Current File : /home/mmdealscpanel/yummmdeals.com/pod.tar
perlvos.pod000064400000007403150344123420006744 0ustar00If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see pod/perlpod.pod) which is
specially designed to be readable as is.

=head1 NAME

perlvos - Perl for Stratus OpenVOS

=head1 SYNOPSIS

This file contains notes for building perl on the Stratus OpenVOS
operating system.  Perl is a scripting or macro language that is
popular on many systems.  See L<perlbook> for a number of good books
on Perl.

These are instructions for building Perl from source.  This version of
Perl requires the dynamic linking support that is found in OpenVOS
Release 17.1 and thus is not supported on OpenVOS Release 17.0 or
earlier releases.

If you are running VOS Release 14.4.1 or later, you can obtain a
pre-compiled, supported copy of perl by purchasing the GNU Tools
product from Stratus Technologies.

=head1 BUILDING PERL FOR OPENVOS

To build perl from its source code on the Stratus V Series platform
you must have OpenVOS Release 17.1.0 or later, GNU Tools Release
3.5 or later, and the C/POSIX Runtime Libraries.

Follow the normal instructions for building perl; e.g, enter bash, run
the Configure script, then use "gmake" to build perl.

=head1 INSTALLING PERL IN OPENVOS

=over 4

=item 1

After you have built perl using the Configure script, ensure that you
have modify and default write permission to C<< >system>ported >> and
all subdirectories.  Then type

     gmake install

=item 2

While there are currently no architecture-specific extensions or
modules distributed with perl, the following directories can be
used to hold such files (replace the string VERSION by the
appropriate version number):

     >system>ported>lib>perl5>VERSION>i786

=item 3

Site-specific perl extensions and modules can be installed in one of
two places.  Put architecture-independent files into:

     >system>ported>lib>perl5>site_perl>VERSION

Put site-specific architecture-dependent files into one of the
following directories:

     >system>ported>lib>perl5>site_perl>VERSION>i786

=item 4

You can examine the @INC variable from within a perl program
to see the order in which Perl searches these directories.

=back

=head1 USING PERL IN OPENVOS

=head2 Restrictions of Perl on OpenVOS

This port of Perl version 5 prefers Unix-style, slash-separated
pathnames over OpenVOS-style greater-than-separated pathnames.
OpenVOS-style pathnames should work in most contexts, but if you have
trouble, replace all greater-than characters by slash characters.
Because the slash character is used as a pathname delimiter, Perl
cannot process OpenVOS pathnames containing a slash character in a
directory or file name; these must be renamed.

This port of Perl also uses Unix-epoch date values internally.
As long as you are dealing with ASCII character string
representations of dates, this should not be an issue.  The
supported epoch is January 1, 1980 to January 17, 2038.

See the file pod/perlport.pod for more information about the OpenVOS
port of Perl.

=head1 TEST STATUS

A number of the perl self-tests fails for various reasons; generally
these are minor and due to subtle differences between common
POSIX-based environments and the OpenVOS POSIX environment.  Ensure
that you conduct sufficient testing of your code to guarantee that it
works properly in the OpenVOS environment.

=head1 SUPPORT STATUS

I'm offering this port "as is".  You can ask me questions, but I
can't guarantee I'll be able to answer them.  There are some
excellent books available on the Perl language; consult a book
seller.

If you want a supported version of perl for OpenVOS, purchase the
OpenVOS GNU Tools product from Stratus Technologies, along with a
support contract (or from anyone else who will sell you support).

=head1 AUTHOR

Paul Green (Paul.Green@stratus.com)

=head1 LAST UPDATE

February 28, 2013

=cut
perllinux.pod000064400000002720150344123420007271 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perllinux - Perl version 5 on Linux systems

=head1 DESCRIPTION

This document describes various features of Linux that will affect how Perl
version 5 (hereafter just Perl) is compiled and/or runs.

=head2 Experimental Support for Sun Studio Compilers for Linux OS

Sun Microsystems has released a port of their Sun Studio compilers for
Linux.  As of November 2005, only an alpha version has been released.  
Until a release of these compilers is made, support for compiling Perl with
these compiler experimental.

Also, some special instructions for building Perl with Sun Studio on Linux.
Following the normal C<Configure>, you have to run make as follows:

    LDLOADLIBS=-lc make

C<LDLOADLIBS> is an environment variable used by the linker to link modules
C</ext> modules to glibc.  Currently, that environment variable is not getting
populated by a combination of C<Config> entries and C<ExtUtil::MakeMaker>.
While there may be a bug somewhere in Perl's configuration or
C<ExtUtil::MakeMaker> causing the problem, the most likely cause is an
incomplete understanding of Sun Studio by this author.  Further investigation
is needed to get this working better.

=head1 AUTHOR

Steve Peters <steve@fisharerojo.org>

Please report any errors, updates, or suggestions to F<perlbug@perl.org>.

perlpod.pod000064400000053264150344123420006725 0ustar00
=for comment
This document is in Pod format.  To read this, use a Pod formatter,
like "perldoc perlpod".

=head1 NAME
X<POD> X<plain old documentation>

perlpod - the Plain Old Documentation format

=head1 DESCRIPTION

Pod is a simple-to-use markup language used for writing documentation
for Perl, Perl programs, and Perl modules.

Translators are available for converting Pod to various formats
like plain text, HTML, man pages, and more.

Pod markup consists of three basic kinds of paragraphs:
L<ordinary|/"Ordinary Paragraph">,
L<verbatim|/"Verbatim Paragraph">, and 
L<command|/"Command Paragraph">.


=head2 Ordinary Paragraph
X<POD, ordinary paragraph>

Most paragraphs in your documentation will be ordinary blocks
of text, like this one.  You can simply type in your text without
any markup whatsoever, and with just a blank line before and
after.  When it gets formatted, it will undergo minimal formatting, 
like being rewrapped, probably put into a proportionally spaced
font, and maybe even justified.

You can use formatting codes in ordinary paragraphs, for B<bold>,
I<italic>, C<code-style>, L<hyperlinks|perlfaq>, and more.  Such
codes are explained in the "L<Formatting Codes|/"Formatting Codes">"
section, below.


=head2 Verbatim Paragraph
X<POD, verbatim paragraph> X<verbatim>

Verbatim paragraphs are usually used for presenting a codeblock or
other text which does not require any special parsing or formatting,
and which shouldn't be wrapped.

A verbatim paragraph is distinguished by having its first character
be a space or a tab.  (And commonly, all its lines begin with spaces
and/or tabs.)  It should be reproduced exactly, with tabs assumed to
be on 8-column boundaries.  There are no special formatting codes,
so you can't italicize or anything like that.  A \ means \, and
nothing else.


=head2 Command Paragraph
X<POD, command>

A command paragraph is used for special treatment of whole chunks
of text, usually as headings or parts of lists.

All command paragraphs (which are typically only one line long) start
with "=", followed by an identifier, followed by arbitrary text that
the command can use however it pleases.  Currently recognized commands
are

    =pod
    =head1 Heading Text
    =head2 Heading Text
    =head3 Heading Text
    =head4 Heading Text
    =over indentlevel
    =item stuff
    =back
    =begin format
    =end format
    =for format text...
    =encoding type
    =cut

To explain them each in detail:

=over

=item C<=head1 I<Heading Text>>
X<=head1> X<=head2> X<=head3> X<=head4>
X<head1> X<head2> X<head3> X<head4>

=item C<=head2 I<Heading Text>>

=item C<=head3 I<Heading Text>>

=item C<=head4 I<Heading Text>>

Head1 through head4 produce headings, head1 being the highest
level.  The text in the rest of this paragraph is the content of the
heading.  For example:

  =head2 Object Attributes

The text "Object Attributes" comprises the heading there.
The text in these heading commands can use formatting codes, as seen here:

  =head2 Possible Values for C<$/>

Such commands are explained in the
"L<Formatting Codes|/"Formatting Codes">" section, below.

=item C<=over I<indentlevel>>
X<=over> X<=item> X<=back> X<over> X<item> X<back>

=item C<=item I<stuff...>>

=item C<=back>

Item, over, and back require a little more explanation:  "=over" starts
a region specifically for the generation of a list using "=item"
commands, or for indenting (groups of) normal paragraphs.  At the end
of your list, use "=back" to end it.  The I<indentlevel> option to
"=over" indicates how far over to indent, generally in ems (where
one em is the width of an "M" in the document's base font) or roughly
comparable units; if there is no I<indentlevel> option, it defaults
to four.  (And some formatters may just ignore whatever I<indentlevel>
you provide.)  In the I<stuff> in C<=item I<stuff...>>, you may
use formatting codes, as seen here:

  =item Using C<$|> to Control Buffering

Such commands are explained in the
"L<Formatting Codes|/"Formatting Codes">" section, below.

Note also that there are some basic rules to using "=over" ...
"=back" regions:

=over

=item *

Don't use "=item"s outside of an "=over" ... "=back" region.

=item *

The first thing after the "=over" command should be an "=item", unless
there aren't going to be any items at all in this "=over" ... "=back"
region.

=item *

Don't put "=headI<n>" commands inside an "=over" ... "=back" region.

=item *

And perhaps most importantly, keep the items consistent: either use
"=item *" for all of them, to produce bullets; or use "=item 1.",
"=item 2.", etc., to produce numbered lists; or use "=item foo",
"=item bar", etc.--namely, things that look nothing like bullets or
numbers.

If you start with bullets or numbers, stick with them, as
formatters use the first "=item" type to decide how to format the
list.

=back

=item C<=cut>
X<=cut> X<cut>

To end a Pod block, use a blank line,
then a line beginning with "=cut", and a blank
line after it.  This lets Perl (and the Pod formatter) know that
this is where Perl code is resuming.  (The blank line before the "=cut"
is not technically necessary, but many older Pod processors require it.)

=item C<=pod>
X<=pod> X<pod>

The "=pod" command by itself doesn't do much of anything, but it
signals to Perl (and Pod formatters) that a Pod block starts here.  A
Pod block starts with I<any> command paragraph, so a "=pod" command is
usually used just when you want to start a Pod block with an ordinary
paragraph or a verbatim paragraph.  For example:

  =item stuff()

  This function does stuff.

  =cut

  sub stuff {
    ...
  }

  =pod

  Remember to check its return value, as in:

    stuff() || die "Couldn't do stuff!";

  =cut

=item C<=begin I<formatname>>
X<=begin> X<=end> X<=for> X<begin> X<end> X<for>

=item C<=end I<formatname>>

=item C<=for I<formatname> I<text...>>

For, begin, and end will let you have regions of text/code/data that
are not generally interpreted as normal Pod text, but are passed
directly to particular formatters, or are otherwise special.  A
formatter that can use that format will use the region, otherwise it
will be completely ignored.

A command "=begin I<formatname>", some paragraphs, and a
command "=end I<formatname>", mean that the text/data in between
is meant for formatters that understand the special format
called I<formatname>.  For example,

  =begin html

  <hr> <img src="thang.png">
  <p> This is a raw HTML paragraph </p>

  =end html

The command "=for I<formatname> I<text...>"
specifies that the remainder of just this paragraph (starting
right after I<formatname>) is in that special format.  

  =for html <hr> <img src="thang.png">
  <p> This is a raw HTML paragraph </p>

This means the same thing as the above "=begin html" ... "=end html"
region.

That is, with "=for", you can have only one paragraph's worth
of text (i.e., the text in "=foo targetname text..."), but with
"=begin targetname" ... "=end targetname", you can have any amount
of stuff in between.  (Note that there still must be a blank line
after the "=begin" command and a blank line before the "=end"
command.)

Here are some examples of how to use these:

  =begin html

  <br>Figure 1.<br><IMG SRC="figure1.png"><br>

  =end html

  =begin text

    ---------------
    |  foo        |
    |        bar  |
    ---------------

  ^^^^ Figure 1. ^^^^

  =end text

Some format names that formatters currently are known to accept
include "roff", "man", "latex", "tex", "text", and "html".  (Some
formatters will treat some of these as synonyms.)

A format name of "comment" is common for just making notes (presumably
to yourself) that won't appear in any formatted version of the Pod
document:

  =for comment
  Make sure that all the available options are documented!

Some I<formatnames> will require a leading colon (as in
C<"=for :formatname">, or
C<"=begin :formatname" ... "=end :formatname">),
to signal that the text is not raw data, but instead I<is> Pod text
(i.e., possibly containing formatting codes) that's just not for
normal formatting (e.g., may not be a normal-use paragraph, but might
be for formatting as a footnote).

=item C<=encoding I<encodingname>>
X<=encoding> X<encoding>

This command is used for declaring the encoding of a document.  Most
users won't need this; but if your encoding isn't US-ASCII,
then put a C<=encoding I<encodingname>> command very early in the document so
that pod formatters will know how to decode the document.  For
I<encodingname>, use a name recognized by the L<Encode::Supported>
module.  Some pod formatters may try to guess between a Latin-1 or
CP-1252 versus
UTF-8 encoding, but they may guess wrong.  It's best to be explicit if
you use anything besides strict ASCII.  Examples:

  =encoding latin1

  =encoding utf8

  =encoding koi8-r

  =encoding ShiftJIS

  =encoding big5

C<=encoding> affects the whole document, and must occur only once.

=back

And don't forget, all commands but C<=encoding> last up
until the end of its I<paragraph>, not its line.  So in the
examples below, you can see that every command needs the blank
line after it, to end its paragraph.  (And some older Pod translators
may require the C<=encoding> line to have a following blank line as
well, even though it should be legal to omit.)

Some examples of lists include:

  =over

  =item *

  First item

  =item *

  Second item

  =back

  =over

  =item Foo()

  Description of Foo function

  =item Bar()

  Description of Bar function

  =back


=head2 Formatting Codes
X<POD, formatting code> X<formatting code>
X<POD, interior sequence> X<interior sequence>

In ordinary paragraphs and in some command paragraphs, various
formatting codes (a.k.a. "interior sequences") can be used:

=for comment
 "interior sequences" is such an opaque term.
 Prefer "formatting codes" instead.

=over

=item C<IE<lt>textE<gt>> -- italic text
X<I> X<< IZ<><> >> X<POD, formatting code, italic> X<italic>

Used for emphasis ("C<be IE<lt>careful!E<gt>>") and parameters
("C<redo IE<lt>LABELE<gt>>")

=item C<BE<lt>textE<gt>> -- bold text
X<B> X<< BZ<><> >> X<POD, formatting code, bold> X<bold>

Used for switches ("C<perl's BE<lt>-nE<gt> switch>"), programs
("C<some systems provide a BE<lt>chfnE<gt> for that>"),
emphasis ("C<be BE<lt>careful!E<gt>>"), and so on
("C<and that feature is known as BE<lt>autovivificationE<gt>>").

=item C<CE<lt>codeE<gt>> -- code text
X<C> X<< CZ<><> >> X<POD, formatting code, code> X<code>

Renders code in a typewriter font, or gives some other indication that
this represents program text ("C<CE<lt>gmtime($^T)E<gt>>") or some other
form of computerese ("C<CE<lt>drwxr-xr-xE<gt>>").

=item C<LE<lt>nameE<gt>> -- a hyperlink
X<L> X<< LZ<><> >> X<POD, formatting code, hyperlink> X<hyperlink>

There are various syntaxes, listed below.  In the syntaxes given,
C<text>, C<name>, and C<section> cannot contain the characters
'/' and '|'; and any '<' or '>' should be matched.

=over

=item *

C<LE<lt>nameE<gt>>

Link to a Perl manual page (e.g., C<LE<lt>Net::PingE<gt>>).  Note
that C<name> should not contain spaces.  This syntax
is also occasionally used for references to Unix man pages, as in
C<LE<lt>crontab(5)E<gt>>.

=item *

C<LE<lt>name/"sec"E<gt>> or C<LE<lt>name/secE<gt>>

Link to a section in other manual page.  E.g.,
C<LE<lt>perlsyn/"For Loops"E<gt>>

=item *

C<LE<lt>/"sec"E<gt>> or C<LE<lt>/secE<gt>>

Link to a section in this manual page.  E.g.,
C<LE<lt>/"Object Methods"E<gt>>

=back

A section is started by the named heading or item.  For
example, C<LE<lt>perlvar/$.E<gt>> or C<LE<lt>perlvar/"$."E<gt>> both
link to the section started by "C<=item $.>" in perlvar.  And
C<LE<lt>perlsyn/For LoopsE<gt>> or C<LE<lt>perlsyn/"For Loops"E<gt>>
both link to the section started by "C<=head2 For Loops>"
in perlsyn.

To control what text is used for display, you
use "C<LE<lt>text|...E<gt>>", as in:

=over

=item *

C<LE<lt>text|nameE<gt>>

Link this text to that manual page.  E.g.,
C<LE<lt>Perl Error Messages|perldiagE<gt>>

=item *

C<LE<lt>text|name/"sec"E<gt>> or C<LE<lt>text|name/secE<gt>>

Link this text to that section in that manual page.  E.g.,
C<LE<lt>postfix "if"|perlsyn/"Statement Modifiers"E<gt>>

=item *

C<LE<lt>text|/"sec"E<gt>> or C<LE<lt>text|/secE<gt>>
or C<LE<lt>text|"sec"E<gt>>

Link this text to that section in this manual page.  E.g.,
C<LE<lt>the various attributes|/"Member Data"E<gt>>

=back

Or you can link to a web page:

=over

=item *

C<LE<lt>scheme:...E<gt>>

C<LE<lt>text|scheme:...E<gt>>

Links to an absolute URL.  For example, C<LE<lt>http://www.perl.org/E<gt>> or
C<LE<lt>The Perl Home Page|http://www.perl.org/E<gt>>.

=back

=item C<EE<lt>escapeE<gt>> -- a character escape
X<E> X<< EZ<><> >> X<POD, formatting code, escape> X<escape>

Very similar to HTML/XML C<&I<foo>;> "entity references":

=over

=item *

C<EE<lt>ltE<gt>> -- a literal E<lt> (less than)

=item *

C<EE<lt>gtE<gt>> -- a literal E<gt> (greater than)

=item *

C<EE<lt>verbarE<gt>> -- a literal | (I<ver>tical I<bar>)

=item *

C<EE<lt>solE<gt>> -- a literal / (I<sol>idus)

The above four are optional except in other formatting codes,
notably C<LE<lt>...E<gt>>, and when preceded by a
capital letter.

=item *

C<EE<lt>htmlnameE<gt>>

Some non-numeric HTML entity name, such as C<EE<lt>eacuteE<gt>>,
meaning the same thing as C<&eacute;> in HTML -- i.e., a lowercase
e with an acute (/-shaped) accent.

=item *

C<EE<lt>numberE<gt>>

The ASCII/Latin-1/Unicode character with that number.  A
leading "0x" means that I<number> is hex, as in
C<EE<lt>0x201EE<gt>>.  A leading "0" means that I<number> is octal,
as in C<EE<lt>075E<gt>>.  Otherwise I<number> is interpreted as being
in decimal, as in C<EE<lt>181E<gt>>.

Note that older Pod formatters might not recognize octal or
hex numeric escapes, and that many formatters cannot reliably
render characters above 255.  (Some formatters may even have
to use compromised renderings of Latin-1/CP-1252 characters, like
rendering C<EE<lt>eacuteE<gt>> as just a plain "e".)

=back

=item C<FE<lt>filenameE<gt>> -- used for filenames
X<F> X<< FZ<><> >> X<POD, formatting code, filename> X<filename>

Typically displayed in italics.  Example: "C<FE<lt>.cshrcE<gt>>"

=item C<SE<lt>textE<gt>> -- text contains non-breaking spaces
X<S> X<< SZ<><> >> X<POD, formatting code, non-breaking space> 
X<non-breaking space>

This means that the words in I<text> should not be broken
across lines.  Example: S<C<SE<lt>$x ? $y : $zE<gt>>>.

=item C<XE<lt>topic nameE<gt>> -- an index entry
X<X> X<< XZ<><> >> X<POD, formatting code, index entry> X<index entry>

This is ignored by most formatters, but some may use it for building
indexes.  It always renders as empty-string.
Example: C<XE<lt>absolutizing relative URLsE<gt>>

=item C<ZE<lt>E<gt>> -- a null (zero-effect) formatting code
X<Z> X<< ZZ<><> >> X<POD, formatting code, null> X<null>

This is rarely used.  It's one way to get around using an
EE<lt>...E<gt> code sometimes.  For example, instead of
"C<NEE<lt>ltE<gt>3>" (for "NE<lt>3") you could write
"C<NZE<lt>E<gt>E<lt>3>" (the "ZE<lt>E<gt>" breaks up the "N" and
the "E<lt>" so they can't be considered
the part of a (fictitious) "NE<lt>...E<gt>" code).

=for comment
 This was formerly explained as a "zero-width character".  But it in
 most parser models, it parses to nothing at all, as opposed to parsing
 as if it were a E<zwnj> or E<zwj>, which are REAL zero-width characters.
 So "width" and "character" are exactly the wrong words.

=back

Most of the time, you will need only a single set of angle brackets to
delimit the beginning and end of formatting codes.  However,
sometimes you will want to put a real right angle bracket (a
greater-than sign, '>') inside of a formatting code.  This is particularly
common when using a formatting code to provide a different font-type for a
snippet of code.  As with all things in Perl, there is more than
one way to do it.  One way is to simply escape the closing bracket
using an C<E> code:

    C<$a E<lt>=E<gt> $b>

This will produce: "C<$a E<lt>=E<gt> $b>"

A more readable, and perhaps more "plain" way is to use an alternate
set of delimiters that doesn't require a single ">" to be escaped.
Doubled angle brackets ("<<" and ">>") may be used I<if and only if there is
whitespace right after the opening delimiter and whitespace right
before the closing delimiter!>  For example, the following will
do the trick:
X<POD, formatting code, escaping with multiple brackets>

    C<< $a <=> $b >>

In fact, you can use as many repeated angle-brackets as you like so
long as you have the same number of them in the opening and closing
delimiters, and make sure that whitespace immediately follows the last
'<' of the opening delimiter, and immediately precedes the first '>'
of the closing delimiter.  (The whitespace is ignored.)  So the
following will also work:
X<POD, formatting code, escaping with multiple brackets>

    C<<< $a <=> $b >>>
    C<<<<  $a <=> $b     >>>>

And they all mean exactly the same as this:

    C<$a E<lt>=E<gt> $b>

The multiple-bracket form does not affect the interpretation of the contents of
the formatting code, only how it must end.  That means that the examples above
are also exactly the same as this:

    C<< $a E<lt>=E<gt> $b >>

As a further example, this means that if you wanted to put these bits of
code in C<C> (code) style:

    open(X, ">>thing.dat") || die $!
    $foo->bar();

you could do it like so:

    C<<< open(X, ">>thing.dat") || die $! >>>
    C<< $foo->bar(); >>

which is presumably easier to read than the old way:

    C<open(X, "E<gt>E<gt>thing.dat") || die $!>
    C<$foo-E<gt>bar();>

This is currently supported by pod2text (Pod::Text), pod2man (Pod::Man),
and any other pod2xxx or Pod::Xxxx translators that use
Pod::Parser 1.093 or later, or Pod::Tree 1.02 or later.

=head2 The Intent
X<POD, intent of>

The intent is simplicity of use, not power of expression.  Paragraphs
look like paragraphs (block format), so that they stand out
visually, and so that I could run them through C<fmt> easily to reformat
them (that's F7 in my version of B<vi>, or Esc Q in my version of
B<emacs>).  I wanted the translator to always leave the C<'> and C<`> and
C<"> quotes alone, in verbatim mode, so I could slurp in a
working program, shift it over four spaces, and have it print out, er,
verbatim.  And presumably in a monospace font.

The Pod format is not necessarily sufficient for writing a book.  Pod
is just meant to be an idiot-proof common source for nroff, HTML,
TeX, and other markup languages, as used for online
documentation.  Translators exist for B<pod2text>, B<pod2html>,
B<pod2man> (that's for nroff(1) and troff(1)), B<pod2latex>, and
B<pod2fm>.  Various others are available in CPAN.


=head2 Embedding Pods in Perl Modules
X<POD, embedding>

You can embed Pod documentation in your Perl modules and scripts.  Start
your documentation with an empty line, a "=head1" command at the
beginning, and end it with a "=cut" command and an empty line.  The
B<perl> executable will ignore the Pod text.  You can place a Pod
statement where B<perl> expects the beginning of a new statement, but
not within a statement, as that would result in an error.  See any of
the supplied library modules for examples.

If you're going to put your Pod at the end of the file, and you're using
an C<__END__> or C<__DATA__> cut mark, make sure to put an empty line there
before the first Pod command.

  __END__

  =head1 NAME

  Time::Local - efficiently compute time from local and GMT time

Without that empty line before the "=head1", many translators wouldn't
have recognized the "=head1" as starting a Pod block.

=head2 Hints for Writing Pod

=over

=item *
X<podchecker> X<POD, validating>

The B<podchecker> command is provided for checking Pod syntax for errors
and warnings.  For example, it checks for completely blank lines in
Pod blocks and for unknown commands and formatting codes.  You should
still also pass your document through one or more translators and proofread
the result, or print out the result and proofread that.  Some of the
problems found may be bugs in the translators, which you may or may not
wish to work around.

=item *

If you're more familiar with writing in HTML than with writing in Pod, you
can try your hand at writing documentation in simple HTML, and converting
it to Pod with the experimental L<Pod::HTML2Pod|Pod::HTML2Pod> module,
(available in CPAN), and looking at the resulting code.  The experimental
L<Pod::PXML|Pod::PXML> module in CPAN might also be useful.

=item *

Many older Pod translators require the lines before every Pod
command and after every Pod command (including "=cut"!) to be a blank
line.  Having something like this:

 # - - - - - - - - - - - -
 =item $firecracker->boom()

 This noisily detonates the firecracker object.
 =cut
 sub boom {
 ...

...will make such Pod translators completely fail to see the Pod block
at all.

Instead, have it like this:

 # - - - - - - - - - - - -

 =item $firecracker->boom()

 This noisily detonates the firecracker object.

 =cut

 sub boom {
 ...

=item *

Some older Pod translators require paragraphs (including command
paragraphs like "=head2 Functions") to be separated by I<completely>
empty lines.  If you have an apparently empty line with some spaces
on it, this might not count as a separator for those translators, and
that could cause odd formatting.

=item *

Older translators might add wording around an LE<lt>E<gt> link, so that
C<LE<lt>Foo::BarE<gt>> may become "the Foo::Bar manpage", for example.
So you shouldn't write things like C<the LE<lt>fooE<gt>
documentation>, if you want the translated document to read sensibly.
Instead, write C<the LE<lt>Foo::Bar|Foo::BarE<gt> documentation> or
C<LE<lt>the Foo::Bar documentation|Foo::BarE<gt>>, to control how the
link comes out.

=item *

Going past the 70th column in a verbatim block might be ungracefully
wrapped by some formatters.

=back

=head1 SEE ALSO

L<perlpodspec>, L<perlsyn/"PODs: Embedded Documentation">,
L<perlnewmod>, L<perldoc>, L<pod2html>, L<pod2man>, L<podchecker>.

=head1 AUTHOR

Larry Wall, Sean M. Burke

=cut
perl5004delta.pod000064400000155660150344123420007550 0ustar00=head1 NAME

perl5004delta - what's new for perl5.004

=head1 DESCRIPTION

This document describes differences between the 5.003 release (as
documented in I<Programming Perl>, second edition--the Camel Book) and
this one.

=head1 Supported Environments

Perl5.004 builds out of the box on Unix, Plan 9, LynxOS, VMS, OS/2,
QNX, AmigaOS, and Windows NT.  Perl runs on Windows 95 as well, but it
cannot be built there, for lack of a reasonable command interpreter.

=head1 Core Changes

Most importantly, many bugs were fixed, including several security
problems.  See the F<Changes> file in the distribution for details.

=head2 List assignment to %ENV works

C<%ENV = ()> and C<%ENV = @list> now work as expected (except on VMS
where it generates a fatal error).

=head2 Change to "Can't locate Foo.pm in @INC" error

The error "Can't locate Foo.pm in @INC" now lists the contents of @INC
for easier debugging.

=head2 Compilation option: Binary compatibility with 5.003

There is a new Configure question that asks if you want to maintain
binary compatibility with Perl 5.003.  If you choose binary
compatibility, you do not have to recompile your extensions, but you
might have symbol conflicts if you embed Perl in another application,
just as in the 5.003 release.  By default, binary compatibility
is preserved at the expense of symbol table pollution.

=head2 $PERL5OPT environment variable

You may now put Perl options in the $PERL5OPT environment variable.
Unless Perl is running with taint checks, it will interpret this
variable as if its contents had appeared on a "#!perl" line at the
beginning of your script, except that hyphens are optional.  PERL5OPT
may only be used to set the following switches: B<-[DIMUdmw]>.

=head2 Limitations on B<-M>, B<-m>, and B<-T> options

The C<-M> and C<-m> options are no longer allowed on the C<#!> line of
a script.  If a script needs a module, it should invoke it with the
C<use> pragma.

The B<-T> option is also forbidden on the C<#!> line of a script,
unless it was present on the Perl command line.  Due to the way C<#!>
works, this usually means that B<-T> must be in the first argument.
Thus:

    #!/usr/bin/perl -T -w

will probably work for an executable script invoked as C<scriptname>,
while:

    #!/usr/bin/perl -w -T

will probably fail under the same conditions.  (Non-Unix systems will
probably not follow this rule.)  But C<perl scriptname> is guaranteed
to fail, since then there is no chance of B<-T> being found on the
command line before it is found on the C<#!> line.

=head2 More precise warnings

If you removed the B<-w> option from your Perl 5.003 scripts because it
made Perl too verbose, we recommend that you try putting it back when
you upgrade to Perl 5.004.  Each new perl version tends to remove some
undesirable warnings, while adding new warnings that may catch bugs in
your scripts.

=head2 Deprecated: Inherited C<AUTOLOAD> for non-methods

Before Perl 5.004, C<AUTOLOAD> functions were looked up as methods
(using the C<@ISA> hierarchy), even when the function to be autoloaded
was called as a plain function (e.g. C<Foo::bar()>), not a method
(e.g. C<< Foo->bar() >> or C<< $obj->bar() >>).

Perl 5.005 will use method lookup only for methods' C<AUTOLOAD>s.
However, there is a significant base of existing code that may be using
the old behavior.  So, as an interim step, Perl 5.004 issues an optional
warning when a non-method uses an inherited C<AUTOLOAD>.

The simple rule is:  Inheritance will not work when autoloading
non-methods.  The simple fix for old code is:  In any module that used to
depend on inheriting C<AUTOLOAD> for non-methods from a base class named
C<BaseClass>, execute C<*AUTOLOAD = \&BaseClass::AUTOLOAD> during startup.

=head2 Previously deprecated %OVERLOAD is no longer usable

Using %OVERLOAD to define overloading was deprecated in 5.003.
Overloading is now defined using the overload pragma. %OVERLOAD is
still used internally but should not be used by Perl scripts. See
L<overload> for more details.

=head2 Subroutine arguments created only when they're modified

In Perl 5.004, nonexistent array and hash elements used as subroutine
parameters are brought into existence only if they are actually
assigned to (via C<@_>).

Earlier versions of Perl vary in their handling of such arguments.
Perl versions 5.002 and 5.003 always brought them into existence.
Perl versions 5.000 and 5.001 brought them into existence only if
they were not the first argument (which was almost certainly a bug).
Earlier versions of Perl never brought them into existence.

For example, given this code:

     undef @a; undef %a;
     sub show { print $_[0] };
     sub change { $_[0]++ };
     show($a[2]);
     change($a{b});

After this code executes in Perl 5.004, $a{b} exists but $a[2] does
not.  In Perl 5.002 and 5.003, both $a{b} and $a[2] would have existed
(but $a[2]'s value would have been undefined).

=head2 Group vector changeable with C<$)>

The C<$)> special variable has always (well, in Perl 5, at least)
reflected not only the current effective group, but also the group list
as returned by the C<getgroups()> C function (if there is one).
However, until this release, there has not been a way to call the
C<setgroups()> C function from Perl.

In Perl 5.004, assigning to C<$)> is exactly symmetrical with examining
it: The first number in its string value is used as the effective gid;
if there are any numbers after the first one, they are passed to the
C<setgroups()> C function (if there is one).

=head2 Fixed parsing of $$<digit>, &$<digit>, etc.

Perl versions before 5.004 misinterpreted any type marker followed by
"$" and a digit.  For example, "$$0" was incorrectly taken to mean
"${$}0" instead of "${$0}".  This bug is (mostly) fixed in Perl 5.004.

However, the developers of Perl 5.004 could not fix this bug completely,
because at least two widely-used modules depend on the old meaning of
"$$0" in a string.  So Perl 5.004 still interprets "$$<digit>" in the
old (broken) way inside strings; but it generates this message as a
warning.  And in Perl 5.005, this special treatment will cease.

=head2 Fixed localization of $<digit>, $&, etc.

Perl versions before 5.004 did not always properly localize the
regex-related special variables.  Perl 5.004 does localize them, as
the documentation has always said it should.  This may result in $1,
$2, etc. no longer being set where existing programs use them.

=head2 No resetting of $. on implicit close

The documentation for Perl 5.0 has always stated that C<$.> is I<not>
reset when an already-open file handle is reopened with no intervening
call to C<close>.  Due to a bug, perl versions 5.000 through 5.003
I<did> reset C<$.> under that circumstance; Perl 5.004 does not.

=head2 C<wantarray> may return undef

The C<wantarray> operator returns true if a subroutine is expected to
return a list, and false otherwise.  In Perl 5.004, C<wantarray> can
also return the undefined value if a subroutine's return value will
not be used at all, which allows subroutines to avoid a time-consuming
calculation of a return value if it isn't going to be used.

=head2 C<eval EXPR> determines value of EXPR in scalar context

Perl (version 5) used to determine the value of EXPR inconsistently,
sometimes incorrectly using the surrounding context for the determination.
Now, the value of EXPR (before being parsed by eval) is always determined in
a scalar context.  Once parsed, it is executed as before, by providing
the context that the scope surrounding the eval provided.  This change
makes the behavior Perl4 compatible, besides fixing bugs resulting from
the inconsistent behavior.  This program:

    @a = qw(time now is time);
    print eval @a;
    print '|', scalar eval @a;

used to print something like "timenowis881399109|4", but now (and in perl4)
prints "4|4".

=head2 Changes to tainting checks

A bug in previous versions may have failed to detect some insecure
conditions when taint checks are turned on.  (Taint checks are used
in setuid or setgid scripts, or when explicitly turned on with the
C<-T> invocation option.)  Although it's unlikely, this may cause a
previously-working script to now fail, which should be construed
as a blessing since that indicates a potentially-serious security
hole was just plugged.

The new restrictions when tainting include:

=over 4

=item No glob() or <*>

These operators may spawn the C shell (csh), which cannot be made
safe.  This restriction will be lifted in a future version of Perl
when globbing is implemented without the use of an external program.

=item No spawning if tainted $CDPATH, $ENV, $BASH_ENV

These environment variables may alter the behavior of spawned programs
(especially shells) in ways that subvert security.  So now they are
treated as dangerous, in the manner of $IFS and $PATH.

=item No spawning if tainted $TERM doesn't look like a terminal name

Some termcap libraries do unsafe things with $TERM.  However, it would be
unnecessarily harsh to treat all $TERM values as unsafe, since only shell
metacharacters can cause trouble in $TERM.  So a tainted $TERM is
considered to be safe if it contains only alphanumerics, underscores,
dashes, and colons, and unsafe if it contains other characters (including
whitespace).

=back

=head2 New Opcode module and revised Safe module

A new Opcode module supports the creation, manipulation and
application of opcode masks.  The revised Safe module has a new API
and is implemented using the new Opcode module.  Please read the new
Opcode and Safe documentation.

=head2 Embedding improvements

In older versions of Perl it was not possible to create more than one
Perl interpreter instance inside a single process without leaking like a
sieve and/or crashing.  The bugs that caused this behavior have all been
fixed.  However, you still must take care when embedding Perl in a C
program.  See the updated perlembed manpage for tips on how to manage
your interpreters.

=head2 Internal change: FileHandle class based on IO::* classes

File handles are now stored internally as type IO::Handle.  The
FileHandle module is still supported for backwards compatibility, but
it is now merely a front end to the IO::* modules, specifically
IO::Handle, IO::Seekable, and IO::File.  We suggest, but do not
require, that you use the IO::* modules in new code.

In harmony with this change, C<*GLOB{FILEHANDLE}> is now just a
backward-compatible synonym for C<*GLOB{IO}>.

=head2 Internal change: PerlIO abstraction interface

It is now possible to build Perl with AT&T's sfio IO package
instead of stdio.  See L<perlapio> for more details, and
the F<INSTALL> file for how to use it.

=head2 New and changed syntax

=over 4

=item $coderef->(PARAMS)

A subroutine reference may now be suffixed with an arrow and a
(possibly empty) parameter list.  This syntax denotes a call of the
referenced subroutine, with the given parameters (if any).

This new syntax follows the pattern of S<C<< $hashref->{FOO} >>> and
S<C<< $aryref->[$foo] >>>: You may now write S<C<&$subref($foo)>> as
S<C<< $subref->($foo) >>>.  All these arrow terms may be chained;
thus, S<C<< &{$table->{FOO}}($bar) >>> may now be written
S<C<< $table->{FOO}->($bar) >>>.

=back

=head2 New and changed builtin constants

=over 4

=item __PACKAGE__

The current package name at compile time, or the undefined value if
there is no current package (due to a C<package;> directive).  Like
C<__FILE__> and C<__LINE__>, C<__PACKAGE__> does I<not> interpolate
into strings.

=back

=head2 New and changed builtin variables

=over 4

=item $^E

Extended error message on some platforms.  (Also known as
$EXTENDED_OS_ERROR if you C<use English>).

=item $^H

The current set of syntax checks enabled by C<use strict>.  See the
documentation of C<strict> for more details.  Not actually new, but
newly documented.
Because it is intended for internal use by Perl core components,
there is no C<use English> long name for this variable.

=item $^M

By default, running out of memory it is not trappable.  However, if
compiled for this, Perl may use the contents of C<$^M> as an emergency
pool after die()ing with this message.  Suppose that your Perl were
compiled with -DPERL_EMERGENCY_SBRK and used Perl's malloc.  Then

    $^M = 'a' x (1<<16);

would allocate a 64K buffer for use when in emergency.
See the F<INSTALL> file for information on how to enable this option.
As a disincentive to casual use of this advanced feature,
there is no C<use English> long name for this variable.

=back

=head2 New and changed builtin functions

=over 4

=item delete on slices

This now works.  (e.g. C<delete @ENV{'PATH', 'MANPATH'}>)

=item flock

is now supported on more platforms, prefers fcntl to lockf when
emulating, and always flushes before (un)locking.

=item printf and sprintf

Perl now implements these functions itself; it doesn't use the C
library function sprintf() any more, except for floating-point
numbers, and even then only known flags are allowed.  As a result, it
is now possible to know which conversions and flags will work, and
what they will do.

The new conversions in Perl's sprintf() are:

   %i	a synonym for %d
   %p	a pointer (the address of the Perl value, in hexadecimal)
   %n	special: *stores* the number of characters output so far
        into the next variable in the parameter list 

The new flags that go between the C<%> and the conversion are:

   #	prefix octal with "0", hex with "0x"
   h	interpret integer as C type "short" or "unsigned short"
   V	interpret integer as Perl's standard integer type

Also, where a number would appear in the flags, an asterisk ("*") may
be used instead, in which case Perl uses the next item in the
parameter list as the given number (that is, as the field width or
precision).  If a field width obtained through "*" is negative, it has
the same effect as the '-' flag: left-justification.

See L<perlfunc/sprintf> for a complete list of conversion and flags.

=item keys as an lvalue

As an lvalue, C<keys> allows you to increase the number of hash buckets
allocated for the given hash.  This can gain you a measure of efficiency if
you know the hash is going to get big.  (This is similar to pre-extending
an array by assigning a larger number to $#array.)  If you say

    keys %hash = 200;

then C<%hash> will have at least 200 buckets allocated for it.  These
buckets will be retained even if you do C<%hash = ()>; use C<undef
%hash> if you want to free the storage while C<%hash> is still in scope.
You can't shrink the number of buckets allocated for the hash using
C<keys> in this way (but you needn't worry about doing this by accident,
as trying has no effect).

=item my() in Control Structures

You can now use my() (with or without the parentheses) in the control
expressions of control structures such as:

    while (defined(my $line = <>)) {
        $line = lc $line;
    } continue {
        print $line;
    }

    if ((my $answer = <STDIN>) =~ /^y(es)?$/i) {
        user_agrees();
    } elsif ($answer =~ /^n(o)?$/i) {
        user_disagrees();
    } else {
        chomp $answer;
        die "`$answer' is neither `yes' nor `no'";
    }

Also, you can declare a foreach loop control variable as lexical by
preceding it with the word "my".  For example, in:

    foreach my $i (1, 2, 3) {
        some_function();
    }

$i is a lexical variable, and the scope of $i extends to the end of
the loop, but not beyond it.

Note that you still cannot use my() on global punctuation variables
such as $_ and the like.

=item pack() and unpack()

A new format 'w' represents a BER compressed integer (as defined in
ASN.1).  Its format is a sequence of one or more bytes, each of which
provides seven bits of the total value, with the most significant
first.  Bit eight of each byte is set, except for the last byte, in
which bit eight is clear.

If 'p' or 'P' are given undef as values, they now generate a NULL
pointer.

Both pack() and unpack() now fail when their templates contain invalid
types.  (Invalid types used to be ignored.)

=item sysseek()

The new sysseek() operator is a variant of seek() that sets and gets the
file's system read/write position, using the lseek(2) system call.  It is
the only reliable way to seek before using sysread() or syswrite().  Its
return value is the new position, or the undefined value on failure.

=item use VERSION

If the first argument to C<use> is a number, it is treated as a version
number instead of a module name.  If the version of the Perl interpreter
is less than VERSION, then an error message is printed and Perl exits
immediately.  Because C<use> occurs at compile time, this check happens
immediately during the compilation process, unlike C<require VERSION>,
which waits until runtime for the check.  This is often useful if you
need to check the current Perl version before C<use>ing library modules
which have changed in incompatible ways from older versions of Perl.
(We try not to do this more than we have to.)

=item use Module VERSION LIST

If the VERSION argument is present between Module and LIST, then the
C<use> will call the VERSION method in class Module with the given
version as an argument.  The default VERSION method, inherited from
the UNIVERSAL class, croaks if the given version is larger than the
value of the variable $Module::VERSION.  (Note that there is not a
comma after VERSION!)

This version-checking mechanism is similar to the one currently used
in the Exporter module, but it is faster and can be used with modules
that don't use the Exporter.  It is the recommended method for new
code.

=item prototype(FUNCTION)

Returns the prototype of a function as a string (or C<undef> if the
function has no prototype).  FUNCTION is a reference to or the name of the
function whose prototype you want to retrieve.
(Not actually new; just never documented before.)

=item srand

The default seed for C<srand>, which used to be C<time>, has been changed.
Now it's a heady mix of difficult-to-predict system-dependent values,
which should be sufficient for most everyday purposes.

Previous to version 5.004, calling C<rand> without first calling C<srand>
would yield the same sequence of random numbers on most or all machines.
Now, when perl sees that you're calling C<rand> and haven't yet called
C<srand>, it calls C<srand> with the default seed. You should still call
C<srand> manually if your code might ever be run on a pre-5.004 system,
of course, or if you want a seed other than the default.

=item $_ as Default

Functions documented in the Camel to default to $_ now in
fact do, and all those that do are so documented in L<perlfunc>.

=item C<m//gc> does not reset search position on failure

The C<m//g> match iteration construct has always reset its target
string's search position (which is visible through the C<pos> operator)
when a match fails; as a result, the next C<m//g> match after a failure
starts again at the beginning of the string.  With Perl 5.004, this
reset may be disabled by adding the "c" (for "continue") modifier,
i.e. C<m//gc>.  This feature, in conjunction with the C<\G> zero-width
assertion, makes it possible to chain matches together.  See L<perlop>
and L<perlre>.

=item C<m//x> ignores whitespace before ?*+{}

The C<m//x> construct has always been intended to ignore all unescaped
whitespace.  However, before Perl 5.004, whitespace had the effect of
escaping repeat modifiers like "*" or "?"; for example, C</a *b/x> was
(mis)interpreted as C</a\*b/x>.  This bug has been fixed in 5.004.

=item nested C<sub{}> closures work now

Prior to the 5.004 release, nested anonymous functions didn't work
right.  They do now.

=item formats work right on changing lexicals

Just like anonymous functions that contain lexical variables
that change (like a lexical index variable for a C<foreach> loop),
formats now work properly.  For example, this silently failed
before (printed only zeros), but is fine now:

    my $i;
    foreach $i ( 1 .. 10 ) {
	write;
    }
    format =
	my i is @#
	$i
    .

However, it still fails (without a warning) if the foreach is within a
subroutine:

    my $i;
    sub foo {
      foreach $i ( 1 .. 10 ) {
	write;
      }
    }
    foo;
    format =
	my i is @#
	$i
    .

=back

=head2 New builtin methods

The C<UNIVERSAL> package automatically contains the following methods that
are inherited by all other classes:

=over 4

=item isa(CLASS)

C<isa> returns I<true> if its object is blessed into a subclass of C<CLASS>

C<isa> is also exportable and can be called as a sub with two arguments. This
allows the ability to check what a reference points to. Example:

    use UNIVERSAL qw(isa);

    if(isa($ref, 'ARRAY')) {
       ...
    }

=item can(METHOD)

C<can> checks to see if its object has a method called C<METHOD>,
if it does then a reference to the sub is returned; if it does not then
I<undef> is returned.

=item VERSION( [NEED] )

C<VERSION> returns the version number of the class (package).  If the
NEED argument is given then it will check that the current version (as
defined by the $VERSION variable in the given package) not less than
NEED; it will die if this is not the case.  This method is normally
called as a class method.  This method is called automatically by the
C<VERSION> form of C<use>.

    use A 1.2 qw(some imported subs);
    # implies:
    A->VERSION(1.2);

=back

B<NOTE:> C<can> directly uses Perl's internal code for method lookup, and
C<isa> uses a very similar method and caching strategy. This may cause
strange effects if the Perl code dynamically changes @ISA in any package.

You may add other methods to the UNIVERSAL class via Perl or XS code.
You do not need to C<use UNIVERSAL> in order to make these methods
available to your program.  This is necessary only if you wish to
have C<isa> available as a plain subroutine in the current package.

=head2 TIEHANDLE now supported

See L<perltie> for other kinds of tie()s.

=over 4

=item TIEHANDLE classname, LIST

This is the constructor for the class.  That means it is expected to
return an object of some sort. The reference can be used to
hold some internal information.

    sub TIEHANDLE {
	print "<shout>\n";
	my $i;
	return bless \$i, shift;
    }

=item PRINT this, LIST

This method will be triggered every time the tied handle is printed to.
Beyond its self reference it also expects the list that was passed to
the print function.

    sub PRINT {
	$r = shift;
	$$r++;
	return print join( $, => map {uc} @_), $\;
    }

=item PRINTF this, LIST

This method will be triggered every time the tied handle is printed to
with the C<printf()> function.
Beyond its self reference it also expects the format and list that was
passed to the printf function.

    sub PRINTF {
        shift;
	  my $fmt = shift;
        print sprintf($fmt, @_)."\n";
    }

=item READ this LIST

This method will be called when the handle is read from via the C<read>
or C<sysread> functions.

    sub READ {
	$r = shift;
	my($buf,$len,$offset) = @_;
	print "READ called, \$buf=$buf, \$len=$len, \$offset=$offset";
    }

=item READLINE this

This method will be called when the handle is read from. The method
should return undef when there is no more data.

    sub READLINE {
	$r = shift;
	return "PRINT called $$r times\n"
    }

=item GETC this

This method will be called when the C<getc> function is called.

    sub GETC { print "Don't GETC, Get Perl"; return "a"; }

=item DESTROY this

As with the other types of ties, this method will be called when the
tied handle is about to be destroyed. This is useful for debugging and
possibly for cleaning up.

    sub DESTROY {
	print "</shout>\n";
    }

=back

=head2 Malloc enhancements

If perl is compiled with the malloc included with the perl distribution
(that is, if C<perl -V:d_mymalloc> is 'define') then you can print
memory statistics at runtime by running Perl thusly:

  env PERL_DEBUG_MSTATS=2 perl your_script_here

The value of 2 means to print statistics after compilation and on
exit; with a value of 1, the statistics are printed only on exit.
(If you want the statistics at an arbitrary time, you'll need to
install the optional module Devel::Peek.)

Three new compilation flags are recognized by malloc.c.  (They have no
effect if perl is compiled with system malloc().)

=over 4

=item -DPERL_EMERGENCY_SBRK

If this macro is defined, running out of memory need not be a fatal
error: a memory pool can allocated by assigning to the special
variable C<$^M>.  See L</"$^M">.

=item -DPACK_MALLOC

Perl memory allocation is by bucket with sizes close to powers of two.
Because of these malloc overhead may be big, especially for data of
size exactly a power of two.  If C<PACK_MALLOC> is defined, perl uses
a slightly different algorithm for small allocations (up to 64 bytes
long), which makes it possible to have overhead down to 1 byte for
allocations which are powers of two (and appear quite often).

Expected memory savings (with 8-byte alignment in C<alignbytes>) is
about 20% for typical Perl usage.  Expected slowdown due to additional
malloc overhead is in fractions of a percent (hard to measure, because
of the effect of saved memory on speed).

=item -DTWO_POT_OPTIMIZE

Similarly to C<PACK_MALLOC>, this macro improves allocations of data
with size close to a power of two; but this works for big allocations
(starting with 16K by default).  Such allocations are typical for big
hashes and special-purpose scripts, especially image processing.

On recent systems, the fact that perl requires 2M from system for 1M
allocation will not affect speed of execution, since the tail of such
a chunk is not going to be touched (and thus will not require real
memory).  However, it may result in a premature out-of-memory error.
So if you will be manipulating very large blocks with sizes close to
powers of two, it would be wise to define this macro.

Expected saving of memory is 0-100% (100% in applications which
require most memory in such 2**n chunks); expected slowdown is
negligible.

=back

=head2 Miscellaneous efficiency enhancements

Functions that have an empty prototype and that do nothing but return
a fixed value are now inlined (e.g. C<sub PI () { 3.14159 }>).

Each unique hash key is only allocated once, no matter how many hashes
have an entry with that key.  So even if you have 100 copies of the
same hash, the hash keys never have to be reallocated.

=head1 Support for More Operating Systems

Support for the following operating systems is new in Perl 5.004.

=head2 Win32

Perl 5.004 now includes support for building a "native" perl under
Windows NT, using the Microsoft Visual C++ compiler (versions 2.0
and above) or the Borland C++ compiler (versions 5.02 and above).
The resulting perl can be used under Windows 95 (if it
is installed in the same directory locations as it got installed
in Windows NT).  This port includes support for perl extension
building tools like L<ExtUtils::MakeMaker> and L<h2xs>, so that many extensions
available on the Comprehensive Perl Archive Network (CPAN) can now be
readily built under Windows NT.  See http://www.perl.com/ for more
information on CPAN and F<README.win32> in the perl distribution for more
details on how to get started with building this port.

There is also support for building perl under the Cygwin32 environment.
Cygwin32 is a set of GNU tools that make it possible to compile and run
many Unix programs under Windows NT by providing a mostly Unix-like 
interface for compilation and execution.  See F<README.cygwin32> in the
perl distribution for more details on this port and how to obtain the
Cygwin32 toolkit.

=head2 Plan 9

See F<README.plan9> in the perl distribution.

=head2 QNX

See F<README.qnx> in the perl distribution.

=head2 AmigaOS

See F<README.amigaos> in the perl distribution.

=head1 Pragmata

Six new pragmatic modules exist:

=over 4

=item use autouse MODULE => qw(sub1 sub2 sub3)

Defers C<require MODULE> until someone calls one of the specified
subroutines (which must be exported by MODULE).  This pragma should be
used with caution, and only when necessary.

=item use blib

=item use blib 'dir'

Looks for MakeMaker-like I<'blib'> directory structure starting in
I<dir> (or current directory) and working back up to five levels of
parent directories.

Intended for use on command line with B<-M> option as a way of testing
arbitrary scripts against an uninstalled version of a package.

=item use constant NAME => VALUE

Provides a convenient interface for creating compile-time constants,
See L<perlsub/"Constant Functions">.

=item use locale

Tells the compiler to enable (or disable) the use of POSIX locales for
builtin operations.

When C<use locale> is in effect, the current LC_CTYPE locale is used
for regular expressions and case mapping; LC_COLLATE for string
ordering; and LC_NUMERIC for numeric formatting in printf and sprintf
(but B<not> in print).  LC_NUMERIC is always used in write, since
lexical scoping of formats is problematic at best.

Each C<use locale> or C<no locale> affects statements to the end of
the enclosing BLOCK or, if not inside a BLOCK, to the end of the
current file.  Locales can be switched and queried with
POSIX::setlocale().

See L<perllocale> for more information.

=item use ops

Disable unsafe opcodes, or any named opcodes, when compiling Perl code.

=item use vmsish

Enable VMS-specific language features.  Currently, there are three
VMS-specific features available: 'status', which makes C<$?> and
C<system> return genuine VMS status values instead of emulating POSIX;
'exit', which makes C<exit> take a genuine VMS status value instead of
assuming that C<exit 1> is an error; and 'time', which makes all times
relative to the local time zone, in the VMS tradition.

=back

=head1 Modules

=head2 Required Updates

Though Perl 5.004 is compatible with almost all modules that work
with Perl 5.003, there are a few exceptions:

    Module   Required Version for Perl 5.004
    ------   -------------------------------
    Filter   Filter-1.12
    LWP      libwww-perl-5.08
    Tk       Tk400.202 (-w makes noise)

Also, the majordomo mailing list program, version 1.94.1, doesn't work
with Perl 5.004 (nor with perl 4), because it executes an invalid
regular expression.  This bug is fixed in majordomo version 1.94.2.

=head2 Installation directories

The I<installperl> script now places the Perl source files for
extensions in the architecture-specific library directory, which is
where the shared libraries for extensions have always been.  This
change is intended to allow administrators to keep the Perl 5.004
library directory unchanged from a previous version, without running
the risk of binary incompatibility between extensions' Perl source and
shared libraries.

=head2 Module information summary

Brand new modules, arranged by topic rather than strictly
alphabetically:

    CGI.pm               Web server interface ("Common Gateway Interface")
    CGI/Apache.pm        Support for Apache's Perl module
    CGI/Carp.pm          Log server errors with helpful context
    CGI/Fast.pm          Support for FastCGI (persistent server process)
    CGI/Push.pm          Support for server push
    CGI/Switch.pm        Simple interface for multiple server types

    CPAN                 Interface to Comprehensive Perl Archive Network
    CPAN::FirstTime      Utility for creating CPAN configuration file
    CPAN::Nox            Runs CPAN while avoiding compiled extensions

    IO.pm                Top-level interface to IO::* classes
    IO/File.pm           IO::File extension Perl module
    IO/Handle.pm         IO::Handle extension Perl module
    IO/Pipe.pm           IO::Pipe extension Perl module
    IO/Seekable.pm       IO::Seekable extension Perl module
    IO/Select.pm         IO::Select extension Perl module
    IO/Socket.pm         IO::Socket extension Perl module

    Opcode.pm            Disable named opcodes when compiling Perl code

    ExtUtils/Embed.pm    Utilities for embedding Perl in C programs
    ExtUtils/testlib.pm  Fixes up @INC to use just-built extension

    FindBin.pm           Find path of currently executing program

    Class/Struct.pm      Declare struct-like datatypes as Perl classes
    File/stat.pm         By-name interface to Perl's builtin stat
    Net/hostent.pm       By-name interface to Perl's builtin gethost*
    Net/netent.pm        By-name interface to Perl's builtin getnet*
    Net/protoent.pm      By-name interface to Perl's builtin getproto*
    Net/servent.pm       By-name interface to Perl's builtin getserv*
    Time/gmtime.pm       By-name interface to Perl's builtin gmtime
    Time/localtime.pm    By-name interface to Perl's builtin localtime
    Time/tm.pm           Internal object for Time::{gm,local}time
    User/grent.pm        By-name interface to Perl's builtin getgr*
    User/pwent.pm        By-name interface to Perl's builtin getpw*

    Tie/RefHash.pm       Base class for tied hashes with references as keys

    UNIVERSAL.pm         Base class for *ALL* classes

=head2 Fcntl

New constants in the existing Fcntl modules are now supported,
provided that your operating system happens to support them:

    F_GETOWN F_SETOWN
    O_ASYNC O_DEFER O_DSYNC O_FSYNC O_SYNC
    O_EXLOCK O_SHLOCK

These constants are intended for use with the Perl operators sysopen()
and fcntl() and the basic database modules like SDBM_File.  For the
exact meaning of these and other Fcntl constants please refer to your
operating system's documentation for fcntl() and open().

In addition, the Fcntl module now provides these constants for use
with the Perl operator flock():

	LOCK_SH LOCK_EX LOCK_NB LOCK_UN

These constants are defined in all environments (because where there is
no flock() system call, Perl emulates it).  However, for historical
reasons, these constants are not exported unless they are explicitly
requested with the ":flock" tag (e.g. C<use Fcntl ':flock'>).

=head2 IO

The IO module provides a simple mechanism to load all the IO modules at one
go.  Currently this includes:

     IO::Handle
     IO::Seekable
     IO::File
     IO::Pipe
     IO::Socket

For more information on any of these modules, please see its
respective documentation.

=head2 Math::Complex

The Math::Complex module has been totally rewritten, and now supports
more operations.  These are overloaded:

     + - * / ** <=> neg ~ abs sqrt exp log sin cos atan2 "" (stringify)

And these functions are now exported:

    pi i Re Im arg
    log10 logn ln cbrt root
    tan
    csc sec cot
    asin acos atan
    acsc asec acot
    sinh cosh tanh
    csch sech coth
    asinh acosh atanh
    acsch asech acoth
    cplx cplxe

=head2 Math::Trig

This new module provides a simpler interface to parts of Math::Complex for
those who need trigonometric functions only for real numbers.

=head2 DB_File

There have been quite a few changes made to DB_File. Here are a few of
the highlights:

=over 4

=item *

Fixed a handful of bugs.

=item *

By public demand, added support for the standard hash function exists().

=item *

Made it compatible with Berkeley DB 1.86.

=item *

Made negative subscripts work with RECNO interface.

=item *

Changed the default flags from O_RDWR to O_CREAT|O_RDWR and the default
mode from 0640 to 0666.

=item *

Made DB_File automatically import the open() constants (O_RDWR,
O_CREAT etc.) from Fcntl, if available.

=item *

Updated documentation.

=back

Refer to the HISTORY section in DB_File.pm for a complete list of
changes. Everything after DB_File 1.01 has been added since 5.003.

=head2 Net::Ping

Major rewrite - support added for both udp echo and real icmp pings.

=head2 Object-oriented overrides for builtin operators

Many of the Perl builtins returning lists now have
object-oriented overrides.  These are:

    File::stat
    Net::hostent
    Net::netent
    Net::protoent
    Net::servent
    Time::gmtime
    Time::localtime
    User::grent
    User::pwent

For example, you can now say

    use File::stat;
    use User::pwent;
    $his = (stat($filename)->st_uid == pwent($whoever)->pw_uid);

=head1 Utility Changes

=head2 pod2html

=over 4

=item Sends converted HTML to standard output

The I<pod2html> utility included with Perl 5.004 is entirely new.
By default, it sends the converted HTML to its standard output,
instead of writing it to a file like Perl 5.003's I<pod2html> did.
Use the B<--outfile=FILENAME> option to write to a file.

=back

=head2 xsubpp

=over 4

=item C<void> XSUBs now default to returning nothing

Due to a documentation/implementation bug in previous versions of
Perl, XSUBs with a return type of C<void> have actually been
returning one value.  Usually that value was the GV for the XSUB,
but sometimes it was some already freed or reused value, which would
sometimes lead to program failure.

In Perl 5.004, if an XSUB is declared as returning C<void>, it
actually returns no value, i.e. an empty list (though there is a
backward-compatibility exception; see below).  If your XSUB really
does return an SV, you should give it a return type of C<SV *>.

For backward compatibility, I<xsubpp> tries to guess whether a
C<void> XSUB is really C<void> or if it wants to return an C<SV *>.
It does so by examining the text of the XSUB: if I<xsubpp> finds
what looks like an assignment to C<ST(0)>, it assumes that the
XSUB's return type is really C<SV *>.

=back

=head1 C Language API Changes

=over 4

=item C<gv_fetchmethod> and C<perl_call_sv>

The C<gv_fetchmethod> function finds a method for an object, just like
in Perl 5.003.  The GV it returns may be a method cache entry.
However, in Perl 5.004, method cache entries are not visible to users;
therefore, they can no longer be passed directly to C<perl_call_sv>.
Instead, you should use the C<GvCV> macro on the GV to extract its CV,
and pass the CV to C<perl_call_sv>.

The most likely symptom of passing the result of C<gv_fetchmethod> to
C<perl_call_sv> is Perl's producing an "Undefined subroutine called"
error on the I<second> call to a given method (since there is no cache
on the first call).

=item C<perl_eval_pv>

A new function handy for eval'ing strings of Perl code inside C code.
This function returns the value from the eval statement, which can
be used instead of fetching globals from the symbol table.  See
L<perlguts>, L<perlembed> and L<perlcall> for details and examples.

=item Extended API for manipulating hashes

Internal handling of hash keys has changed.  The old hashtable API is
still fully supported, and will likely remain so.  The additions to the
API allow passing keys as C<SV*>s, so that C<tied> hashes can be given
real scalars as keys rather than plain strings (nontied hashes still
can only use strings as keys).  New extensions must use the new hash
access functions and macros if they wish to use C<SV*> keys.  These
additions also make it feasible to manipulate C<HE*>s (hash entries),
which can be more efficient.  See L<perlguts> for details.

=back

=head1 Documentation Changes

Many of the base and library pods were updated.  These
new pods are included in section 1:

=over 4

=item L<perldelta>

This document.

=item L<perlfaq>

Frequently asked questions.

=item L<perllocale>

Locale support (internationalization and localization).

=item L<perltoot>

Tutorial on Perl OO programming.

=item L<perlapio>

Perl internal IO abstraction interface.

=item L<perlmodlib>

Perl module library and recommended practice for module creation.
Extracted from L<perlmod> (which is much smaller as a result).

=item L<perldebug>

Although not new, this has been massively updated.

=item L<perlsec>

Although not new, this has been massively updated.

=back

=head1 New Diagnostics

Several new conditions will trigger warnings that were
silent before.  Some only affect certain platforms.
The following new warnings and errors outline these.
These messages are classified as follows (listed in
increasing order of desperation):

   (W) A warning (optional).
   (D) A deprecation (optional).
   (S) A severe warning (mandatory).
   (F) A fatal error (trappable).
   (P) An internal error you should never see (trappable).
   (X) A very fatal error (nontrappable).
   (A) An alien error message (not generated by Perl).

=over 4

=item "my" variable %s masks earlier declaration in same scope

(W) A lexical variable has been redeclared in the same scope, effectively
eliminating all access to the previous instance.  This is almost always
a typographical error.  Note that the earlier variable will still exist
until the end of the scope or until all closure referents to it are
destroyed.

=item %s argument is not a HASH element or slice

(F) The argument to delete() must be either a hash element, such as

    $foo{$bar}
    $ref->[12]->{"susie"}

or a hash slice, such as

    @foo{$bar, $baz, $xyzzy}
    @{$ref->[12]}{"susie", "queue"}

=item Allocation too large: %lx

(X) You can't allocate more than 64K on an MS-DOS machine.

=item Allocation too large

(F) You can't allocate more than 2^31+"small amount" bytes.

=item Applying %s to %s will act on scalar(%s)

(W) The pattern match (//), substitution (s///), and transliteration (tr///)
operators work on scalar values.  If you apply one of them to an array
or a hash, it will convert the array or hash to a scalar value (the
length of an array or the population info of a hash) and then work on
that scalar value.  This is probably not what you meant to do.  See
L<perlfunc/grep> and L<perlfunc/map> for alternatives.

=item Attempt to free nonexistent shared string

(P) Perl maintains a reference counted internal table of strings to
optimize the storage and access of hash keys and other strings.  This
indicates someone tried to decrement the reference count of a string
that can no longer be found in the table.

=item Attempt to use reference as lvalue in substr

(W) You supplied a reference as the first argument to substr() used
as an lvalue, which is pretty strange.  Perhaps you forgot to
dereference it first.  See L<perlfunc/substr>.

=item Bareword "%s" refers to nonexistent package

(W) You used a qualified bareword of the form C<Foo::>, but
the compiler saw no other uses of that namespace before that point.
Perhaps you need to predeclare a package?

=item Can't redefine active sort subroutine %s

(F) Perl optimizes the internal handling of sort subroutines and keeps
pointers into them.  You tried to redefine one such sort subroutine when it
was currently active, which is not allowed.  If you really want to do
this, you should write C<sort { &func } @x> instead of C<sort func @x>.

=item Can't use bareword ("%s") as %s ref while "strict refs" in use

(F) Only hard references are allowed by "strict refs".  Symbolic references
are disallowed.  See L<perlref>.

=item Cannot resolve method `%s' overloading `%s' in package `%s'

(P) Internal error trying to resolve overloading specified by a method
name (as opposed to a subroutine reference).

=item Constant subroutine %s redefined

(S) You redefined a subroutine which had previously been eligible for
inlining.  See L<perlsub/"Constant Functions"> for commentary and
workarounds.

=item Constant subroutine %s undefined

(S) You undefined a subroutine which had previously been eligible for
inlining.  See L<perlsub/"Constant Functions"> for commentary and
workarounds.

=item Copy method did not return a reference

(F) The method which overloads "=" is buggy. See L<overload/Copy Constructor>.

=item Died

(F) You passed die() an empty string (the equivalent of C<die "">) or
you called it with no args and both C<$@> and C<$_> were empty.

=item Exiting pseudo-block via %s

(W) You are exiting a rather special block construct (like a sort block or
subroutine) by unconventional means, such as a goto, or a loop control
statement.  See L<perlfunc/sort>.

=item Identifier too long

(F) Perl limits identifiers (names for variables, functions, etc.) to
252 characters for simple names, somewhat more for compound names (like
C<$A::B>).  You've exceeded Perl's limits.  Future versions of Perl are
likely to eliminate these arbitrary limitations.

=item Illegal character %s (carriage return)

(F) A carriage return character was found in the input.  This is an
error, and not a warning, because carriage return characters can break
multi-line strings, including here documents (e.g., C<print <<EOF;>).

=item Illegal switch in PERL5OPT: %s

(X) The PERL5OPT environment variable may only be used to set the
following switches: B<-[DIMUdmw]>.

=item Integer overflow in hex number

(S) The literal hex number you have specified is too big for your
architecture. On a 32-bit architecture the largest hex literal is
0xFFFFFFFF.

=item Integer overflow in octal number

(S) The literal octal number you have specified is too big for your
architecture. On a 32-bit architecture the largest octal literal is
037777777777.

=item internal error: glob failed

(P) Something went wrong with the external program(s) used for C<glob>
and C<< <*.c> >>.  This may mean that your csh (C shell) is
broken.  If so, you should change all of the csh-related variables in
config.sh:  If you have tcsh, make the variables refer to it as if it
were csh (e.g. C<full_csh='/usr/bin/tcsh'>); otherwise, make them all
empty (except that C<d_csh> should be C<'undef'>) so that Perl will
think csh is missing.  In either case, after editing config.sh, run
C<./Configure -S> and rebuild Perl.

=item Invalid conversion in %s: "%s"

(W) Perl does not understand the given format conversion.
See L<perlfunc/sprintf>.

=item Invalid type in pack: '%s'

(F) The given character is not a valid pack type.  See L<perlfunc/pack>.

=item Invalid type in unpack: '%s'

(F) The given character is not a valid unpack type.  See L<perlfunc/unpack>.

=item Name "%s::%s" used only once: possible typo

(W) Typographical errors often show up as unique variable names.
If you had a good reason for having a unique name, then just mention
it again somehow to suppress the message (the C<use vars> pragma is
provided for just this purpose).

=item Null picture in formline

(F) The first argument to formline must be a valid format picture
specification.  It was found to be empty, which probably means you
supplied it an uninitialized value.  See L<perlform>.

=item Offset outside string

(F) You tried to do a read/write/send/recv operation with an offset
pointing outside the buffer.  This is difficult to imagine.
The sole exception to this is that C<sysread()>ing past the buffer
will extend the buffer and zero pad the new area.

=item Out of memory!

(X|F) The malloc() function returned 0, indicating there was insufficient
remaining memory (or virtual memory) to satisfy the request.

The request was judged to be small, so the possibility to trap it
depends on the way Perl was compiled.  By default it is not trappable.
However, if compiled for this, Perl may use the contents of C<$^M> as
an emergency pool after die()ing with this message.  In this case the
error is trappable I<once>.

=item Out of memory during request for %s

(F) The malloc() function returned 0, indicating there was insufficient
remaining memory (or virtual memory) to satisfy the request. However,
the request was judged large enough (compile-time default is 64K), so
a possibility to shut down by trapping this error is granted.

=item panic: frexp

(P) The library function frexp() failed, making printf("%f") impossible.

=item Possible attempt to put comments in qw() list

(W) qw() lists contain items separated by whitespace; as with literal
strings, comment characters are not ignored, but are instead treated
as literal data.  (You may have used different delimiters than the
parentheses shown here; braces are also frequently used.)

You probably wrote something like this:

    @list = qw(
        a # a comment
        b # another comment
    );

when you should have written this:

    @list = qw(
        a
        b
    );

If you really want comments, build your list the
old-fashioned way, with quotes and commas:

    @list = (
        'a',    # a comment
        'b',    # another comment
    );

=item Possible attempt to separate words with commas

(W) qw() lists contain items separated by whitespace; therefore commas
aren't needed to separate the items. (You may have used different
delimiters than the parentheses shown here; braces are also frequently
used.)

You probably wrote something like this:

    qw! a, b, c !;

which puts literal commas into some of the list items.  Write it without
commas if you don't want them to appear in your data:

    qw! a b c !;

=item Scalar value @%s{%s} better written as $%s{%s}

(W) You've used a hash slice (indicated by @) to select a single element of
a hash.  Generally it's better to ask for a scalar value (indicated by $).
The difference is that C<$foo{&bar}> always behaves like a scalar, both when
assigning to it and when evaluating its argument, while C<@foo{&bar}> behaves
like a list when you assign to it, and provides a list context to its
subscript, which can do weird things if you're expecting only one subscript.

=item Stub found while resolving method `%s' overloading `%s' in %s

(P) Overloading resolution over @ISA tree may be broken by importing stubs.
Stubs should never be implicitly created, but explicit calls to C<can>
may break this.

=item Too late for "B<-T>" option

(X) The #! line (or local equivalent) in a Perl script contains the
B<-T> option, but Perl was not invoked with B<-T> in its argument
list.  This is an error because, by the time Perl discovers a B<-T> in
a script, it's too late to properly taint everything from the
environment.  So Perl gives up.

=item untie attempted while %d inner references still exist

(W) A copy of the object returned from C<tie> (or C<tied>) was still
valid when C<untie> was called.

=item Unrecognized character %s

(F) The Perl parser has no idea what to do with the specified character
in your Perl script (or eval).  Perhaps you tried to run a compressed
script, a binary program, or a directory as a Perl program.

=item Unsupported function fork

(F) Your version of executable does not support forking.

Note that under some systems, like OS/2, there may be different flavors of
Perl executables, some of which may support fork, some not. Try changing
the name you call Perl by to C<perl_>, C<perl__>, and so on.

=item Use of "$$<digit>" to mean "${$}<digit>" is deprecated

(D) Perl versions before 5.004 misinterpreted any type marker followed
by "$" and a digit.  For example, "$$0" was incorrectly taken to mean
"${$}0" instead of "${$0}".  This bug is (mostly) fixed in Perl 5.004.

However, the developers of Perl 5.004 could not fix this bug completely,
because at least two widely-used modules depend on the old meaning of
"$$0" in a string.  So Perl 5.004 still interprets "$$<digit>" in the
old (broken) way inside strings; but it generates this message as a
warning.  And in Perl 5.005, this special treatment will cease.

=item Value of %s can be "0"; test with defined()

(W) In a conditional expression, you used <HANDLE>, <*> (glob), C<each()>,
or C<readdir()> as a boolean value.  Each of these constructs can return a
value of "0"; that would make the conditional expression false, which is
probably not what you intended.  When using these constructs in conditional
expressions, test their values with the C<defined> operator.

=item Variable "%s" may be unavailable

(W) An inner (nested) I<anonymous> subroutine is inside a I<named>
subroutine, and outside that is another subroutine; and the anonymous
(innermost) subroutine is referencing a lexical variable defined in
the outermost subroutine.  For example:

   sub outermost { my $a; sub middle { sub { $a } } }

If the anonymous subroutine is called or referenced (directly or
indirectly) from the outermost subroutine, it will share the variable
as you would expect.  But if the anonymous subroutine is called or
referenced when the outermost subroutine is not active, it will see
the value of the shared variable as it was before and during the
*first* call to the outermost subroutine, which is probably not what
you want.

In these circumstances, it is usually best to make the middle
subroutine anonymous, using the C<sub {}> syntax.  Perl has specific
support for shared variables in nested anonymous subroutines; a named
subroutine in between interferes with this feature.

=item Variable "%s" will not stay shared

(W) An inner (nested) I<named> subroutine is referencing a lexical
variable defined in an outer subroutine.

When the inner subroutine is called, it will probably see the value of
the outer subroutine's variable as it was before and during the
*first* call to the outer subroutine; in this case, after the first
call to the outer subroutine is complete, the inner and outer
subroutines will no longer share a common value for the variable.  In
other words, the variable will no longer be shared.

Furthermore, if the outer subroutine is anonymous and references a
lexical variable outside itself, then the outer and inner subroutines
will I<never> share the given variable.

This problem can usually be solved by making the inner subroutine
anonymous, using the C<sub {}> syntax.  When inner anonymous subs that
reference variables in outer subroutines are called or referenced,
they are automatically rebound to the current values of such
variables.

=item Warning: something's wrong

(W) You passed warn() an empty string (the equivalent of C<warn "">) or
you called it with no args and C<$_> was empty.

=item Ill-formed logical name |%s| in prime_env_iter

(W) A warning peculiar to VMS.  A logical name was encountered when preparing
to iterate over %ENV which violates the syntactic rules governing logical
names.  Since it cannot be translated normally, it is skipped, and will not
appear in %ENV.  This may be a benign occurrence, as some software packages
might directly modify logical name tables and introduce nonstandard names,
or it may indicate that a logical name table has been corrupted.

=item Got an error from DosAllocMem

(P) An error peculiar to OS/2.  Most probably you're using an obsolete
version of Perl, and this should not happen anyway.

=item Malformed PERLLIB_PREFIX

(F) An error peculiar to OS/2.  PERLLIB_PREFIX should be of the form

    prefix1;prefix2

or

    prefix1 prefix2

with nonempty prefix1 and prefix2.  If C<prefix1> is indeed a prefix
of a builtin library search path, prefix2 is substituted.  The error
may appear if components are not found, or are too long.  See
"PERLLIB_PREFIX" in F<README.os2>.

=item PERL_SH_DIR too long

(F) An error peculiar to OS/2. PERL_SH_DIR is the directory to find the
C<sh>-shell in.  See "PERL_SH_DIR" in F<README.os2>.

=item Process terminated by SIG%s

(W) This is a standard message issued by OS/2 applications, while *nix
applications die in silence.  It is considered a feature of the OS/2
port.  One can easily disable this by appropriate sighandlers, see
L<perlipc/"Signals">.  See also "Process terminated by SIGTERM/SIGINT"
in F<README.os2>.

=back

=head1 BUGS

If you find what you think is a bug, you might check the headers of
recently posted articles in the comp.lang.perl.misc newsgroup.
There may also be information at http://www.perl.com/perl/ , the Perl
Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Make sure you trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to <F<perlbug@perl.com>> to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.  This file has been
significantly updated for 5.004, so even veteran users should
look through it.

The F<README> file for general stuff.

The F<Copying> file for copyright information.

=head1 HISTORY

Constructed by Tom Christiansen, grabbing material with permission
from innumerable contributors, with kibitzing by more than a few Perl
porters.

Last update: Wed May 14 11:14:09 EDT 1997
perlplan9.pod000064400000012005150344123420007152 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlplan9 - Plan 9-specific documentation for Perl

=head1 DESCRIPTION

These are a few notes describing features peculiar to
Plan 9 Perl. As such, it is not intended to be a replacement
for the rest of the Perl 5 documentation (which is both 
copious and excellent). If you have any questions to 
which you can't find answers in these man pages, contact 
Luther Huffman at lutherh@stratcom.com and we'll try to 
answer them.

=head2 Invoking Perl

Perl is invoked from the command line as described in 
L<perl>. Most perl scripts, however, do have a first line 
such as "#!/usr/local/bin/perl". This is known as a shebang 
(shell-bang) statement and tells the OS shell where to find 
the perl interpreter. In Plan 9 Perl this statement should be 
"#!/bin/perl" if you wish to be able to directly invoke the 
script by its name.
     Alternatively, you may invoke perl with the command "Perl"
instead of "perl". This will produce Acme-friendly error
messages of the form "filename:18".

Some scripts, usually identified with a *.PL extension, are 
self-configuring and are able to correctly create their own 
shebang path from config information located in Plan 9 
Perl. These you won't need to be worried about.

=head2 What's in Plan 9 Perl

Although Plan 9 Perl currently only  provides static 
loading, it is built with a number of useful extensions. 
These include Opcode, FileHandle, Fcntl, and POSIX. Expect 
to see others (and DynaLoading!) in the future.

=head2 What's not in Plan 9 Perl

As mentioned previously, dynamic loading isn't currently 
available nor is MakeMaker. Both are high-priority items.

=head2 Perl5 Functions not currently supported in Plan 9 Perl

Some, such as C<chown> and C<umask> aren't provided 
because the concept does not exist within Plan 9. Others,
such as some of the socket-related functions, simply
haven't been written yet. Many in the latter category 
may be supported in the future.

The functions not currently implemented include:

    chown, chroot, dbmclose, dbmopen, getsockopt, 
    setsockopt, recvmsg, sendmsg, getnetbyname, 
    getnetbyaddr, getnetent, getprotoent, getservent, 
    sethostent, setnetent, setprotoent, setservent, 
    endservent, endnetent, endprotoent, umask

There may be several other functions that have undefined 
behavior so this list shouldn't be considered complete.

=head2 Signals in Plan 9 Perl

For compatibility with perl scripts written for the Unix 
environment, Plan 9 Perl uses the POSIX signal emulation
provided in Plan 9's ANSI POSIX Environment (APE). Signal stacking
isn't supported. The signals provided are:

    SIGHUP, SIGINT, SIGQUIT, SIGILL, SIGABRT,
    SIGFPE, SIGKILL, SIGSEGV, SIGPIPE, SIGPIPE, SIGALRM, 
    SIGTERM, SIGUSR1, SIGUSR2, SIGCHLD, SIGCONT,
    SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU

=head1 COMPILING AND INSTALLING PERL ON PLAN 9

WELCOME to Plan 9 Perl, brave soul!

   This is a preliminary alpha version of Plan 9 Perl. Still to be
implemented are MakeMaker and DynaLoader. Many perl commands are
missing or currently behave in an inscrutable manner. These gaps will,
with perseverance and a modicum of luck, be remedied in the near
future.To install this software:

1. Create the source directories and libraries for perl by running the
plan9/setup.rc command (i.e., located in the plan9 subdirectory).
Note: the setup routine assumes that you haven't dearchived these
files into /sys/src/cmd/perl. After running setup.rc you may delete
the copy of the source you originally detarred, as source code has now
been installed in /sys/src/cmd/perl. If you plan on installing perl
binaries for all architectures, run "setup.rc -a".

2. After making sure that you have adequate privileges to build system
software, from /sys/src/cmd/perl/5.00301 (adjust version
appropriately) run:

	mk install

If you wish to install perl versions for all architectures (68020,
mips, sparc and 386) run:

	mk installall

3. Wait. The build process will take a *long* time because perl
bootstraps itself. A 75MHz Pentium, 16MB RAM machine takes roughly 30
minutes to build the distribution from scratch.

=head2 Installing Perl Documentation on Plan 9

This perl distribution comes with a tremendous amount of
documentation. To add these to the built-in manuals that come with
Plan 9, from /sys/src/cmd/perl/5.00301 (adjust version appropriately)
run:

	mk man

To begin your reading, start with:

	man perl

This is a good introduction and will direct you towards other man
pages that may interest you.

(Note: "mk man" may produce some extraneous noise. Fear not.)

=head1 BUGS

"As many as there are grains of sand on all the beaches of the 
world . . ." - Carl Sagan

=head1 Revision date

This document was revised 09-October-1996 for Perl 5.003_7.

=head1 AUTHOR

Direct questions, comments, and the unlikely bug report (ahem) direct
comments toward:

Luther Huffman, lutherh@stratcom.com, 
Strategic Computer Solutions, Inc.		
perllol.pod000064400000022554150344123420006727 0ustar00=head1 NAME

perllol - Manipulating Arrays of Arrays in Perl

=head1 DESCRIPTION

=head2 Declaration and Access of Arrays of Arrays

The simplest two-level data structure to build in Perl is an array of
arrays, sometimes casually called a list of lists.  It's reasonably easy to
understand, and almost everything that applies here will also be applicable
later on with the fancier data structures.

An array of an array is just a regular old array @AoA that you can
get at with two subscripts, like C<$AoA[3][2]>.  Here's a declaration
of the array:

    use 5.010;  # so we can use say()

    # assign to our array, an array of array references
    @AoA = (
	   [ "fred", "barney", "pebbles", "bambam", "dino", ],
	   [ "george", "jane", "elroy", "judy", ],
	   [ "homer", "bart", "marge", "maggie", ],
    );
    say $AoA[2][1];
  bart

Now you should be very careful that the outer bracket type
is a round one, that is, a parenthesis.  That's because you're assigning to
an @array, so you need parentheses.  If you wanted there I<not> to be an @AoA,
but rather just a reference to it, you could do something more like this:

    # assign a reference to array of array references
    $ref_to_AoA = [
	[ "fred", "barney", "pebbles", "bambam", "dino", ],
	[ "george", "jane", "elroy", "judy", ],
	[ "homer", "bart", "marge", "maggie", ],
    ];
    say $ref_to_AoA->[2][1];
  bart

Notice that the outer bracket type has changed, and so our access syntax
has also changed.  That's because unlike C, in perl you can't freely
interchange arrays and references thereto.  $ref_to_AoA is a reference to an
array, whereas @AoA is an array proper.  Likewise, C<$AoA[2]> is not an
array, but an array ref.  So how come you can write these:

    $AoA[2][2]
    $ref_to_AoA->[2][2]

instead of having to write these:

    $AoA[2]->[2]
    $ref_to_AoA->[2]->[2]

Well, that's because the rule is that on adjacent brackets only (whether
square or curly), you are free to omit the pointer dereferencing arrow.
But you cannot do so for the very first one if it's a scalar containing
a reference, which means that $ref_to_AoA always needs it.

=head2 Growing Your Own

That's all well and good for declaration of a fixed data structure,
but what if you wanted to add new elements on the fly, or build
it up entirely from scratch?

First, let's look at reading it in from a file.  This is something like
adding a row at a time.  We'll assume that there's a flat file in which
each line is a row and each word an element.  If you're trying to develop an
@AoA array containing all these, here's the right way to do that:

    while (<>) {
	@tmp = split;
	push @AoA, [ @tmp ];
    }

You might also have loaded that from a function:

    for $i ( 1 .. 10 ) {
	$AoA[$i] = [ somefunc($i) ];
    }

Or you might have had a temporary variable sitting around with the
array in it.

    for $i ( 1 .. 10 ) {
	@tmp = somefunc($i);
	$AoA[$i] = [ @tmp ];
    }

It's important you make sure to use the C<[ ]> array reference
constructor.  That's because this wouldn't work:

    $AoA[$i] = @tmp;   # WRONG!

The reason that doesn't do what you want is because assigning a
named array like that to a scalar is taking an array in scalar
context, which means just counts the number of elements in @tmp.

If you are running under C<use strict> (and if you aren't, why in
the world aren't you?), you'll have to add some declarations to
make it happy:

    use strict;
    my(@AoA, @tmp);
    while (<>) {
	@tmp = split;
	push @AoA, [ @tmp ];
    }

Of course, you don't need the temporary array to have a name at all:

    while (<>) {
	push @AoA, [ split ];
    }

You also don't have to use push().  You could just make a direct assignment
if you knew where you wanted to put it:

    my (@AoA, $i, $line);
    for $i ( 0 .. 10 ) {
	$line = <>;
	$AoA[$i] = [ split " ", $line ];
    }

or even just

    my (@AoA, $i);
    for $i ( 0 .. 10 ) {
	$AoA[$i] = [ split " ", <> ];
    }

You should in general be leery of using functions that could
potentially return lists in scalar context without explicitly stating
such.  This would be clearer to the casual reader:

    my (@AoA, $i);
    for $i ( 0 .. 10 ) {
	$AoA[$i] = [ split " ", scalar(<>) ];
    }

If you wanted to have a $ref_to_AoA variable as a reference to an array,
you'd have to do something like this:

    while (<>) {
	push @$ref_to_AoA, [ split ];
    }

Now you can add new rows.  What about adding new columns?  If you're
dealing with just matrices, it's often easiest to use simple assignment:

    for $x (1 .. 10) {
	for $y (1 .. 10) {
	    $AoA[$x][$y] = func($x, $y);
	}
    }

    for $x ( 3, 7, 9 ) {
	$AoA[$x][20] += func2($x);
    }

It doesn't matter whether those elements are already
there or not: it'll gladly create them for you, setting
intervening elements to C<undef> as need be.

If you wanted just to append to a row, you'd have
to do something a bit funnier looking:

    # add new columns to an existing row
    push @{ $AoA[0] }, "wilma", "betty";   # explicit deref

=head2 Access and Printing

Now it's time to print your data structure out.  How
are you going to do that?  Well, if you want only one
of the elements, it's trivial:

    print $AoA[0][0];

If you want to print the whole thing, though, you can't
say

    print @AoA;		# WRONG

because you'll get just references listed, and perl will never
automatically dereference things for you.  Instead, you have to
roll yourself a loop or two.  This prints the whole structure,
using the shell-style for() construct to loop across the outer
set of subscripts.

    for $aref ( @AoA ) {
	say "\t [ @$aref ],";
    }

If you wanted to keep track of subscripts, you might do this:

    for $i ( 0 .. $#AoA ) {
	say "\t elt $i is [ @{$AoA[$i]} ],";
    }

or maybe even this.  Notice the inner loop.

    for $i ( 0 .. $#AoA ) {
	for $j ( 0 .. $#{$AoA[$i]} ) {
	    say "elt $i $j is $AoA[$i][$j]";
	}
    }

As you can see, it's getting a bit complicated.  That's why
sometimes is easier to take a temporary on your way through:

    for $i ( 0 .. $#AoA ) {
	$aref = $AoA[$i];
	for $j ( 0 .. $#{$aref} ) {
	    say "elt $i $j is $AoA[$i][$j]";
	}
    }

Hmm... that's still a bit ugly.  How about this:

    for $i ( 0 .. $#AoA ) {
	$aref = $AoA[$i];
	$n = @$aref - 1;
	for $j ( 0 .. $n ) {
	    say "elt $i $j is $AoA[$i][$j]";
	}
    }

When you get tired of writing a custom print for your data structures,
you might look at the standard L<Dumpvalue> or L<Data::Dumper> modules.
The former is what the Perl debugger uses, while the latter generates
parsable Perl code.  For example:

 use v5.14;     # using the + prototype, new to v5.14

 sub show(+) {
	require Dumpvalue;
	state $prettily = new Dumpvalue::
			    tick        => q("),
			    compactDump => 1,  # comment these two lines
                                               # out
			    veryCompact => 1,  # if you want a bigger
                                               # dump
			;
	dumpValue $prettily @_;
 }

 # Assign a list of array references to an array.
 my @AoA = (
	   [ "fred", "barney" ],
	   [ "george", "jane", "elroy" ],
	   [ "homer", "marge", "bart" ],
 );
 push @{ $AoA[0] }, "wilma", "betty";
 show @AoA;

will print out:

    0  0..3  "fred" "barney" "wilma" "betty"
    1  0..2  "george" "jane" "elroy"
    2  0..2  "homer" "marge" "bart"

Whereas if you comment out the two lines I said you might wish to,
then it shows it to you this way instead:

    0  ARRAY(0x8031d0)
       0  "fred"
       1  "barney"
       2  "wilma"
       3  "betty"
    1  ARRAY(0x803d40)
       0  "george"
       1  "jane"
       2  "elroy"
    2  ARRAY(0x803e10)
       0  "homer"
       1  "marge"
       2  "bart"

=head2 Slices

If you want to get at a slice (part of a row) in a multidimensional
array, you're going to have to do some fancy subscripting.  That's
because while we have a nice synonym for single elements via the
pointer arrow for dereferencing, no such convenience exists for slices.

Here's how to do one operation using a loop.  We'll assume an @AoA
variable as before.

    @part = ();
    $x = 4;
    for ($y = 7; $y < 13; $y++) {
	push @part, $AoA[$x][$y];
    }

That same loop could be replaced with a slice operation:

    @part = @{$AoA[4]}[7..12];

or spaced out a bit:

    @part = @{ $AoA[4] } [ 7..12 ];

But as you might well imagine, this can get pretty rough on the reader.

Ah, but what if you wanted a I<two-dimensional slice>, such as having
$x run from 4..8 and $y run from 7 to 12?  Hmm... here's the simple way:

    @newAoA = ();
    for ($startx = $x = 4; $x <= 8; $x++) {
	for ($starty = $y = 7; $y <= 12; $y++) {
	    $newAoA[$x - $startx][$y - $starty] = $AoA[$x][$y];
	}
    }

We can reduce some of the looping through slices

    for ($x = 4; $x <= 8; $x++) {
	push @newAoA, [ @{ $AoA[$x] } [ 7..12 ] ];
    }

If you were into Schwartzian Transforms, you would probably
have selected map for that

    @newAoA = map { [ @{ $AoA[$_] } [ 7..12 ] ] } 4 .. 8;

Although if your manager accused you of seeking job security (or rapid
insecurity) through inscrutable code, it would be hard to argue. :-)
If I were you, I'd put that in a function:

    @newAoA = splice_2D( \@AoA, 4 => 8, 7 => 12 );
    sub splice_2D {
	my $lrr = shift; 	# ref to array of array refs!
	my ($x_lo, $x_hi,
	    $y_lo, $y_hi) = @_;

	return map {
	    [ @{ $lrr->[$_] } [ $y_lo .. $y_hi ] ]
	} $x_lo .. $x_hi;
    }


=head1 SEE ALSO

L<perldata>, L<perlref>, L<perldsc>

=head1 AUTHOR

Tom Christiansen <F<tchrist@perl.com>>

Last update: Tue Apr 26 18:30:55 MDT 2011
perlop.pod000064400000412074150344123420006557 0ustar00=head1 NAME
X<operator>

perlop - Perl operators and precedence

=head1 DESCRIPTION

In Perl, the operator determines what operation is performed,
independent of the type of the operands.  For example S<C<$x + $y>>
is always a numeric addition, and if C<$x> or C<$y> do not contain
numbers, an attempt is made to convert them to numbers first.

This is in contrast to many other dynamic languages, where the
operation is determined by the type of the first argument.  It also
means that Perl has two versions of some operators, one for numeric
and one for string comparison.  For example S<C<$x == $y>> compares
two numbers for equality, and S<C<$x eq $y>> compares two strings.

There are a few exceptions though: C<x> can be either string
repetition or list repetition, depending on the type of the left
operand, and C<&>, C<|>, C<^> and C<~> can be either string or numeric bit
operations.

=head2 Operator Precedence and Associativity
X<operator, precedence> X<precedence> X<associativity>

Operator precedence and associativity work in Perl more or less like
they do in mathematics.

I<Operator precedence> means some operators are evaluated before
others.  For example, in S<C<2 + 4 * 5>>, the multiplication has higher
precedence so S<C<4 * 5>> is evaluated first yielding S<C<2 + 20 ==
22>> and not S<C<6 * 5 == 30>>.

I<Operator associativity> defines what happens if a sequence of the
same operators is used one after another: whether the evaluator will
evaluate the left operations first, or the right first.  For example, in
S<C<8 - 4 - 2>>, subtraction is left associative so Perl evaluates the
expression left to right.  S<C<8 - 4>> is evaluated first making the
expression S<C<4 - 2 == 2>> and not S<C<8 - 2 == 6>>.

Perl operators have the following associativity and precedence,
listed from highest precedence to lowest.  Operators borrowed from
C keep the same precedence relationship with each other, even where
C's precedence is slightly screwy.  (This makes learning Perl easier
for C folks.)  With very few exceptions, these all operate on scalar
values only, not array values.

    left	terms and list operators (leftward)
    left	->
    nonassoc	++ --
    right	**
    right	! ~ \ and unary + and -
    left	=~ !~
    left	* / % x
    left	+ - .
    left	<< >>
    nonassoc	named unary operators
    nonassoc	< > <= >= lt gt le ge
    nonassoc	== != <=> eq ne cmp ~~
    left	&
    left	| ^
    left	&&
    left	|| //
    nonassoc	..  ...
    right	?:
    right	= += -= *= etc. goto last next redo dump
    left	, =>
    nonassoc	list operators (rightward)
    right	not
    left	and
    left	or xor

In the following sections, these operators are covered in detail, in the
same order in which they appear in the table above.

Many operators can be overloaded for objects.  See L<overload>.

=head2 Terms and List Operators (Leftward)
X<list operator> X<operator, list> X<term>

A TERM has the highest precedence in Perl.  They include variables,
quote and quote-like operators, any expression in parentheses,
and any function whose arguments are parenthesized.  Actually, there
aren't really functions in this sense, just list operators and unary
operators behaving as functions because you put parentheses around
the arguments.  These are all documented in L<perlfunc>.

If any list operator (C<print()>, etc.) or any unary operator (C<chdir()>, etc.)
is followed by a left parenthesis as the next token, the operator and
arguments within parentheses are taken to be of highest precedence,
just like a normal function call.

In the absence of parentheses, the precedence of list operators such as
C<print>, C<sort>, or C<chmod> is either very high or very low depending on
whether you are looking at the left side or the right side of the operator.
For example, in

    @ary = (1, 3, sort 4, 2);
    print @ary;		# prints 1324

the commas on the right of the C<sort> are evaluated before the C<sort>,
but the commas on the left are evaluated after.  In other words,
list operators tend to gobble up all arguments that follow, and
then act like a simple TERM with regard to the preceding expression.
Be careful with parentheses:

    # These evaluate exit before doing the print:
    print($foo, exit);	# Obviously not what you want.
    print $foo, exit;	# Nor is this.

    # These do the print before evaluating exit:
    (print $foo), exit;	# This is what you want.
    print($foo), exit;	# Or this.
    print ($foo), exit;	# Or even this.

Also note that

    print ($foo & 255) + 1, "\n";

probably doesn't do what you expect at first glance.  The parentheses
enclose the argument list for C<print> which is evaluated (printing
the result of S<C<$foo & 255>>).  Then one is added to the return value
of C<print> (usually 1).  The result is something like this:

    1 + 1, "\n";    # Obviously not what you meant.

To do what you meant properly, you must write:

    print(($foo & 255) + 1, "\n");

See L</Named Unary Operators> for more discussion of this.

Also parsed as terms are the S<C<do {}>> and S<C<eval {}>> constructs, as
well as subroutine and method calls, and the anonymous
constructors C<[]> and C<{}>.

See also L</Quote and Quote-like Operators> toward the end of this section,
as well as L</"I/O Operators">.

=head2 The Arrow Operator
X<arrow> X<dereference> X<< -> >>

"C<< -> >>" is an infix dereference operator, just as it is in C
and C++.  If the right side is either a C<[...]>, C<{...}>, or a
C<(...)> subscript, then the left side must be either a hard or
symbolic reference to an array, a hash, or a subroutine respectively.
(Or technically speaking, a location capable of holding a hard
reference, if it's an array or hash reference being used for
assignment.)  See L<perlreftut> and L<perlref>.

Otherwise, the right side is a method name or a simple scalar
variable containing either the method name or a subroutine reference,
and the left side must be either an object (a blessed reference)
or a class name (that is, a package name).  See L<perlobj>.

The dereferencing cases (as opposed to method-calling cases) are
somewhat extended by the C<postderef> feature.  For the
details of that feature, consult L<perlref/Postfix Dereference Syntax>.

=head2 Auto-increment and Auto-decrement
X<increment> X<auto-increment> X<++> X<decrement> X<auto-decrement> X<-->

C<"++"> and C<"--"> work as in C.  That is, if placed before a variable,
they increment or decrement the variable by one before returning the
value, and if placed after, increment or decrement after returning the
value.

    $i = 0;  $j = 0;
    print $i++;  # prints 0
    print ++$j;  # prints 1

Note that just as in C, Perl doesn't define B<when> the variable is
incremented or decremented.  You just know it will be done sometime
before or after the value is returned.  This also means that modifying
a variable twice in the same statement will lead to undefined behavior.
Avoid statements like:

    $i = $i ++;
    print ++ $i + $i ++;

Perl will not guarantee what the result of the above statements is.

The auto-increment operator has a little extra builtin magic to it.  If
you increment a variable that is numeric, or that has ever been used in
a numeric context, you get a normal increment.  If, however, the
variable has been used in only string contexts since it was set, and
has a value that is not the empty string and matches the pattern
C</^[a-zA-Z]*[0-9]*\z/>, the increment is done as a string, preserving each
character within its range, with carry:

    print ++($foo = "99");	# prints "100"
    print ++($foo = "a0");	# prints "a1"
    print ++($foo = "Az");	# prints "Ba"
    print ++($foo = "zz");	# prints "aaa"

C<undef> is always treated as numeric, and in particular is changed
to C<0> before incrementing (so that a post-increment of an undef value
will return C<0> rather than C<undef>).

The auto-decrement operator is not magical.

=head2 Exponentiation
X<**> X<exponentiation> X<power>

Binary C<"**"> is the exponentiation operator.  It binds even more
tightly than unary minus, so C<-2**4> is C<-(2**4)>, not C<(-2)**4>.
(This is
implemented using C's C<pow(3)> function, which actually works on doubles
internally.)

Note that certain exponentiation expressions are ill-defined:
these include C<0**0>, C<1**Inf>, and C<Inf**0>.  Do not expect
any particular results from these special cases, the results
are platform-dependent.

=head2 Symbolic Unary Operators
X<unary operator> X<operator, unary>

Unary C<"!"> performs logical negation, that is, "not".  See also
L<C<not>|/Logical Not> for a lower precedence version of this.
X<!>

Unary C<"-"> performs arithmetic negation if the operand is numeric,
including any string that looks like a number.  If the operand is
an identifier, a string consisting of a minus sign concatenated
with the identifier is returned.  Otherwise, if the string starts
with a plus or minus, a string starting with the opposite sign is
returned.  One effect of these rules is that C<-bareword> is equivalent
to the string C<"-bareword">.  If, however, the string begins with a
non-alphabetic character (excluding C<"+"> or C<"-">), Perl will attempt
to convert
the string to a numeric, and the arithmetic negation is performed.  If the
string cannot be cleanly converted to a numeric, Perl will give the warning
B<Argument "the string" isn't numeric in negation (-) at ...>.
X<-> X<negation, arithmetic>

Unary C<"~"> performs bitwise negation, that is, 1's complement.  For
example, S<C<0666 & ~027>> is 0640.  (See also L</Integer Arithmetic> and
L</Bitwise String Operators>.)  Note that the width of the result is
platform-dependent: C<~0> is 32 bits wide on a 32-bit platform, but 64
bits wide on a 64-bit platform, so if you are expecting a certain bit
width, remember to use the C<"&"> operator to mask off the excess bits.
X<~> X<negation, binary>

When complementing strings, if all characters have ordinal values under
256, then their complements will, also.  But if they do not, all
characters will be in either 32- or 64-bit complements, depending on your
architecture.  So for example, C<~"\x{3B1}"> is C<"\x{FFFF_FC4E}"> on
32-bit machines and C<"\x{FFFF_FFFF_FFFF_FC4E}"> on 64-bit machines.

If the experimental "bitwise" feature is enabled via S<C<use feature
'bitwise'>>, then unary C<"~"> always treats its argument as a number, and an
alternate form of the operator, C<"~.">, always treats its argument as a
string.  So C<~0> and C<~"0"> will both give 2**32-1 on 32-bit platforms,
whereas C<~.0> and C<~."0"> will both yield C<"\xff">.  This feature
produces a warning unless you use S<C<no warnings 'experimental::bitwise'>>.

Unary C<"+"> has no effect whatsoever, even on strings.  It is useful
syntactically for separating a function name from a parenthesized expression
that would otherwise be interpreted as the complete list of function
arguments.  (See examples above under L</Terms and List Operators (Leftward)>.)
X<+>

Unary C<"\"> creates a reference to whatever follows it.  See L<perlreftut>
and L<perlref>.  Do not confuse this behavior with the behavior of
backslash within a string, although both forms do convey the notion
of protecting the next thing from interpolation.
X<\> X<reference> X<backslash>

=head2 Binding Operators
X<binding> X<operator, binding> X<=~> X<!~>

Binary C<"=~"> binds a scalar expression to a pattern match.  Certain operations
search or modify the string C<$_> by default.  This operator makes that kind
of operation work on some other string.  The right argument is a search
pattern, substitution, or transliteration.  The left argument is what is
supposed to be searched, substituted, or transliterated instead of the default
C<$_>.  When used in scalar context, the return value generally indicates the
success of the operation.  The exceptions are substitution (C<s///>)
and transliteration (C<y///>) with the C</r> (non-destructive) option,
which cause the B<r>eturn value to be the result of the substitution.
Behavior in list context depends on the particular operator.
See L</"Regexp Quote-Like Operators"> for details and L<perlretut> for
examples using these operators.

If the right argument is an expression rather than a search pattern,
substitution, or transliteration, it is interpreted as a search pattern at run
time.  Note that this means that its
contents will be interpolated twice, so

    '\\' =~ q'\\';

is not ok, as the regex engine will end up trying to compile the
pattern C<\>, which it will consider a syntax error.

Binary C<"!~"> is just like C<"=~"> except the return value is negated in
the logical sense.

Binary C<"!~"> with a non-destructive substitution (C<s///r>) or transliteration
(C<y///r>) is a syntax error.

=head2 Multiplicative Operators
X<operator, multiplicative>

Binary C<"*"> multiplies two numbers.
X<*>

Binary C<"/"> divides two numbers.
X</> X<slash>

Binary C<"%"> is the modulo operator, which computes the division
remainder of its first argument with respect to its second argument.
Given integer
operands C<$m> and C<$n>: If C<$n> is positive, then S<C<$m % $n>> is
C<$m> minus the largest multiple of C<$n> less than or equal to
C<$m>.  If C<$n> is negative, then S<C<$m % $n>> is C<$m> minus the
smallest multiple of C<$n> that is not less than C<$m> (that is, the
result will be less than or equal to zero).  If the operands
C<$m> and C<$n> are floating point values and the absolute value of
C<$n> (that is C<abs($n)>) is less than S<C<(UV_MAX + 1)>>, only
the integer portion of C<$m> and C<$n> will be used in the operation
(Note: here C<UV_MAX> means the maximum of the unsigned integer type).
If the absolute value of the right operand (C<abs($n)>) is greater than
or equal to S<C<(UV_MAX + 1)>>, C<"%"> computes the floating-point remainder
C<$r> in the equation S<C<($r = $m - $i*$n)>> where C<$i> is a certain
integer that makes C<$r> have the same sign as the right operand
C<$n> (B<not> as the left operand C<$m> like C function C<fmod()>)
and the absolute value less than that of C<$n>.
Note that when S<C<use integer>> is in scope, C<"%"> gives you direct access
to the modulo operator as implemented by your C compiler.  This
operator is not as well defined for negative operands, but it will
execute faster.
X<%> X<remainder> X<modulo> X<mod>

Binary C<"x"> is the repetition operator.  In scalar context or if the left
operand is not enclosed in parentheses, it returns a string consisting
of the left operand repeated the number of times specified by the right
operand.  In list context, if the left operand is enclosed in
parentheses or is a list formed by C<qw/I<STRING>/>, it repeats the list.
If the right operand is zero or negative (raising a warning on
negative), it returns an empty string
or an empty list, depending on the context.
X<x>

    print '-' x 80;		# print row of dashes

    print "\t" x ($tab/8), ' ' x ($tab%8);	# tab over

    @ones = (1) x 80;		# a list of 80 1's
    @ones = (5) x @ones;	# set all elements to 5


=head2 Additive Operators
X<operator, additive>

Binary C<"+"> returns the sum of two numbers.
X<+>

Binary C<"-"> returns the difference of two numbers.
X<->

Binary C<"."> concatenates two strings.
X<string, concatenation> X<concatenation>
X<cat> X<concat> X<concatenate> X<.>

=head2 Shift Operators
X<shift operator> X<operator, shift> X<<< << >>>
X<<< >> >>> X<right shift> X<left shift> X<bitwise shift>
X<shl> X<shr> X<shift, right> X<shift, left>

Binary C<<< "<<" >>> returns the value of its left argument shifted left by the
number of bits specified by the right argument.  Arguments should be
integers.  (See also L</Integer Arithmetic>.)

Binary C<<< ">>" >>> returns the value of its left argument shifted right by
the number of bits specified by the right argument.  Arguments should
be integers.  (See also L</Integer Arithmetic>.)

If S<C<use integer>> (see L</Integer Arithmetic>) is in force then
signed C integers are used (I<arithmetic shift>), otherwise unsigned C
integers are used (I<logical shift>), even for negative shiftees.
In arithmetic right shift the sign bit is replicated on the left,
in logical shift zero bits come in from the left.

Either way, the implementation isn't going to generate results larger
than the size of the integer type Perl was built with (32 bits or 64 bits).

Shifting by negative number of bits means the reverse shift: left
shift becomes right shift, right shift becomes left shift.  This is
unlike in C, where negative shift is undefined.

Shifting by more bits than the size of the integers means most of the
time zero (all bits fall off), except that under S<C<use integer>>
right overshifting a negative shiftee results in -1.  This is unlike
in C, where shifting by too many bits is undefined.  A common C
behavior is "shift by modulo wordbits", so that for example

    1 >> 64 == 1 >> (64 % 64) == 1 >> 0 == 1  # Common C behavior.

but that is completely accidental.

If you get tired of being subject to your platform's native integers,
the S<C<use bigint>> pragma neatly sidesteps the issue altogether:

    print 20 << 20;  # 20971520
    print 20 << 40;  # 5120 on 32-bit machines,
                     # 21990232555520 on 64-bit machines
    use bigint;
    print 20 << 100; # 25353012004564588029934064107520

=head2 Named Unary Operators
X<operator, named unary>

The various named unary operators are treated as functions with one
argument, with optional parentheses.

If any list operator (C<print()>, etc.) or any unary operator (C<chdir()>, etc.)
is followed by a left parenthesis as the next token, the operator and
arguments within parentheses are taken to be of highest precedence,
just like a normal function call.  For example,
because named unary operators are higher precedence than C<||>:

    chdir $foo    || die;	# (chdir $foo) || die
    chdir($foo)   || die;	# (chdir $foo) || die
    chdir ($foo)  || die;	# (chdir $foo) || die
    chdir +($foo) || die;	# (chdir $foo) || die

but, because C<"*"> is higher precedence than named operators:

    chdir $foo * 20;	# chdir ($foo * 20)
    chdir($foo) * 20;	# (chdir $foo) * 20
    chdir ($foo) * 20;	# (chdir $foo) * 20
    chdir +($foo) * 20;	# chdir ($foo * 20)

    rand 10 * 20;	# rand (10 * 20)
    rand(10) * 20;	# (rand 10) * 20
    rand (10) * 20;	# (rand 10) * 20
    rand +(10) * 20;	# rand (10 * 20)

Regarding precedence, the filetest operators, like C<-f>, C<-M>, etc. are
treated like named unary operators, but they don't follow this functional
parenthesis rule.  That means, for example, that C<-f($file).".bak"> is
equivalent to S<C<-f "$file.bak">>.
X<-X> X<filetest> X<operator, filetest>

See also L</"Terms and List Operators (Leftward)">.

=head2 Relational Operators
X<relational operator> X<operator, relational>

Perl operators that return true or false generally return values
that can be safely used as numbers.  For example, the relational
operators in this section and the equality operators in the next
one return C<1> for true and a special version of the defined empty
string, C<"">, which counts as a zero but is exempt from warnings
about improper numeric conversions, just as S<C<"0 but true">> is.

Binary C<< "<" >> returns true if the left argument is numerically less than
the right argument.
X<< < >>

Binary C<< ">" >> returns true if the left argument is numerically greater
than the right argument.
X<< > >>

Binary C<< "<=" >> returns true if the left argument is numerically less than
or equal to the right argument.
X<< <= >>

Binary C<< ">=" >> returns true if the left argument is numerically greater
than or equal to the right argument.
X<< >= >>

Binary C<"lt"> returns true if the left argument is stringwise less than
the right argument.
X<< lt >>

Binary C<"gt"> returns true if the left argument is stringwise greater
than the right argument.
X<< gt >>

Binary C<"le"> returns true if the left argument is stringwise less than
or equal to the right argument.
X<< le >>

Binary C<"ge"> returns true if the left argument is stringwise greater
than or equal to the right argument.
X<< ge >>

=head2 Equality Operators
X<equality> X<equal> X<equals> X<operator, equality>

Binary C<< "==" >> returns true if the left argument is numerically equal to
the right argument.
X<==>

Binary C<< "!=" >> returns true if the left argument is numerically not equal
to the right argument.
X<!=>

Binary C<< "<=>" >> returns -1, 0, or 1 depending on whether the left
argument is numerically less than, equal to, or greater than the right
argument.  If your platform supports C<NaN>'s (not-a-numbers) as numeric
values, using them with C<< "<=>" >> returns undef.  C<NaN> is not
C<< "<" >>, C<< "==" >>, C<< ">" >>, C<< "<=" >> or C<< ">=" >> anything
(even C<NaN>), so those 5 return false.  S<C<< NaN != NaN >>> returns
true, as does S<C<NaN !=> I<anything else>>.  If your platform doesn't
support C<NaN>'s then C<NaN> is just a string with numeric value 0.
X<< <=> >>
X<spaceship>

    $ perl -le '$x = "NaN"; print "No NaN support here" if $x == $x'
    $ perl -le '$x = "NaN"; print "NaN support here" if $x != $x'

(Note that the L<bigint>, L<bigrat>, and L<bignum> pragmas all
support C<"NaN">.)

Binary C<"eq"> returns true if the left argument is stringwise equal to
the right argument.
X<eq>

Binary C<"ne"> returns true if the left argument is stringwise not equal
to the right argument.
X<ne>

Binary C<"cmp"> returns -1, 0, or 1 depending on whether the left
argument is stringwise less than, equal to, or greater than the right
argument.
X<cmp>

Binary C<"~~"> does a smartmatch between its arguments.  Smart matching
is described in the next section.
X<~~>

C<"lt">, C<"le">, C<"ge">, C<"gt"> and C<"cmp"> use the collation (sort)
order specified by the current C<LC_COLLATE> locale if a S<C<use
locale>> form that includes collation is in effect.  See L<perllocale>.
Do not mix these with Unicode,
only use them with legacy 8-bit locale encodings.
The standard C<L<Unicode::Collate>> and
C<L<Unicode::Collate::Locale>> modules offer much more powerful
solutions to collation issues.

For case-insensitive comparisions, look at the L<perlfunc/fc> case-folding
function, available in Perl v5.16 or later:

    if ( fc($x) eq fc($y) ) { ... }

=head2 Smartmatch Operator

First available in Perl 5.10.1 (the 5.10.0 version behaved differently),
binary C<~~> does a "smartmatch" between its arguments.  This is mostly
used implicitly in the C<when> construct described in L<perlsyn>, although
not all C<when> clauses call the smartmatch operator.  Unique among all of
Perl's operators, the smartmatch operator can recurse.  The smartmatch
operator is L<experimental|perlpolicy/experimental> and its behavior is
subject to change.

It is also unique in that all other Perl operators impose a context
(usually string or numeric context) on their operands, autoconverting
those operands to those imposed contexts.  In contrast, smartmatch
I<infers> contexts from the actual types of its operands and uses that
type information to select a suitable comparison mechanism.

The C<~~> operator compares its operands "polymorphically", determining how
to compare them according to their actual types (numeric, string, array,
hash, etc.)  Like the equality operators with which it shares the same
precedence, C<~~> returns 1 for true and C<""> for false.  It is often best
read aloud as "in", "inside of", or "is contained in", because the left
operand is often looked for I<inside> the right operand.  That makes the
order of the operands to the smartmatch operand often opposite that of
the regular match operator.  In other words, the "smaller" thing is usually
placed in the left operand and the larger one in the right.

The behavior of a smartmatch depends on what type of things its arguments
are, as determined by the following table.  The first row of the table
whose types apply determines the smartmatch behavior.  Because what
actually happens is mostly determined by the type of the second operand,
the table is sorted on the right operand instead of on the left.

 Left      Right      Description and pseudocode
 ===============================================================
 Any       undef      check whether Any is undefined
                like: !defined Any

 Any       Object     invoke ~~ overloading on Object, or die

 Right operand is an ARRAY:

 Left      Right      Description and pseudocode
 ===============================================================
 ARRAY1    ARRAY2     recurse on paired elements of ARRAY1 and ARRAY2[2]
                like: (ARRAY1[0] ~~ ARRAY2[0])
                        && (ARRAY1[1] ~~ ARRAY2[1]) && ...
 HASH      ARRAY      any ARRAY elements exist as HASH keys
                like: grep { exists HASH->{$_} } ARRAY
 Regexp    ARRAY      any ARRAY elements pattern match Regexp
                like: grep { /Regexp/ } ARRAY
 undef     ARRAY      undef in ARRAY
                like: grep { !defined } ARRAY
 Any       ARRAY      smartmatch each ARRAY element[3]
                like: grep { Any ~~ $_ } ARRAY

 Right operand is a HASH:

 Left      Right      Description and pseudocode
 ===============================================================
 HASH1     HASH2      all same keys in both HASHes
                like: keys HASH1 ==
                         grep { exists HASH2->{$_} } keys HASH1
 ARRAY     HASH       any ARRAY elements exist as HASH keys
                like: grep { exists HASH->{$_} } ARRAY
 Regexp    HASH       any HASH keys pattern match Regexp
                like: grep { /Regexp/ } keys HASH
 undef     HASH       always false (undef can't be a key)
                like: 0 == 1
 Any       HASH       HASH key existence
                like: exists HASH->{Any}

 Right operand is CODE:

 Left      Right      Description and pseudocode
 ===============================================================
 ARRAY     CODE       sub returns true on all ARRAY elements[1]
                like: !grep { !CODE->($_) } ARRAY
 HASH      CODE       sub returns true on all HASH keys[1]
                like: !grep { !CODE->($_) } keys HASH
 Any       CODE       sub passed Any returns true
                like: CODE->(Any)

Right operand is a Regexp:

 Left      Right      Description and pseudocode
 ===============================================================
 ARRAY     Regexp     any ARRAY elements match Regexp
                like: grep { /Regexp/ } ARRAY
 HASH      Regexp     any HASH keys match Regexp
                like: grep { /Regexp/ } keys HASH
 Any       Regexp     pattern match
                like: Any =~ /Regexp/

 Other:

 Left      Right      Description and pseudocode
 ===============================================================
 Object    Any        invoke ~~ overloading on Object,
                      or fall back to...

 Any       Num        numeric equality
                 like: Any == Num
 Num       nummy[4]    numeric equality
                 like: Num == nummy
 undef     Any        check whether undefined
                 like: !defined(Any)
 Any       Any        string equality
                 like: Any eq Any


Notes:

=over

=item 1.
Empty hashes or arrays match.

=item 2.
That is, each element smartmatches the element of the same index in the other array.[3]

=item 3.
If a circular reference is found, fall back to referential equality.

=item 4.
Either an actual number, or a string that looks like one.

=back

The smartmatch implicitly dereferences any non-blessed hash or array
reference, so the C<I<HASH>> and C<I<ARRAY>> entries apply in those cases.
For blessed references, the C<I<Object>> entries apply.  Smartmatches
involving hashes only consider hash keys, never hash values.

The "like" code entry is not always an exact rendition.  For example, the
smartmatch operator short-circuits whenever possible, but C<grep> does
not.  Also, C<grep> in scalar context returns the number of matches, but
C<~~> returns only true or false.

Unlike most operators, the smartmatch operator knows to treat C<undef>
specially:

    use v5.10.1;
    @array = (1, 2, 3, undef, 4, 5);
    say "some elements undefined" if undef ~~ @array;

Each operand is considered in a modified scalar context, the modification
being that array and hash variables are passed by reference to the
operator, which implicitly dereferences them.  Both elements
of each pair are the same:

    use v5.10.1;

    my %hash = (red    => 1, blue   => 2, green  => 3,
                orange => 4, yellow => 5, purple => 6,
                black  => 7, grey   => 8, white  => 9);

    my @array = qw(red blue green);

    say "some array elements in hash keys" if  @array ~~  %hash;
    say "some array elements in hash keys" if \@array ~~ \%hash;

    say "red in array" if "red" ~~  @array;
    say "red in array" if "red" ~~ \@array;

    say "some keys end in e" if /e$/ ~~  %hash;
    say "some keys end in e" if /e$/ ~~ \%hash;

Two arrays smartmatch if each element in the first array smartmatches
(that is, is "in") the corresponding element in the second array,
recursively.

    use v5.10.1;
    my @little = qw(red blue green);
    my @bigger = ("red", "blue", [ "orange", "green" ] );
    if (@little ~~ @bigger) {  # true!
        say "little is contained in bigger";
    }

Because the smartmatch operator recurses on nested arrays, this
will still report that "red" is in the array.

    use v5.10.1;
    my @array = qw(red blue green);
    my $nested_array = [[[[[[[ @array ]]]]]]];
    say "red in array" if "red" ~~ $nested_array;

If two arrays smartmatch each other, then they are deep
copies of each others' values, as this example reports:

    use v5.12.0;
    my @a = (0, 1, 2, [3, [4, 5], 6], 7);
    my @b = (0, 1, 2, [3, [4, 5], 6], 7);

    if (@a ~~ @b && @b ~~ @a) {
        say "a and b are deep copies of each other";
    }
    elsif (@a ~~ @b) {
        say "a smartmatches in b";
    }
    elsif (@b ~~ @a) {
        say "b smartmatches in a";
    }
    else {
        say "a and b don't smartmatch each other at all";
    }


If you were to set S<C<$b[3] = 4>>, then instead of reporting that "a and b
are deep copies of each other", it now reports that C<"b smartmatches in a">.
That's because the corresponding position in C<@a> contains an array that
(eventually) has a 4 in it.

Smartmatching one hash against another reports whether both contain the
same keys, no more and no less.  This could be used to see whether two
records have the same field names, without caring what values those fields
might have.  For example:

    use v5.10.1;
    sub make_dogtag {
        state $REQUIRED_FIELDS = { name=>1, rank=>1, serial_num=>1 };

        my ($class, $init_fields) = @_;

        die "Must supply (only) name, rank, and serial number"
            unless $init_fields ~~ $REQUIRED_FIELDS;

        ...
    }

However, this only does what you mean if C<$init_fields> is indeed a hash
reference. The condition C<$init_fields ~~ $REQUIRED_FIELDS> also allows the
strings C<"name">, C<"rank">, C<"serial_num"> as well as any array reference
that contains C<"name"> or C<"rank"> or C<"serial_num"> anywhere to pass
through.

The smartmatch operator is most often used as the implicit operator of a
C<when> clause.  See the section on "Switch Statements" in L<perlsyn>.

=head3 Smartmatching of Objects

To avoid relying on an object's underlying representation, if the
smartmatch's right operand is an object that doesn't overload C<~~>,
it raises the exception "C<Smartmatching a non-overloaded object
breaks encapsulation>".  That's because one has no business digging
around to see whether something is "in" an object.  These are all
illegal on objects without a C<~~> overload:

    %hash ~~ $object
       42 ~~ $object
   "fred" ~~ $object

However, you can change the way an object is smartmatched by overloading
the C<~~> operator.  This is allowed to
extend the usual smartmatch semantics.
For objects that do have an C<~~> overload, see L<overload>.

Using an object as the left operand is allowed, although not very useful.
Smartmatching rules take precedence over overloading, so even if the
object in the left operand has smartmatch overloading, this will be
ignored.  A left operand that is a non-overloaded object falls back on a
string or numeric comparison of whatever the C<ref> operator returns.  That
means that

    $object ~~ X

does I<not> invoke the overload method with C<I<X>> as an argument.
Instead the above table is consulted as normal, and based on the type of
C<I<X>>, overloading may or may not be invoked.  For simple strings or
numbers, "in" becomes equivalent to this:

    $object ~~ $number          ref($object) == $number
    $object ~~ $string          ref($object) eq $string

For example, this reports that the handle smells IOish
(but please don't really do this!):

    use IO::Handle;
    my $fh = IO::Handle->new();
    if ($fh ~~ /\bIO\b/) {
        say "handle smells IOish";
    }

That's because it treats C<$fh> as a string like
C<"IO::Handle=GLOB(0x8039e0)">, then pattern matches against that.

=head2 Bitwise And
X<operator, bitwise, and> X<bitwise and> X<&>

Binary C<"&"> returns its operands ANDed together bit by bit.  Although no
warning is currently raised, the result is not well defined when this operation
is performed on operands that aren't either numbers (see
L</Integer Arithmetic>) nor bitstrings (see L</Bitwise String Operators>).

Note that C<"&"> has lower priority than relational operators, so for example
the parentheses are essential in a test like

    print "Even\n" if ($x & 1) == 0;

If the experimental "bitwise" feature is enabled via S<C<use feature
'bitwise'>>, then this operator always treats its operand as numbers.  This
feature produces a warning unless you also use C<S<no warnings
'experimental::bitwise'>>.

=head2 Bitwise Or and Exclusive Or
X<operator, bitwise, or> X<bitwise or> X<|> X<operator, bitwise, xor>
X<bitwise xor> X<^>

Binary C<"|"> returns its operands ORed together bit by bit.

Binary C<"^"> returns its operands XORed together bit by bit.

Although no warning is currently raised, the results are not well
defined when these operations are performed on operands that aren't either
numbers (see L</Integer Arithmetic>) nor bitstrings (see L</Bitwise String
Operators>).

Note that C<"|"> and C<"^"> have lower priority than relational operators, so
for example the parentheses are essential in a test like

    print "false\n" if (8 | 2) != 10;

If the experimental "bitwise" feature is enabled via S<C<use feature
'bitwise'>>, then this operator always treats its operand as numbers.  This
feature produces a warning unless you also use S<C<no warnings
'experimental::bitwise'>>.

=head2 C-style Logical And
X<&&> X<logical and> X<operator, logical, and>

Binary C<"&&"> performs a short-circuit logical AND operation.  That is,
if the left operand is false, the right operand is not even evaluated.
Scalar or list context propagates down to the right operand if it
is evaluated.

=head2 C-style Logical Or
X<||> X<operator, logical, or>

Binary C<"||"> performs a short-circuit logical OR operation.  That is,
if the left operand is true, the right operand is not even evaluated.
Scalar or list context propagates down to the right operand if it
is evaluated.

=head2 Logical Defined-Or
X<//> X<operator, logical, defined-or>

Although it has no direct equivalent in C, Perl's C<//> operator is related
to its C-style "or".  In fact, it's exactly the same as C<||>, except that it
tests the left hand side's definedness instead of its truth.  Thus,
S<C<< EXPR1 // EXPR2 >>> returns the value of C<< EXPR1 >> if it's defined,
otherwise, the value of C<< EXPR2 >> is returned.
(C<< EXPR1 >> is evaluated in scalar context, C<< EXPR2 >>
in the context of C<< // >> itself).  Usually,
this is the same result as S<C<< defined(EXPR1) ? EXPR1 : EXPR2 >>> (except that
the ternary-operator form can be used as a lvalue, while S<C<< EXPR1 // EXPR2 >>>
cannot).  This is very useful for
providing default values for variables.  If you actually want to test if
at least one of C<$x> and C<$y> is defined, use S<C<defined($x // $y)>>.

The C<||>, C<//> and C<&&> operators return the last value evaluated
(unlike C's C<||> and C<&&>, which return 0 or 1).  Thus, a reasonably
portable way to find out the home directory might be:

    $home =  $ENV{HOME}
	  // $ENV{LOGDIR}
	  // (getpwuid($<))[7]
	  // die "You're homeless!\n";

In particular, this means that you shouldn't use this
for selecting between two aggregates for assignment:

    @a = @b || @c;            # This doesn't do the right thing
    @a = scalar(@b) || @c;    # because it really means this.
    @a = @b ? @b : @c;        # This works fine, though.

As alternatives to C<&&> and C<||> when used for
control flow, Perl provides the C<and> and C<or> operators (see below).
The short-circuit behavior is identical.  The precedence of C<"and">
and C<"or"> is much lower, however, so that you can safely use them after a
list operator without the need for parentheses:

    unlink "alpha", "beta", "gamma"
	    or gripe(), next LINE;

With the C-style operators that would have been written like this:

    unlink("alpha", "beta", "gamma")
	    || (gripe(), next LINE);

It would be even more readable to write that this way:

    unless(unlink("alpha", "beta", "gamma")) {
        gripe();
        next LINE;
    }

Using C<"or"> for assignment is unlikely to do what you want; see below.

=head2 Range Operators
X<operator, range> X<range> X<..> X<...>

Binary C<".."> is the range operator, which is really two different
operators depending on the context.  In list context, it returns a
list of values counting (up by ones) from the left value to the right
value.  If the left value is greater than the right value then it
returns the empty list.  The range operator is useful for writing
S<C<foreach (1..10)>> loops and for doing slice operations on arrays.  In
the current implementation, no temporary array is created when the
range operator is used as the expression in C<foreach> loops, but older
versions of Perl might burn a lot of memory when you write something
like this:

    for (1 .. 1_000_000) {
	# code
    }

The range operator also works on strings, using the magical
auto-increment, see below.

In scalar context, C<".."> returns a boolean value.  The operator is
bistable, like a flip-flop, and emulates the line-range (comma)
operator of B<sed>, B<awk>, and various editors.  Each C<".."> operator
maintains its own boolean state, even across calls to a subroutine
that contains it.  It is false as long as its left operand is false.
Once the left operand is true, the range operator stays true until the
right operand is true, I<AFTER> which the range operator becomes false
again.  It doesn't become false till the next time the range operator
is evaluated.  It can test the right operand and become false on the
same evaluation it became true (as in B<awk>), but it still returns
true once.  If you don't want it to test the right operand until the
next evaluation, as in B<sed>, just use three dots (C<"...">) instead of
two.  In all other regards, C<"..."> behaves just like C<".."> does.

The right operand is not evaluated while the operator is in the
"false" state, and the left operand is not evaluated while the
operator is in the "true" state.  The precedence is a little lower
than || and &&.  The value returned is either the empty string for
false, or a sequence number (beginning with 1) for true.  The sequence
number is reset for each range encountered.  The final sequence number
in a range has the string C<"E0"> appended to it, which doesn't affect
its numeric value, but gives you something to search for if you want
to exclude the endpoint.  You can exclude the beginning point by
waiting for the sequence number to be greater than 1.

If either operand of scalar C<".."> is a constant expression,
that operand is considered true if it is equal (C<==>) to the current
input line number (the C<$.> variable).

To be pedantic, the comparison is actually S<C<int(EXPR) == int(EXPR)>>,
but that is only an issue if you use a floating point expression; when
implicitly using C<$.> as described in the previous paragraph, the
comparison is S<C<int(EXPR) == int($.)>> which is only an issue when C<$.>
is set to a floating point value and you are not reading from a file.
Furthermore, S<C<"span" .. "spat">> or S<C<2.18 .. 3.14>> will not do what
you want in scalar context because each of the operands are evaluated
using their integer representation.

Examples:

As a scalar operator:

    if (101 .. 200) { print; } # print 2nd hundred lines, short for
                               #  if ($. == 101 .. $. == 200) { print; }

    next LINE if (1 .. /^$/);  # skip header lines, short for
                               #   next LINE if ($. == 1 .. /^$/);
                               # (typically in a loop labeled LINE)

    s/^/> / if (/^$/ .. eof());  # quote body

    # parse mail messages
    while (<>) {
        $in_header =   1  .. /^$/;
        $in_body   = /^$/ .. eof;
        if ($in_header) {
            # do something
        } else { # in body
            # do something else
        }
    } continue {
        close ARGV if eof;             # reset $. each file
    }

Here's a simple example to illustrate the difference between
the two range operators:

    @lines = ("   - Foo",
              "01 - Bar",
              "1  - Baz",
              "   - Quux");

    foreach (@lines) {
        if (/0/ .. /1/) {
            print "$_\n";
        }
    }

This program will print only the line containing "Bar".  If
the range operator is changed to C<...>, it will also print the
"Baz" line.

And now some examples as a list operator:

    for (101 .. 200) { print }      # print $_ 100 times
    @foo = @foo[0 .. $#foo];        # an expensive no-op
    @foo = @foo[$#foo-4 .. $#foo];  # slice last 5 items

The range operator (in list context) makes use of the magical
auto-increment algorithm if the operands are strings.  You
can say

    @alphabet = ("A" .. "Z");

to get all normal letters of the English alphabet, or

    $hexdigit = (0 .. 9, "a" .. "f")[$num & 15];

to get a hexadecimal digit, or

    @z2 = ("01" .. "31");
    print $z2[$mday];

to get dates with leading zeros.

If the final value specified is not in the sequence that the magical
increment would produce, the sequence goes until the next value would
be longer than the final value specified.

As of Perl 5.26, the list-context range operator on strings works as expected
in the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The
'unicode_strings' feature >>. In previous versions, and outside the scope of
that feature, it exhibits L<perlunicode/The "Unicode Bug">: its behavior
depends on the internal encoding of the range endpoint.

If the initial value specified isn't part of a magical increment
sequence (that is, a non-empty string matching C</^[a-zA-Z]*[0-9]*\z/>),
only the initial value will be returned.  So the following will only
return an alpha:

    use charnames "greek";
    my @greek_small =  ("\N{alpha}" .. "\N{omega}");

To get the 25 traditional lowercase Greek letters, including both sigmas,
you could use this instead:

    use charnames "greek";
    my @greek_small =  map { chr } ( ord("\N{alpha}")
                                        ..
                                     ord("\N{omega}")
                                   );

However, because there are I<many> other lowercase Greek characters than
just those, to match lowercase Greek characters in a regular expression,
you could use the pattern C</(?:(?=\p{Greek})\p{Lower})+/> (or the
L<experimental feature|perlrecharclass/Extended Bracketed Character
Classes> C<S</(?[ \p{Greek} & \p{Lower} ])+/>>).

Because each operand is evaluated in integer form, S<C<2.18 .. 3.14>> will
return two elements in list context.

    @list = (2.18 .. 3.14); # same as @list = (2 .. 3);

=head2 Conditional Operator
X<operator, conditional> X<operator, ternary> X<ternary> X<?:>

Ternary C<"?:"> is the conditional operator, just as in C.  It works much
like an if-then-else.  If the argument before the C<?> is true, the
argument before the C<:> is returned, otherwise the argument after the
C<:> is returned.  For example:

    printf "I have %d dog%s.\n", $n,
	    ($n == 1) ? "" : "s";

Scalar or list context propagates downward into the 2nd
or 3rd argument, whichever is selected.

    $x = $ok ? $y : $z;  # get a scalar
    @x = $ok ? @y : @z;  # get an array
    $x = $ok ? @y : @z;  # oops, that's just a count!

The operator may be assigned to if both the 2nd and 3rd arguments are
legal lvalues (meaning that you can assign to them):

    ($x_or_y ? $x : $y) = $z;

Because this operator produces an assignable result, using assignments
without parentheses will get you in trouble.  For example, this:

    $x % 2 ? $x += 10 : $x += 2

Really means this:

    (($x % 2) ? ($x += 10) : $x) += 2

Rather than this:

    ($x % 2) ? ($x += 10) : ($x += 2)

That should probably be written more simply as:

    $x += ($x % 2) ? 10 : 2;

=head2 Assignment Operators
X<assignment> X<operator, assignment> X<=> X<**=> X<+=> X<*=> X<&=>
X<<< <<= >>> X<&&=> X<-=> X</=> X<|=> X<<< >>= >>> X<||=> X<//=> X<.=>
X<%=> X<^=> X<x=> X<&.=> X<|.=> X<^.=>

C<"="> is the ordinary assignment operator.

Assignment operators work as in C.  That is,

    $x += 2;

is equivalent to

    $x = $x + 2;

although without duplicating any side effects that dereferencing the lvalue
might trigger, such as from C<tie()>.  Other assignment operators work similarly.
The following are recognized:

    **=    +=    *=    &=    &.=    <<=    &&=
           -=    /=    |=    |.=    >>=    ||=
           .=    %=    ^=    ^.=           //=
                 x=

Although these are grouped by family, they all have the precedence
of assignment.  These combined assignment operators can only operate on
scalars, whereas the ordinary assignment operator can assign to arrays,
hashes, lists and even references.  (See L<"Context"|perldata/Context>
and L<perldata/List value constructors>, and L<perlref/Assigning to
References>.)

Unlike in C, the scalar assignment operator produces a valid lvalue.
Modifying an assignment is equivalent to doing the assignment and
then modifying the variable that was assigned to.  This is useful
for modifying a copy of something, like this:

    ($tmp = $global) =~ tr/13579/24680/;

Although as of 5.14, that can be also be accomplished this way:

    use v5.14;
    $tmp = ($global =~  tr/13579/24680/r);

Likewise,

    ($x += 2) *= 3;

is equivalent to

    $x += 2;
    $x *= 3;

Similarly, a list assignment in list context produces the list of
lvalues assigned to, and a list assignment in scalar context returns
the number of elements produced by the expression on the right hand
side of the assignment.

The three dotted bitwise assignment operators (C<&.=> C<|.=> C<^.=>) are new in
Perl 5.22 and experimental.  See L</Bitwise String Operators>.

=head2 Comma Operator
X<comma> X<operator, comma> X<,>

Binary C<","> is the comma operator.  In scalar context it evaluates
its left argument, throws that value away, then evaluates its right
argument and returns that value.  This is just like C's comma operator.

In list context, it's just the list argument separator, and inserts
both its arguments into the list.  These arguments are also evaluated
from left to right.

The C<< => >> operator (sometimes pronounced "fat comma") is a synonym
for the comma except that it causes a
word on its left to be interpreted as a string if it begins with a letter
or underscore and is composed only of letters, digits and underscores.
This includes operands that might otherwise be interpreted as operators,
constants, single number v-strings or function calls.  If in doubt about
this behavior, the left operand can be quoted explicitly.

Otherwise, the C<< => >> operator behaves exactly as the comma operator
or list argument separator, according to context.

For example:

    use constant FOO => "something";

    my %h = ( FOO => 23 );

is equivalent to:

    my %h = ("FOO", 23);

It is I<NOT>:

    my %h = ("something", 23);

The C<< => >> operator is helpful in documenting the correspondence
between keys and values in hashes, and other paired elements in lists.

    %hash = ( $key => $value );
    login( $username => $password );

The special quoting behavior ignores precedence, and hence may apply to
I<part> of the left operand:

    print time.shift => "bbb";

That example prints something like C<"1314363215shiftbbb">, because the
C<< => >> implicitly quotes the C<shift> immediately on its left, ignoring
the fact that C<time.shift> is the entire left operand.

=head2 List Operators (Rightward)
X<operator, list, rightward> X<list operator>

On the right side of a list operator, the comma has very low precedence,
such that it controls all comma-separated expressions found there.
The only operators with lower precedence are the logical operators
C<"and">, C<"or">, and C<"not">, which may be used to evaluate calls to list
operators without the need for parentheses:

    open HANDLE, "< :encoding(UTF-8)", "filename"
        or die "Can't open: $!\n";

However, some people find that code harder to read than writing
it with parentheses:

    open(HANDLE, "< :encoding(UTF-8)", "filename")
        or die "Can't open: $!\n";

in which case you might as well just use the more customary C<"||"> operator:

    open(HANDLE, "< :encoding(UTF-8)", "filename")
        || die "Can't open: $!\n";

See also discussion of list operators in L</Terms and List Operators (Leftward)>.

=head2 Logical Not
X<operator, logical, not> X<not>

Unary C<"not"> returns the logical negation of the expression to its right.
It's the equivalent of C<"!"> except for the very low precedence.

=head2 Logical And
X<operator, logical, and> X<and>

Binary C<"and"> returns the logical conjunction of the two surrounding
expressions.  It's equivalent to C<&&> except for the very low
precedence.  This means that it short-circuits: the right
expression is evaluated only if the left expression is true.

=head2 Logical or and Exclusive Or
X<operator, logical, or> X<operator, logical, xor>
X<operator, logical, exclusive or>
X<or> X<xor>

Binary C<"or"> returns the logical disjunction of the two surrounding
expressions.  It's equivalent to C<||> except for the very low precedence.
This makes it useful for control flow:

    print FH $data		or die "Can't write to FH: $!";

This means that it short-circuits: the right expression is evaluated
only if the left expression is false.  Due to its precedence, you must
be careful to avoid using it as replacement for the C<||> operator.
It usually works out better for flow control than in assignments:

    $x = $y or $z;              # bug: this is wrong
    ($x = $y) or $z;            # really means this
    $x = $y || $z;              # better written this way

However, when it's a list-context assignment and you're trying to use
C<||> for control flow, you probably need C<"or"> so that the assignment
takes higher precedence.

    @info = stat($file) || die;     # oops, scalar sense of stat!
    @info = stat($file) or die;     # better, now @info gets its due

Then again, you could always use parentheses.

Binary C<"xor"> returns the exclusive-OR of the two surrounding expressions.
It cannot short-circuit (of course).

There is no low precedence operator for defined-OR.

=head2 C Operators Missing From Perl
X<operator, missing from perl> X<&> X<*>
X<typecasting> X<(TYPE)>

Here is what C has that Perl doesn't:

=over 8

=item unary &

Address-of operator.  (But see the C<"\"> operator for taking a reference.)

=item unary *

Dereference-address operator.  (Perl's prefix dereferencing
operators are typed: C<$>, C<@>, C<%>, and C<&>.)

=item (TYPE)

Type-casting operator.

=back

=head2 Quote and Quote-like Operators
X<operator, quote> X<operator, quote-like> X<q> X<qq> X<qx> X<qw> X<m>
X<qr> X<s> X<tr> X<'> X<''> X<"> X<""> X<//> X<`> X<``> X<<< << >>>
X<escape sequence> X<escape>

While we usually think of quotes as literal values, in Perl they
function as operators, providing various kinds of interpolating and
pattern matching capabilities.  Perl provides customary quote characters
for these behaviors, but also provides a way for you to choose your
quote character for any of them.  In the following table, a C<{}> represents
any pair of delimiters you choose.

    Customary  Generic        Meaning	     Interpolates
	''	 q{}	      Literal		  no
	""	qq{}	      Literal		  yes
	``	qx{}	      Command		  yes*
		qw{}	     Word list		  no
	//	 m{}	   Pattern match	  yes*
		qr{}	      Pattern		  yes*
		 s{}{}	    Substitution	  yes*
		tr{}{}	  Transliteration	  no (but see below)
		 y{}{}	  Transliteration	  no (but see below)
        <<EOF                 here-doc            yes*

	* unless the delimiter is ''.

Non-bracketing delimiters use the same character fore and aft, but the four
sorts of ASCII brackets (round, angle, square, curly) all nest, which means
that

    q{foo{bar}baz}

is the same as

    'foo{bar}baz'

Note, however, that this does not always work for quoting Perl code:

    $s = q{ if($x eq "}") ... }; # WRONG

is a syntax error.  The C<L<Text::Balanced>> module (standard as of v5.8,
and from CPAN before then) is able to do this properly.

There can (and in some cases, must) be whitespace between the operator
and the quoting
characters, except when C<#> is being used as the quoting character.
C<q#foo#> is parsed as the string C<foo>, while S<C<q #foo#>> is the
operator C<q> followed by a comment.  Its argument will be taken
from the next line.  This allows you to write:

    s {foo}  # Replace foo
      {bar}  # with bar.

The cases where whitespace must be used are when the quoting character
is a word character (meaning it matches C</\w/>):

    q XfooX # Works: means the string 'foo'
    qXfooX  # WRONG!

The following escape sequences are available in constructs that interpolate,
and in transliterations:
X<\t> X<\n> X<\r> X<\f> X<\b> X<\a> X<\e> X<\x> X<\0> X<\c> X<\N> X<\N{}>
X<\o{}>

    Sequence     Note  Description
    \t                  tab               (HT, TAB)
    \n                  newline           (NL)
    \r                  return            (CR)
    \f                  form feed         (FF)
    \b                  backspace         (BS)
    \a                  alarm (bell)      (BEL)
    \e                  escape            (ESC)
    \x{263A}     [1,8]  hex char          (example: SMILEY)
    \x1b         [2,8]  restricted range hex char (example: ESC)
    \N{name}     [3]    named Unicode character or character sequence
    \N{U+263D}   [4,8]  Unicode character (example: FIRST QUARTER MOON)
    \c[          [5]    control char      (example: chr(27))
    \o{23072}    [6,8]  octal char        (example: SMILEY)
    \033         [7,8]  restricted range octal char  (example: ESC)

=over 4

=item [1]

The result is the character specified by the hexadecimal number between
the braces.  See L</[8]> below for details on which character.

Only hexadecimal digits are valid between the braces.  If an invalid
character is encountered, a warning will be issued and the invalid
character and all subsequent characters (valid or invalid) within the
braces will be discarded.

If there are no valid digits between the braces, the generated character is
the NULL character (C<\x{00}>).  However, an explicit empty brace (C<\x{}>)
will not cause a warning (currently).

=item [2]

The result is the character specified by the hexadecimal number in the range
0x00 to 0xFF.  See L</[8]> below for details on which character.

Only hexadecimal digits are valid following C<\x>.  When C<\x> is followed
by fewer than two valid digits, any valid digits will be zero-padded.  This
means that C<\x7> will be interpreted as C<\x07>, and a lone C<"\x"> will be
interpreted as C<\x00>.  Except at the end of a string, having fewer than
two valid digits will result in a warning.  Note that although the warning
says the illegal character is ignored, it is only ignored as part of the
escape and will still be used as the subsequent character in the string.
For example:

  Original    Result    Warns?
  "\x7"       "\x07"    no
  "\x"        "\x00"    no
  "\x7q"      "\x07q"   yes
  "\xq"       "\x00q"   yes

=item [3]

The result is the Unicode character or character sequence given by I<name>.
See L<charnames>.

=item [4]

S<C<\N{U+I<hexadecimal number>}>> means the Unicode character whose Unicode code
point is I<hexadecimal number>.

=item [5]

The character following C<\c> is mapped to some other character as shown in the
table:

 Sequence   Value
   \c@      chr(0)
   \cA      chr(1)
   \ca      chr(1)
   \cB      chr(2)
   \cb      chr(2)
   ...
   \cZ      chr(26)
   \cz      chr(26)
   \c[      chr(27)
                     # See below for chr(28)
   \c]      chr(29)
   \c^      chr(30)
   \c_      chr(31)
   \c?      chr(127) # (on ASCII platforms; see below for link to
                     #  EBCDIC discussion)

In other words, it's the character whose code point has had 64 xor'd with
its uppercase.  C<\c?> is DELETE on ASCII platforms because
S<C<ord("?") ^ 64>> is 127, and
C<\c@> is NULL because the ord of C<"@"> is 64, so xor'ing 64 itself produces 0.

Also, C<\c\I<X>> yields S<C< chr(28) . "I<X>">> for any I<X>, but cannot come at the
end of a string, because the backslash would be parsed as escaping the end
quote.

On ASCII platforms, the resulting characters from the list above are the
complete set of ASCII controls.  This isn't the case on EBCDIC platforms; see
L<perlebcdic/OPERATOR DIFFERENCES> for a full discussion of the
differences between these for ASCII versus EBCDIC platforms.

Use of any other character following the C<"c"> besides those listed above is
discouraged, and as of Perl v5.20, the only characters actually allowed
are the printable ASCII ones, minus the left brace C<"{">.  What happens
for any of the allowed other characters is that the value is derived by
xor'ing with the seventh bit, which is 64, and a warning raised if
enabled.  Using the non-allowed characters generates a fatal error.

To get platform independent controls, you can use C<\N{...}>.

=item [6]

The result is the character specified by the octal number between the braces.
See L</[8]> below for details on which character.

If a character that isn't an octal digit is encountered, a warning is raised,
and the value is based on the octal digits before it, discarding it and all
following characters up to the closing brace.  It is a fatal error if there are
no octal digits at all.

=item [7]

The result is the character specified by the three-digit octal number in the
range 000 to 777 (but best to not use above 077, see next paragraph).  See
L</[8]> below for details on which character.

Some contexts allow 2 or even 1 digit, but any usage without exactly
three digits, the first being a zero, may give unintended results.  (For
example, in a regular expression it may be confused with a backreference;
see L<perlrebackslash/Octal escapes>.)  Starting in Perl 5.14, you may
use C<\o{}> instead, which avoids all these problems.  Otherwise, it is best to
use this construct only for ordinals C<\077> and below, remembering to pad to
the left with zeros to make three digits.  For larger ordinals, either use
C<\o{}>, or convert to something else, such as to hex and use C<\N{U+}>
(which is portable between platforms with different character sets) or
C<\x{}> instead.

=item [8]

Several constructs above specify a character by a number.  That number
gives the character's position in the character set encoding (indexed from 0).
This is called synonymously its ordinal, code position, or code point.  Perl
works on platforms that have a native encoding currently of either ASCII/Latin1
or EBCDIC, each of which allow specification of 256 characters.  In general, if
the number is 255 (0xFF, 0377) or below, Perl interprets this in the platform's
native encoding.  If the number is 256 (0x100, 0400) or above, Perl interprets
it as a Unicode code point and the result is the corresponding Unicode
character.  For example C<\x{50}> and C<\o{120}> both are the number 80 in
decimal, which is less than 256, so the number is interpreted in the native
character set encoding.  In ASCII the character in the 80th position (indexed
from 0) is the letter C<"P">, and in EBCDIC it is the ampersand symbol C<"&">.
C<\x{100}> and C<\o{400}> are both 256 in decimal, so the number is interpreted
as a Unicode code point no matter what the native encoding is.  The name of the
character in the 256th position (indexed by 0) in Unicode is
C<LATIN CAPITAL LETTER A WITH MACRON>.

An exception to the above rule is that S<C<\N{U+I<hex number>}>> is
always interpreted as a Unicode code point, so that C<\N{U+0050}> is C<"P"> even
on EBCDIC platforms.

=back

B<NOTE>: Unlike C and other languages, Perl has no C<\v> escape sequence for
the vertical tab (VT, which is 11 in both ASCII and EBCDIC), but you may
use C<\N{VT}>, C<\ck>, C<\N{U+0b}>, or C<\x0b>.  (C<\v>
does have meaning in regular expression patterns in Perl, see L<perlre>.)

The following escape sequences are available in constructs that interpolate,
but not in transliterations.
X<\l> X<\u> X<\L> X<\U> X<\E> X<\Q> X<\F>

    \l		lowercase next character only
    \u		titlecase (not uppercase!) next character only
    \L		lowercase all characters till \E or end of string
    \U		uppercase all characters till \E or end of string
    \F		foldcase all characters till \E or end of string
    \Q          quote (disable) pattern metacharacters till \E or
                end of string
    \E		end either case modification or quoted section
		(whichever was last seen)

See L<perlfunc/quotemeta> for the exact definition of characters that
are quoted by C<\Q>.

C<\L>, C<\U>, C<\F>, and C<\Q> can stack, in which case you need one
C<\E> for each.  For example:

 say"This \Qquoting \ubusiness \Uhere isn't quite\E done yet,\E is it?";
 This quoting\ Business\ HERE\ ISN\'T\ QUITE\ done\ yet\, is it?

If a S<C<use locale>> form that includes C<LC_CTYPE> is in effect (see
L<perllocale>), the case map used by C<\l>, C<\L>, C<\u>, and C<\U> is
taken from the current locale.  If Unicode (for example, C<\N{}> or code
points of 0x100 or beyond) is being used, the case map used by C<\l>,
C<\L>, C<\u>, and C<\U> is as defined by Unicode.  That means that
case-mapping a single character can sometimes produce a sequence of
several characters.
Under S<C<use locale>>, C<\F> produces the same results as C<\L>
for all locales but a UTF-8 one, where it instead uses the Unicode
definition.

All systems use the virtual C<"\n"> to represent a line terminator,
called a "newline".  There is no such thing as an unvarying, physical
newline character.  It is only an illusion that the operating system,
device drivers, C libraries, and Perl all conspire to preserve.  Not all
systems read C<"\r"> as ASCII CR and C<"\n"> as ASCII LF.  For example,
on the ancient Macs (pre-MacOS X) of yesteryear, these used to be reversed,
and on systems without a line terminator,
printing C<"\n"> might emit no actual data.  In general, use C<"\n"> when
you mean a "newline" for your system, but use the literal ASCII when you
need an exact character.  For example, most networking protocols expect
and prefer a CR+LF (C<"\015\012"> or C<"\cM\cJ">) for line terminators,
and although they often accept just C<"\012">, they seldom tolerate just
C<"\015">.  If you get in the habit of using C<"\n"> for networking,
you may be burned some day.
X<newline> X<line terminator> X<eol> X<end of line>
X<\n> X<\r> X<\r\n>

For constructs that do interpolate, variables beginning with "C<$>"
or "C<@>" are interpolated.  Subscripted variables such as C<$a[3]> or
C<< $href->{key}[0] >> are also interpolated, as are array and hash slices.
But method calls such as C<< $obj->meth >> are not.

Interpolating an array or slice interpolates the elements in order,
separated by the value of C<$">, so is equivalent to interpolating
S<C<join $", @array>>.  "Punctuation" arrays such as C<@*> are usually
interpolated only if the name is enclosed in braces C<@{*}>, but the
arrays C<@_>, C<@+>, and C<@-> are interpolated even without braces.

For double-quoted strings, the quoting from C<\Q> is applied after
interpolation and escapes are processed.

    "abc\Qfoo\tbar$s\Exyz"

is equivalent to

    "abc" . quotemeta("foo\tbar$s") . "xyz"

For the pattern of regex operators (C<qr//>, C<m//> and C<s///>),
the quoting from C<\Q> is applied after interpolation is processed,
but before escapes are processed.  This allows the pattern to match
literally (except for C<$> and C<@>).  For example, the following matches:

    '\s\t' =~ /\Q\s\t/

Because C<$> or C<@> trigger interpolation, you'll need to use something
like C</\Quser\E\@\Qhost/> to match them literally.

Patterns are subject to an additional level of interpretation as a
regular expression.  This is done as a second pass, after variables are
interpolated, so that regular expressions may be incorporated into the
pattern from the variables.  If this is not what you want, use C<\Q> to
interpolate a variable literally.

Apart from the behavior described above, Perl does not expand
multiple levels of interpolation.  In particular, contrary to the
expectations of shell programmers, back-quotes do I<NOT> interpolate
within double quotes, nor do single quotes impede evaluation of
variables when used within double quotes.

=head2 Regexp Quote-Like Operators
X<operator, regexp>

Here are the quote-like operators that apply to pattern
matching and related activities.

=over 8

=item C<qr/I<STRING>/msixpodualn>
X<qr> X</i> X</m> X</o> X</s> X</x> X</p>

This operator quotes (and possibly compiles) its I<STRING> as a regular
expression.  I<STRING> is interpolated the same way as I<PATTERN>
in C<m/I<PATTERN>/>.  If C<"'"> is used as the delimiter, no variable
interpolation is done.  Returns a Perl value which may be used instead of the
corresponding C</I<STRING>/msixpodualn> expression.  The returned value is a
normalized version of the original pattern.  It magically differs from
a string containing the same characters: C<ref(qr/x/)> returns "Regexp";
however, dereferencing it is not well defined (you currently get the
normalized version of the original pattern, but this may change).


For example,

    $rex = qr/my.STRING/is;
    print $rex;                 # prints (?si-xm:my.STRING)
    s/$rex/foo/;

is equivalent to

    s/my.STRING/foo/is;

The result may be used as a subpattern in a match:

    $re = qr/$pattern/;
    $string =~ /foo${re}bar/;	# can be interpolated in other
                                # patterns
    $string =~ $re;		# or used standalone
    $string =~ /$re/;		# or this way

Since Perl may compile the pattern at the moment of execution of the C<qr()>
operator, using C<qr()> may have speed advantages in some situations,
notably if the result of C<qr()> is used standalone:

    sub match {
	my $patterns = shift;
	my @compiled = map qr/$_/i, @$patterns;
	grep {
	    my $success = 0;
	    foreach my $pat (@compiled) {
		$success = 1, last if /$pat/;
	    }
	    $success;
	} @_;
    }

Precompilation of the pattern into an internal representation at
the moment of C<qr()> avoids the need to recompile the pattern every
time a match C</$pat/> is attempted.  (Perl has many other internal
optimizations, but none would be triggered in the above example if
we did not use C<qr()> operator.)

Options (specified by the following modifiers) are:

    m	Treat string as multiple lines.
    s	Treat string as single line. (Make . match a newline)
    i	Do case-insensitive pattern matching.
    x   Use extended regular expressions; specifying two
        x's means \t and the SPACE character are ignored within
        square-bracketed character classes
    p	When matching preserve a copy of the matched string so
        that ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be
        defined (ignored starting in v5.20) as these are always
        defined starting in that release
    o	Compile pattern only once.
    a   ASCII-restrict: Use ASCII for \d, \s, \w and [[:posix:]]
        character classes; specifying two a's adds the further
        restriction that no ASCII character will match a
        non-ASCII one under /i.
    l   Use the current run-time locale's rules.
    u   Use Unicode rules.
    d   Use Unicode or native charset, as in 5.12 and earlier.
    n   Non-capture mode. Don't let () fill in $1, $2, etc...

If a precompiled pattern is embedded in a larger pattern then the effect
of C<"msixpluadn"> will be propagated appropriately.  The effect that the
C</o> modifier has is not propagated, being restricted to those patterns
explicitly using it.

The last four modifiers listed above, added in Perl 5.14,
control the character set rules, but C</a> is the only one you are likely
to want to specify explicitly; the other three are selected
automatically by various pragmas.

See L<perlre> for additional information on valid syntax for I<STRING>, and
for a detailed look at the semantics of regular expressions.  In
particular, all modifiers except the largely obsolete C</o> are further
explained in L<perlre/Modifiers>.  C</o> is described in the next section.

=item C<m/I<PATTERN>/msixpodualngc>
X<m> X<operator, match>
X<regexp, options> X<regexp> X<regex, options> X<regex>
X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c>

=item C</I<PATTERN>/msixpodualngc>

Searches a string for a pattern match, and in scalar context returns
true if it succeeds, false if it fails.  If no string is specified
via the C<=~> or C<!~> operator, the C<$_> string is searched.  (The
string specified with C<=~> need not be an lvalue--it may be the
result of an expression evaluation, but remember the C<=~> binds
rather tightly.)  See also L<perlre>.

Options are as described in C<qr//> above; in addition, the following match
process modifiers are available:

 g  Match globally, i.e., find all occurrences.
 c  Do not reset search position on a failed match when /g is
    in effect.

If C<"/"> is the delimiter then the initial C<m> is optional.  With the C<m>
you can use any pair of non-whitespace (ASCII) characters
as delimiters.  This is particularly useful for matching path names
that contain C<"/">, to avoid LTS (leaning toothpick syndrome).  If C<"?"> is
the delimiter, then a match-only-once rule applies,
described in C<m?I<PATTERN>?> below.  If C<"'"> (single quote) is the delimiter,
no variable interpolation is performed on the I<PATTERN>.
When using a delimiter character valid in an identifier, whitespace is required
after the C<m>.

I<PATTERN> may contain variables, which will be interpolated
every time the pattern search is evaluated, except
for when the delimiter is a single quote.  (Note that C<$(>, C<$)>, and
C<$|> are not interpolated because they look like end-of-string tests.)
Perl will not recompile the pattern unless an interpolated
variable that it contains changes.  You can force Perl to skip the
test and never recompile by adding a C</o> (which stands for "once")
after the trailing delimiter.
Once upon a time, Perl would recompile regular expressions
unnecessarily, and this modifier was useful to tell it not to do so, in the
interests of speed.  But now, the only reasons to use C</o> are one of:

=over

=item 1

The variables are thousands of characters long and you know that they
don't change, and you need to wring out the last little bit of speed by
having Perl skip testing for that.  (There is a maintenance penalty for
doing this, as mentioning C</o> constitutes a promise that you won't
change the variables in the pattern.  If you do change them, Perl won't
even notice.)

=item 2

you want the pattern to use the initial values of the variables
regardless of whether they change or not.  (But there are saner ways
of accomplishing this than using C</o>.)

=item 3

If the pattern contains embedded code, such as

    use re 'eval';
    $code = 'foo(?{ $x })';
    /$code/

then perl will recompile each time, even though the pattern string hasn't
changed, to ensure that the current value of C<$x> is seen each time.
Use C</o> if you want to avoid this.

=back

The bottom line is that using C</o> is almost never a good idea.

=item The empty pattern C<//>

If the I<PATTERN> evaluates to the empty string, the last
I<successfully> matched regular expression is used instead.  In this
case, only the C<g> and C<c> flags on the empty pattern are honored;
the other flags are taken from the original pattern.  If no match has
previously succeeded, this will (silently) act instead as a genuine
empty pattern (which will always match).

Note that it's possible to confuse Perl into thinking C<//> (the empty
regex) is really C<//> (the defined-or operator).  Perl is usually pretty
good about this, but some pathological cases might trigger this, such as
C<$x///> (is that S<C<($x) / (//)>> or S<C<$x // />>?) and S<C<print $fh //>>
(S<C<print $fh(//>> or S<C<print($fh //>>?).  In all of these examples, Perl
will assume you meant defined-or.  If you meant the empty regex, just
use parentheses or spaces to disambiguate, or even prefix the empty
regex with an C<m> (so C<//> becomes C<m//>).

=item Matching in list context

If the C</g> option is not used, C<m//> in list context returns a
list consisting of the subexpressions matched by the parentheses in the
pattern, that is, (C<$1>, C<$2>, C<$3>...)  (Note that here C<$1> etc. are
also set).  When there are no parentheses in the pattern, the return
value is the list C<(1)> for success.
With or without parentheses, an empty list is returned upon failure.

Examples:

 open(TTY, "+</dev/tty")
    || die "can't access /dev/tty: $!";

 <TTY> =~ /^y/i && foo();	# do foo if desired

 if (/Version: *([0-9.]*)/) { $version = $1; }

 next if m#^/usr/spool/uucp#;

 # poor man's grep
 $arg = shift;
 while (<>) {
    print if /$arg/o; # compile only once (no longer needed!)
 }

 if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/))

This last example splits C<$foo> into the first two words and the
remainder of the line, and assigns those three fields to C<$F1>, C<$F2>, and
C<$Etc>.  The conditional is true if any variables were assigned; that is,
if the pattern matched.

The C</g> modifier specifies global pattern matching--that is,
matching as many times as possible within the string.  How it behaves
depends on the context.  In list context, it returns a list of the
substrings matched by any capturing parentheses in the regular
expression.  If there are no parentheses, it returns a list of all
the matched strings, as if there were parentheses around the whole
pattern.

In scalar context, each execution of C<m//g> finds the next match,
returning true if it matches, and false if there is no further match.
The position after the last match can be read or set using the C<pos()>
function; see L<perlfunc/pos>.  A failed match normally resets the
search position to the beginning of the string, but you can avoid that
by adding the C</c> modifier (for example, C<m//gc>).  Modifying the target
string also resets the search position.

=item C<\G I<assertion>>

You can intermix C<m//g> matches with C<m/\G.../g>, where C<\G> is a
zero-width assertion that matches the exact position where the
previous C<m//g>, if any, left off.  Without the C</g> modifier, the
C<\G> assertion still anchors at C<pos()> as it was at the start of
the operation (see L<perlfunc/pos>), but the match is of course only
attempted once.  Using C<\G> without C</g> on a target string that has
not previously had a C</g> match applied to it is the same as using
the C<\A> assertion to match the beginning of the string.  Note also
that, currently, C<\G> is only properly supported when anchored at the
very beginning of the pattern.

Examples:

    # list context
    ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);

    # scalar context
    local $/ = "";
    while ($paragraph = <>) {
	while ($paragraph =~ /\p{Ll}['")]*[.!?]+['")]*\s/g) {
	    $sentences++;
	}
    }
    say $sentences;

Here's another way to check for sentences in a paragraph:

 my $sentence_rx = qr{
    (?: (?<= ^ ) | (?<= \s ) )  # after start-of-string or
                                # whitespace
    \p{Lu}                      # capital letter
    .*?                         # a bunch of anything
    (?<= \S )                   # that ends in non-
                                # whitespace
    (?<! \b [DMS]r  )           # but isn't a common abbr.
    (?<! \b Mrs )
    (?<! \b Sra )
    (?<! \b St  )
    [.?!]                       # followed by a sentence
                                # ender
    (?= $ | \s )                # in front of end-of-string
                                # or whitespace
 }sx;
 local $/ = "";
 while (my $paragraph = <>) {
    say "NEW PARAGRAPH";
    my $count = 0;
    while ($paragraph =~ /($sentence_rx)/g) {
        printf "\tgot sentence %d: <%s>\n", ++$count, $1;
    }
 }

Here's how to use C<m//gc> with C<\G>:

    $_ = "ppooqppqq";
    while ($i++ < 2) {
        print "1: '";
        print $1 while /(o)/gc; print "', pos=", pos, "\n";
        print "2: '";
        print $1 if /\G(q)/gc;  print "', pos=", pos, "\n";
        print "3: '";
        print $1 while /(p)/gc; print "', pos=", pos, "\n";
    }
    print "Final: '$1', pos=",pos,"\n" if /\G(.)/;

The last example should print:

    1: 'oo', pos=4
    2: 'q', pos=5
    3: 'pp', pos=7
    1: '', pos=7
    2: 'q', pos=8
    3: '', pos=8
    Final: 'q', pos=8

Notice that the final match matched C<q> instead of C<p>, which a match
without the C<\G> anchor would have done.  Also note that the final match
did not update C<pos>.  C<pos> is only updated on a C</g> match.  If the
final match did indeed match C<p>, it's a good bet that you're running a
very old (pre-5.6.0) version of Perl.

A useful idiom for C<lex>-like scanners is C</\G.../gc>.  You can
combine several regexps like this to process a string part-by-part,
doing different actions depending on which regexp matched.  Each
regexp tries to match where the previous one leaves off.

 $_ = <<'EOL';
    $url = URI::URL->new( "http://example.com/" );
    die if $url eq "xXx";
 EOL

 LOOP: {
     print(" digits"),       redo LOOP if /\G\d+\b[,.;]?\s*/gc;
     print(" lowercase"),    redo LOOP
                                    if /\G\p{Ll}+\b[,.;]?\s*/gc;
     print(" UPPERCASE"),    redo LOOP
                                    if /\G\p{Lu}+\b[,.;]?\s*/gc;
     print(" Capitalized"),  redo LOOP
                              if /\G\p{Lu}\p{Ll}+\b[,.;]?\s*/gc;
     print(" MiXeD"),        redo LOOP if /\G\pL+\b[,.;]?\s*/gc;
     print(" alphanumeric"), redo LOOP
                            if /\G[\p{Alpha}\pN]+\b[,.;]?\s*/gc;
     print(" line-noise"),   redo LOOP if /\G\W+/gc;
     print ". That's all!\n";
 }

Here is the output (split into several lines):

 line-noise lowercase line-noise UPPERCASE line-noise UPPERCASE
 line-noise lowercase line-noise lowercase line-noise lowercase
 lowercase line-noise lowercase lowercase line-noise lowercase
 lowercase line-noise MiXeD line-noise. That's all!

=item C<m?I<PATTERN>?msixpodualngc>
X<?> X<operator, match-once>

This is just like the C<m/I<PATTERN>/> search, except that it matches
only once between calls to the C<reset()> operator.  This is a useful
optimization when you want to see only the first occurrence of
something in each file of a set of files, for instance.  Only C<m??>
patterns local to the current package are reset.

    while (<>) {
	if (m?^$?) {
			    # blank line between header and body
	}
    } continue {
	reset if eof;	    # clear m?? status for next file
    }

Another example switched the first "latin1" encoding it finds
to "utf8" in a pod file:

    s//utf8/ if m? ^ =encoding \h+ \K latin1 ?x;

The match-once behavior is controlled by the match delimiter being
C<?>; with any other delimiter this is the normal C<m//> operator.

In the past, the leading C<m> in C<m?I<PATTERN>?> was optional, but omitting it
would produce a deprecation warning.  As of v5.22.0, omitting it produces a
syntax error.  If you encounter this construct in older code, you can just add
C<m>.

=item C<s/I<PATTERN>/I<REPLACEMENT>/msixpodualngcer>
X<s> X<substitute> X<substitution> X<replace> X<regexp, replace>
X<regexp, substitute> X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c> X</e> X</r>

Searches a string for a pattern, and if found, replaces that pattern
with the replacement text and returns the number of substitutions
made.  Otherwise it returns false (specifically, the empty string).

If the C</r> (non-destructive) option is used then it runs the
substitution on a copy of the string and instead of returning the
number of substitutions, it returns the copy whether or not a
substitution occurred.  The original string is never changed when
C</r> is used.  The copy will always be a plain string, even if the
input is an object or a tied variable.

If no string is specified via the C<=~> or C<!~> operator, the C<$_>
variable is searched and modified.  Unless the C</r> option is used,
the string specified must be a scalar variable, an array element, a
hash element, or an assignment to one of those; that is, some sort of
scalar lvalue.

If the delimiter chosen is a single quote, no variable interpolation is
done on either the I<PATTERN> or the I<REPLACEMENT>.  Otherwise, if the
I<PATTERN> contains a C<$> that looks like a variable rather than an
end-of-string test, the variable will be interpolated into the pattern
at run-time.  If you want the pattern compiled only once the first time
the variable is interpolated, use the C</o> option.  If the pattern
evaluates to the empty string, the last successfully executed regular
expression is used instead.  See L<perlre> for further explanation on these.

Options are as with C<m//> with the addition of the following replacement
specific options:

    e	Evaluate the right side as an expression.
    ee  Evaluate the right side as a string then eval the
        result.
    r   Return substitution and leave the original string
        untouched.

Any non-whitespace delimiter may replace the slashes.  Add space after
the C<s> when using a character allowed in identifiers.  If single quotes
are used, no interpretation is done on the replacement string (the C</e>
modifier overrides this, however).  Note that Perl treats backticks
as normal delimiters; the replacement text is not evaluated as a command.
If the I<PATTERN> is delimited by bracketing quotes, the I<REPLACEMENT> has
its own pair of quotes, which may or may not be bracketing quotes, for example,
C<s(foo)(bar)> or C<< s<foo>/bar/ >>.  A C</e> will cause the
replacement portion to be treated as a full-fledged Perl expression
and evaluated right then and there.  It is, however, syntax checked at
compile-time.  A second C<e> modifier will cause the replacement portion
to be C<eval>ed before being run as a Perl expression.

Examples:

    s/\bgreen\b/mauve/g;	      # don't change wintergreen

    $path =~ s|/usr/bin|/usr/local/bin|;

    s/Login: $foo/Login: $bar/; # run-time pattern

    ($foo = $bar) =~ s/this/that/;	# copy first, then
                                        # change
    ($foo = "$bar") =~ s/this/that/;	# convert to string,
                                        # copy, then change
    $foo = $bar =~ s/this/that/r;	# Same as above using /r
    $foo = $bar =~ s/this/that/r
                =~ s/that/the other/r;	# Chained substitutes
                                        # using /r
    @foo = map { s/this/that/r } @bar	# /r is very useful in
                                        # maps

    $count = ($paragraph =~ s/Mister\b/Mr./g);  # get change-cnt

    $_ = 'abc123xyz';
    s/\d+/$&*2/e;		# yields 'abc246xyz'
    s/\d+/sprintf("%5d",$&)/e;	# yields 'abc  246xyz'
    s/\w/$& x 2/eg;		# yields 'aabbcc  224466xxyyzz'

    s/%(.)/$percent{$1}/g;	# change percent escapes; no /e
    s/%(.)/$percent{$1} || $&/ge;	# expr now, so /e
    s/^=(\w+)/pod($1)/ge;	# use function call

    $_ = 'abc123xyz';
    $x = s/abc/def/r;           # $x is 'def123xyz' and
                                # $_ remains 'abc123xyz'.

    # expand variables in $_, but dynamics only, using
    # symbolic dereferencing
    s/\$(\w+)/${$1}/g;

    # Add one to the value of any numbers in the string
    s/(\d+)/1 + $1/eg;

    # Titlecase words in the last 30 characters only
    substr($str, -30) =~ s/\b(\p{Alpha}+)\b/\u\L$1/g;

    # This will expand any embedded scalar variable
    # (including lexicals) in $_ : First $1 is interpolated
    # to the variable name, and then evaluated
    s/(\$\w+)/$1/eeg;

    # Delete (most) C comments.
    $program =~ s {
	/\*	# Match the opening delimiter.
	.*?	# Match a minimal number of characters.
	\*/	# Match the closing delimiter.
    } []gsx;

    s/^\s*(.*?)\s*$/$1/;	# trim whitespace in $_,
                                # expensively

    for ($variable) {		# trim whitespace in $variable,
                                # cheap
	s/^\s+//;
	s/\s+$//;
    }

    s/([^ ]*) *([^ ]*)/$2 $1/;	# reverse 1st two fields

Note the use of C<$> instead of C<\> in the last example.  Unlike
B<sed>, we use the \<I<digit>> form only in the left hand side.
Anywhere else it's $<I<digit>>.

Occasionally, you can't use just a C</g> to get all the changes
to occur that you might want.  Here are two common cases:

    # put commas in the right places in an integer
    1 while s/(\d)(\d\d\d)(?!\d)/$1,$2/g;

    # expand tabs to 8-column spacing
    1 while s/\t+/' ' x (length($&)*8 - length($`)%8)/e;

=back

=head2 Quote-Like Operators
X<operator, quote-like>

=over 4

=item C<q/I<STRING>/>
X<q> X<quote, single> X<'> X<''>

=item C<'I<STRING>'>

A single-quoted, literal string.  A backslash represents a backslash
unless followed by the delimiter or another backslash, in which case
the delimiter or backslash is interpolated.

    $foo = q!I said, "You said, 'She said it.'"!;
    $bar = q('This is it.');
    $baz = '\n';		# a two-character string

=item C<qq/I<STRING>/>
X<qq> X<quote, double> X<"> X<"">

=item "I<STRING>"

A double-quoted, interpolated string.

    $_ .= qq
     (*** The previous line contains the naughty word "$1".\n)
		if /\b(tcl|java|python)\b/i;      # :-)
    $baz = "\n";		# a one-character string

=item C<qx/I<STRING>/>
X<qx> X<`> X<``> X<backtick>

=item C<`I<STRING>`>

A string which is (possibly) interpolated and then executed as a
system command with F</bin/sh> or its equivalent.  Shell wildcards,
pipes, and redirections will be honored.  The collected standard
output of the command is returned; standard error is unaffected.  In
scalar context, it comes back as a single (potentially multi-line)
string, or C<undef> if the command failed.  In list context, returns a
list of lines (however you've defined lines with C<$/> or
C<$INPUT_RECORD_SEPARATOR>), or an empty list if the command failed.

Because backticks do not affect standard error, use shell file descriptor
syntax (assuming the shell supports this) if you care to address this.
To capture a command's STDERR and STDOUT together:

    $output = `cmd 2>&1`;

To capture a command's STDOUT but discard its STDERR:

    $output = `cmd 2>/dev/null`;

To capture a command's STDERR but discard its STDOUT (ordering is
important here):

    $output = `cmd 2>&1 1>/dev/null`;

To exchange a command's STDOUT and STDERR in order to capture the STDERR
but leave its STDOUT to come out the old STDERR:

    $output = `cmd 3>&1 1>&2 2>&3 3>&-`;

To read both a command's STDOUT and its STDERR separately, it's easiest
to redirect them separately to files, and then read from those files
when the program is done:

    system("program args 1>program.stdout 2>program.stderr");

The STDIN filehandle used by the command is inherited from Perl's STDIN.
For example:

    open(SPLAT, "stuff")   || die "can't open stuff: $!";
    open(STDIN, "<&SPLAT") || die "can't dupe SPLAT: $!";
    print STDOUT `sort`;

will print the sorted contents of the file named F<"stuff">.

Using single-quote as a delimiter protects the command from Perl's
double-quote interpolation, passing it on to the shell instead:

    $perl_info  = qx(ps $$);            # that's Perl's $$
    $shell_info = qx'ps $$';            # that's the new shell's $$

How that string gets evaluated is entirely subject to the command
interpreter on your system.  On most platforms, you will have to protect
shell metacharacters if you want them treated literally.  This is in
practice difficult to do, as it's unclear how to escape which characters.
See L<perlsec> for a clean and safe example of a manual C<fork()> and C<exec()>
to emulate backticks safely.

On some platforms (notably DOS-like ones), the shell may not be
capable of dealing with multiline commands, so putting newlines in
the string may not get you what you want.  You may be able to evaluate
multiple commands in a single line by separating them with the command
separator character, if your shell supports that (for example, C<;> on
many Unix shells and C<&> on the Windows NT C<cmd> shell).

Perl will attempt to flush all files opened for
output before starting the child process, but this may not be supported
on some platforms (see L<perlport>).  To be safe, you may need to set
C<$|> (C<$AUTOFLUSH> in C<L<English>>) or call the C<autoflush()> method of
C<L<IO::Handle>> on any open handles.

Beware that some command shells may place restrictions on the length
of the command line.  You must ensure your strings don't exceed this
limit after any necessary interpolations.  See the platform-specific
release notes for more details about your particular environment.

Using this operator can lead to programs that are difficult to port,
because the shell commands called vary between systems, and may in
fact not be present at all.  As one example, the C<type> command under
the POSIX shell is very different from the C<type> command under DOS.
That doesn't mean you should go out of your way to avoid backticks
when they're the right way to get something done.  Perl was made to be
a glue language, and one of the things it glues together is commands.
Just understand what you're getting yourself into.

Like C<system>, backticks put the child process exit code in C<$?>.
If you'd like to manually inspect failure, you can check all possible
failure modes by inspecting C<$?> like this:

    if ($? == -1) {
        print "failed to execute: $!\n";
    }
    elsif ($? & 127) {
        printf "child died with signal %d, %s coredump\n",
            ($? & 127),  ($? & 128) ? 'with' : 'without';
    }
    else {
        printf "child exited with value %d\n", $? >> 8;
    }

Use the L<open> pragma to control the I/O layers used when reading the
output of the command, for example:

  use open IN => ":encoding(UTF-8)";
  my $x = `cmd-producing-utf-8`;

See L</"I/O Operators"> for more discussion.

=item C<qw/I<STRING>/>
X<qw> X<quote, list> X<quote, words>

Evaluates to a list of the words extracted out of I<STRING>, using embedded
whitespace as the word delimiters.  It can be understood as being roughly
equivalent to:

    split(" ", q/STRING/);

the differences being that it generates a real list at compile time, and
in scalar context it returns the last element in the list.  So
this expression:

    qw(foo bar baz)

is semantically equivalent to the list:

    "foo", "bar", "baz"

Some frequently seen examples:

    use POSIX qw( setlocale localeconv )
    @EXPORT = qw( foo bar baz );

A common mistake is to try to separate the words with commas or to
put comments into a multi-line C<qw>-string.  For this reason, the
S<C<use warnings>> pragma and the B<-w> switch (that is, the C<$^W> variable)
produces warnings if the I<STRING> contains the C<","> or the C<"#"> character.

=item C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr>
X<tr> X<y> X<transliterate> X</c> X</d> X</s>

=item C<y/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr>

Transliterates all occurrences of the characters found in the search list
with the corresponding character in the replacement list.  It returns
the number of characters replaced or deleted.  If no string is
specified via the C<=~> or C<!~> operator, the C<$_> string is transliterated.

If the C</r> (non-destructive) option is present, a new copy of the string
is made and its characters transliterated, and this copy is returned no
matter whether it was modified or not: the original string is always
left unchanged.  The new copy is always a plain string, even if the input
string is an object or a tied variable.

Unless the C</r> option is used, the string specified with C<=~> must be a
scalar variable, an array element, a hash element, or an assignment to one
of those; in other words, an lvalue.

A character range may be specified with a hyphen, so C<tr/A-J/0-9/>
does the same replacement as C<tr/ACEGIBDFHJ/0246813579/>.
For B<sed> devotees, C<y> is provided as a synonym for C<tr>.  If the
I<SEARCHLIST> is delimited by bracketing quotes, the I<REPLACEMENTLIST>
must have its own pair of quotes, which may or may not be bracketing
quotes; for example, C<tr[aeiouy][yuoiea]> or C<tr(+\-*/)/ABCD/>.

Characters may be literals or any of the escape sequences accepted in
double-quoted strings.  But there is no variable interpolation, so C<"$">
and C<"@"> are treated as literals.  A hyphen at the beginning or end, or
preceded by a backslash is considered a literal.  Escape sequence
details are in L<the table near the beginning of this section|/Quote and
Quote-like Operators>.

Note that C<tr> does B<not> do regular expression character classes such as
C<\d> or C<\pL>.  The C<tr> operator is not equivalent to the C<L<tr(1)>>
utility.  C<tr[a-z][A-Z]> will uppercase the 26 letters "a" through "z",
but for case changing not confined to ASCII, use
L<C<lc>|perlfunc/lc>, L<C<uc>|perlfunc/uc>,
L<C<lcfirst>|perlfunc/lcfirst>, L<C<ucfirst>|perlfunc/ucfirst>
(all documented in L<perlfunc>), or the
L<substitution operator C<sE<sol>I<PATTERN>E<sol>I<REPLACEMENT>E<sol>>|/sE<sol>PATTERNE<sol>REPLACEMENTE<sol>msixpodualngcer>
(with C<\U>, C<\u>, C<\L>, and C<\l> string-interpolation escapes in the
I<REPLACEMENT> portion).

Most ranges are unportable between character sets, but certain ones
signal Perl to do special handling to make them portable.  There are two
classes of portable ranges.  The first are any subsets of the ranges
C<A-Z>, C<a-z>, and C<0-9>, when expressed as literal characters.

  tr/h-k/H-K/

capitalizes the letters C<"h">, C<"i">, C<"j">, and C<"k"> and nothing
else, no matter what the platform's character set is.  In contrast, all
of

  tr/\x68-\x6B/\x48-\x4B/
  tr/h-\x6B/H-\x4B/
  tr/\x68-k/\x48-K/

do the same capitalizations as the previous example when run on ASCII
platforms, but something completely different on EBCDIC ones.

The second class of portable ranges is invoked when one or both of the
range's end points are expressed as C<\N{...}>

 $string =~ tr/\N{U+20}-\N{U+7E}//d;

removes from C<$string> all the platform's characters which are
equivalent to any of Unicode U+0020, U+0021, ... U+007D, U+007E.  This
is a portable range, and has the same effect on every platform it is
run on.  It turns out that in this example, these are the ASCII
printable characters.  So after this is run, C<$string> has only
controls and characters which have no ASCII equivalents.

But, even for portable ranges, it is not generally obvious what is
included without having to look things up.  A sound principle is to use
only ranges that begin from and end at either ASCII alphabetics of equal
case (C<b-e>, C<B-E>), or digits (C<1-4>).  Anything else is unclear
(and unportable unless C<\N{...}> is used).  If in doubt, spell out the
character sets in full.

Options:

    c	Complement the SEARCHLIST.
    d	Delete found but unreplaced characters.
    s	Squash duplicate replaced characters.
    r	Return the modified string and leave the original string
	untouched.

If the C</c> modifier is specified, the I<SEARCHLIST> character set
is complemented.  If the C</d> modifier is specified, any characters
specified by I<SEARCHLIST> not found in I<REPLACEMENTLIST> are deleted.
(Note that this is slightly more flexible than the behavior of some
B<tr> programs, which delete anything they find in the I<SEARCHLIST>,
period.)  If the C</s> modifier is specified, sequences of characters
that were transliterated to the same character are squashed down
to a single instance of the character.

If the C</d> modifier is used, the I<REPLACEMENTLIST> is always interpreted
exactly as specified.  Otherwise, if the I<REPLACEMENTLIST> is shorter
than the I<SEARCHLIST>, the final character is replicated till it is long
enough.  If the I<REPLACEMENTLIST> is empty, the I<SEARCHLIST> is replicated.
This latter is useful for counting characters in a class or for
squashing character sequences in a class.

Examples:

    $ARGV[1] =~ tr/A-Z/a-z/;	# canonicalize to lower case ASCII

    $cnt = tr/*/*/;		# count the stars in $_

    $cnt = $sky =~ tr/*/*/;	# count the stars in $sky

    $cnt = tr/0-9//;		# count the digits in $_

    tr/a-zA-Z//s;		# bookkeeper -> bokeper

    ($HOST = $host) =~ tr/a-z/A-Z/;
     $HOST = $host  =~ tr/a-z/A-Z/r;   # same thing

    $HOST = $host =~ tr/a-z/A-Z/r    # chained with s///r
                  =~ s/:/ -p/r;

    tr/a-zA-Z/ /cs;		# change non-alphas to single space

    @stripped = map tr/a-zA-Z/ /csr, @original;
				# /r with map

    tr [\200-\377]
       [\000-\177];		# wickedly delete 8th bit

If multiple transliterations are given for a character, only the
first one is used:

    tr/AAA/XYZ/

will transliterate any A to X.

Because the transliteration table is built at compile time, neither
the I<SEARCHLIST> nor the I<REPLACEMENTLIST> are subjected to double quote
interpolation.  That means that if you want to use variables, you
must use an C<eval()>:

    eval "tr/$oldlist/$newlist/";
    die $@ if $@;

    eval "tr/$oldlist/$newlist/, 1" or die $@;

=item C<< <<I<EOF> >>
X<here-doc> X<heredoc> X<here-document> X<<< << >>>

A line-oriented form of quoting is based on the shell "here-document"
syntax.  Following a C<< << >> you specify a string to terminate
the quoted material, and all lines following the current line down to
the terminating string are the value of the item.

Prefixing the terminating string with a C<~> specifies that you
want to use L</Indented Here-docs> (see below).

The terminating string may be either an identifier (a word), or some
quoted text.  An unquoted identifier works like double quotes.
There may not be a space between the C<< << >> and the identifier,
unless the identifier is explicitly quoted.  (If you put a space it
will be treated as a null identifier, which is valid, and matches the
first empty line.)  The terminating string must appear by itself
(unquoted and with no surrounding whitespace) on the terminating line.

If the terminating string is quoted, the type of quotes used determine
the treatment of the text.

=over 4

=item Double Quotes

Double quotes indicate that the text will be interpolated using exactly
the same rules as normal double quoted strings.

       print <<EOF;
    The price is $Price.
    EOF

       print << "EOF"; # same as above
    The price is $Price.
    EOF


=item Single Quotes

Single quotes indicate the text is to be treated literally with no
interpolation of its content.  This is similar to single quoted
strings except that backslashes have no special meaning, with C<\\>
being treated as two backslashes and not one as they would in every
other quoting construct.

Just as in the shell, a backslashed bareword following the C<<< << >>>
means the same thing as a single-quoted string does:

	$cost = <<'VISTA';  # hasta la ...
    That'll be $10 please, ma'am.
    VISTA

	$cost = <<\VISTA;   # Same thing!
    That'll be $10 please, ma'am.
    VISTA

This is the only form of quoting in perl where there is no need
to worry about escaping content, something that code generators
can and do make good use of.

=item Backticks

The content of the here doc is treated just as it would be if the
string were embedded in backticks.  Thus the content is interpolated
as though it were double quoted and then executed via the shell, with
the results of the execution returned.

       print << `EOC`; # execute command and get results
    echo hi there
    EOC

=back

=over 4

=item Indented Here-docs

The here-doc modifier C<~> allows you to indent your here-docs to make
the code more readable:

    if ($some_var) {
      print <<~EOF;
        This is a here-doc
        EOF
    }

This will print...

    This is a here-doc

...with no leading whitespace.

The delimiter is used to determine the B<exact> whitespace to
remove from the beginning of each line.  All lines B<must> have
at least the same starting whitespace (except lines only
containing a newline) or perl will croak.  Tabs and spaces can
be mixed, but are matched exactly.  One tab will not be equal to
8 spaces!

Additional beginning whitespace (beyond what preceded the
delimiter) will be preserved:

    print <<~EOF;
      This text is not indented
        This text is indented with two spaces
      		This text is indented with two tabs
      EOF

Finally, the modifier may be used with all of the forms
mentioned above:

    <<~\EOF;
    <<~'EOF'
    <<~"EOF"
    <<~`EOF`

And whitespace may be used between the C<~> and quoted delimiters:

    <<~ 'EOF'; # ... "EOF", `EOF`

=back

It is possible to stack multiple here-docs in a row:

       print <<"foo", <<"bar"; # you can stack them
    I said foo.
    foo
    I said bar.
    bar

       myfunc(<< "THIS", 23, <<'THAT');
    Here's a line
    or two.
    THIS
    and here's another.
    THAT

Just don't forget that you have to put a semicolon on the end
to finish the statement, as Perl doesn't know you're not going to
try to do this:

       print <<ABC
    179231
    ABC
       + 20;

If you want to remove the line terminator from your here-docs,
use C<chomp()>.

    chomp($string = <<'END');
    This is a string.
    END

If you want your here-docs to be indented with the rest of the code,
you'll need to remove leading whitespace from each line manually:

    ($quote = <<'FINIS') =~ s/^\s+//gm;
       The Road goes ever on and on,
       down from the door where it began.
    FINIS

If you use a here-doc within a delimited construct, such as in C<s///eg>,
the quoted material must still come on the line following the
C<<< <<FOO >>> marker, which means it may be inside the delimited
construct:

    s/this/<<E . 'that'
    the other
    E
     . 'more '/eg;

It works this way as of Perl 5.18.  Historically, it was inconsistent, and
you would have to write

    s/this/<<E . 'that'
     . 'more '/eg;
    the other
    E

outside of string evals.

Additionally, quoting rules for the end-of-string identifier are
unrelated to Perl's quoting rules.  C<q()>, C<qq()>, and the like are not
supported in place of C<''> and C<"">, and the only interpolation is for
backslashing the quoting character:

    print << "abc\"def";
    testing...
    abc"def

Finally, quoted strings cannot span multiple lines.  The general rule is
that the identifier must be a string literal.  Stick with that, and you
should be safe.

=back

=head2 Gory details of parsing quoted constructs
X<quote, gory details>

When presented with something that might have several different
interpretations, Perl uses the B<DWIM> (that's "Do What I Mean")
principle to pick the most probable interpretation.  This strategy
is so successful that Perl programmers often do not suspect the
ambivalence of what they write.  But from time to time, Perl's
notions differ substantially from what the author honestly meant.

This section hopes to clarify how Perl handles quoted constructs.
Although the most common reason to learn this is to unravel labyrinthine
regular expressions, because the initial steps of parsing are the
same for all quoting operators, they are all discussed together.

The most important Perl parsing rule is the first one discussed
below: when processing a quoted construct, Perl first finds the end
of that construct, then interprets its contents.  If you understand
this rule, you may skip the rest of this section on the first
reading.  The other rules are likely to contradict the user's
expectations much less frequently than this first one.

Some passes discussed below are performed concurrently, but because
their results are the same, we consider them individually.  For different
quoting constructs, Perl performs different numbers of passes, from
one to four, but these passes are always performed in the same order.

=over 4

=item Finding the end

The first pass is finding the end of the quoted construct.  This results
in saving to a safe location a copy of the text (between the starting
and ending delimiters), normalized as necessary to avoid needing to know
what the original delimiters were.

If the construct is a here-doc, the ending delimiter is a line
that has a terminating string as the content.  Therefore C<<<EOF> is
terminated by C<EOF> immediately followed by C<"\n"> and starting
from the first column of the terminating line.
When searching for the terminating line of a here-doc, nothing
is skipped.  In other words, lines after the here-doc syntax
are compared with the terminating string line by line.

For the constructs except here-docs, single characters are used as starting
and ending delimiters.  If the starting delimiter is an opening punctuation
(that is C<(>, C<[>, C<{>, or C<< < >>), the ending delimiter is the
corresponding closing punctuation (that is C<)>, C<]>, C<}>, or C<< > >>).
If the starting delimiter is an unpaired character like C</> or a closing
punctuation, the ending delimiter is the same as the starting delimiter.
Therefore a C</> terminates a C<qq//> construct, while a C<]> terminates
both C<qq[]> and C<qq]]> constructs.

When searching for single-character delimiters, escaped delimiters
and C<\\> are skipped.  For example, while searching for terminating C</>,
combinations of C<\\> and C<\/> are skipped.  If the delimiters are
bracketing, nested pairs are also skipped.  For example, while searching
for a closing C<]> paired with the opening C<[>, combinations of C<\\>, C<\]>,
and C<\[> are all skipped, and nested C<[> and C<]> are skipped as well.
However, when backslashes are used as the delimiters (like C<qq\\> and
C<tr\\\>), nothing is skipped.
During the search for the end, backslashes that escape delimiters or
other backslashes are removed (exactly speaking, they are not copied to the
safe location).

For constructs with three-part delimiters (C<s///>, C<y///>, and
C<tr///>), the search is repeated once more.
If the first delimiter is not an opening punctuation, the three delimiters must
be the same, such as C<s!!!> and C<tr)))>,
in which case the second delimiter
terminates the left part and starts the right part at once.
If the left part is delimited by bracketing punctuation (that is C<()>,
C<[]>, C<{}>, or C<< <> >>), the right part needs another pair of
delimiters such as C<s(){}> and C<tr[]//>.  In these cases, whitespace
and comments are allowed between the two parts, although the comment must follow
at least one whitespace character; otherwise a character expected as the
start of the comment may be regarded as the starting delimiter of the right part.

During this search no attention is paid to the semantics of the construct.
Thus:

    "$hash{"$foo/$bar"}"

or:

    m/
      bar	# NOT a comment, this slash / terminated m//!
     /x

do not form legal quoted expressions.   The quoted part ends on the
first C<"> and C</>, and the rest happens to be a syntax error.
Because the slash that terminated C<m//> was followed by a C<SPACE>,
the example above is not C<m//x>, but rather C<m//> with no C</x>
modifier.  So the embedded C<#> is interpreted as a literal C<#>.

Also no attention is paid to C<\c\> (multichar control char syntax) during
this search.  Thus the second C<\> in C<qq/\c\/> is interpreted as a part
of C<\/>, and the following C</> is not recognized as a delimiter.
Instead, use C<\034> or C<\x1c> at the end of quoted constructs.

=item Interpolation
X<interpolation>

The next step is interpolation in the text obtained, which is now
delimiter-independent.  There are multiple cases.

=over 4

=item C<<<'EOF'>

No interpolation is performed.
Note that the combination C<\\> is left intact, since escaped delimiters
are not available for here-docs.

=item  C<m''>, the pattern of C<s'''>

No interpolation is performed at this stage.
Any backslashed sequences including C<\\> are treated at the stage
to L</"parsing regular expressions">.

=item C<''>, C<q//>, C<tr'''>, C<y'''>, the replacement of C<s'''>

The only interpolation is removal of C<\> from pairs of C<\\>.
Therefore C<"-"> in C<tr'''> and C<y'''> is treated literally
as a hyphen and no character range is available.
C<\1> in the replacement of C<s'''> does not work as C<$1>.

=item C<tr///>, C<y///>

No variable interpolation occurs.  String modifying combinations for
case and quoting such as C<\Q>, C<\U>, and C<\E> are not recognized.
The other escape sequences such as C<\200> and C<\t> and backslashed
characters such as C<\\> and C<\-> are converted to appropriate literals.
The character C<"-"> is treated specially and therefore C<\-> is treated
as a literal C<"-">.

=item C<"">, C<``>, C<qq//>, C<qx//>, C<< <file*glob> >>, C<<<"EOF">

C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, C<\F> (possibly paired with C<\E>) are
converted to corresponding Perl constructs.  Thus, C<"$foo\Qbaz$bar">
is converted to S<C<$foo . (quotemeta("baz" . $bar))>> internally.
The other escape sequences such as C<\200> and C<\t> and backslashed
characters such as C<\\> and C<\-> are replaced with appropriate
expansions.

Let it be stressed that I<whatever falls between C<\Q> and C<\E>>
is interpolated in the usual way.  Something like C<"\Q\\E"> has
no C<\E> inside.  Instead, it has C<\Q>, C<\\>, and C<E>, so the
result is the same as for C<"\\\\E">.  As a general rule, backslashes
between C<\Q> and C<\E> may lead to counterintuitive results.  So,
C<"\Q\t\E"> is converted to C<quotemeta("\t")>, which is the same
as C<"\\\t"> (since TAB is not alphanumeric).  Note also that:

  $str = '\t';
  return "\Q$str";

may be closer to the conjectural I<intention> of the writer of C<"\Q\t\E">.

Interpolated scalars and arrays are converted internally to the C<join> and
C<"."> catenation operations.  Thus, S<C<"$foo XXX '@arr'">> becomes:

  $foo . " XXX '" . (join $", @arr) . "'";

All operations above are performed simultaneously, left to right.

Because the result of S<C<"\Q I<STRING> \E">> has all metacharacters
quoted, there is no way to insert a literal C<$> or C<@> inside a
C<\Q\E> pair.  If protected by C<\>, C<$> will be quoted to become
C<"\\\$">; if not, it is interpreted as the start of an interpolated
scalar.

Note also that the interpolation code needs to make a decision on
where the interpolated scalar ends.  For instance, whether
S<C<< "a $x -> {c}" >>> really means:

  "a " . $x . " -> {c}";

or:

  "a " . $x -> {c};

Most of the time, the longest possible text that does not include
spaces between components and which contains matching braces or
brackets.  because the outcome may be determined by voting based
on heuristic estimators, the result is not strictly predictable.
Fortunately, it's usually correct for ambiguous cases.

=item the replacement of C<s///>

Processing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, C<\F> and interpolation
happens as with C<qq//> constructs.

It is at this step that C<\1> is begrudgingly converted to C<$1> in
the replacement text of C<s///>, in order to correct the incorrigible
I<sed> hackers who haven't picked up the saner idiom yet.  A warning
is emitted if the S<C<use warnings>> pragma or the B<-w> command-line flag
(that is, the C<$^W> variable) was set.

=item C<RE> in C<m?RE?>, C</RE/>, C<m/RE/>, C<s/RE/foo/>,

Processing of C<\Q>, C<\U>, C<\u>, C<\L>, C<\l>, C<\F>, C<\E>,
and interpolation happens (almost) as with C<qq//> constructs.

Processing of C<\N{...}> is also done here, and compiled into an intermediate
form for the regex compiler.  (This is because, as mentioned below, the regex
compilation may be done at execution time, and C<\N{...}> is a compile-time
construct.)

However any other combinations of C<\> followed by a character
are not substituted but only skipped, in order to parse them
as regular expressions at the following step.
As C<\c> is skipped at this step, C<@> of C<\c@> in RE is possibly
treated as an array symbol (for example C<@foo>),
even though the same text in C<qq//> gives interpolation of C<\c@>.

Code blocks such as C<(?{BLOCK})> are handled by temporarily passing control
back to the perl parser, in a similar way that an interpolated array
subscript expression such as C<"foo$array[1+f("[xyz")]bar"> would be.

Moreover, inside C<(?{BLOCK})>, S<C<(?# comment )>>, and
a C<#>-comment in a C</x>-regular expression, no processing is
performed whatsoever.  This is the first step at which the presence
of the C</x> modifier is relevant.

Interpolation in patterns has several quirks: C<$|>, C<$(>, C<$)>, C<@+>
and C<@-> are not interpolated, and constructs C<$var[SOMETHING]> are
voted (by several different estimators) to be either an array element
or C<$var> followed by an RE alternative.  This is where the notation
C<${arr[$bar]}> comes handy: C</${arr[0-9]}/> is interpreted as
array element C<-9>, not as a regular expression from the variable
C<$arr> followed by a digit, which would be the interpretation of
C</$arr[0-9]/>.  Since voting among different estimators may occur,
the result is not predictable.

The lack of processing of C<\\> creates specific restrictions on
the post-processed text.  If the delimiter is C</>, one cannot get
the combination C<\/> into the result of this step.  C</> will
finish the regular expression, C<\/> will be stripped to C</> on
the previous step, and C<\\/> will be left as is.  Because C</> is
equivalent to C<\/> inside a regular expression, this does not
matter unless the delimiter happens to be character special to the
RE engine, such as in C<s*foo*bar*>, C<m[foo]>, or C<m?foo?>; or an
alphanumeric char, as in:

  m m ^ a \s* b mmx;

In the RE above, which is intentionally obfuscated for illustration, the
delimiter is C<m>, the modifier is C<mx>, and after delimiter-removal the
RE is the same as for S<C<m/ ^ a \s* b /mx>>.  There's more than one
reason you're encouraged to restrict your delimiters to non-alphanumeric,
non-whitespace choices.

=back

This step is the last one for all constructs except regular expressions,
which are processed further.

=item parsing regular expressions
X<regexp, parse>

Previous steps were performed during the compilation of Perl code,
but this one happens at run time, although it may be optimized to
be calculated at compile time if appropriate.  After preprocessing
described above, and possibly after evaluation if concatenation,
joining, casing translation, or metaquoting are involved, the
resulting I<string> is passed to the RE engine for compilation.

Whatever happens in the RE engine might be better discussed in L<perlre>,
but for the sake of continuity, we shall do so here.

This is another step where the presence of the C</x> modifier is
relevant.  The RE engine scans the string from left to right and
converts it into a finite automaton.

Backslashed characters are either replaced with corresponding
literal strings (as with C<\{>), or else they generate special nodes
in the finite automaton (as with C<\b>).  Characters special to the
RE engine (such as C<|>) generate corresponding nodes or groups of
nodes.  C<(?#...)> comments are ignored.  All the rest is either
converted to literal strings to match, or else is ignored (as is
whitespace and C<#>-style comments if C</x> is present).

Parsing of the bracketed character class construct, C<[...]>, is
rather different than the rule used for the rest of the pattern.
The terminator of this construct is found using the same rules as
for finding the terminator of a C<{}>-delimited construct, the only
exception being that C<]> immediately following C<[> is treated as
though preceded by a backslash.

The terminator of runtime C<(?{...})> is found by temporarily switching
control to the perl parser, which should stop at the point where the
logically balancing terminating C<}> is found.

It is possible to inspect both the string given to RE engine and the
resulting finite automaton.  See the arguments C<debug>/C<debugcolor>
in the S<C<use L<re>>> pragma, as well as Perl's B<-Dr> command-line
switch documented in L<perlrun/"Command Switches">.

=item Optimization of regular expressions
X<regexp, optimization>

This step is listed for completeness only.  Since it does not change
semantics, details of this step are not documented and are subject
to change without notice.  This step is performed over the finite
automaton that was generated during the previous pass.

It is at this stage that C<split()> silently optimizes C</^/> to
mean C</^/m>.

=back

=head2 I/O Operators
X<operator, i/o> X<operator, io> X<io> X<while> X<filehandle>
X<< <> >> X<< <<>> >> X<@ARGV>

There are several I/O operators you should know about.

A string enclosed by backticks (grave accents) first undergoes
double-quote interpolation.  It is then interpreted as an external
command, and the output of that command is the value of the
backtick string, like in a shell.  In scalar context, a single string
consisting of all output is returned.  In list context, a list of
values is returned, one per line of output.  (You can set C<$/> to use
a different line terminator.)  The command is executed each time the
pseudo-literal is evaluated.  The status value of the command is
returned in C<$?> (see L<perlvar> for the interpretation of C<$?>).
Unlike in B<csh>, no translation is done on the return data--newlines
remain newlines.  Unlike in any of the shells, single quotes do not
hide variable names in the command from interpretation.  To pass a
literal dollar-sign through to the shell you need to hide it with a
backslash.  The generalized form of backticks is C<qx//>.  (Because
backticks always undergo shell expansion as well, see L<perlsec> for
security concerns.)
X<qx> X<`> X<``> X<backtick> X<glob>

In scalar context, evaluating a filehandle in angle brackets yields
the next line from that file (the newline, if any, included), or
C<undef> at end-of-file or on error.  When C<$/> is set to C<undef>
(sometimes known as file-slurp mode) and the file is empty, it
returns C<''> the first time, followed by C<undef> subsequently.

Ordinarily you must assign the returned value to a variable, but
there is one situation where an automatic assignment happens.  If
and only if the input symbol is the only thing inside the conditional
of a C<while> statement (even if disguised as a C<for(;;)> loop),
the value is automatically assigned to the global variable C<$_>,
destroying whatever was there previously.  (This may seem like an
odd thing to you, but you'll use the construct in almost every Perl
script you write.)  The C<$_> variable is not implicitly localized.
You'll have to put a S<C<local $_;>> before the loop if you want that
to happen.

The following lines are equivalent:

    while (defined($_ = <STDIN>)) { print; }
    while ($_ = <STDIN>) { print; }
    while (<STDIN>) { print; }
    for (;<STDIN>;) { print; }
    print while defined($_ = <STDIN>);
    print while ($_ = <STDIN>);
    print while <STDIN>;

This also behaves similarly, but assigns to a lexical variable
instead of to C<$_>:

    while (my $line = <STDIN>) { print $line }

In these loop constructs, the assigned value (whether assignment
is automatic or explicit) is then tested to see whether it is
defined.  The defined test avoids problems where the line has a string
value that would be treated as false by Perl; for example a "" or
a C<"0"> with no trailing newline.  If you really mean for such values
to terminate the loop, they should be tested for explicitly:

    while (($_ = <STDIN>) ne '0') { ... }
    while (<STDIN>) { last unless $_; ... }

In other boolean contexts, C<< <I<FILEHANDLE>> >> without an
explicit C<defined> test or comparison elicits a warning if the
S<C<use warnings>> pragma or the B<-w>
command-line switch (the C<$^W> variable) is in effect.

The filehandles STDIN, STDOUT, and STDERR are predefined.  (The
filehandles C<stdin>, C<stdout>, and C<stderr> will also work except
in packages, where they would be interpreted as local identifiers
rather than global.)  Additional filehandles may be created with
the C<open()> function, amongst others.  See L<perlopentut> and
L<perlfunc/open> for details on this.
X<stdin> X<stdout> X<sterr>

If a C<< <I<FILEHANDLE>> >> is used in a context that is looking for
a list, a list comprising all input lines is returned, one line per
list element.  It's easy to grow to a rather large data space this
way, so use with care.

C<< <I<FILEHANDLE>> >>  may also be spelled C<readline(*I<FILEHANDLE>)>.
See L<perlfunc/readline>.

The null filehandle C<< <> >> is special: it can be used to emulate the
behavior of B<sed> and B<awk>, and any other Unix filter program
that takes a list of filenames, doing the same to each line
of input from all of them.  Input from C<< <> >> comes either from
standard input, or from each file listed on the command line.  Here's
how it works: the first time C<< <> >> is evaluated, the C<@ARGV> array is
checked, and if it is empty, C<$ARGV[0]> is set to C<"-">, which when opened
gives you standard input.  The C<@ARGV> array is then processed as a list
of filenames.  The loop

    while (<>) {
	...			# code for each line
    }

is equivalent to the following Perl-like pseudo code:

    unshift(@ARGV, '-') unless @ARGV;
    while ($ARGV = shift) {
	open(ARGV, $ARGV);
	while (<ARGV>) {
	    ...		# code for each line
	}
    }

except that it isn't so cumbersome to say, and will actually work.
It really does shift the C<@ARGV> array and put the current filename
into the C<$ARGV> variable.  It also uses filehandle I<ARGV>
internally.  C<< <> >> is just a synonym for C<< <ARGV> >>, which
is magical.  (The pseudo code above doesn't work because it treats
C<< <ARGV> >> as non-magical.)

Since the null filehandle uses the two argument form of L<perlfunc/open>
it interprets special characters, so if you have a script like this:

    while (<>) {
        print;
    }

and call it with S<C<perl dangerous.pl 'rm -rfv *|'>>, it actually opens a
pipe, executes the C<rm> command and reads C<rm>'s output from that pipe.
If you want all items in C<@ARGV> to be interpreted as file names, you
can use the module C<ARGV::readonly> from CPAN, or use the double bracket:

    while (<<>>) {
        print;
    }

Using double angle brackets inside of a while causes the open to use the
three argument form (with the second argument being C<< < >>), so all
arguments in C<ARGV> are treated as literal filenames (including C<"-">).
(Note that for convenience, if you use C<< <<>> >> and if C<@ARGV> is
empty, it will still read from the standard input.)

You can modify C<@ARGV> before the first C<< <> >> as long as the array ends up
containing the list of filenames you really want.  Line numbers (C<$.>)
continue as though the input were one big happy file.  See the example
in L<perlfunc/eof> for how to reset line numbers on each file.

If you want to set C<@ARGV> to your own list of files, go right ahead.
This sets C<@ARGV> to all plain text files if no C<@ARGV> was given:

    @ARGV = grep { -f && -T } glob('*') unless @ARGV;

You can even set them to pipe commands.  For example, this automatically
filters compressed arguments through B<gzip>:

    @ARGV = map { /\.(gz|Z)$/ ? "gzip -dc < $_ |" : $_ } @ARGV;

If you want to pass switches into your script, you can use one of the
C<Getopts> modules or put a loop on the front like this:

    while ($_ = $ARGV[0], /^-/) {
	shift;
        last if /^--$/;
	if (/^-D(.*)/) { $debug = $1 }
	if (/^-v/)     { $verbose++  }
	# ...		# other switches
    }

    while (<>) {
	# ...		# code for each line
    }

The C<< <> >> symbol will return C<undef> for end-of-file only once.
If you call it again after this, it will assume you are processing another
C<@ARGV> list, and if you haven't set C<@ARGV>, will read input from STDIN.

If what the angle brackets contain is a simple scalar variable (for example,
C<$foo>), then that variable contains the name of the
filehandle to input from, or its typeglob, or a reference to the
same.  For example:

    $fh = \*STDIN;
    $line = <$fh>;

If what's within the angle brackets is neither a filehandle nor a simple
scalar variable containing a filehandle name, typeglob, or typeglob
reference, it is interpreted as a filename pattern to be globbed, and
either a list of filenames or the next filename in the list is returned,
depending on context.  This distinction is determined on syntactic
grounds alone.  That means C<< <$x> >> is always a C<readline()> from
an indirect handle, but C<< <$hash{key}> >> is always a C<glob()>.
That's because C<$x> is a simple scalar variable, but C<$hash{key}> is
not--it's a hash element.  Even C<< <$x > >> (note the extra space)
is treated as C<glob("$x ")>, not C<readline($x)>.

One level of double-quote interpretation is done first, but you can't
say C<< <$foo> >> because that's an indirect filehandle as explained
in the previous paragraph.  (In older versions of Perl, programmers
would insert curly brackets to force interpretation as a filename glob:
C<< <${foo}> >>.  These days, it's considered cleaner to call the
internal function directly as C<glob($foo)>, which is probably the right
way to have done it in the first place.)  For example:

    while (<*.c>) {
	chmod 0644, $_;
    }

is roughly equivalent to:

    open(FOO, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|");
    while (<FOO>) {
	chomp;
	chmod 0644, $_;
    }

except that the globbing is actually done internally using the standard
C<L<File::Glob>> extension.  Of course, the shortest way to do the above is:

    chmod 0644, <*.c>;

A (file)glob evaluates its (embedded) argument only when it is
starting a new list.  All values must be read before it will start
over.  In list context, this isn't important because you automatically
get them all anyway.  However, in scalar context the operator returns
the next value each time it's called, or C<undef> when the list has
run out.  As with filehandle reads, an automatic C<defined> is
generated when the glob occurs in the test part of a C<while>,
because legal glob returns (for example,
a file called F<0>) would otherwise
terminate the loop.  Again, C<undef> is returned only once.  So if
you're expecting a single value from a glob, it is much better to
say

    ($file) = <blurch*>;

than

    $file = <blurch*>;

because the latter will alternate between returning a filename and
returning false.

If you're trying to do variable interpolation, it's definitely better
to use the C<glob()> function, because the older notation can cause people
to become confused with the indirect filehandle notation.

    @files = glob("$dir/*.[ch]");
    @files = glob($files[$i]);

=head2 Constant Folding
X<constant folding> X<folding>

Like C, Perl does a certain amount of expression evaluation at
compile time whenever it determines that all arguments to an
operator are static and have no side effects.  In particular, string
concatenation happens at compile time between literals that don't do
variable substitution.  Backslash interpolation also happens at
compile time.  You can say

      'Now is the time for all'
    . "\n"
    .  'good men to come to.'

and this all reduces to one string internally.  Likewise, if
you say

    foreach $file (@filenames) {
	if (-s $file > 5 + 100 * 2**16) {  }
    }

the compiler precomputes the number which that expression
represents so that the interpreter won't have to.

=head2 No-ops
X<no-op> X<nop>

Perl doesn't officially have a no-op operator, but the bare constants
C<0> and C<1> are special-cased not to produce a warning in void
context, so you can for example safely do

    1 while foo();

=head2 Bitwise String Operators
X<operator, bitwise, string> X<&.> X<|.> X<^.> X<~.>

Bitstrings of any size may be manipulated by the bitwise operators
(C<~ | & ^>).

If the operands to a binary bitwise op are strings of different
sizes, B<|> and B<^> ops act as though the shorter operand had
additional zero bits on the right, while the B<&> op acts as though
the longer operand were truncated to the length of the shorter.
The granularity for such extension or truncation is one or more
bytes.

    # ASCII-based examples
    print "j p \n" ^ " a h";        	# prints "JAPH\n"
    print "JA" | "  ph\n";          	# prints "japh\n"
    print "japh\nJunk" & '_____';   	# prints "JAPH\n";
    print 'p N$' ^ " E<H\n";		# prints "Perl\n";

If you are intending to manipulate bitstrings, be certain that
you're supplying bitstrings: If an operand is a number, that will imply
a B<numeric> bitwise operation.  You may explicitly show which type of
operation you intend by using C<""> or C<0+>, as in the examples below.

    $foo =  150  |  105;	# yields 255  (0x96 | 0x69 is 0xFF)
    $foo = '150' |  105;	# yields 255
    $foo =  150  | '105';	# yields 255
    $foo = '150' | '105';	# yields string '155' (under ASCII)

    $baz = 0+$foo & 0+$bar;	# both ops explicitly numeric
    $biz = "$foo" ^ "$bar";	# both ops explicitly stringy

This somewhat unpredictable behavior can be avoided with the experimental
"bitwise" feature, new in Perl 5.22.  You can enable it via S<C<use feature
'bitwise'>>.  By default, it will warn unless the C<"experimental::bitwise">
warnings category has been disabled.  (S<C<use experimental 'bitwise'>> will
enable the feature and disable the warning.)  Under this feature, the four
standard bitwise operators (C<~ | & ^>) are always numeric.  Adding a dot
after each operator (C<~. |. &. ^.>) forces it to treat its operands as
strings:

    use experimental "bitwise";
    $foo =  150  |  105;	# yields 255  (0x96 | 0x69 is 0xFF)
    $foo = '150' |  105;	# yields 255
    $foo =  150  | '105';	# yields 255
    $foo = '150' | '105';	# yields 255
    $foo =  150  |. 105;	# yields string '155'
    $foo = '150' |. 105;	# yields string '155'
    $foo =  150  |.'105';	# yields string '155'
    $foo = '150' |.'105';	# yields string '155'

    $baz = $foo &  $bar;	# both operands numeric
    $biz = $foo ^. $bar;	# both operands stringy

The assignment variants of these operators (C<&= |= ^= &.= |.= ^.=>)
behave likewise under the feature.

The behavior of these operators is problematic (and subject to change)
if either or both of the strings are encoded in UTF-8 (see
L<perlunicode/Byte and Character Semantics>.

See L<perlfunc/vec> for information on how to manipulate individual bits
in a bit vector.

=head2 Integer Arithmetic
X<integer>

By default, Perl assumes that it must do most of its arithmetic in
floating point.  But by saying

    use integer;

you may tell the compiler to use integer operations
(see L<integer> for a detailed explanation) from here to the end of
the enclosing BLOCK.  An inner BLOCK may countermand this by saying

    no integer;

which lasts until the end of that BLOCK.  Note that this doesn't
mean everything is an integer, merely that Perl will use integer
operations for arithmetic, comparison, and bitwise operators.  For
example, even under S<C<use integer>>, if you take the C<sqrt(2)>, you'll
still get C<1.4142135623731> or so.

Used on numbers, the bitwise operators (C<&> C<|> C<^> C<~> C<< << >>
C<< >> >>) always produce integral results.  (But see also
L</Bitwise String Operators>.)  However, S<C<use integer>> still has meaning for
them.  By default, their results are interpreted as unsigned integers, but
if S<C<use integer>> is in effect, their results are interpreted
as signed integers.  For example, C<~0> usually evaluates to a large
integral value.  However, S<C<use integer; ~0>> is C<-1> on two's-complement
machines.

=head2 Floating-point Arithmetic

X<floating-point> X<floating point> X<float> X<real>

While S<C<use integer>> provides integer-only arithmetic, there is no
analogous mechanism to provide automatic rounding or truncation to a
certain number of decimal places.  For rounding to a certain number
of digits, C<sprintf()> or C<printf()> is usually the easiest route.
See L<perlfaq4>.

Floating-point numbers are only approximations to what a mathematician
would call real numbers.  There are infinitely more reals than floats,
so some corners must be cut.  For example:

    printf "%.20g\n", 123456789123456789;
    #        produces 123456789123456784

Testing for exact floating-point equality or inequality is not a
good idea.  Here's a (relatively expensive) work-around to compare
whether two floating-point numbers are equal to a particular number of
decimal places.  See Knuth, volume II, for a more robust treatment of
this topic.

    sub fp_equal {
	my ($X, $Y, $POINTS) = @_;
	my ($tX, $tY);
	$tX = sprintf("%.${POINTS}g", $X);
	$tY = sprintf("%.${POINTS}g", $Y);
	return $tX eq $tY;
    }

The POSIX module (part of the standard perl distribution) implements
C<ceil()>, C<floor()>, and other mathematical and trigonometric functions.
The C<L<Math::Complex>> module (part of the standard perl distribution)
defines mathematical functions that work on both the reals and the
imaginary numbers.  C<Math::Complex> is not as efficient as POSIX, but
POSIX can't work with complex numbers.

Rounding in financial applications can have serious implications, and
the rounding method used should be specified precisely.  In these
cases, it probably pays not to trust whichever system rounding is
being used by Perl, but to instead implement the rounding function you
need yourself.

=head2 Bigger Numbers
X<number, arbitrary precision>

The standard C<L<Math::BigInt>>, C<L<Math::BigRat>>, and
C<L<Math::BigFloat>> modules,
along with the C<bignum>, C<bigint>, and C<bigrat> pragmas, provide
variable-precision arithmetic and overloaded operators, although
they're currently pretty slow.  At the cost of some space and
considerable speed, they avoid the normal pitfalls associated with
limited-precision representations.

	use 5.010;
	use bigint;  # easy interface to Math::BigInt
	$x = 123456789123456789;
	say $x * $x;
    +15241578780673678515622620750190521

Or with rationals:

        use 5.010;
        use bigrat;
        $x = 3/22;
        $y = 4/6;
        say "x/y is ", $x/$y;
        say "x*y is ", $x*$y;
        x/y is 9/44
        x*y is 1/11

Several modules let you calculate with unlimited or fixed precision
(bound only by memory and CPU time).  There
are also some non-standard modules that
provide faster implementations via external C libraries.

Here is a short, but incomplete summary:

  Math::String           treat string sequences like numbers
  Math::FixedPrecision   calculate with a fixed precision
  Math::Currency         for currency calculations
  Bit::Vector            manipulate bit vectors fast (uses C)
  Math::BigIntFast       Bit::Vector wrapper for big numbers
  Math::Pari             provides access to the Pari C library
  Math::Cephes           uses the external Cephes C library (no
                         big numbers)
  Math::Cephes::Fraction fractions via the Cephes library
  Math::GMP              another one using an external C library
  Math::GMPz             an alternative interface to libgmp's big ints
  Math::GMPq             an interface to libgmp's fraction numbers
  Math::GMPf             an interface to libgmp's floating point numbers

Choose wisely.

=cut
perl5005delta.pod000064400000102753150344123420007544 0ustar00=head1 NAME

perl5005delta - what's new for perl5.005

=head1 DESCRIPTION

This document describes differences between the 5.004 release and this one.

=head1 About the new versioning system

Perl is now developed on two tracks: a maintenance track that makes
small, safe updates to released production versions with emphasis on
compatibility; and a development track that pursues more aggressive
evolution.  Maintenance releases (which should be considered production
quality) have subversion numbers that run from C<1> to C<49>, and
development releases (which should be considered "alpha" quality) run
from C<50> to C<99>.

Perl 5.005 is the combined product of the new dual-track development
scheme.

=head1 Incompatible Changes

=head2 WARNING:  This version is not binary compatible with Perl 5.004.

Starting with Perl 5.004_50 there were many deep and far-reaching changes
to the language internals.  If you have dynamically loaded extensions
that you built under perl 5.003 or 5.004, you can continue to use them
with 5.004, but you will need to rebuild and reinstall those extensions
to use them 5.005.  See F<INSTALL> for detailed instructions on how to
upgrade.

=head2 Default installation structure has changed

The new Configure defaults are designed to allow a smooth upgrade from
5.004 to 5.005, but you should read F<INSTALL> for a detailed
discussion of the changes in order to adapt them to your system.

=head2 Perl Source Compatibility

When none of the experimental features are enabled, there should be
very few user-visible Perl source compatibility issues.

If threads are enabled, then some caveats apply. C<@_> and C<$_> become
lexical variables.  The effect of this should be largely transparent to
the user, but there are some boundary conditions under which user will
need to be aware of the issues.  For example, C<local(@_)> results in
a "Can't localize lexical variable @_ ..." message.  This may be enabled
in a future version.

Some new keywords have been introduced.  These are generally expected to
have very little impact on compatibility.  See L<New C<INIT> keyword>,
L<New C<lock> keyword>, and L<New C<qrE<sol>E<sol>> operator>.

Certain barewords are now reserved.  Use of these will provoke a warning
if you have asked for them with the C<-w> switch.
See L<C<our> is now a reserved word>.

=head2 C Source Compatibility

There have been a large number of changes in the internals to support
the new features in this release.

=over 4

=item *

Core sources now require ANSI C compiler

An ANSI C compiler is now B<required> to build perl.  See F<INSTALL>.

=item *

All Perl global variables must now be referenced with an explicit prefix

All Perl global variables that are visible for use by extensions now
have a C<PL_> prefix.  New extensions should C<not> refer to perl globals
by their unqualified names.  To preserve sanity, we provide limited
backward compatibility for globals that are being widely used like
C<sv_undef> and C<na> (which should now be written as C<PL_sv_undef>,
C<PL_na> etc.)

If you find that your XS extension does not compile anymore because a
perl global is not visible, try adding a C<PL_> prefix to the global
and rebuild.

It is strongly recommended that all functions in the Perl API that don't
begin with C<perl> be referenced with a C<Perl_> prefix.  The bare function
names without the C<Perl_> prefix are supported with macros, but this
support may cease in a future release.

See L<perlapi>.

=item *

Enabling threads has source compatibility issues

Perl built with threading enabled requires extensions to use the new
C<dTHR> macro to initialize the handle to access per-thread data.
If you see a compiler error that talks about the variable C<thr> not
being declared (when building a module that has XS code),  you need
to add C<dTHR;> at the beginning of the block that elicited the error.

The API function C<perl_get_sv("@",GV_ADD)> should be used instead of
directly accessing perl globals as C<GvSV(errgv)>.  The API call is
backward compatible with existing perls and provides source compatibility
with threading is enabled.

See L</"C Source Compatibility"> for more information.

=back

=head2 Binary Compatibility

This version is NOT binary compatible with older versions.  All extensions
will need to be recompiled.  Further binaries built with threads enabled
are incompatible with binaries built without.  This should largely be
transparent to the user, as all binary incompatible configurations have
their own unique architecture name, and extension binaries get installed at
unique locations.  This allows coexistence of several configurations in
the same directory hierarchy.  See F<INSTALL>.

=head2 Security fixes may affect compatibility

A few taint leaks and taint omissions have been corrected.  This may lead
to "failure" of scripts that used to work with older versions.  Compiling
with -DINCOMPLETE_TAINTS provides a perl with minimal amounts of changes
to the tainting behavior.  But note that the resulting perl will have
known insecurities.

Oneliners with the C<-e> switch do not create temporary files anymore.

=head2 Relaxed new mandatory warnings introduced in 5.004

Many new warnings that were introduced in 5.004 have been made
optional.  Some of these warnings are still present, but perl's new
features make them less often a problem.  See L</New Diagnostics>.

=head2 Licensing

Perl has a new Social Contract for contributors.  See F<Porting/Contract>.

The license included in much of the Perl documentation has changed.
Most of the Perl documentation was previously under the implicit GNU
General Public License or the Artistic License (at the user's choice).
Now much of the documentation unambiguously states the terms under which
it may be distributed.  Those terms are in general much less restrictive
than the GNU GPL.  See L<perl> and the individual perl manpages listed
therein.

=head1 Core Changes


=head2 Threads

WARNING: Threading is considered an B<experimental> feature.  Details of the
implementation may change without notice.  There are known limitations
and some bugs.  These are expected to be fixed in future versions.

See F<README.threads>.

=head2 Compiler

WARNING: The Compiler and related tools are considered B<experimental>.
Features may change without notice, and there are known limitations
and bugs.  Since the compiler is fully external to perl, the default
configuration will build and install it.

The Compiler produces three different types of transformations of a
perl program.  The C backend generates C code that captures perl's state
just before execution begins.  It eliminates the compile-time overheads
of the regular perl interpreter, but the run-time performance remains
comparatively the same.  The CC backend generates optimized C code
equivalent to the code path at run-time.  The CC backend has greater
potential for big optimizations, but only a few optimizations are
implemented currently.  The Bytecode backend generates a platform
independent bytecode representation of the interpreter's state
just before execution.  Thus, the Bytecode back end also eliminates
much of the compilation overhead of the interpreter.

The compiler comes with several valuable utilities.

C<B::Lint> is an experimental module to detect and warn about suspicious
code, especially the cases that the C<-w> switch does not detect.

C<B::Deparse> can be used to demystify perl code, and understand
how perl optimizes certain constructs.

C<B::Xref> generates cross reference reports of all definition and use
of variables, subroutines and formats in a program.

C<B::Showlex> show the lexical variables used by a subroutine or file
at a glance.

C<perlcc> is a simple frontend for compiling perl.

See C<ext/B/README>, L<B>, and the respective compiler modules.

=head2 Regular Expressions

Perl's regular expression engine has been seriously overhauled, and
many new constructs are supported.  Several bugs have been fixed.

Here is an itemized summary:

=over 4

=item Many new and improved optimizations

Changes in the RE engine:

	Unneeded nodes removed;
	Substrings merged together;
	New types of nodes to process (SUBEXPR)* and similar expressions
	    quickly, used if the SUBEXPR has no side effects and matches
	    strings of the same length;
	Better optimizations by lookup for constant substrings;
	Better search for constants substrings anchored by $ ;

Changes in Perl code using RE engine:

	More optimizations to s/longer/short/;
	study() was not working;
	/blah/ may be optimized to an analogue of index() if $& $` $' not seen;
	Unneeded copying of matched-against string removed;
	Only matched part of the string is copying if $` $' were not seen;

=item Many bug fixes

Note that only the major bug fixes are listed here.  See F<Changes> for others.

	Backtracking might not restore start of $3.
	No feedback if max count for * or + on "complex" subexpression
	    was reached, similarly (but at compile time) for {3,34567}
	Primitive restrictions on max count introduced to decrease a 
	    possibility of a segfault;
	(ZERO-LENGTH)* could segfault;
	(ZERO-LENGTH)* was prohibited;
	Long REs were not allowed;
	/RE/g could skip matches at the same position after a 
	  zero-length match;

=item New regular expression constructs

The following new syntax elements are supported:

	(?<=RE)
	(?<!RE)
	(?{ CODE })
	(?i-x)
	(?i:RE)
	(?(COND)YES_RE|NO_RE)
	(?>RE)
	\z

=item New operator for precompiled regular expressions

See L<New C<qrE<sol>E<sol>> operator>.

=item Other improvements

	Better debugging output (possibly with colors),
            even from non-debugging Perl;
	RE engine code now looks like C, not like assembler;
	Behaviour of RE modifiable by `use re' directive;
	Improved documentation;
	Test suite significantly extended;
	Syntax [:^upper:] etc., reserved inside character classes;

=item Incompatible changes

	(?i) localized inside enclosing group;
	$( is not interpolated into RE any more;
	/RE/g may match at the same position (with non-zero length)
	    after a zero-length match (bug fix).

=back

See L<perlre> and L<perlop>.

=head2   Improved malloc()

See banner at the beginning of C<malloc.c> for details.

=head2 Quicksort is internally implemented

Perl now contains its own highly optimized qsort() routine.  The new qsort()
is resistant to inconsistent comparison functions, so Perl's C<sort()> will
not provoke coredumps any more when given poorly written sort subroutines.
(Some C library C<qsort()>s that were being used before used to have this
problem.)  In our testing, the new C<qsort()> required the minimal number
of pair-wise compares on average, among all known C<qsort()> implementations.

See C<perlfunc/sort>.

=head2 Reliable signals

Perl's signal handling is susceptible to random crashes, because signals
arrive asynchronously, and the Perl runtime is not reentrant at arbitrary
times.

However, one experimental implementation of reliable signals is available
when threads are enabled.  See C<Thread::Signal>.  Also see F<INSTALL> for
how to build a Perl capable of threads.

=head2 Reliable stack pointers

The internals now reallocate the perl stack only at predictable times.
In particular, magic calls never trigger reallocations of the stack,
because all reentrancy of the runtime is handled using a "stack of stacks".
This should improve reliability of cached stack pointers in the internals
and in XSUBs.

=head2 More generous treatment of carriage returns

Perl used to complain if it encountered literal carriage returns in
scripts.  Now they are mostly treated like whitespace within program text.
Inside string literals and here documents, literal carriage returns are
ignored if they occur paired with linefeeds, or get interpreted as whitespace
if they stand alone.  This behavior means that literal carriage returns
in files should be avoided.  You can get the older, more compatible (but
less generous) behavior by defining the preprocessor symbol
C<PERL_STRICT_CR> when building perl.  Of course, all this has nothing
whatever to do with how escapes like C<\r> are handled within strings.

Note that this doesn't somehow magically allow you to keep all text files
in DOS format.  The generous treatment only applies to files that perl
itself parses.  If your C compiler doesn't allow carriage returns in
files, you may still be unable to build modules that need a C compiler.

=head2 Memory leaks

C<substr>, C<pos> and C<vec> don't leak memory anymore when used in lvalue
context.  Many small leaks that impacted applications that embed multiple
interpreters have been fixed.

=head2 Better support for multiple interpreters

The build-time option C<-DMULTIPLICITY> has had many of the details
reworked.  Some previously global variables that should have been
per-interpreter now are.  With care, this allows interpreters to call
each other.  See the C<PerlInterp> extension on CPAN.

=head2 Behavior of local() on array and hash elements is now well-defined

See L<perlsub/"Temporary Values via local()">.

=head2 C<%!> is transparently tied to the L<Errno> module

See L<perlvar>, and L<Errno>.

=head2 Pseudo-hashes are supported

See L<perlref>.

=head2 C<EXPR foreach EXPR> is supported

See L<perlsyn>.

=head2 Keywords can be globally overridden

See L<perlsub>.

=head2 C<$^E> is meaningful on Win32

See L<perlvar>.

=head2 C<foreach (1..1000000)> optimized

C<foreach (1..1000000)> is now optimized into a counting loop.  It does
not try to allocate a 1000000-size list anymore.

=head2 C<Foo::> can be used as implicitly quoted package name

Barewords caused unintuitive behavior when a subroutine with the same
name as a package happened to be defined.  Thus, C<new Foo @args>,
use the result of the call to C<Foo()> instead of C<Foo> being treated
as a literal.  The recommended way to write barewords in the indirect
object slot is C<new Foo:: @args>.  Note that the method C<new()> is
called with a first argument of C<Foo>, not C<Foo::> when you do that.

=head2 C<exists $Foo::{Bar::}> tests existence of a package

It was impossible to test for the existence of a package without
actually creating it before.  Now C<exists $Foo::{Bar::}> can be
used to test if the C<Foo::Bar> namespace has been created.

=head2 Better locale support

See L<perllocale>.

=head2 Experimental support for 64-bit platforms

Perl5 has always had 64-bit support on systems with 64-bit longs.
Starting with 5.005, the beginnings of experimental support for systems
with 32-bit long and 64-bit 'long long' integers has been added.
If you add -DUSE_LONG_LONG to your ccflags in config.sh (or manually
define it in perl.h) then perl will be built with 'long long' support.
There will be many compiler warnings, and the resultant perl may not
work on all systems.  There are many other issues related to
third-party extensions and libraries.  This option exists to allow
people to work on those issues.

=head2 prototype() returns useful results on builtins

See L<perlfunc/prototype>.

=head2 Extended support for exception handling

C<die()> now accepts a reference value, and C<$@> gets set to that
value in exception traps.  This makes it possible to propagate
exception objects.  This is an undocumented B<experimental> feature.

=head2 Re-blessing in DESTROY() supported for chaining DESTROY() methods

See L<perlobj/Destructors>.

=head2 All C<printf> format conversions are handled internally

See L<perlfunc/printf>.

=head2 New C<INIT> keyword

C<INIT> subs are like C<BEGIN> and C<END>, but they get run just before
the perl runtime begins execution.  e.g., the Perl Compiler makes use of
C<INIT> blocks to initialize and resolve pointers to XSUBs.

=head2 New C<lock> keyword

The C<lock> keyword is the fundamental synchronization primitive
in threaded perl.  When threads are not enabled, it is currently a noop.

To minimize impact on source compatibility this keyword is "weak", i.e., any
user-defined subroutine of the same name overrides it, unless a C<use Thread>
has been seen.

=head2 New C<qr//> operator

The C<qr//> operator, which is syntactically similar to the other quote-like
operators, is used to create precompiled regular expressions.  This compiled
form can now be explicitly passed around in variables, and interpolated in
other regular expressions.  See L<perlop>.

=head2 C<our> is now a reserved word

Calling a subroutine with the name C<our> will now provoke a warning when
using the C<-w> switch.

=head2 Tied arrays are now fully supported

See L<Tie::Array>.

=head2 Tied handles support is better

Several missing hooks have been added.  There is also a new base class for
TIEARRAY implementations.  See L<Tie::Array>.

=head2 4th argument to substr

substr() can now both return and replace in one operation.  The optional
4th argument is the replacement string.  See L<perlfunc/substr>.

=head2 Negative LENGTH argument to splice

splice() with a negative LENGTH argument now work similar to what the
LENGTH did for substr().  Previously a negative LENGTH was treated as
0.  See L<perlfunc/splice>.

=head2 Magic lvalues are now more magical

When you say something like C<substr($x, 5) = "hi">, the scalar returned
by substr() is special, in that any modifications to it affect $x.
(This is called a 'magic lvalue' because an 'lvalue' is something on
the left side of an assignment.)  Normally, this is exactly what you
would expect to happen, but Perl uses the same magic if you use substr(),
pos(), or vec() in a context where they might be modified, like taking
a reference with C<\> or as an argument to a sub that modifies C<@_>.
In previous versions, this 'magic' only went one way, but now changes
to the scalar the magic refers to ($x in the above example) affect the
magic lvalue too. For instance, this code now acts differently:

    $x = "hello";
    sub printit {
	$x = "g'bye";
	print $_[0], "\n";
    }
    printit(substr($x, 0, 5));

In previous versions, this would print "hello", but it now prints "g'bye".

=head2 <> now reads in records

If C<$/> is a reference to an integer, or a scalar that holds an integer,
<> will read in records instead of lines. For more info, see
L<perlvar/$E<sol>>.

=head1 Supported Platforms

Configure has many incremental improvements.  Site-wide policy for building
perl can now be made persistent, via Policy.sh.  Configure also records
the command-line arguments used in F<config.sh>.

=head2 New Platforms

BeOS is now supported.  See F<README.beos>.

DOS is now supported under the DJGPP tools.  See F<README.dos> (installed 
as L<perldos> on some systems).

MiNT is now supported.  See F<README.mint>.

MPE/iX is now supported.  See README.mpeix.

MVS (aka OS390, aka Open Edition) is now supported.  See F<README.os390> 
(installed as L<perlos390> on some systems).

Stratus VOS is now supported.  See F<README.vos>.

=head2 Changes in existing support

Win32 support has been vastly enhanced.  Support for Perl Object, a C++
encapsulation of Perl.  GCC and EGCS are now supported on Win32.
See F<README.win32>, aka L<perlwin32>.

VMS configuration system has been rewritten.  See F<README.vms> (installed 
as F<README_vms> on some systems).

The hints files for most Unix platforms have seen incremental improvements.

=head1 Modules and Pragmata

=head2 New Modules

=over 4

=item B

Perl compiler and tools.  See L<B>.

=item Data::Dumper

A module to pretty print Perl data.  See L<Data::Dumper>.

=item Dumpvalue

A module to dump perl values to the screen. See L<Dumpvalue>.

=item Errno

A module to look up errors more conveniently.  See L<Errno>.

=item File::Spec

A portable API for file operations.

=item ExtUtils::Installed

Query and manage installed modules.

=item ExtUtils::Packlist

Manipulate .packlist files.

=item Fatal

Make functions/builtins succeed or die.

=item IPC::SysV

Constants and other support infrastructure for System V IPC operations
in perl.

=item Test

A framework for writing test suites.

=item Tie::Array

Base class for tied arrays.

=item Tie::Handle

Base class for tied handles.

=item Thread

Perl thread creation, manipulation, and support.

=item attrs

Set subroutine attributes.

=item fields

Compile-time class fields.

=item re

Various pragmata to control behavior of regular expressions.

=back

=head2 Changes in existing modules

=over 4

=item Benchmark

You can now run tests for I<x> seconds instead of guessing the right
number of tests to run.

Keeps better time.

=item Carp

Carp has a new function cluck(). cluck() warns, like carp(), but also adds
a stack backtrace to the error message, like confess().

=item CGI

CGI has been updated to version 2.42.

=item Fcntl

More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for
large (more than 4G) file access (the 64-bit support is not yet
working, though, so no need to get overly excited), Free/Net/OpenBSD
locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and
O_ACCMODE: the mask of O_RDONLY, O_WRONLY, and O_RDWR.

=item Math::Complex

The accessors methods Re, Im, arg, abs, rho, theta, methods can
($z->Re()) now also act as mutators ($z->Re(3)).

=item Math::Trig

A little bit of radial trigonometry (cylindrical and spherical) added,
for example the great circle distance.

=item POSIX

POSIX now has its own platform-specific hints files.

=item DB_File

DB_File supports version 2.x of Berkeley DB.  See C<ext/DB_File/Changes>.

=item MakeMaker

MakeMaker now supports writing empty makefiles, provides a way to
specify that site umask() policy should be honored.  There is also
better support for manipulation of .packlist files, and getting
information about installed modules.

Extensions that have both architecture-dependent and
architecture-independent files are now always installed completely in
the architecture-dependent locations.  Previously, the shareable parts
were shared both across architectures and across perl versions and were
therefore liable to be overwritten with newer versions that might have
subtle incompatibilities.

=item CPAN

See L<perlmodinstall> and L<CPAN>.

=item Cwd

Cwd::cwd is faster on most platforms.

=back

=head1 Utility Changes

C<h2ph> and related utilities have been vastly overhauled.

C<perlcc>, a new experimental front end for the compiler is available.

The crude GNU C<configure> emulator is now called C<configure.gnu> to
avoid trampling on C<Configure> under case-insensitive filesystems.

C<perldoc> used to be rather slow.  The slower features are now optional.
In particular, case-insensitive searches need the C<-i> switch, and
recursive searches need C<-r>.  You can set these switches in the
C<PERLDOC> environment variable to get the old behavior.

=head1 Documentation Changes

Config.pm now has a glossary of variables.

F<Porting/patching.pod> has detailed instructions on how to create and
submit patches for perl.

L<perlport> specifies guidelines on how to write portably. 

L<perlmodinstall> describes how to fetch and install modules from C<CPAN>
sites.

Some more Perl traps are documented now.  See L<perltrap>.

L<perlopentut> gives a tutorial on using open().

L<perlreftut> gives a tutorial on references.

L<perlthrtut> gives a tutorial on threads.

=head1 New Diagnostics

=over 4

=item Ambiguous call resolved as CORE::%s(), qualify as such or use &

(W) A subroutine you have declared has the same name as a Perl keyword,
and you have used the name without qualification for calling one or the
other.  Perl decided to call the builtin because the subroutine is
not imported.

To force interpretation as a subroutine call, either put an ampersand
before the subroutine name, or qualify the name with its package.
Alternatively, you can import the subroutine (or pretend that it's
imported with the C<use subs> pragma).

To silently interpret it as the Perl operator, use the C<CORE::> prefix
on the operator (e.g. C<CORE::log($x)>) or by declaring the subroutine
to be an object method (see L</attrs>).

=item Bad index while coercing array into hash

(F) The index looked up in the hash found as the 0'th element of a
pseudo-hash is not legal.  Index values must be at 1 or greater.
See L<perlref>.

=item Bareword "%s" refers to nonexistent package

(W) You used a qualified bareword of the form C<Foo::>, but
the compiler saw no other uses of that namespace before that point.
Perhaps you need to predeclare a package?

=item Can't call method "%s" on an undefined value

(F) You used the syntax of a method call, but the slot filled by the
object reference or package name contains an undefined value.
Something like this will reproduce the error:

    $BADREF = 42;
    process $BADREF 1,2,3;
    $BADREF->process(1,2,3);

=item Can't check filesystem of script "%s" for nosuid

(P) For some reason you can't check the filesystem of the script for nosuid.

=item Can't coerce array into hash

(F) You used an array where a hash was expected, but the array has no
information on how to map from keys to array indices.  You can do that
only with arrays that have a hash reference at index 0.

=item Can't goto subroutine from an eval-string

(F) The "goto subroutine" call can't be used to jump out of an eval "string".
(You can use it to jump out of an eval {BLOCK}, but you probably don't want to.)

=item Can't localize pseudo-hash element

(F) You said something like C<< local $ar->{'key'} >>, where $ar is
a reference to a pseudo-hash.  That hasn't been implemented yet, but
you can get a similar effect by localizing the corresponding array
element directly: C<< local $ar->[$ar->[0]{'key'}] >>.

=item Can't use %%! because Errno.pm is not available

(F) The first time the %! hash is used, perl automatically loads the
Errno.pm module. The Errno module is expected to tie the %! hash to
provide symbolic names for C<$!> errno values.

=item Cannot find an opnumber for "%s"

(F) A string of a form C<CORE::word> was given to prototype(), but
there is no builtin with the name C<word>.

=item Character class syntax [. .] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax beginning
with "[." and ending with ".]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[." and ".\]".

=item Character class syntax [: :] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax beginning
with "[:" and ending with ":]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[:" and ":\]".

=item Character class syntax [= =] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax
beginning with "[=" and ending with "=]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[=" and "=\]".

=item %s: Eval-group in insecure regular expression

(F) Perl detected tainted data when trying to compile a regular expression
that contains the C<(?{ ... })> zero-width assertion, which is unsafe.
See L<perlre/(?{ code })>, and L<perlsec>.

=item %s: Eval-group not allowed, use re 'eval'

(F) A regular expression contained the C<(?{ ... })> zero-width assertion,
but that construct is only allowed when the C<use re 'eval'> pragma is
in effect.  See L<perlre/(?{ code })>.

=item %s: Eval-group not allowed at run time

(F) Perl tried to compile a regular expression containing the C<(?{ ... })>
zero-width assertion at run time, as it would when the pattern contains
interpolated values.  Since that is a security risk, it is not allowed.
If you insist, you may still do this by explicitly building the pattern
from an interpolated string at run time and using that in an eval().
See L<perlre/(?{ code })>.

=item Explicit blessing to '' (assuming package main)

(W) You are blessing a reference to a zero length string.  This has
the effect of blessing the reference into the package main.  This is
usually not what you want.  Consider providing a default target
package, e.g. bless($ref, $p || 'MyPackage');

=item Illegal hex digit ignored

(W) You may have tried to use a character other than 0 - 9 or A - F in a
hexadecimal number.  Interpretation of the hexadecimal number stopped
before the illegal character.

=item No such array field

(F) You tried to access an array as a hash, but the field name used is
not defined.  The hash at index 0 should map all valid field names to
array indices for that to work.

=item No such field "%s" in variable %s of type %s

(F) You tried to access a field of a typed variable where the type
does not know about the field name.  The field names are looked up in
the %FIELDS hash in the type package at compile time.  The %FIELDS hash
is usually set up with the 'fields' pragma.

=item Out of memory during ridiculously large request

(F) You can't allocate more than 2^31+"small amount" bytes.  This error
is most likely to be caused by a typo in the Perl program. e.g., C<$arr[time]>
instead of C<$arr[$time]>.

=item Range iterator outside integer range

(F) One (or both) of the numeric arguments to the range operator ".."
are outside the range which can be represented by integers internally.
One possible workaround is to force Perl to use magical string
increment by prepending "0" to your numbers.

=item Recursive inheritance detected while looking for method '%s' %s

(F) More than 100 levels of inheritance were encountered while invoking a
method.  Probably indicates an unintended loop in your inheritance hierarchy.

=item Reference found where even-sized list expected

(W) You gave a single reference where Perl was expecting a list with
an even number of elements (for assignment to a hash). This
usually means that you used the anon hash constructor when you meant 
to use parens. In any case, a hash requires key/value B<pairs>.

    %hash = { one => 1, two => 2, };   # WRONG
    %hash = [ qw/ an anon array / ];   # WRONG
    %hash = ( one => 1, two => 2, );   # right
    %hash = qw( one 1 two 2 );                 # also fine

=item Undefined value assigned to typeglob

(W) An undefined value was assigned to a typeglob, a la C<*foo = undef>.
This does nothing.  It's possible that you really mean C<undef *foo>.

=item Use of reserved word "%s" is deprecated

(D) The indicated bareword is a reserved word.  Future versions of perl
may use it as a keyword, so you're better off either explicitly quoting
the word in a manner appropriate for its context of use, or using a
different name altogether.  The warning can be suppressed for subroutine
names by either adding a C<&> prefix, or using a package qualifier,
e.g. C<&our()>, or C<Foo::our()>.

=item perl: warning: Setting locale failed.

(S) The whole warning message will look something like:

       perl: warning: Setting locale failed.
       perl: warning: Please check that your locale settings:
               LC_ALL = "En_US",
               LANG = (unset)
           are supported and installed on your system.
       perl: warning: Falling back to the standard locale ("C").

Exactly what were the failed locale settings varies.  In the above the
settings were that the LC_ALL was "En_US" and the LANG had no value.
This error means that Perl detected that you and/or your system
administrator have set up the so-called variable system but Perl could
not use those settings.  This was not dead serious, fortunately: there
is a "default locale" called "C" that Perl can and will use, the
script will be run.  Before you really fix the problem, however, you
will get the same error message each time you run Perl.  How to really
fix the problem can be found in L<perllocale/"LOCALE PROBLEMS">.

=back


=head1 Obsolete Diagnostics

=over 4

=item Can't mktemp()

(F) The mktemp() routine failed for some reason while trying to process
a B<-e> switch.  Maybe your /tmp partition is full, or clobbered.

Removed because B<-e> doesn't use temporary files any more.

=item Can't write to temp file for B<-e>: %s

(F) The write routine failed for some reason while trying to process
a B<-e> switch.  Maybe your /tmp partition is full, or clobbered.

Removed because B<-e> doesn't use temporary files any more.

=item Cannot open temporary file

(F) The create routine failed for some reason while trying to process
a B<-e> switch.  Maybe your /tmp partition is full, or clobbered.

Removed because B<-e> doesn't use temporary files any more.

=item regexp too big

(F) The current implementation of regular expressions uses shorts as
address offsets within a string.  Unfortunately this means that if
the regular expression compiles to longer than 32767, it'll blow up.
Usually when you want a regular expression this big, there is a better
way to do it with multiple statements.  See L<perlre>.

=back

=head1 Configuration Changes

You can use "Configure -Uinstallusrbinperl" which causes installperl
to skip installing perl also as /usr/bin/perl.  This is useful if you
prefer not to modify /usr/bin for some reason or another but harmful
because many scripts assume to find Perl in /usr/bin/perl.

=head1 BUGS

If you find what you think is a bug, you might check the headers of
recently posted articles in the comp.lang.perl.misc newsgroup.
There may also be information at http://www.perl.com/perl/ , the Perl
Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Make sure you trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to <F<perlbug@perl.com>> to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Gurusamy Sarathy <F<gsar@activestate.com>>, with many contributions
from The Perl Porters.

Send omissions or corrections to <F<perlbug@perl.com>>.

=cut
perlport.pod000064400000253062150344123420007125 0ustar00=head1 NAME

perlport - Writing portable Perl

=head1 DESCRIPTION

Perl runs on numerous operating systems.  While most of them share
much in common, they also have their own unique features.

This document is meant to help you to find out what constitutes portable
Perl code.  That way once you make a decision to write portably,
you know where the lines are drawn, and you can stay within them.

There is a tradeoff between taking full advantage of one particular
type of computer and taking advantage of a full range of them.
Naturally, as you broaden your range and become more diverse, the
common factors drop, and you are left with an increasingly smaller
area of common ground in which you can operate to accomplish a
particular task.  Thus, when you begin attacking a problem, it is
important to consider under which part of the tradeoff curve you
want to operate.  Specifically, you must decide whether it is
important that the task that you are coding has the full generality
of being portable, or whether to just get the job done right now.
This is the hardest choice to be made.  The rest is easy, because
Perl provides many choices, whichever way you want to approach your
problem.

Looking at it another way, writing portable code is usually about
willfully limiting your available choices.  Naturally, it takes
discipline and sacrifice to do that.  The product of portability
and convenience may be a constant.  You have been warned.

Be aware of two important points:

=over 4

=item Not all Perl programs have to be portable

There is no reason you should not use Perl as a language to glue Unix
tools together, or to prototype a Macintosh application, or to manage the
Windows registry.  If it makes no sense to aim for portability for one
reason or another in a given program, then don't bother.

=item Nearly all of Perl already I<is> portable

Don't be fooled into thinking that it is hard to create portable Perl
code.  It isn't.  Perl tries its level-best to bridge the gaps between
what's available on different platforms, and all the means available to
use those features.  Thus almost all Perl code runs on any machine
without modification.  But there are some significant issues in
writing portable code, and this document is entirely about those issues.

=back

Here's the general rule: When you approach a task commonly done
using a whole range of platforms, think about writing portable
code.  That way, you don't sacrifice much by way of the implementation
choices you can avail yourself of, and at the same time you can give
your users lots of platform choices.  On the other hand, when you have to
take advantage of some unique feature of a particular platform, as is
often the case with systems programming (whether for Unix, Windows,
VMS, etc.), consider writing platform-specific code.

When the code will run on only two or three operating systems, you
may need to consider only the differences of those particular systems.
The important thing is to decide where the code will run and to be
deliberate in your decision.

The material below is separated into three main sections: main issues of
portability (L</"ISSUES">), platform-specific issues (L</"PLATFORMS">), and
built-in Perl functions that behave differently on various ports
(L</"FUNCTION IMPLEMENTATIONS">).

This information should not be considered complete; it includes possibly
transient information about idiosyncrasies of some of the ports, almost
all of which are in a state of constant evolution.  Thus, this material
should be considered a perpetual work in progress
(C<< <IMG SRC="yellow_sign.gif" ALT="Under Construction"> >>).

=head1 ISSUES

=head2 Newlines

In most operating systems, lines in files are terminated by newlines.
Just what is used as a newline may vary from OS to OS.  Unix
traditionally uses C<\012>, one type of DOSish I/O uses C<\015\012>,
S<Mac OS> uses C<\015>, and z/OS uses C<\025>.

Perl uses C<\n> to represent the "logical" newline, where what is
logical may depend on the platform in use.  In MacPerl, C<\n> always
means C<\015>.  On EBCDIC platforms, C<\n> could be C<\025> or C<\045>.
In DOSish perls, C<\n> usually means C<\012>, but when
accessing a file in "text" mode, perl uses the C<:crlf> layer that
translates it to (or from) C<\015\012>, depending on whether you're
reading or writing. Unix does the same thing on ttys in canonical
mode.  C<\015\012> is commonly referred to as CRLF.

To trim trailing newlines from text lines use
L<C<chomp>|perlfunc/chomp VARIABLE>.  With default settings that function
looks for a trailing C<\n> character and thus trims in a portable way.

When dealing with binary files (or text files in binary mode) be sure
to explicitly set L<C<$E<sol>>|perlvar/$E<sol>> to the appropriate value for
your file format before using L<C<chomp>|perlfunc/chomp VARIABLE>.

Because of the "text" mode translation, DOSish perls have limitations in
using L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> and
L<C<tell>|perlfunc/tell FILEHANDLE> on a file accessed in "text" mode.
Stick to L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE>-ing to
locations you got from L<C<tell>|perlfunc/tell FILEHANDLE> (and no
others), and you are usually free to use
L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> and
L<C<tell>|perlfunc/tell FILEHANDLE> even in "text" mode.  Using
L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> or
L<C<tell>|perlfunc/tell FILEHANDLE> or other file operations may be
non-portable.  If you use L<C<binmode>|perlfunc/binmode FILEHANDLE> on a
file, however, you can usually
L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> and
L<C<tell>|perlfunc/tell FILEHANDLE> with arbitrary values safely.

A common misconception in socket programming is that S<C<\n eq \012>>
everywhere.  When using protocols such as common Internet protocols,
C<\012> and C<\015> are called for specifically, and the values of
the logical C<\n> and C<\r> (carriage return) are not reliable.

    print $socket "Hi there, client!\r\n";      # WRONG
    print $socket "Hi there, client!\015\012";  # RIGHT

However, using C<\015\012> (or C<\cM\cJ>, or C<\x0D\x0A>) can be tedious
and unsightly, as well as confusing to those maintaining the code.  As
such, the L<C<Socket>|Socket> module supplies the Right Thing for those
who want it.

    use Socket qw(:DEFAULT :crlf);
    print $socket "Hi there, client!$CRLF"      # RIGHT

When reading from a socket, remember that the default input record
separator L<C<$E<sol>>|perlvar/$E<sol>> is C<\n>, but robust socket code
will recognize as either C<\012> or C<\015\012> as end of line:

    while (<$socket>) {  # NOT ADVISABLE!
        # ...
    }

Because both CRLF and LF end in LF, the input record separator can
be set to LF and any CR stripped later.  Better to write:

    use Socket qw(:DEFAULT :crlf);
    local($/) = LF;      # not needed if $/ is already \012

    while (<$socket>) {
        s/$CR?$LF/\n/;   # not sure if socket uses LF or CRLF, OK
    #   s/\015?\012/\n/; # same thing
    }

This example is preferred over the previous one--even for Unix
platforms--because now any C<\015>'s (C<\cM>'s) are stripped out
(and there was much rejoicing).

Similarly, functions that return text data--such as a function that
fetches a web page--should sometimes translate newlines before
returning the data, if they've not yet been translated to the local
newline representation.  A single line of code will often suffice:

    $data =~ s/\015?\012/\n/g;
    return $data;

Some of this may be confusing.  Here's a handy reference to the ASCII CR
and LF characters.  You can print it out and stick it in your wallet.

    LF  eq  \012  eq  \x0A  eq  \cJ  eq  chr(10)  eq  ASCII 10
    CR  eq  \015  eq  \x0D  eq  \cM  eq  chr(13)  eq  ASCII 13

             | Unix | DOS  | Mac  |
        ---------------------------
        \n   |  LF  |  LF  |  CR  |
        \r   |  CR  |  CR  |  LF  |
        \n * |  LF  | CRLF |  CR  |
        \r * |  CR  |  CR  |  LF  |
        ---------------------------
        * text-mode STDIO

The Unix column assumes that you are not accessing a serial line
(like a tty) in canonical mode.  If you are, then CR on input becomes
"\n", and "\n" on output becomes CRLF.

These are just the most common definitions of C<\n> and C<\r> in Perl.
There may well be others.  For example, on an EBCDIC implementation
such as z/OS (OS/390) or OS/400 (using the ILE, the PASE is ASCII-based)
the above material is similar to "Unix" but the code numbers change:

    LF  eq  \025  eq  \x15  eq  \cU  eq  chr(21)  eq  CP-1047 21
    LF  eq  \045  eq  \x25  eq           chr(37)  eq  CP-0037 37
    CR  eq  \015  eq  \x0D  eq  \cM  eq  chr(13)  eq  CP-1047 13
    CR  eq  \015  eq  \x0D  eq  \cM  eq  chr(13)  eq  CP-0037 13

             | z/OS | OS/400 |
        ----------------------
        \n   |  LF  |  LF    |
        \r   |  CR  |  CR    |
        \n * |  LF  |  LF    |
        \r * |  CR  |  CR    |
        ----------------------
        * text-mode STDIO

=head2 Numbers endianness and Width

Different CPUs store integers and floating point numbers in different
orders (called I<endianness>) and widths (32-bit and 64-bit being the
most common today).  This affects your programs when they attempt to transfer
numbers in binary format from one CPU architecture to another,
usually either "live" via network connection, or by storing the
numbers to secondary storage such as a disk file or tape.

Conflicting storage orders make an utter mess out of the numbers.  If a
little-endian host (Intel, VAX) stores 0x12345678 (305419896 in
decimal), a big-endian host (Motorola, Sparc, PA) reads it as
0x78563412 (2018915346 in decimal).  Alpha and MIPS can be either:
Digital/Compaq used/uses them in little-endian mode; SGI/Cray uses
them in big-endian mode.  To avoid this problem in network (socket)
connections use the L<C<pack>|perlfunc/pack TEMPLATE,LIST> and
L<C<unpack>|perlfunc/unpack TEMPLATE,EXPR> formats C<n> and C<N>, the
"network" orders.  These are guaranteed to be portable.

As of Perl 5.10.0, you can also use the C<E<gt>> and C<E<lt>> modifiers
to force big- or little-endian byte-order.  This is useful if you want
to store signed integers or 64-bit integers, for example.

You can explore the endianness of your platform by unpacking a
data structure packed in native format such as:

    print unpack("h*", pack("s2", 1, 2)), "\n";
    # '10002000' on e.g. Intel x86 or Alpha 21064 in little-endian mode
    # '00100020' on e.g. Motorola 68040

If you need to distinguish between endian architectures you could use
either of the variables set like so:

    $is_big_endian   = unpack("h*", pack("s", 1)) =~ /01/;
    $is_little_endian = unpack("h*", pack("s", 1)) =~ /^1/;

Differing widths can cause truncation even between platforms of equal
endianness.  The platform of shorter width loses the upper parts of the
number.  There is no good solution for this problem except to avoid
transferring or storing raw binary numbers.

One can circumnavigate both these problems in two ways.  Either
transfer and store numbers always in text format, instead of raw
binary, or else consider using modules like
L<C<Data::Dumper>|Data::Dumper> and L<C<Storable>|Storable> (included as
of Perl 5.8).  Keeping all data as text significantly simplifies matters.

=head2 Files and Filesystems

Most platforms these days structure files in a hierarchical fashion.
So, it is reasonably safe to assume that all platforms support the
notion of a "path" to uniquely identify a file on the system.  How
that path is really written, though, differs considerably.

Although similar, file path specifications differ between Unix,
Windows, S<Mac OS>, OS/2, VMS, VOS, S<RISC OS>, and probably others.
Unix, for example, is one of the few OSes that has the elegant idea
of a single root directory.

DOS, OS/2, VMS, VOS, and Windows can work similarly to Unix with C</>
as path separator, or in their own idiosyncratic ways (such as having
several root directories and various "unrooted" device files such NIL:
and LPT:).

S<Mac OS> 9 and earlier used C<:> as a path separator instead of C</>.

The filesystem may support neither hard links
(L<C<link>|perlfunc/link OLDFILE,NEWFILE>) nor symbolic links
(L<C<symlink>|perlfunc/symlink OLDFILE,NEWFILE>,
L<C<readlink>|perlfunc/readlink EXPR>,
L<C<lstat>|perlfunc/lstat FILEHANDLE>).

The filesystem may support neither access timestamp nor change
timestamp (meaning that about the only portable timestamp is the
modification timestamp), or one second granularity of any timestamps
(e.g. the FAT filesystem limits the time granularity to two seconds).

The "inode change timestamp" (the L<C<-C>|perlfunc/-X FILEHANDLE>
filetest) may really be the "creation timestamp" (which it is not in
Unix).

VOS perl can emulate Unix filenames with C</> as path separator.  The
native pathname characters greater-than, less-than, number-sign, and
percent-sign are always accepted.

S<RISC OS> perl can emulate Unix filenames with C</> as path
separator, or go native and use C<.> for path separator and C<:> to
signal filesystems and disk names.

Don't assume Unix filesystem access semantics: that read, write,
and execute are all the permissions there are, and even if they exist,
that their semantics (for example what do C<r>, C<w>, and C<x> mean on
a directory) are the Unix ones.  The various Unix/POSIX compatibility
layers usually try to make interfaces like L<C<chmod>|perlfunc/chmod LIST>
work, but sometimes there simply is no good mapping.

The L<C<File::Spec>|File::Spec> modules provide methods to manipulate path
specifications and return the results in native format for each
platform.  This is often unnecessary as Unix-style paths are
understood by Perl on every supported platform, but if you need to
produce native paths for a native utility that does not understand
Unix syntax, or if you are operating on paths or path components
in unknown (and thus possibly native) syntax, L<C<File::Spec>|File::Spec>
is your friend.  Here are two brief examples:

    use File::Spec::Functions;
    chdir(updir());        # go up one directory

    # Concatenate a path from its components
    my $file = catfile(updir(), 'temp', 'file.txt');
    # on Unix:    '../temp/file.txt'
    # on Win32:   '..\temp\file.txt'
    # on VMS:     '[-.temp]file.txt'

In general, production code should not have file paths hardcoded.
Making them user-supplied or read from a configuration file is
better, keeping in mind that file path syntax varies on different
machines.

This is especially noticeable in scripts like Makefiles and test suites,
which often assume C</> as a path separator for subdirectories.

Also of use is L<C<File::Basename>|File::Basename> from the standard
distribution, which splits a pathname into pieces (base filename, full
path to directory, and file suffix).

Even when on a single platform (if you can call Unix a single platform),
remember not to count on the existence or the contents of particular
system-specific files or directories, like F</etc/passwd>,
F</etc/sendmail.conf>, F</etc/resolv.conf>, or even F</tmp/>.  For
example, F</etc/passwd> may exist but not contain the encrypted
passwords, because the system is using some form of enhanced security.
Or it may not contain all the accounts, because the system is using NIS.
If code does need to rely on such a file, include a description of the
file and its format in the code's documentation, then make it easy for
the user to override the default location of the file.

Don't assume a text file will end with a newline.  They should,
but people forget.

Do not have two files or directories of the same name with different
case, like F<test.pl> and F<Test.pl>, as many platforms have
case-insensitive (or at least case-forgiving) filenames.  Also, try
not to have non-word characters (except for C<.>) in the names, and
keep them to the 8.3 convention, for maximum portability, onerous a
burden though this may appear.

Likewise, when using the L<C<AutoSplit>|AutoSplit> module, try to keep
your functions to 8.3 naming and case-insensitive conventions; or, at the
least, make it so the resulting files have a unique (case-insensitively)
first 8 characters.

Whitespace in filenames is tolerated on most systems, but not all,
and even on systems where it might be tolerated, some utilities
might become confused by such whitespace.

Many systems (DOS, VMS ODS-2) cannot have more than one C<.> in their
filenames.

Don't assume C<< > >> won't be the first character of a filename.
Always use the three-arg version of
L<C<open>|perlfunc/open FILEHANDLE,EXPR>:

    open my $fh, '<', $existing_file) or die $!;

Two-arg L<C<open>|perlfunc/open FILEHANDLE,EXPR> is magic and can
translate characters like C<< > >>, C<< < >>, and C<|> in filenames,
which is usually the wrong thing to do.
L<C<sysopen>|perlfunc/sysopen FILEHANDLE,FILENAME,MODE> and three-arg
L<C<open>|perlfunc/open FILEHANDLE,EXPR> don't have this problem.

Don't use C<:> as a part of a filename since many systems use that for
their own semantics (Mac OS Classic for separating pathname components,
many networking schemes and utilities for separating the nodename and
the pathname, and so on).  For the same reasons, avoid C<@>, C<;> and
C<|>.

Don't assume that in pathnames you can collapse two leading slashes
C<//> into one: some networking and clustering filesystems have special
semantics for that.  Let the operating system sort it out.

The I<portable filename characters> as defined by ANSI C are

 a b c d e f g h i j k l m n o p q r s t u v w x y z
 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
 0 1 2 3 4 5 6 7 8 9
 . _ -

and C<-> shouldn't be the first character.  If you want to be
hypercorrect, stay case-insensitive and within the 8.3 naming
convention (all the files and directories have to be unique within one
directory if their names are lowercased and truncated to eight
characters before the C<.>, if any, and to three characters after the
C<.>, if any).  (And do not use C<.>s in directory names.)

=head2 System Interaction

Not all platforms provide a command line.  These are usually platforms
that rely primarily on a Graphical User Interface (GUI) for user
interaction.  A program requiring a command line interface might
not work everywhere.  This is probably for the user of the program
to deal with, so don't stay up late worrying about it.

Some platforms can't delete or rename files held open by the system,
this limitation may also apply to changing filesystem metainformation
like file permissions or owners.  Remember to
L<C<close>|perlfunc/close FILEHANDLE> files when you are done with them.
Don't L<C<unlink>|perlfunc/unlink LIST> or
L<C<rename>|perlfunc/rename OLDNAME,NEWNAME> an open file.  Don't
L<C<tie>|perlfunc/tie VARIABLE,CLASSNAME,LIST> or
L<C<open>|perlfunc/open FILEHANDLE,EXPR> a file already tied or opened;
L<C<untie>|perlfunc/untie VARIABLE> or
L<C<close>|perlfunc/close FILEHANDLE> it first.

Don't open the same file more than once at a time for writing, as some
operating systems put mandatory locks on such files.

Don't assume that write/modify permission on a directory gives the
right to add or delete files/directories in that directory.  That is
filesystem specific: in some filesystems you need write/modify
permission also (or even just) in the file/directory itself.  In some
filesystems (AFS, DFS) the permission to add/delete directory entries
is a completely separate permission.

Don't assume that a single L<C<unlink>|perlfunc/unlink LIST> completely
gets rid of the file: some filesystems (most notably the ones in VMS) have
versioned filesystems, and L<C<unlink>|perlfunc/unlink LIST> removes only
the most recent one (it doesn't remove all the versions because by default
the native tools on those platforms remove just the most recent version,
too).  The portable idiom to remove all the versions of a file is

    1 while unlink "file";

This will terminate if the file is undeleteable for some reason
(protected, not there, and so on).

Don't count on a specific environment variable existing in
L<C<%ENV>|perlvar/%ENV>.  Don't count on L<C<%ENV>|perlvar/%ENV> entries
being case-sensitive, or even case-preserving.  Don't try to clear
L<C<%ENV>|perlvar/%ENV> by saying C<%ENV = ();>, or, if you really have
to, make it conditional on C<$^O ne 'VMS'> since in VMS the
L<C<%ENV>|perlvar/%ENV> table is much more than a per-process key-value
string table.

On VMS, some entries in the L<C<%ENV>|perlvar/%ENV> hash are dynamically
created when their key is used on a read if they did not previously
exist.  The values for C<$ENV{HOME}>, C<$ENV{TERM}>, C<$ENV{PATH}>, and
C<$ENV{USER}>, are known to be dynamically generated.  The specific names
that are dynamically generated may vary with the version of the C library
on VMS, and more may exist than are documented.

On VMS by default, changes to the L<C<%ENV>|perlvar/%ENV> hash persist
after perl exits.  Subsequent invocations of perl in the same process can
inadvertently inherit environment settings that were meant to be
temporary.

Don't count on signals or L<C<%SIG>|perlvar/%SIG> for anything.

Don't count on filename globbing.  Use
L<C<opendir>|perlfunc/opendir DIRHANDLE,EXPR>,
L<C<readdir>|perlfunc/readdir DIRHANDLE>, and
L<C<closedir>|perlfunc/closedir DIRHANDLE> instead.

Don't count on per-program environment variables, or per-program current
directories.

Don't count on specific values of L<C<$!>|perlvar/$!>, neither numeric nor
especially the string values. Users may switch their locales causing
error messages to be translated into their languages.  If you can
trust a POSIXish environment, you can portably use the symbols defined
by the L<C<Errno>|Errno> module, like C<ENOENT>.  And don't trust on the
values of L<C<$!>|perlvar/$!> at all except immediately after a failed
system call.

=head2 Command names versus file pathnames

Don't assume that the name used to invoke a command or program with
L<C<system>|perlfunc/system LIST> or L<C<exec>|perlfunc/exec LIST> can
also be used to test for the existence of the file that holds the
executable code for that command or program.
First, many systems have "internal" commands that are built-in to the
shell or OS and while these commands can be invoked, there is no
corresponding file.  Second, some operating systems (e.g., Cygwin,
DJGPP, OS/2, and VOS) have required suffixes for executable files;
these suffixes are generally permitted on the command name but are not
required.  Thus, a command like C<perl> might exist in a file named
F<perl>, F<perl.exe>, or F<perl.pm>, depending on the operating system.
The variable L<C<$Config{_exe}>|Config/C<_exe>> in the
L<C<Config>|Config> module holds the executable suffix, if any.  Third,
the VMS port carefully sets up L<C<$^X>|perlvar/$^X> and
L<C<$Config{perlpath}>|Config/C<perlpath>> so that no further processing
is required.  This is just as well, because the matching regular
expression used below would then have to deal with a possible trailing
version number in the VMS file name.

To convert L<C<$^X>|perlvar/$^X> to a file pathname, taking account of
the requirements of the various operating system possibilities, say:

 use Config;
 my $thisperl = $^X;
 if ($^O ne 'VMS') {
     $thisperl .= $Config{_exe}
         unless $thisperl =~ m/\Q$Config{_exe}\E$/i;
 }

To convert L<C<$Config{perlpath}>|Config/C<perlpath>> to a file pathname, say:

 use Config;
 my $thisperl = $Config{perlpath};
 if ($^O ne 'VMS') {
     $thisperl .= $Config{_exe}
         unless $thisperl =~ m/\Q$Config{_exe}\E$/i;
 }

=head2 Networking

Don't assume that you can reach the public Internet.

Don't assume that there is only one way to get through firewalls
to the public Internet.

Don't assume that you can reach outside world through any other port
than 80, or some web proxy.  ftp is blocked by many firewalls.

Don't assume that you can send email by connecting to the local SMTP port.

Don't assume that you can reach yourself or any node by the name
'localhost'.  The same goes for '127.0.0.1'.  You will have to try both.

Don't assume that the host has only one network card, or that it
can't bind to many virtual IP addresses.

Don't assume a particular network device name.

Don't assume a particular set of
L<C<ioctl>|perlfunc/ioctl FILEHANDLE,FUNCTION,SCALAR>s will work.

Don't assume that you can ping hosts and get replies.

Don't assume that any particular port (service) will respond.

Don't assume that L<C<Sys::Hostname>|Sys::Hostname> (or any other API or
command) returns either a fully qualified hostname or a non-qualified
hostname: it all depends on how the system had been configured.  Also
remember that for things such as DHCP and NAT, the hostname you get back
might not be very useful.

All the above I<don't>s may look daunting, and they are, but the key
is to degrade gracefully if one cannot reach the particular network
service one wants.  Croaking or hanging do not look very professional.

=head2 Interprocess Communication (IPC)

In general, don't directly access the system in code meant to be
portable.  That means, no L<C<system>|perlfunc/system LIST>,
L<C<exec>|perlfunc/exec LIST>, L<C<fork>|perlfunc/fork>,
L<C<pipe>|perlfunc/pipe READHANDLE,WRITEHANDLE>,
L<C<``> or C<qxE<sol>E<sol>>|perlop/C<qxE<sol>I<STRING>E<sol>>>,
L<C<open>|perlfunc/open FILEHANDLE,EXPR> with a C<|>, nor any of the other
things that makes being a Perl hacker worth being.

Commands that launch external processes are generally supported on
most platforms (though many of them do not support any type of
forking).  The problem with using them arises from what you invoke
them on.  External tools are often named differently on different
platforms, may not be available in the same location, might accept
different arguments, can behave differently, and often present their
results in a platform-dependent way.  Thus, you should seldom depend
on them to produce consistent results.  (Then again, if you're calling
C<netstat -a>, you probably don't expect it to run on both Unix and CP/M.)

One especially common bit of Perl code is opening a pipe to B<sendmail>:

    open(my $mail, '|-', '/usr/lib/sendmail -t')
	or die "cannot fork sendmail: $!";

This is fine for systems programming when sendmail is known to be
available.  But it is not fine for many non-Unix systems, and even
some Unix systems that may not have sendmail installed.  If a portable
solution is needed, see the various distributions on CPAN that deal
with it.  L<C<Mail::Mailer>|Mail::Mailer> and L<C<Mail::Send>|Mail::Send>
in the C<MailTools> distribution are commonly used, and provide several
mailing methods, including C<mail>, C<sendmail>, and direct SMTP (via
L<C<Net::SMTP>|Net::SMTP>) if a mail transfer agent is not available.
L<C<Mail::Sendmail>|Mail::Sendmail> is a standalone module that provides
simple, platform-independent mailing.

The Unix System V IPC (C<msg*(), sem*(), shm*()>) is not available
even on all Unix platforms.

Do not use either the bare result of C<pack("N", 10, 20, 30, 40)> or
bare v-strings (such as C<v10.20.30.40>) to represent IPv4 addresses:
both forms just pack the four bytes into network order.  That this
would be equal to the C language C<in_addr> struct (which is what the
socket code internally uses) is not guaranteed.  To be portable use
the routines of the L<C<Socket>|Socket> module, such as
L<C<inet_aton>|Socket/$ip_address = inet_aton $string>,
L<C<inet_ntoa>|Socket/$string = inet_ntoa $ip_address>, and
L<C<sockaddr_in>|Socket/$sockaddr = sockaddr_in $port, $ip_address>.

The rule of thumb for portable code is: Do it all in portable Perl, or
use a module (that may internally implement it with platform-specific
code, but exposes a common interface).

=head2 External Subroutines (XS)

XS code can usually be made to work with any platform, but dependent
libraries, header files, etc., might not be readily available or
portable, or the XS code itself might be platform-specific, just as Perl
code might be.  If the libraries and headers are portable, then it is
normally reasonable to make sure the XS code is portable, too.

A different type of portability issue arises when writing XS code:
availability of a C compiler on the end-user's system.  C brings
with it its own portability issues, and writing XS code will expose
you to some of those.  Writing purely in Perl is an easier way to
achieve portability.

=head2 Standard Modules

In general, the standard modules work across platforms.  Notable
exceptions are the L<C<CPAN>|CPAN> module (which currently makes
connections to external programs that may not be available),
platform-specific modules (like L<C<ExtUtils::MM_VMS>|ExtUtils::MM_VMS>),
and DBM modules.

There is no one DBM module available on all platforms.
L<C<SDBM_File>|SDBM_File> and the others are generally available on all
Unix and DOSish ports, but not in MacPerl, where only
L<C<NDBM_File>|NDBM_File> and L<C<DB_File>|DB_File> are available.

The good news is that at least some DBM module should be available, and
L<C<AnyDBM_File>|AnyDBM_File> will use whichever module it can find.  Of
course, then the code needs to be fairly strict, dropping to the greatest
common factor (e.g., not exceeding 1K for each record), so that it will
work with any DBM module.  See L<AnyDBM_File> for more details.

=head2 Time and Date

The system's notion of time of day and calendar date is controlled in
widely different ways.  Don't assume the timezone is stored in C<$ENV{TZ}>,
and even if it is, don't assume that you can control the timezone through
that variable.  Don't assume anything about the three-letter timezone
abbreviations (for example that MST would be the Mountain Standard Time,
it's been known to stand for Moscow Standard Time).  If you need to
use timezones, express them in some unambiguous format like the
exact number of minutes offset from UTC, or the POSIX timezone
format.

Don't assume that the epoch starts at 00:00:00, January 1, 1970,
because that is OS- and implementation-specific.  It is better to
store a date in an unambiguous representation.  The ISO 8601 standard
defines YYYY-MM-DD as the date format, or YYYY-MM-DDTHH:MM:SS
(that's a literal "T" separating the date from the time).
Please do use the ISO 8601 instead of making us guess what
date 02/03/04 might be.  ISO 8601 even sorts nicely as-is.
A text representation (like "1987-12-18") can be easily converted
into an OS-specific value using a module like
L<C<Time::Piece>|Time::Piece> (see L<Time::Piece/Date Parsing>) or
L<C<Date::Parse>|Date::Parse>.  An array of values, such as those
returned by L<C<localtime>|perlfunc/localtime EXPR>, can be converted to an OS-specific
representation using L<C<Time::Local>|Time::Local>.

When calculating specific times, such as for tests in time or date modules,
it may be appropriate to calculate an offset for the epoch.

    use Time::Local qw(timegm);
    my $offset = timegm(0, 0, 0, 1, 0, 70);

The value for C<$offset> in Unix will be C<0>, but in Mac OS Classic
will be some large number.  C<$offset> can then be added to a Unix time
value to get what should be the proper value on any system.

=head2 Character sets and character encoding

Assume very little about character sets.

Assume nothing about numerical values (L<C<ord>|perlfunc/ord EXPR>,
L<C<chr>|perlfunc/chr NUMBER>) of characters.
Do not use explicit code point ranges (like C<\xHH-\xHH)>.  However,
starting in Perl v5.22, regular expression pattern bracketed character
class ranges specified like C<qr/[\N{U+HH}-\N{U+HH}]/> are portable,
and starting in Perl v5.24, the same ranges are portable in
L<C<trE<sol>E<sol>E<sol>>|perlop/C<trE<sol>I<SEARCHLIST>E<sol>I<REPLACEMENTLIST>E<sol>cdsr>>.
You can portably use symbolic character classes like C<[:print:]>.

Do not assume that the alphabetic characters are encoded contiguously
(in the numeric sense).  There may be gaps.  Special coding in Perl,
however, guarantees that all subsets of C<qr/[A-Z]/>, C<qr/[a-z]/>, and
C<qr/[0-9]/> behave as expected.
L<C<trE<sol>E<sol>E<sol>>|perlop/C<trE<sol>I<SEARCHLIST>E<sol>I<REPLACEMENTLIST>E<sol>cdsr>>
behaves the same for these ranges.  In patterns, any ranges specified with
end points using the C<\N{...}> notations ensures character set
portability, but it is a bug in Perl v5.22 that this isn't true of
L<C<trE<sol>E<sol>E<sol>>|perlop/C<trE<sol>I<SEARCHLIST>E<sol>I<REPLACEMENTLIST>E<sol>cdsr>>,
fixed in v5.24.

Do not assume anything about the ordering of the characters.
The lowercase letters may come before or after the uppercase letters;
the lowercase and uppercase may be interlaced so that both "a" and "A"
come before "b"; the accented and other international characters may
be interlaced so that E<auml> comes before "b".
L<Unicode::Collate> can be used to sort this all out.

=head2 Internationalisation

If you may assume POSIX (a rather large assumption), you may read
more about the POSIX locale system from L<perllocale>.  The locale
system at least attempts to make things a little bit more portable,
or at least more convenient and native-friendly for non-English
users.  The system affects character sets and encoding, and date
and time formatting--amongst other things.

If you really want to be international, you should consider Unicode.
See L<perluniintro> and L<perlunicode> for more information.

By default Perl assumes your source code is written in an 8-bit ASCII
superset. To embed Unicode characters in your strings and regexes, you can
use the L<C<\x{HH}> or (more portably) C<\N{U+HH}>
notations|perlop/Quote and Quote-like Operators>. You can also use the
L<C<utf8>|utf8> pragma and write your code in UTF-8, which lets you use
Unicode characters directly (not just in quoted constructs but also in
identifiers).

=head2 System Resources

If your code is destined for systems with severely constrained (or
missing!) virtual memory systems then you want to be I<especially> mindful
of avoiding wasteful constructs such as:

    my @lines = <$very_large_file>;            # bad

    while (<$fh>) {$file .= $_}                # sometimes bad
    my $file = join('', <$fh>);                # better

The last two constructs may appear unintuitive to most people.  The
first repeatedly grows a string, whereas the second allocates a
large chunk of memory in one go.  On some systems, the second is
more efficient than the first.

=head2 Security

Most multi-user platforms provide basic levels of security, usually
implemented at the filesystem level.  Some, however, unfortunately do
not.  Thus the notion of user id, or "home" directory,
or even the state of being logged-in, may be unrecognizable on many
platforms.  If you write programs that are security-conscious, it
is usually best to know what type of system you will be running
under so that you can write code explicitly for that platform (or
class of platforms).

Don't assume the Unix filesystem access semantics: the operating
system or the filesystem may be using some ACL systems, which are
richer languages than the usual C<rwx>.  Even if the C<rwx> exist,
their semantics might be different.

(From the security viewpoint, testing for permissions before attempting to
do something is silly anyway: if one tries this, there is potential
for race conditions. Someone or something might change the
permissions between the permissions check and the actual operation.
Just try the operation.)

Don't assume the Unix user and group semantics: especially, don't
expect L<C<< $< >>|perlvar/$E<lt>> and L<C<< $> >>|perlvar/$E<gt>> (or
L<C<$(>|perlvar/$(> and L<C<$)>|perlvar/$)>) to work for switching
identities (or memberships).

Don't assume set-uid and set-gid semantics.  (And even if you do,
think twice: set-uid and set-gid are a known can of security worms.)

=head2 Style

For those times when it is necessary to have platform-specific code,
consider keeping the platform-specific code in one place, making porting
to other platforms easier.  Use the L<C<Config>|Config> module and the
special variable L<C<$^O>|perlvar/$^O> to differentiate platforms, as
described in L</"PLATFORMS">.

Beware of the "else syndrome":

  if ($^O eq 'MSWin32') {
    # code that assumes Windows
  } else {
    # code that assumes Linux
  }

The C<else> branch should be used for the really ultimate fallback,
not for code specific to some platform.

Be careful in the tests you supply with your module or programs.
Module code may be fully portable, but its tests might not be.  This
often happens when tests spawn off other processes or call external
programs to aid in the testing, or when (as noted above) the tests
assume certain things about the filesystem and paths.  Be careful not
to depend on a specific output style for errors, such as when checking
L<C<$!>|perlvar/$!> after a failed system call.  Using
L<C<$!>|perlvar/$!> for anything else than displaying it as output is
doubtful (though see the L<C<Errno>|Errno> module for testing reasonably
portably for error value). Some platforms expect a certain output format,
and Perl on those platforms may have been adjusted accordingly.  Most
specifically, don't anchor a regex when testing an error value.

=head1 CPAN Testers

Modules uploaded to CPAN are tested by a variety of volunteers on
different platforms.  These CPAN testers are notified by mail of each
new upload, and reply to the list with PASS, FAIL, NA (not applicable to
this platform), or UNKNOWN (unknown), along with any relevant notations.

The purpose of the testing is twofold: one, to help developers fix any
problems in their code that crop up because of lack of testing on other
platforms; two, to provide users with information about whether
a given module works on a given platform.

Also see:

=over 4

=item *

Mailing list: cpan-testers-discuss@perl.org

=item *

Testing results: L<http://www.cpantesters.org/>

=back

=head1 PLATFORMS

Perl is built with a L<C<$^O>|perlvar/$^O> variable that indicates the
operating system it was built on.  This was implemented
to help speed up code that would otherwise have to C<use Config>
and use the value of L<C<$Config{osname}>|Config/C<osname>>.  Of course,
to get more detailed information about the system, looking into
L<C<%Config>|Config/DESCRIPTION> is certainly recommended.

L<C<%Config>|Config/DESCRIPTION> cannot always be trusted, however,
because it was built at compile time.  If perl was built in one place,
then transferred elsewhere, some values may be wrong.  The values may
even have been edited after the fact.

=head2 Unix

Perl works on a bewildering variety of Unix and Unix-like platforms (see
e.g. most of the files in the F<hints/> directory in the source code kit).
On most of these systems, the value of L<C<$^O>|perlvar/$^O> (hence
L<C<$Config{osname}>|Config/C<osname>>, too) is determined either by
lowercasing and stripping punctuation from the first field of the string
returned by typing C<uname -a> (or a similar command) at the shell prompt
or by testing the file system for the presence of uniquely named files
such as a kernel or header file.  Here, for example, are a few of the
more popular Unix flavors:

    uname         $^O        $Config{archname}
    --------------------------------------------
    AIX           aix        aix
    BSD/OS        bsdos      i386-bsdos
    Darwin        darwin     darwin
    DYNIX/ptx     dynixptx   i386-dynixptx
    FreeBSD       freebsd    freebsd-i386
    Haiku         haiku      BePC-haiku
    Linux         linux      arm-linux
    Linux         linux      armv5tel-linux
    Linux         linux      i386-linux
    Linux         linux      i586-linux
    Linux         linux      ppc-linux
    HP-UX         hpux       PA-RISC1.1
    IRIX          irix       irix
    Mac OS X      darwin     darwin
    NeXT 3        next       next-fat
    NeXT 4        next       OPENSTEP-Mach
    openbsd       openbsd    i386-openbsd
    OSF1          dec_osf    alpha-dec_osf
    reliantunix-n svr4       RM400-svr4
    SCO_SV        sco_sv     i386-sco_sv
    SINIX-N       svr4       RM400-svr4
    sn4609        unicos     CRAY_C90-unicos
    sn6521        unicosmk   t3e-unicosmk
    sn9617        unicos     CRAY_J90-unicos
    SunOS         solaris    sun4-solaris
    SunOS         solaris    i86pc-solaris
    SunOS4        sunos      sun4-sunos

Because the value of L<C<$Config{archname}>|Config/C<archname>> may
depend on the hardware architecture, it can vary more than the value of
L<C<$^O>|perlvar/$^O>.

=head2 DOS and Derivatives

Perl has long been ported to Intel-style microcomputers running under
systems like PC-DOS, MS-DOS, OS/2, and most Windows platforms you can
bring yourself to mention (except for Windows CE, if you count that).
Users familiar with I<COMMAND.COM> or I<CMD.EXE> style shells should
be aware that each of these file specifications may have subtle
differences:

    my $filespec0 = "c:/foo/bar/file.txt";
    my $filespec1 = "c:\\foo\\bar\\file.txt";
    my $filespec2 = 'c:\foo\bar\file.txt';
    my $filespec3 = 'c:\\foo\\bar\\file.txt';

System calls accept either C</> or C<\> as the path separator.
However, many command-line utilities of DOS vintage treat C</> as
the option prefix, so may get confused by filenames containing C</>.
Aside from calling any external programs, C</> will work just fine,
and probably better, as it is more consistent with popular usage,
and avoids the problem of remembering what to backwhack and what
not to.

The DOS FAT filesystem can accommodate only "8.3" style filenames.  Under
the "case-insensitive, but case-preserving" HPFS (OS/2) and NTFS (NT)
filesystems you may have to be careful about case returned with functions
like L<C<readdir>|perlfunc/readdir DIRHANDLE> or used with functions like
L<C<open>|perlfunc/open FILEHANDLE,EXPR> or
L<C<opendir>|perlfunc/opendir DIRHANDLE,EXPR>.

DOS also treats several filenames as special, such as F<AUX>, F<PRN>,
F<NUL>, F<CON>, F<COM1>, F<LPT1>, F<LPT2>, etc.  Unfortunately, sometimes
these filenames won't even work if you include an explicit directory
prefix.  It is best to avoid such filenames, if you want your code to be
portable to DOS and its derivatives.  It's hard to know what these all
are, unfortunately.

Users of these operating systems may also wish to make use of
scripts such as F<pl2bat.bat> to put wrappers around your scripts.

Newline (C<\n>) is translated as C<\015\012> by the I/O system when
reading from and writing to files (see L</"Newlines">).
C<binmode($filehandle)> will keep C<\n> translated as C<\012> for that
filehandle.
L<C<binmode>|perlfunc/binmode FILEHANDLE> should always be used for code
that deals with binary data.  That's assuming you realize in advance that
your data is in binary.  General-purpose programs should often assume
nothing about their data.

The L<C<$^O>|perlvar/$^O> variable and the
L<C<$Config{archname}>|Config/C<archname>> values for various DOSish
perls are as follows:

    OS             $^O       $Config{archname}  ID    Version
    ---------------------------------------------------------
    MS-DOS         dos       ?
    PC-DOS         dos       ?
    OS/2           os2       ?
    Windows 3.1    ?         ?                  0     3 01
    Windows 95     MSWin32   MSWin32-x86        1     4 00
    Windows 98     MSWin32   MSWin32-x86        1     4 10
    Windows ME     MSWin32   MSWin32-x86        1     ?
    Windows NT     MSWin32   MSWin32-x86        2     4 xx
    Windows NT     MSWin32   MSWin32-ALPHA      2     4 xx
    Windows NT     MSWin32   MSWin32-ppc        2     4 xx
    Windows 2000   MSWin32   MSWin32-x86        2     5 00
    Windows XP     MSWin32   MSWin32-x86        2     5 01
    Windows 2003   MSWin32   MSWin32-x86        2     5 02
    Windows Vista  MSWin32   MSWin32-x86        2     6 00
    Windows 7      MSWin32   MSWin32-x86        2     6 01
    Windows 7      MSWin32   MSWin32-x64        2     6 01
    Windows 2008   MSWin32   MSWin32-x86        2     6 01
    Windows 2008   MSWin32   MSWin32-x64        2     6 01
    Windows CE     MSWin32   ?                  3
    Cygwin         cygwin    cygwin

The various MSWin32 Perl's can distinguish the OS they are running on
via the value of the fifth element of the list returned from
L<C<Win32::GetOSVersion()>|Win32/Win32::GetOSVersion()>.  For example:

    if ($^O eq 'MSWin32') {
        my @os_version_info = Win32::GetOSVersion();
        print +('3.1','95','NT')[$os_version_info[4]],"\n";
    }

There are also C<Win32::IsWinNT()|Win32/Win32::IsWinNT()>,
C<Win32::IsWin95()|Win32/Win32::IsWin95()>, and
L<C<Win32::GetOSName()>|Win32/Win32::GetOSName()>; try
L<C<perldoc Win32>|Win32>.
The very portable L<C<POSIX::uname()>|POSIX/C<uname>> will work too:

    c:\> perl -MPOSIX -we "print join '|', uname"
    Windows NT|moonru|5.0|Build 2195 (Service Pack 2)|x86

Errors set by Winsock functions are now put directly into C<$^E>,
and the relevant C<WSAE*> error codes are now exported from the
L<Errno> and L<POSIX> modules for testing this against.

The previous behavior of putting the errors (converted to POSIX-style
C<E*> error codes since Perl 5.20.0) into C<$!> was buggy due to
the non-equivalence of like-named Winsock and POSIX error constants,
a relationship between which has unfortunately been established
in one way or another since Perl 5.8.0.

The new behavior provides a much more robust solution for checking
Winsock errors in portable software without accidentally matching
POSIX tests that were intended for other OSes and may have different
meanings for Winsock.

The old behavior is currently retained, warts and all, for backwards
compatibility, but users are encouraged to change any code that
tests C<$!> against C<E*> constants for Winsock errors to instead
test C<$^E> against C<WSAE*> constants.  After a suitable deprecation
period, which started with Perl 5.24, the old behavior may be
removed, leaving C<$!> unchanged after Winsock function calls, to
avoid any possible confusion over which error variable to check.

Also see:

=over 4

=item *

The djgpp environment for DOS, L<http://www.delorie.com/djgpp/>
and L<perldos>.

=item *

The EMX environment for DOS, OS/2, etc. emx@iaehv.nl,
L<ftp://hobbes.nmsu.edu/pub/os2/dev/emx/>  Also L<perlos2>.

=item *

Build instructions for Win32 in L<perlwin32>, or under the Cygnus environment
in L<perlcygwin>.

=item *

The C<Win32::*> modules in L<Win32>.

=item *

The ActiveState Pages, L<http://www.activestate.com/>

=item *

The Cygwin environment for Win32; F<README.cygwin> (installed
as L<perlcygwin>), L<http://www.cygwin.com/>

=item *

The U/WIN environment for Win32,
L<http://www.research.att.com/sw/tools/uwin/>

=item *

Build instructions for OS/2, L<perlos2>

=back

=head2 VMS

Perl on VMS is discussed in L<perlvms> in the Perl distribution.

The official name of VMS as of this writing is OpenVMS.

Interacting with Perl from the Digital Command Language (DCL) shell
often requires a different set of quotation marks than Unix shells do.
For example:

    $ perl -e "print ""Hello, world.\n"""
    Hello, world.

There are several ways to wrap your Perl scripts in DCL F<.COM> files, if
you are so inclined.  For example:

    $ write sys$output "Hello from DCL!"
    $ if p1 .eqs. ""
    $ then perl -x 'f$environment("PROCEDURE")
    $ else perl -x - 'p1 'p2 'p3 'p4 'p5 'p6 'p7 'p8
    $ deck/dollars="__END__"
    #!/usr/bin/perl

    print "Hello from Perl!\n";

    __END__
    $ endif

Do take care with C<$ ASSIGN/nolog/user SYS$COMMAND: SYS$INPUT> if your
Perl-in-DCL script expects to do things like C<< $read = <STDIN>; >>.

The VMS operating system has two filesystems, designated by their
on-disk structure (ODS) level: ODS-2 and its successor ODS-5.  The
initial port of Perl to VMS pre-dates ODS-5, but all current testing and
development assumes ODS-5 and its capabilities, including case
preservation, extended characters in filespecs, and names up to 8192
bytes long.

Perl on VMS can accept either VMS- or Unix-style file
specifications as in either of the following:

    $ perl -ne "print if /perl_setup/i" SYS$LOGIN:LOGIN.COM
    $ perl -ne "print if /perl_setup/i" /sys$login/login.com

but not a mixture of both as in:

    $ perl -ne "print if /perl_setup/i" sys$login:/login.com
    Can't open sys$login:/login.com: file specification syntax error

In general, the easiest path to portability is always to specify
filenames in Unix format unless they will need to be processed by native
commands or utilities.  Because of this latter consideration, the
L<File::Spec> module by default returns native format specifications
regardless of input format.  This default may be reversed so that
filenames are always reported in Unix format by specifying the
C<DECC$FILENAME_UNIX_REPORT> feature logical in the environment.

The file type, or extension, is always present in a VMS-format file
specification even if it's zero-length.  This means that, by default,
L<C<readdir>|perlfunc/readdir DIRHANDLE> will return a trailing dot on a
file with no extension, so where you would see C<"a"> on Unix you'll see
C<"a."> on VMS.  However, the trailing dot may be suppressed by enabling
the C<DECC$READDIR_DROPDOTNOTYPE> feature in the environment (see the CRTL
documentation on feature logical names).

What C<\n> represents depends on the type of file opened.  It usually
represents C<\012> but it could also be C<\015>, C<\012>, C<\015\012>,
C<\000>, C<\040>, or nothing depending on the file organization and
record format.  The L<C<VMS::Stdio>|VMS::Stdio> module provides access to
the special C<fopen()> requirements of files with unusual attributes on
VMS.

The value of L<C<$^O>|perlvar/$^O> on OpenVMS is "VMS".  To determine the
architecture that you are running on refer to
L<C<$Config{archname}>|Config/C<archname>>.

On VMS, perl determines the UTC offset from the C<SYS$TIMEZONE_DIFFERENTIAL>
logical name.  Although the VMS epoch began at 17-NOV-1858 00:00:00.00,
calls to L<C<localtime>|perlfunc/localtime EXPR> are adjusted to count
offsets from 01-JAN-1970 00:00:00.00, just like Unix.

Also see:

=over 4

=item *

F<README.vms> (installed as F<README_vms>), L<perlvms>

=item *

vmsperl list, vmsperl-subscribe@perl.org

=item *

vmsperl on the web, L<http://www.sidhe.org/vmsperl/index.html>

=item *

VMS Software Inc. web site, L<http://www.vmssoftware.com>

=back

=head2 VOS

Perl on VOS (also known as OpenVOS) is discussed in F<README.vos>
in the Perl distribution (installed as L<perlvos>).  Perl on VOS
can accept either VOS- or Unix-style file specifications as in
either of the following:

    $ perl -ne "print if /perl_setup/i" >system>notices
    $ perl -ne "print if /perl_setup/i" /system/notices

or even a mixture of both as in:

    $ perl -ne "print if /perl_setup/i" >system/notices

Even though VOS allows the slash character to appear in object
names, because the VOS port of Perl interprets it as a pathname
delimiting character, VOS files, directories, or links whose
names contain a slash character cannot be processed.  Such files
must be renamed before they can be processed by Perl.

Older releases of VOS (prior to OpenVOS Release 17.0) limit file
names to 32 or fewer characters, prohibit file names from
starting with a C<-> character, and prohibit file names from
containing C< > (space) or any character from the set C<< !#%&'()*;<=>? >>.

Newer releases of VOS (OpenVOS Release 17.0 or later) support a
feature known as extended names.  On these releases, file names
can contain up to 255 characters, are prohibited from starting
with a C<-> character, and the set of prohibited characters is
reduced to C<< #%*<>? >>.  There are
restrictions involving spaces and apostrophes:  these characters
must not begin or end a name, nor can they immediately precede or
follow a period.  Additionally, a space must not immediately
precede another space or hyphen.  Specifically, the following
character combinations are prohibited:  space-space,
space-hyphen, period-space, space-period, period-apostrophe,
apostrophe-period, leading or trailing space, and leading or
trailing apostrophe.  Although an extended file name is limited
to 255 characters, a path name is still limited to 256
characters.

The value of L<C<$^O>|perlvar/$^O> on VOS is "vos".  To determine the
architecture that you are running on refer to
L<C<$Config{archname}>|Config/C<archname>>.

Also see:

=over 4

=item *

F<README.vos> (installed as L<perlvos>)

=item *

The VOS mailing list.

There is no specific mailing list for Perl on VOS.  You can contact
the Stratus Technologies Customer Assistance Center (CAC) for your
region, or you can use the contact information located in the
distribution files on the Stratus Anonymous FTP site.

=item *

Stratus Technologies on the web at L<http://www.stratus.com>

=item *

VOS Open-Source Software on the web at L<http://ftp.stratus.com/pub/vos/vos.html>

=back

=head2 EBCDIC Platforms

v5.22 core Perl runs on z/OS (formerly OS/390).  Theoretically it could
run on the successors of OS/400 on AS/400 minicomputers as well as
VM/ESA, and BS2000 for S/390 Mainframes.  Such computers use EBCDIC
character sets internally (usually Character Code Set ID 0037 for OS/400
and either 1047 or POSIX-BC for S/390 systems).

The rest of this section may need updating, but we don't know what it
should say.  Please email comments to
L<perlbug@perl.org|mailto:perlbug@perl.org>.

On the mainframe Perl currently works under the "Unix system
services for OS/390" (formerly known as OpenEdition), VM/ESA OpenEdition, or
the BS200 POSIX-BC system (BS2000 is supported in Perl 5.6 and greater).
See L<perlos390> for details.  Note that for OS/400 there is also a port of
Perl 5.8.1/5.10.0 or later to the PASE which is ASCII-based (as opposed to
ILE which is EBCDIC-based), see L<perlos400>.

As of R2.5 of USS for OS/390 and Version 2.3 of VM/ESA these Unix
sub-systems do not support the C<#!> shebang trick for script invocation.
Hence, on OS/390 and VM/ESA Perl scripts can be executed with a header
similar to the following simple script:

    : # use perl
        eval 'exec /usr/local/bin/perl -S $0 ${1+"$@"}'
            if 0;
    #!/usr/local/bin/perl     # just a comment really

    print "Hello from perl!\n";

OS/390 will support the C<#!> shebang trick in release 2.8 and beyond.
Calls to L<C<system>|perlfunc/system LIST> and backticks can use POSIX
shell syntax on all S/390 systems.

On the AS/400, if PERL5 is in your library list, you may need
to wrap your Perl scripts in a CL procedure to invoke them like so:

    BEGIN
      CALL PGM(PERL5/PERL) PARM('/QOpenSys/hello.pl')
    ENDPGM

This will invoke the Perl script F<hello.pl> in the root of the
QOpenSys file system.  On the AS/400 calls to
L<C<system>|perlfunc/system LIST> or backticks must use CL syntax.

On these platforms, bear in mind that the EBCDIC character set may have
an effect on what happens with some Perl functions (such as
L<C<chr>|perlfunc/chr NUMBER>, L<C<pack>|perlfunc/pack TEMPLATE,LIST>,
L<C<print>|perlfunc/print FILEHANDLE LIST>,
L<C<printf>|perlfunc/printf FILEHANDLE FORMAT, LIST>,
L<C<ord>|perlfunc/ord EXPR>, L<C<sort>|perlfunc/sort SUBNAME LIST>,
L<C<sprintf>|perlfunc/sprintf FORMAT, LIST>,
L<C<unpack>|perlfunc/unpack TEMPLATE,EXPR>), as
well as bit-fiddling with ASCII constants using operators like
L<C<^>, C<&> and C<|>|perlop/Bitwise String Operators>, not to mention
dealing with socket interfaces to ASCII computers (see L</"Newlines">).

Fortunately, most web servers for the mainframe will correctly
translate the C<\n> in the following statement to its ASCII equivalent
(C<\r> is the same under both Unix and z/OS):

    print "Content-type: text/html\r\n\r\n";

The values of L<C<$^O>|perlvar/$^O> on some of these platforms include:

    uname         $^O        $Config{archname}
    --------------------------------------------
    OS/390        os390      os390
    OS400         os400      os400
    POSIX-BC      posix-bc   BS2000-posix-bc

Some simple tricks for determining if you are running on an EBCDIC
platform could include any of the following (perhaps all):

    if ("\t" eq "\005")  { print "EBCDIC may be spoken here!\n"; }

    if (ord('A') == 193) { print "EBCDIC may be spoken here!\n"; }

    if (chr(169) eq 'z') { print "EBCDIC may be spoken here!\n"; }

One thing you may not want to rely on is the EBCDIC encoding
of punctuation characters since these may differ from code page to code
page (and once your module or script is rumoured to work with EBCDIC,
folks will want it to work with all EBCDIC character sets).

Also see:

=over 4

=item *

L<perlos390>, L<perlos400>, L<perlbs2000>, L<perlebcdic>.

=item *

The perl-mvs@perl.org list is for discussion of porting issues as well as
general usage issues for all EBCDIC Perls.  Send a message body of
"subscribe perl-mvs" to majordomo@perl.org.

=item *

AS/400 Perl information at
L<http://as400.rochester.ibm.com/>
as well as on CPAN in the F<ports/> directory.

=back

=head2 Acorn RISC OS

Because Acorns use ASCII with newlines (C<\n>) in text files as C<\012> like
Unix, and because Unix filename emulation is turned on by default,
most simple scripts will probably work "out of the box".  The native
filesystem is modular, and individual filesystems are free to be
case-sensitive or insensitive, and are usually case-preserving.  Some
native filesystems have name length limits, which file and directory
names are silently truncated to fit.  Scripts should be aware that the
standard filesystem currently has a name length limit of B<10>
characters, with up to 77 items in a directory, but other filesystems
may not impose such limitations.

Native filenames are of the form

    Filesystem#Special_Field::DiskName.$.Directory.Directory.File

where

    Special_Field is not usually present, but may contain . and $ .
    Filesystem =~ m|[A-Za-z0-9_]|
    DsicName   =~ m|[A-Za-z0-9_/]|
    $ represents the root directory
    . is the path separator
    @ is the current directory (per filesystem but machine global)
    ^ is the parent directory
    Directory and File =~ m|[^\0- "\.\$\%\&:\@\\^\|\177]+|

The default filename translation is roughly C<tr|/.|./|>, swapping dots
and slahes.

Note that C<"ADFS::HardDisk.$.File" ne 'ADFS::HardDisk.$.File'> and that
the second stage of C<$> interpolation in regular expressions will fall
foul of the L<C<$.>|perlvar/$.> variable if scripts are not careful.

Logical paths specified by system variables containing comma-separated
search lists are also allowed; hence C<System:Modules> is a valid
filename, and the filesystem will prefix C<Modules> with each section of
C<System$Path> until a name is made that points to an object on disk.
Writing to a new file C<System:Modules> would be allowed only if
C<System$Path> contains a single item list.  The filesystem will also
expand system variables in filenames if enclosed in angle brackets, so
C<< <System$Dir>.Modules >> would look for the file
S<C<$ENV{'System$Dir'} . 'Modules'>>.  The obvious implication of this is
that B<fully qualified filenames can start with C<< <> >>> and the
three-argument form of L<C<open>|perlfunc/open FILEHANDLE,EXPR> should
always be used.

Because C<.> was in use as a directory separator and filenames could not
be assumed to be unique after 10 characters, Acorn implemented the C
compiler to strip the trailing C<.c> C<.h> C<.s> and C<.o> suffix from
filenames specified in source code and store the respective files in
subdirectories named after the suffix.  Hence files are translated:

    foo.h           h.foo
    C:foo.h         C:h.foo        (logical path variable)
    sys/os.h        sys.h.os       (C compiler groks Unix-speak)
    10charname.c    c.10charname
    10charname.o    o.10charname
    11charname_.c   c.11charname   (assuming filesystem truncates at 10)

The Unix emulation library's translation of filenames to native assumes
that this sort of translation is required, and it allows a user-defined list
of known suffixes that it will transpose in this fashion.  This may
seem transparent, but consider that with these rules F<foo/bar/baz.h>
and F<foo/bar/h/baz> both map to F<foo.bar.h.baz>, and that
L<C<readdir>|perlfunc/readdir DIRHANDLE> and L<C<glob>|perlfunc/glob EXPR>
cannot and do not attempt to emulate the reverse mapping.  Other
C<.>'s in filenames are translated to C</>.

As implied above, the environment accessed through
L<C<%ENV>|perlvar/%ENV> is global, and the convention is that program
specific environment variables are of the form C<Program$Name>.
Each filesystem maintains a current directory,
and the current filesystem's current directory is the B<global> current
directory.  Consequently, sociable programs don't change the current
directory but rely on full pathnames, and programs (and Makefiles) cannot
assume that they can spawn a child process which can change the current
directory without affecting its parent (and everyone else for that
matter).

Because native operating system filehandles are global and are currently
allocated down from 255, with 0 being a reserved value, the Unix emulation
library emulates Unix filehandles.  Consequently, you can't rely on
passing C<STDIN>, C<STDOUT>, or C<STDERR> to your children.

The desire of users to express filenames of the form
C<< <Foo$Dir>.Bar >> on the command line unquoted causes problems,
too: L<C<``>|perlop/C<qxE<sol>I<STRING>E<sol>>> command output capture has
to perform a guessing game.  It assumes that a string C<< <[^<>]+\$[^<>]> >>
is a reference to an environment variable, whereas anything else involving
C<< < >> or C<< > >> is redirection, and generally manages to be 99%
right.  Of course, the problem remains that scripts cannot rely on any
Unix tools being available, or that any tools found have Unix-like command
line arguments.

Extensions and XS are, in theory, buildable by anyone using free
tools.  In practice, many don't, as users of the Acorn platform are
used to binary distributions.  MakeMaker does run, but no available
make currently copes with MakeMaker's makefiles; even if and when
this should be fixed, the lack of a Unix-like shell will cause
problems with makefile rules, especially lines of the form
C<cd sdbm && make all>, and anything using quoting.

S<"RISC OS"> is the proper name for the operating system, but the value
in L<C<$^O>|perlvar/$^O> is "riscos" (because we don't like shouting).

=head2 Other perls

Perl has been ported to many platforms that do not fit into any of
the categories listed above.  Some, such as AmigaOS,
QNX, Plan 9, and VOS, have been well-integrated into the standard
Perl source code kit.  You may need to see the F<ports/> directory
on CPAN for information, and possibly binaries, for the likes of:
aos, Atari ST, lynxos, riscos, Novell Netware, Tandem Guardian,
I<etc.>  (Yes, we know that some of these OSes may fall under the
Unix category, but we are not a standards body.)

Some approximate operating system names and their L<C<$^O>|perlvar/$^O>
values in the "OTHER" category include:

    OS            $^O        $Config{archname}
    ------------------------------------------
    Amiga DOS     amigaos    m68k-amigos

See also:

=over 4

=item *

Amiga, F<README.amiga> (installed as L<perlamiga>).

=item *

A free perl5-based PERL.NLM for Novell Netware is available in
precompiled binary and source code form from L<http://www.novell.com/>
as well as from CPAN.

=item  *

S<Plan 9>, F<README.plan9>

=back

=head1 FUNCTION IMPLEMENTATIONS

Listed below are functions that are either completely unimplemented
or else have been implemented differently on various platforms.
Preceding each description will be, in parentheses, a list of
platforms that the description applies to.

The list may well be incomplete, or even wrong in some places.  When
in doubt, consult the platform-specific README files in the Perl
source distribution, and any other documentation resources accompanying
a given port.

Be aware, moreover, that even among Unix-ish systems there are variations.

For many functions, you can also query L<C<%Config>|Config/DESCRIPTION>,
exported by default from the L<C<Config>|Config> module.  For example, to
check whether the platform has the L<C<lstat>|perlfunc/lstat FILEHANDLE>
call, check L<C<$Config{d_lstat}>|Config/C<d_lstat>>.  See L<Config> for a
full description of available variables.

=head2 Alphabetical Listing of Perl Functions

=over 8

=item -X

(Win32)
C<-w> only inspects the read-only file attribute (FILE_ATTRIBUTE_READONLY),
which determines whether the directory can be deleted, not whether it can
be written to. Directories always have read and write access unless denied
by discretionary access control lists (DACLs).

(VMS)
C<-r>, C<-w>, C<-x>, and C<-o> tell whether the file is accessible,
which may not reflect UIC-based file protections.

(S<RISC OS>)
C<-s> by name on an open file will return the space reserved on disk,
rather than the current extent.  C<-s> on an open filehandle returns the
current size.

(Win32, VMS, S<RISC OS>)
C<-R>, C<-W>, C<-X>, C<-O> are indistinguishable from C<-r>, C<-w>,
C<-x>, C<-o>.

(Win32, VMS, S<RISC OS>)
C<-g>, C<-k>, C<-l>, C<-u>, C<-A> are not particularly meaningful.

(VMS, S<RISC OS>)
C<-p> is not particularly meaningful.

(VMS)
C<-d> is true if passed a device spec without an explicit directory.

(Win32)
C<-x> (or C<-X>) determine if a file ends in one of the executable
suffixes.  C<-S> is meaningless.

(S<RISC OS>)
C<-x> (or C<-X>) determine if a file has an executable file type.

=item alarm

(Win32)
Emulated using timers that must be explicitly polled whenever Perl
wants to dispatch "safe signals" and therefore cannot interrupt
blocking system calls.

=item atan2

(Tru64, HP-UX 10.20)
Due to issues with various CPUs, math libraries, compilers, and standards,
results for C<atan2> may vary depending on any combination of the above.
Perl attempts to conform to the Open Group/IEEE standards for the results
returned from C<atan2>, but cannot force the issue if the system Perl is
run on does not allow it.

The current version of the standards for C<atan2> is available at
L<http://www.opengroup.org/onlinepubs/009695399/functions/atan2.html>.

=item binmode

(S<RISC OS>)
Meaningless.

(VMS)
Reopens file and restores pointer; if function fails, underlying
filehandle may be closed, or pointer may be in a different position.

(Win32)
The value returned by L<C<tell>|perlfunc/tell FILEHANDLE> may be affected
after the call, and the filehandle may be flushed.

=item chmod

(Win32)
Only good for changing "owner" read-write access; "group" and "other"
bits are meaningless.

(S<RISC OS>)
Only good for changing "owner" and "other" read-write access.

(VOS)
Access permissions are mapped onto VOS access-control list changes.

(Cygwin)
The actual permissions set depend on the value of the C<CYGWIN> variable
in the SYSTEM environment settings.

(Android)
Setting the exec bit on some locations (generally F</sdcard>) will return true
but not actually set the bit.

=item chown

(S<Plan 9>, S<RISC OS>)
Not implemented.

(Win32)
Does nothing, but won't fail.

(VOS)
A little funky, because VOS's notion of ownership is a little funky.

=item chroot

(Win32, VMS, S<Plan 9>, S<RISC OS>, VOS)
Not implemented.

=item crypt

(Win32)
May not be available if library or source was not provided when building
perl.

(Android)
Not implemented.

=item dbmclose

(VMS, S<Plan 9>, VOS)
Not implemented.

=item dbmopen

(VMS, S<Plan 9>, VOS)
Not implemented.

=item dump

(S<RISC OS>)
Not useful.

(Cygwin, Win32)
Not supported.

(VMS)
Invokes VMS debugger.

=item exec

(Win32)
C<exec LIST> without the use of indirect object syntax (C<exec PROGRAM LIST>)
may fall back to trying the shell if the first C<spawn()> fails.

(SunOS, Solaris, HP-UX)
Does not automatically flush output handles on some platforms.

(Symbian OS)
Not supported.

=item exit

(VMS)
Emulates Unix C<exit> (which considers C<exit 1> to indicate an error) by
mapping the C<1> to C<SS$_ABORT> (C<44>).  This behavior may be overridden
with the pragma L<C<use vmsish 'exit'>|vmsish/C<vmsish exit>>.  As with
the CRTL's C<exit()> function, C<exit 0> is also mapped to an exit status
of C<SS$_NORMAL> (C<1>); this mapping cannot be overridden.  Any other
argument to C<exit>
is used directly as Perl's exit status.  On VMS, unless the future
POSIX_EXIT mode is enabled, the exit code should always be a valid
VMS exit code and not a generic number.  When the POSIX_EXIT mode is
enabled, a generic number will be encoded in a method compatible with
the C library _POSIX_EXIT macro so that it can be decoded by other
programs, particularly ones written in C, like the GNV package.

(Solaris)
C<exit> resets file pointers, which is a problem when called
from a child process (created by L<C<fork>|perlfunc/fork>) in
L<C<BEGIN>|perlmod/BEGIN, UNITCHECK, CHECK, INIT and END>.
A workaround is to use L<C<POSIX::_exit>|POSIX/C<_exit>>.

    exit unless $Config{archname} =~ /\bsolaris\b/;
    require POSIX;
    POSIX::_exit(0);

=item fcntl

(Win32)
Not implemented.

(VMS)
Some functions available based on the version of VMS.

=item flock

(VMS, S<RISC OS>, VOS)
Not implemented.

=item fork

(AmigaOS, S<RISC OS>, VMS)
Not implemented.

(Win32)
Emulated using multiple interpreters.  See L<perlfork>.

(SunOS, Solaris, HP-UX)
Does not automatically flush output handles on some platforms.

=item getlogin

(S<RISC OS>)
Not implemented.

=item getpgrp

(Win32, VMS, S<RISC OS>)
Not implemented.

=item getppid

(Win32, S<RISC OS>)
Not implemented.

=item getpriority

(Win32, VMS, S<RISC OS>, VOS)
Not implemented.

=item getpwnam

(Win32)
Not implemented.

(S<RISC OS>)
Not useful.

=item getgrnam

(Win32, VMS, S<RISC OS>)
Not implemented.

=item getnetbyname

(Android, Win32, S<Plan 9>)
Not implemented.

=item getpwuid

(Win32)
Not implemented.

(S<RISC OS>)
Not useful.

=item getgrgid

(Win32, VMS, S<RISC OS>)
Not implemented.

=item getnetbyaddr

(Android, Win32, S<Plan 9>)
Not implemented.

=item getprotobynumber

(Android)
Not implemented.

=item getpwent

(Android, Win32)
Not implemented.

=item getgrent

(Android, Win32, VMS)
Not implemented.

=item gethostbyname

(S<Irix 5>)
C<gethostbyname('localhost')> does not work everywhere: you may have
to use C<gethostbyname('127.0.0.1')>.

=item gethostent

(Win32)
Not implemented.

=item getnetent

(Android, Win32, S<Plan 9>)
Not implemented.

=item getprotoent

(Android, Win32, S<Plan 9>)
Not implemented.

=item getservent

(Win32, S<Plan 9>)
Not implemented.

=item seekdir

(Android)
Not implemented.

=item sethostent

(Android, Win32, S<Plan 9>, S<RISC OS>)
Not implemented.

=item setnetent

(Win32, S<Plan 9>, S<RISC OS>)
Not implemented.

=item setprotoent

(Android, Win32, S<Plan 9>, S<RISC OS>)
Not implemented.

=item setservent

(S<Plan 9>, Win32, S<RISC OS>)
Not implemented.

=item endpwent

(Win32)
Not implemented.

(Android)
Either not implemented or a no-op.

=item endgrent

(Android, S<RISC OS>, VMS, Win32)
Not implemented.

=item endhostent

(Android, Win32)
Not implemented.

=item endnetent

(Android, Win32, S<Plan 9>)
Not implemented.

=item endprotoent

(Android, Win32, S<Plan 9>)
Not implemented.

=item endservent

(S<Plan 9>, Win32)
Not implemented.

=item getsockopt

(S<Plan 9>)
Not implemented.

=item glob

This operator is implemented via the L<C<File::Glob>|File::Glob> extension
on most platforms.  See L<File::Glob> for portability information.

=item gmtime

In theory, C<gmtime> is reliable from -2**63 to 2**63-1.  However,
because work-arounds in the implementation use floating point numbers,
it will become inaccurate as the time gets larger.  This is a bug and
will be fixed in the future.

(VOS)
Time values are 32-bit quantities.

=item ioctl

(VMS)
Not implemented.

(Win32)
Available only for socket handles, and it does what the C<ioctlsocket()> call
in the Winsock API does.

(S<RISC OS>)
Available only for socket handles.

=item kill

(S<RISC OS>)
Not implemented, hence not useful for taint checking.

(Win32)
C<kill> doesn't send a signal to the identified process like it does on
Unix platforms.  Instead C<kill($sig, $pid)> terminates the process
identified by C<$pid>, and makes it exit immediately with exit status
C<$sig>.  As in Unix, if C<$sig> is 0 and the specified process exists, it
returns true without actually terminating it.

(Win32)
C<kill(-9, $pid)> will terminate the process specified by C<$pid> and
recursively all child processes owned by it.  This is different from
the Unix semantics, where the signal will be delivered to all
processes in the same process group as the process specified by
C<$pid>.

(VMS)
A pid of -1 indicating all processes on the system is not currently
supported.

=item link

(S<RISC OS>, VOS)
Not implemented.

(AmigaOS)
Link count not updated because hard links are not quite that hard
(They are sort of half-way between hard and soft links).

(Win32)
Hard links are implemented on Win32 under NTFS only. They are
natively supported on Windows 2000 and later.  On Windows NT they
are implemented using the Windows POSIX subsystem support and the
Perl process will need Administrator or Backup Operator privileges
to create hard links.

(VMS)
Available on 64 bit OpenVMS 8.2 and later.

=item localtime

C<localtime> has the same range as L</gmtime>, but because time zone
rules change, its accuracy for historical and future times may degrade
but usually by no more than an hour.

=item lstat

(S<RISC OS>)
Not implemented.

(Win32)
Return values (especially for device and inode) may be bogus.

=item msgctl

=item msgget

=item msgsnd

=item msgrcv

(Android, Win32, VMS, S<Plan 9>, S<RISC OS>, VOS)
Not implemented.

=item open

(Win32, S<RISC OS>)
Open modes C<|-> and C<-|> are unsupported.

(SunOS, Solaris, HP-UX)
Opening a process does not automatically flush output handles on some
platforms.

=item readlink

(Win32, VMS, S<RISC OS>)
Not implemented.

=item rename

(Win32)
Can't move directories between directories on different logical volumes.

=item rewinddir

(Win32)
Will not cause L<C<readdir>|perlfunc/readdir DIRHANDLE> to re-read the
directory stream.  The entries already read before the C<rewinddir> call
will just be returned again from a cache buffer.

=item select

(Win32, VMS)
Only implemented on sockets.

(S<RISC OS>)
Only reliable on sockets.

Note that the L<C<select FILEHANDLE>|perlfunc/select FILEHANDLE> form is
generally portable.

=item semctl

=item semget

=item semop

(Android, Win32, VMS, S<RISC OS>)
Not implemented.

=item setgrent

(Android, VMS, Win32, S<RISC OS>)
Not implemented.

=item setpgrp

(Win32, VMS, S<RISC OS>, VOS)
Not implemented.

=item setpriority

(Win32, VMS, S<RISC OS>, VOS)
Not implemented.

=item setpwent

(Android, Win32, S<RISC OS>)
Not implemented.

=item setsockopt

(S<Plan 9>)
Not implemented.

=item shmctl

=item shmget

=item shmread

=item shmwrite

(Android, Win32, VMS, S<RISC OS>)
Not implemented.

=item sleep

(Win32)
Emulated using synchronization functions such that it can be
interrupted by L<C<alarm>|perlfunc/alarm SECONDS>, and limited to a
maximum of 4294967 seconds, approximately 49 days.

=item socketpair

(S<RISC OS>)
Not implemented.

(VMS)
Available on 64 bit OpenVMS 8.2 and later.

=item stat

Platforms that do not have C<rdev>, C<blksize>, or C<blocks> will return
these as C<''>, so numeric comparison or manipulation of these fields may
cause 'not numeric' warnings.

(S<Mac OS X>)
C<ctime> not supported on UFS.

(Win32)
C<ctime> is creation time instead of inode change time.

(Win32)
C<dev> and C<ino> are not meaningful.

(VMS)
C<dev> and C<ino> are not necessarily reliable.

(S<RISC OS>)
C<mtime>, C<atime> and C<ctime> all return the last modification time.
C<dev> and C<ino> are not necessarily reliable.

(OS/2)
C<dev>, C<rdev>, C<blksize>, and C<blocks> are not available.  C<ino> is not
meaningful and will differ between stat calls on the same file.

(Cygwin)
Some versions of cygwin when doing a C<stat("foo")> and not finding it
may then attempt to C<stat("foo.exe")>.

(Win32)
C<stat> needs to open the file to determine the link count
and update attributes that may have been changed through hard links.
Setting L<C<${^WIN32_SLOPPY_STAT}>|perlvar/${^WIN32_SLOPPY_STAT}> to a
true value speeds up C<stat> by not performing this operation.

=item symlink

(Win32, S<RISC OS>)
Not implemented.

(VMS)
Implemented on 64 bit VMS 8.3.  VMS requires the symbolic link to be in Unix
syntax if it is intended to resolve to a valid path.

=item syscall

(Win32, VMS, S<RISC OS>, VOS)
Not implemented.

=item sysopen

(S<Mac OS>, OS/390)
The traditional C<0>, C<1>, and C<2> MODEs are implemented with different
numeric values on some systems.  The flags exported by L<C<Fcntl>|Fcntl>
(C<O_RDONLY>, C<O_WRONLY>, C<O_RDWR>) should work everywhere though.

=item system

(Win32)
As an optimization, may not call the command shell specified in
C<$ENV{PERL5SHELL}>.  C<system(1, @args)> spawns an external
process and immediately returns its process designator, without
waiting for it to terminate.  Return value may be used subsequently
in L<C<wait>|perlfunc/wait> or L<C<waitpid>|perlfunc/waitpid PID,FLAGS>.
Failure to C<spawn()> a subprocess is indicated by setting
L<C<$?>|perlvar/$?> to C<<< 255 << 8 >>>.  L<C<$?>|perlvar/$?> is set in a
way compatible with Unix (i.e. the exit status of the subprocess is
obtained by C<<< $? >> 8 >>>, as described in the documentation).

(S<RISC OS>)
There is no shell to process metacharacters, and the native standard is
to pass a command line terminated by "\n" "\r" or "\0" to the spawned
program.  Redirection such as C<< > foo >> is performed (if at all) by
the run time library of the spawned program.  C<system LIST> will call
the Unix emulation library's L<C<exec>|perlfunc/exec LIST> emulation,
which attempts to provide emulation of the stdin, stdout, stderr in force
in the parent, provided the child program uses a compatible version of the
emulation library.  C<system SCALAR> will call the native command line
directly and no such emulation of a child Unix program will occur.
Mileage B<will> vary.

(Win32)
C<system LIST> without the use of indirect object syntax (C<system PROGRAM LIST>)
may fall back to trying the shell if the first C<spawn()> fails.

(SunOS, Solaris, HP-UX)
Does not automatically flush output handles on some platforms.

(VMS)
The return value is POSIX-like (shifted up by 8 bits), which only allows
room for a made-up value derived from the severity bits of the native
32-bit condition code (unless overridden by
L<C<use vmsish 'status'>|vmsish/C<vmsish status>>).  If the native
condition code is one that has a POSIX value encoded, the POSIX value will
be decoded to extract the expected exit value.  For more details see
L<perlvms/$?>.

=item telldir

(Android)
Not implemented.

=item times

(Win32)
"Cumulative" times will be bogus.  On anything other than Windows NT
or Windows 2000, "system" time will be bogus, and "user" time is
actually the time returned by the L<C<clock()>|clock(3)> function in the C
runtime library.

(S<RISC OS>)
Not useful.

=item truncate

(Older versions of VMS)
Not implemented.

(VOS)
Truncation to same-or-shorter lengths only.

(Win32)
If a FILEHANDLE is supplied, it must be writable and opened in append
mode (i.e., use C<<< open(my $fh, '>>', 'filename') >>>
or C<sysopen(my $fh, ..., O_APPEND|O_RDWR)>.  If a filename is supplied, it
should not be held open elsewhere.

=item umask

Returns C<undef> where unavailable.

(AmigaOS)
C<umask> works but the correct permissions are set only when the file
is finally closed.

=item utime

(VMS, S<RISC OS>)
Only the modification time is updated.

(Win32)
May not behave as expected.  Behavior depends on the C runtime
library's implementation of L<C<utime()>|utime(2)>, and the filesystem
being used.  The FAT filesystem typically does not support an "access
time" field, and it may limit timestamps to a granularity of two seconds.

=item wait

=item waitpid

(Win32)
Can only be applied to process handles returned for processes spawned
using C<system(1, ...)> or pseudo processes created with
L<C<fork>|perlfunc/fork>.

(S<RISC OS>)
Not useful.

=back


=head1 Supported Platforms

The following platforms are known to build Perl 5.12 (as of April 2010,
its release date) from the standard source code distribution available
at L<http://www.cpan.org/src>

=over

=item Linux (x86, ARM, IA64)

=item HP-UX

=item AIX

=item Win32

=over

=item Windows 2000

=item Windows XP

=item Windows Server 2003

=item Windows Vista

=item Windows Server 2008

=item Windows 7

=back

=item Cygwin

Some tests are known to fail:

=over

=item *

F<ext/XS-APItest/t/call_checker.t> - see
L<https://rt.perl.org/Ticket/Display.html?id=78502>

=item *

F<dist/I18N-Collate/t/I18N-Collate.t>

=item *

F<ext/Win32CORE/t/win32core.t> - may fail on recent cygwin installs.

=back

=item Solaris (x86, SPARC)

=item OpenVMS

=over

=item Alpha (7.2 and later)

=item I64 (8.2 and later)

=back

=item Symbian

=item NetBSD

=item FreeBSD

=item Debian GNU/kFreeBSD

=item Haiku

=item Irix (6.5. What else?)

=item OpenBSD

=item Dragonfly BSD

=item Midnight BSD

=item QNX Neutrino RTOS (6.5.0)

=item MirOS BSD

=item Stratus OpenVOS (17.0 or later)

Caveats:

=over

=item time_t issues that may or may not be fixed

=back

=item Symbian (Series 60 v3, 3.2 and 5 - what else?)

=item Stratus VOS / OpenVOS

=item AIX

=item Android

=item FreeMINT

Perl now builds with FreeMiNT/Atari. It fails a few tests, that needs
some investigation.

The FreeMiNT port uses GNU dld for loadable module capabilities. So
ensure you have that library installed when building perl.

=back

=head1 EOL Platforms

=head2 (Perl 5.20)

The following platforms were supported by a previous version of
Perl but have been officially removed from Perl's source code
as of 5.20:

=over

=item AT&T 3b1

=back

=head2 (Perl 5.14)

The following platforms were supported up to 5.10.  They may still
have worked in 5.12, but supporting code has been removed for 5.14:

=over

=item Windows 95

=item Windows 98

=item Windows ME

=item Windows NT4

=back

=head2 (Perl 5.12)

The following platforms were supported by a previous version of
Perl but have been officially removed from Perl's source code
as of 5.12:

=over

=item Atari MiNT

=item Apollo Domain/OS

=item Apple Mac OS 8/9

=item Tenon Machten

=back


=head1 Supported Platforms (Perl 5.8)

As of July 2002 (the Perl release 5.8.0), the following platforms were
able to build Perl from the standard source code distribution
available at L<http://www.cpan.org/src/>

        AIX
        BeOS
        BSD/OS          (BSDi)
        Cygwin
        DG/UX
        DOS DJGPP       1)
        DYNIX/ptx
        EPOC R5
        FreeBSD
        HI-UXMPP        (Hitachi) (5.8.0 worked but we didn't know it)
        HP-UX
        IRIX
        Linux
        Mac OS Classic
        Mac OS X        (Darwin)
        MPE/iX
        NetBSD
        NetWare
        NonStop-UX
        ReliantUNIX     (formerly SINIX)
        OpenBSD
        OpenVMS         (formerly VMS)
        Open UNIX       (Unixware) (since Perl 5.8.1/5.9.0)
        OS/2
        OS/400          (using the PASE) (since Perl 5.8.1/5.9.0)
        PowerUX
        POSIX-BC        (formerly BS2000)
        QNX
        Solaris
        SunOS 4
        SUPER-UX        (NEC)
        Tru64 UNIX      (formerly DEC OSF/1, Digital UNIX)
        UNICOS
        UNICOS/mk
        UTS
        VOS / OpenVOS
        Win95/98/ME/2K/XP 2)
        WinCE
        z/OS            (formerly OS/390)
        VM/ESA

        1) in DOS mode either the DOS or OS/2 ports can be used
        2) compilers: Borland, MinGW (GCC), VC6

The following platforms worked with the previous releases (5.6 and
5.7), but we did not manage either to fix or to test these in time
for the 5.8.0 release.  There is a very good chance that many of these
will work fine with the 5.8.0.

        BSD/OS
        DomainOS
        Hurd
        LynxOS
        MachTen
        PowerMAX
        SCO SV
        SVR4
        Unixware
        Windows 3.1

Known to be broken for 5.8.0 (but 5.6.1 and 5.7.2 can be used):

	AmigaOS 3

The following platforms have been known to build Perl from source in
the past (5.005_03 and earlier), but we haven't been able to verify
their status for the current release, either because the
hardware/software platforms are rare or because we don't have an
active champion on these platforms--or both.  They used to work,
though, so go ahead and try compiling them, and let perlbug@perl.org
of any trouble.

        3b1
        A/UX
        ConvexOS
        CX/UX
        DC/OSx
        DDE SMES
        DOS EMX
        Dynix
        EP/IX
        ESIX
        FPS
        GENIX
        Greenhills
        ISC
        MachTen 68k
        MPC
        NEWS-OS
        NextSTEP
        OpenSTEP
        Opus
        Plan 9
        RISC/os
        SCO ODT/OSR
        Stellar
        SVR2
        TI1500
        TitanOS
        Ultrix
        Unisys Dynix

The following platforms have their own source code distributions and
binaries available via L<http://www.cpan.org/ports/>

                                Perl release

        OS/400 (ILE)            5.005_02
        Tandem Guardian         5.004

The following platforms have only binaries available via
L<http://www.cpan.org/ports/index.html> :

                                Perl release

        Acorn RISCOS            5.005_02
        AOS                     5.002
        LynxOS                  5.004_02

Although we do suggest that you always build your own Perl from
the source code, both for maximal configurability and for security,
in case you are in a hurry you can check
L<http://www.cpan.org/ports/index.html> for binary distributions.

=head1 SEE ALSO

L<perlaix>, L<perlamiga>, L<perlbs2000>,
L<perlce>, L<perlcygwin>, L<perldos>,
L<perlebcdic>, L<perlfreebsd>, L<perlhurd>, L<perlhpux>, L<perlirix>,
L<perlmacos>, L<perlmacosx>,
L<perlnetware>, L<perlos2>, L<perlos390>, L<perlos400>,
L<perlplan9>, L<perlqnx>, L<perlsolaris>, L<perltru64>,
L<perlunicode>, L<perlvms>, L<perlvos>, L<perlwin32>, and L<Win32>.

=head1 AUTHORS / CONTRIBUTORS

Abigail <abigail@abigail.be>,
Charles Bailey <bailey@newman.upenn.edu>,
Graham Barr <gbarr@pobox.com>,
Tom Christiansen <tchrist@perl.com>,
Nicholas Clark <nick@ccl4.org>,
Thomas Dorner <Thomas.Dorner@start.de>,
Andy Dougherty <doughera@lafayette.edu>,
Dominic Dunlop <domo@computer.org>,
Neale Ferguson <neale@vma.tabnsw.com.au>,
David J. Fiander <davidf@mks.com>,
Paul Green <Paul.Green@stratus.com>,
M.J.T. Guy <mjtg@cam.ac.uk>,
Jarkko Hietaniemi <jhi@iki.fi>,
Luther Huffman <lutherh@stratcom.com>,
Nick Ing-Simmons <nick@ing-simmons.net>,
Andreas J. KE<ouml>nig <a.koenig@mind.de>,
Markus Laker <mlaker@contax.co.uk>,
Andrew M. Langmead <aml@world.std.com>,
Lukas Mai <l.mai@web.de>,
Larry Moore <ljmoore@freespace.net>,
Paul Moore <Paul.Moore@uk.origin-it.com>,
Chris Nandor <pudge@pobox.com>,
Matthias Neeracher <neeracher@mac.com>,
Philip Newton <pne@cpan.org>,
Gary Ng <71564.1743@CompuServe.COM>,
Tom Phoenix <rootbeer@teleport.com>,
AndrE<eacute> Pirard <A.Pirard@ulg.ac.be>,
Peter Prymmer <pvhp@forte.com>,
Hugo van der Sanden <hv@crypt0.demon.co.uk>,
Gurusamy Sarathy <gsar@activestate.com>,
Paul J. Schinder <schinder@pobox.com>,
Michael G Schwern <schwern@pobox.com>,
Dan Sugalski <dan@sidhe.org>,
Nathan Torkington <gnat@frii.com>,
John Malmberg <wb8tyw@qsl.net>
perl5261delta.pod000064400000017367150344123420007556 0ustar00=encoding utf8

=head1 NAME

perl5261delta - what is new for perl v5.26.1

=head1 DESCRIPTION

This document describes differences between the 5.26.0 release and the 5.26.1
release.

If you are upgrading from an earlier release such as 5.24.0, first read
L<perl5260delta>, which describes differences between 5.24.0 and 5.26.0.

=head1 Security

=head2 [CVE-2017-12837] Heap buffer overflow in regular expression compiler

Compiling certain regular expression patterns with the case-insensitive
modifier could cause a heap buffer overflow and crash perl.  This has now been
fixed.
L<[perl #131582]|https://rt.perl.org/Public/Bug/Display.html?id=131582>

=head2 [CVE-2017-12883] Buffer over-read in regular expression parser

For certain types of syntax error in a regular expression pattern, the error
message could either contain the contents of a random, possibly large, chunk of
memory, or could crash perl.  This has now been fixed.
L<[perl #131598]|https://rt.perl.org/Public/Bug/Display.html?id=131598>

=head2 [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows

A possible stack buffer overflow in the C<%ENV> code on Windows has been fixed
by removing the buffer completely since it was superfluous anyway.
L<[perl #131665]|https://rt.perl.org/Public/Bug/Display.html?id=131665>

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.26.0.  If any exist,
they are bugs, and we request that you submit a report.  See L</Reporting
Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<base> has been upgraded from version 2.25 to 2.26.

The effects of dotless C<@INC> on this module have been limited by the
introduction of a more refined and accurate solution for removing C<'.'> from
C<@INC> while reducing the false positives.

=item *

L<charnames> has been upgraded from version 1.44 to 1.45.

=item *

L<Module::CoreList> has been upgraded from version 5.20170530 to 5.20170922_26.

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item FreeBSD

=over 4

=item *

Building with B<g++> on FreeBSD-11.0 has been fixed.
L<[perl #131337]|https://rt.perl.org/Public/Bug/Display.html?id=131337>

=back

=item Windows

=over 4

=item *

Support for compiling perl on Windows using Microsoft Visual Studio 2017
(containing Visual C++ 14.1) has been added.

=item *

Building XS modules with GCC 6 in a 64-bit build of Perl failed due to
incorrect mapping of C<strtoll> and C<strtoull>.  This has now been fixed.
L<[perl #131726]|https://rt.perl.org/Public/Bug/Display.html?id=131726>
L<[cpan #121683]|https://rt.cpan.org/Public/Bug/Display.html?id=121683>
L<[cpan #122353]|https://rt.cpan.org/Public/Bug/Display.html?id=122353>

=back

=back

=head1 Selected Bug Fixes

=over 4

=item *

Several built-in functions previously had bugs that could cause them to write
to the internal stack without allocating room for the item being written.  In
rare situations, this could have led to a crash.  These bugs have now been
fixed, and if any similar bugs are introduced in future, they will be detected
automatically in debugging builds.
L<[perl #131732]|https://rt.perl.org/Public/Bug/Display.html?id=131732>

=item *

Using a symbolic ref with postderef syntax as the key in a hash lookup was
yielding an assertion failure on debugging builds.
L<[perl #131627]|https://rt.perl.org/Public/Bug/Display.html?id=131627>

=item *

List assignment (C<aassign>) could in some rare cases allocate an entry on the
mortal stack and leave the entry uninitialized.
L<[perl #131570]|https://rt.perl.org/Public/Bug/Display.html?id=131570>

=item *

Attempting to apply an attribute to an C<our> variable where a function of that
name already exists could result in a NULL pointer being supplied where an SV
was expected, crashing perl.
L<[perl #131597]|https://rt.perl.org/Public/Bug/Display.html?id=131597>

=item *

The code that vivifies a typeglob out of a code ref made some false assumptions
that could lead to a crash in cases such as C<< $::{"A"} = sub {}; \&{"A"} >>.
This has now been fixed.
L<[perl #131085]|https://rt.perl.org/Public/Bug/Display.html?id=131085>

=item *

C<my_atof2> no longer reads beyond the terminating NUL, which previously
occurred if the decimal point is immediately before the NUL.
L<[perl #131526]|https://rt.perl.org/Public/Bug/Display.html?id=131526>

=item *

Occasional "Malformed UTF-8 character" crashes in C<s//> on utf8 strings have
been fixed.
L<[perl #131575]|https://rt.perl.org/Public/Bug/Display.html?id=131575>

=item *

C<perldoc -f s> now finds C<s///>.
L<[perl #131371]|https://rt.perl.org/Public/Bug/Display.html?id=131371>

=item *

Some erroneous warnings after utf8 conversion have been fixed.
L<[perl #131190]|https://rt.perl.org/Public/Bug/Display.html?id=131190>

=item *

The C<jmpenv> frame to catch Perl exceptions is set up lazily, and this used to
be a bit too lazy.  The catcher is now set up earlier, preventing some possible
crashes.
L<[perl #105930]|https://rt.perl.org/Public/Bug/Display.html?id=105930>

=item *

Spurious "Assuming NOT a POSIX class" warnings have been removed.
L<[perl #131522]|https://rt.perl.org/Public/Bug/Display.html?id=131522>

=back

=head1 Acknowledgements

Perl 5.26.1 represents approximately 4 months of development since Perl 5.26.0
and contains approximately 8,900 lines of changes across 85 files from 23
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 990 lines of changes to 38 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.26.1:

Aaron Crane, Andy Dougherty, Aristotle Pagaltzis, Chris 'BinGOs' Williams,
Craig A. Berry, Dagfinn Ilmari Mannsåker, David Mitchell, E. Choroba, Eric
Herman, Father Chrysostomos, Jacques Germishuys, James E Keenan, John SJ
Anderson, Karl Williamson, Ken Brown, Lukas Mai, Matthew Horsfall, Ricardo
Signes, Sawyer X, Steve Hay, Tony Cook, Yves Orton, Zefram.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the perl bug database
at L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to
report the issue.

=head1 Give Thanks

If you wish to thank the Perl 5 Porters for the work we had done in Perl 5, you
can do so by running the C<perlthanks> program:

    perlthanks

This will send an email to the Perl 5 Porters list with your show of thanks.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlmod.pod000064400000063212150344123420006714 0ustar00=head1 NAME

perlmod - Perl modules (packages and symbol tables)

=head1 DESCRIPTION

=head2 Is this the document you were after?

There are other documents which might contain the information that you're
looking for:

=over 2

=item This doc

Perl's packages, namespaces, and some info on classes.

=item L<perlnewmod>

Tutorial on making a new module.

=item L<perlmodstyle>

Best practices for making a new module.

=back

=head2 Packages
X<package> X<namespace> X<variable, global> X<global variable> X<global>

Unlike Perl 4, in which all the variables were dynamic and shared one
global name space, causing maintainability problems, Perl 5 provides two
mechanisms for protecting code from having its variables stomped on by
other code: lexically scoped variables created with C<my> or C<state> and
namespaced global variables, which are exposed via the C<vars> pragma,
or the C<our> keyword. Any global variable is considered to
be part of a namespace and can be accessed via a "fully qualified form".
Conversely, any lexically scoped variable is considered to be part of
that lexical-scope, and does not have a "fully qualified form".

In perl namespaces are called "packages" and
the C<package> declaration tells the compiler which
namespace to prefix to C<our> variables and unqualified dynamic names.
This both protects
against accidental stomping and provides an interface for deliberately
clobbering global dynamic variables declared and used in other scopes or
packages, when that is what you want to do.

The scope of the C<package> declaration is from the
declaration itself through the end of the enclosing block, C<eval>,
or file, whichever comes first (the same scope as the my(), our(), state(), and
local() operators, and also the effect
of the experimental "reference aliasing," which may change), or until
the next C<package> declaration.  Unqualified dynamic identifiers will be in
this namespace, except for those few identifiers that, if unqualified,
default to the main package instead of the current one as described
below.  A C<package> statement affects only dynamic global
symbols, including subroutine names, and variables you've used local()
on, but I<not> lexical variables created with my(), our() or state().

Typically, a C<package> statement is the first declaration in a file
included in a program by one of the C<do>, C<require>, or C<use> operators.  You can
switch into a package in more than one place: C<package> has no
effect beyond specifying which symbol table the compiler will use for
dynamic symbols for the rest of that block or until the next C<package> statement.
You can refer to variables and filehandles in other packages
by prefixing the identifier with the package name and a double
colon: C<$Package::Variable>.  If the package name is null, the
C<main> package is assumed.  That is, C<$::sail> is equivalent to
C<$main::sail>.

The old package delimiter was a single quote, but double colon is now the
preferred delimiter, in part because it's more readable to humans, and
in part because it's more readable to B<emacs> macros.  It also makes C++
programmers feel like they know what's going on--as opposed to using the
single quote as separator, which was there to make Ada programmers feel
like they knew what was going on.  Because the old-fashioned syntax is still
supported for backwards compatibility, if you try to use a string like
C<"This is $owner's house">, you'll be accessing C<$owner::s>; that is,
the $s variable in package C<owner>, which is probably not what you meant.
Use braces to disambiguate, as in C<"This is ${owner}'s house">.
X<::> X<'>

Packages may themselves contain package separators, as in
C<$OUTER::INNER::var>.  This implies nothing about the order of
name lookups, however.  There are no relative packages: all symbols
are either local to the current package, or must be fully qualified
from the outer package name down.  For instance, there is nowhere
within package C<OUTER> that C<$INNER::var> refers to
C<$OUTER::INNER::var>.  C<INNER> refers to a totally
separate global package. The custom of treating package names as a
hierarchy is very strong, but the language in no way enforces it.

Only identifiers starting with letters (or underscore) are stored
in a package's symbol table.  All other symbols are kept in package
C<main>, including all punctuation variables, like $_.  In addition,
when unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV,
ARGVOUT, ENV, INC, and SIG are forced to be in package C<main>,
even when used for other purposes than their built-in ones.  If you
have a package called C<m>, C<s>, or C<y>, then you can't use the
qualified form of an identifier because it would be instead interpreted
as a pattern match, a substitution, or a transliteration.
X<variable, punctuation> 

Variables beginning with underscore used to be forced into package
main, but we decided it was more useful for package writers to be able
to use leading underscore to indicate private variables and method names.
However, variables and functions named with a single C<_>, such as
$_ and C<sub _>, are still forced into the package C<main>.  See also
L<perlvar/"The Syntax of Variable Names">.

C<eval>ed strings are compiled in the package in which the eval() was
compiled.  (Assignments to C<$SIG{}>, however, assume the signal
handler specified is in the C<main> package.  Qualify the signal handler
name if you wish to have a signal handler in a package.)  For an
example, examine F<perldb.pl> in the Perl library.  It initially switches
to the C<DB> package so that the debugger doesn't interfere with variables
in the program you are trying to debug.  At various points, however, it
temporarily switches back to the C<main> package to evaluate various
expressions in the context of the C<main> package (or wherever you came
from).  See L<perldebug>.

The special symbol C<__PACKAGE__> contains the current package, but cannot
(easily) be used to construct variable names. After C<my($foo)> has hidden
package variable C<$foo>, it can still be accessed, without knowing what
package you are in, as C<${__PACKAGE__.'::foo'}>.

See L<perlsub> for other scoping issues related to my() and local(),
and L<perlref> regarding closures.

=head2 Symbol Tables
X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias>

The symbol table for a package happens to be stored in the hash of that
name with two colons appended.  The main symbol table's name is thus
C<%main::>, or C<%::> for short.  Likewise the symbol table for the nested
package mentioned earlier is named C<%OUTER::INNER::>.

The value in each entry of the hash is what you are referring to when you
use the C<*name> typeglob notation.

    local *main::foo    = *main::bar;

You can use this to print out all the variables in a package, for
instance.  The standard but antiquated F<dumpvar.pl> library and
the CPAN module Devel::Symdump make use of this.

The results of creating new symbol table entries directly or modifying any
entries that are not already typeglobs are undefined and subject to change
between releases of perl.

Assignment to a typeglob performs an aliasing operation, i.e.,

    *dick = *richard;

causes variables, subroutines, formats, and file and directory handles
accessible via the identifier C<richard> also to be accessible via the
identifier C<dick>.  If you want to alias only a particular variable or
subroutine, assign a reference instead:

    *dick = \$richard;

Which makes $richard and $dick the same variable, but leaves
@richard and @dick as separate arrays.  Tricky, eh?

There is one subtle difference between the following statements:

    *foo = *bar;
    *foo = \$bar;

C<*foo = *bar> makes the typeglobs themselves synonymous while
C<*foo = \$bar> makes the SCALAR portions of two distinct typeglobs
refer to the same scalar value. This means that the following code:

    $bar = 1;
    *foo = \$bar;       # Make $foo an alias for $bar

    {
        local $bar = 2; # Restrict changes to block
        print $foo;     # Prints '1'!
    }

Would print '1', because C<$foo> holds a reference to the I<original>
C<$bar>. The one that was stuffed away by C<local()> and which will be
restored when the block ends. Because variables are accessed through the
typeglob, you can use C<*foo = *bar> to create an alias which can be
localized. (But be aware that this means you can't have a separate
C<@foo> and C<@bar>, etc.)

What makes all of this important is that the Exporter module uses glob
aliasing as the import/export mechanism. Whether or not you can properly
localize a variable that has been exported from a module depends on how
it was exported:

    @EXPORT = qw($FOO); # Usual form, can't be localized
    @EXPORT = qw(*FOO); # Can be localized

You can work around the first case by using the fully qualified name
(C<$Package::FOO>) where you need a local value, or by overriding it
by saying C<*FOO = *Package::FOO> in your script.

The C<*x = \$y> mechanism may be used to pass and return cheap references
into or from subroutines if you don't want to copy the whole
thing.  It only works when assigning to dynamic variables, not
lexicals.

    %some_hash = ();			# can't be my()
    *some_hash = fn( \%another_hash );
    sub fn {
	local *hashsym = shift;
	# now use %hashsym normally, and you
	# will affect the caller's %another_hash
	my %nhash = (); # do what you want
	return \%nhash;
    }

On return, the reference will overwrite the hash slot in the
symbol table specified by the *some_hash typeglob.  This
is a somewhat tricky way of passing around references cheaply
when you don't want to have to remember to dereference variables
explicitly.

Another use of symbol tables is for making "constant" scalars.
X<constant> X<scalar, constant>

    *PI = \3.14159265358979;

Now you cannot alter C<$PI>, which is probably a good thing all in all.
This isn't the same as a constant subroutine, which is subject to
optimization at compile-time.  A constant subroutine is one prototyped
to take no arguments and to return a constant expression.  See
L<perlsub> for details on these.  The C<use constant> pragma is a
convenient shorthand for these.

You can say C<*foo{PACKAGE}> and C<*foo{NAME}> to find out what name and
package the *foo symbol table entry comes from.  This may be useful
in a subroutine that gets passed typeglobs as arguments:

    sub identify_typeglob {
        my $glob = shift;
        print 'You gave me ', *{$glob}{PACKAGE},
            '::', *{$glob}{NAME}, "\n";
    }
    identify_typeglob *foo;
    identify_typeglob *bar::baz;

This prints

    You gave me main::foo
    You gave me bar::baz

The C<*foo{THING}> notation can also be used to obtain references to the
individual elements of *foo.  See L<perlref>.

Subroutine definitions (and declarations, for that matter) need
not necessarily be situated in the package whose symbol table they
occupy.  You can define a subroutine outside its package by
explicitly qualifying the name of the subroutine:

    package main;
    sub Some_package::foo { ... }   # &foo defined in Some_package

This is just a shorthand for a typeglob assignment at compile time:

    BEGIN { *Some_package::foo = sub { ... } }

and is I<not> the same as writing:

    {
	package Some_package;
	sub foo { ... }
    }

In the first two versions, the body of the subroutine is
lexically in the main package, I<not> in Some_package. So
something like this:

    package main;

    $Some_package::name = "fred";
    $main::name = "barney";

    sub Some_package::foo {
	print "in ", __PACKAGE__, ": \$name is '$name'\n";
    }

    Some_package::foo();

prints:

    in main: $name is 'barney'

rather than:

    in Some_package: $name is 'fred'

This also has implications for the use of the SUPER:: qualifier
(see L<perlobj>).

=head2 BEGIN, UNITCHECK, CHECK, INIT and END
X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END>

Five specially named code blocks are executed at the beginning and at
the end of a running Perl program.  These are the C<BEGIN>,
C<UNITCHECK>, C<CHECK>, C<INIT>, and C<END> blocks.

These code blocks can be prefixed with C<sub> to give the appearance of a
subroutine (although this is not considered good style).  One should note
that these code blocks don't really exist as named subroutines (despite
their appearance). The thing that gives this away is the fact that you can
have B<more than one> of these code blocks in a program, and they will get
B<all> executed at the appropriate moment.  So you can't execute any of
these code blocks by name.

A C<BEGIN> code block is executed as soon as possible, that is, the moment
it is completely defined, even before the rest of the containing file (or
string) is parsed.  You may have multiple C<BEGIN> blocks within a file (or
eval'ed string); they will execute in order of definition.  Because a C<BEGIN>
code block executes immediately, it can pull in definitions of subroutines
and such from other files in time to be visible to the rest of the compile
and run time.  Once a C<BEGIN> has run, it is immediately undefined and any
code it used is returned to Perl's memory pool.

An C<END> code block is executed as late as possible, that is, after
perl has finished running the program and just before the interpreter
is being exited, even if it is exiting as a result of a die() function.
(But not if it's morphing into another program via C<exec>, or
being blown out of the water by a signal--you have to trap that yourself
(if you can).)  You may have multiple C<END> blocks within a file--they
will execute in reverse order of definition; that is: last in, first
out (LIFO).  C<END> blocks are not executed when you run perl with the
C<-c> switch, or if compilation fails.

Note that C<END> code blocks are B<not> executed at the end of a string
C<eval()>: if any C<END> code blocks are created in a string C<eval()>,
they will be executed just as any other C<END> code block of that package
in LIFO order just before the interpreter is being exited.

Inside an C<END> code block, C<$?> contains the value that the program is
going to pass to C<exit()>.  You can modify C<$?> to change the exit
value of the program.  Beware of changing C<$?> by accident (e.g. by
running something via C<system>).
X<$?>

Inside of a C<END> block, the value of C<${^GLOBAL_PHASE}> will be
C<"END">.

C<UNITCHECK>, C<CHECK> and C<INIT> code blocks are useful to catch the
transition between the compilation phase and the execution phase of
the main program.

C<UNITCHECK> blocks are run just after the unit which defined them has
been compiled.  The main program file and each module it loads are
compilation units, as are string C<eval>s, run-time code compiled using the
C<(?{ })> construct in a regex, calls to C<do FILE>, C<require FILE>,
and code after the C<-e> switch on the command line.

C<BEGIN> and C<UNITCHECK> blocks are not directly related to the phase of
the interpreter.  They can be created and executed during any phase.

C<CHECK> code blocks are run just after the B<initial> Perl compile phase ends
and before the run time begins, in LIFO order.  C<CHECK> code blocks are used
in the Perl compiler suite to save the compiled state of the program.

Inside of a C<CHECK> block, the value of C<${^GLOBAL_PHASE}> will be
C<"CHECK">.

C<INIT> blocks are run just before the Perl runtime begins execution, in
"first in, first out" (FIFO) order.

Inside of an C<INIT> block, the value of C<${^GLOBAL_PHASE}> will be C<"INIT">.

The C<CHECK> and C<INIT> blocks in code compiled by C<require>, string C<do>,
or string C<eval> will not be executed if they occur after the end of the
main compilation phase; that can be a problem in mod_perl and other persistent
environments which use those functions to load code at runtime.

When you use the B<-n> and B<-p> switches to Perl, C<BEGIN> and
C<END> work just as they do in B<awk>, as a degenerate case.
Both C<BEGIN> and C<CHECK> blocks are run when you use the B<-c>
switch for a compile-only syntax check, although your main code
is not.

The B<begincheck> program makes it all clear, eventually:

  #!/usr/bin/perl

  # begincheck

  print         "10. Ordinary code runs at runtime.\n";

  END { print   "16.   So this is the end of the tale.\n" }
  INIT { print  " 7. INIT blocks run FIFO just before runtime.\n" }
  UNITCHECK {
    print       " 4.   And therefore before any CHECK blocks.\n"
  }
  CHECK { print " 6.   So this is the sixth line.\n" }

  print         "11.   It runs in order, of course.\n";

  BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" }
  END { print   "15.   Read perlmod for the rest of the story.\n" }
  CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" }
  INIT { print  " 8.   Run this again, using Perl's -c switch.\n" }

  print         "12.   This is anti-obfuscated code.\n";

  END { print   "14. END blocks run LIFO at quitting time.\n" }
  BEGIN { print " 2.   So this line comes out second.\n" }
  UNITCHECK {
   print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n"
  }
  INIT { print  " 9.   You'll see the difference right away.\n" }

  print         "13.   It only _looks_ like it should be confusing.\n";

  __END__

=head2 Perl Classes
X<class> X<@ISA>

There is no special class syntax in Perl, but a package may act
as a class if it provides subroutines to act as methods.  Such a
package may also derive some of its methods from another class (package)
by listing the other package name(s) in its global @ISA array (which
must be a package global, not a lexical).

For more on this, see L<perlootut> and L<perlobj>.

=head2 Perl Modules
X<module>

A module is just a set of related functions in a library file, i.e.,
a Perl package with the same name as the file.  It is specifically
designed to be reusable by other modules or programs.  It may do this
by providing a mechanism for exporting some of its symbols into the
symbol table of any package using it, or it may function as a class
definition and make its semantics available implicitly through
method calls on the class and its objects, without explicitly
exporting anything.  Or it can do a little of both.

For example, to start a traditional, non-OO module called Some::Module,
create a file called F<Some/Module.pm> and start with this template:

    package Some::Module;  # assumes Some/Module.pm

    use strict;
    use warnings;

    BEGIN {
        require Exporter;

        # set the version for version checking
        our $VERSION     = 1.00;

        # Inherit from Exporter to export functions and variables
        our @ISA         = qw(Exporter);

        # Functions and variables which are exported by default
        our @EXPORT      = qw(func1 func2);

        # Functions and variables which can be optionally exported
        our @EXPORT_OK   = qw($Var1 %Hashit func3);
    }

    # exported package globals go here
    our $Var1    = '';
    our %Hashit  = ();

    # non-exported package globals go here
    # (they are still accessible as $Some::Module::stuff)
    our @more    = ();
    our $stuff   = '';

    # file-private lexicals go here, before any functions which use them
    my $priv_var    = '';
    my %secret_hash = ();

    # here's a file-private function as a closure,
    # callable as $priv_func->();
    my $priv_func = sub {
        ...
    };

    # make all your functions, whether exported or not;
    # remember to put something interesting in the {} stubs
    sub func1      { ... }
    sub func2      { ... }

    # this one isn't exported, but could be called directly
    # as Some::Module::func3()
    sub func3      { ... }

    END { ... }       # module clean-up code here (global destructor)

    1;  # don't forget to return a true value from the file

Then go on to declare and use your variables in functions without
any qualifications.  See L<Exporter> and the L<perlmodlib> for
details on mechanics and style issues in module creation.

Perl modules are included into your program by saying

    use Module;

or

    use Module LIST;

This is exactly equivalent to

    BEGIN { require 'Module.pm'; 'Module'->import; }

or

    BEGIN { require 'Module.pm'; 'Module'->import( LIST ); }

As a special case

    use Module ();

is exactly equivalent to

    BEGIN { require 'Module.pm'; }

All Perl module files have the extension F<.pm>.  The C<use> operator
assumes this so you don't have to spell out "F<Module.pm>" in quotes.
This also helps to differentiate new modules from old F<.pl> and
F<.ph> files.  Module names are also capitalized unless they're
functioning as pragmas; pragmas are in effect compiler directives,
and are sometimes called "pragmatic modules" (or even "pragmata"
if you're a classicist).

The two statements:

    require SomeModule;
    require "SomeModule.pm";

differ from each other in two ways.  In the first case, any double
colons in the module name, such as C<Some::Module>, are translated
into your system's directory separator, usually "/".   The second
case does not, and would have to be specified literally.  The other
difference is that seeing the first C<require> clues in the compiler
that uses of indirect object notation involving "SomeModule", as
in C<$ob = purge SomeModule>, are method calls, not function calls.
(Yes, this really can make a difference.)

Because the C<use> statement implies a C<BEGIN> block, the importing
of semantics happens as soon as the C<use> statement is compiled,
before the rest of the file is compiled.  This is how it is able
to function as a pragma mechanism, and also how modules are able to
declare subroutines that are then visible as list or unary operators for
the rest of the current file.  This will not work if you use C<require>
instead of C<use>.  With C<require> you can get into this problem:

    require Cwd;		# make Cwd:: accessible
    $here = Cwd::getcwd();

    use Cwd;			# import names from Cwd::
    $here = getcwd();

    require Cwd;	    	# make Cwd:: accessible
    $here = getcwd(); 		# oops! no main::getcwd()

In general, C<use Module ()> is recommended over C<require Module>,
because it determines module availability at compile time, not in the
middle of your program's execution.  An exception would be if two modules
each tried to C<use> each other, and each also called a function from
that other module.  In that case, it's easy to use C<require> instead.

Perl packages may be nested inside other package names, so we can have
package names containing C<::>.  But if we used that package name
directly as a filename it would make for unwieldy or impossible
filenames on some systems.  Therefore, if a module's name is, say,
C<Text::Soundex>, then its definition is actually found in the library
file F<Text/Soundex.pm>.

Perl modules always have a F<.pm> file, but there may also be
dynamically linked executables (often ending in F<.so>) or autoloaded
subroutine definitions (often ending in F<.al>) associated with the
module.  If so, these will be entirely transparent to the user of
the module.  It is the responsibility of the F<.pm> file to load
(or arrange to autoload) any additional functionality.  For example,
although the POSIX module happens to do both dynamic loading and
autoloading, the user can say just C<use POSIX> to get it all.

=head2 Making your module threadsafe
X<threadsafe> X<thread safe>
X<module, threadsafe> X<module, thread safe>
X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread>

Perl supports a type of threads called interpreter threads (ithreads).
These threads can be used explicitly and implicitly.

Ithreads work by cloning the data tree so that no data is shared
between different threads. These threads can be used by using the C<threads>
module or by doing fork() on win32 (fake fork() support). When a
thread is cloned all Perl data is cloned, however non-Perl data cannot
be cloned automatically.  Perl after 5.8.0 has support for the C<CLONE>
special subroutine.  In C<CLONE> you can do whatever
you need to do,
like for example handle the cloning of non-Perl data, if necessary.
C<CLONE> will be called once as a class method for every package that has it
defined (or inherits it).  It will be called in the context of the new thread,
so all modifications are made in the new area.  Currently CLONE is called with
no parameters other than the invocant package name, but code should not assume
that this will remain unchanged, as it is likely that in future extra parameters
will be passed in to give more information about the state of cloning.

If you want to CLONE all objects you will need to keep track of them per
package. This is simply done using a hash and Scalar::Util::weaken().

Perl after 5.8.7 has support for the C<CLONE_SKIP> special subroutine.
Like C<CLONE>, C<CLONE_SKIP> is called once per package; however, it is
called just before cloning starts, and in the context of the parent
thread. If it returns a true value, then no objects of that class will
be cloned; or rather, they will be copied as unblessed, undef values.
For example: if in the parent there are two references to a single blessed
hash, then in the child there will be two references to a single undefined
scalar value instead.
This provides a simple mechanism for making a module threadsafe; just add
C<sub CLONE_SKIP { 1 }> at the top of the class, and C<DESTROY()> will
now only be called once per object. Of course, if the child thread needs
to make use of the objects, then a more sophisticated approach is
needed.

Like C<CLONE>, C<CLONE_SKIP> is currently called with no parameters other
than the invocant package name, although that may change. Similarly, to
allow for future expansion, the return value should be a single C<0> or
C<1> value.

=head1 SEE ALSO

See L<perlmodlib> for general style issues related to building Perl
modules and classes, as well as descriptions of the standard library
and CPAN, L<Exporter> for how Perl's standard import/export mechanism
works, L<perlootut> and L<perlobj> for in-depth information on
creating classes, L<perlobj> for a hard-core reference document on
objects, L<perlsub> for an explanation of functions and scoping,
and L<perlxstut> and L<perlguts> for more information on writing
extension modules.
perlsymbian.pod000064400000035777150344123420007616 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlsymbian - Perl version 5 on Symbian OS

=head1 DESCRIPTION

This document describes various features of the Symbian operating
system that will affect how Perl version 5 (hereafter just Perl)
is compiled and/or runs.

B<NOTE: this port (as of 0.4.1) does not compile into a Symbian
OS GUI application, but instead it results in a Symbian DLL.>
The DLL includes a C++ class called CPerlBase, which one can then
(derive from and) use to embed Perl into applications, see F<symbian/README>.

The base port of Perl to Symbian only implements the basic POSIX-like
functionality; it does not implement any further Symbian or Series 60,
Series 80, or UIQ bindings for Perl.

It is also possible to generate Symbian executables for "miniperl"
and "perl", but since there is no standard command line interface
for Symbian (nor full keyboards in the devices), these are useful
mainly as demonstrations.

=head2 Compiling Perl on Symbian

(0) You need to have the appropriate Symbian SDK installed.

These instructions have been tested under various Nokia Series 60
Symbian SDKs (1.2 to 2.6, 2.8 should also work, 1.2 compiles but
does not work), Series 80 2.0, and Nokia 7710 (Series 90) SDK.
You can get the SDKs from Forum Nokia (L<http://www.forum.nokia.com/>).
A very rough port ("it compiles") to UIQ 2.1 has also been made.

A prerequisite for any of the SDKs is to install ActivePerl
from ActiveState, L<http://www.activestate.com/Products/ActivePerl/>

Having the SDK installed also means that you need to have either
the Metrowerks CodeWarrior installed (2.8 and 3.0 were used in testing)
or the Microsoft Visual C++ 6.0 installed (SP3 minimum, SP5 recommended).

Note that for example the Series 60 2.0 VC SDK installation talks
about ActivePerl build 518, which does no more (as of mid-2005) exist
at the ActiveState website.  The ActivePerl 5.8.4 build 810 was
used successfully for compiling Perl on Symbian.  The 5.6.x ActivePerls
do not work.

Other SDKs or compilers like Visual.NET, command-line-only
Visual.NET, Borland, GnuPoc, or sdk2unix have not been tried.

These instructions almost certainly won't work with older Symbian
releases or other SDKs.  Patches to get this port running in other
releases, SDKs, compilers, platforms, or devices are naturally welcome.

(1) Get a Perl source code distribution (for example the file
perl-5.9.2.tar.gz is fine) from L<http://www.cpan.org/src/>
and unpack it in your the C:/Symbian directory of your Windows
system.

(2) Change to the perl source directory.

    cd c:\Symbian\perl-5.x.x

(3) Run the following script using the perl coming with the SDK

    perl symbian\config.pl

You must use the cmd.exe, the Cygwin shell will not work.
The PATH must include the SDK tools, including a Perl,
which should be the case under cmd.exe.  If you do not
have that, see the end of symbian\sdk.pl for notes of
how your environment should be set up for Symbian compiles.

(4) Build the project, either by

     make all

in cmd.exe or by using either the Metrowerks CodeWarrior
or the Visual C++ 6.0, or the Visual Studio 8 (the Visual C++
2005 Express Edition works fine).

If you use the VC IDE, you will have to run F<symbian\config.pl>
first using the cmd.exe, and then run 'make win.mf vc6.mf' to generate
the VC6 makefiles and workspaces.  "make vc6" will compile for the VC6,
and "make cw" for the CodeWarrior.

The following SDK and compiler configurations and Nokia phones were
tested at some point in time (+ = compiled and PerlApp run, - = not),
both for Perl 5.8.x and 5.9.x:

     SDK     | VC | CW |
     --------+----+----+---
     S60 1.2 | +  | +  | 3650 (*)
     S60 2.0 | +  | +  | 6600
     S60 2.1 | -  | +  | 6670
     S60 2.6 | +  | +  | 6630
     S60 2.8 | +  | +  | (not tested in a device)
     S80 2.6 | -  | +  | 9300
     S90 1.1 | +  | -  | 7710
     UIQ 2.1 | -  | +  | (not tested in a device)

 (*) Compiles but does not work, unfortunately, a problem with Symbian.

If you are using the 'make' directly, it is the GNU make from the SDKs,
and it will invoke the right make commands for the Windows emulator
build and the Arm target builds ('thumb' by default) as necessary.

The build scripts assume the 'absolute style' SDK installs under C:,
the 'subst style' will not work.

If using the VC IDE, to build use for example the File->Open Workspace->
C:\Symbian\8.0a\S60_2nd_FP2\epoc32\build\symbian\perl\perl\wins\perl.dsw
The emulator binaries will appear in the same directory.

If using the VC IDE, you will a lot of warnings in the beginning of
the build because a lot of headers mentioned by the source cannot
be found, but this is not serious since those headers are not used.

The Metrowerks will give a lot of warnings about unused variables and
empty declarations, you can ignore those.

When the Windows and Arm DLLs are built do not be scared by a very long
messages whizzing by: it is the "export freeze" phase where the whole
(rather large) API of Perl is listed.

Once the build is completed you need to create the DLL SIS file by

     make perldll.sis

which will create the file perlXYZ.sis (the XYZ being the Perl version)
which you can then install into your Symbian device: an easy way
to do this is to send them via Bluetooth or infrared and just open
the messages.

Since the total size of all Perl SIS files once installed is
over 2 MB, it is recommended to do the installation into a
memory card (drive E:) instead of the C: drive.

The size of the perlXYZ.SIS is about 370 kB but once it is in the
device it is about one 750 kB (according to the application manager).

The perlXYZ.sis includes only the Perl DLL: to create an additional
SIS file which includes some of the standard (pure) Perl libraries,
issue the command

     make perllib.sis

Some of the standard Perl libraries are included, but not all:
see L</HISTORY> or F<symbian\install.cfg> for more details
(250 kB -> 700 kB).

Some of the standard Perl XS extensions (see L</HISTORY> are
also available:

     make perlext.sis

which will create perlXYZext.sis (290 kB -> 770 kB).

To compile the demonstration application PerlApp you need first to
install the Perl headers under the SDK.

To install the Perl headers and the class CPerlBase documentation
so that you no more need the Perl sources around to compile Perl
applications using the SDK:

     make sdkinstall

The destination directory is C:\Symbian\perl\X.Y.Z.  For more
details, see F<symbian\PerlBase.pod>.

Once the headers have been installed, you can create a SIS for
the PerlApp:

     make perlapp.sis

The perlapp.sis (11 kB -> 16 kB) will be built in the symbian
subdirectory, but a copy will also be made to the main directory.

If you want to package the Perl DLLs (one for WINS, one for ARMI),
the headers, and the documentation:

     make perlsdk.zip

which will create perlXYZsdk.zip that can be used in another
Windows system with the SDK, without having to compile Perl in
that system.

If you want to package the PerlApp sources:

     make perlapp.zip

If you want to package the perl.exe and miniperl.exe, you
can use the perlexe.sis and miniperlexe.sis make targets.
You also probably want the perllib.sis for the libraries
and maybe even the perlapp.sis for the recognizer.

The make target 'allsis' combines all the above SIS targets.

To clean up after compilation you can use either of

     make clean
     make distclean

depending on how clean you want to be.

=head2 Compilation problems

If you see right after "make" this

    cat makefile.sh >makefile
    'cat' is not recognized as an internal or external command,
    operable program or batch file.

it means you need to (re)run the F<symbian\config.pl>.

If you get the error

        'perl' is not recognized as an internal or external command,
        operable program or batch file.

you may need to reinstall the ActivePerl.

If you see this

    ren makedef.pl nomakedef.pl
    The system cannot find the file specified.
    C:\Symbian\...\make.exe: [rename_makedef] Error 1 (ignored)

please ignore it since it is nothing serious (the build process of
renames the Perl makedef.pl as nomakedef.pl to avoid confusing it
with a makedef.pl of the SDK).

=head2 PerlApp

The PerlApp application demonstrates how to embed Perl interpreters
to a Symbian application.  The "Time" menu item runs the following
Perl code: C<print "Running in ", $^O, "\n", scalar localtime>,
the "Oneliner" allows one to type in Perl code, and the "Run"
opens a file chooser for selecting a Perl file to run.

The PerlApp also is started when the "Perl recognizer" (also included
and installed) detects a Perl file being activated through the GUI,
and offers either to install it under \Perl (if the Perl file is in
the inbox of the messaging application) or to run it (if the Perl file
is under \Perl).

=head2 sisify.pl

In the symbian subdirectory there is F<sisify.pl> utility which can be used
to package Perl scripts and/or Perl library directories into SIS files,
which can be installed to the device.  To run the sisify.pl utility,
you will need to have the 'makesis' and 'uidcrc' utilities already
installed.  If you don't have the Win32 SDKs, you may try for example
L<http://gnupoc.sourceforge.net/> or L<http://symbianos.org/~andreh/>.

=head2 Using Perl in Symbian

First of all note that you have full access to the Symbian device
when using Perl: you can do a lot of damage to your device (like
removing system files) unless you are careful.  Please do take
backups before doing anything.

The Perl port has been done for the most part using the Symbian
standard POSIX-ish STDLIB library. It is a reasonably complete
library, but certain corners of such emulation libraries that tend
to be left unimplemented on non-UNIX platforms have been left
unimplemented also this time: fork(), signals(), user/group ids,
select() working for sockets, non-blocking sockets, and so forth.
See the file F<symbian/config.sh> and look for 'undef' to find the
unsupported APIs (or from Perl use Config).

The filesystem of Symbian devices uses DOSish syntax, "drives"
separated from paths by a colon, and backslashes for the path.  The
exact assignment of the drives probably varies between platforms, but
for example in Series 60 you might see C: as the (flash) main memory,
D: as the RAM drive, E: as the memory card (MMC), Z: as the ROM.  In
Series 80 D: is the memory card.  As far the devices go the NUL: is
the bit bucket, the COMx: are the serial lines, IRCOMx: are the IR
ports, TMP: might be C:\System\Temp.  Remember to double those
backslashes in doublequoted strings.

The Perl DLL is installed in \System\Libs\.  The Perl libraries and
extension DLLs are installed in \System\Libs\Perl\X.Y.Z\.  The PerlApp
is installed in \System\Apps\, and the SIS also installs a couple of
demo scripts in \Perl\ (C:\Mydocs\Perl\ on Nokia 7710).

Note that the Symbian filesystem is very picky: it strongly prefers
the \ instead of the /.

When doing XS / Symbian C++ programming include first the Symbian
headers, then any standard C/POSIX headers, then Perl headers, and finally
any application headers.

New() and Copy() are unfortunately used by both Symbian and Perl code
so you'll have to play cpp games if you need them.  PerlBase.h undefines
the Perl definitions and redefines them as PerlNew() and PerlCopy().

=head1 TO DO

Lots.  See F<symbian/TODO>.

=head1 WARNING

As of Perl Symbian port version 0.4.1 any part of Perl's standard
regression test suite has not been run on a real Symbian device using
the ported Perl, so innumerable bugs may lie in wait.  Therefore there
is absolutely no warranty.

=head1 NOTE

When creating and extending application programming interfaces (APIs)
for Symbian or Series 60 or Series 80 or Series 90 it is suggested
that trademarks, registered trademarks, or trade names are not used in
the API names.  Instead, developers should consider basing the API
naming in the existing (C++, or maybe Java) public component and API
naming, modified as appropriate by the rules of the programming
language the new APIs are for.

Nokia is a registered trademark of Nokia Corporation. Nokia's product
names are trademarks or registered trademarks of Nokia.  Other product
and company names mentioned herein may be trademarks or trade names of
their respective owners.

=head1 AUTHOR

Jarkko Hietaniemi

=head1 COPYRIGHT

Copyright (c) 2004-2005 Nokia.  All rights reserved.

Copyright (c) 2006-2007 Jarkko Hietaniemi.

=head1 LICENSE

The Symbian port is licensed under the same terms as Perl itself.

=head1 HISTORY

=over 4

=item *

0.1.0: April 2005

(This will show as "0.01" in the Symbian Installer.)

 - The console window is a very simple console indeed: one can
   get the newline with "000" and the "C" button is a backspace.
   Do not expect a terminal capable of vt100 or ANSI sequences.
   The console is also "ASCII", you cannot input e.g. any accented
   letters.  Because of obvious physical constraints the console is
   also very small: (in Nokia 6600) 22 columns, 17 rows.
 - The following libraries are available:
   AnyDBM_File AutoLoader base Carp Config Cwd constant
   DynaLoader Exporter File::Spec integer lib strict Symbol
   vars warnings XSLoader
 - The following extensions are available:
   attributes Compress::Zlib Cwd Data::Dumper Devel::Peek
   Digest::MD5 DynaLoader Fcntl File::Glob Filter::Util::Call
   IO List::Util MIME::Base64
   PerlIO::scalar PerlIO::via SDBM_File Socket Storable Time::HiRes
 - The following extensions are missing for various technical
   reasons:
   B ByteLoader Devel::DProf Devel::PPPort Encode GDBM_File
   I18N::Langinfo IPC::SysV NDBM_File Opcode PerlIO::encoding POSIX
   re Safe Sys::Hostname Sys::Syslog
   threads threads::shared Unicode::Normalize
 - Using MakeMaker or the Module::* to build and install modules
   is not supported.
 - Building XS other than the ones in the core is not supported.

Since this is 0.something release, any future releases are almost
guaranteed to be binary incompatible.  As a sign of this the Symbian
symbol exports are kept unfrozen and the .def files fully rebuilt
every time.

=item *

0.2.0: October 2005

  - Perl 5.9.3 (patch level 25741)
  - Compress::Zlib and IO::Zlib supported
  - sisify.pl added

We maintain the binary incompatibility.

=item *

0.3.0: October 2005

  - Perl 5.9.3 (patch level 25911)
  - Series 80 2.0 and UIQ 2.1 support

We maintain the binary incompatibility.

=item *

0.4.0: November 2005

  - Perl 5.9.3 (patch level 26052)
  - adding a sample Symbian extension

We maintain the binary incompatibility.

=item *

0.4.1: December 2006

  - Perl 5.9.5-to-be (patch level 30002)
  - added extensions: Compress/Raw/Zlib, Digest/SHA,
    Hash/Util, Math/BigInt/FastCalc, Text/Soundex, Time/Piece
  - port to S90 1.1 by alexander smishlajev

We maintain the binary incompatibility.

=item *

0.4.2: March 2007

  - catchup with Perl 5.9.5-to-be (patch level 30812)
  - tested to build with Microsoft Visual C++ 2005 Express Edition
    (which uses Microsoft Visual C 8, instead of the old VC6),
    SDK used for testing S60_2nd_FP3 aka 8.1a

We maintain the binary incompatibility.

=back

=cut
perlstyle.pod000064400000020666150344123420007303 0ustar00=head1 NAME

perlstyle - Perl style guide

=head1 DESCRIPTION

Each programmer will, of course, have his or her own preferences in
regards to formatting, but there are some general guidelines that will
make your programs easier to read, understand, and maintain.

The most important thing is to run your programs under the B<-w>
flag at all times.  You may turn it off explicitly for particular
portions of code via the C<no warnings> pragma or the C<$^W> variable
if you must.  You should also always run under C<use strict> or know the
reason why not.  The C<use sigtrap> and even C<use diagnostics> pragmas
may also prove useful.

Regarding aesthetics of code lay out, about the only thing Larry
cares strongly about is that the closing curly bracket of
a multi-line BLOCK should line up with the keyword that started the construct.
Beyond that, he has other preferences that aren't so strong:

=over 4

=item *

4-column indent.

=item *

Opening curly on same line as keyword, if possible, otherwise line up.

=item *

Space before the opening curly of a multi-line BLOCK.

=item *

One-line BLOCK may be put on one line, including curlies.

=item *

No space before the semicolon.

=item *

Semicolon omitted in "short" one-line BLOCK.

=item *

Space around most operators.

=item *

Space around a "complex" subscript (inside brackets).

=item *

Blank lines between chunks that do different things.

=item *

Uncuddled elses.

=item *

No space between function name and its opening parenthesis.

=item *

Space after each comma.

=item *

Long lines broken after an operator (except C<and> and C<or>).

=item *

Space after last parenthesis matching on current line.

=item *

Line up corresponding items vertically.

=item *

Omit redundant punctuation as long as clarity doesn't suffer.

=back

Larry has his reasons for each of these things, but he doesn't claim that
everyone else's mind works the same as his does.

Here are some other more substantive style issues to think about:

=over 4

=item *

Just because you I<CAN> do something a particular way doesn't mean that
you I<SHOULD> do it that way.  Perl is designed to give you several
ways to do anything, so consider picking the most readable one.  For
instance

    open(FOO,$foo) || die "Can't open $foo: $!";

is better than

    die "Can't open $foo: $!" unless open(FOO,$foo);

because the second way hides the main point of the statement in a
modifier.  On the other hand

    print "Starting analysis\n" if $verbose;

is better than

    $verbose && print "Starting analysis\n";

because the main point isn't whether the user typed B<-v> or not.

Similarly, just because an operator lets you assume default arguments
doesn't mean that you have to make use of the defaults.  The defaults
are there for lazy systems programmers writing one-shot programs.  If
you want your program to be readable, consider supplying the argument.

Along the same lines, just because you I<CAN> omit parentheses in many
places doesn't mean that you ought to:

    return print reverse sort num values %array;
    return print(reverse(sort num (values(%array))));

When in doubt, parenthesize.  At the very least it will let some poor
schmuck bounce on the % key in B<vi>.

Even if you aren't in doubt, consider the mental welfare of the person
who has to maintain the code after you, and who will probably put
parentheses in the wrong place.

=item *

Don't go through silly contortions to exit a loop at the top or the
bottom, when Perl provides the C<last> operator so you can exit in
the middle.  Just "outdent" it a little to make it more visible:

    LINE:
	for (;;) {
	    statements;
	  last LINE if $foo;
	    next LINE if /^#/;
	    statements;
	}

=item *

Don't be afraid to use loop labels--they're there to enhance
readability as well as to allow multilevel loop breaks.  See the
previous example.

=item *

Avoid using C<grep()> (or C<map()>) or `backticks` in a void context, that is,
when you just throw away their return values.  Those functions all
have return values, so use them.  Otherwise use a C<foreach()> loop or
the C<system()> function instead.

=item *

For portability, when using features that may not be implemented on
every machine, test the construct in an eval to see if it fails.  If
you know what version or patchlevel a particular feature was
implemented, you can test C<$]> (C<$PERL_VERSION> in C<English>) to see if it
will be there.  The C<Config> module will also let you interrogate values
determined by the B<Configure> program when Perl was installed.

=item *

Choose mnemonic identifiers.  If you can't remember what mnemonic means,
you've got a problem.

=item *

While short identifiers like C<$gotit> are probably ok, use underscores to
separate words in longer identifiers.  It is generally easier to read
C<$var_names_like_this> than C<$VarNamesLikeThis>, especially for
non-native speakers of English. It's also a simple rule that works
consistently with C<VAR_NAMES_LIKE_THIS>.

Package names are sometimes an exception to this rule.  Perl informally
reserves lowercase module names for "pragma" modules like C<integer> and
C<strict>.  Other modules should begin with a capital letter and use mixed
case, but probably without underscores due to limitations in primitive
file systems' representations of module names as files that must fit into a
few sparse bytes.

=item *

You may find it helpful to use letter case to indicate the scope
or nature of a variable. For example:

    $ALL_CAPS_HERE   constants only (beware clashes with perl vars!)
    $Some_Caps_Here  package-wide global/static
    $no_caps_here    function scope my() or local() variables

Function and method names seem to work best as all lowercase.
E.g., C<$obj-E<gt>as_string()>.

You can use a leading underscore to indicate that a variable or
function should not be used outside the package that defined it.

=item *

If you have a really hairy regular expression, use the C</x>  or C</xx>
modifiers and put in some whitespace to make it look a little less like
line noise.
Don't use slash as a delimiter when your regexp has slashes or backslashes.

=item *

Use the new C<and> and C<or> operators to avoid having to parenthesize
list operators so much, and to reduce the incidence of punctuation
operators like C<&&> and C<||>.  Call your subroutines as if they were
functions or list operators to avoid excessive ampersands and parentheses.

=item *

Use here documents instead of repeated C<print()> statements.

=item *

Line up corresponding things vertically, especially if it'd be too long
to fit on one line anyway.

    $IDX = $ST_MTIME;
    $IDX = $ST_ATIME 	   if $opt_u;
    $IDX = $ST_CTIME 	   if $opt_c;
    $IDX = $ST_SIZE  	   if $opt_s;

    mkdir $tmpdir, 0700	or die "can't mkdir $tmpdir: $!";
    chdir($tmpdir)      or die "can't chdir $tmpdir: $!";
    mkdir 'tmp',   0777	or die "can't mkdir $tmpdir/tmp: $!";

=item *

Always check the return codes of system calls.  Good error messages should
go to C<STDERR>, include which program caused the problem, what the failed
system call and arguments were, and (VERY IMPORTANT) should contain the
standard system error message for what went wrong.  Here's a simple but
sufficient example:

    opendir(D, $dir)	 or die "can't opendir $dir: $!";

=item *

Line up your transliterations when it makes sense:

    tr [abc]
       [xyz];

=item *

Think about reusability.  Why waste brainpower on a one-shot when you
might want to do something like it again?  Consider generalizing your
code.  Consider writing a module or object class.  Consider making your
code run cleanly with C<use strict> and C<use warnings> (or B<-w>) in
effect.  Consider giving away your code.  Consider changing your whole
world view.  Consider... oh, never mind.

=item *

Try to document your code and use Pod formatting in a consistent way. Here
are commonly expected conventions:

=over 4

=item *

use C<CE<lt>E<gt>> for function, variable and module names (and more
generally anything that can be considered part of code, like filehandles
or specific values). Note that function names are considered more readable
with parentheses after their name, that is C<function()>.

=item *

use C<BE<lt>E<gt>> for commands names like B<cat> or B<grep>.

=item *

use C<FE<lt>E<gt>> or C<CE<lt>E<gt>> for file names. C<FE<lt>E<gt>> should
be the only Pod code for file names, but as most Pod formatters render it
as italic, Unix and Windows paths with their slashes and backslashes may
be less readable, and better rendered with C<CE<lt>E<gt>>.

=back

=item *

Be consistent.

=item *

Be nice.

=back
perl5161delta.pod000064400000013776150344123420007555 0ustar00=encoding utf8

=head1 NAME

perl5161delta - what is new for perl v5.16.1

=head1 DESCRIPTION

This document describes differences between the 5.16.0 release and
the 5.16.1 release.

If you are upgrading from an earlier release such as 5.14.0, first read
L<perl5160delta>, which describes differences between 5.14.0 and
5.16.0.

=head1 Security

=head2 an off-by-two error in Scalar-List-Util has been fixed

The bugfix was in Scalar-List-Util 1.23_04, and perl 5.16.1 includes
Scalar-List-Util 1.25.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.16.0 If any
exist, they are bugs, and we request that you submit a report.  See
L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Scalar::Util> and L<List::Util> have been upgraded from version 1.23 to
version 1.25.

=item *

L<B::Deparse> has been updated from version 1.14 to 1.14_01.  An
"uninitialized" warning emitted by B::Deparse has been squashed
[perl #113464].

=back

=head1 Configuration and Compilation

=over

=item *

Building perl with some Windows compilers used to fail due to a problem
with miniperl's C<glob> operator (which uses the C<perlglob> program)
deleting the PATH environment variable [perl #113798].

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item VMS

All C header files from the top-level directory of the distribution are now
installed on VMS, providing consistency with a long-standing practice on other
platforms. Previously only a subset were installed, which broke non-core extension
builds for extensions that depended on the missing include files.

=back

=head1 Selected Bug Fixes

=over 4

=item *

A regression introduced in Perl v5.16.0 involving
C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/> has been fixed.  Only the first
instance is supposed to be meaningful if a character appears more than
once in C<I<SEARCHLIST>>.  Under some circumstances, the final instance
was overriding all earlier ones.  [perl #113584]

=item *

C<B::COP::stashlen> has been added.   This provides access to an internal
field added in perl 5.16 under threaded builds.  It was broken at the last
minute before 5.16 was released [perl #113034].

=item *

The L<re> pragma will no longer clobber C<$_>. [perl #113750]

=item *

Unicode 6.1 published an incorrect alias for one of the
Canonical_Combining_Class property's values (which range between 0 and
254).  The alias C<CCC133> should have been C<CCC132>.  Perl now
overrides the data file furnished by Unicode to give the correct value.

=item *

Duplicating scalar filehandles works again.  [perl #113764]

=item *

Under threaded perls, a runtime code block in a regular expression could
corrupt the package name stored in the op tree, resulting in bad reads
in C<caller>, and possibly crashes [perl #113060].

=item *

For efficiency's sake, many operators and built-in functions return the
same scalar each time.  Lvalue subroutines and subroutines in the CORE::
namespace were allowing this implementation detail to leak through.
C<print &CORE::uc("a"), &CORE::uc("b")> used to print "BB".  The same thing
would happen with an lvalue subroutine returning the return value of C<uc>.
Now the value is copied in such cases [perl #113044].

=item *

C<__SUB__> now works in special blocks (C<BEGIN>, C<END>, etc.).

=item *

Formats that reference lexical variables from outside no longer result
in crashes.

=back

=head1 Known Problems

There are no new known problems, but consult L<perl5160delta/Known
Problems> to see those identified in the 5.16.0 release.

=head1 Acknowledgements

Perl 5.16.1 represents approximately 2 months of development since Perl
5.16.0 and contains approximately 14,000 lines of changes across 96
files from 8 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers. The following people are known to
have contributed the improvements that became Perl 5.16.1:

Chris 'BinGOs' Williams, Craig A. Berry, Father Chrysostomos, Karl
Williamson, Paul Johnson, Reini Urban, Ricardo Signes, Tony Cook.

The list above is almost certainly incomplete as it is automatically
generated from version control history. In particular, it does not
include the names of the (very much appreciated) contributors who
reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN
modules included in Perl's core. We're grateful to the entire CPAN
community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors,
please see the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please
send it to perl5-security-report@perl.org. This points to a closed
subscription unarchived mailing list, which includes all the core
committers, who will be able to help assess the impact of issues, figure
out a resolution, and help co-ordinate the release of patches to
mitigate or fix the problem across all platforms on which Perl is
supported. Please only use this address for security issues in the Perl
core, not for modules independently distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlexperiment.pod000064400000016033150344123420010314 0ustar00=head1 NAME

perlexperiment - A listing of experimental features in Perl

=head1 DESCRIPTION

This document lists the current and past experimental features in the perl
core. Although all of these are documented with their appropriate topics,
this succinct listing gives you an overview and basic facts about their
status.

So far we've merely tried to find and list the experimental features and infer
their inception, versions, etc. There's a lot of speculation here.

=head2 Current experiments

=over 8

=item C<our> can now have an experimental optional attribute C<unique>

Introduced in Perl 5.8.0

Deprecated in Perl 5.10.0

The ticket for this feature is
L<[perl #119313]|https://rt.perl.org/rt3/Ticket/Display.html?id=119313>.

=item Smart match (C<~~>)

Introduced in Perl 5.10.0

Modified in Perl 5.10.1, 5.12.0

Using this feature triggers warnings in the category
C<experimental::smartmatch>.

The ticket for this feature is
L<[perl #119317]|https://rt.perl.org/rt3/Ticket/Display.html?id=119317>.

=item Pluggable keywords

The ticket for this feature is
L<[perl #119455]|https://rt.perl.org/rt3/Ticket/Display.html?id=119455>.

See L<perlapi/PL_keyword_plugin> for the mechanism.

Introduced in Perl 5.11.2

=item Regular Expression Set Operations

Introduced in Perl 5.18

The ticket for this feature is
L<[perl #119451]|https://rt.perl.org/rt3/Ticket/Display.html?id=119451>.

See also: L<perlrecharclass/Extended Bracketed Character Classes>

Using this feature triggers warnings in the category
C<experimental::regex_sets>.

=item Subroutine signatures

Introduced in Perl 5.20.0

Using this feature triggers warnings in the category
C<experimental::signatures>.

The ticket for this feature is
L<[perl #121481]|https://rt.perl.org/Ticket/Display.html?id=121481>.

=item Aliasing via reference

Introduced in Perl 5.22.0

Using this feature triggers warnings in the category
C<experimental::refaliasing>.

The ticket for this feature is
L<[perl #122947]|https://rt.perl.org/rt3/Ticket/Display.html?id=122947>.

See also: L<perlref/Assigning to References>

=item The "const" attribute

Introduced in Perl 5.22.0

Using this feature triggers warnings in the category
C<experimental::const_attr>.

The ticket for this feature is
L<[perl #123630]|https://rt.perl.org/rt3/Ticket/Display.html?id=123630>.

See also: L<perlsub/Constant Functions>

=item use re 'strict';

Introduced in Perl 5.22.0

Using this feature triggers warnings in the category
C<experimental::re_strict>.

See L<re/'strict' mode>

=item String- and number-specific bitwise operators

Introduced in Perl 5.22.0

See also: L<perlop/Bitwise String Operators>

Using this feature triggers warnings in the category
C<experimental::bitwise>.

The ticket for this feature is
L<[perl #123707]|https://rt.perl.org/rt3/Ticket/Display.html?id=123707>.

=item The <:win32> IO pseudolayer

The ticket for this feature is
L<[perl #119453]|https://rt.perl.org/rt3/Ticket/Display.html?id=119453>.

See also L<perlrun>

=item Declaring a reference to a variable

Introduced in Perl 5.26.0

Using this feature triggers warnings in the category
C<experimental::declared_refs>.

The ticket for this feature is
L<[perl #128654]|https://rt.perl.org/rt3/Ticket/Display.html?id=128654>.

See also: L<perlref/Declaring a Reference to a Variable>

=item There is an C<installhtml> target in the Makefile.

The ticket for this feature is
L<[perl #116487]|https://rt.perl.org/rt3/Ticket/Display.html?id=116487>.

=item Unicode in Perl on EBCDIC

=back

=head2 Accepted features

These features were so wildly successful and played so well with others that
we decided to remove their experimental status and admit them as full, stable
features in the world of Perl, lavishing all the benefits and luxuries thereof.
They are also awarded +5 Stability and +3 Charisma.

=over 8

=item 64-bit support

Introduced in Perl 5.005

=item die accepts a reference

Introduced in Perl 5.005

=item DB module

Introduced in Perl 5.6.0

See also L<perldebug>, L<perldebtut>

=item Weak references

Introduced in Perl 5.6.0

=item Internal file glob

Introduced in Perl 5.6.0

=item fork() emulation

Introduced in Perl 5.6.1

See also L<perlfork>

=item -Dusemultiplicity -Duseithreads

Introduced in Perl 5.6.0

Accepted in Perl 5.8.0

=item Support for long doubles

Introduced in Perl 5.6.0

Accepted in Perl 5.8.1

=item The C<\N> regex character class

The C<\N> character class, not to be confused with the named character
sequence C<\N{NAME}>, denotes any non-newline character in a regular
expression.

Introduced in Perl 5.12

Exact version of acceptance unclear, but no later than Perl 5.18.

=item C<(?{code})> and C<(??{ code })>

Introduced in Perl 5.6.0

Accepted in Perl 5.20.0

See also L<perlre>

=item Linux abstract Unix domain sockets

Introduced in Perl 5.9.2

Accepted before Perl 5.20.0.  The Socket library is now primarily maintained
on CPAN, rather than in the perl core.

See also L<Socket>

=item Lvalue subroutines

Introduced in Perl 5.6.0

Accepted in Perl 5.20.0

See also L<perlsub>

=item Backtracking control verbs

C<(*ACCEPT)>

Introduced in Perl 5.10

Accepted in Perl 5.20.0

=item The <:pop> IO pseudolayer

See also L<perlrun>

Accepted in Perl 5.20.0

=item C<\s> in regexp matches vertical tab

Accepted in Perl 5.22.0

=item Postfix dereference syntax

Introduced in Perl 5.20.0

Accepted in Perl 5.24.0

=item Lexical subroutines

Introduced in Perl 5.18.0

Accepted in Perl 5.26.0

=back

=head2 Removed features

These features are no longer considered experimental and their functionality
has disappeared. It's your own fault if you wrote production programs using
these features after we explicitly told you not to (see L<perlpolicy>).

=over 8

=item 5.005-style threading

Introduced in Perl 5.005

Removed in Perl 5.10

=item perlcc

Introduced in Perl 5.005

Moved from Perl 5.9.0 to CPAN

=item The pseudo-hash data type

Introduced in Perl 5.6.0

Removed in Perl 5.9.0

=item GetOpt::Long Options can now take multiple values at once (experimental)

C<Getopt::Long> upgraded to version 2.35

Removed in Perl 5.8.8

=item Assertions

The C<-A> command line switch

Introduced in Perl 5.9.0

Removed in Perl 5.9.5

=item Test::Harness::Straps

Moved from Perl 5.10.1 to CPAN

=item C<legacy>

The experimental C<legacy> pragma was swallowed by the C<feature> pragma.

Introduced in Perl 5.11.2

Removed in Perl 5.11.3

=item Lexical C<$_>

Using this feature triggered warnings in the category
C<experimental::lexical_topic>.

Introduced in Perl 5.10.0

Removed in Perl 5.24.0

=item Array and hash container functions accept references

Using this feature triggered warnings in the category
C<experimental::autoderef>.

Superseded by L</Postfix dereference syntax>.

Introduced in Perl 5.14.0

Removed in Perl 5.24.0

=back

=head1 SEE ALSO

For a complete list of features check L<feature>.

=head1 AUTHORS

brian d foy C<< <brian.d.foy@gmail.com> >>

SE<eacute>bastien Aperghis-Tramoni C<< <saper@cpan.org> >>

=head1 COPYRIGHT

Copyright 2010, brian d foy C<< <brian.d.foy@gmail.com> >>

=head1 LICENSE

You can use and redistribute this document under the same terms as Perl
itself.

=cut
perldos.pod000064400000024432150344123420006723 0ustar00If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see perlpod manpage) which is
specially designed to be readable as is.

=head1 NAME

perldos - Perl under DOS, W31, W95.

=head1 SYNOPSIS

These are instructions for building Perl under DOS (or w??), using
DJGPP v2.03 or later.  Under w95 long filenames are supported.

=head1 DESCRIPTION

Before you start, you should glance through the README file
found in the top-level directory where the Perl distribution
was extracted.  Make sure you read and understand the terms under
which this software is being distributed.

This port currently supports MakeMaker (the set of modules that
is used to build extensions to perl).  Therefore, you should be
able to build and install most extensions found in the CPAN sites.

Detailed instructions on how to build and install perl extension
modules, including XS-type modules, is included.  See 'BUILDING AND
INSTALLING MODULES'.

=head2 Prerequisites for Compiling Perl on DOS

=over 4

=item DJGPP

DJGPP is a port of GNU C/C++ compiler and development tools to 32-bit,
protected-mode environment on Intel 32-bit CPUs running MS-DOS and compatible
operating systems, by DJ Delorie <dj@delorie.com> and friends.

For more details (FAQ), check out the home of DJGPP at:

        http://www.delorie.com/djgpp/

If you have questions about DJGPP, try posting to the DJGPP newsgroup:
comp.os.msdos.djgpp, or use the email gateway djgpp@delorie.com.

You can find the full DJGPP distribution on any of the mirrors listed here:

        http://www.delorie.com/djgpp/getting.html

You need the following files to build perl (or add new modules):

        v2/djdev203.zip
        v2gnu/bnu2112b.zip
        v2gnu/gcc2953b.zip
        v2gnu/bsh204b.zip
        v2gnu/mak3791b.zip
        v2gnu/fil40b.zip
        v2gnu/sed3028b.zip
        v2gnu/txt20b.zip
        v2gnu/dif272b.zip
        v2gnu/grep24b.zip
        v2gnu/shl20jb.zip
        v2gnu/gwk306b.zip
        v2misc/csdpmi5b.zip

or possibly any newer version.

=item Pthreads

Thread support is not tested in this version of the djgpp perl.

=back

=head2 Shortcomings of Perl under DOS

Perl under DOS lacks some features of perl under UNIX because of
deficiencies in the UNIX-emulation, most notably:

=over 4

=item *

fork() and pipe()

=item *

some features of the UNIX filesystem regarding link count and file dates

=item *

in-place operation is a little bit broken with short filenames

=item *

sockets

=back

=head2 Building Perl on DOS

=over 4

=item *

Unpack the source package F<perl5.8*.tar.gz> with djtarx. If you want
to use long file names under w95 and also to get Perl to pass all its
tests, don't forget to use

        set LFN=y
        set FNCASE=y

before unpacking the archive.

=item *

Create a "symlink" or copy your bash.exe to sh.exe in your C<($DJDIR)/bin>
directory.

        ln -s bash.exe sh.exe

[If you have the recommended version of bash for DJGPP, this is already
done for you.]

And make the C<SHELL> environment variable point to this F<sh.exe>:

        set SHELL=c:/djgpp/bin/sh.exe (use full path name!)

You can do this in F<djgpp.env> too. Add this line BEFORE any section
definition:

        +SHELL=%DJDIR%/bin/sh.exe

=item *

If you have F<split.exe> and F<gsplit.exe> in your path, then rename 
F<split.exe> to F<djsplit.exe>, and F<gsplit.exe> to F<split.exe>.
Copy or link F<gecho.exe> to F<echo.exe> if you don't have F<echo.exe>.
Copy or link F<gawk.exe> to F<awk.exe> if you don't have F<awk.exe>.

[If you have the recommended versions of djdev, shell utilities and
gawk, all these are already done for you, and you will not need to do
anything.]

=item *

Chdir to the djgpp subdirectory of perl toplevel and type the following
commands:

        set FNCASE=y
        configure.bat

This will do some preprocessing then run the Configure script for you.
The Configure script is interactive, but in most cases you just need to
press ENTER.  The "set" command ensures that DJGPP preserves the letter
case of file names when reading directories.  If you already issued this
set command when unpacking the archive, and you are in the same DOS
session as when you unpacked the archive, you don't have to issue the
set command again.  This command is necessary *before* you start to 
(re)configure or (re)build perl in order to ensure both that perl builds 
correctly and that building XS-type modules can succeed.  See the DJGPP 
info entry for "_preserve_fncase" for more information:

        info libc alphabetical _preserve_fncase

If the script says that your package is incomplete, and asks whether
to continue, just answer with Y (this can only happen if you don't use
long filenames or forget to issue "set FNCASE=y" first).

When Configure asks about the extensions, I suggest IO and Fcntl,
and if you want database handling then SDBM_File or GDBM_File
(you need to install gdbm for this one). If you want to use the
POSIX extension (this is the default), make sure that the stack
size of your F<cc1.exe> is at least 512kbyte (you can check this
with: C<stubedit cc1.exe>).

You can use the Configure script in non-interactive mode too.
When I built my F<perl.exe>, I used something like this:

        configure.bat -des

You can find more info about Configure's command line switches in
the F<INSTALL> file.

When the script ends, and you want to change some values in the
generated F<config.sh> file, then run

        sh Configure -S

after you made your modifications.

IMPORTANT: if you use this C<-S> switch, be sure to delete the CONFIG
environment variable before running the script:

        set CONFIG=

=item *

Now you can compile Perl. Type:

        make

=back

=head2 Testing Perl on DOS

Type:

        make test

If you're lucky you should see "All tests successful". But there can be
a few failed subtests (less than 5 hopefully) depending on some external
conditions (e.g. some subtests fail under linux/dosemu or plain dos
with short filenames only).

=head2 Installation of Perl on DOS

Type:

        make install

This will copy the newly compiled perl and libraries into your DJGPP
directory structure. Perl.exe and the utilities go into C<($DJDIR)/bin>,
and the library goes under C<($DJDIR)/lib/perl5>. The pod documentation
goes under C<($DJDIR)/lib/perl5/pod>.

=head1 BUILDING AND INSTALLING MODULES ON DOS

=head2 Building Prerequisites for Perl on DOS

For building and installing non-XS modules, all you need is a working
perl under DJGPP.  Non-XS modules do not require re-linking the perl
binary, and so are simpler to build and install.

XS-type modules do require re-linking the perl binary, because part of
an XS module is written in "C", and has to be linked together with the
perl binary to be executed.  This is required because perl under DJGPP
is built with the "static link" option, due to the lack of "dynamic
linking" in the DJGPP environment.

Because XS modules require re-linking of the perl binary, you need both
the perl binary distribution and the perl source distribution to build
an XS extension module.  In addition, you will have to have built your
perl binary from the source distribution so that all of the components
of the perl binary are available for the required link step.

=head2 Unpacking CPAN Modules on DOS

First, download the module package from CPAN (e.g., the "Comma Separated
Value" text package, Text-CSV-0.01.tar.gz).  Then expand the contents of
the package into some location on your disk.  Most CPAN modules are
built with an internal directory structure, so it is usually safe to
expand it in the root of your DJGPP installation.  Some people prefer to
locate source trees under /usr/src (i.e., C<($DJDIR)/usr/src>), but you may
put it wherever seems most logical to you, *EXCEPT* under the same
directory as your perl source code.  There are special rules that apply
to modules which live in the perl source tree that do not apply to most
of the modules in CPAN.

Unlike other DJGPP packages, which are normal "zip" files, most CPAN
module packages are "gzipped tarballs".  Recent versions of WinZip will
safely unpack and expand them, *UNLESS* they have zero-length files.  It
is a known WinZip bug (as of v7.0) that it will not extract zero-length
files.

From the command line, you can use the djtar utility provided with DJGPP
to unpack and expand these files.  For example:

        C:\djgpp>djtarx -v Text-CSV-0.01.tar.gz

This will create the new directory C<($DJDIR)/Text-CSV-0.01>, filling
it with the source for this module.

=head2 Building Non-XS Modules on DOS

To build a non-XS module, you can use the standard module-building
instructions distributed with perl modules.

    perl Makefile.PL
    make
    make test
    make install

This is sufficient because non-XS modules install only ".pm" files and
(sometimes) pod and/or man documentation.  No re-linking of the perl
binary is needed to build, install or use non-XS modules.

=head2 Building XS Modules on DOS

To build an XS module, you must use the standard module-building
instructions distributed with perl modules *PLUS* three extra
instructions specific to the DJGPP "static link" build environment.

    set FNCASE=y
    perl Makefile.PL
    make
    make perl
    make test
    make -f Makefile.aperl inst_perl MAP_TARGET=perl.exe
    make install

The first extra instruction sets DJGPP's FNCASE environment variable so
that the new perl binary which you must build for an XS-type module will
build correctly.  The second extra instruction re-builds the perl binary
in your module directory before you run "make test", so that you are
testing with the new module code you built with "make".  The third extra
instruction installs the perl binary from your module directory into the
standard DJGPP binary directory, C<($DJDIR)/bin>, replacing your
previous perl binary.

Note that the MAP_TARGET value *must* have the ".exe" extension or you
will not create a "perl.exe" to replace the one in C<($DJDIR)/bin>.

When you are done, the XS-module install process will have added information
to your "perllocal" information telling that the perl binary has been replaced,
and what module was installed.  You can view this information at any time
by using the command:

        perl -S perldoc perllocal

=head1 AUTHOR

Laszlo Molnar, F<laszlo.molnar@eth.ericsson.se> [Installing/building perl]

Peter J. Farley III F<pjfarley@banet.net> [Building/installing modules]

=head1 SEE ALSO

perl(1).

=cut

perl586delta.pod000064400000011053150344123420007465 0ustar00=head1 NAME

perl586delta - what is new for perl v5.8.6

=head1 DESCRIPTION

This document describes differences between the 5.8.5 release and
the 5.8.6 release.

=head1 Incompatible Changes

There are no changes incompatible with 5.8.5.

=head1 Core Enhancements

The perl interpreter is now more tolerant of UTF-16-encoded scripts.

On Win32, Perl can now use non-IFS compatible LSPs, which allows Perl to
work in conjunction with firewalls such as McAfee Guardian. For full details
see the file F<README.win32>, particularly if you're running Win95.

=head1 Modules and Pragmata

=over 4

=item *

With the C<base> pragma, an intermediate class with no fields used to messes
up private fields in the base class. This has been fixed.

=item *

Cwd upgraded to version 3.01 (as part of the new PathTools distribution)

=item *

Devel::PPPort upgraded to version 3.03

=item *

File::Spec upgraded to version 3.01 (as part of the new PathTools distribution)

=item *

Encode upgraded to version 2.08

=item *

ExtUtils::MakeMaker remains at version 6.17, as later stable releases currently
available on CPAN have some issues with core modules on some core platforms.

=item *

I18N::LangTags upgraded to version 0.35

=item *

Math::BigInt upgraded to version 1.73

=item *

Math::BigRat upgraded to version 0.13

=item *

MIME::Base64 upgraded to version 3.05

=item *

POSIX::sigprocmask function can now retrieve the current signal mask without
also setting it.

=item *

Time::HiRes upgraded to version 1.65

=back

=head1 Utility Changes

Perl has a new -dt command-line flag, which enables threads support in the
debugger.

=head1 Performance Enhancements

C<reverse sort ...> is now optimized to sort in reverse, avoiding the
generation of a temporary intermediate list.

C<for (reverse @foo)> now iterates in reverse, avoiding the generation of a
temporary reversed list.

=head1 Selected Bug Fixes

The regexp engine is now more robust when given invalid utf8 input, as is
sometimes generated by buggy XS modules.

C<foreach> on threads::shared array used to be able to crash Perl. This bug
has now been fixed.

A regexp in C<STDOUT>'s destructor used to coredump, because the regexp pad
was already freed. This has been fixed.

C<goto &> is now more robust - bugs in deep recursion and chained C<goto &>
have been fixed.

Using C<delete> on an array no longer leaks memory. A C<pop> of an item from a
shared array reference no longer causes a leak.

C<eval_sv()> failing a taint test could corrupt the stack - this has been
fixed.

On platforms with 64 bit pointers numeric comparison operators used to
erroneously compare the addresses of references that are overloaded, rather
than using the overloaded values. This has been fixed.

C<read> into a UTF8-encoded buffer with an offset off the end of the buffer
no longer mis-calculates buffer lengths.

Although Perl has promised since version 5.8 that C<sort()> would be
stable, the two cases C<sort {$b cmp $a}> and C<< sort {$b <=> $a} >> could
produce non-stable sorts.   This is corrected in perl5.8.6.

Localising C<$^D> no longer generates a diagnostic message about valid -D
flags.

=head1 New or Changed Diagnostics

For -t and -T,
   Too late for "-T" option
has been changed to the more informative
   "-T" is on the #! line, it must also be used on the command line

=head1 Changed Internals

From now on all applications embedding perl will behave as if perl
were compiled with -DPERL_USE_SAFE_PUTENV.  See "Environment access" in
the F<INSTALL> file for details.

Most C<C> source files now have comments at the top explaining their purpose,
which should help anyone wishing to get an overview of the implementation.

=head1 New Tests

There are significantly more tests for the C<B> suite of modules.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5101delta.pod000064400000125560150344123420007542 0ustar00=head1 NAME

perl5101delta - what is new for perl v5.10.1

=head1 DESCRIPTION

This document describes differences between the 5.10.0 release and
the 5.10.1 release.

If you are upgrading from an earlier release such as 5.8.8, first read
the L<perl5100delta>, which describes differences between 5.8.8 and
5.10.0

=head1 Incompatible Changes

=head2 Switch statement changes

The handling of complex expressions by the C<given>/C<when> switch
statement has been enhanced. There are two new cases where C<when> now
interprets its argument as a boolean, instead of an expression to be used
in a smart match:

=over 4

=item flip-flop operators

The C<..> and C<...> flip-flop operators are now evaluated in boolean
context, following their usual semantics; see L<perlop/"Range Operators">.

Note that, as in perl 5.10.0, C<when (1..10)> will not work to test
whether a given value is an integer between 1 and 10; you should use
C<when ([1..10])> instead (note the array reference).

However, contrary to 5.10.0, evaluating the flip-flop operators in boolean
context ensures it can now be useful in a C<when()>, notably for
implementing bistable conditions, like in:

    when (/^=begin/ .. /^=end/) {
      # do something
    }

=item defined-or operator

A compound expression involving the defined-or operator, as in
C<when (expr1 // expr2)>, will be treated as boolean if the first
expression is boolean. (This just extends the existing rule that applies
to the regular or operator, as in C<when (expr1 || expr2)>.)

=back

The next section details more changes brought to the semantics to
the smart match operator, that naturally also modify the behaviour
of the switch statements where smart matching is implicitly used.

=head2 Smart match changes

=head3 Changes to type-based dispatch

The smart match operator C<~~> is no longer commutative. The behaviour of
a smart match now depends primarily on the type of its right hand
argument. Moreover, its semantics have been adjusted for greater
consistency or usefulness in several cases. While the general backwards
compatibility is maintained, several changes must be noted:

=over 4

=item *

Code references with an empty prototype are no longer treated specially.
They are passed an argument like the other code references (even if they
choose to ignore it).

=item *

C<%hash ~~ sub {}> and C<@array ~~ sub {}> now test that the subroutine
returns a true value for each key of the hash (or element of the
array), instead of passing the whole hash or array as a reference to
the subroutine.

=item *

Due to the commutativity breakage, code references are no longer
treated specially when appearing on the left of the C<~~> operator,
but like any vulgar scalar.

=item *

C<undef ~~ %hash> is always false (since C<undef> can't be a key in a
hash). No implicit conversion to C<""> is done (as was the case in perl
5.10.0).

=item *

C<$scalar ~~ @array> now always distributes the smart match across the
elements of the array. It's true if one element in @array verifies
C<$scalar ~~ $element>. This is a generalization of the old behaviour
that tested whether the array contained the scalar.

=back

The full dispatch table for the smart match operator is given in
L<perlsyn/"Smart matching in detail">.

=head3 Smart match and overloading

According to the rule of dispatch based on the rightmost argument type,
when an object overloading C<~~> appears on the right side of the
operator, the overload routine will always be called (with a 3rd argument
set to a true value, see L<overload>.) However, when the object will
appear on the left, the overload routine will be called only when the
rightmost argument is a simple scalar. This way distributivity of smart match
across arrays is not broken, as well as the other behaviours with complex
types (coderefs, hashes, regexes). Thus, writers of overloading routines
for smart match mostly need to worry only with comparing against a scalar,
and possibly with stringification overloading; the other common cases
will be automatically handled consistently.

C<~~> will now refuse to work on objects that do not overload it (in order
to avoid relying on the object's underlying structure). (However, if the
object overloads the stringification or the numification operators, and
if overload fallback is active, it will be used instead, as usual.)

=head2 Other incompatible changes

=over 4

=item *

The semantics of C<use feature :5.10*> have changed slightly.
See L</"Modules and Pragmata"> for more information.

=item *

It is now a run-time error to use the smart match operator C<~~>
with an object that has no overload defined for it. (This way
C<~~> will not break encapsulation by matching against the
object's internal representation as a reference.)

=item *

The version control system used for the development of the perl
interpreter has been switched from Perforce to git.  This is mainly an
internal issue that only affects people actively working on the perl core;
but it may have minor external visibility, for example in some of details
of the output of C<perl -V>. See L<perlrepository> for more information.

=item *

The internal structure of the C<ext/> directory in the perl source has
been reorganised. In general, a module C<Foo::Bar> whose source was
stored under F<ext/Foo/Bar/> is now located under F<ext/Foo-Bar/>. Also,
some modules have been moved from F<lib/> to F<ext/>. This is purely a
source tarball change, and should make no difference to the compilation or
installation of perl, unless you have a very customised build process that
explicitly relies on this structure, or which hard-codes the C<nonxs_ext>
F<Configure> parameter. Specifically, this change does not by default
alter the location of any files in the final installation.

=item *

As part of the C<Test::Harness> 2.x to 3.x upgrade, the experimental
C<Test::Harness::Straps> module has been removed.
See L</"Updated Modules"> for more details.

=item *

As part of the C<ExtUtils::MakeMaker> upgrade, the
C<ExtUtils::MakeMaker::bytes> and C<ExtUtils::MakeMaker::vmsish> modules
have been removed from this distribution.

=item *

C<Module::CoreList> no longer contains the C<%:patchlevel> hash.

=item *

This one is actually a change introduced in 5.10.0, but it was missed
from that release's perldelta, so it is mentioned here instead.

A bugfix related to the handling of the C</m> modifier and C<qr> resulted
in a change of behaviour between 5.8.x and 5.10.0:

    # matches in 5.8.x, doesn't match in 5.10.0
    $re = qr/^bar/; "foo\nbar" =~ /$re/m;

=back

=head1 Core Enhancements

=head2 Unicode Character Database 5.1.0

The copy of the Unicode Character Database included in Perl 5.10.1 has
been updated to 5.1.0 from 5.0.0. See
L<http://www.unicode.org/versions/Unicode5.1.0/#Notable_Changes> for the
notable changes.

=head2 A proper interface for pluggable Method Resolution Orders

As of Perl 5.10.1 there is a new interface for plugging and using method
resolution orders other than the default (linear depth first search).
The C3 method resolution order added in 5.10.0 has been re-implemented as
a plugin, without changing its Perl-space interface. See L<perlmroapi> for
more information.

=head2 The C<overloading> pragma

This pragma allows you to lexically disable or enable overloading
for some or all operations. (Yuval Kogman)

=head2 Parallel tests

The core distribution can now run its regression tests in parallel on
Unix-like platforms. Instead of running C<make test>, set C<TEST_JOBS> in
your environment to the number of tests to run in parallel, and run
C<make test_harness>. On a Bourne-like shell, this can be done as

    TEST_JOBS=3 make test_harness  # Run 3 tests in parallel

An environment variable is used, rather than parallel make itself, because
L<TAP::Harness> needs to be able to schedule individual non-conflicting test
scripts itself, and there is no standard interface to C<make> utilities to
interact with their job schedulers.

Note that currently some test scripts may fail when run in parallel (most
notably C<ext/IO/t/io_dir.t>). If necessary run just the failing scripts
again sequentially and see if the failures go away.

=head2 DTrace support

Some support for DTrace has been added. See "DTrace support" in F<INSTALL>.

=head2 Support for C<configure_requires> in CPAN module metadata

Both C<CPAN> and C<CPANPLUS> now support the C<configure_requires> keyword
in the C<META.yml> metadata file included in most recent CPAN distributions.
This allows distribution authors to specify configuration prerequisites that
must be installed before running F<Makefile.PL> or F<Build.PL>.

See the documentation for C<ExtUtils::MakeMaker> or C<Module::Build> for more
on how to specify C<configure_requires> when creating a distribution for CPAN.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

=over 4

=item C<autodie>

This is a new lexically-scoped alternative for the C<Fatal> module.
The bundled version is 2.06_01. Note that in this release, using a string
eval when C<autodie> is in effect can cause the autodie behaviour to leak
into the surrounding scope. See L<autodie/"BUGS"> for more details.

=item C<Compress::Raw::Bzip2>

This has been added to the core (version 2.020).

=item C<parent>

This pragma establishes an ISA relationship with base classes at compile
time. It provides the key feature of C<base> without the feature creep.

=item C<Parse::CPAN::Meta>

This has been added to the core (version 1.39).

=back

=head2 Pragmata Changes

=over 4

=item C<attributes>

Upgraded from version 0.08 to 0.09.

=item C<attrs>

Upgraded from version 1.02 to 1.03.

=item C<base>

Upgraded from version 2.13 to 2.14. See L<parent> for a replacement.

=item C<bigint>

Upgraded from version 0.22 to 0.23.

=item C<bignum>

Upgraded from version 0.22 to 0.23.

=item C<bigrat>

Upgraded from version 0.22 to 0.23.

=item C<charnames>

Upgraded from version 1.06 to 1.07.

The Unicode F<NameAliases.txt> database file has been added. This has the
effect of adding some extra C<\N> character names that formerly wouldn't
have been recognised; for example, C<"\N{LATIN CAPITAL LETTER GHA}">.

=item C<constant>

Upgraded from version 1.13 to 1.17.

=item C<feature>

The meaning of the C<:5.10> and C<:5.10.X> feature bundles has
changed slightly. The last component, if any (i.e. C<X>) is simply ignored.
This is predicated on the assumption that new features will not, in
general, be added to maintenance releases. So C<:5.10> and C<:5.10.X>
have identical effect. This is a change to the behaviour documented for
5.10.0.

=item C<fields>

Upgraded from version 2.13 to 2.14 (this was just a version bump; there
were no functional changes).

=item C<lib>

Upgraded from version 0.5565 to 0.62.

=item C<open>

Upgraded from version 1.06 to 1.07.

=item C<overload>

Upgraded from version 1.06 to 1.07.

=item C<overloading>

See L</"The C<overloading> pragma"> above.

=item C<version>

Upgraded from version 0.74 to 0.77.

=back

=head2 Updated Modules

=over 4

=item C<Archive::Extract>

Upgraded from version 0.24 to 0.34.

=item C<Archive::Tar>

Upgraded from version 1.38 to 1.52.

=item C<Attribute::Handlers>

Upgraded from version 0.79 to 0.85.

=item C<AutoLoader>

Upgraded from version 5.63 to 5.68.

=item C<AutoSplit>

Upgraded from version 1.05 to 1.06.

=item C<B>

Upgraded from version 1.17 to 1.22.

=item C<B::Debug>

Upgraded from version 1.05 to 1.11.

=item C<B::Deparse>

Upgraded from version 0.83 to 0.89.

=item C<B::Lint>

Upgraded from version 1.09 to 1.11.

=item C<B::Xref>

Upgraded from version 1.01 to 1.02.

=item C<Benchmark>

Upgraded from version 1.10 to 1.11.

=item C<Carp>

Upgraded from version 1.08 to 1.11.

=item C<CGI>

Upgraded from version 3.29 to 3.43.
(also includes the "default_value for popup_menu()" fix from 3.45).

=item C<Compress::Zlib>

Upgraded from version 2.008 to 2.020.

=item C<CPAN>

Upgraded from version 1.9205 to 1.9402. C<CPAN::FTP> has a local fix to
stop it being too verbose on download failure.

=item C<CPANPLUS>

Upgraded from version 0.84 to 0.88.

=item C<CPANPLUS::Dist::Build>

Upgraded from version 0.06_02 to 0.36.

=item C<Cwd>

Upgraded from version 3.25_01 to 3.30.

=item C<Data::Dumper>

Upgraded from version 2.121_14 to 2.124.

=item C<DB>

Upgraded from version 1.01 to 1.02.

=item C<DB_File>

Upgraded from version 1.816_1 to 1.820.

=item C<Devel::PPPort>

Upgraded from version 3.13 to 3.19.

=item C<Digest::MD5>

Upgraded from version 2.36_01 to 2.39.

=item C<Digest::SHA>

Upgraded from version 5.45 to 5.47.

=item C<DirHandle>

Upgraded from version 1.01 to 1.03.

=item C<Dumpvalue>

Upgraded from version 1.12 to 1.13.

=item C<DynaLoader>

Upgraded from version 1.08 to 1.10.

=item C<Encode>

Upgraded from version 2.23 to 2.35.

=item C<Errno>

Upgraded from version 1.10 to 1.11.

=item C<Exporter>

Upgraded from version 5.62 to 5.63.

=item C<ExtUtils::CBuilder>

Upgraded from version 0.21 to 0.2602.

=item C<ExtUtils::Command>

Upgraded from version 1.13 to 1.16.

=item C<ExtUtils::Constant>

Upgraded from 0.20 to 0.22. (Note that neither of these versions are
available on CPAN.)

=item C<ExtUtils::Embed>

Upgraded from version 1.27 to 1.28.

=item C<ExtUtils::Install>

Upgraded from version 1.44 to 1.54.

=item C<ExtUtils::MakeMaker>

Upgraded from version 6.42 to 6.55_02.

Note that C<ExtUtils::MakeMaker::bytes> and C<ExtUtils::MakeMaker::vmsish>
have been removed from this distribution.

=item C<ExtUtils::Manifest>

Upgraded from version 1.51_01 to 1.56.

=item C<ExtUtils::ParseXS>

Upgraded from version 2.18_02 to 2.2002.

=item C<Fatal>

Upgraded from version 1.05 to 2.06_01. See also the new pragma C<autodie>.

=item C<File::Basename>

Upgraded from version 2.76 to 2.77.

=item C<File::Compare>

Upgraded from version 1.1005 to 1.1006.

=item C<File::Copy>

Upgraded from version 2.11 to 2.14.

=item C<File::Fetch>

Upgraded from version 0.14 to 0.20.

=item C<File::Find>

Upgraded from version 1.12 to 1.14.

=item C<File::Path>

Upgraded from version 2.04 to 2.07_03.

=item C<File::Spec>

Upgraded from version 3.2501 to 3.30.

=item C<File::stat>

Upgraded from version 1.00 to 1.01.

=item C<File::Temp>

Upgraded from version 0.18 to 0.22.

=item C<FileCache>

Upgraded from version 1.07 to 1.08.

=item C<FileHandle>

Upgraded from version 2.01 to 2.02.

=item C<Filter::Simple>

Upgraded from version 0.82 to 0.84.

=item C<Filter::Util::Call>

Upgraded from version 1.07 to 1.08.

=item C<FindBin>

Upgraded from version 1.49 to 1.50.

=item C<GDBM_File>

Upgraded from version 1.08 to 1.09.

=item C<Getopt::Long>

Upgraded from version 2.37 to 2.38.

=item C<Hash::Util::FieldHash>

Upgraded from version 1.03 to 1.04. This fixes a memory leak.

=item C<I18N::Collate>

Upgraded from version 1.00 to 1.01.

=item C<IO>

Upgraded from version 1.23_01 to 1.25.

This makes non-blocking mode work on Windows in C<IO::Socket::INET>
[CPAN #43573].

=item C<IO::Compress::*>

Upgraded from version 2.008 to 2.020.

=item C<IO::Dir>

Upgraded from version 1.06 to 1.07.

=item C<IO::Handle>

Upgraded from version 1.27 to 1.28.

=item C<IO::Socket>

Upgraded from version 1.30_01 to 1.31.

=item C<IO::Zlib>

Upgraded from version 1.07 to 1.09.

=item C<IPC::Cmd>

Upgraded from version 0.40_1 to 0.46.

=item C<IPC::Open3>

Upgraded from version 1.02 to 1.04.

=item C<IPC::SysV>

Upgraded from version 1.05 to 2.01.

=item C<lib>

Upgraded from version 0.5565 to 0.62.

=item C<List::Util>

Upgraded from version 1.19 to 1.21.

=item C<Locale::MakeText>

Upgraded from version 1.12 to 1.13.

=item C<Log::Message>

Upgraded from version 0.01 to 0.02.

=item C<Math::BigFloat>

Upgraded from version 1.59 to 1.60.

=item C<Math::BigInt>

Upgraded from version 1.88 to 1.89.

=item C<Math::BigInt::FastCalc>

Upgraded from version 0.16 to 0.19.

=item C<Math::BigRat>

Upgraded from version 0.21 to 0.22.

=item C<Math::Complex>

Upgraded from version 1.37 to 1.56.

=item C<Math::Trig>

Upgraded from version 1.04 to 1.20.

=item C<Memoize>

Upgraded from version 1.01_02 to 1.01_03 (just a minor documentation
change).

=item C<Module::Build>

Upgraded from version 0.2808_01 to 0.34_02.

=item C<Module::CoreList>

Upgraded from version 2.13 to 2.18. This release no longer contains the
C<%Module::CoreList::patchlevel> hash.

=item C<Module::Load>

Upgraded from version 0.12 to 0.16.

=item C<Module::Load::Conditional>

Upgraded from version 0.22 to 0.30.

=item C<Module::Loaded>

Upgraded from version 0.01 to 0.02.

=item C<Module::Pluggable>

Upgraded from version 3.6 to 3.9.

=item C<NDBM_File>

Upgraded from version 1.07 to 1.08.

=item C<Net::Ping>

Upgraded from version 2.33 to 2.36.

=item C<NEXT>

Upgraded from version 0.60_01 to 0.64.

=item C<Object::Accessor>

Upgraded from version 0.32 to 0.34.

=item C<OS2::REXX>

Upgraded from version 1.03 to 1.04.

=item C<Package::Constants>

Upgraded from version 0.01 to 0.02.

=item C<PerlIO>

Upgraded from version 1.04 to 1.06.

=item C<PerlIO::via>

Upgraded from version 0.04 to 0.07.

=item C<Pod::Man>

Upgraded from version 2.16 to 2.22.

=item C<Pod::Parser>

Upgraded from version 1.35 to 1.37.

=item C<Pod::Simple>

Upgraded from version 3.05 to 3.07.

=item C<Pod::Text>

Upgraded from version 3.08 to 3.13.

=item C<POSIX>

Upgraded from version 1.13 to 1.17.

=item C<Safe>

Upgraded from 2.12 to 2.18.

=item C<Scalar::Util>

Upgraded from version 1.19 to 1.21.

=item C<SelectSaver>

Upgraded from 1.01 to 1.02.

=item C<SelfLoader>

Upgraded from 1.11 to 1.17.

=item C<Socket>

Upgraded from 1.80 to 1.82.

=item C<Storable>

Upgraded from 2.18 to 2.20.

=item C<Switch>

Upgraded from version 2.13 to 2.14. Please see L</Deprecations>.

=item C<Symbol>

Upgraded from version 1.06 to 1.07.

=item C<Sys::Syslog>

Upgraded from version 0.22 to 0.27.

=item C<Term::ANSIColor>

Upgraded from version 1.12 to 2.00.

=item C<Term::ReadLine>

Upgraded from version 1.03 to 1.04.

=item C<Term::UI>

Upgraded from version 0.18 to 0.20.

=item C<Test::Harness>

Upgraded from version 2.64 to 3.17.

Note that one side-effect of the 2.x to 3.x upgrade is that the
experimental C<Test::Harness::Straps> module (and its supporting
C<Assert>, C<Iterator>, C<Point> and C<Results> modules) have been
removed. If you still need this, then they are available in the
(unmaintained) C<Test-Harness-Straps> distribution on CPAN.

=item C<Test::Simple>

Upgraded from version 0.72 to 0.92.

=item C<Text::ParseWords>

Upgraded from version 3.26 to 3.27.

=item C<Text::Tabs>

Upgraded from version 2007.1117 to 2009.0305.

=item C<Text::Wrap>

Upgraded from version 2006.1117 to 2009.0305.

=item C<Thread::Queue>

Upgraded from version 2.00 to 2.11.

=item C<Thread::Semaphore>

Upgraded from version 2.01 to 2.09.

=item C<threads>

Upgraded from version 1.67 to 1.72.

=item C<threads::shared>

Upgraded from version 1.14 to 1.29.

=item C<Tie::RefHash>

Upgraded from version 1.37 to 1.38.

=item C<Tie::StdHandle>

This has documentation changes, and has been assigned a version for the
first time: version 4.2.

=item C<Time::HiRes>

Upgraded from version 1.9711 to 1.9719.

=item C<Time::Local>

Upgraded from version 1.18 to 1.1901.

=item C<Time::Piece>

Upgraded from version 1.12 to 1.15.

=item C<Unicode::Normalize>

Upgraded from version 1.02 to 1.03.

=item C<Unicode::UCD>

Upgraded from version 0.25 to 0.27.

C<charinfo()> now works on Unified CJK code points added to later versions
of Unicode.

C<casefold()> has new fields returned to provide both a simpler interface
and previously missing information. The old fields are retained for
backwards compatibility. Information about Turkic-specific code points is
now returned.

The documentation has been corrected and expanded.

=item C<UNIVERSAL>

Upgraded from version 1.04 to 1.05.

=item C<Win32>

Upgraded from version 0.34 to 0.39.

=item C<Win32API::File>

Upgraded from version 0.1001_01 to 0.1101.

=item C<XSLoader>

Upgraded from version 0.08 to 0.10.

=back

=head1 Utility Changes

=over 4

=item F<h2ph>

Now looks in C<include-fixed> too, which is a recent addition to gcc's
search path.

=item F<h2xs>

No longer incorrectly treats enum values like macros (Daniel Burr).

Now handles C++ style constants (C<//>) properly in enums. (A patch from
Rainer Weikusat was used; Daniel Burr also proposed a similar fix).

=item F<perl5db.pl>

C<LVALUE> subroutines now work under the debugger.

The debugger now correctly handles proxy constant subroutines, and
subroutine stubs.

=item F<perlthanks>

Perl 5.10.1 adds a new utility F<perlthanks>, which is a variant of
F<perlbug>, but for sending non-bug-reports to the authors and maintainers
of Perl. Getting nothing but bug reports can become a bit demoralising:
we'll see if this changes things.

=back

=head1 New Documentation

=over 4

=item L<perlhaiku>

This contains instructions on how to build perl for the Haiku platform.

=item L<perlmroapi>

This describes the new interface for pluggable Method Resolution Orders.

=item L<perlperf>

This document, by Richard Foley, provides an introduction to the use of
performance and optimization techniques which can be used with particular
reference to perl programs.

=item L<perlrepository>

This describes how to access the perl source using the I<git> version
control system.

=item L<perlthanks>

This describes the new F<perlthanks> utility.

=back

=head1 Changes to Existing Documentation

The various large C<Changes*> files (which listed every change made to perl
over the last 18 years) have been removed, and replaced by a small file,
also called C<Changes>, which just explains how that same information may
be extracted from the git version control system.

The file F<Porting/patching.pod> has been deleted, as it mainly described
interacting with the old Perforce-based repository, which is now obsolete.
Information still relevant has been moved to L<perlrepository>.

L<perlapi>, L<perlintern>, L<perlmodlib> and L<perltoc> are now all
generated at build time, rather than being shipped as part of the release.

=head1 Performance Enhancements

=over 4

=item *

A new internal cache means that C<isa()> will often be faster.

=item *

Under C<use locale>, the locale-relevant information is now cached on
read-only values, such as the list returned by C<keys %hash>. This makes
operations such as C<sort keys %hash> in the scope of C<use locale> much
faster.

=item *

Empty C<DESTROY> methods are no longer called.

=back

=head1 Installation and Configuration Improvements

=head2 F<ext/> reorganisation

The layout of directories in F<ext> has been revised. Specifically, all
extensions are now flat, and at the top level, with C</> in pathnames
replaced by C<->, so that F<ext/Data/Dumper/> is now F<ext/Data-Dumper/>,
etc.  The names of the extensions as specified to F<Configure>, and as
reported by C<%Config::Config> under the keys C<dynamic_ext>,
C<known_extensions>, C<nonxs_ext> and C<static_ext> have not changed, and
still use C</>. Hence this change will not have any affect once perl is
installed. However, C<Attribute::Handlers>, C<Safe> and C<mro> have now
become extensions in their own right, so if you run F<Configure> with
options to specify an exact list of extensions to build, you will need to
change it to account for this.

For 5.10.2, it is planned that many dual-life modules will have been moved
from F<lib> to F<ext>; again this will have no effect on an installed
perl, but will matter if you invoke F<Configure> with a pre-canned list of
extensions to build.

=head2 Configuration improvements

If C<vendorlib> and C<vendorarch> are the same, then they are only added to
C<@INC> once.

C<$Config{usedevel}> and the C-level C<PERL_USE_DEVEL> are now defined if
perl is built with  C<-Dusedevel>.

F<Configure> will enable use of C<-fstack-protector>, to provide protection
against stack-smashing attacks, if the compiler supports it.

F<Configure> will now determine the correct prototypes for re-entrant
functions, and for C<gconvert>, if you are using a C++ compiler rather
than a C compiler.

On Unix, if you build from a tree containing a git repository, the
configuration process will note the commit hash you have checked out, for
display in the output of C<perl -v> and C<perl -V>. Unpushed local commits
are automatically added to the list of local patches displayed by
C<perl -V>.

=head2 Compilation improvements

As part of the flattening of F<ext>, all extensions on all platforms are
built by F<make_ext.pl>. This replaces the Unix-specific
F<ext/util/make_ext>, VMS-specific F<make_ext.com> and Win32-specific
F<win32/buildext.pl>.

=head2 Platform Specific Changes

=over 4

=item AIX

Removed F<libbsd> for AIX 5L and 6.1. Only flock() was used from F<libbsd>.

Removed F<libgdbm> for AIX 5L and 6.1. The F<libgdbm> is delivered as an
optional package with the AIX Toolbox. Unfortunately the 64 bit version 
is broken.

Hints changes mean that AIX 4.2 should work again.

=item Cygwin

On Cygwin we now strip the last number from the DLL. This has been the
behaviour in the cygwin.com build for years. The hints files have been
updated.

=item FreeBSD

The hints files now identify the correct threading libraries on FreeBSD 7
and later.

=item Irix

We now work around a bizarre preprocessor bug in the Irix 6.5 compiler:
C<cc -E -> unfortunately goes into K&R mode, but C<cc -E file.c> doesn't.

=item Haiku

Patches from the Haiku maintainers have been merged in. Perl should now
build on Haiku.

=item MirOS BSD

Perl should now build on MirOS BSD.

=item NetBSD

Hints now supports versions 5.*.

=item Stratus VOS

Various changes from Stratus have been merged in.

=item Symbian

There is now support for Symbian S60 3.2 SDK and S60 5.0 SDK.

=item Win32

Improved message window handling means that C<alarm> and C<kill> messages
will no longer be dropped under race conditions.

=item VMS

Reads from the in-memory temporary files of C<PerlIO::scalar> used to fail
if C<$/> was set to a numeric reference (to indicate record-style reads).
This is now fixed.

VMS now supports C<getgrgid>.

Many improvements and cleanups have been made to the VMS file name handling
and conversion code.

Enabling the C<PERL_VMS_POSIX_EXIT> logical name now encodes a POSIX exit
status in a VMS condition value for better interaction with GNV's bash
shell and other utilities that depend on POSIX exit values.  See
L<perlvms/"$?"> for details.

=back

=head1 Selected Bug Fixes

=over 4

=item *

5.10.0 inadvertently disabled an optimisation, which caused a measurable
performance drop in list assignment, such as is often used to assign
function parameters from C<@_>. The optimisation has been re-instated, and
the performance regression fixed.

=item *

Fixed memory leak on C<while (1) { map 1, 1 }> [RT #53038].

=item *

Some potential coredumps in PerlIO fixed [RT #57322,54828].

=item *

The debugger now works with lvalue subroutines.

=item *

The debugger's C<m> command was broken on modules that defined constants
[RT #61222].

=item *

C<crypt()> and string complement could return tainted values for untainted
arguments [RT #59998].

=item *

The C<-i.suffix> command-line switch now recreates the file using
restricted permissions, before changing its mode to match the original
file. This eliminates a potential race condition [RT #60904].

=item *

On some Unix systems, the value in C<$?> would not have the top bit set
(C<$? & 128>) even if the child core dumped.

=item *

Under some circumstances, $^R could incorrectly become undefined
[RT #57042].

=item *

(XS) In various hash functions, passing a pre-computed hash to when the
key is UTF-8 might result in an incorrect lookup.

=item *

(XS) Including F<XSUB.h> before F<perl.h> gave a compile-time error
[RT #57176].

=item *

C<< $object->isa('Foo') >> would report false if the package C<Foo> didn't
exist, even if the object's C<@ISA> contained C<Foo>.

=item *

Various bugs in the new-to 5.10.0 mro code, triggered by manipulating
C<@ISA>, have been found and fixed.

=item *

Bitwise operations on references could crash the interpreter, e.g.
C<$x=\$y; $x |= "foo"> [RT #54956].

=item *

Patterns including alternation might be sensitive to the internal UTF-8
representation, e.g.

    my $byte = chr(192);
    my $utf8 = chr(192); utf8::upgrade($utf8);
    $utf8 =~ /$byte|X}/i;	# failed in 5.10.0

=item *

Within UTF8-encoded Perl source files (i.e. where C<use utf8> is in
effect), double-quoted literal strings could be corrupted where a C<\xNN>,
C<\0NNN> or C<\N{}> is followed by a literal character with ordinal value
greater than 255 [RT #59908].

=item *

C<B::Deparse> failed to correctly deparse various constructs:
C<readpipe STRING> [RT #62428], C<CORE::require(STRING)> [RT #62488],
C<sub foo(_)> [RT #62484].

=item *

Using C<setpgrp()> with no arguments could corrupt the perl stack.

=item *

The block form of C<eval> is now specifically trappable by C<Safe> and
C<ops>.  Previously it was erroneously treated like string C<eval>.

=item *

In 5.10.0, the two characters C<[~> were sometimes parsed as the smart
match operator (C<~~>) [RT #63854].

=item *

In 5.10.0, the C<*> quantifier in patterns was sometimes treated as
C<{0,32767}> [RT #60034, #60464]. For example, this match would fail:

    ("ab" x 32768) =~ /^(ab)*$/

=item *

C<shmget> was limited to a 32 bit segment size on a 64 bit OS [RT #63924].

=item *

Using C<next> or C<last> to exit a C<given> block no longer produces a
spurious warning like the following:

    Exiting given via last at foo.pl line 123

=item *

On Windows, C<'.\foo'> and C<'..\foo'>  were treated differently than
C<'./foo'> and C<'../foo'> by C<do> and C<require> [RT #63492].

=item *

Assigning a format to a glob could corrupt the format; e.g.:

     *bar=*foo{FORMAT}; # foo format now bad

=item *

Attempting to coerce a typeglob to a string or number could cause an
assertion failure. The correct error message is now generated,
C<Can't coerce GLOB to I<$type>>.

=item *

Under C<use filetest 'access'>, C<-x> was using the wrong access mode. This
has been fixed [RT #49003].

=item *

C<length> on a tied scalar that returned a Unicode value would not be
correct the first time. This has been fixed.

=item *

Using an array C<tie> inside in array C<tie> could SEGV. This has been
fixed. [RT #51636]

=item *

A race condition inside C<PerlIOStdio_close()> has been identified and
fixed. This used to cause various threading issues, including SEGVs.

=item *

In C<unpack>, the use of C<()> groups in scalar context was internally
placing a list on the interpreter's stack, which manifested in various
ways, including SEGVs.  This is now fixed [RT #50256].

=item *

Magic was called twice in C<substr>, C<\&$x>, C<tie $x, $m> and C<chop>.
These have all been fixed.

=item *

A 5.10.0 optimisation to clear the temporary stack within the implicit
loop of C<s///ge> has been reverted, as it turned out to be the cause of
obscure bugs in seemingly unrelated parts of the interpreter [commit 
ef0d4e17921ee3de].

=item *

The line numbers for warnings inside C<elsif> are now correct.

=item *

The C<..> operator now works correctly with ranges whose ends are at or
close to the values of the smallest and largest integers.

=item *

C<binmode STDIN, ':raw'> could lead to segmentation faults on some platforms.
This has been fixed [RT #54828].

=item *

An off-by-one error meant that C<index $str, ...> was effectively being
executed as C<index "$str\0", ...>. This has been fixed [RT #53746].

=item *

Various leaks associated with named captures in regexes have been fixed
[RT #57024].

=item *

A weak reference to a hash would leak. This was affecting C<DBI>
[RT #56908].

=item *

Using (?|) in a regex could cause a segfault [RT #59734].

=item *

Use of a UTF-8 C<tr//> within a closure could cause a segfault [RT #61520].

=item *

Calling C<sv_chop()> or otherwise upgrading an SV could result in an
unaligned 64-bit access on the SPARC architecture [RT #60574].

=item *

In the 5.10.0 release, C<inc_version_list> would incorrectly list
C<5.10.*> after C<5.8.*>; this affected the C<@INC> search order
[RT #67628].

=item *

In 5.10.0, C<pack "a*", $tainted_value> returned a non-tainted value
[RT #52552].

=item *

In 5.10.0, C<printf> and C<sprintf> could produce the fatal error
C<panic: utf8_mg_pos_cache_update> when printing UTF-8 strings
[RT #62666].

=item *

In the 5.10.0 release, a dynamically created C<AUTOLOAD> method might be
missed (method cache issue) [RT #60220,60232].

=item *

In the 5.10.0 release, a combination of C<use feature> and C<//ee> could
cause a memory leak [RT #63110].

=item *

C<-C> on the shebang (C<#!>) line is once more permitted if it is also
specified on the command line. C<-C> on the shebang line used to be a
silent no-op I<if> it was not also on the command line, so perl 5.10.0
disallowed it, which broke some scripts. Now perl checks whether it is
also on the command line and only dies if it is not [RT #67880].

=item *

In 5.10.0, certain types of re-entrant regular expression could crash,
or cause the following assertion failure [RT #60508]:

    Assertion rx->sublen >= (s - rx->subbeg) + i failed


=back

=head1 New or Changed Diagnostics

=over 4

=item C<panic: sv_chop %s>

This new fatal error occurs when the C routine C<Perl_sv_chop()> was
passed a position that is not within the scalar's string buffer. This
could be caused by buggy XS code, and at this point recovery is not
possible.

=item C<Can't locate package %s for the parents of %s>

This warning has been removed. In general, it only got produced in
conjunction with other warnings, and removing it allowed an ISA lookup
optimisation to be added.

=item C<v-string in use/require is non-portable>

This warning has been removed.

=item C<Deep recursion on subroutine "%s">

It is now possible to change the depth threshold for this warning from the
default of 100, by recompiling the F<perl> binary, setting the C
pre-processor macro C<PERL_SUB_DEPTH_WARN> to the desired value.

=back

=head1 Changed Internals

=over 4

=item *

The J.R.R. Tolkien quotes at the head of C source file have been checked and
proper citations added, thanks to a patch from Tom Christiansen.

=item *

C<vcroak()> now accepts a null first argument. In addition, a full audit
was made of the "not NULL" compiler annotations, and those for several
other internal functions were corrected.

=item *

New macros C<dSAVEDERRNO>, C<dSAVE_ERRNO>, C<SAVE_ERRNO>, C<RESTORE_ERRNO>
have been added to formalise the temporary saving of the C<errno>
variable.

=item *

The function C<Perl_sv_insert_flags> has been added to augment
C<Perl_sv_insert>.

=item *

The function C<Perl_newSV_type(type)> has been added, equivalent to
C<Perl_newSV()> followed by C<Perl_sv_upgrade(type)>.

=item *

The function C<Perl_newSVpvn_flags()> has been added, equivalent to
C<Perl_newSVpvn()> and then performing the action relevant to the flag.

Two flag bits are currently supported.

=over 4

=item C<SVf_UTF8>

This will call C<SvUTF8_on()> for you. (Note that this does not convert an
sequence of ISO 8859-1 characters to UTF-8). A wrapper, C<newSVpvn_utf8()>
is available for this.

=item C<SVs_TEMP>

Call C<sv_2mortal()> on the new SV.

=back

There is also a wrapper that takes constant strings, C<newSVpvs_flags()>.

=item *

The function C<Perl_croak_xs_usage> has been added as a wrapper to
C<Perl_croak>.

=item *

The functions C<PerlIO_find_layer> and C<PerlIO_list_alloc> are now
exported.

=item *

C<PL_na> has been exterminated from the core code, replaced by local STRLEN
temporaries, or C<*_nolen()> calls. Either approach is faster than C<PL_na>,
which is a pointer deference into the interpreter structure under ithreads,
and a global variable otherwise.

=item *

C<Perl_mg_free()> used to leave freed memory accessible via SvMAGIC() on
the scalar. It now updates the linked list to remove each piece of magic
as it is freed.

=item *

Under ithreads, the regex in C<PL_reg_curpm> is now reference counted. This
eliminates a lot of hackish workarounds to cope with it not being reference
counted.

=item *

C<Perl_mg_magical()> would sometimes incorrectly turn on C<SvRMAGICAL()>.
This has been fixed.

=item *

The I<public> IV and NV flags are now not set if the string value has
trailing "garbage". This behaviour is consistent with not setting the
public IV or NV flags if the value is out of range for the type.

=item *

SV allocation tracing has been added to the diagnostics enabled by C<-Dm>.
The tracing can alternatively output via the C<PERL_MEM_LOG> mechanism, if
that was enabled when the F<perl> binary was compiled.

=item *

Uses of C<Nullav>, C<Nullcv>, C<Nullhv>, C<Nullop>, C<Nullsv> etc have been
replaced by C<NULL> in the core code, and non-dual-life modules, as C<NULL>
is clearer to those unfamiliar with the core code.

=item *

A macro C<MUTABLE_PTR(p)> has been added, which on (non-pedantic) gcc will
not cast away C<const>, returning a C<void *>. Macros C<MUTABLE_SV(av)>,
C<MUTABLE_SV(cv)> etc build on this, casting to C<AV *> etc without
casting away C<const>. This allows proper compile-time auditing of
C<const> correctness in the core, and helped picked up some errors (now
fixed).

=item *

Macros C<mPUSHs()> and C<mXPUSHs()> have been added, for pushing SVs on the
stack and mortalizing them.

=item *

Use of the private structure C<mro_meta> has changed slightly. Nothing
outside the core should be accessing this directly anyway.

=item *

A new tool, C<Porting/expand-macro.pl> has been added, that allows you
to view how a C preprocessor macro would be expanded when compiled.
This is handy when trying to decode the macro hell that is the perl
guts.

=back

=head1 New Tests

Many modules updated from CPAN incorporate new tests.

Several tests that have the potential to hang forever if they fail now
incorporate a "watchdog" functionality that will kill them after a timeout,
which helps ensure that C<make test> and C<make test_harness> run to
completion automatically. (Jerry Hedden).

Some core-specific tests have been added:

=over 4

=item t/comp/retainedlines.t

Check that the debugger can retain source lines from C<eval>.

=item t/io/perlio_fail.t

Check that bad layers fail.

=item t/io/perlio_leaks.t

Check that PerlIO layers are not leaking.

=item t/io/perlio_open.t

Check that certain special forms of open work.

=item t/io/perlio.t

General PerlIO tests.

=item t/io/pvbm.t

Check that there is no unexpected interaction between the internal types
C<PVBM> and C<PVGV>.

=item t/mro/package_aliases.t

Check that mro works properly in the presence of aliased packages.

=item t/op/dbm.t

Tests for C<dbmopen> and C<dbmclose>.

=item t/op/index_thr.t

Tests for the interaction of C<index> and threads.

=item t/op/pat_thr.t

Tests for the interaction of esoteric patterns and threads.

=item t/op/qr_gc.t

Test that C<qr> doesn't leak.

=item t/op/reg_email_thr.t

Tests for the interaction of regex recursion and threads.

=item t/op/regexp_qr_embed_thr.t

Tests for the interaction of patterns with embedded C<qr//> and threads.

=item t/op/regexp_unicode_prop.t

Tests for Unicode properties in regular expressions.

=item t/op/regexp_unicode_prop_thr.t

Tests for the interaction of Unicode properties and threads.

=item t/op/reg_nc_tie.t

Test the tied methods of C<Tie::Hash::NamedCapture>.

=item t/op/reg_posixcc.t 

Check that POSIX character classes behave consistently.

=item t/op/re.t

Check that exportable C<re> functions in F<universal.c> work.

=item t/op/setpgrpstack.t

Check that C<setpgrp> works.

=item t/op/substr_thr.t

Tests for the interaction of C<substr> and threads.

=item t/op/upgrade.t

Check that upgrading and assigning scalars works.

=item t/uni/lex_utf8.t

Check that Unicode in the lexer works.

=item t/uni/tie.t

Check that Unicode and C<tie> work.

=back

=head1 Known Problems

This is a list of some significant unfixed bugs, which are regressions
from either 5.10.0 or 5.8.x.

=over 4

=item *

C<List::Util::first> misbehaves in the presence of a lexical C<$_>
(typically introduced by C<my $_> or implicitly by C<given>). The variable
which gets set for each iteration is the package variable C<$_>, not the
lexical C<$_> [RT #67694].

A similar issue may occur in other modules that provide functions which
take a block as their first argument, like

    foo { ... $_ ...} list

=item *

The C<charnames> pragma may generate a run-time error when a regex is
interpolated [RT #56444]:

    use charnames ':full';
    my $r1 = qr/\N{THAI CHARACTER SARA I}/;
    "foo" =~ $r1;    # okay
    "foo" =~ /$r1+/; # runtime error

A workaround is to generate the character outside of the regex:

    my $a = "\N{THAI CHARACTER SARA I}";
    my $r1 = qr/$a/;

=item *

Some regexes may run much more slowly when run in a child thread compared
with the thread the pattern was compiled into [RT #55600].


=back

=head1 Deprecations

The following items are now deprecated.

=over 4

=item *

C<Switch> is buggy and should be avoided. From perl 5.11.0 onwards, it is
intended that any use of the core version of this module will emit a
warning, and that the module will eventually be removed from the core
(probably in perl 5.14.0). See L<perlsyn/"Switch statements"> for its
replacement.

=item *

C<suidperl> will be removed in 5.12.0. This provides a mechanism to
emulate setuid permission bits on systems that don't support it properly.

=back

=head1 Acknowledgements

Some of the work in this release was funded by a TPF grant.

Nicholas Clark officially retired from maintenance pumpking duty at the
end of 2008; however in reality he has put much effort in since then to
help get 5.10.1 into a fit state to be released, including writing a
considerable chunk of this perldelta.

Steffen Mueller and David Golden in particular helped getting CPAN modules
polished and synchronised with their in-core equivalents.

Craig Berry was tireless in getting maint to run under VMS, no matter how
many times we broke it for him.

The other core committers contributed most of the changes, and applied most
of the patches sent in by the hundreds of contributors listed in F<AUTHORS>.

(Sorry to all the people I haven't mentioned by name).

Finally, thanks to Larry Wall, without whom none of this would be
necessary.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes
all the core committers, who will be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5123delta.pod000064400000010004150344123430007531 0ustar00=encoding utf8

=head1 NAME

perl5123delta - what is new for perl v5.12.3

=head1 DESCRIPTION

This document describes differences between the 5.12.2 release and
the 5.12.3 release.

If you are upgrading from an earlier release such as 5.12.1, first read
L<perl5122delta>, which describes differences between 5.12.1 and
5.12.2.  The major changes made in 5.12.0 are described in L<perl5120delta>.

=head1 Incompatible Changes

    There are no changes intentionally incompatible with 5.12.2. If any
    exist, they are bugs and reports are welcome.

=head1 Core Enhancements

=head2 C<keys>, C<values> work on arrays

You can now use the C<keys>, C<values>, C<each> builtin functions on arrays
(previously you could only use them on hashes).  See L<perlfunc> for details.
This is actually a change introduced in perl 5.12.0, but it was missed from
that release's perldelta.

=head1 Bug Fixes

"no VERSION" will now correctly deparse with B::Deparse, as will certain
constant expressions.

Module::Build should be more reliably pass its tests under cygwin.

Lvalue subroutines are again able to return copy-on-write scalars.  This
had been broken since version 5.10.0.

=head1 Platform Specific Notes

=over 4

=item Solaris

A separate DTrace is now build for miniperl, which means that perl can be
compiled with -Dusedtrace on Solaris again.

=item VMS

A number of regressions on VMS have been fixed.  In addition to minor cleanup
of questionable expressions in F<vms.c>, file permissions should no longer be
garbled by the PerlIO layer, and spurious record boundaries should no longer be
introduced by the PerlIO layer during output.

For more details and discussion on the latter, see:

    http://www.nntp.perl.org/group/perl.vmsperl/2010/11/msg15419.html

=item VOS

A few very small changes were made to the build process on VOS to better
support the platform.  Longer-than-32-character filenames are now supported on
OpenVOS, and build properly without IPv6 support.

=back

=head1 Acknowledgements

Perl 5.12.3 represents approximately four months of development since
Perl 5.12.2 and contains approximately 2500 lines of changes across
54 files from 16 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.12.3:

Craig A. Berry, David Golden, David Leadbeater, Father Chrysostomos, Florian
Ragwitz, Jesse Vincent, Karl Williamson, Nick Johnston, Nicolas Kaiser, Paul
Green, Rafael Garcia-Suarez, Rainer Tammer, Ricardo Signes, Steffen Mueller,
Zsbán Ambrus, Ævar Arnfjörð Bjarmason

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes
all the core committers, who will be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlrebackslash.pod000064400000076111150344123430010422 0ustar00=head1 NAME

perlrebackslash - Perl Regular Expression Backslash Sequences and Escapes

=head1 DESCRIPTION

The top level documentation about Perl regular expressions
is found in L<perlre>.

This document describes all backslash and escape sequences. After
explaining the role of the backslash, it lists all the sequences that have
a special meaning in Perl regular expressions (in alphabetical order),
then describes each of them.

Most sequences are described in detail in different documents; the primary
purpose of this document is to have a quick reference guide describing all
backslash and escape sequences.

=head2 The backslash

In a regular expression, the backslash can perform one of two tasks:
it either takes away the special meaning of the character following it
(for instance, C<\|> matches a vertical bar, it's not an alternation),
or it is the start of a backslash or escape sequence.

The rules determining what it is are quite simple: if the character
following the backslash is an ASCII punctuation (non-word) character (that is,
anything that is not a letter, digit, or underscore), then the backslash just
takes away any special meaning of the character following it.

If the character following the backslash is an ASCII letter or an ASCII digit,
then the sequence may be special; if so, it's listed below. A few letters have
not been used yet, so escaping them with a backslash doesn't change them to be
special.  A future version of Perl may assign a special meaning to them, so if
you have warnings turned on, Perl issues a warning if you use such a
sequence.  [1].

It is however guaranteed that backslash or escape sequences never have a
punctuation character following the backslash, not now, and not in a future
version of Perl 5. So it is safe to put a backslash in front of a non-word
character.

Note that the backslash itself is special; if you want to match a backslash,
you have to escape the backslash with a backslash: C</\\/> matches a single
backslash.

=over 4

=item [1]

There is one exception. If you use an alphanumeric character as the
delimiter of your pattern (which you probably shouldn't do for readability
reasons), you have to escape the delimiter if you want to match
it. Perl won't warn then. See also L<perlop/Gory details of parsing
quoted constructs>.

=back


=head2 All the sequences and escapes

Those not usable within a bracketed character class (like C<[\da-z]>) are marked
as C<Not in [].>

 \000              Octal escape sequence.  See also \o{}.
 \1                Absolute backreference.  Not in [].
 \a                Alarm or bell.
 \A                Beginning of string.  Not in [].
 \b{}, \b          Boundary. (\b is a backspace in []).
 \B{}, \B          Not a boundary.  Not in [].
 \cX               Control-X.
 \d                Match any digit character.
 \D                Match any character that isn't a digit.
 \e                Escape character.
 \E                Turn off \Q, \L and \U processing.  Not in [].
 \f                Form feed.
 \F                Foldcase till \E.  Not in [].
 \g{}, \g1         Named, absolute or relative backreference.
                   Not in [].
 \G                Pos assertion.  Not in [].
 \h                Match any horizontal whitespace character.
 \H                Match any character that isn't horizontal whitespace.
 \k{}, \k<>, \k''  Named backreference.  Not in [].
 \K                Keep the stuff left of \K.  Not in [].
 \l                Lowercase next character.  Not in [].
 \L                Lowercase till \E.  Not in [].
 \n                (Logical) newline character.
 \N                Match any character but newline.  Not in [].
 \N{}              Named or numbered (Unicode) character or sequence.
 \o{}              Octal escape sequence.
 \p{}, \pP         Match any character with the given Unicode property.
 \P{}, \PP         Match any character without the given property.
 \Q                Quote (disable) pattern metacharacters till \E.  Not
                   in [].
 \r                Return character.
 \R                Generic new line.  Not in [].
 \s                Match any whitespace character.
 \S                Match any character that isn't a whitespace.
 \t                Tab character.
 \u                Titlecase next character.  Not in [].
 \U                Uppercase till \E.  Not in [].
 \v                Match any vertical whitespace character.
 \V                Match any character that isn't vertical whitespace
 \w                Match any word character.
 \W                Match any character that isn't a word character.
 \x{}, \x00        Hexadecimal escape sequence.
 \X                Unicode "extended grapheme cluster".  Not in [].
 \z                End of string.  Not in [].
 \Z                End of string.  Not in [].

=head2 Character Escapes

=head3  Fixed characters

A handful of characters have a dedicated I<character escape>. The following
table shows them, along with their ASCII code points (in decimal and hex),
their ASCII name, the control escape on ASCII platforms and a short
description.  (For EBCDIC platforms, see L<perlebcdic/OPERATOR DIFFERENCES>.)

 Seq.  Code Point  ASCII   Cntrl   Description.
       Dec    Hex
  \a     7     07    BEL    \cG    alarm or bell
  \b     8     08     BS    \cH    backspace [1]
  \e    27     1B    ESC    \c[    escape character
  \f    12     0C     FF    \cL    form feed
  \n    10     0A     LF    \cJ    line feed [2]
  \r    13     0D     CR    \cM    carriage return
  \t     9     09    TAB    \cI    tab

=over 4

=item [1]

C<\b> is the backspace character only inside a character class. Outside a
character class, C<\b> alone is a word-character/non-word-character
boundary, and C<\b{}> is some other type of boundary.

=item [2]

C<\n> matches a logical newline. Perl converts between C<\n> and your
OS's native newline character when reading from or writing to text files.

=back

=head4 Example

 $str =~ /\t/;   # Matches if $str contains a (horizontal) tab.

=head3 Control characters

C<\c> is used to denote a control character; the character following C<\c>
determines the value of the construct.  For example the value of C<\cA> is
C<chr(1)>, and the value of C<\cb> is C<chr(2)>, etc.
The gory details are in L<perlop/"Regexp Quote-Like Operators">.  A complete
list of what C<chr(1)>, etc. means for ASCII and EBCDIC platforms is in
L<perlebcdic/OPERATOR DIFFERENCES>.

Note that C<\c\> alone at the end of a regular expression (or doubled-quoted
string) is not valid.  The backslash must be followed by another character.
That is, C<\c\I<X>> means C<chr(28) . 'I<X>'> for all characters I<X>.

To write platform-independent code, you must use C<\N{I<NAME>}> instead, like
C<\N{ESCAPE}> or C<\N{U+001B}>, see L<charnames>.

Mnemonic: I<c>ontrol character.

=head4 Example

 $str =~ /\cK/;  # Matches if $str contains a vertical tab (control-K).

=head3 Named or numbered characters and character sequences

Unicode characters have a Unicode name and numeric code point (ordinal)
value.  Use the
C<\N{}> construct to specify a character by either of these values.
Certain sequences of characters also have names.

To specify by name, the name of the character or character sequence goes
between the curly braces.

To specify a character by Unicode code point, use the form C<\N{U+I<code
point>}>, where I<code point> is a number in hexadecimal that gives the
code point that Unicode has assigned to the desired character.  It is
customary but not required to use leading zeros to pad the number to 4
digits.  Thus C<\N{U+0041}> means C<LATIN CAPITAL LETTER A>, and you will
rarely see it written without the two leading zeros.  C<\N{U+0041}> means
"A" even on EBCDIC machines (where the ordinal value of "A" is not 0x41).

It is even possible to give your own names to characters and character
sequences.  For details, see L<charnames>.

(There is an expanded internal form that you may see in debug output:
C<\N{U+I<code point>.I<code point>...}>.
The C<...> means any number of these I<code point>s separated by dots.
This represents the sequence formed by the characters.  This is an internal
form only, subject to change, and you should not try to use it yourself.)

Mnemonic: I<N>amed character.

Note that a character or character sequence expressed as a named
or numbered character is considered a character without special
meaning by the regex engine, and will match "as is".

=head4 Example

 $str =~ /\N{THAI CHARACTER SO SO}/;  # Matches the Thai SO SO character

 use charnames 'Cyrillic';            # Loads Cyrillic names.
 $str =~ /\N{ZHE}\N{KA}/;             # Match "ZHE" followed by "KA".

=head3 Octal escapes

There are two forms of octal escapes.  Each is used to specify a character by
its code point specified in octal notation.

One form, available starting in Perl 5.14 looks like C<\o{...}>, where the dots
represent one or more octal digits.  It can be used for any Unicode character.

It was introduced to avoid the potential problems with the other form,
available in all Perls.  That form consists of a backslash followed by three
octal digits.  One problem with this form is that it can look exactly like an
old-style backreference (see
L</Disambiguation rules between old-style octal escapes and backreferences>
below.)  You can avoid this by making the first of the three digits always a
zero, but that makes \077 the largest code point specifiable.

In some contexts, a backslash followed by two or even one octal digits may be
interpreted as an octal escape, sometimes with a warning, and because of some
bugs, sometimes with surprising results.  Also, if you are creating a regex
out of smaller snippets concatenated together, and you use fewer than three
digits, the beginning of one snippet may be interpreted as adding digits to the
ending of the snippet before it.  See L</Absolute referencing> for more
discussion and examples of the snippet problem.

Note that a character expressed as an octal escape is considered
a character without special meaning by the regex engine, and will match
"as is".

To summarize, the C<\o{}> form is always safe to use, and the other form is
safe to use for code points through \077 when you use exactly three digits to
specify them.

Mnemonic: I<0>ctal or I<o>ctal.

=head4 Examples (assuming an ASCII platform)

 $str = "Perl";
 $str =~ /\o{120}/;  # Match, "\120" is "P".
 $str =~ /\120/;     # Same.
 $str =~ /\o{120}+/; # Match, "\120" is "P",
                     # it's repeated at least once.
 $str =~ /\120+/;    # Same.
 $str =~ /P\053/;    # No match, "\053" is "+" and taken literally.
 /\o{23073}/         # Black foreground, white background smiling face.
 /\o{4801234567}/    # Raises a warning, and yields chr(4).

=head4 Disambiguation rules between old-style octal escapes and backreferences

Octal escapes of the C<\000> form outside of bracketed character classes
potentially clash with old-style backreferences (see L</Absolute referencing>
below).  They both consist of a backslash followed by numbers.  So Perl has to
use heuristics to determine whether it is a backreference or an octal escape.
Perl uses the following rules to disambiguate:

=over 4

=item 1

If the backslash is followed by a single digit, it's a backreference.

=item 2

If the first digit following the backslash is a 0, it's an octal escape.

=item 3

If the number following the backslash is N (in decimal), and Perl already
has seen N capture groups, Perl considers this a backreference.  Otherwise,
it considers it an octal escape. If N has more than three digits, Perl
takes only the first three for the octal escape; the rest are matched as is.

 my $pat  = "(" x 999;
    $pat .= "a";
    $pat .= ")" x 999;
 /^($pat)\1000$/;   #  Matches 'aa'; there are 1000 capture groups.
 /^$pat\1000$/;     #  Matches 'a@0'; there are 999 capture groups
                    #  and \1000 is seen as \100 (a '@') and a '0'.

=back

You can force a backreference interpretation always by using the C<\g{...}>
form.  You can the force an octal interpretation always by using the C<\o{...}>
form, or for numbers up through \077 (= 63 decimal), by using three digits,
beginning with a "0".

=head3 Hexadecimal escapes

Like octal escapes, there are two forms of hexadecimal escapes, but both start
with the sequence C<\x>.  This is followed by either exactly two hexadecimal
digits forming a number, or a hexadecimal number of arbitrary length surrounded
by curly braces. The hexadecimal number is the code point of the character you
want to express.

Note that a character expressed as one of these escapes is considered a
character without special meaning by the regex engine, and will match
"as is".

Mnemonic: heI<x>adecimal.

=head4 Examples (assuming an ASCII platform)

 $str = "Perl";
 $str =~ /\x50/;    # Match, "\x50" is "P".
 $str =~ /\x50+/;   # Match, "\x50" is "P", it is repeated at least once
 $str =~ /P\x2B/;   # No match, "\x2B" is "+" and taken literally.

 /\x{2603}\x{2602}/ # Snowman with an umbrella.
                    # The Unicode character 2603 is a snowman,
                    # the Unicode character 2602 is an umbrella.
 /\x{263B}/         # Black smiling face.
 /\x{263b}/         # Same, the hex digits A - F are case insensitive.

=head2 Modifiers

A number of backslash sequences have to do with changing the character,
or characters following them. C<\l> will lowercase the character following
it, while C<\u> will uppercase (or, more accurately, titlecase) the
character following it. They provide functionality similar to the
functions C<lcfirst> and C<ucfirst>.

To uppercase or lowercase several characters, one might want to use
C<\L> or C<\U>, which will lowercase/uppercase all characters following
them, until either the end of the pattern or the next occurrence of
C<\E>, whichever comes first. They provide functionality similar to what
the functions C<lc> and C<uc> provide.

C<\Q> is used to quote (disable) pattern metacharacters, up to the next
C<\E> or the end of the pattern. C<\Q> adds a backslash to any character
that could have special meaning to Perl.  In the ASCII range, it quotes
every character that isn't a letter, digit, or underscore.  See
L<perlfunc/quotemeta> for details on what gets quoted for non-ASCII
code points.  Using this ensures that any character between C<\Q> and
C<\E> will be matched literally, not interpreted as a metacharacter by
the regex engine.

C<\F> can be used to casefold all characters following, up to the next C<\E>
or the end of the pattern. It provides the functionality similar to
the C<fc> function.

Mnemonic: I<L>owercase, I<U>ppercase, I<F>old-case, I<Q>uotemeta, I<E>nd.

=head4 Examples

 $sid     = "sid";
 $greg    = "GrEg";
 $miranda = "(Miranda)";
 $str     =~ /\u$sid/;        # Matches 'Sid'
 $str     =~ /\L$greg/;       # Matches 'greg'
 $str     =~ /\Q$miranda\E/;  # Matches '(Miranda)', as if the pattern
                              #   had been written as /\(Miranda\)/

=head2 Character classes

Perl regular expressions have a large range of character classes. Some of
the character classes are written as a backslash sequence. We will briefly
discuss those here; full details of character classes can be found in
L<perlrecharclass>.

C<\w> is a character class that matches any single I<word> character
(letters, digits, Unicode marks, and connector punctuation (like the
underscore)).  C<\d> is a character class that matches any decimal
digit, while the character class C<\s> matches any whitespace character.
New in perl 5.10.0 are the classes C<\h> and C<\v> which match horizontal
and vertical whitespace characters.

The exact set of characters matched by C<\d>, C<\s>, and C<\w> varies
depending on various pragma and regular expression modifiers.  It is
possible to restrict the match to the ASCII range by using the C</a>
regular expression modifier.  See L<perlrecharclass>.

The uppercase variants (C<\W>, C<\D>, C<\S>, C<\H>, and C<\V>) are
character classes that match, respectively, any character that isn't a
word character, digit, whitespace, horizontal whitespace, or vertical
whitespace.

Mnemonics: I<w>ord, I<d>igit, I<s>pace, I<h>orizontal, I<v>ertical.

=head3 Unicode classes

C<\pP> (where C<P> is a single letter) and C<\p{Property}> are used to
match a character that matches the given Unicode property; properties
include things like "letter", or "thai character". Capitalizing the
sequence to C<\PP> and C<\P{Property}> make the sequence match a character
that doesn't match the given Unicode property. For more details, see
L<perlrecharclass/Backslash sequences> and
L<perlunicode/Unicode Character Properties>.

Mnemonic: I<p>roperty.

=head2 Referencing

If capturing parenthesis are used in a regular expression, we can refer
to the part of the source string that was matched, and match exactly the
same thing. There are three ways of referring to such I<backreference>:
absolutely, relatively, and by name.

=for later add link to perlrecapture

=head3 Absolute referencing

Either C<\gI<N>> (starting in Perl 5.10.0), or C<\I<N>> (old-style) where I<N>
is a positive (unsigned) decimal number of any length is an absolute reference
to a capturing group.

I<N> refers to the Nth set of parentheses, so C<\gI<N>> refers to whatever has
been matched by that set of parentheses.  Thus C<\g1> refers to the first
capture group in the regex.

The C<\gI<N>> form can be equivalently written as C<\g{I<N>}>
which avoids ambiguity when building a regex by concatenating shorter
strings.  Otherwise if you had a regex C<qr/$a$b/>, and C<$a> contained
C<"\g1">, and C<$b> contained C<"37">, you would get C</\g137/> which is
probably not what you intended.

In the C<\I<N>> form, I<N> must not begin with a "0", and there must be at
least I<N> capturing groups, or else I<N> is considered an octal escape
(but something like C<\18> is the same as C<\0018>; that is, the octal escape
C<"\001"> followed by a literal digit C<"8">).

Mnemonic: I<g>roup.

=head4 Examples

 /(\w+) \g1/;    # Finds a duplicated word, (e.g. "cat cat").
 /(\w+) \1/;     # Same thing; written old-style.
 /(.)(.)\g2\g1/;  # Match a four letter palindrome (e.g. "ABBA").


=head3 Relative referencing

C<\g-I<N>> (starting in Perl 5.10.0) is used for relative addressing.  (It can
be written as C<\g{-I<N>>.)  It refers to the I<N>th group before the
C<\g{-I<N>}>.

The big advantage of this form is that it makes it much easier to write
patterns with references that can be interpolated in larger patterns,
even if the larger pattern also contains capture groups.

=head4 Examples

 /(A)        # Group 1
  (          # Group 2
    (B)      # Group 3
    \g{-1}   # Refers to group 3 (B)
    \g{-3}   # Refers to group 1 (A)
  )
 /x;         # Matches "ABBA".

 my $qr = qr /(.)(.)\g{-2}\g{-1}/;  # Matches 'abab', 'cdcd', etc.
 /$qr$qr/                           # Matches 'ababcdcd'.

=head3 Named referencing

C<\g{I<name>}> (starting in Perl 5.10.0) can be used to back refer to a
named capture group, dispensing completely with having to think about capture
buffer positions.

To be compatible with .Net regular expressions, C<\g{name}> may also be
written as C<\k{name}>, C<< \k<name> >> or C<\k'name'>.

To prevent any ambiguity, I<name> must not start with a digit nor contain a
hyphen.

=head4 Examples

 /(?<word>\w+) \g{word}/ # Finds duplicated word, (e.g. "cat cat")
 /(?<word>\w+) \k{word}/ # Same.
 /(?<word>\w+) \k<word>/ # Same.
 /(?<letter1>.)(?<letter2>.)\g{letter2}\g{letter1}/
                         # Match a four letter palindrome (e.g. "ABBA")

=head2 Assertions

Assertions are conditions that have to be true; they don't actually
match parts of the substring. There are six assertions that are written as
backslash sequences.

=over 4

=item \A

C<\A> only matches at the beginning of the string. If the C</m> modifier
isn't used, then C</\A/> is equivalent to C</^/>. However, if the C</m>
modifier is used, then C</^/> matches internal newlines, but the meaning
of C</\A/> isn't changed by the C</m> modifier. C<\A> matches at the beginning
of the string regardless whether the C</m> modifier is used.

=item \z, \Z

C<\z> and C<\Z> match at the end of the string. If the C</m> modifier isn't
used, then C</\Z/> is equivalent to C</$/>; that is, it matches at the
end of the string, or one before the newline at the end of the string. If the
C</m> modifier is used, then C</$/> matches at internal newlines, but the
meaning of C</\Z/> isn't changed by the C</m> modifier. C<\Z> matches at
the end of the string (or just before a trailing newline) regardless whether
the C</m> modifier is used.

C<\z> is just like C<\Z>, except that it does not match before a trailing
newline. C<\z> matches at the end of the string only, regardless of the
modifiers used, and not just before a newline.  It is how to anchor the
match to the true end of the string under all conditions.

=item \G

C<\G> is usually used only in combination with the C</g> modifier. If the
C</g> modifier is used and the match is done in scalar context, Perl 
remembers where in the source string the last match ended, and the next time,
it will start the match from where it ended the previous time.

C<\G> matches the point where the previous match on that string ended, 
or the beginning of that string if there was no previous match.

=for later add link to perlremodifiers

Mnemonic: I<G>lobal.

=item \b{}, \b, \B{}, \B

C<\b{...}>, available starting in v5.22, matches a boundary (between two
characters, or before the first character of the string, or after the
final character of the string) based on the Unicode rules for the
boundary type specified inside the braces.  The boundary
types are given a few paragraphs below.  C<\B{...}> matches at any place
between characters where C<\b{...}> of the same type doesn't match.

C<\b> when not immediately followed by a C<"{"> matches at any place
between a word (something matched by C<\w>) and a non-word character
(C<\W>); C<\B> when not immediately followed by a C<"{"> matches at any
place between characters where C<\b> doesn't match.  To get better
word matching of natural language text, see L</\b{wb}> below.

C<\b>
and C<\B> assume there's a non-word character before the beginning and after
the end of the source string; so C<\b> will match at the beginning (or end)
of the source string if the source string begins (or ends) with a word
character. Otherwise, C<\B> will match.   

Do not use something like C<\b=head\d\b> and expect it to match the
beginning of a line.  It can't, because for there to be a boundary before
the non-word "=", there must be a word character immediately previous.  
All plain C<\b> and C<\B> boundary determinations look for word
characters alone, not for
non-word characters nor for string ends.  It may help to understand how
C<\b> and C<\B> work by equating them as follows:

    \b	really means	(?:(?<=\w)(?!\w)|(?<!\w)(?=\w))
    \B	really means	(?:(?<=\w)(?=\w)|(?<!\w)(?!\w))

In contrast, C<\b{...}> and C<\B{...}> may or may not match at the
beginning and end of the line, depending on the boundary type.  These
implement the Unicode default boundaries, specified in
L<http://www.unicode.org/reports/tr14/> and
L<http://www.unicode.org/reports/tr29/>.
The boundary types are:

=over

=item C<\b{gcb}> or C<\b{g}>

This matches a Unicode "Grapheme Cluster Boundary".  (Actually Perl
always uses the improved "extended" grapheme cluster").  These are
explained below under L</C<\X>>.  In fact, C<\X> is another way to get
the same functionality.  It is equivalent to C</.+?\b{gcb}/>.  Use
whichever is most convenient for your situation.

=item C<\b{lb}>

This matches according to the default Unicode Line Breaking Algorithm
(L<http://www.unicode.org/reports/tr14/>), as customized in that
document
(L<Example 7 of revision 35|http://www.unicode.org/reports/tr14/tr14-35.html#Example7>)
for better handling of numeric expressions.

This is suitable for many purposes, but the L<Unicode::LineBreak> module
is available on CPAN that provides many more features, including
customization.

=item C<\b{sb}>

This matches a Unicode "Sentence Boundary".  This is an aid to parsing
natural language sentences.  It gives good, but imperfect results.  For
example, it thinks that "Mr. Smith" is two sentences.  More details are
at L<http://www.unicode.org/reports/tr29/>.  Note also that it thinks
that anything matching L</\R> (except form feed and vertical tab) is a
sentence boundary.  C<\b{sb}> works with text designed for
word-processors which wrap lines
automatically for display, but hard-coded line boundaries are considered
to be essentially the ends of text blocks (paragraphs really), and hence
the ends of sententces.  C<\b{sb}> doesn't do well with text containing
embedded newlines, like the source text of the document you are reading.
Such text needs to be preprocessed to get rid of the line separators
before looking for sentence boundaries.  Some people view this as a bug
in the Unicode standard, and this behavior is quite subject to change in
future Perl versions.

=item C<\b{wb}>

This matches a Unicode "Word Boundary", but tailored to Perl
expectations.  This gives better (though not
perfect) results for natural language processing than plain C<\b>
(without braces) does.  For example, it understands that apostrophes can
be in the middle of words and that parentheses aren't (see the examples
below).  More details are at L<http://www.unicode.org/reports/tr29/>.

The current Unicode definition of a Word Boundary matches between every
white space character.  Perl tailors this, starting in version 5.24, to
generally not break up spans of white space, just as plain C<\b> has
always functioned.  This allows C<\b{wb}> to be a drop-in replacement for
C<\b>, but with generally better results for natural language
processing.  (The exception to this tailoring is when a span of white
space is immediately followed by something like U+0303, COMBINING TILDE.
If the final space character in the span is a horizontal white space, it
is broken out so that it attaches instead to the combining character.
To be precise, if a span of white space that ends in a horizontal space
has the character immediately following it have either of the Word
Boundary property values "Extend", "Format" or "ZWJ", the boundary between the
final horizontal space character and the rest of the span matches
C<\b{wb}>.  In all other cases the boundary between two white space
characters matches C<\B{wb}>.)

=back

It is important to realize when you use these Unicode boundaries,
that you are taking a risk that a future version of Perl which contains
a later version of the Unicode Standard will not work precisely the same
way as it did when your code was written.  These rules are not
considered stable and have been somewhat more subject to change than the
rest of the Standard.  Unicode reserves the right to change them at
will, and Perl reserves the right to update its implementation to
Unicode's new rules.  In the past, some changes have been because new
characters have been added to the Standard which have different
characteristics than all previous characters, so new rules are
formulated for handling them.  These should not cause any backward
compatibility issues.  But some changes have changed the treatment of
existing characters because the Unicode Technical Committee has decided
that the change is warranted for whatever reason.  This could be to fix
a bug, or because they think better results are obtained with the new
rule.

It is also important to realize that these are default boundary
definitions, and that implementations may wish to tailor the results for
particular purposes and locales.  For example, some languages, such as
Japanese and Thai, require dictionary lookup to determine word
boundaries.

Mnemonic: I<b>oundary.

=back

=head4 Examples

  "cat"   =~ /\Acat/;     # Match.
  "cat"   =~ /cat\Z/;     # Match.
  "cat\n" =~ /cat\Z/;     # Match.
  "cat\n" =~ /cat\z/;     # No match.

  "cat"   =~ /\bcat\b/;   # Matches.
  "cats"  =~ /\bcat\b/;   # No match.
  "cat"   =~ /\bcat\B/;   # No match.
  "cats"  =~ /\bcat\B/;   # Match.

  while ("cat dog" =~ /(\w+)/g) {
      print $1;           # Prints 'catdog'
  }
  while ("cat dog" =~ /\G(\w+)/g) {
      print $1;           # Prints 'cat'
  }

  my $s = "He said, \"Is pi 3.14? (I'm not sure).\"";
  print join("|", $s =~ m/ ( .+? \b     ) /xg), "\n";
  print join("|", $s =~ m/ ( .+? \b{wb} ) /xg), "\n";
 prints
  He| |said|, "|Is| |pi| |3|.|14|? (|I|'|m| |not| |sure
  He| |said|,| |"|Is| |pi| |3.14|?| |(|I'm| |not| |sure|)|.|"

=head2 Misc

Here we document the backslash sequences that don't fall in one of the
categories above. These are:

=over 4

=item \K

This appeared in perl 5.10.0. Anything matched left of C<\K> is
not included in C<$&>, and will not be replaced if the pattern is
used in a substitution. This lets you write C<s/PAT1 \K PAT2/REPL/x>
instead of C<s/(PAT1) PAT2/${1}REPL/x> or C<s/(?<=PAT1) PAT2/REPL/x>.

Mnemonic: I<K>eep.

=item \N

This feature, available starting in v5.12,  matches any character
that is B<not> a newline.  It is a short-hand for writing C<[^\n]>, and is
identical to the C<.> metasymbol, except under the C</s> flag, which changes
the meaning of C<.>, but not C<\N>.

Note that C<\N{...}> can mean a
L<named or numbered character
|/Named or numbered characters and character sequences>.

Mnemonic: Complement of I<\n>.

=item \R
X<\R>

C<\R> matches a I<generic newline>; that is, anything considered a
linebreak sequence by Unicode. This includes all characters matched by
C<\v> (vertical whitespace), and the multi character sequence C<"\x0D\x0A">
(carriage return followed by a line feed, sometimes called the network
newline; it's the end of line sequence used in Microsoft text files opened
in binary mode). C<\R> is equivalent to C<< (?>\x0D\x0A|\v) >>.  (The
reason it doesn't backtrack is that the sequence is considered
inseparable.  That means that

 "\x0D\x0A" =~ /^\R\x0A$/   # No match

fails, because the C<\R> matches the entire string, and won't backtrack
to match just the C<"\x0D">.)  Since
C<\R> can match a sequence of more than one character, it cannot be put
inside a bracketed character class; C</[\R]/> is an error; use C<\v>
instead.  C<\R> was introduced in perl 5.10.0.

Note that this does not respect any locale that might be in effect; it
matches according to the platform's native character set.

Mnemonic: none really. C<\R> was picked because PCRE already uses C<\R>,
and more importantly because Unicode recommends such a regular expression
metacharacter, and suggests C<\R> as its notation.

=item \X
X<\X>

This matches a Unicode I<extended grapheme cluster>.

C<\X> matches quite well what normal (non-Unicode-programmer) usage
would consider a single character.  As an example, consider a G with some sort
of diacritic mark, such as an arrow.  There is no such single character in
Unicode, but one can be composed by using a G followed by a Unicode "COMBINING
UPWARDS ARROW BELOW", and would be displayed by Unicode-aware software as if it
were a single character.

The match is greedy and non-backtracking, so that the cluster is never
broken up into smaller components.

See also L<C<\b{gcb}>|/\b{}, \b, \B{}, \B>.

Mnemonic: eI<X>tended Unicode character.

=back

=head4 Examples

 $str =~ s/foo\Kbar/baz/g; # Change any 'bar' following a 'foo' to 'baz'
 $str =~ s/(.)\K\g1//g;    # Delete duplicated characters.

 "\n"   =~ /^\R$/;         # Match, \n   is a generic newline.
 "\r"   =~ /^\R$/;         # Match, \r   is a generic newline.
 "\r\n" =~ /^\R$/;         # Match, \r\n is a generic newline.

 "P\x{307}" =~ /^\X$/     # \X matches a P with a dot above.

=cut
perl5100delta.pod000064400000154357150344123430007550 0ustar00=encoding utf8

=head1 NAME

perl5100delta - what is new for perl 5.10.0

=head1 DESCRIPTION

This document describes the differences between the 5.8.8 release and
the 5.10.0 release.

Many of the bug fixes in 5.10.0 were already seen in the 5.8.X maintenance
releases; they are not duplicated here and are documented in the set of
man pages named perl58[1-8]?delta.

=head1 Core Enhancements

=head2 The C<feature> pragma

The C<feature> pragma is used to enable new syntax that would break Perl's
backwards-compatibility with older releases of the language. It's a lexical
pragma, like C<strict> or C<warnings>.

Currently the following new features are available: C<switch> (adds a
switch statement), C<say> (adds a C<say> built-in function), and C<state>
(adds a C<state> keyword for declaring "static" variables). Those
features are described in their own sections of this document.

The C<feature> pragma is also implicitly loaded when you require a minimal
perl version (with the C<use VERSION> construct) greater than, or equal
to, 5.9.5. See L<feature> for details.

=head2 New B<-E> command-line switch

B<-E> is equivalent to B<-e>, but it implicitly enables all
optional features (like C<use feature ":5.10">).

=head2 Defined-or operator

A new operator C<//> (defined-or) has been implemented.
The following expression:

    $a // $b

is merely equivalent to

   defined $a ? $a : $b

and the statement

   $c //= $d;

can now be used instead of

   $c = $d unless defined $c;

The C<//> operator has the same precedence and associativity as C<||>.
Special care has been taken to ensure that this operator Do What You Mean
while not breaking old code, but some edge cases involving the empty
regular expression may now parse differently.  See L<perlop> for
details.

=head2 Switch and Smart Match operator

Perl 5 now has a switch statement. It's available when C<use feature
'switch'> is in effect. This feature introduces three new keywords,
C<given>, C<when>, and C<default>:

    given ($foo) {
	when (/^abc/) { $abc = 1; }
	when (/^def/) { $def = 1; }
	when (/^xyz/) { $xyz = 1; }
	default { $nothing = 1; }
    }

A more complete description of how Perl matches the switch variable
against the C<when> conditions is given in L<perlsyn/"Switch statements">.

This kind of match is called I<smart match>, and it's also possible to use
it outside of switch statements, via the new C<~~> operator. See
L<perlsyn/"Smart matching in detail">.

This feature was contributed by Robin Houston.

=head2 Regular expressions

=over 4

=item Recursive Patterns

It is now possible to write recursive patterns without using the C<(??{})>
construct. This new way is more efficient, and in many cases easier to
read.

Each capturing parenthesis can now be treated as an independent pattern
that can be entered by using the C<(?PARNO)> syntax (C<PARNO> standing for
"parenthesis number"). For example, the following pattern will match
nested balanced angle brackets:

    /
     ^                      # start of line
     (                      # start capture buffer 1
	<                   #   match an opening angle bracket
	(?:                 #   match one of:
	    (?>             #     don't backtrack over the inside of this group
	        [^<>]+      #       one or more non angle brackets
	    )               #     end non backtracking group
	|                   #     ... or ...
	    (?1)            #     recurse to bracket 1 and try it again
	)*                  #   0 or more times.
	>                   #   match a closing angle bracket
     )                      # end capture buffer one
     $                      # end of line
    /x

PCRE users should note that Perl's recursive regex feature allows
backtracking into a recursed pattern, whereas in PCRE the recursion is
atomic or "possessive" in nature.  As in the example above, you can
add (?>) to control this selectively.  (Yves Orton)

=item Named Capture Buffers

It is now possible to name capturing parenthesis in a pattern and refer to
the captured contents by name. The naming syntax is C<< (?<NAME>....) >>.
It's possible to backreference to a named buffer with the C<< \k<NAME> >>
syntax. In code, the new magical hashes C<%+> and C<%-> can be used to
access the contents of the capture buffers.

Thus, to replace all doubled chars with a single copy, one could write

    s/(?<letter>.)\k<letter>/$+{letter}/g

Only buffers with defined contents will be "visible" in the C<%+> hash, so
it's possible to do something like

    foreach my $name (keys %+) {
        print "content of buffer '$name' is $+{$name}\n";
    }

The C<%-> hash is a bit more complete, since it will contain array refs
holding values from all capture buffers similarly named, if there should
be many of them.

C<%+> and C<%-> are implemented as tied hashes through the new module
C<Tie::Hash::NamedCapture>.

Users exposed to the .NET regex engine will find that the perl
implementation differs in that the numerical ordering of the buffers
is sequential, and not "unnamed first, then named". Thus in the pattern

   /(A)(?<B>B)(C)(?<D>D)/

$1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not
$1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer
would expect. This is considered a feature. :-) (Yves Orton)

=item Possessive Quantifiers

Perl now supports the "possessive quantifier" syntax of the "atomic match"
pattern. Basically a possessive quantifier matches as much as it can and never
gives any back. Thus it can be used to control backtracking. The syntax is
similar to non-greedy matching, except instead of using a '?' as the modifier
the '+' is used. Thus C<?+>, C<*+>, C<++>, C<{min,max}+> are now legal
quantifiers. (Yves Orton)

=item Backtracking control verbs

The regex engine now supports a number of special-purpose backtrack
control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL)
and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton)

=item Relative backreferences

A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a
safer form of back-reference notation as well as allowing relative
backreferences. This should make it easier to generate and embed patterns
that contain backreferences. See L<perlre/"Capture buffers">. (Yves Orton)

=item C<\K> escape

The functionality of Jeff Pinyan's module Regexp::Keep has been added to
the core. In regular expressions you can now use the special escape C<\K>
as a way to do something like floating length positive lookbehind. It is
also useful in substitutions like:

  s/(foo)bar/$1/g

that can now be converted to

  s/foo\Kbar//g

which is much more efficient. (Yves Orton)

=item Vertical and horizontal whitespace, and linebreak

Regular expressions now recognize the C<\v> and C<\h> escapes that match
vertical and horizontal whitespace, respectively. C<\V> and C<\H>
logically match their complements.

C<\R> matches a generic linebreak, that is, vertical whitespace, plus
the multi-character sequence C<"\x0D\x0A">.

=item Optional pre-match and post-match captures with the /p flag

There is a new flag C</p> for regular expressions.  Using this
makes the engine preserve a copy of the part of the matched string before
the matching substring to the new special variable C<${^PREMATCH}>, the
part after the matching substring to C<${^POSTMATCH}>, and the matched
substring itself to C<${^MATCH}>.

Perl is still able to store these substrings to the special variables
C<$`>, C<$'>, C<$&>, but using these variables anywhere in the program
adds a penalty to all regular expression matches, whereas if you use
the C</p> flag and the new special variables instead, you pay only for
the regular expressions where the flag is used.

For more detail on the new variables, see L<perlvar>; for the use of
the regular expression flag, see L<perlop> and L<perlre>.

=back

=head2 C<say()>

say() is a new built-in, only available when C<use feature 'say'> is in
effect, that is similar to print(), but that implicitly appends a newline
to the printed string. See L<perlfunc/say>. (Robin Houston)

=head2 Lexical C<$_>

The default variable C<$_> can now be lexicalized, by declaring it like
any other lexical variable, with a simple

    my $_;

The operations that default on C<$_> will use the lexically-scoped
version of C<$_> when it exists, instead of the global C<$_>.

In a C<map> or a C<grep> block, if C<$_> was previously my'ed, then the
C<$_> inside the block is lexical as well (and scoped to the block).

In a scope where C<$_> has been lexicalized, you can still have access to
the global version of C<$_> by using C<$::_>, or, more simply, by
overriding the lexical declaration with C<our $_>. (Rafael Garcia-Suarez)

=head2 The C<_> prototype

A new prototype character has been added. C<_> is equivalent to C<$> but
defaults to C<$_> if the corresponding argument isn't supplied (both C<$>
and C<_> denote a scalar). Due to the optional nature of the argument, 
you can only use it at the end of a prototype, or before a semicolon.

This has a small incompatible consequence: the prototype() function has
been adjusted to return C<_> for some built-ins in appropriate cases (for
example, C<prototype('CORE::rmdir')>). (Rafael Garcia-Suarez)

=head2 UNITCHECK blocks

C<UNITCHECK>, a new special code block has been introduced, in addition to
C<BEGIN>, C<CHECK>, C<INIT> and C<END>.

C<CHECK> and C<INIT> blocks, while useful for some specialized purposes,
are always executed at the transition between the compilation and the
execution of the main program, and thus are useless whenever code is
loaded at runtime. On the other hand, C<UNITCHECK> blocks are executed
just after the unit which defined them has been compiled. See L<perlmod>
for more information. (Alex Gough)

=head2 New Pragma, C<mro>

A new pragma, C<mro> (for Method Resolution Order) has been added. It
permits to switch, on a per-class basis, the algorithm that perl uses to
find inherited methods in case of a multiple inheritance hierarchy. The
default MRO hasn't changed (DFS, for Depth First Search). Another MRO is
available: the C3 algorithm. See L<mro> for more information.
(Brandon Black)

Note that, due to changes in the implementation of class hierarchy search,
code that used to undef the C<*ISA> glob will most probably break. Anyway,
undef'ing C<*ISA> had the side-effect of removing the magic on the @ISA
array and should not have been done in the first place. Also, the
cache C<*::ISA::CACHE::> no longer exists; to force reset the @ISA cache,
you now need to use the C<mro> API, or more simply to assign to @ISA
(e.g. with C<@ISA = @ISA>).

=head2 readdir() may return a "short filename" on Windows

The readdir() function may return a "short filename" when the long
filename contains characters outside the ANSI codepage.  Similarly
Cwd::cwd() may return a short directory name, and glob() may return short
names as well.  On the NTFS file system these short names can always be
represented in the ANSI codepage.  This will not be true for all other file
system drivers; e.g. the FAT filesystem stores short filenames in the OEM
codepage, so some files on FAT volumes remain unaccessible through the
ANSI APIs.

Similarly, $^X, @INC, and $ENV{PATH} are preprocessed at startup to make
sure all paths are valid in the ANSI codepage (if possible).

The Win32::GetLongPathName() function now returns the UTF-8 encoded
correct long file name instead of using replacement characters to force
the name into the ANSI codepage.  The new Win32::GetANSIPathName()
function can be used to turn a long pathname into a short one only if the
long one cannot be represented in the ANSI codepage.

Many other functions in the C<Win32> module have been improved to accept
UTF-8 encoded arguments.  Please see L<Win32> for details.

=head2 readpipe() is now overridable

The built-in function readpipe() is now overridable. Overriding it permits
also to override its operator counterpart, C<qx//> (a.k.a. C<``>).
Moreover, it now defaults to C<$_> if no argument is provided. (Rafael
Garcia-Suarez)

=head2 Default argument for readline()

readline() now defaults to C<*ARGV> if no argument is provided. (Rafael
Garcia-Suarez)

=head2 state() variables

A new class of variables has been introduced. State variables are similar
to C<my> variables, but are declared with the C<state> keyword in place of
C<my>. They're visible only in their lexical scope, but their value is
persistent: unlike C<my> variables, they're not undefined at scope entry,
but retain their previous value. (Rafael Garcia-Suarez, Nicholas Clark)

To use state variables, one needs to enable them by using

    use feature 'state';

or by using the C<-E> command-line switch in one-liners.
See L<perlsub/"Persistent Private Variables">.

=head2 Stacked filetest operators

As a new form of syntactic sugar, it's now possible to stack up filetest
operators. You can now write C<-f -w -x $file> in a row to mean
C<-x $file && -w _ && -f _>. See L<perlfunc/-X>.

=head2 UNIVERSAL::DOES()

The C<UNIVERSAL> class has a new method, C<DOES()>. It has been added to
solve semantic problems with the C<isa()> method. C<isa()> checks for
inheritance, while C<DOES()> has been designed to be overridden when
module authors use other types of relations between classes (in addition
to inheritance). (chromatic)

See L<< UNIVERSAL/"$obj->DOES( ROLE )" >>.

=head2 Formats

Formats were improved in several ways. A new field, C<^*>, can be used for
variable-width, one-line-at-a-time text. Null characters are now handled
correctly in picture lines. Using C<@#> and C<~~> together will now
produce a compile-time error, as those format fields are incompatible.
L<perlform> has been improved, and miscellaneous bugs fixed.

=head2 Byte-order modifiers for pack() and unpack()

There are two new byte-order modifiers, C<E<gt>> (big-endian) and C<E<lt>>
(little-endian), that can be appended to most pack() and unpack() template
characters and groups to force a certain byte-order for that type or group.
See L<perlfunc/pack> and L<perlpacktut> for details.

=head2 C<no VERSION>

You can now use C<no> followed by a version number to specify that you
want to use a version of perl older than the specified one.

=head2 C<chdir>, C<chmod> and C<chown> on filehandles

C<chdir>, C<chmod> and C<chown> can now work on filehandles as well as
filenames, if the system supports respectively C<fchdir>, C<fchmod> and
C<fchown>, thanks to a patch provided by Gisle Aas.

=head2 OS groups

C<$(> and C<$)> now return groups in the order where the OS returns them,
thanks to Gisle Aas. This wasn't previously the case.

=head2 Recursive sort subs

You can now use recursive subroutines with sort(), thanks to Robin Houston.

=head2 Exceptions in constant folding

The constant folding routine is now wrapped in an exception handler, and
if folding throws an exception (such as attempting to evaluate 0/0), perl
now retains the current optree, rather than aborting the whole program.
Without this change, programs would not compile if they had expressions that
happened to generate exceptions, even though those expressions were in code
that could never be reached at runtime. (Nicholas Clark, Dave Mitchell)

=head2 Source filters in @INC

It's possible to enhance the mechanism of subroutine hooks in @INC by
adding a source filter on top of the filehandle opened and returned by the
hook. This feature was planned a long time ago, but wasn't quite working
until now. See L<perlfunc/require> for details. (Nicholas Clark)

=head2 New internal variables

=over 4

=item C<${^RE_DEBUG_FLAGS}>

This variable controls what debug flags are in effect for the regular
expression engine when running under C<use re "debug">. See L<re> for
details.

=item C<${^CHILD_ERROR_NATIVE}>

This variable gives the native status returned by the last pipe close,
backtick command, successful call to wait() or waitpid(), or from the
system() operator. See L<perlvar> for details. (Contributed by Gisle Aas.)

=item C<${^RE_TRIE_MAXBUF}>

See L</"Trie optimisation of literal string alternations">.

=item C<${^WIN32_SLOPPY_STAT}>

See L</"Sloppy stat on Windows">.

=back

=head2 Miscellaneous

C<unpack()> now defaults to unpacking the C<$_> variable.

C<mkdir()> without arguments now defaults to C<$_>.

The internal dump output has been improved, so that non-printable characters
such as newline and backspace are output in C<\x> notation, rather than
octal.

The B<-C> option can no longer be used on the C<#!> line. It wasn't
working there anyway, since the standard streams are already set up
at this point in the execution of the perl interpreter. You can use
binmode() instead to get the desired behaviour.

=head2 UCD 5.0.0

The copy of the Unicode Character Database included in Perl 5 has
been updated to version 5.0.0.

=head2 MAD

MAD, which stands for I<Miscellaneous Attribute Decoration>, is a
still-in-development work leading to a Perl 5 to Perl 6 converter. To
enable it, it's necessary to pass the argument C<-Dmad> to Configure. The
obtained perl isn't binary compatible with a regular perl 5.10, and has
space and speed penalties; moreover not all regression tests still pass
with it. (Larry Wall, Nicholas Clark)

=head2 kill() on Windows

On Windows platforms, C<kill(-9, $pid)> now kills a process tree.
(On Unix, this delivers the signal to all processes in the same process
group.)

=head1 Incompatible Changes

=head2 Packing and UTF-8 strings

The semantics of pack() and unpack() regarding UTF-8-encoded data has been
changed. Processing is now by default character per character instead of
byte per byte on the underlying encoding. Notably, code that used things
like C<pack("a*", $string)> to see through the encoding of string will now
simply get back the original $string. Packed strings can also get upgraded
during processing when you store upgraded characters. You can get the old
behaviour by using C<use bytes>.

To be consistent with pack(), the C<C0> in unpack() templates indicates
that the data is to be processed in character mode, i.e. character by
character; on the contrary, C<U0> in unpack() indicates UTF-8 mode, where
the packed string is processed in its UTF-8-encoded Unicode form on a byte
by byte basis. This is reversed with regard to perl 5.8.X, but now consistent
between pack() and unpack().

Moreover, C<C0> and C<U0> can also be used in pack() templates to specify
respectively character and byte modes.

C<C0> and C<U0> in the middle of a pack or unpack format now switch to the
specified encoding mode, honoring parens grouping. Previously, parens were
ignored.

Also, there is a new pack() character format, C<W>, which is intended to
replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in
the strings internal representation. C<W> represents unsigned (logical)
character values, which can be greater than 255. It is therefore more
robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap
values outside the range 0..255, and not respect the string encoding).

In practice, that means that pack formats are now encoding-neutral, except
C<C>.

For consistency, C<A> in unpack() format now trims all Unicode whitespace
from the end of the string. Before perl 5.9.2, it used to strip only the
classical ASCII space characters.

=head2 Byte/character count feature in unpack()

A new unpack() template character, C<".">, returns the number of bytes or
characters (depending on the selected encoding mode, see above) read so far.

=head2 The C<$*> and C<$#> variables have been removed

C<$*>, which was deprecated in favor of the C</s> and C</m> regexp
modifiers, has been removed.

The deprecated C<$#> variable (output format for numbers) has been
removed.

Two new severe warnings, C<$#/$* is no longer supported>, have been added.

=head2 substr() lvalues are no longer fixed-length

The lvalues returned by the three argument form of substr() used to be a
"fixed length window" on the original string. In some cases this could
cause surprising action at distance or other undefined behaviour. Now the
length of the window adjusts itself to the length of the string assigned to
it.

=head2 Parsing of C<-f _>

The identifier C<_> is now forced to be a bareword after a filetest
operator. This solves a number of misparsing issues when a global C<_>
subroutine is defined.

=head2 C<:unique>

The C<:unique> attribute has been made a no-op, since its current
implementation was fundamentally flawed and not threadsafe.

=head2 Effect of pragmas in eval

The compile-time value of the C<%^H> hint variable can now propagate into
eval("")uated code. This makes it more useful to implement lexical
pragmas.

As a side-effect of this, the overloaded-ness of constants now propagates
into eval("").

=head2 chdir FOO

A bareword argument to chdir() is now recognized as a file handle.
Earlier releases interpreted the bareword as a directory name.
(Gisle Aas)

=head2 Handling of .pmc files

An old feature of perl was that before C<require> or C<use> look for a
file with a F<.pm> extension, they will first look for a similar filename
with a F<.pmc> extension. If this file is found, it will be loaded in
place of any potentially existing file ending in a F<.pm> extension.

Previously, F<.pmc> files were loaded only if more recent than the
matching F<.pm> file. Starting with 5.9.4, they'll be always loaded if
they exist.

=head2 $^V is now a C<version> object instead of a v-string

$^V can still be used with the C<%vd> format in printf, but any
character-level operations will now access the string representation
of the C<version> object and not the ordinals of a v-string.
Expressions like C<< substr($^V, 0, 2) >> or C<< split //, $^V >>
no longer work and must be rewritten.

=head2 @- and @+ in patterns

The special arrays C<@-> and C<@+> are no longer interpolated in regular
expressions. (Sadahiro Tomoyuki)

=head2 $AUTOLOAD can now be tainted

If you call a subroutine by a tainted name, and if it defers to an
AUTOLOAD function, then $AUTOLOAD will be (correctly) tainted.
(Rick Delaney)

=head2 Tainting and printf

When perl is run under taint mode, C<printf()> and C<sprintf()> will now
reject any tainted format argument. (Rafael Garcia-Suarez)

=head2 undef and signal handlers

Undefining or deleting a signal handler via C<undef $SIG{FOO}> is now
equivalent to setting it to C<'DEFAULT'>. (Rafael Garcia-Suarez)

=head2 strictures and dereferencing in defined()

C<use strict 'refs'> was ignoring taking a hard reference in an argument
to defined(), as in :

    use strict 'refs';
    my $x = 'foo';
    if (defined $$x) {...}

This now correctly produces the run-time error C<Can't use string as a
SCALAR ref while "strict refs" in use>.

C<defined @$foo> and C<defined %$bar> are now also subject to C<strict
'refs'> (that is, C<$foo> and C<$bar> shall be proper references there.)
(C<defined(@foo)> and C<defined(%bar)> are discouraged constructs anyway.)
(Nicholas Clark)

=head2 C<(?p{})> has been removed

The regular expression construct C<(?p{})>, which was deprecated in perl
5.8, has been removed. Use C<(??{})> instead. (Rafael Garcia-Suarez)

=head2 Pseudo-hashes have been removed

Support for pseudo-hashes has been removed from Perl 5.9. (The C<fields>
pragma remains here, but uses an alternate implementation.)

=head2 Removal of the bytecode compiler and of perlcc

C<perlcc>, the byteloader and the supporting modules (B::C, B::CC,
B::Bytecode, etc.) are no longer distributed with the perl sources. Those
experimental tools have never worked reliably, and, due to the lack of
volunteers to keep them in line with the perl interpreter developments, it
was decided to remove them instead of shipping a broken version of those.
The last version of those modules can be found with perl 5.9.4.

However the B compiler framework stays supported in the perl core, as with
the more useful modules it has permitted (among others, B::Deparse and
B::Concise).

=head2 Removal of the JPL

The JPL (Java-Perl Lingo) has been removed from the perl sources tarball.

=head2 Recursive inheritance detected earlier

Perl will now immediately throw an exception if you modify any package's
C<@ISA> in such a way that it would cause recursive inheritance.

Previously, the exception would not occur until Perl attempted to make
use of the recursive inheritance while resolving a method or doing a
C<$foo-E<gt>isa($bar)> lookup.

=head2 warnings::enabled and warnings::warnif changed to favor users of modules

The behaviour in 5.10.x favors the person using the module;
The behaviour in 5.8.x favors the module writer;

Assume the following code:

  main calls Foo::Bar::baz()
  Foo::Bar inherits from Foo::Base
  Foo::Bar::baz() calls Foo::Base::_bazbaz()
  Foo::Base::_bazbaz() calls: warnings::warnif('substr', 'some warning 
message');

On 5.8.x, the code warns when Foo::Bar contains C<use warnings;>
It does not matter if Foo::Base or main have warnings enabled
to disable the warning one has to modify Foo::Bar.

On 5.10.0 and newer, the code warns when main contains C<use warnings;>
It does not matter if Foo::Base or Foo::Bar have warnings enabled
to disable the warning one has to modify main.

=head1 Modules and Pragmata

=head2 Upgrading individual core modules

Even more core modules are now also available separately through the
CPAN.  If you wish to update one of these modules, you don't need to
wait for a new perl release.  From within the cpan shell, running the
'r' command will report on modules with upgrades available.  See
C<perldoc CPAN> for more information.

=head2 Pragmata Changes

=over 4

=item C<feature>

The new pragma C<feature> is used to enable new features that might break
old code. See L</"The C<feature> pragma"> above.

=item C<mro>

This new pragma enables to change the algorithm used to resolve inherited
methods. See L</"New Pragma, C<mro>"> above.

=item Scoping of the C<sort> pragma

The C<sort> pragma is now lexically scoped. Its effect used to be global.

=item Scoping of C<bignum>, C<bigint>, C<bigrat>

The three numeric pragmas C<bignum>, C<bigint> and C<bigrat> are now
lexically scoped. (Tels)

=item C<base>

The C<base> pragma now warns if a class tries to inherit from itself.
(Curtis "Ovid" Poe)

=item C<strict> and C<warnings>

C<strict> and C<warnings> will now complain loudly if they are loaded via
incorrect casing (as in C<use Strict;>). (Johan Vromans)

=item C<version>

The C<version> module provides support for version objects.

=item C<warnings>

The C<warnings> pragma doesn't load C<Carp> anymore. That means that code
that used C<Carp> routines without having loaded it at compile time might
need to be adjusted; typically, the following (faulty) code won't work
anymore, and will require parentheses to be added after the function name:

    use warnings;
    require Carp;
    Carp::confess 'argh';

=item C<less>

C<less> now does something useful (or at least it tries to). In fact, it
has been turned into a lexical pragma. So, in your modules, you can now
test whether your users have requested to use less CPU, or less memory,
less magic, or maybe even less fat. See L<less> for more. (Joshua ben
Jore)

=back

=head2 New modules

=over 4

=item *

C<encoding::warnings>, by Audrey Tang, is a module to emit warnings
whenever an ASCII character string containing high-bit bytes is implicitly
converted into UTF-8. It's a lexical pragma since Perl 5.9.4; on older
perls, its effect is global.

=item *

C<Module::CoreList>, by Richard Clamp, is a small handy module that tells
you what versions of core modules ship with any versions of Perl 5. It
comes with a command-line frontend, C<corelist>.

=item *

C<Math::BigInt::FastCalc> is an XS-enabled, and thus faster, version of
C<Math::BigInt::Calc>.

=item *

C<Compress::Zlib> is an interface to the zlib compression library. It
comes with a bundled version of zlib, so having a working zlib is not a
prerequisite to install it. It's used by C<Archive::Tar> (see below).

=item *

C<IO::Zlib> is an C<IO::>-style interface to C<Compress::Zlib>.

=item *

C<Archive::Tar> is a module to manipulate C<tar> archives.

=item *

C<Digest::SHA> is a module used to calculate many types of SHA digests,
has been included for SHA support in the CPAN module.

=item *

C<ExtUtils::CBuilder> and C<ExtUtils::ParseXS> have been added.

=item *

C<Hash::Util::FieldHash>, by Anno Siegel, has been added. This module
provides support for I<field hashes>: hashes that maintain an association
of a reference with a value, in a thread-safe garbage-collected way.
Such hashes are useful to implement inside-out objects.

=item *

C<Module::Build>, by Ken Williams, has been added. It's an alternative to
C<ExtUtils::MakeMaker> to build and install perl modules.

=item *

C<Module::Load>, by Jos Boumans, has been added. It provides a single
interface to load Perl modules and F<.pl> files.

=item *

C<Module::Loaded>, by Jos Boumans, has been added. It's used to mark
modules as loaded or unloaded.

=item *

C<Package::Constants>, by Jos Boumans, has been added. It's a simple
helper to list all constants declared in a given package.

=item *

C<Win32API::File>, by Tye McQueen, has been added (for Windows builds).
This module provides low-level access to Win32 system API calls for
files/dirs.

=item *

C<Locale::Maketext::Simple>, needed by CPANPLUS, is a simple wrapper around
C<Locale::Maketext::Lexicon>. Note that C<Locale::Maketext::Lexicon> isn't
included in the perl core; the behaviour of C<Locale::Maketext::Simple>
gracefully degrades when the later isn't present.

=item *

C<Params::Check> implements a generic input parsing/checking mechanism. It
is used by CPANPLUS.

=item *

C<Term::UI> simplifies the task to ask questions at a terminal prompt.

=item *

C<Object::Accessor> provides an interface to create per-object accessors.

=item *

C<Module::Pluggable> is a simple framework to create modules that accept
pluggable sub-modules.

=item *

C<Module::Load::Conditional> provides simple ways to query and possibly
load installed modules.

=item *

C<Time::Piece> provides an object oriented interface to time functions,
overriding the built-ins localtime() and gmtime().

=item *

C<IPC::Cmd> helps to find and run external commands, possibly
interactively.

=item *

C<File::Fetch> provide a simple generic file fetching mechanism.

=item *

C<Log::Message> and C<Log::Message::Simple> are used by the log facility
of C<CPANPLUS>.

=item *

C<Archive::Extract> is a generic archive extraction mechanism
for F<.tar> (plain, gzipped or bzipped) or F<.zip> files.

=item *

C<CPANPLUS> provides an API and a command-line tool to access the CPAN
mirrors.

=item *

C<Pod::Escapes> provides utilities that are useful in decoding Pod
EE<lt>...E<gt> sequences.

=item *

C<Pod::Simple> is now the backend for several of the Pod-related modules
included with Perl.

=back

=head2 Selected Changes to Core Modules

=over 4

=item C<Attribute::Handlers>

C<Attribute::Handlers> can now report the caller's file and line number.
(David Feldman)

All interpreted attributes are now passed as array references. (Damian
Conway)

=item C<B::Lint>

C<B::Lint> is now based on C<Module::Pluggable>, and so can be extended
with plugins. (Joshua ben Jore)

=item C<B>

It's now possible to access the lexical pragma hints (C<%^H>) by using the
method B::COP::hints_hash(). It returns a C<B::RHE> object, which in turn
can be used to get a hash reference via the method B::RHE::HASH(). (Joshua
ben Jore)

=item C<Thread>

As the old 5005thread threading model has been removed, in favor of the
ithreads scheme, the C<Thread> module is now a compatibility wrapper, to
be used in old code only. It has been removed from the default list of
dynamic extensions.

=back

=head1 Utility Changes

=over 4

=item perl -d

The Perl debugger can now save all debugger commands for sourcing later;
notably, it can now emulate stepping backwards, by restarting and
rerunning all bar the last command from a saved command history.

It can also display the parent inheritance tree of a given class, with the
C<i> command.

=item ptar

C<ptar> is a pure perl implementation of C<tar> that comes with
C<Archive::Tar>.

=item ptardiff

C<ptardiff> is a small utility used to generate a diff between the contents
of a tar archive and a directory tree. Like C<ptar>, it comes with
C<Archive::Tar>.

=item shasum

C<shasum> is a command-line utility, used to print or to check SHA
digests. It comes with the new C<Digest::SHA> module.

=item corelist

The C<corelist> utility is now installed with perl (see L</"New modules">
above).

=item h2ph and h2xs

C<h2ph> and C<h2xs> have been made more robust with regard to
"modern" C code.

C<h2xs> implements a new option C<--use-xsloader> to force use of
C<XSLoader> even in backwards compatible modules.

The handling of authors' names that had apostrophes has been fixed.

Any enums with negative values are now skipped.

=item perlivp

C<perlivp> no longer checks for F<*.ph> files by default.  Use the new C<-a>
option to run I<all> tests.

=item find2perl

C<find2perl> now assumes C<-print> as a default action. Previously, it
needed to be specified explicitly.

Several bugs have been fixed in C<find2perl>, regarding C<-exec> and
C<-eval>. Also the options C<-path>, C<-ipath> and C<-iname> have been
added.

=item config_data

C<config_data> is a new utility that comes with C<Module::Build>. It
provides a command-line interface to the configuration of Perl modules
that use Module::Build's framework of configurability (that is,
C<*::ConfigData> modules that contain local configuration information for
their parent modules.)

=item cpanp

C<cpanp>, the CPANPLUS shell, has been added. (C<cpanp-run-perl>, a
helper for CPANPLUS operation, has been added too, but isn't intended for
direct use).

=item cpan2dist

C<cpan2dist> is a new utility that comes with CPANPLUS. It's a tool to
create distributions (or packages) from CPAN modules.

=item pod2html

The output of C<pod2html> has been enhanced to be more customizable via
CSS. Some formatting problems were also corrected. (Jari Aalto)

=back

=head1 New Documentation

The L<perlpragma> manpage documents how to write one's own lexical
pragmas in pure Perl (something that is possible starting with 5.9.4).

The new L<perlglossary> manpage is a glossary of terms used in the Perl
documentation, technical and otherwise, kindly provided by O'Reilly Media,
Inc.

The L<perlreguts> manpage, courtesy of Yves Orton, describes internals of the
Perl regular expression engine.

The L<perlreapi> manpage describes the interface to the perl interpreter
used to write pluggable regular expression engines (by Ævar Arnfjörð
Bjarmason).

The L<perlunitut> manpage is an tutorial for programming with Unicode and
string encodings in Perl, courtesy of Juerd Waalboer.

A new manual page, L<perlunifaq> (the Perl Unicode FAQ), has been added
(Juerd Waalboer).

The L<perlcommunity> manpage gives a description of the Perl community
on the Internet and in real life. (Edgar "Trizor" Bering)

The L<CORE> manual page documents the C<CORE::> namespace. (Tels)

The long-existing feature of C</(?{...})/> regexps setting C<$_> and pos()
is now documented.

=head1 Performance Enhancements

=head2 In-place sorting

Sorting arrays in place (C<@a = sort @a>) is now optimized to avoid
making a temporary copy of the array.

Likewise, C<reverse sort ...> is now optimized to sort in reverse,
avoiding the generation of a temporary intermediate list.

=head2 Lexical array access

Access to elements of lexical arrays via a numeric constant between 0 and
255 is now faster. (This used to be only the case for global arrays.)

=head2 XS-assisted SWASHGET

Some pure-perl code that perl was using to retrieve Unicode properties and
transliteration mappings has been reimplemented in XS.

=head2 Constant subroutines

The interpreter internals now support a far more memory efficient form of
inlineable constants. Storing a reference to a constant value in a symbol
table is equivalent to a full typeglob referencing a constant subroutine,
but using about 400 bytes less memory. This proxy constant subroutine is
automatically upgraded to a real typeglob with subroutine if necessary.
The approach taken is analogous to the existing space optimisation for
subroutine stub declarations, which are stored as plain scalars in place
of the full typeglob.

Several of the core modules have been converted to use this feature for
their system dependent constants - as a result C<use POSIX;> now takes about
200K less memory.

=head2 C<PERL_DONT_CREATE_GVSV>

The new compilation flag C<PERL_DONT_CREATE_GVSV>, introduced as an option
in perl 5.8.8, is turned on by default in perl 5.9.3. It prevents perl
from creating an empty scalar with every new typeglob. See L<perl589delta>
for details.

=head2 Weak references are cheaper

Weak reference creation is now I<O(1)> rather than I<O(n)>, courtesy of
Nicholas Clark. Weak reference deletion remains I<O(n)>, but if deletion only
happens at program exit, it may be skipped completely.

=head2 sort() enhancements

Salvador Fandiño provided improvements to reduce the memory usage of C<sort>
and to speed up some cases.

=head2 Memory optimisations

Several internal data structures (typeglobs, GVs, CVs, formats) have been
restructured to use less memory. (Nicholas Clark)

=head2 UTF-8 cache optimisation

The UTF-8 caching code is now more efficient, and used more often.
(Nicholas Clark)

=head2 Sloppy stat on Windows

On Windows, perl's stat() function normally opens the file to determine
the link count and update attributes that may have been changed through
hard links. Setting ${^WIN32_SLOPPY_STAT} to a true value speeds up
stat() by not performing this operation. (Jan Dubois)

=head2 Regular expressions optimisations

=over 4

=item Engine de-recursivised

The regular expression engine is no longer recursive, meaning that
patterns that used to overflow the stack will either die with useful
explanations, or run to completion, which, since they were able to blow
the stack before, will likely take a very long time to happen. If you were
experiencing the occasional stack overflow (or segfault) and upgrade to
discover that now perl apparently hangs instead, look for a degenerate
regex. (Dave Mitchell)

=item Single char char-classes treated as literals

Classes of a single character are now treated the same as if the character
had been used as a literal, meaning that code that uses char-classes as an
escaping mechanism will see a speedup. (Yves Orton)

=item Trie optimisation of literal string alternations

Alternations, where possible, are optimised into more efficient matching
structures. String literal alternations are merged into a trie and are
matched simultaneously.  This means that instead of O(N) time for matching
N alternations at a given point, the new code performs in O(1) time.
A new special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune
this optimization. (Yves Orton)

B<Note:> Much code exists that works around perl's historic poor
performance on alternations. Often the tricks used to do so will disable
the new optimisations. Hopefully the utility modules used for this purpose
will be educated about these new optimisations.

=item Aho-Corasick start-point optimisation

When a pattern starts with a trie-able alternation and there aren't
better optimisations available, the regex engine will use Aho-Corasick
matching to find the start point. (Yves Orton)

=back

=head1 Installation and Configuration Improvements

=head2 Configuration improvements

=over 4

=item C<-Dusesitecustomize>

Run-time customization of @INC can be enabled by passing the
C<-Dusesitecustomize> flag to Configure. When enabled, this will make perl
run F<$sitelibexp/sitecustomize.pl> before anything else.  This script can
then be set up to add additional entries to @INC.

=item Relocatable installations

There is now Configure support for creating a relocatable perl tree. If
you Configure with C<-Duserelocatableinc>, then the paths in @INC (and
everything else in %Config) can be optionally located via the path of the
perl executable.

That means that, if the string C<".../"> is found at the start of any
path, it's substituted with the directory of $^X. So, the relocation can
be configured on a per-directory basis, although the default with
C<-Duserelocatableinc> is that everything is relocated. The initial
install is done to the original configured prefix.

=item strlcat() and strlcpy()

The configuration process now detects whether strlcat() and strlcpy() are
available.  When they are not available, perl's own version is used (from
Russ Allbery's public domain implementation).  Various places in the perl
interpreter now use them. (Steve Peters)

=item C<d_pseudofork> and C<d_printf_format_null>

A new configuration variable, available as C<$Config{d_pseudofork}> in
the L<Config> module, has been added, to distinguish real fork() support
from fake pseudofork used on Windows platforms.

A new configuration variable, C<d_printf_format_null>, has been added, 
to see if printf-like formats are allowed to be NULL.

=item Configure help

C<Configure -h> has been extended with the most commonly used options.

=back

=head2 Compilation improvements

=over 4

=item Parallel build

Parallel makes should work properly now, although there may still be problems
if C<make test> is instructed to run in parallel.

=item Borland's compilers support

Building with Borland's compilers on Win32 should work more smoothly. In
particular Steve Hay has worked to side step many warnings emitted by their
compilers and at least one C compiler internal error.

=item Static build on Windows

Perl extensions on Windows now can be statically built into the Perl DLL.

Also, it's now possible to build a C<perl-static.exe> that doesn't depend
on the Perl DLL on Win32. See the Win32 makefiles for details.
(Vadim Konovalov)

=item ppport.h files

All F<ppport.h> files in the XS modules bundled with perl are now
autogenerated at build time. (Marcus Holland-Moritz)

=item C++ compatibility

Efforts have been made to make perl and the core XS modules compilable
with various C++ compilers (although the situation is not perfect with
some of the compilers on some of the platforms tested.)

=item Support for Microsoft 64-bit compiler

Support for building perl with Microsoft's 64-bit compiler has been
improved. (ActiveState)

=item Visual C++

Perl can now be compiled with Microsoft Visual C++ 2005 (and 2008 Beta 2).

=item Win32 builds

All win32 builds (MS-Win, WinCE) have been merged and cleaned up.

=back

=head2 Installation improvements

=over 4

=item Module auxiliary files

README files and changelogs for CPAN modules bundled with perl are no
longer installed.

=back

=head2 New Or Improved Platforms

Perl has been reported to work on Symbian OS. See L<perlsymbian> for more
information.

Many improvements have been made towards making Perl work correctly on
z/OS.

Perl has been reported to work on DragonFlyBSD and MidnightBSD.

Perl has also been reported to work on NexentaOS
( http://www.gnusolaris.org/ ).

The VMS port has been improved. See L<perlvms>.

Support for Cray XT4 Catamount/Qk has been added. See
F<hints/catamount.sh> in the source code distribution for more
information.

Vendor patches have been merged for RedHat and Gentoo.

DynaLoader::dl_unload_file() now works on Windows.

=head1 Selected Bug Fixes

=over 4

=item strictures in regexp-eval blocks

C<strict> wasn't in effect in regexp-eval blocks (C</(?{...})/>).

=item Calling CORE::require()

CORE::require() and CORE::do() were always parsed as require() and do()
when they were overridden. This is now fixed.

=item Subscripts of slices

You can now use a non-arrowed form for chained subscripts after a list
slice, like in:

    ({foo => "bar"})[0]{foo}

This used to be a syntax error; a C<< -> >> was required.

=item C<no warnings 'category'> works correctly with -w

Previously when running with warnings enabled globally via C<-w>, selective
disabling of specific warning categories would actually turn off all warnings.
This is now fixed; now C<no warnings 'io';> will only turn off warnings in the
C<io> class. Previously it would erroneously turn off all warnings.

=item threads improvements

Several memory leaks in ithreads were closed. Also, ithreads were made
less memory-intensive.

C<threads> is now a dual-life module, also available on CPAN. It has been
expanded in many ways. A kill() method is available for thread signalling.
One can get thread status, or the list of running or joinable threads.

A new C<< threads->exit() >> method is used to exit from the application
(this is the default for the main thread) or from the current thread only
(this is the default for all other threads). On the other hand, the exit()
built-in now always causes the whole application to terminate. (Jerry
D. Hedden)

=item chr() and negative values

chr() on a negative value now gives C<\x{FFFD}>, the Unicode replacement
character, unless when the C<bytes> pragma is in effect, where the low
eight bits of the value are used.

=item PERL5SHELL and tainting

On Windows, the PERL5SHELL environment variable is now checked for
taintedness. (Rafael Garcia-Suarez)

=item Using *FILE{IO}

C<stat()> and C<-X> filetests now treat *FILE{IO} filehandles like *FILE
filehandles. (Steve Peters)

=item Overloading and reblessing

Overloading now works when references are reblessed into another class.
Internally, this has been implemented by moving the flag for "overloading"
from the reference to the referent, which logically is where it should
always have been. (Nicholas Clark)

=item Overloading and UTF-8

A few bugs related to UTF-8 handling with objects that have
stringification overloaded have been fixed. (Nicholas Clark)

=item eval memory leaks fixed

Traditionally, C<eval 'syntax error'> has leaked badly. Many (but not all)
of these leaks have now been eliminated or reduced. (Dave Mitchell)

=item Random device on Windows

In previous versions, perl would read the file F</dev/urandom> if it
existed when seeding its random number generator.  That file is unlikely
to exist on Windows, and if it did would probably not contain appropriate
data, so perl no longer tries to read it on Windows. (Alex Davies)

=item PERLIO_DEBUG

The C<PERLIO_DEBUG> environment variable no longer has any effect for
setuid scripts and for scripts run with B<-T>.

Moreover, with a thread-enabled perl, using C<PERLIO_DEBUG> could lead to
an internal buffer overflow. This has been fixed.

=item PerlIO::scalar and read-only scalars

PerlIO::scalar will now prevent writing to read-only scalars. Moreover,
seek() is now supported with PerlIO::scalar-based filehandles, the
underlying string being zero-filled as needed. (Rafael, Jarkko Hietaniemi)

=item study() and UTF-8

study() never worked for UTF-8 strings, but could lead to false results.
It's now a no-op on UTF-8 data. (Yves Orton)

=item Critical signals

The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an
"unsafe" manner (contrary to other signals, that are deferred until the
perl interpreter reaches a reasonably stable state; see
L<perlipc/"Deferred Signals (Safe Signals)">). (Rafael)

=item @INC-hook fix

When a module or a file is loaded through an @INC-hook, and when this hook
has set a filename entry in %INC, __FILE__ is now set for this module
accordingly to the contents of that %INC entry. (Rafael)

=item C<-t> switch fix

The C<-w> and C<-t> switches can now be used together without messing
up which categories of warnings are activated. (Rafael)

=item Duping UTF-8 filehandles

Duping a filehandle which has the C<:utf8> PerlIO layer set will now
properly carry that layer on the duped filehandle. (Rafael)

=item Localisation of hash elements

Localizing a hash element whose key was given as a variable didn't work
correctly if the variable was changed while the local() was in effect (as
in C<local $h{$x}; ++$x>). (Bo Lindbergh)

=back

=head1 New or Changed Diagnostics

=over 4

=item Use of uninitialized value

Perl will now try to tell you the name of the variable (if any) that was
undefined.

=item Deprecated use of my() in false conditional

A new deprecation warning, I<Deprecated use of my() in false conditional>,
has been added, to warn against the use of the dubious and deprecated
construct

    my $x if 0;

See L<perldiag>. Use C<state> variables instead.

=item !=~ should be !~

A new warning, C<!=~ should be !~>, is emitted to prevent this misspelling
of the non-matching operator.

=item Newline in left-justified string

The warning I<Newline in left-justified string> has been removed.

=item Too late for "-T" option

The error I<Too late for "-T" option> has been reformulated to be more
descriptive.

=item "%s" variable %s masks earlier declaration

This warning is now emitted in more consistent cases; in short, when one
of the declarations involved is a C<my> variable:

    my $x;   my $x;	# warns
    my $x;  our $x;	# warns
    our $x;  my $x;	# warns

On the other hand, the following:

    our $x; our $x;

now gives a C<"our" variable %s redeclared> warning.

=item readdir()/closedir()/etc. attempted on invalid dirhandle

These new warnings are now emitted when a dirhandle is used but is
either closed or not really a dirhandle.

=item Opening dirhandle/filehandle %s also as a file/directory

Two deprecation warnings have been added: (Rafael)

    Opening dirhandle %s also as a file
    Opening filehandle %s also as a directory

=item Use of -P is deprecated

Perl's command-line switch C<-P> is now deprecated.

=item v-string in use/require is non-portable

Perl will warn you against potential backwards compatibility problems with
the C<use VERSION> syntax.

=item perl -V

C<perl -V> has several improvements, making it more useable from shell
scripts to get the value of configuration variables. See L<perlrun> for
details.

=back

=head1 Changed Internals

In general, the source code of perl has been refactored, tidied up,
and optimized in many places. Also, memory management and allocation
has been improved in several points.

When compiling the perl core with gcc, as many gcc warning flags are
turned on as is possible on the platform.  (This quest for cleanliness
doesn't extend to XS code because we cannot guarantee the tidiness of
code we didn't write.)  Similar strictness flags have been added or
tightened for various other C compilers.

=head2 Reordering of SVt_* constants

The relative ordering of constants that define the various types of C<SV>
have changed; in particular, C<SVt_PVGV> has been moved before C<SVt_PVLV>,
C<SVt_PVAV>, C<SVt_PVHV> and C<SVt_PVCV>.  This is unlikely to make any
difference unless you have code that explicitly makes assumptions about that
ordering. (The inheritance hierarchy of C<B::*> objects has been changed
to reflect this.)

=head2 Elimination of SVt_PVBM

Related to this, the internal type C<SVt_PVBM> has been removed. This
dedicated type of C<SV> was used by the C<index> operator and parts of the
regexp engine to facilitate fast Boyer-Moore matches. Its use internally has
been replaced by C<SV>s of type C<SVt_PVGV>.

=head2 New type SVt_BIND

A new type C<SVt_BIND> has been added, in readiness for the project to
implement Perl 6 on 5. There deliberately is no implementation yet, and
they cannot yet be created or destroyed.

=head2 Removal of CPP symbols

The C preprocessor symbols C<PERL_PM_APIVERSION> and
C<PERL_XS_APIVERSION>, which were supposed to give the version number of
the oldest perl binary-compatible (resp. source-compatible) with the
present one, were not used, and sometimes had misleading values. They have
been removed.

=head2 Less space is used by ops

The C<BASEOP> structure now uses less space. The C<op_seq> field has been
removed and replaced by a single bit bit-field C<op_opt>. C<op_type> is now 9
bits long. (Consequently, the C<B::OP> class doesn't provide an C<seq>
method anymore.)

=head2 New parser

perl's parser is now generated by bison (it used to be generated by
byacc.) As a result, it seems to be a bit more robust.

Also, Dave Mitchell improved the lexer debugging output under C<-DT>.

=head2 Use of C<const>

Andy Lester supplied many improvements to determine which function
parameters and local variables could actually be declared C<const> to the C
compiler. Steve Peters provided new C<*_set> macros and reworked the core to
use these rather than assigning to macros in LVALUE context.

=head2 Mathoms

A new file, F<mathoms.c>, has been added. It contains functions that are
no longer used in the perl core, but that remain available for binary or
source compatibility reasons. However, those functions will not be
compiled in if you add C<-DNO_MATHOMS> in the compiler flags.

=head2 C<AvFLAGS> has been removed

The C<AvFLAGS> macro has been removed.

=head2 C<av_*> changes

The C<av_*()> functions, used to manipulate arrays, no longer accept null
C<AV*> parameters.

=head2 $^H and %^H

The implementation of the special variables $^H and %^H has changed, to
allow implementing lexical pragmas in pure Perl.

=head2 B:: modules inheritance changed

The inheritance hierarchy of C<B::> modules has changed; C<B::NV> now
inherits from C<B::SV> (it used to inherit from C<B::IV>).

=head2 Anonymous hash and array constructors

The anonymous hash and array constructors now take 1 op in the optree
instead of 3, now that pp_anonhash and pp_anonlist return a reference to
an hash/array when the op is flagged with OPf_SPECIAL. (Nicholas Clark)

=head1 Known Problems

There's still a remaining problem in the implementation of the lexical
C<$_>: it doesn't work inside C</(?{...})/> blocks. (See the TODO test in
F<t/op/mydef.t>.)

Stacked filetest operators won't work when the C<filetest> pragma is in
effect, because they rely on the stat() buffer C<_> being populated, and
filetest bypasses stat().

=head2 UTF-8 problems

The handling of Unicode still is unclean in several places, where it's
dependent on whether a string is internally flagged as UTF-8. This will
be made more consistent in perl 5.12, but that won't be possible without
a certain amount of backwards incompatibility.

=head1 Platform Specific Problems

When compiled with g++ and thread support on Linux, it's reported that the
C<$!> stops working correctly. This is related to the fact that the glibc
provides two strerror_r(3) implementation, and perl selects the wrong
one.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/rt3/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file and the perl590delta to perl595delta man pages for
exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5241delta.pod000064400000020027150344123430007540 0ustar00=encoding utf8

=head1 NAME

perl5241delta - what is new for perl v5.24.1

=head1 DESCRIPTION

This document describes differences between the 5.24.0 release and the 5.24.1
release.

If you are upgrading from an earlier release such as 5.22.0, first read
L<perl5240delta>, which describes differences between 5.22.0 and 5.24.0.

=head1 Security

=head2 B<-Di> switch is now required for PerlIO debugging output

Previously PerlIO debugging output would be sent to the file specified by the
C<PERLIO_DEBUG> environment variable if perl wasn't running setuid and the
B<-T> or B<-t> switches hadn't been parsed yet.

If perl performed output at a point where it hadn't yet parsed its switches
this could result in perl creating or overwriting the file named by
C<PERLIO_DEBUG> even when the B<-T> switch had been supplied.

Perl now requires the B<-Di> switch to produce PerlIO debugging output.  By
default this is written to C<stderr>, but can optionally be redirected to a
file by setting the C<PERLIO_DEBUG> environment variable.

If perl is running setuid or the B<-T> switch was supplied C<PERLIO_DEBUG> is
ignored and the debugging output is sent to C<stderr> as for any other B<-D>
switch.

=head2 Core modules and tools no longer search F<"."> for optional modules

The tools and many modules supplied in core no longer search the default
current directory entry in L<C<@INC>|perlvar/@INC> for optional modules.  For
example, L<Storable> will remove the final F<"."> from C<@INC> before trying to
load L<Log::Agent>.

This prevents an attacker injecting an optional module into a process run by
another user where the current directory is writable by the attacker, e.g. the
F</tmp> directory.

In most cases this removal should not cause problems, but difficulties were
encountered with L<base>, which treats every module name supplied as optional.
These difficulties have not yet been resolved, so for this release there are no
changes to L<base>.  We hope to have a fix for L<base> in Perl 5.24.2.

To protect your own code from this attack, either remove the default F<".">
entry from C<@INC> at the start of your script, so:

  #!/usr/bin/perl
  use strict;
  ...

becomes:

  #!/usr/bin/perl
  BEGIN { pop @INC if $INC[-1] eq '.' }
  use strict;
  ...

or for modules, remove F<"."> from a localized C<@INC>, so:

  my $can_foo = eval { require Foo; }

becomes:

  my $can_foo = eval {
      local @INC = @INC;
      pop @INC if $INC[-1] eq '.';
      require Foo;
  };

=head1 Incompatible Changes

Other than the security changes above there are no changes intentionally
incompatible with Perl 5.24.0.  If any exist, they are bugs, and we request
that you submit a report.  See L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Archive::Tar> has been upgraded from version 2.04 to 2.04_01.

=item *

L<bignum> has been upgraded from version 0.42 to 0.42_01.

=item *

L<CPAN> has been upgraded from version 2.11 to 2.11_01.

=item *

L<Digest> has been upgraded from version 1.17 to 1.17_01.

=item *

L<Digest::SHA> has been upgraded from version 5.95 to 5.95_01.

=item *

L<Encode> has been upgraded from version 2.80 to 2.80_01.

=item *

L<ExtUtils::MakeMaker> has been upgraded from version 7.10_01 to 7.10_02.

=item *

L<File::Fetch> has been upgraded from version 0.48 to 0.48_01.

=item *

L<File::Spec> has been upgraded from version 3.63 to 3.63_01.

=item *

L<HTTP::Tiny> has been upgraded from version 0.056 to 0.056_001.

=item *

L<IO> has been upgraded from version 1.36 to 1.36_01.

=item *

The IO-Compress modules have been upgraded from version 2.069 to 2.069_001.

=item *

L<IPC::Cmd> has been upgraded from version 0.92 to 0.92_01.

=item *

L<JSON::PP> has been upgraded from version 2.27300 to 2.27300_01.

=item *

L<Locale::Maketext> has been upgraded from version 1.26 to 1.26_01.

=item *

L<Locale::Maketext::Simple> has been upgraded from version 0.21 to 0.21_01.

=item *

L<Memoize> has been upgraded from version 1.03 to 1.03_01.

=item *

L<Module::CoreList> has been upgraded from version 5.20160506 to 5.20170114_24.

=item *

L<Net::Ping> has been upgraded from version 2.43 to 2.43_01.

=item *

L<Parse::CPAN::Meta> has been upgraded from version 1.4417 to 1.4417_001.

=item *

L<Pod::Html> has been upgraded from version 1.22 to 1.2201.

=item *

L<Pod::Perldoc> has been upgraded from version 3.25_02 to 3.25_03.

=item *

L<Storable> has been upgraded from version 2.56 to 2.56_01.

=item *

L<Sys::Syslog> has been upgraded from version 0.33 to 0.33_01.

=item *

L<Test> has been upgraded from version 1.28 to 1.28_01.

=item *

L<Test::Harness> has been upgraded from version 3.36 to 3.36_01.

=item *

L<XSLoader> has been upgraded from version 0.21 to 0.22, fixing a security hole
in which binary files could be loaded from a path outside of C<@INC>.
L<[perl #128528]|https://rt.perl.org/Public/Bug/Display.html?id=128528>

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perlapio>

=over 4

=item *

The documentation of C<PERLIO_DEBUG> has been updated.

=back

=head3 L<perlrun>

=over 4

=item *

The new B<-Di> switch has been documented, and the documentation of
C<PERLIO_DEBUG> has been updated.

=back

=head1 Testing

=over 4

=item *

A new test script, F<t/run/switchDx.t>, has been added to test that the new
B<-Di> switch is working correctly.

=back

=head1 Selected Bug Fixes

=over 4

=item *

The change to hashbang redirection introduced in Perl 5.24.0, whereby perl
would redirect to another interpreter (Perl 6) if it found a hashbang path
which contains "perl" followed by "6", has been reverted because it broke in
cases such as C<#!/opt/perl64/bin/perl>.

=back

=head1 Acknowledgements

Perl 5.24.1 represents approximately 8 months of development since Perl 5.24.0
and contains approximately 8,100 lines of changes across 240 files from 18
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 2,200 lines of changes to 170 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.24.1:

Aaron Crane, Alex Vandiver, Aristotle Pagaltzis, Chad Granum, Chris 'BinGOs'
Williams, Craig A. Berry, Father Chrysostomos, James E Keenan, Jarkko
Hietaniemi, Karen Etheridge, Leon Timmermans, Matthew Horsfall, Ricardo Signes,
Sawyer X, Sébastien Aperghis-Tramoni, Stevan Little, Steve Hay, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the Perl bug database at
L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to
report the issue.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlipc.pod000064400000212255150344123430006714 0ustar00=head1 NAME

perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores)

=head1 DESCRIPTION

The basic IPC facilities of Perl are built out of the good old Unix
signals, named pipes, pipe opens, the Berkeley socket routines, and SysV
IPC calls.  Each is used in slightly different situations.

=head1 Signals

Perl uses a simple signal handling model: the %SIG hash contains names
or references of user-installed signal handlers.  These handlers will
be called with an argument which is the name of the signal that
triggered it.  A signal may be generated intentionally from a
particular keyboard sequence like control-C or control-Z, sent to you
from another process, or triggered automatically by the kernel when
special events transpire, like a child process exiting, your own process
running out of stack space, or hitting a process file-size limit.

For example, to trap an interrupt signal, set up a handler like this:

    our $shucks;

    sub catch_zap {
        my $signame = shift;
        $shucks++;
        die "Somebody sent me a SIG$signame";
    }
    $SIG{INT} = __PACKAGE__ . "::catch_zap";
    $SIG{INT} = \&catch_zap;  # best strategy

Prior to Perl 5.8.0 it was necessary to do as little as you possibly
could in your handler; notice how all we do is set a global variable
and then raise an exception.  That's because on most systems,
libraries are not re-entrant; particularly, memory allocation and I/O
routines are not.  That meant that doing nearly I<anything> in your
handler could in theory trigger a memory fault and subsequent core
dump - see L</Deferred Signals (Safe Signals)> below.

The names of the signals are the ones listed out by C<kill -l> on your
system, or you can retrieve them using the CPAN module L<IPC::Signal>.

You may also choose to assign the strings C<"IGNORE"> or C<"DEFAULT"> as
the handler, in which case Perl will try to discard the signal or do the
default thing.

On most Unix platforms, the C<CHLD> (sometimes also known as C<CLD>) signal
has special behavior with respect to a value of C<"IGNORE">.
Setting C<$SIG{CHLD}> to C<"IGNORE"> on such a platform has the effect of
not creating zombie processes when the parent process fails to C<wait()>
on its child processes (i.e., child processes are automatically reaped).
Calling C<wait()> with C<$SIG{CHLD}> set to C<"IGNORE"> usually returns
C<-1> on such platforms.

Some signals can be neither trapped nor ignored, such as the KILL and STOP
(but not the TSTP) signals. Note that ignoring signals makes them disappear.
If you only want them blocked temporarily without them getting lost you'll
have to use POSIX' sigprocmask.

Sending a signal to a negative process ID means that you send the signal
to the entire Unix process group.  This code sends a hang-up signal to all
processes in the current process group, and also sets $SIG{HUP} to C<"IGNORE">
so it doesn't kill itself:

    # block scope for local
    {
        local $SIG{HUP} = "IGNORE";
        kill HUP => -$$;
        # snazzy writing of: kill("HUP", -$$)
    }

Another interesting signal to send is signal number zero.  This doesn't
actually affect a child process, but instead checks whether it's alive
or has changed its UIDs.

    unless (kill 0 => $kid_pid) {
        warn "something wicked happened to $kid_pid";
    }

Signal number zero may fail because you lack permission to send the
signal when directed at a process whose real or saved UID is not
identical to the real or effective UID of the sending process, even
though the process is alive.  You may be able to determine the cause of
failure using C<$!> or C<%!>.

    unless (kill(0 => $pid) || $!{EPERM}) {
        warn "$pid looks dead";
    }

You might also want to employ anonymous functions for simple signal
handlers:

    $SIG{INT} = sub { die "\nOutta here!\n" };

SIGCHLD handlers require some special care.  If a second child dies
while in the signal handler caused by the first death, we won't get
another signal. So must loop here else we will leave the unreaped child
as a zombie. And the next time two children die we get another zombie.
And so on.

    use POSIX ":sys_wait_h";
    $SIG{CHLD} = sub {
        while ((my $child = waitpid(-1, WNOHANG)) > 0) {
            $Kid_Status{$child} = $?;
        }
    };
    # do something that forks...

Be careful: qx(), system(), and some modules for calling external commands
do a fork(), then wait() for the result. Thus, your signal handler
will be called. Because wait() was already called by system() or qx(),
the wait() in the signal handler will see no more zombies and will
therefore block.

The best way to prevent this issue is to use waitpid(), as in the following
example:

    use POSIX ":sys_wait_h"; # for nonblocking read

    my %children;

    $SIG{CHLD} = sub {
        # don't change $! and $? outside handler
        local ($!, $?);
        while ( (my $pid = waitpid(-1, WNOHANG)) > 0 ) {
            delete $children{$pid};
            cleanup_child($pid, $?);
        }
    };

    while (1) {
        my $pid = fork();
        die "cannot fork" unless defined $pid;
        if ($pid == 0) {
            # ...
            exit 0;
        } else {
            $children{$pid}=1;
            # ...
            system($command);
            # ...
       }
    }

Signal handling is also used for timeouts in Unix.  While safely
protected within an C<eval{}> block, you set a signal handler to trap
alarm signals and then schedule to have one delivered to you in some
number of seconds.  Then try your blocking operation, clearing the alarm
when it's done but not before you've exited your C<eval{}> block.  If it
goes off, you'll use die() to jump out of the block.

Here's an example:

    my $ALARM_EXCEPTION = "alarm clock restart";
    eval {
        local $SIG{ALRM} = sub { die $ALARM_EXCEPTION };
        alarm 10;
        flock(FH, 2)    # blocking write lock
                        || die "cannot flock: $!";
        alarm 0;
    };
    if ($@ && $@ !~ quotemeta($ALARM_EXCEPTION)) { die }

If the operation being timed out is system() or qx(), this technique
is liable to generate zombies.    If this matters to you, you'll
need to do your own fork() and exec(), and kill the errant child process.

For more complex signal handling, you might see the standard POSIX
module.  Lamentably, this is almost entirely undocumented, but the
F<ext/POSIX/t/sigaction.t> file from the Perl source distribution has
some examples in it.

=head2 Handling the SIGHUP Signal in Daemons

A process that usually starts when the system boots and shuts down
when the system is shut down is called a daemon (Disk And Execution
MONitor). If a daemon process has a configuration file which is
modified after the process has been started, there should be a way to
tell that process to reread its configuration file without stopping
the process. Many daemons provide this mechanism using a C<SIGHUP>
signal handler. When you want to tell the daemon to reread the file,
simply send it the C<SIGHUP> signal.

The following example implements a simple daemon, which restarts
itself every time the C<SIGHUP> signal is received. The actual code is
located in the subroutine C<code()>, which just prints some debugging
info to show that it works; it should be replaced with the real code.

  #!/usr/bin/perl

  use strict;
  use warnings;

  use POSIX ();
  use FindBin ();
  use File::Basename ();
  use File::Spec::Functions qw(catfile);

  $| = 1;

  # make the daemon cross-platform, so exec always calls the script
  # itself with the right path, no matter how the script was invoked.
  my $script = File::Basename::basename($0);
  my $SELF  = catfile($FindBin::Bin, $script);

  # POSIX unmasks the sigprocmask properly
  $SIG{HUP} = sub {
      print "got SIGHUP\n";
      exec($SELF, @ARGV)        || die "$0: couldn't restart: $!";
  };

  code();

  sub code {
      print "PID: $$\n";
      print "ARGV: @ARGV\n";
      my $count = 0;
      while (1) {
          sleep 2;
          print ++$count, "\n";
      }
  }


=head2 Deferred Signals (Safe Signals)

Before Perl 5.8.0, installing Perl code to deal with signals exposed you to
danger from two things.  First, few system library functions are
re-entrant.  If the signal interrupts while Perl is executing one function
(like malloc(3) or printf(3)), and your signal handler then calls the same
function again, you could get unpredictable behavior--often, a core dump.
Second, Perl isn't itself re-entrant at the lowest levels.  If the signal
interrupts Perl while Perl is changing its own internal data structures,
similarly unpredictable behavior may result.

There were two things you could do, knowing this: be paranoid or be
pragmatic.  The paranoid approach was to do as little as possible in your
signal handler.  Set an existing integer variable that already has a
value, and return.  This doesn't help you if you're in a slow system call,
which will just restart.  That means you have to C<die> to longjmp(3) out
of the handler.  Even this is a little cavalier for the true paranoiac,
who avoids C<die> in a handler because the system I<is> out to get you.
The pragmatic approach was to say "I know the risks, but prefer the
convenience", and to do anything you wanted in your signal handler,
and be prepared to clean up core dumps now and again.

Perl 5.8.0 and later avoid these problems by "deferring" signals.  That is,
when the signal is delivered to the process by the system (to the C code
that implements Perl) a flag is set, and the handler returns immediately.
Then at strategic "safe" points in the Perl interpreter (e.g. when it is
about to execute a new opcode) the flags are checked and the Perl level
handler from %SIG is executed. The "deferred" scheme allows much more
flexibility in the coding of signal handlers as we know the Perl
interpreter is in a safe state, and that we are not in a system library
function when the handler is called.  However the implementation does
differ from previous Perls in the following ways:

=over 4

=item Long-running opcodes

As the Perl interpreter looks at signal flags only when it is about
to execute a new opcode, a signal that arrives during a long-running
opcode (e.g. a regular expression operation on a very large string) will
not be seen until the current opcode completes.

If a signal of any given type fires multiple times during an opcode
(such as from a fine-grained timer), the handler for that signal will
be called only once, after the opcode completes; all other
instances will be discarded.  Furthermore, if your system's signal queue
gets flooded to the point that there are signals that have been raised
but not yet caught (and thus not deferred) at the time an opcode
completes, those signals may well be caught and deferred during
subsequent opcodes, with sometimes surprising results.  For example, you
may see alarms delivered even after calling C<alarm(0)> as the latter
stops the raising of alarms but does not cancel the delivery of alarms
raised but not yet caught.  Do not depend on the behaviors described in
this paragraph as they are side effects of the current implementation and
may change in future versions of Perl.

=item Interrupting IO

When a signal is delivered (e.g., SIGINT from a control-C) the operating
system breaks into IO operations like I<read>(2), which is used to
implement Perl's readline() function, the C<< <> >> operator. On older
Perls the handler was called immediately (and as C<read> is not "unsafe",
this worked well). With the "deferred" scheme the handler is I<not> called
immediately, and if Perl is using the system's C<stdio> library that
library may restart the C<read> without returning to Perl to give it a
chance to call the %SIG handler. If this happens on your system the
solution is to use the C<:perlio> layer to do IO--at least on those handles
that you want to be able to break into with signals. (The C<:perlio> layer
checks the signal flags and calls %SIG handlers before resuming IO
operation.)

The default in Perl 5.8.0 and later is to automatically use
the C<:perlio> layer.

Note that it is not advisable to access a file handle within a signal
handler where that signal has interrupted an I/O operation on that same
handle. While perl will at least try hard not to crash, there are no
guarantees of data integrity; for example, some data might get dropped or
written twice.

Some networking library functions like gethostbyname() are known to have
their own implementations of timeouts which may conflict with your
timeouts.  If you have problems with such functions, try using the POSIX
sigaction() function, which bypasses Perl safe signals.  Be warned that
this does subject you to possible memory corruption, as described above.

Instead of setting C<$SIG{ALRM}>:

   local $SIG{ALRM} = sub { die "alarm" };

try something like the following:

 use POSIX qw(SIGALRM);
 POSIX::sigaction(SIGALRM,
                  POSIX::SigAction->new(sub { die "alarm" }))
          || die "Error setting SIGALRM handler: $!\n";

Another way to disable the safe signal behavior locally is to use
the C<Perl::Unsafe::Signals> module from CPAN, which affects
all signals.

=item Restartable system calls

On systems that supported it, older versions of Perl used the
SA_RESTART flag when installing %SIG handlers.  This meant that
restartable system calls would continue rather than returning when
a signal arrived.  In order to deliver deferred signals promptly,
Perl 5.8.0 and later do I<not> use SA_RESTART.  Consequently,
restartable system calls can fail (with $! set to C<EINTR>) in places
where they previously would have succeeded.

The default C<:perlio> layer retries C<read>, C<write>
and C<close> as described above; interrupted C<wait> and
C<waitpid> calls will always be retried.

=item Signals as "faults"

Certain signals like SEGV, ILL, and BUS are generated by virtual memory
addressing errors and similar "faults". These are normally fatal: there is
little a Perl-level handler can do with them.  So Perl delivers them
immediately rather than attempting to defer them.

=item Signals triggered by operating system state

On some operating systems certain signal handlers are supposed to "do
something" before returning. One example can be CHLD or CLD, which
indicates a child process has completed. On some operating systems the
signal handler is expected to C<wait> for the completed child
process. On such systems the deferred signal scheme will not work for
those signals: it does not do the C<wait>. Again the failure will
look like a loop as the operating system will reissue the signal because
there are completed child processes that have not yet been C<wait>ed for.

=back

If you want the old signal behavior back despite possible
memory corruption, set the environment variable C<PERL_SIGNALS> to
C<"unsafe">.  This feature first appeared in Perl 5.8.1.

=head1 Named Pipes

A named pipe (often referred to as a FIFO) is an old Unix IPC
mechanism for processes communicating on the same machine.  It works
just like regular anonymous pipes, except that the
processes rendezvous using a filename and need not be related.

To create a named pipe, use the C<POSIX::mkfifo()> function.

    use POSIX qw(mkfifo);
    mkfifo($path, 0700)     ||  die "mkfifo $path failed: $!";

You can also use the Unix command mknod(1), or on some
systems, mkfifo(1).  These may not be in your normal path, though.

    # system return val is backwards, so && not ||
    #
    $ENV{PATH} .= ":/etc:/usr/etc";
    if  (      system("mknod",  $path, "p")
            && system("mkfifo", $path) )
    {
        die "mk{nod,fifo} $path failed";
    }


A fifo is convenient when you want to connect a process to an unrelated
one.  When you open a fifo, the program will block until there's something
on the other end.

For example, let's say you'd like to have your F<.signature> file be a
named pipe that has a Perl program on the other end.  Now every time any
program (like a mailer, news reader, finger program, etc.) tries to read
from that file, the reading program will read the new signature from your
program.  We'll use the pipe-checking file-test operator, B<-p>, to find
out whether anyone (or anything) has accidentally removed our fifo.

    chdir();    # go home
    my $FIFO = ".signature";

    while (1) {
        unless (-p $FIFO) {
            unlink $FIFO;   # discard any failure, will catch later
            require POSIX;  # delayed loading of heavy module
            POSIX::mkfifo($FIFO, 0700)
                                || die "can't mkfifo $FIFO: $!";
        }

        # next line blocks till there's a reader
        open (FIFO, "> $FIFO")  || die "can't open $FIFO: $!";
        print FIFO "John Smith (smith\@host.org)\n", `fortune -s`;
        close(FIFO)             || die "can't close $FIFO: $!";
        sleep 2;                # to avoid dup signals
    }

=head1 Using open() for IPC

Perl's basic open() statement can also be used for unidirectional
interprocess communication by either appending or prepending a pipe
symbol to the second argument to open().  Here's how to start
something up in a child process you intend to write to:

    open(SPOOLER, "| cat -v | lpr -h 2>/dev/null")
                        || die "can't fork: $!";
    local $SIG{PIPE} = sub { die "spooler pipe broke" };
    print SPOOLER "stuff\n";
    close SPOOLER       || die "bad spool: $! $?";

And here's how to start up a child process you intend to read from:

    open(STATUS, "netstat -an 2>&1 |")
                        || die "can't fork: $!";
    while (<STATUS>) {
        next if /^(tcp|udp)/;
        print;
    }
    close STATUS        || die "bad netstat: $! $?";

If one can be sure that a particular program is a Perl script expecting
filenames in @ARGV, the clever programmer can write something like this:

    % program f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile

and no matter which sort of shell it's called from, the Perl program will
read from the file F<f1>, the process F<cmd1>, standard input (F<tmpfile>
in this case), the F<f2> file, the F<cmd2> command, and finally the F<f3>
file.  Pretty nifty, eh?

You might notice that you could use backticks for much the
same effect as opening a pipe for reading:

    print grep { !/^(tcp|udp)/ } `netstat -an 2>&1`;
    die "bad netstatus ($?)" if $?;

While this is true on the surface, it's much more efficient to process the
file one line or record at a time because then you don't have to read the
whole thing into memory at once.  It also gives you finer control of the
whole process, letting you kill off the child process early if you'd like.

Be careful to check the return values from both open() and close().  If
you're I<writing> to a pipe, you should also trap SIGPIPE.  Otherwise,
think of what happens when you start up a pipe to a command that doesn't
exist: the open() will in all likelihood succeed (it only reflects the
fork()'s success), but then your output will fail--spectacularly.  Perl
can't know whether the command worked, because your command is actually
running in a separate process whose exec() might have failed.  Therefore,
while readers of bogus commands return just a quick EOF, writers
to bogus commands will get hit with a signal, which they'd best be prepared
to handle.  Consider:

    open(FH, "|bogus")      || die "can't fork: $!";
    print FH "bang\n";      #  neither necessary nor sufficient
                            #  to check print retval!
    close(FH)               || die "can't close: $!";

The reason for not checking the return value from print() is because of
pipe buffering; physical writes are delayed.  That won't blow up until the
close, and it will blow up with a SIGPIPE.  To catch it, you could use
this:

    $SIG{PIPE} = "IGNORE";
    open(FH, "|bogus")  || die "can't fork: $!";
    print FH "bang\n";
    close(FH)           || die "can't close: status=$?";

=head2 Filehandles

Both the main process and any child processes it forks share the same
STDIN, STDOUT, and STDERR filehandles.  If both processes try to access
them at once, strange things can happen.  You may also want to close
or reopen the filehandles for the child.  You can get around this by
opening your pipe with open(), but on some systems this means that the
child process cannot outlive the parent.

=head2 Background Processes

You can run a command in the background with:

    system("cmd &");

The command's STDOUT and STDERR (and possibly STDIN, depending on your
shell) will be the same as the parent's.  You won't need to catch
SIGCHLD because of the double-fork taking place; see below for details.

=head2 Complete Dissociation of Child from Parent

In some cases (starting server processes, for instance) you'll want to
completely dissociate the child process from the parent.  This is
often called daemonization.  A well-behaved daemon will also chdir()
to the root directory so it doesn't prevent unmounting the filesystem
containing the directory from which it was launched, and redirect its
standard file descriptors from and to F</dev/null> so that random
output doesn't wind up on the user's terminal.

 use POSIX "setsid";

 sub daemonize {
     chdir("/")                  || die "can't chdir to /: $!";
     open(STDIN,  "< /dev/null") || die "can't read /dev/null: $!";
     open(STDOUT, "> /dev/null") || die "can't write to /dev/null: $!";
     defined(my $pid = fork())   || die "can't fork: $!";
     exit if $pid;               # non-zero now means I am the parent
     (setsid() != -1)            || die "Can't start a new session: $!";
     open(STDERR, ">&STDOUT")    || die "can't dup stdout: $!";
 }

The fork() has to come before the setsid() to ensure you aren't a
process group leader; the setsid() will fail if you are.  If your
system doesn't have the setsid() function, open F</dev/tty> and use the
C<TIOCNOTTY> ioctl() on it instead.  See tty(4) for details.

Non-Unix users should check their C<< I<Your_OS>::Process >> module for
other possible solutions.

=head2 Safe Pipe Opens

Another interesting approach to IPC is making your single program go
multiprocess and communicate between--or even amongst--yourselves.  The
open() function will accept a file argument of either C<"-|"> or C<"|-">
to do a very interesting thing: it forks a child connected to the
filehandle you've opened.  The child is running the same program as the
parent.  This is useful for safely opening a file when running under an
assumed UID or GID, for example.  If you open a pipe I<to> minus, you can
write to the filehandle you opened and your kid will find it in I<his>
STDIN.  If you open a pipe I<from> minus, you can read from the filehandle
you opened whatever your kid writes to I<his> STDOUT.

    use English;
    my $PRECIOUS = "/path/to/some/safe/file";
    my $sleep_count;
    my $pid;

    do {
        $pid = open(KID_TO_WRITE, "|-");
        unless (defined $pid) {
            warn "cannot fork: $!";
            die "bailing out" if $sleep_count++ > 6;
            sleep 10;
        }
    } until defined $pid;

    if ($pid) {                 # I am the parent
        print KID_TO_WRITE @some_data;
        close(KID_TO_WRITE)     || warn "kid exited $?";
    } else {                    # I am the child
        # drop permissions in setuid and/or setgid programs:
        ($EUID, $EGID) = ($UID, $GID);
        open (OUTFILE, "> $PRECIOUS")
                                || die "can't open $PRECIOUS: $!";
        while (<STDIN>) {
            print OUTFILE;      # child's STDIN is parent's KID_TO_WRITE
        }
        close(OUTFILE)          || die "can't close $PRECIOUS: $!";
        exit(0);                # don't forget this!!
    }

Another common use for this construct is when you need to execute
something without the shell's interference.  With system(), it's
straightforward, but you can't use a pipe open or backticks safely.
That's because there's no way to stop the shell from getting its hands on
your arguments.   Instead, use lower-level control to call exec() directly.

Here's a safe backtick or pipe open for read:

    my $pid = open(KID_TO_READ, "-|");
    defined($pid)           || die "can't fork: $!";

    if ($pid) {             # parent
        while (<KID_TO_READ>) {
                            # do something interesting
        }
        close(KID_TO_READ)  || warn "kid exited $?";

    } else {                # child
        ($EUID, $EGID) = ($UID, $GID); # suid only
        exec($program, @options, @args)
                            || die "can't exec program: $!";
        # NOTREACHED
    }

And here's a safe pipe open for writing:

    my $pid = open(KID_TO_WRITE, "|-");
    defined($pid)           || die "can't fork: $!";

    $SIG{PIPE} = sub { die "whoops, $program pipe broke" };

    if ($pid) {             # parent
        print KID_TO_WRITE @data;
        close(KID_TO_WRITE) || warn "kid exited $?";

    } else {                # child
        ($EUID, $EGID) = ($UID, $GID);
        exec($program, @options, @args)
                            || die "can't exec program: $!";
        # NOTREACHED
    }

It is very easy to dead-lock a process using this form of open(), or
indeed with any use of pipe() with multiple subprocesses.  The
example above is "safe" because it is simple and calls exec().  See
L</"Avoiding Pipe Deadlocks"> for general safety principles, but there
are extra gotchas with Safe Pipe Opens.

In particular, if you opened the pipe using C<open FH, "|-">, then you
cannot simply use close() in the parent process to close an unwanted
writer.  Consider this code:

    my $pid = open(WRITER, "|-");        # fork open a kid
    defined($pid)               || die "first fork failed: $!";
    if ($pid) {
        if (my $sub_pid = fork()) {
            defined($sub_pid)   || die "second fork failed: $!";
            close(WRITER)       || die "couldn't close WRITER: $!";
            # now do something else...
        }
        else {
            # first write to WRITER
            # ...
            # then when finished
            close(WRITER)       || die "couldn't close WRITER: $!";
            exit(0);
        }
    }
    else {
        # first do something with STDIN, then
        exit(0);
    }

In the example above, the true parent does not want to write to the WRITER
filehandle, so it closes it.  However, because WRITER was opened using
C<open FH, "|-">, it has a special behavior: closing it calls
waitpid() (see L<perlfunc/waitpid>), which waits for the subprocess
to exit.  If the child process ends up waiting for something happening
in the section marked "do something else", you have deadlock.

This can also be a problem with intermediate subprocesses in more
complicated code, which will call waitpid() on all open filehandles
during global destruction--in no predictable order.

To solve this, you must manually use pipe(), fork(), and the form of
open() which sets one file descriptor to another, as shown below:

    pipe(READER, WRITER)        || die "pipe failed: $!";
    $pid = fork();
    defined($pid)               || die "first fork failed: $!";
    if ($pid) {
        close READER;
        if (my $sub_pid = fork()) {
            defined($sub_pid)   || die "first fork failed: $!";
            close(WRITER)       || die "can't close WRITER: $!";
        }
        else {
            # write to WRITER...
            # ...
            # then  when finished
            close(WRITER)       || die "can't close WRITER: $!";
            exit(0);
        }
        # write to WRITER...
    }
    else {
        open(STDIN, "<&READER") || die "can't reopen STDIN: $!";
        close(WRITER)           || die "can't close WRITER: $!";
        # do something...
        exit(0);
    }

Since Perl 5.8.0, you can also use the list form of C<open> for pipes.
This is preferred when you wish to avoid having the shell interpret
metacharacters that may be in your command string.

So for example, instead of using:

    open(PS_PIPE, "ps aux|")    || die "can't open ps pipe: $!";

One would use either of these:

    open(PS_PIPE, "-|", "ps", "aux")
                                || die "can't open ps pipe: $!";

    @ps_args = qw[ ps aux ];
    open(PS_PIPE, "-|", @ps_args)
                                || die "can't open @ps_args|: $!";

Because there are more than three arguments to open(), forks the ps(1)
command I<without> spawning a shell, and reads its standard output via the
C<PS_PIPE> filehandle.  The corresponding syntax to I<write> to command
pipes is to use C<"|-"> in place of C<"-|">.

This was admittedly a rather silly example, because you're using string
literals whose content is perfectly safe.  There is therefore no cause to
resort to the harder-to-read, multi-argument form of pipe open().  However,
whenever you cannot be assured that the program arguments are free of shell
metacharacters, the fancier form of open() should be used.  For example:

    @grep_args = ("egrep", "-i", $some_pattern, @many_files);
    open(GREP_PIPE, "-|", @grep_args)
                        || die "can't open @grep_args|: $!";

Here the multi-argument form of pipe open() is preferred because the
pattern and indeed even the filenames themselves might hold metacharacters.

Be aware that these operations are full Unix forks, which means they may
not be correctly implemented on all alien systems.

=head2 Avoiding Pipe Deadlocks

Whenever you have more than one subprocess, you must be careful that each
closes whichever half of any pipes created for interprocess communication
it is not using.  This is because any child process reading from the pipe
and expecting an EOF will never receive it, and therefore never exit. A
single process closing a pipe is not enough to close it; the last process
with the pipe open must close it for it to read EOF.

Certain built-in Unix features help prevent this most of the time.  For
instance, filehandles have a "close on exec" flag, which is set I<en masse>
under control of the C<$^F> variable.  This is so any filehandles you
didn't explicitly route to the STDIN, STDOUT or STDERR of a child
I<program> will be automatically closed.

Always explicitly and immediately call close() on the writable end of any
pipe, unless that process is actually writing to it.  Even if you don't
explicitly call close(), Perl will still close() all filehandles during
global destruction.  As previously discussed, if those filehandles have
been opened with Safe Pipe Open, this will result in calling waitpid(),
which may again deadlock.

=head2 Bidirectional Communication with Another Process

While this works reasonably well for unidirectional communication, what
about bidirectional communication?  The most obvious approach doesn't work:

    # THIS DOES NOT WORK!!
    open(PROG_FOR_READING_AND_WRITING, "| some program |")

If you forget to C<use warnings>, you'll miss out entirely on the
helpful diagnostic message:

    Can't do bidirectional pipe at -e line 1.

If you really want to, you can use the standard open2() from the
C<IPC::Open2> module to catch both ends.  There's also an open3() in
C<IPC::Open3> for tridirectional I/O so you can also catch your child's
STDERR, but doing so would then require an awkward select() loop and
wouldn't allow you to use normal Perl input operations.

If you look at its source, you'll see that open2() uses low-level
primitives like the pipe() and exec() syscalls to create all the
connections.  Although it might have been more efficient by using
socketpair(), this would have been even less portable than it already
is. The open2() and open3() functions are unlikely to work anywhere
except on a Unix system, or at least one purporting POSIX compliance.

=for TODO
Hold on, is this even true?  First it says that socketpair() is avoided
for portability, but then it says it probably won't work except on
Unixy systems anyway.  Which one of those is true?

Here's an example of using open2():

    use FileHandle;
    use IPC::Open2;
    $pid = open2(*Reader, *Writer, "cat -un");
    print Writer "stuff\n";
    $got = <Reader>;

The problem with this is that buffering is really going to ruin your
day.  Even though your C<Writer> filehandle is auto-flushed so the process
on the other end gets your data in a timely manner, you can't usually do
anything to force that process to give its data to you in a similarly quick
fashion.  In this special case, we could actually so, because we gave
I<cat> a B<-u> flag to make it unbuffered.  But very few commands are
designed to operate over pipes, so this seldom works unless you yourself
wrote the program on the other end of the double-ended pipe.

A solution to this is to use a library which uses pseudottys to make your
program behave more reasonably.  This way you don't have to have control
over the source code of the program you're using.  The C<Expect> module
from CPAN also addresses this kind of thing.  This module requires two
other modules from CPAN, C<IO::Pty> and C<IO::Stty>.  It sets up a pseudo
terminal to interact with programs that insist on talking to the terminal
device driver.  If your system is supported, this may be your best bet.

=head2 Bidirectional Communication with Yourself

If you want, you may make low-level pipe() and fork() syscalls to stitch
this together by hand.  This example only talks to itself, but you could
reopen the appropriate handles to STDIN and STDOUT and call other processes.
(The following example lacks proper error checking.)

 #!/usr/bin/perl -w
 # pipe1 - bidirectional communication using two pipe pairs
 #         designed for the socketpair-challenged
 use IO::Handle;             # thousands of lines just for autoflush :-(
 pipe(PARENT_RDR, CHILD_WTR);  # XXX: check failure?
 pipe(CHILD_RDR,  PARENT_WTR); # XXX: check failure?
 CHILD_WTR->autoflush(1);
 PARENT_WTR->autoflush(1);

 if ($pid = fork()) {
     close PARENT_RDR;
     close PARENT_WTR;
     print CHILD_WTR "Parent Pid $$ is sending this\n";
     chomp($line = <CHILD_RDR>);
     print "Parent Pid $$ just read this: '$line'\n";
     close CHILD_RDR; close CHILD_WTR;
     waitpid($pid, 0);
 } else {
     die "cannot fork: $!" unless defined $pid;
     close CHILD_RDR;
     close CHILD_WTR;
     chomp($line = <PARENT_RDR>);
     print "Child Pid $$ just read this: '$line'\n";
     print PARENT_WTR "Child Pid $$ is sending this\n";
     close PARENT_RDR;
     close PARENT_WTR;
     exit(0);
 }

But you don't actually have to make two pipe calls.  If you
have the socketpair() system call, it will do this all for you.

 #!/usr/bin/perl -w
 # pipe2 - bidirectional communication using socketpair
 #   "the best ones always go both ways"

 use Socket;
 use IO::Handle;  # thousands of lines just for autoflush :-(

 # We say AF_UNIX because although *_LOCAL is the
 # POSIX 1003.1g form of the constant, many machines
 # still don't have it.
 socketpair(CHILD, PARENT, AF_UNIX, SOCK_STREAM, PF_UNSPEC)
                             ||  die "socketpair: $!";

 CHILD->autoflush(1);
 PARENT->autoflush(1);

 if ($pid = fork()) {
     close PARENT;
     print CHILD "Parent Pid $$ is sending this\n";
     chomp($line = <CHILD>);
     print "Parent Pid $$ just read this: '$line'\n";
     close CHILD;
     waitpid($pid, 0);
 } else {
     die "cannot fork: $!" unless defined $pid;
     close CHILD;
     chomp($line = <PARENT>);
     print "Child Pid $$ just read this: '$line'\n";
     print PARENT "Child Pid $$ is sending this\n";
     close PARENT;
     exit(0);
 }

=head1 Sockets: Client/Server Communication

While not entirely limited to Unix-derived operating systems (e.g., WinSock
on PCs provides socket support, as do some VMS libraries), you might not have
sockets on your system, in which case this section probably isn't going to
do you much good.  With sockets, you can do both virtual circuits like TCP
streams and datagrams like UDP packets.  You may be able to do even more
depending on your system.

The Perl functions for dealing with sockets have the same names as
the corresponding system calls in C, but their arguments tend to differ
for two reasons.  First, Perl filehandles work differently than C file
descriptors.  Second, Perl already knows the length of its strings, so you
don't need to pass that information.

One of the major problems with ancient, antemillennial socket code in Perl
was that it used hard-coded values for some of the constants, which
severely hurt portability.  If you ever see code that does anything like
explicitly setting C<$AF_INET = 2>, you know you're in for big trouble.
An immeasurably superior approach is to use the C<Socket> module, which more
reliably grants access to the various constants and functions you'll need.

If you're not writing a server/client for an existing protocol like
NNTP or SMTP, you should give some thought to how your server will
know when the client has finished talking, and vice-versa.  Most
protocols are based on one-line messages and responses (so one party
knows the other has finished when a "\n" is received) or multi-line
messages and responses that end with a period on an empty line
("\n.\n" terminates a message/response).

=head2 Internet Line Terminators

The Internet line terminator is "\015\012".  Under ASCII variants of
Unix, that could usually be written as "\r\n", but under other systems,
"\r\n" might at times be "\015\015\012", "\012\012\015", or something
completely different.  The standards specify writing "\015\012" to be
conformant (be strict in what you provide), but they also recommend
accepting a lone "\012" on input (be lenient in what you require).
We haven't always been very good about that in the code in this manpage,
but unless you're on a Mac from way back in its pre-Unix dark ages, you'll
probably be ok.

=head2 Internet TCP Clients and Servers

Use Internet-domain sockets when you want to do client-server
communication that might extend to machines outside of your own system.

Here's a sample TCP client using Internet-domain sockets:

    #!/usr/bin/perl -w
    use strict;
    use Socket;
    my ($remote, $port, $iaddr, $paddr, $proto, $line);

    $remote  = shift || "localhost";
    $port    = shift || 2345;  # random port
    if ($port =~ /\D/) { $port = getservbyname($port, "tcp") }
    die "No port" unless $port;
    $iaddr   = inet_aton($remote)       || die "no host: $remote";
    $paddr   = sockaddr_in($port, $iaddr);

    $proto   = getprotobyname("tcp");
    socket(SOCK, PF_INET, SOCK_STREAM, $proto)  || die "socket: $!";
    connect(SOCK, $paddr)               || die "connect: $!";
    while ($line = <SOCK>) {
        print $line;
    }

    close (SOCK)                        || die "close: $!";
    exit(0);

And here's a corresponding server to go along with it.  We'll
leave the address as C<INADDR_ANY> so that the kernel can choose
the appropriate interface on multihomed hosts.  If you want sit
on a particular interface (like the external side of a gateway
or firewall machine), fill this in with your real address instead.

 #!/usr/bin/perl -Tw
 use strict;
 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
 use Socket;
 use Carp;
 my $EOL = "\015\012";

 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }

 my $port  = shift || 2345;
 die "invalid port" unless $port =~ /^ \d+ $/x;

 my $proto = getprotobyname("tcp");

 socket(Server, PF_INET, SOCK_STREAM, $proto)   || die "socket: $!";
 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
                                                || die "setsockopt: $!";
 bind(Server, sockaddr_in($port, INADDR_ANY))   || die "bind: $!";
 listen(Server, SOMAXCONN)                      || die "listen: $!";

 logmsg "server started on port $port";

 my $paddr;

 for ( ; $paddr = accept(Client, Server); close Client) {
     my($port, $iaddr) = sockaddr_in($paddr);
     my $name = gethostbyaddr($iaddr, AF_INET);

     logmsg "connection from $name [",
             inet_ntoa($iaddr), "]
             at port $port";

     print Client "Hello there, $name, it's now ",
                     scalar localtime(), $EOL;
 }

And here's a multitasking version.  It's multitasked in that
like most typical servers, it spawns (fork()s) a slave server to
handle the client request so that the master server can quickly
go back to service a new client.

 #!/usr/bin/perl -Tw
 use strict;
 BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
 use Socket;
 use Carp;
 my $EOL = "\015\012";

 sub spawn;  # forward declaration
 sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }

 my $port  = shift || 2345;
 die "invalid port" unless $port =~ /^ \d+ $/x;

 my $proto = getprotobyname("tcp");

 socket(Server, PF_INET, SOCK_STREAM, $proto)   || die "socket: $!";
 setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, pack("l", 1))
                                                || die "setsockopt: $!";
 bind(Server, sockaddr_in($port, INADDR_ANY))   || die "bind: $!";
 listen(Server, SOMAXCONN)                      || die "listen: $!";

 logmsg "server started on port $port";

 my $waitedpid = 0;
 my $paddr;

 use POSIX ":sys_wait_h";
 use Errno;

 sub REAPER {
     local $!;   # don't let waitpid() overwrite current error
     while ((my $pid = waitpid(-1, WNOHANG)) > 0 && WIFEXITED($?)) {
         logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
     }
     $SIG{CHLD} = \&REAPER;  # loathe SysV
 }

 $SIG{CHLD} = \&REAPER;

 while (1) {
     $paddr = accept(Client, Server) || do {
         # try again if accept() returned because got a signal
         next if $!{EINTR};
         die "accept: $!";
     };
     my ($port, $iaddr) = sockaddr_in($paddr);
     my $name = gethostbyaddr($iaddr, AF_INET);

     logmsg "connection from $name [",
            inet_ntoa($iaddr),
            "] at port $port";

     spawn sub {
         $| = 1;
         print "Hello there, $name, it's now ",
               scalar localtime(),
               $EOL;
         exec "/usr/games/fortune"       # XXX: "wrong" line terminators
             or confess "can't exec fortune: $!";
     };
     close Client;
 }

 sub spawn {
     my $coderef = shift;

     unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
         confess "usage: spawn CODEREF";
     }

     my $pid;
     unless (defined($pid = fork())) {
         logmsg "cannot fork: $!";
         return;
     }
     elsif ($pid) {
         logmsg "begat $pid";
         return; # I'm the parent
     }
     # else I'm the child -- go spawn

     open(STDIN,  "<&Client")    || die "can't dup client to stdin";
     open(STDOUT, ">&Client")    || die "can't dup client to stdout";
     ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
     exit($coderef->());
 }

This server takes the trouble to clone off a child version via fork()
for each incoming request.  That way it can handle many requests at
once, which you might not always want.  Even if you don't fork(), the
listen() will allow that many pending connections.  Forking servers
have to be particularly careful about cleaning up their dead children
(called "zombies" in Unix parlance), because otherwise you'll quickly
fill up your process table.  The REAPER subroutine is used here to
call waitpid() for any child processes that have finished, thereby
ensuring that they terminate cleanly and don't join the ranks of the
living dead.

Within the while loop we call accept() and check to see if it returns
a false value.  This would normally indicate a system error needs
to be reported.  However, the introduction of safe signals (see
L</Deferred Signals (Safe Signals)> above) in Perl 5.8.0 means that
accept() might also be interrupted when the process receives a signal.
This typically happens when one of the forked subprocesses exits and
notifies the parent process with a CHLD signal.

If accept() is interrupted by a signal, $! will be set to EINTR.
If this happens, we can safely continue to the next iteration of
the loop and another call to accept().  It is important that your
signal handling code not modify the value of $!, or else this test
will likely fail.  In the REAPER subroutine we create a local version
of $! before calling waitpid().  When waitpid() sets $! to ECHILD as
it inevitably does when it has no more children waiting, it
updates the local copy and leaves the original unchanged.

You should use the B<-T> flag to enable taint checking (see L<perlsec>)
even if we aren't running setuid or setgid.  This is always a good idea
for servers or any program run on behalf of someone else (like CGI
scripts), because it lessens the chances that people from the outside will
be able to compromise your system.

Let's look at another TCP client.  This one connects to the TCP "time"
service on a number of different machines and shows how far their clocks
differ from the system on which it's being run:

    #!/usr/bin/perl  -w
    use strict;
    use Socket;

    my $SECS_OF_70_YEARS = 2208988800;
    sub ctime { scalar localtime(shift() || time()) }

    my $iaddr = gethostbyname("localhost");
    my $proto = getprotobyname("tcp");
    my $port = getservbyname("time", "tcp");
    my $paddr = sockaddr_in(0, $iaddr);
    my($host);

    $| = 1;
    printf "%-24s %8s %s\n", "localhost", 0, ctime();

    foreach $host (@ARGV) {
        printf "%-24s ", $host;
        my $hisiaddr = inet_aton($host)     || die "unknown host";
        my $hispaddr = sockaddr_in($port, $hisiaddr);
        socket(SOCKET, PF_INET, SOCK_STREAM, $proto)
                                            || die "socket: $!";
        connect(SOCKET, $hispaddr)          || die "connect: $!";
        my $rtime = pack("C4", ());
        read(SOCKET, $rtime, 4);
        close(SOCKET);
        my $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
        printf "%8d %s\n", $histime - time(), ctime($histime);
    }

=head2 Unix-Domain TCP Clients and Servers

That's fine for Internet-domain clients and servers, but what about local
communications?  While you can use the same setup, sometimes you don't
want to.  Unix-domain sockets are local to the current host, and are often
used internally to implement pipes.  Unlike Internet domain sockets, Unix
domain sockets can show up in the file system with an ls(1) listing.

    % ls -l /dev/log
    srw-rw-rw-  1 root            0 Oct 31 07:23 /dev/log

You can test for these with Perl's B<-S> file test:

    unless (-S "/dev/log") {
        die "something's wicked with the log system";
    }

Here's a sample Unix-domain client:

    #!/usr/bin/perl -w
    use Socket;
    use strict;
    my ($rendezvous, $line);

    $rendezvous = shift || "catsock";
    socket(SOCK, PF_UNIX, SOCK_STREAM, 0)     || die "socket: $!";
    connect(SOCK, sockaddr_un($rendezvous))   || die "connect: $!";
    while (defined($line = <SOCK>)) {
        print $line;
    }
    exit(0);

And here's a corresponding server.  You don't have to worry about silly
network terminators here because Unix domain sockets are guaranteed
to be on the localhost, and thus everything works right.

    #!/usr/bin/perl -Tw
    use strict;
    use Socket;
    use Carp;

    BEGIN { $ENV{PATH} = "/usr/bin:/bin" }
    sub spawn;  # forward declaration
    sub logmsg { print "$0 $$: @_ at ", scalar localtime(), "\n" }

    my $NAME = "catsock";
    my $uaddr = sockaddr_un($NAME);
    my $proto = getprotobyname("tcp");

    socket(Server, PF_UNIX, SOCK_STREAM, 0) || die "socket: $!";
    unlink($NAME);
    bind  (Server, $uaddr)                  || die "bind: $!";
    listen(Server, SOMAXCONN)               || die "listen: $!";

    logmsg "server started on $NAME";

    my $waitedpid;

    use POSIX ":sys_wait_h";
    sub REAPER {
        my $child;
        while (($waitedpid = waitpid(-1, WNOHANG)) > 0) {
            logmsg "reaped $waitedpid" . ($? ? " with exit $?" : "");
        }
        $SIG{CHLD} = \&REAPER;  # loathe SysV
    }

    $SIG{CHLD} = \&REAPER;


    for ( $waitedpid = 0;
          accept(Client, Server) || $waitedpid;
          $waitedpid = 0, close Client)
    {
        next if $waitedpid;
        logmsg "connection on $NAME";
        spawn sub {
            print "Hello there, it's now ", scalar localtime(), "\n";
            exec("/usr/games/fortune")  || die "can't exec fortune: $!";
        };
    }

    sub spawn {
        my $coderef = shift();

        unless (@_ == 0 && $coderef && ref($coderef) eq "CODE") {
            confess "usage: spawn CODEREF";
        }

        my $pid;
        unless (defined($pid = fork())) {
            logmsg "cannot fork: $!";
            return;
        }
        elsif ($pid) {
            logmsg "begat $pid";
            return; # I'm the parent
        }
        else {
            # I'm the child -- go spawn
        }

        open(STDIN,  "<&Client")    || die "can't dup client to stdin";
        open(STDOUT, ">&Client")    || die "can't dup client to stdout";
        ## open(STDERR, ">&STDOUT") || die "can't dup stdout to stderr";
        exit($coderef->());
    }

As you see, it's remarkably similar to the Internet domain TCP server, so
much so, in fact, that we've omitted several duplicate functions--spawn(),
logmsg(), ctime(), and REAPER()--which are the same as in the other server.

So why would you ever want to use a Unix domain socket instead of a
simpler named pipe?  Because a named pipe doesn't give you sessions.  You
can't tell one process's data from another's.  With socket programming,
you get a separate session for each client; that's why accept() takes two
arguments.

For example, let's say that you have a long-running database server daemon
that you want folks to be able to access from the Web, but only
if they go through a CGI interface.  You'd have a small, simple CGI
program that does whatever checks and logging you feel like, and then acts
as a Unix-domain client and connects to your private server.

=head1 TCP Clients with IO::Socket

For those preferring a higher-level interface to socket programming, the
IO::Socket module provides an object-oriented approach.  If for some reason
you lack this module, you can just fetch IO::Socket from CPAN, where you'll also
find modules providing easy interfaces to the following systems: DNS, FTP,
Ident (RFC 931), NIS and NISPlus, NNTP, Ping, POP3, SMTP, SNMP, SSLeay,
Telnet, and Time--to name just a few.

=head2 A Simple Client

Here's a client that creates a TCP connection to the "daytime"
service at port 13 of the host name "localhost" and prints out everything
that the server there cares to provide.

    #!/usr/bin/perl -w
    use IO::Socket;
    $remote = IO::Socket::INET->new(
                        Proto    => "tcp",
                        PeerAddr => "localhost",
                        PeerPort => "daytime(13)",
                    )
                 || die "can't connect to daytime service on localhost";
    while (<$remote>) { print }

When you run this program, you should get something back that
looks like this:

    Wed May 14 08:40:46 MDT 1997

Here are what those parameters to the new() constructor mean:

=over 4

=item C<Proto>

This is which protocol to use.  In this case, the socket handle returned
will be connected to a TCP socket, because we want a stream-oriented
connection, that is, one that acts pretty much like a plain old file.
Not all sockets are this of this type.  For example, the UDP protocol
can be used to make a datagram socket, used for message-passing.

=item C<PeerAddr>

This is the name or Internet address of the remote host the server is
running on.  We could have specified a longer name like C<"www.perl.com">,
or an address like C<"207.171.7.72">.  For demonstration purposes, we've
used the special hostname C<"localhost">, which should always mean the
current machine you're running on.  The corresponding Internet address
for localhost is C<"127.0.0.1">, if you'd rather use that.

=item C<PeerPort>

This is the service name or port number we'd like to connect to.
We could have gotten away with using just C<"daytime"> on systems with a
well-configured system services file,[FOOTNOTE: The system services file
is found in I</etc/services> under Unixy systems.] but here we've specified the
port number (13) in parentheses.  Using just the number would have also
worked, but numeric literals make careful programmers nervous.

=back

Notice how the return value from the C<new> constructor is used as
a filehandle in the C<while> loop?  That's what's called an I<indirect
filehandle>, a scalar variable containing a filehandle.  You can use
it the same way you would a normal filehandle.  For example, you
can read one line from it this way:

    $line = <$handle>;

all remaining lines from is this way:

    @lines = <$handle>;

and send a line of data to it this way:

    print $handle "some data\n";

=head2 A Webget Client

Here's a simple client that takes a remote host to fetch a document
from, and then a list of files to get from that host.  This is a
more interesting client than the previous one because it first sends
something to the server before fetching the server's response.

    #!/usr/bin/perl -w
    use IO::Socket;
    unless (@ARGV > 1) { die "usage: $0 host url ..." }
    $host = shift(@ARGV);
    $EOL = "\015\012";
    $BLANK = $EOL x 2;
    for my $document (@ARGV) {
        $remote = IO::Socket::INET->new( Proto     => "tcp",
                                         PeerAddr  => $host,
                                         PeerPort  => "http(80)",
                  )     || die "cannot connect to httpd on $host";
        $remote->autoflush(1);
        print $remote "GET $document HTTP/1.0" . $BLANK;
        while ( <$remote> ) { print }
        close $remote;
    }

The web server handling the HTTP service is assumed to be at
its standard port, number 80.  If the server you're trying to
connect to is at a different port, like 1080 or 8080, you should specify it
as the named-parameter pair, C<< PeerPort => 8080 >>.  The C<autoflush>
method is used on the socket because otherwise the system would buffer
up the output we sent it.  (If you're on a prehistoric Mac, you'll also
need to change every C<"\n"> in your code that sends data over the network
to be a C<"\015\012"> instead.)

Connecting to the server is only the first part of the process: once you
have the connection, you have to use the server's language.  Each server
on the network has its own little command language that it expects as
input.  The string that we send to the server starting with "GET" is in
HTTP syntax.  In this case, we simply request each specified document.
Yes, we really are making a new connection for each document, even though
it's the same host.  That's the way you always used to have to speak HTTP.
Recent versions of web browsers may request that the remote server leave
the connection open a little while, but the server doesn't have to honor
such a request.

Here's an example of running that program, which we'll call I<webget>:

    % webget www.perl.com /guanaco.html
    HTTP/1.1 404 File Not Found
    Date: Thu, 08 May 1997 18:02:32 GMT
    Server: Apache/1.2b6
    Connection: close
    Content-type: text/html

    <HEAD><TITLE>404 File Not Found</TITLE></HEAD>
    <BODY><H1>File Not Found</H1>
    The requested URL /guanaco.html was not found on this server.<P>
    </BODY>

Ok, so that's not very interesting, because it didn't find that
particular document.  But a long response wouldn't have fit on this page.

For a more featureful version of this program, you should look to
the I<lwp-request> program included with the LWP modules from CPAN.

=head2 Interactive Client with IO::Socket

Well, that's all fine if you want to send one command and get one answer,
but what about setting up something fully interactive, somewhat like
the way I<telnet> works?  That way you can type a line, get the answer,
type a line, get the answer, etc.

This client is more complicated than the two we've done so far, but if
you're on a system that supports the powerful C<fork> call, the solution
isn't that rough.  Once you've made the connection to whatever service
you'd like to chat with, call C<fork> to clone your process.  Each of
these two identical process has a very simple job to do: the parent
copies everything from the socket to standard output, while the child
simultaneously copies everything from standard input to the socket.
To accomplish the same thing using just one process would be I<much>
harder, because it's easier to code two processes to do one thing than it
is to code one process to do two things.  (This keep-it-simple principle
a cornerstones of the Unix philosophy, and good software engineering as
well, which is probably why it's spread to other systems.)

Here's the code:

    #!/usr/bin/perl -w
    use strict;
    use IO::Socket;
    my ($host, $port, $kidpid, $handle, $line);

    unless (@ARGV == 2) { die "usage: $0 host port" }
    ($host, $port) = @ARGV;

    # create a tcp connection to the specified host and port
    $handle = IO::Socket::INET->new(Proto     => "tcp",
                                    PeerAddr  => $host,
                                    PeerPort  => $port)
               || die "can't connect to port $port on $host: $!";

    $handle->autoflush(1);       # so output gets there right away
    print STDERR "[Connected to $host:$port]\n";

    # split the program into two processes, identical twins
    die "can't fork: $!" unless defined($kidpid = fork());

    # the if{} block runs only in the parent process
    if ($kidpid) {
        # copy the socket to standard output
        while (defined ($line = <$handle>)) {
            print STDOUT $line;
        }
        kill("TERM", $kidpid);   # send SIGTERM to child
    }
    # the else{} block runs only in the child process
    else {
        # copy standard input to the socket
        while (defined ($line = <STDIN>)) {
            print $handle $line;
        }
        exit(0);                # just in case
    }

The C<kill> function in the parent's C<if> block is there to send a
signal to our child process, currently running in the C<else> block,
as soon as the remote server has closed its end of the connection.

If the remote server sends data a byte at time, and you need that
data immediately without waiting for a newline (which might not happen),
you may wish to replace the C<while> loop in the parent with the
following:

    my $byte;
    while (sysread($handle, $byte, 1) == 1) {
        print STDOUT $byte;
    }

Making a system call for each byte you want to read is not very efficient
(to put it mildly) but is the simplest to explain and works reasonably
well.

=head1 TCP Servers with IO::Socket

As always, setting up a server is little bit more involved than running a client.
The model is that the server creates a special kind of socket that
does nothing but listen on a particular port for incoming connections.
It does this by calling the C<< IO::Socket::INET->new() >> method with
slightly different arguments than the client did.

=over 4

=item Proto

This is which protocol to use.  Like our clients, we'll
still specify C<"tcp"> here.

=item LocalPort

We specify a local
port in the C<LocalPort> argument, which we didn't do for the client.
This is service name or port number for which you want to be the
server. (Under Unix, ports under 1024 are restricted to the
superuser.)  In our sample, we'll use port 9000, but you can use
any port that's not currently in use on your system.  If you try
to use one already in used, you'll get an "Address already in use"
message.  Under Unix, the C<netstat -a> command will show
which services current have servers.

=item Listen

The C<Listen> parameter is set to the maximum number of
pending connections we can accept until we turn away incoming clients.
Think of it as a call-waiting queue for your telephone.
The low-level Socket module has a special symbol for the system maximum, which
is SOMAXCONN.

=item Reuse

The C<Reuse> parameter is needed so that we restart our server
manually without waiting a few minutes to allow system buffers to
clear out.

=back

Once the generic server socket has been created using the parameters
listed above, the server then waits for a new client to connect
to it.  The server blocks in the C<accept> method, which eventually accepts a
bidirectional connection from the remote client.  (Make sure to autoflush
this handle to circumvent buffering.)

To add to user-friendliness, our server prompts the user for commands.
Most servers don't do this.  Because of the prompt without a newline,
you'll have to use the C<sysread> variant of the interactive client above.

This server accepts one of five different commands, sending output back to
the client.  Unlike most network servers, this one handles only one
incoming client at a time.  Multitasking servers are covered in
Chapter 16 of the Camel.

Here's the code.  We'll

 #!/usr/bin/perl -w
 use IO::Socket;
 use Net::hostent;      # for OOish version of gethostbyaddr

 $PORT = 9000;          # pick something not in use

 $server = IO::Socket::INET->new( Proto     => "tcp",
                                  LocalPort => $PORT,
                                  Listen    => SOMAXCONN,
                                  Reuse     => 1);

 die "can't setup server" unless $server;
 print "[Server $0 accepting clients]\n";

 while ($client = $server->accept()) {
   $client->autoflush(1);
   print $client "Welcome to $0; type help for command list.\n";
   $hostinfo = gethostbyaddr($client->peeraddr);
   printf "[Connect from %s]\n",
          $hostinfo ? $hostinfo->name : $client->peerhost;
   print $client "Command? ";
   while ( <$client>) {
     next unless /\S/;     # blank line
     if    (/quit|exit/i)  { last                                      }
     elsif (/date|time/i)  { printf $client "%s\n", scalar localtime() }
     elsif (/who/i )       { print  $client `who 2>&1`                 }
     elsif (/cookie/i )    { print  $client `/usr/games/fortune 2>&1`  }
     elsif (/motd/i )      { print  $client `cat /etc/motd 2>&1`       }
     else {
       print $client "Commands: quit date who cookie motd\n";
     }
   } continue {
      print $client "Command? ";
   }
   close $client;
 }

=head1 UDP: Message Passing

Another kind of client-server setup is one that uses not connections, but
messages.  UDP communications involve much lower overhead but also provide
less reliability, as there are no promises that messages will arrive at
all, let alone in order and unmangled.  Still, UDP offers some advantages
over TCP, including being able to "broadcast" or "multicast" to a whole
bunch of destination hosts at once (usually on your local subnet).  If you
find yourself overly concerned about reliability and start building checks
into your message system, then you probably should use just TCP to start
with.

UDP datagrams are I<not> a bytestream and should not be treated as such.
This makes using I/O mechanisms with internal buffering like stdio (i.e.
print() and friends) especially cumbersome. Use syswrite(), or better
send(), like in the example below.

Here's a UDP program similar to the sample Internet TCP client given
earlier.  However, instead of checking one host at a time, the UDP version
will check many of them asynchronously by simulating a multicast and then
using select() to do a timed-out wait for I/O.  To do something similar
with TCP, you'd have to use a different socket handle for each host.

 #!/usr/bin/perl -w
 use strict;
 use Socket;
 use Sys::Hostname;

 my ( $count, $hisiaddr, $hispaddr, $histime,
      $host, $iaddr, $paddr, $port, $proto,
      $rin, $rout, $rtime, $SECS_OF_70_YEARS);

 $SECS_OF_70_YEARS = 2_208_988_800;

 $iaddr = gethostbyname(hostname());
 $proto = getprotobyname("udp");
 $port = getservbyname("time", "udp");
 $paddr = sockaddr_in(0, $iaddr); # 0 means let kernel pick

 socket(SOCKET, PF_INET, SOCK_DGRAM, $proto)   || die "socket: $!";
 bind(SOCKET, $paddr)                          || die "bind: $!";

 $| = 1;
 printf "%-12s %8s %s\n",  "localhost", 0, scalar localtime();
 $count = 0;
 for $host (@ARGV) {
     $count++;
     $hisiaddr = inet_aton($host)              || die "unknown host";
     $hispaddr = sockaddr_in($port, $hisiaddr);
     defined(send(SOCKET, 0, 0, $hispaddr))    || die "send $host: $!";
 }

 $rin = "";
 vec($rin, fileno(SOCKET), 1) = 1;

 # timeout after 10.0 seconds
 while ($count && select($rout = $rin, undef, undef, 10.0)) {
     $rtime = "";
     $hispaddr = recv(SOCKET, $rtime, 4, 0)    || die "recv: $!";
     ($port, $hisiaddr) = sockaddr_in($hispaddr);
     $host = gethostbyaddr($hisiaddr, AF_INET);
     $histime = unpack("N", $rtime) - $SECS_OF_70_YEARS;
     printf "%-12s ", $host;
     printf "%8d %s\n", $histime - time(), scalar localtime($histime);
     $count--;
 }

This example does not include any retries and may consequently fail to
contact a reachable host. The most prominent reason for this is congestion
of the queues on the sending host if the number of hosts to contact is
sufficiently large.

=head1 SysV IPC

While System V IPC isn't so widely used as sockets, it still has some
interesting uses.  However, you cannot use SysV IPC or Berkeley mmap() to
have a variable shared amongst several processes.  That's because Perl
would reallocate your string when you weren't wanting it to.  You might
look into the C<IPC::Shareable> or C<threads::shared> modules for that.

Here's a small example showing shared memory usage.

    use IPC::SysV qw(IPC_PRIVATE IPC_RMID S_IRUSR S_IWUSR);

    $size = 2000;
    $id = shmget(IPC_PRIVATE, $size, S_IRUSR | S_IWUSR);
    defined($id)                    || die "shmget: $!";
    print "shm key $id\n";

    $message = "Message #1";
    shmwrite($id, $message, 0, 60)  || die "shmwrite: $!";
    print "wrote: '$message'\n";
    shmread($id, $buff, 0, 60)      || die "shmread: $!";
    print "read : '$buff'\n";

    # the buffer of shmread is zero-character end-padded.
    substr($buff, index($buff, "\0")) = "";
    print "un" unless $buff eq $message;
    print "swell\n";

    print "deleting shm $id\n";
    shmctl($id, IPC_RMID, 0)        || die "shmctl: $!";

Here's an example of a semaphore:

    use IPC::SysV qw(IPC_CREAT);

    $IPC_KEY = 1234;
    $id = semget($IPC_KEY, 10, 0666 | IPC_CREAT);
    defined($id)                    || die "semget: $!";
    print "sem id $id\n";

Put this code in a separate file to be run in more than one process.
Call the file F<take>:

    # create a semaphore

    $IPC_KEY = 1234;
    $id = semget($IPC_KEY, 0, 0);
    defined($id)                    || die "semget: $!";

    $semnum  = 0;
    $semflag = 0;

    # "take" semaphore
    # wait for semaphore to be zero
    $semop = 0;
    $opstring1 = pack("s!s!s!", $semnum, $semop, $semflag);

    # Increment the semaphore count
    $semop = 1;
    $opstring2 = pack("s!s!s!", $semnum, $semop,  $semflag);
    $opstring  = $opstring1 . $opstring2;

    semop($id, $opstring)   || die "semop: $!";

Put this code in a separate file to be run in more than one process.
Call this file F<give>:

    # "give" the semaphore
    # run this in the original process and you will see
    # that the second process continues

    $IPC_KEY = 1234;
    $id = semget($IPC_KEY, 0, 0);
    die unless defined($id);

    $semnum  = 0;
    $semflag = 0;

    # Decrement the semaphore count
    $semop = -1;
    $opstring = pack("s!s!s!", $semnum, $semop, $semflag);

    semop($id, $opstring)   || die "semop: $!";

The SysV IPC code above was written long ago, and it's definitely
clunky looking.  For a more modern look, see the IPC::SysV module.

A small example demonstrating SysV message queues:

    use IPC::SysV qw(IPC_PRIVATE IPC_RMID IPC_CREAT S_IRUSR S_IWUSR);

    my $id = msgget(IPC_PRIVATE, IPC_CREAT | S_IRUSR | S_IWUSR);
    defined($id)                || die "msgget failed: $!";

    my $sent      = "message";
    my $type_sent = 1234;

    msgsnd($id, pack("l! a*", $type_sent, $sent), 0)
                                || die "msgsnd failed: $!";

    msgrcv($id, my $rcvd_buf, 60, 0, 0)
                                || die "msgrcv failed: $!";

    my($type_rcvd, $rcvd) = unpack("l! a*", $rcvd_buf);

    if ($rcvd eq $sent) {
        print "okay\n";
    } else {
        print "not okay\n";
    }

    msgctl($id, IPC_RMID, 0)    || die "msgctl failed: $!\n";

=head1 NOTES

Most of these routines quietly but politely return C<undef> when they
fail instead of causing your program to die right then and there due to
an uncaught exception.  (Actually, some of the new I<Socket> conversion
functions do croak() on bad arguments.)  It is therefore essential to
check return values from these functions.  Always begin your socket
programs this way for optimal success, and don't forget to add the B<-T>
taint-checking flag to the C<#!> line for servers:

    #!/usr/bin/perl -Tw
    use strict;
    use sigtrap;
    use Socket;

=head1 BUGS

These routines all create system-specific portability problems.  As noted
elsewhere, Perl is at the mercy of your C libraries for much of its system
behavior.  It's probably safest to assume broken SysV semantics for
signals and to stick with simple TCP and UDP socket operations; e.g., don't
try to pass open file descriptors over a local UDP datagram socket if you
want your code to stand a chance of being portable.

=head1 AUTHOR

Tom Christiansen, with occasional vestiges of Larry Wall's original
version and suggestions from the Perl Porters.

=head1 SEE ALSO

There's a lot more to networking than this, but this should get you
started.

For intrepid programmers, the indispensable textbook is I<Unix Network
Programming, 2nd Edition, Volume 1> by W. Richard Stevens (published by
Prentice-Hall).  Most books on networking address the subject from the
perspective of a C programmer; translation to Perl is left as an exercise
for the reader.

The IO::Socket(3) manpage describes the object library, and the Socket(3)
manpage describes the low-level interface to sockets.  Besides the obvious
functions in L<perlfunc>, you should also check out the F<modules> file at
your nearest CPAN site, especially
L<http://www.cpan.org/modules/00modlist.long.html#ID5_Networking_>.
See L<perlmodlib> or best yet, the F<Perl FAQ> for a description
of what CPAN is and where to get it if the previous link doesn't work
for you.

Section 5 of CPAN's F<modules> file is devoted to "Networking, Device
Control (modems), and Interprocess Communication", and contains numerous
unbundled modules numerous networking modules, Chat and Expect operations,
CGI programming, DCE, FTP, IPC, NNTP, Proxy, Ptty, RPC, SNMP, SMTP, Telnet,
Threads, and ToolTalk--to name just a few.
perlreapi.pod000064400000073176150344123430007250 0ustar00=head1 NAME

perlreapi - Perl regular expression plugin interface

=head1 DESCRIPTION

As of Perl 5.9.5 there is a new interface for plugging and using
regular expression engines other than the default one.

Each engine is supposed to provide access to a constant structure of the
following format:

    typedef struct regexp_engine {
        REGEXP* (*comp) (pTHX_
                         const SV * const pattern, const U32 flags);
        I32     (*exec) (pTHX_
                         REGEXP * const rx,
                         char* stringarg,
                         char* strend, char* strbeg,
                         SSize_t minend, SV* sv,
                         void* data, U32 flags);
        char*   (*intuit) (pTHX_
                           REGEXP * const rx, SV *sv,
			   const char * const strbeg,
                           char *strpos, char *strend, U32 flags,
                           struct re_scream_pos_data_s *data);
        SV*     (*checkstr) (pTHX_ REGEXP * const rx);
        void    (*free) (pTHX_ REGEXP * const rx);
        void    (*numbered_buff_FETCH) (pTHX_
                                        REGEXP * const rx,
                                        const I32 paren,
                                        SV * const sv);
        void    (*numbered_buff_STORE) (pTHX_
                                        REGEXP * const rx,
                                        const I32 paren,
                                        SV const * const value);
        I32     (*numbered_buff_LENGTH) (pTHX_
                                         REGEXP * const rx,
                                         const SV * const sv,
                                         const I32 paren);
        SV*     (*named_buff) (pTHX_
                               REGEXP * const rx,
                               SV * const key,
                               SV * const value,
                               U32 flags);
        SV*     (*named_buff_iter) (pTHX_
                                    REGEXP * const rx,
                                    const SV * const lastkey,
                                    const U32 flags);
        SV*     (*qr_package)(pTHX_ REGEXP * const rx);
    #ifdef USE_ITHREADS
        void*   (*dupe) (pTHX_ REGEXP * const rx, CLONE_PARAMS *param);
    #endif
        REGEXP* (*op_comp) (...);


When a regexp is compiled, its C<engine> field is then set to point at
the appropriate structure, so that when it needs to be used Perl can find
the right routines to do so.

In order to install a new regexp handler, C<$^H{regcomp}> is set
to an integer which (when casted appropriately) resolves to one of these
structures.  When compiling, the C<comp> method is executed, and the
resulting C<regexp> structure's engine field is expected to point back at
the same structure.

The pTHX_ symbol in the definition is a macro used by Perl under threading
to provide an extra argument to the routine holding a pointer back to
the interpreter that is executing the regexp. So under threading all
routines get an extra argument.

=head1 Callbacks

=head2 comp

    REGEXP* comp(pTHX_ const SV * const pattern, const U32 flags);

Compile the pattern stored in C<pattern> using the given C<flags> and
return a pointer to a prepared C<REGEXP> structure that can perform
the match.  See L</The REGEXP structure> below for an explanation of
the individual fields in the REGEXP struct.

The C<pattern> parameter is the scalar that was used as the
pattern.  Previous versions of Perl would pass two C<char*> indicating
the start and end of the stringified pattern; the following snippet can
be used to get the old parameters:

    STRLEN plen;
    char*  exp = SvPV(pattern, plen);
    char* xend = exp + plen;

Since any scalar can be passed as a pattern, it's possible to implement
an engine that does something with an array (C<< "ook" =~ [ qw/ eek
hlagh / ] >>) or with the non-stringified form of a compiled regular
expression (C<< "ook" =~ qr/eek/ >>).  Perl's own engine will always
stringify everything using the snippet above, but that doesn't mean
other engines have to.

The C<flags> parameter is a bitfield which indicates which of the
C<msixpn> flags the regex was compiled with.  It also contains
additional info, such as if C<use locale> is in effect.

The C<eogc> flags are stripped out before being passed to the comp
routine.  The regex engine does not need to know if any of these
are set, as those flags should only affect what Perl does with the
pattern and its match variables, not how it gets compiled and
executed.

By the time the comp callback is called, some of these flags have
already had effect (noted below where applicable).  However most of
their effect occurs after the comp callback has run, in routines that
read the C<< rx->extflags >> field which it populates.

In general the flags should be preserved in C<< rx->extflags >> after
compilation, although the regex engine might want to add or delete
some of them to invoke or disable some special behavior in Perl.  The
flags along with any special behavior they cause are documented below:

The pattern modifiers:

=over 4

=item C</m> - RXf_PMf_MULTILINE

If this is in C<< rx->extflags >> it will be passed to
C<Perl_fbm_instr> by C<pp_split> which will treat the subject string
as a multi-line string.

=item C</s> - RXf_PMf_SINGLELINE

=item C</i> - RXf_PMf_FOLD

=item C</x> - RXf_PMf_EXTENDED

If present on a regex, C<"#"> comments will be handled differently by the
tokenizer in some cases.

TODO: Document those cases.

=item C</p> - RXf_PMf_KEEPCOPY

TODO: Document this

=item Character set

The character set rules are determined by an enum that is contained
in this field.  This is still experimental and subject to change, but
the current interface returns the rules by use of the in-line function
C<get_regex_charset(const U32 flags)>.  The only currently documented
value returned from it is REGEX_LOCALE_CHARSET, which is set if
C<use locale> is in effect. If present in C<< rx->extflags >>,
C<split> will use the locale dependent definition of whitespace
when RXf_SKIPWHITE or RXf_WHITE is in effect.  ASCII whitespace
is defined as per L<isSPACE|perlapi/isSPACE>, and by the internal
macros C<is_utf8_space> under UTF-8, and C<isSPACE_LC> under C<use
locale>.

=back

Additional flags:

=over 4

=item RXf_SPLIT

This flag was removed in perl 5.18.0.  C<split ' '> is now special-cased
solely in the parser.  RXf_SPLIT is still #defined, so you can test for it.
This is how it used to work:

If C<split> is invoked as C<split ' '> or with no arguments (which
really means C<split(' ', $_)>, see L<split|perlfunc/split>), Perl will
set this flag.  The regex engine can then check for it and set the
SKIPWHITE and WHITE extflags.  To do this, the Perl engine does:

    if (flags & RXf_SPLIT && r->prelen == 1 && r->precomp[0] == ' ')
        r->extflags |= (RXf_SKIPWHITE|RXf_WHITE);

=back

These flags can be set during compilation to enable optimizations in
the C<split> operator.

=over 4

=item RXf_SKIPWHITE

This flag was removed in perl 5.18.0.  It is still #defined, so you can
set it, but doing so will have no effect.  This is how it used to work:

If the flag is present in C<< rx->extflags >> C<split> will delete
whitespace from the start of the subject string before it's operated
on.  What is considered whitespace depends on if the subject is a
UTF-8 string and if the C<RXf_PMf_LOCALE> flag is set.

If RXf_WHITE is set in addition to this flag, C<split> will behave like
C<split " "> under the Perl engine.

=item RXf_START_ONLY

Tells the split operator to split the target string on newlines
(C<\n>) without invoking the regex engine.

Perl's engine sets this if the pattern is C</^/> (C<plen == 1 && *exp
== '^'>), even under C</^/s>; see L<split|perlfunc>.  Of course a
different regex engine might want to use the same optimizations
with a different syntax.

=item RXf_WHITE

Tells the split operator to split the target string on whitespace
without invoking the regex engine.  The definition of whitespace varies
depending on if the target string is a UTF-8 string and on
if RXf_PMf_LOCALE is set.

Perl's engine sets this flag if the pattern is C<\s+>.

=item RXf_NULL

Tells the split operator to split the target string on
characters.  The definition of character varies depending on if
the target string is a UTF-8 string.

Perl's engine sets this flag on empty patterns, this optimization
makes C<split //> much faster than it would otherwise be.  It's even
faster than C<unpack>.

=item RXf_NO_INPLACE_SUBST

Added in perl 5.18.0, this flag indicates that a regular expression might
perform an operation that would interfere with inplace substitution. For
instance it might contain lookbehind, or assign to non-magical variables
(such as $REGMARK and $REGERROR) during matching.  C<s///> will skip
certain optimisations when this is set.

=back

=head2 exec

    I32 exec(pTHX_ REGEXP * const rx,
             char *stringarg, char* strend, char* strbeg,
             SSize_t minend, SV* sv,
             void* data, U32 flags);

Execute a regexp. The arguments are

=over 4

=item rx

The regular expression to execute.

=item sv

This is the SV to be matched against.  Note that the
actual char array to be matched against is supplied by the arguments
described below; the SV is just used to determine UTF8ness, C<pos()> etc.

=item strbeg

Pointer to the physical start of the string.

=item strend

Pointer to the character following the physical end of the string (i.e.
the C<\0>, if any).

=item stringarg

Pointer to the position in the string where matching should start; it might
not be equal to C<strbeg> (for example in a later iteration of C</.../g>).

=item minend

Minimum length of string (measured in bytes from C<stringarg>) that must
match; if the engine reaches the end of the match but hasn't reached this
position in the string, it should fail.

=item data

Optimisation data; subject to change.

=item flags

Optimisation flags; subject to change.

=back

=head2 intuit

    char* intuit(pTHX_
		REGEXP * const rx,
		SV *sv,
		const char * const strbeg,
		char *strpos,
		char *strend,
		const U32 flags,
		struct re_scream_pos_data_s *data);

Find the start position where a regex match should be attempted,
or possibly if the regex engine should not be run because the
pattern can't match.  This is called, as appropriate, by the core,
depending on the values of the C<extflags> member of the C<regexp>
structure.

Arguments:

    rx:     the regex to match against
    sv:     the SV being matched: only used for utf8 flag; the string
	    itself is accessed via the pointers below. Note that on
	    something like an overloaded SV, SvPOK(sv) may be false
	    and the string pointers may point to something unrelated to
	    the SV itself.
    strbeg: real beginning of string
    strpos: the point in the string at which to begin matching
    strend: pointer to the byte following the last char of the string
    flags   currently unused; set to 0
    data:   currently unused; set to NULL


=head2 checkstr

    SV*	checkstr(pTHX_ REGEXP * const rx);

Return a SV containing a string that must appear in the pattern. Used
by C<split> for optimising matches.

=head2 free

    void free(pTHX_ REGEXP * const rx);

Called by Perl when it is freeing a regexp pattern so that the engine
can release any resources pointed to by the C<pprivate> member of the
C<regexp> structure.  This is only responsible for freeing private data;
Perl will handle releasing anything else contained in the C<regexp> structure.

=head2 Numbered capture callbacks

Called to get/set the value of C<$`>, C<$'>, C<$&> and their named
equivalents, ${^PREMATCH}, ${^POSTMATCH} and ${^MATCH}, as well as the
numbered capture groups (C<$1>, C<$2>, ...).

The C<paren> parameter will be C<1> for C<$1>, C<2> for C<$2> and so
forth, and have these symbolic values for the special variables:

    ${^PREMATCH}  RX_BUFF_IDX_CARET_PREMATCH
    ${^POSTMATCH} RX_BUFF_IDX_CARET_POSTMATCH
    ${^MATCH}     RX_BUFF_IDX_CARET_FULLMATCH
    $`            RX_BUFF_IDX_PREMATCH
    $'            RX_BUFF_IDX_POSTMATCH
    $&            RX_BUFF_IDX_FULLMATCH

Note that in Perl 5.17.3 and earlier, the last three constants were also
used for the caret variants of the variables.


The names have been chosen by analogy with L<Tie::Scalar> methods
names with an additional B<LENGTH> callback for efficiency.  However
named capture variables are currently not tied internally but
implemented via magic.

=head3 numbered_buff_FETCH

    void numbered_buff_FETCH(pTHX_ REGEXP * const rx, const I32 paren,
                             SV * const sv);

Fetch a specified numbered capture.  C<sv> should be set to the scalar
to return, the scalar is passed as an argument rather than being
returned from the function because when it's called Perl already has a
scalar to store the value, creating another one would be
redundant.  The scalar can be set with C<sv_setsv>, C<sv_setpvn> and
friends, see L<perlapi>.

This callback is where Perl untaints its own capture variables under
taint mode (see L<perlsec>).  See the C<Perl_reg_numbered_buff_fetch>
function in F<regcomp.c> for how to untaint capture variables if
that's something you'd like your engine to do as well.

=head3 numbered_buff_STORE

    void    (*numbered_buff_STORE) (pTHX_
                                    REGEXP * const rx,
                                    const I32 paren,
                                    SV const * const value);

Set the value of a numbered capture variable.  C<value> is the scalar
that is to be used as the new value.  It's up to the engine to make
sure this is used as the new value (or reject it).

Example:

    if ("ook" =~ /(o*)/) {
        # 'paren' will be '1' and 'value' will be 'ee'
        $1 =~ tr/o/e/;
    }

Perl's own engine will croak on any attempt to modify the capture
variables, to do this in another engine use the following callback
(copied from C<Perl_reg_numbered_buff_store>):

    void
    Example_reg_numbered_buff_store(pTHX_
                                    REGEXP * const rx,
                                    const I32 paren,
                                    SV const * const value)
    {
        PERL_UNUSED_ARG(rx);
        PERL_UNUSED_ARG(paren);
        PERL_UNUSED_ARG(value);

        if (!PL_localizing)
            Perl_croak(aTHX_ PL_no_modify);
    }

Actually Perl will not I<always> croak in a statement that looks
like it would modify a numbered capture variable.  This is because the
STORE callback will not be called if Perl can determine that it
doesn't have to modify the value.  This is exactly how tied variables
behave in the same situation:

    package CaptureVar;
    use parent 'Tie::Scalar';

    sub TIESCALAR { bless [] }
    sub FETCH { undef }
    sub STORE { die "This doesn't get called" }

    package main;

    tie my $sv => "CaptureVar";
    $sv =~ y/a/b/;

Because C<$sv> is C<undef> when the C<y///> operator is applied to it,
the transliteration won't actually execute and the program won't
C<die>.  This is different to how 5.8 and earlier versions behaved
since the capture variables were READONLY variables then; now they'll
just die when assigned to in the default engine.

=head3 numbered_buff_LENGTH

    I32 numbered_buff_LENGTH (pTHX_
                              REGEXP * const rx,
                              const SV * const sv,
                              const I32 paren);

Get the C<length> of a capture variable.  There's a special callback
for this so that Perl doesn't have to do a FETCH and run C<length> on
the result, since the length is (in Perl's case) known from an offset
stored in C<< rx->offs >>, this is much more efficient:

    I32 s1  = rx->offs[paren].start;
    I32 s2  = rx->offs[paren].end;
    I32 len = t1 - s1;

This is a little bit more complex in the case of UTF-8, see what
C<Perl_reg_numbered_buff_length> does with
L<is_utf8_string_loclen|perlapi/is_utf8_string_loclen>.

=head2 Named capture callbacks

Called to get/set the value of C<%+> and C<%->, as well as by some
utility functions in L<re>.

There are two callbacks, C<named_buff> is called in all the cases the
FETCH, STORE, DELETE, CLEAR, EXISTS and SCALAR L<Tie::Hash> callbacks
would be on changes to C<%+> and C<%-> and C<named_buff_iter> in the
same cases as FIRSTKEY and NEXTKEY.

The C<flags> parameter can be used to determine which of these
operations the callbacks should respond to.  The following flags are
currently defined:

Which L<Tie::Hash> operation is being performed from the Perl level on
C<%+> or C<%+>, if any:

    RXapif_FETCH
    RXapif_STORE
    RXapif_DELETE
    RXapif_CLEAR
    RXapif_EXISTS
    RXapif_SCALAR
    RXapif_FIRSTKEY
    RXapif_NEXTKEY

If C<%+> or C<%-> is being operated on, if any.

    RXapif_ONE /* %+ */
    RXapif_ALL /* %- */

If this is being called as C<re::regname>, C<re::regnames> or
C<re::regnames_count>, if any.  The first two will be combined with
C<RXapif_ONE> or C<RXapif_ALL>.

    RXapif_REGNAME
    RXapif_REGNAMES
    RXapif_REGNAMES_COUNT

Internally C<%+> and C<%-> are implemented with a real tied interface
via L<Tie::Hash::NamedCapture>.  The methods in that package will call
back into these functions.  However the usage of
L<Tie::Hash::NamedCapture> for this purpose might change in future
releases.  For instance this might be implemented by magic instead
(would need an extension to mgvtbl).

=head3 named_buff

    SV*     (*named_buff) (pTHX_ REGEXP * const rx, SV * const key,
                           SV * const value, U32 flags);

=head3 named_buff_iter

    SV*     (*named_buff_iter) (pTHX_
                                REGEXP * const rx,
                                const SV * const lastkey,
                                const U32 flags);

=head2 qr_package

    SV* qr_package(pTHX_ REGEXP * const rx);

The package the qr// magic object is blessed into (as seen by C<ref
qr//>).  It is recommended that engines change this to their package
name for identification regardless of if they implement methods
on the object.

The package this method returns should also have the internal
C<Regexp> package in its C<@ISA>.  C<< qr//->isa("Regexp") >> should always
be true regardless of what engine is being used.

Example implementation might be:

    SV*
    Example_qr_package(pTHX_ REGEXP * const rx)
    {
    	PERL_UNUSED_ARG(rx);
    	return newSVpvs("re::engine::Example");
    }

Any method calls on an object created with C<qr//> will be dispatched to the
package as a normal object.

    use re::engine::Example;
    my $re = qr//;
    $re->meth; # dispatched to re::engine::Example::meth()

To retrieve the C<REGEXP> object from the scalar in an XS function use
the C<SvRX> macro, see L<"REGEXP Functions" in perlapi|perlapi/REGEXP
Functions>.

    void meth(SV * rv)
    PPCODE:
        REGEXP * re = SvRX(sv);

=head2 dupe

    void* dupe(pTHX_ REGEXP * const rx, CLONE_PARAMS *param);

On threaded builds a regexp may need to be duplicated so that the pattern
can be used by multiple threads.  This routine is expected to handle the
duplication of any private data pointed to by the C<pprivate> member of
the C<regexp> structure.  It will be called with the preconstructed new
C<regexp> structure as an argument, the C<pprivate> member will point at
the B<old> private structure, and it is this routine's responsibility to
construct a copy and return a pointer to it (which Perl will then use to
overwrite the field as passed to this routine.)

This allows the engine to dupe its private data but also if necessary
modify the final structure if it really must.

On unthreaded builds this field doesn't exist.

=head2 op_comp

This is private to the Perl core and subject to change. Should be left
null.

=head1 The REGEXP structure

The REGEXP struct is defined in F<regexp.h>.
All regex engines must be able to
correctly build such a structure in their L</comp> routine.

The REGEXP structure contains all the data that Perl needs to be aware of
to properly work with the regular expression.  It includes data about
optimisations that Perl can use to determine if the regex engine should
really be used, and various other control info that is needed to properly
execute patterns in various contexts, such as if the pattern anchored in
some way, or what flags were used during the compile, or if the
program contains special constructs that Perl needs to be aware of.

In addition it contains two fields that are intended for the private
use of the regex engine that compiled the pattern.  These are the
C<intflags> and C<pprivate> members.  C<pprivate> is a void pointer to
an arbitrary structure, whose use and management is the responsibility
of the compiling engine.  Perl will never modify either of these
values.

    typedef struct regexp {
        /* what engine created this regexp? */
        const struct regexp_engine* engine;

        /* what re is this a lightweight copy of? */
        struct regexp* mother_re;

        /* Information about the match that the Perl core uses to manage
         * things */
        U32 extflags;   /* Flags used both externally and internally */
	I32 minlen;	/* mininum possible number of chars in */
                           string to match */
	I32 minlenret;	/* mininum possible number of chars in $& */
        U32 gofs;       /* chars left of pos that we search from */

        /* substring data about strings that must appear
           in the final match, used for optimisations */
        struct reg_substr_data *substrs;

        U32 nparens;  /* number of capture groups */

        /* private engine specific data */
        U32 intflags;   /* Engine Specific Internal flags */
        void *pprivate; /* Data private to the regex engine which 
                           created this object. */

        /* Data about the last/current match. These are modified during
         * matching*/
        U32 lastparen;            /* highest close paren matched ($+) */
        U32 lastcloseparen;       /* last close paren matched ($^N) */
        regexp_paren_pair *swap;  /* Swap copy of *offs */
        regexp_paren_pair *offs;  /* Array of offsets for (@-) and
                                     (@+) */

        char *subbeg;  /* saved or original string so \digit works
                          forever. */
        SV_SAVED_COPY  /* If non-NULL, SV which is COW from original */
        I32 sublen;    /* Length of string pointed by subbeg */
        I32 suboffset;	/* byte offset of subbeg from logical start of
                           str */
	I32 subcoffset;	/* suboffset equiv, but in chars (for @-/@+) */

        /* Information about the match that isn't often used */
        I32 prelen;           /* length of precomp */
        const char *precomp;  /* pre-compilation regular expression */

        char *wrapped;  /* wrapped version of the pattern */
        I32 wraplen;    /* length of wrapped */

        I32 seen_evals;   /* number of eval groups in the pattern - for
                             security checks */
        HV *paren_names;  /* Optional hash of paren names */

        /* Refcount of this regexp */
        I32 refcnt;             /* Refcount of this regexp */
    } regexp;

The fields are discussed in more detail below:

=head2 C<engine>

This field points at a C<regexp_engine> structure which contains pointers
to the subroutines that are to be used for performing a match.  It
is the compiling routine's responsibility to populate this field before
returning the regexp object.

Internally this is set to C<NULL> unless a custom engine is specified in
C<$^H{regcomp}>, Perl's own set of callbacks can be accessed in the struct
pointed to by C<RE_ENGINE_PTR>.

=head2 C<mother_re>

TODO, see L<http://www.mail-archive.com/perl5-changes@perl.org/msg17328.html>

=head2 C<extflags>

This will be used by Perl to see what flags the regexp was compiled
with, this will normally be set to the value of the flags parameter by
the L<comp|/comp> callback.  See the L<comp|/comp> documentation for
valid flags.

=head2 C<minlen> C<minlenret>

The minimum string length (in characters) required for the pattern to match.
This is used to
prune the search space by not bothering to match any closer to the end of a
string than would allow a match.  For instance there is no point in even
starting the regex engine if the minlen is 10 but the string is only 5
characters long.  There is no way that the pattern can match.

C<minlenret> is the minimum length (in characters) of the string that would
be found in $& after a match.

The difference between C<minlen> and C<minlenret> can be seen in the
following pattern:

    /ns(?=\d)/

where the C<minlen> would be 3 but C<minlenret> would only be 2 as the \d is
required to match but is not actually
included in the matched content.  This
distinction is particularly important as the substitution logic uses the
C<minlenret> to tell if it can do in-place substitutions (these can
result in considerable speed-up).

=head2 C<gofs>

Left offset from pos() to start match at.

=head2 C<substrs>

Substring data about strings that must appear in the final match.  This
is currently only used internally by Perl's engine, but might be
used in the future for all engines for optimisations.

=head2 C<nparens>, C<lastparen>, and C<lastcloseparen>

These fields are used to keep track of how many paren groups could be matched
in the pattern, which was the last open paren to be entered, and which was
the last close paren to be entered.

=head2 C<intflags>

The engine's private copy of the flags the pattern was compiled with. Usually
this is the same as C<extflags> unless the engine chose to modify one of them.

=head2 C<pprivate>

A void* pointing to an engine-defined
data structure.  The Perl engine uses the
C<regexp_internal> structure (see L<perlreguts/Base Structures>) but a custom
engine should use something else.

=head2 C<swap>

Unused.  Left in for compatibility with Perl 5.10.0.

=head2 C<offs>

A C<regexp_paren_pair> structure which defines offsets into the string being
matched which correspond to the C<$&> and C<$1>, C<$2> etc. captures, the
C<regexp_paren_pair> struct is defined as follows:

    typedef struct regexp_paren_pair {
        I32 start;
        I32 end;
    } regexp_paren_pair;

If C<< ->offs[num].start >> or C<< ->offs[num].end >> is C<-1> then that
capture group did not match.
C<< ->offs[0].start/end >> represents C<$&> (or
C<${^MATCH}> under C</p>) and C<< ->offs[paren].end >> matches C<$$paren> where
C<$paren >= 1>.

=head2 C<precomp> C<prelen>

Used for optimisations.  C<precomp> holds a copy of the pattern that
was compiled and C<prelen> its length.  When a new pattern is to be
compiled (such as inside a loop) the internal C<regcomp> operator
checks if the last compiled C<REGEXP>'s C<precomp> and C<prelen>
are equivalent to the new one, and if so uses the old pattern instead
of compiling a new one.

The relevant snippet from C<Perl_pp_regcomp>:

	if (!re || !re->precomp || re->prelen != (I32)len ||
	    memNE(re->precomp, t, len))
        /* Compile a new pattern */

=head2 C<paren_names>

This is a hash used internally to track named capture groups and their
offsets.  The keys are the names of the buffers the values are dualvars,
with the IV slot holding the number of buffers with the given name and the
pv being an embedded array of I32.  The values may also be contained
independently in the data array in cases where named backreferences are
used.

=head2 C<substrs>

Holds information on the longest string that must occur at a fixed
offset from the start of the pattern, and the longest string that must
occur at a floating offset from the start of the pattern.  Used to do
Fast-Boyer-Moore searches on the string to find out if its worth using
the regex engine at all, and if so where in the string to search.

=head2 C<subbeg> C<sublen> C<saved_copy> C<suboffset> C<subcoffset>

Used during the execution phase for managing search and replace patterns,
and for providing the text for C<$&>, C<$1> etc. C<subbeg> points to a
buffer (either the original string, or a copy in the case of
C<RX_MATCH_COPIED(rx)>), and C<sublen> is the length of the buffer.  The
C<RX_OFFS> start and end indices index into this buffer.

In the presence of the C<REXEC_COPY_STR> flag, but with the addition of
the C<REXEC_COPY_SKIP_PRE> or C<REXEC_COPY_SKIP_POST> flags, an engine
can choose not to copy the full buffer (although it must still do so in
the presence of C<RXf_PMf_KEEPCOPY> or the relevant bits being set in
C<PL_sawampersand>).  In this case, it may set C<suboffset> to indicate the
number of bytes from the logical start of the buffer to the physical start
(i.e. C<subbeg>).  It should also set C<subcoffset>, the number of
characters in the offset. The latter is needed to support C<@-> and C<@+>
which work in characters, not bytes.

=head2 C<wrapped> C<wraplen>

Stores the string C<qr//> stringifies to. The Perl engine for example
stores C<(?^:eek)> in the case of C<qr/eek/>.

When using a custom engine that doesn't support the C<(?:)> construct
for inline modifiers, it's probably best to have C<qr//> stringify to
the supplied pattern, note that this will create undesired patterns in
cases such as:

    my $x = qr/a|b/;  # "a|b"
    my $y = qr/c/i;   # "c"
    my $z = qr/$x$y/; # "a|bc"

There's no solution for this problem other than making the custom
engine understand a construct like C<(?:)>.

=head2 C<seen_evals>

This stores the number of eval groups in
the pattern.  This is used for security
purposes when embedding compiled regexes into larger patterns with C<qr//>.

=head2 C<refcnt>

The number of times the structure is referenced.  When
this falls to 0, the regexp is automatically freed
by a call to pregfree.  This should be set to 1 in
each engine's L</comp> routine.

=head1 HISTORY

Originally part of L<perlreguts>.

=head1 AUTHORS

Originally written by Yves Orton, expanded by E<AElig>var ArnfjE<ouml>rE<eth>
Bjarmason.

=head1 LICENSE

Copyright 2006 Yves Orton and 2007 E<AElig>var ArnfjE<ouml>rE<eth> Bjarmason.

This program is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.

=cut
perluniprops.pod000064400001055172150344123430010024 0ustar00=begin comment

# !!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
# This file is machine-generated by lib/unicore/mktables from the Unicode
# database, Version 9.0.0.  Any changes made here will be lost!


To change this file, edit lib/unicore/mktables instead.

=end comment

=head1 NAME

perluniprops - Index of Unicode Version 9.0.0 character properties in Perl

=head1 DESCRIPTION

This document provides information about the portion of the Unicode database
that deals with character properties, that is the portion that is defined on
single code points.  (L</Other information in the Unicode data base>
below briefly mentions other data that Unicode provides.)

Perl can provide access to all non-provisional Unicode character properties,
though not all are enabled by default.  The omitted ones are the Unihan
properties (accessible via the CPAN module L<Unicode::Unihan>) and certain
deprecated or Unicode-internal properties.  (An installation may choose to
recompile Perl's tables to change this.  See L<Unicode character
properties that are NOT accepted by Perl>.)

For most purposes, access to Unicode properties from the Perl core is through
regular expression matches, as described in the next section.
For some special purposes, and to access the properties that are not suitable
for regular expression matching, all the Unicode character properties that
Perl handles are accessible via the standard L<Unicode::UCD> module, as
described in the section L</Properties accessible through Unicode::UCD>.

Perl also provides some additional extensions and short-cut synonyms
for Unicode properties.

This document merely lists all available properties and does not attempt to
explain what each property really means.  There is a brief description of each
Perl extension; see L<perlunicode/Other Properties> for more information on
these.  There is some detail about Blocks, Scripts, General_Category,
and Bidi_Class in L<perlunicode>, but to find out about the intricacies of the
official Unicode properties, refer to the Unicode standard.  A good starting
place is L<http://www.unicode.org/reports/tr44/>.

Note that you can define your own properties; see
L<perlunicode/"User-Defined Character Properties">.

=head1 Properties accessible through C<\p{}> and C<\P{}>

The Perl regular expression C<\p{}> and C<\P{}> constructs give access to
most of the Unicode character properties.  The table below shows all these
constructs, both single and compound forms.

B<Compound forms> consist of two components, separated by an equals sign or a
colon.  The first component is the property name, and the second component is
the particular value of the property to match against, for example,
C<\p{Script: Greek}> and C<\p{Script=Greek}> both mean to match characters
whose Script property value is Greek.

B<Single forms>, like C<\p{Greek}>, are mostly Perl-defined shortcuts for
their equivalent compound forms.  The table shows these equivalences.  (In our
example, C<\p{Greek}> is a just a shortcut for C<\p{Script=Greek}>.)
There are also a few Perl-defined single forms that are not shortcuts for a
compound form.  One such is C<\p{Word}>.  These are also listed in the table.

In parsing these constructs, Perl always ignores Upper/lower case differences
everywhere within the {braces}.  Thus C<\p{Greek}> means the same thing as
C<\p{greek}>.  But note that changing the case of the C<"p"> or C<"P"> before
the left brace completely changes the meaning of the construct, from "match"
(for C<\p{}>) to "doesn't match" (for C<\P{}>).  Casing in this document is
for improved legibility.

Also, white space, hyphens, and underscores are normally ignored
everywhere between the {braces}, and hence can be freely added or removed
even if the C</x> modifier hasn't been specified on the regular expression.
But in the table below a 'B<T>' at the beginning of an entry
means that tighter (stricter) rules are used for that entry:

=over 4

=over 4

=item Single form (C<\p{name}>) tighter rules:

White space, hyphens, and underscores ARE significant
except for:

=over 4

=item * white space adjacent to a non-word character

=item * underscores separating digits in numbers

=back

That means, for example, that you can freely add or remove white space
adjacent to (but within) the braces without affecting the meaning.

=item Compound form (C<\p{name=value}> or C<\p{name:value}>) tighter rules:

The tighter rules given above for the single form apply to everything to the
right of the colon or equals; the looser rules still apply to everything to
the left.

That means, for example, that you can freely add or remove white space
adjacent to (but within) the braces and the colon or equal sign.

=back

=back

Some properties are considered obsolete by Unicode, but still available.
There are several varieties of obsolescence:

=over 4

=over 4

=item Stabilized

A property may be stabilized.  Such a determination does not indicate
that the property should or should not be used; instead it is a declaration
that the property will not be maintained nor extended for newly encoded
characters.  Such properties are marked with an 'B<S>' in the
table.

=item Deprecated

A property may be deprecated, perhaps because its original intent
has been replaced by another property, or because its specification was
somehow defective.  This means that its use is strongly
discouraged, so much so that a warning will be issued if used, unless the
regular expression is in the scope of a C<S<no warnings 'deprecated'>>
statement.  A 'B<D>' flags each such entry in the table, and
the entry there for the longest, most descriptive version of the property will
give the reason it is deprecated, and perhaps advice.  Perl may issue such a
warning, even for properties that aren't officially deprecated by Unicode,
when there used to be characters or code points that were matched by them, but
no longer.  This is to warn you that your program may not work like it did on
earlier Unicode releases.

A deprecated property may be made unavailable in a future Perl version, so it
is best to move away from them.

A deprecated property may also be stabilized, but this fact is not shown.

=item Obsolete

Properties marked with an 'B<O>' in the table are considered (plain)
obsolete.  Generally this designation is given to properties that Unicode once
used for internal purposes (but not any longer).

=item Discouraged

This is not actually a Unicode-specified obsolescence, but applies to certain
Perl extensions that are present for backwards compatibility, but are
discouraged from being used.  These are not obsolete, but their meanings are
not stable.  Future Unicode versions could force any of these extensions to be
removed without warning, replaced by another property with the same name that
means something different.  An 'B<X>' flags each such entry in the
table.  Use the equivalent shown instead.


In particular, matches in the Block property have single forms
defined by Perl that begin with C<"In_">, C<"Is_>, or even with no prefix at
all,  Like all B<DISCOURAGED> forms, these are not stable.  For example,
C<\p{Block=Deseret}> can currently be written as C<\p{In_Deseret}>,
C<\p{Is_Deseret}>, or C<\p{Deseret}>.  But, a new Unicode version may
come along that would force Perl to change the meaning of one or more of
these, and your program would no longer be correct.  Currently there are no
such conflicts with the form that begins C<"In_">, but there are many with the
other two shortcuts, and Unicode continues to define new properties that begin
with C<"In">, so it's quite possible that a conflict will occur in the future.
The compound form is guaranteed to not become obsolete, and its meaning is
clearer anyway.  See L<perlunicode/"Blocks"> for more information about this.


=back

=back

The table below has two columns.  The left column contains the C<\p{}>
constructs to look up, possibly preceded by the flags mentioned above; and
the right column contains information about them, like a description, or
synonyms.  The table shows both the single and compound forms for each
property that has them.  If the left column is a short name for a property,
the right column will give its longer, more descriptive name; and if the left
column is the longest name, the right column will show any equivalent shortest
name, in both single and compound forms if applicable.

If braces are not needed to specify a property (e.g., C<\pL>), the left
column contains both forms, with and without braces.

The right column will also caution you if a property means something different
than what might normally be expected.

All single forms are Perl extensions; a few compound forms are as well, and
are noted as such.

Numbers in (parentheses) indicate the total number of Unicode code points
matched by the property.  For emphasis, those properties that match no code
points at all are listed as well in a separate section following the table.

Most properties match the same code points regardless of whether C<"/i">
case-insensitive matching is specified or not.  But a few properties are
affected.  These are shown with the notation S<C<(/i= I<other_property>)>>
in the second column.  Under case-insensitive matching they match the
same code pode points as the property I<other_property>.

There is no description given for most non-Perl defined properties (See
L<http://www.unicode.org/reports/tr44/> for that).

For compactness, 'B<*>' is used as a wildcard instead of showing all possible
combinations.  For example, entries like:

 \p{Gc: *}                                  \p{General_Category: *}

mean that 'Gc' is a synonym for 'General_Category', and anything that is valid
for the latter is also valid for the former.  Similarly,

 \p{Is_*}                                   \p{*}

means that if and only if, for example, C<\p{Foo}> exists, then
C<\p{Is_Foo}> and C<\p{IsFoo}> are also valid and all mean the same thing.
And similarly, C<\p{Foo=Bar}> means the same as C<\p{Is_Foo=Bar}> and
C<\p{IsFoo=Bar}>.  "*" here is restricted to something not beginning with an
underscore.

Also, in binary properties, 'Yes', 'T', and 'True' are all synonyms for 'Y'.
And 'No', 'F', and 'False' are all synonyms for 'N'.  The table shows 'Y*' and
'N*' to indicate this, and doesn't have separate entries for the other
possibilities.  Note that not all properties which have values 'Yes' and 'No'
are binary, and they have all their values spelled out without using this wild
card, and a C<NOT> clause in their description that highlights their not being
binary.  These also require the compound form to match them, whereas true
binary properties have both single and compound forms available.

Note that all non-essential underscores are removed in the display of the
short names below.

B<Legend summary:>

=over 4

=item Z<>B<*> is a wild-card

=item B<(\d+)> in the info column gives the number of Unicode code points matched
by this property.

=item B<D> means this is deprecated.

=item B<O> means this is obsolete.

=item B<S> means this is stabilized.

=item B<T> means tighter (stricter) name matching applies.

=item B<X> means use of this form is discouraged, and may not be
stable.

=back

       NAME                           INFO

   \p{Adlam}               \p{Script_Extensions=Adlam} (Short:
                             \p{Adlm}; NOT \p{Block=Adlam}) (88)
   \p{Adlm}                \p{Adlam} (= \p{Script_Extensions=Adlam})
                             (NOT \p{Block=Adlam}) (88)
 X \p{Aegean_Numbers}      \p{Block=Aegean_Numbers} (64)
 T \p{Age: 1.1}            \p{Age=V1_1} (33_979)
 T \p{Age: 2.0}            \p{Age=V2_0} (144_521)
 T \p{Age: 2.1}            \p{Age=V2_1} (2)
 T \p{Age: 3.0}            \p{Age=V3_0} (10_307)
 T \p{Age: 3.1}            \p{Age=V3_1} (44_978)
 T \p{Age: 3.2}            \p{Age=V3_2} (1016)
 T \p{Age: 4.0}            \p{Age=V4_0} (1226)
 T \p{Age: 4.1}            \p{Age=V4_1} (1273)
 T \p{Age: 5.0}            \p{Age=V5_0} (1369)
 T \p{Age: 5.1}            \p{Age=V5_1} (1624)
 T \p{Age: 5.2}            \p{Age=V5_2} (6648)
 T \p{Age: 6.0}            \p{Age=V6_0} (2088)
 T \p{Age: 6.1}            \p{Age=V6_1} (732)
 T \p{Age: 6.2}            \p{Age=V6_2} (1)
 T \p{Age: 6.3}            \p{Age=V6_3} (5)
 T \p{Age: 7.0}            \p{Age=V7_0} (2834)
 T \p{Age: 8.0}            \p{Age=V8_0} (7716)
 T \p{Age: 9.0}            \p{Age=V9_0} (7500)
   \p{Age: NA}             \p{Age=Unassigned} (846_293 plus all
                             above-Unicode code points)
   \p{Age: Unassigned}     Code point's usage has not been assigned
                             in any Unicode release thus far. (Short:
                             \p{Age=NA}) (846_293 plus all above-
                             Unicode code points)
   \p{Age: V1_1}           Code point's usage introduced in version
                             1.1 (33_979)
   \p{Age: V2_0}           Code point's usage was introduced in
                             version 2.0; See also Property
                             'Present_In' (144_521)
   \p{Age: V2_1}           Code point's usage was introduced in
                             version 2.1; See also Property
                             'Present_In' (2)
   \p{Age: V3_0}           Code point's usage was introduced in
                             version 3.0; See also Property
                             'Present_In' (10_307)
   \p{Age: V3_1}           Code point's usage was introduced in
                             version 3.1; See also Property
                             'Present_In' (44_978)
   \p{Age: V3_2}           Code point's usage was introduced in
                             version 3.2; See also Property
                             'Present_In' (1016)
   \p{Age: V4_0}           Code point's usage was introduced in
                             version 4.0; See also Property
                             'Present_In' (1226)
   \p{Age: V4_1}           Code point's usage was introduced in
                             version 4.1; See also Property
                             'Present_In' (1273)
   \p{Age: V5_0}           Code point's usage was introduced in
                             version 5.0; See also Property
                             'Present_In' (1369)
   \p{Age: V5_1}           Code point's usage was introduced in
                             version 5.1; See also Property
                             'Present_In' (1624)
   \p{Age: V5_2}           Code point's usage was introduced in
                             version 5.2; See also Property
                             'Present_In' (6648)
   \p{Age: V6_0}           Code point's usage was introduced in
                             version 6.0; See also Property
                             'Present_In' (2088)
   \p{Age: V6_1}           Code point's usage was introduced in
                             version 6.1; See also Property
                             'Present_In' (732)
   \p{Age: V6_2}           Code point's usage was introduced in
                             version 6.2; See also Property
                             'Present_In' (1)
   \p{Age: V6_3}           Code point's usage was introduced in
                             version 6.3; See also Property
                             'Present_In' (5)
   \p{Age: V7_0}           Code point's usage was introduced in
                             version 7.0; See also Property
                             'Present_In' (2834)
   \p{Age: V8_0}           Code point's usage was introduced in
                             version 8.0; See also Property
                             'Present_In' (7716)
   \p{Age: V9_0}           Code point's usage was introduced in
                             version 9.0; See also Property
                             'Present_In' (7500)
   \p{Aghb}                \p{Caucasian_Albanian} (=
                             \p{Script_Extensions=
                             Caucasian_Albanian}) (NOT \p{Block=
                             Caucasian_Albanian}) (53)
   \p{AHex}                \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y})
                             (22)
   \p{AHex: *}             \p{ASCII_Hex_Digit: *}
   \p{Ahom}                \p{Script_Extensions=Ahom} (NOT \p{Block=
                             Ahom}) (57)
 X \p{Alchemical}          \p{Alchemical_Symbols} (= \p{Block=
                             Alchemical_Symbols}) (128)
 X \p{Alchemical_Symbols}  \p{Block=Alchemical_Symbols} (Short:
                             \p{InAlchemical}) (128)
   \p{All}                 All code points, including those above
                             Unicode.  Same as qr/./s (1_114_112 plus
                             all above-Unicode code points)
   \p{Alnum}               \p{XPosixAlnum} (118_820)
   \p{Alpha}               \p{XPosixAlpha} (= \p{Alphabetic=Y})
                             (118_240)
   \p{Alpha: *}            \p{Alphabetic: *}
   \p{Alphabetic}          \p{XPosixAlpha} (= \p{Alphabetic=Y})
                             (118_240)
   \p{Alphabetic: N*}      (Short: \p{Alpha=N}, \P{Alpha}) (995_872
                             plus all above-Unicode code points)
   \p{Alphabetic: Y*}      (Short: \p{Alpha=Y}, \p{Alpha}) (118_240)
 X \p{Alphabetic_PF}       \p{Alphabetic_Presentation_Forms} (=
                             \p{Block=Alphabetic_Presentation_Forms})
                             (80)
 X \p{Alphabetic_Presentation_Forms} \p{Block=
                             Alphabetic_Presentation_Forms} (Short:
                             \p{InAlphabeticPF}) (80)
   \p{Anatolian_Hieroglyphs} \p{Script_Extensions=
                             Anatolian_Hieroglyphs} (Short: \p{Hluw};
                             NOT \p{Block=Anatolian_Hieroglyphs})
                             (583)
 X \p{Ancient_Greek_Music} \p{Ancient_Greek_Musical_Notation} (=
                             \p{Block=
                             Ancient_Greek_Musical_Notation}) (80)
 X \p{Ancient_Greek_Musical_Notation} \p{Block=
                             Ancient_Greek_Musical_Notation} (Short:
                             \p{InAncientGreekMusic}) (80)
 X \p{Ancient_Greek_Numbers} \p{Block=Ancient_Greek_Numbers} (80)
 X \p{Ancient_Symbols}     \p{Block=Ancient_Symbols} (64)
   \p{Any}                 All Unicode code points: [\x{0000}-
                             \x{10FFFF}] (1_114_112)
   \p{Arab}                \p{Arabic} (= \p{Script_Extensions=
                             Arabic}) (NOT \p{Block=Arabic}) (1323)
   \p{Arabic}              \p{Script_Extensions=Arabic} (Short:
                             \p{Arab}; NOT \p{Block=Arabic}) (1323)
 X \p{Arabic_Ext_A}        \p{Arabic_Extended_A} (= \p{Block=
                             Arabic_Extended_A}) (96)
 X \p{Arabic_Extended_A}   \p{Block=Arabic_Extended_A} (Short:
                             \p{InArabicExtA}) (96)
 X \p{Arabic_Math}         \p{Arabic_Mathematical_Alphabetic_Symbols}
                             (= \p{Block=
                             Arabic_Mathematical_Alphabetic_Symbols})
                             (256)
 X \p{Arabic_Mathematical_Alphabetic_Symbols} \p{Block=
                             Arabic_Mathematical_Alphabetic_Symbols}
                             (Short: \p{InArabicMath}) (256)
 X \p{Arabic_PF_A}         \p{Arabic_Presentation_Forms_A} (=
                             \p{Block=Arabic_Presentation_Forms_A})
                             (688)
 X \p{Arabic_PF_B}         \p{Arabic_Presentation_Forms_B} (=
                             \p{Block=Arabic_Presentation_Forms_B})
                             (144)
 X \p{Arabic_Presentation_Forms_A} \p{Block=
                             Arabic_Presentation_Forms_A} (Short:
                             \p{InArabicPFA}) (688)
 X \p{Arabic_Presentation_Forms_B} \p{Block=
                             Arabic_Presentation_Forms_B} (Short:
                             \p{InArabicPFB}) (144)
 X \p{Arabic_Sup}          \p{Arabic_Supplement} (= \p{Block=
                             Arabic_Supplement}) (48)
 X \p{Arabic_Supplement}   \p{Block=Arabic_Supplement} (Short:
                             \p{InArabicSup}) (48)
   \p{Armenian}            \p{Script_Extensions=Armenian} (Short:
                             \p{Armn}; NOT \p{Block=Armenian}) (94)
   \p{Armi}                \p{Imperial_Aramaic} (=
                             \p{Script_Extensions=Imperial_Aramaic})
                             (NOT \p{Block=Imperial_Aramaic}) (31)
   \p{Armn}                \p{Armenian} (= \p{Script_Extensions=
                             Armenian}) (NOT \p{Block=Armenian}) (94)
 X \p{Arrows}              \p{Block=Arrows} (112)
   \p{ASCII}               \p{Block=Basic_Latin} [[:ASCII:]] (128)
   \p{ASCII_Hex_Digit}     \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y})
                             (22)
   \p{ASCII_Hex_Digit: N*} (Short: \p{AHex=N}, \P{AHex}) (1_114_090
                             plus all above-Unicode code points)
   \p{ASCII_Hex_Digit: Y*} (Short: \p{AHex=Y}, \p{AHex}) (22)
   \p{Assigned}            All assigned code points (267_753)
   \p{Avestan}             \p{Script_Extensions=Avestan} (Short:
                             \p{Avst}; NOT \p{Block=Avestan}) (61)
   \p{Avst}                \p{Avestan} (= \p{Script_Extensions=
                             Avestan}) (NOT \p{Block=Avestan}) (61)
   \p{Bali}                \p{Balinese} (= \p{Script_Extensions=
                             Balinese}) (NOT \p{Block=Balinese}) (121)
   \p{Balinese}            \p{Script_Extensions=Balinese} (Short:
                             \p{Bali}; NOT \p{Block=Balinese}) (121)
   \p{Bamu}                \p{Bamum} (= \p{Script_Extensions=Bamum})
                             (NOT \p{Block=Bamum}) (657)
   \p{Bamum}               \p{Script_Extensions=Bamum} (Short:
                             \p{Bamu}; NOT \p{Block=Bamum}) (657)
 X \p{Bamum_Sup}           \p{Bamum_Supplement} (= \p{Block=
                             Bamum_Supplement}) (576)
 X \p{Bamum_Supplement}    \p{Block=Bamum_Supplement} (Short:
                             \p{InBamumSup}) (576)
 X \p{Basic_Latin}         \p{ASCII} (= \p{Block=Basic_Latin}) (128)
   \p{Bass}                \p{Bassa_Vah} (= \p{Script_Extensions=
                             Bassa_Vah}) (NOT \p{Block=Bassa_Vah})
                             (36)
   \p{Bassa_Vah}           \p{Script_Extensions=Bassa_Vah} (Short:
                             \p{Bass}; NOT \p{Block=Bassa_Vah}) (36)
   \p{Batak}               \p{Script_Extensions=Batak} (Short:
                             \p{Batk}; NOT \p{Block=Batak}) (56)
   \p{Batk}                \p{Batak} (= \p{Script_Extensions=Batak})
                             (NOT \p{Block=Batak}) (56)
   \p{Bc: *}               \p{Bidi_Class: *}
   \p{Beng}                \p{Bengali} (= \p{Script_Extensions=
                             Bengali}) (NOT \p{Block=Bengali}) (98)
   \p{Bengali}             \p{Script_Extensions=Bengali} (Short:
                             \p{Beng}; NOT \p{Block=Bengali}) (98)
   \p{Bhaiksuki}           \p{Script_Extensions=Bhaiksuki} (Short:
                             \p{Bhks}; NOT \p{Block=Bhaiksuki}) (97)
   \p{Bhks}                \p{Bhaiksuki} (= \p{Script_Extensions=
                             Bhaiksuki}) (NOT \p{Block=Bhaiksuki})
                             (97)
   \p{Bidi_C}              \p{Bidi_Control} (= \p{Bidi_Control=Y})
                             (12)
   \p{Bidi_C: *}           \p{Bidi_Control: *}
   \p{Bidi_Class: AL}      \p{Bidi_Class=Arabic_Letter} (1420)
   \p{Bidi_Class: AN}      \p{Bidi_Class=Arabic_Number} (51)
   \p{Bidi_Class: Arabic_Letter} (Short: \p{Bc=AL}) (1420)
   \p{Bidi_Class: Arabic_Number} (Short: \p{Bc=AN}) (51)
   \p{Bidi_Class: B}       \p{Bidi_Class=Paragraph_Separator} (7)
   \p{Bidi_Class: BN}      \p{Bidi_Class=Boundary_Neutral} (4016)
   \p{Bidi_Class: Boundary_Neutral} (Short: \p{Bc=BN}) (4016)
   \p{Bidi_Class: Common_Separator} (Short: \p{Bc=CS}) (15)
   \p{Bidi_Class: CS}      \p{Bidi_Class=Common_Separator} (15)
   \p{Bidi_Class: EN}      \p{Bidi_Class=European_Number} (158)
   \p{Bidi_Class: ES}      \p{Bidi_Class=European_Separator} (12)
   \p{Bidi_Class: ET}      \p{Bidi_Class=European_Terminator} (87)
   \p{Bidi_Class: European_Number} (Short: \p{Bc=EN}) (158)
   \p{Bidi_Class: European_Separator} (Short: \p{Bc=ES}) (12)
   \p{Bidi_Class: European_Terminator} (Short: \p{Bc=ET}) (87)
   \p{Bidi_Class: First_Strong_Isolate} (Short: \p{Bc=FSI}) (1)
   \p{Bidi_Class: FSI}     \p{Bidi_Class=First_Strong_Isolate} (1)
   \p{Bidi_Class: L}       \p{Bidi_Class=Left_To_Right} (1_097_280
                             plus all above-Unicode code points)
   \p{Bidi_Class: Left_To_Right} (Short: \p{Bc=L}) (1_097_280 plus
                             all above-Unicode code points)
   \p{Bidi_Class: Left_To_Right_Embedding} (Short: \p{Bc=LRE}) (1)
   \p{Bidi_Class: Left_To_Right_Isolate} (Short: \p{Bc=LRI}) (1)
   \p{Bidi_Class: Left_To_Right_Override} (Short: \p{Bc=LRO}) (1)
   \p{Bidi_Class: LRE}     \p{Bidi_Class=Left_To_Right_Embedding} (1)
   \p{Bidi_Class: LRI}     \p{Bidi_Class=Left_To_Right_Isolate} (1)
   \p{Bidi_Class: LRO}     \p{Bidi_Class=Left_To_Right_Override} (1)
   \p{Bidi_Class: Nonspacing_Mark} (Short: \p{Bc=NSM}) (1700)
   \p{Bidi_Class: NSM}     \p{Bidi_Class=Nonspacing_Mark} (1700)
   \p{Bidi_Class: ON}      \p{Bidi_Class=Other_Neutral} (5267)
   \p{Bidi_Class: Other_Neutral} (Short: \p{Bc=ON}) (5267)
   \p{Bidi_Class: Paragraph_Separator} (Short: \p{Bc=B}) (7)
   \p{Bidi_Class: PDF}     \p{Bidi_Class=Pop_Directional_Format} (1)
   \p{Bidi_Class: PDI}     \p{Bidi_Class=Pop_Directional_Isolate} (1)
   \p{Bidi_Class: Pop_Directional_Format} (Short: \p{Bc=PDF}) (1)
   \p{Bidi_Class: Pop_Directional_Isolate} (Short: \p{Bc=PDI}) (1)
   \p{Bidi_Class: R}       \p{Bidi_Class=Right_To_Left} (4070)
   \p{Bidi_Class: Right_To_Left} (Short: \p{Bc=R}) (4070)
   \p{Bidi_Class: Right_To_Left_Embedding} (Short: \p{Bc=RLE}) (1)
   \p{Bidi_Class: Right_To_Left_Isolate} (Short: \p{Bc=RLI}) (1)
   \p{Bidi_Class: Right_To_Left_Override} (Short: \p{Bc=RLO}) (1)
   \p{Bidi_Class: RLE}     \p{Bidi_Class=Right_To_Left_Embedding} (1)
   \p{Bidi_Class: RLI}     \p{Bidi_Class=Right_To_Left_Isolate} (1)
   \p{Bidi_Class: RLO}     \p{Bidi_Class=Right_To_Left_Override} (1)
   \p{Bidi_Class: S}       \p{Bidi_Class=Segment_Separator} (3)
   \p{Bidi_Class: Segment_Separator} (Short: \p{Bc=S}) (3)
   \p{Bidi_Class: White_Space} (Short: \p{Bc=WS}) (17)
   \p{Bidi_Class: WS}      \p{Bidi_Class=White_Space} (17)
   \p{Bidi_Control}        \p{Bidi_Control=Y} (Short: \p{BidiC}) (12)
   \p{Bidi_Control: N*}    (Short: \p{BidiC=N}, \P{BidiC}) (1_114_100
                             plus all above-Unicode code points)
   \p{Bidi_Control: Y*}    (Short: \p{BidiC=Y}, \p{BidiC}) (12)
   \p{Bidi_M}              \p{Bidi_Mirrored} (= \p{Bidi_Mirrored=Y})
                             (545)
   \p{Bidi_M: *}           \p{Bidi_Mirrored: *}
   \p{Bidi_Mirrored}       \p{Bidi_Mirrored=Y} (Short: \p{BidiM})
                             (545)
   \p{Bidi_Mirrored: N*}   (Short: \p{BidiM=N}, \P{BidiM}) (1_113_567
                             plus all above-Unicode code points)
   \p{Bidi_Mirrored: Y*}   (Short: \p{BidiM=Y}, \p{BidiM}) (545)
   \p{Bidi_Paired_Bracket_Type: C} \p{Bidi_Paired_Bracket_Type=Close}
                             (60)
   \p{Bidi_Paired_Bracket_Type: Close} (Short: \p{Bpt=C}) (60)
   \p{Bidi_Paired_Bracket_Type: N} \p{Bidi_Paired_Bracket_Type=None}
                             (1_113_992 plus all above-Unicode code
                             points)
   \p{Bidi_Paired_Bracket_Type: None} (Short: \p{Bpt=N}) (1_113_992
                             plus all above-Unicode code points)
   \p{Bidi_Paired_Bracket_Type: O} \p{Bidi_Paired_Bracket_Type=Open}
                             (60)
   \p{Bidi_Paired_Bracket_Type: Open} (Short: \p{Bpt=O}) (60)
   \p{Blank}               \p{XPosixBlank} (18)
   \p{Blk: *}              \p{Block: *}
   \p{Block: Adlam}        (NOT \p{Adlam} NOR \p{Is_Adlam}) (96)
   \p{Block: Aegean_Numbers} (64)
   \p{Block: Ahom}         (NOT \p{Ahom} NOR \p{Is_Ahom}) (64)
   \p{Block: Alchemical}   \p{Block=Alchemical_Symbols} (128)
   \p{Block: Alchemical_Symbols} (Short: \p{Blk=Alchemical}) (128)
   \p{Block: Alphabetic_PF} \p{Block=Alphabetic_Presentation_Forms}
                             (80)
   \p{Block: Alphabetic_Presentation_Forms} (Short: \p{Blk=
                             AlphabeticPF}) (80)
   \p{Block: Anatolian_Hieroglyphs} (NOT \p{Anatolian_Hieroglyphs}
                             NOR \p{Is_Anatolian_Hieroglyphs}) (640)
   \p{Block: Ancient_Greek_Music} \p{Block=
                             Ancient_Greek_Musical_Notation} (80)
   \p{Block: Ancient_Greek_Musical_Notation} (Short: \p{Blk=
                             AncientGreekMusic}) (80)
   \p{Block: Ancient_Greek_Numbers} (80)
   \p{Block: Ancient_Symbols} (64)
   \p{Block: Arabic}       (NOT \p{Arabic} NOR \p{Is_Arabic}) (256)
   \p{Block: Arabic_Ext_A} \p{Block=Arabic_Extended_A} (96)
   \p{Block: Arabic_Extended_A} (Short: \p{Blk=ArabicExtA}) (96)
   \p{Block: Arabic_Math}  \p{Block=
                             Arabic_Mathematical_Alphabetic_Symbols}
                             (256)
   \p{Block: Arabic_Mathematical_Alphabetic_Symbols} (Short: \p{Blk=
                             ArabicMath}) (256)
   \p{Block: Arabic_PF_A}  \p{Block=Arabic_Presentation_Forms_A} (688)
   \p{Block: Arabic_PF_B}  \p{Block=Arabic_Presentation_Forms_B} (144)
   \p{Block: Arabic_Presentation_Forms_A} (Short: \p{Blk=ArabicPFA})
                             (688)
   \p{Block: Arabic_Presentation_Forms_B} (Short: \p{Blk=ArabicPFB})
                             (144)
   \p{Block: Arabic_Sup}   \p{Block=Arabic_Supplement} (48)
   \p{Block: Arabic_Supplement} (Short: \p{Blk=ArabicSup}) (48)
   \p{Block: Armenian}     (NOT \p{Armenian} NOR \p{Is_Armenian}) (96)
   \p{Block: Arrows}       (112)
   \p{Block: ASCII}        \p{Block=Basic_Latin} (128)
   \p{Block: Avestan}      (NOT \p{Avestan} NOR \p{Is_Avestan}) (64)
   \p{Block: Balinese}     (NOT \p{Balinese} NOR \p{Is_Balinese})
                             (128)
   \p{Block: Bamum}        (NOT \p{Bamum} NOR \p{Is_Bamum}) (96)
   \p{Block: Bamum_Sup}    \p{Block=Bamum_Supplement} (576)
   \p{Block: Bamum_Supplement} (Short: \p{Blk=BamumSup}) (576)
   \p{Block: Basic_Latin}  (Short: \p{Blk=ASCII}) (128)
   \p{Block: Bassa_Vah}    (NOT \p{Bassa_Vah} NOR \p{Is_Bassa_Vah})
                             (48)
   \p{Block: Batak}        (NOT \p{Batak} NOR \p{Is_Batak}) (64)
   \p{Block: Bengali}      (NOT \p{Bengali} NOR \p{Is_Bengali}) (128)
   \p{Block: Bhaiksuki}    (NOT \p{Bhaiksuki} NOR \p{Is_Bhaiksuki})
                             (112)
   \p{Block: Block_Elements} (32)
   \p{Block: Bopomofo}     (NOT \p{Bopomofo} NOR \p{Is_Bopomofo}) (48)
   \p{Block: Bopomofo_Ext} \p{Block=Bopomofo_Extended} (32)
   \p{Block: Bopomofo_Extended} (Short: \p{Blk=BopomofoExt}) (32)
   \p{Block: Box_Drawing}  (128)
   \p{Block: Brahmi}       (NOT \p{Brahmi} NOR \p{Is_Brahmi}) (128)
   \p{Block: Braille}      \p{Block=Braille_Patterns} (256)
   \p{Block: Braille_Patterns} (Short: \p{Blk=Braille}) (256)
   \p{Block: Buginese}     (NOT \p{Buginese} NOR \p{Is_Buginese}) (32)
   \p{Block: Buhid}        (NOT \p{Buhid} NOR \p{Is_Buhid}) (32)
   \p{Block: Byzantine_Music} \p{Block=Byzantine_Musical_Symbols}
                             (256)
   \p{Block: Byzantine_Musical_Symbols} (Short: \p{Blk=
                             ByzantineMusic}) (256)
   \p{Block: Canadian_Syllabics} \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics}
                             (640)
   \p{Block: Carian}       (NOT \p{Carian} NOR \p{Is_Carian}) (64)
   \p{Block: Caucasian_Albanian} (NOT \p{Caucasian_Albanian} NOR
                             \p{Is_Caucasian_Albanian}) (64)
   \p{Block: Chakma}       (NOT \p{Chakma} NOR \p{Is_Chakma}) (80)
   \p{Block: Cham}         (NOT \p{Cham} NOR \p{Is_Cham}) (96)
   \p{Block: Cherokee}     (NOT \p{Cherokee} NOR \p{Is_Cherokee}) (96)
   \p{Block: Cherokee_Sup} \p{Block=Cherokee_Supplement} (80)
   \p{Block: Cherokee_Supplement} (Short: \p{Blk=CherokeeSup}) (80)
   \p{Block: CJK}          \p{Block=CJK_Unified_Ideographs} (20_992)
   \p{Block: CJK_Compat}   \p{Block=CJK_Compatibility} (256)
   \p{Block: CJK_Compat_Forms} \p{Block=CJK_Compatibility_Forms} (32)
   \p{Block: CJK_Compat_Ideographs} \p{Block=
                             CJK_Compatibility_Ideographs} (512)
   \p{Block: CJK_Compat_Ideographs_Sup} \p{Block=
                             CJK_Compatibility_Ideographs_Supplement}
                             (544)
   \p{Block: CJK_Compatibility} (Short: \p{Blk=CJKCompat}) (256)
   \p{Block: CJK_Compatibility_Forms} (Short: \p{Blk=CJKCompatForms})
                             (32)
   \p{Block: CJK_Compatibility_Ideographs} (Short: \p{Blk=
                             CJKCompatIdeographs}) (512)
   \p{Block: CJK_Compatibility_Ideographs_Supplement} (Short: \p{Blk=
                             CJKCompatIdeographsSup}) (544)
   \p{Block: CJK_Ext_A}    \p{Block=
                             CJK_Unified_Ideographs_Extension_A}
                             (6592)
   \p{Block: CJK_Ext_B}    \p{Block=
                             CJK_Unified_Ideographs_Extension_B}
                             (42_720)
   \p{Block: CJK_Ext_C}    \p{Block=
                             CJK_Unified_Ideographs_Extension_C}
                             (4160)
   \p{Block: CJK_Ext_D}    \p{Block=
                             CJK_Unified_Ideographs_Extension_D} (224)
   \p{Block: CJK_Ext_E}    \p{Block=
                             CJK_Unified_Ideographs_Extension_E}
                             (5776)
   \p{Block: CJK_Radicals_Sup} \p{Block=CJK_Radicals_Supplement} (128)
   \p{Block: CJK_Radicals_Supplement} (Short: \p{Blk=CJKRadicalsSup})
                             (128)
   \p{Block: CJK_Strokes}  (48)
   \p{Block: CJK_Symbols}  \p{Block=CJK_Symbols_And_Punctuation} (64)
   \p{Block: CJK_Symbols_And_Punctuation} (Short: \p{Blk=CJKSymbols})
                             (64)
   \p{Block: CJK_Unified_Ideographs} (Short: \p{Blk=CJK}) (20_992)
   \p{Block: CJK_Unified_Ideographs_Extension_A} (Short: \p{Blk=
                             CJKExtA}) (6592)
   \p{Block: CJK_Unified_Ideographs_Extension_B} (Short: \p{Blk=
                             CJKExtB}) (42_720)
   \p{Block: CJK_Unified_Ideographs_Extension_C} (Short: \p{Blk=
                             CJKExtC}) (4160)
   \p{Block: CJK_Unified_Ideographs_Extension_D} (Short: \p{Blk=
                             CJKExtD}) (224)
   \p{Block: CJK_Unified_Ideographs_Extension_E} (Short: \p{Blk=
                             CJKExtE}) (5776)
   \p{Block: Combining_Diacritical_Marks} (Short: \p{Blk=
                             Diacriticals}) (112)
   \p{Block: Combining_Diacritical_Marks_Extended} (Short: \p{Blk=
                             DiacriticalsExt}) (80)
   \p{Block: Combining_Diacritical_Marks_For_Symbols} (Short: \p{Blk=
                             DiacriticalsForSymbols}) (48)
   \p{Block: Combining_Diacritical_Marks_Supplement} (Short: \p{Blk=
                             DiacriticalsSup}) (64)
   \p{Block: Combining_Half_Marks} (Short: \p{Blk=HalfMarks}) (16)
   \p{Block: Combining_Marks_For_Symbols} \p{Block=
                             Combining_Diacritical_Marks_For_Symbols}
                             (48)
   \p{Block: Common_Indic_Number_Forms} (Short: \p{Blk=
                             IndicNumberForms}) (16)
   \p{Block: Compat_Jamo}  \p{Block=Hangul_Compatibility_Jamo} (96)
   \p{Block: Control_Pictures} (64)
   \p{Block: Coptic}       (NOT \p{Coptic} NOR \p{Is_Coptic}) (128)
   \p{Block: Coptic_Epact_Numbers} (32)
   \p{Block: Counting_Rod} \p{Block=Counting_Rod_Numerals} (32)
   \p{Block: Counting_Rod_Numerals} (Short: \p{Blk=CountingRod}) (32)
   \p{Block: Cuneiform}    (NOT \p{Cuneiform} NOR \p{Is_Cuneiform})
                             (1024)
   \p{Block: Cuneiform_Numbers} \p{Block=
                             Cuneiform_Numbers_And_Punctuation} (128)
   \p{Block: Cuneiform_Numbers_And_Punctuation} (Short: \p{Blk=
                             CuneiformNumbers}) (128)
   \p{Block: Currency_Symbols} (48)
   \p{Block: Cypriot_Syllabary} (64)
   \p{Block: Cyrillic}     (NOT \p{Cyrillic} NOR \p{Is_Cyrillic})
                             (256)
   \p{Block: Cyrillic_Ext_A} \p{Block=Cyrillic_Extended_A} (32)
   \p{Block: Cyrillic_Ext_B} \p{Block=Cyrillic_Extended_B} (96)
   \p{Block: Cyrillic_Ext_C} \p{Block=Cyrillic_Extended_C} (16)
   \p{Block: Cyrillic_Extended_A} (Short: \p{Blk=CyrillicExtA}) (32)
   \p{Block: Cyrillic_Extended_B} (Short: \p{Blk=CyrillicExtB}) (96)
   \p{Block: Cyrillic_Extended_C} (Short: \p{Blk=CyrillicExtC}) (16)
   \p{Block: Cyrillic_Sup} \p{Block=Cyrillic_Supplement} (48)
   \p{Block: Cyrillic_Supplement} (Short: \p{Blk=CyrillicSup}) (48)
   \p{Block: Cyrillic_Supplementary} \p{Block=Cyrillic_Supplement}
                             (48)
   \p{Block: Deseret}      (80)
   \p{Block: Devanagari}   (NOT \p{Devanagari} NOR \p{Is_Devanagari})
                             (128)
   \p{Block: Devanagari_Ext} \p{Block=Devanagari_Extended} (32)
   \p{Block: Devanagari_Extended} (Short: \p{Blk=DevanagariExt}) (32)
   \p{Block: Diacriticals} \p{Block=Combining_Diacritical_Marks} (112)
   \p{Block: Diacriticals_Ext} \p{Block=
                             Combining_Diacritical_Marks_Extended}
                             (80)
   \p{Block: Diacriticals_For_Symbols} \p{Block=
                             Combining_Diacritical_Marks_For_Symbols}
                             (48)
   \p{Block: Diacriticals_Sup} \p{Block=
                             Combining_Diacritical_Marks_Supplement}
                             (64)
   \p{Block: Dingbats}     (192)
   \p{Block: Domino}       \p{Block=Domino_Tiles} (112)
   \p{Block: Domino_Tiles} (Short: \p{Blk=Domino}) (112)
   \p{Block: Duployan}     (NOT \p{Duployan} NOR \p{Is_Duployan})
                             (160)
   \p{Block: Early_Dynastic_Cuneiform} (208)
   \p{Block: Egyptian_Hieroglyphs} (NOT \p{Egyptian_Hieroglyphs} NOR
                             \p{Is_Egyptian_Hieroglyphs}) (1072)
   \p{Block: Elbasan}      (NOT \p{Elbasan} NOR \p{Is_Elbasan}) (48)
   \p{Block: Emoticons}    (80)
   \p{Block: Enclosed_Alphanum} \p{Block=Enclosed_Alphanumerics} (160)
   \p{Block: Enclosed_Alphanum_Sup} \p{Block=
                             Enclosed_Alphanumeric_Supplement} (256)
   \p{Block: Enclosed_Alphanumeric_Supplement} (Short: \p{Blk=
                             EnclosedAlphanumSup}) (256)
   \p{Block: Enclosed_Alphanumerics} (Short: \p{Blk=
                             EnclosedAlphanum}) (160)
   \p{Block: Enclosed_CJK} \p{Block=Enclosed_CJK_Letters_And_Months}
                             (256)
   \p{Block: Enclosed_CJK_Letters_And_Months} (Short: \p{Blk=
                             EnclosedCJK}) (256)
   \p{Block: Enclosed_Ideographic_Sup} \p{Block=
                             Enclosed_Ideographic_Supplement} (256)
   \p{Block: Enclosed_Ideographic_Supplement} (Short: \p{Blk=
                             EnclosedIdeographicSup}) (256)
   \p{Block: Ethiopic}     (NOT \p{Ethiopic} NOR \p{Is_Ethiopic})
                             (384)
   \p{Block: Ethiopic_Ext} \p{Block=Ethiopic_Extended} (96)
   \p{Block: Ethiopic_Ext_A} \p{Block=Ethiopic_Extended_A} (48)
   \p{Block: Ethiopic_Extended} (Short: \p{Blk=EthiopicExt}) (96)
   \p{Block: Ethiopic_Extended_A} (Short: \p{Blk=EthiopicExtA}) (48)
   \p{Block: Ethiopic_Sup} \p{Block=Ethiopic_Supplement} (32)
   \p{Block: Ethiopic_Supplement} (Short: \p{Blk=EthiopicSup}) (32)
   \p{Block: General_Punctuation} (Short: \p{Blk=Punctuation}; NOT
                             \p{Punct} NOR \p{Is_Punctuation}) (112)
   \p{Block: Geometric_Shapes} (96)
   \p{Block: Geometric_Shapes_Ext} \p{Block=
                             Geometric_Shapes_Extended} (128)
   \p{Block: Geometric_Shapes_Extended} (Short: \p{Blk=
                             GeometricShapesExt}) (128)
   \p{Block: Georgian}     (NOT \p{Georgian} NOR \p{Is_Georgian}) (96)
   \p{Block: Georgian_Sup} \p{Block=Georgian_Supplement} (48)
   \p{Block: Georgian_Supplement} (Short: \p{Blk=GeorgianSup}) (48)
   \p{Block: Glagolitic}   (NOT \p{Glagolitic} NOR \p{Is_Glagolitic})
                             (96)
   \p{Block: Glagolitic_Sup} \p{Block=Glagolitic_Supplement} (48)
   \p{Block: Glagolitic_Supplement} (Short: \p{Blk=GlagoliticSup})
                             (48)
   \p{Block: Gothic}       (NOT \p{Gothic} NOR \p{Is_Gothic}) (32)
   \p{Block: Grantha}      (NOT \p{Grantha} NOR \p{Is_Grantha}) (128)
   \p{Block: Greek}        \p{Block=Greek_And_Coptic} (NOT \p{Greek}
                             NOR \p{Is_Greek}) (144)
   \p{Block: Greek_And_Coptic} (Short: \p{Blk=Greek}; NOT \p{Greek}
                             NOR \p{Is_Greek}) (144)
   \p{Block: Greek_Ext}    \p{Block=Greek_Extended} (256)
   \p{Block: Greek_Extended} (Short: \p{Blk=GreekExt}) (256)
   \p{Block: Gujarati}     (NOT \p{Gujarati} NOR \p{Is_Gujarati})
                             (128)
   \p{Block: Gurmukhi}     (NOT \p{Gurmukhi} NOR \p{Is_Gurmukhi})
                             (128)
   \p{Block: Half_And_Full_Forms} \p{Block=
                             Halfwidth_And_Fullwidth_Forms} (240)
   \p{Block: Half_Marks}   \p{Block=Combining_Half_Marks} (16)
   \p{Block: Halfwidth_And_Fullwidth_Forms} (Short: \p{Blk=
                             HalfAndFullForms}) (240)
   \p{Block: Hangul}       \p{Block=Hangul_Syllables} (NOT \p{Hangul}
                             NOR \p{Is_Hangul}) (11_184)
   \p{Block: Hangul_Compatibility_Jamo} (Short: \p{Blk=CompatJamo})
                             (96)
   \p{Block: Hangul_Jamo}  (Short: \p{Blk=Jamo}) (256)
   \p{Block: Hangul_Jamo_Extended_A} (Short: \p{Blk=JamoExtA}) (32)
   \p{Block: Hangul_Jamo_Extended_B} (Short: \p{Blk=JamoExtB}) (80)
   \p{Block: Hangul_Syllables} (Short: \p{Blk=Hangul}; NOT \p{Hangul}
                             NOR \p{Is_Hangul}) (11_184)
   \p{Block: Hanunoo}      (NOT \p{Hanunoo} NOR \p{Is_Hanunoo}) (32)
   \p{Block: Hatran}       (NOT \p{Hatran} NOR \p{Is_Hatran}) (32)
   \p{Block: Hebrew}       (NOT \p{Hebrew} NOR \p{Is_Hebrew}) (112)
   \p{Block: High_Private_Use_Surrogates} (Short: \p{Blk=
                             HighPUSurrogates}) (128)
   \p{Block: High_PU_Surrogates} \p{Block=
                             High_Private_Use_Surrogates} (128)
   \p{Block: High_Surrogates} (896)
   \p{Block: Hiragana}     (NOT \p{Hiragana} NOR \p{Is_Hiragana}) (96)
   \p{Block: IDC}          \p{Block=
                             Ideographic_Description_Characters} (NOT
                             \p{ID_Continue} NOR \p{Is_IDC}) (16)
   \p{Block: Ideographic_Description_Characters} (Short: \p{Blk=IDC};
                             NOT \p{ID_Continue} NOR \p{Is_IDC}) (16)
   \p{Block: Ideographic_Symbols} \p{Block=
                             Ideographic_Symbols_And_Punctuation} (32)
   \p{Block: Ideographic_Symbols_And_Punctuation} (Short: \p{Blk=
                             IdeographicSymbols}) (32)
   \p{Block: Imperial_Aramaic} (NOT \p{Imperial_Aramaic} NOR
                             \p{Is_Imperial_Aramaic}) (32)
   \p{Block: Indic_Number_Forms} \p{Block=Common_Indic_Number_Forms}
                             (16)
   \p{Block: Inscriptional_Pahlavi} (NOT \p{Inscriptional_Pahlavi}
                             NOR \p{Is_Inscriptional_Pahlavi}) (32)
   \p{Block: Inscriptional_Parthian} (NOT \p{Inscriptional_Parthian}
                             NOR \p{Is_Inscriptional_Parthian}) (32)
   \p{Block: IPA_Ext}      \p{Block=IPA_Extensions} (96)
   \p{Block: IPA_Extensions} (Short: \p{Blk=IPAExt}) (96)
   \p{Block: Jamo}         \p{Block=Hangul_Jamo} (256)
   \p{Block: Jamo_Ext_A}   \p{Block=Hangul_Jamo_Extended_A} (32)
   \p{Block: Jamo_Ext_B}   \p{Block=Hangul_Jamo_Extended_B} (80)
   \p{Block: Javanese}     (NOT \p{Javanese} NOR \p{Is_Javanese}) (96)
   \p{Block: Kaithi}       (NOT \p{Kaithi} NOR \p{Is_Kaithi}) (80)
   \p{Block: Kana_Sup}     \p{Block=Kana_Supplement} (256)
   \p{Block: Kana_Supplement} (Short: \p{Blk=KanaSup}) (256)
   \p{Block: Kanbun}       (16)
   \p{Block: Kangxi}       \p{Block=Kangxi_Radicals} (224)
   \p{Block: Kangxi_Radicals} (Short: \p{Blk=Kangxi}) (224)
   \p{Block: Kannada}      (NOT \p{Kannada} NOR \p{Is_Kannada}) (128)
   \p{Block: Katakana}     (NOT \p{Katakana} NOR \p{Is_Katakana}) (96)
   \p{Block: Katakana_Ext} \p{Block=Katakana_Phonetic_Extensions} (16)
   \p{Block: Katakana_Phonetic_Extensions} (Short: \p{Blk=
                             KatakanaExt}) (16)
   \p{Block: Kayah_Li}     (48)
   \p{Block: Kharoshthi}   (NOT \p{Kharoshthi} NOR \p{Is_Kharoshthi})
                             (96)
   \p{Block: Khmer}        (NOT \p{Khmer} NOR \p{Is_Khmer}) (128)
   \p{Block: Khmer_Symbols} (32)
   \p{Block: Khojki}       (NOT \p{Khojki} NOR \p{Is_Khojki}) (80)
   \p{Block: Khudawadi}    (NOT \p{Khudawadi} NOR \p{Is_Khudawadi})
                             (80)
   \p{Block: Lao}          (NOT \p{Lao} NOR \p{Is_Lao}) (128)
   \p{Block: Latin_1}      \p{Block=Latin_1_Supplement} (128)
   \p{Block: Latin_1_Sup}  \p{Block=Latin_1_Supplement} (128)
   \p{Block: Latin_1_Supplement} (Short: \p{Blk=Latin1}) (128)
   \p{Block: Latin_Ext_A}  \p{Block=Latin_Extended_A} (128)
   \p{Block: Latin_Ext_Additional} \p{Block=
                             Latin_Extended_Additional} (256)
   \p{Block: Latin_Ext_B}  \p{Block=Latin_Extended_B} (208)
   \p{Block: Latin_Ext_C}  \p{Block=Latin_Extended_C} (32)
   \p{Block: Latin_Ext_D}  \p{Block=Latin_Extended_D} (224)
   \p{Block: Latin_Ext_E}  \p{Block=Latin_Extended_E} (64)
   \p{Block: Latin_Extended_A} (Short: \p{Blk=LatinExtA}) (128)
   \p{Block: Latin_Extended_Additional} (Short: \p{Blk=
                             LatinExtAdditional}) (256)
   \p{Block: Latin_Extended_B} (Short: \p{Blk=LatinExtB}) (208)
   \p{Block: Latin_Extended_C} (Short: \p{Blk=LatinExtC}) (32)
   \p{Block: Latin_Extended_D} (Short: \p{Blk=LatinExtD}) (224)
   \p{Block: Latin_Extended_E} (Short: \p{Blk=LatinExtE}) (64)
   \p{Block: Lepcha}       (NOT \p{Lepcha} NOR \p{Is_Lepcha}) (80)
   \p{Block: Letterlike_Symbols} (80)
   \p{Block: Limbu}        (NOT \p{Limbu} NOR \p{Is_Limbu}) (80)
   \p{Block: Linear_A}     (NOT \p{Linear_A} NOR \p{Is_Linear_A})
                             (384)
   \p{Block: Linear_B_Ideograms} (128)
   \p{Block: Linear_B_Syllabary} (128)
   \p{Block: Lisu}         (48)
   \p{Block: Low_Surrogates} (1024)
   \p{Block: Lycian}       (NOT \p{Lycian} NOR \p{Is_Lycian}) (32)
   \p{Block: Lydian}       (NOT \p{Lydian} NOR \p{Is_Lydian}) (32)
   \p{Block: Mahajani}     (NOT \p{Mahajani} NOR \p{Is_Mahajani}) (48)
   \p{Block: Mahjong}      \p{Block=Mahjong_Tiles} (48)
   \p{Block: Mahjong_Tiles} (Short: \p{Blk=Mahjong}) (48)
   \p{Block: Malayalam}    (NOT \p{Malayalam} NOR \p{Is_Malayalam})
                             (128)
   \p{Block: Mandaic}      (NOT \p{Mandaic} NOR \p{Is_Mandaic}) (32)
   \p{Block: Manichaean}   (NOT \p{Manichaean} NOR \p{Is_Manichaean})
                             (64)
   \p{Block: Marchen}      (NOT \p{Marchen} NOR \p{Is_Marchen}) (80)
   \p{Block: Math_Alphanum} \p{Block=
                             Mathematical_Alphanumeric_Symbols} (1024)
   \p{Block: Math_Operators} \p{Block=Mathematical_Operators} (256)
   \p{Block: Mathematical_Alphanumeric_Symbols} (Short: \p{Blk=
                             MathAlphanum}) (1024)
   \p{Block: Mathematical_Operators} (Short: \p{Blk=MathOperators})
                             (256)
   \p{Block: Meetei_Mayek} (NOT \p{Meetei_Mayek} NOR
                             \p{Is_Meetei_Mayek}) (64)
   \p{Block: Meetei_Mayek_Ext} \p{Block=Meetei_Mayek_Extensions} (32)
   \p{Block: Meetei_Mayek_Extensions} (Short: \p{Blk=MeeteiMayekExt})
                             (32)
   \p{Block: Mende_Kikakui} (NOT \p{Mende_Kikakui} NOR
                             \p{Is_Mende_Kikakui}) (224)
   \p{Block: Meroitic_Cursive} (NOT \p{Meroitic_Cursive} NOR
                             \p{Is_Meroitic_Cursive}) (96)
   \p{Block: Meroitic_Hieroglyphs} (32)
   \p{Block: Miao}         (NOT \p{Miao} NOR \p{Is_Miao}) (160)
   \p{Block: Misc_Arrows}  \p{Block=Miscellaneous_Symbols_And_Arrows}
                             (256)
   \p{Block: Misc_Math_Symbols_A} \p{Block=
                             Miscellaneous_Mathematical_Symbols_A}
                             (48)
   \p{Block: Misc_Math_Symbols_B} \p{Block=
                             Miscellaneous_Mathematical_Symbols_B}
                             (128)
   \p{Block: Misc_Pictographs} \p{Block=
                             Miscellaneous_Symbols_And_Pictographs}
                             (768)
   \p{Block: Misc_Symbols} \p{Block=Miscellaneous_Symbols} (256)
   \p{Block: Misc_Technical} \p{Block=Miscellaneous_Technical} (256)
   \p{Block: Miscellaneous_Mathematical_Symbols_A} (Short: \p{Blk=
                             MiscMathSymbolsA}) (48)
   \p{Block: Miscellaneous_Mathematical_Symbols_B} (Short: \p{Blk=
                             MiscMathSymbolsB}) (128)
   \p{Block: Miscellaneous_Symbols} (Short: \p{Blk=MiscSymbols}) (256)
   \p{Block: Miscellaneous_Symbols_And_Arrows} (Short: \p{Blk=
                             MiscArrows}) (256)
   \p{Block: Miscellaneous_Symbols_And_Pictographs} (Short: \p{Blk=
                             MiscPictographs}) (768)
   \p{Block: Miscellaneous_Technical} (Short: \p{Blk=MiscTechnical})
                             (256)
   \p{Block: Modi}         (NOT \p{Modi} NOR \p{Is_Modi}) (96)
   \p{Block: Modifier_Letters} \p{Block=Spacing_Modifier_Letters} (80)
   \p{Block: Modifier_Tone_Letters} (32)
   \p{Block: Mongolian}    (NOT \p{Mongolian} NOR \p{Is_Mongolian})
                             (176)
   \p{Block: Mongolian_Sup} \p{Block=Mongolian_Supplement} (32)
   \p{Block: Mongolian_Supplement} (Short: \p{Blk=MongolianSup}) (32)
   \p{Block: Mro}          (NOT \p{Mro} NOR \p{Is_Mro}) (48)
   \p{Block: Multani}      (NOT \p{Multani} NOR \p{Is_Multani}) (48)
   \p{Block: Music}        \p{Block=Musical_Symbols} (256)
   \p{Block: Musical_Symbols} (Short: \p{Blk=Music}) (256)
   \p{Block: Myanmar}      (NOT \p{Myanmar} NOR \p{Is_Myanmar}) (160)
   \p{Block: Myanmar_Ext_A} \p{Block=Myanmar_Extended_A} (32)
   \p{Block: Myanmar_Ext_B} \p{Block=Myanmar_Extended_B} (32)
   \p{Block: Myanmar_Extended_A} (Short: \p{Blk=MyanmarExtA}) (32)
   \p{Block: Myanmar_Extended_B} (Short: \p{Blk=MyanmarExtB}) (32)
   \p{Block: Nabataean}    (NOT \p{Nabataean} NOR \p{Is_Nabataean})
                             (48)
   \p{Block: NB}           \p{Block=No_Block} (842_320 plus all
                             above-Unicode code points)
   \p{Block: New_Tai_Lue}  (NOT \p{New_Tai_Lue} NOR
                             \p{Is_New_Tai_Lue}) (96)
   \p{Block: Newa}         (NOT \p{Newa} NOR \p{Is_Newa}) (128)
   \p{Block: NKo}          (NOT \p{Nko} NOR \p{Is_NKo}) (64)
   \p{Block: No_Block}     (Short: \p{Blk=NB}) (842_320 plus all
                             above-Unicode code points)
   \p{Block: Number_Forms} (64)
   \p{Block: OCR}          \p{Block=Optical_Character_Recognition}
                             (32)
   \p{Block: Ogham}        (NOT \p{Ogham} NOR \p{Is_Ogham}) (32)
   \p{Block: Ol_Chiki}     (48)
   \p{Block: Old_Hungarian} (NOT \p{Old_Hungarian} NOR
                             \p{Is_Old_Hungarian}) (128)
   \p{Block: Old_Italic}   (NOT \p{Old_Italic} NOR \p{Is_Old_Italic})
                             (48)
   \p{Block: Old_North_Arabian} (32)
   \p{Block: Old_Permic}   (NOT \p{Old_Permic} NOR \p{Is_Old_Permic})
                             (48)
   \p{Block: Old_Persian}  (NOT \p{Old_Persian} NOR
                             \p{Is_Old_Persian}) (64)
   \p{Block: Old_South_Arabian} (32)
   \p{Block: Old_Turkic}   (NOT \p{Old_Turkic} NOR \p{Is_Old_Turkic})
                             (80)
   \p{Block: Optical_Character_Recognition} (Short: \p{Blk=OCR}) (32)
   \p{Block: Oriya}        (NOT \p{Oriya} NOR \p{Is_Oriya}) (128)
   \p{Block: Ornamental_Dingbats} (48)
   \p{Block: Osage}        (NOT \p{Osage} NOR \p{Is_Osage}) (80)
   \p{Block: Osmanya}      (NOT \p{Osmanya} NOR \p{Is_Osmanya}) (48)
   \p{Block: Pahawh_Hmong} (NOT \p{Pahawh_Hmong} NOR
                             \p{Is_Pahawh_Hmong}) (144)
   \p{Block: Palmyrene}    (32)
   \p{Block: Pau_Cin_Hau}  (NOT \p{Pau_Cin_Hau} NOR
                             \p{Is_Pau_Cin_Hau}) (64)
   \p{Block: Phags_Pa}     (NOT \p{Phags_Pa} NOR \p{Is_Phags_Pa}) (64)
   \p{Block: Phaistos}     \p{Block=Phaistos_Disc} (48)
   \p{Block: Phaistos_Disc} (Short: \p{Blk=Phaistos}) (48)
   \p{Block: Phoenician}   (NOT \p{Phoenician} NOR \p{Is_Phoenician})
                             (32)
   \p{Block: Phonetic_Ext} \p{Block=Phonetic_Extensions} (128)
   \p{Block: Phonetic_Ext_Sup} \p{Block=
                             Phonetic_Extensions_Supplement} (64)
   \p{Block: Phonetic_Extensions} (Short: \p{Blk=PhoneticExt}) (128)
   \p{Block: Phonetic_Extensions_Supplement} (Short: \p{Blk=
                             PhoneticExtSup}) (64)
   \p{Block: Playing_Cards} (96)
   \p{Block: Private_Use}  \p{Block=Private_Use_Area} (NOT
                             \p{Private_Use} NOR \p{Is_Private_Use})
                             (6400)
   \p{Block: Private_Use_Area} (Short: \p{Blk=PUA}; NOT
                             \p{Private_Use} NOR \p{Is_Private_Use})
                             (6400)
   \p{Block: Psalter_Pahlavi} (NOT \p{Psalter_Pahlavi} NOR
                             \p{Is_Psalter_Pahlavi}) (48)
   \p{Block: PUA}          \p{Block=Private_Use_Area} (NOT
                             \p{Private_Use} NOR \p{Is_Private_Use})
                             (6400)
   \p{Block: Punctuation}  \p{Block=General_Punctuation} (NOT
                             \p{Punct} NOR \p{Is_Punctuation}) (112)
   \p{Block: Rejang}       (NOT \p{Rejang} NOR \p{Is_Rejang}) (48)
   \p{Block: Rumi}         \p{Block=Rumi_Numeral_Symbols} (32)
   \p{Block: Rumi_Numeral_Symbols} (Short: \p{Blk=Rumi}) (32)
   \p{Block: Runic}        (NOT \p{Runic} NOR \p{Is_Runic}) (96)
   \p{Block: Samaritan}    (NOT \p{Samaritan} NOR \p{Is_Samaritan})
                             (64)
   \p{Block: Saurashtra}   (NOT \p{Saurashtra} NOR \p{Is_Saurashtra})
                             (96)
   \p{Block: Sharada}      (NOT \p{Sharada} NOR \p{Is_Sharada}) (96)
   \p{Block: Shavian}      (48)
   \p{Block: Shorthand_Format_Controls} (16)
   \p{Block: Siddham}      (NOT \p{Siddham} NOR \p{Is_Siddham}) (128)
   \p{Block: Sinhala}      (NOT \p{Sinhala} NOR \p{Is_Sinhala}) (128)
   \p{Block: Sinhala_Archaic_Numbers} (32)
   \p{Block: Small_Form_Variants} (Short: \p{Blk=SmallForms}) (32)
   \p{Block: Small_Forms}  \p{Block=Small_Form_Variants} (32)
   \p{Block: Sora_Sompeng} (NOT \p{Sora_Sompeng} NOR
                             \p{Is_Sora_Sompeng}) (48)
   \p{Block: Spacing_Modifier_Letters} (Short: \p{Blk=
                             ModifierLetters}) (80)
   \p{Block: Specials}     (16)
   \p{Block: Sundanese}    (NOT \p{Sundanese} NOR \p{Is_Sundanese})
                             (64)
   \p{Block: Sundanese_Sup} \p{Block=Sundanese_Supplement} (16)
   \p{Block: Sundanese_Supplement} (Short: \p{Blk=SundaneseSup}) (16)
   \p{Block: Sup_Arrows_A} \p{Block=Supplemental_Arrows_A} (16)
   \p{Block: Sup_Arrows_B} \p{Block=Supplemental_Arrows_B} (128)
   \p{Block: Sup_Arrows_C} \p{Block=Supplemental_Arrows_C} (256)
   \p{Block: Sup_Math_Operators} \p{Block=
                             Supplemental_Mathematical_Operators}
                             (256)
   \p{Block: Sup_PUA_A}    \p{Block=Supplementary_Private_Use_Area_A}
                             (65_536)
   \p{Block: Sup_PUA_B}    \p{Block=Supplementary_Private_Use_Area_B}
                             (65_536)
   \p{Block: Sup_Punctuation} \p{Block=Supplemental_Punctuation} (128)
   \p{Block: Sup_Symbols_And_Pictographs} \p{Block=
                             Supplemental_Symbols_And_Pictographs}
                             (256)
   \p{Block: Super_And_Sub} \p{Block=Superscripts_And_Subscripts} (48)
   \p{Block: Superscripts_And_Subscripts} (Short: \p{Blk=
                             SuperAndSub}) (48)
   \p{Block: Supplemental_Arrows_A} (Short: \p{Blk=SupArrowsA}) (16)
   \p{Block: Supplemental_Arrows_B} (Short: \p{Blk=SupArrowsB}) (128)
   \p{Block: Supplemental_Arrows_C} (Short: \p{Blk=SupArrowsC}) (256)
   \p{Block: Supplemental_Mathematical_Operators} (Short: \p{Blk=
                             SupMathOperators}) (256)
   \p{Block: Supplemental_Punctuation} (Short: \p{Blk=
                             SupPunctuation}) (128)
   \p{Block: Supplemental_Symbols_And_Pictographs} (Short: \p{Blk=
                             SupSymbolsAndPictographs}) (256)
   \p{Block: Supplementary_Private_Use_Area_A} (Short: \p{Blk=
                             SupPUAA}) (65_536)
   \p{Block: Supplementary_Private_Use_Area_B} (Short: \p{Blk=
                             SupPUAB}) (65_536)
   \p{Block: Sutton_SignWriting} (688)
   \p{Block: Syloti_Nagri} (NOT \p{Syloti_Nagri} NOR
                             \p{Is_Syloti_Nagri}) (48)
   \p{Block: Syriac}       (NOT \p{Syriac} NOR \p{Is_Syriac}) (80)
   \p{Block: Tagalog}      (NOT \p{Tagalog} NOR \p{Is_Tagalog}) (32)
   \p{Block: Tagbanwa}     (NOT \p{Tagbanwa} NOR \p{Is_Tagbanwa}) (32)
   \p{Block: Tags}         (128)
   \p{Block: Tai_Le}       (NOT \p{Tai_Le} NOR \p{Is_Tai_Le}) (48)
   \p{Block: Tai_Tham}     (NOT \p{Tai_Tham} NOR \p{Is_Tai_Tham})
                             (144)
   \p{Block: Tai_Viet}     (NOT \p{Tai_Viet} NOR \p{Is_Tai_Viet}) (96)
   \p{Block: Tai_Xuan_Jing} \p{Block=Tai_Xuan_Jing_Symbols} (96)
   \p{Block: Tai_Xuan_Jing_Symbols} (Short: \p{Blk=TaiXuanJing}) (96)
   \p{Block: Takri}        (NOT \p{Takri} NOR \p{Is_Takri}) (80)
   \p{Block: Tamil}        (NOT \p{Tamil} NOR \p{Is_Tamil}) (128)
   \p{Block: Tangut}       (NOT \p{Tangut} NOR \p{Is_Tangut}) (6144)
   \p{Block: Tangut_Components} (768)
   \p{Block: Telugu}       (NOT \p{Telugu} NOR \p{Is_Telugu}) (128)
   \p{Block: Thaana}       (NOT \p{Thaana} NOR \p{Is_Thaana}) (64)
   \p{Block: Thai}         (NOT \p{Thai} NOR \p{Is_Thai}) (128)
   \p{Block: Tibetan}      (NOT \p{Tibetan} NOR \p{Is_Tibetan}) (256)
   \p{Block: Tifinagh}     (NOT \p{Tifinagh} NOR \p{Is_Tifinagh}) (80)
   \p{Block: Tirhuta}      (NOT \p{Tirhuta} NOR \p{Is_Tirhuta}) (96)
   \p{Block: Transport_And_Map} \p{Block=Transport_And_Map_Symbols}
                             (128)
   \p{Block: Transport_And_Map_Symbols} (Short: \p{Blk=
                             TransportAndMap}) (128)
   \p{Block: UCAS}         \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics}
                             (640)
   \p{Block: UCAS_Ext}     \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics_-
                             Extended} (80)
   \p{Block: Ugaritic}     (NOT \p{Ugaritic} NOR \p{Is_Ugaritic}) (32)
   \p{Block: Unified_Canadian_Aboriginal_Syllabics} (Short: \p{Blk=
                             UCAS}) (640)
   \p{Block: Unified_Canadian_Aboriginal_Syllabics_Extended} (Short:
                             \p{Blk=UCASExt}) (80)
   \p{Block: Vai}          (NOT \p{Vai} NOR \p{Is_Vai}) (320)
   \p{Block: Variation_Selectors} (Short: \p{Blk=VS}; NOT
                             \p{Variation_Selector} NOR \p{Is_VS})
                             (16)
   \p{Block: Variation_Selectors_Supplement} (Short: \p{Blk=VSSup})
                             (240)
   \p{Block: Vedic_Ext}    \p{Block=Vedic_Extensions} (48)
   \p{Block: Vedic_Extensions} (Short: \p{Blk=VedicExt}) (48)
   \p{Block: Vertical_Forms} (16)
   \p{Block: VS}           \p{Block=Variation_Selectors} (NOT
                             \p{Variation_Selector} NOR \p{Is_VS})
                             (16)
   \p{Block: VS_Sup}       \p{Block=Variation_Selectors_Supplement}
                             (240)
   \p{Block: Warang_Citi}  (NOT \p{Warang_Citi} NOR
                             \p{Is_Warang_Citi}) (96)
   \p{Block: Yi_Radicals}  (64)
   \p{Block: Yi_Syllables} (1168)
   \p{Block: Yijing}       \p{Block=Yijing_Hexagram_Symbols} (64)
   \p{Block: Yijing_Hexagram_Symbols} (Short: \p{Blk=Yijing}) (64)
 X \p{Block_Elements}      \p{Block=Block_Elements} (32)
   \p{Bopo}                \p{Bopomofo} (= \p{Script_Extensions=
                             Bopomofo}) (NOT \p{Block=Bopomofo}) (110)
   \p{Bopomofo}            \p{Script_Extensions=Bopomofo} (Short:
                             \p{Bopo}; NOT \p{Block=Bopomofo}) (110)
 X \p{Bopomofo_Ext}        \p{Bopomofo_Extended} (= \p{Block=
                             Bopomofo_Extended}) (32)
 X \p{Bopomofo_Extended}   \p{Block=Bopomofo_Extended} (Short:
                             \p{InBopomofoExt}) (32)
 X \p{Box_Drawing}         \p{Block=Box_Drawing} (128)
   \p{Bpt: *}              \p{Bidi_Paired_Bracket_Type: *}
   \p{Brah}                \p{Brahmi} (= \p{Script_Extensions=
                             Brahmi}) (NOT \p{Block=Brahmi}) (109)
   \p{Brahmi}              \p{Script_Extensions=Brahmi} (Short:
                             \p{Brah}; NOT \p{Block=Brahmi}) (109)
   \p{Brai}                \p{Braille} (= \p{Script_Extensions=
                             Braille}) (256)
   \p{Braille}             \p{Script_Extensions=Braille} (Short:
                             \p{Brai}) (256)
 X \p{Braille_Patterns}    \p{Block=Braille_Patterns} (Short:
                             \p{InBraille}) (256)
   \p{Bugi}                \p{Buginese} (= \p{Script_Extensions=
                             Buginese}) (NOT \p{Block=Buginese}) (31)
   \p{Buginese}            \p{Script_Extensions=Buginese} (Short:
                             \p{Bugi}; NOT \p{Block=Buginese}) (31)
   \p{Buhd}                \p{Buhid} (= \p{Script_Extensions=Buhid})
                             (NOT \p{Block=Buhid}) (22)
   \p{Buhid}               \p{Script_Extensions=Buhid} (Short:
                             \p{Buhd}; NOT \p{Block=Buhid}) (22)
 X \p{Byzantine_Music}     \p{Byzantine_Musical_Symbols} (= \p{Block=
                             Byzantine_Musical_Symbols}) (256)
 X \p{Byzantine_Musical_Symbols} \p{Block=Byzantine_Musical_Symbols}
                             (Short: \p{InByzantineMusic}) (256)
   \p{C} \pC               \p{Other} (= \p{General_Category=Other})
                             (986_091 plus all above-Unicode code
                             points)
   \p{Cakm}                \p{Chakma} (= \p{Script_Extensions=
                             Chakma}) (NOT \p{Block=Chakma}) (87)
   \p{Canadian_Aboriginal} \p{Script_Extensions=Canadian_Aboriginal}
                             (Short: \p{Cans}) (710)
 X \p{Canadian_Syllabics}  \p{Unified_Canadian_Aboriginal_Syllabics}
                             (= \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics})
                             (640)
 T \p{Canonical_Combining_Class: 0} \p{Canonical_Combining_Class=
                             Not_Reordered} (1_113_298 plus all
                             above-Unicode code points)
 T \p{Canonical_Combining_Class: 1} \p{Canonical_Combining_Class=
                             Overlay} (32)
 T \p{Canonical_Combining_Class: 7} \p{Canonical_Combining_Class=
                             Nukta} (22)
 T \p{Canonical_Combining_Class: 8} \p{Canonical_Combining_Class=
                             Kana_Voicing} (2)
 T \p{Canonical_Combining_Class: 9} \p{Canonical_Combining_Class=
                             Virama} (47)
 T \p{Canonical_Combining_Class: 10} \p{Canonical_Combining_Class=
                             CCC10} (1)
 T \p{Canonical_Combining_Class: 11} \p{Canonical_Combining_Class=
                             CCC11} (1)
 T \p{Canonical_Combining_Class: 12} \p{Canonical_Combining_Class=
                             CCC12} (1)
 T \p{Canonical_Combining_Class: 13} \p{Canonical_Combining_Class=
                             CCC13} (1)
 T \p{Canonical_Combining_Class: 14} \p{Canonical_Combining_Class=
                             CCC14} (1)
 T \p{Canonical_Combining_Class: 15} \p{Canonical_Combining_Class=
                             CCC15} (1)
 T \p{Canonical_Combining_Class: 16} \p{Canonical_Combining_Class=
                             CCC16} (1)
 T \p{Canonical_Combining_Class: 17} \p{Canonical_Combining_Class=
                             CCC17} (1)
 T \p{Canonical_Combining_Class: 18} \p{Canonical_Combining_Class=
                             CCC18} (2)
 T \p{Canonical_Combining_Class: 19} \p{Canonical_Combining_Class=
                             CCC19} (2)
 T \p{Canonical_Combining_Class: 20} \p{Canonical_Combining_Class=
                             CCC20} (1)
 T \p{Canonical_Combining_Class: 21} \p{Canonical_Combining_Class=
                             CCC21} (1)
 T \p{Canonical_Combining_Class: 22} \p{Canonical_Combining_Class=
                             CCC22} (1)
 T \p{Canonical_Combining_Class: 23} \p{Canonical_Combining_Class=
                             CCC23} (1)
 T \p{Canonical_Combining_Class: 24} \p{Canonical_Combining_Class=
                             CCC24} (1)
 T \p{Canonical_Combining_Class: 25} \p{Canonical_Combining_Class=
                             CCC25} (1)
 T \p{Canonical_Combining_Class: 26} \p{Canonical_Combining_Class=
                             CCC26} (1)
 T \p{Canonical_Combining_Class: 27} \p{Canonical_Combining_Class=
                             CCC27} (2)
 T \p{Canonical_Combining_Class: 28} \p{Canonical_Combining_Class=
                             CCC28} (2)
 T \p{Canonical_Combining_Class: 29} \p{Canonical_Combining_Class=
                             CCC29} (2)
 T \p{Canonical_Combining_Class: 30} \p{Canonical_Combining_Class=
                             CCC30} (2)
 T \p{Canonical_Combining_Class: 31} \p{Canonical_Combining_Class=
                             CCC31} (2)
 T \p{Canonical_Combining_Class: 32} \p{Canonical_Combining_Class=
                             CCC32} (2)
 T \p{Canonical_Combining_Class: 33} \p{Canonical_Combining_Class=
                             CCC33} (1)
 T \p{Canonical_Combining_Class: 34} \p{Canonical_Combining_Class=
                             CCC34} (1)
 T \p{Canonical_Combining_Class: 35} \p{Canonical_Combining_Class=
                             CCC35} (1)
 T \p{Canonical_Combining_Class: 36} \p{Canonical_Combining_Class=
                             CCC36} (1)
 T \p{Canonical_Combining_Class: 84} \p{Canonical_Combining_Class=
                             CCC84} (1)
 T \p{Canonical_Combining_Class: 91} \p{Canonical_Combining_Class=
                             CCC91} (1)
 T \p{Canonical_Combining_Class: 103} \p{Canonical_Combining_Class=
                             CCC103} (2)
 T \p{Canonical_Combining_Class: 107} \p{Canonical_Combining_Class=
                             CCC107} (4)
 T \p{Canonical_Combining_Class: 118} \p{Canonical_Combining_Class=
                             CCC118} (2)
 T \p{Canonical_Combining_Class: 122} \p{Canonical_Combining_Class=
                             CCC122} (4)
 T \p{Canonical_Combining_Class: 129} \p{Canonical_Combining_Class=
                             CCC129} (1)
 T \p{Canonical_Combining_Class: 130} \p{Canonical_Combining_Class=
                             CCC130} (6)
 T \p{Canonical_Combining_Class: 132} \p{Canonical_Combining_Class=
                             CCC132} (1)
 T \p{Canonical_Combining_Class: 133} \p{Canonical_Combining_Class=
                             CCC133} (0)
 T \p{Canonical_Combining_Class: 200} \p{Canonical_Combining_Class=
                             Attached_Below_Left} (0)
 T \p{Canonical_Combining_Class: 202} \p{Canonical_Combining_Class=
                             Attached_Below} (5)
 T \p{Canonical_Combining_Class: 214} \p{Canonical_Combining_Class=
                             Attached_Above} (1)
 T \p{Canonical_Combining_Class: 216} \p{Canonical_Combining_Class=
                             Attached_Above_Right} (9)
 T \p{Canonical_Combining_Class: 218} \p{Canonical_Combining_Class=
                             Below_Left} (1)
 T \p{Canonical_Combining_Class: 220} \p{Canonical_Combining_Class=
                             Below} (153)
 T \p{Canonical_Combining_Class: 222} \p{Canonical_Combining_Class=
                             Below_Right} (4)
 T \p{Canonical_Combining_Class: 224} \p{Canonical_Combining_Class=
                             Left} (2)
 T \p{Canonical_Combining_Class: 226} \p{Canonical_Combining_Class=
                             Right} (1)
 T \p{Canonical_Combining_Class: 228} \p{Canonical_Combining_Class=
                             Above_Left} (3)
 T \p{Canonical_Combining_Class: 230} \p{Canonical_Combining_Class=
                             Above} (461)
 T \p{Canonical_Combining_Class: 232} \p{Canonical_Combining_Class=
                             Above_Right} (4)
 T \p{Canonical_Combining_Class: 233} \p{Canonical_Combining_Class=
                             Double_Below} (4)
 T \p{Canonical_Combining_Class: 234} \p{Canonical_Combining_Class=
                             Double_Above} (5)
 T \p{Canonical_Combining_Class: 240} \p{Canonical_Combining_Class=
                             Iota_Subscript} (1)
   \p{Canonical_Combining_Class: A} \p{Canonical_Combining_Class=
                             Above} (461)
   \p{Canonical_Combining_Class: Above} (Short: \p{Ccc=A}) (461)
   \p{Canonical_Combining_Class: Above_Left} (Short: \p{Ccc=AL}) (3)
   \p{Canonical_Combining_Class: Above_Right} (Short: \p{Ccc=AR}) (4)
   \p{Canonical_Combining_Class: AL} \p{Canonical_Combining_Class=
                             Above_Left} (3)
   \p{Canonical_Combining_Class: AR} \p{Canonical_Combining_Class=
                             Above_Right} (4)
   \p{Canonical_Combining_Class: ATA} \p{Canonical_Combining_Class=
                             Attached_Above} (1)
   \p{Canonical_Combining_Class: ATAR} \p{Canonical_Combining_Class=
                             Attached_Above_Right} (9)
   \p{Canonical_Combining_Class: ATB} \p{Canonical_Combining_Class=
                             Attached_Below} (5)
   \p{Canonical_Combining_Class: ATBL} \p{Canonical_Combining_Class=
                             Attached_Below_Left} (0)
   \p{Canonical_Combining_Class: Attached_Above} (Short: \p{Ccc=ATA})
                             (1)
   \p{Canonical_Combining_Class: Attached_Above_Right} (Short:
                             \p{Ccc=ATAR}) (9)
   \p{Canonical_Combining_Class: Attached_Below} (Short: \p{Ccc=ATB})
                             (5)
   \p{Canonical_Combining_Class: Attached_Below_Left} (Short: \p{Ccc=
                             ATBL}) (0)
   \p{Canonical_Combining_Class: B} \p{Canonical_Combining_Class=
                             Below} (153)
   \p{Canonical_Combining_Class: Below} (Short: \p{Ccc=B}) (153)
   \p{Canonical_Combining_Class: Below_Left} (Short: \p{Ccc=BL}) (1)
   \p{Canonical_Combining_Class: Below_Right} (Short: \p{Ccc=BR}) (4)
   \p{Canonical_Combining_Class: BL} \p{Canonical_Combining_Class=
                             Below_Left} (1)
   \p{Canonical_Combining_Class: BR} \p{Canonical_Combining_Class=
                             Below_Right} (4)
   \p{Canonical_Combining_Class: CCC10} (Short: \p{Ccc=CCC10}) (1)
   \p{Canonical_Combining_Class: CCC103} (Short: \p{Ccc=CCC103}) (2)
   \p{Canonical_Combining_Class: CCC107} (Short: \p{Ccc=CCC107}) (4)
   \p{Canonical_Combining_Class: CCC11} (Short: \p{Ccc=CCC11}) (1)
   \p{Canonical_Combining_Class: CCC118} (Short: \p{Ccc=CCC118}) (2)
   \p{Canonical_Combining_Class: CCC12} (Short: \p{Ccc=CCC12}) (1)
   \p{Canonical_Combining_Class: CCC122} (Short: \p{Ccc=CCC122}) (4)
   \p{Canonical_Combining_Class: CCC129} (Short: \p{Ccc=CCC129}) (1)
   \p{Canonical_Combining_Class: CCC13} (Short: \p{Ccc=CCC13}) (1)
   \p{Canonical_Combining_Class: CCC130} (Short: \p{Ccc=CCC130}) (6)
   \p{Canonical_Combining_Class: CCC132} (Short: \p{Ccc=CCC132}) (1)
   \p{Canonical_Combining_Class: CCC133} (Short: \p{Ccc=CCC133}) (0)
   \p{Canonical_Combining_Class: CCC14} (Short: \p{Ccc=CCC14}) (1)
   \p{Canonical_Combining_Class: CCC15} (Short: \p{Ccc=CCC15}) (1)
   \p{Canonical_Combining_Class: CCC16} (Short: \p{Ccc=CCC16}) (1)
   \p{Canonical_Combining_Class: CCC17} (Short: \p{Ccc=CCC17}) (1)
   \p{Canonical_Combining_Class: CCC18} (Short: \p{Ccc=CCC18}) (2)
   \p{Canonical_Combining_Class: CCC19} (Short: \p{Ccc=CCC19}) (2)
   \p{Canonical_Combining_Class: CCC20} (Short: \p{Ccc=CCC20}) (1)
   \p{Canonical_Combining_Class: CCC21} (Short: \p{Ccc=CCC21}) (1)
   \p{Canonical_Combining_Class: CCC22} (Short: \p{Ccc=CCC22}) (1)
   \p{Canonical_Combining_Class: CCC23} (Short: \p{Ccc=CCC23}) (1)
   \p{Canonical_Combining_Class: CCC24} (Short: \p{Ccc=CCC24}) (1)
   \p{Canonical_Combining_Class: CCC25} (Short: \p{Ccc=CCC25}) (1)
   \p{Canonical_Combining_Class: CCC26} (Short: \p{Ccc=CCC26}) (1)
   \p{Canonical_Combining_Class: CCC27} (Short: \p{Ccc=CCC27}) (2)
   \p{Canonical_Combining_Class: CCC28} (Short: \p{Ccc=CCC28}) (2)
   \p{Canonical_Combining_Class: CCC29} (Short: \p{Ccc=CCC29}) (2)
   \p{Canonical_Combining_Class: CCC30} (Short: \p{Ccc=CCC30}) (2)
   \p{Canonical_Combining_Class: CCC31} (Short: \p{Ccc=CCC31}) (2)
   \p{Canonical_Combining_Class: CCC32} (Short: \p{Ccc=CCC32}) (2)
   \p{Canonical_Combining_Class: CCC33} (Short: \p{Ccc=CCC33}) (1)
   \p{Canonical_Combining_Class: CCC34} (Short: \p{Ccc=CCC34}) (1)
   \p{Canonical_Combining_Class: CCC35} (Short: \p{Ccc=CCC35}) (1)
   \p{Canonical_Combining_Class: CCC36} (Short: \p{Ccc=CCC36}) (1)
   \p{Canonical_Combining_Class: CCC84} (Short: \p{Ccc=CCC84}) (1)
   \p{Canonical_Combining_Class: CCC91} (Short: \p{Ccc=CCC91}) (1)
   \p{Canonical_Combining_Class: DA} \p{Canonical_Combining_Class=
                             Double_Above} (5)
   \p{Canonical_Combining_Class: DB} \p{Canonical_Combining_Class=
                             Double_Below} (4)
   \p{Canonical_Combining_Class: Double_Above} (Short: \p{Ccc=DA}) (5)
   \p{Canonical_Combining_Class: Double_Below} (Short: \p{Ccc=DB}) (4)
   \p{Canonical_Combining_Class: Iota_Subscript} (Short: \p{Ccc=IS})
                             (1)
   \p{Canonical_Combining_Class: IS} \p{Canonical_Combining_Class=
                             Iota_Subscript} (1)
   \p{Canonical_Combining_Class: Kana_Voicing} (Short: \p{Ccc=KV}) (2)
   \p{Canonical_Combining_Class: KV} \p{Canonical_Combining_Class=
                             Kana_Voicing} (2)
   \p{Canonical_Combining_Class: L} \p{Canonical_Combining_Class=
                             Left} (2)
   \p{Canonical_Combining_Class: Left} (Short: \p{Ccc=L}) (2)
   \p{Canonical_Combining_Class: NK} \p{Canonical_Combining_Class=
                             Nukta} (22)
   \p{Canonical_Combining_Class: Not_Reordered} (Short: \p{Ccc=NR})
                             (1_113_298 plus all above-Unicode code
                             points)
   \p{Canonical_Combining_Class: NR} \p{Canonical_Combining_Class=
                             Not_Reordered} (1_113_298 plus all
                             above-Unicode code points)
   \p{Canonical_Combining_Class: Nukta} (Short: \p{Ccc=NK}) (22)
   \p{Canonical_Combining_Class: OV} \p{Canonical_Combining_Class=
                             Overlay} (32)
   \p{Canonical_Combining_Class: Overlay} (Short: \p{Ccc=OV}) (32)
   \p{Canonical_Combining_Class: R} \p{Canonical_Combining_Class=
                             Right} (1)
   \p{Canonical_Combining_Class: Right} (Short: \p{Ccc=R}) (1)
   \p{Canonical_Combining_Class: Virama} (Short: \p{Ccc=VR}) (47)
   \p{Canonical_Combining_Class: VR} \p{Canonical_Combining_Class=
                             Virama} (47)
   \p{Cans}                \p{Canadian_Aboriginal} (=
                             \p{Script_Extensions=
                             Canadian_Aboriginal}) (710)
   \p{Cari}                \p{Carian} (= \p{Script_Extensions=
                             Carian}) (NOT \p{Block=Carian}) (49)
   \p{Carian}              \p{Script_Extensions=Carian} (Short:
                             \p{Cari}; NOT \p{Block=Carian}) (49)
   \p{Case_Ignorable}      \p{Case_Ignorable=Y} (Short: \p{CI}) (2240)
   \p{Case_Ignorable: N*}  (Short: \p{CI=N}, \P{CI}) (1_111_872 plus
                             all above-Unicode code points)
   \p{Case_Ignorable: Y*}  (Short: \p{CI=Y}, \p{CI}) (2240)
   \p{Cased}               \p{Cased=Y} (4105)
   \p{Cased: N*}           (Single: \P{Cased}) (1_110_007 plus all
                             above-Unicode code points)
   \p{Cased: Y*}           (Single: \p{Cased}) (4105)
   \p{Cased_Letter}        \p{General_Category=Cased_Letter} (Short:
                             \p{LC}) (3796)
   \p{Category: *}         \p{General_Category: *}
   \p{Caucasian_Albanian}  \p{Script_Extensions=Caucasian_Albanian}
                             (Short: \p{Aghb}; NOT \p{Block=
                             Caucasian_Albanian}) (53)
   \p{Cc}                  \p{XPosixCntrl} (= \p{General_Category=
                             Control}) (65)
   \p{Ccc: *}              \p{Canonical_Combining_Class: *}
   \p{CE}                  \p{Composition_Exclusion} (=
                             \p{Composition_Exclusion=Y}) (81)
   \p{CE: *}               \p{Composition_Exclusion: *}
   \p{Cf}                  \p{Format} (= \p{General_Category=Format})
                             (151)
   \p{Chakma}              \p{Script_Extensions=Chakma} (Short:
                             \p{Cakm}; NOT \p{Block=Chakma}) (87)
   \p{Cham}                \p{Script_Extensions=Cham} (NOT \p{Block=
                             Cham}) (83)
   \p{Changes_When_Casefolded} \p{Changes_When_Casefolded=Y} (Short:
                             \p{CWCF}) (1377)
   \p{Changes_When_Casefolded: N*} (Short: \p{CWCF=N}, \P{CWCF})
                             (1_112_735 plus all above-Unicode code
                             points)
   \p{Changes_When_Casefolded: Y*} (Short: \p{CWCF=Y}, \p{CWCF})
                             (1377)
   \p{Changes_When_Casemapped} \p{Changes_When_Casemapped=Y} (Short:
                             \p{CWCM}) (2669)
   \p{Changes_When_Casemapped: N*} (Short: \p{CWCM=N}, \P{CWCM})
                             (1_111_443 plus all above-Unicode code
                             points)
   \p{Changes_When_Casemapped: Y*} (Short: \p{CWCM=Y}, \p{CWCM})
                             (2669)
   \p{Changes_When_Lowercased} \p{Changes_When_Lowercased=Y} (Short:
                             \p{CWL}) (1304)
   \p{Changes_When_Lowercased: N*} (Short: \p{CWL=N}, \P{CWL})
                             (1_112_808 plus all above-Unicode code
                             points)
   \p{Changes_When_Lowercased: Y*} (Short: \p{CWL=Y}, \p{CWL}) (1304)
   \p{Changes_When_NFKC_Casefolded} \p{Changes_When_NFKC_Casefolded=
                             Y} (Short: \p{CWKCF}) (10_227)
   \p{Changes_When_NFKC_Casefolded: N*} (Short: \p{CWKCF=N},
                             \P{CWKCF}) (1_103_885 plus all above-
                             Unicode code points)
   \p{Changes_When_NFKC_Casefolded: Y*} (Short: \p{CWKCF=Y},
                             \p{CWKCF}) (10_227)
   \p{Changes_When_Titlecased} \p{Changes_When_Titlecased=Y} (Short:
                             \p{CWT}) (1369)
   \p{Changes_When_Titlecased: N*} (Short: \p{CWT=N}, \P{CWT})
                             (1_112_743 plus all above-Unicode code
                             points)
   \p{Changes_When_Titlecased: Y*} (Short: \p{CWT=Y}, \p{CWT}) (1369)
   \p{Changes_When_Uppercased} \p{Changes_When_Uppercased=Y} (Short:
                             \p{CWU}) (1396)
   \p{Changes_When_Uppercased: N*} (Short: \p{CWU=N}, \P{CWU})
                             (1_112_716 plus all above-Unicode code
                             points)
   \p{Changes_When_Uppercased: Y*} (Short: \p{CWU=Y}, \p{CWU}) (1396)
   \p{Cher}                \p{Cherokee} (= \p{Script_Extensions=
                             Cherokee}) (NOT \p{Block=Cherokee}) (172)
   \p{Cherokee}            \p{Script_Extensions=Cherokee} (Short:
                             \p{Cher}; NOT \p{Block=Cherokee}) (172)
 X \p{Cherokee_Sup}        \p{Cherokee_Supplement} (= \p{Block=
                             Cherokee_Supplement}) (80)
 X \p{Cherokee_Supplement} \p{Block=Cherokee_Supplement} (Short:
                             \p{InCherokeeSup}) (80)
   \p{CI}                  \p{Case_Ignorable} (= \p{Case_Ignorable=
                             Y}) (2240)
   \p{CI: *}               \p{Case_Ignorable: *}
 X \p{CJK}                 \p{CJK_Unified_Ideographs} (= \p{Block=
                             CJK_Unified_Ideographs}) (20_992)
 X \p{CJK_Compat}          \p{CJK_Compatibility} (= \p{Block=
                             CJK_Compatibility}) (256)
 X \p{CJK_Compat_Forms}    \p{CJK_Compatibility_Forms} (= \p{Block=
                             CJK_Compatibility_Forms}) (32)
 X \p{CJK_Compat_Ideographs} \p{CJK_Compatibility_Ideographs} (=
                             \p{Block=CJK_Compatibility_Ideographs})
                             (512)
 X \p{CJK_Compat_Ideographs_Sup}
                             \p{CJK_Compatibility_Ideographs_-
                             Supplement} (= \p{Block=
                             CJK_Compatibility_Ideographs_-
                             Supplement}) (544)
 X \p{CJK_Compatibility}   \p{Block=CJK_Compatibility} (Short:
                             \p{InCJKCompat}) (256)
 X \p{CJK_Compatibility_Forms} \p{Block=CJK_Compatibility_Forms}
                             (Short: \p{InCJKCompatForms}) (32)
 X \p{CJK_Compatibility_Ideographs} \p{Block=
                             CJK_Compatibility_Ideographs} (Short:
                             \p{InCJKCompatIdeographs}) (512)
 X \p{CJK_Compatibility_Ideographs_Supplement} \p{Block=
                             CJK_Compatibility_Ideographs_Supplement}
                             (Short: \p{InCJKCompatIdeographsSup})
                             (544)
 X \p{CJK_Ext_A}           \p{CJK_Unified_Ideographs_Extension_A} (=
                             \p{Block=
                             CJK_Unified_Ideographs_Extension_A})
                             (6592)
 X \p{CJK_Ext_B}           \p{CJK_Unified_Ideographs_Extension_B} (=
                             \p{Block=
                             CJK_Unified_Ideographs_Extension_B})
                             (42_720)
 X \p{CJK_Ext_C}           \p{CJK_Unified_Ideographs_Extension_C} (=
                             \p{Block=
                             CJK_Unified_Ideographs_Extension_C})
                             (4160)
 X \p{CJK_Ext_D}           \p{CJK_Unified_Ideographs_Extension_D} (=
                             \p{Block=
                             CJK_Unified_Ideographs_Extension_D})
                             (224)
 X \p{CJK_Ext_E}           \p{CJK_Unified_Ideographs_Extension_E} (=
                             \p{Block=
                             CJK_Unified_Ideographs_Extension_E})
                             (5776)
 X \p{CJK_Radicals_Sup}    \p{CJK_Radicals_Supplement} (= \p{Block=
                             CJK_Radicals_Supplement}) (128)
 X \p{CJK_Radicals_Supplement} \p{Block=CJK_Radicals_Supplement}
                             (Short: \p{InCJKRadicalsSup}) (128)
 X \p{CJK_Strokes}         \p{Block=CJK_Strokes} (48)
 X \p{CJK_Symbols}         \p{CJK_Symbols_And_Punctuation} (=
                             \p{Block=CJK_Symbols_And_Punctuation})
                             (64)
 X \p{CJK_Symbols_And_Punctuation} \p{Block=
                             CJK_Symbols_And_Punctuation} (Short:
                             \p{InCJKSymbols}) (64)
 X \p{CJK_Unified_Ideographs} \p{Block=CJK_Unified_Ideographs}
                             (Short: \p{InCJK}) (20_992)
 X \p{CJK_Unified_Ideographs_Extension_A} \p{Block=
                             CJK_Unified_Ideographs_Extension_A}
                             (Short: \p{InCJKExtA}) (6592)
 X \p{CJK_Unified_Ideographs_Extension_B} \p{Block=
                             CJK_Unified_Ideographs_Extension_B}
                             (Short: \p{InCJKExtB}) (42_720)
 X \p{CJK_Unified_Ideographs_Extension_C} \p{Block=
                             CJK_Unified_Ideographs_Extension_C}
                             (Short: \p{InCJKExtC}) (4160)
 X \p{CJK_Unified_Ideographs_Extension_D} \p{Block=
                             CJK_Unified_Ideographs_Extension_D}
                             (Short: \p{InCJKExtD}) (224)
 X \p{CJK_Unified_Ideographs_Extension_E} \p{Block=
                             CJK_Unified_Ideographs_Extension_E}
                             (Short: \p{InCJKExtE}) (5776)
   \p{Close_Punctuation}   \p{General_Category=Close_Punctuation}
                             (Short: \p{Pe}) (73)
   \p{Cn}                  \p{Unassigned} (= \p{General_Category=
                             Unassigned}) (846_359 plus all above-
                             Unicode code points)
   \p{Cntrl}               \p{XPosixCntrl} (= \p{General_Category=
                             Control}) (65)
   \p{Co}                  \p{Private_Use} (= \p{General_Category=
                             Private_Use}) (NOT \p{Private_Use_Area})
                             (137_468)
 X \p{Combining_Diacritical_Marks} \p{Block=
                             Combining_Diacritical_Marks} (Short:
                             \p{InDiacriticals}) (112)
 X \p{Combining_Diacritical_Marks_Extended} \p{Block=
                             Combining_Diacritical_Marks_Extended}
                             (Short: \p{InDiacriticalsExt}) (80)
 X \p{Combining_Diacritical_Marks_For_Symbols} \p{Block=
                             Combining_Diacritical_Marks_For_Symbols}
                             (Short: \p{InDiacriticalsForSymbols})
                             (48)
 X \p{Combining_Diacritical_Marks_Supplement} \p{Block=
                             Combining_Diacritical_Marks_Supplement}
                             (Short: \p{InDiacriticalsSup}) (64)
 X \p{Combining_Half_Marks} \p{Block=Combining_Half_Marks} (Short:
                             \p{InHalfMarks}) (16)
   \p{Combining_Mark}      \p{Mark} (= \p{General_Category=Mark})
                             (2097)
 X \p{Combining_Marks_For_Symbols}
                             \p{Combining_Diacritical_Marks_For_-
                             Symbols} (= \p{Block=
                             Combining_Diacritical_Marks_For_-
                             Symbols}) (48)
   \p{Common}              \p{Script_Extensions=Common} (Short:
                             \p{Zyyy}) (6864)
 X \p{Common_Indic_Number_Forms} \p{Block=Common_Indic_Number_Forms}
                             (Short: \p{InIndicNumberForms}) (16)
   \p{Comp_Ex}             \p{Full_Composition_Exclusion} (=
                             \p{Full_Composition_Exclusion=Y}) (1120)
   \p{Comp_Ex: *}          \p{Full_Composition_Exclusion: *}
 X \p{Compat_Jamo}         \p{Hangul_Compatibility_Jamo} (= \p{Block=
                             Hangul_Compatibility_Jamo}) (96)
   \p{Composition_Exclusion} \p{Composition_Exclusion=Y} (Short:
                             \p{CE}) (81)
   \p{Composition_Exclusion: N*} (Short: \p{CE=N}, \P{CE}) (1_114_031
                             plus all above-Unicode code points)
   \p{Composition_Exclusion: Y*} (Short: \p{CE=Y}, \p{CE}) (81)
   \p{Connector_Punctuation} \p{General_Category=
                             Connector_Punctuation} (Short: \p{Pc})
                             (10)
   \p{Control}             \p{XPosixCntrl} (= \p{General_Category=
                             Control}) (65)
 X \p{Control_Pictures}    \p{Block=Control_Pictures} (64)
   \p{Copt}                \p{Coptic} (= \p{Script_Extensions=
                             Coptic}) (NOT \p{Block=Coptic}) (165)
   \p{Coptic}              \p{Script_Extensions=Coptic} (Short:
                             \p{Copt}; NOT \p{Block=Coptic}) (165)
 X \p{Coptic_Epact_Numbers} \p{Block=Coptic_Epact_Numbers} (32)
 X \p{Counting_Rod}        \p{Counting_Rod_Numerals} (= \p{Block=
                             Counting_Rod_Numerals}) (32)
 X \p{Counting_Rod_Numerals} \p{Block=Counting_Rod_Numerals} (Short:
                             \p{InCountingRod}) (32)
   \p{Cprt}                \p{Cypriot} (= \p{Script_Extensions=
                             Cypriot}) (112)
   \p{Cs}                  \p{Surrogate} (= \p{General_Category=
                             Surrogate}) (2048)
   \p{Cuneiform}           \p{Script_Extensions=Cuneiform} (Short:
                             \p{Xsux}; NOT \p{Block=Cuneiform}) (1234)
 X \p{Cuneiform_Numbers}   \p{Cuneiform_Numbers_And_Punctuation} (=
                             \p{Block=
                             Cuneiform_Numbers_And_Punctuation}) (128)
 X \p{Cuneiform_Numbers_And_Punctuation} \p{Block=
                             Cuneiform_Numbers_And_Punctuation}
                             (Short: \p{InCuneiformNumbers}) (128)
   \p{Currency_Symbol}     \p{General_Category=Currency_Symbol}
                             (Short: \p{Sc}) (53)
 X \p{Currency_Symbols}    \p{Block=Currency_Symbols} (48)
   \p{CWCF}                \p{Changes_When_Casefolded} (=
                             \p{Changes_When_Casefolded=Y}) (1377)
   \p{CWCF: *}             \p{Changes_When_Casefolded: *}
   \p{CWCM}                \p{Changes_When_Casemapped} (=
                             \p{Changes_When_Casemapped=Y}) (2669)
   \p{CWCM: *}             \p{Changes_When_Casemapped: *}
   \p{CWKCF}               \p{Changes_When_NFKC_Casefolded} (=
                             \p{Changes_When_NFKC_Casefolded=Y})
                             (10_227)
   \p{CWKCF: *}            \p{Changes_When_NFKC_Casefolded: *}
   \p{CWL}                 \p{Changes_When_Lowercased} (=
                             \p{Changes_When_Lowercased=Y}) (1304)
   \p{CWL: *}              \p{Changes_When_Lowercased: *}
   \p{CWT}                 \p{Changes_When_Titlecased} (=
                             \p{Changes_When_Titlecased=Y}) (1369)
   \p{CWT: *}              \p{Changes_When_Titlecased: *}
   \p{CWU}                 \p{Changes_When_Uppercased} (=
                             \p{Changes_When_Uppercased=Y}) (1396)
   \p{CWU: *}              \p{Changes_When_Uppercased: *}
   \p{Cypriot}             \p{Script_Extensions=Cypriot} (Short:
                             \p{Cprt}) (112)
 X \p{Cypriot_Syllabary}   \p{Block=Cypriot_Syllabary} (64)
   \p{Cyrillic}            \p{Script_Extensions=Cyrillic} (Short:
                             \p{Cyrl}; NOT \p{Block=Cyrillic}) (446)
 X \p{Cyrillic_Ext_A}      \p{Cyrillic_Extended_A} (= \p{Block=
                             Cyrillic_Extended_A}) (32)
 X \p{Cyrillic_Ext_B}      \p{Cyrillic_Extended_B} (= \p{Block=
                             Cyrillic_Extended_B}) (96)
 X \p{Cyrillic_Ext_C}      \p{Cyrillic_Extended_C} (= \p{Block=
                             Cyrillic_Extended_C}) (16)
 X \p{Cyrillic_Extended_A} \p{Block=Cyrillic_Extended_A} (Short:
                             \p{InCyrillicExtA}) (32)
 X \p{Cyrillic_Extended_B} \p{Block=Cyrillic_Extended_B} (Short:
                             \p{InCyrillicExtB}) (96)
 X \p{Cyrillic_Extended_C} \p{Block=Cyrillic_Extended_C} (Short:
                             \p{InCyrillicExtC}) (16)
 X \p{Cyrillic_Sup}        \p{Cyrillic_Supplement} (= \p{Block=
                             Cyrillic_Supplement}) (48)
 X \p{Cyrillic_Supplement} \p{Block=Cyrillic_Supplement} (Short:
                             \p{InCyrillicSup}) (48)
 X \p{Cyrillic_Supplementary} \p{Cyrillic_Supplement} (= \p{Block=
                             Cyrillic_Supplement}) (48)
   \p{Cyrl}                \p{Cyrillic} (= \p{Script_Extensions=
                             Cyrillic}) (NOT \p{Block=Cyrillic}) (446)
   \p{Dash}                \p{Dash=Y} (28)
   \p{Dash: N*}            (Single: \P{Dash}) (1_114_084 plus all
                             above-Unicode code points)
   \p{Dash: Y*}            (Single: \p{Dash}) (28)
   \p{Dash_Punctuation}    \p{General_Category=Dash_Punctuation}
                             (Short: \p{Pd}) (24)
   \p{Decimal_Number}      \p{XPosixDigit} (= \p{General_Category=
                             Decimal_Number}) (580)
   \p{Decomposition_Type: Can} \p{Decomposition_Type=Canonical}
                             (13_232)
   \p{Decomposition_Type: Canonical} (Short: \p{Dt=Can}) (13_232)
   \p{Decomposition_Type: Circle} (Short: \p{Dt=Enc}) (240)
   \p{Decomposition_Type: Com} \p{Decomposition_Type=Compat} (720)
   \p{Decomposition_Type: Compat} (Short: \p{Dt=Com}) (720)
   \p{Decomposition_Type: Enc} \p{Decomposition_Type=Circle} (240)
   \p{Decomposition_Type: Fin} \p{Decomposition_Type=Final} (240)
   \p{Decomposition_Type: Final} (Short: \p{Dt=Fin}) (240)
   \p{Decomposition_Type: Font} (Short: \p{Dt=Font}) (1184)
   \p{Decomposition_Type: Fra} \p{Decomposition_Type=Fraction} (20)
   \p{Decomposition_Type: Fraction} (Short: \p{Dt=Fra}) (20)
   \p{Decomposition_Type: Init} \p{Decomposition_Type=Initial} (171)
   \p{Decomposition_Type: Initial} (Short: \p{Dt=Init}) (171)
   \p{Decomposition_Type: Iso} \p{Decomposition_Type=Isolated} (238)
   \p{Decomposition_Type: Isolated} (Short: \p{Dt=Iso}) (238)
   \p{Decomposition_Type: Med} \p{Decomposition_Type=Medial} (82)
   \p{Decomposition_Type: Medial} (Short: \p{Dt=Med}) (82)
   \p{Decomposition_Type: Nar} \p{Decomposition_Type=Narrow} (122)
   \p{Decomposition_Type: Narrow} (Short: \p{Dt=Nar}) (122)
   \p{Decomposition_Type: Nb} \p{Decomposition_Type=Nobreak} (5)
   \p{Decomposition_Type: Nobreak} (Short: \p{Dt=Nb}) (5)
   \p{Decomposition_Type: Non_Canon} \p{Decomposition_Type=
                             Non_Canonical} (Perl extension) (3662)
   \p{Decomposition_Type: Non_Canonical} Union of all non-canonical
                             decompositions (Short: \p{Dt=NonCanon})
                             (Perl extension) (3662)
   \p{Decomposition_Type: None} (Short: \p{Dt=None}) (1_097_218 plus
                             all above-Unicode code points)
   \p{Decomposition_Type: Small} (Short: \p{Dt=Sml}) (26)
   \p{Decomposition_Type: Sml} \p{Decomposition_Type=Small} (26)
   \p{Decomposition_Type: Sqr} \p{Decomposition_Type=Square} (285)
   \p{Decomposition_Type: Square} (Short: \p{Dt=Sqr}) (285)
   \p{Decomposition_Type: Sub} (Short: \p{Dt=Sub}) (38)
   \p{Decomposition_Type: Sup} \p{Decomposition_Type=Super} (152)
   \p{Decomposition_Type: Super} (Short: \p{Dt=Sup}) (152)
   \p{Decomposition_Type: Vert} \p{Decomposition_Type=Vertical} (35)
   \p{Decomposition_Type: Vertical} (Short: \p{Dt=Vert}) (35)
   \p{Decomposition_Type: Wide} (Short: \p{Dt=Wide}) (104)
   \p{Default_Ignorable_Code_Point} \p{Default_Ignorable_Code_Point=
                             Y} (Short: \p{DI}) (4173)
   \p{Default_Ignorable_Code_Point: N*} (Short: \p{DI=N}, \P{DI})
                             (1_109_939 plus all above-Unicode code
                             points)
   \p{Default_Ignorable_Code_Point: Y*} (Short: \p{DI=Y}, \p{DI})
                             (4173)
   \p{Dep}                 \p{Deprecated} (= \p{Deprecated=Y}) (15)
   \p{Dep: *}              \p{Deprecated: *}
   \p{Deprecated}          \p{Deprecated=Y} (Short: \p{Dep}) (15)
   \p{Deprecated: N*}      (Short: \p{Dep=N}, \P{Dep}) (1_114_097
                             plus all above-Unicode code points)
   \p{Deprecated: Y*}      (Short: \p{Dep=Y}, \p{Dep}) (15)
   \p{Deseret}             \p{Script_Extensions=Deseret} (Short:
                             \p{Dsrt}) (80)
   \p{Deva}                \p{Devanagari} (= \p{Script_Extensions=
                             Devanagari}) (NOT \p{Block=Devanagari})
                             (210)
   \p{Devanagari}          \p{Script_Extensions=Devanagari} (Short:
                             \p{Deva}; NOT \p{Block=Devanagari}) (210)
 X \p{Devanagari_Ext}      \p{Devanagari_Extended} (= \p{Block=
                             Devanagari_Extended}) (32)
 X \p{Devanagari_Extended} \p{Block=Devanagari_Extended} (Short:
                             \p{InDevanagariExt}) (32)
   \p{DI}                  \p{Default_Ignorable_Code_Point} (=
                             \p{Default_Ignorable_Code_Point=Y})
                             (4173)
   \p{DI: *}               \p{Default_Ignorable_Code_Point: *}
   \p{Dia}                 \p{Diacritic} (= \p{Diacritic=Y}) (782)
   \p{Dia: *}              \p{Diacritic: *}
   \p{Diacritic}           \p{Diacritic=Y} (Short: \p{Dia}) (782)
   \p{Diacritic: N*}       (Short: \p{Dia=N}, \P{Dia}) (1_113_330
                             plus all above-Unicode code points)
   \p{Diacritic: Y*}       (Short: \p{Dia=Y}, \p{Dia}) (782)
 X \p{Diacriticals}        \p{Combining_Diacritical_Marks} (=
                             \p{Block=Combining_Diacritical_Marks})
                             (112)
 X \p{Diacriticals_Ext}    \p{Combining_Diacritical_Marks_Extended}
                             (= \p{Block=
                             Combining_Diacritical_Marks_Extended})
                             (80)
 X \p{Diacriticals_For_Symbols}
                             \p{Combining_Diacritical_Marks_For_-
                             Symbols} (= \p{Block=
                             Combining_Diacritical_Marks_For_-
                             Symbols}) (48)
 X \p{Diacriticals_Sup}    \p{Combining_Diacritical_Marks_Supplement}
                             (= \p{Block=
                             Combining_Diacritical_Marks_Supplement})
                             (64)
   \p{Digit}               \p{XPosixDigit} (= \p{General_Category=
                             Decimal_Number}) (580)
 X \p{Dingbats}            \p{Block=Dingbats} (192)
 X \p{Domino}              \p{Domino_Tiles} (= \p{Block=
                             Domino_Tiles}) (112)
 X \p{Domino_Tiles}        \p{Block=Domino_Tiles} (Short:
                             \p{InDomino}) (112)
   \p{Dsrt}                \p{Deseret} (= \p{Script_Extensions=
                             Deseret}) (80)
   \p{Dt: *}               \p{Decomposition_Type: *}
   \p{Dupl}                \p{Duployan} (= \p{Script_Extensions=
                             Duployan}) (NOT \p{Block=Duployan}) (147)
   \p{Duployan}            \p{Script_Extensions=Duployan} (Short:
                             \p{Dupl}; NOT \p{Block=Duployan}) (147)
   \p{Ea: *}               \p{East_Asian_Width: *}
 X \p{Early_Dynastic_Cuneiform} \p{Block=Early_Dynastic_Cuneiform}
                             (208)
   \p{East_Asian_Width: A} \p{East_Asian_Width=Ambiguous} (138_739)
   \p{East_Asian_Width: Ambiguous} (Short: \p{Ea=A}) (138_739)
   \p{East_Asian_Width: F} \p{East_Asian_Width=Fullwidth} (104)
   \p{East_Asian_Width: Fullwidth} (Short: \p{Ea=F}) (104)
   \p{East_Asian_Width: H} \p{East_Asian_Width=Halfwidth} (123)
   \p{East_Asian_Width: Halfwidth} (Short: \p{Ea=H}) (123)
   \p{East_Asian_Width: N} \p{East_Asian_Width=Neutral} (794_146 plus
                             all above-Unicode code points)
   \p{East_Asian_Width: Na} \p{East_Asian_Width=Narrow} (111)
   \p{East_Asian_Width: Narrow} (Short: \p{Ea=Na}) (111)
   \p{East_Asian_Width: Neutral} (Short: \p{Ea=N}) (794_146 plus all
                             above-Unicode code points)
   \p{East_Asian_Width: W} \p{East_Asian_Width=Wide} (180_889)
   \p{East_Asian_Width: Wide} (Short: \p{Ea=W}) (180_889)
   \p{Egyp}                \p{Egyptian_Hieroglyphs} (=
                             \p{Script_Extensions=
                             Egyptian_Hieroglyphs}) (NOT \p{Block=
                             Egyptian_Hieroglyphs}) (1071)
   \p{Egyptian_Hieroglyphs} \p{Script_Extensions=
                             Egyptian_Hieroglyphs} (Short: \p{Egyp};
                             NOT \p{Block=Egyptian_Hieroglyphs})
                             (1071)
   \p{Elba}                \p{Elbasan} (= \p{Script_Extensions=
                             Elbasan}) (NOT \p{Block=Elbasan}) (40)
   \p{Elbasan}             \p{Script_Extensions=Elbasan} (Short:
                             \p{Elba}; NOT \p{Block=Elbasan}) (40)
 X \p{Emoticons}           \p{Block=Emoticons} (80)
 X \p{Enclosed_Alphanum}   \p{Enclosed_Alphanumerics} (= \p{Block=
                             Enclosed_Alphanumerics}) (160)
 X \p{Enclosed_Alphanum_Sup} \p{Enclosed_Alphanumeric_Supplement} (=
                             \p{Block=
                             Enclosed_Alphanumeric_Supplement}) (256)
 X \p{Enclosed_Alphanumeric_Supplement} \p{Block=
                             Enclosed_Alphanumeric_Supplement}
                             (Short: \p{InEnclosedAlphanumSup}) (256)
 X \p{Enclosed_Alphanumerics} \p{Block=Enclosed_Alphanumerics}
                             (Short: \p{InEnclosedAlphanum}) (160)
 X \p{Enclosed_CJK}        \p{Enclosed_CJK_Letters_And_Months} (=
                             \p{Block=
                             Enclosed_CJK_Letters_And_Months}) (256)
 X \p{Enclosed_CJK_Letters_And_Months} \p{Block=
                             Enclosed_CJK_Letters_And_Months} (Short:
                             \p{InEnclosedCJK}) (256)
 X \p{Enclosed_Ideographic_Sup} \p{Enclosed_Ideographic_Supplement}
                             (= \p{Block=
                             Enclosed_Ideographic_Supplement}) (256)
 X \p{Enclosed_Ideographic_Supplement} \p{Block=
                             Enclosed_Ideographic_Supplement} (Short:
                             \p{InEnclosedIdeographicSup}) (256)
   \p{Enclosing_Mark}      \p{General_Category=Enclosing_Mark}
                             (Short: \p{Me}) (13)
   \p{Ethi}                \p{Ethiopic} (= \p{Script_Extensions=
                             Ethiopic}) (NOT \p{Block=Ethiopic}) (495)
   \p{Ethiopic}            \p{Script_Extensions=Ethiopic} (Short:
                             \p{Ethi}; NOT \p{Block=Ethiopic}) (495)
 X \p{Ethiopic_Ext}        \p{Ethiopic_Extended} (= \p{Block=
                             Ethiopic_Extended}) (96)
 X \p{Ethiopic_Ext_A}      \p{Ethiopic_Extended_A} (= \p{Block=
                             Ethiopic_Extended_A}) (48)
 X \p{Ethiopic_Extended}   \p{Block=Ethiopic_Extended} (Short:
                             \p{InEthiopicExt}) (96)
 X \p{Ethiopic_Extended_A} \p{Block=Ethiopic_Extended_A} (Short:
                             \p{InEthiopicExtA}) (48)
 X \p{Ethiopic_Sup}        \p{Ethiopic_Supplement} (= \p{Block=
                             Ethiopic_Supplement}) (32)
 X \p{Ethiopic_Supplement} \p{Block=Ethiopic_Supplement} (Short:
                             \p{InEthiopicSup}) (32)
   \p{Ext}                 \p{Extender} (= \p{Extender=Y}) (42)
   \p{Ext: *}              \p{Extender: *}
   \p{Extender}            \p{Extender=Y} (Short: \p{Ext}) (42)
   \p{Extender: N*}        (Short: \p{Ext=N}, \P{Ext}) (1_114_070
                             plus all above-Unicode code points)
   \p{Extender: Y*}        (Short: \p{Ext=Y}, \p{Ext}) (42)
   \p{Final_Punctuation}   \p{General_Category=Final_Punctuation}
                             (Short: \p{Pf}) (10)
   \p{Format}              \p{General_Category=Format} (Short:
                             \p{Cf}) (151)
   \p{Full_Composition_Exclusion} \p{Full_Composition_Exclusion=Y}
                             (Short: \p{CompEx}) (1120)
   \p{Full_Composition_Exclusion: N*} (Short: \p{CompEx=N},
                             \P{CompEx}) (1_112_992 plus all above-
                             Unicode code points)
   \p{Full_Composition_Exclusion: Y*} (Short: \p{CompEx=Y},
                             \p{CompEx}) (1120)
   \p{Gc: *}               \p{General_Category: *}
   \p{GCB: *}              \p{Grapheme_Cluster_Break: *}
   \p{General_Category: C} \p{General_Category=Other} (986_091 plus
                             all above-Unicode code points)
   \p{General_Category: Cased_Letter} [\p{Ll}\p{Lu}\p{Lt}] (Short:
                             \p{Gc=LC}, \p{LC}) (3796)
   \p{General_Category: Cc} \p{General_Category=Control} (65)
   \p{General_Category: Cf} \p{General_Category=Format} (151)
   \p{General_Category: Close_Punctuation} (Short: \p{Gc=Pe}, \p{Pe})
                             (73)
   \p{General_Category: Cn} \p{General_Category=Unassigned} (846_359
                             plus all above-Unicode code points)
   \p{General_Category: Cntrl} \p{General_Category=Control} (65)
   \p{General_Category: Co} \p{General_Category=Private_Use} (137_468)
   \p{General_Category: Combining_Mark} \p{General_Category=Mark}
                             (2097)
   \p{General_Category: Connector_Punctuation} (Short: \p{Gc=Pc},
                             \p{Pc}) (10)
   \p{General_Category: Control} (Short: \p{Gc=Cc}, \p{Cc}) (65)
   \p{General_Category: Cs} \p{General_Category=Surrogate} (2048)
   \p{General_Category: Currency_Symbol} (Short: \p{Gc=Sc}, \p{Sc})
                             (53)
   \p{General_Category: Dash_Punctuation} (Short: \p{Gc=Pd}, \p{Pd})
                             (24)
   \p{General_Category: Decimal_Number} (Short: \p{Gc=Nd}, \p{Nd})
                             (580)
   \p{General_Category: Digit} \p{General_Category=Decimal_Number}
                             (580)
   \p{General_Category: Enclosing_Mark} (Short: \p{Gc=Me}, \p{Me})
                             (13)
   \p{General_Category: Final_Punctuation} (Short: \p{Gc=Pf}, \p{Pf})
                             (10)
   \p{General_Category: Format} (Short: \p{Gc=Cf}, \p{Cf}) (151)
   \p{General_Category: Initial_Punctuation} (Short: \p{Gc=Pi},
                             \p{Pi}) (12)
   \p{General_Category: L} \p{General_Category=Letter} (116_766)
 X \p{General_Category: L&} \p{General_Category=Cased_Letter} (3796)
 X \p{General_Category: L_} \p{General_Category=Cased_Letter} Note
                             the trailing '_' matters in spite of
                             loose matching rules. (3796)
   \p{General_Category: LC} \p{General_Category=Cased_Letter} (3796)
   \p{General_Category: Letter} (Short: \p{Gc=L}, \p{L}) (116_766)
   \p{General_Category: Letter_Number} (Short: \p{Gc=Nl}, \p{Nl})
                             (236)
   \p{General_Category: Line_Separator} (Short: \p{Gc=Zl}, \p{Zl}) (1)
   \p{General_Category: Ll} \p{General_Category=Lowercase_Letter}
                             (/i= General_Category=Cased_Letter)
                             (2063)
   \p{General_Category: Lm} \p{General_Category=Modifier_Letter} (249)
   \p{General_Category: Lo} \p{General_Category=Other_Letter}
                             (112_721)
   \p{General_Category: Lowercase_Letter} (Short: \p{Gc=Ll}, \p{Ll};
                             /i= General_Category=Cased_Letter) (2063)
   \p{General_Category: Lt} \p{General_Category=Titlecase_Letter}
                             (/i= General_Category=Cased_Letter) (31)
   \p{General_Category: Lu} \p{General_Category=Uppercase_Letter}
                             (/i= General_Category=Cased_Letter)
                             (1702)
   \p{General_Category: M} \p{General_Category=Mark} (2097)
   \p{General_Category: Mark} (Short: \p{Gc=M}, \p{M}) (2097)
   \p{General_Category: Math_Symbol} (Short: \p{Gc=Sm}, \p{Sm}) (948)
   \p{General_Category: Mc} \p{General_Category=Spacing_Mark} (394)
   \p{General_Category: Me} \p{General_Category=Enclosing_Mark} (13)
   \p{General_Category: Mn} \p{General_Category=Nonspacing_Mark}
                             (1690)
   \p{General_Category: Modifier_Letter} (Short: \p{Gc=Lm}, \p{Lm})
                             (249)
   \p{General_Category: Modifier_Symbol} (Short: \p{Gc=Sk}, \p{Sk})
                             (121)
   \p{General_Category: N} \p{General_Category=Number} (1492)
   \p{General_Category: Nd} \p{General_Category=Decimal_Number} (580)
   \p{General_Category: Nl} \p{General_Category=Letter_Number} (236)
   \p{General_Category: No} \p{General_Category=Other_Number} (676)
   \p{General_Category: Nonspacing_Mark} (Short: \p{Gc=Mn}, \p{Mn})
                             (1690)
   \p{General_Category: Number} (Short: \p{Gc=N}, \p{N}) (1492)
   \p{General_Category: Open_Punctuation} (Short: \p{Gc=Ps}, \p{Ps})
                             (75)
   \p{General_Category: Other} (Short: \p{Gc=C}, \p{C}) (986_091 plus
                             all above-Unicode code points)
   \p{General_Category: Other_Letter} (Short: \p{Gc=Lo}, \p{Lo})
                             (112_721)
   \p{General_Category: Other_Number} (Short: \p{Gc=No}, \p{No}) (676)
   \p{General_Category: Other_Punctuation} (Short: \p{Gc=Po}, \p{Po})
                             (544)
   \p{General_Category: Other_Symbol} (Short: \p{Gc=So}, \p{So})
                             (5777)
   \p{General_Category: P} \p{General_Category=Punctuation} (748)
   \p{General_Category: Paragraph_Separator} (Short: \p{Gc=Zp},
                             \p{Zp}) (1)
   \p{General_Category: Pc} \p{General_Category=
                             Connector_Punctuation} (10)
   \p{General_Category: Pd} \p{General_Category=Dash_Punctuation} (24)
   \p{General_Category: Pe} \p{General_Category=Close_Punctuation}
                             (73)
   \p{General_Category: Pf} \p{General_Category=Final_Punctuation}
                             (10)
   \p{General_Category: Pi} \p{General_Category=Initial_Punctuation}
                             (12)
   \p{General_Category: Po} \p{General_Category=Other_Punctuation}
                             (544)
   \p{General_Category: Private_Use} (Short: \p{Gc=Co}, \p{Co})
                             (137_468)
   \p{General_Category: Ps} \p{General_Category=Open_Punctuation} (75)
   \p{General_Category: Punct} \p{General_Category=Punctuation} (748)
   \p{General_Category: Punctuation} (Short: \p{Gc=P}, \p{P}) (748)
   \p{General_Category: S} \p{General_Category=Symbol} (6899)
   \p{General_Category: Sc} \p{General_Category=Currency_Symbol} (53)
   \p{General_Category: Separator} (Short: \p{Gc=Z}, \p{Z}) (19)
   \p{General_Category: Sk} \p{General_Category=Modifier_Symbol} (121)
   \p{General_Category: Sm} \p{General_Category=Math_Symbol} (948)
   \p{General_Category: So} \p{General_Category=Other_Symbol} (5777)
   \p{General_Category: Space_Separator} (Short: \p{Gc=Zs}, \p{Zs})
                             (17)
   \p{General_Category: Spacing_Mark} (Short: \p{Gc=Mc}, \p{Mc}) (394)
   \p{General_Category: Surrogate} (Short: \p{Gc=Cs}, \p{Cs}) (2048)
   \p{General_Category: Symbol} (Short: \p{Gc=S}, \p{S}) (6899)
   \p{General_Category: Titlecase_Letter} (Short: \p{Gc=Lt}, \p{Lt};
                             /i= General_Category=Cased_Letter) (31)
   \p{General_Category: Unassigned} (Short: \p{Gc=Cn}, \p{Cn})
                             (846_359 plus all above-Unicode code
                             points)
   \p{General_Category: Uppercase_Letter} (Short: \p{Gc=Lu}, \p{Lu};
                             /i= General_Category=Cased_Letter) (1702)
   \p{General_Category: Z} \p{General_Category=Separator} (19)
   \p{General_Category: Zl} \p{General_Category=Line_Separator} (1)
   \p{General_Category: Zp} \p{General_Category=Paragraph_Separator}
                             (1)
   \p{General_Category: Zs} \p{General_Category=Space_Separator} (17)
 X \p{General_Punctuation} \p{Block=General_Punctuation} (Short:
                             \p{InPunctuation}) (112)
 X \p{Geometric_Shapes}    \p{Block=Geometric_Shapes} (96)
 X \p{Geometric_Shapes_Ext} \p{Geometric_Shapes_Extended} (=
                             \p{Block=Geometric_Shapes_Extended})
                             (128)
 X \p{Geometric_Shapes_Extended} \p{Block=Geometric_Shapes_Extended}
                             (Short: \p{InGeometricShapesExt}) (128)
   \p{Geor}                \p{Georgian} (= \p{Script_Extensions=
                             Georgian}) (NOT \p{Block=Georgian}) (129)
   \p{Georgian}            \p{Script_Extensions=Georgian} (Short:
                             \p{Geor}; NOT \p{Block=Georgian}) (129)
 X \p{Georgian_Sup}        \p{Georgian_Supplement} (= \p{Block=
                             Georgian_Supplement}) (48)
 X \p{Georgian_Supplement} \p{Block=Georgian_Supplement} (Short:
                             \p{InGeorgianSup}) (48)
   \p{Glag}                \p{Glagolitic} (= \p{Script_Extensions=
                             Glagolitic}) (NOT \p{Block=Glagolitic})
                             (136)
   \p{Glagolitic}          \p{Script_Extensions=Glagolitic} (Short:
                             \p{Glag}; NOT \p{Block=Glagolitic}) (136)
 X \p{Glagolitic_Sup}      \p{Glagolitic_Supplement} (= \p{Block=
                             Glagolitic_Supplement}) (48)
 X \p{Glagolitic_Supplement} \p{Block=Glagolitic_Supplement} (Short:
                             \p{InGlagoliticSup}) (48)
   \p{Goth}                \p{Gothic} (= \p{Script_Extensions=
                             Gothic}) (NOT \p{Block=Gothic}) (27)
   \p{Gothic}              \p{Script_Extensions=Gothic} (Short:
                             \p{Goth}; NOT \p{Block=Gothic}) (27)
   \p{Gr_Base}             \p{Grapheme_Base} (= \p{Grapheme_Base=Y})
                             (126_288)
   \p{Gr_Base: *}          \p{Grapheme_Base: *}
   \p{Gr_Ext}              \p{Grapheme_Extend} (= \p{Grapheme_Extend=
                             Y}) (1828)
   \p{Gr_Ext: *}           \p{Grapheme_Extend: *}
   \p{Gran}                \p{Grantha} (= \p{Script_Extensions=
                             Grantha}) (NOT \p{Block=Grantha}) (113)
   \p{Grantha}             \p{Script_Extensions=Grantha} (Short:
                             \p{Gran}; NOT \p{Block=Grantha}) (113)
   \p{Graph}               \p{XPosixGraph} (265_621)
   \p{Grapheme_Base}       \p{Grapheme_Base=Y} (Short: \p{GrBase})
                             (126_288)
   \p{Grapheme_Base: N*}   (Short: \p{GrBase=N}, \P{GrBase}) (987_824
                             plus all above-Unicode code points)
   \p{Grapheme_Base: Y*}   (Short: \p{GrBase=Y}, \p{GrBase}) (126_288)
   \p{Grapheme_Cluster_Break: CN} \p{Grapheme_Cluster_Break=Control}
                             (5925)
   \p{Grapheme_Cluster_Break: Control} (Short: \p{GCB=CN}) (5925)
   \p{Grapheme_Cluster_Break: CR} (Short: \p{GCB=CR}) (1)
   \p{Grapheme_Cluster_Break: E_Base} (Short: \p{GCB=EB}) (79)
   \p{Grapheme_Cluster_Break: E_Base_GAZ} (Short: \p{GCB=EBG}) (4)
   \p{Grapheme_Cluster_Break: E_Modifier} (Short: \p{GCB=EM}) (5)
   \p{Grapheme_Cluster_Break: EB} \p{Grapheme_Cluster_Break=E_Base}
                             (79)
   \p{Grapheme_Cluster_Break: EBG} \p{Grapheme_Cluster_Break=
                             E_Base_GAZ} (4)
   \p{Grapheme_Cluster_Break: EM} \p{Grapheme_Cluster_Break=
                             E_Modifier} (5)
   \p{Grapheme_Cluster_Break: EX} \p{Grapheme_Cluster_Break=Extend}
                             (1828)
   \p{Grapheme_Cluster_Break: Extend} (Short: \p{GCB=EX}) (1828)
   \p{Grapheme_Cluster_Break: GAZ} \p{Grapheme_Cluster_Break=
                             Glue_After_Zwj} (3)
   \p{Grapheme_Cluster_Break: Glue_After_Zwj} (Short: \p{GCB=GAZ}) (3)
   \p{Grapheme_Cluster_Break: L} (Short: \p{GCB=L}) (125)
   \p{Grapheme_Cluster_Break: LF} (Short: \p{GCB=LF}) (1)
   \p{Grapheme_Cluster_Break: LV} (Short: \p{GCB=LV}) (399)
   \p{Grapheme_Cluster_Break: LVT} (Short: \p{GCB=LVT}) (10_773)
   \p{Grapheme_Cluster_Break: Other} (Short: \p{GCB=XX}) (1_094_356
                             plus all above-Unicode code points)
   \p{Grapheme_Cluster_Break: PP} \p{Grapheme_Cluster_Break=Prepend}
                             (13)
   \p{Grapheme_Cluster_Break: Prepend} (Short: \p{GCB=PP}) (13)
   \p{Grapheme_Cluster_Break: Regional_Indicator} (Short: \p{GCB=RI})
                             (26)
   \p{Grapheme_Cluster_Break: RI} \p{Grapheme_Cluster_Break=
                             Regional_Indicator} (26)
   \p{Grapheme_Cluster_Break: SM} \p{Grapheme_Cluster_Break=
                             SpacingMark} (341)
   \p{Grapheme_Cluster_Break: SpacingMark} (Short: \p{GCB=SM}) (341)
   \p{Grapheme_Cluster_Break: T} (Short: \p{GCB=T}) (137)
   \p{Grapheme_Cluster_Break: V} (Short: \p{GCB=V}) (95)
   \p{Grapheme_Cluster_Break: XX} \p{Grapheme_Cluster_Break=Other}
                             (1_094_356 plus all above-Unicode code
                             points)
   \p{Grapheme_Cluster_Break: ZWJ} (Short: \p{GCB=ZWJ}) (1)
   \p{Grapheme_Extend}     \p{Grapheme_Extend=Y} (Short: \p{GrExt})
                             (1828)
   \p{Grapheme_Extend: N*} (Short: \p{GrExt=N}, \P{GrExt}) (1_112_284
                             plus all above-Unicode code points)
   \p{Grapheme_Extend: Y*} (Short: \p{GrExt=Y}, \p{GrExt}) (1828)
   \p{Greek}               \p{Script_Extensions=Greek} (Short:
                             \p{Grek}; NOT \p{Greek_And_Coptic}) (522)
 X \p{Greek_And_Coptic}    \p{Block=Greek_And_Coptic} (Short:
                             \p{InGreek}) (144)
 X \p{Greek_Ext}           \p{Greek_Extended} (= \p{Block=
                             Greek_Extended}) (256)
 X \p{Greek_Extended}      \p{Block=Greek_Extended} (Short:
                             \p{InGreekExt}) (256)
   \p{Grek}                \p{Greek} (= \p{Script_Extensions=Greek})
                             (NOT \p{Greek_And_Coptic}) (522)
   \p{Gujarati}            \p{Script_Extensions=Gujarati} (Short:
                             \p{Gujr}; NOT \p{Block=Gujarati}) (99)
   \p{Gujr}                \p{Gujarati} (= \p{Script_Extensions=
                             Gujarati}) (NOT \p{Block=Gujarati}) (99)
   \p{Gurmukhi}            \p{Script_Extensions=Gurmukhi} (Short:
                             \p{Guru}; NOT \p{Block=Gurmukhi}) (93)
   \p{Guru}                \p{Gurmukhi} (= \p{Script_Extensions=
                             Gurmukhi}) (NOT \p{Block=Gurmukhi}) (93)
 X \p{Half_And_Full_Forms} \p{Halfwidth_And_Fullwidth_Forms} (=
                             \p{Block=Halfwidth_And_Fullwidth_Forms})
                             (240)
 X \p{Half_Marks}          \p{Combining_Half_Marks} (= \p{Block=
                             Combining_Half_Marks}) (16)
 X \p{Halfwidth_And_Fullwidth_Forms} \p{Block=
                             Halfwidth_And_Fullwidth_Forms} (Short:
                             \p{InHalfAndFullForms}) (240)
   \p{Han}                 \p{Script_Extensions=Han} (82_013)
   \p{Hang}                \p{Hangul} (= \p{Script_Extensions=
                             Hangul}) (NOT \p{Hangul_Syllables})
                             (11_775)
   \p{Hangul}              \p{Script_Extensions=Hangul} (Short:
                             \p{Hang}; NOT \p{Hangul_Syllables})
                             (11_775)
 X \p{Hangul_Compatibility_Jamo} \p{Block=Hangul_Compatibility_Jamo}
                             (Short: \p{InCompatJamo}) (96)
 X \p{Hangul_Jamo}         \p{Block=Hangul_Jamo} (Short: \p{InJamo})
                             (256)
 X \p{Hangul_Jamo_Extended_A} \p{Block=Hangul_Jamo_Extended_A}
                             (Short: \p{InJamoExtA}) (32)
 X \p{Hangul_Jamo_Extended_B} \p{Block=Hangul_Jamo_Extended_B}
                             (Short: \p{InJamoExtB}) (80)
   \p{Hangul_Syllable_Type: L} \p{Hangul_Syllable_Type=Leading_Jamo}
                             (125)
   \p{Hangul_Syllable_Type: Leading_Jamo} (Short: \p{Hst=L}) (125)
   \p{Hangul_Syllable_Type: LV} \p{Hangul_Syllable_Type=LV_Syllable}
                             (399)
   \p{Hangul_Syllable_Type: LV_Syllable} (Short: \p{Hst=LV}) (399)
   \p{Hangul_Syllable_Type: LVT} \p{Hangul_Syllable_Type=
                             LVT_Syllable} (10_773)
   \p{Hangul_Syllable_Type: LVT_Syllable} (Short: \p{Hst=LVT})
                             (10_773)
   \p{Hangul_Syllable_Type: NA} \p{Hangul_Syllable_Type=
                             Not_Applicable} (1_102_583 plus all
                             above-Unicode code points)
   \p{Hangul_Syllable_Type: Not_Applicable} (Short: \p{Hst=NA})
                             (1_102_583 plus all above-Unicode code
                             points)
   \p{Hangul_Syllable_Type: T} \p{Hangul_Syllable_Type=Trailing_Jamo}
                             (137)
   \p{Hangul_Syllable_Type: Trailing_Jamo} (Short: \p{Hst=T}) (137)
   \p{Hangul_Syllable_Type: V} \p{Hangul_Syllable_Type=Vowel_Jamo}
                             (95)
   \p{Hangul_Syllable_Type: Vowel_Jamo} (Short: \p{Hst=V}) (95)
 X \p{Hangul_Syllables}    \p{Block=Hangul_Syllables} (Short:
                             \p{InHangul}) (11_184)
   \p{Hani}                \p{Han} (= \p{Script_Extensions=Han})
                             (82_013)
   \p{Hano}                \p{Hanunoo} (= \p{Script_Extensions=
                             Hanunoo}) (NOT \p{Block=Hanunoo}) (23)
   \p{Hanunoo}             \p{Script_Extensions=Hanunoo} (Short:
                             \p{Hano}; NOT \p{Block=Hanunoo}) (23)
   \p{Hatr}                \p{Hatran} (= \p{Script_Extensions=
                             Hatran}) (NOT \p{Block=Hatran}) (26)
   \p{Hatran}              \p{Script_Extensions=Hatran} (Short:
                             \p{Hatr}; NOT \p{Block=Hatran}) (26)
   \p{Hebr}                \p{Hebrew} (= \p{Script_Extensions=
                             Hebrew}) (NOT \p{Block=Hebrew}) (133)
   \p{Hebrew}              \p{Script_Extensions=Hebrew} (Short:
                             \p{Hebr}; NOT \p{Block=Hebrew}) (133)
   \p{Hex}                 \p{XPosixXDigit} (= \p{Hex_Digit=Y}) (44)
   \p{Hex: *}              \p{Hex_Digit: *}
   \p{Hex_Digit}           \p{XPosixXDigit} (= \p{Hex_Digit=Y}) (44)
   \p{Hex_Digit: N*}       (Short: \p{Hex=N}, \P{Hex}) (1_114_068
                             plus all above-Unicode code points)
   \p{Hex_Digit: Y*}       (Short: \p{Hex=Y}, \p{Hex}) (44)
 X \p{High_Private_Use_Surrogates} \p{Block=
                             High_Private_Use_Surrogates} (Short:
                             \p{InHighPUSurrogates}) (128)
 X \p{High_PU_Surrogates}  \p{High_Private_Use_Surrogates} (=
                             \p{Block=High_Private_Use_Surrogates})
                             (128)
 X \p{High_Surrogates}     \p{Block=High_Surrogates} (896)
   \p{Hira}                \p{Hiragana} (= \p{Script_Extensions=
                             Hiragana}) (NOT \p{Block=Hiragana}) (143)
   \p{Hiragana}            \p{Script_Extensions=Hiragana} (Short:
                             \p{Hira}; NOT \p{Block=Hiragana}) (143)
   \p{Hluw}                \p{Anatolian_Hieroglyphs} (=
                             \p{Script_Extensions=
                             Anatolian_Hieroglyphs}) (NOT \p{Block=
                             Anatolian_Hieroglyphs}) (583)
   \p{Hmng}                \p{Pahawh_Hmong} (= \p{Script_Extensions=
                             Pahawh_Hmong}) (NOT \p{Block=
                             Pahawh_Hmong}) (127)
   \p{HorizSpace}          \p{XPosixBlank} (18)
   \p{Hst: *}              \p{Hangul_Syllable_Type: *}
   \p{Hung}                \p{Old_Hungarian} (= \p{Script_Extensions=
                             Old_Hungarian}) (NOT \p{Block=
                             Old_Hungarian}) (108)
 D \p{Hyphen}              \p{Hyphen=Y} (11)
 D \p{Hyphen: N*}          Supplanted by Line_Break property values;
                             see www.unicode.org/reports/tr14
                             (Single: \P{Hyphen}) (1_114_101 plus all
                             above-Unicode code points)
 D \p{Hyphen: Y*}          Supplanted by Line_Break property values;
                             see www.unicode.org/reports/tr14
                             (Single: \p{Hyphen}) (11)
   \p{ID_Continue}         \p{ID_Continue=Y} (Short: \p{IDC}; NOT
                             \p{Ideographic_Description_Characters})
                             (119_691)
   \p{ID_Continue: N*}     (Short: \p{IDC=N}, \P{IDC}) (994_421 plus
                             all above-Unicode code points)
   \p{ID_Continue: Y*}     (Short: \p{IDC=Y}, \p{IDC}) (119_691)
   \p{ID_Start}            \p{ID_Start=Y} (Short: \p{IDS}) (117_007)
   \p{ID_Start: N*}        (Short: \p{IDS=N}, \P{IDS}) (997_105 plus
                             all above-Unicode code points)
   \p{ID_Start: Y*}        (Short: \p{IDS=Y}, \p{IDS}) (117_007)
   \p{IDC}                 \p{ID_Continue} (= \p{ID_Continue=Y}) (NOT
                             \p{Ideographic_Description_Characters})
                             (119_691)
   \p{IDC: *}              \p{ID_Continue: *}
   \p{Ideo}                \p{Ideographic} (= \p{Ideographic=Y})
                             (88_284)
   \p{Ideo: *}             \p{Ideographic: *}
   \p{Ideographic}         \p{Ideographic=Y} (Short: \p{Ideo})
                             (88_284)
   \p{Ideographic: N*}     (Short: \p{Ideo=N}, \P{Ideo}) (1_025_828
                             plus all above-Unicode code points)
   \p{Ideographic: Y*}     (Short: \p{Ideo=Y}, \p{Ideo}) (88_284)
 X \p{Ideographic_Description_Characters} \p{Block=
                             Ideographic_Description_Characters}
                             (Short: \p{InIDC}) (16)
 X \p{Ideographic_Symbols} \p{Ideographic_Symbols_And_Punctuation} (=
                             \p{Block=
                             Ideographic_Symbols_And_Punctuation})
                             (32)
 X \p{Ideographic_Symbols_And_Punctuation} \p{Block=
                             Ideographic_Symbols_And_Punctuation}
                             (Short: \p{InIdeographicSymbols}) (32)
   \p{IDS}                 \p{ID_Start} (= \p{ID_Start=Y}) (117_007)
   \p{IDS: *}              \p{ID_Start: *}
   \p{IDS_Binary_Operator} \p{IDS_Binary_Operator=Y} (Short:
                             \p{IDSB}) (10)
   \p{IDS_Binary_Operator: N*} (Short: \p{IDSB=N}, \P{IDSB})
                             (1_114_102 plus all above-Unicode code
                             points)
   \p{IDS_Binary_Operator: Y*} (Short: \p{IDSB=Y}, \p{IDSB}) (10)
   \p{IDS_Trinary_Operator} \p{IDS_Trinary_Operator=Y} (Short:
                             \p{IDST}) (2)
   \p{IDS_Trinary_Operator: N*} (Short: \p{IDST=N}, \P{IDST})
                             (1_114_110 plus all above-Unicode code
                             points)
   \p{IDS_Trinary_Operator: Y*} (Short: \p{IDST=Y}, \p{IDST}) (2)
   \p{IDSB}                \p{IDS_Binary_Operator} (=
                             \p{IDS_Binary_Operator=Y}) (10)
   \p{IDSB: *}             \p{IDS_Binary_Operator: *}
   \p{IDST}                \p{IDS_Trinary_Operator} (=
                             \p{IDS_Trinary_Operator=Y}) (2)
   \p{IDST: *}             \p{IDS_Trinary_Operator: *}
   \p{Imperial_Aramaic}    \p{Script_Extensions=Imperial_Aramaic}
                             (Short: \p{Armi}; NOT \p{Block=
                             Imperial_Aramaic}) (31)
   \p{In: *}               \p{Present_In: *} (Perl extension)
 X \p{In_*}                \p{Block: *}
 X \p{Indic_Number_Forms}  \p{Common_Indic_Number_Forms} (= \p{Block=
                             Common_Indic_Number_Forms}) (16)
   \p{Indic_Positional_Category: Bottom} (Short: \p{InPC=Bottom})
                             (300)
   \p{Indic_Positional_Category: Bottom_And_Right} (Short: \p{InPC=
                             BottomAndRight}) (2)
   \p{Indic_Positional_Category: Left} (Short: \p{InPC=Left}) (57)
   \p{Indic_Positional_Category: Left_And_Right} (Short: \p{InPC=
                             LeftAndRight}) (21)
   \p{Indic_Positional_Category: NA} (Short: \p{InPC=NA}) (1_113_069
                             plus all above-Unicode code points)
   \p{Indic_Positional_Category: Overstruck} (Short: \p{InPC=
                             Overstruck}) (10)
   \p{Indic_Positional_Category: Right} (Short: \p{InPC=Right}) (258)
   \p{Indic_Positional_Category: Top} (Short: \p{InPC=Top}) (342)
   \p{Indic_Positional_Category: Top_And_Bottom} (Short: \p{InPC=
                             TopAndBottom}) (10)
   \p{Indic_Positional_Category: Top_And_Bottom_And_Right} (Short:
                             \p{InPC=TopAndBottomAndRight}) (1)
   \p{Indic_Positional_Category: Top_And_Left} (Short: \p{InPC=
                             TopAndLeft}) (6)
   \p{Indic_Positional_Category: Top_And_Left_And_Right} (Short:
                             \p{InPC=TopAndLeftAndRight}) (4)
   \p{Indic_Positional_Category: Top_And_Right} (Short: \p{InPC=
                             TopAndRight}) (13)
   \p{Indic_Positional_Category: Visual_Order_Left} (Short: \p{InPC=
                             VisualOrderLeft}) (19)
   \p{Indic_Syllabic_Category: Avagraha} (Short: \p{InSC=Avagraha})
                             (15)
   \p{Indic_Syllabic_Category: Bindu} (Short: \p{InSC=Bindu}) (67)
   \p{Indic_Syllabic_Category: Brahmi_Joining_Number} (Short:
                             \p{InSC=BrahmiJoiningNumber}) (20)
   \p{Indic_Syllabic_Category: Cantillation_Mark} (Short: \p{InSC=
                             CantillationMark}) (53)
   \p{Indic_Syllabic_Category: Consonant} (Short: \p{InSC=Consonant})
                             (1907)
   \p{Indic_Syllabic_Category: Consonant_Dead} (Short: \p{InSC=
                             ConsonantDead}) (10)
   \p{Indic_Syllabic_Category: Consonant_Final} (Short: \p{InSC=
                             ConsonantFinal}) (62)
   \p{Indic_Syllabic_Category: Consonant_Head_Letter} (Short:
                             \p{InSC=ConsonantHeadLetter}) (5)
   \p{Indic_Syllabic_Category: Consonant_Killer} (Short: \p{InSC=
                             ConsonantKiller}) (2)
   \p{Indic_Syllabic_Category: Consonant_Medial} (Short: \p{InSC=
                             ConsonantMedial}) (22)
   \p{Indic_Syllabic_Category: Consonant_Placeholder} (Short:
                             \p{InSC=ConsonantPlaceholder}) (16)
   \p{Indic_Syllabic_Category: Consonant_Preceding_Repha} (Short:
                             \p{InSC=ConsonantPrecedingRepha}) (1)
   \p{Indic_Syllabic_Category: Consonant_Prefixed} (Short: \p{InSC=
                             ConsonantPrefixed}) (2)
   \p{Indic_Syllabic_Category: Consonant_Subjoined} (Short: \p{InSC=
                             ConsonantSubjoined}) (90)
   \p{Indic_Syllabic_Category: Consonant_Succeeding_Repha} (Short:
                             \p{InSC=ConsonantSucceedingRepha}) (4)
   \p{Indic_Syllabic_Category: Consonant_With_Stacker} (Short:
                             \p{InSC=ConsonantWithStacker}) (4)
   \p{Indic_Syllabic_Category: Gemination_Mark} (Short: \p{InSC=
                             GeminationMark}) (2)
   \p{Indic_Syllabic_Category: Invisible_Stacker} (Short: \p{InSC=
                             InvisibleStacker}) (7)
   \p{Indic_Syllabic_Category: Joiner} (Short: \p{InSC=Joiner}) (1)
   \p{Indic_Syllabic_Category: Modifying_Letter} (Short: \p{InSC=
                             ModifyingLetter}) (1)
   \p{Indic_Syllabic_Category: Non_Joiner} (Short: \p{InSC=
                             NonJoiner}) (1)
   \p{Indic_Syllabic_Category: Nukta} (Short: \p{InSC=Nukta}) (24)
   \p{Indic_Syllabic_Category: Number} (Short: \p{InSC=Number}) (459)
   \p{Indic_Syllabic_Category: Number_Joiner} (Short: \p{InSC=
                             NumberJoiner}) (1)
   \p{Indic_Syllabic_Category: Other} (Short: \p{InSC=Other})
                             (1_110_129 plus all above-Unicode code
                             points)
   \p{Indic_Syllabic_Category: Pure_Killer} (Short: \p{InSC=
                             PureKiller}) (16)
   \p{Indic_Syllabic_Category: Register_Shifter} (Short: \p{InSC=
                             RegisterShifter}) (2)
   \p{Indic_Syllabic_Category: Syllable_Modifier} (Short: \p{InSC=
                             SyllableModifier}) (22)
   \p{Indic_Syllabic_Category: Tone_Letter} (Short: \p{InSC=
                             ToneLetter}) (7)
   \p{Indic_Syllabic_Category: Tone_Mark} (Short: \p{InSC=ToneMark})
                             (42)
   \p{Indic_Syllabic_Category: Virama} (Short: \p{InSC=Virama}) (24)
   \p{Indic_Syllabic_Category: Visarga} (Short: \p{InSC=Visarga}) (31)
   \p{Indic_Syllabic_Category: Vowel} (Short: \p{InSC=Vowel}) (30)
   \p{Indic_Syllabic_Category: Vowel_Dependent} (Short: \p{InSC=
                             VowelDependent}) (602)
   \p{Indic_Syllabic_Category: Vowel_Independent} (Short: \p{InSC=
                             VowelIndependent}) (431)
   \p{Inherited}           \p{Script_Extensions=Inherited} (Short:
                             \p{Zinh}) (496)
   \p{Initial_Punctuation} \p{General_Category=Initial_Punctuation}
                             (Short: \p{Pi}) (12)
   \p{InPC: *}             \p{Indic_Positional_Category: *}
   \p{InSC: *}             \p{Indic_Syllabic_Category: *}
   \p{Inscriptional_Pahlavi} \p{Script_Extensions=
                             Inscriptional_Pahlavi} (Short: \p{Phli};
                             NOT \p{Block=Inscriptional_Pahlavi}) (27)
   \p{Inscriptional_Parthian} \p{Script_Extensions=
                             Inscriptional_Parthian} (Short:
                             \p{Prti}; NOT \p{Block=
                             Inscriptional_Parthian}) (30)
 X \p{IPA_Ext}             \p{IPA_Extensions} (= \p{Block=
                             IPA_Extensions}) (96)
 X \p{IPA_Extensions}      \p{Block=IPA_Extensions} (Short:
                             \p{InIPAExt}) (96)
   \p{Is_*}                \p{*} (Any exceptions are individually
                             noted beginning with the word NOT.) If
                             an entry has flag(s) at its beginning,
                             like "D", the "Is_" form has the same
                             flag(s)
   \p{Ital}                \p{Old_Italic} (= \p{Script_Extensions=
                             Old_Italic}) (NOT \p{Block=Old_Italic})
                             (36)
 X \p{Jamo}                \p{Hangul_Jamo} (= \p{Block=Hangul_Jamo})
                             (256)
 X \p{Jamo_Ext_A}          \p{Hangul_Jamo_Extended_A} (= \p{Block=
                             Hangul_Jamo_Extended_A}) (32)
 X \p{Jamo_Ext_B}          \p{Hangul_Jamo_Extended_B} (= \p{Block=
                             Hangul_Jamo_Extended_B}) (80)
   \p{Java}                \p{Javanese} (= \p{Script_Extensions=
                             Javanese}) (NOT \p{Block=Javanese}) (91)
   \p{Javanese}            \p{Script_Extensions=Javanese} (Short:
                             \p{Java}; NOT \p{Block=Javanese}) (91)
   \p{Jg: *}               \p{Joining_Group: *}
   \p{Join_C}              \p{Join_Control} (= \p{Join_Control=Y}) (2)
   \p{Join_C: *}           \p{Join_Control: *}
   \p{Join_Control}        \p{Join_Control=Y} (Short: \p{JoinC}) (2)
   \p{Join_Control: N*}    (Short: \p{JoinC=N}, \P{JoinC}) (1_114_110
                             plus all above-Unicode code points)
   \p{Join_Control: Y*}    (Short: \p{JoinC=Y}, \p{JoinC}) (2)
   \p{Joining_Group: African_Feh} (Short: \p{Jg=AfricanFeh}) (1)
   \p{Joining_Group: African_Noon} (Short: \p{Jg=AfricanNoon}) (1)
   \p{Joining_Group: African_Qaf} (Short: \p{Jg=AfricanQaf}) (1)
   \p{Joining_Group: Ain}  (Short: \p{Jg=Ain}) (8)
   \p{Joining_Group: Alaph} (Short: \p{Jg=Alaph}) (1)
   \p{Joining_Group: Alef} (Short: \p{Jg=Alef}) (10)
   \p{Joining_Group: Beh}  (Short: \p{Jg=Beh}) (24)
   \p{Joining_Group: Beth} (Short: \p{Jg=Beth}) (2)
   \p{Joining_Group: Burushaski_Yeh_Barree} (Short: \p{Jg=
                             BurushaskiYehBarree}) (2)
   \p{Joining_Group: Dal}  (Short: \p{Jg=Dal}) (15)
   \p{Joining_Group: Dalath_Rish} (Short: \p{Jg=DalathRish}) (4)
   \p{Joining_Group: E}    (Short: \p{Jg=E}) (1)
   \p{Joining_Group: Farsi_Yeh} (Short: \p{Jg=FarsiYeh}) (7)
   \p{Joining_Group: Fe}   (Short: \p{Jg=Fe}) (1)
   \p{Joining_Group: Feh}  (Short: \p{Jg=Feh}) (10)
   \p{Joining_Group: Final_Semkath} (Short: \p{Jg=FinalSemkath}) (1)
   \p{Joining_Group: Gaf}  (Short: \p{Jg=Gaf}) (14)
   \p{Joining_Group: Gamal} (Short: \p{Jg=Gamal}) (3)
   \p{Joining_Group: Hah}  (Short: \p{Jg=Hah}) (18)
   \p{Joining_Group: Hamza_On_Heh_Goal} (Short: \p{Jg=
                             HamzaOnHehGoal}) (1)
   \p{Joining_Group: He}   (Short: \p{Jg=He}) (1)
   \p{Joining_Group: Heh}  (Short: \p{Jg=Heh}) (1)
   \p{Joining_Group: Heh_Goal} (Short: \p{Jg=HehGoal}) (2)
   \p{Joining_Group: Heth} (Short: \p{Jg=Heth}) (1)
   \p{Joining_Group: Kaf}  (Short: \p{Jg=Kaf}) (6)
   \p{Joining_Group: Kaph} (Short: \p{Jg=Kaph}) (1)
   \p{Joining_Group: Khaph} (Short: \p{Jg=Khaph}) (1)
   \p{Joining_Group: Knotted_Heh} (Short: \p{Jg=KnottedHeh}) (2)
   \p{Joining_Group: Lam}  (Short: \p{Jg=Lam}) (7)
   \p{Joining_Group: Lamadh} (Short: \p{Jg=Lamadh}) (1)
   \p{Joining_Group: Manichaean_Aleph} (Short: \p{Jg=
                             ManichaeanAleph}) (1)
   \p{Joining_Group: Manichaean_Ayin} (Short: \p{Jg=ManichaeanAyin})
                             (2)
   \p{Joining_Group: Manichaean_Beth} (Short: \p{Jg=ManichaeanBeth})
                             (2)
   \p{Joining_Group: Manichaean_Daleth} (Short: \p{Jg=
                             ManichaeanDaleth}) (1)
   \p{Joining_Group: Manichaean_Dhamedh} (Short: \p{Jg=
                             ManichaeanDhamedh}) (1)
   \p{Joining_Group: Manichaean_Five} (Short: \p{Jg=ManichaeanFive})
                             (1)
   \p{Joining_Group: Manichaean_Gimel} (Short: \p{Jg=
                             ManichaeanGimel}) (2)
   \p{Joining_Group: Manichaean_Heth} (Short: \p{Jg=ManichaeanHeth})
                             (1)
   \p{Joining_Group: Manichaean_Hundred} (Short: \p{Jg=
                             ManichaeanHundred}) (1)
   \p{Joining_Group: Manichaean_Kaph} (Short: \p{Jg=ManichaeanKaph})
                             (3)
   \p{Joining_Group: Manichaean_Lamedh} (Short: \p{Jg=
                             ManichaeanLamedh}) (1)
   \p{Joining_Group: Manichaean_Mem} (Short: \p{Jg=ManichaeanMem}) (1)
   \p{Joining_Group: Manichaean_Nun} (Short: \p{Jg=ManichaeanNun}) (1)
   \p{Joining_Group: Manichaean_One} (Short: \p{Jg=ManichaeanOne}) (1)
   \p{Joining_Group: Manichaean_Pe} (Short: \p{Jg=ManichaeanPe}) (2)
   \p{Joining_Group: Manichaean_Qoph} (Short: \p{Jg=ManichaeanQoph})
                             (3)
   \p{Joining_Group: Manichaean_Resh} (Short: \p{Jg=ManichaeanResh})
                             (1)
   \p{Joining_Group: Manichaean_Sadhe} (Short: \p{Jg=
                             ManichaeanSadhe}) (1)
   \p{Joining_Group: Manichaean_Samekh} (Short: \p{Jg=
                             ManichaeanSamekh}) (1)
   \p{Joining_Group: Manichaean_Taw} (Short: \p{Jg=ManichaeanTaw}) (1)
   \p{Joining_Group: Manichaean_Ten} (Short: \p{Jg=ManichaeanTen}) (1)
   \p{Joining_Group: Manichaean_Teth} (Short: \p{Jg=ManichaeanTeth})
                             (1)
   \p{Joining_Group: Manichaean_Thamedh} (Short: \p{Jg=
                             ManichaeanThamedh}) (1)
   \p{Joining_Group: Manichaean_Twenty} (Short: \p{Jg=
                             ManichaeanTwenty}) (1)
   \p{Joining_Group: Manichaean_Waw} (Short: \p{Jg=ManichaeanWaw}) (1)
   \p{Joining_Group: Manichaean_Yodh} (Short: \p{Jg=ManichaeanYodh})
                             (1)
   \p{Joining_Group: Manichaean_Zayin} (Short: \p{Jg=
                             ManichaeanZayin}) (2)
   \p{Joining_Group: Meem} (Short: \p{Jg=Meem}) (4)
   \p{Joining_Group: Mim}  (Short: \p{Jg=Mim}) (1)
   \p{Joining_Group: No_Joining_Group} (Short: \p{Jg=NoJoiningGroup})
                             (1_113_818 plus all above-Unicode code
                             points)
   \p{Joining_Group: Noon} (Short: \p{Jg=Noon}) (8)
   \p{Joining_Group: Nun}  (Short: \p{Jg=Nun}) (1)
   \p{Joining_Group: Nya}  (Short: \p{Jg=Nya}) (1)
   \p{Joining_Group: Pe}   (Short: \p{Jg=Pe}) (1)
   \p{Joining_Group: Qaf}  (Short: \p{Jg=Qaf}) (5)
   \p{Joining_Group: Qaph} (Short: \p{Jg=Qaph}) (1)
   \p{Joining_Group: Reh}  (Short: \p{Jg=Reh}) (19)
   \p{Joining_Group: Reversed_Pe} (Short: \p{Jg=ReversedPe}) (1)
   \p{Joining_Group: Rohingya_Yeh} (Short: \p{Jg=RohingyaYeh}) (1)
   \p{Joining_Group: Sad}  (Short: \p{Jg=Sad}) (6)
   \p{Joining_Group: Sadhe} (Short: \p{Jg=Sadhe}) (1)
   \p{Joining_Group: Seen} (Short: \p{Jg=Seen}) (11)
   \p{Joining_Group: Semkath} (Short: \p{Jg=Semkath}) (1)
   \p{Joining_Group: Shin} (Short: \p{Jg=Shin}) (1)
   \p{Joining_Group: Straight_Waw} (Short: \p{Jg=StraightWaw}) (1)
   \p{Joining_Group: Swash_Kaf} (Short: \p{Jg=SwashKaf}) (1)
   \p{Joining_Group: Syriac_Waw} (Short: \p{Jg=SyriacWaw}) (1)
   \p{Joining_Group: Tah}  (Short: \p{Jg=Tah}) (4)
   \p{Joining_Group: Taw}  (Short: \p{Jg=Taw}) (1)
   \p{Joining_Group: Teh_Marbuta} (Short: \p{Jg=TehMarbuta}) (3)
   \p{Joining_Group: Teh_Marbuta_Goal} \p{Joining_Group=
                             Hamza_On_Heh_Goal} (1)
   \p{Joining_Group: Teth} (Short: \p{Jg=Teth}) (2)
   \p{Joining_Group: Waw}  (Short: \p{Jg=Waw}) (16)
   \p{Joining_Group: Yeh}  (Short: \p{Jg=Yeh}) (11)
   \p{Joining_Group: Yeh_Barree} (Short: \p{Jg=YehBarree}) (2)
   \p{Joining_Group: Yeh_With_Tail} (Short: \p{Jg=YehWithTail}) (1)
   \p{Joining_Group: Yudh} (Short: \p{Jg=Yudh}) (1)
   \p{Joining_Group: Yudh_He} (Short: \p{Jg=YudhHe}) (1)
   \p{Joining_Group: Zain} (Short: \p{Jg=Zain}) (1)
   \p{Joining_Group: Zhain} (Short: \p{Jg=Zhain}) (1)
   \p{Joining_Type: C}     \p{Joining_Type=Join_Causing} (4)
   \p{Joining_Type: D}     \p{Joining_Type=Dual_Joining} (501)
   \p{Joining_Type: Dual_Joining} (Short: \p{Jt=D}) (501)
   \p{Joining_Type: Join_Causing} (Short: \p{Jt=C}) (4)
   \p{Joining_Type: L}     \p{Joining_Type=Left_Joining} (3)
   \p{Joining_Type: Left_Joining} (Short: \p{Jt=L}) (3)
   \p{Joining_Type: Non_Joining} (Short: \p{Jt=U}) (1_111_653 plus
                             all above-Unicode code points)
   \p{Joining_Type: R}     \p{Joining_Type=Right_Joining} (112)
   \p{Joining_Type: Right_Joining} (Short: \p{Jt=R}) (112)
   \p{Joining_Type: T}     \p{Joining_Type=Transparent} (1839)
   \p{Joining_Type: Transparent} (Short: \p{Jt=T}) (1839)
   \p{Joining_Type: U}     \p{Joining_Type=Non_Joining} (1_111_653
                             plus all above-Unicode code points)
   \p{Jt: *}               \p{Joining_Type: *}
   \p{Kaithi}              \p{Script_Extensions=Kaithi} (Short:
                             \p{Kthi}; NOT \p{Block=Kaithi}) (86)
   \p{Kali}                \p{Kayah_Li} (= \p{Script_Extensions=
                             Kayah_Li}) (48)
   \p{Kana}                \p{Katakana} (= \p{Script_Extensions=
                             Katakana}) (NOT \p{Block=Katakana}) (352)
 X \p{Kana_Sup}            \p{Kana_Supplement} (= \p{Block=
                             Kana_Supplement}) (256)
 X \p{Kana_Supplement}     \p{Block=Kana_Supplement} (Short:
                             \p{InKanaSup}) (256)
 X \p{Kanbun}              \p{Block=Kanbun} (16)
 X \p{Kangxi}              \p{Kangxi_Radicals} (= \p{Block=
                             Kangxi_Radicals}) (224)
 X \p{Kangxi_Radicals}     \p{Block=Kangxi_Radicals} (Short:
                             \p{InKangxi}) (224)
   \p{Kannada}             \p{Script_Extensions=Kannada} (Short:
                             \p{Knda}; NOT \p{Block=Kannada}) (100)
   \p{Katakana}            \p{Script_Extensions=Katakana} (Short:
                             \p{Kana}; NOT \p{Block=Katakana}) (352)
 X \p{Katakana_Ext}        \p{Katakana_Phonetic_Extensions} (=
                             \p{Block=Katakana_Phonetic_Extensions})
                             (16)
 X \p{Katakana_Phonetic_Extensions} \p{Block=
                             Katakana_Phonetic_Extensions} (Short:
                             \p{InKatakanaExt}) (16)
   \p{Kayah_Li}            \p{Script_Extensions=Kayah_Li} (Short:
                             \p{Kali}) (48)
   \p{Khar}                \p{Kharoshthi} (= \p{Script_Extensions=
                             Kharoshthi}) (NOT \p{Block=Kharoshthi})
                             (65)
   \p{Kharoshthi}          \p{Script_Extensions=Kharoshthi} (Short:
                             \p{Khar}; NOT \p{Block=Kharoshthi}) (65)
   \p{Khmer}               \p{Script_Extensions=Khmer} (Short:
                             \p{Khmr}; NOT \p{Block=Khmer}) (146)
 X \p{Khmer_Symbols}       \p{Block=Khmer_Symbols} (32)
   \p{Khmr}                \p{Khmer} (= \p{Script_Extensions=Khmer})
                             (NOT \p{Block=Khmer}) (146)
   \p{Khoj}                \p{Khojki} (= \p{Script_Extensions=
                             Khojki}) (NOT \p{Block=Khojki}) (72)
   \p{Khojki}              \p{Script_Extensions=Khojki} (Short:
                             \p{Khoj}; NOT \p{Block=Khojki}) (72)
   \p{Khudawadi}           \p{Script_Extensions=Khudawadi} (Short:
                             \p{Sind}; NOT \p{Block=Khudawadi}) (81)
   \p{Knda}                \p{Kannada} (= \p{Script_Extensions=
                             Kannada}) (NOT \p{Block=Kannada}) (100)
   \p{Kthi}                \p{Kaithi} (= \p{Script_Extensions=
                             Kaithi}) (NOT \p{Block=Kaithi}) (86)
   \p{L} \pL               \p{Letter} (= \p{General_Category=Letter})
                             (116_766)
 X \p{L&}                  \p{Cased_Letter} (= \p{General_Category=
                             Cased_Letter}) (3796)
 X \p{L_}                  \p{Cased_Letter} (= \p{General_Category=
                             Cased_Letter}) Note the trailing '_'
                             matters in spite of loose matching
                             rules. (3796)
   \p{Lana}                \p{Tai_Tham} (= \p{Script_Extensions=
                             Tai_Tham}) (NOT \p{Block=Tai_Tham}) (127)
   \p{Lao}                 \p{Script_Extensions=Lao} (NOT \p{Block=
                             Lao}) (67)
   \p{Laoo}                \p{Lao} (= \p{Script_Extensions=Lao}) (NOT
                             \p{Block=Lao}) (67)
   \p{Latin}               \p{Script_Extensions=Latin} (Short:
                             \p{Latn}) (1370)
 X \p{Latin_1}             \p{Latin_1_Supplement} (= \p{Block=
                             Latin_1_Supplement}) (128)
 X \p{Latin_1_Sup}         \p{Latin_1_Supplement} (= \p{Block=
                             Latin_1_Supplement}) (128)
 X \p{Latin_1_Supplement}  \p{Block=Latin_1_Supplement} (Short:
                             \p{InLatin1}) (128)
 X \p{Latin_Ext_A}         \p{Latin_Extended_A} (= \p{Block=
                             Latin_Extended_A}) (128)
 X \p{Latin_Ext_Additional} \p{Latin_Extended_Additional} (=
                             \p{Block=Latin_Extended_Additional})
                             (256)
 X \p{Latin_Ext_B}         \p{Latin_Extended_B} (= \p{Block=
                             Latin_Extended_B}) (208)
 X \p{Latin_Ext_C}         \p{Latin_Extended_C} (= \p{Block=
                             Latin_Extended_C}) (32)
 X \p{Latin_Ext_D}         \p{Latin_Extended_D} (= \p{Block=
                             Latin_Extended_D}) (224)
 X \p{Latin_Ext_E}         \p{Latin_Extended_E} (= \p{Block=
                             Latin_Extended_E}) (64)
 X \p{Latin_Extended_A}    \p{Block=Latin_Extended_A} (Short:
                             \p{InLatinExtA}) (128)
 X \p{Latin_Extended_Additional} \p{Block=Latin_Extended_Additional}
                             (Short: \p{InLatinExtAdditional}) (256)
 X \p{Latin_Extended_B}    \p{Block=Latin_Extended_B} (Short:
                             \p{InLatinExtB}) (208)
 X \p{Latin_Extended_C}    \p{Block=Latin_Extended_C} (Short:
                             \p{InLatinExtC}) (32)
 X \p{Latin_Extended_D}    \p{Block=Latin_Extended_D} (Short:
                             \p{InLatinExtD}) (224)
 X \p{Latin_Extended_E}    \p{Block=Latin_Extended_E} (Short:
                             \p{InLatinExtE}) (64)
   \p{Latn}                \p{Latin} (= \p{Script_Extensions=Latin})
                             (1370)
   \p{Lb: *}               \p{Line_Break: *}
   \p{LC}                  \p{Cased_Letter} (= \p{General_Category=
                             Cased_Letter}) (3796)
   \p{Lepc}                \p{Lepcha} (= \p{Script_Extensions=
                             Lepcha}) (NOT \p{Block=Lepcha}) (74)
   \p{Lepcha}              \p{Script_Extensions=Lepcha} (Short:
                             \p{Lepc}; NOT \p{Block=Lepcha}) (74)
   \p{Letter}              \p{General_Category=Letter} (Short: \p{L})
                             (116_766)
   \p{Letter_Number}       \p{General_Category=Letter_Number} (Short:
                             \p{Nl}) (236)
 X \p{Letterlike_Symbols}  \p{Block=Letterlike_Symbols} (80)
   \p{Limb}                \p{Limbu} (= \p{Script_Extensions=Limbu})
                             (NOT \p{Block=Limbu}) (69)
   \p{Limbu}               \p{Script_Extensions=Limbu} (Short:
                             \p{Limb}; NOT \p{Block=Limbu}) (69)
   \p{Lina}                \p{Linear_A} (= \p{Script_Extensions=
                             Linear_A}) (NOT \p{Block=Linear_A}) (386)
   \p{Linb}                \p{Linear_B} (= \p{Script_Extensions=
                             Linear_B}) (268)
   \p{Line_Break: AI}      \p{Line_Break=Ambiguous} (707)
   \p{Line_Break: AL}      \p{Line_Break=Alphabetic} (19_523)
   \p{Line_Break: Alphabetic} (Short: \p{Lb=AL}) (19_523)
   \p{Line_Break: Ambiguous} (Short: \p{Lb=AI}) (707)
   \p{Line_Break: B2}      \p{Line_Break=Break_Both} (3)
   \p{Line_Break: BA}      \p{Line_Break=Break_After} (218)
   \p{Line_Break: BB}      \p{Line_Break=Break_Before} (37)
   \p{Line_Break: BK}      \p{Line_Break=Mandatory_Break} (4)
   \p{Line_Break: Break_After} (Short: \p{Lb=BA}) (218)
   \p{Line_Break: Break_Before} (Short: \p{Lb=BB}) (37)
   \p{Line_Break: Break_Both} (Short: \p{Lb=B2}) (3)
   \p{Line_Break: Break_Symbols} (Short: \p{Lb=SY}) (1)
   \p{Line_Break: Carriage_Return} (Short: \p{Lb=CR}) (1)
   \p{Line_Break: CB}      \p{Line_Break=Contingent_Break} (1)
   \p{Line_Break: CJ}      \p{Line_Break=
                             Conditional_Japanese_Starter} (51)
   \p{Line_Break: CL}      \p{Line_Break=Close_Punctuation} (90)
   \p{Line_Break: Close_Parenthesis} (Short: \p{Lb=CP}) (2)
   \p{Line_Break: Close_Punctuation} (Short: \p{Lb=CL}) (90)
   \p{Line_Break: CM}      \p{Line_Break=Combining_Mark} (2090)
   \p{Line_Break: Combining_Mark} (Short: \p{Lb=CM}) (2090)
   \p{Line_Break: Complex_Context} (Short: \p{Lb=SA}) (734)
   \p{Line_Break: Conditional_Japanese_Starter} (Short: \p{Lb=CJ})
                             (51)
   \p{Line_Break: Contingent_Break} (Short: \p{Lb=CB}) (1)
   \p{Line_Break: CP}      \p{Line_Break=Close_Parenthesis} (2)
   \p{Line_Break: CR}      \p{Line_Break=Carriage_Return} (1)
   \p{Line_Break: E_Base}  (Short: \p{Lb=EB}) (83)
   \p{Line_Break: E_Modifier} (Short: \p{Lb=EM}) (5)
   \p{Line_Break: EB}      \p{Line_Break=E_Base} (83)
   \p{Line_Break: EM}      \p{Line_Break=E_Modifier} (5)
   \p{Line_Break: EX}      \p{Line_Break=Exclamation} (37)
   \p{Line_Break: Exclamation} (Short: \p{Lb=EX}) (37)
   \p{Line_Break: GL}      \p{Line_Break=Glue} (18)
   \p{Line_Break: Glue}    (Short: \p{Lb=GL}) (18)
   \p{Line_Break: H2}      (Short: \p{Lb=H2}) (399)
   \p{Line_Break: H3}      (Short: \p{Lb=H3}) (10_773)
   \p{Line_Break: Hebrew_Letter} (Short: \p{Lb=HL}) (74)
   \p{Line_Break: HL}      \p{Line_Break=Hebrew_Letter} (74)
   \p{Line_Break: HY}      \p{Line_Break=Hyphen} (1)
   \p{Line_Break: Hyphen}  (Short: \p{Lb=HY}) (1)
   \p{Line_Break: ID}      \p{Line_Break=Ideographic} (172_133)
   \p{Line_Break: Ideographic} (Short: \p{Lb=ID}) (172_133)
   \p{Line_Break: IN}      \p{Line_Break=Inseparable} (6)
   \p{Line_Break: Infix_Numeric} (Short: \p{Lb=IS}) (13)
   \p{Line_Break: Inseparable} (Short: \p{Lb=IN}) (6)
   \p{Line_Break: Inseperable} \p{Line_Break=Inseparable} (6)
   \p{Line_Break: IS}      \p{Line_Break=Infix_Numeric} (13)
   \p{Line_Break: JL}      (Short: \p{Lb=JL}) (125)
   \p{Line_Break: JT}      (Short: \p{Lb=JT}) (137)
   \p{Line_Break: JV}      (Short: \p{Lb=JV}) (95)
   \p{Line_Break: LF}      \p{Line_Break=Line_Feed} (1)
   \p{Line_Break: Line_Feed} (Short: \p{Lb=LF}) (1)
   \p{Line_Break: Mandatory_Break} (Short: \p{Lb=BK}) (4)
   \p{Line_Break: Next_Line} (Short: \p{Lb=NL}) (1)
   \p{Line_Break: NL}      \p{Line_Break=Next_Line} (1)
   \p{Line_Break: Nonstarter} (Short: \p{Lb=NS}) (30)
   \p{Line_Break: NS}      \p{Line_Break=Nonstarter} (30)
   \p{Line_Break: NU}      \p{Line_Break=Numeric} (572)
   \p{Line_Break: Numeric} (Short: \p{Lb=NU}) (572)
   \p{Line_Break: OP}      \p{Line_Break=Open_Punctuation} (87)
   \p{Line_Break: Open_Punctuation} (Short: \p{Lb=OP}) (87)
   \p{Line_Break: PO}      \p{Line_Break=Postfix_Numeric} (30)
   \p{Line_Break: Postfix_Numeric} (Short: \p{Lb=PO}) (30)
   \p{Line_Break: PR}      \p{Line_Break=Prefix_Numeric} (65)
   \p{Line_Break: Prefix_Numeric} (Short: \p{Lb=PR}) (65)
   \p{Line_Break: QU}      \p{Line_Break=Quotation} (39)
   \p{Line_Break: Quotation} (Short: \p{Lb=QU}) (39)
   \p{Line_Break: Regional_Indicator} (Short: \p{Lb=RI}) (26)
   \p{Line_Break: RI}      \p{Line_Break=Regional_Indicator} (26)
   \p{Line_Break: SA}      \p{Line_Break=Complex_Context} (734)
 D \p{Line_Break: SG}      \p{Line_Break=Surrogate} (2048)
   \p{Line_Break: SP}      \p{Line_Break=Space} (1)
   \p{Line_Break: Space}   (Short: \p{Lb=SP}) (1)
 D \p{Line_Break: Surrogate} Deprecated by Unicode because surrogates
                             should never appear in well-formed text,
                             and therefore shouldn't be the basis for
                             line breaking (Short: \p{Lb=SG}) (2048)
   \p{Line_Break: SY}      \p{Line_Break=Break_Symbols} (1)
   \p{Line_Break: Unknown} (Short: \p{Lb=XX}) (903_847 plus all
                             above-Unicode code points)
   \p{Line_Break: WJ}      \p{Line_Break=Word_Joiner} (2)
   \p{Line_Break: Word_Joiner} (Short: \p{Lb=WJ}) (2)
   \p{Line_Break: XX}      \p{Line_Break=Unknown} (903_847 plus all
                             above-Unicode code points)
   \p{Line_Break: ZW}      \p{Line_Break=ZWSpace} (1)
   \p{Line_Break: ZWJ}     (Short: \p{Lb=ZWJ}) (1)
   \p{Line_Break: ZWSpace} (Short: \p{Lb=ZW}) (1)
   \p{Line_Separator}      \p{General_Category=Line_Separator}
                             (Short: \p{Zl}) (1)
   \p{Linear_A}            \p{Script_Extensions=Linear_A} (Short:
                             \p{Lina}; NOT \p{Block=Linear_A}) (386)
   \p{Linear_B}            \p{Script_Extensions=Linear_B} (Short:
                             \p{Linb}) (268)
 X \p{Linear_B_Ideograms}  \p{Block=Linear_B_Ideograms} (128)
 X \p{Linear_B_Syllabary}  \p{Block=Linear_B_Syllabary} (128)
   \p{Lisu}                \p{Script_Extensions=Lisu} (48)
   \p{Ll}                  \p{Lowercase_Letter} (=
                             \p{General_Category=Lowercase_Letter})
                             (/i= General_Category=Cased_Letter)
                             (2063)
   \p{Lm}                  \p{Modifier_Letter} (=
                             \p{General_Category=Modifier_Letter})
                             (249)
   \p{Lo}                  \p{Other_Letter} (= \p{General_Category=
                             Other_Letter}) (112_721)
   \p{LOE}                 \p{Logical_Order_Exception} (=
                             \p{Logical_Order_Exception=Y}) (19)
   \p{LOE: *}              \p{Logical_Order_Exception: *}
   \p{Logical_Order_Exception} \p{Logical_Order_Exception=Y} (Short:
                             \p{LOE}) (19)
   \p{Logical_Order_Exception: N*} (Short: \p{LOE=N}, \P{LOE})
                             (1_114_093 plus all above-Unicode code
                             points)
   \p{Logical_Order_Exception: Y*} (Short: \p{LOE=Y}, \p{LOE}) (19)
 X \p{Low_Surrogates}      \p{Block=Low_Surrogates} (1024)
   \p{Lower}               \p{XPosixLower} (= \p{Lowercase=Y}) (/i=
                             Cased=Yes) (2252)
   \p{Lower: *}            \p{Lowercase: *}
   \p{Lowercase}           \p{XPosixLower} (= \p{Lowercase=Y}) (/i=
                             Cased=Yes) (2252)
   \p{Lowercase: N*}       (Short: \p{Lower=N}, \P{Lower}; /i= Cased=
                             No) (1_111_860 plus all above-Unicode
                             code points)
   \p{Lowercase: Y*}       (Short: \p{Lower=Y}, \p{Lower}; /i= Cased=
                             Yes) (2252)
   \p{Lowercase_Letter}    \p{General_Category=Lowercase_Letter}
                             (Short: \p{Ll}; /i= General_Category=
                             Cased_Letter) (2063)
   \p{Lt}                  \p{Titlecase_Letter} (=
                             \p{General_Category=Titlecase_Letter})
                             (/i= General_Category=Cased_Letter) (31)
   \p{Lu}                  \p{Uppercase_Letter} (=
                             \p{General_Category=Uppercase_Letter})
                             (/i= General_Category=Cased_Letter)
                             (1702)
   \p{Lyci}                \p{Lycian} (= \p{Script_Extensions=
                             Lycian}) (NOT \p{Block=Lycian}) (29)
   \p{Lycian}              \p{Script_Extensions=Lycian} (Short:
                             \p{Lyci}; NOT \p{Block=Lycian}) (29)
   \p{Lydi}                \p{Lydian} (= \p{Script_Extensions=
                             Lydian}) (NOT \p{Block=Lydian}) (27)
   \p{Lydian}              \p{Script_Extensions=Lydian} (Short:
                             \p{Lydi}; NOT \p{Block=Lydian}) (27)
   \p{M} \pM               \p{Mark} (= \p{General_Category=Mark})
                             (2097)
   \p{Mahajani}            \p{Script_Extensions=Mahajani} (Short:
                             \p{Mahj}; NOT \p{Block=Mahajani}) (61)
   \p{Mahj}                \p{Mahajani} (= \p{Script_Extensions=
                             Mahajani}) (NOT \p{Block=Mahajani}) (61)
 X \p{Mahjong}             \p{Mahjong_Tiles} (= \p{Block=
                             Mahjong_Tiles}) (48)
 X \p{Mahjong_Tiles}       \p{Block=Mahjong_Tiles} (Short:
                             \p{InMahjong}) (48)
   \p{Malayalam}           \p{Script_Extensions=Malayalam} (Short:
                             \p{Mlym}; NOT \p{Block=Malayalam}) (119)
   \p{Mand}                \p{Mandaic} (= \p{Script_Extensions=
                             Mandaic}) (NOT \p{Block=Mandaic}) (30)
   \p{Mandaic}             \p{Script_Extensions=Mandaic} (Short:
                             \p{Mand}; NOT \p{Block=Mandaic}) (30)
   \p{Mani}                \p{Manichaean} (= \p{Script_Extensions=
                             Manichaean}) (NOT \p{Block=Manichaean})
                             (52)
   \p{Manichaean}          \p{Script_Extensions=Manichaean} (Short:
                             \p{Mani}; NOT \p{Block=Manichaean}) (52)
   \p{Marc}                \p{Marchen} (= \p{Script_Extensions=
                             Marchen}) (NOT \p{Block=Marchen}) (68)
   \p{Marchen}             \p{Script_Extensions=Marchen} (Short:
                             \p{Marc}; NOT \p{Block=Marchen}) (68)
   \p{Mark}                \p{General_Category=Mark} (Short: \p{M})
                             (2097)
   \p{Math}                \p{Math=Y} (2310)
   \p{Math: N*}            (Single: \P{Math}) (1_111_802 plus all
                             above-Unicode code points)
   \p{Math: Y*}            (Single: \p{Math}) (2310)
 X \p{Math_Alphanum}       \p{Mathematical_Alphanumeric_Symbols} (=
                             \p{Block=
                             Mathematical_Alphanumeric_Symbols})
                             (1024)
 X \p{Math_Operators}      \p{Mathematical_Operators} (= \p{Block=
                             Mathematical_Operators}) (256)
   \p{Math_Symbol}         \p{General_Category=Math_Symbol} (Short:
                             \p{Sm}) (948)
 X \p{Mathematical_Alphanumeric_Symbols} \p{Block=
                             Mathematical_Alphanumeric_Symbols}
                             (Short: \p{InMathAlphanum}) (1024)
 X \p{Mathematical_Operators} \p{Block=Mathematical_Operators}
                             (Short: \p{InMathOperators}) (256)
   \p{Mc}                  \p{Spacing_Mark} (= \p{General_Category=
                             Spacing_Mark}) (394)
   \p{Me}                  \p{Enclosing_Mark} (= \p{General_Category=
                             Enclosing_Mark}) (13)
   \p{Meetei_Mayek}        \p{Script_Extensions=Meetei_Mayek} (Short:
                             \p{Mtei}; NOT \p{Block=Meetei_Mayek})
                             (79)
 X \p{Meetei_Mayek_Ext}    \p{Meetei_Mayek_Extensions} (= \p{Block=
                             Meetei_Mayek_Extensions}) (32)
 X \p{Meetei_Mayek_Extensions} \p{Block=Meetei_Mayek_Extensions}
                             (Short: \p{InMeeteiMayekExt}) (32)
   \p{Mend}                \p{Mende_Kikakui} (= \p{Script_Extensions=
                             Mende_Kikakui}) (NOT \p{Block=
                             Mende_Kikakui}) (213)
   \p{Mende_Kikakui}       \p{Script_Extensions=Mende_Kikakui}
                             (Short: \p{Mend}; NOT \p{Block=
                             Mende_Kikakui}) (213)
   \p{Merc}                \p{Meroitic_Cursive} (=
                             \p{Script_Extensions=Meroitic_Cursive})
                             (NOT \p{Block=Meroitic_Cursive}) (90)
   \p{Mero}                \p{Meroitic_Hieroglyphs} (=
                             \p{Script_Extensions=
                             Meroitic_Hieroglyphs}) (32)
   \p{Meroitic_Cursive}    \p{Script_Extensions=Meroitic_Cursive}
                             (Short: \p{Merc}; NOT \p{Block=
                             Meroitic_Cursive}) (90)
   \p{Meroitic_Hieroglyphs} \p{Script_Extensions=
                             Meroitic_Hieroglyphs} (Short: \p{Mero})
                             (32)
   \p{Miao}                \p{Script_Extensions=Miao} (NOT \p{Block=
                             Miao}) (133)
 X \p{Misc_Arrows}         \p{Miscellaneous_Symbols_And_Arrows} (=
                             \p{Block=
                             Miscellaneous_Symbols_And_Arrows}) (256)
 X \p{Misc_Math_Symbols_A} \p{Miscellaneous_Mathematical_Symbols_A}
                             (= \p{Block=
                             Miscellaneous_Mathematical_Symbols_A})
                             (48)
 X \p{Misc_Math_Symbols_B} \p{Miscellaneous_Mathematical_Symbols_B}
                             (= \p{Block=
                             Miscellaneous_Mathematical_Symbols_B})
                             (128)
 X \p{Misc_Pictographs}    \p{Miscellaneous_Symbols_And_Pictographs}
                             (= \p{Block=
                             Miscellaneous_Symbols_And_Pictographs})
                             (768)
 X \p{Misc_Symbols}        \p{Miscellaneous_Symbols} (= \p{Block=
                             Miscellaneous_Symbols}) (256)
 X \p{Misc_Technical}      \p{Miscellaneous_Technical} (= \p{Block=
                             Miscellaneous_Technical}) (256)
 X \p{Miscellaneous_Mathematical_Symbols_A} \p{Block=
                             Miscellaneous_Mathematical_Symbols_A}
                             (Short: \p{InMiscMathSymbolsA}) (48)
 X \p{Miscellaneous_Mathematical_Symbols_B} \p{Block=
                             Miscellaneous_Mathematical_Symbols_B}
                             (Short: \p{InMiscMathSymbolsB}) (128)
 X \p{Miscellaneous_Symbols} \p{Block=Miscellaneous_Symbols} (Short:
                             \p{InMiscSymbols}) (256)
 X \p{Miscellaneous_Symbols_And_Arrows} \p{Block=
                             Miscellaneous_Symbols_And_Arrows}
                             (Short: \p{InMiscArrows}) (256)
 X \p{Miscellaneous_Symbols_And_Pictographs} \p{Block=
                             Miscellaneous_Symbols_And_Pictographs}
                             (Short: \p{InMiscPictographs}) (768)
 X \p{Miscellaneous_Technical} \p{Block=Miscellaneous_Technical}
                             (Short: \p{InMiscTechnical}) (256)
   \p{Mlym}                \p{Malayalam} (= \p{Script_Extensions=
                             Malayalam}) (NOT \p{Block=Malayalam})
                             (119)
   \p{Mn}                  \p{Nonspacing_Mark} (=
                             \p{General_Category=Nonspacing_Mark})
                             (1690)
   \p{Modi}                \p{Script_Extensions=Modi} (NOT \p{Block=
                             Modi}) (89)
   \p{Modifier_Letter}     \p{General_Category=Modifier_Letter}
                             (Short: \p{Lm}) (249)
 X \p{Modifier_Letters}    \p{Spacing_Modifier_Letters} (= \p{Block=
                             Spacing_Modifier_Letters}) (80)
   \p{Modifier_Symbol}     \p{General_Category=Modifier_Symbol}
                             (Short: \p{Sk}) (121)
 X \p{Modifier_Tone_Letters} \p{Block=Modifier_Tone_Letters} (32)
   \p{Mong}                \p{Mongolian} (= \p{Script_Extensions=
                             Mongolian}) (NOT \p{Block=Mongolian})
                             (169)
   \p{Mongolian}           \p{Script_Extensions=Mongolian} (Short:
                             \p{Mong}; NOT \p{Block=Mongolian}) (169)
 X \p{Mongolian_Sup}       \p{Mongolian_Supplement} (= \p{Block=
                             Mongolian_Supplement}) (32)
 X \p{Mongolian_Supplement} \p{Block=Mongolian_Supplement} (Short:
                             \p{InMongolianSup}) (32)
   \p{Mro}                 \p{Script_Extensions=Mro} (NOT \p{Block=
                             Mro}) (43)
   \p{Mroo}                \p{Mro} (= \p{Script_Extensions=Mro}) (NOT
                             \p{Block=Mro}) (43)
   \p{Mtei}                \p{Meetei_Mayek} (= \p{Script_Extensions=
                             Meetei_Mayek}) (NOT \p{Block=
                             Meetei_Mayek}) (79)
   \p{Mult}                \p{Multani} (= \p{Script_Extensions=
                             Multani}) (NOT \p{Block=Multani}) (48)
   \p{Multani}             \p{Script_Extensions=Multani} (Short:
                             \p{Mult}; NOT \p{Block=Multani}) (48)
 X \p{Music}               \p{Musical_Symbols} (= \p{Block=
                             Musical_Symbols}) (256)
 X \p{Musical_Symbols}     \p{Block=Musical_Symbols} (Short:
                             \p{InMusic}) (256)
   \p{Myanmar}             \p{Script_Extensions=Myanmar} (Short:
                             \p{Mymr}; NOT \p{Block=Myanmar}) (224)
 X \p{Myanmar_Ext_A}       \p{Myanmar_Extended_A} (= \p{Block=
                             Myanmar_Extended_A}) (32)
 X \p{Myanmar_Ext_B}       \p{Myanmar_Extended_B} (= \p{Block=
                             Myanmar_Extended_B}) (32)
 X \p{Myanmar_Extended_A}  \p{Block=Myanmar_Extended_A} (Short:
                             \p{InMyanmarExtA}) (32)
 X \p{Myanmar_Extended_B}  \p{Block=Myanmar_Extended_B} (Short:
                             \p{InMyanmarExtB}) (32)
   \p{Mymr}                \p{Myanmar} (= \p{Script_Extensions=
                             Myanmar}) (NOT \p{Block=Myanmar}) (224)
   \p{N} \pN               \p{Number} (= \p{General_Category=Number})
                             (1492)
   \p{Nabataean}           \p{Script_Extensions=Nabataean} (Short:
                             \p{Nbat}; NOT \p{Block=Nabataean}) (40)
   \p{Narb}                \p{Old_North_Arabian} (=
                             \p{Script_Extensions=Old_North_Arabian})
                             (32)
 X \p{NB}                  \p{No_Block} (= \p{Block=No_Block})
                             (842_320 plus all above-Unicode code
                             points)
   \p{Nbat}                \p{Nabataean} (= \p{Script_Extensions=
                             Nabataean}) (NOT \p{Block=Nabataean})
                             (40)
   \p{NChar}               \p{Noncharacter_Code_Point} (=
                             \p{Noncharacter_Code_Point=Y}) (66)
   \p{NChar: *}            \p{Noncharacter_Code_Point: *}
   \p{Nd}                  \p{XPosixDigit} (= \p{General_Category=
                             Decimal_Number}) (580)
   \p{New_Tai_Lue}         \p{Script_Extensions=New_Tai_Lue} (Short:
                             \p{Talu}; NOT \p{Block=New_Tai_Lue}) (83)
   \p{Newa}                \p{Script_Extensions=Newa} (NOT \p{Block=
                             Newa}) (92)
   \p{NFC_QC: *}           \p{NFC_Quick_Check: *}
   \p{NFC_Quick_Check: M}  \p{NFC_Quick_Check=Maybe} (110)
   \p{NFC_Quick_Check: Maybe} (Short: \p{NFCQC=M}) (110)
   \p{NFC_Quick_Check: N}  \p{NFC_Quick_Check=No} (NOT
                             \P{NFC_Quick_Check} NOR \P{NFC_QC})
                             (1120)
   \p{NFC_Quick_Check: No} (Short: \p{NFCQC=N}; NOT
                             \P{NFC_Quick_Check} NOR \P{NFC_QC})
                             (1120)
   \p{NFC_Quick_Check: Y}  \p{NFC_Quick_Check=Yes} (NOT
                             \p{NFC_Quick_Check} NOR \p{NFC_QC})
                             (1_112_882 plus all above-Unicode code
                             points)
   \p{NFC_Quick_Check: Yes} (Short: \p{NFCQC=Y}; NOT
                             \p{NFC_Quick_Check} NOR \p{NFC_QC})
                             (1_112_882 plus all above-Unicode code
                             points)
   \p{NFD_QC: *}           \p{NFD_Quick_Check: *}
   \p{NFD_Quick_Check: N}  \p{NFD_Quick_Check=No} (NOT
                             \P{NFD_Quick_Check} NOR \P{NFD_QC})
                             (13_232)
   \p{NFD_Quick_Check: No} (Short: \p{NFDQC=N}; NOT
                             \P{NFD_Quick_Check} NOR \P{NFD_QC})
                             (13_232)
   \p{NFD_Quick_Check: Y}  \p{NFD_Quick_Check=Yes} (NOT
                             \p{NFD_Quick_Check} NOR \p{NFD_QC})
                             (1_100_880 plus all above-Unicode code
                             points)
   \p{NFD_Quick_Check: Yes} (Short: \p{NFDQC=Y}; NOT
                             \p{NFD_Quick_Check} NOR \p{NFD_QC})
                             (1_100_880 plus all above-Unicode code
                             points)
   \p{NFKC_QC: *}          \p{NFKC_Quick_Check: *}
   \p{NFKC_Quick_Check: M} \p{NFKC_Quick_Check=Maybe} (110)
   \p{NFKC_Quick_Check: Maybe} (Short: \p{NFKCQC=M}) (110)
   \p{NFKC_Quick_Check: N} \p{NFKC_Quick_Check=No} (NOT
                             \P{NFKC_Quick_Check} NOR \P{NFKC_QC})
                             (4794)
   \p{NFKC_Quick_Check: No} (Short: \p{NFKCQC=N}; NOT
                             \P{NFKC_Quick_Check} NOR \P{NFKC_QC})
                             (4794)
   \p{NFKC_Quick_Check: Y} \p{NFKC_Quick_Check=Yes} (NOT
                             \p{NFKC_Quick_Check} NOR \p{NFKC_QC})
                             (1_109_208 plus all above-Unicode code
                             points)
   \p{NFKC_Quick_Check: Yes} (Short: \p{NFKCQC=Y}; NOT
                             \p{NFKC_Quick_Check} NOR \p{NFKC_QC})
                             (1_109_208 plus all above-Unicode code
                             points)
   \p{NFKD_QC: *}          \p{NFKD_Quick_Check: *}
   \p{NFKD_Quick_Check: N} \p{NFKD_Quick_Check=No} (NOT
                             \P{NFKD_Quick_Check} NOR \P{NFKD_QC})
                             (16_894)
   \p{NFKD_Quick_Check: No} (Short: \p{NFKDQC=N}; NOT
                             \P{NFKD_Quick_Check} NOR \P{NFKD_QC})
                             (16_894)
   \p{NFKD_Quick_Check: Y} \p{NFKD_Quick_Check=Yes} (NOT
                             \p{NFKD_Quick_Check} NOR \p{NFKD_QC})
                             (1_097_218 plus all above-Unicode code
                             points)
   \p{NFKD_Quick_Check: Yes} (Short: \p{NFKDQC=Y}; NOT
                             \p{NFKD_Quick_Check} NOR \p{NFKD_QC})
                             (1_097_218 plus all above-Unicode code
                             points)
   \p{Nko}                 \p{Script_Extensions=Nko} (NOT \p{NKo})
                             (59)
   \p{Nkoo}                \p{Nko} (= \p{Script_Extensions=Nko}) (NOT
                             \p{NKo}) (59)
   \p{Nl}                  \p{Letter_Number} (= \p{General_Category=
                             Letter_Number}) (236)
   \p{No}                  \p{Other_Number} (= \p{General_Category=
                             Other_Number}) (676)
 X \p{No_Block}            \p{Block=No_Block} (Short: \p{InNB})
                             (842_320 plus all above-Unicode code
                             points)
   \p{Noncharacter_Code_Point} \p{Noncharacter_Code_Point=Y} (Short:
                             \p{NChar}) (66)
   \p{Noncharacter_Code_Point: N*} (Short: \p{NChar=N}, \P{NChar})
                             (1_114_046 plus all above-Unicode code
                             points)
   \p{Noncharacter_Code_Point: Y*} (Short: \p{NChar=Y}, \p{NChar})
                             (66)
   \p{Nonspacing_Mark}     \p{General_Category=Nonspacing_Mark}
                             (Short: \p{Mn}) (1690)
   \p{Nt: *}               \p{Numeric_Type: *}
   \p{Number}              \p{General_Category=Number} (Short: \p{N})
                             (1492)
 X \p{Number_Forms}        \p{Block=Number_Forms} (64)
   \p{Numeric_Type: De}    \p{Numeric_Type=Decimal} (580)
   \p{Numeric_Type: Decimal} (Short: \p{Nt=De}) (580)
   \p{Numeric_Type: Di}    \p{Numeric_Type=Digit} (128)
   \p{Numeric_Type: Digit} (Short: \p{Nt=Di}) (128)
   \p{Numeric_Type: None}  (Short: \p{Nt=None}) (1_112_539 plus all
                             above-Unicode code points)
   \p{Numeric_Type: Nu}    \p{Numeric_Type=Numeric} (865)
   \p{Numeric_Type: Numeric} (Short: \p{Nt=Nu}) (865)
 T \p{Numeric_Value: -1/2} (Short: \p{Nv=-1/2}) (1)
 T \p{Numeric_Value: 0}    (Short: \p{Nv=0}) (74)
 T \p{Numeric_Value: 1/160} (Short: \p{Nv=1/160}) (1)
 T \p{Numeric_Value: 1/40} (Short: \p{Nv=1/40}) (1)
 T \p{Numeric_Value: 3/80} (Short: \p{Nv=3/80}) (1)
 T \p{Numeric_Value: 1/20} (Short: \p{Nv=1/20}) (1)
 T \p{Numeric_Value: 1/16} (Short: \p{Nv=1/16}) (4)
 T \p{Numeric_Value: 1/12} (Short: \p{Nv=1/12}) (1)
 T \p{Numeric_Value: 1/10} (Short: \p{Nv=1/10}) (2)
 T \p{Numeric_Value: 1/9}  (Short: \p{Nv=1/9}) (1)
 T \p{Numeric_Value: 1/8}  (Short: \p{Nv=1/8}) (6)
 T \p{Numeric_Value: 1/7}  (Short: \p{Nv=1/7}) (1)
 T \p{Numeric_Value: 3/20} (Short: \p{Nv=3/20}) (1)
 T \p{Numeric_Value: 1/6}  (Short: \p{Nv=1/6}) (3)
 T \p{Numeric_Value: 3/16} (Short: \p{Nv=3/16}) (4)
 T \p{Numeric_Value: 1/5}  (Short: \p{Nv=1/5}) (2)
 T \p{Numeric_Value: 1/4}  (Short: \p{Nv=1/4}) (12)
 T \p{Numeric_Value: 1/3}  (Short: \p{Nv=1/3}) (6)
 T \p{Numeric_Value: 3/8}  (Short: \p{Nv=3/8}) (1)
 T \p{Numeric_Value: 2/5}  (Short: \p{Nv=2/5}) (1)
 T \p{Numeric_Value: 5/12} (Short: \p{Nv=5/12}) (1)
 T \p{Numeric_Value: 1/2}  (Short: \p{Nv=1/2}) (13)
 T \p{Numeric_Value: 7/12} (Short: \p{Nv=7/12}) (1)
 T \p{Numeric_Value: 3/5}  (Short: \p{Nv=3/5}) (1)
 T \p{Numeric_Value: 5/8}  (Short: \p{Nv=5/8}) (1)
 T \p{Numeric_Value: 2/3}  (Short: \p{Nv=2/3}) (7)
 T \p{Numeric_Value: 3/4}  (Short: \p{Nv=3/4}) (7)
 T \p{Numeric_Value: 4/5}  (Short: \p{Nv=4/5}) (1)
 T \p{Numeric_Value: 5/6}  (Short: \p{Nv=5/6}) (3)
 T \p{Numeric_Value: 7/8}  (Short: \p{Nv=7/8}) (1)
 T \p{Numeric_Value: 11/12} (Short: \p{Nv=11/12}) (1)
 T \p{Numeric_Value: 1}    (Short: \p{Nv=1}) (121)
 T \p{Numeric_Value: 3/2}  (Short: \p{Nv=3/2}) (1)
 T \p{Numeric_Value: 2}    (Short: \p{Nv=2}) (121)
 T \p{Numeric_Value: 5/2}  (Short: \p{Nv=5/2}) (1)
 T \p{Numeric_Value: 3}    (Short: \p{Nv=3}) (123)
 T \p{Numeric_Value: 7/2}  (Short: \p{Nv=7/2}) (1)
 T \p{Numeric_Value: 4}    (Short: \p{Nv=4}) (115)
 T \p{Numeric_Value: 9/2}  (Short: \p{Nv=9/2}) (1)
 T \p{Numeric_Value: 5}    (Short: \p{Nv=5}) (113)
 T \p{Numeric_Value: 11/2} (Short: \p{Nv=11/2}) (1)
 T \p{Numeric_Value: 6}    (Short: \p{Nv=6}) (100)
 T \p{Numeric_Value: 13/2} (Short: \p{Nv=13/2}) (1)
 T \p{Numeric_Value: 7}    (Short: \p{Nv=7}) (99)
 T \p{Numeric_Value: 15/2} (Short: \p{Nv=15/2}) (1)
 T \p{Numeric_Value: 8}    (Short: \p{Nv=8}) (95)
 T \p{Numeric_Value: 17/2} (Short: \p{Nv=17/2}) (1)
 T \p{Numeric_Value: 9}    (Short: \p{Nv=9}) (99)
 T \p{Numeric_Value: 10}   (Short: \p{Nv=10}) (54)
 T \p{Numeric_Value: 11}   (Short: \p{Nv=11}) (6)
 T \p{Numeric_Value: 12}   (Short: \p{Nv=12}) (6)
 T \p{Numeric_Value: 13}   (Short: \p{Nv=13}) (4)
 T \p{Numeric_Value: 14}   (Short: \p{Nv=14}) (4)
 T \p{Numeric_Value: 15}   (Short: \p{Nv=15}) (4)
 T \p{Numeric_Value: 16}   (Short: \p{Nv=16}) (5)
 T \p{Numeric_Value: 17}   (Short: \p{Nv=17}) (5)
 T \p{Numeric_Value: 18}   (Short: \p{Nv=18}) (5)
 T \p{Numeric_Value: 19}   (Short: \p{Nv=19}) (5)
 T \p{Numeric_Value: 20}   (Short: \p{Nv=20}) (31)
 T \p{Numeric_Value: 21}   (Short: \p{Nv=21}) (1)
 T \p{Numeric_Value: 22}   (Short: \p{Nv=22}) (1)
 T \p{Numeric_Value: 23}   (Short: \p{Nv=23}) (1)
 T \p{Numeric_Value: 24}   (Short: \p{Nv=24}) (1)
 T \p{Numeric_Value: 25}   (Short: \p{Nv=25}) (1)
 T \p{Numeric_Value: 26}   (Short: \p{Nv=26}) (1)
 T \p{Numeric_Value: 27}   (Short: \p{Nv=27}) (1)
 T \p{Numeric_Value: 28}   (Short: \p{Nv=28}) (1)
 T \p{Numeric_Value: 29}   (Short: \p{Nv=29}) (1)
 T \p{Numeric_Value: 30}   (Short: \p{Nv=30}) (16)
 T \p{Numeric_Value: 31}   (Short: \p{Nv=31}) (1)
 T \p{Numeric_Value: 32}   (Short: \p{Nv=32}) (1)
 T \p{Numeric_Value: 33}   (Short: \p{Nv=33}) (1)
 T \p{Numeric_Value: 34}   (Short: \p{Nv=34}) (1)
 T \p{Numeric_Value: 35}   (Short: \p{Nv=35}) (1)
 T \p{Numeric_Value: 36}   (Short: \p{Nv=36}) (1)
 T \p{Numeric_Value: 37}   (Short: \p{Nv=37}) (1)
 T \p{Numeric_Value: 38}   (Short: \p{Nv=38}) (1)
 T \p{Numeric_Value: 39}   (Short: \p{Nv=39}) (1)
 T \p{Numeric_Value: 40}   (Short: \p{Nv=40}) (16)
 T \p{Numeric_Value: 41}   (Short: \p{Nv=41}) (1)
 T \p{Numeric_Value: 42}   (Short: \p{Nv=42}) (1)
 T \p{Numeric_Value: 43}   (Short: \p{Nv=43}) (1)
 T \p{Numeric_Value: 44}   (Short: \p{Nv=44}) (1)
 T \p{Numeric_Value: 45}   (Short: \p{Nv=45}) (1)
 T \p{Numeric_Value: 46}   (Short: \p{Nv=46}) (1)
 T \p{Numeric_Value: 47}   (Short: \p{Nv=47}) (1)
 T \p{Numeric_Value: 48}   (Short: \p{Nv=48}) (1)
 T \p{Numeric_Value: 49}   (Short: \p{Nv=49}) (1)
 T \p{Numeric_Value: 50}   (Short: \p{Nv=50}) (27)
 T \p{Numeric_Value: 60}   (Short: \p{Nv=60}) (11)
 T \p{Numeric_Value: 70}   (Short: \p{Nv=70}) (11)
 T \p{Numeric_Value: 80}   (Short: \p{Nv=80}) (10)
 T \p{Numeric_Value: 90}   (Short: \p{Nv=90}) (10)
 T \p{Numeric_Value: 100}  (Short: \p{Nv=100}) (30)
 T \p{Numeric_Value: 200}  (Short: \p{Nv=200}) (4)
 T \p{Numeric_Value: 300}  (Short: \p{Nv=300}) (5)
 T \p{Numeric_Value: 400}  (Short: \p{Nv=400}) (4)
 T \p{Numeric_Value: 500}  (Short: \p{Nv=500}) (14)
 T \p{Numeric_Value: 600}  (Short: \p{Nv=600}) (4)
 T \p{Numeric_Value: 700}  (Short: \p{Nv=700}) (4)
 T \p{Numeric_Value: 800}  (Short: \p{Nv=800}) (4)
 T \p{Numeric_Value: 900}  (Short: \p{Nv=900}) (5)
 T \p{Numeric_Value: 1000} (Short: \p{Nv=1000}) (20)
 T \p{Numeric_Value: 2000} (Short: \p{Nv=2000}) (2)
 T \p{Numeric_Value: 3000} (Short: \p{Nv=3000}) (2)
 T \p{Numeric_Value: 4000} (Short: \p{Nv=4000}) (2)
 T \p{Numeric_Value: 5000} (Short: \p{Nv=5000}) (6)
 T \p{Numeric_Value: 6000} (Short: \p{Nv=6000}) (2)
 T \p{Numeric_Value: 7000} (Short: \p{Nv=7000}) (2)
 T \p{Numeric_Value: 8000} (Short: \p{Nv=8000}) (2)
 T \p{Numeric_Value: 9000} (Short: \p{Nv=9000}) (2)
 T \p{Numeric_Value: 10000} (= 1.0e+04) (Short: \p{Nv=10000}) (9)
 T \p{Numeric_Value: 20000} (= 2.0e+04) (Short: \p{Nv=20000}) (2)
 T \p{Numeric_Value: 30000} (= 3.0e+04) (Short: \p{Nv=30000}) (2)
 T \p{Numeric_Value: 40000} (= 4.0e+04) (Short: \p{Nv=40000}) (2)
 T \p{Numeric_Value: 50000} (= 5.0e+04) (Short: \p{Nv=50000}) (5)
 T \p{Numeric_Value: 60000} (= 6.0e+04) (Short: \p{Nv=60000}) (2)
 T \p{Numeric_Value: 70000} (= 7.0e+04) (Short: \p{Nv=70000}) (2)
 T \p{Numeric_Value: 80000} (= 8.0e+04) (Short: \p{Nv=80000}) (2)
 T \p{Numeric_Value: 90000} (= 9.0e+04) (Short: \p{Nv=90000}) (2)
 T \p{Numeric_Value: 100000} (= 1.0e+05) (Short: \p{Nv=100000}) (2)
 T \p{Numeric_Value: 200000} (= 2.0e+05) (Short: \p{Nv=200000}) (1)
 T \p{Numeric_Value: 216000} (= 2.2e+05) (Short: \p{Nv=216000}) (1)
 T \p{Numeric_Value: 300000} (= 3.0e+05) (Short: \p{Nv=300000}) (1)
 T \p{Numeric_Value: 400000} (= 4.0e+05) (Short: \p{Nv=400000}) (1)
 T \p{Numeric_Value: 432000} (= 4.3e+05) (Short: \p{Nv=432000}) (1)
 T \p{Numeric_Value: 500000} (= 5.0e+05) (Short: \p{Nv=500000}) (1)
 T \p{Numeric_Value: 600000} (= 6.0e+05) (Short: \p{Nv=600000}) (1)
 T \p{Numeric_Value: 700000} (= 7.0e+05) (Short: \p{Nv=700000}) (1)
 T \p{Numeric_Value: 800000} (= 8.0e+05) (Short: \p{Nv=800000}) (1)
 T \p{Numeric_Value: 900000} (= 9.0e+05) (Short: \p{Nv=900000}) (1)
 T \p{Numeric_Value: 1000000} (= 1.0e+06) (Short: \p{Nv=1000000}) (1)
 T \p{Numeric_Value: 100000000} (= 1.0e+08) (Short: \p{Nv=100000000})
                             (3)
 T \p{Numeric_Value: 10000000000} (= 1.0e+10) (Short: \p{Nv=
                             10000000000}) (1)
 T \p{Numeric_Value: 1000000000000} (= 1.0e+12) (Short: \p{Nv=
                             1000000000000}) (2)
   \p{Numeric_Value: NaN}  (Short: \p{Nv=NaN}) (1_112_539 plus all
                             above-Unicode code points)
   \p{Nv: *}               \p{Numeric_Value: *}
 X \p{OCR}                 \p{Optical_Character_Recognition} (=
                             \p{Block=Optical_Character_Recognition})
                             (32)
   \p{Ogam}                \p{Ogham} (= \p{Script_Extensions=Ogham})
                             (NOT \p{Block=Ogham}) (29)
   \p{Ogham}               \p{Script_Extensions=Ogham} (Short:
                             \p{Ogam}; NOT \p{Block=Ogham}) (29)
   \p{Ol_Chiki}            \p{Script_Extensions=Ol_Chiki} (Short:
                             \p{Olck}) (48)
   \p{Olck}                \p{Ol_Chiki} (= \p{Script_Extensions=
                             Ol_Chiki}) (48)
   \p{Old_Hungarian}       \p{Script_Extensions=Old_Hungarian}
                             (Short: \p{Hung}; NOT \p{Block=
                             Old_Hungarian}) (108)
   \p{Old_Italic}          \p{Script_Extensions=Old_Italic} (Short:
                             \p{Ital}; NOT \p{Block=Old_Italic}) (36)
   \p{Old_North_Arabian}   \p{Script_Extensions=Old_North_Arabian}
                             (Short: \p{Narb}) (32)
   \p{Old_Permic}          \p{Script_Extensions=Old_Permic} (Short:
                             \p{Perm}; NOT \p{Block=Old_Permic}) (44)
   \p{Old_Persian}         \p{Script_Extensions=Old_Persian} (Short:
                             \p{Xpeo}; NOT \p{Block=Old_Persian}) (50)
   \p{Old_South_Arabian}   \p{Script_Extensions=Old_South_Arabian}
                             (Short: \p{Sarb}) (32)
   \p{Old_Turkic}          \p{Script_Extensions=Old_Turkic} (Short:
                             \p{Orkh}; NOT \p{Block=Old_Turkic}) (73)
   \p{Open_Punctuation}    \p{General_Category=Open_Punctuation}
                             (Short: \p{Ps}) (75)
 X \p{Optical_Character_Recognition} \p{Block=
                             Optical_Character_Recognition} (Short:
                             \p{InOCR}) (32)
   \p{Oriya}               \p{Script_Extensions=Oriya} (Short:
                             \p{Orya}; NOT \p{Block=Oriya}) (94)
   \p{Orkh}                \p{Old_Turkic} (= \p{Script_Extensions=
                             Old_Turkic}) (NOT \p{Block=Old_Turkic})
                             (73)
 X \p{Ornamental_Dingbats} \p{Block=Ornamental_Dingbats} (48)
   \p{Orya}                \p{Oriya} (= \p{Script_Extensions=Oriya})
                             (NOT \p{Block=Oriya}) (94)
   \p{Osage}               \p{Script_Extensions=Osage} (Short:
                             \p{Osge}; NOT \p{Block=Osage}) (72)
   \p{Osge}                \p{Osage} (= \p{Script_Extensions=Osage})
                             (NOT \p{Block=Osage}) (72)
   \p{Osma}                \p{Osmanya} (= \p{Script_Extensions=
                             Osmanya}) (NOT \p{Block=Osmanya}) (40)
   \p{Osmanya}             \p{Script_Extensions=Osmanya} (Short:
                             \p{Osma}; NOT \p{Block=Osmanya}) (40)
   \p{Other}               \p{General_Category=Other} (Short: \p{C})
                             (986_091 plus all above-Unicode code
                             points)
   \p{Other_Letter}        \p{General_Category=Other_Letter} (Short:
                             \p{Lo}) (112_721)
   \p{Other_Number}        \p{General_Category=Other_Number} (Short:
                             \p{No}) (676)
   \p{Other_Punctuation}   \p{General_Category=Other_Punctuation}
                             (Short: \p{Po}) (544)
   \p{Other_Symbol}        \p{General_Category=Other_Symbol} (Short:
                             \p{So}) (5777)
   \p{P} \pP               \p{Punct} (= \p{General_Category=
                             Punctuation}) (NOT
                             \p{General_Punctuation}) (748)
   \p{Pahawh_Hmong}        \p{Script_Extensions=Pahawh_Hmong} (Short:
                             \p{Hmng}; NOT \p{Block=Pahawh_Hmong})
                             (127)
   \p{Palm}                \p{Palmyrene} (= \p{Script_Extensions=
                             Palmyrene}) (32)
   \p{Palmyrene}           \p{Script_Extensions=Palmyrene} (Short:
                             \p{Palm}) (32)
   \p{Paragraph_Separator} \p{General_Category=Paragraph_Separator}
                             (Short: \p{Zp}) (1)
   \p{Pat_Syn}             \p{Pattern_Syntax} (= \p{Pattern_Syntax=
                             Y}) (2760)
   \p{Pat_Syn: *}          \p{Pattern_Syntax: *}
   \p{Pat_WS}              \p{Pattern_White_Space} (=
                             \p{Pattern_White_Space=Y}) (11)
   \p{Pat_WS: *}           \p{Pattern_White_Space: *}
   \p{Pattern_Syntax}      \p{Pattern_Syntax=Y} (Short: \p{PatSyn})
                             (2760)
   \p{Pattern_Syntax: N*}  (Short: \p{PatSyn=N}, \P{PatSyn})
                             (1_111_352 plus all above-Unicode code
                             points)
   \p{Pattern_Syntax: Y*}  (Short: \p{PatSyn=Y}, \p{PatSyn}) (2760)
   \p{Pattern_White_Space} \p{Pattern_White_Space=Y} (Short:
                             \p{PatWS}) (11)
   \p{Pattern_White_Space: N*} (Short: \p{PatWS=N}, \P{PatWS})
                             (1_114_101 plus all above-Unicode code
                             points)
   \p{Pattern_White_Space: Y*} (Short: \p{PatWS=Y}, \p{PatWS}) (11)
   \p{Pau_Cin_Hau}         \p{Script_Extensions=Pau_Cin_Hau} (Short:
                             \p{Pauc}; NOT \p{Block=Pau_Cin_Hau}) (57)
   \p{Pauc}                \p{Pau_Cin_Hau} (= \p{Script_Extensions=
                             Pau_Cin_Hau}) (NOT \p{Block=
                             Pau_Cin_Hau}) (57)
   \p{Pc}                  \p{Connector_Punctuation} (=
                             \p{General_Category=
                             Connector_Punctuation}) (10)
   \p{PCM}                 \p{Prepended_Concatenation_Mark} (=
                             \p{Prepended_Concatenation_Mark=Y}) (10)
   \p{PCM: *}              \p{Prepended_Concatenation_Mark: *}
   \p{Pd}                  \p{Dash_Punctuation} (=
                             \p{General_Category=Dash_Punctuation})
                             (24)
   \p{Pe}                  \p{Close_Punctuation} (=
                             \p{General_Category=Close_Punctuation})
                             (73)
   \p{PerlSpace}           \p{PosixSpace} (6)
   \p{PerlWord}            \p{PosixWord} (63)
   \p{Perm}                \p{Old_Permic} (= \p{Script_Extensions=
                             Old_Permic}) (NOT \p{Block=Old_Permic})
                             (44)
   \p{Pf}                  \p{Final_Punctuation} (=
                             \p{General_Category=Final_Punctuation})
                             (10)
   \p{Phag}                \p{Phags_Pa} (= \p{Script_Extensions=
                             Phags_Pa}) (NOT \p{Block=Phags_Pa}) (59)
   \p{Phags_Pa}            \p{Script_Extensions=Phags_Pa} (Short:
                             \p{Phag}; NOT \p{Block=Phags_Pa}) (59)
 X \p{Phaistos}            \p{Phaistos_Disc} (= \p{Block=
                             Phaistos_Disc}) (48)
 X \p{Phaistos_Disc}       \p{Block=Phaistos_Disc} (Short:
                             \p{InPhaistos}) (48)
   \p{Phli}                \p{Inscriptional_Pahlavi} (=
                             \p{Script_Extensions=
                             Inscriptional_Pahlavi}) (NOT \p{Block=
                             Inscriptional_Pahlavi}) (27)
   \p{Phlp}                \p{Psalter_Pahlavi} (=
                             \p{Script_Extensions=Psalter_Pahlavi})
                             (NOT \p{Block=Psalter_Pahlavi}) (30)
   \p{Phnx}                \p{Phoenician} (= \p{Script_Extensions=
                             Phoenician}) (NOT \p{Block=Phoenician})
                             (29)
   \p{Phoenician}          \p{Script_Extensions=Phoenician} (Short:
                             \p{Phnx}; NOT \p{Block=Phoenician}) (29)
 X \p{Phonetic_Ext}        \p{Phonetic_Extensions} (= \p{Block=
                             Phonetic_Extensions}) (128)
 X \p{Phonetic_Ext_Sup}    \p{Phonetic_Extensions_Supplement} (=
                             \p{Block=
                             Phonetic_Extensions_Supplement}) (64)
 X \p{Phonetic_Extensions} \p{Block=Phonetic_Extensions} (Short:
                             \p{InPhoneticExt}) (128)
 X \p{Phonetic_Extensions_Supplement} \p{Block=
                             Phonetic_Extensions_Supplement} (Short:
                             \p{InPhoneticExtSup}) (64)
   \p{Pi}                  \p{Initial_Punctuation} (=
                             \p{General_Category=
                             Initial_Punctuation}) (12)
 X \p{Playing_Cards}       \p{Block=Playing_Cards} (96)
   \p{Plrd}                \p{Miao} (= \p{Script_Extensions=Miao})
                             (NOT \p{Block=Miao}) (133)
   \p{Po}                  \p{Other_Punctuation} (=
                             \p{General_Category=Other_Punctuation})
                             (544)
   \p{PosixAlnum}          [A-Za-z0-9] (62)
   \p{PosixAlpha}          [A-Za-z] (52)
   \p{PosixBlank}          \t and ' ' (2)
   \p{PosixCntrl}          ASCII control characters: NUL, SOH, STX,
                             ETX, EOT, ENQ, ACK, BEL, BS, HT, LF, VT,
                             FF, CR, SO, SI, DLE, DC1, DC2, DC3, DC4,
                             NAK, SYN, ETB, CAN, EOM, SUB, ESC, FS,
                             GS, RS, US, and DEL (33)
   \p{PosixDigit}          [0-9] (10)
   \p{PosixGraph}          [-!"#$%&'()*+,./:;<=>?@[\\]^_`{|}~0-9A-Za-
                             z] (94)
   \p{PosixLower}          [a-z] (/i= PosixAlpha) (26)
   \p{PosixPrint}          [- 0-9A-Za-z!"#$%&'()*+,./:;<=
                             >?@[\\]^_`{|}~] (95)
   \p{PosixPunct}          [-!"#$%&'()*+,./:;<=>?@[\\]^_`{|}~] (32)
   \p{PosixSpace}          \t, \n, \cK, \f, \r, and ' '.  (\cK is
                             vertical tab) (Short: \p{PerlSpace}) (6)
   \p{PosixUpper}          [A-Z] (/i= PosixAlpha) (26)
   \p{PosixWord}           \w, restricted to ASCII = [A-Za-z0-9_]
                             (Short: \p{PerlWord}) (63)
   \p{PosixXDigit}         \p{ASCII_Hex_Digit=Y} [0-9A-Fa-f] (Short:
                             \p{AHex}) (22)
   \p{Prepended_Concatenation_Mark} \p{Prepended_Concatenation_Mark=
                             Y} (Short: \p{PCM}) (10)
   \p{Prepended_Concatenation_Mark: N*} (Short: \p{PCM=N}, \P{PCM})
                             (1_114_102 plus all above-Unicode code
                             points)
   \p{Prepended_Concatenation_Mark: Y*} (Short: \p{PCM=Y}, \p{PCM})
                             (10)
 T \p{Present_In: 1.1}     \p{Age=V1_1} (Short: \p{In=1.1}) (Perl
                             extension) (33_979)
 T \p{Present_In: 2.0}     Code point's usage introduced in version
                             2.0 or earlier (Short: \p{In=2.0}) (Perl
                             extension) (178_500)
 T \p{Present_In: 2.1}     Code point's usage introduced in version
                             2.1 or earlier (Short: \p{In=2.1}) (Perl
                             extension) (178_502)
 T \p{Present_In: 3.0}     Code point's usage introduced in version
                             3.0 or earlier (Short: \p{In=3.0}) (Perl
                             extension) (188_809)
 T \p{Present_In: 3.1}     Code point's usage introduced in version
                             3.1 or earlier (Short: \p{In=3.1}) (Perl
                             extension) (233_787)
 T \p{Present_In: 3.2}     Code point's usage introduced in version
                             3.2 or earlier (Short: \p{In=3.2}) (Perl
                             extension) (234_803)
 T \p{Present_In: 4.0}     Code point's usage introduced in version
                             4.0 or earlier (Short: \p{In=4.0}) (Perl
                             extension) (236_029)
 T \p{Present_In: 4.1}     Code point's usage introduced in version
                             4.1 or earlier (Short: \p{In=4.1}) (Perl
                             extension) (237_302)
 T \p{Present_In: 5.0}     Code point's usage introduced in version
                             5.0 or earlier (Short: \p{In=5.0}) (Perl
                             extension) (238_671)
 T \p{Present_In: 5.1}     Code point's usage introduced in version
                             5.1 or earlier (Short: \p{In=5.1}) (Perl
                             extension) (240_295)
 T \p{Present_In: 5.2}     Code point's usage introduced in version
                             5.2 or earlier (Short: \p{In=5.2}) (Perl
                             extension) (246_943)
 T \p{Present_In: 6.0}     Code point's usage introduced in version
                             6.0 or earlier (Short: \p{In=6.0}) (Perl
                             extension) (249_031)
 T \p{Present_In: 6.1}     Code point's usage introduced in version
                             6.1 or earlier (Short: \p{In=6.1}) (Perl
                             extension) (249_763)
 T \p{Present_In: 6.2}     Code point's usage introduced in version
                             6.2 or earlier (Short: \p{In=6.2}) (Perl
                             extension) (249_764)
 T \p{Present_In: 6.3}     Code point's usage introduced in version
                             6.3 or earlier (Short: \p{In=6.3}) (Perl
                             extension) (249_769)
 T \p{Present_In: 7.0}     Code point's usage introduced in version
                             7.0 or earlier (Short: \p{In=7.0}) (Perl
                             extension) (252_603)
 T \p{Present_In: 8.0}     Code point's usage introduced in version
                             8.0 or earlier (Short: \p{In=8.0}) (Perl
                             extension) (260_319)
 T \p{Present_In: 9.0}     Code point's usage introduced in version
                             9.0 or earlier (Short: \p{In=9.0}) (Perl
                             extension) (267_819)
   \p{Present_In: Unassigned} \p{Age=Unassigned} (Short: \p{In=
                             Unassigned}) (Perl extension) (846_293
                             plus all above-Unicode code points)
   \p{Print}               \p{XPosixPrint} (265_638)
   \p{Private_Use}         \p{General_Category=Private_Use} (Short:
                             \p{Co}; NOT \p{Private_Use_Area})
                             (137_468)
 X \p{Private_Use_Area}    \p{Block=Private_Use_Area} (Short:
                             \p{InPUA}) (6400)
   \p{Prti}                \p{Inscriptional_Parthian} (=
                             \p{Script_Extensions=
                             Inscriptional_Parthian}) (NOT \p{Block=
                             Inscriptional_Parthian}) (30)
   \p{Ps}                  \p{Open_Punctuation} (=
                             \p{General_Category=Open_Punctuation})
                             (75)
   \p{Psalter_Pahlavi}     \p{Script_Extensions=Psalter_Pahlavi}
                             (Short: \p{Phlp}; NOT \p{Block=
                             Psalter_Pahlavi}) (30)
 X \p{PUA}                 \p{Private_Use_Area} (= \p{Block=
                             Private_Use_Area}) (6400)
   \p{Punct}               \p{General_Category=Punctuation} (Short:
                             \p{P}; NOT \p{General_Punctuation}) (748)
   \p{Punctuation}         \p{Punct} (= \p{General_Category=
                             Punctuation}) (NOT
                             \p{General_Punctuation}) (748)
   \p{Qaac}                \p{Coptic} (= \p{Script_Extensions=
                             Coptic}) (NOT \p{Block=Coptic}) (165)
   \p{Qaai}                \p{Inherited} (= \p{Script_Extensions=
                             Inherited}) (496)
   \p{QMark}               \p{Quotation_Mark} (= \p{Quotation_Mark=
                             Y}) (30)
   \p{QMark: *}            \p{Quotation_Mark: *}
   \p{Quotation_Mark}      \p{Quotation_Mark=Y} (Short: \p{QMark})
                             (30)
   \p{Quotation_Mark: N*}  (Short: \p{QMark=N}, \P{QMark}) (1_114_082
                             plus all above-Unicode code points)
   \p{Quotation_Mark: Y*}  (Short: \p{QMark=Y}, \p{QMark}) (30)
   \p{Radical}             \p{Radical=Y} (329)
   \p{Radical: N*}         (Single: \P{Radical}) (1_113_783 plus all
                             above-Unicode code points)
   \p{Radical: Y*}         (Single: \p{Radical}) (329)
   \p{Rejang}              \p{Script_Extensions=Rejang} (Short:
                             \p{Rjng}; NOT \p{Block=Rejang}) (37)
   \p{Rjng}                \p{Rejang} (= \p{Script_Extensions=
                             Rejang}) (NOT \p{Block=Rejang}) (37)
 X \p{Rumi}                \p{Rumi_Numeral_Symbols} (= \p{Block=
                             Rumi_Numeral_Symbols}) (32)
 X \p{Rumi_Numeral_Symbols} \p{Block=Rumi_Numeral_Symbols} (Short:
                             \p{InRumi}) (32)
   \p{Runic}               \p{Script_Extensions=Runic} (Short:
                             \p{Runr}; NOT \p{Block=Runic}) (86)
   \p{Runr}                \p{Runic} (= \p{Script_Extensions=Runic})
                             (NOT \p{Block=Runic}) (86)
   \p{S} \pS               \p{Symbol} (= \p{General_Category=Symbol})
                             (6899)
   \p{Samaritan}           \p{Script_Extensions=Samaritan} (Short:
                             \p{Samr}; NOT \p{Block=Samaritan}) (61)
   \p{Samr}                \p{Samaritan} (= \p{Script_Extensions=
                             Samaritan}) (NOT \p{Block=Samaritan})
                             (61)
   \p{Sarb}                \p{Old_South_Arabian} (=
                             \p{Script_Extensions=Old_South_Arabian})
                             (32)
   \p{Saur}                \p{Saurashtra} (= \p{Script_Extensions=
                             Saurashtra}) (NOT \p{Block=Saurashtra})
                             (82)
   \p{Saurashtra}          \p{Script_Extensions=Saurashtra} (Short:
                             \p{Saur}; NOT \p{Block=Saurashtra}) (82)
   \p{SB: *}               \p{Sentence_Break: *}
   \p{Sc}                  \p{Currency_Symbol} (=
                             \p{General_Category=Currency_Symbol})
                             (53)
   \p{Sc: *}               \p{Script: *}
   \p{Script: Adlam}       (Short: \p{Sc=Adlm}) (87)
   \p{Script: Adlm}        \p{Script=Adlam} (87)
   \p{Script: Aghb}        \p{Script=Caucasian_Albanian} (53)
   \p{Script: Ahom}        (Short: \p{Sc=Ahom}) (57)
   \p{Script: Anatolian_Hieroglyphs} (Short: \p{Sc=Hluw}) (583)
   \p{Script: Arab}        \p{Script=Arabic} (1279)
   \p{Script: Arabic}      (Short: \p{Sc=Arab}) (1279)
   \p{Script: Armenian}    (Short: \p{Sc=Armn}) (93)
   \p{Script: Armi}        \p{Script=Imperial_Aramaic} (31)
   \p{Script: Armn}        \p{Script=Armenian} (93)
   \p{Script: Avestan}     (Short: \p{Sc=Avst}) (61)
   \p{Script: Avst}        \p{Script=Avestan} (61)
   \p{Script: Bali}        \p{Script=Balinese} (121)
   \p{Script: Balinese}    (Short: \p{Sc=Bali}) (121)
   \p{Script: Bamu}        \p{Script=Bamum} (657)
   \p{Script: Bamum}       (Short: \p{Sc=Bamu}) (657)
   \p{Script: Bass}        \p{Script=Bassa_Vah} (36)
   \p{Script: Bassa_Vah}   (Short: \p{Sc=Bass}) (36)
   \p{Script: Batak}       (Short: \p{Sc=Batk}) (56)
   \p{Script: Batk}        \p{Script=Batak} (56)
   \p{Script: Beng}        \p{Script=Bengali} (93)
   \p{Script: Bengali}     (Short: \p{Sc=Beng}) (93)
   \p{Script: Bhaiksuki}   (Short: \p{Sc=Bhks}) (97)
   \p{Script: Bhks}        \p{Script=Bhaiksuki} (97)
   \p{Script: Bopo}        \p{Script=Bopomofo} (70)
   \p{Script: Bopomofo}    (Short: \p{Sc=Bopo}) (70)
   \p{Script: Brah}        \p{Script=Brahmi} (109)
   \p{Script: Brahmi}      (Short: \p{Sc=Brah}) (109)
   \p{Script: Brai}        \p{Script=Braille} (256)
   \p{Script: Braille}     (Short: \p{Sc=Brai}) (256)
   \p{Script: Bugi}        \p{Script=Buginese} (30)
   \p{Script: Buginese}    (Short: \p{Sc=Bugi}) (30)
   \p{Script: Buhd}        \p{Script=Buhid} (20)
   \p{Script: Buhid}       (Short: \p{Sc=Buhd}) (20)
   \p{Script: Cakm}        \p{Script=Chakma} (67)
   \p{Script: Canadian_Aboriginal} (Short: \p{Sc=Cans}) (710)
   \p{Script: Cans}        \p{Script=Canadian_Aboriginal} (710)
   \p{Script: Cari}        \p{Script=Carian} (49)
   \p{Script: Carian}      (Short: \p{Sc=Cari}) (49)
   \p{Script: Caucasian_Albanian} (Short: \p{Sc=Aghb}) (53)
   \p{Script: Chakma}      (Short: \p{Sc=Cakm}) (67)
   \p{Script: Cham}        (Short: \p{Sc=Cham}) (83)
   \p{Script: Cher}        \p{Script=Cherokee} (172)
   \p{Script: Cherokee}    (Short: \p{Sc=Cher}) (172)
   \p{Script: Common}      (Short: \p{Sc=Zyyy}) (7279)
   \p{Script: Copt}        \p{Script=Coptic} (137)
   \p{Script: Coptic}      (Short: \p{Sc=Copt}) (137)
   \p{Script: Cprt}        \p{Script=Cypriot} (55)
   \p{Script: Cuneiform}   (Short: \p{Sc=Xsux}) (1234)
   \p{Script: Cypriot}     (Short: \p{Sc=Cprt}) (55)
   \p{Script: Cyrillic}    (Short: \p{Sc=Cyrl}) (443)
   \p{Script: Cyrl}        \p{Script=Cyrillic} (443)
   \p{Script: Deseret}     (Short: \p{Sc=Dsrt}) (80)
   \p{Script: Deva}        \p{Script=Devanagari} (154)
   \p{Script: Devanagari}  (Short: \p{Sc=Deva}) (154)
   \p{Script: Dsrt}        \p{Script=Deseret} (80)
   \p{Script: Dupl}        \p{Script=Duployan} (143)
   \p{Script: Duployan}    (Short: \p{Sc=Dupl}) (143)
   \p{Script: Egyp}        \p{Script=Egyptian_Hieroglyphs} (1071)
   \p{Script: Egyptian_Hieroglyphs} (Short: \p{Sc=Egyp}) (1071)
   \p{Script: Elba}        \p{Script=Elbasan} (40)
   \p{Script: Elbasan}     (Short: \p{Sc=Elba}) (40)
   \p{Script: Ethi}        \p{Script=Ethiopic} (495)
   \p{Script: Ethiopic}    (Short: \p{Sc=Ethi}) (495)
   \p{Script: Geor}        \p{Script=Georgian} (127)
   \p{Script: Georgian}    (Short: \p{Sc=Geor}) (127)
   \p{Script: Glag}        \p{Script=Glagolitic} (132)
   \p{Script: Glagolitic}  (Short: \p{Sc=Glag}) (132)
   \p{Script: Goth}        \p{Script=Gothic} (27)
   \p{Script: Gothic}      (Short: \p{Sc=Goth}) (27)
   \p{Script: Gran}        \p{Script=Grantha} (85)
   \p{Script: Grantha}     (Short: \p{Sc=Gran}) (85)
   \p{Script: Greek}       (Short: \p{Sc=Grek}) (518)
   \p{Script: Grek}        \p{Script=Greek} (518)
   \p{Script: Gujarati}    (Short: \p{Sc=Gujr}) (85)
   \p{Script: Gujr}        \p{Script=Gujarati} (85)
   \p{Script: Gurmukhi}    (Short: \p{Sc=Guru}) (79)
   \p{Script: Guru}        \p{Script=Gurmukhi} (79)
   \p{Script: Han}         (Short: \p{Sc=Han}) (81_734)
   \p{Script: Hang}        \p{Script=Hangul} (11_739)
   \p{Script: Hangul}      (Short: \p{Sc=Hang}) (11_739)
   \p{Script: Hani}        \p{Script=Han} (81_734)
   \p{Script: Hano}        \p{Script=Hanunoo} (21)
   \p{Script: Hanunoo}     (Short: \p{Sc=Hano}) (21)
   \p{Script: Hatr}        \p{Script=Hatran} (26)
   \p{Script: Hatran}      (Short: \p{Sc=Hatr}) (26)
   \p{Script: Hebr}        \p{Script=Hebrew} (133)
   \p{Script: Hebrew}      (Short: \p{Sc=Hebr}) (133)
   \p{Script: Hira}        \p{Script=Hiragana} (91)
   \p{Script: Hiragana}    (Short: \p{Sc=Hira}) (91)
   \p{Script: Hluw}        \p{Script=Anatolian_Hieroglyphs} (583)
   \p{Script: Hmng}        \p{Script=Pahawh_Hmong} (127)
   \p{Script: Hung}        \p{Script=Old_Hungarian} (108)
   \p{Script: Imperial_Aramaic} (Short: \p{Sc=Armi}) (31)
   \p{Script: Inherited}   (Short: \p{Sc=Zinh}) (564)
   \p{Script: Inscriptional_Pahlavi} (Short: \p{Sc=Phli}) (27)
   \p{Script: Inscriptional_Parthian} (Short: \p{Sc=Prti}) (30)
   \p{Script: Ital}        \p{Script=Old_Italic} (36)
   \p{Script: Java}        \p{Script=Javanese} (90)
   \p{Script: Javanese}    (Short: \p{Sc=Java}) (90)
   \p{Script: Kaithi}      (Short: \p{Sc=Kthi}) (66)
   \p{Script: Kali}        \p{Script=Kayah_Li} (47)
   \p{Script: Kana}        \p{Script=Katakana} (300)
   \p{Script: Kannada}     (Short: \p{Sc=Knda}) (88)
   \p{Script: Katakana}    (Short: \p{Sc=Kana}) (300)
   \p{Script: Kayah_Li}    (Short: \p{Sc=Kali}) (47)
   \p{Script: Khar}        \p{Script=Kharoshthi} (65)
   \p{Script: Kharoshthi}  (Short: \p{Sc=Khar}) (65)
   \p{Script: Khmer}       (Short: \p{Sc=Khmr}) (146)
   \p{Script: Khmr}        \p{Script=Khmer} (146)
   \p{Script: Khoj}        \p{Script=Khojki} (62)
   \p{Script: Khojki}      (Short: \p{Sc=Khoj}) (62)
   \p{Script: Khudawadi}   (Short: \p{Sc=Sind}) (69)
   \p{Script: Knda}        \p{Script=Kannada} (88)
   \p{Script: Kthi}        \p{Script=Kaithi} (66)
   \p{Script: Lana}        \p{Script=Tai_Tham} (127)
   \p{Script: Lao}         (Short: \p{Sc=Lao}) (67)
   \p{Script: Laoo}        \p{Script=Lao} (67)
   \p{Script: Latin}       (Short: \p{Sc=Latn}) (1350)
   \p{Script: Latn}        \p{Script=Latin} (1350)
   \p{Script: Lepc}        \p{Script=Lepcha} (74)
   \p{Script: Lepcha}      (Short: \p{Sc=Lepc}) (74)
   \p{Script: Limb}        \p{Script=Limbu} (68)
   \p{Script: Limbu}       (Short: \p{Sc=Limb}) (68)
   \p{Script: Lina}        \p{Script=Linear_A} (341)
   \p{Script: Linb}        \p{Script=Linear_B} (211)
   \p{Script: Linear_A}    (Short: \p{Sc=Lina}) (341)
   \p{Script: Linear_B}    (Short: \p{Sc=Linb}) (211)
   \p{Script: Lisu}        (Short: \p{Sc=Lisu}) (48)
   \p{Script: Lyci}        \p{Script=Lycian} (29)
   \p{Script: Lycian}      (Short: \p{Sc=Lyci}) (29)
   \p{Script: Lydi}        \p{Script=Lydian} (27)
   \p{Script: Lydian}      (Short: \p{Sc=Lydi}) (27)
   \p{Script: Mahajani}    (Short: \p{Sc=Mahj}) (39)
   \p{Script: Mahj}        \p{Script=Mahajani} (39)
   \p{Script: Malayalam}   (Short: \p{Sc=Mlym}) (114)
   \p{Script: Mand}        \p{Script=Mandaic} (29)
   \p{Script: Mandaic}     (Short: \p{Sc=Mand}) (29)
   \p{Script: Mani}        \p{Script=Manichaean} (51)
   \p{Script: Manichaean}  (Short: \p{Sc=Mani}) (51)
   \p{Script: Marc}        \p{Script=Marchen} (68)
   \p{Script: Marchen}     (Short: \p{Sc=Marc}) (68)
   \p{Script: Meetei_Mayek} (Short: \p{Sc=Mtei}) (79)
   \p{Script: Mend}        \p{Script=Mende_Kikakui} (213)
   \p{Script: Mende_Kikakui} (Short: \p{Sc=Mend}) (213)
   \p{Script: Merc}        \p{Script=Meroitic_Cursive} (90)
   \p{Script: Mero}        \p{Script=Meroitic_Hieroglyphs} (32)
   \p{Script: Meroitic_Cursive} (Short: \p{Sc=Merc}) (90)
   \p{Script: Meroitic_Hieroglyphs} (Short: \p{Sc=Mero}) (32)
   \p{Script: Miao}        (Short: \p{Sc=Miao}) (133)
   \p{Script: Mlym}        \p{Script=Malayalam} (114)
   \p{Script: Modi}        (Short: \p{Sc=Modi}) (79)
   \p{Script: Mong}        \p{Script=Mongolian} (166)
   \p{Script: Mongolian}   (Short: \p{Sc=Mong}) (166)
   \p{Script: Mro}         (Short: \p{Sc=Mro}) (43)
   \p{Script: Mroo}        \p{Script=Mro} (43)
   \p{Script: Mtei}        \p{Script=Meetei_Mayek} (79)
   \p{Script: Mult}        \p{Script=Multani} (38)
   \p{Script: Multani}     (Short: \p{Sc=Mult}) (38)
   \p{Script: Myanmar}     (Short: \p{Sc=Mymr}) (223)
   \p{Script: Mymr}        \p{Script=Myanmar} (223)
   \p{Script: Nabataean}   (Short: \p{Sc=Nbat}) (40)
   \p{Script: Narb}        \p{Script=Old_North_Arabian} (32)
   \p{Script: Nbat}        \p{Script=Nabataean} (40)
   \p{Script: New_Tai_Lue} (Short: \p{Sc=Talu}) (83)
   \p{Script: Newa}        (Short: \p{Sc=Newa}) (92)
   \p{Script: Nko}         (Short: \p{Sc=Nko}) (59)
   \p{Script: Nkoo}        \p{Script=Nko} (59)
   \p{Script: Ogam}        \p{Script=Ogham} (29)
   \p{Script: Ogham}       (Short: \p{Sc=Ogam}) (29)
   \p{Script: Ol_Chiki}    (Short: \p{Sc=Olck}) (48)
   \p{Script: Olck}        \p{Script=Ol_Chiki} (48)
   \p{Script: Old_Hungarian} (Short: \p{Sc=Hung}) (108)
   \p{Script: Old_Italic}  (Short: \p{Sc=Ital}) (36)
   \p{Script: Old_North_Arabian} (Short: \p{Sc=Narb}) (32)
   \p{Script: Old_Permic}  (Short: \p{Sc=Perm}) (43)
   \p{Script: Old_Persian} (Short: \p{Sc=Xpeo}) (50)
   \p{Script: Old_South_Arabian} (Short: \p{Sc=Sarb}) (32)
   \p{Script: Old_Turkic}  (Short: \p{Sc=Orkh}) (73)
   \p{Script: Oriya}       (Short: \p{Sc=Orya}) (90)
   \p{Script: Orkh}        \p{Script=Old_Turkic} (73)
   \p{Script: Orya}        \p{Script=Oriya} (90)
   \p{Script: Osage}       (Short: \p{Sc=Osge}) (72)
   \p{Script: Osge}        \p{Script=Osage} (72)
   \p{Script: Osma}        \p{Script=Osmanya} (40)
   \p{Script: Osmanya}     (Short: \p{Sc=Osma}) (40)
   \p{Script: Pahawh_Hmong} (Short: \p{Sc=Hmng}) (127)
   \p{Script: Palm}        \p{Script=Palmyrene} (32)
   \p{Script: Palmyrene}   (Short: \p{Sc=Palm}) (32)
   \p{Script: Pau_Cin_Hau} (Short: \p{Sc=Pauc}) (57)
   \p{Script: Pauc}        \p{Script=Pau_Cin_Hau} (57)
   \p{Script: Perm}        \p{Script=Old_Permic} (43)
   \p{Script: Phag}        \p{Script=Phags_Pa} (56)
   \p{Script: Phags_Pa}    (Short: \p{Sc=Phag}) (56)
   \p{Script: Phli}        \p{Script=Inscriptional_Pahlavi} (27)
   \p{Script: Phlp}        \p{Script=Psalter_Pahlavi} (29)
   \p{Script: Phnx}        \p{Script=Phoenician} (29)
   \p{Script: Phoenician}  (Short: \p{Sc=Phnx}) (29)
   \p{Script: Plrd}        \p{Script=Miao} (133)
   \p{Script: Prti}        \p{Script=Inscriptional_Parthian} (30)
   \p{Script: Psalter_Pahlavi} (Short: \p{Sc=Phlp}) (29)
   \p{Script: Qaac}        \p{Script=Coptic} (137)
   \p{Script: Qaai}        \p{Script=Inherited} (564)
   \p{Script: Rejang}      (Short: \p{Sc=Rjng}) (37)
   \p{Script: Rjng}        \p{Script=Rejang} (37)
   \p{Script: Runic}       (Short: \p{Sc=Runr}) (86)
   \p{Script: Runr}        \p{Script=Runic} (86)
   \p{Script: Samaritan}   (Short: \p{Sc=Samr}) (61)
   \p{Script: Samr}        \p{Script=Samaritan} (61)
   \p{Script: Sarb}        \p{Script=Old_South_Arabian} (32)
   \p{Script: Saur}        \p{Script=Saurashtra} (82)
   \p{Script: Saurashtra}  (Short: \p{Sc=Saur}) (82)
   \p{Script: Sgnw}        \p{Script=SignWriting} (672)
   \p{Script: Sharada}     (Short: \p{Sc=Shrd}) (94)
   \p{Script: Shavian}     (Short: \p{Sc=Shaw}) (48)
   \p{Script: Shaw}        \p{Script=Shavian} (48)
   \p{Script: Shrd}        \p{Script=Sharada} (94)
   \p{Script: Sidd}        \p{Script=Siddham} (92)
   \p{Script: Siddham}     (Short: \p{Sc=Sidd}) (92)
   \p{Script: SignWriting} (Short: \p{Sc=Sgnw}) (672)
   \p{Script: Sind}        \p{Script=Khudawadi} (69)
   \p{Script: Sinh}        \p{Script=Sinhala} (110)
   \p{Script: Sinhala}     (Short: \p{Sc=Sinh}) (110)
   \p{Script: Sora}        \p{Script=Sora_Sompeng} (35)
   \p{Script: Sora_Sompeng} (Short: \p{Sc=Sora}) (35)
   \p{Script: Sund}        \p{Script=Sundanese} (72)
   \p{Script: Sundanese}   (Short: \p{Sc=Sund}) (72)
   \p{Script: Sylo}        \p{Script=Syloti_Nagri} (44)
   \p{Script: Syloti_Nagri} (Short: \p{Sc=Sylo}) (44)
   \p{Script: Syrc}        \p{Script=Syriac} (77)
   \p{Script: Syriac}      (Short: \p{Sc=Syrc}) (77)
   \p{Script: Tagalog}     (Short: \p{Sc=Tglg}) (20)
   \p{Script: Tagb}        \p{Script=Tagbanwa} (18)
   \p{Script: Tagbanwa}    (Short: \p{Sc=Tagb}) (18)
   \p{Script: Tai_Le}      (Short: \p{Sc=Tale}) (35)
   \p{Script: Tai_Tham}    (Short: \p{Sc=Lana}) (127)
   \p{Script: Tai_Viet}    (Short: \p{Sc=Tavt}) (72)
   \p{Script: Takr}        \p{Script=Takri} (66)
   \p{Script: Takri}       (Short: \p{Sc=Takr}) (66)
   \p{Script: Tale}        \p{Script=Tai_Le} (35)
   \p{Script: Talu}        \p{Script=New_Tai_Lue} (83)
   \p{Script: Tamil}       (Short: \p{Sc=Taml}) (72)
   \p{Script: Taml}        \p{Script=Tamil} (72)
   \p{Script: Tang}        \p{Script=Tangut} (6881)
   \p{Script: Tangut}      (Short: \p{Sc=Tang}) (6881)
   \p{Script: Tavt}        \p{Script=Tai_Viet} (72)
   \p{Script: Telu}        \p{Script=Telugu} (96)
   \p{Script: Telugu}      (Short: \p{Sc=Telu}) (96)
   \p{Script: Tfng}        \p{Script=Tifinagh} (59)
   \p{Script: Tglg}        \p{Script=Tagalog} (20)
   \p{Script: Thaa}        \p{Script=Thaana} (50)
   \p{Script: Thaana}      (Short: \p{Sc=Thaa}) (50)
   \p{Script: Thai}        (Short: \p{Sc=Thai}) (86)
   \p{Script: Tibetan}     (Short: \p{Sc=Tibt}) (207)
   \p{Script: Tibt}        \p{Script=Tibetan} (207)
   \p{Script: Tifinagh}    (Short: \p{Sc=Tfng}) (59)
   \p{Script: Tirh}        \p{Script=Tirhuta} (82)
   \p{Script: Tirhuta}     (Short: \p{Sc=Tirh}) (82)
   \p{Script: Ugar}        \p{Script=Ugaritic} (31)
   \p{Script: Ugaritic}    (Short: \p{Sc=Ugar}) (31)
   \p{Script: Unknown}     (Short: \p{Sc=Zzzz}) (985_875 plus all
                             above-Unicode code points)
   \p{Script: Vai}         (Short: \p{Sc=Vai}) (300)
   \p{Script: Vaii}        \p{Script=Vai} (300)
   \p{Script: Wara}        \p{Script=Warang_Citi} (84)
   \p{Script: Warang_Citi} (Short: \p{Sc=Wara}) (84)
   \p{Script: Xpeo}        \p{Script=Old_Persian} (50)
   \p{Script: Xsux}        \p{Script=Cuneiform} (1234)
   \p{Script: Yi}          (Short: \p{Sc=Yi}) (1220)
   \p{Script: Yiii}        \p{Script=Yi} (1220)
   \p{Script: Zinh}        \p{Script=Inherited} (564)
   \p{Script: Zyyy}        \p{Script=Common} (7279)
   \p{Script: Zzzz}        \p{Script=Unknown} (985_875 plus all
                             above-Unicode code points)
   \p{Script_Extensions: Adlam} (Short: \p{Scx=Adlm}, \p{Adlm}) (88)
   \p{Script_Extensions: Adlm} \p{Script_Extensions=Adlam} (88)
   \p{Script_Extensions: Aghb} \p{Script_Extensions=
                             Caucasian_Albanian} (53)
   \p{Script_Extensions: Ahom} (Short: \p{Scx=Ahom}, \p{Ahom}) (57)
   \p{Script_Extensions: Anatolian_Hieroglyphs} (Short: \p{Scx=Hluw},
                             \p{Hluw}) (583)
   \p{Script_Extensions: Arab} \p{Script_Extensions=Arabic} (1323)
   \p{Script_Extensions: Arabic} (Short: \p{Scx=Arab}, \p{Arab})
                             (1323)
   \p{Script_Extensions: Armenian} (Short: \p{Scx=Armn}, \p{Armn})
                             (94)
   \p{Script_Extensions: Armi} \p{Script_Extensions=Imperial_Aramaic}
                             (31)
   \p{Script_Extensions: Armn} \p{Script_Extensions=Armenian} (94)
   \p{Script_Extensions: Avestan} (Short: \p{Scx=Avst}, \p{Avst}) (61)
   \p{Script_Extensions: Avst} \p{Script_Extensions=Avestan} (61)
   \p{Script_Extensions: Bali} \p{Script_Extensions=Balinese} (121)
   \p{Script_Extensions: Balinese} (Short: \p{Scx=Bali}, \p{Bali})
                             (121)
   \p{Script_Extensions: Bamu} \p{Script_Extensions=Bamum} (657)
   \p{Script_Extensions: Bamum} (Short: \p{Scx=Bamu}, \p{Bamu}) (657)
   \p{Script_Extensions: Bass} \p{Script_Extensions=Bassa_Vah} (36)
   \p{Script_Extensions: Bassa_Vah} (Short: \p{Scx=Bass}, \p{Bass})
                             (36)
   \p{Script_Extensions: Batak} (Short: \p{Scx=Batk}, \p{Batk}) (56)
   \p{Script_Extensions: Batk} \p{Script_Extensions=Batak} (56)
   \p{Script_Extensions: Beng} \p{Script_Extensions=Bengali} (98)
   \p{Script_Extensions: Bengali} (Short: \p{Scx=Beng}, \p{Beng}) (98)
   \p{Script_Extensions: Bhaiksuki} (Short: \p{Scx=Bhks}, \p{Bhks})
                             (97)
   \p{Script_Extensions: Bhks} \p{Script_Extensions=Bhaiksuki} (97)
   \p{Script_Extensions: Bopo} \p{Script_Extensions=Bopomofo} (110)
   \p{Script_Extensions: Bopomofo} (Short: \p{Scx=Bopo}, \p{Bopo})
                             (110)
   \p{Script_Extensions: Brah} \p{Script_Extensions=Brahmi} (109)
   \p{Script_Extensions: Brahmi} (Short: \p{Scx=Brah}, \p{Brah}) (109)
   \p{Script_Extensions: Brai} \p{Script_Extensions=Braille} (256)
   \p{Script_Extensions: Braille} (Short: \p{Scx=Brai}, \p{Brai})
                             (256)
   \p{Script_Extensions: Bugi} \p{Script_Extensions=Buginese} (31)
   \p{Script_Extensions: Buginese} (Short: \p{Scx=Bugi}, \p{Bugi})
                             (31)
   \p{Script_Extensions: Buhd} \p{Script_Extensions=Buhid} (22)
   \p{Script_Extensions: Buhid} (Short: \p{Scx=Buhd}, \p{Buhd}) (22)
   \p{Script_Extensions: Cakm} \p{Script_Extensions=Chakma} (87)
   \p{Script_Extensions: Canadian_Aboriginal} (Short: \p{Scx=Cans},
                             \p{Cans}) (710)
   \p{Script_Extensions: Cans} \p{Script_Extensions=
                             Canadian_Aboriginal} (710)
   \p{Script_Extensions: Cari} \p{Script_Extensions=Carian} (49)
   \p{Script_Extensions: Carian} (Short: \p{Scx=Cari}, \p{Cari}) (49)
   \p{Script_Extensions: Caucasian_Albanian} (Short: \p{Scx=Aghb},
                             \p{Aghb}) (53)
   \p{Script_Extensions: Chakma} (Short: \p{Scx=Cakm}, \p{Cakm}) (87)
   \p{Script_Extensions: Cham} (Short: \p{Scx=Cham}, \p{Cham}) (83)
   \p{Script_Extensions: Cher} \p{Script_Extensions=Cherokee} (172)
   \p{Script_Extensions: Cherokee} (Short: \p{Scx=Cher}, \p{Cher})
                             (172)
   \p{Script_Extensions: Common} (Short: \p{Scx=Zyyy}, \p{Zyyy})
                             (6864)
   \p{Script_Extensions: Copt} \p{Script_Extensions=Coptic} (165)
   \p{Script_Extensions: Coptic} (Short: \p{Scx=Copt}, \p{Copt}) (165)
   \p{Script_Extensions: Cprt} \p{Script_Extensions=Cypriot} (112)
   \p{Script_Extensions: Cuneiform} (Short: \p{Scx=Xsux}, \p{Xsux})
                             (1234)
   \p{Script_Extensions: Cypriot} (Short: \p{Scx=Cprt}, \p{Cprt})
                             (112)
   \p{Script_Extensions: Cyrillic} (Short: \p{Scx=Cyrl}, \p{Cyrl})
                             (446)
   \p{Script_Extensions: Cyrl} \p{Script_Extensions=Cyrillic} (446)
   \p{Script_Extensions: Deseret} (Short: \p{Scx=Dsrt}, \p{Dsrt}) (80)
   \p{Script_Extensions: Deva} \p{Script_Extensions=Devanagari} (210)
   \p{Script_Extensions: Devanagari} (Short: \p{Scx=Deva}, \p{Deva})
                             (210)
   \p{Script_Extensions: Dsrt} \p{Script_Extensions=Deseret} (80)
   \p{Script_Extensions: Dupl} \p{Script_Extensions=Duployan} (147)
   \p{Script_Extensions: Duployan} (Short: \p{Scx=Dupl}, \p{Dupl})
                             (147)
   \p{Script_Extensions: Egyp} \p{Script_Extensions=
                             Egyptian_Hieroglyphs} (1071)
   \p{Script_Extensions: Egyptian_Hieroglyphs} (Short: \p{Scx=Egyp},
                             \p{Egyp}) (1071)
   \p{Script_Extensions: Elba} \p{Script_Extensions=Elbasan} (40)
   \p{Script_Extensions: Elbasan} (Short: \p{Scx=Elba}, \p{Elba}) (40)
   \p{Script_Extensions: Ethi} \p{Script_Extensions=Ethiopic} (495)
   \p{Script_Extensions: Ethiopic} (Short: \p{Scx=Ethi}, \p{Ethi})
                             (495)
   \p{Script_Extensions: Geor} \p{Script_Extensions=Georgian} (129)
   \p{Script_Extensions: Georgian} (Short: \p{Scx=Geor}, \p{Geor})
                             (129)
   \p{Script_Extensions: Glag} \p{Script_Extensions=Glagolitic} (136)
   \p{Script_Extensions: Glagolitic} (Short: \p{Scx=Glag}, \p{Glag})
                             (136)
   \p{Script_Extensions: Goth} \p{Script_Extensions=Gothic} (27)
   \p{Script_Extensions: Gothic} (Short: \p{Scx=Goth}, \p{Goth}) (27)
   \p{Script_Extensions: Gran} \p{Script_Extensions=Grantha} (113)
   \p{Script_Extensions: Grantha} (Short: \p{Scx=Gran}, \p{Gran})
                             (113)
   \p{Script_Extensions: Greek} (Short: \p{Scx=Grek}, \p{Grek}) (522)
   \p{Script_Extensions: Grek} \p{Script_Extensions=Greek} (522)
   \p{Script_Extensions: Gujarati} (Short: \p{Scx=Gujr}, \p{Gujr})
                             (99)
   \p{Script_Extensions: Gujr} \p{Script_Extensions=Gujarati} (99)
   \p{Script_Extensions: Gurmukhi} (Short: \p{Scx=Guru}, \p{Guru})
                             (93)
   \p{Script_Extensions: Guru} \p{Script_Extensions=Gurmukhi} (93)
   \p{Script_Extensions: Han} (Short: \p{Scx=Han}, \p{Han}) (82_013)
   \p{Script_Extensions: Hang} \p{Script_Extensions=Hangul} (11_775)
   \p{Script_Extensions: Hangul} (Short: \p{Scx=Hang}, \p{Hang})
                             (11_775)
   \p{Script_Extensions: Hani} \p{Script_Extensions=Han} (82_013)
   \p{Script_Extensions: Hano} \p{Script_Extensions=Hanunoo} (23)
   \p{Script_Extensions: Hanunoo} (Short: \p{Scx=Hano}, \p{Hano}) (23)
   \p{Script_Extensions: Hatr} \p{Script_Extensions=Hatran} (26)
   \p{Script_Extensions: Hatran} (Short: \p{Scx=Hatr}, \p{Hatr}) (26)
   \p{Script_Extensions: Hebr} \p{Script_Extensions=Hebrew} (133)
   \p{Script_Extensions: Hebrew} (Short: \p{Scx=Hebr}, \p{Hebr}) (133)
   \p{Script_Extensions: Hira} \p{Script_Extensions=Hiragana} (143)
   \p{Script_Extensions: Hiragana} (Short: \p{Scx=Hira}, \p{Hira})
                             (143)
   \p{Script_Extensions: Hluw} \p{Script_Extensions=
                             Anatolian_Hieroglyphs} (583)
   \p{Script_Extensions: Hmng} \p{Script_Extensions=Pahawh_Hmong}
                             (127)
   \p{Script_Extensions: Hung} \p{Script_Extensions=Old_Hungarian}
                             (108)
   \p{Script_Extensions: Imperial_Aramaic} (Short: \p{Scx=Armi},
                             \p{Armi}) (31)
   \p{Script_Extensions: Inherited} (Short: \p{Scx=Zinh}, \p{Zinh})
                             (496)
   \p{Script_Extensions: Inscriptional_Pahlavi} (Short: \p{Scx=Phli},
                             \p{Phli}) (27)
   \p{Script_Extensions: Inscriptional_Parthian} (Short: \p{Scx=
                             Prti}, \p{Prti}) (30)
   \p{Script_Extensions: Ital} \p{Script_Extensions=Old_Italic} (36)
   \p{Script_Extensions: Java} \p{Script_Extensions=Javanese} (91)
   \p{Script_Extensions: Javanese} (Short: \p{Scx=Java}, \p{Java})
                             (91)
   \p{Script_Extensions: Kaithi} (Short: \p{Scx=Kthi}, \p{Kthi}) (86)
   \p{Script_Extensions: Kali} \p{Script_Extensions=Kayah_Li} (48)
   \p{Script_Extensions: Kana} \p{Script_Extensions=Katakana} (352)
   \p{Script_Extensions: Kannada} (Short: \p{Scx=Knda}, \p{Knda})
                             (100)
   \p{Script_Extensions: Katakana} (Short: \p{Scx=Kana}, \p{Kana})
                             (352)
   \p{Script_Extensions: Kayah_Li} (Short: \p{Scx=Kali}, \p{Kali})
                             (48)
   \p{Script_Extensions: Khar} \p{Script_Extensions=Kharoshthi} (65)
   \p{Script_Extensions: Kharoshthi} (Short: \p{Scx=Khar}, \p{Khar})
                             (65)
   \p{Script_Extensions: Khmer} (Short: \p{Scx=Khmr}, \p{Khmr}) (146)
   \p{Script_Extensions: Khmr} \p{Script_Extensions=Khmer} (146)
   \p{Script_Extensions: Khoj} \p{Script_Extensions=Khojki} (72)
   \p{Script_Extensions: Khojki} (Short: \p{Scx=Khoj}, \p{Khoj}) (72)
   \p{Script_Extensions: Khudawadi} (Short: \p{Scx=Sind}, \p{Sind})
                             (81)
   \p{Script_Extensions: Knda} \p{Script_Extensions=Kannada} (100)
   \p{Script_Extensions: Kthi} \p{Script_Extensions=Kaithi} (86)
   \p{Script_Extensions: Lana} \p{Script_Extensions=Tai_Tham} (127)
   \p{Script_Extensions: Lao} (Short: \p{Scx=Lao}, \p{Lao}) (67)
   \p{Script_Extensions: Laoo} \p{Script_Extensions=Lao} (67)
   \p{Script_Extensions: Latin} (Short: \p{Scx=Latn}, \p{Latn}) (1370)
   \p{Script_Extensions: Latn} \p{Script_Extensions=Latin} (1370)
   \p{Script_Extensions: Lepc} \p{Script_Extensions=Lepcha} (74)
   \p{Script_Extensions: Lepcha} (Short: \p{Scx=Lepc}, \p{Lepc}) (74)
   \p{Script_Extensions: Limb} \p{Script_Extensions=Limbu} (69)
   \p{Script_Extensions: Limbu} (Short: \p{Scx=Limb}, \p{Limb}) (69)
   \p{Script_Extensions: Lina} \p{Script_Extensions=Linear_A} (386)
   \p{Script_Extensions: Linb} \p{Script_Extensions=Linear_B} (268)
   \p{Script_Extensions: Linear_A} (Short: \p{Scx=Lina}, \p{Lina})
                             (386)
   \p{Script_Extensions: Linear_B} (Short: \p{Scx=Linb}, \p{Linb})
                             (268)
   \p{Script_Extensions: Lisu} (Short: \p{Scx=Lisu}, \p{Lisu}) (48)
   \p{Script_Extensions: Lyci} \p{Script_Extensions=Lycian} (29)
   \p{Script_Extensions: Lycian} (Short: \p{Scx=Lyci}, \p{Lyci}) (29)
   \p{Script_Extensions: Lydi} \p{Script_Extensions=Lydian} (27)
   \p{Script_Extensions: Lydian} (Short: \p{Scx=Lydi}, \p{Lydi}) (27)
   \p{Script_Extensions: Mahajani} (Short: \p{Scx=Mahj}, \p{Mahj})
                             (61)
   \p{Script_Extensions: Mahj} \p{Script_Extensions=Mahajani} (61)
   \p{Script_Extensions: Malayalam} (Short: \p{Scx=Mlym}, \p{Mlym})
                             (119)
   \p{Script_Extensions: Mand} \p{Script_Extensions=Mandaic} (30)
   \p{Script_Extensions: Mandaic} (Short: \p{Scx=Mand}, \p{Mand}) (30)
   \p{Script_Extensions: Mani} \p{Script_Extensions=Manichaean} (52)
   \p{Script_Extensions: Manichaean} (Short: \p{Scx=Mani}, \p{Mani})
                             (52)
   \p{Script_Extensions: Marc} \p{Script_Extensions=Marchen} (68)
   \p{Script_Extensions: Marchen} (Short: \p{Scx=Marc}, \p{Marc}) (68)
   \p{Script_Extensions: Meetei_Mayek} (Short: \p{Scx=Mtei},
                             \p{Mtei}) (79)
   \p{Script_Extensions: Mend} \p{Script_Extensions=Mende_Kikakui}
                             (213)
   \p{Script_Extensions: Mende_Kikakui} (Short: \p{Scx=Mend},
                             \p{Mend}) (213)
   \p{Script_Extensions: Merc} \p{Script_Extensions=Meroitic_Cursive}
                             (90)
   \p{Script_Extensions: Mero} \p{Script_Extensions=
                             Meroitic_Hieroglyphs} (32)
   \p{Script_Extensions: Meroitic_Cursive} (Short: \p{Scx=Merc},
                             \p{Merc}) (90)
   \p{Script_Extensions: Meroitic_Hieroglyphs} (Short: \p{Scx=Mero},
                             \p{Mero}) (32)
   \p{Script_Extensions: Miao} (Short: \p{Scx=Miao}, \p{Miao}) (133)
   \p{Script_Extensions: Mlym} \p{Script_Extensions=Malayalam} (119)
   \p{Script_Extensions: Modi} (Short: \p{Scx=Modi}, \p{Modi}) (89)
   \p{Script_Extensions: Mong} \p{Script_Extensions=Mongolian} (169)
   \p{Script_Extensions: Mongolian} (Short: \p{Scx=Mong}, \p{Mong})
                             (169)
   \p{Script_Extensions: Mro} (Short: \p{Scx=Mro}, \p{Mro}) (43)
   \p{Script_Extensions: Mroo} \p{Script_Extensions=Mro} (43)
   \p{Script_Extensions: Mtei} \p{Script_Extensions=Meetei_Mayek} (79)
   \p{Script_Extensions: Mult} \p{Script_Extensions=Multani} (48)
   \p{Script_Extensions: Multani} (Short: \p{Scx=Mult}, \p{Mult}) (48)
   \p{Script_Extensions: Myanmar} (Short: \p{Scx=Mymr}, \p{Mymr})
                             (224)
   \p{Script_Extensions: Mymr} \p{Script_Extensions=Myanmar} (224)
   \p{Script_Extensions: Nabataean} (Short: \p{Scx=Nbat}, \p{Nbat})
                             (40)
   \p{Script_Extensions: Narb} \p{Script_Extensions=
                             Old_North_Arabian} (32)
   \p{Script_Extensions: Nbat} \p{Script_Extensions=Nabataean} (40)
   \p{Script_Extensions: New_Tai_Lue} (Short: \p{Scx=Talu}, \p{Talu})
                             (83)
   \p{Script_Extensions: Newa} (Short: \p{Scx=Newa}, \p{Newa}) (92)
   \p{Script_Extensions: Nko} (Short: \p{Scx=Nko}, \p{Nko}) (59)
   \p{Script_Extensions: Nkoo} \p{Script_Extensions=Nko} (59)
   \p{Script_Extensions: Ogam} \p{Script_Extensions=Ogham} (29)
   \p{Script_Extensions: Ogham} (Short: \p{Scx=Ogam}, \p{Ogam}) (29)
   \p{Script_Extensions: Ol_Chiki} (Short: \p{Scx=Olck}, \p{Olck})
                             (48)
   \p{Script_Extensions: Olck} \p{Script_Extensions=Ol_Chiki} (48)
   \p{Script_Extensions: Old_Hungarian} (Short: \p{Scx=Hung},
                             \p{Hung}) (108)
   \p{Script_Extensions: Old_Italic} (Short: \p{Scx=Ital}, \p{Ital})
                             (36)
   \p{Script_Extensions: Old_North_Arabian} (Short: \p{Scx=Narb},
                             \p{Narb}) (32)
   \p{Script_Extensions: Old_Permic} (Short: \p{Scx=Perm}, \p{Perm})
                             (44)
   \p{Script_Extensions: Old_Persian} (Short: \p{Scx=Xpeo}, \p{Xpeo})
                             (50)
   \p{Script_Extensions: Old_South_Arabian} (Short: \p{Scx=Sarb},
                             \p{Sarb}) (32)
   \p{Script_Extensions: Old_Turkic} (Short: \p{Scx=Orkh}, \p{Orkh})
                             (73)
   \p{Script_Extensions: Oriya} (Short: \p{Scx=Orya}, \p{Orya}) (94)
   \p{Script_Extensions: Orkh} \p{Script_Extensions=Old_Turkic} (73)
   \p{Script_Extensions: Orya} \p{Script_Extensions=Oriya} (94)
   \p{Script_Extensions: Osage} (Short: \p{Scx=Osge}, \p{Osge}) (72)
   \p{Script_Extensions: Osge} \p{Script_Extensions=Osage} (72)
   \p{Script_Extensions: Osma} \p{Script_Extensions=Osmanya} (40)
   \p{Script_Extensions: Osmanya} (Short: \p{Scx=Osma}, \p{Osma}) (40)
   \p{Script_Extensions: Pahawh_Hmong} (Short: \p{Scx=Hmng},
                             \p{Hmng}) (127)
   \p{Script_Extensions: Palm} \p{Script_Extensions=Palmyrene} (32)
   \p{Script_Extensions: Palmyrene} (Short: \p{Scx=Palm}, \p{Palm})
                             (32)
   \p{Script_Extensions: Pau_Cin_Hau} (Short: \p{Scx=Pauc}, \p{Pauc})
                             (57)
   \p{Script_Extensions: Pauc} \p{Script_Extensions=Pau_Cin_Hau} (57)
   \p{Script_Extensions: Perm} \p{Script_Extensions=Old_Permic} (44)
   \p{Script_Extensions: Phag} \p{Script_Extensions=Phags_Pa} (59)
   \p{Script_Extensions: Phags_Pa} (Short: \p{Scx=Phag}, \p{Phag})
                             (59)
   \p{Script_Extensions: Phli} \p{Script_Extensions=
                             Inscriptional_Pahlavi} (27)
   \p{Script_Extensions: Phlp} \p{Script_Extensions=Psalter_Pahlavi}
                             (30)
   \p{Script_Extensions: Phnx} \p{Script_Extensions=Phoenician} (29)
   \p{Script_Extensions: Phoenician} (Short: \p{Scx=Phnx}, \p{Phnx})
                             (29)
   \p{Script_Extensions: Plrd} \p{Script_Extensions=Miao} (133)
   \p{Script_Extensions: Prti} \p{Script_Extensions=
                             Inscriptional_Parthian} (30)
   \p{Script_Extensions: Psalter_Pahlavi} (Short: \p{Scx=Phlp},
                             \p{Phlp}) (30)
   \p{Script_Extensions: Qaac} \p{Script_Extensions=Coptic} (165)
   \p{Script_Extensions: Qaai} \p{Script_Extensions=Inherited} (496)
   \p{Script_Extensions: Rejang} (Short: \p{Scx=Rjng}, \p{Rjng}) (37)
   \p{Script_Extensions: Rjng} \p{Script_Extensions=Rejang} (37)
   \p{Script_Extensions: Runic} (Short: \p{Scx=Runr}, \p{Runr}) (86)
   \p{Script_Extensions: Runr} \p{Script_Extensions=Runic} (86)
   \p{Script_Extensions: Samaritan} (Short: \p{Scx=Samr}, \p{Samr})
                             (61)
   \p{Script_Extensions: Samr} \p{Script_Extensions=Samaritan} (61)
   \p{Script_Extensions: Sarb} \p{Script_Extensions=
                             Old_South_Arabian} (32)
   \p{Script_Extensions: Saur} \p{Script_Extensions=Saurashtra} (82)
   \p{Script_Extensions: Saurashtra} (Short: \p{Scx=Saur}, \p{Saur})
                             (82)
   \p{Script_Extensions: Sgnw} \p{Script_Extensions=SignWriting} (672)
   \p{Script_Extensions: Sharada} (Short: \p{Scx=Shrd}, \p{Shrd})
                             (100)
   \p{Script_Extensions: Shavian} (Short: \p{Scx=Shaw}, \p{Shaw}) (48)
   \p{Script_Extensions: Shaw} \p{Script_Extensions=Shavian} (48)
   \p{Script_Extensions: Shrd} \p{Script_Extensions=Sharada} (100)
   \p{Script_Extensions: Sidd} \p{Script_Extensions=Siddham} (92)
   \p{Script_Extensions: Siddham} (Short: \p{Scx=Sidd}, \p{Sidd}) (92)
   \p{Script_Extensions: SignWriting} (Short: \p{Scx=Sgnw}, \p{Sgnw})
                             (672)
   \p{Script_Extensions: Sind} \p{Script_Extensions=Khudawadi} (81)
   \p{Script_Extensions: Sinh} \p{Script_Extensions=Sinhala} (112)
   \p{Script_Extensions: Sinhala} (Short: \p{Scx=Sinh}, \p{Sinh})
                             (112)
   \p{Script_Extensions: Sora} \p{Script_Extensions=Sora_Sompeng} (35)
   \p{Script_Extensions: Sora_Sompeng} (Short: \p{Scx=Sora},
                             \p{Sora}) (35)
   \p{Script_Extensions: Sund} \p{Script_Extensions=Sundanese} (72)
   \p{Script_Extensions: Sundanese} (Short: \p{Scx=Sund}, \p{Sund})
                             (72)
   \p{Script_Extensions: Sylo} \p{Script_Extensions=Syloti_Nagri} (56)
   \p{Script_Extensions: Syloti_Nagri} (Short: \p{Scx=Sylo},
                             \p{Sylo}) (56)
   \p{Script_Extensions: Syrc} \p{Script_Extensions=Syriac} (93)
   \p{Script_Extensions: Syriac} (Short: \p{Scx=Syrc}, \p{Syrc}) (93)
   \p{Script_Extensions: Tagalog} (Short: \p{Scx=Tglg}, \p{Tglg}) (22)
   \p{Script_Extensions: Tagb} \p{Script_Extensions=Tagbanwa} (20)
   \p{Script_Extensions: Tagbanwa} (Short: \p{Scx=Tagb}, \p{Tagb})
                             (20)
   \p{Script_Extensions: Tai_Le} (Short: \p{Scx=Tale}, \p{Tale}) (45)
   \p{Script_Extensions: Tai_Tham} (Short: \p{Scx=Lana}, \p{Lana})
                             (127)
   \p{Script_Extensions: Tai_Viet} (Short: \p{Scx=Tavt}, \p{Tavt})
                             (72)
   \p{Script_Extensions: Takr} \p{Script_Extensions=Takri} (78)
   \p{Script_Extensions: Takri} (Short: \p{Scx=Takr}, \p{Takr}) (78)
   \p{Script_Extensions: Tale} \p{Script_Extensions=Tai_Le} (45)
   \p{Script_Extensions: Talu} \p{Script_Extensions=New_Tai_Lue} (83)
   \p{Script_Extensions: Tamil} (Short: \p{Scx=Taml}, \p{Taml}) (80)
   \p{Script_Extensions: Taml} \p{Script_Extensions=Tamil} (80)
   \p{Script_Extensions: Tang} \p{Script_Extensions=Tangut} (6881)
   \p{Script_Extensions: Tangut} (Short: \p{Scx=Tang}, \p{Tang})
                             (6881)
   \p{Script_Extensions: Tavt} \p{Script_Extensions=Tai_Viet} (72)
   \p{Script_Extensions: Telu} \p{Script_Extensions=Telugu} (101)
   \p{Script_Extensions: Telugu} (Short: \p{Scx=Telu}, \p{Telu}) (101)
   \p{Script_Extensions: Tfng} \p{Script_Extensions=Tifinagh} (59)
   \p{Script_Extensions: Tglg} \p{Script_Extensions=Tagalog} (22)
   \p{Script_Extensions: Thaa} \p{Script_Extensions=Thaana} (65)
   \p{Script_Extensions: Thaana} (Short: \p{Scx=Thaa}, \p{Thaa}) (65)
   \p{Script_Extensions: Thai} (Short: \p{Scx=Thai}, \p{Thai}) (86)
   \p{Script_Extensions: Tibetan} (Short: \p{Scx=Tibt}, \p{Tibt})
                             (207)
   \p{Script_Extensions: Tibt} \p{Script_Extensions=Tibetan} (207)
   \p{Script_Extensions: Tifinagh} (Short: \p{Scx=Tfng}, \p{Tfng})
                             (59)
   \p{Script_Extensions: Tirh} \p{Script_Extensions=Tirhuta} (94)
   \p{Script_Extensions: Tirhuta} (Short: \p{Scx=Tirh}, \p{Tirh}) (94)
   \p{Script_Extensions: Ugar} \p{Script_Extensions=Ugaritic} (31)
   \p{Script_Extensions: Ugaritic} (Short: \p{Scx=Ugar}, \p{Ugar})
                             (31)
   \p{Script_Extensions: Unknown} (Short: \p{Scx=Zzzz}, \p{Zzzz})
                             (985_875 plus all above-Unicode code
                             points)
   \p{Script_Extensions: Vai} (Short: \p{Scx=Vai}, \p{Vai}) (300)
   \p{Script_Extensions: Vaii} \p{Script_Extensions=Vai} (300)
   \p{Script_Extensions: Wara} \p{Script_Extensions=Warang_Citi} (84)
   \p{Script_Extensions: Warang_Citi} (Short: \p{Scx=Wara}, \p{Wara})
                             (84)
   \p{Script_Extensions: Xpeo} \p{Script_Extensions=Old_Persian} (50)
   \p{Script_Extensions: Xsux} \p{Script_Extensions=Cuneiform} (1234)
   \p{Script_Extensions: Yi} (Short: \p{Scx=Yi}, \p{Yi}) (1246)
   \p{Script_Extensions: Yiii} \p{Script_Extensions=Yi} (1246)
   \p{Script_Extensions: Zinh} \p{Script_Extensions=Inherited} (496)
   \p{Script_Extensions: Zyyy} \p{Script_Extensions=Common} (6864)
   \p{Script_Extensions: Zzzz} \p{Script_Extensions=Unknown} (985_875
                             plus all above-Unicode code points)
   \p{Scx: *}              \p{Script_Extensions: *}
   \p{SD}                  \p{Soft_Dotted} (= \p{Soft_Dotted=Y}) (46)
   \p{SD: *}               \p{Soft_Dotted: *}
   \p{Sentence_Break: AT}  \p{Sentence_Break=ATerm} (4)
   \p{Sentence_Break: ATerm} (Short: \p{SB=AT}) (4)
   \p{Sentence_Break: CL}  \p{Sentence_Break=Close} (187)
   \p{Sentence_Break: Close} (Short: \p{SB=CL}) (187)
   \p{Sentence_Break: CR}  (Short: \p{SB=CR}) (1)
   \p{Sentence_Break: EX}  \p{Sentence_Break=Extend} (2197)
   \p{Sentence_Break: Extend} (Short: \p{SB=EX}) (2197)
   \p{Sentence_Break: FO}  \p{Sentence_Break=Format} (53)
   \p{Sentence_Break: Format} (Short: \p{SB=FO}) (53)
   \p{Sentence_Break: LE}  \p{Sentence_Break=OLetter} (113_027)
   \p{Sentence_Break: LF}  (Short: \p{SB=LF}) (1)
   \p{Sentence_Break: LO}  \p{Sentence_Break=Lower} (2251)
   \p{Sentence_Break: Lower} (Short: \p{SB=LO}) (2251)
   \p{Sentence_Break: NU}  \p{Sentence_Break=Numeric} (572)
   \p{Sentence_Break: Numeric} (Short: \p{SB=NU}) (572)
   \p{Sentence_Break: OLetter} (Short: \p{SB=LE}) (113_027)
   \p{Sentence_Break: Other} (Short: \p{SB=XX}) (993_796 plus all
                             above-Unicode code points)
   \p{Sentence_Break: SC}  \p{Sentence_Break=SContinue} (26)
   \p{Sentence_Break: SContinue} (Short: \p{SB=SC}) (26)
   \p{Sentence_Break: SE}  \p{Sentence_Break=Sep} (3)
   \p{Sentence_Break: Sep} (Short: \p{SB=SE}) (3)
   \p{Sentence_Break: Sp}  (Short: \p{SB=Sp}) (20)
   \p{Sentence_Break: ST}  \p{Sentence_Break=STerm} (121)
   \p{Sentence_Break: STerm} (Short: \p{SB=ST}) (121)
   \p{Sentence_Break: UP}  \p{Sentence_Break=Upper} (1853)
   \p{Sentence_Break: Upper} (Short: \p{SB=UP}) (1853)
   \p{Sentence_Break: XX}  \p{Sentence_Break=Other} (993_796 plus all
                             above-Unicode code points)
   \p{Sentence_Terminal}   \p{Sentence_Terminal=Y} (Short: \p{STerm})
                             (124)
   \p{Sentence_Terminal: N*} (Short: \p{STerm=N}, \P{STerm})
                             (1_113_988 plus all above-Unicode code
                             points)
   \p{Sentence_Terminal: Y*} (Short: \p{STerm=Y}, \p{STerm}) (124)
   \p{Separator}           \p{General_Category=Separator} (Short:
                             \p{Z}) (19)
   \p{Sgnw}                \p{SignWriting} (= \p{Script_Extensions=
                             SignWriting}) (672)
   \p{Sharada}             \p{Script_Extensions=Sharada} (Short:
                             \p{Shrd}; NOT \p{Block=Sharada}) (100)
   \p{Shavian}             \p{Script_Extensions=Shavian} (Short:
                             \p{Shaw}) (48)
   \p{Shaw}                \p{Shavian} (= \p{Script_Extensions=
                             Shavian}) (48)
 X \p{Shorthand_Format_Controls} \p{Block=Shorthand_Format_Controls}
                             (16)
   \p{Shrd}                \p{Sharada} (= \p{Script_Extensions=
                             Sharada}) (NOT \p{Block=Sharada}) (100)
   \p{Sidd}                \p{Siddham} (= \p{Script_Extensions=
                             Siddham}) (NOT \p{Block=Siddham}) (92)
   \p{Siddham}             \p{Script_Extensions=Siddham} (Short:
                             \p{Sidd}; NOT \p{Block=Siddham}) (92)
   \p{SignWriting}         \p{Script_Extensions=SignWriting} (Short:
                             \p{Sgnw}) (672)
   \p{Sind}                \p{Khudawadi} (= \p{Script_Extensions=
                             Khudawadi}) (NOT \p{Block=Khudawadi})
                             (81)
   \p{Sinh}                \p{Sinhala} (= \p{Script_Extensions=
                             Sinhala}) (NOT \p{Block=Sinhala}) (112)
   \p{Sinhala}             \p{Script_Extensions=Sinhala} (Short:
                             \p{Sinh}; NOT \p{Block=Sinhala}) (112)
 X \p{Sinhala_Archaic_Numbers} \p{Block=Sinhala_Archaic_Numbers} (32)
   \p{Sk}                  \p{Modifier_Symbol} (=
                             \p{General_Category=Modifier_Symbol})
                             (121)
   \p{Sm}                  \p{Math_Symbol} (= \p{General_Category=
                             Math_Symbol}) (948)
 X \p{Small_Form_Variants} \p{Block=Small_Form_Variants} (Short:
                             \p{InSmallForms}) (32)
 X \p{Small_Forms}         \p{Small_Form_Variants} (= \p{Block=
                             Small_Form_Variants}) (32)
   \p{So}                  \p{Other_Symbol} (= \p{General_Category=
                             Other_Symbol}) (5777)
   \p{Soft_Dotted}         \p{Soft_Dotted=Y} (Short: \p{SD}) (46)
   \p{Soft_Dotted: N*}     (Short: \p{SD=N}, \P{SD}) (1_114_066 plus
                             all above-Unicode code points)
   \p{Soft_Dotted: Y*}     (Short: \p{SD=Y}, \p{SD}) (46)
   \p{Sora}                \p{Sora_Sompeng} (= \p{Script_Extensions=
                             Sora_Sompeng}) (NOT \p{Block=
                             Sora_Sompeng}) (35)
   \p{Sora_Sompeng}        \p{Script_Extensions=Sora_Sompeng} (Short:
                             \p{Sora}; NOT \p{Block=Sora_Sompeng})
                             (35)
   \p{Space}               \p{White_Space} (= \p{White_Space=Y}) (25)
   \p{Space: *}            \p{White_Space: *}
   \p{Space_Separator}     \p{General_Category=Space_Separator}
                             (Short: \p{Zs}) (17)
   \p{SpacePerl}           \p{XPosixSpace} (25)
   \p{Spacing_Mark}        \p{General_Category=Spacing_Mark} (Short:
                             \p{Mc}) (394)
 X \p{Spacing_Modifier_Letters} \p{Block=Spacing_Modifier_Letters}
                             (Short: \p{InModifierLetters}) (80)
 X \p{Specials}            \p{Block=Specials} (16)
   \p{STerm}               \p{Sentence_Terminal} (=
                             \p{Sentence_Terminal=Y}) (124)
   \p{STerm: *}            \p{Sentence_Terminal: *}
   \p{Sund}                \p{Sundanese} (= \p{Script_Extensions=
                             Sundanese}) (NOT \p{Block=Sundanese})
                             (72)
   \p{Sundanese}           \p{Script_Extensions=Sundanese} (Short:
                             \p{Sund}; NOT \p{Block=Sundanese}) (72)
 X \p{Sundanese_Sup}       \p{Sundanese_Supplement} (= \p{Block=
                             Sundanese_Supplement}) (16)
 X \p{Sundanese_Supplement} \p{Block=Sundanese_Supplement} (Short:
                             \p{InSundaneseSup}) (16)
 X \p{Sup_Arrows_A}        \p{Supplemental_Arrows_A} (= \p{Block=
                             Supplemental_Arrows_A}) (16)
 X \p{Sup_Arrows_B}        \p{Supplemental_Arrows_B} (= \p{Block=
                             Supplemental_Arrows_B}) (128)
 X \p{Sup_Arrows_C}        \p{Supplemental_Arrows_C} (= \p{Block=
                             Supplemental_Arrows_C}) (256)
 X \p{Sup_Math_Operators}  \p{Supplemental_Mathematical_Operators} (=
                             \p{Block=
                             Supplemental_Mathematical_Operators})
                             (256)
 X \p{Sup_PUA_A}           \p{Supplementary_Private_Use_Area_A} (=
                             \p{Block=
                             Supplementary_Private_Use_Area_A})
                             (65_536)
 X \p{Sup_PUA_B}           \p{Supplementary_Private_Use_Area_B} (=
                             \p{Block=
                             Supplementary_Private_Use_Area_B})
                             (65_536)
 X \p{Sup_Punctuation}     \p{Supplemental_Punctuation} (= \p{Block=
                             Supplemental_Punctuation}) (128)
 X \p{Sup_Symbols_And_Pictographs}
                             \p{Supplemental_Symbols_And_Pictographs}
                             (= \p{Block=
                             Supplemental_Symbols_And_Pictographs})
                             (256)
 X \p{Super_And_Sub}       \p{Superscripts_And_Subscripts} (=
                             \p{Block=Superscripts_And_Subscripts})
                             (48)
 X \p{Superscripts_And_Subscripts} \p{Block=
                             Superscripts_And_Subscripts} (Short:
                             \p{InSuperAndSub}) (48)
 X \p{Supplemental_Arrows_A} \p{Block=Supplemental_Arrows_A} (Short:
                             \p{InSupArrowsA}) (16)
 X \p{Supplemental_Arrows_B} \p{Block=Supplemental_Arrows_B} (Short:
                             \p{InSupArrowsB}) (128)
 X \p{Supplemental_Arrows_C} \p{Block=Supplemental_Arrows_C} (Short:
                             \p{InSupArrowsC}) (256)
 X \p{Supplemental_Mathematical_Operators} \p{Block=
                             Supplemental_Mathematical_Operators}
                             (Short: \p{InSupMathOperators}) (256)
 X \p{Supplemental_Punctuation} \p{Block=Supplemental_Punctuation}
                             (Short: \p{InSupPunctuation}) (128)
 X \p{Supplemental_Symbols_And_Pictographs} \p{Block=
                             Supplemental_Symbols_And_Pictographs}
                             (Short: \p{InSupSymbolsAndPictographs})
                             (256)
 X \p{Supplementary_Private_Use_Area_A} \p{Block=
                             Supplementary_Private_Use_Area_A}
                             (Short: \p{InSupPUAA}) (65_536)
 X \p{Supplementary_Private_Use_Area_B} \p{Block=
                             Supplementary_Private_Use_Area_B}
                             (Short: \p{InSupPUAB}) (65_536)
   \p{Surrogate}           \p{General_Category=Surrogate} (Short:
                             \p{Cs}) (2048)
 X \p{Sutton_SignWriting}  \p{Block=Sutton_SignWriting} (688)
   \p{Sylo}                \p{Syloti_Nagri} (= \p{Script_Extensions=
                             Syloti_Nagri}) (NOT \p{Block=
                             Syloti_Nagri}) (56)
   \p{Syloti_Nagri}        \p{Script_Extensions=Syloti_Nagri} (Short:
                             \p{Sylo}; NOT \p{Block=Syloti_Nagri})
                             (56)
   \p{Symbol}              \p{General_Category=Symbol} (Short: \p{S})
                             (6899)
   \p{Syrc}                \p{Syriac} (= \p{Script_Extensions=
                             Syriac}) (NOT \p{Block=Syriac}) (93)
   \p{Syriac}              \p{Script_Extensions=Syriac} (Short:
                             \p{Syrc}; NOT \p{Block=Syriac}) (93)
   \p{Tagalog}             \p{Script_Extensions=Tagalog} (Short:
                             \p{Tglg}; NOT \p{Block=Tagalog}) (22)
   \p{Tagb}                \p{Tagbanwa} (= \p{Script_Extensions=
                             Tagbanwa}) (NOT \p{Block=Tagbanwa}) (20)
   \p{Tagbanwa}            \p{Script_Extensions=Tagbanwa} (Short:
                             \p{Tagb}; NOT \p{Block=Tagbanwa}) (20)
 X \p{Tags}                \p{Block=Tags} (128)
   \p{Tai_Le}              \p{Script_Extensions=Tai_Le} (Short:
                             \p{Tale}; NOT \p{Block=Tai_Le}) (45)
   \p{Tai_Tham}            \p{Script_Extensions=Tai_Tham} (Short:
                             \p{Lana}; NOT \p{Block=Tai_Tham}) (127)
   \p{Tai_Viet}            \p{Script_Extensions=Tai_Viet} (Short:
                             \p{Tavt}; NOT \p{Block=Tai_Viet}) (72)
 X \p{Tai_Xuan_Jing}       \p{Tai_Xuan_Jing_Symbols} (= \p{Block=
                             Tai_Xuan_Jing_Symbols}) (96)
 X \p{Tai_Xuan_Jing_Symbols} \p{Block=Tai_Xuan_Jing_Symbols} (Short:
                             \p{InTaiXuanJing}) (96)
   \p{Takr}                \p{Takri} (= \p{Script_Extensions=Takri})
                             (NOT \p{Block=Takri}) (78)
   \p{Takri}               \p{Script_Extensions=Takri} (Short:
                             \p{Takr}; NOT \p{Block=Takri}) (78)
   \p{Tale}                \p{Tai_Le} (= \p{Script_Extensions=
                             Tai_Le}) (NOT \p{Block=Tai_Le}) (45)
   \p{Talu}                \p{New_Tai_Lue} (= \p{Script_Extensions=
                             New_Tai_Lue}) (NOT \p{Block=
                             New_Tai_Lue}) (83)
   \p{Tamil}               \p{Script_Extensions=Tamil} (Short:
                             \p{Taml}; NOT \p{Block=Tamil}) (80)
   \p{Taml}                \p{Tamil} (= \p{Script_Extensions=Tamil})
                             (NOT \p{Block=Tamil}) (80)
   \p{Tang}                \p{Tangut} (= \p{Script_Extensions=
                             Tangut}) (NOT \p{Block=Tangut}) (6881)
   \p{Tangut}              \p{Script_Extensions=Tangut} (Short:
                             \p{Tang}; NOT \p{Block=Tangut}) (6881)
 X \p{Tangut_Components}   \p{Block=Tangut_Components} (768)
   \p{Tavt}                \p{Tai_Viet} (= \p{Script_Extensions=
                             Tai_Viet}) (NOT \p{Block=Tai_Viet}) (72)
   \p{Telu}                \p{Telugu} (= \p{Script_Extensions=
                             Telugu}) (NOT \p{Block=Telugu}) (101)
   \p{Telugu}              \p{Script_Extensions=Telugu} (Short:
                             \p{Telu}; NOT \p{Block=Telugu}) (101)
   \p{Term}                \p{Terminal_Punctuation} (=
                             \p{Terminal_Punctuation=Y}) (246)
   \p{Term: *}             \p{Terminal_Punctuation: *}
   \p{Terminal_Punctuation} \p{Terminal_Punctuation=Y} (Short:
                             \p{Term}) (246)
   \p{Terminal_Punctuation: N*} (Short: \p{Term=N}, \P{Term})
                             (1_113_866 plus all above-Unicode code
                             points)
   \p{Terminal_Punctuation: Y*} (Short: \p{Term=Y}, \p{Term}) (246)
   \p{Tfng}                \p{Tifinagh} (= \p{Script_Extensions=
                             Tifinagh}) (NOT \p{Block=Tifinagh}) (59)
   \p{Tglg}                \p{Tagalog} (= \p{Script_Extensions=
                             Tagalog}) (NOT \p{Block=Tagalog}) (22)
   \p{Thaa}                \p{Thaana} (= \p{Script_Extensions=
                             Thaana}) (NOT \p{Block=Thaana}) (65)
   \p{Thaana}              \p{Script_Extensions=Thaana} (Short:
                             \p{Thaa}; NOT \p{Block=Thaana}) (65)
   \p{Thai}                \p{Script_Extensions=Thai} (NOT \p{Block=
                             Thai}) (86)
   \p{Tibetan}             \p{Script_Extensions=Tibetan} (Short:
                             \p{Tibt}; NOT \p{Block=Tibetan}) (207)
   \p{Tibt}                \p{Tibetan} (= \p{Script_Extensions=
                             Tibetan}) (NOT \p{Block=Tibetan}) (207)
   \p{Tifinagh}            \p{Script_Extensions=Tifinagh} (Short:
                             \p{Tfng}; NOT \p{Block=Tifinagh}) (59)
   \p{Tirh}                \p{Tirhuta} (= \p{Script_Extensions=
                             Tirhuta}) (NOT \p{Block=Tirhuta}) (94)
   \p{Tirhuta}             \p{Script_Extensions=Tirhuta} (Short:
                             \p{Tirh}; NOT \p{Block=Tirhuta}) (94)
   \p{Title}               \p{Titlecase} (/i= Cased=Yes) (31)
   \p{Titlecase}           (= \p{Gc=Lt}) (Short: \p{Title}; /i=
                             Cased=Yes) (31)
   \p{Titlecase_Letter}    \p{General_Category=Titlecase_Letter}
                             (Short: \p{Lt}; /i= General_Category=
                             Cased_Letter) (31)
 X \p{Transport_And_Map}   \p{Transport_And_Map_Symbols} (= \p{Block=
                             Transport_And_Map_Symbols}) (128)
 X \p{Transport_And_Map_Symbols} \p{Block=Transport_And_Map_Symbols}
                             (Short: \p{InTransportAndMap}) (128)
 X \p{UCAS}                \p{Unified_Canadian_Aboriginal_Syllabics}
                             (= \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics})
                             (640)
 X \p{UCAS_Ext}            \p{Unified_Canadian_Aboriginal_Syllabics_-
                             Extended} (= \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics_-
                             Extended}) (80)
   \p{Ugar}                \p{Ugaritic} (= \p{Script_Extensions=
                             Ugaritic}) (NOT \p{Block=Ugaritic}) (31)
   \p{Ugaritic}            \p{Script_Extensions=Ugaritic} (Short:
                             \p{Ugar}; NOT \p{Block=Ugaritic}) (31)
   \p{UIdeo}               \p{Unified_Ideograph} (=
                             \p{Unified_Ideograph=Y}) (80_388)
   \p{UIdeo: *}            \p{Unified_Ideograph: *}
   \p{Unassigned}          \p{General_Category=Unassigned} (Short:
                             \p{Cn}) (846_359 plus all above-Unicode
                             code points)
   \p{Unicode}             \p{Any} (1_114_112)
 X \p{Unified_Canadian_Aboriginal_Syllabics} \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics}
                             (Short: \p{InUCAS}) (640)
 X \p{Unified_Canadian_Aboriginal_Syllabics_Extended} \p{Block=
                             Unified_Canadian_Aboriginal_Syllabics_-
                             Extended} (Short: \p{InUCASExt}) (80)
   \p{Unified_Ideograph}   \p{Unified_Ideograph=Y} (Short: \p{UIdeo})
                             (80_388)
   \p{Unified_Ideograph: N*} (Short: \p{UIdeo=N}, \P{UIdeo})
                             (1_033_724 plus all above-Unicode code
                             points)
   \p{Unified_Ideograph: Y*} (Short: \p{UIdeo=Y}, \p{UIdeo}) (80_388)
   \p{Unknown}             \p{Script_Extensions=Unknown} (Short:
                             \p{Zzzz}) (985_875 plus all above-
                             Unicode code points)
   \p{Upper}               \p{XPosixUpper} (= \p{Uppercase=Y}) (/i=
                             Cased=Yes) (1822)
   \p{Upper: *}            \p{Uppercase: *}
   \p{Uppercase}           \p{XPosixUpper} (= \p{Uppercase=Y}) (/i=
                             Cased=Yes) (1822)
   \p{Uppercase: N*}       (Short: \p{Upper=N}, \P{Upper}; /i= Cased=
                             No) (1_112_290 plus all above-Unicode
                             code points)
   \p{Uppercase: Y*}       (Short: \p{Upper=Y}, \p{Upper}; /i= Cased=
                             Yes) (1822)
   \p{Uppercase_Letter}    \p{General_Category=Uppercase_Letter}
                             (Short: \p{Lu}; /i= General_Category=
                             Cased_Letter) (1702)
   \p{Vai}                 \p{Script_Extensions=Vai} (NOT \p{Block=
                             Vai}) (300)
   \p{Vaii}                \p{Vai} (= \p{Script_Extensions=Vai}) (NOT
                             \p{Block=Vai}) (300)
   \p{Variation_Selector}  \p{Variation_Selector=Y} (Short: \p{VS};
                             NOT \p{Variation_Selectors}) (259)
   \p{Variation_Selector: N*} (Short: \p{VS=N}, \P{VS}) (1_113_853
                             plus all above-Unicode code points)
   \p{Variation_Selector: Y*} (Short: \p{VS=Y}, \p{VS}) (259)
 X \p{Variation_Selectors} \p{Block=Variation_Selectors} (Short:
                             \p{InVS}) (16)
 X \p{Variation_Selectors_Supplement} \p{Block=
                             Variation_Selectors_Supplement} (Short:
                             \p{InVSSup}) (240)
 X \p{Vedic_Ext}           \p{Vedic_Extensions} (= \p{Block=
                             Vedic_Extensions}) (48)
 X \p{Vedic_Extensions}    \p{Block=Vedic_Extensions} (Short:
                             \p{InVedicExt}) (48)
 X \p{Vertical_Forms}      \p{Block=Vertical_Forms} (16)
   \p{VertSpace}           \v (7)
   \p{VS}                  \p{Variation_Selector} (=
                             \p{Variation_Selector=Y}) (NOT
                             \p{Variation_Selectors}) (259)
   \p{VS: *}               \p{Variation_Selector: *}
 X \p{VS_Sup}              \p{Variation_Selectors_Supplement} (=
                             \p{Block=
                             Variation_Selectors_Supplement}) (240)
   \p{Wara}                \p{Warang_Citi} (= \p{Script_Extensions=
                             Warang_Citi}) (NOT \p{Block=
                             Warang_Citi}) (84)
   \p{Warang_Citi}         \p{Script_Extensions=Warang_Citi} (Short:
                             \p{Wara}; NOT \p{Block=Warang_Citi}) (84)
   \p{WB: *}               \p{Word_Break: *}
   \p{White_Space}         \p{White_Space=Y} (Short: \p{Space}) (25)
   \p{White_Space: N*}     (Short: \p{Space=N}, \P{Space}) (1_114_087
                             plus all above-Unicode code points)
   \p{White_Space: Y*}     (Short: \p{Space=Y}, \p{Space}) (25)
   \p{Word}                \p{XPosixWord} (119_821)
   \p{Word_Break: ALetter} (Short: \p{WB=LE}) (27_992)
   \p{Word_Break: CR}      (Short: \p{WB=CR}) (1)
   \p{Word_Break: Double_Quote} (Short: \p{WB=DQ}) (1)
   \p{Word_Break: DQ}      \p{Word_Break=Double_Quote} (1)
   \p{Word_Break: E_Base}  (Short: \p{WB=EB}) (79)
   \p{Word_Break: E_Base_GAZ} (Short: \p{WB=EBG}) (4)
   \p{Word_Break: E_Modifier} (Short: \p{WB=EM}) (5)
   \p{Word_Break: EB}      \p{Word_Break=E_Base} (79)
   \p{Word_Break: EBG}     \p{Word_Break=E_Base_GAZ} (4)
   \p{Word_Break: EM}      \p{Word_Break=E_Modifier} (5)
   \p{Word_Break: EX}      \p{Word_Break=ExtendNumLet} (11)
   \p{Word_Break: Extend}  (Short: \p{WB=Extend}) (2196)
   \p{Word_Break: ExtendNumLet} (Short: \p{WB=EX}) (11)
   \p{Word_Break: FO}      \p{Word_Break=Format} (52)
   \p{Word_Break: Format}  (Short: \p{WB=FO}) (52)
   \p{Word_Break: GAZ}     \p{Word_Break=Glue_After_Zwj} (3)
   \p{Word_Break: Glue_After_Zwj} (Short: \p{WB=GAZ}) (3)
   \p{Word_Break: Hebrew_Letter} (Short: \p{WB=HL}) (74)
   \p{Word_Break: HL}      \p{Word_Break=Hebrew_Letter} (74)
   \p{Word_Break: KA}      \p{Word_Break=Katakana} (310)
   \p{Word_Break: Katakana} (Short: \p{WB=KA}) (310)
   \p{Word_Break: LE}      \p{Word_Break=ALetter} (27_992)
   \p{Word_Break: LF}      (Short: \p{WB=LF}) (1)
   \p{Word_Break: MB}      \p{Word_Break=MidNumLet} (7)
   \p{Word_Break: MidLetter} (Short: \p{WB=ML}) (9)
   \p{Word_Break: MidNum}  (Short: \p{WB=MN}) (15)
   \p{Word_Break: MidNumLet} (Short: \p{WB=MB}) (7)
   \p{Word_Break: ML}      \p{Word_Break=MidLetter} (9)
   \p{Word_Break: MN}      \p{Word_Break=MidNum} (15)
   \p{Word_Break: Newline} (Short: \p{WB=NL}) (5)
   \p{Word_Break: NL}      \p{Word_Break=Newline} (5)
   \p{Word_Break: NU}      \p{Word_Break=Numeric} (571)
   \p{Word_Break: Numeric} (Short: \p{WB=NU}) (571)
   \p{Word_Break: Other}   (Short: \p{WB=XX}) (1_082_748 plus all
                             above-Unicode code points)
   \p{Word_Break: Regional_Indicator} (Short: \p{WB=RI}) (26)
   \p{Word_Break: RI}      \p{Word_Break=Regional_Indicator} (26)
   \p{Word_Break: Single_Quote} (Short: \p{WB=SQ}) (1)
   \p{Word_Break: SQ}      \p{Word_Break=Single_Quote} (1)
   \p{Word_Break: XX}      \p{Word_Break=Other} (1_082_748 plus all
                             above-Unicode code points)
   \p{Word_Break: ZWJ}     (Short: \p{WB=ZWJ}) (1)
   \p{WSpace}              \p{White_Space} (= \p{White_Space=Y}) (25)
   \p{WSpace: *}           \p{White_Space: *}
   \p{XDigit}              \p{XPosixXDigit} (= \p{Hex_Digit=Y}) (44)
   \p{XID_Continue}        \p{XID_Continue=Y} (Short: \p{XIDC})
                             (119_672)
   \p{XID_Continue: N*}    (Short: \p{XIDC=N}, \P{XIDC}) (994_440
                             plus all above-Unicode code points)
   \p{XID_Continue: Y*}    (Short: \p{XIDC=Y}, \p{XIDC}) (119_672)
   \p{XID_Start}           \p{XID_Start=Y} (Short: \p{XIDS}) (116_984)
   \p{XID_Start: N*}       (Short: \p{XIDS=N}, \P{XIDS}) (997_128
                             plus all above-Unicode code points)
   \p{XID_Start: Y*}       (Short: \p{XIDS=Y}, \p{XIDS}) (116_984)
   \p{XIDC}                \p{XID_Continue} (= \p{XID_Continue=Y})
                             (119_672)
   \p{XIDC: *}             \p{XID_Continue: *}
   \p{XIDS}                \p{XID_Start} (= \p{XID_Start=Y}) (116_984)
   \p{XIDS: *}             \p{XID_Start: *}
   \p{Xpeo}                \p{Old_Persian} (= \p{Script_Extensions=
                             Old_Persian}) (NOT \p{Block=
                             Old_Persian}) (50)
   \p{XPerlSpace}          \p{XPosixSpace} (25)
   \p{XPosixAlnum}         Alphabetic and (decimal) Numeric (Short:
                             \p{Alnum}) (118_820)
   \p{XPosixAlpha}         \p{Alphabetic=Y} (Short: \p{Alpha})
                             (118_240)
   \p{XPosixBlank}         \h, Horizontal white space (Short:
                             \p{Blank}) (18)
   \p{XPosixCntrl}         \p{General_Category=Control} Control
                             characters (Short: \p{Cc}) (65)
   \p{XPosixDigit}         \p{General_Category=Decimal_Number} [0-9]
                             + all other decimal digits (Short:
                             \p{Nd}) (580)
   \p{XPosixGraph}         Characters that are graphical (Short:
                             \p{Graph}) (265_621)
   \p{XPosixLower}         \p{Lowercase=Y} (Short: \p{Lower}; /i=
                             Cased=Yes) (2252)
   \p{XPosixPrint}         Characters that are graphical plus space
                             characters (but no controls) (Short:
                             \p{Print}) (265_638)
   \p{XPosixPunct}         \p{Punct} + ASCII-range \p{Symbol} (757)
   \p{XPosixSpace}         \s including beyond ASCII and vertical tab
                             (Short: \p{SpacePerl}) (25)
   \p{XPosixUpper}         \p{Uppercase=Y} (Short: \p{Upper}; /i=
                             Cased=Yes) (1822)
   \p{XPosixWord}          \w, including beyond ASCII; = \p{Alnum} +
                             \pM + \p{Pc} + \p{Join_Control} (Short:
                             \p{Word}) (119_821)
   \p{XPosixXDigit}        \p{Hex_Digit=Y} (Short: \p{Hex}) (44)
   \p{Xsux}                \p{Cuneiform} (= \p{Script_Extensions=
                             Cuneiform}) (NOT \p{Block=Cuneiform})
                             (1234)
   \p{Yi}                  \p{Script_Extensions=Yi} (1246)
 X \p{Yi_Radicals}         \p{Block=Yi_Radicals} (64)
 X \p{Yi_Syllables}        \p{Block=Yi_Syllables} (1168)
   \p{Yiii}                \p{Yi} (= \p{Script_Extensions=Yi}) (1246)
 X \p{Yijing}              \p{Yijing_Hexagram_Symbols} (= \p{Block=
                             Yijing_Hexagram_Symbols}) (64)
 X \p{Yijing_Hexagram_Symbols} \p{Block=Yijing_Hexagram_Symbols}
                             (Short: \p{InYijing}) (64)
   \p{Z} \pZ               \p{Separator} (= \p{General_Category=
                             Separator}) (19)
   \p{Zinh}                \p{Inherited} (= \p{Script_Extensions=
                             Inherited}) (496)
   \p{Zl}                  \p{Line_Separator} (= \p{General_Category=
                             Line_Separator}) (1)
   \p{Zp}                  \p{Paragraph_Separator} (=
                             \p{General_Category=
                             Paragraph_Separator}) (1)
   \p{Zs}                  \p{Space_Separator} (=
                             \p{General_Category=Space_Separator})
                             (17)
   \p{Zyyy}                \p{Common} (= \p{Script_Extensions=
                             Common}) (6864)
   \p{Zzzz}                \p{Unknown} (= \p{Script_Extensions=
                             Unknown}) (985_875 plus all above-
                             Unicode code points)
 TX\p{_CanonDCIJ}          (For internal use by Perl, not necessarily
                             stable) (= \p{Soft_Dotted=Y}) (46)
 TX\p{_Case_Ignorable}     (For internal use by Perl, not necessarily
                             stable) (= \p{Case_Ignorable=Y}) (2240)
 TX\p{_CombAbove}          (For internal use by Perl, not necessarily
                             stable) (= \p{Canonical_Combining_Class=
                             Above}) (461)



=head2 Legal C<\p{}> and C<\P{}> constructs that match no characters

Unicode has some property-value pairs that currently don't match anything.
This happens generally either because they are obsolete, or they exist for
symmetry with other forms, but no language has yet been encoded that uses
them.  In this version of Unicode, the following match zero code points:

=over 4

=item \p{Canonical_Combining_Class=Attached_Below_Left}

=item \p{Canonical_Combining_Class=CCC133}

=back



=head1 Properties accessible through Unicode::UCD

The value of any Unicode (not including Perl extensions) character
property mentioned above for any single code point is available through
L<Unicode::UCD/charprop()>.  L<Unicode::UCD/charprops_all()> returns the
values of all the Unicode properties for a given code point.

Besides these, all the Unicode character properties mentioned above
(except for those marked as for internal use by Perl) are also
accessible by L<Unicode::UCD/prop_invlist()>.

Due to their nature, not all Unicode character properties are suitable for
regular expression matches, nor C<prop_invlist()>.  The remaining
non-provisional, non-internal ones are accessible via
L<Unicode::UCD/prop_invmap()> (except for those that this Perl installation
hasn't included; see L<below for which those are|/Unicode character properties
that are NOT accepted by Perl>).

For compatibility with other parts of Perl, all the single forms given in the
table in the L<section above|/Properties accessible through \p{} and \P{}>
are recognized.  BUT, there are some ambiguities between some Perl extensions
and the Unicode properties, all of which are silently resolved in favor of the
official Unicode property.  To avoid surprises, you should only use
C<prop_invmap()> for forms listed in the table below, which omits the
non-recommended ones.  The affected forms are the Perl single form equivalents
of Unicode properties, such as C<\p{sc}> being a single-form equivalent of
C<\p{gc=sc}>, which is treated by C<prop_invmap()> as the C<Script> property,
whose short name is C<sc>.  The table indicates the current ambiguities in the
INFO column, beginning with the word C<"NOT">.

The standard Unicode properties listed below are documented in
L<http://www.unicode.org/reports/tr44/>; Perl_Decimal_Digit is documented in
L<Unicode::UCD/prop_invmap()>.  The other Perl extensions are in
L<perlunicode/Other Properties>;

The first column in the table is a name for the property; the second column is
an alternative name, if any, plus possibly some annotations.  The alternative
name is the property's full name, unless that would simply repeat the first
column, in which case the second column indicates the property's short name
(if different).  The annotations are given only in the entry for the full
name.  If a property is obsolete, etc, the entry will be flagged with the same
characters used in the table in the L<section above|/Properties accessible
through \p{} and \P{}>, like B<D> or B<S>.

   NAME                      INFO

   Age
   AHex                    ASCII_Hex_Digit
   All                     (Perl extension).  All code points,
                           including those above Unicode.  Same as
                           qr/./s
   Alnum                   XPosixAlnum.  (Perl extension)
   Alpha                   Alphabetic
   Alphabetic              (Short: Alpha)
   Any                     (Perl extension).  All Unicode code
                           points: [\x{0000}-\x{10FFFF}]
   ASCII                   Block=ASCII.  (Perl extension).
                           [[:ASCII:]]
   ASCII_Hex_Digit         (Short: AHex)
   Assigned                (Perl extension).  All assigned code points
   Bc                      Bidi_Class
   Bidi_C                  Bidi_Control
   Bidi_Class              (Short: bc)
   Bidi_Control            (Short: Bidi_C)
   Bidi_M                  Bidi_Mirrored
   Bidi_Mirrored           (Short: Bidi_M)
   Bidi_Mirroring_Glyph    (Short: bmg)
   Bidi_Paired_Bracket     (Short: bpb)
   Bidi_Paired_Bracket_Type (Short: bpt)
   Blank                   XPosixBlank.  (Perl extension)
   Blk                     Block
   Block                   (Short: blk)
   Bmg                     Bidi_Mirroring_Glyph
   Bpb                     Bidi_Paired_Bracket
   Bpt                     Bidi_Paired_Bracket_Type
   Canonical_Combining_Class (Short: ccc)
   Case_Folding            (Short: cf)
   Case_Ignorable          (Short: CI)
   Cased
   Category                General_Category
   Ccc                     Canonical_Combining_Class
   CE                      Composition_Exclusion
   Cf                      Case_Folding; NOT 'cf' meaning
                           'General_Category=Format'
   Changes_When_Casefolded (Short: CWCF)
   Changes_When_Casemapped (Short: CWCM)
   Changes_When_Lowercased (Short: CWL)
   Changes_When_NFKC_Casefolded (Short: CWKCF)
   Changes_When_Titlecased (Short: CWT)
   Changes_When_Uppercased (Short: CWU)
   CI                      Case_Ignorable
   Cntrl                   General_Category=XPosixCntrl.  (Perl
                           extension)
   Comp_Ex                 Full_Composition_Exclusion
   Composition_Exclusion   (Short: CE)
   CWCF                    Changes_When_Casefolded
   CWCM                    Changes_When_Casemapped
   CWKCF                   Changes_When_NFKC_Casefolded
   CWL                     Changes_When_Lowercased
   CWT                     Changes_When_Titlecased
   CWU                     Changes_When_Uppercased
   Dash
   Decomposition_Mapping   (Short: dm)
   Decomposition_Type      (Short: dt)
   Default_Ignorable_Code_Point (Short: DI)
   Dep                     Deprecated
   Deprecated              (Short: Dep)
   DI                      Default_Ignorable_Code_Point
   Dia                     Diacritic
   Diacritic               (Short: Dia)
   Digit                   General_Category=XPosixDigit.  (Perl
                           extension)
   Dm                      Decomposition_Mapping
   Dt                      Decomposition_Type
   Ea                      East_Asian_Width
   East_Asian_Width        (Short: ea)
   Ext                     Extender
   Extender                (Short: Ext)
   Full_Composition_Exclusion (Short: Comp_Ex)
   Gc                      General_Category
   GCB                     Grapheme_Cluster_Break
   General_Category        (Short: gc)
   Gr_Base                 Grapheme_Base
   Gr_Ext                  Grapheme_Extend
   Graph                   XPosixGraph.  (Perl extension)
   Grapheme_Base           (Short: Gr_Base)
   Grapheme_Cluster_Break  (Short: GCB)
   Grapheme_Extend         (Short: Gr_Ext)
   Hangul_Syllable_Type    (Short: hst)
   Hex                     Hex_Digit
   Hex_Digit               (Short: Hex)
   HorizSpace              XPosixBlank.  (Perl extension)
   Hst                     Hangul_Syllable_Type
 D Hyphen                  Supplanted by Line_Break property values;
                           see www.unicode.org/reports/tr14
   ID_Continue             (Short: IDC)
   ID_Start                (Short: IDS)
   IDC                     ID_Continue
   Ideo                    Ideographic
   Ideographic             (Short: Ideo)
   IDS                     ID_Start
   IDS_Binary_Operator     (Short: IDSB)
   IDS_Trinary_Operator    (Short: IDST)
   IDSB                    IDS_Binary_Operator
   IDST                    IDS_Trinary_Operator
   In                      Present_In.  (Perl extension)
   Indic_Positional_Category (Short: InPC)
   Indic_Syllabic_Category (Short: InSC)
   InPC                    Indic_Positional_Category
   InSC                    Indic_Syllabic_Category
   Isc                     ISO_Comment; NOT 'isc' meaning
                           'General_Category=Other'
   ISO_Comment             (Short: isc)
   Jg                      Joining_Group
   Join_C                  Join_Control
   Join_Control            (Short: Join_C)
   Joining_Group           (Short: jg)
   Joining_Type            (Short: jt)
   Jt                      Joining_Type
   Lb                      Line_Break
   Lc                      Lowercase_Mapping; NOT 'lc' meaning
                           'General_Category=Cased_Letter'
   Line_Break              (Short: lb)
   LOE                     Logical_Order_Exception
   Logical_Order_Exception (Short: LOE)
   Lower                   Lowercase
   Lowercase               (Short: Lower)
   Lowercase_Mapping       (Short: lc)
   Math
   Na                      Name
   Na1                     Unicode_1_Name
   Name                    (Short: na)
   Name_Alias
   NChar                   Noncharacter_Code_Point
   NFC_QC                  NFC_Quick_Check
   NFC_Quick_Check         (Short: NFC_QC)
   NFD_QC                  NFD_Quick_Check
   NFD_Quick_Check         (Short: NFD_QC)
   NFKC_Casefold           (Short: NFKC_CF)
   NFKC_CF                 NFKC_Casefold
   NFKC_QC                 NFKC_Quick_Check
   NFKC_Quick_Check        (Short: NFKC_QC)
   NFKD_QC                 NFKD_Quick_Check
   NFKD_Quick_Check        (Short: NFKD_QC)
   Noncharacter_Code_Point (Short: NChar)
   Nt                      Numeric_Type
   Numeric_Type            (Short: nt)
   Numeric_Value           (Short: nv)
   Nv                      Numeric_Value
   Pat_Syn                 Pattern_Syntax
   Pat_WS                  Pattern_White_Space
   Pattern_Syntax          (Short: Pat_Syn)
   Pattern_White_Space     (Short: Pat_WS)
   PCM                     Prepended_Concatenation_Mark
   Perl_Decimal_Digit      (Perl extension)
   PerlSpace               PosixSpace.  (Perl extension)
   PerlWord                PosixWord.  (Perl extension)
   PosixAlnum              (Perl extension).  [A-Za-z0-9]
   PosixAlpha              (Perl extension).  [A-Za-z]
   PosixBlank              (Perl extension).  \t and ' '
   PosixCntrl              (Perl extension).  ASCII control
                           characters: NUL, SOH, STX, ETX, EOT, ENQ,
                           ACK, BEL, BS, HT, LF, VT, FF, CR, SO, SI,
                           DLE, DC1, DC2, DC3, DC4, NAK, SYN, ETB,
                           CAN, EOM, SUB, ESC, FS, GS, RS, US, and DEL
   PosixDigit              (Perl extension).  [0-9]
   PosixGraph              (Perl extension).  [-!"#$%&'()*+,./:;<=
                           >?@[\\]^_`{|}~0-9A-Za-z]
   PosixLower              (Perl extension).  [a-z]
   PosixPrint              (Perl extension).  [- 0-9A-Za-
                           z!"#$%&'()*+,./:;<=>?@[\\]^_`{|}~]
   PosixPunct              (Perl extension).  [-!"#$%&'()*+,./:;<=
                           >?@[\\]^_`{|}~]
   PosixSpace              (Perl extension).  \t, \n, \cK, \f, \r,
                           and ' '.  (\cK is vertical tab)
   PosixUpper              (Perl extension).  [A-Z]
   PosixWord               (Perl extension).  \w, restricted to ASCII
                           = [A-Za-z0-9_]
   PosixXDigit             (Perl extension).  [0-9A-Fa-f]
   Prepended_Concatenation_Mark (Short: PCM)
   Present_In              (Short: In).  (Perl extension)
   Print                   XPosixPrint.  (Perl extension)
   Punct                   General_Category=Punct.  (Perl extension)
   QMark                   Quotation_Mark
   Quotation_Mark          (Short: QMark)
   Radical
   SB                      Sentence_Break
   Sc                      Script; NOT 'sc' meaning
                           'General_Category=Currency_Symbol'
   Scf                     Simple_Case_Folding
   Script                  (Short: sc)
   Script_Extensions       (Short: scx)
   Scx                     Script_Extensions
   SD                      Soft_Dotted
   Sentence_Break          (Short: SB)
   Sentence_Terminal       (Short: STerm)
   Sfc                     Simple_Case_Folding
   Simple_Case_Folding     (Short: scf)
   Simple_Lowercase_Mapping (Short: slc)
   Simple_Titlecase_Mapping (Short: stc)
   Simple_Uppercase_Mapping (Short: suc)
   Slc                     Simple_Lowercase_Mapping
   Soft_Dotted             (Short: SD)
   Space                   White_Space
   SpacePerl               XPosixSpace.  (Perl extension)
   Stc                     Simple_Titlecase_Mapping
   STerm                   Sentence_Terminal
   Suc                     Simple_Uppercase_Mapping
   Tc                      Titlecase_Mapping
   Term                    Terminal_Punctuation
   Terminal_Punctuation    (Short: Term)
   Title                   Titlecase.  (Perl extension)
   Titlecase               (Short: Title).  (Perl extension).  (=
                           \p{Gc=Lt})
   Titlecase_Mapping       (Short: tc)
   Uc                      Uppercase_Mapping
   UIdeo                   Unified_Ideograph
   Unicode                 Any.  (Perl extension)
   Unicode_1_Name          (Short: na1)
   Unified_Ideograph       (Short: UIdeo)
   Upper                   Uppercase
   Uppercase               (Short: Upper)
   Uppercase_Mapping       (Short: uc)
   Variation_Selector      (Short: VS)
   VertSpace               (Perl extension).  \v
   VS                      Variation_Selector
   WB                      Word_Break
   White_Space             (Short: WSpace)
   Word                    XPosixWord.  (Perl extension)
   Word_Break              (Short: WB)
   WSpace                  White_Space
   XDigit                  XPosixXDigit.  (Perl extension)
   XID_Continue            (Short: XIDC)
   XID_Start               (Short: XIDS)
   XIDC                    XID_Continue
   XIDS                    XID_Start
   XPerlSpace              XPosixSpace.  (Perl extension)
   XPosixAlnum             (Short: Alnum).  (Perl extension).
                           Alphabetic and (decimal) Numeric
   XPosixAlpha             (Perl extension)
   XPosixBlank             (Short: Blank).  (Perl extension).  \h,
                           Horizontal white space
   XPosixCntrl             General_Category=XPosixCntrl  (Short:
                           Cntrl).  (Perl extension).  Control
                           characters
   XPosixDigit             General_Category=XPosixDigit  (Short:
                           Digit).  (Perl extension).  [0-9] + all
                           other decimal digits
   XPosixGraph             (Short: Graph).  (Perl extension).
                           Characters that are graphical
   XPosixLower             (Perl extension)
   XPosixPrint             (Short: Print).  (Perl extension).
                           Characters that are graphical plus space
                           characters (but no controls)
   XPosixPunct             (Perl extension).  \p{Punct} + ASCII-range
                           \p{Symbol}
   XPosixSpace             (Perl extension).  \s including beyond
                           ASCII and vertical tab
   XPosixUpper             (Perl extension)
   XPosixWord              (Short: Word).  (Perl extension).  \w,
                           including beyond ASCII; = \p{Alnum} + \pM
                           + \p{Pc} + \p{Join_Control}
   XPosixXDigit            (Short: XDigit).  (Perl extension)


=head1 Properties accessible through other means

Certain properties are accessible also via core function calls.  These are:

 Lowercase_Mapping          lc() and lcfirst()
 Titlecase_Mapping          ucfirst()
 Uppercase_Mapping          uc()

Also, Case_Folding is accessible through the C</i> modifier in regular
expressions, the C<\F> transliteration escape, and the C<L<fc|perlfunc/fc>>
operator.

And, the Name and Name_Aliases properties are accessible through the C<\N{}>
interpolation in double-quoted strings and regular expressions; and functions
C<charnames::viacode()>, C<charnames::vianame()>, and
C<charnames::string_vianame()> (which require a C<use charnames ();> to be
specified.

Finally, most properties related to decomposition are accessible via
L<Unicode::Normalize>.

=head1 Unicode character properties that are NOT accepted by Perl

Perl will generate an error for a few character properties in Unicode when
used in a regular expression.  The non-Unihan ones are listed below, with the
reasons they are not accepted, perhaps with work-arounds.  The short names for
the properties are listed enclosed in (parentheses).
As described after the list, an installation can change the defaults and choose
to accept any of these.  The list is machine generated based on the
choices made for the installation that generated this document.


=over 4



=item I<Expands_On_NFC> (XO_NFC)

=item I<Expands_On_NFD> (XO_NFD)

=item I<Expands_On_NFKC> (XO_NFKC)

=item I<Expands_On_NFKD> (XO_NFKD)

Deprecated by Unicode.  These are characters that expand to more than one character in the specified normalization form, but whether they actually take up more bytes or not depends on the encoding being used.  For example, a UTF-8 encoded character may expand to a different number of bytes than a UTF-32 encoded character.



=item I<Grapheme_Link> (Gr_Link)

Deprecated by Unicode:  Duplicates ccc=vr (Canonical_Combining_Class=Virama)



=item I<Jamo_Short_Name> (JSN)

=item I<Other_Alphabetic> (OAlpha)

=item I<Other_Default_Ignorable_Code_Point> (ODI)

=item I<Other_Grapheme_Extend> (OGr_Ext)

=item I<Other_ID_Continue> (OIDC)

=item I<Other_ID_Start> (OIDS)

=item I<Other_Lowercase> (OLower)

=item I<Other_Math> (OMath)

=item I<Other_Uppercase> (OUpper)

Used by Unicode internally for generating other properties and not intended to be used stand-alone



=item I<Script=Katakana_Or_Hiragana> (sc=Hrkt)

Obsolete.  All code points previously matched by this have been moved to "Script=Common".  Consider instead using "Script_Extensions=Katakana" or "Script_Extensions=Hiragana" (or both)



=item I<Script_Extensions=Katakana_Or_Hiragana> (scx=Hrkt)

All code points that would be matched by this are matched by either "Script_Extensions=Katakana" or "Script_Extensions=Hiragana"

=back


An installation can choose to allow any of these to be matched by downloading
the Unicode database from L<http://www.unicode.org/Public/> to
C<$Config{privlib}>/F<unicore/> in the Perl source tree, changing the
controlling lists contained in the program
C<$Config{privlib}>/F<unicore/mktables> and then re-compiling and installing.
(C<%Config> is available from the Config module).

Also, perl can be recompiled to operate on an earlier version of the Unicode
standard.  Further information is at
C<$Config{privlib}>/F<unicore/README.perl>.

=head1 Other information in the Unicode data base

The Unicode data base is delivered in two different formats.  The XML version
is valid for more modern Unicode releases.  The other version is a collection
of files.  The two are intended to give equivalent information.  Perl uses the
older form; this allows you to recompile Perl to use early Unicode releases.

The only non-character property that Perl currently supports is Named
Sequences, in which a sequence of code points
is given a name and generally treated as a single entity.  (Perl supports
these via the C<\N{...}> double-quotish construct,
L<charnames/charnames::string_vianame(name)>, and L<Unicode::UCD/namedseq()>.

Below is a list of the files in the Unicode data base that Perl doesn't
currently use, along with very brief descriptions of their purposes.
Some of the names of the files have been shortened from those that Unicode
uses, in order to allow them to be distinguishable from similarly named files
on file systems for which only the first 8 characters of a name are
significant.

=over 4




=item F<auxiliary/GraphemeBreakTest.html> 

=item F<auxiliary/LineBreakTest.html> 

=item F<auxiliary/SentenceBreakTest.html> 

=item F<auxiliary/WordBreakTest.html> 

Documentation of validation Tests



=item F<BidiCharacterTest.txt> 

=item F<BidiTest.txt> 

=item F<NormTest.txt> 

Validation Tests



=item F<CJKRadicals.txt> 

Maps the kRSUnicode property values to corresponding code points



=item F<EmojiSources.txt> 

Maps certain Unicode code points to their legacy Japanese cell-phone values



=item F<Index.txt> 

Alphabetical index of Unicode characters



=item F<NamedSqProv.txt> 

Named sequences proposed for inclusion in a later version of the Unicode Standard; if you need them now, you can append this file to F<NamedSequences.txt> and recompile perl



=item F<NamesList.html> 

Describes the format and contents of F<NamesList.txt>



=item F<NamesList.txt> 

Annotated list of characters



=item F<NormalizationCorrections.txt> 

Documentation of corrections already incorporated into the Unicode data base



=item F<ReadMe.txt> 

Documentation



=item F<StandardizedVariants.html> 

Obsoleted as of Unicode 9.0, but previously provided a visual display of the standard variant sequences derived from F<StandardizedVariants.txt>.



=item F<StandardizedVariants.txt> 

Certain glyph variations for character display are standardized.  This lists the non-Unihan ones; the Unihan ones are also not used by Perl, and are in a separate Unicode data base L<http://www.unicode.org/ivd>



=item F<TangutSources.txt> 

Specifies source mappings for Tangut ideographs and components. This data file also includes informative radical-stroke values that are used internally by Unicode



=item F<USourceData.txt> 

Documentation of status and cross reference of proposals for encoding by Unicode of Unihan characters



=item F<USourceGlyphs.pdf> 

Pictures of the characters in F<USourceData.txt>


=back

=head1 SEE ALSO

L<http://www.unicode.org/reports/tr44/>

L<perlrecharclass>

L<perlunicode>

perlguts.pod000064400000420101150344123430007112 0ustar00=head1 NAME

perlguts - Introduction to the Perl API

=head1 DESCRIPTION

This document attempts to describe how to use the Perl API, as well as
to provide some info on the basic workings of the Perl core.  It is far
from complete and probably contains many errors.  Please refer any
questions or comments to the author below.

=head1 Variables

=head2 Datatypes

Perl has three typedefs that handle Perl's three main data types:

    SV  Scalar Value
    AV  Array Value
    HV  Hash Value

Each typedef has specific routines that manipulate the various data types.

=head2 What is an "IV"?

Perl uses a special typedef IV which is a simple signed integer type that is
guaranteed to be large enough to hold a pointer (as well as an integer).
Additionally, there is the UV, which is simply an unsigned IV.

Perl also uses two special typedefs, I32 and I16, which will always be at
least 32-bits and 16-bits long, respectively.  (Again, there are U32 and U16,
as well.)  They will usually be exactly 32 and 16 bits long, but on Crays
they will both be 64 bits.

=head2 Working with SVs

An SV can be created and loaded with one command.  There are five types of
values that can be loaded: an integer value (IV), an unsigned integer
value (UV), a double (NV), a string (PV), and another scalar (SV).
("PV" stands for "Pointer Value".  You might think that it is misnamed
because it is described as pointing only to strings.  However, it is
possible to have it point to other things.  For example, it could point
to an array of UVs.  But,
using it for non-strings requires care, as the underlying assumption of
much of the internals is that PVs are just for strings.  Often, for
example, a trailing C<NUL> is tacked on automatically.  The non-string use
is documented only in this paragraph.)

The seven routines are:

    SV*  newSViv(IV);
    SV*  newSVuv(UV);
    SV*  newSVnv(double);
    SV*  newSVpv(const char*, STRLEN);
    SV*  newSVpvn(const char*, STRLEN);
    SV*  newSVpvf(const char*, ...);
    SV*  newSVsv(SV*);

C<STRLEN> is an integer type (Size_t, usually defined as size_t in
F<config.h>) guaranteed to be large enough to represent the size of
any string that perl can handle.

In the unlikely case of a SV requiring more complex initialization, you
can create an empty SV with newSV(len).  If C<len> is 0 an empty SV of
type NULL is returned, else an SV of type PV is returned with len + 1 (for
the C<NUL>) bytes of storage allocated, accessible via SvPVX.  In both cases
the SV has the undef value.

    SV *sv = newSV(0);   /* no storage allocated  */
    SV *sv = newSV(10);  /* 10 (+1) bytes of uninitialised storage
                          * allocated */

To change the value of an I<already-existing> SV, there are eight routines:

    void  sv_setiv(SV*, IV);
    void  sv_setuv(SV*, UV);
    void  sv_setnv(SV*, double);
    void  sv_setpv(SV*, const char*);
    void  sv_setpvn(SV*, const char*, STRLEN)
    void  sv_setpvf(SV*, const char*, ...);
    void  sv_vsetpvfn(SV*, const char*, STRLEN, va_list *,
                                                    SV **, I32, bool *);
    void  sv_setsv(SV*, SV*);

Notice that you can choose to specify the length of the string to be
assigned by using C<sv_setpvn>, C<newSVpvn>, or C<newSVpv>, or you may
allow Perl to calculate the length by using C<sv_setpv> or by specifying
0 as the second argument to C<newSVpv>.  Be warned, though, that Perl will
determine the string's length by using C<strlen>, which depends on the
string terminating with a C<NUL> character, and not otherwise containing
NULs.

The arguments of C<sv_setpvf> are processed like C<sprintf>, and the
formatted output becomes the value.

C<sv_vsetpvfn> is an analogue of C<vsprintf>, but it allows you to specify
either a pointer to a variable argument list or the address and length of
an array of SVs.  The last argument points to a boolean; on return, if that
boolean is true, then locale-specific information has been used to format
the string, and the string's contents are therefore untrustworthy (see
L<perlsec>).  This pointer may be NULL if that information is not
important.  Note that this function requires you to specify the length of
the format.

The C<sv_set*()> functions are not generic enough to operate on values
that have "magic".  See L</Magic Virtual Tables> later in this document.

All SVs that contain strings should be terminated with a C<NUL> character.
If it is not C<NUL>-terminated there is a risk of
core dumps and corruptions from code which passes the string to C
functions or system calls which expect a C<NUL>-terminated string.
Perl's own functions typically add a trailing C<NUL> for this reason.
Nevertheless, you should be very careful when you pass a string stored
in an SV to a C function or system call.

To access the actual value that an SV points to, you can use the macros:

    SvIV(SV*)
    SvUV(SV*)
    SvNV(SV*)
    SvPV(SV*, STRLEN len)
    SvPV_nolen(SV*)

which will automatically coerce the actual scalar type into an IV, UV, double,
or string.

In the C<SvPV> macro, the length of the string returned is placed into the
variable C<len> (this is a macro, so you do I<not> use C<&len>).  If you do
not care what the length of the data is, use the C<SvPV_nolen> macro.
Historically the C<SvPV> macro with the global variable C<PL_na> has been
used in this case.  But that can be quite inefficient because C<PL_na> must
be accessed in thread-local storage in threaded Perl.  In any case, remember
that Perl allows arbitrary strings of data that may both contain NULs and
might not be terminated by a C<NUL>.

Also remember that C doesn't allow you to safely say C<foo(SvPV(s, len),
len);>.  It might work with your
compiler, but it won't work for everyone.
Break this sort of statement up into separate assignments:

    SV *s;
    STRLEN len;
    char *ptr;
    ptr = SvPV(s, len);
    foo(ptr, len);

If you want to know if the scalar value is TRUE, you can use:

    SvTRUE(SV*)

Although Perl will automatically grow strings for you, if you need to force
Perl to allocate more memory for your SV, you can use the macro

    SvGROW(SV*, STRLEN newlen)

which will determine if more memory needs to be allocated.  If so, it will
call the function C<sv_grow>.  Note that C<SvGROW> can only increase, not
decrease, the allocated memory of an SV and that it does not automatically
add space for the trailing C<NUL> byte (perl's own string functions typically do
C<SvGROW(sv, len + 1)>).

If you want to write to an existing SV's buffer and set its value to a
string, use SvPV_force() or one of its variants to force the SV to be
a PV.  This will remove any of various types of non-stringness from
the SV while preserving the content of the SV in the PV.  This can be
used, for example, to append data from an API function to a buffer
without extra copying:

    (void)SvPVbyte_force(sv, len);
    s = SvGROW(sv, len + needlen + 1);
    /* something that modifies up to needlen bytes at s+len, but
       modifies newlen bytes
         eg. newlen = read(fd, s + len, needlen);
       ignoring errors for these examples
     */
    s[len + newlen] = '\0';
    SvCUR_set(sv, len + newlen);
    SvUTF8_off(sv);
    SvSETMAGIC(sv);

If you already have the data in memory or if you want to keep your
code simple, you can use one of the sv_cat*() variants, such as
sv_catpvn().  If you want to insert anywhere in the string you can use
sv_insert() or sv_insert_flags().

If you don't need the existing content of the SV, you can avoid some
copying with:

    SvPVCLEAR(sv);
    s = SvGROW(sv, needlen + 1);
    /* something that modifies up to needlen bytes at s, but modifies
       newlen bytes
         eg. newlen = read(fd, s. needlen);
     */
    s[newlen] = '\0';
    SvCUR_set(sv, newlen);
    SvPOK_only(sv); /* also clears SVf_UTF8 */
    SvSETMAGIC(sv);

Again, if you already have the data in memory or want to avoid the
complexity of the above, you can use sv_setpvn().

If you have a buffer allocated with Newx() and want to set that as the
SV's value, you can use sv_usepvn_flags().  That has some requirements
if you want to avoid perl re-allocating the buffer to fit the trailing
NUL:

   Newx(buf, somesize+1, char);
   /* ... fill in buf ... */
   buf[somesize] = '\0';
   sv_usepvn_flags(sv, buf, somesize, SV_SMAGIC | SV_HAS_TRAILING_NUL);
   /* buf now belongs to perl, don't release it */

If you have an SV and want to know what kind of data Perl thinks is stored
in it, you can use the following macros to check the type of SV you have.

    SvIOK(SV*)
    SvNOK(SV*)
    SvPOK(SV*)

You can get and set the current length of the string stored in an SV with
the following macros:

    SvCUR(SV*)
    SvCUR_set(SV*, I32 val)

You can also get a pointer to the end of the string stored in the SV
with the macro:

    SvEND(SV*)

But note that these last three macros are valid only if C<SvPOK()> is true.

If you want to append something to the end of string stored in an C<SV*>,
you can use the following functions:

    void  sv_catpv(SV*, const char*);
    void  sv_catpvn(SV*, const char*, STRLEN);
    void  sv_catpvf(SV*, const char*, ...);
    void  sv_vcatpvfn(SV*, const char*, STRLEN, va_list *, SV **,
                                                             I32, bool);
    void  sv_catsv(SV*, SV*);

The first function calculates the length of the string to be appended by
using C<strlen>.  In the second, you specify the length of the string
yourself.  The third function processes its arguments like C<sprintf> and
appends the formatted output.  The fourth function works like C<vsprintf>.
You can specify the address and length of an array of SVs instead of the
va_list argument.  The fifth function
extends the string stored in the first
SV with the string stored in the second SV.  It also forces the second SV
to be interpreted as a string.

The C<sv_cat*()> functions are not generic enough to operate on values that
have "magic".  See L</Magic Virtual Tables> later in this document.

If you know the name of a scalar variable, you can get a pointer to its SV
by using the following:

    SV*  get_sv("package::varname", 0);

This returns NULL if the variable does not exist.

If you want to know if this variable (or any other SV) is actually C<defined>,
you can call:

    SvOK(SV*)

The scalar C<undef> value is stored in an SV instance called C<PL_sv_undef>.

Its address can be used whenever an C<SV*> is needed.  Make sure that
you don't try to compare a random sv with C<&PL_sv_undef>.  For example
when interfacing Perl code, it'll work correctly for:

  foo(undef);

But won't work when called as:

  $x = undef;
  foo($x);

So to repeat always use SvOK() to check whether an sv is defined.

Also you have to be careful when using C<&PL_sv_undef> as a value in
AVs or HVs (see L</AVs, HVs and undefined values>).

There are also the two values C<PL_sv_yes> and C<PL_sv_no>, which contain
boolean TRUE and FALSE values, respectively.  Like C<PL_sv_undef>, their
addresses can be used whenever an C<SV*> is needed.

Do not be fooled into thinking that C<(SV *) 0> is the same as C<&PL_sv_undef>.
Take this code:

    SV* sv = (SV*) 0;
    if (I-am-to-return-a-real-value) {
            sv = sv_2mortal(newSViv(42));
    }
    sv_setsv(ST(0), sv);

This code tries to return a new SV (which contains the value 42) if it should
return a real value, or undef otherwise.  Instead it has returned a NULL
pointer which, somewhere down the line, will cause a segmentation violation,
bus error, or just weird results.  Change the zero to C<&PL_sv_undef> in the
first line and all will be well.

To free an SV that you've created, call C<SvREFCNT_dec(SV*)>.  Normally this
call is not necessary (see L</Reference Counts and Mortality>).

=head2 Offsets

Perl provides the function C<sv_chop> to efficiently remove characters
from the beginning of a string; you give it an SV and a pointer to
somewhere inside the PV, and it discards everything before the
pointer.  The efficiency comes by means of a little hack: instead of
actually removing the characters, C<sv_chop> sets the flag C<OOK>
(offset OK) to signal to other functions that the offset hack is in
effect, and it moves the PV pointer (called C<SvPVX>) forward
by the number of bytes chopped off, and adjusts C<SvCUR> and C<SvLEN>
accordingly.  (A portion of the space between the old and new PV
pointers is used to store the count of chopped bytes.)

Hence, at this point, the start of the buffer that we allocated lives
at C<SvPVX(sv) - SvIV(sv)> in memory and the PV pointer is pointing
into the middle of this allocated storage.

This is best demonstrated by example.  Normally copy-on-write will prevent
the substitution from operator from using this hack, but if you can craft a
string for which copy-on-write is not possible, you can see it in play.  In
the current implementation, the final byte of a string buffer is used as a
copy-on-write reference count.  If the buffer is not big enough, then
copy-on-write is skipped.  First have a look at an empty string:

  % ./perl -Ilib -MDevel::Peek -le '$a=""; $a .= ""; Dump $a'
  SV = PV(0x7ffb7c008a70) at 0x7ffb7c030390
    REFCNT = 1
    FLAGS = (POK,pPOK)
    PV = 0x7ffb7bc05b50 ""\0
    CUR = 0
    LEN = 10

Notice here the LEN is 10.  (It may differ on your platform.)  Extend the
length of the string to one less than 10, and do a substitution:

 % ./perl -Ilib -MDevel::Peek -le '$a=""; $a.="123456789"; $a=~s/.//; \
                                                            Dump($a)'
 SV = PV(0x7ffa04008a70) at 0x7ffa04030390
   REFCNT = 1
   FLAGS = (POK,OOK,pPOK)
   OFFSET = 1
   PV = 0x7ffa03c05b61 ( "\1" . ) "23456789"\0
   CUR = 8
   LEN = 9

Here the number of bytes chopped off (1) is shown next as the OFFSET.  The
portion of the string between the "real" and the "fake" beginnings is
shown in parentheses, and the values of C<SvCUR> and C<SvLEN> reflect
the fake beginning, not the real one.  (The first character of the string
buffer happens to have changed to "\1" here, not "1", because the current
implementation stores the offset count in the string buffer.  This is
subject to change.)

Something similar to the offset hack is performed on AVs to enable
efficient shifting and splicing off the beginning of the array; while
C<AvARRAY> points to the first element in the array that is visible from
Perl, C<AvALLOC> points to the real start of the C array.  These are
usually the same, but a C<shift> operation can be carried out by
increasing C<AvARRAY> by one and decreasing C<AvFILL> and C<AvMAX>.
Again, the location of the real start of the C array only comes into
play when freeing the array.  See C<av_shift> in F<av.c>.

=head2 What's Really Stored in an SV?

Recall that the usual method of determining the type of scalar you have is
to use C<Sv*OK> macros.  Because a scalar can be both a number and a string,
usually these macros will always return TRUE and calling the C<Sv*V>
macros will do the appropriate conversion of string to integer/double or
integer/double to string.

If you I<really> need to know if you have an integer, double, or string
pointer in an SV, you can use the following three macros instead:

    SvIOKp(SV*)
    SvNOKp(SV*)
    SvPOKp(SV*)

These will tell you if you truly have an integer, double, or string pointer
stored in your SV.  The "p" stands for private.

There are various ways in which the private and public flags may differ.
For example, in perl 5.16 and earlier a tied SV may have a valid
underlying value in the IV slot (so SvIOKp is true), but the data
should be accessed via the FETCH routine rather than directly,
so SvIOK is false.  (In perl 5.18 onwards, tied scalars use
the flags the same way as untied scalars.)  Another is when
numeric conversion has occurred and precision has been lost: only the
private flag is set on 'lossy' values.  So when an NV is converted to an
IV with loss, SvIOKp, SvNOKp and SvNOK will be set, while SvIOK wont be.

In general, though, it's best to use the C<Sv*V> macros.

=head2 Working with AVs

There are two ways to create and load an AV.  The first method creates an
empty AV:

    AV*  newAV();

The second method both creates the AV and initially populates it with SVs:

    AV*  av_make(SSize_t num, SV **ptr);

The second argument points to an array containing C<num> C<SV*>'s.  Once the
AV has been created, the SVs can be destroyed, if so desired.

Once the AV has been created, the following operations are possible on it:

    void  av_push(AV*, SV*);
    SV*   av_pop(AV*);
    SV*   av_shift(AV*);
    void  av_unshift(AV*, SSize_t num);

These should be familiar operations, with the exception of C<av_unshift>.
This routine adds C<num> elements at the front of the array with the C<undef>
value.  You must then use C<av_store> (described below) to assign values
to these new elements.

Here are some other functions:

    SSize_t av_top_index(AV*);
    SV**    av_fetch(AV*, SSize_t key, I32 lval);
    SV**    av_store(AV*, SSize_t key, SV* val);

The C<av_top_index> function returns the highest index value in an array (just
like $#array in Perl).  If the array is empty, -1 is returned.  The
C<av_fetch> function returns the value at index C<key>, but if C<lval>
is non-zero, then C<av_fetch> will store an undef value at that index.
The C<av_store> function stores the value C<val> at index C<key>, and does
not increment the reference count of C<val>.  Thus the caller is responsible
for taking care of that, and if C<av_store> returns NULL, the caller will
have to decrement the reference count to avoid a memory leak.  Note that
C<av_fetch> and C<av_store> both return C<SV**>'s, not C<SV*>'s as their
return value.

A few more:

    void  av_clear(AV*);
    void  av_undef(AV*);
    void  av_extend(AV*, SSize_t key);

The C<av_clear> function deletes all the elements in the AV* array, but
does not actually delete the array itself.  The C<av_undef> function will
delete all the elements in the array plus the array itself.  The
C<av_extend> function extends the array so that it contains at least C<key+1>
elements.  If C<key+1> is less than the currently allocated length of the array,
then nothing is done.

If you know the name of an array variable, you can get a pointer to its AV
by using the following:

    AV*  get_av("package::varname", 0);

This returns NULL if the variable does not exist.

See L</Understanding the Magic of Tied Hashes and Arrays> for more
information on how to use the array access functions on tied arrays.

=head2 Working with HVs

To create an HV, you use the following routine:

    HV*  newHV();

Once the HV has been created, the following operations are possible on it:

    SV**  hv_store(HV*, const char* key, U32 klen, SV* val, U32 hash);
    SV**  hv_fetch(HV*, const char* key, U32 klen, I32 lval);

The C<klen> parameter is the length of the key being passed in (Note that
you cannot pass 0 in as a value of C<klen> to tell Perl to measure the
length of the key).  The C<val> argument contains the SV pointer to the
scalar being stored, and C<hash> is the precomputed hash value (zero if
you want C<hv_store> to calculate it for you).  The C<lval> parameter
indicates whether this fetch is actually a part of a store operation, in
which case a new undefined value will be added to the HV with the supplied
key and C<hv_fetch> will return as if the value had already existed.

Remember that C<hv_store> and C<hv_fetch> return C<SV**>'s and not just
C<SV*>.  To access the scalar value, you must first dereference the return
value.  However, you should check to make sure that the return value is
not NULL before dereferencing it.

The first of these two functions checks if a hash table entry exists, and the 
second deletes it.

    bool  hv_exists(HV*, const char* key, U32 klen);
    SV*   hv_delete(HV*, const char* key, U32 klen, I32 flags);

If C<flags> does not include the C<G_DISCARD> flag then C<hv_delete> will
create and return a mortal copy of the deleted value.

And more miscellaneous functions:

    void   hv_clear(HV*);
    void   hv_undef(HV*);

Like their AV counterparts, C<hv_clear> deletes all the entries in the hash
table but does not actually delete the hash table.  The C<hv_undef> deletes
both the entries and the hash table itself.

Perl keeps the actual data in a linked list of structures with a typedef of HE.
These contain the actual key and value pointers (plus extra administrative
overhead).  The key is a string pointer; the value is an C<SV*>.  However,
once you have an C<HE*>, to get the actual key and value, use the routines
specified below.

    I32    hv_iterinit(HV*);
            /* Prepares starting point to traverse hash table */
    HE*    hv_iternext(HV*);
            /* Get the next entry, and return a pointer to a
               structure that has both the key and value */
    char*  hv_iterkey(HE* entry, I32* retlen);
            /* Get the key from an HE structure and also return
               the length of the key string */
    SV*    hv_iterval(HV*, HE* entry);
            /* Return an SV pointer to the value of the HE
               structure */
    SV*    hv_iternextsv(HV*, char** key, I32* retlen);
            /* This convenience routine combines hv_iternext,
	       hv_iterkey, and hv_iterval.  The key and retlen
	       arguments are return values for the key and its
	       length.  The value is returned in the SV* argument */

If you know the name of a hash variable, you can get a pointer to its HV
by using the following:

    HV*  get_hv("package::varname", 0);

This returns NULL if the variable does not exist.

The hash algorithm is defined in the C<PERL_HASH> macro:

    PERL_HASH(hash, key, klen)

The exact implementation of this macro varies by architecture and version
of perl, and the return value may change per invocation, so the value
is only valid for the duration of a single perl process.

See L</Understanding the Magic of Tied Hashes and Arrays> for more
information on how to use the hash access functions on tied hashes.

=head2 Hash API Extensions

Beginning with version 5.004, the following functions are also supported:

    HE*     hv_fetch_ent  (HV* tb, SV* key, I32 lval, U32 hash);
    HE*     hv_store_ent  (HV* tb, SV* key, SV* val, U32 hash);

    bool    hv_exists_ent (HV* tb, SV* key, U32 hash);
    SV*     hv_delete_ent (HV* tb, SV* key, I32 flags, U32 hash);

    SV*     hv_iterkeysv  (HE* entry);

Note that these functions take C<SV*> keys, which simplifies writing
of extension code that deals with hash structures.  These functions
also allow passing of C<SV*> keys to C<tie> functions without forcing
you to stringify the keys (unlike the previous set of functions).

They also return and accept whole hash entries (C<HE*>), making their
use more efficient (since the hash number for a particular string
doesn't have to be recomputed every time).  See L<perlapi> for detailed
descriptions.

The following macros must always be used to access the contents of hash
entries.  Note that the arguments to these macros must be simple
variables, since they may get evaluated more than once.  See
L<perlapi> for detailed descriptions of these macros.

    HePV(HE* he, STRLEN len)
    HeVAL(HE* he)
    HeHASH(HE* he)
    HeSVKEY(HE* he)
    HeSVKEY_force(HE* he)
    HeSVKEY_set(HE* he, SV* sv)

These two lower level macros are defined, but must only be used when
dealing with keys that are not C<SV*>s:

    HeKEY(HE* he)
    HeKLEN(HE* he)

Note that both C<hv_store> and C<hv_store_ent> do not increment the
reference count of the stored C<val>, which is the caller's responsibility.
If these functions return a NULL value, the caller will usually have to
decrement the reference count of C<val> to avoid a memory leak.

=head2 AVs, HVs and undefined values

Sometimes you have to store undefined values in AVs or HVs.  Although
this may be a rare case, it can be tricky.  That's because you're
used to using C<&PL_sv_undef> if you need an undefined SV.

For example, intuition tells you that this XS code:

    AV *av = newAV();
    av_store( av, 0, &PL_sv_undef );

is equivalent to this Perl code:

    my @av;
    $av[0] = undef;

Unfortunately, this isn't true.  In perl 5.18 and earlier, AVs use C<&PL_sv_undef> as a marker
for indicating that an array element has not yet been initialized.
Thus, C<exists $av[0]> would be true for the above Perl code, but
false for the array generated by the XS code.  In perl 5.20, storing
&PL_sv_undef will create a read-only element, because the scalar
&PL_sv_undef itself is stored, not a copy.

Similar problems can occur when storing C<&PL_sv_undef> in HVs:

    hv_store( hv, "key", 3, &PL_sv_undef, 0 );

This will indeed make the value C<undef>, but if you try to modify
the value of C<key>, you'll get the following error:

    Modification of non-creatable hash value attempted

In perl 5.8.0, C<&PL_sv_undef> was also used to mark placeholders
in restricted hashes.  This caused such hash entries not to appear
when iterating over the hash or when checking for the keys
with the C<hv_exists> function.

You can run into similar problems when you store C<&PL_sv_yes> or
C<&PL_sv_no> into AVs or HVs.  Trying to modify such elements
will give you the following error:

    Modification of a read-only value attempted

To make a long story short, you can use the special variables
C<&PL_sv_undef>, C<&PL_sv_yes> and C<&PL_sv_no> with AVs and
HVs, but you have to make sure you know what you're doing.

Generally, if you want to store an undefined value in an AV
or HV, you should not use C<&PL_sv_undef>, but rather create a
new undefined value using the C<newSV> function, for example:

    av_store( av, 42, newSV(0) );
    hv_store( hv, "foo", 3, newSV(0), 0 );

=head2 References

References are a special type of scalar that point to other data types
(including other references).

To create a reference, use either of the following functions:

    SV* newRV_inc((SV*) thing);
    SV* newRV_noinc((SV*) thing);

The C<thing> argument can be any of an C<SV*>, C<AV*>, or C<HV*>.  The
functions are identical except that C<newRV_inc> increments the reference
count of the C<thing>, while C<newRV_noinc> does not.  For historical
reasons, C<newRV> is a synonym for C<newRV_inc>.

Once you have a reference, you can use the following macro to dereference
the reference:

    SvRV(SV*)

then call the appropriate routines, casting the returned C<SV*> to either an
C<AV*> or C<HV*>, if required.

To determine if an SV is a reference, you can use the following macro:

    SvROK(SV*)

To discover what type of value the reference refers to, use the following
macro and then check the return value.

    SvTYPE(SvRV(SV*))

The most useful types that will be returned are:

    < SVt_PVAV  Scalar
    SVt_PVAV    Array
    SVt_PVHV    Hash
    SVt_PVCV    Code
    SVt_PVGV    Glob (possibly a file handle)

See L<perlapi/svtype> for more details.

=head2 Blessed References and Class Objects

References are also used to support object-oriented programming.  In perl's
OO lexicon, an object is simply a reference that has been blessed into a
package (or class).  Once blessed, the programmer may now use the reference
to access the various methods in the class.

A reference can be blessed into a package with the following function:

    SV* sv_bless(SV* sv, HV* stash);

The C<sv> argument must be a reference value.  The C<stash> argument
specifies which class the reference will belong to.  See
L</Stashes and Globs> for information on converting class names into stashes.

/* Still under construction */

The following function upgrades rv to reference if not already one.
Creates a new SV for rv to point to.  If C<classname> is non-null, the SV
is blessed into the specified class.  SV is returned.

	SV* newSVrv(SV* rv, const char* classname);

The following three functions copy integer, unsigned integer or double
into an SV whose reference is C<rv>.  SV is blessed if C<classname> is
non-null.

	SV* sv_setref_iv(SV* rv, const char* classname, IV iv);
	SV* sv_setref_uv(SV* rv, const char* classname, UV uv);
	SV* sv_setref_nv(SV* rv, const char* classname, NV iv);

The following function copies the pointer value (I<the address, not the
string!>) into an SV whose reference is rv.  SV is blessed if C<classname>
is non-null.

	SV* sv_setref_pv(SV* rv, const char* classname, void* pv);

The following function copies a string into an SV whose reference is C<rv>.
Set length to 0 to let Perl calculate the string length.  SV is blessed if
C<classname> is non-null.

    SV* sv_setref_pvn(SV* rv, const char* classname, char* pv,
                                                         STRLEN length);

The following function tests whether the SV is blessed into the specified
class.  It does not check inheritance relationships.

	int  sv_isa(SV* sv, const char* name);

The following function tests whether the SV is a reference to a blessed object.

	int  sv_isobject(SV* sv);

The following function tests whether the SV is derived from the specified
class.  SV can be either a reference to a blessed object or a string
containing a class name.  This is the function implementing the
C<UNIVERSAL::isa> functionality.

	bool sv_derived_from(SV* sv, const char* name);

To check if you've got an object derived from a specific class you have
to write:

	if (sv_isobject(sv) && sv_derived_from(sv, class)) { ... }

=head2 Creating New Variables

To create a new Perl variable with an undef value which can be accessed from
your Perl script, use the following routines, depending on the variable type.

    SV*  get_sv("package::varname", GV_ADD);
    AV*  get_av("package::varname", GV_ADD);
    HV*  get_hv("package::varname", GV_ADD);

Notice the use of GV_ADD as the second parameter.  The new variable can now
be set, using the routines appropriate to the data type.

There are additional macros whose values may be bitwise OR'ed with the
C<GV_ADD> argument to enable certain extra features.  Those bits are:

=over

=item GV_ADDMULTI

Marks the variable as multiply defined, thus preventing the:

  Name <varname> used only once: possible typo

warning.

=item GV_ADDWARN

Issues the warning:

  Had to create <varname> unexpectedly

if the variable did not exist before the function was called.

=back

If you do not specify a package name, the variable is created in the current
package.

=head2 Reference Counts and Mortality

Perl uses a reference count-driven garbage collection mechanism.  SVs,
AVs, or HVs (xV for short in the following) start their life with a
reference count of 1.  If the reference count of an xV ever drops to 0,
then it will be destroyed and its memory made available for reuse.

This normally doesn't happen at the Perl level unless a variable is
undef'ed or the last variable holding a reference to it is changed or
overwritten.  At the internal level, however, reference counts can be
manipulated with the following macros:

    int SvREFCNT(SV* sv);
    SV* SvREFCNT_inc(SV* sv);
    void SvREFCNT_dec(SV* sv);

However, there is one other function which manipulates the reference
count of its argument.  The C<newRV_inc> function, you will recall,
creates a reference to the specified argument.  As a side effect,
it increments the argument's reference count.  If this is not what
you want, use C<newRV_noinc> instead.

For example, imagine you want to return a reference from an XSUB function.
Inside the XSUB routine, you create an SV which initially has a reference
count of one.  Then you call C<newRV_inc>, passing it the just-created SV.
This returns the reference as a new SV, but the reference count of the
SV you passed to C<newRV_inc> has been incremented to two.  Now you
return the reference from the XSUB routine and forget about the SV.
But Perl hasn't!  Whenever the returned reference is destroyed, the
reference count of the original SV is decreased to one and nothing happens.
The SV will hang around without any way to access it until Perl itself
terminates.  This is a memory leak.

The correct procedure, then, is to use C<newRV_noinc> instead of
C<newRV_inc>.  Then, if and when the last reference is destroyed,
the reference count of the SV will go to zero and it will be destroyed,
stopping any memory leak.

There are some convenience functions available that can help with the
destruction of xVs.  These functions introduce the concept of "mortality".
An xV that is mortal has had its reference count marked to be decremented,
but not actually decremented, until "a short time later".  Generally the
term "short time later" means a single Perl statement, such as a call to
an XSUB function.  The actual determinant for when mortal xVs have their
reference count decremented depends on two macros, SAVETMPS and FREETMPS.
See L<perlcall> and L<perlxs> for more details on these macros.

"Mortalization" then is at its simplest a deferred C<SvREFCNT_dec>.
However, if you mortalize a variable twice, the reference count will
later be decremented twice.

"Mortal" SVs are mainly used for SVs that are placed on perl's stack.
For example an SV which is created just to pass a number to a called sub
is made mortal to have it cleaned up automatically when it's popped off
the stack.  Similarly, results returned by XSUBs (which are pushed on the
stack) are often made mortal.

To create a mortal variable, use the functions:

    SV*  sv_newmortal()
    SV*  sv_2mortal(SV*)
    SV*  sv_mortalcopy(SV*)

The first call creates a mortal SV (with no value), the second converts an existing
SV to a mortal SV (and thus defers a call to C<SvREFCNT_dec>), and the
third creates a mortal copy of an existing SV.
Because C<sv_newmortal> gives the new SV no value, it must normally be given one
via C<sv_setpv>, C<sv_setiv>, etc. :

    SV *tmp = sv_newmortal();
    sv_setiv(tmp, an_integer);

As that is multiple C statements it is quite common so see this idiom instead:

    SV *tmp = sv_2mortal(newSViv(an_integer));


You should be careful about creating mortal variables.  Strange things
can happen if you make the same value mortal within multiple contexts,
or if you make a variable mortal multiple
times.  Thinking of "Mortalization"
as deferred C<SvREFCNT_dec> should help to minimize such problems.
For example if you are passing an SV which you I<know> has a high enough REFCNT
to survive its use on the stack you need not do any mortalization.
If you are not sure then doing an C<SvREFCNT_inc> and C<sv_2mortal>, or
making a C<sv_mortalcopy> is safer.

The mortal routines are not just for SVs; AVs and HVs can be
made mortal by passing their address (type-casted to C<SV*>) to the
C<sv_2mortal> or C<sv_mortalcopy> routines.

=head2 Stashes and Globs

A B<stash> is a hash that contains all variables that are defined
within a package.  Each key of the stash is a symbol
name (shared by all the different types of objects that have the same
name), and each value in the hash table is a GV (Glob Value).  This GV
in turn contains references to the various objects of that name,
including (but not limited to) the following:

    Scalar Value
    Array Value
    Hash Value
    I/O Handle
    Format
    Subroutine

There is a single stash called C<PL_defstash> that holds the items that exist
in the C<main> package.  To get at the items in other packages, append the
string "::" to the package name.  The items in the C<Foo> package are in
the stash C<Foo::> in PL_defstash.  The items in the C<Bar::Baz> package are
in the stash C<Baz::> in C<Bar::>'s stash.

To get the stash pointer for a particular package, use the function:

    HV*  gv_stashpv(const char* name, I32 flags)
    HV*  gv_stashsv(SV*, I32 flags)

The first function takes a literal string, the second uses the string stored
in the SV.  Remember that a stash is just a hash table, so you get back an
C<HV*>.  The C<flags> flag will create a new package if it is set to GV_ADD.

The name that C<gv_stash*v> wants is the name of the package whose symbol table
you want.  The default package is called C<main>.  If you have multiply nested
packages, pass their names to C<gv_stash*v>, separated by C<::> as in the Perl
language itself.

Alternately, if you have an SV that is a blessed reference, you can find
out the stash pointer by using:

    HV*  SvSTASH(SvRV(SV*));

then use the following to get the package name itself:

    char*  HvNAME(HV* stash);

If you need to bless or re-bless an object you can use the following
function:

    SV*  sv_bless(SV*, HV* stash)

where the first argument, an C<SV*>, must be a reference, and the second
argument is a stash.  The returned C<SV*> can now be used in the same way
as any other SV.

For more information on references and blessings, consult L<perlref>.

=head2 Double-Typed SVs

Scalar variables normally contain only one type of value, an integer,
double, pointer, or reference.  Perl will automatically convert the
actual scalar data from the stored type into the requested type.

Some scalar variables contain more than one type of scalar data.  For
example, the variable C<$!> contains either the numeric value of C<errno>
or its string equivalent from either C<strerror> or C<sys_errlist[]>.

To force multiple data values into an SV, you must do two things: use the
C<sv_set*v> routines to add the additional scalar type, then set a flag
so that Perl will believe it contains more than one type of data.  The
four macros to set the flags are:

	SvIOK_on
	SvNOK_on
	SvPOK_on
	SvROK_on

The particular macro you must use depends on which C<sv_set*v> routine
you called first.  This is because every C<sv_set*v> routine turns on
only the bit for the particular type of data being set, and turns off
all the rest.

For example, to create a new Perl variable called "dberror" that contains
both the numeric and descriptive string error values, you could use the
following code:

    extern int  dberror;
    extern char *dberror_list;

    SV* sv = get_sv("dberror", GV_ADD);
    sv_setiv(sv, (IV) dberror);
    sv_setpv(sv, dberror_list[dberror]);
    SvIOK_on(sv);

If the order of C<sv_setiv> and C<sv_setpv> had been reversed, then the
macro C<SvPOK_on> would need to be called instead of C<SvIOK_on>.

=head2 Read-Only Values

In Perl 5.16 and earlier, copy-on-write (see the next section) shared a
flag bit with read-only scalars.  So the only way to test whether
C<sv_setsv>, etc., will raise a "Modification of a read-only value" error
in those versions is:

    SvREADONLY(sv) && !SvIsCOW(sv)

Under Perl 5.18 and later, SvREADONLY only applies to read-only variables,
and, under 5.20, copy-on-write scalars can also be read-only, so the above
check is incorrect.  You just want:

    SvREADONLY(sv)

If you need to do this check often, define your own macro like this:

    #if PERL_VERSION >= 18
    # define SvTRULYREADONLY(sv) SvREADONLY(sv)
    #else
    # define SvTRULYREADONLY(sv) (SvREADONLY(sv) && !SvIsCOW(sv))
    #endif

=head2 Copy on Write

Perl implements a copy-on-write (COW) mechanism for scalars, in which
string copies are not immediately made when requested, but are deferred
until made necessary by one or the other scalar changing.  This is mostly
transparent, but one must take care not to modify string buffers that are
shared by multiple SVs.

You can test whether an SV is using copy-on-write with C<SvIsCOW(sv)>.

You can force an SV to make its own copy of its string buffer by calling C<sv_force_normal(sv)> or SvPV_force_nolen(sv).

If you want to make the SV drop its string buffer, use
C<sv_force_normal_flags(sv, SV_COW_DROP_PV)> or simply
C<sv_setsv(sv, NULL)>.

All of these functions will croak on read-only scalars (see the previous
section for more on those).

To test that your code is behaving correctly and not modifying COW buffers,
on systems that support L<mmap(2)> (i.e., Unix) you can configure perl with
C<-Accflags=-DPERL_DEBUG_READONLY_COW> and it will turn buffer violations
into crashes.  You will find it to be marvellously slow, so you may want to
skip perl's own tests.

=head2 Magic Variables

[This section still under construction.  Ignore everything here.  Post no
bills.  Everything not permitted is forbidden.]

Any SV may be magical, that is, it has special features that a normal
SV does not have.  These features are stored in the SV structure in a
linked list of C<struct magic>'s, typedef'ed to C<MAGIC>.

    struct magic {
        MAGIC*      mg_moremagic;
        MGVTBL*     mg_virtual;
        U16         mg_private;
        char        mg_type;
        U8          mg_flags;
        I32         mg_len;
        SV*         mg_obj;
        char*       mg_ptr;
    };

Note this is current as of patchlevel 0, and could change at any time.

=head2 Assigning Magic

Perl adds magic to an SV using the sv_magic function:

  void sv_magic(SV* sv, SV* obj, int how, const char* name, I32 namlen);

The C<sv> argument is a pointer to the SV that is to acquire a new magical
feature.

If C<sv> is not already magical, Perl uses the C<SvUPGRADE> macro to
convert C<sv> to type C<SVt_PVMG>.
Perl then continues by adding new magic
to the beginning of the linked list of magical features.  Any prior entry
of the same type of magic is deleted.  Note that this can be overridden,
and multiple instances of the same type of magic can be associated with an
SV.

The C<name> and C<namlen> arguments are used to associate a string with
the magic, typically the name of a variable.  C<namlen> is stored in the
C<mg_len> field and if C<name> is non-null then either a C<savepvn> copy of
C<name> or C<name> itself is stored in the C<mg_ptr> field, depending on
whether C<namlen> is greater than zero or equal to zero respectively.  As a
special case, if C<(name && namlen == HEf_SVKEY)> then C<name> is assumed
to contain an C<SV*> and is stored as-is with its REFCNT incremented.

The sv_magic function uses C<how> to determine which, if any, predefined
"Magic Virtual Table" should be assigned to the C<mg_virtual> field.
See the L</Magic Virtual Tables> section below.  The C<how> argument is also
stored in the C<mg_type> field.  The value of
C<how> should be chosen from the set of macros
C<PERL_MAGIC_foo> found in F<perl.h>.  Note that before
these macros were added, Perl internals used to directly use character
literals, so you may occasionally come across old code or documentation
referring to 'U' magic rather than C<PERL_MAGIC_uvar> for example.

The C<obj> argument is stored in the C<mg_obj> field of the C<MAGIC>
structure.  If it is not the same as the C<sv> argument, the reference
count of the C<obj> object is incremented.  If it is the same, or if
the C<how> argument is C<PERL_MAGIC_arylen>, C<PERL_MAGIC_regdatum>,
C<PERL_MAGIC_regdata>, or if it is a NULL pointer, then C<obj> is merely
stored, without the reference count being incremented.

See also C<sv_magicext> in L<perlapi> for a more flexible way to add magic
to an SV.

There is also a function to add magic to an C<HV>:

    void hv_magic(HV *hv, GV *gv, int how);

This simply calls C<sv_magic> and coerces the C<gv> argument into an C<SV>.

To remove the magic from an SV, call the function sv_unmagic:

    int sv_unmagic(SV *sv, int type);

The C<type> argument should be equal to the C<how> value when the C<SV>
was initially made magical.

However, note that C<sv_unmagic> removes all magic of a certain C<type> from the
C<SV>.  If you want to remove only certain
magic of a C<type> based on the magic
virtual table, use C<sv_unmagicext> instead:

    int sv_unmagicext(SV *sv, int type, MGVTBL *vtbl);

=head2 Magic Virtual Tables

The C<mg_virtual> field in the C<MAGIC> structure is a pointer to an
C<MGVTBL>, which is a structure of function pointers and stands for
"Magic Virtual Table" to handle the various operations that might be
applied to that variable.

The C<MGVTBL> has five (or sometimes eight) pointers to the following
routine types:

    int  (*svt_get)  (pTHX_ SV* sv, MAGIC* mg);
    int  (*svt_set)  (pTHX_ SV* sv, MAGIC* mg);
    U32  (*svt_len)  (pTHX_ SV* sv, MAGIC* mg);
    int  (*svt_clear)(pTHX_ SV* sv, MAGIC* mg);
    int  (*svt_free) (pTHX_ SV* sv, MAGIC* mg);

    int  (*svt_copy) (pTHX_ SV *sv, MAGIC* mg, SV *nsv,
                                          const char *name, I32 namlen);
    int  (*svt_dup)  (pTHX_ MAGIC *mg, CLONE_PARAMS *param);
    int  (*svt_local)(pTHX_ SV *nsv, MAGIC *mg);


This MGVTBL structure is set at compile-time in F<perl.h> and there are
currently 32 types.  These different structures contain pointers to various
routines that perform additional actions depending on which function is
being called.

   Function pointer    Action taken
   ----------------    ------------
   svt_get             Do something before the value of the SV is
                       retrieved.
   svt_set             Do something after the SV is assigned a value.
   svt_len             Report on the SV's length.
   svt_clear           Clear something the SV represents.
   svt_free            Free any extra storage associated with the SV.

   svt_copy            copy tied variable magic to a tied element
   svt_dup             duplicate a magic structure during thread cloning
   svt_local           copy magic to local value during 'local'

For instance, the MGVTBL structure called C<vtbl_sv> (which corresponds
to an C<mg_type> of C<PERL_MAGIC_sv>) contains:

    { magic_get, magic_set, magic_len, 0, 0 }

Thus, when an SV is determined to be magical and of type C<PERL_MAGIC_sv>,
if a get operation is being performed, the routine C<magic_get> is
called.  All the various routines for the various magical types begin
with C<magic_>.  NOTE: the magic routines are not considered part of
the Perl API, and may not be exported by the Perl library.

The last three slots are a recent addition, and for source code
compatibility they are only checked for if one of the three flags
MGf_COPY, MGf_DUP or MGf_LOCAL is set in mg_flags.
This means that most code can continue declaring
a vtable as a 5-element value.  These three are
currently used exclusively by the threading code, and are highly subject
to change.

The current kinds of Magic Virtual Tables are:

=for comment
This table is generated by regen/mg_vtable.pl.  Any changes made here
will be lost.

=for mg_vtable.pl begin

 mg_type
 (old-style char and macro)   MGVTBL         Type of magic
 --------------------------   ------         -------------
 \0 PERL_MAGIC_sv             vtbl_sv        Special scalar variable
 #  PERL_MAGIC_arylen         vtbl_arylen    Array length ($#ary)
 %  PERL_MAGIC_rhash          (none)         Extra data for restricted
                                             hashes
 *  PERL_MAGIC_debugvar       vtbl_debugvar  $DB::single, signal, trace
                                             vars
 .  PERL_MAGIC_pos            vtbl_pos       pos() lvalue
 :  PERL_MAGIC_symtab         (none)         Extra data for symbol
                                             tables
 <  PERL_MAGIC_backref        vtbl_backref   For weak ref data
 @  PERL_MAGIC_arylen_p       (none)         To move arylen out of XPVAV
 B  PERL_MAGIC_bm             vtbl_regexp    Boyer-Moore 
                                             (fast string search)
 c  PERL_MAGIC_overload_table vtbl_ovrld     Holds overload table 
                                             (AMT) on stash
 D  PERL_MAGIC_regdata        vtbl_regdata   Regex match position data 
                                             (@+ and @- vars)
 d  PERL_MAGIC_regdatum       vtbl_regdatum  Regex match position data
                                             element
 E  PERL_MAGIC_env            vtbl_env       %ENV hash
 e  PERL_MAGIC_envelem        vtbl_envelem   %ENV hash element
 f  PERL_MAGIC_fm             vtbl_regexp    Formline 
                                             ('compiled' format)
 g  PERL_MAGIC_regex_global   vtbl_mglob     m//g target
 H  PERL_MAGIC_hints          vtbl_hints     %^H hash
 h  PERL_MAGIC_hintselem      vtbl_hintselem %^H hash element
 I  PERL_MAGIC_isa            vtbl_isa       @ISA array
 i  PERL_MAGIC_isaelem        vtbl_isaelem   @ISA array element
 k  PERL_MAGIC_nkeys          vtbl_nkeys     scalar(keys()) lvalue
 L  PERL_MAGIC_dbfile         (none)         Debugger %_<filename
 l  PERL_MAGIC_dbline         vtbl_dbline    Debugger %_<filename
                                             element
 N  PERL_MAGIC_shared         (none)         Shared between threads
 n  PERL_MAGIC_shared_scalar  (none)         Shared between threads
 o  PERL_MAGIC_collxfrm       vtbl_collxfrm  Locale transformation
 P  PERL_MAGIC_tied           vtbl_pack      Tied array or hash
 p  PERL_MAGIC_tiedelem       vtbl_packelem  Tied array or hash element
 q  PERL_MAGIC_tiedscalar     vtbl_packelem  Tied scalar or handle
 r  PERL_MAGIC_qr             vtbl_regexp    Precompiled qr// regex
 S  PERL_MAGIC_sig            (none)         %SIG hash
 s  PERL_MAGIC_sigelem        vtbl_sigelem   %SIG hash element
 t  PERL_MAGIC_taint          vtbl_taint     Taintedness
 U  PERL_MAGIC_uvar           vtbl_uvar      Available for use by
                                             extensions
 u  PERL_MAGIC_uvar_elem      (none)         Reserved for use by
                                             extensions
 V  PERL_MAGIC_vstring        (none)         SV was vstring literal
 v  PERL_MAGIC_vec            vtbl_vec       vec() lvalue
 w  PERL_MAGIC_utf8           vtbl_utf8      Cached UTF-8 information
 x  PERL_MAGIC_substr         vtbl_substr    substr() lvalue
 y  PERL_MAGIC_defelem        vtbl_defelem   Shadow "foreach" iterator
                                             variable / smart parameter
                                             vivification
 \  PERL_MAGIC_lvref          vtbl_lvref     Lvalue reference
                                             constructor
 ]  PERL_MAGIC_checkcall      vtbl_checkcall Inlining/mutation of call
                                             to this CV
 ~  PERL_MAGIC_ext            (none)         Available for use by
                                             extensions

=for mg_vtable.pl end

When an uppercase and lowercase letter both exist in the table, then the
uppercase letter is typically used to represent some kind of composite type
(a list or a hash), and the lowercase letter is used to represent an element
of that composite type.  Some internals code makes use of this case
relationship.  However, 'v' and 'V' (vec and v-string) are in no way related.

The C<PERL_MAGIC_ext> and C<PERL_MAGIC_uvar> magic types are defined
specifically for use by extensions and will not be used by perl itself.
Extensions can use C<PERL_MAGIC_ext> magic to 'attach' private information
to variables (typically objects).  This is especially useful because
there is no way for normal perl code to corrupt this private information
(unlike using extra elements of a hash object).

Similarly, C<PERL_MAGIC_uvar> magic can be used much like tie() to call a
C function any time a scalar's value is used or changed.  The C<MAGIC>'s
C<mg_ptr> field points to a C<ufuncs> structure:

    struct ufuncs {
        I32 (*uf_val)(pTHX_ IV, SV*);
        I32 (*uf_set)(pTHX_ IV, SV*);
        IV uf_index;
    };

When the SV is read from or written to, the C<uf_val> or C<uf_set>
function will be called with C<uf_index> as the first arg and a pointer to
the SV as the second.  A simple example of how to add C<PERL_MAGIC_uvar>
magic is shown below.  Note that the ufuncs structure is copied by
sv_magic, so you can safely allocate it on the stack.

    void
    Umagic(sv)
        SV *sv;
    PREINIT:
        struct ufuncs uf;
    CODE:
        uf.uf_val   = &my_get_fn;
        uf.uf_set   = &my_set_fn;
        uf.uf_index = 0;
        sv_magic(sv, 0, PERL_MAGIC_uvar, (char*)&uf, sizeof(uf));

Attaching C<PERL_MAGIC_uvar> to arrays is permissible but has no effect.

For hashes there is a specialized hook that gives control over hash
keys (but not values).  This hook calls C<PERL_MAGIC_uvar> 'get' magic
if the "set" function in the C<ufuncs> structure is NULL.  The hook
is activated whenever the hash is accessed with a key specified as
an C<SV> through the functions C<hv_store_ent>, C<hv_fetch_ent>,
C<hv_delete_ent>, and C<hv_exists_ent>.  Accessing the key as a string
through the functions without the C<..._ent> suffix circumvents the
hook.  See L<Hash::Util::FieldHash/GUTS> for a detailed description.

Note that because multiple extensions may be using C<PERL_MAGIC_ext>
or C<PERL_MAGIC_uvar> magic, it is important for extensions to take
extra care to avoid conflict.  Typically only using the magic on
objects blessed into the same class as the extension is sufficient.
For C<PERL_MAGIC_ext> magic, it is usually a good idea to define an
C<MGVTBL>, even if all its fields will be C<0>, so that individual
C<MAGIC> pointers can be identified as a particular kind of magic
using their magic virtual table.  C<mg_findext> provides an easy way
to do that:

    STATIC MGVTBL my_vtbl = { 0, 0, 0, 0, 0, 0, 0, 0 };

    MAGIC *mg;
    if ((mg = mg_findext(sv, PERL_MAGIC_ext, &my_vtbl))) {
        /* this is really ours, not another module's PERL_MAGIC_ext */
        my_priv_data_t *priv = (my_priv_data_t *)mg->mg_ptr;
        ...
    }

Also note that the C<sv_set*()> and C<sv_cat*()> functions described
earlier do B<not> invoke 'set' magic on their targets.  This must
be done by the user either by calling the C<SvSETMAGIC()> macro after
calling these functions, or by using one of the C<sv_set*_mg()> or
C<sv_cat*_mg()> functions.  Similarly, generic C code must call the
C<SvGETMAGIC()> macro to invoke any 'get' magic if they use an SV
obtained from external sources in functions that don't handle magic.
See L<perlapi> for a description of these functions.
For example, calls to the C<sv_cat*()> functions typically need to be
followed by C<SvSETMAGIC()>, but they don't need a prior C<SvGETMAGIC()>
since their implementation handles 'get' magic.

=head2 Finding Magic

    MAGIC *mg_find(SV *sv, int type); /* Finds the magic pointer of that
                                       * type */

This routine returns a pointer to a C<MAGIC> structure stored in the SV.
If the SV does not have that magical
feature, C<NULL> is returned.  If the
SV has multiple instances of that magical feature, the first one will be
returned.  C<mg_findext> can be used
to find a C<MAGIC> structure of an SV
based on both its magic type and its magic virtual table:

    MAGIC *mg_findext(SV *sv, int type, MGVTBL *vtbl);

Also, if the SV passed to C<mg_find> or C<mg_findext> is not of type
SVt_PVMG, Perl may core dump.

    int mg_copy(SV* sv, SV* nsv, const char* key, STRLEN klen);

This routine checks to see what types of magic C<sv> has.  If the mg_type
field is an uppercase letter, then the mg_obj is copied to C<nsv>, but
the mg_type field is changed to be the lowercase letter.

=head2 Understanding the Magic of Tied Hashes and Arrays

Tied hashes and arrays are magical beasts of the C<PERL_MAGIC_tied>
magic type.

WARNING: As of the 5.004 release, proper usage of the array and hash
access functions requires understanding a few caveats.  Some
of these caveats are actually considered bugs in the API, to be fixed
in later releases, and are bracketed with [MAYCHANGE] below.  If
you find yourself actually applying such information in this section, be
aware that the behavior may change in the future, umm, without warning.

The perl tie function associates a variable with an object that implements
the various GET, SET, etc methods.  To perform the equivalent of the perl
tie function from an XSUB, you must mimic this behaviour.  The code below
carries out the necessary steps -- firstly it creates a new hash, and then
creates a second hash which it blesses into the class which will implement
the tie methods.  Lastly it ties the two hashes together, and returns a
reference to the new tied hash.  Note that the code below does NOT call the
TIEHASH method in the MyTie class -
see L</Calling Perl Routines from within C Programs> for details on how
to do this.

    SV*
    mytie()
    PREINIT:
        HV *hash;
        HV *stash;
        SV *tie;
    CODE:
        hash = newHV();
        tie = newRV_noinc((SV*)newHV());
        stash = gv_stashpv("MyTie", GV_ADD);
        sv_bless(tie, stash);
        hv_magic(hash, (GV*)tie, PERL_MAGIC_tied);
        RETVAL = newRV_noinc(hash);
    OUTPUT:
        RETVAL

The C<av_store> function, when given a tied array argument, merely
copies the magic of the array onto the value to be "stored", using
C<mg_copy>.  It may also return NULL, indicating that the value did not
actually need to be stored in the array.  [MAYCHANGE] After a call to
C<av_store> on a tied array, the caller will usually need to call
C<mg_set(val)> to actually invoke the perl level "STORE" method on the
TIEARRAY object.  If C<av_store> did return NULL, a call to
C<SvREFCNT_dec(val)> will also be usually necessary to avoid a memory
leak. [/MAYCHANGE]

The previous paragraph is applicable verbatim to tied hash access using the
C<hv_store> and C<hv_store_ent> functions as well.

C<av_fetch> and the corresponding hash functions C<hv_fetch> and
C<hv_fetch_ent> actually return an undefined mortal value whose magic
has been initialized using C<mg_copy>.  Note the value so returned does not
need to be deallocated, as it is already mortal.  [MAYCHANGE] But you will
need to call C<mg_get()> on the returned value in order to actually invoke
the perl level "FETCH" method on the underlying TIE object.  Similarly,
you may also call C<mg_set()> on the return value after possibly assigning
a suitable value to it using C<sv_setsv>,  which will invoke the "STORE"
method on the TIE object. [/MAYCHANGE]

[MAYCHANGE]
In other words, the array or hash fetch/store functions don't really
fetch and store actual values in the case of tied arrays and hashes.  They
merely call C<mg_copy> to attach magic to the values that were meant to be
"stored" or "fetched".  Later calls to C<mg_get> and C<mg_set> actually
do the job of invoking the TIE methods on the underlying objects.  Thus
the magic mechanism currently implements a kind of lazy access to arrays
and hashes.

Currently (as of perl version 5.004), use of the hash and array access
functions requires the user to be aware of whether they are operating on
"normal" hashes and arrays, or on their tied variants.  The API may be
changed to provide more transparent access to both tied and normal data
types in future versions.
[/MAYCHANGE]

You would do well to understand that the TIEARRAY and TIEHASH interfaces
are mere sugar to invoke some perl method calls while using the uniform hash
and array syntax.  The use of this sugar imposes some overhead (typically
about two to four extra opcodes per FETCH/STORE operation, in addition to
the creation of all the mortal variables required to invoke the methods).
This overhead will be comparatively small if the TIE methods are themselves
substantial, but if they are only a few statements long, the overhead
will not be insignificant.

=head2 Localizing changes

Perl has a very handy construction

  {
    local $var = 2;
    ...
  }

This construction is I<approximately> equivalent to

  {
    my $oldvar = $var;
    $var = 2;
    ...
    $var = $oldvar;
  }

The biggest difference is that the first construction would
reinstate the initial value of $var, irrespective of how control exits
the block: C<goto>, C<return>, C<die>/C<eval>, etc.  It is a little bit
more efficient as well.

There is a way to achieve a similar task from C via Perl API: create a
I<pseudo-block>, and arrange for some changes to be automatically
undone at the end of it, either explicit, or via a non-local exit (via
die()).  A I<block>-like construct is created by a pair of
C<ENTER>/C<LEAVE> macros (see L<perlcall/"Returning a Scalar">).
Such a construct may be created specially for some important localized
task, or an existing one (like boundaries of enclosing Perl
subroutine/block, or an existing pair for freeing TMPs) may be
used.  (In the second case the overhead of additional localization must
be almost negligible.)  Note that any XSUB is automatically enclosed in
an C<ENTER>/C<LEAVE> pair.

Inside such a I<pseudo-block> the following service is available:

=over 4

=item C<SAVEINT(int i)>

=item C<SAVEIV(IV i)>

=item C<SAVEI32(I32 i)>

=item C<SAVELONG(long i)>

These macros arrange things to restore the value of integer variable
C<i> at the end of enclosing I<pseudo-block>.

=item C<SAVESPTR(s)>

=item C<SAVEPPTR(p)>

These macros arrange things to restore the value of pointers C<s> and
C<p>.  C<s> must be a pointer of a type which survives conversion to
C<SV*> and back, C<p> should be able to survive conversion to C<char*>
and back.

=item C<SAVEFREESV(SV *sv)>

The refcount of C<sv> will be decremented at the end of
I<pseudo-block>.  This is similar to C<sv_2mortal> in that it is also a
mechanism for doing a delayed C<SvREFCNT_dec>.  However, while C<sv_2mortal>
extends the lifetime of C<sv> until the beginning of the next statement,
C<SAVEFREESV> extends it until the end of the enclosing scope.  These
lifetimes can be wildly different.

Also compare C<SAVEMORTALIZESV>.

=item C<SAVEMORTALIZESV(SV *sv)>

Just like C<SAVEFREESV>, but mortalizes C<sv> at the end of the current
scope instead of decrementing its reference count.  This usually has the
effect of keeping C<sv> alive until the statement that called the currently
live scope has finished executing.

=item C<SAVEFREEOP(OP *op)>

The C<OP *> is op_free()ed at the end of I<pseudo-block>.

=item C<SAVEFREEPV(p)>

The chunk of memory which is pointed to by C<p> is Safefree()ed at the
end of I<pseudo-block>.

=item C<SAVECLEARSV(SV *sv)>

Clears a slot in the current scratchpad which corresponds to C<sv> at
the end of I<pseudo-block>.

=item C<SAVEDELETE(HV *hv, char *key, I32 length)>

The key C<key> of C<hv> is deleted at the end of I<pseudo-block>.  The
string pointed to by C<key> is Safefree()ed.  If one has a I<key> in
short-lived storage, the corresponding string may be reallocated like
this:

  SAVEDELETE(PL_defstash, savepv(tmpbuf), strlen(tmpbuf));

=item C<SAVEDESTRUCTOR(DESTRUCTORFUNC_NOCONTEXT_t f, void *p)>

At the end of I<pseudo-block> the function C<f> is called with the
only argument C<p>.

=item C<SAVEDESTRUCTOR_X(DESTRUCTORFUNC_t f, void *p)>

At the end of I<pseudo-block> the function C<f> is called with the
implicit context argument (if any), and C<p>.

=item C<SAVESTACK_POS()>

The current offset on the Perl internal stack (cf. C<SP>) is restored
at the end of I<pseudo-block>.

=back

The following API list contains functions, thus one needs to
provide pointers to the modifiable data explicitly (either C pointers,
or Perlish C<GV *>s).  Where the above macros take C<int>, a similar
function takes C<int *>.

=over 4

=item C<SV* save_scalar(GV *gv)>

Equivalent to Perl code C<local $gv>.

=item C<AV* save_ary(GV *gv)>

=item C<HV* save_hash(GV *gv)>

Similar to C<save_scalar>, but localize C<@gv> and C<%gv>.

=item C<void save_item(SV *item)>

Duplicates the current value of C<SV>, on the exit from the current
C<ENTER>/C<LEAVE> I<pseudo-block> will restore the value of C<SV>
using the stored value.  It doesn't handle magic.  Use C<save_scalar> if
magic is affected.

=item C<void save_list(SV **sarg, I32 maxsarg)>

A variant of C<save_item> which takes multiple arguments via an array
C<sarg> of C<SV*> of length C<maxsarg>.

=item C<SV* save_svref(SV **sptr)>

Similar to C<save_scalar>, but will reinstate an C<SV *>.

=item C<void save_aptr(AV **aptr)>

=item C<void save_hptr(HV **hptr)>

Similar to C<save_svref>, but localize C<AV *> and C<HV *>.

=back

The C<Alias> module implements localization of the basic types within the
I<caller's scope>.  People who are interested in how to localize things in
the containing scope should take a look there too.

=head1 Subroutines

=head2 XSUBs and the Argument Stack

The XSUB mechanism is a simple way for Perl programs to access C subroutines.
An XSUB routine will have a stack that contains the arguments from the Perl
program, and a way to map from the Perl data structures to a C equivalent.

The stack arguments are accessible through the C<ST(n)> macro, which returns
the C<n>'th stack argument.  Argument 0 is the first argument passed in the
Perl subroutine call.  These arguments are C<SV*>, and can be used anywhere
an C<SV*> is used.

Most of the time, output from the C routine can be handled through use of
the RETVAL and OUTPUT directives.  However, there are some cases where the
argument stack is not already long enough to handle all the return values.
An example is the POSIX tzname() call, which takes no arguments, but returns
two, the local time zone's standard and summer time abbreviations.

To handle this situation, the PPCODE directive is used and the stack is
extended using the macro:

    EXTEND(SP, num);

where C<SP> is the macro that represents the local copy of the stack pointer,
and C<num> is the number of elements the stack should be extended by.

Now that there is room on the stack, values can be pushed on it using C<PUSHs>
macro.  The pushed values will often need to be "mortal" (See
L</Reference Counts and Mortality>):

    PUSHs(sv_2mortal(newSViv(an_integer)))
    PUSHs(sv_2mortal(newSVuv(an_unsigned_integer)))
    PUSHs(sv_2mortal(newSVnv(a_double)))
    PUSHs(sv_2mortal(newSVpv("Some String",0)))
    /* Although the last example is better written as the more
     * efficient: */
    PUSHs(newSVpvs_flags("Some String", SVs_TEMP))

And now the Perl program calling C<tzname>, the two values will be assigned
as in:

    ($standard_abbrev, $summer_abbrev) = POSIX::tzname;

An alternate (and possibly simpler) method to pushing values on the stack is
to use the macro:

    XPUSHs(SV*)

This macro automatically adjusts the stack for you, if needed.  Thus, you
do not need to call C<EXTEND> to extend the stack.

Despite their suggestions in earlier versions of this document the macros
C<(X)PUSH[iunp]> are I<not> suited to XSUBs which return multiple results.
For that, either stick to the C<(X)PUSHs> macros shown above, or use the new
C<m(X)PUSH[iunp]> macros instead; see L</Putting a C value on Perl stack>.

For more information, consult L<perlxs> and L<perlxstut>.

=head2 Autoloading with XSUBs

If an AUTOLOAD routine is an XSUB, as with Perl subroutines, Perl puts the
fully-qualified name of the autoloaded subroutine in the $AUTOLOAD variable
of the XSUB's package.

But it also puts the same information in certain fields of the XSUB itself:

    HV *stash           = CvSTASH(cv);
    const char *subname = SvPVX(cv);
    STRLEN name_length  = SvCUR(cv); /* in bytes */
    U32 is_utf8         = SvUTF8(cv);

C<SvPVX(cv)> contains just the sub name itself, not including the package.
For an AUTOLOAD routine in UNIVERSAL or one of its superclasses,
C<CvSTASH(cv)> returns NULL during a method call on a nonexistent package.

B<Note>: Setting $AUTOLOAD stopped working in 5.6.1, which did not support
XS AUTOLOAD subs at all.  Perl 5.8.0 introduced the use of fields in the
XSUB itself.  Perl 5.16.0 restored the setting of $AUTOLOAD.  If you need
to support 5.8-5.14, use the XSUB's fields.

=head2 Calling Perl Routines from within C Programs

There are four routines that can be used to call a Perl subroutine from
within a C program.  These four are:

    I32  call_sv(SV*, I32);
    I32  call_pv(const char*, I32);
    I32  call_method(const char*, I32);
    I32  call_argv(const char*, I32, char**);

The routine most often used is C<call_sv>.  The C<SV*> argument
contains either the name of the Perl subroutine to be called, or a
reference to the subroutine.  The second argument consists of flags
that control the context in which the subroutine is called, whether
or not the subroutine is being passed arguments, how errors should be
trapped, and how to treat return values.

All four routines return the number of arguments that the subroutine returned
on the Perl stack.

These routines used to be called C<perl_call_sv>, etc., before Perl v5.6.0,
but those names are now deprecated; macros of the same name are provided for
compatibility.

When using any of these routines (except C<call_argv>), the programmer
must manipulate the Perl stack.  These include the following macros and
functions:

    dSP
    SP
    PUSHMARK()
    PUTBACK
    SPAGAIN
    ENTER
    SAVETMPS
    FREETMPS
    LEAVE
    XPUSH*()
    POP*()

For a detailed description of calling conventions from C to Perl,
consult L<perlcall>.

=head2 Putting a C value on Perl stack

A lot of opcodes (this is an elementary operation in the internal perl
stack machine) put an SV* on the stack.  However, as an optimization
the corresponding SV is (usually) not recreated each time.  The opcodes
reuse specially assigned SVs (I<target>s) which are (as a corollary)
not constantly freed/created.

Each of the targets is created only once (but see
L</Scratchpads and recursion> below), and when an opcode needs to put
an integer, a double, or a string on stack, it just sets the
corresponding parts of its I<target> and puts the I<target> on stack.

The macro to put this target on stack is C<PUSHTARG>, and it is
directly used in some opcodes, as well as indirectly in zillions of
others, which use it via C<(X)PUSH[iunp]>.

Because the target is reused, you must be careful when pushing multiple
values on the stack.  The following code will not do what you think:

    XPUSHi(10);
    XPUSHi(20);

This translates as "set C<TARG> to 10, push a pointer to C<TARG> onto
the stack; set C<TARG> to 20, push a pointer to C<TARG> onto the stack".
At the end of the operation, the stack does not contain the values 10
and 20, but actually contains two pointers to C<TARG>, which we have set
to 20.

If you need to push multiple different values then you should either use
the C<(X)PUSHs> macros, or else use the new C<m(X)PUSH[iunp]> macros,
none of which make use of C<TARG>.  The C<(X)PUSHs> macros simply push an
SV* on the stack, which, as noted under L</XSUBs and the Argument Stack>,
will often need to be "mortal".  The new C<m(X)PUSH[iunp]> macros make
this a little easier to achieve by creating a new mortal for you (via
C<(X)PUSHmortal>), pushing that onto the stack (extending it if necessary
in the case of the C<mXPUSH[iunp]> macros), and then setting its value.
Thus, instead of writing this to "fix" the example above:

    XPUSHs(sv_2mortal(newSViv(10)))
    XPUSHs(sv_2mortal(newSViv(20)))

you can simply write:

    mXPUSHi(10)
    mXPUSHi(20)

On a related note, if you do use C<(X)PUSH[iunp]>, then you're going to
need a C<dTARG> in your variable declarations so that the C<*PUSH*>
macros can make use of the local variable C<TARG>.  See also C<dTARGET>
and C<dXSTARG>.

=head2 Scratchpads

The question remains on when the SVs which are I<target>s for opcodes
are created.  The answer is that they are created when the current
unit--a subroutine or a file (for opcodes for statements outside of
subroutines)--is compiled.  During this time a special anonymous Perl
array is created, which is called a scratchpad for the current unit.

A scratchpad keeps SVs which are lexicals for the current unit and are
targets for opcodes.  A previous version of this document
stated that one can deduce that an SV lives on a scratchpad
by looking on its flags: lexicals have C<SVs_PADMY> set, and
I<target>s have C<SVs_PADTMP> set.  But this has never been fully true.
C<SVs_PADMY> could be set on a variable that no longer resides in any pad.
While I<target>s do have C<SVs_PADTMP> set, it can also be set on variables
that have never resided in a pad, but nonetheless act like I<target>s.  As
of perl 5.21.5, the C<SVs_PADMY> flag is no longer used and is defined as
0.  C<SvPADMY()> now returns true for anything without C<SVs_PADTMP>.

The correspondence between OPs and I<target>s is not 1-to-1.  Different
OPs in the compile tree of the unit can use the same target, if this
would not conflict with the expected life of the temporary.

=head2 Scratchpads and recursion

In fact it is not 100% true that a compiled unit contains a pointer to
the scratchpad AV.  In fact it contains a pointer to an AV of
(initially) one element, and this element is the scratchpad AV.  Why do
we need an extra level of indirection?

The answer is B<recursion>, and maybe B<threads>.  Both
these can create several execution pointers going into the same
subroutine.  For the subroutine-child not write over the temporaries
for the subroutine-parent (lifespan of which covers the call to the
child), the parent and the child should have different
scratchpads.  (I<And> the lexicals should be separate anyway!)

So each subroutine is born with an array of scratchpads (of length 1).
On each entry to the subroutine it is checked that the current
depth of the recursion is not more than the length of this array, and
if it is, new scratchpad is created and pushed into the array.

The I<target>s on this scratchpad are C<undef>s, but they are already
marked with correct flags.

=head1 Memory Allocation

=head2 Allocation

All memory meant to be used with the Perl API functions should be manipulated
using the macros described in this section.  The macros provide the necessary
transparency between differences in the actual malloc implementation that is
used within perl.

It is suggested that you enable the version of malloc that is distributed
with Perl.  It keeps pools of various sizes of unallocated memory in
order to satisfy allocation requests more quickly.  However, on some
platforms, it may cause spurious malloc or free errors.

The following three macros are used to initially allocate memory :

    Newx(pointer, number, type);
    Newxc(pointer, number, type, cast);
    Newxz(pointer, number, type);

The first argument C<pointer> should be the name of a variable that will
point to the newly allocated memory.

The second and third arguments C<number> and C<type> specify how many of
the specified type of data structure should be allocated.  The argument
C<type> is passed to C<sizeof>.  The final argument to C<Newxc>, C<cast>,
should be used if the C<pointer> argument is different from the C<type>
argument.

Unlike the C<Newx> and C<Newxc> macros, the C<Newxz> macro calls C<memzero>
to zero out all the newly allocated memory.

=head2 Reallocation

    Renew(pointer, number, type);
    Renewc(pointer, number, type, cast);
    Safefree(pointer)

These three macros are used to change a memory buffer size or to free a
piece of memory no longer needed.  The arguments to C<Renew> and C<Renewc>
match those of C<New> and C<Newc> with the exception of not needing the
"magic cookie" argument.

=head2 Moving

    Move(source, dest, number, type);
    Copy(source, dest, number, type);
    Zero(dest, number, type);

These three macros are used to move, copy, or zero out previously allocated
memory.  The C<source> and C<dest> arguments point to the source and
destination starting points.  Perl will move, copy, or zero out C<number>
instances of the size of the C<type> data structure (using the C<sizeof>
function).

=head1 PerlIO

The most recent development releases of Perl have been experimenting with
removing Perl's dependency on the "normal" standard I/O suite and allowing
other stdio implementations to be used.  This involves creating a new
abstraction layer that then calls whichever implementation of stdio Perl
was compiled with.  All XSUBs should now use the functions in the PerlIO
abstraction layer and not make any assumptions about what kind of stdio
is being used.

For a complete description of the PerlIO abstraction, consult L<perlapio>.

=head1 Compiled code

=head2 Code tree

Here we describe the internal form your code is converted to by
Perl.  Start with a simple example:

  $a = $b + $c;

This is converted to a tree similar to this one:

             assign-to
           /           \
          +             $a
        /   \
      $b     $c

(but slightly more complicated).  This tree reflects the way Perl
parsed your code, but has nothing to do with the execution order.
There is an additional "thread" going through the nodes of the tree
which shows the order of execution of the nodes.  In our simplified
example above it looks like:

     $b ---> $c ---> + ---> $a ---> assign-to

But with the actual compile tree for C<$a = $b + $c> it is different:
some nodes I<optimized away>.  As a corollary, though the actual tree
contains more nodes than our simplified example, the execution order
is the same as in our example.

=head2 Examining the tree

If you have your perl compiled for debugging (usually done with
C<-DDEBUGGING> on the C<Configure> command line), you may examine the
compiled tree by specifying C<-Dx> on the Perl command line.  The
output takes several lines per node, and for C<$b+$c> it looks like
this:

    5           TYPE = add  ===> 6
                TARG = 1
                FLAGS = (SCALAR,KIDS)
                {
                    TYPE = null  ===> (4)
                      (was rv2sv)
                    FLAGS = (SCALAR,KIDS)
                    {
    3                   TYPE = gvsv  ===> 4
                        FLAGS = (SCALAR)
                        GV = main::b
                    }
                }
                {
                    TYPE = null  ===> (5)
                      (was rv2sv)
                    FLAGS = (SCALAR,KIDS)
                    {
    4                   TYPE = gvsv  ===> 5
                        FLAGS = (SCALAR)
                        GV = main::c
                    }
                }

This tree has 5 nodes (one per C<TYPE> specifier), only 3 of them are
not optimized away (one per number in the left column).  The immediate
children of the given node correspond to C<{}> pairs on the same level
of indentation, thus this listing corresponds to the tree:

                   add
                 /     \
               null    null
                |       |
               gvsv    gvsv

The execution order is indicated by C<===E<gt>> marks, thus it is C<3
4 5 6> (node C<6> is not included into above listing), i.e.,
C<gvsv gvsv add whatever>.

Each of these nodes represents an op, a fundamental operation inside the
Perl core.  The code which implements each operation can be found in the
F<pp*.c> files; the function which implements the op with type C<gvsv>
is C<pp_gvsv>, and so on.  As the tree above shows, different ops have
different numbers of children: C<add> is a binary operator, as one would
expect, and so has two children.  To accommodate the various different
numbers of children, there are various types of op data structure, and
they link together in different ways.

The simplest type of op structure is C<OP>: this has no children.  Unary
operators, C<UNOP>s, have one child, and this is pointed to by the
C<op_first> field.  Binary operators (C<BINOP>s) have not only an
C<op_first> field but also an C<op_last> field.  The most complex type of
op is a C<LISTOP>, which has any number of children.  In this case, the
first child is pointed to by C<op_first> and the last child by
C<op_last>.  The children in between can be found by iteratively
following the C<OpSIBLING> pointer from the first child to the last (but
see below).

There are also some other op types: a C<PMOP> holds a regular expression,
and has no children, and a C<LOOP> may or may not have children.  If the
C<op_children> field is non-zero, it behaves like a C<LISTOP>.  To
complicate matters, if a C<UNOP> is actually a C<null> op after
optimization (see L</Compile pass 2: context propagation>) it will still
have children in accordance with its former type.

Finally, there is a C<LOGOP>, or logic op. Like a C<LISTOP>, this has one
or more children, but it doesn't have an C<op_last> field: so you have to
follow C<op_first> and then the C<OpSIBLING> chain itself to find the
last child. Instead it has an C<op_other> field, which is comparable to
the C<op_next> field described below, and represents an alternate
execution path. Operators like C<and>, C<or> and C<?> are C<LOGOP>s. Note
that in general, C<op_other> may not point to any of the direct children
of the C<LOGOP>.

Starting in version 5.21.2, perls built with the experimental
define C<-DPERL_OP_PARENT> add an extra boolean flag for each op,
C<op_moresib>.  When not set, this indicates that this is the last op in an
C<OpSIBLING> chain. This frees up the C<op_sibling> field on the last
sibling to point back to the parent op. Under this build, that field is
also renamed C<op_sibparent> to reflect its joint role. The macro
C<OpSIBLING(o)> wraps this special behaviour, and always returns NULL on
the last sibling.  With this build the C<op_parent(o)> function can be
used to find the parent of any op. Thus for forward compatibility, you
should always use the C<OpSIBLING(o)> macro rather than accessing
C<op_sibling> directly.

Another way to examine the tree is to use a compiler back-end module, such
as L<B::Concise>.

=head2 Compile pass 1: check routines

The tree is created by the compiler while I<yacc> code feeds it
the constructions it recognizes.  Since I<yacc> works bottom-up, so does
the first pass of perl compilation.

What makes this pass interesting for perl developers is that some
optimization may be performed on this pass.  This is optimization by
so-called "check routines".  The correspondence between node names
and corresponding check routines is described in F<opcode.pl> (do not
forget to run C<make regen_headers> if you modify this file).

A check routine is called when the node is fully constructed except
for the execution-order thread.  Since at this time there are no
back-links to the currently constructed node, one can do most any
operation to the top-level node, including freeing it and/or creating
new nodes above/below it.

The check routine returns the node which should be inserted into the
tree (if the top-level node was not modified, check routine returns
its argument).

By convention, check routines have names C<ck_*>.  They are usually
called from C<new*OP> subroutines (or C<convert>) (which in turn are
called from F<perly.y>).

=head2 Compile pass 1a: constant folding

Immediately after the check routine is called the returned node is
checked for being compile-time executable.  If it is (the value is
judged to be constant) it is immediately executed, and a I<constant>
node with the "return value" of the corresponding subtree is
substituted instead.  The subtree is deleted.

If constant folding was not performed, the execution-order thread is
created.

=head2 Compile pass 2: context propagation

When a context for a part of compile tree is known, it is propagated
down through the tree.  At this time the context can have 5 values
(instead of 2 for runtime context): void, boolean, scalar, list, and
lvalue.  In contrast with the pass 1 this pass is processed from top
to bottom: a node's context determines the context for its children.

Additional context-dependent optimizations are performed at this time.
Since at this moment the compile tree contains back-references (via
"thread" pointers), nodes cannot be free()d now.  To allow
optimized-away nodes at this stage, such nodes are null()ified instead
of free()ing (i.e. their type is changed to OP_NULL).

=head2 Compile pass 3: peephole optimization

After the compile tree for a subroutine (or for an C<eval> or a file)
is created, an additional pass over the code is performed.  This pass
is neither top-down or bottom-up, but in the execution order (with
additional complications for conditionals).  Optimizations performed
at this stage are subject to the same restrictions as in the pass 2.

Peephole optimizations are done by calling the function pointed to
by the global variable C<PL_peepp>.  By default, C<PL_peepp> just
calls the function pointed to by the global variable C<PL_rpeepp>.
By default, that performs some basic op fixups and optimisations along
the execution-order op chain, and recursively calls C<PL_rpeepp> for
each side chain of ops (resulting from conditionals).  Extensions may
provide additional optimisations or fixups, hooking into either the
per-subroutine or recursive stage, like this:

    static peep_t prev_peepp;
    static void my_peep(pTHX_ OP *o)
    {
        /* custom per-subroutine optimisation goes here */
        prev_peepp(aTHX_ o);
        /* custom per-subroutine optimisation may also go here */
    }
    BOOT:
        prev_peepp = PL_peepp;
        PL_peepp = my_peep;

    static peep_t prev_rpeepp;
    static void my_rpeep(pTHX_ OP *o)
    {
        OP *orig_o = o;
        for(; o; o = o->op_next) {
            /* custom per-op optimisation goes here */
        }
        prev_rpeepp(aTHX_ orig_o);
    }
    BOOT:
        prev_rpeepp = PL_rpeepp;
        PL_rpeepp = my_rpeep;

=head2 Pluggable runops

The compile tree is executed in a runops function.  There are two runops
functions, in F<run.c> and in F<dump.c>.  C<Perl_runops_debug> is used
with DEBUGGING and C<Perl_runops_standard> is used otherwise.  For fine
control over the execution of the compile tree it is possible to provide
your own runops function.

It's probably best to copy one of the existing runops functions and
change it to suit your needs.  Then, in the BOOT section of your XS
file, add the line:

  PL_runops = my_runops;

This function should be as efficient as possible to keep your programs
running as fast as possible.

=head2 Compile-time scope hooks

As of perl 5.14 it is possible to hook into the compile-time lexical
scope mechanism using C<Perl_blockhook_register>.  This is used like
this:

    STATIC void my_start_hook(pTHX_ int full);
    STATIC BHK my_hooks;

    BOOT:
        BhkENTRY_set(&my_hooks, bhk_start, my_start_hook);
        Perl_blockhook_register(aTHX_ &my_hooks);

This will arrange to have C<my_start_hook> called at the start of
compiling every lexical scope.  The available hooks are:

=over 4

=item C<void bhk_start(pTHX_ int full)>

This is called just after starting a new lexical scope.  Note that Perl
code like

    if ($x) { ... }

creates two scopes: the first starts at the C<(> and has C<full == 1>,
the second starts at the C<{> and has C<full == 0>.  Both end at the
C<}>, so calls to C<start> and C<pre>/C<post_end> will match.  Anything
pushed onto the save stack by this hook will be popped just before the
scope ends (between the C<pre_> and C<post_end> hooks, in fact).

=item C<void bhk_pre_end(pTHX_ OP **o)>

This is called at the end of a lexical scope, just before unwinding the
stack.  I<o> is the root of the optree representing the scope; it is a
double pointer so you can replace the OP if you need to.

=item C<void bhk_post_end(pTHX_ OP **o)>

This is called at the end of a lexical scope, just after unwinding the
stack.  I<o> is as above.  Note that it is possible for calls to C<pre_>
and C<post_end> to nest, if there is something on the save stack that
calls string eval.

=item C<void bhk_eval(pTHX_ OP *const o)>

This is called just before starting to compile an C<eval STRING>, C<do
FILE>, C<require> or C<use>, after the eval has been set up.  I<o> is the
OP that requested the eval, and will normally be an C<OP_ENTEREVAL>,
C<OP_DOFILE> or C<OP_REQUIRE>.

=back

Once you have your hook functions, you need a C<BHK> structure to put
them in.  It's best to allocate it statically, since there is no way to
free it once it's registered.  The function pointers should be inserted
into this structure using the C<BhkENTRY_set> macro, which will also set
flags indicating which entries are valid.  If you do need to allocate
your C<BHK> dynamically for some reason, be sure to zero it before you
start.

Once registered, there is no mechanism to switch these hooks off, so if
that is necessary you will need to do this yourself.  An entry in C<%^H>
is probably the best way, so the effect is lexically scoped; however it
is also possible to use the C<BhkDISABLE> and C<BhkENABLE> macros to
temporarily switch entries on and off.  You should also be aware that
generally speaking at least one scope will have opened before your
extension is loaded, so you will see some C<pre>/C<post_end> pairs that
didn't have a matching C<start>.

=head1 Examining internal data structures with the C<dump> functions

To aid debugging, the source file F<dump.c> contains a number of
functions which produce formatted output of internal data structures.

The most commonly used of these functions is C<Perl_sv_dump>; it's used
for dumping SVs, AVs, HVs, and CVs.  The C<Devel::Peek> module calls
C<sv_dump> to produce debugging output from Perl-space, so users of that
module should already be familiar with its format.

C<Perl_op_dump> can be used to dump an C<OP> structure or any of its
derivatives, and produces output similar to C<perl -Dx>; in fact,
C<Perl_dump_eval> will dump the main root of the code being evaluated,
exactly like C<-Dx>.

Other useful functions are C<Perl_dump_sub>, which turns a C<GV> into an
op tree, C<Perl_dump_packsubs> which calls C<Perl_dump_sub> on all the
subroutines in a package like so: (Thankfully, these are all xsubs, so
there is no op tree)

    (gdb) print Perl_dump_packsubs(PL_defstash)

    SUB attributes::bootstrap = (xsub 0x811fedc 0)

    SUB UNIVERSAL::can = (xsub 0x811f50c 0)

    SUB UNIVERSAL::isa = (xsub 0x811f304 0)

    SUB UNIVERSAL::VERSION = (xsub 0x811f7ac 0)

    SUB DynaLoader::boot_DynaLoader = (xsub 0x805b188 0)

and C<Perl_dump_all>, which dumps all the subroutines in the stash and
the op tree of the main root.

=head1 How multiple interpreters and concurrency are supported

=head2 Background and PERL_IMPLICIT_CONTEXT

The Perl interpreter can be regarded as a closed box: it has an API
for feeding it code or otherwise making it do things, but it also has
functions for its own use.  This smells a lot like an object, and
there are ways for you to build Perl so that you can have multiple
interpreters, with one interpreter represented either as a C structure,
or inside a thread-specific structure.  These structures contain all
the context, the state of that interpreter.

One macro controls the major Perl build flavor: MULTIPLICITY.  The
MULTIPLICITY build has a C structure that packages all the interpreter
state.  With multiplicity-enabled perls, PERL_IMPLICIT_CONTEXT is also
normally defined, and enables the support for passing in a "hidden" first
argument that represents all three data structures.  MULTIPLICITY makes
multi-threaded perls possible (with the ithreads threading model, related
to the macro USE_ITHREADS.)

Two other "encapsulation" macros are the PERL_GLOBAL_STRUCT and
PERL_GLOBAL_STRUCT_PRIVATE (the latter turns on the former, and the
former turns on MULTIPLICITY.)  The PERL_GLOBAL_STRUCT causes all the
internal variables of Perl to be wrapped inside a single global struct,
struct perl_vars, accessible as (globals) &PL_Vars or PL_VarsPtr or
the function  Perl_GetVars().  The PERL_GLOBAL_STRUCT_PRIVATE goes
one step further, there is still a single struct (allocated in main()
either from heap or from stack) but there are no global data symbols
pointing to it.  In either case the global struct should be initialized
as the very first thing in main() using Perl_init_global_struct() and
correspondingly tear it down after perl_free() using Perl_free_global_struct(),
please see F<miniperlmain.c> for usage details.  You may also need
to use C<dVAR> in your coding to "declare the global variables"
when you are using them.  dTHX does this for you automatically.

To see whether you have non-const data you can use a BSD (or GNU)
compatible C<nm>:

  nm libperl.a | grep -v ' [TURtr] '

If this displays any C<D> or C<d> symbols (or possibly C<C> or C<c>),
you have non-const data.  The symbols the C<grep> removed are as follows:
C<Tt> are I<text>, or code, the C<Rr> are I<read-only> (const) data,
and the C<U> is <undefined>, external symbols referred to.

The test F<t/porting/libperl.t> does this kind of symbol sanity
checking on C<libperl.a>.

For backward compatibility reasons defining just PERL_GLOBAL_STRUCT
doesn't actually hide all symbols inside a big global struct: some
PerlIO_xxx vtables are left visible.  The PERL_GLOBAL_STRUCT_PRIVATE
then hides everything (see how the PERLIO_FUNCS_DECL is used).

All this obviously requires a way for the Perl internal functions to be
either subroutines taking some kind of structure as the first
argument, or subroutines taking nothing as the first argument.  To
enable these two very different ways of building the interpreter,
the Perl source (as it does in so many other situations) makes heavy
use of macros and subroutine naming conventions.

First problem: deciding which functions will be public API functions and
which will be private.  All functions whose names begin C<S_> are private
(think "S" for "secret" or "static").  All other functions begin with
"Perl_", but just because a function begins with "Perl_" does not mean it is
part of the API.  (See L</Internal
Functions>.)  The easiest way to be B<sure> a
function is part of the API is to find its entry in L<perlapi>.
If it exists in L<perlapi>, it's part of the API.  If it doesn't, and you
think it should be (i.e., you need it for your extension), send mail via
L<perlbug> explaining why you think it should be.

Second problem: there must be a syntax so that the same subroutine
declarations and calls can pass a structure as their first argument,
or pass nothing.  To solve this, the subroutines are named and
declared in a particular way.  Here's a typical start of a static
function used within the Perl guts:

  STATIC void
  S_incline(pTHX_ char *s)

STATIC becomes "static" in C, and may be #define'd to nothing in some
configurations in the future.

A public function (i.e. part of the internal API, but not necessarily
sanctioned for use in extensions) begins like this:

  void
  Perl_sv_setiv(pTHX_ SV* dsv, IV num)

C<pTHX_> is one of a number of macros (in F<perl.h>) that hide the
details of the interpreter's context.  THX stands for "thread", "this",
or "thingy", as the case may be.  (And no, George Lucas is not involved. :-)
The first character could be 'p' for a B<p>rototype, 'a' for B<a>rgument,
or 'd' for B<d>eclaration, so we have C<pTHX>, C<aTHX> and C<dTHX>, and
their variants.

When Perl is built without options that set PERL_IMPLICIT_CONTEXT, there is no
first argument containing the interpreter's context.  The trailing underscore
in the pTHX_ macro indicates that the macro expansion needs a comma
after the context argument because other arguments follow it.  If
PERL_IMPLICIT_CONTEXT is not defined, pTHX_ will be ignored, and the
subroutine is not prototyped to take the extra argument.  The form of the
macro without the trailing underscore is used when there are no additional
explicit arguments.

When a core function calls another, it must pass the context.  This
is normally hidden via macros.  Consider C<sv_setiv>.  It expands into
something like this:

    #ifdef PERL_IMPLICIT_CONTEXT
      #define sv_setiv(a,b)      Perl_sv_setiv(aTHX_ a, b)
      /* can't do this for vararg functions, see below */
    #else
      #define sv_setiv           Perl_sv_setiv
    #endif

This works well, and means that XS authors can gleefully write:

    sv_setiv(foo, bar);

and still have it work under all the modes Perl could have been
compiled with.

This doesn't work so cleanly for varargs functions, though, as macros
imply that the number of arguments is known in advance.  Instead we
either need to spell them out fully, passing C<aTHX_> as the first
argument (the Perl core tends to do this with functions like
Perl_warner), or use a context-free version.

The context-free version of Perl_warner is called
Perl_warner_nocontext, and does not take the extra argument.  Instead
it does dTHX; to get the context from thread-local storage.  We
C<#define warner Perl_warner_nocontext> so that extensions get source
compatibility at the expense of performance.  (Passing an arg is
cheaper than grabbing it from thread-local storage.)

You can ignore [pad]THXx when browsing the Perl headers/sources.
Those are strictly for use within the core.  Extensions and embedders
need only be aware of [pad]THX.

=head2 So what happened to dTHR?

C<dTHR> was introduced in perl 5.005 to support the older thread model.
The older thread model now uses the C<THX> mechanism to pass context
pointers around, so C<dTHR> is not useful any more.  Perl 5.6.0 and
later still have it for backward source compatibility, but it is defined
to be a no-op.

=head2 How do I use all this in extensions?

When Perl is built with PERL_IMPLICIT_CONTEXT, extensions that call
any functions in the Perl API will need to pass the initial context
argument somehow.  The kicker is that you will need to write it in
such a way that the extension still compiles when Perl hasn't been
built with PERL_IMPLICIT_CONTEXT enabled.

There are three ways to do this.  First, the easy but inefficient way,
which is also the default, in order to maintain source compatibility
with extensions: whenever F<XSUB.h> is #included, it redefines the aTHX
and aTHX_ macros to call a function that will return the context.
Thus, something like:

        sv_setiv(sv, num);

in your extension will translate to this when PERL_IMPLICIT_CONTEXT is
in effect:

        Perl_sv_setiv(Perl_get_context(), sv, num);

or to this otherwise:

        Perl_sv_setiv(sv, num);

You don't have to do anything new in your extension to get this; since
the Perl library provides Perl_get_context(), it will all just
work.

The second, more efficient way is to use the following template for
your Foo.xs:

        #define PERL_NO_GET_CONTEXT     /* we want efficiency */
        #include "EXTERN.h"
        #include "perl.h"
        #include "XSUB.h"

        STATIC void my_private_function(int arg1, int arg2);

        STATIC void
        my_private_function(int arg1, int arg2)
        {
            dTHX;       /* fetch context */
            ... call many Perl API functions ...
        }

        [... etc ...]

        MODULE = Foo            PACKAGE = Foo

        /* typical XSUB */

        void
        my_xsub(arg)
                int arg
            CODE:
                my_private_function(arg, 10);

Note that the only two changes from the normal way of writing an
extension is the addition of a C<#define PERL_NO_GET_CONTEXT> before
including the Perl headers, followed by a C<dTHX;> declaration at
the start of every function that will call the Perl API.  (You'll
know which functions need this, because the C compiler will complain
that there's an undeclared identifier in those functions.)  No changes
are needed for the XSUBs themselves, because the XS() macro is
correctly defined to pass in the implicit context if needed.

The third, even more efficient way is to ape how it is done within
the Perl guts:


        #define PERL_NO_GET_CONTEXT     /* we want efficiency */
        #include "EXTERN.h"
        #include "perl.h"
        #include "XSUB.h"

        /* pTHX_ only needed for functions that call Perl API */
        STATIC void my_private_function(pTHX_ int arg1, int arg2);

        STATIC void
        my_private_function(pTHX_ int arg1, int arg2)
        {
            /* dTHX; not needed here, because THX is an argument */
            ... call Perl API functions ...
        }

        [... etc ...]

        MODULE = Foo            PACKAGE = Foo

        /* typical XSUB */

        void
        my_xsub(arg)
                int arg
            CODE:
                my_private_function(aTHX_ arg, 10);

This implementation never has to fetch the context using a function
call, since it is always passed as an extra argument.  Depending on
your needs for simplicity or efficiency, you may mix the previous
two approaches freely.

Never add a comma after C<pTHX> yourself--always use the form of the
macro with the underscore for functions that take explicit arguments,
or the form without the argument for functions with no explicit arguments.

If one is compiling Perl with the C<-DPERL_GLOBAL_STRUCT> the C<dVAR>
definition is needed if the Perl global variables (see F<perlvars.h>
or F<globvar.sym>) are accessed in the function and C<dTHX> is not
used (the C<dTHX> includes the C<dVAR> if necessary).  One notices
the need for C<dVAR> only with the said compile-time define, because
otherwise the Perl global variables are visible as-is.

=head2 Should I do anything special if I call perl from multiple threads?

If you create interpreters in one thread and then proceed to call them in
another, you need to make sure perl's own Thread Local Storage (TLS) slot is
initialized correctly in each of those threads.

The C<perl_alloc> and C<perl_clone> API functions will automatically set
the TLS slot to the interpreter they created, so that there is no need to do
anything special if the interpreter is always accessed in the same thread that
created it, and that thread did not create or call any other interpreters
afterwards.  If that is not the case, you have to set the TLS slot of the
thread before calling any functions in the Perl API on that particular
interpreter.  This is done by calling the C<PERL_SET_CONTEXT> macro in that
thread as the first thing you do:

	/* do this before doing anything else with some_perl */
	PERL_SET_CONTEXT(some_perl);

	... other Perl API calls on some_perl go here ...

=head2 Future Plans and PERL_IMPLICIT_SYS

Just as PERL_IMPLICIT_CONTEXT provides a way to bundle up everything
that the interpreter knows about itself and pass it around, so too are
there plans to allow the interpreter to bundle up everything it knows
about the environment it's running on.  This is enabled with the
PERL_IMPLICIT_SYS macro.  Currently it only works with USE_ITHREADS on
Windows.

This allows the ability to provide an extra pointer (called the "host"
environment) for all the system calls.  This makes it possible for
all the system stuff to maintain their own state, broken down into
seven C structures.  These are thin wrappers around the usual system
calls (see F<win32/perllib.c>) for the default perl executable, but for a
more ambitious host (like the one that would do fork() emulation) all
the extra work needed to pretend that different interpreters are
actually different "processes", would be done here.

The Perl engine/interpreter and the host are orthogonal entities.
There could be one or more interpreters in a process, and one or
more "hosts", with free association between them.

=head1 Internal Functions

All of Perl's internal functions which will be exposed to the outside
world are prefixed by C<Perl_> so that they will not conflict with XS
functions or functions used in a program in which Perl is embedded.
Similarly, all global variables begin with C<PL_>.  (By convention,
static functions start with C<S_>.)

Inside the Perl core (C<PERL_CORE> defined), you can get at the functions
either with or without the C<Perl_> prefix, thanks to a bunch of defines
that live in F<embed.h>.  Note that extension code should I<not> set
C<PERL_CORE>; this exposes the full perl internals, and is likely to cause
breakage of the XS in each new perl release.

The file F<embed.h> is generated automatically from
F<embed.pl> and F<embed.fnc>.  F<embed.pl> also creates the prototyping
header files for the internal functions, generates the documentation
and a lot of other bits and pieces.  It's important that when you add
a new function to the core or change an existing one, you change the
data in the table in F<embed.fnc> as well.  Here's a sample entry from
that table:

    Apd |SV**   |av_fetch   |AV* ar|I32 key|I32 lval

The second column is the return type, the third column the name.  Columns
after that are the arguments.  The first column is a set of flags:

=over 3

=item A

This function is a part of the public
API.  All such functions should also
have 'd', very few do not.

=item p

This function has a C<Perl_> prefix; i.e. it is defined as
C<Perl_av_fetch>.

=item d

This function has documentation using the C<apidoc> feature which we'll
look at in a second.  Some functions have 'd' but not 'A'; docs are good.

=back

Other available flags are:

=over 3

=item s

This is a static function and is defined as C<STATIC S_whatever>, and
usually called within the sources as C<whatever(...)>.

=item n

This does not need an interpreter context, so the definition has no
C<pTHX>, and it follows that callers don't use C<aTHX>.  (See
L</Background and PERL_IMPLICIT_CONTEXT>.)

=item r

This function never returns; C<croak>, C<exit> and friends.

=item f

This function takes a variable number of arguments, C<printf> style.
The argument list should end with C<...>, like this:

    Afprd   |void   |croak          |const char* pat|...

=item M

This function is part of the experimental development API, and may change
or disappear without notice.

=item o

This function should not have a compatibility macro to define, say,
C<Perl_parse> to C<parse>.  It must be called as C<Perl_parse>.

=item x

This function isn't exported out of the Perl core.

=item m

This is implemented as a macro.

=item X

This function is explicitly exported.

=item E

This function is visible to extensions included in the Perl core.

=item b

Binary backward compatibility; this function is a macro but also has
a C<Perl_> implementation (which is exported).

=item others

See the comments at the top of C<embed.fnc> for others.

=back

If you edit F<embed.pl> or F<embed.fnc>, you will need to run
C<make regen_headers> to force a rebuild of F<embed.h> and other
auto-generated files.

=head2 Formatted Printing of IVs, UVs, and NVs

If you are printing IVs, UVs, or NVS instead of the stdio(3) style
formatting codes like C<%d>, C<%ld>, C<%f>, you should use the
following macros for portability

        IVdf            IV in decimal
        UVuf            UV in decimal
        UVof            UV in octal
        UVxf            UV in hexadecimal
        NVef            NV %e-like
        NVff            NV %f-like
        NVgf            NV %g-like

These will take care of 64-bit integers and long doubles.
For example:

        printf("IV is %"IVdf"\n", iv);

The IVdf will expand to whatever is the correct format for the IVs.

Note that there are different "long doubles": Perl will use
whatever the compiler has.

If you are printing addresses of pointers, use UVxf combined
with PTR2UV(), do not use %lx or %p.

=head2 Formatted Printing of Size_t and SSize_t

The most general way to do this is to cast them to a UV or IV, and
print as in the
L<previous section|/Formatted Printing of IVs, UVs, and NVs>.

But if you're using C<PerlIO_printf()>, it's less typing and visual
clutter to use the C<"%z"> length modifier (for I<siZe>):

        PerlIO_printf("STRLEN is %zu\n", len);

This modifier is not portable, so its use should be restricted to
C<PerlIO_printf()>.

=head2 Pointer-To-Integer and Integer-To-Pointer

Because pointer size does not necessarily equal integer size,
use the follow macros to do it right.

        PTR2UV(pointer)
        PTR2IV(pointer)
        PTR2NV(pointer)
        INT2PTR(pointertotype, integer)

For example:

        IV  iv = ...;
        SV *sv = INT2PTR(SV*, iv);

and

        AV *av = ...;
        UV  uv = PTR2UV(av);

=head2 Exception Handling

There are a couple of macros to do very basic exception handling in XS
modules.  You have to define C<NO_XSLOCKS> before including F<XSUB.h> to
be able to use these macros:

        #define NO_XSLOCKS
        #include "XSUB.h"

You can use these macros if you call code that may croak, but you need
to do some cleanup before giving control back to Perl.  For example:

        dXCPT;    /* set up necessary variables */

        XCPT_TRY_START {
          code_that_may_croak();
        } XCPT_TRY_END

        XCPT_CATCH
        {
          /* do cleanup here */
          XCPT_RETHROW;
        }

Note that you always have to rethrow an exception that has been
caught.  Using these macros, it is not possible to just catch the
exception and ignore it.  If you have to ignore the exception, you
have to use the C<call_*> function.

The advantage of using the above macros is that you don't have
to setup an extra function for C<call_*>, and that using these
macros is faster than using C<call_*>.

=head2 Source Documentation

There's an effort going on to document the internal functions and
automatically produce reference manuals from them -- L<perlapi> is one
such manual which details all the functions which are available to XS
writers.  L<perlintern> is the autogenerated manual for the functions
which are not part of the API and are supposedly for internal use only.

Source documentation is created by putting POD comments into the C
source, like this:

 /*
 =for apidoc sv_setiv

 Copies an integer into the given SV.  Does not handle 'set' magic.  See
 L<perlapi/sv_setiv_mg>.

 =cut
 */

Please try and supply some documentation if you add functions to the
Perl core.

=head2 Backwards compatibility

The Perl API changes over time.  New functions are
added or the interfaces of existing functions are
changed.  The C<Devel::PPPort> module tries to
provide compatibility code for some of these changes, so XS writers don't
have to code it themselves when supporting multiple versions of Perl.

C<Devel::PPPort> generates a C header file F<ppport.h> that can also
be run as a Perl script.  To generate F<ppport.h>, run:

    perl -MDevel::PPPort -eDevel::PPPort::WriteFile

Besides checking existing XS code, the script can also be used to retrieve
compatibility information for various API calls using the C<--api-info>
command line switch.  For example:

  % perl ppport.h --api-info=sv_magicext

For details, see C<perldoc ppport.h>.

=head1 Unicode Support

Perl 5.6.0 introduced Unicode support.  It's important for porters and XS
writers to understand this support and make sure that the code they
write does not corrupt Unicode data.

=head2 What B<is> Unicode, anyway?

In the olden, less enlightened times, we all used to use ASCII.  Most of
us did, anyway.  The big problem with ASCII is that it's American.  Well,
no, that's not actually the problem; the problem is that it's not
particularly useful for people who don't use the Roman alphabet.  What
used to happen was that particular languages would stick their own
alphabet in the upper range of the sequence, between 128 and 255.  Of
course, we then ended up with plenty of variants that weren't quite
ASCII, and the whole point of it being a standard was lost.

Worse still, if you've got a language like Chinese or
Japanese that has hundreds or thousands of characters, then you really
can't fit them into a mere 256, so they had to forget about ASCII
altogether, and build their own systems using pairs of numbers to refer
to one character.

To fix this, some people formed Unicode, Inc. and
produced a new character set containing all the characters you can
possibly think of and more.  There are several ways of representing these
characters, and the one Perl uses is called UTF-8.  UTF-8 uses
a variable number of bytes to represent a character.  You can learn more
about Unicode and Perl's Unicode model in L<perlunicode>.

(On EBCDIC platforms, Perl uses instead UTF-EBCDIC, which is a form of
UTF-8 adapted for EBCDIC platforms.  Below, we just talk about UTF-8.
UTF-EBCDIC is like UTF-8, but the details are different.  The macros
hide the differences from you, just remember that the particular numbers
and bit patterns presented below will differ in UTF-EBCDIC.)

=head2 How can I recognise a UTF-8 string?

You can't.  This is because UTF-8 data is stored in bytes just like
non-UTF-8 data.  The Unicode character 200, (C<0xC8> for you hex types)
capital E with a grave accent, is represented by the two bytes
C<v196.172>.  Unfortunately, the non-Unicode string C<chr(196).chr(172)>
has that byte sequence as well.  So you can't tell just by looking -- this
is what makes Unicode input an interesting problem.

In general, you either have to know what you're dealing with, or you
have to guess.  The API function C<is_utf8_string> can help; it'll tell
you if a string contains only valid UTF-8 characters, and the chances
of a non-UTF-8 string looking like valid UTF-8 become very small very
quickly with increasing string length.  On a character-by-character
basis, C<isUTF8_CHAR>
will tell you whether the current character in a string is valid UTF-8. 

=head2 How does UTF-8 represent Unicode characters?

As mentioned above, UTF-8 uses a variable number of bytes to store a
character.  Characters with values 0...127 are stored in one
byte, just like good ol' ASCII.  Character 128 is stored as
C<v194.128>; this continues up to character 191, which is
C<v194.191>.  Now we've run out of bits (191 is binary
C<10111111>) so we move on; character 192 is C<v195.128>.  And
so it goes on, moving to three bytes at character 2048.
L<perlunicode/Unicode Encodings> has pictures of how this works.

Assuming you know you're dealing with a UTF-8 string, you can find out
how long the first character in it is with the C<UTF8SKIP> macro:

    char *utf = "\305\233\340\240\201";
    I32 len;

    len = UTF8SKIP(utf); /* len is 2 here */
    utf += len;
    len = UTF8SKIP(utf); /* len is 3 here */

Another way to skip over characters in a UTF-8 string is to use
C<utf8_hop>, which takes a string and a number of characters to skip
over.  You're on your own about bounds checking, though, so don't use it
lightly.

All bytes in a multi-byte UTF-8 character will have the high bit set,
so you can test if you need to do something special with this
character like this (the C<UTF8_IS_INVARIANT()> is a macro that tests
whether the byte is encoded as a single byte even in UTF-8):

    U8 *utf;
    U8 *utf_end; /* 1 beyond buffer pointed to by utf */
    UV uv;	/* Note: a UV, not a U8, not a char */
    STRLEN len; /* length of character in bytes */

    if (!UTF8_IS_INVARIANT(*utf))
        /* Must treat this as UTF-8 */
        uv = utf8_to_uvchr_buf(utf, utf_end, &len);
    else
        /* OK to treat this character as a byte */
        uv = *utf;

You can also see in that example that we use C<utf8_to_uvchr_buf> to get the
value of the character; the inverse function C<uvchr_to_utf8> is available
for putting a UV into UTF-8:

    if (!UVCHR_IS_INVARIANT(uv))
        /* Must treat this as UTF8 */
        utf8 = uvchr_to_utf8(utf8, uv);
    else
        /* OK to treat this character as a byte */
        *utf8++ = uv;

You B<must> convert characters to UVs using the above functions if
you're ever in a situation where you have to match UTF-8 and non-UTF-8
characters.  You may not skip over UTF-8 characters in this case.  If you
do this, you'll lose the ability to match hi-bit non-UTF-8 characters;
for instance, if your UTF-8 string contains C<v196.172>, and you skip
that character, you can never match a C<chr(200)> in a non-UTF-8 string.
So don't do that!

(Note that we don't have to test for invariant characters in the
examples above.  The functions work on any well-formed UTF-8 input.
It's just that its faster to avoid the function overhead when it's not
needed.)

=head2 How does Perl store UTF-8 strings?

Currently, Perl deals with UTF-8 strings and non-UTF-8 strings
slightly differently.  A flag in the SV, C<SVf_UTF8>, indicates that the
string is internally encoded as UTF-8.  Without it, the byte value is the
codepoint number and vice versa.  This flag is only meaningful if the SV
is C<SvPOK> or immediately after stringification via C<SvPV> or a
similar macro.  You can check and manipulate this flag with the
following macros:

    SvUTF8(sv)
    SvUTF8_on(sv)
    SvUTF8_off(sv)

This flag has an important effect on Perl's treatment of the string: if
UTF-8 data is not properly distinguished, regular expressions,
C<length>, C<substr> and other string handling operations will have
undesirable (wrong) results.

The problem comes when you have, for instance, a string that isn't
flagged as UTF-8, and contains a byte sequence that could be UTF-8 --
especially when combining non-UTF-8 and UTF-8 strings.

Never forget that the C<SVf_UTF8> flag is separate from the PV value; you
need to be sure you don't accidentally knock it off while you're
manipulating SVs.  More specifically, you cannot expect to do this:

    SV *sv;
    SV *nsv;
    STRLEN len;
    char *p;

    p = SvPV(sv, len);
    frobnicate(p);
    nsv = newSVpvn(p, len);

The C<char*> string does not tell you the whole story, and you can't
copy or reconstruct an SV just by copying the string value.  Check if the
old SV has the UTF8 flag set (I<after> the C<SvPV> call), and act
accordingly:

    p = SvPV(sv, len);
    is_utf8 = SvUTF8(sv);
    frobnicate(p, is_utf8);
    nsv = newSVpvn(p, len);
    if (is_utf8)
        SvUTF8_on(nsv);

In the above, your C<frobnicate> function has been changed to be made
aware of whether or not it's dealing with UTF-8 data, so that it can
handle the string appropriately.

Since just passing an SV to an XS function and copying the data of
the SV is not enough to copy the UTF8 flags, even less right is just
passing a S<C<char *>> to an XS function.

For full generality, use the L<C<DO_UTF8>|perlapi/DO_UTF8> macro to see if the
string in an SV is to be I<treated> as UTF-8.  This takes into account
if the call to the XS function is being made from within the scope of
L<S<C<use bytes>>|bytes>.  If so, the underlying bytes that comprise the
UTF-8 string are to be exposed, rather than the character they
represent.  But this pragma should only really be used for debugging and
perhaps low-level testing at the byte level.  Hence most XS code need
not concern itself with this, but various areas of the perl core do need
to support it.

And this isn't the whole story.  Starting in Perl v5.12, strings that
aren't encoded in UTF-8 may also be treated as Unicode under various
conditions (see L<perlunicode/ASCII Rules versus Unicode Rules>).
This is only really a problem for characters whose ordinals are between
128 and 255, and their behavior varies under ASCII versus Unicode rules
in ways that your code cares about (see L<perlunicode/The "Unicode Bug">).
There is no published API for dealing with this, as it is subject to
change, but you can look at the code for C<pp_lc> in F<pp.c> for an
example as to how it's currently done.

=head2 How do I convert a string to UTF-8?

If you're mixing UTF-8 and non-UTF-8 strings, it is necessary to upgrade
the non-UTF-8 strings to UTF-8.  If you've got an SV, the easiest way to do
this is:

    sv_utf8_upgrade(sv);

However, you must not do this, for example:

    if (!SvUTF8(left))
        sv_utf8_upgrade(left);

If you do this in a binary operator, you will actually change one of the
strings that came into the operator, and, while it shouldn't be noticeable
by the end user, it can cause problems in deficient code.

Instead, C<bytes_to_utf8> will give you a UTF-8-encoded B<copy> of its
string argument.  This is useful for having the data available for
comparisons and so on, without harming the original SV.  There's also
C<utf8_to_bytes> to go the other way, but naturally, this will fail if
the string contains any characters above 255 that can't be represented
in a single byte.

=head2 How do I compare strings?

L<perlapi/sv_cmp> and L<perlapi/sv_cmp_flags> do a lexigraphic
comparison of two SV's, and handle UTF-8ness properly.  Note, however,
that Unicode specifies a much fancier mechanism for collation, available
via the L<Unicode::Collate> module.

To just compare two strings for equality/non-equality, you can just use
L<C<memEQ()>|perlapi/memEQ> and L<C<memNE()>|perlapi/memEQ> as usual,
except the strings must be both UTF-8 or not UTF-8 encoded.

To compare two strings case-insensitively, use
L<C<foldEQ_utf8()>|perlapi/foldEQ_utf8> (the strings don't have to have
the same UTF-8ness).

=head2 Is there anything else I need to know?

Not really.  Just remember these things:

=over 3

=item *

There's no way to tell if a S<C<char *>> or S<C<U8 *>> string is UTF-8
or not.  But you can tell if an SV is to be treated as UTF-8 by calling
C<DO_UTF8> on it, after stringifying it with C<SvPV> or a similar
macro.  And, you can tell if SV is actually UTF-8 (even if it is not to
be treated as such) by looking at its C<SvUTF8> flag (again after
stringifying it).  Don't forget to set the flag if something should be
UTF-8.
Treat the flag as part of the PV, even though it's not -- if you pass on
the PV to somewhere, pass on the flag too.

=item *

If a string is UTF-8, B<always> use C<utf8_to_uvchr_buf> to get at the value,
unless C<UTF8_IS_INVARIANT(*s)> in which case you can use C<*s>.

=item *

When writing a character UV to a UTF-8 string, B<always> use
C<uvchr_to_utf8>, unless C<UVCHR_IS_INVARIANT(uv))> in which case
you can use C<*s = uv>.

=item *

Mixing UTF-8 and non-UTF-8 strings is
tricky.  Use C<bytes_to_utf8> to get
a new string which is UTF-8 encoded, and then combine them.

=back

=head1 Custom Operators

Custom operator support is an experimental feature that allows you to
define your own ops.  This is primarily to allow the building of
interpreters for other languages in the Perl core, but it also allows
optimizations through the creation of "macro-ops" (ops which perform the
functions of multiple ops which are usually executed together, such as
C<gvsv, gvsv, add>.)

This feature is implemented as a new op type, C<OP_CUSTOM>.  The Perl
core does not "know" anything special about this op type, and so it will
not be involved in any optimizations.  This also means that you can
define your custom ops to be any op structure -- unary, binary, list and
so on -- you like.

It's important to know what custom operators won't do for you.  They
won't let you add new syntax to Perl, directly.  They won't even let you
add new keywords, directly.  In fact, they won't change the way Perl
compiles a program at all.  You have to do those changes yourself, after
Perl has compiled the program.  You do this either by manipulating the op
tree using a C<CHECK> block and the C<B::Generate> module, or by adding
a custom peephole optimizer with the C<optimize> module.

When you do this, you replace ordinary Perl ops with custom ops by
creating ops with the type C<OP_CUSTOM> and the C<op_ppaddr> of your own
PP function.  This should be defined in XS code, and should look like
the PP ops in C<pp_*.c>.  You are responsible for ensuring that your op
takes the appropriate number of values from the stack, and you are
responsible for adding stack marks if necessary.

You should also "register" your op with the Perl interpreter so that it
can produce sensible error and warning messages.  Since it is possible to
have multiple custom ops within the one "logical" op type C<OP_CUSTOM>,
Perl uses the value of C<< o->op_ppaddr >> to determine which custom op
it is dealing with.  You should create an C<XOP> structure for each
ppaddr you use, set the properties of the custom op with
C<XopENTRY_set>, and register the structure against the ppaddr using
C<Perl_custom_op_register>.  A trivial example might look like:

    static XOP my_xop;
    static OP *my_pp(pTHX);

    BOOT:
        XopENTRY_set(&my_xop, xop_name, "myxop");
        XopENTRY_set(&my_xop, xop_desc, "Useless custom op");
        Perl_custom_op_register(aTHX_ my_pp, &my_xop);

The available fields in the structure are:

=over 4

=item xop_name

A short name for your op.  This will be included in some error messages,
and will also be returned as C<< $op->name >> by the L<B|B> module, so
it will appear in the output of module like L<B::Concise|B::Concise>.

=item xop_desc

A short description of the function of the op.

=item xop_class

Which of the various C<*OP> structures this op uses.  This should be one of
the C<OA_*> constants from F<op.h>, namely

=over 4

=item OA_BASEOP

=item OA_UNOP

=item OA_BINOP

=item OA_LOGOP

=item OA_LISTOP

=item OA_PMOP

=item OA_SVOP

=item OA_PADOP

=item OA_PVOP_OR_SVOP

This should be interpreted as 'C<PVOP>' only.  The C<_OR_SVOP> is because
the only core C<PVOP>, C<OP_TRANS>, can sometimes be a C<SVOP> instead.

=item OA_LOOP

=item OA_COP

=back

The other C<OA_*> constants should not be used.

=item xop_peep

This member is of type C<Perl_cpeep_t>, which expands to C<void
(*Perl_cpeep_t)(aTHX_ OP *o, OP *oldop)>.  If it is set, this function
will be called from C<Perl_rpeep> when ops of this type are encountered
by the peephole optimizer.  I<o> is the OP that needs optimizing;
I<oldop> is the previous OP optimized, whose C<op_next> points to I<o>.

=back

C<B::Generate> directly supports the creation of custom ops by name.


=head1 Dynamic Scope and the Context Stack

B<Note:> this section describes a non-public internal API that is subject
to change without notice.

=head2 Introduction to the context stack

In Perl, dynamic scoping refers to the runtime nesting of things like
subroutine calls, evals etc, as well as the entering and exiting of block
scopes. For example, the restoring of a C<local>ised variable is
determined by the dynamic scope.

Perl tracks the dynamic scope by a data structure called the context
stack, which is an array of C<PERL_CONTEXT> structures, and which is
itself a big union for all the types of context. Whenever a new scope is
entered (such as a block, a C<for> loop, or a subroutine call), a new
context entry is pushed onto the stack. Similarly when leaving a block or
returning from a subroutine call etc. a context is popped. Since the
context stack represents the current dynamic scope, it can be searched.
For example, C<next LABEL> searches back through the stack looking for a
loop context that matches the label; C<return> pops contexts until it
finds a sub or eval context or similar; C<caller> examines sub contexts on
the stack.

Each context entry is labelled with a context type, C<cx_type>. Typical
context types are C<CXt_SUB>, C<CXt_EVAL> etc., as well as C<CXt_BLOCK>
and C<CXt_NULL> which represent a basic scope (as pushed by C<pp_enter>)
and a sort block. The type determines which part of the context union are
valid.

The main division in the context struct is between a substitution scope
(C<CXt_SUBST>) and block scopes, which are everything else. The former is
just used while executing C<s///e>, and won't be discussed further
here.

All the block scope types share a common base, which corresponds to
C<CXt_BLOCK>. This stores the old values of various scope-related
variables like C<PL_curpm>, as well as information about the current
scope, such as C<gimme>. On scope exit, the old variables are restored.

Particular block scope types store extra per-type information. For
example, C<CXt_SUB> stores the currently executing CV, while the various
for loop types might hold the original loop variable SV. On scope exit,
the per-type data is processed; for example the CV has its reference count
decremented, and the original loop variable is restored.

The macro C<cxstack> returns the base of the current context stack, while
C<cxstack_ix> is the index of the current frame within that stack.

In fact, the context stack is actually part of a stack-of-stacks system;
whenever something unusual is done such as calling a C<DESTROY> or tie
handler, a new stack is pushed, then popped at the end.

Note that the API described here changed considerably in perl 5.24; prior
to that, big macros like C<PUSHBLOCK> and C<POPSUB> were used; in 5.24
they were replaced by the inline static functions described below. In
addition, the ordering and detail of how these macros/function work
changed in many ways, often subtly. In particular they didn't handle
saving the savestack and temps stack positions, and required additional
C<ENTER>, C<SAVETMPS> and C<LEAVE> compared to the new functions. The
old-style macros will not be described further.


=head2 Pushing contexts

For pushing a new context, the two basic functions are
C<cx = cx_pushblock()>, which pushes a new basic context block and returns
its address, and a family of similar functions with names like
C<cx_pushsub(cx)> which populate the additional type-dependent fields in
the C<cx> struct. Note that C<CXt_NULL> and C<CXt_BLOCK> don't have their
own push functions, as they don't store any data beyond that pushed by
C<cx_pushblock>.

The fields of the context struct and the arguments to the C<cx_*>
functions are subject to change between perl releases, representing
whatever is convenient or efficient for that release.

A typical context stack pushing can be found in C<pp_entersub>; the
following shows a simplified and stripped-down example of a non-XS call,
along with comments showing roughly what each function does.

 dMARK;
 U8 gimme      = GIMME_V;
 bool hasargs  = cBOOL(PL_op->op_flags & OPf_STACKED);
 OP *retop     = PL_op->op_next;
 I32 old_ss_ix = PL_savestack_ix;
 CV *cv        = ....;

 /* ... make mortal copies of stack args which are PADTMPs here ... */

 /* ... do any additional savestack pushes here ... */

 /* Now push a new context entry of type 'CXt_SUB'; initially just
  * doing the actions common to all block types: */

 cx = cx_pushblock(CXt_SUB, gimme, MARK, old_ss_ix);

     /* this does (approximately):
         CXINC;              /* cxstack_ix++ (grow if necessary) */
         cx = CX_CUR();      /* and get the address of new frame */
         cx->cx_type        = CXt_SUB;
         cx->blk_gimme      = gimme;
         cx->blk_oldsp      = MARK - PL_stack_base;
         cx->blk_oldsaveix  = old_ss_ix;
         cx->blk_oldcop     = PL_curcop;
         cx->blk_oldmarksp  = PL_markstack_ptr - PL_markstack;
         cx->blk_oldscopesp = PL_scopestack_ix;
         cx->blk_oldpm      = PL_curpm;
         cx->blk_old_tmpsfloor = PL_tmps_floor;

         PL_tmps_floor        = PL_tmps_ix;
     */


 /* then update the new context frame with subroutine-specific info,
  * such as the CV about to be executed: */

 cx_pushsub(cx, cv, retop, hasargs);

     /* this does (approximately):
         cx->blk_sub.cv          = cv;
         cx->blk_sub.olddepth    = CvDEPTH(cv);
         cx->blk_sub.prevcomppad = PL_comppad;
         cx->cx_type            |= (hasargs) ? CXp_HASARGS : 0;
         cx->blk_sub.retop       = retop;
         SvREFCNT_inc_simple_void_NN(cv);
     */

Note that C<cx_pushblock()> sets two new floors: for the args stack (to
C<MARK>) and the temps stack (to C<PL_tmps_ix>). While executing at this
scope level, every C<nextstate> (amongst others) will reset the args and
tmps stack levels to these floors. Note that since C<cx_pushblock> uses
the current value of C<PL_tmps_ix> rather than it being passed as an arg,
this dictates at what point C<cx_pushblock> should be called. In
particular, any new mortals which should be freed only on scope exit
(rather than at the next C<nextstate>) should be created first.

Most callers of C<cx_pushblock> simply set the new args stack floor to the
top of the previous stack frame, but for C<CXt_LOOP_LIST> it stores the
items being iterated over on the stack, and so sets C<blk_oldsp> to the
top of these items instead. Note that, contrary to its name, C<blk_oldsp>
doesn't always represent the value to restore C<PL_stack_sp> to on scope
exit.

Note the early capture of C<PL_savestack_ix> to C<old_ss_ix>, which is
later passed as an arg to C<cx_pushblock>. In the case of C<pp_entersub>,
this is because, although most values needing saving are stored in fields
of the context struct, an extra value needs saving only when the debugger
is running, and it doesn't make sense to bloat the struct for this rare
case. So instead it is saved on the savestack. Since this value gets
calculated and saved before the context is pushed, it is necessary to pass
the old value of C<PL_savestack_ix> to C<cx_pushblock>, to ensure that the
saved value gets freed during scope exit.  For most users of
C<cx_pushblock>, where nothing needs pushing on the save stack,
C<PL_savestack_ix> is just passed directly as an arg to C<cx_pushblock>.

Note that where possible, values should be saved in the context struct
rather than on the save stack; it's much faster that way.

Normally C<cx_pushblock> should be immediately followed by the appropriate
C<cx_pushfoo>, with nothing between them; this is because if code
in-between could die (e.g. a warning upgraded to fatal), then the context
stack unwinding code in C<dounwind> would see (in the example above) a
C<CXt_SUB> context frame, but without all the subroutine-specific fields
set, and crashes would soon ensue.

Where the two must be separate, initially set the type to C<CXt_NULL> or
C<CXt_BLOCK>, and later change it to C<CXt_foo> when doing the
C<cx_pushfoo>. This is exactly what C<pp_enteriter> does, once it's
determined which type of loop it's pushing.

=head2 Popping contexts

Contexts are popped using C<cx_popsub()> etc. and C<cx_popblock()>. Note
however, that unlike C<cx_pushblock>, neither of these functions actually
decrement the current context stack index; this is done separately using
C<CX_POP()>.

There are two main ways that contexts are popped. During normal execution
as scopes are exited, functions like C<pp_leave>, C<pp_leaveloop> and
C<pp_leavesub> process and pop just one context using C<cx_popfoo> and
C<cx_popblock>. On the other hand, things like C<pp_return> and C<next>
may have to pop back several scopes until a sub or loop context is found,
and exceptions (such as C<die>) need to pop back contexts until an eval
context is found. Both of these are accomplished by C<dounwind()>, which
is capable of processing and popping all contexts above the target one.

Here is a typical example of context popping, as found in C<pp_leavesub>
(simplified slightly):

 U8 gimme;
 PERL_CONTEXT *cx;
 SV **oldsp;
 OP *retop;

 cx = CX_CUR();

 gimme = cx->blk_gimme;
 oldsp = PL_stack_base + cx->blk_oldsp; /* last arg of previous frame */

 if (gimme == G_VOID)
     PL_stack_sp = oldsp;
 else
     leave_adjust_stacks(oldsp, oldsp, gimme, 0);

 CX_LEAVE_SCOPE(cx);
 cx_popsub(cx);
 cx_popblock(cx);
 retop = cx->blk_sub.retop;
 CX_POP(cx);

 return retop;

The steps above are in a very specific order, designed to be the reverse
order of when the context was pushed. The first thing to do is to copy
and/or protect any any return arguments and free any temps in the current
scope. Scope exits like an rvalue sub normally return a mortal copy of
their return args (as opposed to lvalue subs). It is important to make
this copy before the save stack is popped or variables are restored, or
bad things like the following can happen:

    sub f { my $x =...; $x }  # $x freed before we get to copy it
    sub f { /(...)/;    $1 }  # PL_curpm restored before $1 copied

Although we wish to free any temps at the same time, we have to be careful
not to free any temps which are keeping return args alive; nor to free the
temps we have just created while mortal copying return args. Fortunately,
C<leave_adjust_stacks()> is capable of making mortal copies of return args,
shifting args down the stack, and only processing those entries on the
temps stack that are safe to do so.

In void context no args are returned, so it's more efficient to skip
calling C<leave_adjust_stacks()>. Also in void context, a C<nextstate> op
is likely to be imminently called which will do a C<FREETMPS>, so there's
no need to do that either.

The next step is to pop savestack entries: C<CX_LEAVE_SCOPE(cx)> is just
defined as C<<LEAVE_SCOPE(cx->blk_oldsaveix)>>. Note that during the
popping, it's possible for perl to call destructors, call C<STORE> to undo
localisations of tied vars, and so on. Any of these can die or call
C<exit()>. In this case, C<dounwind()> will be called, and the current
context stack frame will be re-processed. Thus it is vital that all steps
in popping a context are done in such a way to support reentrancy.  The
other alternative, of decrementing C<cxstack_ix> I<before> processing the
frame, would lead to leaks and the like if something died halfway through,
or overwriting of the current frame.

C<CX_LEAVE_SCOPE> itself is safely re-entrant: if only half the savestack
items have been popped before dying and getting trapped by eval, then the
C<CX_LEAVE_SCOPE>s in C<dounwind> or C<pp_leaveeval> will continue where
the first one left off.

The next step is the type-specific context processing; in this case
C<cx_popsub>. In part, this looks like:

    cv = cx->blk_sub.cv;
    CvDEPTH(cv) = cx->blk_sub.olddepth;
    cx->blk_sub.cv = NULL;
    SvREFCNT_dec(cv);

where its processing the just-executed CV. Note that before it decrements
the CV's reference count, it nulls the C<blk_sub.cv>. This means that if
it re-enters, the CV won't be freed twice. It also means that you can't
rely on such type-specific fields having useful values after the return
from C<cx_popfoo>.

Next, C<cx_popblock> restores all the various interpreter vars to their
previous values or previous high water marks; it expands to:

    PL_markstack_ptr = PL_markstack + cx->blk_oldmarksp;
    PL_scopestack_ix = cx->blk_oldscopesp;
    PL_curpm         = cx->blk_oldpm;
    PL_curcop        = cx->blk_oldcop;
    PL_tmps_floor    = cx->blk_old_tmpsfloor;

Note that it I<doesn't> restore C<PL_stack_sp>; as mentioned earlier,
which value to restore it to depends on the context type (specifically
C<for (list) {}>), and what args (if any) it returns; and that will
already have been sorted out earlier by C<leave_adjust_stacks()>.

Finally, the context stack pointer is actually decremented by C<CX_POP(cx)>.
After this point, it's possible that that the current context frame could
be overwritten by other contexts being pushed. Although things like ties
and C<DESTROY> are supposed to work within a new context stack, it's best
not to assume this. Indeed on debugging builds, C<CX_POP(cx)> deliberately
sets C<cx> to null to detect code that is still relying on the field
values in that context frame. Note in the C<pp_leavesub()> example above,
we grab C<blk_sub.retop> I<before> calling C<CX_POP>.

=head2 Redoing contexts

Finally, there is C<cx_topblock(cx)>, which acts like a super-C<nextstate>
as regards to resetting various vars to their base values. It is used in
places like C<pp_next>, C<pp_redo> and C<pp_goto> where rather than
exiting a scope, we want to re-initialise the scope. As well as resetting
C<PL_stack_sp> like C<nextstate>, it also resets C<PL_markstack_ptr>,
C<PL_scopestack_ix> and C<PL_curpm>. Note that it doesn't do a
C<FREETMPS>.


=head1 AUTHORS

Until May 1997, this document was maintained by Jeff Okamoto
E<lt>okamoto@corp.hp.comE<gt>.  It is now maintained as part of Perl
itself by the Perl 5 Porters E<lt>perl5-porters@perl.orgE<gt>.

With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil
Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer,
Stephen McCamant, and Gurusamy Sarathy.

=head1 SEE ALSO

L<perlapi>, L<perlintern>, L<perlxs>, L<perlembed>
perlunicook.pod000064400000061620150344123430007606 0ustar00
=encoding utf8

=head1 NAME

perlunicook - cookbookish examples of handling Unicode in Perl

=head1 DESCRIPTION

This manpage contains short recipes demonstrating how to handle common Unicode
operations in Perl, plus one complete program at the end. Any undeclared
variables in individual recipes are assumed to have a previous appropriate
value in them.

=head1 EXAMPLES

=head2 ℞ 0: Standard preamble

Unless otherwise notes, all examples below require this standard preamble
to work correctly, with the C<#!> adjusted to work on your system:

 #!/usr/bin/env perl

 use utf8;      # so literals and identifiers can be in UTF-8
 use v5.12;     # or later to get "unicode_strings" feature
 use strict;    # quote strings, declare variables
 use warnings;  # on by default
 use warnings  qw(FATAL utf8);    # fatalize encoding glitches
 use open      qw(:std :encoding(UTF-8)); # undeclared streams in UTF-8
 use charnames qw(:full :short);  # unneeded in v5.16

This I<does> make even Unix programmers C<binmode> your binary streams,
or open them with C<:raw>, but that's the only way to get at them
portably anyway.

B<WARNING>: C<use autodie> (pre 2.26) and C<use open> do not get along with each
other.

=head2 ℞ 1: Generic Unicode-savvy filter

Always decompose on the way in, then recompose on the way out.

 use Unicode::Normalize;

 while (<>) {
     $_ = NFD($_);   # decompose + reorder canonically
     ...
 } continue {
     print NFC($_);  # recompose (where possible) + reorder canonically
 }

=head2 ℞ 2: Fine-tuning Unicode warnings

As of v5.14, Perl distinguishes three subclasses of UTF‑8 warnings.

 use v5.14;                  # subwarnings unavailable any earlier
 no warnings "nonchar";      # the 66 forbidden non-characters
 no warnings "surrogate";    # UTF-16/CESU-8 nonsense
 no warnings "non_unicode";  # for codepoints over 0x10_FFFF

=head2 ℞ 3: Declare source in utf8 for identifiers and literals

Without the all-critical C<use utf8> declaration, putting UTF‑8 in your
literals and identifiers won’t work right.  If you used the standard
preamble just given above, this already happened.  If you did, you can
do things like this:

 use utf8;

 my $measure   = "Ångström";
 my @μsoft     = qw( cp852 cp1251 cp1252 );
 my @ὑπέρμεγας = qw( ὑπέρ  μεγας );
 my @鯉        = qw( koi8-f koi8-u koi8-r );
 my $motto     = "👪 💗 🐪"; # FAMILY, GROWING HEART, DROMEDARY CAMEL

If you forget C<use utf8>, high bytes will be misunderstood as
separate characters, and nothing will work right.

=head2 ℞ 4: Characters and their numbers

The C<ord> and C<chr> functions work transparently on all codepoints,
not just on ASCII alone — nor in fact, not even just on Unicode alone.

 # ASCII characters
 ord("A")
 chr(65)

 # characters from the Basic Multilingual Plane
 ord("Σ")
 chr(0x3A3)

 # beyond the BMP
 ord("𝑛")               # MATHEMATICAL ITALIC SMALL N
 chr(0x1D45B)

 # beyond Unicode! (up to MAXINT)
 ord("\x{20_0000}")
 chr(0x20_0000)

=head2 ℞ 5: Unicode literals by character number

In an interpolated literal, whether a double-quoted string or a
regex, you may specify a character by its number using the
C<\x{I<HHHHHH>}> escape.

 String: "\x{3a3}"
 Regex:  /\x{3a3}/

 String: "\x{1d45b}"
 Regex:  /\x{1d45b}/

 # even non-BMP ranges in regex work fine
 /[\x{1D434}-\x{1D467}]/

=head2 ℞ 6: Get character name by number

 use charnames ();
 my $name = charnames::viacode(0x03A3);

=head2 ℞ 7: Get character number by name

 use charnames ();
 my $number = charnames::vianame("GREEK CAPITAL LETTER SIGMA");

=head2 ℞ 8: Unicode named characters

Use the C<< \N{I<charname>} >> notation to get the character
by that name for use in interpolated literals (double-quoted
strings and regexes).  In v5.16, there is an implicit

 use charnames qw(:full :short);

But prior to v5.16, you must be explicit about which set of charnames you
want.  The C<:full> names are the official Unicode character name, alias, or
sequence, which all share a namespace.

 use charnames qw(:full :short latin greek);

 "\N{MATHEMATICAL ITALIC SMALL N}"      # :full
 "\N{GREEK CAPITAL LETTER SIGMA}"       # :full

Anything else is a Perl-specific convenience abbreviation.  Specify one or
more scripts by names if you want short names that are script-specific.

 "\N{Greek:Sigma}"                      # :short
 "\N{ae}"                               #  latin
 "\N{epsilon}"                          #  greek

The v5.16 release also supports a C<:loose> import for loose matching of
character names, which works just like loose matching of property names:
that is, it disregards case, whitespace, and underscores:

 "\N{euro sign}"                        # :loose (from v5.16)

=head2 ℞ 9: Unicode named sequences

These look just like character names but return multiple codepoints.
Notice the C<%vx> vector-print functionality in C<printf>.

 use charnames qw(:full);
 my $seq = "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}";
 printf "U+%v04X\n", $seq;
 U+0100.0300

=head2 ℞ 10: Custom named characters

Use C<:alias> to give your own lexically scoped nicknames to existing
characters, or even to give unnamed private-use characters useful names.

 use charnames ":full", ":alias" => {
     ecute => "LATIN SMALL LETTER E WITH ACUTE",
     "APPLE LOGO" => 0xF8FF, # private use character
 };

 "\N{ecute}"
 "\N{APPLE LOGO}"

=head2 ℞ 11: Names of CJK codepoints

Sinograms like “東京” come back with character names of
C<CJK UNIFIED IDEOGRAPH-6771> and C<CJK UNIFIED IDEOGRAPH-4EAC>,
because their “names” vary.  The CPAN C<Unicode::Unihan> module
has a large database for decoding these (and a whole lot more), provided you
know how to understand its output.

 # cpan -i Unicode::Unihan
 use Unicode::Unihan;
 my $str = "東京";
 my $unhan = Unicode::Unihan->new;
 for my $lang (qw(Mandarin Cantonese Korean JapaneseOn JapaneseKun)) {
     printf "CJK $str in %-12s is ", $lang;
     say $unhan->$lang($str);
 }

prints:

 CJK 東京 in Mandarin     is DONG1JING1
 CJK 東京 in Cantonese    is dung1ging1
 CJK 東京 in Korean       is TONGKYENG
 CJK 東京 in JapaneseOn   is TOUKYOU KEI KIN
 CJK 東京 in JapaneseKun  is HIGASHI AZUMAMIYAKO

If you have a specific romanization scheme in mind,
use the specific module:

 # cpan -i Lingua::JA::Romanize::Japanese
 use Lingua::JA::Romanize::Japanese;
 my $k2r = Lingua::JA::Romanize::Japanese->new;
 my $str = "東京";
 say "Japanese for $str is ", $k2r->chars($str);

prints

 Japanese for 東京 is toukyou

=head2 ℞ 12: Explicit encode/decode

On rare occasion, such as a database read, you may be
given encoded text you need to decode.

  use Encode qw(encode decode);

  my $chars = decode("shiftjis", $bytes, 1);
 # OR
  my $bytes = encode("MIME-Header-ISO_2022_JP", $chars, 1);

For streams all in the same encoding, don't use encode/decode; instead
set the file encoding when you open the file or immediately after with
C<binmode> as described later below.

=head2 ℞ 13: Decode program arguments as utf8

     $ perl -CA ...
 or
     $ export PERL_UNICODE=A
 or
    use Encode qw(decode);
    @ARGV = map { decode('UTF-8', $_, 1) } @ARGV;

=head2 ℞ 14: Decode program arguments as locale encoding

    # cpan -i Encode::Locale
    use Encode qw(locale);
    use Encode::Locale;

    # use "locale" as an arg to encode/decode
    @ARGV = map { decode(locale => $_, 1) } @ARGV;

=head2 ℞ 15: Declare STD{IN,OUT,ERR} to be utf8

Use a command-line option, an environment variable, or else
call C<binmode> explicitly:

     $ perl -CS ...
 or
     $ export PERL_UNICODE=S
 or
     use open qw(:std :encoding(UTF-8));
 or
     binmode(STDIN,  ":encoding(UTF-8)");
     binmode(STDOUT, ":utf8");
     binmode(STDERR, ":utf8");

=head2 ℞ 16: Declare STD{IN,OUT,ERR} to be in locale encoding

    # cpan -i Encode::Locale
    use Encode;
    use Encode::Locale;

    # or as a stream for binmode or open
    binmode STDIN,  ":encoding(console_in)"  if -t STDIN;
    binmode STDOUT, ":encoding(console_out)" if -t STDOUT;
    binmode STDERR, ":encoding(console_out)" if -t STDERR;

=head2 ℞ 17: Make file I/O default to utf8

Files opened without an encoding argument will be in UTF-8:

     $ perl -CD ...
 or
     $ export PERL_UNICODE=D
 or
     use open qw(:encoding(UTF-8));

=head2 ℞ 18: Make all I/O and args default to utf8

     $ perl -CSDA ...
 or
     $ export PERL_UNICODE=SDA
 or
     use open qw(:std :encoding(UTF-8));
     use Encode qw(decode);
     @ARGV = map { decode('UTF-8', $_, 1) } @ARGV;

=head2 ℞ 19: Open file with specific encoding

Specify stream encoding.  This is the normal way
to deal with encoded text, not by calling low-level
functions.

 # input file
     open(my $in_file, "< :encoding(UTF-16)", "wintext");
 OR
     open(my $in_file, "<", "wintext");
     binmode($in_file, ":encoding(UTF-16)");
 THEN
     my $line = <$in_file>;

 # output file
     open($out_file, "> :encoding(cp1252)", "wintext");
 OR
     open(my $out_file, ">", "wintext");
     binmode($out_file, ":encoding(cp1252)");
 THEN
     print $out_file "some text\n";

More layers than just the encoding can be specified here. For example,
the incantation C<":raw :encoding(UTF-16LE) :crlf"> includes implicit
CRLF handling.

=head2 ℞ 20: Unicode casing

Unicode casing is very different from ASCII casing.

 uc("henry ⅷ")  # "HENRY Ⅷ"
 uc("tschüß")   # "TSCHÜSS"  notice ß => SS

 # both are true:
 "tschüß"  =~ /TSCHÜSS/i   # notice ß => SS
 "Σίσυφος" =~ /ΣΊΣΥΦΟΣ/i   # notice Σ,σ,ς sameness

=head2 ℞ 21: Unicode case-insensitive comparisons

Also available in the CPAN L<Unicode::CaseFold> module,
the new C<fc> “foldcase” function from v5.16 grants
access to the same Unicode casefolding as the C</i>
pattern modifier has always used:

 use feature "fc"; # fc() function is from v5.16

 # sort case-insensitively
 my @sorted = sort { fc($a) cmp fc($b) } @list;

 # both are true:
 fc("tschüß")  eq fc("TSCHÜSS")
 fc("Σίσυφος") eq fc("ΣΊΣΥΦΟΣ")

=head2 ℞ 22: Match Unicode linebreak sequence in regex

A Unicode linebreak matches the two-character CRLF
grapheme or any of seven vertical whitespace characters.
Good for dealing with textfiles coming from different
operating systems.

 \R

 s/\R/\n/g;  # normalize all linebreaks to \n

=head2 ℞ 23: Get character category

Find the general category of a numeric codepoint.

 use Unicode::UCD qw(charinfo);
 my $cat = charinfo(0x3A3)->{category};  # "Lu"

=head2 ℞ 24: Disabling Unicode-awareness in builtin charclasses

Disable C<\w>, C<\b>, C<\s>, C<\d>, and the POSIX
classes from working correctly on Unicode either in this
scope, or in just one regex.

 use v5.14;
 use re "/a";

 # OR

 my($num) = $str =~ /(\d+)/a;

Or use specific un-Unicode properties, like C<\p{ahex}>
and C<\p{POSIX_Digit>}.  Properties still work normally
no matter what charset modifiers (C</d /u /l /a /aa>)
should be effect.

=head2 ℞ 25: Match Unicode properties in regex with \p, \P

These all match a single codepoint with the given
property.  Use C<\P> in place of C<\p> to match
one codepoint lacking that property.

 \pL, \pN, \pS, \pP, \pM, \pZ, \pC
 \p{Sk}, \p{Ps}, \p{Lt}
 \p{alpha}, \p{upper}, \p{lower}
 \p{Latin}, \p{Greek}
 \p{script_extensions=Latin}, \p{scx=Greek}
 \p{East_Asian_Width=Wide}, \p{EA=W}
 \p{Line_Break=Hyphen}, \p{LB=HY}
 \p{Numeric_Value=4}, \p{NV=4}

=head2 ℞ 26: Custom character properties

Define at compile-time your own custom character
properties for use in regexes.

 # using private-use characters
 sub In_Tengwar { "E000\tE07F\n" }

 if (/\p{In_Tengwar}/) { ... }

 # blending existing properties
 sub Is_GraecoRoman_Title {<<'END_OF_SET'}
 +utf8::IsLatin
 +utf8::IsGreek
 &utf8::IsTitle
 END_OF_SET

 if (/\p{Is_GraecoRoman_Title}/ { ... }

=head2 ℞ 27: Unicode normalization

Typically render into NFD on input and NFC on output. Using NFKC or NFKD
functions improves recall on searches, assuming you've already done to the
same text to be searched. Note that this is about much more than just pre-
combined compatibility glyphs; it also reorders marks according to their
canonical combining classes and weeds out singletons.

 use Unicode::Normalize;
 my $nfd  = NFD($orig);
 my $nfc  = NFC($orig);
 my $nfkd = NFKD($orig);
 my $nfkc = NFKC($orig);

=head2 ℞ 28: Convert non-ASCII Unicode numerics

Unless you’ve used C</a> or C</aa>, C<\d> matches more than
ASCII digits only, but Perl’s implicit string-to-number
conversion does not current recognize these.  Here’s how to
convert such strings manually.

 use v5.14;  # needed for num() function
 use Unicode::UCD qw(num);
 my $str = "got Ⅻ and ४५६७ and ⅞ and here";
 my @nums = ();
 while ($str =~ /(\d+|\N)/g) {  # not just ASCII!
    push @nums, num($1);
 }
 say "@nums";   #     12      4567      0.875

 use charnames qw(:full);
 my $nv = num("\N{RUMI DIGIT ONE}\N{RUMI DIGIT TWO}");

=head2 ℞ 29: Match Unicode grapheme cluster in regex

Programmer-visible “characters” are codepoints matched by C</./s>,
but user-visible “characters” are graphemes matched by C</\X/>.

 # Find vowel *plus* any combining diacritics,underlining,etc.
 my $nfd = NFD($orig);
 $nfd =~ / (?=[aeiou]) \X /xi

=head2 ℞ 30: Extract by grapheme instead of by codepoint (regex)

 # match and grab five first graphemes
 my($first_five) = $str =~ /^ ( \X{5} ) /x;

=head2 ℞ 31: Extract by grapheme instead of by codepoint (substr)

 # cpan -i Unicode::GCString
 use Unicode::GCString;
 my $gcs = Unicode::GCString->new($str);
 my $first_five = $gcs->substr(0, 5);

=head2 ℞ 32: Reverse string by grapheme

Reversing by codepoint messes up diacritics, mistakenly converting
C<crème brûlée> into C<éel̂urb em̀erc> instead of into C<eélûrb emèrc>;
so reverse by grapheme instead.  Both these approaches work
right no matter what normalization the string is in:

 $str = join("", reverse $str =~ /\X/g);

 # OR: cpan -i Unicode::GCString
 use Unicode::GCString;
 $str = reverse Unicode::GCString->new($str);

=head2 ℞ 33: String length in graphemes

The string C<brûlée> has six graphemes but up to eight codepoints.
This counts by grapheme, not by codepoint:

 my $str = "brûlée";
 my $count = 0;
 while ($str =~ /\X/g) { $count++ }

  # OR: cpan -i Unicode::GCString
 use Unicode::GCString;
 my $gcs = Unicode::GCString->new($str);
 my $count = $gcs->length;

=head2 ℞ 34: Unicode column-width for printing

Perl’s C<printf>, C<sprintf>, and C<format> think all
codepoints take up 1 print column, but many take 0 or 2.
Here to show that normalization makes no difference,
we print out both forms:

 use Unicode::GCString;
 use Unicode::Normalize;

 my @words = qw/crème brûlée/;
 @words = map { NFC($_), NFD($_) } @words;

 for my $str (@words) {
     my $gcs = Unicode::GCString->new($str);
     my $cols = $gcs->columns;
     my $pad = " " x (10 - $cols);
     say str, $pad, " |";
 }

generates this to show that it pads correctly no matter
the normalization:

 crème      |
 crème      |
 brûlée     |
 brûlée     |

=head2 ℞ 35: Unicode collation

Text sorted by numeric codepoint follows no reasonable alphabetic order;
use the UCA for sorting text.

 use Unicode::Collate;
 my $col = Unicode::Collate->new();
 my @list = $col->sort(@old_list);

See the I<ucsort> program from the L<Unicode::Tussle> CPAN module
for a convenient command-line interface to this module.

=head2 ℞ 36: Case- I<and> accent-insensitive Unicode sort

Specify a collation strength of level 1 to ignore case and
diacritics, only looking at the basic character.

 use Unicode::Collate;
 my $col = Unicode::Collate->new(level => 1);
 my @list = $col->sort(@old_list);

=head2 ℞ 37: Unicode locale collation

Some locales have special sorting rules.

 # either use v5.12, OR: cpan -i Unicode::Collate::Locale
 use Unicode::Collate::Locale;
 my $col = Unicode::Collate::Locale->new(locale => "de__phonebook");
 my @list = $col->sort(@old_list);

The I<ucsort> program mentioned above accepts a C<--locale> parameter.

=head2 ℞ 38: Making C<cmp> work on text instead of codepoints

Instead of this:

 @srecs = sort {
     $b->{AGE}   <=>  $a->{AGE}
                 ||
     $a->{NAME}  cmp  $b->{NAME}
 } @recs;

Use this:

 my $coll = Unicode::Collate->new();
 for my $rec (@recs) {
     $rec->{NAME_key} = $coll->getSortKey( $rec->{NAME} );
 }
 @srecs = sort {
     $b->{AGE}       <=>  $a->{AGE}
                     ||
     $a->{NAME_key}  cmp  $b->{NAME_key}
 } @recs;

=head2 ℞ 39: Case- I<and> accent-insensitive comparisons

Use a collator object to compare Unicode text by character
instead of by codepoint.

 use Unicode::Collate;
 my $es = Unicode::Collate->new(
     level => 1,
     normalization => undef
 );

  # now both are true:
 $es->eq("García",  "GARCIA" );
 $es->eq("Márquez", "MARQUEZ");

=head2 ℞ 40: Case- I<and> accent-insensitive locale comparisons

Same, but in a specific locale.

 my $de = Unicode::Collate::Locale->new(
            locale => "de__phonebook",
          );

 # now this is true:
 $de->eq("tschüß", "TSCHUESS");  # notice ü => UE, ß => SS

=head2 ℞ 41: Unicode linebreaking

Break up text into lines according to Unicode rules.

 # cpan -i Unicode::LineBreak
 use Unicode::LineBreak;
 use charnames qw(:full);

 my $para = "This is a super\N{HYPHEN}long string. " x 20;
 my $fmt = Unicode::LineBreak->new;
 print $fmt->break($para), "\n";

=head2 ℞ 42: Unicode text in DBM hashes, the tedious way

Using a regular Perl string as a key or value for a DBM
hash will trigger a wide character exception if any codepoints
won’t fit into a byte.  Here’s how to manually manage the translation:

    use DB_File;
    use Encode qw(encode decode);
    tie %dbhash, "DB_File", "pathname";

 # STORE

    # assume $uni_key and $uni_value are abstract Unicode strings
    my $enc_key   = encode("UTF-8", $uni_key, 1);
    my $enc_value = encode("UTF-8", $uni_value, 1);
    $dbhash{$enc_key} = $enc_value;

 # FETCH

    # assume $uni_key holds a normal Perl string (abstract Unicode)
    my $enc_key   = encode("UTF-8", $uni_key, 1);
    my $enc_value = $dbhash{$enc_key};
    my $uni_value = decode("UTF-8", $enc_value, 1);

=head2 ℞ 43: Unicode text in DBM hashes, the easy way

Here’s how to implicitly manage the translation; all encoding
and decoding is done automatically, just as with streams that
have a particular encoding attached to them:

    use DB_File;
    use DBM_Filter;

    my $dbobj = tie %dbhash, "DB_File", "pathname";
    $dbobj->Filter_Value("utf8");  # this is the magic bit

 # STORE

    # assume $uni_key and $uni_value are abstract Unicode strings
    $dbhash{$uni_key} = $uni_value;

  # FETCH

    # $uni_key holds a normal Perl string (abstract Unicode)
    my $uni_value = $dbhash{$uni_key};

=head2 ℞ 44: PROGRAM: Demo of Unicode collation and printing

Here’s a full program showing how to make use of locale-sensitive
sorting, Unicode casing, and managing print widths when some of the
characters take up zero or two columns, not just one column each time.
When run, the following program produces this nicely aligned output:

    Crème Brûlée....... €2.00
    Éclair............. €1.60
    Fideuà............. €4.20
    Hamburger.......... €6.00
    Jamón Serrano...... €4.45
    Linguiça........... €7.00
    Pâté............... €4.15
    Pears.............. €2.00
    Pêches............. €2.25
    Smørbrød........... €5.75
    Spätzle............ €5.50
    Xoriço............. €3.00
    Γύρος.............. €6.50
    막걸리............. €4.00
    おもち............. €2.65
    お好み焼き......... €8.00
    シュークリーム..... €1.85
    寿司............... €9.99
    包子............... €7.50

Here's that program; tested on v5.14.

 #!/usr/bin/env perl
 # umenu - demo sorting and printing of Unicode food
 #
 # (obligatory and increasingly long preamble)
 #
 use utf8;
 use v5.14;                       # for locale sorting
 use strict;
 use warnings;
 use warnings  qw(FATAL utf8);    # fatalize encoding faults
 use open      qw(:std :encoding(UTF-8)); # undeclared streams in UTF-8
 use charnames qw(:full :short);  # unneeded in v5.16

 # std modules
 use Unicode::Normalize;          # std perl distro as of v5.8
 use List::Util qw(max);          # std perl distro as of v5.10
 use Unicode::Collate::Locale;    # std perl distro as of v5.14

 # cpan modules
 use Unicode::GCString;           # from CPAN

 # forward defs
 sub pad($$$);
 sub colwidth(_);
 sub entitle(_);

 my %price = (
     "γύρος"             => 6.50, # gyros
     "pears"             => 2.00, # like um, pears
     "linguiça"          => 7.00, # spicy sausage, Portuguese
     "xoriço"            => 3.00, # chorizo sausage, Catalan
     "hamburger"         => 6.00, # burgermeister meisterburger
     "éclair"            => 1.60, # dessert, French
     "smørbrød"          => 5.75, # sandwiches, Norwegian
     "spätzle"           => 5.50, # Bayerisch noodles, little sparrows
     "包子"              => 7.50, # bao1 zi5, steamed pork buns, Mandarin
     "jamón serrano"     => 4.45, # country ham, Spanish
     "pêches"            => 2.25, # peaches, French
     "シュークリーム"    => 1.85, # cream-filled pastry like eclair
     "막걸리"            => 4.00, # makgeolli, Korean rice wine
     "寿司"              => 9.99, # sushi, Japanese
     "おもち"            => 2.65, # omochi, rice cakes, Japanese
     "crème brûlée"      => 2.00, # crema catalana
     "fideuà"            => 4.20, # more noodles, Valencian
                                  # (Catalan=fideuada)
     "pâté"              => 4.15, # gooseliver paste, French
     "お好み焼き"        => 8.00, # okonomiyaki, Japanese
 );

 my $width = 5 + max map { colwidth } keys %price;

 # So the Asian stuff comes out in an order that someone
 # who reads those scripts won't freak out over; the
 # CJK stuff will be in JIS X 0208 order that way.
 my $coll  = Unicode::Collate::Locale->new(locale => "ja");

 for my $item ($coll->sort(keys %price)) {
     print pad(entitle($item), $width, ".");
     printf " €%.2f\n", $price{$item};
 }

 sub pad($$$) {
     my($str, $width, $padchar) = @_;
     return $str . ($padchar x ($width - colwidth($str)));
 }

 sub colwidth(_) {
     my($str) = @_;
     return Unicode::GCString->new($str)->columns;
 }

 sub entitle(_) {
     my($str) = @_;
     $str =~ s{ (?=\pL)(\S)     (\S*) }
              { ucfirst($1) . lc($2)  }xge;
     return $str;
 }

=head1 SEE ALSO

See these manpages, some of which are CPAN modules:
L<perlunicode>, L<perluniprops>,
L<perlre>, L<perlrecharclass>,
L<perluniintro>, L<perlunitut>, L<perlunifaq>,
L<PerlIO>, L<DB_File>, L<DBM_Filter>, L<DBM_Filter::utf8>,
L<Encode>, L<Encode::Locale>,
L<Unicode::UCD>,
L<Unicode::Normalize>,
L<Unicode::GCString>, L<Unicode::LineBreak>,
L<Unicode::Collate>, L<Unicode::Collate::Locale>,
L<Unicode::Unihan>,
L<Unicode::CaseFold>,
L<Unicode::Tussle>,
L<Lingua::JA::Romanize::Japanese>,
L<Lingua::ZH::Romanize::Pinyin>,
L<Lingua::KO::Romanize::Hangul>.

The L<Unicode::Tussle> CPAN module includes many programs
to help with working with Unicode, including
these programs to fully or partly replace standard utilities:
I<tcgrep> instead of I<egrep>,
I<uniquote> instead of I<cat -v> or I<hexdump>,
I<uniwc> instead of I<wc>,
I<unilook> instead of I<look>,
I<unifmt> instead of I<fmt>,
and
I<ucsort> instead of I<sort>.
For exploring Unicode character names and character properties,
see its I<uniprops>, I<unichars>, and I<uninames> programs.
It also supplies these programs, all of which are general filters that do Unicode-y things:
I<unititle> and I<unicaps>;
I<uniwide> and I<uninarrow>;
I<unisupers> and I<unisubs>;
I<nfd>, I<nfc>, I<nfkd>, and I<nfkc>;
and I<uc>, I<lc>, and I<tc>.

Finally, see the published Unicode Standard (page numbers are from version
6.0.0), including these specific annexes and technical reports:

=over

=item §3.13 Default Case Algorithms, page 113;
§4.2  Case, pages 120–122;
Case Mappings, page 166–172, especially Caseless Matching starting on page 170.

=item UAX #44: Unicode Character Database

=item UTS #18: Unicode Regular Expressions

=item UAX #15: Unicode Normalization Forms

=item UTS #10: Unicode Collation Algorithm

=item UAX #29: Unicode Text Segmentation

=item UAX #14: Unicode Line Breaking Algorithm

=item UAX #11: East Asian Width

=back

=head1 AUTHOR

Tom Christiansen E<lt>tchrist@perl.comE<gt> wrote this, with occasional
kibbitzing from Larry Wall and Jeffrey Friedl in the background.

=head1 COPYRIGHT AND LICENCE

Copyright © 2012 Tom Christiansen.

This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.

Most of these examples taken from the current edition of the “Camel Book”;
that is, from the 4ᵗʰ Edition of I<Programming Perl>, Copyright © 2012 Tom
Christiansen <et al.>, 2012-02-13 by O’Reilly Media.  The code itself is
freely redistributable, and you are encouraged to transplant, fold,
spindle, and mutilate any of the examples in this manpage however you please
for inclusion into your own programs without any encumbrance whatsoever.
Acknowledgement via code comment is polite but not required.

=head1 REVISION HISTORY

v1.0.0 – first public release, 2012-02-27
perl5200delta.pod000064400000341763150344123430007550 0ustar00=encoding utf8

=head1 NAME

perl5200delta - what is new for perl v5.20.0

=head1 DESCRIPTION

This document describes differences between the 5.18.0 release and the
5.20.0 release.

If you are upgrading from an earlier release such as 5.16.0, first read
L<perl5180delta>, which describes differences between 5.16.0 and 5.18.0.

=head1 Core Enhancements

=head2 Experimental Subroutine signatures

Declarative syntax to unwrap argument list into lexical variables.
C<sub foo ($a,$b) {...}> checks the number of arguments and puts the
arguments into lexical variables.  Signatures are not equivalent to
the existing idiom of C<sub foo { my($a,$b) = @_; ... }>.  Signatures
are only available by enabling a non-default feature, and generate
warnings about being experimental.  The syntactic clash with
prototypes is managed by disabling the short prototype syntax when
signatures are enabled.

See L<perlsub/Signatures> for details.

=head2 C<sub>s now take a C<prototype> attribute

When declaring or defining a C<sub>, the prototype can now be specified inside
of a C<prototype> attribute instead of in parens following the name.

For example, C<sub foo($$){}> could be rewritten as
C<sub foo : prototype($$){}>.

=head2 More consistent prototype parsing

Multiple semicolons in subroutine prototypes have long been tolerated and
treated as a single semicolon.  There was one case where this did not
happen.  A subroutine whose prototype begins with "*" or ";*" can affect
whether a bareword is considered a method name or sub call.  This now
applies also to ";;;*".

Whitespace has long been allowed inside subroutine prototypes, so
C<sub( $ $ )> is equivalent to C<sub($$)>, but until now it was stripped
when the subroutine was parsed.  Hence, whitespace was I<not> allowed in
prototypes set by C<Scalar::Util::set_prototype>.  Now it is permitted,
and the parser no longer strips whitespace.  This means
C<prototype &mysub> returns the original prototype, whitespace and all.

=head2 C<rand> now uses a consistent random number generator

Previously perl would use a platform specific random number generator, varying
between the libc rand(), random() or drand48().

This meant that the quality of perl's random numbers would vary from platform
to platform, from the 15 bits of rand() on Windows to 48-bits on POSIX
platforms such as Linux with drand48().

Perl now uses its own internal drand48() implementation on all platforms.  This
does not make perl's C<rand> cryptographically secure.  [perl #115928]

=head2 New slice syntax

The new C<%hash{...}> and C<%array[...]> syntax returns a list of key/value (or
index/value) pairs.  See L<perldata/"Key/Value Hash Slices">.

=head2 Experimental Postfix Dereferencing

When the C<postderef> feature is in effect, the following syntactical
equivalencies are set up:

  $sref->$*;  # same as ${ $sref }  # interpolates
  $aref->@*;  # same as @{ $aref }  # interpolates
  $href->%*;  # same as %{ $href }
  $cref->&*;  # same as &{ $cref }
  $gref->**;  # same as *{ $gref }

  $aref->$#*; # same as $#{ $aref }

  $gref->*{ $slot }; # same as *{ $gref }{ $slot }

  $aref->@[ ... ];  # same as @$aref[ ... ]  # interpolates
  $href->@{ ... };  # same as @$href{ ... }  # interpolates
  $aref->%[ ... ];  # same as %$aref[ ... ]
  $href->%{ ... };  # same as %$href{ ... }

Those marked as interpolating only interpolate if the associated
C<postderef_qq> feature is also enabled.  This feature is B<experimental> and
will trigger C<experimental::postderef>-category warnings when used, unless
they are suppressed.

For more information, consult L<the Postfix Dereference Syntax section of
perlref|perlref/Postfix Dereference Syntax>.

=head2 Unicode 6.3 now supported

Perl now supports and is shipped with Unicode 6.3 (though Perl may be
recompiled with any previous Unicode release as well).  A detailed list of
Unicode 6.3 changes is at L<http://www.unicode.org/versions/Unicode6.3.0/>.

=head2 New C<\p{Unicode}> regular expression pattern property

This is a synonym for C<\p{Any}> and matches the set of Unicode-defined
code points 0 - 0x10FFFF.

=head2 Better 64-bit support

On 64-bit platforms, the internal array functions now use 64-bit offsets,
allowing Perl arrays to hold more than 2**31 elements, if you have the memory
available.

The regular expression engine now supports strings longer than 2**31
characters.  [perl #112790, #116907]

The functions PerlIO_get_bufsiz, PerlIO_get_cnt, PerlIO_set_cnt and
PerlIO_set_ptrcnt now have SSize_t, rather than int, return values and
parameters.

=head2 C<S<use locale>> now works on UTF-8 locales

Until this release, only single-byte locales, such as the ISO 8859
series were supported.  Now, the increasingly common multi-byte UTF-8
locales are also supported.  A UTF-8 locale is one in which the
character set is Unicode and the encoding is UTF-8.  The POSIX
C<LC_CTYPE> category operations (case changing (like C<lc()>, C<"\U">),
and character classification (C<\w>, C<\D>, C<qr/[[:punct:]]/>)) under
such a locale work just as if not under locale, but instead as if under
C<S<use feature 'unicode_strings'>>, except taint rules are followed.
Sorting remains by code point order in this release.  [perl #56820].

=head2 C<S<use locale>> now compiles on systems without locale ability

Previously doing this caused the program to not compile.  Within its
scope the program behaves as if in the "C" locale.  Thus programs
written for platforms that support locales can run on locale-less
platforms without change.  Attempts to change the locale away from the
"C" locale will, of course, fail.

=head2 More locale initialization fallback options

If there was an error with locales during Perl start-up, it immediately
gave up and tried to use the C<"C"> locale.  Now it first tries using
other locales given by the environment variables, as detailed in
L<perllocale/ENVIRONMENT>.  For example, if C<LC_ALL> and C<LANG> are
both set, and using the C<LC_ALL> locale fails, Perl will now try the
C<LANG> locale, and only if that fails, will it fall back to C<"C">.  On
Windows machines, Perl will try, ahead of using C<"C">, the system
default locale if all the locales given by environment variables fail.

=head2 C<-DL> runtime option now added for tracing locale setting

This is designed for Perl core developers to aid in field debugging bugs
regarding locales.

=head2 B<-F> now implies B<-a> and B<-a> implies B<-n>

Previously B<-F> without B<-a> was a no-op, and B<-a> without B<-n> or B<-p>
was a no-op, with this change, if you supply B<-F> then both B<-a> and B<-n>
are implied and if you supply B<-a> then B<-n> is implied.

You can still use B<-p> for its extra behaviour. [perl #116190]

=head2 $a and $b warnings exemption

The special variables $a and $b, used in C<sort>, are now exempt from "used
once" warnings, even where C<sort> is not used.  This makes it easier for
CPAN modules to provide functions using $a and $b for similar purposes.
[perl #120462]

=head1 Security

=head2 Avoid possible read of free()d memory during parsing

It was possible that free()d memory could be read during parsing in the unusual
circumstance of the Perl program ending with a heredoc and the last line of the
file on disk having no terminating newline character.  This has now been fixed.

=head1 Incompatible Changes

=head2 C<do> can no longer be used to call subroutines

The C<do SUBROUTINE(LIST)> form has resulted in a deprecation warning
since Perl v5.0.0, and is now a syntax error.

=head2 Quote-like escape changes

The character after C<\c> in a double-quoted string ("..." or qq(...))
or regular expression must now be a printable character and may not be
C<{>.

A literal C<{> after C<\B> or C<\b> is now fatal.

These were deprecated in perl v5.14.0.

=head2 Tainting happens under more circumstances; now conforms to documentation

This affects regular expression matching and changing the case of a
string (C<lc>, C<"\U">, I<etc>.) within the scope of C<use locale>.
The result is now tainted based on the operation, no matter what the
contents of the string were, as the documentation (L<perlsec>,
L<perllocale/SECURITY>) indicates it should.  Previously, for the case
change operation, if the string contained no characters whose case
change could be affected by the locale, the result would not be tainted.
For example, the result of C<uc()> on an empty string or one containing
only above-Latin1 code points is now tainted, and wasn't before.  This
leads to more consistent tainting results.  Regular expression patterns
taint their non-binary results (like C<$&>, C<$2>) if and only if the
pattern contains elements whose matching depends on the current
(potentially tainted) locale.  Like the case changing functions, the
actual contents of the string being matched now do not matter, whereas
formerly it did.  For example, if the pattern contains a C<\w>, the
results will be tainted even if the match did not have to use that
portion of the pattern to succeed or fail, because what a C<\w> matches
depends on locale.  However, for example, a C<.> in a pattern will not
enable tainting, because the dot matches any single character, and what
the current locale is doesn't change in any way what matches and what
doesn't.

=head2 C<\p{}>, C<\P{}> matching has changed for non-Unicode code
points.

C<\p{}> and C<\P{}> are defined by Unicode only on Unicode-defined code
points (C<U+0000> through C<U+10FFFF>).  Their behavior on matching
these legal Unicode code points is unchanged, but there are changes for
code points C<0x110000> and above.  Previously, Perl treated the result
of matching C<\p{}> and C<\P{}> against these as C<undef>, which
translates into "false".  For C<\P{}>, this was then complemented into
"true".  A warning was supposed to be raised when this happened.
However, various optimizations could prevent the warning, and the
results were often counter-intuitive, with both a match and its seeming
complement being false.  Now all non-Unicode code points are treated as
typical unassigned Unicode code points.  This generally is more
Do-What-I-Mean.  A warning is raised only if the results are arguably
different from a strict Unicode approach, and from what Perl used to do.
Code that needs to be strictly Unicode compliant can make this warning
fatal, and then Perl always raises the warning.

Details are in L<perlunicode/Beyond Unicode code points>.

=head2 C<\p{All}> has been expanded to match all possible code points

The Perl-defined regular expression pattern element C<\p{All}>, unused
on CPAN, used to match just the Unicode code points; now it matches all
possible code points; that is, it is equivalent to C<qr/./s>.  Thus
C<\p{All}> is no longer synonymous with C<\p{Any}>, which continues to
match just the Unicode code points, as Unicode says it should.

=head2 Data::Dumper's output may change

Depending on the data structures dumped and the settings set for
Data::Dumper, the dumped output may have changed from previous
versions.

If you have tests that depend on the exact output of Data::Dumper,
they may fail.

To avoid this problem in your code, test against the data structure
from evaluating the dumped structure, instead of the dump itself.

=head2 Locale decimal point character no longer leaks outside of S<C<use locale>> scope

This is actually a bug fix, but some code has come to rely on the bug
being present, so this change is listed here.  The current locale that
the program is running under is not supposed to be visible to Perl code
except within the scope of a S<C<use locale>>.  However, until now under
certain circumstances, the character used for a decimal point (often a
comma) leaked outside the scope.  If your code is affected by this
change, simply add a S<C<use locale>>.

=head2 Assignments of Windows sockets error codes to $! now prefer F<errno.h> values over WSAGetLastError() values

In previous versions of Perl, Windows sockets error codes as returned by
WSAGetLastError() were assigned to $!, and some constants such as ECONNABORTED,
not in F<errno.h> in VC++ (or the various Windows ports of gcc) were defined to
corresponding WSAE* values to allow $! to be tested against the E* constants
exported by L<Errno> and L<POSIX>.

This worked well until VC++ 2010 and later, which introduced new E* constants
with values E<gt> 100 into F<errno.h>, including some being (re)defined by perl
to WSAE* values.  That caused problems when linking XS code against other
libraries which used the original definitions of F<errno.h> constants.

To avoid this incompatibility, perl now maps WSAE* error codes to E* values
where possible, and assigns those values to $!.  The E* constants exported by
L<Errno> and L<POSIX> are updated to match so that testing $! against them,
wherever previously possible, will continue to work as expected, and all E*
constants found in F<errno.h> are now exported from those modules with their
original F<errno.h> values.

In order to avoid breakage in existing Perl code which assigns WSAE* values to
$!, perl now intercepts the assignment and performs the same mapping to E*
values as it uses internally when assigning to $! itself.

However, one backwards-incompatibility remains: existing Perl code which
compares $! against the numeric values of the WSAE* error codes that were
previously assigned to $! will now be broken in those cases where a
corresponding E* value has been assigned instead.  This is only an issue for
those E* values E<lt> 100, which were always exported from L<Errno> and
L<POSIX> with their original F<errno.h> values, and therefore could not be used
for WSAE* error code tests (e.g. WSAEINVAL is 10022, but the corresponding
EINVAL is 22).  (E* values E<gt> 100, if present, were redefined to WSAE*
values anyway, so compatibility can be achieved by using the E* constants,
which will work both before and after this change, albeit using different
numeric values under the hood.)

=head2 Functions C<PerlIO_vsprintf> and C<PerlIO_sprintf> have been removed

These two functions, undocumented, unused in CPAN, and problematic, have been
removed.

=head1 Deprecations

=head2 The C</\C/> character class

The C</\C/> regular expression character class is deprecated. From perl
5.22 onwards it will generate a warning, and from perl 5.24 onwards it
will be a regular expression compiler error. If you need to examine the
individual bytes that make up a UTF8-encoded character, then use
C<utf8::encode()> on the string (or a copy) first.

=head2 Literal control characters in variable names

This deprecation affects things like $\cT, where \cT is a literal control (such
as a C<NAK> or C<NEGATIVE ACKNOWLEDGE> character) in
the source code.  Surprisingly, it appears that originally this was intended as
the canonical way of accessing variables like $^T, with the caret form only
being added as an alternative.

The literal control form is being deprecated for two main reasons.  It has what
are likely unfixable bugs, such as $\cI not working as an alias for $^I, and
their usage not being portable to non-ASCII platforms: While $^T will work
everywhere, \cT is whitespace in EBCDIC.  [perl #119123]

=head2 References to non-integers and non-positive integers in C<$/>

Setting C<$/> to a reference to zero or a reference to a negative integer is
now deprecated, and will behave B<exactly> as though it was set to C<undef>.
If you want slurp behavior set C<$/> to C<undef> explicitly.

Setting C<$/> to a reference to a non integer is now forbidden and will
throw an error. Perl has never documented what would happen in this
context and while it used to behave the same as setting C<$/> to
the address of the references in future it may behave differently, so we
have forbidden this usage.

=head2 Character matching routines in POSIX

Use of any of these functions in the C<POSIX> module is now deprecated:
C<isalnum>, C<isalpha>, C<iscntrl>, C<isdigit>, C<isgraph>, C<islower>,
C<isprint>, C<ispunct>, C<isspace>, C<isupper>, and C<isxdigit>.  The
functions are buggy and don't work on UTF-8 encoded strings.  See their
entries in L<POSIX> for more information.

A warning is raised on the first call to any of them from each place in
the code that they are called.  (Hence a repeated statement in a loop
will raise just the one warning.)

=head2 Interpreter-based threads are now I<discouraged>

The "interpreter-based threads" provided by Perl are not the fast, lightweight
system for multitasking that one might expect or hope for.  Threads are
implemented in a way that make them easy to misuse.  Few people know how to
use them correctly or will be able to provide help.

The use of interpreter-based threads in perl is officially
L<discouraged|perlpolicy/discouraged>.

=head2 Module removals

The following modules will be removed from the core distribution in a
future release, and will at that time need to be installed from CPAN.
Distributions on CPAN which require these modules will need to list them as
prerequisites.

The core versions of these modules will now issue C<"deprecated">-category
warnings to alert you to this fact.  To silence these deprecation warnings,
install the modules in question from CPAN.

Note that the planned removal of these modules from core does not reflect a
judgement about the quality of the code and should not be taken as a suggestion
that their use be halted.  Their disinclusion from core primarily hinges on
their necessity to bootstrapping a fully functional, CPAN-capable Perl
installation, not on concerns over their design.

=over

=item L<CGI> and its associated CGI:: packages

=item L<inc::latest>

=item L<Package::Constants>

=item L<Module::Build> and its associated Module::Build:: packages

=back

=head2 Utility removals

The following utilities will be removed from the core distribution in a
future release, and will at that time need to be installed from CPAN.

=over 4

=item L<find2perl>

=item L<s2p>

=item L<a2p>

=back

=head1 Performance Enhancements

=over 4

=item *

Perl has a new copy-on-write mechanism that avoids the need to copy the
internal string buffer when assigning from one scalar to another. This
makes copying large strings appear much faster.  Modifying one of the two
(or more) strings after an assignment will force a copy internally. This
makes it unnecessary to pass strings by reference for efficiency.

This feature was already available in 5.18.0, but wasn't enabled by
default. It is the default now, and so you no longer need build perl with
the F<Configure> argument:

    -Accflags=-DPERL_NEW_COPY_ON_WRITE

It can be disabled (for now) in a perl build with:

    -Accflags=-DPERL_NO_COW

On some operating systems Perl can be compiled in such a way that any
attempt to modify string buffers shared by multiple SVs will crash.  This
way XS authors can test that their modules handle copy-on-write scalars
correctly.  See L<perlguts/"Copy on Write"> for detail.

=item *

Perl has an optimizer for regular expression patterns.  It analyzes the pattern
to find things such as the minimum length a string has to be to match, etc.  It
now better handles code points that are above the Latin1 range.

=item *

Executing a regex that contains the C<^> anchor (or its variant under the
C</m> flag) has been made much faster in several situations.

=item *

Precomputed hash values are now used in more places during method lookup.

=item *

Constant hash key lookups (C<$hash{key}> as opposed to C<$hash{$key}>) have
long had the internal hash value computed at compile time, to speed up
lookup.  This optimisation has only now been applied to hash slices as
well.

=item *

Combined C<and> and C<or> operators in void context, like those
generated for C<< unless ($a && $b) >> and C<< if ($a || b) >> now
short circuit directly to the end of the statement. [perl #120128]

=item *

In certain situations, when C<return> is the last statement in a subroutine's
main scope, it will be optimized out. This means code like:

  sub baz { return $cat; }

will now behave like:

  sub baz { $cat; }

which is notably faster.

[perl #120765]

=item *

Code like:

  my $x; # or @x, %x
  my $y;

is now optimized to:

  my ($x, $y);

In combination with the L<padrange optimization introduced in
v5.18.0|perl5180delta/Internal Changes>, this means longer uninitialized my
variable statements are also optimized, so:

  my $x; my @y; my %z;

becomes:

  my ($x, @y, %z);

[perl #121077]

=item *

The creation of certain sorts of lists, including array and hash slices, is now
faster.

=item *

The optimisation for arrays indexed with a small constant integer is now
applied for integers in the range -128..127, rather than 0..255. This should
speed up Perl code using expressions like C<$x[-1]>, at the expense of
(presumably much rarer) code using expressions like C<$x[200]>.

=item *

The first iteration over a large hash (using C<keys> or C<each>) is now
faster. This is achieved by preallocating the hash's internal iterator
state, rather than lazily creating it when the hash is first iterated. (For
small hashes, the iterator is still created only when first needed. The
assumption is that small hashes are more likely to be used as objects, and
therefore never allocated. For large hashes, that's less likely to be true,
and the cost of allocating the iterator is swamped by the cost of allocating
space for the hash itself.)

=item *

When doing a global regex match on a string that came from the C<readline>
or C<E<lt>E<gt>> operator, the data is no longer copied unnecessarily.
[perl #121259]

=item *

Dereferencing (as in C<$obj-E<gt>[0]> or C<$obj-E<gt>{k}>) is now faster
when C<$obj> is an instance of a class that has overloaded methods, but
doesn't overload any of the dereferencing methods C<@{}>, C<%{}>, and so on.

=item *

Perl's optimiser no longer skips optimising code that follows certain
C<eval {}> expressions (including those with an apparent infinite loop).

=item *

The implementation now does a better job of avoiding meaningless work at
runtime. Internal effect-free "null" operations (created as a side-effect of
parsing Perl programs) are normally deleted during compilation. That
deletion is now applied in some situations that weren't previously handled.

=item *

Perl now does less disk I/O when dealing with Unicode properties that cover
up to three ranges of consecutive code points.

=back

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

=over 4

=item *

L<experimental> 0.007 has been added to the Perl core.

=item *

L<IO::Socket::IP> 0.29 has been added to the Perl core.

=back

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Archive::Tar> has been upgraded from version 1.90 to 1.96.

=item *

L<arybase> has been upgraded from version 0.06 to 0.07.

=item *

L<Attribute::Handlers> has been upgraded from version 0.94 to 0.96.

=item *

L<attributes> has been upgraded from version 0.21 to 0.22.

=item *

L<autodie> has been upgraded from version 2.13 to 2.23.

=item *

L<AutoLoader> has been upgraded from version 5.73 to 5.74.

=item *

L<autouse> has been upgraded from version 1.07 to 1.08.

=item *

L<B> has been upgraded from version 1.42 to 1.48.

=item *

L<B::Concise> has been upgraded from version 0.95 to 0.992.

=item *

L<B::Debug> has been upgraded from version 1.18 to 1.19.

=item *

L<B::Deparse> has been upgraded from version 1.20 to 1.26.

=item *

L<base> has been upgraded from version 2.18 to 2.22.

=item *

L<Benchmark> has been upgraded from version 1.15 to 1.18.

=item *

L<bignum> has been upgraded from version 0.33 to 0.37.

=item *

L<Carp> has been upgraded from version 1.29 to 1.3301.

=item *

L<CGI> has been upgraded from version 3.63 to 3.65.
NOTE: L<CGI> is deprecated and may be removed from a future version of Perl.

=item *

L<charnames> has been upgraded from version 1.36 to 1.40.

=item *

L<Class::Struct> has been upgraded from version 0.64 to 0.65.

=item *

L<Compress::Raw::Bzip2> has been upgraded from version 2.060 to 2.064.

=item *

L<Compress::Raw::Zlib> has been upgraded from version 2.060 to 2.065.

=item *

L<Config::Perl::V> has been upgraded from version 0.17 to 0.20.

=item *

L<constant> has been upgraded from version 1.27 to 1.31.

=item *

L<CPAN> has been upgraded from version 2.00 to 2.05.

=item *

L<CPAN::Meta> has been upgraded from version 2.120921 to 2.140640.

=item *

L<CPAN::Meta::Requirements> has been upgraded from version 2.122 to 2.125.

=item *

L<CPAN::Meta::YAML> has been upgraded from version 0.008 to 0.012.

=item *

L<Data::Dumper> has been upgraded from version 2.145 to 2.151.

=item *

L<DB> has been upgraded from version 1.04 to 1.07.

=item *

L<DB_File> has been upgraded from version 1.827 to 1.831.

=item *

L<DBM_Filter> has been upgraded from version 0.05 to 0.06.

=item *

L<deprecate> has been upgraded from version 0.02 to 0.03.

=item *

L<Devel::Peek> has been upgraded from version 1.11 to 1.16.

=item *

L<Devel::PPPort> has been upgraded from version 3.20 to 3.21.

=item *

L<diagnostics> has been upgraded from version 1.31 to 1.34.

=item *

L<Digest::MD5> has been upgraded from version 2.52 to 2.53.

=item *

L<Digest::SHA> has been upgraded from version 5.84 to 5.88.

=item *

L<DynaLoader> has been upgraded from version 1.18 to 1.25.

=item *

L<Encode> has been upgraded from version 2.49 to 2.60.

=item *

L<encoding> has been upgraded from version 2.6_01 to 2.12.

=item *

L<English> has been upgraded from version 1.06 to 1.09.

C<$OLD_PERL_VERSION> was added as an alias of C<$]>.

=item *

L<Errno> has been upgraded from version 1.18 to 1.20_03.

=item *

L<Exporter> has been upgraded from version 5.68 to 5.70.

=item *

L<ExtUtils::CBuilder> has been upgraded from version 0.280210 to 0.280216.

=item *

L<ExtUtils::Command> has been upgraded from version 1.17 to 1.18.

=item *

L<ExtUtils::Embed> has been upgraded from version 1.30 to 1.32.

=item *

L<ExtUtils::Install> has been upgraded from version 1.59 to 1.67.

=item *

L<ExtUtils::MakeMaker> has been upgraded from version 6.66 to 6.98.

=item *

L<ExtUtils::Miniperl> has been upgraded from version  to 1.01.

=item *

L<ExtUtils::ParseXS> has been upgraded from version 3.18 to 3.24.

=item *

L<ExtUtils::Typemaps> has been upgraded from version 3.19 to 3.24.

=item *

L<ExtUtils::XSSymSet> has been upgraded from version 1.2 to 1.3.

=item *

L<feature> has been upgraded from version 1.32 to 1.36.

=item *

L<fields> has been upgraded from version 2.16 to 2.17.

=item *

L<File::Basename> has been upgraded from version 2.84 to 2.85.

=item *

L<File::Copy> has been upgraded from version 2.26 to 2.29.

=item *

L<File::DosGlob> has been upgraded from version 1.10 to 1.12.

=item *

L<File::Fetch> has been upgraded from version 0.38 to 0.48.

=item *

L<File::Find> has been upgraded from version 1.23 to 1.27.

=item *

L<File::Glob> has been upgraded from version 1.20 to 1.23.

=item *

L<File::Spec> has been upgraded from version 3.40 to 3.47.

=item *

L<File::Temp> has been upgraded from version 0.23 to 0.2304.

=item *

L<FileCache> has been upgraded from version 1.08 to 1.09.

=item *

L<Filter::Simple> has been upgraded from version 0.89 to 0.91.

=item *

L<Filter::Util::Call> has been upgraded from version 1.45 to 1.49.

=item *

L<Getopt::Long> has been upgraded from version 2.39 to 2.42.

=item *

L<Getopt::Std> has been upgraded from version 1.07 to 1.10.

=item *

L<Hash::Util::FieldHash> has been upgraded from version 1.10 to 1.15.

=item *

L<HTTP::Tiny> has been upgraded from version 0.025 to 0.043.

=item *

L<I18N::Langinfo> has been upgraded from version 0.10 to 0.11.

=item *

L<I18N::LangTags> has been upgraded from version 0.39 to 0.40.

=item *

L<if> has been upgraded from version 0.0602 to 0.0603.

=item *

L<inc::latest> has been upgraded from version 0.4003 to 0.4205.
NOTE: L<inc::latest> is deprecated and may be removed from a future version of Perl.

=item *

L<integer> has been upgraded from version 1.00 to 1.01.

=item *

L<IO> has been upgraded from version 1.28 to 1.31.

=item *

L<IO::Compress::Gzip> and friends have been upgraded from version 2.060 to
2.064.

=item *

L<IPC::Cmd> has been upgraded from version 0.80 to 0.92.

=item *

L<IPC::Open3> has been upgraded from version 1.13 to 1.16.

=item *

L<IPC::SysV> has been upgraded from version 2.03 to 2.04.

=item *

L<JSON::PP> has been upgraded from version 2.27202 to 2.27203.

=item *

L<List::Util> has been upgraded from version 1.27 to 1.38.

=item *

L<locale> has been upgraded from version 1.02 to 1.03.

=item *

L<Locale::Codes> has been upgraded from version 3.25 to 3.30.

=item *

L<Locale::Maketext> has been upgraded from version 1.23 to 1.25.

=item *

L<Math::BigInt> has been upgraded from version 1.9991 to 1.9993.

=item *

L<Math::BigInt::FastCalc> has been upgraded from version 0.30 to 0.31.

=item *

L<Math::BigRat> has been upgraded from version 0.2604 to 0.2606.

=item *

L<MIME::Base64> has been upgraded from version 3.13 to 3.14.

=item *

L<Module::Build> has been upgraded from version 0.4003 to 0.4205.
NOTE: L<Module::Build> is deprecated and may be removed from a future version of Perl.

=item *

L<Module::CoreList> has been upgraded from version 2.89 to 3.10.

=item *

L<Module::Load> has been upgraded from version 0.24 to 0.32.

=item *

L<Module::Load::Conditional> has been upgraded from version 0.54 to 0.62.

=item *

L<Module::Metadata> has been upgraded from version 1.000011 to 1.000019.

=item *

L<mro> has been upgraded from version 1.11 to 1.16.

=item *

L<Net::Ping> has been upgraded from version 2.41 to 2.43.

=item *

L<Opcode> has been upgraded from version 1.25 to 1.27.

=item *

L<Package::Constants> has been upgraded from version 0.02 to 0.04.
NOTE: L<Package::Constants> is deprecated and may be removed from a future version of Perl.

=item *

L<Params::Check> has been upgraded from version 0.36 to 0.38.

=item *

L<parent> has been upgraded from version 0.225 to 0.228.

=item *

L<Parse::CPAN::Meta> has been upgraded from version 1.4404 to 1.4414.

=item *

L<Perl::OSType> has been upgraded from version 1.003 to 1.007.

=item *

L<perlfaq> has been upgraded from version 5.0150042 to 5.0150044.

=item *

L<PerlIO> has been upgraded from version 1.07 to 1.09.

=item *

L<PerlIO::encoding> has been upgraded from version 0.16 to 0.18.

=item *

L<PerlIO::scalar> has been upgraded from version 0.16 to 0.18.

=item *

L<PerlIO::via> has been upgraded from version 0.12 to 0.14.

=item *

L<Pod::Escapes> has been upgraded from version 1.04 to 1.06.

=item *

L<Pod::Functions> has been upgraded from version 1.06 to 1.08.

=item *

L<Pod::Html> has been upgraded from version 1.18 to 1.21.

=item *

L<Pod::Parser> has been upgraded from version 1.60 to 1.62.

=item *

L<Pod::Perldoc> has been upgraded from version 3.19 to 3.23.

=item *

L<Pod::Usage> has been upgraded from version 1.61 to 1.63.

=item *

L<POSIX> has been upgraded from version 1.32 to 1.38_03.

=item *

L<re> has been upgraded from version 0.23 to 0.26.

=item *

L<Safe> has been upgraded from version 2.35 to 2.37.

=item *

L<Scalar::Util> has been upgraded from version 1.27 to 1.38.

=item *

L<SDBM_File> has been upgraded from version 1.09 to 1.11.

=item *

L<Socket> has been upgraded from version 2.009 to 2.013.

=item *

L<Storable> has been upgraded from version 2.41 to 2.49.

=item *

L<strict> has been upgraded from version 1.07 to 1.08.

=item *

L<subs> has been upgraded from version 1.01 to 1.02.

=item *

L<Sys::Hostname> has been upgraded from version 1.17 to 1.18.

=item *

L<Sys::Syslog> has been upgraded from version 0.32 to 0.33.

=item *

L<Term::Cap> has been upgraded from version 1.13 to 1.15.

=item *

L<Term::ReadLine> has been upgraded from version 1.12 to 1.14.

=item *

L<Test::Harness> has been upgraded from version 3.26 to 3.30.

=item *

L<Test::Simple> has been upgraded from version 0.98 to 1.001002.

=item *

L<Text::ParseWords> has been upgraded from version 3.28 to 3.29.

=item *

L<Text::Tabs> has been upgraded from version 2012.0818 to 2013.0523.

=item *

L<Text::Wrap> has been upgraded from version 2012.0818 to 2013.0523.

=item *

L<Thread> has been upgraded from version 3.02 to 3.04.

=item *

L<Thread::Queue> has been upgraded from version 3.02 to 3.05.

=item *

L<threads> has been upgraded from version 1.86 to 1.93.

=item *

L<threads::shared> has been upgraded from version 1.43 to 1.46.

=item *

L<Tie::Array> has been upgraded from version 1.05 to 1.06.

=item *

L<Tie::File> has been upgraded from version 0.99 to 1.00.

=item *

L<Tie::Hash> has been upgraded from version 1.04 to 1.05.

=item *

L<Tie::Scalar> has been upgraded from version 1.02 to 1.03.

=item *

L<Tie::StdHandle> has been upgraded from version 4.3 to 4.4.

=item *

L<Time::HiRes> has been upgraded from version 1.9725 to 1.9726.

=item *

L<Time::Piece> has been upgraded from version 1.20_01 to 1.27.

=item *

L<Unicode::Collate> has been upgraded from version 0.97 to 1.04.

=item *

L<Unicode::Normalize> has been upgraded from version 1.16 to 1.17.

=item *

L<Unicode::UCD> has been upgraded from version 0.51 to 0.57.

=item *

L<utf8> has been upgraded from version 1.10 to 1.13.

=item *

L<version> has been upgraded from version 0.9902 to 0.9908.

=item *

L<vmsish> has been upgraded from version 1.03 to 1.04.

=item *

L<warnings> has been upgraded from version 1.18 to 1.23.

=item *

L<Win32> has been upgraded from version 0.47 to 0.49.

=item *

L<XS::Typemap> has been upgraded from version 0.10 to 0.13.

=item *

L<XSLoader> has been upgraded from version 0.16 to 0.17.

=back

=head1 Documentation

=head2 New Documentation

=head3 L<perlrepository>

This document was removed (actually, renamed L<perlgit> and given a major
overhaul) in Perl v5.14, causing Perl documentation websites to show the now
out of date version in Perl v5.12 as the latest version.  It has now been
restored in stub form, directing readers to current information.

=head2 Changes to Existing Documentation

=head3 L<perldata>

=over 4

=item *

New sections have been added to document the new index/value array slice and
key/value hash slice syntax.

=back

=head3 L<perldebguts>

=over 4

=item *

The C<DB::goto> and C<DB::lsub> debugger subroutines are now documented.  [perl
#77680]

=back

=head3 L<perlexperiment>

=over

=item *

C<\s> matching C<\cK> is marked experimental.

=item *

ithreads were accepted in v5.8.0 (but are discouraged as of v5.20.0).

=item *

Long doubles are not considered experimental.

=item *

Code in regular expressions, regular expression backtracking verbs,
and lvalue subroutines are no longer listed as experimental.  (This
also affects L<perlre> and L<perlsub>.)

=back

=head3 L<perlfunc>

=over

=item *

C<chop> and C<chomp> now note that they can reset the hash iterator.

=item *

C<exec>'s handling of arguments is now more clearly documented.

=item *

C<eval EXPR> now has caveats about expanding floating point numbers in some
locales.

=item *

C<goto EXPR> is now documented to handle an expression that evalutes to a
code reference as if it was C<goto &$coderef>.  This behavior is at least ten
years old.

=item *

Since Perl v5.10, it has been possible for subroutines in C<@INC> to return
a reference to a scalar holding initial source code to prepend to the file.
This is now documented.

=item *

The documentation of C<ref> has been updated to recommend the use of
C<blessed>, C<isa> and C<reftype> when dealing with references to blessed
objects.

=back

=head3 L<perlguts>

=over 4

=item *

Numerous minor changes have been made to reflect changes made to the perl
internals in this release.

=item *

New sections on L<Read-Only Values|perlguts/"Read-Only Values"> and
L<Copy on Write|perlguts/"Copy on Write"> have been added.

=back

=head3 L<perlhack>

=over 4

=item *

The L<Super Quick Patch Guide|perlhack/SUPER QUICK PATCH GUIDE> section has
been updated.

=back

=head3 L<perlhacktips>

=over 4

=item *

The documentation has been updated to include some more examples of C<gdb>
usage.

=back

=head3 L<perllexwarn>

=over 4

=item *

The L<perllexwarn> documentation used to describe the hierarchy of warning
categories understood by the L<warnings> pragma. That description has now
been moved to the L<warnings> documentation itself, leaving L<perllexwarn>
as a stub that points to it. This change consolidates all documentation for
lexical warnings in a single place.

=back

=head3 L<perllocale>

=over

=item *

The documentation now mentions F<fc()> and C<\F>, and includes many
clarifications and corrections in general.

=back

=head3 L<perlop>

=over 4

=item *

The language design of Perl has always called for monomorphic operators.
This is now mentioned explicitly.

=back

=head3 L<perlopentut>

=over 4

=item *

The C<open> tutorial has been completely rewritten by Tom Christiansen, and now
focuses on covering only the basics, rather than providing a comprehensive
reference to all things openable.  This rewrite came as the result of a
vigorous discussion on perl5-porters kicked off by a set of improvements
written by Alexander Hartmaier to the existing L<perlopentut>.  A "more than
you ever wanted to know about C<open>" document may follow in subsequent
versions of perl.

=back

=head3 L<perlre>

=over 4

=item *

The fact that the regexp engine makes no effort to call (?{}) and (??{})
constructs any specified number of times (although it will basically DWIM
in case of a successful match) has been documented.

=item *

The C</r> modifier (for non-destructive substitution) is now documented. [perl
#119151]

=item *

The documentation for C</x> and C<(?# comment)> has been expanded and clarified.

=back

=head3 L<perlreguts>

=over 4

=item *

The documentation has been updated in the light of recent changes to
F<regcomp.c>.

=back

=head3 L<perlsub>

=over 4

=item *

The need to predeclare recursive functions with prototypes in order for the
prototype to be honoured in the recursive call is now documented. [perl #2726]

=item *

A list of subroutine names used by the perl implementation is now included.
[perl #77680]

=back

=head3 L<perltrap>

=over 4

=item *

There is now a L<JavaScript|perltrap/JavaScript Traps> section.

=back

=head3 L<perlunicode>

=over 4

=item *

The documentation has been updated to reflect C<Bidi_Class> changes in
Unicode 6.3.

=back

=head3 L<perlvar>

=over 4

=item *

A new section explaining the performance issues of $`, $& and $', including
workarounds and changes in different versions of Perl, has been added.

=item *

Three L<English> variable names which have long been documented but do not
actually exist have been removed from the documentation.  These were
C<$OLD_PERL_VERSION>, C<$OFMT>, and C<$ARRAY_BASE>.

(Actually, C<OLD_PERL_VERSION> I<does> exist, starting with this revision, but
remained undocumented until perl 5.22.0.)

=back

=head3 L<perlxs>

=over 4

=item *

Several problems in the C<MY_CXT> example have been fixed.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<delete argument is indexE<sol>value array slice, use array slice|perldiag/"delete argument is index/value array slice, use array slice">

(F) You used index/value array slice syntax (C<%array[...]>) as the argument to
C<delete>.  You probably meant C<@array[...]> with an @ symbol instead.

=item *

L<delete argument is keyE<sol>value hash slice, use hash slice|perldiag/"delete argument is key/value hash slice, use hash slice">

(F) You used key/value hash slice syntax (C<%hash{...}>) as the argument to
C<delete>.  You probably meant C<@hash{...}> with an @ symbol instead.

=item *

L<Magical list constants are not supported|perldiag/"Magical list constants are
not supported">

(F) You assigned a magical array to a stash element, and then tried to use the
subroutine from the same slot.  You are asking Perl to do something it cannot
do, details subject to change between Perl versions.

=item *

Added L<Setting $E<sol> to a %s reference is forbidden|perldiag/"Setting $E<sol> to %s reference is forbidden">

=back

=head3 New Warnings

=over 4

=item *

L<%s on reference is experimental|perldiag/"push on reference is experimental">:

The "auto-deref" feature is experimental.

Starting in v5.14.0, it was possible to use push, pop, keys, and other
built-in functions not only on aggregate types, but on references to
them.  The feature was not deployed to its original intended
specification, and now may become redundant to postfix dereferencing.
It has always been categorized as an experimental feature, and in
v5.20.0 is carries a warning as such.

Warnings will now be issued at compile time when these operations are
detected.

  no if $] >= 5.01908, warnings => "experimental::autoderef";

Consider, though, replacing the use of these features, as they may
change behavior again before becoming stable.

=item *

L<A sequence of multiple spaces in a charnames alias definition is deprecated|perldiag/"A sequence of multiple spaces in a charnames alias definition is deprecated">

L<Trailing white-space in a charnames alias definition is deprecated|perldiag/"Trailing white-space in a charnames alias definition is deprecated">

These two deprecation warnings involving C<\N{...}> were incorrectly
implemented.  They did not warn by default (now they do) and could not be
made fatal via C<< use warnings FATAL => 'deprecated' >> (now they can).

=item *

L<Attribute prototype(%s) discards earlier prototype attribute in same sub|perldiag/"Attribute prototype(%s) discards earlier prototype attribute in same sub">

(W misc) A sub was declared as C<sub foo : prototype(A) : prototype(B) {}>, for
example.  Since each sub can only have one prototype, the earlier
declaration(s) are discarded while the last one is applied.

=item *

L<Invalid \0 character in %s for %s: %s\0%s|perldiag/"Invalid \0 character in %s for %s: %s\0%s">

(W syscalls) Embedded \0 characters in pathnames or other system call arguments
produce a warning as of 5.20.  The parts after the \0 were formerly ignored by
system calls.

=item *

L<Matched non-Unicode code point 0x%X against Unicode property; may not be portable|perldiag/"Matched non-Unicode code point 0x%X against Unicode property; may not be portable">.

This replaces the message "Code point 0x%X is not Unicode, all \p{} matches
fail; all \P{} matches succeed".

=item *

L<Missing ']' in prototype for %s : %s|perldiag/"Missing ']' in prototype for %s : %s">

(W illegalproto) A grouping was started with C<[> but never closed with C<]>.

=item *

L<Possible precedence issue with control flow operator|perldiag/"Possible precedence issue with control flow operator">

(W syntax) There is a possible problem with the mixing of a control flow
operator (e.g. C<return>) and a low-precedence operator like C<or>.  Consider:

    sub { return $a or $b; }

This is parsed as:

    sub { (return $a) or $b; }

Which is effectively just:

    sub { return $a; }

Either use parentheses or the high-precedence variant of the operator.

Note this may be also triggered for constructs like:

    sub { 1 if die; }

=item *

L<Postfix dereference is experimental|perldiag/"Postfix dereference is experimental">

(S experimental::postderef) This warning is emitted if you use the experimental
postfix dereference syntax.  Simply suppress the warning if you want to use the
feature, but know that in doing so you are taking the risk of using an
experimental feature which may change or be removed in a future Perl version:

    no warnings "experimental::postderef";
    use feature "postderef", "postderef_qq";
    $ref->$*;
    $aref->@*;
    $aref->@[@indices];
    ... etc ...

=item *

L<Prototype '%s' overridden by attribute 'prototype(%s)' in %s|perldiag/"Prototype '%s' overridden by attribute 'prototype(%s)' in %s">

(W prototype) A prototype was declared in both the parentheses after the sub
name and via the prototype attribute.  The prototype in parentheses is useless,
since it will be replaced by the prototype from the attribute before it's ever
used.

=item *

L<Scalar value @%s[%s] better written as $%s[%s]|perldiag/"Scalar value @%s[%s] better written as $%s[%s]">

(W syntax) In scalar context, you've used an array index/value slice (indicated
by %) to select a single element of an array.  Generally it's better to ask for
a scalar value (indicated by $).  The difference is that C<$foo[&bar]> always
behaves like a scalar, both in the value it returns and when evaluating its
argument, while C<%foo[&bar]> provides a list context to its subscript, which
can do weird things if you're expecting only one subscript.  When called in
list context, it also returns the index (what C<&bar> returns) in addition to
the value.

=item *

L<Scalar value @%s{%s} better written as $%s{%s}|perldiag/"Scalar value @%s{%s} better written as $%s{%s}">

(W syntax) In scalar context, you've used a hash key/value slice (indicated by
%) to select a single element of a hash.  Generally it's better to ask for a
scalar value (indicated by $).  The difference is that C<$foo{&bar}> always
behaves like a scalar, both in the value it returns and when evaluating its
argument, while C<@foo{&bar}> and provides a list context to its subscript,
which can do weird things if you're expecting only one subscript.  When called
in list context, it also returns the key in addition to the value.

=item *

L<Setting $E<sol> to a reference to %s as a form of slurp is deprecated, treating as undef|perldiag/"Setting $E<sol> to a reference to %s as a form of slurp is deprecated, treating as undef">

=item *

L<Unexpected exit %u|perldiag/"Unexpected exit %u">

(S) exit() was called or the script otherwise finished gracefully when
C<PERL_EXIT_WARN> was set in C<PL_exit_flags>.

=item *

L<Unexpected exit failure %d|perldiag/"Unexpected exit failure %d">

(S) An uncaught die() was called when C<PERL_EXIT_WARN> was set in
C<PL_exit_flags>.

=item *

L<Use of literal control characters in variable names is deprecated|perldiag/"Use of literal control characters in variable names is deprecated">

(D deprecated) Using literal control characters in the source to refer to the
^FOO variables, like $^X and ${^GLOBAL_PHASE} is now deprecated.  This only
affects code like $\cT, where \cT is a control (like a C<SOH>) in the
source code: ${"\cT"} and $^T remain valid.

=item *

L<Useless use of greediness modifier|perldiag/"Useless use of greediness modifier '%c' in regex; marked by <-- HERE in m/%s/">

This fixes [Perl #42957].

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

Warnings and errors from the regexp engine are now UTF-8 clean.

=item *

The "Unknown switch condition" error message has some slight changes.  This
error triggers when there is an unknown condition in a C<(?(foo))> conditional.
The error message used to read:

    Unknown switch condition (?(%s in regex;

But what %s could be was mostly up to luck.  For C<(?(foobar))>, you might have
seen "fo" or "f".  For Unicode characters, you would generally get a corrupted
string.  The message has been changed to read:

    Unknown switch condition (?(...)) in regex;

Additionally, the C<'E<lt>-- HERE'> marker in the error will now point to the
correct spot in the regex.

=item *

The "%s "\x%X" does not map to Unicode" warning is now correctly listed as a
severe warning rather than as a fatal error.

=item *

Under rare circumstances, one could get a "Can't coerce readonly REF to
string" instead of the customary "Modification of a read-only value".  This
alternate error message has been removed.

=item *

"Ambiguous use of * resolved as operator *": This and similar warnings
about "%" and "&" used to occur in some circumstances where there was no
operator of the type cited, so the warning was completely wrong.  This has
been fixed [perl #117535, #76910].

=item *

Warnings about malformed subroutine prototypes are now more consistent in
how the prototypes are rendered.  Some of these warnings would truncate
prototypes containing nulls.  In other cases one warning would suppress
another.  The warning about illegal characters in prototypes no longer says
"after '_'" if the bad character came before the underscore.

=item *

L<Perl folding rules are not up-to-date for 0x%X; please use the perlbug
utility to report; in regex; marked by <-- HERE in
mE<sol>%sE<sol>|perldiag/"Perl folding rules are not up-to-date for 0x%X;
please use the perlbug utility to report; in regex; marked by <-- HERE in
m/%s/">

This message is now only in the regexp category, and not in the deprecated
category.  It is still a default (i.e., severe) warning [perl #89648].

=item *

L<%%s[%s] in scalar context better written as $%s[%s]|perldiag/"%%s[%s] in scalar context better written as $%s[%s]">

This warning now occurs for any C<%array[$index]> or C<%hash{key}> known to
be in scalar context at compile time.  Previously it was worded "Scalar
value %%s[%s] better written as $%s[%s]".

=item *

L<Switch condition not recognized in regex; marked by <-- HERE in mE<sol>%sE<sol>|perldiag/"Switch condition not recognized in regex; marked by <-- HERE in m/%s/">:

The description for this diagnostic has been extended to cover all cases where the warning may occur.
Issues with the positioning of the arrow indicator have also been resolved.

=item *

The error messages for C<my($a?$b$c)> and C<my(do{})> now mention "conditional
expression" and "do block", respectively, instead of reading 'Can't declare
null operation in "my"'.

=item *

When C<use re "debug"> executes a regex containing a backreference, the
debugging output now shows what string is being matched.

=item *

The now fatal error message C<Character following "\c" must be ASCII> has been
reworded as C<Character following "\c" must be printable ASCII> to emphasize
that in C<\cI<X>>, I<X> must be a I<printable (non-control)> ASCII character.

=back

=head1 Utility Changes

=head3 L<a2p>

=over 4

=item *

A possible crash from an off-by-one error when trying to access before the
beginning of a buffer has been fixed.  [perl #120244]

=back

=head3 F<bisect.pl>

The git bisection tool F<Porting/bisect.pl> has had many enhancements.

It is provided as part of the source distribution but not installed because
it is not self-contained as it relies on being run from within a git
checkout. Note also that it makes no attempt to fix tests, correct runtime
bugs or make something useful to install - its purpose is to make minimal
changes to get any historical revision of interest to build and run as close
as possible to "as-was", and thereby make C<git bisect> easy to use.

=over 4

=item *

Can optionally run the test case with a timeout.

=item *

Can now run in-place in a clean git checkout.

=item *

Can run the test case under C<valgrind>.

=item *

Can apply user supplied patches and fixes to the source checkout before
building.

=item *

Now has fixups to enable building several more historical ranges of bleadperl,
which can be useful for pinpointing the origins of bugs or behaviour changes.

=back

=head3 L<find2perl>

=over 4

=item *

L<find2perl> now handles C<?> wildcards correctly.  [perl #113054]

=back

=head3 L<perlbug>

=over 4

=item *

F<perlbug> now has a C<-p> option for attaching patches with a bug report.

=item *

L<perlbug> has been modified to supply the report template with CRLF line
endings on Windows.
[L<perl #121277|https://rt.perl.org/Public/Bug/Display.html?id=121277>]

=item *

L<perlbug> now makes as few assumptions as possible about the encoding of the
report.  This will likely change in the future to assume UTF-8 by default but
allow a user override.

=back

=head1 Configuration and Compilation

=over 4

=item *

The F<Makefile.PL> for L<SDBM_File> now generates a better F<Makefile>, which
avoids a race condition during parallel makes, which could cause the build to
fail.  This is the last known parallel make problem (on *nix platforms), and
therefore we believe that a parallel make should now always be error free.

=item *

F<installperl> and F<installman>'s option handling has been refactored to use
L<Getopt::Long>. Both are used by the F<Makefile> C<install> targets, and
are not installed, so these changes are only likely to affect custom
installation scripts.

=over 4

=item *

Single letter options now also have long names.

=item *

Invalid options are now rejected.

=item *

Command line arguments that are not options are now rejected.

=item *

Each now has a C<--help> option to display the usage message.

=back

The behaviour for all valid documented invocations is unchanged.

=item *

Where possible, the build now avoids recursive invocations of F<make> when
building pure-Perl extensions, without removing any parallelism from the
build. Currently around 80 extensions can be processed directly by the
F<make_ext.pl> tool, meaning that 80 invocations of F<make> and 160
invocations of F<miniperl> are no longer made.

=item *

The build system now works correctly when compiling under GCC or Clang with
link-time optimization enabled (the C<-flto> option). [perl #113022]

=item *

Distinct library basenames with C<d_libname_unique>.

When compiling perl with this option, the library files for XS modules are
named something "unique" -- for example, Hash/Util/Util.so becomes
Hash/Util/PL_Hash__Util.so.  This behavior is similar to what currently
happens on VMS, and serves as groundwork for the Android port.

=item *

C<sysroot> option to indicate the logical root directory under gcc and clang.

When building with this option set, both Configure and the compilers search
for all headers and libraries under this new sysroot, instead of /.

This is a huge time saver if cross-compiling, but can also help
on native builds if your toolchain's files have non-standard locations.

=item *

The cross-compilation model has been renovated.
There's several new options, and some backwards-incompatible changes:

We now build binaries for miniperl and generate_uudmap to be used on the host,
rather than running every miniperl call on the target; this means that, short
of 'make test', we no longer need access to the target system once Configure is
done.  You can provide already-built binaries through the C<hostperl> and
C<hostgenerate> options to Configure.

Additionally, if targeting an EBCDIC platform from an ASCII host,
or viceversa, you'll need to run Configure with C<-Uhostgenerate>, to
indicate that generate_uudmap should be run on the target.

Finally, there's also a way of having Configure end early, right after
building the host binaries, by cross-compiling without specifying a
C<targethost>.

The incompatible changes include no longer using xconfig.h, xlib, or
Cross.pm, so canned config files and Makefiles will have to be updated.

=item *

Related to the above, there is now a way of specifying the location of sh
(or equivalent) on the target system: C<targetsh>.

For example, Android has its sh in /system/bin/sh, so if cross-compiling
from a more normal Unixy system with sh in /bin/sh, "targetsh" would end
up as /system/bin/sh, and "sh" as /bin/sh.

=item *

By default, B<gcc> 4.9 does some optimizations that break perl.  The B<-fwrapv>
option disables those optimizations (and probably others), so for B<gcc> 4.3
and later (since the there might be similar problems lurking on older versions
too, but B<-fwrapv> was broken before 4.3, and the optimizations probably won't
go away), F<Configure> now adds B<-fwrapv> unless the user requests
B<-fno-wrapv>, which disables B<-fwrapv>, or B<-fsanitize=undefined>, which
turns the overflows B<-fwrapv> ignores into runtime errors.
[L<perl #121505|https://rt.perl.org/Public/Bug/Display.html?id=121505>]

=back

=head1 Testing

=over 4

=item *

The C<test.valgrind> make target now allows tests to be run in parallel.
This target allows Perl's test suite to be run under Valgrind, which detects
certain sorts of C programming errors, though at significant cost in running
time. On suitable hardware, allowing parallel execution claws back a lot of
that additional cost. [perl #121431]

=item *

Various tests in F<t/porting/> are no longer skipped when the perl
F<.git> directory is outside the perl tree and pointed to by
C<$GIT_DIR>. [perl #120505]

=item *

The test suite no longer fails when the user's interactive shell maintains a
C<$PWD> environment variable, but the F</bin/sh> used for running tests
doesn't.

=back

=head1 Platform Support

=head2 New Platforms

=over 4

=item Android

Perl can now be built for Android, either natively or through
cross-compilation, for all three currently available architectures (ARM,
MIPS, and x86), on a wide range of versions.

=item Bitrig

Compile support has been added for Bitrig, a fork of OpenBSD.

=item FreeMiNT

Support has been added for FreeMiNT, a free open-source OS for the Atari ST
system and its successors, based on the original MiNT that was officially
adopted by Atari.

=item Synology

Synology ships its NAS boxes with a lean Linux distribution (DSM) on relative
cheap CPU's (like the Marvell Kirkwood mv6282 - ARMv5tel or Freescale QorIQ
P1022 ppc - e500v2) not meant for workstations or development. These boxes
should build now. The basic problems are the non-standard location for tools.

=back

=head2 Discontinued Platforms

=over 4

=item C<sfio>

Code related to supporting the C<sfio> I/O system has been removed.

Perl 5.004 added support to use the native API of C<sfio>, AT&T's Safe/Fast
I/O library. This code still built with v5.8.0, albeit with many regression
tests failing, but was inadvertently broken before the v5.8.1 release,
meaning that it has not worked on any version of Perl released since then.
In over a decade we have received no bug reports about this, hence it is clear
that no-one is using this functionality on any version of Perl that is still
supported to any degree.

=item AT&T 3b1

Configure support for the 3b1, also known as the AT&T Unix PC (and the similar
AT&T 7300), has been removed.

=item DG/UX

DG/UX was a Unix sold by Data General. The last release was in April 2001.
It only runs on Data General's own hardware.

=item EBCDIC

In the absence of a regular source of smoke reports, code intended to support
native EBCDIC platforms will be removed from perl before 5.22.0.

=back

=head2 Platform-Specific Notes

=over 4

=item Cygwin

=over 4

=item *

recv() on a connected handle would populate the returned sender
address with whatever happened to be in the working buffer.  recv()
now uses a workaround similar to the Win32 recv() wrapper and returns
an empty string when recvfrom(2) doesn't modify the supplied address
length. [perl #118843]

=item *

Fixed a build error in cygwin.c on Cygwin 1.7.28.

Tests now handle the errors that occur when C<cygserver> isn't
running.

=back

=item GNU/Hurd

The BSD compatibility library C<libbsd> is no longer required for builds.

=item Linux

The hints file now looks for C<libgdbm_compat> only if C<libgdbm> itself is
also wanted. The former is never useful without the latter, and in some
circumstances, including it could actually prevent building.

=item Mac OS

The build system now honors an C<ld> setting supplied by the user running
F<Configure>.

=item MidnightBSD

C<objformat> was removed from version 0.4-RELEASE of MidnightBSD and had been
deprecated on earlier versions.  This caused the build environment to be
erroneously configured for C<a.out> rather than C<elf>.  This has been now
been corrected.

=item Mixed-endian platforms

The code supporting C<pack> and C<unpack> operations on mixed endian
platforms has been removed. We believe that Perl has long been unable to
build on mixed endian architectures (such as PDP-11s), so we don't think
that this change will affect any platforms which were able to build v5.18.0.

=item VMS

=over 4

=item *

The C<PERL_ENV_TABLES> feature to control the population of %ENV at perl
start-up was broken in Perl 5.16.0 but has now been fixed.

=item *

Skip access checks on remotes in opendir().  [perl #121002]

=item *

A check for glob metacharacters in a path returned by the
L<C<glob()>|perlfunc/glob> operator has been replaced with a check for VMS
wildcard characters.  This saves a significant number of unnecessary
L<C<lstat()>|perlfunc/lstat> calls such that some simple glob operations become
60-80% faster.

=back

=item Win32

=over 4

=item *

C<rename> and C<link> on Win32 now set $! to ENOSPC and EDQUOT when
appropriate.  [perl #119857]

=item *

The BUILD_STATIC and ALL_STATIC makefile options for linking some or (nearly)
all extensions statically (into perl520.dll, and into a separate
perl-static.exe too) were broken for MinGW builds. This has now been fixed.

The ALL_STATIC option has also been improved to include the Encode and Win32
extensions (for both VC++ and MinGW builds).

=item *

Support for building with Visual C++ 2013 has been added.  There are currently
two possible test failures (see L<perlwin32/"Testing Perl on Windows">) which
will hopefully be resolved soon.

=item *

Experimental support for building with Intel C++ Compiler has been added.  The
nmake makefile (win32/Makefile) and the dmake makefile (win32/makefile.mk) can
be used.  A "nmake test" will not pass at this time due to F<cpan/CGI/t/url.t>.

=item *

Killing a process tree with L<perlfunc/kill> and a negative signal, was broken
starting in 5.18.0. In this bug, C<kill> always returned 0 for a negative
signal even for valid PIDs, and no processes were terminated. This has been
fixed [perl #121230].

=item *

The time taken to build perl on Windows has been reduced quite significantly
(time savings in the region of 30-40% are typically seen) by reducing the
number of, usually failing, I/O calls for each L<C<require()>|perlfunc/require>
(for B<miniperl.exe> only).
[L<perl #121119|https://rt.perl.org/Public/Bug/Display.html?id=121119>]

=item *

About 15 minutes of idle sleeping was removed from running C<make test> due to
a bug in which the timeout monitor used for tests could not be cancelled once
the test completes, and the full timeout period elapsed before running the next
test file.
[L<perl #121395|https://rt.perl.org/Public/Bug/Display.html?id=121395>]

=item *

On a perl built without pseudo-fork (pseudo-fork builds were not affected by
this bug), killing a process tree with L<C<kill()>|perlfunc/kill> and a negative
signal resulted in C<kill()> inverting the returned value.  For example, if
C<kill()> killed 1 process tree PID then it returned 0 instead of 1, and if
C<kill()> was passed 2 invalid PIDs then it returned 2 instead of 0.  This has
probably been the case since the process tree kill feature was implemented on
Win32.  It has now been corrected to follow the documented behaviour.
[L<perl #121230|https://rt.perl.org/Public/Bug/Display.html?id=121230>]

=item *

When building a 64-bit perl, an uninitialized memory read in B<miniperl.exe>,
used during the build process, could lead to a 4GB B<wperl.exe> being created.
This has now been fixed.  (Note that B<perl.exe> itself was unaffected, but
obviously B<wperl.exe> would have been completely broken.)
[L<perl #121471|https://rt.perl.org/Public/Bug/Display.html?id=121471>]

=item *

Perl can now be built with B<gcc> version 4.8.1 from L<http://www.mingw.org>.
This was previously broken due to an incorrect definition of DllMain() in one
of perl's source files.  Earlier B<gcc> versions were also affected when using
version 4 of the w32api package.  Versions of B<gcc> available from
L<http://mingw-w64.sourceforge.net/> were not affected.
[L<perl #121643|https://rt.perl.org/Public/Bug/Display.html?id=121643>]

=item *

The test harness now has no failures when perl is built on a FAT drive with the
Windows OS on an NTFS drive.
[L<perl #21442|https://rt.perl.org/Public/Bug/Display.html?id=21442>]

=item *

When cloning the context stack in fork() emulation, Perl_cx_dup()
would crash accessing parameter information for context stack entries
that included no parameters, as with C<&foo;>.
[L<perl #121721|https://rt.perl.org/Public/Bug/Display.html?id=121721>]

=item *

Introduced by
L<perl #113536|https://rt.perl.org/Public/Bug/Display.html?id=113536>, a memory
leak on every call to C<system> and backticks (C< `` >), on most Win32 Perls
starting from 5.18.0 has been fixed.  The memory leak only occurred if you
enabled psuedo-fork in your build of Win32 Perl, and were running that build on
Server 2003 R2 or newer OS.  The leak does not appear on WinXP SP3.
[L<perl #121676|https://rt.perl.org/Public/Bug/Display.html?id=121676>]

=back

=item WinCE

=over 4

=item *

The building of XS modules has largely been restored.  Several still cannot
(yet) be built but it is now possible to build Perl on WinCE with only a couple
of further patches (to L<Socket> and L<ExtUtils::MakeMaker>), hopefully to be
incorporated soon.

=item *

Perl can now be built in one shot with no user intervention on WinCE by running
C<nmake -f Makefile.ce all>.

Support for building with EVC (Embedded Visual C++) 4 has been restored.  Perl
can also be built using Smart Devices for Visual C++ 2005 or 2008.

=back

=back

=head1 Internal Changes

=over 4

=item *

The internal representation has changed for the match variables $1, $2 etc.,
$`, $&, $', ${^PREMATCH}, ${^MATCH} and ${^POSTMATCH}.  It uses slightly less
memory, avoids string comparisons and numeric conversions during lookup, and
uses 23 fewer lines of C.  This change should not affect any external code.

=item *

Arrays now use NULL internally to represent unused slots, instead of
&PL_sv_undef.  &PL_sv_undef is no longer treated as a special value, so
av_store(av, 0, &PL_sv_undef) will cause element 0 of that array to hold a
read-only undefined scalar.  C<$array[0] = anything> will croak and
C<\$array[0]> will compare equal to C<\undef>.

=item *

The SV returned by HeSVKEY_force() now correctly reflects the UTF8ness of the
underlying hash key when that key is not stored as a SV.  [perl #79074]

=item *

Certain rarely used functions and macros available to XS code are now
deprecated.  These are:
C<utf8_to_uvuni_buf> (use C<utf8_to_uvchr_buf> instead),
C<valid_utf8_to_uvuni> (use C<utf8_to_uvchr_buf> instead),
C<NATIVE_TO_NEED> (this did not work properly anyway),
and C<ASCII_TO_NEED> (this did not work properly anyway).

Starting in this release, almost never does application code need to
distinguish between the platform's character set and Latin1, on which the
lowest 256 characters of Unicode are based.  New code should not use
C<utf8n_to_uvuni> (use C<utf8_to_uvchr_buf> instead),
nor
C<uvuni_to_utf8> (use C<uvchr_to_utf8> instead),

=item *

The Makefile shortcut targets for many rarely (or never) used testing and
profiling targets have been removed, or merged into the only other Makefile
target that uses them.  Specifically, these targets are gone, along with
documentation that referenced them or explained how to use them:

    check.third check.utf16 check.utf8 coretest minitest.prep
    minitest.utf16 perl.config.dashg perl.config.dashpg
    perl.config.gcov perl.gcov perl.gprof perl.gprof.config
    perl.pixie perl.pixie.atom perl.pixie.config perl.pixie.irix
    perl.third perl.third.config perl.valgrind.config purecovperl
    pureperl quantperl test.deparse test.taintwarn test.third
    test.torture test.utf16 test.utf8 test_notty.deparse
    test_notty.third test_notty.valgrind test_prep.third
    test_prep.valgrind torturetest ucheck ucheck.third ucheck.utf16
    ucheck.valgrind utest utest.third utest.utf16 utest.valgrind

It's still possible to run the relevant commands by "hand" - no underlying
functionality has been removed.

=item *

It is now possible to keep Perl from initializing locale handling.
For the most part, Perl doesn't pay attention to locale.  (See
L<perllocale>.)  Nonetheless, until now, on startup, it has always
initialized locale handling to the system default, just in case the
program being executed ends up using locales.  (This is one of the first
things a locale-aware program should do, long before Perl knows if it
will actually be needed or not.)  This works well except when Perl is
embedded in another application which wants a locale that isn't the
system default.  Now, if the environment variable
C<PERL_SKIP_LOCALE_INIT> is set at the time Perl is started, this
initialization step is skipped.  Prior to this, on Windows platforms,
the only workaround for this deficiency was to use a hacked-up copy of
internal Perl code.  Applications that need to use older Perls can
discover if the embedded Perl they are using needs the workaround by
testing that the C preprocessor symbol C<HAS_SKIP_LOCALE_INIT> is not
defined.  [RT #38193]

=item *

C<BmRARE> and C<BmPREVIOUS> have been removed.  They were not used anywhere
and are not part of the API.  For XS modules, they are now #defined as 0.

=item *

C<sv_force_normal>, which usually croaks on read-only values, used to allow
read-only values to be modified at compile time.  This has been changed to
croak on read-only values regardless.  This change uncovered several core
bugs.

=item *

Perl's new copy-on-write mechanism  (which is now enabled by default),
allows any C<SvPOK> scalar to be automatically upgraded to a copy-on-write
scalar when copied. A reference count on the string buffer is stored in
the string buffer itself.

For example:

    $ perl -MDevel::Peek -e'$a="abc"; $b = $a; Dump $a; Dump $b'
    SV = PV(0x260cd80) at 0x2620ad8
      REFCNT = 1
      FLAGS = (POK,IsCOW,pPOK)
      PV = 0x2619bc0 "abc"\0
      CUR = 3
      LEN = 16
      COW_REFCNT = 1
    SV = PV(0x260ce30) at 0x2620b20
      REFCNT = 1
      FLAGS = (POK,IsCOW,pPOK)
      PV = 0x2619bc0 "abc"\0
      CUR = 3
      LEN = 16
      COW_REFCNT = 1

Note that both scalars share the same PV buffer and have a COW_REFCNT
greater than zero.

This means that XS code which wishes to modify the C<SvPVX()> buffer of an
SV should call C<SvPV_force()> or similar first, to ensure a valid (and
unshared) buffer, and to call C<SvSETMAGIC()> afterwards. This in fact has
always been the case (for example hash keys were already copy-on-write);
this change just spreads the COW behaviour to a wider variety of SVs.

One important difference is that before 5.18.0, shared hash-key scalars
used to have the C<SvREADONLY> flag set; this is no longer the case.

This new behaviour can still be disabled by running F<Configure> with
B<-Accflags=-DPERL_NO_COW>.  This option will probably be removed in Perl
5.22.

=item *

C<PL_sawampersand> is now a constant.  The switch this variable provided
(to enable/disable the pre-match copy depending on whether C<$&> had been
seen) has been removed and replaced with copy-on-write, eliminating a few
bugs.

The previous behaviour can still be enabled by running F<Configure> with
B<-Accflags=-DPERL_SAWAMPERSAND>.

=item *

The functions C<my_swap>, C<my_htonl> and C<my_ntohl> have been removed.
It is unclear why these functions were ever marked as I<A>, part of the
API. XS code can't call them directly, as it can't rely on them being
compiled. Unsurprisingly, no code on CPAN references them.

=item *

The signature of the C<Perl_re_intuit_start()> regex function has changed;
the function pointer C<intuit> in the regex engine plugin structure
has also changed accordingly. A new parameter, C<strbeg> has been added;
this has the same meaning as the same-named parameter in
C<Perl_regexec_flags>. Previously intuit would try to guess the start of
the string from the passed SV (if any), and would sometimes get it wrong
(e.g. with an overloaded SV).

=item *

The signature of the C<Perl_regexec_flags()> regex function has
changed; the function pointer C<exec> in the regex engine plugin
structure has also changed to match.  The C<minend> parameter now has
type C<SSize_t> to better support 64-bit systems.

=item *

XS code may use various macros to change the case of a character or code
point (for example C<toLOWER_utf8()>).  Only a couple of these were
documented until now;
and now they should be used in preference to calling the underlying
functions.  See L<perlapi/Character case changing>.

=item *

The code dealt rather inconsistently with uids and gids. Some
places assumed that they could be safely stored in UVs, others
in IVs, others in ints. Four new macros are introduced:
SvUID(), sv_setuid(), SvGID(), and sv_setgid()

=item *

C<sv_pos_b2u_flags> has been added to the API.  It is similar to C<sv_pos_b2u>,
but supports long strings on 64-bit platforms.

=item *

C<PL_exit_flags> can now be used by perl embedders or other XS code to have
perl C<warn> or C<abort> on an attempted exit. [perl #52000]

=item *

Compiling with C<-Accflags=-PERL_BOOL_AS_CHAR> now allows C99 and C++
compilers to emulate the aliasing of C<bool> to C<char> that perl does for
C89 compilers.  [perl #120314]

=item *

The C<sv> argument in L<perlapi/sv_2pv_flags>, L<perlapi/sv_2iv_flags>,
L<perlapi/sv_2uv_flags>, and L<perlapi/sv_2nv_flags> and their older wrappers
sv_2pv, sv_2iv, sv_2uv, sv_2nv, is now non-NULL. Passing NULL now will crash.
When the non-NULL marker was introduced en masse in 5.9.3 the functions
were marked non-NULL, but since the creation of the SV API in 5.0 alpha 2, if
NULL was passed, the functions returned 0 or false-type values. The code that
supports C<sv> argument being non-NULL dates to 5.0 alpha 2 directly, and
indirectly to Perl 1.0 (pre 5.0 api). The lack of documentation that the
functions accepted a NULL C<sv> was corrected in 5.11.0 and between 5.11.0
and 5.19.5 the functions were marked NULLOK. As an optimization the NULLOK code
has now been removed, and the functions became non-NULL marked again, because
core getter-type macros never pass NULL to these functions and would crash
before ever passing NULL.

The only way a NULL C<sv> can be passed to sv_2*v* functions is if XS code
directly calls sv_2*v*. This is unlikely as XS code uses Sv*V* macros to get
the underlying value out of the SV. One possible situation which leads to
a NULL C<sv> being passed to sv_2*v* functions, is if XS code defines its own
getter type Sv*V* macros, which check for NULL B<before> dereferencing and
checking the SV's flags through public API Sv*OK* macros or directly using
private API C<SvFLAGS>, and if C<sv> is NULL, then calling the sv_2*v functions
with a NULL litteral or passing the C<sv> containing a NULL value.

=item *

newATTRSUB is now a macro

The public API newATTRSUB was previously a macro to the private
function Perl_newATTRSUB. Function Perl_newATTRSUB has been removed. newATTRSUB
is now macro to a different internal function.

=item *

Changes in warnings raised by C<utf8n_to_uvchr()>

This bottom level function decodes the first character of a UTF-8 string
into a code point.  It is accessible to C<XS> level code, but it's
discouraged from using it directly.  There are higher level functions
that call this that should be used instead, such as
L<perlapi/utf8_to_uvchr_buf>.  For completeness though, this documents
some changes to it.  Now, tests for malformations are done before any
tests for other potential issues.  One of those issues involves code
points so large that they have never appeared in any official standard
(the current standard has scaled back the highest acceptable code point
from earlier versions).  It is possible (though not done in CPAN) to
warn and/or forbid these code points, while accepting smaller code
points that are still above the legal Unicode maximum.  The warning
message for this now includes the code point if representable on the
machine.  Previously it always displayed raw bytes, which is what it
still does for non-representable code points.

=item *

Regexp engine changes that affect the pluggable regex engine interface

Many flags that used to be exposed via regexp.h and used to populate the
extflags member of struct regexp have been removed. These fields were
technically private to Perl's own regexp engine and should not have been
exposed there in the first place.

The affected flags are:

    RXf_NOSCAN
    RXf_CANY_SEEN
    RXf_GPOS_SEEN
    RXf_GPOS_FLOAT
    RXf_ANCH_BOL
    RXf_ANCH_MBOL
    RXf_ANCH_SBOL
    RXf_ANCH_GPOS

As well as the follow flag masks:

    RXf_ANCH_SINGLE
    RXf_ANCH

All have been renamed to PREGf_ equivalents and moved to regcomp.h.

The behavior previously achieved by setting one or more of the RXf_ANCH_
flags (via the RXf_ANCH mask) have now been replaced by a *single* flag bit
in extflags:

    RXf_IS_ANCHORED

pluggable regex engines which previously used to set these flags should
now set this flag ALONE.

=item *

The Perl core now consistently uses C<av_tindex()> ("the top index of an
array") as a more clearly-named synonym for C<av_len()>.

=item *

The obscure interpreter variable C<PL_timesbuf> is expected to be removed
early in the 5.21.x development series, so that Perl 5.22.0 will not provide
it to XS authors.  While the variable still exists in 5.20.0, we hope that
this advance warning of the deprecation will help anyone who is using that
variable.

=back

=head1 Selected Bug Fixes

=head2 Regular Expressions

=over 4

=item *

Fixed a small number of regexp constructions that could either fail to
match or crash perl when the string being matched against was
allocated above the 2GB line on 32-bit systems. [RT #118175]

=item *

Various memory leaks involving the parsing of the C<(?[...])> regular
expression construct have been fixed.

=item *

C<(?[...])> now allows interpolation of precompiled patterns consisting of
C<(?[...])> with bracketed character classes inside (C<$pat =
S<qr/(?[ [a] ])/;> S</(?[ $pat ])/>>).  Formerly, the brackets would
confuse the regular expression parser.

=item *

The "Quantifier unexpected on zero-length expression" warning message could
appear twice starting in Perl v5.10 for a regular expression also
containing alternations (e.g., "a|b") triggering the trie optimisation.

=item *

Perl v5.18 inadvertently introduced a bug whereby interpolating mixed up-
and down-graded UTF-8 strings in a regex could result in malformed UTF-8
in the pattern: specifically if a downgraded character in the range
C<\x80..\xff> followed a UTF-8 string, e.g.

    utf8::upgrade(  my $u = "\x{e5}");
    utf8::downgrade(my $d = "\x{e5}");
    /$u$d/

[RT #118297]

=item *

In regular expressions containing multiple code blocks, the values of
C<$1>, C<$2>, etc., set by nested regular expression calls would leak from
one block to the next.  Now these variables always refer to the outer
regular expression at the start of an embedded block [perl #117917].

=item *

C</$qr/p> was broken in Perl 5.18.0; the C</p> flag was ignored.  This has been
fixed. [perl #118213]

=item *

Starting in Perl 5.18.0, a construct like C</[#](?{})/x> would have its C<#>
incorrectly interpreted as a comment.  The code block would be skipped,
unparsed.  This has been corrected.

=item *

Starting in Perl 5.001, a regular expression like C</[#$a]/x> or C</[#]$a/x>
would have its C<#> incorrectly interpreted as a comment, so the variable would
not interpolate.  This has been corrected. [perl #45667]

=item *

Perl 5.18.0 inadvertently made dereferenced regular expressions
S<(C<${ qr// }>)> false as booleans.  This has been fixed.

=item *

The use of C<\G> in regular expressions, where it's not at the start of the
pattern, is now slightly less buggy (although it is still somewhat
problematic).

=item *

Where a regular expression included code blocks (C</(?{...})/>), and where the
use of constant overloading triggered a re-compilation of the code block, the
second compilation didn't see its outer lexical scope.  This was a regression
in Perl 5.18.0.

=item *

The string position set by C<pos> could shift if the string changed
representation internally to or from utf8.  This could happen, e.g., with
references to objects with string overloading.

=item *

Taking references to the return values of two C<pos> calls with the same
argument, and then assigning a reference to one and C<undef> to the other,
could result in assertion failures or memory leaks.

=item *

Elements of @- and @+ now update correctly when they refer to non-existent
captures.  Previously, a referenced element (C<$ref = \$-[1]>) could refer to
the wrong match after subsequent matches.

=item *

The code that parses regex backrefs (or ambiguous backref/octals) such as \123
did a simple atoi(), which could wrap round to negative values on long digit
strings and cause segmentation faults.  This has now been fixed.  [perl
#119505]

=item *

Assigning another typeglob to C<*^R> no longer makes the regular expression
engine crash.

=item *

The C<\N> regular expression escape, when used without the curly braces (to
mean C<[^\n]>), was ignoring a following C<*> if followed by whitespace
under /x.  It had been this way since C<\N> to mean C<[^\n]> was introduced
in 5.12.0.

=item *

C<s///>, C<tr///> and C<y///> now work when a wide character is used as the
delimiter.  [perl #120463]

=item *

Some cases of unterminated (?...) sequences in regular expressions (e.g.,
C</(?</>) have been fixed to produce the proper error message instead of
"panic: memory wrap".  Other cases (e.g., C</(?(/>) have yet to be fixed.

=item *

When a reference to a reference to an overloaded object was returned from
a regular expression C<(??{...})> code block, an incorrect implicit
dereference could take place if the inner reference had been returned by
a code block previously.

=item *

A tied variable returned from C<(??{...})> sees the inner values of match
variables (i.e., the $1 etc. from any matches inside the block) in its
FETCH method.  This was not the case if a reference to an overloaded object
was the last thing assigned to the tied variable.  Instead, the match
variables referred to the outer pattern during the FETCH call.

=item *

Fix unexpected tainting via regexp using locale. Previously, under certain
conditions, the use of character classes could cause tainting when it
shouldn't. Some character classes are locale-dependent, but before this
patch, sometimes tainting was happening even for character classes that
don't depend on the locale. [perl #120675]

=item *

Under certain conditions, Perl would throw an error if in an lookbehind
assertion in a regexp, the assertion referred to a named subpattern,
complaining the lookbehind was variable when it wasn't. This has been
fixed. [perl #120600], [perl #120618]. The current fix may be improved
on in the future.

=item *

C<$^R> wasn't available outside of the regular expression that
initialized it.  [perl #121070]

=item *

A large set of fixes and refactoring for re_intuit_start() was merged,
the highlights are:

=over

=item *

Fixed a panic when compiling the regular expression
C</\x{100}[xy]\x{100}{2}/>.

=item *

Fixed a performance regression when performing a global pattern match
against a UTF-8 string.  [perl #120692]

=item *

Fixed another performance issue where matching a regular expression
like C</ab.{1,2}x/> against a long UTF-8 string would unnecessarily
calculate byte offsets for a large portion of the string. [perl
#120692]

=back

=item *

Fixed an alignment error when compiling regular expressions when built
with GCC on HP-UX 64-bit.

=item *

On 64-bit platforms C<pos> can now be set to a value higher than 2**31-1.
[perl #72766]

=back

=head2 Perl 5 Debugger and -d

=over 4

=item *

The debugger's C<man> command been fixed. It was broken in the v5.18.0
release. The C<man> command is aliased to the names C<doc> and C<perldoc> -
all now work again.

=item *

C<@_> is now correctly visible in the debugger, fixing a regression
introduced in v5.18.0's debugger. [RT #118169]

=item *

Under copy-on-write builds (the default as of 5.20.0) C<< ${'_<-e'}[0] >>
no longer gets mangled.  This is the first line of input saved for the
debugger's use for one-liners [perl #118627].

=item *

On non-threaded builds, setting C<${"_E<lt>filename"}> to a reference or
typeglob no longer causes C<__FILE__> and some error messages to produce a
corrupt string, and no longer prevents C<#line> directives in string evals from
providing the source lines to the debugger.  Threaded builds were unaffected.

=item *

Starting with Perl 5.12, line numbers were off by one if the B<-d> switch was
used on the #! line.  Now they are correct.

=item *

C<*DB::DB = sub {} if 0> no longer stops Perl's debugging mode from finding
C<DB::DB> subs declared thereafter.

=item *

C<%{'_<...'}> hashes now set breakpoints on the corresponding C<@{'_<...'}>
rather than whichever array C<@DB::dbline> is aliased to.  [perl #119799]

=item *

Call set-magic when setting $DB::sub.  [perl #121255]

=item *

The debugger's "n" command now respects lvalue subroutines and steps over
them [perl #118839].

=back

=head2 Lexical Subroutines

=over 4

=item *

Lexical constants (C<my sub a() { 42 }>) no longer crash when inlined.

=item *

Parameter prototypes attached to lexical subroutines are now respected when
compiling sub calls without parentheses.  Previously, the prototypes were
honoured only for calls I<with> parentheses. [RT #116735]

=item *

Syntax errors in lexical subroutines in combination with calls to the same
subroutines no longer cause crashes at compile time.

=item *

Deep recursion warnings no longer crash lexical subroutines. [RT #118521]

=item *

The dtrace sub-entry probe now works with lexical subs, instead of
crashing [perl #118305].

=item *

Undefining an inlinable lexical subroutine (C<my sub foo() { 42 } undef
&foo>) would result in a crash if warnings were turned on.

=item *

An undefined lexical sub used as an inherited method no longer crashes.

=item *

The presence of a lexical sub named "CORE" no longer stops the CORE::
prefix from working.

=back

=head2 Everything Else

=over 4

=item *

The OP allocation code now returns correctly aligned memory in all cases
for C<struct pmop>. Previously it could return memory only aligned to a
4-byte boundary, which is not correct for an ithreads build with 64 bit IVs
on some 32 bit platforms. Notably, this caused the build to fail completely
on sparc GNU/Linux. [RT #118055]

=item *

Evaluating large hashes in scalar context is now much faster, as the number
of used chains in the hash is now cached for larger hashes. Smaller hashes
continue not to store it and calculate it when needed, as this saves one IV.
That would be 1 IV overhead for every object built from a hash. [RT #114576]

=item *

Perl v5.16 inadvertently introduced a bug whereby calls to XSUBs that were
not visible at compile time were treated as lvalues and could be assigned
to, even when the subroutine was not an lvalue sub.  This has been fixed.
[RT #117947]

=item *

In Perl v5.18.0 dualvars that had an empty string for the string part but a
non-zero number for the number part starting being treated as true.  In
previous versions they were treated as false, the string representation
taking precedeence.  The old behaviour has been restored. [RT #118159]

=item *

Since Perl v5.12, inlining of constants that override built-in keywords of
the same name had countermanded C<use subs>, causing subsequent mentions of
the constant to use the built-in keyword instead.  This has been fixed.

=item *

The warning produced by C<-l $handle> now applies to IO refs and globs, not
just to glob refs.  That warning is also now UTF8-clean. [RT #117595]

=item *

C<delete local $ENV{nonexistent_env_var}> no longer leaks memory.

=item *

C<sort> and C<require> followed by a keyword prefixed with C<CORE::> now
treat it as a keyword, and not as a subroutine or module name. [RT #24482]

=item *

Through certain conundrums, it is possible to cause the current package to
be freed.  Certain operators (C<bless>, C<reset>, C<open>, C<eval>) could
not cope and would crash.  They have been made more resilient. [RT #117941]

=item *

Aliasing filehandles through glob-to-glob assignment would not update
internal method caches properly if a package of the same name as the
filehandle existed, resulting in filehandle method calls going to the
package instead.  This has been fixed.

=item *

C<./Configure -de -Dusevendorprefix> didn't default. [RT #64126]

=item *

The C<Statement unlikely to be reached> warning was listed in
L<perldiag> as an C<exec>-category warning, but was enabled and disabled
by the C<syntax> category.  On the other hand, the C<exec> category
controlled its fatal-ness.  It is now entirely handled by the C<exec>
category.

=item *

The "Replacement list is longer that search list" warning for C<tr///> and
C<y///> no longer occurs in the presence of the C</c> flag. [RT #118047]

=item *

Stringification of NVs are not cached so that the lexical locale controls
stringification of the decimal point. [perl #108378] [perl #115800]

=item *

There have been several fixes related to Perl's handling of locales.  perl
#38193 was described above in L</Internal Changes>.
Also fixed is 
#118197, where the radix (decimal point) character had to be an ASCII
character (which doesn't work for some non-Western languages);
and #115808, in which C<POSIX::setlocale()> on failure returned an
C<undef> which didn't warn about not being defined even if those
warnings were enabled.

=item *

Compiling a C<split> operator whose third argument is a named constant
evaluating to 0 no longer causes the constant's value to change.

=item *

A named constant used as the second argument to C<index> no longer gets
coerced to a string if it is a reference, regular expression, dualvar, etc.

=item *

A named constant evaluating to the undefined value used as the second
argument to C<index> no longer produces "uninitialized" warnings at compile
time.  It will still produce them at run time.

=item *

When a scalar was returned from a subroutine in @INC, the referenced scalar
was magically converted into an IO thingy, possibly resulting in "Bizarre
copy" errors if that scalar continued to be used elsewhere.  Now Perl uses
an internal copy of the scalar instead.

=item *

Certain uses of the C<sort> operator are optimised to modify an array in
place, such as C<@a = sort @a>.  During the sorting, the array is made
read-only.  If a sort block should happen to die, then the array remained
read-only even outside the C<sort>.  This has been fixed.

=item *

C<$a> and C<$b> inside a sort block are aliased to the actual arguments to
C<sort>, so they can be modified through those two variables.  This did not
always work, e.g., for lvalue subs and C<$#ary>, and probably many other
operators.  It works now.

=item *

The arguments to C<sort> are now all in list context.  If the C<sort>
itself were called in void or scalar context, then I<some>, but not all, of
the arguments used to be in void or scalar context.

=item *

Subroutine prototypes with Unicode characters above U+00FF were getting
mangled during closure cloning.  This would happen with subroutines closing
over lexical variables declared outside, and with lexical subs.

=item *

C<UNIVERSAL::can> now treats its first argument the same way that method
calls do: Typeglobs and glob references with non-empty IO slots are treated
as handles, and strings are treated as filehandles, rather than packages,
if a handle with that name exists [perl #113932].

=item *

Method calls on typeglobs (e.g., C<< *ARGV->getline >>) used to stringify
the typeglob and then look it up again.  Combined with changes in Perl
5.18.0, this allowed C<< *foo->bar >> to call methods on the "foo" package
(like C<< foo->bar >>).  In some cases it could cause the method to be
called on the wrong handle.  Now a typeglob argument is treated as a
handle (just like C<< (\*foo)->bar >>), or, if its IO slot is empty, an
error is raised.

=item *

Assigning a vstring to a tied variable or to a subroutine argument aliased
to a nonexistent hash or array element now works, without flattening the
vstring into a regular string.

=item *

C<pos>, C<tie>, C<tied> and C<untie> did not work
properly on subroutine arguments aliased to nonexistent
hash and array elements [perl #77814, #27010].

=item *

The C<< => >> fat arrow operator can now quote built-in keywords even if it
occurs on the next line, making it consistent with how it treats other
barewords.

=item *

Autovivifying a subroutine stub via C<\&$glob> started causing crashes in Perl
5.18.0 if the $glob was merely a copy of a real glob, i.e., a scalar that had
had a glob assigned to it.  This has been fixed. [perl #119051]

=item *

Perl used to leak an implementation detail when it came to referencing the
return values of certain operators.  C<for ($a+$b) { warn \$_; warn \$_ }> used
to display two different memory addresses, because the C<\> operator was
copying the variable.  Under threaded builds, it would also happen for
constants (C<for(1) { ... }>).  This has been fixed. [perl #21979, #78194,
#89188, #109746, #114838, #115388]

=item *

The range operator C<..> was returning the same modifiable scalars with each
call, unless it was the only thing in a C<foreach> loop header.  This meant
that changes to values within the list returned would be visible the next time
the operator was executed. [perl #3105]

=item *

Constant folding and subroutine inlining no longer cause operations that would
normally return new modifiable scalars to return read-only values instead.

=item *

Closures of the form C<sub () { $some_variable }> are no longer inlined,
causing changes to the variable to be ignored by callers of the subroutine.
[perl #79908]

=item *

Return values of certain operators such as C<ref> would sometimes be shared
between recursive calls to the same subroutine, causing the inner call to
modify the value returned by C<ref> in the outer call.  This has been fixed.

=item *

C<__PACKAGE__> and constants returning a package name or hash key are now
consistently read-only.  In various previous Perl releases, they have become
mutable under certain circumstances.

=item *

Enabling "used once" warnings no longer causes crashes on stash circularities
created at compile time (C<*Foo::Bar::Foo:: = *Foo::>).

=item *

Undef constants used in hash keys (C<use constant u =E<gt> undef; $h{+u}>) no
longer produce "uninitialized" warnings at compile time.

=item *

Modifying a substitution target inside the substitution replacement no longer
causes crashes.

=item *

The first statement inside a string eval used to use the wrong pragma setting
sometimes during constant folding.  C<eval 'uc chr 0xe0'> would randomly choose
between Unicode, byte, and locale semantics.  This has been fixed.

=item *

The handling of return values of @INC filters (subroutines returned by
subroutines in @INC) has been fixed in various ways.  Previously tied variables
were mishandled, and setting $_ to a reference or typeglob could result in
crashes.

=item *

The C<SvPVbyte> XS function has been fixed to work with tied scalars returning
something other than a string.  It used to return utf8 in those cases where
C<SvPV> would.

=item *

Perl 5.18.0 inadvertently made C<--> and C<++> crash on dereferenced regular
expressions, and stopped C<++> from flattening vstrings.

=item *

C<bless> no longer dies with "Can't bless non-reference value" if its first
argument is a tied reference.

=item *

C<reset> with an argument no longer skips copy-on-write scalars, regular
expressions, typeglob copies, and vstrings.  Also, when encountering those or
read-only values, it no longer skips any array or hash with the same name.

=item *

C<reset> with an argument now skips scalars aliased to typeglobs
(C<for $z (*foo) { reset "z" }>).  Previously it would corrupt memory or crash.

=item *

C<ucfirst> and C<lcfirst> were not respecting the bytes pragma.  This was a
regression from Perl 5.12. [perl #117355]

=item *

Changes to C<UNIVERSAL::DESTROY> now update DESTROY caches in all classes,
instead of causing classes that have already had objects destroyed to continue
using the old sub.  This was a regression in Perl 5.18. [perl #114864]

=item *

All known false-positive occurrences of the deprecation warning "Useless use of
'\'; doesn't escape metacharacter '%c'", added in Perl 5.18.0, have been
removed. [perl #119101]

=item *

The value of $^E is now saved across signal handlers on Windows.  [perl #85104]

=item *

A lexical filehandle (as in C<open my $fh...>) is usually given a name based on
the current package and the name of the variable, e.g. "main::$fh".  Under
recursion, the filehandle was losing the "$fh" part of the name.  This has been
fixed.

=item *

Uninitialized values returned by XSUBs are no longer exempt from uninitialized
warnings.  [perl #118693]

=item *

C<elsif ("")> no longer erroneously produces a warning about void context.
[perl #118753]

=item *

Passing C<undef> to a subroutine now causes @_ to contain the same read-only
undefined scalar that C<undef> returns.  Furthermore, C<exists $_[0]> will now
return true if C<undef> was the first argument.  [perl #7508, #109726]

=item *

Passing a non-existent array element to a subroutine does not usually
autovivify it unless the subroutine modifies its argument.  This did not work
correctly with negative indices and with non-existent elements within the
array.  The element would be vivified immediately.  The delayed vivification
has been extended to work with those.  [perl #118691]

=item *

Assigning references or globs to the scalar returned by $#foo after the @foo
array has been freed no longer causes assertion failures on debugging builds
and memory leaks on regular builds.

=item *

On 64-bit platforms, large ranges like 1..1000000000000 no longer crash, but
eat up all your memory instead.  [perl #119161]

=item *

C<__DATA__> now puts the C<DATA> handle in the right package, even if the
current package has been renamed through glob assignment.

=item *

When C<die>, C<last>, C<next>, C<redo>, C<goto> and C<exit> unwind the scope,
it is possible for C<DESTROY> recursively to call a subroutine or format that
is currently being exited.  It that case, sometimes the lexical variables
inside the sub would start out having values from the outer call, instead of
being undefined as they should.  This has been fixed.  [perl #119311]

=item *

${^MPEN} is no longer treated as a synonym for ${^MATCH}.

=item *

Perl now tries a little harder to return the correct line number in
C<(caller)[2]>.  [perl #115768]

=item *

Line numbers inside multiline quote-like operators are now reported correctly.
[perl #3643]

=item *

C<#line> directives inside code embedded in quote-like operators are now
respected.

=item *

Line numbers are now correct inside the second here-doc when two here-doc
markers occur on the same line.

=item *

An optimization in Perl 5.18 made incorrect assumptions causing a bad
interaction with the L<Devel::CallParser> CPAN module.  If the module was
loaded then lexical variables declared in separate statements following a
C<my(...)> list might fail to be cleared on scope exit.

=item *

C<&xsub> and C<goto &xsub> calls now allow the called subroutine to autovivify
elements of @_.

=item *

C<&xsub> and C<goto &xsub> no longer crash if *_ has been undefined and has no
ARRAY entry (i.e. @_ does not exist).

=item *

C<&xsub> and C<goto &xsub> now work with tied @_.

=item *

Overlong identifiers no longer cause a buffer overflow (and a crash).  They
started doing so in Perl 5.18.

=item *

The warning "Scalar value @hash{foo} better written as $hash{foo}" now produces
far fewer false positives.  In particular, C<@hash{+function_returning_a_list}>
and C<@hash{ qw "foo bar baz" }> no longer warn.  The same applies to array
slices.  [perl #28380, #114024]

=item *

C<$! = EINVAL; waitpid(0, WNOHANG);> no longer goes into an internal infinite
loop.  [perl #85228]

=item *

A possible segmentation fault in filehandle duplication has been fixed.

=item *

A subroutine in @INC can return a reference to a scalar containing the initial
contents of the file.  However, that scalar was freed prematurely if not
referenced elsewhere, giving random results.

=item *

C<last> no longer returns values that the same statement has accumulated so
far, fixing amongst other things the long-standing bug that C<push @a, last>
would try to return the @a, copying it like a scalar in the process and
resulting in the error, "Bizarre copy of ARRAY in last."  [perl #3112]

=item *

In some cases, closing file handles opened to pipe to or from a process, which
had been duplicated into a standard handle, would call perl's internal waitpid
wrapper with a pid of zero.  With the fix for [perl #85228] this zero pid was
passed to C<waitpid>, possibly blocking the process.  This wait for process
zero no longer occurs.  [perl #119893]

=item *

C<select> used to ignore magic on the fourth (timeout) argument, leading to
effects such as C<select> blocking indefinitely rather than the expected sleep
time.  This has now been fixed.  [perl #120102]

=item *

The class name in C<for my class $foo> is now parsed correctly.  In the case of
the second character of the class name being followed by a digit (e.g. 'a1b')
this used to give the error "Missing $ on loop variable".  [perl #120112]

=item *

Perl 5.18.0 accidentally disallowed C<-bareword> under C<use strict> and
C<use integer>.  This has been fixed.  [perl #120288]

=item *

C<-a> at the start of a line (or a hyphen with any single letter that is
not a filetest operator) no longer produces an erroneous 'Use of "-a"
without parentheses is ambiguous' warning.  [perl #120288]

=item *

Lvalue context is now properly propagated into bare blocks and C<if> and
C<else> blocks in lvalue subroutines.  Previously, arrays and hashes would
sometimes incorrectly be flattened when returned in lvalue list context, or
"Bizarre copy" errors could occur.  [perl #119797]

=item *

Lvalue context is now propagated to the branches of C<||> and C<&&> (and
their alphabetic equivalents, C<or> and C<and>).  This means
C<foreach (pos $x || pos $y) {...}> now allows C<pos> to be modified
through $_.

=item *

C<stat> and C<readline> remember the last handle used; the former
for the special C<_> filehandle, the latter for C<${^LAST_FH}>.
C<eval "*foo if 0"> where *foo was the last handle passed to C<stat>
or C<readline> could cause that handle to be forgotten if the
handle were not opened yet.  This has been fixed.

=item *

Various cases of C<delete $::{a}>, C<delete $::{ENV}> etc. causing a crash
have been fixed.  [perl #54044]

=item *

Setting C<$!> to EACCESS before calling C<require> could affect
C<require>'s behaviour.  This has been fixed.

=item *

The "Can't use \1 to mean $1 in expression" warning message now only occurs
on the right-hand (replacement) part of a substitution.  Formerly it could
happen in code embedded in the left-hand side, or in any other quote-like
operator.

=item *

Blessing into a reference (C<bless $thisref, $thatref>) has long been
disallowed, but magical scalars for the second like C<$/> and those tied
were exempt.  They no longer are.  [perl #119809]

=item *

Blessing into a reference was accidentally allowed in 5.18 if the class
argument were a blessed reference with stale method caches (i.e., whose
class had had subs defined since the last method call).  They are
disallowed once more, as in 5.16.

=item *

C<< $x->{key} >> where $x was declared as C<my Class $x> no longer crashes
if a Class::FIELDS subroutine stub has been declared.

=item *

C<@$obj{'key'}> and C<${$obj}{key}> used to be exempt from compile-time
field checking ("No such class field"; see L<fields>) but no longer are.

=item *

A nonexistent array element with a large index passed to a subroutine that
ties the array and then tries to access the element no longer results in a
crash.

=item *

Declaring a subroutine stub named NEGATIVE_INDICES no longer makes negative
array indices crash when the current package is a tied array class.

=item *

Declaring a C<require>, C<glob>, or C<do> subroutine stub in the
CORE::GLOBAL:: package no longer makes compilation of calls to the
corresponding functions crash.

=item *

Aliasing CORE::GLOBAL:: functions to constants stopped working in Perl 5.10
but has now been fixed.

=item *

When C<`...`> or C<qx/.../> calls a C<readpipe> override, double-quotish
interpolation now happens, as is the case when there is no override.
Previously, the presence of an override would make these quote-like
operators act like C<q{}>, suppressing interpolation.  [perl #115330]

=item *

C<<<<`...`> here-docs (with backticks as the delimiters) now call
C<readpipe> overrides.  [perl #119827]

=item *

C<&CORE::exit()> and C<&CORE::die()> now respect L<vmsish> hints.

=item *

Undefining a glob that triggers a DESTROY method that undefines the same
glob is now safe.  It used to produce "Attempt to free unreferenced glob
pointer" warnings and leak memory.

=item *

If subroutine redefinition (C<eval 'sub foo{}'> or C<newXS> for XS code)
triggers a DESTROY method on the sub that is being redefined, and that
method assigns a subroutine to the same slot (C<*foo = sub {}>), C<$_[0]>
is no longer left pointing to a freed scalar.  Now DESTROY is delayed until
the new subroutine has been installed.

=item *

On Windows, perl no longer calls CloseHandle() on a socket handle.  This makes
debugging easier on Windows by removing certain irrelevant bad handle
exceptions.  It also fixes a race condition that made socket functions randomly
fail in a Perl process with multiple OS threads, and possible test failures in
F<dist/IO/t/cachepropagate-tcp.t>.  [perl #120091/118059]

=item *

Formats involving UTF-8 encoded strings, or strange vars like ties,
overloads, or stringified refs (and in recent
perls, pure NOK vars) would generally do the wrong thing in formats
when the var is treated as a string and repeatedly chopped, as in
C<< ^<<<~~ >> and similar. This has now been resolved.
[perl #33832/45325/113868/119847/119849/119851]

=item *

C<< semctl(..., SETVAL, ...) >> would set the semaphore to the top
32-bits of the supplied integer instead of the bottom 32-bits on
64-bit big-endian systems. [perl #120635]

=item *

C<< readdir() >> now only sets C<$!> on error.  C<$!> is no longer set
to C<EBADF> when then terminating C<undef> is read from the directory
unless the system call sets C<$!>. [perl #118651]

=item *

C<&CORE::glob> no longer causes an intermittent crash due to perl's stack
getting corrupted. [perl #119993]

=item *

C<open> with layers that load modules (e.g., "<:encoding(utf8)") no longer
runs the risk of crashing due to stack corruption.

=item *

Perl 5.18 broke autoloading via C<< ->SUPER::foo >> method calls by looking
up AUTOLOAD from the current package rather than the current package's
superclass.  This has been fixed. [perl #120694]

=item *

A longstanding bug causing C<do {} until CONSTANT>, where the constant
holds a true value, to read unallocated memory has been resolved.  This
would usually happen after a syntax error.  In past versions of Perl it has
crashed intermittently. [perl #72406]

=item *

Fix HP-UX C<$!> failure. HP-UX strerror() returns an empty string for an
unknown error code.  This caused an assertion to fail under DEBUGGING
builds.  Now instead, the returned string for C<"$!"> contains text
indicating the code is for an unknown error.

=item *

Individually-tied elements of @INC (as in C<tie $INC[0]...>) are now
handled correctly.  Formerly, whether a sub returned by such a tied element
would be treated as a sub depended on whether a FETCH had occurred
previously.

=item *

C<getc> on a byte-sized handle after the same C<getc> operator had been
used on a utf8 handle used to treat the bytes as utf8, resulting in erratic
behavior (e.g., malformed UTF-8 warnings).

=item *

An initial C<{> at the beginning of a format argument line was always
interpreted as the beginning of a block prior to v5.18.  In Perl v5.18, it
started being treated as an ambiguous token.  The parser would guess
whether it was supposed to be an anonymous hash constructor or a block
based on the contents.  Now the previous behavious has been restored.
[perl #119973]

=item *

In Perl v5.18 C<undef *_; goto &sub> and C<local *_; goto &sub> started
crashing.  This has been fixed. [perl #119949]

=item *

Backticks (C< `` > or C< qx// >) combined with multiple threads on
Win32 could result in output sent to stdout on one thread being
captured by backticks of an external command in another thread.

This could occur for pseudo-forked processes too, as Win32's
pseudo-fork is implemented in terms of threads.  [perl #77672]

=item *

C<< open $fh, ">+", undef >> no longer leaks memory when TMPDIR is set
but points to a directory a temporary file cannot be created in.  [perl
#120951]

=item *

C< for ( $h{k} || '' ) > no longer auto-vivifies C<$h{k}>.  [perl
#120374]

=item *

On Windows machines, Perl now emulates the POSIX use of the environment
for locale initialization.  Previously, the environment was ignored.
See L<perllocale/ENVIRONMENT>.

=item *

Fixed a crash when destroying a self-referencing GLOB.  [perl #121242]

=back

=head1 Known Problems

=over 4

=item *

L<IO::Socket> is known to fail tests on AIX 5.3.  There is
L<a patch|https://rt.perl.org/Ticket/Display.html?id=120835> in the request
tracker, #120835, which may be applied to future releases.

=item *

The following modules are known to have test failures with this version of
Perl.  Patches have been submitted, so there will hopefully be new releases
soon:

=over

=item *

L<Data::Structure::Util> version 0.15

=item *

L<HTML::StripScripts> version 1.05

=item *

L<List::Gather> version 0.08.

=back

=back

=head1 Obituary

Diana Rosa, 27, of Rio de Janeiro, went to her long rest on May 10,
2014, along with the plush camel she kept hanging on her computer screen
all the time. She was a passionate Perl hacker who loved the language and its
community, and who never missed a Rio.pm event. She was a true artist, an
enthusiast about writing code, singing arias and graffiting walls. We'll never
forget you.

Greg McCarroll died on August 28, 2013.

Greg was well known for many good reasons. He was one of the organisers of
the first YAPC::Europe, which concluded with an unscheduled auction where he
frantically tried to raise extra money to avoid the conference making a
loss. It was Greg who mistakenly arrived for a london.pm meeting a week
late; some years later he was the one who sold the choice of official
meeting date at a YAPC::Europe auction, and eventually as glorious leader of
london.pm he got to inherit the irreverent confusion that he had created.

Always helpful, friendly and cheerfully optimistic, you will be missed, but
never forgotten.

=head1 Acknowledgements

Perl 5.20.0 represents approximately 12 months of development since Perl 5.18.0
and contains approximately 470,000 lines of changes across 2,900 files from 124
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 280,000 lines of changes to 1,800 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.20.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Abir Viqar, Alan Haggai Alavi, Alan
Hourihane, Alexander Voronov, Alexandr Ciornii, Andy Dougherty, Anno Siegel,
Aristotle Pagaltzis, Arthur Axel 'fREW' Schmidt, Brad Gilbert, Brendan Byrd,
Brian Childs, Brian Fraser, Brian Gottreu, Chris 'BinGOs' Williams, Christian
Millour, Colin Kuskie, Craig A. Berry, Dabrien 'Dabe' Murphy, Dagfinn Ilmari
Mannsåker, Daniel Dragan, Darin McBride, David Golden, David Leadbeater, David
Mitchell, David Nicol, David Steinbrunner, Dennis Kaarsemaker, Dominic
Hargreaves, Ed Avis, Eric Brine, Evan Zacks, Father Chrysostomos, Florian
Ragwitz, François Perrad, Gavin Shelley, Gideon Israel Dsouza, Gisle Aas,
Graham Knop, H.Merijn Brand, Hauke D, Heiko Eissfeldt, Hiroo Hayashi, Hojung
Youn, James E Keenan, Jarkko Hietaniemi, Jerry D. Hedden, Jess Robinson, Jesse
Luehrs, Johan Vromans, John Gardiner Myers, John Goodyear, John P. Linderman,
John Peacock, kafka, Kang-min Liu, Karen Etheridge, Karl Williamson, Keedi Kim,
Kent Fredric, kevin dawson, Kevin Falcone, Kevin Ryde, Leon Timmermans, Lukas
Mai, Marc Simpson, Marcel Grünauer, Marco Peereboom, Marcus Holland-Moritz,
Mark Jason Dominus, Martin McGrath, Matthew Horsfall, Max Maischein, Mike
Doherty, Moritz Lenz, Nathan Glenn, Nathan Trapuzzano, Neil Bowers, Neil
Williams, Nicholas Clark, Niels Thykier, Niko Tyni, Olivier Mengué, Owain G.
Ainsworth, Paul Green, Paul Johnson, Peter John Acklam, Peter Martini, Peter
Rabbitson, Petr Písař, Philip Boulain, Philip Guenther, Piotr Roszatycki,
Rafael Garcia-Suarez, Reini Urban, Reuben Thomas, Ricardo Signes, Ruslan
Zakirov, Sergey Alekseev, Shirakata Kentaro, Shlomi Fish, Slaven Rezic,
Smylers, Steffen Müller, Steve Hay, Sullivan Beck, Thomas Sibley, Tobias
Leich, Toby Inkster, Tokuhiro Matsuno, Tom Christiansen, Tom Hukins, Tony Cook,
Victor Efimov, Viktor Turskyi, Vladimir Timofeev, YAMASHINA Hio, Yves Orton,
Zefram, Zsbán Ambrus, Ævar Arnfjörð Bjarmason.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
http://rt.perl.org/perlbug/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlko.pod000064400000027743150344123430006560 0ustar00=encoding utf8

이 파일을 내용 그대로 읽고 있다면 우스꽝스러운 문자는 무시해주세요.
이 문서는 POD로 읽을 수 있도록 POD 형식(F<pod/perlpod.pod> 문서를
확인하세요)으로 작성되어 있습니다.


=head1 NAME

perlko - 한국어 Perl 안내서

=head1 DESCRIPTION

Perl의 세계에 오신 것을 환영합니다!

Perl은 가끔 B<'Practical Extraction and Report Language'>라고 하기도 합니다만
다른 널리 알려진 것들 중에서 B<'Pathologically Eclectic Rubbish Lister'>라고
하기도 합니다. 사실 이것은 끼워 맞춘 것이며 Perl이 이것들의 첫 글자를
가져와서 이름을 붙인 것은 아닙니다. Perl의 창시자 Larry가 첫 번째 이름을
먼저 생각했고 널리 알려진 것을 나중에 지었기 때문입니다. 그렇기 때문에
B<'Perl'>은 모두 대문자가 아닙니다. 널리 알려진 어떤 것을 가지고 논쟁하는
것은 의미가 없습니다. Larry는 두 개 다 지지합니다.

가끔 p가 소문자로 작성된 B<'perl'>을 볼 것입니다. P가 대문자로 되어 있는
B<'Perl'>은 언어를 참조할 때 쓰이며 B<'perl'>처럼 p가 소문자인 경우는 여러분의
프로그램을 컴파일하고 돌릴 때 사용되는 해석기를 지칭할 때 사용됩니다.


=head1 Perl에 관하여

Perl은 본래 문자열 생성을 위해 만들졌지만 지금은 시스템 관리와 웹 개발,
네트워크 프로그래밍, GUI 개발 등을 포함한 여러 분야에서 널리 사용되는
범용 프로그래밍 언어입니다.

이 언어는 아름다움(아주 작고, 우아하고, 아주 적고)보다
실용적(사용하기 쉽고, 효율적이며, 가능한 최대한)인 것을 지향하고 있습니다.
사용하기 쉽고, 절차적 프로그래밍과 객체 지향 프로그래밍을 모두 지원하고,
강력한 문자열 처리 기능을 내장하고, 세상에서 가장 인상적인 제 3자의 모듈
모음처를 가지고 있다는 것은 Perl의 가장 중요한 특징입니다.

Perl의 언어적 특징은 F<pod/perlintro.pod> 문서에서 소개합니다.

이번 릴리스에서 가장 중요한 변화는 F<pod/perldelta.pod>에서 논의합니다.

또한 다양한 출판사가 출판한 많은 Perl 책은 다양한 주제를 다루고 있습니다.
자세한 정보는 F<pod/perlbook.pod> 문서를 확인하세요.


=head1 설치

여러분이 비교적 현대의 운영체제를 사용하고 있고 현재 버전의 Perl을
지역적으로 설치하고 싶다면 다음 명령을 실행하세요.

    ./Configure -des -Dprefix=$HOME/localperl
    make test
    make install

앞의 명령은 여러분의 플랫폼에 맞게 환경을 설정하고 컴파일을 수행한 후,
회기 테스트를 수행한뒤, 홈 디렉터리 하부의 F<localperl> 디렉터리에 perl을
설치합니다.

여러분이 어떠한 문제든 겪게 되거나 사용자 정의 버전 Perl을 설치할 필요가 있다면
현재 배포판에 들어있는 F<INSTALL> 파일 안의 자세한 설명을 읽어야 합니다.
추가적으로 일반적이지 않은 다양한 플랫폼에서 Perl을 빌드하고 사용하는
방법에 대한 도움말과 귀띔이 적혀있는 많은 수의 F<README> 파일이 있습니다.

일단 Perl을 설치하고 나면 C<perldoc> 도구를 이용해 풍부한 문서를 사용할
수 있습니다. 시작하기 위해서 다음 명령을 실행하세요.

    perldoc perl


=head1 실행에 어려움을 겪는다면

Perl은 뜨개질에서 부터 로켓 과학까지 모든 분야에서 사용할 수 있는 크고
복잡한 시스템입니다. 여러분이 어려움에 부딪혔을때 그 문제는 이미 다른
사람이 해결했을 가능성이 높습니다. 문서를 모두 확인했는데도 버그가
확실하다면 C<perlbug> 도구를 이용해서 저희에게 버그를 보고해주세요.
C<perlbug>에 대한 더 자세한 정보는 C<perldoc perlbug> 또는 C<perlbug>를
명령줄에서 실행해서 확인할 수 있습니다.

Perl을 사용 가능하게 만들었다 하더라도 Perl은 계속해서 진화하기 때문에
여러분이 맞닥뜨린 버그를 수정했거나 여러분이 유용하다고 생각할법한
새로운 기능이 추가된 좀 더 최신 버전이 있을 수 있습니다.

여러분은 항상 최신 버전의 perl을 CPAN (Comprehensive Perl Archive Network)
사이트 L<http://www.cpan.org/src/> 에서 찾을 수 있습니다.

perl 소스에 간단한 패치를 등록하고 싶다면 F<pod/perlhack.pod> 문서의
B<"SUPER QUICK PATCH GUIDE">를 살펴보세요.

그냥 개인적으로 참고하세요.
제가 이것처럼 멋진 물건을 만든다는 것을 여러분이 알기를 바랍니다.
그것은 제 이야기의 B<"저자(Author)">를 기쁘게하기 때문입니다.
이것이 여러분을 귀찮게 한다면 여러분의 B<"저작(Authorship)">에
대한 생각을 정정해야 할 수도 있습니다. 하지만 어쨌거나 여러분은
Perl을 사용하는데는 문제가 없답니다. :-)

- B<"저자">로부터.


=head1 인코딩

Perl은 5.8.0판부터 유니코드/ISO 10646에 대해 광범위하게 지원합니다.
유니코드 지원의 일환으로 한중일을 비롯한 세계 각국에서
유니코드 이전에 쓰고 있었고 지금도 널리 쓰이고 있는 수많은 인코딩을
지원합니다. 유니코드는 전 세계에서 쓰이는 모든 언어를 위한
표기 체계(유럽의 라틴 알파벳, 키릴 알파벳, 그리스 알파벳, 인도와 동남 아시아의
브라미 계열 스크립트, 아랍 문자, 히브리 문자, 한중일의 한자, 한국어의 한글,
일본어의 가나, 북미 인디안의 표기 체계 등)를 수용하는 것을 목표로 하고
있기 때문에 기존에 쓰이던  각 언어 및 국가 그리고 운영 체계에 고유한
문자 집합과 인코딩에 쓸 수 있는 모든 글자는 물론이고  기존 문자 집합에서
지원하고 있지 않던 아주 많은 글자를  포함하고 있습니다.

Perl은 내부적으로 유니코드를 문자 표현을 위해 사용합니다.
보다 구체적으로 말하면 Perl 스크립트 안에서  UTF-8 문자열을 쓸 수 있고,
각종 함수와 연산자(예를 들어, 정규식, index, substr)가 바이트 단위
대신 유니코드 글자 단위로 동작합니다.
더 자세한 것은 F<pod/perlunicode.pod> 문서를 참고하세요.
유니코드가 널리 보급되기 전에 널리 쓰이고 있었고, 여전히 널리 쓰이고 있는
각국/각 언어별 인코딩으로 입출력을 하고 이들 인코딩으로 된 데이터와 문서를
다루는 것을 돕기 위해 L<Encode> 모듈이 쓰이고 있습니다.
무엇보다 L<Encode> 모듈을 사용하면 수많은 인코딩 사이의 변환을 쉽게 할 수 있습니다.


=head2 Encode 모듈

=head3 지원 인코딩

L<Encode> 모듈은 다음과 같은 한국어 인코딩을 지원합니다.

=over 4

=item * C<euc-kr>

US-ASCII와 KS X 1001을 같이 쓰는 멀티바이트 인코딩으로 흔히
완성형이라고 불림. KS X 2901과 RFC 1557 참고.

=item * C<cp949>

MS-Windows 9x/ME에서 쓰이는 확장 완성형. euc-kr에 8,822자의
한글 음절을 더한 것임. alias는 uhc, windows-949, x-windows-949,
ks_c_5601-1987. 맨 마지막 이름은 적절하지 않은 이름이지만, Microsoft
제품에서 CP949의 의미로 쓰이고 있음.

=item * C<johab>

KS X 1001:1998 부록 3에서 규정한 조합형. 문자 레퍼토리는 cp949와 마찬가지로
US-ASCII와  KS X 1001에 8,822자의 한글 음절을 더한 것으로 인코딩 방식은 전혀 다름.

=item * C<iso-2022-kr>

RFC 1557에서 규정한 한국어 인터넷 메일 교환용 인코딩으로 US-ASCII와
KS X 1001을 레퍼토리로 하는 점에서 euc-kr과 같지만 인코딩 방식이 다름.
1997-8년 경까지 쓰였으나 더 이상 메일 교환에 쓰이지 않음.

=item * C<ksc5601-raw>

KS X 1001(KS C 5601)을 GL(즉, MSB를 0으로 한 경우)에 놓았을 때의 인코딩.
US-ASCII와 결합하지 않고 단독으로 쓰이는 일은 X11 등에서 글꼴
인코딩(ksc5601.1987-0. '0'은 GL을 의미함)으로 쓰이는 것을 제외하고는
거의 없음. KS C 5601은 1997년 KS X 1001로 이름을 바꾸었음. 1998년에는 두
글자(유로화 부호와 등록 상표 부호)가 더해졌음.

=back

=head3 변환 예제

예를 들어, euc-kr 인코딩으로 된 파일을 UTF-8로 변환하려면
명령줄에서 다음처럼 실행합니다.

    perl -Mencoding=euc-kr,STDOUT,utf8 -pe1 < file.euc-kr > file.utf8

반대로 변환할 경우 다음처럼 실행합니다.

    perl -Mencoding=utf8,STDOUT,euc-kr -pe1 < file.utf8 > file.euc-kr

이런 변환을 좀더 편리하게 할 수 있도록 도와주는 F<piconv>가 Perl에
기본으로 들어있습니다. 이 유틸리티는 L<Encode> 모듈을 이용한 순수 Perl
유틸리티로 이름에서 알 수 있듯이 Unix의 C<iconv>를 모델로 한 것입니다.
사용법은 다음과 같습니다.

   piconv -f euc-kr -t utf8 < file.euc-kr > file.utf8
   piconv -f utf8 -t euc-kr < file.utf8 > file.euc-kr

=head3 모범 사례

Perl은 기본적으로 내부에서 UTF-8을 사용하며 Encode 모듈을 통해
다양한 인코딩을 지원하지만 항상 다음 규칙을 지킴으로써 인코딩과
관련한 다양하게 발생할 수 있는 문제의 가능성을 줄이는 것을 추천합니다.

=over 4

=item * 소스 코드는 항상 UTF-8 인코딩으로 저장

=item * 소스 코드 상단에 C<use utf8;> 프라그마 사용

=item * 소스 코드, 터미널, 운영체제, 데이터 인코딩을 분리해서 이해

=item * 입출력 파일 핸들에 명시적인 인코딩을 사용

=item * 중복(double) 인코딩을 주의

=back


=head3 유니코드 및 한국어 인코딩 관련 자료

=over 4

=item * L<perluniintro>

=item * L<perlunicode>

=item * L<Encode>

=item * L<Encode::KR>

=item * L<encoding>

=item * L<http://www.unicode.org/>

유니코드 컨소시엄

=item * L<http://std.dkuug.dk/JTC1/SC2/WG2>

기본적으로 Unicode와 같은 ISO 표준인  ISO/IEC 10646 UCS(Universal
Character Set)을 만드는 ISO/IEC JTC1/SC2/WG2의 웹 페이지

=item * L<http://www.cl.cam.ac.uk/~mgk25/unicode.html>

유닉스/리눅스 사용자를 위한 UTF-8 및 유니코드 관련 FAQ

=item * L<http://wiki.kldp.org/Translations/html/UTF8-Unicode-KLDP/UTF8-Unicode-KLDP.html>

유닉스/리눅스 사용자를 위한 UTF-8 및 유니코드 관련 FAQ의 한국어 번역

=back


=head1 Perl 관련 자료

다음은 공식적인 Perl 관련 자료중 일부입니다.

=over 4

=item * L<http://www.perl.org/>

Perl 공식 홈페이지

=item * L<http://www.perl.com/>

O'Reilly의 Perl 웹 페이지

=item * L<http://www.cpan.org/>

CPAN - Comprehensive Perl Archive Network, 통합적 Perl 파일 보관 네트워크

=item * L<http://metacpan.org>

메타 CPAN

=item * L<http://lists.perl.org/>

Perl 메일링 리스트

=item * L<http://blogs.perl.org/>

Perl 메타 블로그

=item * L<http://www.perlmonks.org/>

Perl 수도승들을 위한 수도원

=item * L<http://www.pm.org/groups/asia.html>

아시아 지역 Perl 몽거스 모임

=item * L<http://www.perladvent.org/>

Perl 크리스마스 달력

=back


다음은 Perl을 더 깊게 공부하는데 도움을 줄 수 있는 한국어 관련 사이트입니다.

=over 4

=item * L<http://perl.kr/>

한국 Perl 커뮤니티 공식 포털

=item * L<http://doc.perl.kr/>

Perl 문서 한글화 프로젝트

=item * L<http://cafe.naver.com/perlstudy.cafe>

네이버 Perl 카페

=item * L<http://www.perl.or.kr/>

한국 Perl 사용자 모임

=item * L<http://advent.perl.kr>

Seoul.pm Perl 크리스마스 달력 (2010 ~ 2012)

=item * L<http://gypark.pe.kr/wiki/Perl>

GYPARK(Geunyoung Park)의 Perl 관련 한글 문서 저장소

=item * L<http://seoul.pm.org>

Seoul.pm - 서울 Perl 몽거스

=back


=head1 라이센스

F<README> 파일의 B<'LICENSING'> 항목을 참고하세요.


=head1 AUTHORS

=over

=item * Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt>

=item * 신정식 E<lt>jshin@mailaps.orgE<gt>

=item * 김도형 E<lt>keedi@cpan.orgE<gt>

=back


=cut
perlriscos.pod000064400000002771150344123430007443 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perlriscos - Perl version 5 for RISC OS

=head1 DESCRIPTION

This document gives instructions for building Perl for RISC OS. It is
complicated by the need to cross compile. There is a binary version of
perl available from L<http://www.cp15.org/perl/> which you may wish to
use instead of trying to compile it yourself.

=head1 BUILD

You need an installed and working gccsdk cross compiler
L<http://gccsdk.riscos.info/> and REXEN
L<http://www.cp15.org/programming/>

Firstly, copy the source and build a native copy of perl for your host
system.
Then, in the source to be cross compiled:

=over 4

=item 1.

    $ ./Configure

=item 2.

Select the riscos hint file. The default answers for the rest of the
questions are usually sufficient.

Note that, if you wish to run Configure non-interactively (see the INSTALL
document for details), to have it select the correct hint file, you'll
need to provide the argument -Dhintfile=riscos on the Configure
command-line.

=item 3.

    $ make miniperl

=item 4.

This should build miniperl and then fail when it tries to run it.

=item 5.

Copy the miniperl executable from the native build done earlier to
replace the cross compiled miniperl.

=item 6.

    $ make

=item 7.

This will use miniperl to complete the rest of the build.

=back

=head1 AUTHOR

Alex Waugh <alex@alexwaugh.com>
perljp.pod000064400000016541150344123430006552 0ustar00=encoding utf8

=head1 NAME

perljp - 日本語 Perl ガイド

=head1 説明

Perl の世界へようこそ!

Perl 5.8.0 より、Unicodeサポートが大幅に強化され、その結果ラテン文字以外の文字コードのサポートが CJK (中国語、日本語、ハングル)を含めて加わりました。Unicodeは世界中の文字を一つの文字コードで扱うことを目指した標準規格であり、東から西、はたまたその間の文字（ギリシャ文字、キリール文字、アラビア文字、ヘブライ文字、ディーヴァナガーリ文字、などなど）や、これまではOSベンダーが独自に定めていた文字(PCおよびMacintosh)がすでに含まれています。

Perl 自身は Unicode で動作します。Perl スクリプト内の文字列リテラルや正規表現は Unicode を前提としています。そして入出力のためには、これまで使われてきたさまざまな文字コードに対応するモジュール、「 Encode 」が標準装備されており、Unicode とこれらの文字コードの相互変換も簡単に行えるようになっています。

現時点で Encode がサポートする文字コードは以下のとおりです。

  7bit-jis      AdobeStandardEncoding AdobeSymbol       AdobeZdingbat
  ascii             big5              big5-hkscs        cp1006
  cp1026            cp1047            cp1250            cp1251
  cp1252            cp1253            cp1254            cp1255
  cp1256            cp1257            cp1258            cp37
  cp424             cp437             cp500             cp737
  cp775             cp850             cp852             cp855
  cp856             cp857             cp860             cp861
  cp862             cp863             cp864             cp865
  cp866             cp869             cp874             cp875
  cp932             cp936             cp949             cp950
  dingbats          euc-cn            euc-jp            euc-kr
  gb12345-raw       gb2312-raw        gsm0338           hp-roman8
  hz                iso-2022-jp       iso-2022-jp-1     iso-8859-1
  iso-8859-10       iso-8859-11       iso-8859-13       iso-8859-14
  iso-8859-15       iso-8859-16       iso-8859-2        iso-8859-3
  iso-8859-4        iso-8859-5        iso-8859-6        iso-8859-7
  iso-8859-8        iso-8859-9        iso-ir-165        jis0201-raw
  jis0208-raw       jis0212-raw       johab             koi8-f
  koi8-r            koi8-u            ksc5601-raw       MacArabic
  MacCentralEurRoman  MacChineseSimp    MacChineseTrad    MacCroatian
  MacCyrillic       MacDingbats       MacFarsi          MacGreek
  MacHebrew         MacIcelandic      MacJapanese       MacKorean
  MacRoman          MacRomanian       MacRumanian       MacSami
  MacSymbol         MacThai           MacTurkish        MacUkrainian
  nextstep          posix-bc          shiftjis          symbol
  UCS-2BE           UCS-2LE           UTF-16            UTF-16BE
  UTF-16LE          UTF-32            UTF-32BE          UTF-32LE
  utf8              viscii                              

(全114種類)

例えば、文字コードFOOのファイルをUTF-8に変換するには、以下のようにします。

    perl -Mencoding=FOO,STDOUT,utf8 -pe1 < file.FOO > file.utf8

また、Perlには、全部がPerlで書かれた文字コード変換ユーティリティ、piconvも付属しているので、以下のようにすることもできます。

   piconv -f FOO -t utf8 < file.FOO > file.utf8
   piconv -f utf8 -t FOO < file.utf8 > file.FOO

=head2 About (jcode.pl|Jcode.pm|JPerl)

5.8以前の、スクリプトがEUC-JPであればリテラルだけは扱うことができました。また、入出力を扱うモジュールとしてはJcode.pmが( L<http://openlab.ring.gr.jp/Jcode/> )、perl4用のユーティリティとしてはjcode.plがそれぞれ存在し、日本語の扱えるCGIでよく利用されていることを御存じの方も少なくないかと思われます。ただし、日本語による正規表現をうまく扱うことは不可能でした。

5.005以前のPerlには、日本語に特化したローカライズ版、Jperlが存在しました( L<http://homepage2.nifty.com/kipp/perl/jperl/index.html> )。また、Mac OS 9.x/Classic用のPerl、MacPerlの日本語版もMacJPerlとして存在してました。( L<http://habilis.net/macjperl/> ).これらでは文字コードとしてEUC-JPに加えShift_JISもそのまま扱うことができ、また日本語による正規表現を扱うことも可能でした。

Perl5.8では、これらの機能がすべてPerl本体だけで実現できる上に、日本語のみならず上記114の文字コードをすべて、しかも同時に扱うことができます。さらに、CPANなどから新しい文字コード用のモジュールを入手することも簡単にできるようになっています。

=over 4

=item *

入出力

以下の例はいづれもShift_JISの入力をEUC-JPに変換して出力します。

  # jcode.pl
  require "jcode.pl";
  while(<>){
    jcode::convert(*_, 'euc', 'sjis');
    print;
  }
  # Jcode.pm
  use Jcode;
  while(<>){
  	print Jcode->new($_, 'sjis')->euc;
  }
  # Perl 5.8
  use Encode;
  while(<>){
    from_to($_, 'shiftjis', 'euc-jp');
    print;
  }
  # Perl 5.8 - encoding を利用して
  use encoding 'euc-jp', STDIN => 'shiftjis';
  while(<>){
  	print;
  }

=item *

Jperl 互換スクリプト

いわゆる"shebang"を変更するだけで、Jperl用のscriptのほとんどは変更なしに利用可能だと思われます。

   #!/path/to/jperl
   ↓
   #!/path/to/perl -Mencoding=euc-jp

詳しくは perldoc encoding を参照してください。

=back

=head2 さらに詳しく

Perlには膨大な資料が付属しており、Perlの新機能やUnicodeサポート、そしてEncodeモジュールの使用法などが細かく網羅されています（残念ながら、ほとんど英語ではありますが）。以下のコマンドでそれらの一部を閲覧することが可能です。

  perldoc perlunicode # PerlのUnicodeサポート全般
  perldoc Encode      # Encodeモジュールに関して
  perldoc Encode::JP  # うち日本語文字コードに関して

=head2 Perl全般に関する URL

=over 4

=item L<http://www.perl.com/>

Perl ホームページ (O'Reilly and Associates)

=item L<http://www.cpan.org/>

CPAN (Comprehensive Perl Archive Network)

=item L<http://lists.perl.org/>

Perl メーリングリスト集

=back

=head2 Perlの修得に役立つ URL

=over 4

=item L<http://www.oreilly.com.tw/>

O'Reilly 社のPerl関連書籍(繁体字中国語)

=item L<http://www.oreilly.com.cn/>

O'Reilly 社のPerl関連書籍(簡体字中国語)

=item L<http://www.oreilly.co.jp/catalog/>

オライリー社のPerl関連書籍(日本語)

=back

=head2 Perl ユーザーグループ

=over 4

=item L<http://www.pm.org/groups/asia.html>

=back

=head2 Unicode関連のURL

=over 4

=item L<http://www.unicode.org/>

Unicode コンソーシアム (Unicode規格の選定団体)

=item L<http://www.cl.cam.ac.uk/%7Emgk25/unicode.html>

UTF-8 and Unicode FAQ for Unix/Linux

=item L<http://wiki.kldp.org/Translations/html/UTF8-Unicode-KLDP/UTF8-Unicode-KLDP.html>

UTF-8 and Unicode FAQ for Unix/Linux (ハングル訳)

=back

=head1 AUTHORS

Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt>
Dan Kogai (小飼　弾) E<lt>dankogai@dan.co.jpE<gt>

=cut
perlreguts.pod000064400000112670150344123430007452 0ustar00=head1 NAME

perlreguts - Description of the Perl regular expression engine.

=head1 DESCRIPTION

This document is an attempt to shine some light on the guts of the regex
engine and how it works. The regex engine represents a significant chunk
of the perl codebase, but is relatively poorly understood. This document
is a meagre attempt at addressing this situation. It is derived from the
author's experience, comments in the source code, other papers on the
regex engine, feedback on the perl5-porters mail list, and no doubt other
places as well.

B<NOTICE!> It should be clearly understood that the behavior and
structures discussed in this represents the state of the engine as the
author understood it at the time of writing. It is B<NOT> an API
definition, it is purely an internals guide for those who want to hack
the regex engine, or understand how the regex engine works. Readers of
this document are expected to understand perl's regex syntax and its
usage in detail. If you want to learn about the basics of Perl's
regular expressions, see L<perlre>. And if you want to replace the
regex engine with your own, see L<perlreapi>.

=head1 OVERVIEW

=head2 A quick note on terms

There is some debate as to whether to say "regexp" or "regex". In this
document we will use the term "regex" unless there is a special reason
not to, in which case we will explain why.

When speaking about regexes we need to distinguish between their source
code form and their internal form. In this document we will use the term
"pattern" when we speak of their textual, source code form, and the term
"program" when we speak of their internal representation. These
correspond to the terms I<S-regex> and I<B-regex> that Mark Jason
Dominus employs in his paper on "Rx" ([1] in L</REFERENCES>).

=head2 What is a regular expression engine?

A regular expression engine is a program that takes a set of constraints
specified in a mini-language, and then applies those constraints to a
target string, and determines whether or not the string satisfies the
constraints. See L<perlre> for a full definition of the language.

In less grandiose terms, the first part of the job is to turn a pattern into
something the computer can efficiently use to find the matching point in
the string, and the second part is performing the search itself.

To do this we need to produce a program by parsing the text. We then
need to execute the program to find the point in the string that
matches. And we need to do the whole thing efficiently.

=head2 Structure of a Regexp Program

=head3 High Level

Although it is a bit confusing and some people object to the terminology, it
is worth taking a look at a comment that has
been in F<regexp.h> for years:

I<This is essentially a linear encoding of a nondeterministic
finite-state machine (aka syntax charts or "railroad normal form" in
parsing technology).>

The term "railroad normal form" is a bit esoteric, with "syntax
diagram/charts", or "railroad diagram/charts" being more common terms.
Nevertheless it provides a useful mental image of a regex program: each
node can be thought of as a unit of track, with a single entry and in
most cases a single exit point (there are pieces of track that fork, but
statistically not many), and the whole forms a layout with a
single entry and single exit point. The matching process can be thought
of as a car that moves along the track, with the particular route through
the system being determined by the character read at each possible
connector point. A car can fall off the track at any point but it may
only proceed as long as it matches the track.

Thus the pattern C</foo(?:\w+|\d+|\s+)bar/> can be thought of as the
following chart:

                      [start]
                         |
                       <foo>
                         |
                   +-----+-----+
                   |     |     |
                 <\w+> <\d+> <\s+>
                   |     |     |
                   +-----+-----+
                         |
                       <bar>
                         |
                       [end]

The truth of the matter is that perl's regular expressions these days are
much more complex than this kind of structure, but visualising it this way
can help when trying to get your bearings, and it matches the
current implementation pretty closely.

To be more precise, we will say that a regex program is an encoding
of a graph. Each node in the graph corresponds to part of
the original regex pattern, such as a literal string or a branch,
and has a pointer to the nodes representing the next component
to be matched. Since "node" and "opcode" already have other meanings in the
perl source, we will call the nodes in a regex program "regops".

The program is represented by an array of C<regnode> structures, one or
more of which represent a single regop of the program. Struct
C<regnode> is the smallest struct needed, and has a field structure which is
shared with all the other larger structures.

The "next" pointers of all regops except C<BRANCH> implement concatenation;
a "next" pointer with a C<BRANCH> on both ends of it is connecting two
alternatives.  [Here we have one of the subtle syntax dependencies: an
individual C<BRANCH> (as opposed to a collection of them) is never
concatenated with anything because of operator precedence.]

The operand of some types of regop is a literal string; for others,
it is a regop leading into a sub-program.  In particular, the operand
of a C<BRANCH> node is the first regop of the branch.

B<NOTE>: As the railroad metaphor suggests, this is B<not> a tree
structure:  the tail of the branch connects to the thing following the
set of C<BRANCH>es.  It is a like a single line of railway track that
splits as it goes into a station or railway yard and rejoins as it comes
out the other side.

=head3 Regops

The base structure of a regop is defined in F<regexp.h> as follows:

    struct regnode {
        U8  flags;    /* Various purposes, sometimes overridden */
        U8  type;     /* Opcode value as specified by regnodes.h */
        U16 next_off; /* Offset in size regnode */
    };

Other larger C<regnode>-like structures are defined in F<regcomp.h>. They
are almost like subclasses in that they have the same fields as
C<regnode>, with possibly additional fields following in
the structure, and in some cases the specific meaning (and name)
of some of base fields are overridden. The following is a more
complete description.

=over 4

=item C<regnode_1>

=item C<regnode_2>

C<regnode_1> structures have the same header, followed by a single
four-byte argument; C<regnode_2> structures contain two two-byte
arguments instead:

    regnode_1                U32 arg1;
    regnode_2                U16 arg1;  U16 arg2;

=item C<regnode_string>

C<regnode_string> structures, used for literal strings, follow the header
with a one-byte length and then the string data. Strings are padded on
the end with zero bytes so that the total length of the node is a
multiple of four bytes:

    regnode_string           char string[1];
                             U8 str_len; /* overrides flags */

=item C<regnode_charclass>

Bracketed character classes are represented by C<regnode_charclass>
structures, which have a four-byte argument and then a 32-byte (256-bit)
bitmap indicating which characters in the Latin1 range are included in
the class.

    regnode_charclass        U32 arg1;
                             char bitmap[ANYOF_BITMAP_SIZE];

Various flags whose names begin with C<ANYOF_> are used for special
situations.  Above Latin1 matches and things not known until run-time
are stored in L</Perl's pprivate structure>.

=item C<regnode_charclass_posixl>

There is also a larger form of a char class structure used to represent
POSIX char classes under C</l> matching,
called C<regnode_charclass_posixl> which has an
additional 32-bit bitmap indicating which POSIX char classes
have been included.

   regnode_charclass_posixl U32 arg1;
                            char bitmap[ANYOF_BITMAP_SIZE];
                            U32 classflags;

=back

F<regnodes.h> defines an array called C<regarglen[]> which gives the size
of each opcode in units of C<size regnode> (4-byte). A macro is used
to calculate the size of an C<EXACT> node based on its C<str_len> field.

The regops are defined in F<regnodes.h> which is generated from
F<regcomp.sym> by F<regcomp.pl>. Currently the maximum possible number
of distinct regops is restricted to 256, with about a quarter already
used.

A set of macros makes accessing the fields
easier and more consistent. These include C<OP()>, which is used to determine
the type of a C<regnode>-like structure; C<NEXT_OFF()>, which is the offset to
the next node (more on this later); C<ARG()>, C<ARG1()>, C<ARG2()>, C<ARG_SET()>,
and equivalents for reading and setting the arguments; and C<STR_LEN()>,
C<STRING()> and C<OPERAND()> for manipulating strings and regop bearing
types.

=head3 What regop is next?

There are three distinct concepts of "next" in the regex engine, and
it is important to keep them clear.

=over 4

=item *

There is the "next regnode" from a given regnode, a value which is
rarely useful except that sometimes it matches up in terms of value
with one of the others, and that sometimes the code assumes this to
always be so.

=item *

There is the "next regop" from a given regop/regnode. This is the
regop physically located after the current one, as determined by
the size of the current regop. This is often useful, such as when
dumping the structure we use this order to traverse. Sometimes the code
assumes that the "next regnode" is the same as the "next regop", or in
other words assumes that the sizeof a given regop type is always going
to be one regnode large.

=item *

There is the "regnext" from a given regop. This is the regop which
is reached by jumping forward by the value of C<NEXT_OFF()>,
or in a few cases for longer jumps by the C<arg1> field of the C<regnode_1>
structure. The subroutine C<regnext()> handles this transparently.
This is the logical successor of the node, which in some cases, like
that of the C<BRANCH> regop, has special meaning.

=back

=head1 Process Overview

Broadly speaking, performing a match of a string against a pattern
involves the following steps:

=over 5

=item A. Compilation

=over 5

=item 1. Parsing for size

=item 2. Parsing for construction

=item 3. Peep-hole optimisation and analysis

=back

=item B. Execution

=over 5

=item 4. Start position and no-match optimisations

=item 5. Program execution

=back

=back


Where these steps occur in the actual execution of a perl program is
determined by whether the pattern involves interpolating any string
variables. If interpolation occurs, then compilation happens at run time. If it
does not, then compilation is performed at compile time. (The C</o> modifier changes this,
as does C<qr//> to a certain extent.) The engine doesn't really care that
much.

=head2 Compilation

This code resides primarily in F<regcomp.c>, along with the header files
F<regcomp.h>, F<regexp.h> and F<regnodes.h>.

Compilation starts with C<pregcomp()>, which is mostly an initialisation
wrapper which farms work out to two other routines for the heavy lifting: the
first is C<reg()>, which is the start point for parsing; the second,
C<study_chunk()>, is responsible for optimisation.

Initialisation in C<pregcomp()> mostly involves the creation and data-filling
of a special structure, C<RExC_state_t> (defined in F<regcomp.c>).
Almost all internally-used routines in F<regcomp.h> take a pointer to one
of these structures as their first argument, with the name C<pRExC_state>.
This structure is used to store the compilation state and contains many
fields. Likewise there are many macros which operate on this
variable: anything that looks like C<RExC_xxxx> is a macro that operates on
this pointer/structure.

=head3 Parsing for size

In this pass the input pattern is parsed in order to calculate how much
space is needed for each regop we would need to emit. The size is also
used to determine whether long jumps will be required in the program.

This stage is controlled by the macro C<SIZE_ONLY> being set.

The parse proceeds pretty much exactly as it does during the
construction phase, except that most routines are short-circuited to
change the size field C<RExC_size> and not do anything else.

=head3 Parsing for construction

Once the size of the program has been determined, the pattern is parsed
again, but this time for real. Now C<SIZE_ONLY> will be false, and the
actual construction can occur.

C<reg()> is the start of the parse process. It is responsible for
parsing an arbitrary chunk of pattern up to either the end of the
string, or the first closing parenthesis it encounters in the pattern.
This means it can be used to parse the top-level regex, or any section
inside of a grouping parenthesis. It also handles the "special parens"
that perl's regexes have. For instance when parsing C</x(?:foo)y/> C<reg()>
will at one point be called to parse from the "?" symbol up to and
including the ")".

Additionally, C<reg()> is responsible for parsing the one or more
branches from the pattern, and for "finishing them off" by correctly
setting their next pointers. In order to do the parsing, it repeatedly
calls out to C<regbranch()>, which is responsible for handling up to the
first C<|> symbol it sees.

C<regbranch()> in turn calls C<regpiece()> which
handles "things" followed by a quantifier. In order to parse the
"things", C<regatom()> is called. This is the lowest level routine, which
parses out constant strings, character classes, and the
various special symbols like C<$>. If C<regatom()> encounters a "("
character it in turn calls C<reg()>.

The routine C<regtail()> is called by both C<reg()> and C<regbranch()>
in order to "set the tail pointer" correctly. When executing and
we get to the end of a branch, we need to go to the node following the
grouping parens. When parsing, however, we don't know where the end will
be until we get there, so when we do we must go back and update the
offsets as appropriate. C<regtail> is used to make this easier.

A subtlety of the parsing process means that a regex like C</foo/> is
originally parsed into an alternation with a single branch. It is only
afterwards that the optimiser converts single branch alternations into the
simpler form.

=head3 Parse Call Graph and a Grammar

The call graph looks like this:

 reg()                        # parse a top level regex, or inside of
                              # parens
     regbranch()              # parse a single branch of an alternation
         regpiece()           # parse a pattern followed by a quantifier
             regatom()        # parse a simple pattern
                 regclass()   #   used to handle a class
                 reg()        #   used to handle a parenthesised
                              #   subpattern
                 ....
         ...
         regtail()            # finish off the branch
     ...
     regtail()                # finish off the branch sequence. Tie each
                              # branch's tail to the tail of the
                              # sequence
                              # (NEW) In Debug mode this is
                              # regtail_study().

A grammar form might be something like this:

    atom  : constant | class
    quant : '*' | '+' | '?' | '{min,max}'
    _branch: piece
           | piece _branch
           | nothing
    branch: _branch
          | _branch '|' branch
    group : '(' branch ')'
    _piece: atom | group
    piece : _piece
          | _piece quant

=head3 Parsing complications

The implication of the above description is that a pattern containing nested
parentheses will result in a call graph which cycles through C<reg()>,
C<regbranch()>, C<regpiece()>, C<regatom()>, C<reg()>, C<regbranch()> I<etc>
multiple times, until the deepest level of nesting is reached. All the above
routines return a pointer to a C<regnode>, which is usually the last regnode
added to the program. However, one complication is that reg() returns NULL
for parsing C<(?:)> syntax for embedded modifiers, setting the flag
C<TRYAGAIN>. The C<TRYAGAIN> propagates upwards until it is captured, in
some cases by C<regatom()>, but otherwise unconditionally by
C<regbranch()>. Hence it will never be returned by C<regbranch()> to
C<reg()>. This flag permits patterns such as C<(?i)+> to be detected as
errors (I<Quantifier follows nothing in regex; marked by <-- HERE in m/(?i)+
<-- HERE />).

Another complication is that the representation used for the program differs
if it needs to store Unicode, but it's not always possible to know for sure
whether it does until midway through parsing. The Unicode representation for
the program is larger, and cannot be matched as efficiently. (See L</Unicode
and Localisation Support> below for more details as to why.)  If the pattern
contains literal Unicode, it's obvious that the program needs to store
Unicode. Otherwise, the parser optimistically assumes that the more
efficient representation can be used, and starts sizing on this basis.
However, if it then encounters something in the pattern which must be stored
as Unicode, such as an C<\x{...}> escape sequence representing a character
literal, then this means that all previously calculated sizes need to be
redone, using values appropriate for the Unicode representation. Currently,
all regular expression constructions which can trigger this are parsed by code
in C<regatom()>.

To avoid wasted work when a restart is needed, the sizing pass is abandoned
- C<regatom()> immediately returns NULL, setting the flag C<RESTART_UTF8>.
(This action is encapsulated using the macro C<REQUIRE_UTF8>.) This restart
request is propagated up the call chain in a similar fashion, until it is
"caught" in C<Perl_re_op_compile()>, which marks the pattern as containing
Unicode, and restarts the sizing pass. It is also possible for constructions
within run-time code blocks to turn out to need Unicode representation.,
which is signalled by C<S_compile_runtime_code()> returning false to
C<Perl_re_op_compile()>.

The restart was previously implemented using a C<longjmp> in C<regatom()>
back to a C<setjmp> in C<Perl_re_op_compile()>, but this proved to be
problematic as the latter is a large function containing many automatic
variables, which interact badly with the emergent control flow of C<setjmp>.

=head3 Debug Output

In the 5.9.x development version of perl you can C<< use re Debug => 'PARSE' >>
to see some trace information about the parse process. We will start with some
simple patterns and build up to more complex patterns.

So when we parse C</foo/> we see something like the following table. The
left shows what is being parsed, and the number indicates where the next regop
would go. The stuff on the right is the trace output of the graph. The
names are chosen to be short to make it less dense on the screen. 'tsdy'
is a special form of C<regtail()> which does some extra analysis.

 >foo<             1    reg
                          brnc
                            piec
                              atom
 ><                4      tsdy~ EXACT <foo> (EXACT) (1)
                              ~ attach to END (3) offset to 2

The resulting program then looks like:

   1: EXACT <foo>(3)
   3: END(0)

As you can see, even though we parsed out a branch and a piece, it was ultimately
only an atom. The final program shows us how things work. We have an C<EXACT> regop,
followed by an C<END> regop. The number in parens indicates where the C<regnext> of
the node goes. The C<regnext> of an C<END> regop is unused, as C<END> regops mean
we have successfully matched. The number on the left indicates the position of
the regop in the regnode array.

Now let's try a harder pattern. We will add a quantifier, so now we have the pattern
C</foo+/>. We will see that C<regbranch()> calls C<regpiece()> twice.

 >foo+<            1    reg
                          brnc
                            piec
                              atom
 >o+<              3        piec
                              atom
 ><                6        tail~ EXACT <fo> (1)
                   7      tsdy~ EXACT <fo> (EXACT) (1)
                              ~ PLUS (END) (3)
                              ~ attach to END (6) offset to 3

And we end up with the program:

   1: EXACT <fo>(3)
   3: PLUS(6)
   4:   EXACT <o>(0)
   6: END(0)

Now we have a special case. The C<EXACT> regop has a C<regnext> of 0. This is
because if it matches it should try to match itself again. The C<PLUS> regop
handles the actual failure of the C<EXACT> regop and acts appropriately (going
to regnode 6 if the C<EXACT> matched at least once, or failing if it didn't).

Now for something much more complex: C</x(?:foo*|b[a][rR])(foo|bar)$/>

 >x(?:foo*|b...    1    reg
                          brnc
                            piec
                              atom
 >(?:foo*|b[...    3        piec
                              atom
 >?:foo*|b[a...                 reg
 >foo*|b[a][...                   brnc
                                    piec
                                      atom
 >o*|b[a][rR...    5                piec
                                      atom
 >|b[a][rR])...    8                tail~ EXACT <fo> (3)
 >b[a][rR])(...    9              brnc
                  10                piec
                                      atom
 >[a][rR])(f...   12                piec
                                      atom
 >a][rR])(fo...                         clas
 >[rR])(foo|...   14                tail~ EXACT <b> (10)
                                    piec
                                      atom
 >rR])(foo|b...                         clas
 >)(foo|bar)...   25                tail~ EXACT <a> (12)
                                  tail~ BRANCH (3)
                  26              tsdy~ BRANCH (END) (9)
                                      ~ attach to TAIL (25) offset to 16
                                  tsdy~ EXACT <fo> (EXACT) (4)
                                      ~ STAR (END) (6)
                                      ~ attach to TAIL (25) offset to 19
                                  tsdy~ EXACT <b> (EXACT) (10)
                                      ~ EXACT <a> (EXACT) (12)
                                      ~ ANYOF[Rr] (END) (14)
                                      ~ attach to TAIL (25) offset to 11
 >(foo|bar)$<               tail~ EXACT <x> (1)
                            piec
                              atom
 >foo|bar)$<                    reg
                  28              brnc
                                    piec
                                      atom
 >|bar)$<         31              tail~ OPEN1 (26)
 >bar)$<                          brnc
                  32                piec
                                      atom
 >)$<             34              tail~ BRANCH (28)
                  36              tsdy~ BRANCH (END) (31)
                                     ~ attach to CLOSE1 (34) offset to 3
                                  tsdy~ EXACT <foo> (EXACT) (29)
                                     ~ attach to CLOSE1 (34) offset to 5
                                  tsdy~ EXACT <bar> (EXACT) (32)
                                     ~ attach to CLOSE1 (34) offset to 2
 >$<                        tail~ BRANCH (3)
                                ~ BRANCH (9)
                                ~ TAIL (25)
                            piec
                              atom
 ><               37        tail~ OPEN1 (26)
                                ~ BRANCH (28)
                                ~ BRANCH (31)
                                ~ CLOSE1 (34)
                  38      tsdy~ EXACT <x> (EXACT) (1)
                              ~ BRANCH (END) (3)
                              ~ BRANCH (END) (9)
                              ~ TAIL (END) (25)
                              ~ OPEN1 (END) (26)
                              ~ BRANCH (END) (28)
                              ~ BRANCH (END) (31)
                              ~ CLOSE1 (END) (34)
                              ~ EOL (END) (36)
                              ~ attach to END (37) offset to 1

Resulting in the program

   1: EXACT <x>(3)
   3: BRANCH(9)
   4:   EXACT <fo>(6)
   6:   STAR(26)
   7:     EXACT <o>(0)
   9: BRANCH(25)
  10:   EXACT <ba>(14)
  12:   OPTIMIZED (2 nodes)
  14:   ANYOF[Rr](26)
  25: TAIL(26)
  26: OPEN1(28)
  28:   TRIE-EXACT(34)
        [StS:1 Wds:2 Cs:6 Uq:5 #Sts:7 Mn:3 Mx:3 Stcls:bf]
          <foo>
          <bar>
  30:   OPTIMIZED (4 nodes)
  34: CLOSE1(36)
  36: EOL(37)
  37: END(0)

Here we can see a much more complex program, with various optimisations in
play. At regnode 10 we see an example where a character class with only
one character in it was turned into an C<EXACT> node. We can also see where
an entire alternation was turned into a C<TRIE-EXACT> node. As a consequence,
some of the regnodes have been marked as optimised away. We can see that
the C<$> symbol has been converted into an C<EOL> regop, a special piece of
code that looks for C<\n> or the end of the string.

The next pointer for C<BRANCH>es is interesting in that it points at where
execution should go if the branch fails. When executing, if the engine
tries to traverse from a branch to a C<regnext> that isn't a branch then
the engine will know that the entire set of branches has failed.

=head3 Peep-hole Optimisation and Analysis

The regular expression engine can be a weighty tool to wield. On long
strings and complex patterns it can end up having to do a lot of work
to find a match, and even more to decide that no match is possible.
Consider a situation like the following pattern.

   'ababababababababababab' =~ /(a|b)*z/

The C<(a|b)*> part can match at every char in the string, and then fail
every time because there is no C<z> in the string. So obviously we can
avoid using the regex engine unless there is a C<z> in the string.
Likewise in a pattern like:

   /foo(\w+)bar/

In this case we know that the string must contain a C<foo> which must be
followed by C<bar>. We can use Fast Boyer-Moore matching as implemented
in C<fbm_instr()> to find the location of these strings. If they don't exist
then we don't need to resort to the much more expensive regex engine.
Even better, if they do exist then we can use their positions to
reduce the search space that the regex engine needs to cover to determine
if the entire pattern matches.

There are various aspects of the pattern that can be used to facilitate
optimisations along these lines:

=over 5

=item * anchored fixed strings

=item * floating fixed strings

=item * minimum and maximum length requirements

=item * start class

=item * Beginning/End of line positions

=back

Another form of optimisation that can occur is the post-parse "peep-hole"
optimisation, where inefficient constructs are replaced by more efficient
constructs. The C<TAIL> regops which are used during parsing to mark the end
of branches and the end of groups are examples of this. These regops are used
as place-holders during construction and "always match" so they can be
"optimised away" by making the things that point to the C<TAIL> point to the
thing that C<TAIL> points to, thus "skipping" the node.

Another optimisation that can occur is that of "C<EXACT> merging" which is
where two consecutive C<EXACT> nodes are merged into a single
regop. An even more aggressive form of this is that a branch
sequence of the form C<EXACT BRANCH ... EXACT> can be converted into a
C<TRIE-EXACT> regop.

All of this occurs in the routine C<study_chunk()> which uses a special
structure C<scan_data_t> to store the analysis that it has performed, and
does the "peep-hole" optimisations as it goes.

The code involved in C<study_chunk()> is extremely cryptic. Be careful. :-)

=head2 Execution

Execution of a regex generally involves two phases, the first being
finding the start point in the string where we should match from,
and the second being running the regop interpreter.

If we can tell that there is no valid start point then we don't bother running
the interpreter at all. Likewise, if we know from the analysis phase that we
cannot detect a short-cut to the start position, we go straight to the
interpreter.

The two entry points are C<re_intuit_start()> and C<pregexec()>. These routines
have a somewhat incestuous relationship with overlap between their functions,
and C<pregexec()> may even call C<re_intuit_start()> on its own. Nevertheless
other parts of the perl source code may call into either, or both.

Execution of the interpreter itself used to be recursive, but thanks to the
efforts of Dave Mitchell in the 5.9.x development track, that has changed: now an
internal stack is maintained on the heap and the routine is fully
iterative. This can make it tricky as the code is quite conservative
about what state it stores, with the result that two consecutive lines in the
code can actually be running in totally different contexts due to the
simulated recursion.

=head3 Start position and no-match optimisations

C<re_intuit_start()> is responsible for handling start points and no-match
optimisations as determined by the results of the analysis done by
C<study_chunk()> (and described in L</Peep-hole Optimisation and Analysis>).

The basic structure of this routine is to try to find the start- and/or
end-points of where the pattern could match, and to ensure that the string
is long enough to match the pattern. It tries to use more efficient
methods over less efficient methods and may involve considerable
cross-checking of constraints to find the place in the string that matches.
For instance it may try to determine that a given fixed string must be
not only present but a certain number of chars before the end of the
string, or whatever.

It calls several other routines, such as C<fbm_instr()> which does
Fast Boyer Moore matching and C<find_byclass()> which is responsible for
finding the start using the first mandatory regop in the program.

When the optimisation criteria have been satisfied, C<reg_try()> is called
to perform the match.

=head3 Program execution

C<pregexec()> is the main entry point for running a regex. It contains
support for initialising the regex interpreter's state, running
C<re_intuit_start()> if needed, and running the interpreter on the string
from various start positions as needed. When it is necessary to use
the regex interpreter C<pregexec()> calls C<regtry()>.

C<regtry()> is the entry point into the regex interpreter. It expects
as arguments a pointer to a C<regmatch_info> structure and a pointer to
a string.  It returns an integer 1 for success and a 0 for failure.
It is basically a set-up wrapper around C<regmatch()>.

C<regmatch> is the main "recursive loop" of the interpreter. It is
basically a giant switch statement that implements a state machine, where
the possible states are the regops themselves, plus a number of additional
intermediate and failure states. A few of the states are implemented as
subroutines but the bulk are inline code.

=head1 MISCELLANEOUS

=head2 Unicode and Localisation Support

When dealing with strings containing characters that cannot be represented
using an eight-bit character set, perl uses an internal representation
that is a permissive version of Unicode's UTF-8 encoding[2]. This uses single
bytes to represent characters from the ASCII character set, and sequences
of two or more bytes for all other characters. (See L<perlunitut>
for more information about the relationship between UTF-8 and perl's
encoding, utf8. The difference isn't important for this discussion.)

No matter how you look at it, Unicode support is going to be a pain in a
regex engine. Tricks that might be fine when you have 256 possible
characters often won't scale to handle the size of the UTF-8 character
set.  Things you can take for granted with ASCII may not be true with
Unicode. For instance, in ASCII, it is safe to assume that
C<sizeof(char1) == sizeof(char2)>, but in UTF-8 it isn't. Unicode case folding is
vastly more complex than the simple rules of ASCII, and even when not
using Unicode but only localised single byte encodings, things can get
tricky (for example, B<LATIN SMALL LETTER SHARP S> (U+00DF, E<szlig>)
should match 'SS' in localised case-insensitive matching).

Making things worse is that UTF-8 support was a later addition to the
regex engine (as it was to perl) and this necessarily  made things a lot
more complicated. Obviously it is easier to design a regex engine with
Unicode support in mind from the beginning than it is to retrofit it to
one that wasn't.

Nearly all regops that involve looking at the input string have
two cases, one for UTF-8, and one not. In fact, it's often more complex
than that, as the pattern may be UTF-8 as well.

Care must be taken when making changes to make sure that you handle
UTF-8 properly, both at compile time and at execution time, including
when the string and pattern are mismatched.

=head2 Base Structures

The C<regexp> structure described in L<perlreapi> is common to all
regex engines. Two of its fields are intended for the private use
of the regex engine that compiled the pattern. These are the
C<intflags> and pprivate members. The C<pprivate> is a void pointer to
an arbitrary structure whose use and management is the responsibility
of the compiling engine. perl will never modify either of these
values. In the case of the stock engine the structure pointed to by
C<pprivate> is called C<regexp_internal>.

Its C<pprivate> and C<intflags> fields contain data
specific to each engine.

There are two structures used to store a compiled regular expression.
One, the C<regexp> structure described in L<perlreapi> is populated by
the engine currently being. used and some of its fields read by perl to
implement things such as the stringification of C<qr//>.


The other structure is pointed to by the C<regexp> struct's
C<pprivate> and is in addition to C<intflags> in the same struct
considered to be the property of the regex engine which compiled the
regular expression;

The regexp structure contains all the data that perl needs to be aware of
to properly work with the regular expression. It includes data about
optimisations that perl can use to determine if the regex engine should
really be used, and various other control info that is needed to properly
execute patterns in various contexts such as is the pattern anchored in
some way, or what flags were used during the compile, or whether the
program contains special constructs that perl needs to be aware of.

In addition it contains two fields that are intended for the private use
of the regex engine that compiled the pattern. These are the C<intflags>
and pprivate members. The C<pprivate> is a void pointer to an arbitrary
structure whose use and management is the responsibility of the compiling
engine. perl will never modify either of these values.

As mentioned earlier, in the case of the default engines, the C<pprivate>
will be a pointer to a regexp_internal structure which holds the compiled
program and any additional data that is private to the regex engine
implementation.

=head3 Perl's C<pprivate> structure

The following structure is used as the C<pprivate> struct by perl's
regex engine. Since it is specific to perl it is only of curiosity
value to other engine implementations.

 typedef struct regexp_internal {
         U32 *offsets;           /* offset annotations 20001228 MJD
                                  * data about mapping the program to
                                  * the string*/
         regnode *regstclass;    /* Optional startclass as identified or
                                  * constructed by the optimiser */
         struct reg_data *data;  /* Additional miscellaneous data used
                                  * by the program.  Used to make it
                                  * easier to clone and free arbitrary
                                  * data that the regops need. Often the
                                  * ARG field of a regop is an index
                                  * into this structure */
         regnode program[1];     /* Unwarranted chumminess with
                                  * compiler. */
 } regexp_internal;

=over 5

=item C<offsets>

Offsets holds a mapping of offset in the C<program>
to offset in the C<precomp> string. This is only used by ActiveState's
visual regex debugger.

=item C<regstclass>

Special regop that is used by C<re_intuit_start()> to check if a pattern
can match at a certain position. For instance if the regex engine knows
that the pattern must start with a 'Z' then it can scan the string until
it finds one and then launch the regex engine from there. The routine
that handles this is called C<find_by_class()>. Sometimes this field
points at a regop embedded in the program, and sometimes it points at
an independent synthetic regop that has been constructed by the optimiser.

=item C<data>

This field points at a C<reg_data> structure, which is defined as follows

    struct reg_data {
        U32 count;
        U8 *what;
        void* data[1];
    };

This structure is used for handling data structures that the regex engine
needs to handle specially during a clone or free operation on the compiled
product. Each element in the data array has a corresponding element in the
what array. During compilation regops that need special structures stored
will add an element to each array using the add_data() routine and then store
the index in the regop.

=item C<program>

Compiled program. Inlined into the structure so the entire struct can be
treated as a single blob.

=back

=head1 SEE ALSO

L<perlreapi>

L<perlre>

L<perlunitut>

=head1 AUTHOR

by Yves Orton, 2006.

With excerpts from Perl, and contributions and suggestions from
Ronald J. Kimball, Dave Mitchell, Dominic Dunlop, Mark Jason Dominus,
Stephen McCamant, and David Landgren.

=head1 LICENCE

Same terms as Perl.

=head1 REFERENCES

[1] L<http://perl.plover.com/Rx/paper/>

[2] L<http://www.unicode.org>

=cut
perlclib.pod000064400000022623150344123430007050 0ustar00=head1 NAME

perlclib - Internal replacements for standard C library functions

=head1 DESCRIPTION

One thing Perl porters should note is that F<perl> doesn't tend to use that
much of the C standard library internally; you'll see very little use of, 
for example, the F<ctype.h> functions in there. This is because Perl
tends to reimplement or abstract standard library functions, so that we
know exactly how they're going to operate.

This is a reference card for people who are familiar with the C library
and who want to do things the Perl way; to tell them which functions
they ought to use instead of the more normal C functions. 

=head2 Conventions

In the following tables:

=over 3

=item C<t>

is a type.

=item C<p>

is a pointer.

=item C<n>

is a number.

=item C<s>

is a string.

=back

C<sv>, C<av>, C<hv>, etc. represent variables of their respective types.

=head2 File Operations

Instead of the F<stdio.h> functions, you should use the Perl abstraction
layer. Instead of C<FILE*> types, you need to be handling C<PerlIO*>
types.  Don't forget that with the new PerlIO layered I/O abstraction 
C<FILE*> types may not even be available. See also the C<perlapio>
documentation for more information about the following functions:

 Instead Of:                 Use:

 stdin                       PerlIO_stdin()
 stdout                      PerlIO_stdout()
 stderr                      PerlIO_stderr()

 fopen(fn, mode)             PerlIO_open(fn, mode)
 freopen(fn, mode, stream)   PerlIO_reopen(fn, mode, perlio) (Dep-
                               recated)
 fflush(stream)              PerlIO_flush(perlio)
 fclose(stream)              PerlIO_close(perlio)

=head2 File Input and Output

 Instead Of:                 Use:

 fprintf(stream, fmt, ...)   PerlIO_printf(perlio, fmt, ...)

 [f]getc(stream)             PerlIO_getc(perlio)
 [f]putc(stream, n)          PerlIO_putc(perlio, n)
 ungetc(n, stream)           PerlIO_ungetc(perlio, n)

Note that the PerlIO equivalents of C<fread> and C<fwrite> are slightly
different from their C library counterparts:

 fread(p, size, n, stream)   PerlIO_read(perlio, buf, numbytes)
 fwrite(p, size, n, stream)  PerlIO_write(perlio, buf, numbytes)

 fputs(s, stream)            PerlIO_puts(perlio, s)

There is no equivalent to C<fgets>; one should use C<sv_gets> instead:

 fgets(s, n, stream)         sv_gets(sv, perlio, append)

=head2 File Positioning

 Instead Of:                 Use:

 feof(stream)                PerlIO_eof(perlio)
 fseek(stream, n, whence)    PerlIO_seek(perlio, n, whence)
 rewind(stream)              PerlIO_rewind(perlio)

 fgetpos(stream, p)          PerlIO_getpos(perlio, sv)
 fsetpos(stream, p)          PerlIO_setpos(perlio, sv)

 ferror(stream)              PerlIO_error(perlio)
 clearerr(stream)            PerlIO_clearerr(perlio)

=head2 Memory Management and String Handling

 Instead Of:                    Use:

 t* p = malloc(n)               Newx(p, n, t)
 t* p = calloc(n, s)            Newxz(p, n, t)
 p = realloc(p, n)              Renew(p, n, t)
 memcpy(dst, src, n)            Copy(src, dst, n, t)
 memmove(dst, src, n)           Move(src, dst, n, t)
 memcpy(dst, src, sizeof(t))    StructCopy(src, dst, t)
 memset(dst, 0, n * sizeof(t))  Zero(dst, n, t)
 memzero(dst, 0)                Zero(dst, n, char)
 free(p)                        Safefree(p)

 strdup(p)                      savepv(p)
 strndup(p, n)                  savepvn(p, n) (Hey, strndup doesn't
                                               exist!)

 strstr(big, little)            instr(big, little)
 strcmp(s1, s2)                 strLE(s1, s2) / strEQ(s1, s2)
                                              / strGT(s1,s2)
 strncmp(s1, s2, n)             strnNE(s1, s2, n) / strnEQ(s1, s2, n)

 memcmp(p1, p2, n)              memNE(p1, p2, n)
 !memcmp(p1, p2, n)             memEQ(p1, p2, n)

Notice the different order of arguments to C<Copy> and C<Move> than used
in C<memcpy> and C<memmove>.

Most of the time, though, you'll want to be dealing with SVs internally
instead of raw C<char *> strings:

 strlen(s)                   sv_len(sv)
 strcpy(dt, src)             sv_setpv(sv, s)
 strncpy(dt, src, n)         sv_setpvn(sv, s, n)
 strcat(dt, src)             sv_catpv(sv, s)
 strncat(dt, src)            sv_catpvn(sv, s)
 sprintf(s, fmt, ...)        sv_setpvf(sv, fmt, ...)

Note also the existence of C<sv_catpvf> and C<sv_vcatpvfn>, combining
concatenation with formatting.

Sometimes instead of zeroing the allocated heap by using Newxz() you
should consider "poisoning" the data.  This means writing a bit
pattern into it that should be illegal as pointers (and floating point
numbers), and also hopefully surprising enough as integers, so that
any code attempting to use the data without forethought will break
sooner rather than later.  Poisoning can be done using the Poison()
macros, which have similar arguments to Zero():

 PoisonWith(dst, n, t, b)    scribble memory with byte b
 PoisonNew(dst, n, t)        equal to PoisonWith(dst, n, t, 0xAB)
 PoisonFree(dst, n, t)       equal to PoisonWith(dst, n, t, 0xEF)
 Poison(dst, n, t)           equal to PoisonFree(dst, n, t)

=head2 Character Class Tests

There are several types of character class tests that Perl implements.
The only ones described here are those that directly correspond to C
library functions that operate on 8-bit characters, but there are
equivalents that operate on wide characters, and UTF-8 encoded strings.
All are more fully described in L<perlapi/Character classification> and
L<perlapi/Character case changing>.

The C library routines listed in the table below return values based on
the current locale.  Use the entries in the final column for that
functionality.  The other two columns always assume a POSIX (or C)
locale.  The entries in the ASCII column are only meaningful for ASCII
inputs, returning FALSE for anything else.  Use these only when you
B<know> that is what you want.  The entries in the Latin1 column assume
that the non-ASCII 8-bit characters are as Unicode defines, them, the
same as ISO-8859-1, often called Latin 1.

 Instead Of:  Use for ASCII:   Use for Latin1:      Use for locale:

 isalnum(c)  isALPHANUMERIC(c) isALPHANUMERIC_L1(c) isALPHANUMERIC_LC(c)
 isalpha(c)  isALPHA(c)        isALPHA_L1(c)        isALPHA_LC(u )
 isascii(c)  isASCII(c)                             isASCII_LC(c)
 isblank(c)  isBLANK(c)        isBLANK_L1(c)        isBLANK_LC(c)
 iscntrl(c)  isCNTRL(c)        isCNTRL_L1(c)        isCNTRL_LC(c)
 isdigit(c)  isDIGIT(c)        isDIGIT_L1(c)        isDIGIT_LC(c)
 isgraph(c)  isGRAPH(c)        isGRAPH_L1(c)        isGRAPH_LC(c)
 islower(c)  isLOWER(c)        isLOWER_L1(c)        isLOWER_LC(c)
 isprint(c)  isPRINT(c)        isPRINT_L1(c)        isPRINT_LC(c)
 ispunct(c)  isPUNCT(c)        isPUNCT_L1(c)        isPUNCT_LC(c)
 isspace(c)  isSPACE(c)        isSPACE_L1(c)        isSPACE_LC(c)
 isupper(c)  isUPPER(c)        isUPPER_L1(c)        isUPPER_LC(c)
 isxdigit(c) isXDIGIT(c)       isXDIGIT_L1(c)       isXDIGIT_LC(c)

 tolower(c)  toLOWER(c)        toLOWER_L1(c)        toLOWER_LC(c)
 toupper(c)  toUPPER(c)                             toUPPER_LC(c)

To emphasize that you are operating only on ASCII characters, you can
append C<_A> to each of the macros in the ASCII column: C<isALPHA_A>,
C<isDIGIT_A>, and so on.

(There is no entry in the Latin1 column for C<isascii> even though there
is an C<isASCII_L1>, which is identical to C<isASCII>;  the
latter name is clearer.  There is no entry in the Latin1 column for
C<toupper> because the result can be non-Latin1.  You have to use
C<toUPPER_uni>, as described in L<perlapi/Character case changing>.)

=head2 F<stdlib.h> functions

 Instead Of:                 Use:

 atof(s)                     Atof(s)
 atoi(s)                     grok_atoUV(s, &uv, &e)
 atol(s)                     grok_atoUV(s, &uv, &e)
 strtod(s, &p)               Nothing.  Just don't use it.
 strtol(s, &p, n)            grok_atoUV(s, &uv, &e)
 strtoul(s, &p, n)           grok_atoUV(s, &uv, &e)

Typical use is to do range checks on C<uv> before casting:

  int i; UV uv; char* end_ptr;
  if (grok_atoUV(input, &uv, &end_ptr)
      && uv <= INT_MAX)
    i = (int)uv;
    ... /* continue parsing from end_ptr */
  } else {
    ... /* parse error: not a decimal integer in range 0 .. MAX_IV */
  }

Notice also the C<grok_bin>, C<grok_hex>, and C<grok_oct> functions in
F<numeric.c> for converting strings representing numbers in the respective
bases into C<NV>s.  Note that grok_atoUV() doesn't handle negative inputs,
or leading whitespace (being purposefully strict).

Note that strtol() and strtoul() may be disguised as Strtol(), Strtoul(),
Atol(), Atoul().  Avoid those, too.

In theory C<Strtol> and C<Strtoul> may not be defined if the machine perl is
built on doesn't actually have strtol and strtoul. But as those 2
functions are part of the 1989 ANSI C spec we suspect you'll find them
everywhere by now.

 int rand()                  double Drand01()
 srand(n)                    { seedDrand01((Rand_seed_t)n);
                               PL_srand_called = TRUE; }

 exit(n)                     my_exit(n)
 system(s)                   Don't. Look at pp_system or use my_popen.

 getenv(s)                   PerlEnv_getenv(s)
 setenv(s, val)              my_setenv(s, val)

=head2 Miscellaneous functions

You should not even B<want> to use F<setjmp.h> functions, but if you
think you do, use the C<JMPENV> stack in F<scope.h> instead.

For C<signal>/C<sigaction>, use C<rsignal(signo, handler)>.

=head1 SEE ALSO

L<perlapi>, L<perlapio>, L<perlguts>

perlbook.pod000064400000020222150344123430007062 0ustar00=head1 NAME

perlbook - Books about and related to Perl

=head1 DESCRIPTION

There are many books on Perl and Perl-related. A few of these are
good, some are OK, but many aren't worth your money. There is a list
of these books, some with extensive reviews, at
L<http://books.perl.org/> . We list some of the books here, and while
listing a book implies our
endorsement, don't think that not including a book means anything.

Most of these books are available online through Safari Books Online
( L<http://safaribooksonline.com/> ).

=head2 The most popular books

The major reference book on Perl, written by the creator of Perl, is
I<Programming Perl>:

=over 4

=item I<Programming Perl> (the "Camel Book"):

 by Tom Christiansen, brian d foy, Larry Wall with Jon Orwant
 ISBN 978-0-596-00492-7 [4th edition February 2012]
 ISBN 978-1-4493-9890-3 [ebook]
 http://oreilly.com/catalog/9780596004927

=back

The Ram is a cookbook with hundreds of examples of using Perl to
accomplish specific tasks:

=over 4

=item I<The Perl Cookbook> (the "Ram Book"):

 by Tom Christiansen and Nathan Torkington,
 with Foreword by Larry Wall
 ISBN 978-0-596-00313-5 [2nd Edition August 2003]
 ISBN 978-0-596-15888-0 [ebook]
 http://oreilly.com/catalog/9780596003135/

=back

If you want to learn the basics of Perl, you might start with the
Llama book, which assumes that you already know a little about
programming:

=over 4

=item I<Learning Perl>  (the "Llama Book")

 by Randal L. Schwartz, Tom Phoenix, and brian d foy
 ISBN 978-1-4493-0358-7 [6th edition June 2011]
 ISBN 978-1-4493-0458-4 [ebook]
 http://www.learning-perl.com/

=back

The tutorial started in the Llama continues in the Alpaca, which
introduces the intermediate features of references, data structures,
object-oriented programming, and modules:

=over 4

=item I<Intermediate Perl> (the "Alpaca Book")

 by Randal L. Schwartz and brian d foy, with Tom Phoenix
         foreword by Damian Conway
 ISBN 978-1-4493-9309-0 [2nd edition August 2012]
 ISBN 978-1-4493-0459-1 [ebook]
 http://www.intermediateperl.com/

=back

=head2 References

You might want to keep these desktop references close by your keyboard:

=over 4

=item I<Perl 5 Pocket Reference>

 by Johan Vromans
 ISBN 978-1-4493-0370-9 [5th edition July 2011]
 ISBN 978-1-4493-0813-1 [ebook]
 http://oreilly.com/catalog/0636920018476/

=item I<Perl Debugger Pocket Reference>

 by Richard Foley
 ISBN 978-0-596-00503-0 [1st edition January 2004]
 ISBN 978-0-596-55625-9 [ebook]
 http://oreilly.com/catalog/9780596005030/

=item I<Regular Expression Pocket Reference>

 by Tony Stubblebine
 ISBN 978-0-596-51427-3 [2nd edition July 2007]
 ISBN 978-0-596-55782-9 [ebook]
 http://oreilly.com/catalog/9780596514273/

=back

=head2 Tutorials

=over 4

=item I<Beginning Perl>

(There are 2 books with this title)

 by Curtis 'Ovid' Poe
 ISBN 978-1-118-01384-7
 http://www.wrox.com/WileyCDA/WroxTitle/productCd-1118013840.html

 by James Lee
 ISBN 1-59059-391-X [3rd edition April 2010 & ebook]
 http://www.apress.com/9781430227939

=item I<Learning Perl> (the "Llama Book")

 by Randal L. Schwartz, Tom Phoenix, and brian d foy
 ISBN 978-1-4493-0358-7 [6th edition June 2011]
 ISBN 978-1-4493-0458-4 [ebook]
 http://www.learning-perl.com/

=item I<Intermediate Perl> (the "Alpaca Book")

 by Randal L. Schwartz and brian d foy, with Tom Phoenix
         foreword by Damian Conway
 ISBN 978-1-4493-9309-0 [2nd edition August 2012]
 ISBN 978-1-4493-0459-1 [ebook]
 http://www.intermediateperl.com/

=item I<Mastering Perl>

    by brian d foy
 ISBN 9978-1-4493-9311-3 [2st edition January 2014]
 ISBN 978-1-4493-6487-8 [ebook]
 http://www.masteringperl.org/

=item I<Effective Perl Programming>

 by Joseph N. Hall, Joshua A. McAdams, brian d foy
 ISBN 0-321-49694-9 [2nd edition 2010]
 http://www.effectiveperlprogramming.com/

=back

=head2 Task-Oriented

=over 4

=item I<Writing Perl Modules for CPAN>

 by Sam Tregar
 ISBN 1-59059-018-X [1st edition August 2002 & ebook]
 http://www.apress.com/9781590590188

=item I<The Perl Cookbook>

 by Tom Christiansen and Nathan Torkington,
     with Foreword by Larry Wall
 ISBN 978-0-596-00313-5 [2nd Edition August 2003]
 ISBN 978-0-596-15888-0 [ebook]
 http://oreilly.com/catalog/9780596003135/

=item I<Automating System Administration with Perl>

 by David N. Blank-Edelman
 ISBN 978-0-596-00639-6 [2nd edition May 2009]
 ISBN 978-0-596-80251-6 [ebook]
 http://oreilly.com/catalog/9780596006396

=item I<Real World SQL Server Administration with Perl>

 by Linchi Shea
 ISBN 1-59059-097-X [1st edition July 2003 & ebook]
 http://www.apress.com/9781590590973

=back

=head2 Special Topics

=over 4

=item I<Regular Expressions Cookbook>

 by Jan Goyvaerts and Steven Levithan
 ISBN 978-1-4493-1943-4 [2nd edition August 2012]
 ISBN 978-1-4493-2747-7 [ebook]
 http://shop.oreilly.com/product/0636920023630.do

=item I<Programming the Perl DBI>

 by Tim Bunce and Alligator Descartes
 ISBN 978-1-56592-699-8 [February 2000]
 ISBN 978-1-4493-8670-2 [ebook]
 http://oreilly.com/catalog/9781565926998

=item I<Perl Best Practices>

 by Damian Conway
 ISBN 978-0-596-00173-5 [1st edition July 2005]
 ISBN 978-0-596-15900-9 [ebook]
 http://oreilly.com/catalog/9780596001735

=item I<Higher-Order Perl>

 by Mark-Jason Dominus
 ISBN 1-55860-701-3 [1st edition March 2005]
 free ebook http://hop.perl.plover.com/book/
 http://hop.perl.plover.com/

=item I<Mastering Regular Expressions>

 by Jeffrey E. F. Friedl
 ISBN 978-0-596-52812-6 [3rd edition August 2006]
 ISBN 978-0-596-55899-4 [ebook]
 http://oreilly.com/catalog/9780596528126

=item I<Network Programming with Perl>

 by Lincoln Stein
 ISBN 0-201-61571-1 [1st edition 2001]
 http://www.pearsonhighered.com/educator/product/Network-Programming-with-Perl/9780201615715.page

=item I<Perl Template Toolkit>

 by Darren Chamberlain, Dave Cross, and Andy Wardley
 ISBN 978-0-596-00476-7 [December 2003]
 ISBN 978-1-4493-8647-4 [ebook]
 http://oreilly.com/catalog/9780596004767

=item I<Object Oriented Perl>

 by Damian Conway
     with foreword by Randal L. Schwartz
 ISBN 1-884777-79-1 [1st edition August 1999 & ebook]
 http://www.manning.com/conway/

=item I<Data Munging with Perl>

 by Dave Cross
 ISBN 1-930110-00-6 [1st edition 2001 & ebook]
 http://www.manning.com/cross

=item I<Mastering Perl/Tk>

 by Steve Lidie and Nancy Walsh
 ISBN 978-1-56592-716-2 [1st edition January 2002]
 ISBN 978-0-596-10344-6 [ebook]
 http://oreilly.com/catalog/9781565927162

=item I<Extending and Embedding Perl>

 by Tim Jenness and Simon Cozens
 ISBN 1-930110-82-0 [1st edition August 2002 & ebook]
 http://www.manning.com/jenness

=item I<Pro Perl Debugging>

 by Richard Foley with Andy Lester
 ISBN 1-59059-454-1 [1st edition July 2005 & ebook]
 http://www.apress.com/9781590594544

=back

=head2 Free (as in beer) books

Some of these books are available as free downloads.

I<Higher-Order Perl>: L<http://hop.perl.plover.com/>

I<Modern Perl>: L<http://onyxneon.com/books/modern_perl/>

=head2 Other interesting, non-Perl books

You might notice several familiar Perl concepts in this collection of
ACM columns from Jon Bentley. The similarity to the title of the major
Perl book (which came later) is not completely accidental:

=over 4

=item I<Programming Pearls>

 by Jon Bentley
 ISBN 978-0-201-65788-3 [2 edition, October 1999]

=item I<More Programming Pearls>

 by Jon Bentley
 ISBN 0-201-11889-0 [January 1988]

=back

=head2 A note on freshness

Each version of Perl comes with the documentation that was current at
the time of release. This poses a problem for content such as book
lists. There are probably very nice books published after this list
was included in your Perl release, and you can check the latest
released version at L<http://perldoc.perl.org/perlbook.html> .

Some of the books we've listed appear almost ancient in internet
scale, but we've included those books because they still describe the
current way of doing things. Not everything in Perl changes every day.
Many of the beginner-level books, too, go over basic features and
techniques that are still valid today. In general though, we try to
limit this list to books published in the past five years.

=head2 Get your book listed

If your Perl book isn't listed and you think it should be, let us know.
L<mailto:perl5-porters@perl.org>

=cut
perlcygwin.pod000064400000065077150344123430007451 0ustar00If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see F<pod/perlpod.pod>) which is
specially designed to be readable as is.

=head1 NAME

perlcygwin - Perl for Cygwin

=head1 SYNOPSIS

This document will help you configure, make, test and install Perl
on Cygwin.  This document also describes features of Cygwin that will
affect how Perl behaves at runtime.

B<NOTE:> There are pre-built Perl packages available for Cygwin and a
version of Perl is provided in the normal Cygwin install.  If you do
not need to customize the configuration, consider using one of those
packages.


=head1 PREREQUISITES FOR COMPILING PERL ON CYGWIN

=head2 Cygwin = GNU+Cygnus+Windows (Don't leave UNIX without it)

The Cygwin tools are ports of the popular GNU development tools for Win32
platforms.  They run thanks to the Cygwin library which provides the UNIX
system calls and environment these programs expect.  More information
about this project can be found at:

L<http://www.cygwin.com/>

A recent net or commercial release of Cygwin is required.

At the time this document was last updated, Cygwin 1.7.16 was current.


=head2 Cygwin Configuration

While building Perl some changes may be necessary to your Cygwin setup so
that Perl builds cleanly.  These changes are B<not> required for normal
Perl usage.

B<NOTE:> The binaries that are built will run on all Win32 versions.
They do not depend on your host system (WinXP/Win2K/Win7) or your
Cygwin configuration (binary/text mounts, cvgserver).
The only dependencies come from hard-coded pathnames like F</usr/local>.
However, your host system and Cygwin configuration will affect Perl's
runtime behavior (see L</"TEST">).

=over 4

=item * C<PATH>

Set the C<PATH> environment variable so that Configure finds the Cygwin
versions of programs. Any not-needed Windows directories should be removed or
moved to the end of your C<PATH>.

=item * I<nroff>

If you do not have I<nroff> (which is part of the I<groff> package),
Configure will B<not> prompt you to install I<man> pages.

=back

=head1 CONFIGURE PERL ON CYGWIN

The default options gathered by Configure with the assistance of
F<hints/cygwin.sh> will build a Perl that supports dynamic loading
(which requires a shared F<cygperl5_16.dll>).

This will run Configure and keep a record:

  ./Configure 2>&1 | tee log.configure

If you are willing to accept all the defaults run Configure with B<-de>.
However, several useful customizations are available.

=head2 Stripping Perl Binaries on Cygwin

It is possible to strip the EXEs and DLLs created by the build process.
The resulting binaries will be significantly smaller.  If you want the
binaries to be stripped, you can either add a B<-s> option when Configure
prompts you,

  Any additional ld flags (NOT including libraries)? [none] -s
  Any special flags to pass to g++ to create a dynamically loaded
  library?
  [none] -s
  Any special flags to pass to gcc to use dynamic linking? [none] -s

or you can edit F<hints/cygwin.sh> and uncomment the relevant variables
near the end of the file.

=head2 Optional Libraries for Perl on Cygwin

Several Perl functions and modules depend on the existence of
some optional libraries.  Configure will find them if they are
installed in one of the directories listed as being used for library
searches.  Pre-built packages for most of these are available from
the Cygwin installer.

=over 4

=item * C<-lcrypt>

The crypt package distributed with Cygwin is a Linux compatible 56-bit
DES crypt port by Corinna Vinschen.

Alternatively, the crypt libraries in GNU libc have been ported to Cygwin.

As of libcrypt 1.3 (March 2016), you will need to install the
libcrypt-devel package for Configure to detect crypt().

=item * C<-lgdbm_compat> (C<use GDBM_File>)

GDBM is available for Cygwin.

NOTE: The GDBM library only works on NTFS partitions.

=item * C<-ldb> (C<use DB_File>)

BerkeleyDB is available for Cygwin.

NOTE: The BerkeleyDB library only completely works on NTFS partitions.

=item * C<cygserver> (C<use IPC::SysV>)

A port of SysV IPC is available for Cygwin.

NOTE: This has B<not> been extensively tested.  In particular,
C<d_semctl_semun> is undefined because it fails a Configure test
and on Win9x the I<shm*()> functions seem to hang.  It also creates
a compile time dependency because F<perl.h> includes F<<sys/ipc.h>>
and F<<sys/sem.h>> (which will be required in the future when compiling
CPAN modules). CURRENTLY NOT SUPPORTED!

=item * C<-lutil>

Included with the standard Cygwin netrelease is the inetutils package
which includes libutil.a.

=back

=head2 Configure-time Options for Perl on Cygwin

The F<INSTALL> document describes several Configure-time options.  Some of
these will work with Cygwin, others are not yet possible.  Also, some of
these are experimental.  You can either select an option when Configure
prompts you or you can define (undefine) symbols on the command line.

=over 4

=item * C<-Uusedl>

Undefining this symbol forces Perl to be compiled statically.

=item * C<-Dusemymalloc>

By default Perl does not use the C<malloc()> included with the Perl source,
because it was slower and not entirely thread-safe.  If you want to force
Perl to build with the old -Dusemymalloc define this.

=item * C<-Uuseperlio>

Undefining this symbol disables the PerlIO abstraction.  PerlIO is now the
default; it is not recommended to disable PerlIO.

=item * C<-Dusemultiplicity>

Multiplicity is required when embedding Perl in a C program and using
more than one interpreter instance.  This is only required when you build
a not-threaded perl with C<-Uuseithreads>.

=item * C<-Uuse64bitint>

By default Perl uses 64 bit integers.  If you want to use smaller 32 bit
integers, define this symbol.

=item * C<-Duselongdouble>

I<gcc> supports long doubles (12 bytes).  However, several additional
long double math functions are necessary to use them within Perl
(I<{atan2, cos, exp, floor, fmod, frexp, isnan, log, modf, pow, sin, sqrt}l,
strtold>).
These are B<not> yet available with newlib, the Cygwin libc.

=item * C<-Uuseithreads>

Define this symbol if you want not-threaded faster perl.

=item * C<-Duselargefiles>

Cygwin uses 64-bit integers for internal size and position calculations,
this will be correctly detected and defined by Configure.

=item * C<-Dmksymlinks>

Use this to build perl outside of the source tree.  Details can be
found in the F<INSTALL> document.  This is the recommended way to
build perl from sources.

=back

=head2 Suspicious Warnings on Cygwin

You may see some messages during Configure that seem suspicious.

=over 4

=item * Win9x and C<d_eofnblk>

Win9x does not correctly report C<EOF> with a non-blocking read on a
closed pipe.  You will see the following messages:

 But it also returns -1 to signal EOF, so be careful!
 WARNING: you can't distinguish between EOF and no data!

 *** WHOA THERE!!! ***
     The recommended value for $d_eofnblk on this machine was
     "define"!
     Keep the recommended value? [y]

At least for consistency with WinNT, you should keep the recommended
value.

=item * Compiler/Preprocessor defines

The following error occurs because of the Cygwin C<#define> of
C<_LONG_DOUBLE>:

  Guessing which symbols your C compiler and preprocessor define...
  try.c:<line#>: missing binary operator

This failure does not seem to cause any problems.  With older gcc
versions, "parse error" is reported instead of "missing binary
operator".

=back

=head1 MAKE ON CYGWIN

Simply run I<make> and wait:

  make 2>&1 | tee log.make

=head1 TEST ON CYGWIN

There are two steps to running the test suite:

  make test 2>&1 | tee log.make-test

  cd t; ./perl harness 2>&1 | tee ../log.harness

The same tests are run both times, but more information is provided when
running as C<./perl harness>.

Test results vary depending on your host system and your Cygwin
configuration.  If a test can pass in some Cygwin setup, it is always
attempted and explainable test failures are documented.  It is possible
for Perl to pass all the tests, but it is more likely that some tests
will fail for one of the reasons listed below.

=head2 File Permissions on Cygwin

UNIX file permissions are based on sets of mode bits for
{read,write,execute} for each {user,group,other}.  By default Cygwin
only tracks the Win32 read-only attribute represented as the UNIX file
user write bit (files are always readable, files are executable if they
have a F<.{com,bat,exe}> extension or begin with C<#!>, directories are
always readable and executable).  On WinNT with the I<ntea> C<CYGWIN>
setting, the additional mode bits are stored as extended file attributes.
On WinNT with the default I<ntsec> C<CYGWIN> setting, permissions use the
standard WinNT security descriptors and access control lists. Without one of
these options, these tests will fail (listing not updated yet):

  Failed Test           List of failed
  ------------------------------------
  io/fs.t               5, 7, 9-10
  lib/anydbm.t          2
  lib/db-btree.t        20
  lib/db-hash.t         16
  lib/db-recno.t        18
  lib/gdbm.t            2
  lib/ndbm.t            2
  lib/odbm.t            2
  lib/sdbm.t            2
  op/stat.t             9, 20 (.tmp not an executable extension)

=head2 NDBM_File and ODBM_File do not work on FAT filesystems

Do not use NDBM_File or ODBM_File on FAT filesystem.  They can be
built on a FAT filesystem, but many tests will fail:

 ../ext/NDBM_File/ndbm.t       13  3328    71   59  83.10%  1-2 4 16-71
 ../ext/ODBM_File/odbm.t      255 65280    ??   ??       %  ??
 ../lib/AnyDBM_File.t           2   512    12    2  16.67%  1 4
 ../lib/Memoize/t/errors.t      0   139    11    5  45.45%  7-11
 ../lib/Memoize/t/tie_ndbm.t   13  3328     4    4 100.00%  1-4
 run/fresh_perl.t                          97    1   1.03%  91

If you intend to run only on FAT (or if using AnyDBM_File on FAT),
run Configure with the -Ui_ndbm and -Ui_dbm options to prevent
NDBM_File and ODBM_File being built.

With NTFS (and no CYGWIN=nontsec), there should be no problems even if
perl was built on FAT.

=head2 C<fork()> failures in io_* tests

A C<fork()> failure may result in the following tests failing:

  ext/IO/lib/IO/t/io_multihomed.t
  ext/IO/lib/IO/t/io_sock.t
  ext/IO/lib/IO/t/io_unix.t

See comment on fork in L</Miscellaneous> below.

=head1 Specific features of the Cygwin port

=head2 Script Portability on Cygwin

Cygwin does an outstanding job of providing UNIX-like semantics on top of
Win32 systems.  However, in addition to the items noted above, there are
some differences that you should know about.  This is a very brief guide
to portability, more information can be found in the Cygwin documentation.

=over 4

=item * Pathnames

Cygwin pathnames are separated by forward (F</>) slashes, Universal
Naming Codes (F<//UNC>) are also supported Since cygwin-1.7 non-POSIX
pathnames are discouraged.  Names may contain all printable
characters.

File names are case insensitive, but case preserving.  A pathname that
contains a backslash or drive letter is a Win32 pathname, and not
subject to the translations applied to POSIX style pathnames, but
cygwin will warn you, so better convert them to POSIX.

For conversion we have C<Cygwin::win_to_posix_path()> and
C<Cygwin::posix_to_win_path()>.

Since cygwin-1.7 pathnames are UTF-8 encoded.

=item * Text/Binary

Since cygwin-1.7 textmounts are deprecated and strongly discouraged.

When a file is opened it is in either text or binary mode.  In text mode
a file is subject to CR/LF/Ctrl-Z translations.  With Cygwin, the default
mode for an C<open()> is determined by the mode of the mount that underlies
the file. See L</Cygwin::is_binmount>(). Perl provides a C<binmode()> function
to set binary mode on files that otherwise would be treated as text.
C<sysopen()> with the C<O_TEXT> flag sets text mode on files that otherwise
would be treated as binary:

    sysopen(FOO, "bar", O_WRONLY|O_CREAT|O_TEXT)

C<lseek()>, C<tell()> and C<sysseek()> only work with files opened in binary
mode.

The text/binary issue is covered at length in the Cygwin documentation.

=item * PerlIO

PerlIO overrides the default Cygwin Text/Binary behaviour.  A file will
always be treated as binary, regardless of the mode of the mount it lives
on, just like it is in UNIX.  So CR/LF translation needs to be requested in
either the C<open()> call like this:

  open(FH, ">:crlf", "out.txt");

which will do conversion from LF to CR/LF on the output, or in the
environment settings (add this to your .bashrc):

  export PERLIO=crlf

which will pull in the crlf PerlIO layer which does LF -> CRLF conversion
on every output generated by perl.

=item * F<.exe>

The Cygwin C<stat()>, C<lstat()> and C<readlink()> functions make the F<.exe>
extension transparent by looking for F<foo.exe> when you ask for F<foo>
(unless a F<foo> also exists).  Cygwin does not require a F<.exe>
extension, but I<gcc> adds it automatically when building a program.
However, when accessing an executable as a normal file (e.g., I<cp>
in a makefile) the F<.exe> is not transparent.  The I<install> program
included with Cygwin automatically appends a F<.exe> when necessary.

=item * Cygwin vs. Windows process ids

Cygwin processes have their own pid, which is different from the
underlying windows pid.  Most posix compliant Proc functions expect
the cygwin pid, but several Win32::Process functions expect the
winpid. E.g. C<$$> is the cygwin pid of F</usr/bin/perl>, which is not
the winpid.  Use C<Cygwin::pid_to_winpid()> and C<Cygwin::winpid_to_pid()>
to translate between them.

=item * Cygwin vs. Windows errors

Under Cygwin, $^E is the same as $!.  When using L<Win32 API Functions|Win32>,
use C<Win32::GetLastError()> to get the last Windows error.

=item * rebase errors on fork or system

Using C<fork()> or C<system()> out to another perl after loading multiple dlls
may result on a DLL baseaddress conflict. The internal cygwin error
looks like like the following:

 0 [main] perl 8916 child_info_fork::abort: data segment start:
 parent (0xC1A000) != child(0xA6A000)

or:

 183 [main] perl 3588 C:\cygwin\bin\perl.exe: *** fatal error -
 unable to remap C:\cygwin\bin\cygsvn_subr-1-0.dll to same address
 as parent(0x6FB30000) != 0x6FE60000 46 [main] perl 3488 fork: child
 3588 - died waiting for dll loading, errno11

See L<http://cygwin.com/faq/faq-nochunks.html#faq.using.fixing-fork-failures>
It helps if not too many DLLs are loaded in memory so the available address space is larger,
e.g. stopping the MS Internet Explorer might help.

Use the perlrebase or rebase utilities to resolve the conflicting dll addresses.
The rebase package is included in the Cygwin setup. Use F<setup.exe>
from L<http://www.cygwin.com/setup.exe> to install it.

1. kill all perl processes and run C<perlrebase> or

2. kill all cygwin processes and services, start dash from cmd.exe and run C<rebaseall>.

=item * C<chown()>

On WinNT C<chown()> can change a file's user and group IDs.  On Win9x C<chown()>
is a no-op, although this is appropriate since there is no security model.

=item * Miscellaneous

File locking using the C<F_GETLK> command to C<fcntl()> is a stub that
returns C<ENOSYS>.

Win9x can not C<rename()> an open file (although WinNT can).

The Cygwin C<chroot()> implementation has holes (it can not restrict file
access by native Win32 programs).

Inplace editing C<perl -i> of files doesn't work without doing a backup
of the file being edited C<perl -i.bak> because of windowish restrictions,
therefore Perl adds the suffix C<.bak> automatically if you use C<perl -i>
without specifying a backup extension.

=back

=head2 Prebuilt methods:

=over 4

=item C<Cwd::cwd>

Returns the current working directory.

=item C<Cygwin::pid_to_winpid>

Translates a cygwin pid to the corresponding Windows pid (which may or
may not be the same).

=item C<Cygwin::winpid_to_pid>

Translates a Windows pid to the corresponding cygwin pid (if any).

=item C<Cygwin::win_to_posix_path>

Translates a Windows path to the corresponding cygwin path respecting
the current mount points. With a second non-null argument returns an
absolute path. Double-byte characters will not be translated.

=item C<Cygwin::posix_to_win_path>

Translates a cygwin path to the corresponding cygwin path respecting
the current mount points. With a second non-null argument returns an
absolute path. Double-byte characters will not be translated.

=item C<Cygwin::mount_table()>

Returns an array of [mnt_dir, mnt_fsname, mnt_type, mnt_opts].

  perl -e 'for $i (Cygwin::mount_table) {print join(" ",@$i),"\n";}'
  /bin c:\cygwin\bin system binmode,cygexec
  /usr/bin c:\cygwin\bin system binmode
  /usr/lib c:\cygwin\lib system binmode
  / c:\cygwin system binmode
  /cygdrive/c c: system binmode,noumount
  /cygdrive/d d: system binmode,noumount
  /cygdrive/e e: system binmode,noumount

=item C<Cygwin::mount_flags>

Returns the mount type and flags for a specified mount point.
A comma-separated string of mntent->mnt_type (always
"system" or "user"), then the mntent->mnt_opts, where
the first is always "binmode" or "textmode".

  system|user,binmode|textmode,exec,cygexec,cygdrive,mixed,
  notexec,managed,nosuid,devfs,proc,noumount

If the argument is "/cygdrive", then just the volume mount settings,
and the cygdrive mount prefix are returned.

User mounts override system mounts.

  $ perl -e 'print Cygwin::mount_flags "/usr/bin"'
  system,binmode,cygexec
  $ perl -e 'print Cygwin::mount_flags "/cygdrive"'
  binmode,cygdrive,/cygdrive

=item C<Cygwin::is_binmount>

Returns true if the given cygwin path is binary mounted, false if the
path is mounted in textmode.

=item C<Cygwin::sync_winenv>

Cygwin does not initialize all original Win32 environment variables.
See the bottom of this page L<http://cygwin.com/cygwin-ug-net/setup-env.html>
for "Restricted Win32 environment".

Certain Win32 programs called from cygwin programs might need some environment
variable, such as e.g. ADODB needs %COMMONPROGRAMFILES%.
Call Cygwin::sync_winenv() to copy all Win32 environment variables to your
process and note that cygwin will warn on every encounter of non-POSIX paths.

=back

=head1 INSTALL PERL ON CYGWIN

This will install Perl, including I<man> pages.

  make install 2>&1 | tee log.make-install

NOTE: If C<STDERR> is redirected C<make install> will B<not> prompt
you to install I<perl> into F</usr/bin>.

You may need to be I<Administrator> to run C<make install>.  If you
are not, you must have write access to the directories in question.

Information on installing the Perl documentation in HTML format can be
found in the F<INSTALL> document.

=head1 MANIFEST ON CYGWIN

These are the files in the Perl release that contain references to Cygwin.
These very brief notes attempt to explain the reason for all conditional
code.  Hopefully, keeping this up to date will allow the Cygwin port to
be kept as clean as possible.

=over 4

=item Documentation

 INSTALL README.cygwin README.win32 MANIFEST
 pod/perl.pod pod/perlport.pod pod/perlfaq3.pod
 pod/perldelta.pod pod/perl5004delta.pod pod/perl56delta.pod
 pod/perl561delta.pod pod/perl570delta.pod pod/perl572delta.pod
 pod/perl573delta.pod pod/perl58delta.pod pod/perl581delta.pod
 pod/perl590delta.pod pod/perlhist.pod pod/perlmodlib.pod
 pod/perltoc.pod Porting/Glossary pod/perlgit.pod
 Porting/checkAUTHORS.pl
 dist/Cwd/Changes ext/Compress-Raw-Zlib/Changes
 dist/Time-HiRes/Changes
 ext/Compress-Raw-Zlib/README ext/Compress-Zlib/Changes
 ext/DB_File/Changes ext/Encode/Changes ext/Sys-Syslog/Changes
 ext/Win32API-File/Changes
 lib/ExtUtils/CBuilder/Changes lib/ExtUtils/Changes
 lib/ExtUtils/NOTES lib/ExtUtils/PATCHING lib/ExtUtils/README
 lib/Net/Ping/Changes lib/Test/Harness/Changes
 lib/Term/ANSIColor/ChangeLog lib/Term/ANSIColor/README
 README.symbian symbian/TODO

=item Build, Configure, Make, Install

 cygwin/Makefile.SHs
 ext/IPC/SysV/hints/cygwin.pl
 ext/NDBM_File/hints/cygwin.pl
 ext/ODBM_File/hints/cygwin.pl
 hints/cygwin.sh
 Configure             - help finding hints from uname,
                         shared libperl required for dynamic loading
 Makefile.SH Cross/Makefile-cross-SH
                       - linklibperl
 Porting/patchls       - cygwin in port list
 installman            - man pages with :: translated to .
 installperl           - install dll, install to 'pods'
 makedepend.SH         - uwinfix
 regen_lib.pl          - file permissions

 NetWare/Makefile
 plan9/mkfile
 symbian/sanity.pl symbian/sisify.pl
 hints/uwin.sh
 vms/descrip_mms.template
 win32/Makefile win32/makefile.mk

=item Tests

 t/io/fs.t             - no file mode checks if not ntsec
                         skip rename() check when not
                         check_case:relaxed
 t/io/tell.t           - binmode
 t/lib/cygwin.t        - builtin cygwin function tests
 t/op/groups.t         - basegroup has ID = 0
 t/op/magic.t          - $^X/symlink WORKAROUND, s/.exe//
 t/op/stat.t           - no /dev, skip Win32 ftCreationTime quirk
                         (cache manager sometimes preserves ctime of
                         file previously created and deleted), no -u
                         (setuid)
 t/op/taint.t          - can't use empty path under Cygwin Perl
 t/op/time.t           - no tzset()

=item Compiled Perl Source

 EXTERN.h              - __declspec(dllimport)
 XSUB.h                - __declspec(dllexport)
 cygwin/cygwin.c       - os_extras (getcwd, spawn, and several
                         Cygwin:: functions)
 perl.c                - os_extras, -i.bak
 perl.h                - binmode
 doio.c                - win9x can not rename a file when it is open
 pp_sys.c              - do not define h_errno, init
                         _pwent_struct.pw_comment
 util.c                - use setenv
 util.h                - PERL_FILE_IS_ABSOLUTE macro
 pp.c                  - Comment about Posix vs IEEE math under
                         Cygwin
 perlio.c              - CR/LF mode
 perliol.c             - Comment about EXTCONST under Cygwin

=item Compiled Module Source

 ext/Compress-Raw-Zlib/Makefile.PL
                       - Can't install via CPAN shell under Cygwin
 ext/Compress-Raw-Zlib/zlib-src/zutil.h
                       - Cygwin is Unix-like and has vsnprintf
 ext/Errno/Errno_pm.PL - Special handling for Win32 Perl under
                         Cygwin
 ext/POSIX/POSIX.xs    - tzname defined externally
 ext/SDBM_File/sdbm/pair.c
                       - EXTCONST needs to be redefined from
                         EXTERN.h
 ext/SDBM_File/sdbm/sdbm.c
                       - binary open
 ext/Sys/Syslog/Syslog.xs
                       - Cygwin has syslog.h
 ext/Sys/Syslog/win32/compile.pl
                       - Convert paths to Windows paths
 ext/Time-HiRes/HiRes.xs
                       - Various timers not available
 ext/Time-HiRes/Makefile.PL
                       - Find w32api/windows.h
 ext/Win32/Makefile.PL - Use various libraries under Cygwin
 ext/Win32/Win32.xs    - Child dir and child env under Cygwin
 ext/Win32API-File/File.xs
                       - _open_osfhandle not implemented under
                         Cygwin
 ext/Win32CORE/Win32CORE.c
                       - __declspec(dllexport)

=item Perl Modules/Scripts

 ext/B/t/OptreeCheck.pm - Comment about stderr/stdout order under
                          Cygwin
 ext/Digest-SHA/bin/shasum
                       - Use binary mode under Cygwin
 ext/Sys/Syslog/win32/Win32.pm
                       - Convert paths to Windows paths
 ext/Time-HiRes/HiRes.pm
                       - Comment about various timers not available
 ext/Win32API-File/File.pm
                       - _open_osfhandle not implemented under
                         Cygwin
 ext/Win32CORE/Win32CORE.pm
                       - History of Win32CORE under Cygwin
 lib/Cwd.pm            - hook to internal Cwd::cwd
 lib/ExtUtils/CBuilder/Platform/cygwin.pm
                       - use gcc for ld, and link to libperl.dll.a
 lib/ExtUtils/CBuilder.pm
                       - Cygwin is Unix-like
 lib/ExtUtils/Install.pm - Install and rename issues under Cygwin
 lib/ExtUtils/MM.pm    - OS classifications
 lib/ExtUtils/MM_Any.pm - Example for Cygwin
 lib/ExtUtils/MakeMaker.pm
                       - require MM_Cygwin.pm
 lib/ExtUtils/MM_Cygwin.pm
                       - canonpath, cflags, manifypods, perl_archive
 lib/File/Fetch.pm     - Comment about quotes using a Cygwin example
 lib/File/Find.pm      - on remote drives stat() always sets
                         st_nlink to 1
 lib/File/Spec/Cygwin.pm - case_tolerant
 lib/File/Spec/Unix.pm - preserve //unc
 lib/File/Spec/Win32.pm - References a message on cygwin.com
 lib/File/Spec.pm      - Pulls in lib/File/Spec/Cygwin.pm
 lib/File/Temp.pm      - no directory sticky bit
 lib/Module/CoreList.pm - List of all module files and versions
 lib/Net/Domain.pm     - No domainname command under Cygwin
 lib/Net/Netrc.pm      - Bypass using stat() under Cygwin
 lib/Net/Ping.pm       - ECONREFUSED is EAGAIN under Cygwin
 lib/Pod/Find.pm       - Set 'pods' dir
 lib/Pod/Perldoc/ToMan.pm - '-c' switch for pod2man
 lib/Pod/Perldoc.pm    - Use 'less' pager, and use .exe extension
 lib/Term/ANSIColor.pm - Cygwin terminal info
 lib/perl5db.pl        - use stdin not /dev/tty
 utils/perlbug.PL      - Add CYGWIN environment variable to report

=item Perl Module Tests

 dist/Cwd/t/cwd.t
 ext/Compress-Zlib/t/14gzopen.t
 ext/DB_File/t/db-btree.t
 ext/DB_File/t/db-hash.t
 ext/DB_File/t/db-recno.t
 ext/DynaLoader/t/DynaLoader.t
 ext/File-Glob/t/basic.t
 ext/GDBM_File/t/gdbm.t
 ext/POSIX/t/sysconf.t
 ext/POSIX/t/time.t
 ext/SDBM_File/t/sdbm.t
 ext/Sys/Syslog/t/syslog.t
 ext/Time-HiRes/t/HiRes.t
 ext/Win32/t/Unicode.t
 ext/Win32API-File/t/file.t
 ext/Win32CORE/t/win32core.t
 lib/AnyDBM_File.t
 lib/Archive/Extract/t/01_Archive-Extract.t
 lib/Archive/Tar/t/02_methods.t
 lib/ExtUtils/t/Embed.t
 lib/ExtUtils/t/eu_command.t
 lib/ExtUtils/t/MM_Cygwin.t
 lib/ExtUtils/t/MM_Unix.t
 lib/File/Compare.t
 lib/File/Copy.t
 lib/File/Find/t/find.t
 lib/File/Path.t
 lib/File/Spec/t/crossplatform.t
 lib/File/Spec/t/Spec.t
 lib/Net/hostent.t
 lib/Net/Ping/t/110_icmp_inst.t
 lib/Net/Ping/t/500_ping_icmp.t
 lib/Net/t/netrc.t
 lib/Pod/Simple/t/perlcyg.pod
 lib/Pod/Simple/t/perlcygo.txt
 lib/Pod/Simple/t/perlfaq.pod
 lib/Pod/Simple/t/perlfaqo.txt
 lib/User/grent.t
 lib/User/pwent.t

=back

=head1 BUGS ON CYGWIN

Support for swapping real and effective user and group IDs is incomplete.
On WinNT Cygwin provides C<setuid()>, C<seteuid()>, C<setgid()> and C<setegid()>.
However, additional Cygwin calls for manipulating WinNT access tokens
and security contexts are required.

=head1 AUTHORS

Charles Wilson <cwilson@ece.gatech.edu>,
Eric Fifer <egf7@columbia.edu>,
alexander smishlajev <als@turnhere.com>,
Steven Morlock <newspost@morlock.net>,
Sebastien Barre <Sebastien.Barre@utc.fr>,
Teun Burgers <burgers@ecn.nl>,
Gerrit P. Haase <gp@familiehaase.de>,
Reini Urban <rurban@cpan.org>,
Jan Dubois <jand@activestate.com>,
Jerry D. Hedden <jdhedden@cpan.org>.

=head1 HISTORY

Last updated: 2012-02-08
perlunicode.pod000064400000241073150344123430007567 0ustar00=head1 NAME

perlunicode - Unicode support in Perl

=head1 DESCRIPTION

If you haven't already, before reading this document, you should become
familiar with both L<perlunitut> and L<perluniintro>.

Unicode aims to B<UNI>-fy the en-B<CODE>-ings of all the world's
character sets into a single Standard.   For quite a few of the various
coding standards that existed when Unicode was first created, converting
from each to Unicode essentially meant adding a constant to each code
point in the original standard, and converting back meant just
subtracting that same constant.  For ASCII and ISO-8859-1, the constant
is 0.  For ISO-8859-5, (Cyrillic) the constant is 864; for Hebrew
(ISO-8859-8), it's 1488; Thai (ISO-8859-11), 3424; and so forth.  This
made it easy to do the conversions, and facilitated the adoption of
Unicode.

And it worked; nowadays, those legacy standards are rarely used.  Most
everyone uses Unicode.

Unicode is a comprehensive standard.  It specifies many things outside
the scope of Perl, such as how to display sequences of characters.  For
a full discussion of all aspects of Unicode, see
L<http://www.unicode.org>.

=head2 Important Caveats

Even though some of this section may not be understandable to you on
first reading, we think it's important enough to highlight some of the
gotchas before delving further, so here goes:

Unicode support is an extensive requirement. While Perl does not
implement the Unicode standard or the accompanying technical reports
from cover to cover, Perl does support many Unicode features.

Also, the use of Unicode may present security issues that aren't
obvious, see L</Security Implications of Unicode>.

=over 4

=item Safest if you C<use feature 'unicode_strings'>

In order to preserve backward compatibility, Perl does not turn
on full internal Unicode support unless the pragma
L<S<C<use feature 'unicode_strings'>>|feature/The 'unicode_strings' feature>
is specified.  (This is automatically
selected if you S<C<use 5.012>> or higher.)  Failure to do this can
trigger unexpected surprises.  See L</The "Unicode Bug"> below.

This pragma doesn't affect I/O.  Nor does it change the internal
representation of strings, only their interpretation.  There are still
several places where Unicode isn't fully supported, such as in
filenames.

=item Input and Output Layers

Use the C<:encoding(...)> layer  to read from and write to
filehandles using the specified encoding.  (See L<open>.)

=item You should convert your non-ASCII, non-UTF-8 Perl scripts to be
UTF-8.

See L<encoding>.

=item C<use utf8> still needed to enable L<UTF-8|/Unicode Encodings> in scripts

If your Perl script is itself encoded in L<UTF-8|/Unicode Encodings>,
the S<C<use utf8>> pragma must be explicitly included to enable
recognition of that (in string or regular expression literals, or in
identifier names).  B<This is the only time when an explicit S<C<use
utf8>> is needed.>  (See L<utf8>).

If a Perl script begins with the bytes that form the UTF-8 encoding of
the Unicode BYTE ORDER MARK (C<BOM>, see L</Unicode Encodings>), those
bytes are completely ignored.

=item L<UTF-16|/Unicode Encodings> scripts autodetected

If a Perl script begins with the Unicode C<BOM> (UTF-16LE,
UTF16-BE), or if the script looks like non-C<BOM>-marked
UTF-16 of either endianness, Perl will correctly read in the script as
the appropriate Unicode encoding.

=back

=head2 Byte and Character Semantics

Before Unicode, most encodings used 8 bits (a single byte) to encode
each character.  Thus a character was a byte, and a byte was a
character, and there could be only 256 or fewer possible characters.
"Byte Semantics" in the title of this section refers to
this behavior.  There was no need to distinguish between "Byte" and
"Character".

Then along comes Unicode which has room for over a million characters
(and Perl allows for even more).  This means that a character may
require more than a single byte to represent it, and so the two terms
are no longer equivalent.  What matter are the characters as whole
entities, and not usually the bytes that comprise them.  That's what the
term "Character Semantics" in the title of this section refers to.

Perl had to change internally to decouple "bytes" from "characters".
It is important that you too change your ideas, if you haven't already,
so that "byte" and "character" no longer mean the same thing in your
mind.

The basic building block of Perl strings has always been a "character".
The changes basically come down to that the implementation no longer
thinks that a character is always just a single byte.

There are various things to note:

=over 4

=item *

String handling functions, for the most part, continue to operate in
terms of characters.  C<length()>, for example, returns the number of
characters in a string, just as before.  But that number no longer is
necessarily the same as the number of bytes in the string (there may be
more bytes than characters).  The other such functions include
C<chop()>, C<chomp()>, C<substr()>, C<pos()>, C<index()>, C<rindex()>,
C<sort()>, C<sprintf()>, and C<write()>.

The exceptions are:

=over 4

=item *

the bit-oriented C<vec>

E<nbsp>

=item *

the byte-oriented C<pack>/C<unpack> C<"C"> format

However, the C<W> specifier does operate on whole characters, as does the
C<U> specifier.

=item *

some operators that interact with the platform's operating system

Operators dealing with filenames are examples.

=item *

when the functions are called from within the scope of the
S<C<L<use bytes|bytes>>> pragma

Likely, you should use this only for debugging anyway.

=back

=item *

Strings--including hash keys--and regular expression patterns may
contain characters that have ordinal values larger than 255.

If you use a Unicode editor to edit your program, Unicode characters may
occur directly within the literal strings in UTF-8 encoding, or UTF-16.
(The former requires a C<use utf8>, the latter may require a C<BOM>.)

L<perluniintro/Creating Unicode> gives other ways to place non-ASCII
characters in your strings.

=item *

The C<chr()> and C<ord()> functions work on whole characters.

=item *

Regular expressions match whole characters.  For example, C<"."> matches
a whole character instead of only a single byte.

=item *

The C<tr///> operator translates whole characters.  (Note that the
C<tr///CU> functionality has been removed.  For similar functionality to
that, see C<pack('U0', ...)> and C<pack('C0', ...)>).

=item *

C<scalar reverse()> reverses by character rather than by byte.

=item *

The bit string operators, C<& | ^ ~> and (starting in v5.22)
C<&. |. ^.  ~.> can operate on characters that don't fit into a byte.
However, the current behavior is likely to change.  You should not use
these operators on strings that are encoded in UTF-8.  If you're not
sure about the encoding of a string, downgrade it before using any of
these operators; you can use
L<C<utf8::utf8_downgrade()>|utf8/Utility functions>.

=back

The bottom line is that Perl has always practiced "Character Semantics",
but with the advent of Unicode, that is now different than "Byte
Semantics".

=head2 ASCII Rules versus Unicode Rules

Before Unicode, when a character was a byte was a character,
Perl knew only about the 128 characters defined by ASCII, code points 0
through 127 (except for under L<S<C<use locale>>|perllocale>).  That
left the code
points 128 to 255 as unassigned, and available for whatever use a
program might want.  The only semantics they have is their ordinal
numbers, and that they are members of none of the non-negative character
classes.  None are considered to match C<\w> for example, but all match
C<\W>.

Unicode, of course, assigns each of those code points a particular
meaning (along with ones above 255).  To preserve backward
compatibility, Perl only uses the Unicode meanings when there is some
indication that Unicode is what is intended; otherwise the non-ASCII
code points remain treated as if they are unassigned.

Here are the ways that Perl knows that a string should be treated as
Unicode:

=over

=item *

Within the scope of S<C<use utf8>>

If the whole program is Unicode (signified by using 8-bit B<U>nicode
B<T>ransformation B<F>ormat), then all strings within it must be
Unicode.

=item *

Within the scope of
L<S<C<use feature 'unicode_strings'>>|feature/The 'unicode_strings' feature>

This pragma was created so you can explicitly tell Perl that operations
executed within its scope are to use Unicode rules.  More operations are
affected with newer perls.  See L</The "Unicode Bug">.

=item *

Within the scope of S<C<use 5.012>> or higher

This implicitly turns on S<C<use feature 'unicode_strings'>>.

=item *

Within the scope of
L<S<C<use locale 'not_characters'>>|perllocale/Unicode and UTF-8>,
or L<S<C<use locale>>|perllocale> and the current
locale is a UTF-8 locale.

The former is defined to imply Unicode handling; and the latter
indicates a Unicode locale, hence a Unicode interpretation of all
strings within it.

=item *

When the string contains a Unicode-only code point

Perl has never accepted code points above 255 without them being
Unicode, so their use implies Unicode for the whole string.

=item *

When the string contains a Unicode named code point C<\N{...}>

The C<\N{...}> construct explicitly refers to a Unicode code point,
even if it is one that is also in ASCII.  Therefore the string
containing it must be Unicode.

=item *

When the string has come from an external source marked as
Unicode

The L<C<-C>|perlrun/-C [numberE<sol>list]> command line option can
specify that certain inputs to the program are Unicode, and the values
of this can be read by your Perl code, see L<perlvar/"${^UNICODE}">.

=item * When the string has been upgraded to UTF-8

The function L<C<utf8::utf8_upgrade()>|utf8/Utility functions>
can be explicitly used to permanently (unless a subsequent
C<utf8::utf8_downgrade()> is called) cause a string to be treated as
Unicode.

=item * There are additional methods for regular expression patterns

A pattern that is compiled with the C<< /u >> or C<< /a >> modifiers is
treated as Unicode (though there are some restrictions with C<< /a >>).
Under the C<< /d >> and C<< /l >> modifiers, there are several other
indications for Unicode; see L<perlre/Character set modifiers>.

=back

Note that all of the above are overridden within the scope of
C<L<use bytes|bytes>>; but you should be using this pragma only for
debugging.

Note also that some interactions with the platform's operating system
never use Unicode rules.

When Unicode rules are in effect:

=over 4

=item *

Case translation operators use the Unicode case translation tables.

Note that C<uc()>, or C<\U> in interpolated strings, translates to
uppercase, while C<ucfirst>, or C<\u> in interpolated strings,
translates to titlecase in languages that make the distinction (which is
equivalent to uppercase in languages without the distinction).

There is a CPAN module, C<L<Unicode::Casing>>, which allows you to
define your own mappings to be used in C<lc()>, C<lcfirst()>, C<uc()>,
C<ucfirst()>, and C<fc> (or their double-quoted string inlined versions
such as C<\U>).  (Prior to Perl 5.16, this functionality was partially
provided in the Perl core, but suffered from a number of insurmountable
drawbacks, so the CPAN module was written instead.)

=item *

Character classes in regular expressions match based on the character
properties specified in the Unicode properties database.

C<\w> can be used to match a Japanese ideograph, for instance; and
C<[[:digit:]]> a Bengali number.

=item *

Named Unicode properties, scripts, and block ranges may be used (like
bracketed character classes) by using the C<\p{}> "matches property"
construct and the C<\P{}> negation, "doesn't match property".

See L</"Unicode Character Properties"> for more details.

You can define your own character properties and use them
in the regular expression with the C<\p{}> or C<\P{}> construct.
See L</"User-Defined Character Properties"> for more details.

=back

=head2 Extended Grapheme Clusters (Logical characters)

Consider a character, say C<H>.  It could appear with various marks around it,
such as an acute accent, or a circumflex, or various hooks, circles, arrows,
I<etc.>, above, below, to one side or the other, I<etc>.  There are many
possibilities among the world's languages.  The number of combinations is
astronomical, and if there were a character for each combination, it would
soon exhaust Unicode's more than a million possible characters.  So Unicode
took a different approach: there is a character for the base C<H>, and a
character for each of the possible marks, and these can be variously combined
to get a final logical character.  So a logical character--what appears to be a
single character--can be a sequence of more than one individual characters.
The Unicode standard calls these "extended grapheme clusters" (which
is an improved version of the no-longer much used "grapheme cluster");
Perl furnishes the C<\X> regular expression construct to match such
sequences in their entirety.

But Unicode's intent is to unify the existing character set standards and
practices, and several pre-existing standards have single characters that
mean the same thing as some of these combinations, like ISO-8859-1,
which has quite a few of them. For example, C<"LATIN CAPITAL LETTER E
WITH ACUTE"> was already in this standard when Unicode came along.
Unicode therefore added it to its repertoire as that single character.
But this character is considered by Unicode to be equivalent to the
sequence consisting of the character C<"LATIN CAPITAL LETTER E">
followed by the character C<"COMBINING ACUTE ACCENT">.

C<"LATIN CAPITAL LETTER E WITH ACUTE"> is called a "pre-composed"
character, and its equivalence with the "E" and the "COMBINING ACCENT"
sequence is called canonical equivalence.  All pre-composed characters
are said to have a decomposition (into the equivalent sequence), and the
decomposition type is also called canonical.  A string may be comprised
as much as possible of precomposed characters, or it may be comprised of
entirely decomposed characters.  Unicode calls these respectively,
"Normalization Form Composed" (NFC) and "Normalization Form Decomposed".
The C<L<Unicode::Normalize>> module contains functions that convert
between the two.  A string may also have both composed characters and
decomposed characters; this module can be used to make it all one or the
other.

You may be presented with strings in any of these equivalent forms.
There is currently nothing in Perl 5 that ignores the differences.  So
you'll have to specially hanlde it.  The usual advice is to convert your
inputs to C<NFD> before processing further.

For more detailed information, see L<http://unicode.org/reports/tr15/>.

=head2 Unicode Character Properties

(The only time that Perl considers a sequence of individual code
points as a single logical character is in the C<\X> construct, already
mentioned above.   Therefore "character" in this discussion means a single
Unicode code point.)

Very nearly all Unicode character properties are accessible through
regular expressions by using the C<\p{}> "matches property" construct
and the C<\P{}> "doesn't match property" for its negation.

For instance, C<\p{Uppercase}> matches any single character with the Unicode
C<"Uppercase"> property, while C<\p{L}> matches any character with a
C<General_Category> of C<"L"> (letter) property (see
L</General_Category> below).  Brackets are not
required for single letter property names, so C<\p{L}> is equivalent to C<\pL>.

More formally, C<\p{Uppercase}> matches any single character whose Unicode
C<Uppercase> property value is C<True>, and C<\P{Uppercase}> matches any character
whose C<Uppercase> property value is C<False>, and they could have been written as
C<\p{Uppercase=True}> and C<\p{Uppercase=False}>, respectively.

This formality is needed when properties are not binary; that is, if they can
take on more values than just C<True> and C<False>.  For example, the
C<Bidi_Class> property (see L</"Bidirectional Character Types"> below),
can take on several different
values, such as C<Left>, C<Right>, C<Whitespace>, and others.  To match these, one needs
to specify both the property name (C<Bidi_Class>), AND the value being
matched against
(C<Left>, C<Right>, I<etc.>).  This is done, as in the examples above, by having the
two components separated by an equal sign (or interchangeably, a colon), like
C<\p{Bidi_Class: Left}>.

All Unicode-defined character properties may be written in these compound forms
of C<\p{I<property>=I<value>}> or C<\p{I<property>:I<value>}>, but Perl provides some
additional properties that are written only in the single form, as well as
single-form short-cuts for all binary properties and certain others described
below, in which you may omit the property name and the equals or colon
separator.

Most Unicode character properties have at least two synonyms (or aliases if you
prefer): a short one that is easier to type and a longer one that is more
descriptive and hence easier to understand.  Thus the C<"L"> and
C<"Letter"> properties above are equivalent and can be used
interchangeably.  Likewise, C<"Upper"> is a synonym for C<"Uppercase">,
and we could have written C<\p{Uppercase}> equivalently as C<\p{Upper}>.
Also, there are typically various synonyms for the values the property
can be.   For binary properties, C<"True"> has 3 synonyms: C<"T">,
C<"Yes">, and C<"Y">; and C<"False"> has correspondingly C<"F">,
C<"No">, and C<"N">.  But be careful.  A short form of a value for one
property may not mean the same thing as the same short form for another.
Thus, for the C<L</General_Category>> property, C<"L"> means
C<"Letter">, but for the L<C<Bidi_Class>|/Bidirectional Character Types>
property, C<"L"> means C<"Left">.  A complete list of properties and
synonyms is in L<perluniprops>.

Upper/lower case differences in property names and values are irrelevant;
thus C<\p{Upper}> means the same thing as C<\p{upper}> or even C<\p{UpPeR}>.
Similarly, you can add or subtract underscores anywhere in the middle of a
word, so that these are also equivalent to C<\p{U_p_p_e_r}>.  And white space
is irrelevant adjacent to non-word characters, such as the braces and the equals
or colon separators, so C<\p{   Upper  }> and C<\p{ Upper_case : Y }> are
equivalent to these as well.  In fact, white space and even
hyphens can usually be added or deleted anywhere.  So even C<\p{ Up-per case = Yes}> is
equivalent.  All this is called "loose-matching" by Unicode.  The few places
where stricter matching is used is in the middle of numbers, and in the Perl
extension properties that begin or end with an underscore.  Stricter matching
cares about white space (except adjacent to non-word characters),
hyphens, and non-interior underscores.

You can also use negation in both C<\p{}> and C<\P{}> by introducing a caret
(C<^>) between the first brace and the property name: C<\p{^Tamil}> is
equal to C<\P{Tamil}>.

Almost all properties are immune to case-insensitive matching.  That is,
adding a C</i> regular expression modifier does not change what they
match.  There are two sets that are affected.
The first set is
C<Uppercase_Letter>,
C<Lowercase_Letter>,
and C<Titlecase_Letter>,
all of which match C<Cased_Letter> under C</i> matching.
And the second set is
C<Uppercase>,
C<Lowercase>,
and C<Titlecase>,
all of which match C<Cased> under C</i> matching.
This set also includes its subsets C<PosixUpper> and C<PosixLower> both
of which under C</i> match C<PosixAlpha>.
(The difference between these sets is that some things, such as Roman
numerals, come in both upper and lower case so they are C<Cased>, but
aren't considered letters, so they aren't C<Cased_Letter>'s.)

See L</Beyond Unicode code points> for special considerations when
matching Unicode properties against non-Unicode code points.

=head3 B<General_Category>

Every Unicode character is assigned a general category, which is the "most
usual categorization of a character" (from
L<http://www.unicode.org/reports/tr44>).

The compound way of writing these is like C<\p{General_Category=Number}>
(short: C<\p{gc:n}>).  But Perl furnishes shortcuts in which everything up
through the equal or colon separator is omitted.  So you can instead just write
C<\pN>.

Here are the short and long forms of the values the C<General Category> property
can have:

    Short       Long

    L           Letter
    LC, L&      Cased_Letter (that is: [\p{Ll}\p{Lu}\p{Lt}])
    Lu          Uppercase_Letter
    Ll          Lowercase_Letter
    Lt          Titlecase_Letter
    Lm          Modifier_Letter
    Lo          Other_Letter

    M           Mark
    Mn          Nonspacing_Mark
    Mc          Spacing_Mark
    Me          Enclosing_Mark

    N           Number
    Nd          Decimal_Number (also Digit)
    Nl          Letter_Number
    No          Other_Number

    P           Punctuation (also Punct)
    Pc          Connector_Punctuation
    Pd          Dash_Punctuation
    Ps          Open_Punctuation
    Pe          Close_Punctuation
    Pi          Initial_Punctuation
                (may behave like Ps or Pe depending on usage)
    Pf          Final_Punctuation
                (may behave like Ps or Pe depending on usage)
    Po          Other_Punctuation

    S           Symbol
    Sm          Math_Symbol
    Sc          Currency_Symbol
    Sk          Modifier_Symbol
    So          Other_Symbol

    Z           Separator
    Zs          Space_Separator
    Zl          Line_Separator
    Zp          Paragraph_Separator

    C           Other
    Cc          Control (also Cntrl)
    Cf          Format
    Cs          Surrogate
    Co          Private_Use
    Cn          Unassigned

Single-letter properties match all characters in any of the
two-letter sub-properties starting with the same letter.
C<LC> and C<L&> are special: both are aliases for the set consisting of everything matched by C<Ll>, C<Lu>, and C<Lt>.

=head3 B<Bidirectional Character Types>

Because scripts differ in their directionality (Hebrew and Arabic are
written right to left, for example) Unicode supplies a C<Bidi_Class> property.
Some of the values this property can have are:

    Value       Meaning

    L           Left-to-Right
    LRE         Left-to-Right Embedding
    LRO         Left-to-Right Override
    R           Right-to-Left
    AL          Arabic Letter
    RLE         Right-to-Left Embedding
    RLO         Right-to-Left Override
    PDF         Pop Directional Format
    EN          European Number
    ES          European Separator
    ET          European Terminator
    AN          Arabic Number
    CS          Common Separator
    NSM         Non-Spacing Mark
    BN          Boundary Neutral
    B           Paragraph Separator
    S           Segment Separator
    WS          Whitespace
    ON          Other Neutrals

This property is always written in the compound form.
For example, C<\p{Bidi_Class:R}> matches characters that are normally
written right to left.  Unlike the
C<L</General_Category>> property, this
property can have more values added in a future Unicode release.  Those
listed above comprised the complete set for many Unicode releases, but
others were added in Unicode 6.3; you can always find what the
current ones are in L<perluniprops>.  And
L<http://www.unicode.org/reports/tr9/> describes how to use them.

=head3 B<Scripts>

The world's languages are written in many different scripts.  This sentence
(unless you're reading it in translation) is written in Latin, while Russian is
written in Cyrillic, and Greek is written in, well, Greek; Japanese mainly in
Hiragana or Katakana.  There are many more.

The Unicode C<Script> and C<Script_Extensions> properties give what
script a given character is in.  The C<Script_Extensions> property is an
improved version of C<Script>, as demonstrated below.  Either property
can be specified with the compound form like
C<\p{Script=Hebrew}> (short: C<\p{sc=hebr}>), or
C<\p{Script_Extensions=Javanese}> (short: C<\p{scx=java}>).
In addition, Perl furnishes shortcuts for all
C<Script_Extensions> property names.  You can omit everything up through
the equals (or colon), and simply write C<\p{Latin}> or C<\P{Cyrillic}>.
(This is not true for C<Script>, which is required to be
written in the compound form.  Prior to Perl v5.26, the single form
returned the plain old C<Script> version, but was changed because
C<Script_Extensions> gives better results.)

The difference between these two properties involves characters that are
used in multiple scripts.  For example the digits '0' through '9' are
used in many parts of the world.  These are placed in a script named
C<Common>.  Other characters are used in just a few scripts.  For
example, the C<"KATAKANA-HIRAGANA DOUBLE HYPHEN"> is used in both Japanese
scripts, Katakana and Hiragana, but nowhere else.  The C<Script>
property places all characters that are used in multiple scripts in the
C<Common> script, while the C<Script_Extensions> property places those
that are used in only a few scripts into each of those scripts; while
still using C<Common> for those used in many scripts.  Thus both these
match:

 "0" =~ /\p{sc=Common}/     # Matches
 "0" =~ /\p{scx=Common}/    # Matches

and only the first of these match:

 "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Common}  # Matches
 "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Common} # No match

And only the last two of these match:

 "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Hiragana}  # No match
 "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Katakana}  # No match
 "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Hiragana} # Matches
 "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Katakana} # Matches

C<Script_Extensions> is thus an improved C<Script>, in which there are
fewer characters in the C<Common> script, and correspondingly more in
other scripts.  It is new in Unicode version 6.0, and its data are likely
to change significantly in later releases, as things get sorted out.
New code should probably be using C<Script_Extensions> and not plain
C<Script>.  If you compile perl with a Unicode release that doesn't have
C<Script_Extensions>, the single form Perl extensions will instead refer
to the plain C<Script> property.  If you compile with a version of
Unicode that doesn't have the C<Script> property, these extensions will
not be defined at all.

(Actually, besides C<Common>, the C<Inherited> script, contains
characters that are used in multiple scripts.  These are modifier
characters which inherit the script value
of the controlling character.  Some of these are used in many scripts,
and so go into C<Inherited> in both C<Script> and C<Script_Extensions>.
Others are used in just a few scripts, so are in C<Inherited> in
C<Script>, but not in C<Script_Extensions>.)

It is worth stressing that there are several different sets of digits in
Unicode that are equivalent to 0-9 and are matchable by C<\d> in a
regular expression.  If they are used in a single language only, they
are in that language's C<Script> and C<Script_Extensions>.  If they are
used in more than one script, they will be in C<sc=Common>, but only
if they are used in many scripts should they be in C<scx=Common>.

The explanation above has omitted some detail; refer to UAX#24 "Unicode
Script Property": L<http://www.unicode.org/reports/tr24>.

A complete list of scripts and their shortcuts is in L<perluniprops>.

=head3 B<Use of the C<"Is"> Prefix>

For backward compatibility (with Perl 5.6), all properties writable
without using the compound form mentioned
so far may have C<Is> or C<Is_> prepended to their name, so C<\P{Is_Lu}>, for
example, is equal to C<\P{Lu}>, and C<\p{IsScript:Arabic}> is equal to
C<\p{Arabic}>.

=head3 B<Blocks>

In addition to B<scripts>, Unicode also defines B<blocks> of
characters.  The difference between scripts and blocks is that the
concept of scripts is closer to natural languages, while the concept
of blocks is more of an artificial grouping based on groups of Unicode
characters with consecutive ordinal values. For example, the C<"Basic Latin">
block is all the characters whose ordinals are between 0 and 127, inclusive; in
other words, the ASCII characters.  The C<"Latin"> script contains some letters
from this as well as several other blocks, like C<"Latin-1 Supplement">,
C<"Latin Extended-A">, I<etc.>, but it does not contain all the characters from
those blocks. It does not, for example, contain the digits 0-9, because
those digits are shared across many scripts, and hence are in the
C<Common> script.

For more about scripts versus blocks, see UAX#24 "Unicode Script Property":
L<http://www.unicode.org/reports/tr24>

The C<Script_Extensions> or C<Script> properties are likely to be the
ones you want to use when processing
natural language; the C<Block> property may occasionally be useful in working
with the nuts and bolts of Unicode.

Block names are matched in the compound form, like C<\p{Block: Arrows}> or
C<\p{Blk=Hebrew}>.  Unlike most other properties, only a few block names have a
Unicode-defined short name.

Perl also defines single form synonyms for the block property in cases
where these do not conflict with something else.  But don't use any of
these, because they are unstable.  Since these are Perl extensions, they
are subordinate to official Unicode property names; Unicode doesn't know
nor care about Perl's extensions.  It may happen that a name that
currently means the Perl extension will later be changed without warning
to mean a different Unicode property in a future version of the perl
interpreter that uses a later Unicode release, and your code would no
longer work.  The extensions are mentioned here for completeness:  Take
the block name and prefix it with one of: C<In> (for example
C<\p{Blk=Arrows}> can currently be written as C<\p{In_Arrows}>); or
sometimes C<Is> (like C<\p{Is_Arrows}>); or sometimes no prefix at all
(C<\p{Arrows}>).  As of this writing (Unicode 9.0) there are no
conflicts with using the C<In_> prefix, but there are plenty with the
other two forms.  For example, C<\p{Is_Hebrew}> and C<\p{Hebrew}> mean
C<\p{Script_Extensions=Hebrew}> which is NOT the same thing as
C<\p{Blk=Hebrew}>.  Our
advice used to be to use the C<In_> prefix as a single form way of
specifying a block.  But Unicode 8.0 added properties whose names begin
with C<In>, and it's now clear that it's only luck that's so far
prevented a conflict.  Using C<In> is only marginally less typing than
C<Blk:>, and the latter's meaning is clearer anyway, and guaranteed to
never conflict.  So don't take chances.  Use C<\p{Blk=foo}> for new
code.  And be sure that block is what you really really want to do.  In
most cases scripts are what you want instead.

A complete list of blocks is in L<perluniprops>.

=head3 B<Other Properties>

There are many more properties than the very basic ones described here.
A complete list is in L<perluniprops>.

Unicode defines all its properties in the compound form, so all single-form
properties are Perl extensions.  Most of these are just synonyms for the
Unicode ones, but some are genuine extensions, including several that are in
the compound form.  And quite a few of these are actually recommended by Unicode
(in L<http://www.unicode.org/reports/tr18>).

This section gives some details on all extensions that aren't just
synonyms for compound-form Unicode properties
(for those properties, you'll have to refer to the
L<Unicode Standard|http://www.unicode.org/reports/tr44>.

=over

=item B<C<\p{All}>>

This matches every possible code point.  It is equivalent to C<qr/./s>.
Unlike all the other non-user-defined C<\p{}> property matches, no
warning is ever generated if this is property is matched against a
non-Unicode code point (see L</Beyond Unicode code points> below).

=item B<C<\p{Alnum}>>

This matches any C<\p{Alphabetic}> or C<\p{Decimal_Number}> character.

=item B<C<\p{Any}>>

This matches any of the 1_114_112 Unicode code points.  It is a synonym
for C<\p{Unicode}>.

=item B<C<\p{ASCII}>>

This matches any of the 128 characters in the US-ASCII character set,
which is a subset of Unicode.

=item B<C<\p{Assigned}>>

This matches any assigned code point; that is, any code point whose L<general
category|/General_Category> is not C<Unassigned> (or equivalently, not C<Cn>).

=item B<C<\p{Blank}>>

This is the same as C<\h> and C<\p{HorizSpace}>:  A character that changes the
spacing horizontally.

=item B<C<\p{Decomposition_Type: Non_Canonical}>>    (Short: C<\p{Dt=NonCanon}>)

Matches a character that has a non-canonical decomposition.

The L</Extended Grapheme Clusters (Logical characters)> section above
talked about canonical decompositions.  However, many more characters
have a different type of decomposition, a "compatible" or
"non-canonical" decomposition.  The sequences that form these
decompositions are not considered canonically equivalent to the
pre-composed character.  An example is the C<"SUPERSCRIPT ONE">.  It is
somewhat like a regular digit 1, but not exactly; its decomposition into
the digit 1 is called a "compatible" decomposition, specifically a
"super" decomposition.  There are several such compatibility
decompositions (see L<http://www.unicode.org/reports/tr44>), including
one called "compat", which means some miscellaneous type of
decomposition that doesn't fit into the other decomposition categories
that Unicode has chosen.

Note that most Unicode characters don't have a decomposition, so their
decomposition type is C<"None">.

For your convenience, Perl has added the C<Non_Canonical> decomposition
type to mean any of the several compatibility decompositions.

=item B<C<\p{Graph}>>

Matches any character that is graphic.  Theoretically, this means a character
that on a printer would cause ink to be used.

=item B<C<\p{HorizSpace}>>

This is the same as C<\h> and C<\p{Blank}>:  a character that changes the
spacing horizontally.

=item B<C<\p{In=*}>>

This is a synonym for C<\p{Present_In=*}>

=item B<C<\p{PerlSpace}>>

This is the same as C<\s>, restricted to ASCII, namely C<S<[ \f\n\r\t]>>
and starting in Perl v5.18, a vertical tab.

Mnemonic: Perl's (original) space

=item B<C<\p{PerlWord}>>

This is the same as C<\w>, restricted to ASCII, namely C<[A-Za-z0-9_]>

Mnemonic: Perl's (original) word.

=item B<C<\p{Posix...}>>

There are several of these, which are equivalents, using the C<\p{}>
notation, for Posix classes and are described in
L<perlrecharclass/POSIX Character Classes>.

=item B<C<\p{Present_In: *}>>    (Short: C<\p{In=*}>)

This property is used when you need to know in what Unicode version(s) a
character is.

The "*" above stands for some two digit Unicode version number, such as
C<1.1> or C<4.0>; or the "*" can also be C<Unassigned>.  This property will
match the code points whose final disposition has been settled as of the
Unicode release given by the version number; C<\p{Present_In: Unassigned}>
will match those code points whose meaning has yet to be assigned.

For example, C<U+0041> C<"LATIN CAPITAL LETTER A"> was present in the very first
Unicode release available, which is C<1.1>, so this property is true for all
valid "*" versions.  On the other hand, C<U+1EFF> was not assigned until version
5.1 when it became C<"LATIN SMALL LETTER Y WITH LOOP">, so the only "*" that
would match it are 5.1, 5.2, and later.

Unicode furnishes the C<Age> property from which this is derived.  The problem
with Age is that a strict interpretation of it (which Perl takes) has it
matching the precise release a code point's meaning is introduced in.  Thus
C<U+0041> would match only 1.1; and C<U+1EFF> only 5.1.  This is not usually what
you want.

Some non-Perl implementations of the Age property may change its meaning to be
the same as the Perl C<Present_In> property; just be aware of that.

Another confusion with both these properties is that the definition is not
that the code point has been I<assigned>, but that the meaning of the code point
has been I<determined>.  This is because 66 code points will always be
unassigned, and so the C<Age> for them is the Unicode version in which the decision
to make them so was made.  For example, C<U+FDD0> is to be permanently
unassigned to a character, and the decision to do that was made in version 3.1,
so C<\p{Age=3.1}> matches this character, as also does C<\p{Present_In: 3.1}> and up.

=item B<C<\p{Print}>>

This matches any character that is graphical or blank, except controls.

=item B<C<\p{SpacePerl}>>

This is the same as C<\s>, including beyond ASCII.

Mnemonic: Space, as modified by Perl.  (It doesn't include the vertical tab
until v5.18, which both the Posix standard and Unicode consider white space.)

=item B<C<\p{Title}>> and  B<C<\p{Titlecase}>>

Under case-sensitive matching, these both match the same code points as
C<\p{General Category=Titlecase_Letter}> (C<\p{gc=lt}>).  The difference
is that under C</i> caseless matching, these match the same as
C<\p{Cased}>, whereas C<\p{gc=lt}> matches C<\p{Cased_Letter>).

=item B<C<\p{Unicode}>>

This matches any of the 1_114_112 Unicode code points.
C<\p{Any}>.

=item B<C<\p{VertSpace}>>

This is the same as C<\v>:  A character that changes the spacing vertically.

=item B<C<\p{Word}>>

This is the same as C<\w>, including over 100_000 characters beyond ASCII.

=item B<C<\p{XPosix...}>>

There are several of these, which are the standard Posix classes
extended to the full Unicode range.  They are described in
L<perlrecharclass/POSIX Character Classes>.

=back


=head2 User-Defined Character Properties

You can define your own binary character properties by defining subroutines
whose names begin with C<"In"> or C<"Is">.  (The experimental feature
L<perlre/(?[ ])> provides an alternative which allows more complex
definitions.)  The subroutines can be defined in any
package.  The user-defined properties can be used in the regular expression
C<\p{}> and C<\P{}> constructs; if you are using a user-defined property from a
package other than the one you are in, you must specify its package in the
C<\p{}> or C<\P{}> construct.

    # assuming property Is_Foreign defined in Lang::
    package main;  # property package name required
    if ($txt =~ /\p{Lang::IsForeign}+/) { ... }

    package Lang;  # property package name not required
    if ($txt =~ /\p{IsForeign}+/) { ... }


Note that the effect is compile-time and immutable once defined.
However, the subroutines are passed a single parameter, which is 0 if
case-sensitive matching is in effect and non-zero if caseless matching
is in effect.  The subroutine may return different values depending on
the value of the flag, and one set of values will immutably be in effect
for all case-sensitive matches, and the other set for all case-insensitive
matches.

Note that if the regular expression is tainted, then Perl will die rather
than calling the subroutine when the name of the subroutine is
determined by the tainted data.

The subroutines must return a specially-formatted string, with one
or more newline-separated lines.  Each line must be one of the following:

=over 4

=item *

A single hexadecimal number denoting a code point to include.

=item *

Two hexadecimal numbers separated by horizontal whitespace (space or
tabular characters) denoting a range of code points to include.

=item *

Something to include, prefixed by C<"+">: a built-in character
property (prefixed by C<"utf8::">) or a fully qualified (including package
name) user-defined character property,
to represent all the characters in that property; two hexadecimal code
points for a range; or a single hexadecimal code point.

=item *

Something to exclude, prefixed by C<"-">: an existing character
property (prefixed by C<"utf8::">) or a fully qualified (including package
name) user-defined character property,
to represent all the characters in that property; two hexadecimal code
points for a range; or a single hexadecimal code point.

=item *

Something to negate, prefixed C<"!">: an existing character
property (prefixed by C<"utf8::">) or a fully qualified (including package
name) user-defined character property,
to represent all the characters in that property; two hexadecimal code
points for a range; or a single hexadecimal code point.

=item *

Something to intersect with, prefixed by C<"&">: an existing character
property (prefixed by C<"utf8::">) or a fully qualified (including package
name) user-defined character property,
for all the characters except the characters in the property; two
hexadecimal code points for a range; or a single hexadecimal code point.

=back

For example, to define a property that covers both the Japanese
syllabaries (hiragana and katakana), you can define

    sub InKana {
        return <<END;
    3040\t309F
    30A0\t30FF
    END
    }

Imagine that the here-doc end marker is at the beginning of the line.
Now you can use C<\p{InKana}> and C<\P{InKana}>.

You could also have used the existing block property names:

    sub InKana {
        return <<'END';
    +utf8::InHiragana
    +utf8::InKatakana
    END
    }

Suppose you wanted to match only the allocated characters,
not the raw block ranges: in other words, you want to remove
the unassigned characters:

    sub InKana {
        return <<'END';
    +utf8::InHiragana
    +utf8::InKatakana
    -utf8::IsCn
    END
    }

The negation is useful for defining (surprise!) negated classes.

    sub InNotKana {
        return <<'END';
    !utf8::InHiragana
    -utf8::InKatakana
    +utf8::IsCn
    END
    }

This will match all non-Unicode code points, since every one of them is
not in Kana.  You can use intersection to exclude these, if desired, as
this modified example shows:

    sub InNotKana {
        return <<'END';
    !utf8::InHiragana
    -utf8::InKatakana
    +utf8::IsCn
    &utf8::Any
    END
    }

C<&utf8::Any> must be the last line in the definition.

Intersection is used generally for getting the common characters matched
by two (or more) classes.  It's important to remember not to use C<"&"> for
the first set; that would be intersecting with nothing, resulting in an
empty set.

Unlike non-user-defined C<\p{}> property matches, no warning is ever
generated if these properties are matched against a non-Unicode code
point (see L</Beyond Unicode code points> below).

=head2 User-Defined Case Mappings (for serious hackers only)

B<This feature has been removed as of Perl 5.16.>
The CPAN module C<L<Unicode::Casing>> provides better functionality without
the drawbacks that this feature had.  If you are using a Perl earlier
than 5.16, this feature was most fully documented in the 5.14 version of
this pod:
L<http://perldoc.perl.org/5.14.0/perlunicode.html#User-Defined-Case-Mappings-%28for-serious-hackers-only%29>

=head2 Character Encodings for Input and Output

See L<Encode>.

=head2 Unicode Regular Expression Support Level

The following list of Unicode supported features for regular expressions describes
all features currently directly supported by core Perl.  The references
to "Level I<N>" and the section numbers refer to
L<UTS#18 "Unicode Regular Expressions"|http://www.unicode.org/reports/tr18>,
version 13, November 2013.

=head3 Level 1 - Basic Unicode Support

 RL1.1   Hex Notation                     - Done          [1]
 RL1.2   Properties                       - Done          [2]
 RL1.2a  Compatibility Properties         - Done          [3]
 RL1.3   Subtraction and Intersection     - Experimental  [4]
 RL1.4   Simple Word Boundaries           - Done          [5]
 RL1.5   Simple Loose Matches             - Done          [6]
 RL1.6   Line Boundaries                  - Partial       [7]
 RL1.7   Supplementary Code Points        - Done          [8]

=over 4

=item [1] C<\N{U+...}> and C<\x{...}>

=item [2]
C<\p{...}> C<\P{...}>.  This requirement is for a minimal list of
properties.  Perl supports these and all other Unicode character
properties, as R2.7 asks (see L</"Unicode Character Properties"> above).

=item [3]
Perl has C<\d> C<\D> C<\s> C<\S> C<\w> C<\W> C<\X> C<[:I<prop>:]>
C<[:^I<prop>:]>, plus all the properties specified by
L<http://www.unicode.org/reports/tr18/#Compatibility_Properties>.  These
are described above in L</Other Properties>

=item [4]

The experimental feature C<"(?[...])"> starting in v5.18 accomplishes
this.

See L<perlre/(?[ ])>.  If you don't want to use an experimental
feature, you can use one of the following:

=over 4

=item *
Regular expression lookahead

You can mimic class subtraction using lookahead.
For example, what UTS#18 might write as

    [{Block=Greek}-[{UNASSIGNED}]]

in Perl can be written as:

    (?!\p{Unassigned})\p{Block=Greek}
    (?=\p{Assigned})\p{Block=Greek}

But in this particular example, you probably really want

    \p{Greek}

which will match assigned characters known to be part of the Greek script.

=item *

CPAN module C<L<Unicode::Regex::Set>>

It does implement the full UTS#18 grouping, intersection, union, and
removal (subtraction) syntax.

=item *

L</"User-Defined Character Properties">

C<"+"> for union, C<"-"> for removal (set-difference), C<"&"> for intersection

=back

=item [5]
C<\b> C<\B> meet most, but not all, the details of this requirement, but
C<\b{wb}> and C<\B{wb}> do, as well as the stricter R2.3.

=item [6]

Note that Perl does Full case-folding in matching, not Simple:

For example C<U+1F88> is equivalent to C<U+1F00 U+03B9>, instead of just
C<U+1F80>.  This difference matters mainly for certain Greek capital
letters with certain modifiers: the Full case-folding decomposes the
letter, while the Simple case-folding would map it to a single
character.

=item [7]

The reason this is considered to be only partially implemented is that
Perl has L<C<qrE<sol>\b{lb}E<sol>>|perlrebackslash/\b{lb}> and
C<L<Unicode::LineBreak>> that are conformant with
L<UAX#14 "Unicode Line Breaking Algorithm"|http://www.unicode.org/reports/tr14>.
The regular expression construct provides default behavior, while the
heavier-weight module provides customizable line breaking.

But Perl treats C<\n> as the start- and end-line
delimiter, whereas Unicode specifies more characters that should be
so-interpreted.

These are:

 VT   U+000B  (\v in C)
 FF   U+000C  (\f)
 CR   U+000D  (\r)
 NEL  U+0085
 LS   U+2028
 PS   U+2029

C<^> and C<$> in regular expression patterns are supposed to match all
these, but don't.
These characters also don't, but should, affect C<< <> >> C<$.>, and
script line numbers.

Also, lines should not be split within C<CRLF> (i.e. there is no
empty line between C<\r> and C<\n>).  For C<CRLF>, try the C<:crlf>
layer (see L<PerlIO>).

=item [8]
UTF-8/UTF-EBDDIC used in Perl allows not only C<U+10000> to
C<U+10FFFF> but also beyond C<U+10FFFF>

=back

=head3 Level 2 - Extended Unicode Support

 RL2.1   Canonical Equivalents           - Retracted     [9]
                                           by Unicode
 RL2.2   Extended Grapheme Clusters      - Partial       [10]
 RL2.3   Default Word Boundaries         - Done          [11]
 RL2.4   Default Case Conversion         - Done
 RL2.5   Name Properties                 - Done
 RL2.6   Wildcard Properties             - Missing
 RL2.7   Full Properties                 - Done

=over 4

=item [9]
Unicode has rewritten this portion of UTS#18 to say that getting
canonical equivalence (see UAX#15
L<"Unicode Normalization Forms"|http://www.unicode.org/reports/tr15>)
is basically to be done at the programmer level.  Use NFD to write
both your regular expressions and text to match them against (you
can use L<Unicode::Normalize>).

=item [10]
Perl has C<\X> and C<\b{gcb}> but we don't have a "Grapheme Cluster Mode".

=item [11] see
L<UAX#29 "Unicode Text Segmentation"|http://www.unicode.org/reports/tr29>,

=back

=head3 Level 3 - Tailored Support

 RL3.1   Tailored Punctuation            - Missing
 RL3.2   Tailored Grapheme Clusters      - Missing       [12]
 RL3.3   Tailored Word Boundaries        - Missing
 RL3.4   Tailored Loose Matches          - Retracted by Unicode
 RL3.5   Tailored Ranges                 - Retracted by Unicode
 RL3.6   Context Matching                - Missing       [13]
 RL3.7   Incremental Matches             - Missing
 RL3.8   Unicode Set Sharing             - Unicode is proposing
                                           to retract this
 RL3.9   Possible Match Sets             - Missing
 RL3.10  Folded Matching                 - Retracted by Unicode
 RL3.11  Submatchers                     - Missing

=over 4

=item [12]
Perl has L<Unicode::Collate>, but it isn't integrated with regular
expressions.  See
L<UTS#10 "Unicode Collation Algorithms"|http://www.unicode.org/reports/tr10>.

=item [13]
Perl has C<(?<=x)> and C<(?=x)>, but lookaheads or lookbehinds should
see outside of the target substring

=back

=head2 Unicode Encodings

Unicode characters are assigned to I<code points>, which are abstract
numbers.  To use these numbers, various encodings are needed.

=over 4

=item *

UTF-8

UTF-8 is a variable-length (1 to 4 bytes), byte-order independent
encoding.  In most of Perl's documentation, including elsewhere in this
document, the term "UTF-8" means also "UTF-EBCDIC".  But in this section,
"UTF-8" refers only to the encoding used on ASCII platforms.  It is a
superset of 7-bit US-ASCII, so anything encoded in ASCII has the
identical representation when encoded in UTF-8.

The following table is from Unicode 3.2.

 Code Points            1st Byte  2nd Byte  3rd Byte 4th Byte

   U+0000..U+007F       00..7F
   U+0080..U+07FF     * C2..DF    80..BF
   U+0800..U+0FFF       E0      * A0..BF    80..BF
   U+1000..U+CFFF       E1..EC    80..BF    80..BF
   U+D000..U+D7FF       ED        80..9F    80..BF
   U+D800..U+DFFF       +++++ utf16 surrogates, not legal utf8 +++++
   U+E000..U+FFFF       EE..EF    80..BF    80..BF
  U+10000..U+3FFFF      F0      * 90..BF    80..BF    80..BF
  U+40000..U+FFFFF      F1..F3    80..BF    80..BF    80..BF
 U+100000..U+10FFFF     F4        80..8F    80..BF    80..BF

Note the gaps marked by "*" before several of the byte entries above.  These are
caused by legal UTF-8 avoiding non-shortest encodings: it is technically
possible to UTF-8-encode a single code point in different ways, but that is
explicitly forbidden, and the shortest possible encoding should always be used
(and that is what Perl does).

Another way to look at it is via bits:

                Code Points  1st Byte  2nd Byte  3rd Byte  4th Byte

                   0aaaaaaa  0aaaaaaa
           00000bbbbbaaaaaa  110bbbbb  10aaaaaa
           ccccbbbbbbaaaaaa  1110cccc  10bbbbbb  10aaaaaa
 00000dddccccccbbbbbbaaaaaa  11110ddd  10cccccc  10bbbbbb  10aaaaaa

As you can see, the continuation bytes all begin with C<"10">, and the
leading bits of the start byte tell how many bytes there are in the
encoded character.

The original UTF-8 specification allowed up to 6 bytes, to allow
encoding of numbers up to C<0x7FFF_FFFF>.  Perl continues to allow those,
and has extended that up to 13 bytes to encode code points up to what
can fit in a 64-bit word.  However, Perl will warn if you output any of
these as being non-portable; and under strict UTF-8 input protocols,
they are forbidden.  In addition, it is deprecated to use a code point
larger than what a signed integer variable on your system can hold.  On
32-bit ASCII systems, this means C<0x7FFF_FFFF> is the legal maximum
going forward (much higher on 64-bit systems).

=item *

UTF-EBCDIC

Like UTF-8, but EBCDIC-safe, in the way that UTF-8 is ASCII-safe.
This means that all the basic characters (which includes all
those that have ASCII equivalents (like C<"A">, C<"0">, C<"%">, I<etc.>)
are the same in both EBCDIC and UTF-EBCDIC.)

UTF-EBCDIC is used on EBCDIC platforms.  It generally requires more
bytes to represent a given code point than UTF-8 does; the largest
Unicode code points take 5 bytes to represent (instead of 4 in UTF-8),
and, extended for 64-bit words, it uses 14 bytes instead of 13 bytes in
UTF-8.

=item *

UTF-16, UTF-16BE, UTF-16LE, Surrogates, and C<BOM>'s (Byte Order Marks)

The followings items are mostly for reference and general Unicode
knowledge, Perl doesn't use these constructs internally.

Like UTF-8, UTF-16 is a variable-width encoding, but where
UTF-8 uses 8-bit code units, UTF-16 uses 16-bit code units.
All code points occupy either 2 or 4 bytes in UTF-16: code points
C<U+0000..U+FFFF> are stored in a single 16-bit unit, and code
points C<U+10000..U+10FFFF> in two 16-bit units.  The latter case is
using I<surrogates>, the first 16-bit unit being the I<high
surrogate>, and the second being the I<low surrogate>.

Surrogates are code points set aside to encode the C<U+10000..U+10FFFF>
range of Unicode code points in pairs of 16-bit units.  The I<high
surrogates> are the range C<U+D800..U+DBFF> and the I<low surrogates>
are the range C<U+DC00..U+DFFF>.  The surrogate encoding is

    $hi = ($uni - 0x10000) / 0x400 + 0xD800;
    $lo = ($uni - 0x10000) % 0x400 + 0xDC00;

and the decoding is

    $uni = 0x10000 + ($hi - 0xD800) * 0x400 + ($lo - 0xDC00);

Because of the 16-bitness, UTF-16 is byte-order dependent.  UTF-16
itself can be used for in-memory computations, but if storage or
transfer is required either UTF-16BE (big-endian) or UTF-16LE
(little-endian) encodings must be chosen.

This introduces another problem: what if you just know that your data
is UTF-16, but you don't know which endianness?  Byte Order Marks, or
C<BOM>'s, are a solution to this.  A special character has been reserved
in Unicode to function as a byte order marker: the character with the
code point C<U+FEFF> is the C<BOM>.

The trick is that if you read a C<BOM>, you will know the byte order,
since if it was written on a big-endian platform, you will read the
bytes C<0xFE 0xFF>, but if it was written on a little-endian platform,
you will read the bytes C<0xFF 0xFE>.  (And if the originating platform
was writing in ASCII platform UTF-8, you will read the bytes
C<0xEF 0xBB 0xBF>.)

The way this trick works is that the character with the code point
C<U+FFFE> is not supposed to be in input streams, so the
sequence of bytes C<0xFF 0xFE> is unambiguously "C<BOM>, represented in
little-endian format" and cannot be C<U+FFFE>, represented in big-endian
format".

Surrogates have no meaning in Unicode outside their use in pairs to
represent other code points.  However, Perl allows them to be
represented individually internally, for example by saying
C<chr(0xD801)>, so that all code points, not just those valid for open
interchange, are
representable.  Unicode does define semantics for them, such as their
C<L</General_Category>> is C<"Cs">.  But because their use is somewhat dangerous,
Perl will warn (using the warning category C<"surrogate">, which is a
sub-category of C<"utf8">) if an attempt is made
to do things like take the lower case of one, or match
case-insensitively, or to output them.  (But don't try this on Perls
before 5.14.)

=item *

UTF-32, UTF-32BE, UTF-32LE

The UTF-32 family is pretty much like the UTF-16 family, except that
the units are 32-bit, and therefore the surrogate scheme is not
needed.  UTF-32 is a fixed-width encoding.  The C<BOM> signatures are
C<0x00 0x00 0xFE 0xFF> for BE and C<0xFF 0xFE 0x00 0x00> for LE.

=item *

UCS-2, UCS-4

Legacy, fixed-width encodings defined by the ISO 10646 standard.  UCS-2 is a 16-bit
encoding.  Unlike UTF-16, UCS-2 is not extensible beyond C<U+FFFF>,
because it does not use surrogates.  UCS-4 is a 32-bit encoding,
functionally identical to UTF-32 (the difference being that
UCS-4 forbids neither surrogates nor code points larger than C<0x10_FFFF>).

=item *

UTF-7

A seven-bit safe (non-eight-bit) encoding, which is useful if the
transport or storage is not eight-bit safe.  Defined by RFC 2152.

=back

=head2 Noncharacter code points

66 code points are set aside in Unicode as "noncharacter code points".
These all have the C<Unassigned> (C<Cn>) C<L</General_Category>>, and
no character will ever be assigned to any of them.  They are the 32 code
points between C<U+FDD0> and C<U+FDEF> inclusive, and the 34 code
points:

 U+FFFE   U+FFFF
 U+1FFFE  U+1FFFF
 U+2FFFE  U+2FFFF
 ...
 U+EFFFE  U+EFFFF
 U+FFFFE  U+FFFFF
 U+10FFFE U+10FFFF

Until Unicode 7.0, the noncharacters were "B<forbidden> for use in open
interchange of Unicode text data", so that code that processed those
streams could use these code points as sentinels that could be mixed in
with character data, and would always be distinguishable from that data.
(Emphasis above and in the next paragraph are added in this document.)

Unicode 7.0 changed the wording so that they are "B<not recommended> for
use in open interchange of Unicode text data".  The 7.0 Standard goes on
to say:

=over 4

"If a noncharacter is received in open interchange, an application is
not required to interpret it in any way.  It is good practice, however,
to recognize it as a noncharacter and to take appropriate action, such
as replacing it with C<U+FFFD> replacement character, to indicate the
problem in the text.  It is not recommended to simply delete
noncharacter code points from such text, because of the potential
security issues caused by deleting uninterpreted characters.  (See
conformance clause C7 in Section 3.2, Conformance Requirements, and
L<Unicode Technical Report #36, "Unicode Security
Considerations"|http://www.unicode.org/reports/tr36/#Substituting_for_Ill_Formed_Subsequences>)."

=back

This change was made because it was found that various commercial tools
like editors, or for things like source code control, had been written
so that they would not handle program files that used these code points,
effectively precluding their use almost entirely!  And that was never
the intent.  They've always been meant to be usable within an
application, or cooperating set of applications, at will.

If you're writing code, such as an editor, that is supposed to be able
to handle any Unicode text data, then you shouldn't be using these code
points yourself, and instead allow them in the input.  If you need
sentinels, they should instead be something that isn't legal Unicode.
For UTF-8 data, you can use the bytes 0xC1 and 0xC2 as sentinels, as
they never appear in well-formed UTF-8.  (There are equivalents for
UTF-EBCDIC).  You can also store your Unicode code points in integer
variables and use negative values as sentinels.

If you're not writing such a tool, then whether you accept noncharacters
as input is up to you (though the Standard recommends that you not).  If
you do strict input stream checking with Perl, these code points
continue to be forbidden.  This is to maintain backward compatibility
(otherwise potential security holes could open up, as an unsuspecting
application that was written assuming the noncharacters would be
filtered out before getting to it, could now, without warning, start
getting them).  To do strict checking, you can use the layer
C<:encoding('UTF-8')>.

Perl continues to warn (using the warning category C<"nonchar">, which
is a sub-category of C<"utf8">) if an attempt is made to output
noncharacters.

=head2 Beyond Unicode code points

The maximum Unicode code point is C<U+10FFFF>, and Unicode only defines
operations on code points up through that.  But Perl works on code
points up to the maximum permissible unsigned number available on the
platform.  However, Perl will not accept these from input streams unless
lax rules are being used, and will warn (using the warning category
C<"non_unicode">, which is a sub-category of C<"utf8">) if any are output.

Since Unicode rules are not defined on these code points, if a
Unicode-defined operation is done on them, Perl uses what we believe are
sensible rules, while generally warning, using the C<"non_unicode">
category.  For example, C<uc("\x{11_0000}")> will generate such a
warning, returning the input parameter as its result, since Perl defines
the uppercase of every non-Unicode code point to be the code point
itself.  (All the case changing operations, not just uppercasing, work
this way.)

The situation with matching Unicode properties in regular expressions,
the C<\p{}> and C<\P{}> constructs, against these code points is not as
clear cut, and how these are handled has changed as we've gained
experience.

One possibility is to treat any match against these code points as
undefined.  But since Perl doesn't have the concept of a match being
undefined, it converts this to failing or C<FALSE>.  This is almost, but
not quite, what Perl did from v5.14 (when use of these code points
became generally reliable) through v5.18.  The difference is that Perl
treated all C<\p{}> matches as failing, but all C<\P{}> matches as
succeeding.

One problem with this is that it leads to unexpected, and confusing
results in some cases:

 chr(0x110000) =~ \p{ASCII_Hex_Digit=True}      # Failed on <= v5.18
 chr(0x110000) =~ \p{ASCII_Hex_Digit=False}     # Failed! on <= v5.18

That is, it treated both matches as undefined, and converted that to
false (raising a warning on each).  The first case is the expected
result, but the second is likely counterintuitive: "How could both be
false when they are complements?"  Another problem was that the
implementation optimized many Unicode property matches down to already
existing simpler, faster operations, which don't raise the warning.  We
chose to not forgo those optimizations, which help the vast majority of
matches, just to generate a warning for the unlikely event that an
above-Unicode code point is being matched against.

As a result of these problems, starting in v5.20, what Perl does is
to treat non-Unicode code points as just typical unassigned Unicode
characters, and matches accordingly.  (Note: Unicode has atypical
unassigned code points.  For example, it has noncharacter code points,
and ones that, when they do get assigned, are destined to be written
Right-to-left, as Arabic and Hebrew are.  Perl assumes that no
non-Unicode code point has any atypical properties.)

Perl, in most cases, will raise a warning when matching an above-Unicode
code point against a Unicode property when the result is C<TRUE> for
C<\p{}>, and C<FALSE> for C<\P{}>.  For example:

 chr(0x110000) =~ \p{ASCII_Hex_Digit=True}      # Fails, no warning
 chr(0x110000) =~ \p{ASCII_Hex_Digit=False}     # Succeeds, with warning

In both these examples, the character being matched is non-Unicode, so
Unicode doesn't define how it should match.  It clearly isn't an ASCII
hex digit, so the first example clearly should fail, and so it does,
with no warning.  But it is arguable that the second example should have
an undefined, hence C<FALSE>, result.  So a warning is raised for it.

Thus the warning is raised for many fewer cases than in earlier Perls,
and only when what the result is could be arguable.  It turns out that
none of the optimizations made by Perl (or are ever likely to be made)
cause the warning to be skipped, so it solves both problems of Perl's
earlier approach.  The most commonly used property that is affected by
this change is C<\p{Unassigned}> which is a short form for
C<\p{General_Category=Unassigned}>.  Starting in v5.20, all non-Unicode
code points are considered C<Unassigned>.  In earlier releases the
matches failed because the result was considered undefined.

The only place where the warning is not raised when it might ought to
have been is if optimizations cause the whole pattern match to not even
be attempted.  For example, Perl may figure out that for a string to
match a certain regular expression pattern, the string has to contain
the substring C<"foobar">.  Before attempting the match, Perl may look
for that substring, and if not found, immediately fail the match without
actually trying it; so no warning gets generated even if the string
contains an above-Unicode code point.

This behavior is more "Do what I mean" than in earlier Perls for most
applications.  But it catches fewer issues for code that needs to be
strictly Unicode compliant.  Therefore there is an additional mode of
operation available to accommodate such code.  This mode is enabled if a
regular expression pattern is compiled within the lexical scope where
the C<"non_unicode"> warning class has been made fatal, say by:

 use warnings FATAL => "non_unicode"

(see L<warnings>).  In this mode of operation, Perl will raise the
warning for all matches against a non-Unicode code point (not just the
arguable ones), and it skips the optimizations that might cause the
warning to not be output.  (It currently still won't warn if the match
isn't even attempted, like in the C<"foobar"> example above.)

In summary, Perl now normally treats non-Unicode code points as typical
Unicode unassigned code points for regular expression matches, raising a
warning only when it is arguable what the result should be.  However, if
this warning has been made fatal, it isn't skipped.

There is one exception to all this.  C<\p{All}> looks like a Unicode
property, but it is a Perl extension that is defined to be true for all
possible code points, Unicode or not, so no warning is ever generated
when matching this against a non-Unicode code point.  (Prior to v5.20,
it was an exact synonym for C<\p{Any}>, matching code points C<0>
through C<0x10FFFF>.)

=head2 Security Implications of Unicode

First, read
L<Unicode Security Considerations|http://www.unicode.org/reports/tr36>.

Also, note the following:

=over 4

=item *

Malformed UTF-8

UTF-8 is very structured, so many combinations of bytes are invalid.  In
the past, Perl tried to soldier on and make some sense of invalid
combinations, but this can lead to security holes, so now, if the Perl
core needs to process an invalid combination, it will either raise a
fatal error, or will replace those bytes by the sequence that forms the
Unicode REPLACEMENT CHARACTER, for which purpose Unicode created it.

Every code point can be represented by more than one possible
syntactically valid UTF-8 sequence.  Early on, both Unicode and Perl
considered any of these to be valid, but now, all sequences longer
than the shortest possible one are considered to be malformed.

Unicode considers many code points to be illegal, or to be avoided.
Perl generally accepts them, once they have passed through any input
filters that may try to exclude them.  These have been discussed above
(see "Surrogates" under UTF-16 in L</Unicode Encodings>,
L</Noncharacter code points>, and L</Beyond Unicode code points>).

=item *

Regular expression pattern matching may surprise you if you're not
accustomed to Unicode.  Starting in Perl 5.14, several pattern
modifiers are available to control this, called the character set
modifiers.  Details are given in L<perlre/Character set modifiers>.

=back

As discussed elsewhere, Perl has one foot (two hooves?) planted in
each of two worlds: the old world of ASCII and single-byte locales, and
the new world of Unicode, upgrading when necessary.
If your legacy code does not explicitly use Unicode, no automatic
switch-over to Unicode should happen.

=head2 Unicode in Perl on EBCDIC

Unicode is supported on EBCDIC platforms.  See L<perlebcdic>.

Unless ASCII vs. EBCDIC issues are specifically being discussed,
references to UTF-8 encoding in this document and elsewhere should be
read as meaning UTF-EBCDIC on EBCDIC platforms.
See L<perlebcdic/Unicode and UTF>.

Because UTF-EBCDIC is so similar to UTF-8, the differences are mostly
hidden from you; S<C<use utf8>> (and NOT something like
S<C<use utfebcdic>>) declares the the script is in the platform's
"native" 8-bit encoding of Unicode.  (Similarly for the C<":utf8">
layer.)

=head2 Locales

See L<perllocale/Unicode and UTF-8>

=head2 When Unicode Does Not Happen

There are still many places where Unicode (in some encoding or
another) could be given as arguments or received as results, or both in
Perl, but it is not, in spite of Perl having extensive ways to input and
output in Unicode, and a few other "entry points" like the C<@ARGV>
array (which can sometimes be interpreted as UTF-8).

The following are such interfaces.  Also, see L</The "Unicode Bug">.
For all of these interfaces Perl
currently (as of v5.16.0) simply assumes byte strings both as arguments
and results, or UTF-8 strings if the (deprecated) C<encoding> pragma has been used.

One reason that Perl does not attempt to resolve the role of Unicode in
these situations is that the answers are highly dependent on the operating
system and the file system(s).  For example, whether filenames can be
in Unicode and in exactly what kind of encoding, is not exactly a
portable concept.  Similarly for C<qx> and C<system>: how well will the
"command-line interface" (and which of them?) handle Unicode?

=over 4

=item *

C<chdir>, C<chmod>, C<chown>, C<chroot>, C<exec>, C<link>, C<lstat>, C<mkdir>,
C<rename>, C<rmdir>, C<stat>, C<symlink>, C<truncate>, C<unlink>, C<utime>, C<-X>

=item *

C<%ENV>

=item *

C<glob> (aka the C<E<lt>*E<gt>>)

=item *

C<open>, C<opendir>, C<sysopen>

=item *

C<qx> (aka the backtick operator), C<system>

=item *

C<readdir>, C<readlink>

=back

=head2 The "Unicode Bug"

The term, "Unicode bug" has been applied to an inconsistency with the
code points in the C<Latin-1 Supplement> block, that is, between
128 and 255.  Without a locale specified, unlike all other characters or
code points, these characters can have very different semantics
depending on the rules in effect.  (Characters whose code points are
above 255 force Unicode rules; whereas the rules for ASCII characters
are the same under both ASCII and Unicode rules.)

Under Unicode rules, these upper-Latin1 characters are interpreted as
Unicode code points, which means they have the same semantics as Latin-1
(ISO-8859-1) and C1 controls.

As explained in L</ASCII Rules versus Unicode Rules>, under ASCII rules,
they are considered to be unassigned characters.

This can lead to unexpected results.  For example, a string's
semantics can suddenly change if a code point above 255 is appended to
it, which changes the rules from ASCII to Unicode.  As an
example, consider the following program and its output:

 $ perl -le'
     no feature "unicode_strings";
     $s1 = "\xC2";
     $s2 = "\x{2660}";
     for ($s1, $s2, $s1.$s2) {
         print /\w/ || 0;
     }
 '
 0
 0
 1

If there's no C<\w> in C<s1> nor in C<s2>, why does their concatenation
have one?

This anomaly stems from Perl's attempt to not disturb older programs that
didn't use Unicode, along with Perl's desire to add Unicode support
seamlessly.  But the result turned out to not be seamless.  (By the way,
you can choose to be warned when things like this happen.  See
C<L<encoding::warnings>>.)

L<S<C<use feature 'unicode_strings'>>|feature/The 'unicode_strings' feature>
was added, starting in Perl v5.12, to address this problem.  It affects
these things:

=over 4

=item *

Changing the case of a scalar, that is, using C<uc()>, C<ucfirst()>, C<lc()>,
and C<lcfirst()>, or C<\L>, C<\U>, C<\u> and C<\l> in double-quotish
contexts, such as regular expression substitutions.

Under C<unicode_strings> starting in Perl 5.12.0, Unicode rules are
generally used.  See L<perlfunc/lc> for details on how this works
in combination with various other pragmas.

=item *

Using caseless (C</i>) regular expression matching.

Starting in Perl 5.14.0, regular expressions compiled within
the scope of C<unicode_strings> use Unicode rules
even when executed or compiled into larger
regular expressions outside the scope.

=item *

Matching any of several properties in regular expressions.

These properties are C<\b> (without braces), C<\B> (without braces),
C<\s>, C<\S>, C<\w>, C<\W>, and all the Posix character classes
I<except> C<[[:ascii:]]>.

Starting in Perl 5.14.0, regular expressions compiled within
the scope of C<unicode_strings> use Unicode rules
even when executed or compiled into larger
regular expressions outside the scope.

=item *

In C<quotemeta> or its inline equivalent C<\Q>.

Starting in Perl 5.16.0, consistent quoting rules are used within the
scope of C<unicode_strings>, as described in L<perlfunc/quotemeta>.
Prior to that, or outside its scope, no code points above 127 are quoted
in UTF-8 encoded strings, but in byte encoded strings, code points
between 128-255 are always quoted.

=item *

In the C<..> or L<range|perlop/Range Operators> operator.

Starting in Perl 5.26.0, the range operator on strings treats their lengths
consistently within the scope of C<unicode_strings>. Prior to that, or
outside its scope, it could produce strings whose length in characters
exceeded that of the right-hand side, where the right-hand side took up more
bytes than the correct range endpoint.

=item *

In L<< C<split>'s special-case whitespace splitting|perlfunc/split >>.

Starting in Perl 5.28.0, the C<split> function with a pattern specified as
a string containing a single space handles whitespace characters consistently
within the scope of of C<unicode_strings>. Prior to that, or outside its scope,
characters that are whitespace according to Unicode rules but not according to
ASCII rules were treated as field contents rather than field separators when
they appear in byte-encoded strings.

=back

You can see from the above that the effect of C<unicode_strings>
increased over several Perl releases.  (And Perl's support for Unicode
continues to improve; it's best to use the latest available release in
order to get the most complete and accurate results possible.)  Note that
C<unicode_strings> is automatically chosen if you S<C<use 5.012>> or
higher.

For Perls earlier than those described above, or when a string is passed
to a function outside the scope of C<unicode_strings>, see the next section.

=head2 Forcing Unicode in Perl (Or Unforcing Unicode in Perl)

Sometimes (see L</"When Unicode Does Not Happen"> or L</The "Unicode Bug">)
there are situations where you simply need to force a byte
string into UTF-8, or vice versa.  The standard module L<Encode> can be
used for this, or the low-level calls
L<C<utf8::upgrade($bytestring)>|utf8/Utility functions> and
L<C<utf8::downgrade($utf8string[, FAIL_OK])>|utf8/Utility functions>.

Note that C<utf8::downgrade()> can fail if the string contains characters
that don't fit into a byte.

Calling either function on a string that already is in the desired state is a
no-op.

L</ASCII Rules versus Unicode Rules> gives all the ways that a string is
made to use Unicode rules.

=head2 Using Unicode in XS

See L<perlguts/"Unicode Support"> for an introduction to Unicode at
the XS level, and L<perlapi/Unicode Support> for the API details.

=head2 Hacking Perl to work on earlier Unicode versions (for very serious hackers only)

Perl by default comes with the latest supported Unicode version built-in, but
the goal is to allow you to change to use any earlier one.  In Perls
v5.20 and v5.22, however, the earliest usable version is Unicode 5.1.
Perl v5.18 and v5.24 are able to handle all earlier versions.

Download the files in the desired version of Unicode from the Unicode web
site L<http://www.unicode.org>).  These should replace the existing files in
F<lib/unicore> in the Perl source tree.  Follow the instructions in
F<README.perl> in that directory to change some of their names, and then build
perl (see L<INSTALL>).

=head2 Porting code from perl-5.6.X

Perls starting in 5.8 have a different Unicode model from 5.6. In 5.6 the
programmer was required to use the C<utf8> pragma to declare that a
given scope expected to deal with Unicode data and had to make sure that
only Unicode data were reaching that scope. If you have code that is
working with 5.6, you will need some of the following adjustments to
your code. The examples are written such that the code will continue to
work under 5.6, so you should be safe to try them out.

=over 3

=item *

A filehandle that should read or write UTF-8

  if ($] > 5.008) {
    binmode $fh, ":encoding(UTF-8)";
  }

=item *

A scalar that is going to be passed to some extension

Be it C<Compress::Zlib>, C<Apache::Request> or any extension that has no
mention of Unicode in the manpage, you need to make sure that the
UTF8 flag is stripped off. Note that at the time of this writing
(January 2012) the mentioned modules are not UTF-8-aware. Please
check the documentation to verify if this is still true.

  if ($] > 5.008) {
    require Encode;
    $val = Encode::encode("UTF-8", $val); # make octets
  }

=item *

A scalar we got back from an extension

If you believe the scalar comes back as UTF-8, you will most likely
want the UTF8 flag restored:

  if ($] > 5.008) {
    require Encode;
    $val = Encode::decode("UTF-8", $val);
  }

=item *

Same thing, if you are really sure it is UTF-8

  if ($] > 5.008) {
    require Encode;
    Encode::_utf8_on($val);
  }

=item *

A wrapper for L<DBI> C<fetchrow_array> and C<fetchrow_hashref>

When the database contains only UTF-8, a wrapper function or method is
a convenient way to replace all your C<fetchrow_array> and
C<fetchrow_hashref> calls. A wrapper function will also make it easier to
adapt to future enhancements in your database driver. Note that at the
time of this writing (January 2012), the DBI has no standardized way
to deal with UTF-8 data. Please check the L<DBI documentation|DBI> to verify if
that is still true.

  sub fetchrow {
    # $what is one of fetchrow_{array,hashref}
    my($self, $sth, $what) = @_;
    if ($] < 5.008) {
      return $sth->$what;
    } else {
      require Encode;
      if (wantarray) {
        my @arr = $sth->$what;
        for (@arr) {
          defined && /[^\000-\177]/ && Encode::_utf8_on($_);
        }
        return @arr;
      } else {
        my $ret = $sth->$what;
        if (ref $ret) {
          for my $k (keys %$ret) {
            defined
            && /[^\000-\177]/
            && Encode::_utf8_on($_) for $ret->{$k};
          }
          return $ret;
        } else {
          defined && /[^\000-\177]/ && Encode::_utf8_on($_) for $ret;
          return $ret;
        }
      }
    }
  }


=item *

A large scalar that you know can only contain ASCII

Scalars that contain only ASCII and are marked as UTF-8 are sometimes
a drag to your program. If you recognize such a situation, just remove
the UTF8 flag:

  utf8::downgrade($val) if $] > 5.008;

=back

=head1 BUGS

See also L</The "Unicode Bug"> above.

=head2 Interaction with Extensions

When Perl exchanges data with an extension, the extension should be
able to understand the UTF8 flag and act accordingly. If the
extension doesn't recognize that flag, it's likely that the extension
will return incorrectly-flagged data.

So if you're working with Unicode data, consult the documentation of
every module you're using if there are any issues with Unicode data
exchange. If the documentation does not talk about Unicode at all,
suspect the worst and probably look at the source to learn how the
module is implemented. Modules written completely in Perl shouldn't
cause problems. Modules that directly or indirectly access code written
in other programming languages are at risk.

For affected functions, the simple strategy to avoid data corruption is
to always make the encoding of the exchanged data explicit. Choose an
encoding that you know the extension can handle. Convert arguments passed
to the extensions to that encoding and convert results back from that
encoding. Write wrapper functions that do the conversions for you, so
you can later change the functions when the extension catches up.

To provide an example, let's say the popular C<Foo::Bar::escape_html>
function doesn't deal with Unicode data yet. The wrapper function
would convert the argument to raw UTF-8 and convert the result back to
Perl's internal representation like so:

    sub my_escape_html ($) {
        my($what) = shift;
        return unless defined $what;
        Encode::decode("UTF-8", Foo::Bar::escape_html(
                                     Encode::encode("UTF-8", $what)));
    }

Sometimes, when the extension does not convert data but just stores
and retrieves it, you will be able to use the otherwise
dangerous L<C<Encode::_utf8_on()>|Encode/_utf8_on> function. Let's say
the popular C<Foo::Bar> extension, written in C, provides a C<param>
method that lets you store and retrieve data according to these prototypes:

    $self->param($name, $value);            # set a scalar
    $value = $self->param($name);           # retrieve a scalar

If it does not yet provide support for any encoding, one could write a
derived class with such a C<param> method:

    sub param {
      my($self,$name,$value) = @_;
      utf8::upgrade($name);     # make sure it is UTF-8 encoded
      if (defined $value) {
        utf8::upgrade($value);  # make sure it is UTF-8 encoded
        return $self->SUPER::param($name,$value);
      } else {
        my $ret = $self->SUPER::param($name);
        Encode::_utf8_on($ret); # we know, it is UTF-8 encoded
        return $ret;
      }
    }

Some extensions provide filters on data entry/exit points, such as
C<DB_File::filter_store_key> and family. Look out for such filters in
the documentation of your extensions; they can make the transition to
Unicode data much easier.

=head2 Speed

Some functions are slower when working on UTF-8 encoded strings than
on byte encoded strings.  All functions that need to hop over
characters such as C<length()>, C<substr()> or C<index()>, or matching
regular expressions can work B<much> faster when the underlying data are
byte-encoded.

In Perl 5.8.0 the slowness was often quite spectacular; in Perl 5.8.1
a caching scheme was introduced which improved the situation.  In general,
operations with UTF-8 encoded strings are still slower. As an example,
the Unicode properties (character classes) like C<\p{Nd}> are known to
be quite a bit slower (5-20 times) than their simpler counterparts
like C<[0-9]> (then again, there are hundreds of Unicode characters matching
C<Nd> compared with the 10 ASCII characters matching C<[0-9]>).

=head1 SEE ALSO

L<perlunitut>, L<perluniintro>, L<perluniprops>, L<Encode>, L<open>, L<utf8>, L<bytes>,
L<perlretut>, L<perlvar/"${^UNICODE}">,
L<http://www.unicode.org/reports/tr44>).

=cut
perlreftut.pod000064400000044546150344123430007460 0ustar00=head1 NAME

perlreftut - Mark's very short tutorial about references

=head1 DESCRIPTION

One of the most important new features in Perl 5 was the capability to
manage complicated data structures like multidimensional arrays and
nested hashes.  To enable these, Perl 5 introduced a feature called
I<references>, and using references is the key to managing complicated,
structured data in Perl.  Unfortunately, there's a lot of funny syntax
to learn, and the main manual page can be hard to follow.  The manual
is quite complete, and sometimes people find that a problem, because
it can be hard to tell what is important and what isn't.

Fortunately, you only need to know 10% of what's in the main page to get
90% of the benefit.  This page will show you that 10%.

=head1 Who Needs Complicated Data Structures?

One problem that comes up all the time is needing a hash whose values are
lists.  Perl has hashes, of course, but the values have to be scalars;
they can't be lists.

Why would you want a hash of lists?  Let's take a simple example: You
have a file of city and country names, like this:

	Chicago, USA
	Frankfurt, Germany
	Berlin, Germany
	Washington, USA
	Helsinki, Finland
	New York, USA

and you want to produce an output like this, with each country mentioned
once, and then an alphabetical list of the cities in that country:

	Finland: Helsinki.
	Germany: Berlin, Frankfurt.
	USA:  Chicago, New York, Washington.

The natural way to do this is to have a hash whose keys are country
names.  Associated with each country name key is a list of the cities in
that country.  Each time you read a line of input, split it into a country
and a city, look up the list of cities already known to be in that
country, and append the new city to the list.  When you're done reading
the input, iterate over the hash as usual, sorting each list of cities
before you print it out.

If hash values couldn't be lists, you lose.  You'd probably have to
combine all the cities into a single string somehow, and then when
time came to write the output, you'd have to break the string into a
list, sort the list, and turn it back into a string.  This is messy
and error-prone.  And it's frustrating, because Perl already has
perfectly good lists that would solve the problem if only you could
use them.

=head1 The Solution

By the time Perl 5 rolled around, we were already stuck with this
design: Hash values must be scalars.  The solution to this is
references.

A reference is a scalar value that I<refers to> an entire array or an
entire hash (or to just about anything else).  Names are one kind of
reference that you're already familiar with.  Think of the President
of the United States: a messy, inconvenient bag of blood and bones.
But to talk about him, or to represent him in a computer program, all
you need is the easy, convenient scalar string "Barack Obama".

References in Perl are like names for arrays and hashes.  They're
Perl's private, internal names, so you can be sure they're
unambiguous.  Unlike "Barack Obama", a reference only refers to one
thing, and you always know what it refers to.  If you have a reference
to an array, you can recover the entire array from it.  If you have a
reference to a hash, you can recover the entire hash.  But the
reference is still an easy, compact scalar value.

You can't have a hash whose values are arrays; hash values can only be
scalars.  We're stuck with that.  But a single reference can refer to
an entire array, and references are scalars, so you can have a hash of
references to arrays, and it'll act a lot like a hash of arrays, and
it'll be just as useful as a hash of arrays.

We'll come back to this city-country problem later, after we've seen
some syntax for managing references.


=head1 Syntax

There are just two ways to make a reference, and just two ways to use
it once you have it.

=head2 Making References

=head3 B<Make Rule 1>

If you put a C<\> in front of a variable, you get a
reference to that variable.

    $aref = \@array;         # $aref now holds a reference to @array
    $href = \%hash;          # $href now holds a reference to %hash
    $sref = \$scalar;        # $sref now holds a reference to $scalar

Once the reference is stored in a variable like $aref or $href, you
can copy it or store it just the same as any other scalar value:

    $xy = $aref;             # $xy now holds a reference to @array
    $p[3] = $href;           # $p[3] now holds a reference to %hash
    $z = $p[3];              # $z now holds a reference to %hash


These examples show how to make references to variables with names.
Sometimes you want to make an array or a hash that doesn't have a
name.  This is analogous to the way you like to be able to use the
string C<"\n"> or the number 80 without having to store it in a named
variable first.

=head3 B<Make Rule 2>

C<[ ITEMS ]> makes a new, anonymous array, and returns a reference to
that array.  C<{ ITEMS }> makes a new, anonymous hash, and returns a
reference to that hash.

    $aref = [ 1, "foo", undef, 13 ];
    # $aref now holds a reference to an array

    $href = { APR => 4, AUG => 8 };
    # $href now holds a reference to a hash


The references you get from rule 2 are the same kind of
references that you get from rule 1:

	# This:
	$aref = [ 1, 2, 3 ];

	# Does the same as this:
	@array = (1, 2, 3);
	$aref = \@array;


The first line is an abbreviation for the following two lines, except
that it doesn't create the superfluous array variable C<@array>.

If you write just C<[]>, you get a new, empty anonymous array.
If you write just C<{}>, you get a new, empty anonymous hash.


=head2 Using References

What can you do with a reference once you have it?  It's a scalar
value, and we've seen that you can store it as a scalar and get it back
again just like any scalar.  There are just two more ways to use it:

=head3 B<Use Rule 1>

You can always use an array reference, in curly braces, in place of
the name of an array.  For example, C<@{$aref}> instead of C<@array>.

Here are some examples of that:

Arrays:


	@a		@{$aref}		An array
	reverse @a	reverse @{$aref}	Reverse the array
	$a[3]		${$aref}[3]		An element of the array
	$a[3] = 17;	${$aref}[3] = 17	Assigning an element


On each line are two expressions that do the same thing.  The
left-hand versions operate on the array C<@a>.  The right-hand
versions operate on the array that is referred to by C<$aref>.  Once
they find the array they're operating on, both versions do the same
things to the arrays.

Using a hash reference is I<exactly> the same:

	%h		%{$href}	      A hash
	keys %h		keys %{$href}	      Get the keys from the hash
	$h{'red'}	${$href}{'red'}	      An element of the hash
	$h{'red'} = 17	${$href}{'red'} = 17  Assigning an element

Whatever you want to do with a reference, B<Use Rule 1> tells you how
to do it.  You just write the Perl code that you would have written
for doing the same thing to a regular array or hash, and then replace
the array or hash name with C<{$reference}>.  "How do I loop over an
array when all I have is a reference?"  Well, to loop over an array, you
would write

        for my $element (@array) {
          ...
        }

so replace the array name, C<@array>, with the reference:

        for my $element (@{$aref}) {
          ...
        }

"How do I print out the contents of a hash when all I have is a
reference?"  First write the code for printing out a hash:

        for my $key (keys %hash) {
          print "$key => $hash{$key}\n";
        }

And then replace the hash name with the reference:

        for my $key (keys %{$href}) {
          print "$key => ${$href}{$key}\n";
        }

=head3 B<Use Rule 2>

L<B<Use Rule 1>|/B<Use Rule 1>> is all you really need, because it tells
you how to do absolutely everything you ever need to do with references.
But the most common thing to do with an array or a hash is to extract a
single element, and the L<B<Use Rule 1>|/B<Use Rule 1>> notation is
cumbersome.  So there is an abbreviation.

C<${$aref}[3]> is too hard to read, so you can write C<< $aref->[3] >>
instead.

C<${$href}{red}> is too hard to read, so you can write
C<< $href->{red} >> instead.

If C<$aref> holds a reference to an array, then C<< $aref->[3] >> is
the fourth element of the array.  Don't confuse this with C<$aref[3]>,
which is the fourth element of a totally different array, one
deceptively named C<@aref>.  C<$aref> and C<@aref> are unrelated the
same way that C<$item> and C<@item> are.

Similarly, C<< $href->{'red'} >> is part of the hash referred to by
the scalar variable C<$href>, perhaps even one with no name.
C<$href{'red'}> is part of the deceptively named C<%href> hash.  It's
easy to forget to leave out the C<< -> >>, and if you do, you'll get
bizarre results when your program gets array and hash elements out of
totally unexpected hashes and arrays that weren't the ones you wanted
to use.


=head2 An Example

Let's see a quick example of how all this is useful.

First, remember that C<[1, 2, 3]> makes an anonymous array containing
C<(1, 2, 3)>, and gives you a reference to that array.

Now think about

	@a = ( [1, 2, 3],
               [4, 5, 6],
	       [7, 8, 9]
             );

C<@a> is an array with three elements, and each one is a reference to
another array.

C<$a[1]> is one of these references.  It refers to an array, the array
containing C<(4, 5, 6)>, and because it is a reference to an array,
L<B<Use Rule 2>|/B<Use Rule 2>> says that we can write C<< $a[1]->[2] >>
to get the third element from that array.  C<< $a[1]->[2] >> is the 6.
Similarly, C<< $a[0]->[1] >> is the 2.  What we have here is like a
two-dimensional array; you can write C<< $a[ROW]->[COLUMN] >> to get or
set the element in any row and any column of the array.

The notation still looks a little cumbersome, so there's one more
abbreviation:

=head2 Arrow Rule

In between two B<subscripts>, the arrow is optional.

Instead of C<< $a[1]->[2] >>, we can write C<$a[1][2]>; it means the
same thing.  Instead of C<< $a[0]->[1] = 23 >>, we can write
C<$a[0][1] = 23>; it means the same thing.

Now it really looks like two-dimensional arrays!

You can see why the arrows are important.  Without them, we would have
had to write C<${$a[1]}[2]> instead of C<$a[1][2]>.  For
three-dimensional arrays, they let us write C<$x[2][3][5]> instead of
the unreadable C<${${$x[2]}[3]}[5]>.

=head1 Solution

Here's the answer to the problem I posed earlier, of reformatting a
file of city and country names.

    1   my %table;

    2   while (<>) {
    3     chomp;
    4     my ($city, $country) = split /, /;
    5     $table{$country} = [] unless exists $table{$country};
    6     push @{$table{$country}}, $city;
    7   }

    8   for my $country (sort keys %table) {
    9     print "$country: ";
   10     my @cities = @{$table{$country}};
   11     print join ', ', sort @cities;
   12     print ".\n";
   13	}


The program has two pieces: Lines 2-7 read the input and build a data
structure, and lines 8-13 analyze the data and print out the report.
We're going to have a hash, C<%table>, whose keys are country names,
and whose values are references to arrays of city names.  The data
structure will look like this:


           %table
        +-------+---+
        |       |   |   +-----------+--------+
        |Germany| *---->| Frankfurt | Berlin |
        |       |   |   +-----------+--------+
        +-------+---+
        |       |   |   +----------+
        |Finland| *---->| Helsinki |
        |       |   |   +----------+
        +-------+---+
        |       |   |   +---------+------------+----------+
        |  USA  | *---->| Chicago | Washington | New York |
        |       |   |   +---------+------------+----------+
        +-------+---+

We'll look at output first.  Supposing we already have this structure,
how do we print it out?

    8   for my $country (sort keys %table) {
    9     print "$country: ";
   10     my @cities = @{$table{$country}};
   11     print join ', ', sort @cities;
   12     print ".\n";
   13	}

C<%table> is an ordinary hash, and we get a list of keys from it, sort
the keys, and loop over the keys as usual.  The only use of references
is in line 10.  C<$table{$country}> looks up the key C<$country> in the
hash and gets the value, which is a reference to an array of cities in
that country.  L<B<Use Rule 1>|/B<Use Rule 1>> says that we can recover
the array by saying C<@{$table{$country}}>.  Line 10 is just like

	@cities = @array;

except that the name C<array> has been replaced by the reference
C<{$table{$country}}>.  The C<@> tells Perl to get the entire array.
Having gotten the list of cities, we sort it, join it, and print it
out as usual.

Lines 2-7 are responsible for building the structure in the first
place.  Here they are again:

    2   while (<>) {
    3     chomp;
    4     my ($city, $country) = split /, /;
    5     $table{$country} = [] unless exists $table{$country};
    6     push @{$table{$country}}, $city;
    7   }

Lines 2-4 acquire a city and country name.  Line 5 looks to see if the
country is already present as a key in the hash.  If it's not, the
program uses the C<[]> notation (L<B<Make Rule 2>|/B<Make Rule 2>>) to
manufacture a new, empty anonymous array of cities, and installs a
reference to it into the hash under the appropriate key.

Line 6 installs the city name into the appropriate array.
C<$table{$country}> now holds a reference to the array of cities seen
in that country so far.  Line 6 is exactly like

	push @array, $city;

except that the name C<array> has been replaced by the reference
C<{$table{$country}}>.  The L<C<push>|perlfunc/push ARRAY,LIST> adds a
city name to the end of the referred-to array.

There's one fine point I skipped.  Line 5 is unnecessary, and we can
get rid of it.

    2   while (<>) {
    3     chomp;
    4     my ($city, $country) = split /, /;
    5   ####  $table{$country} = [] unless exists $table{$country};
    6     push @{$table{$country}}, $city;
    7   }

If there's already an entry in C<%table> for the current C<$country>,
then nothing is different.  Line 6 will locate the value in
C<$table{$country}>, which is a reference to an array, and push C<$city>
into the array.  But what does it do when C<$country> holds a key, say
C<Greece>, that is not yet in C<%table>?

This is Perl, so it does the exact right thing.  It sees that you want
to push C<Athens> onto an array that doesn't exist, so it helpfully
makes a new, empty, anonymous array for you, installs it into
C<%table>, and then pushes C<Athens> onto it.  This is called
I<autovivification>--bringing things to life automatically.  Perl saw
that the key wasn't in the hash, so it created a new hash entry
automatically. Perl saw that you wanted to use the hash value as an
array, so it created a new empty array and installed a reference to it
in the hash automatically.  And as usual, Perl made the array one
element longer to hold the new city name.

=head1 The Rest

I promised to give you 90% of the benefit with 10% of the details, and
that means I left out 90% of the details.  Now that you have an
overview of the important parts, it should be easier to read the
L<perlref> manual page, which discusses 100% of the details.

Some of the highlights of L<perlref>:

=over 4

=item *

You can make references to anything, including scalars, functions, and
other references.

=item *

In L<B<Use Rule 1>|/B<Use Rule 1>>, you can omit the curly brackets
whenever the thing inside them is an atomic scalar variable like
C<$aref>.  For example, C<@$aref> is the same as C<@{$aref}>, and
C<$$aref[1]> is the same as C<${$aref}[1]>.  If you're just starting
out, you may want to adopt the habit of always including the curly
brackets.

=item *

This doesn't copy the underlying array:

        $aref2 = $aref1;

You get two references to the same array.  If you modify
C<< $aref1->[23] >> and then look at
C<< $aref2->[23] >> you'll see the change.

To copy the array, use

        $aref2 = [@{$aref1}];

This uses C<[...]> notation to create a new anonymous array, and
C<$aref2> is assigned a reference to the new array.  The new array is
initialized with the contents of the array referred to by C<$aref1>.

Similarly, to copy an anonymous hash, you can use

        $href2 = {%{$href1}};

=item *

To see if a variable contains a reference, use the
L<C<ref>|perlfunc/ref EXPR> function.  It returns true if its argument
is a reference.  Actually it's a little better than that: It returns
C<HASH> for hash references and C<ARRAY> for array references.

=item *

If you try to use a reference like a string, you get strings like

	ARRAY(0x80f5dec)   or    HASH(0x826afc0)

If you ever see a string that looks like this, you'll know you
printed out a reference by mistake.

A side effect of this representation is that you can use
L<C<eq>|perlop/Equality Operators> to see if two references refer to the
same thing.  (But you should usually use
L<C<==>|perlop/Equality Operators> instead because it's much faster.)

=item *

You can use a string as if it were a reference.  If you use the string
C<"foo"> as an array reference, it's taken to be a reference to the
array C<@foo>.  This is called a I<symbolic reference>.  The declaration
L<C<use strict 'refs'>|strict> disables this feature, which can cause
all sorts of trouble if you use it by accident.

=back

You might prefer to go on to L<perllol> instead of L<perlref>; it
discusses lists of lists and multidimensional arrays in detail.  After
that, you should move on to L<perldsc>; it's a Data Structure Cookbook
that shows recipes for using and printing out arrays of hashes, hashes
of arrays, and other kinds of data.

=head1 Summary

Everyone needs compound data structures, and in Perl the way you get
them is with references.  There are four important rules for managing
references: Two for making references and two for using them.  Once
you know these rules you can do most of the important things you need
to do with references.

=head1 Credits

Author: Mark Jason Dominus, Plover Systems (C<mjd-perl-ref+@plover.com>)

This article originally appeared in I<The Perl Journal>
( L<http://www.tpj.com/> ) volume 3, #2.  Reprinted with permission.

The original title was I<Understand References Today>.

=head2 Distribution Conditions

Copyright 1998 The Perl Journal.

This documentation is free; you can redistribute it and/or modify it
under the same terms as Perl itself.

Irrespective of its distribution, all code examples in these files are
hereby placed into the public domain.  You are permitted and
encouraged to use this code in your own programs for fun or for profit
as you see fit.  A simple comment in the code giving credit would be
courteous but is not required.




=cut
perlartistic.pod000064400000015542150344123430007763 0ustar00
=head1 NAME

perlartistic - the Perl Artistic License

=head1 SYNOPSIS

 You can refer to this document in Pod via "L<perlartistic>"
 Or you can see this document by entering "perldoc perlartistic"

=head1 DESCRIPTION

Perl is free software; you can redistribute it and/or modify
it under the terms of either:

        a) the GNU General Public License as published by the Free
        Software Foundation; either version 1, or (at your option) any
        later version, or

        b) the "Artistic License" which comes with this Kit.

This is B<"The Artistic License">.
It's here so that modules, programs, etc., that want to declare
this as their distribution license can link to it.

For the GNU General Public License, see L<perlgpl>.

=head1 The "Artistic License"

=head2 Preamble

The intent of this document is to state the conditions under which a
Package may be copied, such that the Copyright Holder maintains some
semblance of artistic control over the development of the package,
while giving the users of the package the right to use and distribute
the Package in a more-or-less customary fashion, plus the right to make
reasonable modifications.

=head2 Definitions

=over

=item "Package"

refers to the collection of files distributed by the
Copyright Holder, and derivatives of that collection of files created
through textual modification.

=item "Standard Version"

refers to such a Package if it has not been
modified, or has been modified in accordance with the wishes of the
Copyright Holder as specified below.

=item "Copyright Holder"

is whoever is named in the copyright or
copyrights for the package.

=item "You"

is you, if you're thinking about copying or distributing this Package.

=item "Reasonable copying fee"

is whatever you can justify on the basis
of media cost, duplication charges, time of people involved, and so on.
(You will not be required to justify it to the Copyright Holder, but
only to the computing community at large as a market that must bear the
fee.)

=item "Freely Available"

means that no fee is charged for the item
itself, though there may be fees involved in handling the item. It also
means that recipients of the item may redistribute it under the same
conditions they received it.

=back

=head2 Conditions

=over

=item 1.

You may make and give away verbatim copies of the source form of the
Standard Version of this Package without restriction, provided that you
duplicate all of the original copyright notices and associated disclaimers.

=item 2.

You may apply bug fixes, portability fixes and other modifications
derived from the Public Domain or from the Copyright Holder.  A Package
modified in such a way shall still be considered the Standard Version.

=item 3.

You may otherwise modify your copy of this Package in any way, provided
that you insert a prominent notice in each changed file stating how and
when you changed that file, and provided that you do at least ONE of the
following:

=over

=item a)

place your modifications in the Public Domain or otherwise make them
Freely Available, such as by posting said modifications to Usenet or an
equivalent medium, or placing the modifications on a major archive site
such as uunet.uu.net, or by allowing the Copyright Holder to include
your modifications in the Standard Version of the Package.

=item b)

use the modified Package only within your corporation or organization.

=item c)

rename any non-standard executables so the names do not conflict with
standard executables, which must also be provided, and provide a
separate manual page for each non-standard executable that clearly
documents how it differs from the Standard Version.

=item d)

make other distribution arrangements with the Copyright Holder.

=back

=item 4.

You may distribute the programs of this Package in object code or
executable form, provided that you do at least ONE of the following:

=over

=item a)

distribute a Standard Version of the executables and library files,
together with instructions (in the manual page or equivalent) on where
to get the Standard Version.

=item b)

accompany the distribution with the machine-readable source of the
Package with your modifications.

=item c)

give non-standard executables non-standard names, and clearly
document the differences in manual pages (or equivalent), together with
instructions on where to get the Standard Version.

=item d)

make other distribution arrangements with the Copyright Holder.

=back

=item 5.

You may charge a reasonable copying fee for any distribution of this
Package.  You may charge any fee you choose for support of this
Package.  You may not charge a fee for this Package itself.  However,
you may distribute this Package in aggregate with other (possibly
commercial) programs as part of a larger (possibly commercial) software
distribution provided that you do not advertise this Package as a
product of your own.  You may embed this Package's interpreter within
an executable of yours (by linking); this shall be construed as a mere
form of aggregation, provided that the complete Standard Version of the
interpreter is so embedded.

=item 6.

The scripts and library files supplied as input to or produced as
output from the programs of this Package do not automatically fall
under the copyright of this Package, but belong to whoever generated
them, and may be sold commercially, and may be aggregated with this
Package.  If such scripts or library files are aggregated with this
Package via the so-called "undump" or "unexec" methods of producing a
binary executable image, then distribution of such an image shall
neither be construed as a distribution of this Package nor shall it
fall under the restrictions of Paragraphs 3 and 4, provided that you do
not represent such an executable image as a Standard Version of this
Package.

=item 7.

C subroutines (or comparably compiled subroutines in other
languages) supplied by you and linked into this Package in order to
emulate subroutines and variables of the language defined by this
Package shall not be considered part of this Package, but are the
equivalent of input as in Paragraph 6, provided these subroutines do
not change the language in any way that would cause it to fail the
regression tests for the language.

=item 8.

Aggregation of this Package with a commercial distribution is always
permitted provided that the use of this Package is embedded; that is,
when no overt attempt is made to make this Package's interfaces visible
to the end user of the commercial distribution.  Such use shall not be
construed as a distribution of this Package.

=item 9.

The name of the Copyright Holder may not be used to endorse or promote
products derived from this software without specific prior written permission.


=item 10.

THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.

=back

The End

=cut


perlvms.pod000064400000143207150344123430006746 0ustar00=head1 NAME

perlvms - VMS-specific documentation for Perl

=head1 DESCRIPTION

Gathered below are notes describing details of Perl 5's 
behavior on VMS.  They are a supplement to the regular Perl 5 
documentation, so we have focussed on the ways in which Perl 
5 functions differently under VMS than it does under Unix, 
and on the interactions between Perl and the rest of the 
operating system.  We haven't tried to duplicate complete 
descriptions of Perl features from the main Perl 
documentation, which can be found in the F<[.pod]> 
subdirectory of the Perl distribution.

We hope these notes will save you from confusion and lost 
sleep when writing Perl scripts on VMS.  If you find we've 
missed something you think should appear here, please don't 
hesitate to drop a line to vmsperl@perl.org.

=head1 Installation

Directions for building and installing Perl 5 can be found in 
the file F<README.vms> in the main source directory of the 
Perl distribution.

=head1 Organization of Perl Images

=head2 Core Images

During the build process, three Perl images are produced.
F<Miniperl.Exe> is an executable image which contains all of
the basic functionality of Perl, but cannot take advantage of
Perl XS extensions and has a hard-wired list of library locations
for loading pure-Perl modules.  It is used extensively to build and
test Perl and various extensions, but is not installed.

Most of the complete Perl resides in the shareable image F<PerlShr.Exe>,
which provides a core to which the Perl executable image and all Perl
extensions are linked. It is generally located via the logical name
F<PERLSHR>.  While it's possible to put the image in F<SYS$SHARE> to
make it loadable, that's not recommended. And while you may wish to
INSTALL the image for performance reasons, you should not install it
with privileges; if you do, the result will not be what you expect as
image privileges are disabled during Perl start-up.

Finally, F<Perl.Exe> is an executable image containing the main
entry point for Perl, as well as some initialization code.  It
should be placed in a public directory, and made world executable.
In order to run Perl with command line arguments, you should
define a foreign command to invoke this image.

=head2 Perl Extensions

Perl extensions are packages which provide both XS and Perl code
to add new functionality to perl.  (XS is a meta-language which
simplifies writing C code which interacts with Perl, see
L<perlxs> for more details.)  The Perl code for an
extension is treated like any other library module - it's
made available in your script through the appropriate
C<use> or C<require> statement, and usually defines a Perl
package containing the extension.

The portion of the extension provided by the XS code may be
connected to the rest of Perl in either of two ways.  In the
B<static> configuration, the object code for the extension is
linked directly into F<PerlShr.Exe>, and is initialized whenever
Perl is invoked.  In the B<dynamic> configuration, the extension's
machine code is placed into a separate shareable image, which is
mapped by Perl's DynaLoader when the extension is C<use>d or
C<require>d in your script.  This allows you to maintain the
extension as a separate entity, at the cost of keeping track of the
additional shareable image.  Most extensions can be set up as either
static or dynamic.

The source code for an extension usually resides in its own
directory.  At least three files are generally provided:
I<Extshortname>F<.xs> (where I<Extshortname> is the portion of
the extension's name following the last C<::>), containing
the XS code, I<Extshortname>F<.pm>, the Perl library module
for the extension, and F<Makefile.PL>, a Perl script which uses
the C<MakeMaker> library modules supplied with Perl to generate
a F<Descrip.MMS> file for the extension.

=head2 Installing static extensions

Since static extensions are incorporated directly into
F<PerlShr.Exe>, you'll have to rebuild Perl to incorporate a
new extension.  You should edit the main F<Descrip.MMS> or F<Makefile>
you use to build Perl, adding the extension's name to the C<ext>
macro, and the extension's object file to the C<extobj> macro.
You'll also need to build the extension's object file, either
by adding dependencies to the main F<Descrip.MMS>, or using a
separate F<Descrip.MMS> for the extension.  Then, rebuild
F<PerlShr.Exe> to incorporate the new code.

Finally, you'll need to copy the extension's Perl library
module to the F<[.>I<Extname>F<]> subdirectory under one
of the directories in C<@INC>, where I<Extname> is the name
of the extension, with all C<::> replaced by C<.> (e.g.
the library module for extension Foo::Bar would be copied
to a F<[.Foo.Bar]> subdirectory).

=head2 Installing dynamic extensions

In general, the distributed kit for a Perl extension includes
a file named Makefile.PL, which is a Perl program which is used
to create a F<Descrip.MMS> file which can be used to build and
install the files required by the extension.  The kit should be
unpacked into a directory tree B<not> under the main Perl source
directory, and the procedure for building the extension is simply

    $ perl Makefile.PL  ! Create Descrip.MMS
    $ mmk               ! Build necessary files
    $ mmk test          ! Run test code, if supplied
    $ mmk install       ! Install into public Perl tree

VMS support for this process in the current release of Perl
is sufficient to handle most extensions.  (See the MakeMaker
documentation for more details on installation options for
extensions.)

=over 4

=item *

the F<[.Lib.Auto.>I<Arch>I<$PVers>I<Extname>F<]> subdirectory
of one of the directories in C<@INC> (where I<PVers>
is the version of Perl you're using, as supplied in C<$]>,
with '.' converted to '_'), or

=item *

one of the directories in C<@INC>, or

=item *

a directory which the extensions Perl library module
passes to the DynaLoader when asking it to map
the shareable image, or

=item *

F<Sys$Share> or F<Sys$Library>.

=back

If the shareable image isn't in any of these places, you'll need
to define a logical name I<Extshortname>, where I<Extshortname>
is the portion of the extension's name after the last C<::>, which
translates to the full file specification of the shareable image.

=head1 File specifications

=head2 Syntax

We have tried to make Perl aware of both VMS-style and Unix-style file
specifications wherever possible.  You may use either style, or both,
on the command line and in scripts, but you may not combine the two
styles within a single file specification.  VMS Perl interprets Unix
pathnames in much the same way as the CRTL (I<e.g.> the first component
of an absolute path is read as the device name for the VMS file
specification).  There are a set of functions provided in the
C<VMS::Filespec> package for explicit interconversion between VMS and
Unix syntax; its documentation provides more details.

We've tried to minimize the dependence of Perl library
modules on Unix syntax, but you may find that some of these,
as well as some scripts written for Unix systems, will
require that you use Unix syntax, since they will assume that
'/' is the directory separator, I<etc.>  If you find instances
of this in the Perl distribution itself, please let us know,
so we can try to work around them.

Also when working on Perl programs on VMS, if you need a syntax
in a specific operating system format, then you need either to
check the appropriate DECC$ feature logical, or call a conversion
routine to force it to that format.

The feature logical name DECC$FILENAME_UNIX_REPORT modifies traditional
Perl behavior in the conversion of file specifications from Unix to VMS
format in order to follow the extended character handling rules now
expected by the CRTL.  Specifically, when this feature is in effect, the
C<./.../> in a Unix path is now translated to C<[.^.^.^.]> instead of
the traditional VMS C<[...]>.  To be compatible with what MakeMaker
expects, if a VMS path cannot be translated to a Unix path, it is
passed through unchanged, so C<unixify("[...]")> will return C<[...]>.

There are several ambiguous cases where a conversion routine cannot
determine whether an input filename is in Unix format or in VMS format,
since now both VMS and Unix file specifications may have characters in
them that could be mistaken for syntax delimiters of the other type. So
some pathnames simply cannot be used in a mode that allows either type
of pathname to be present.  Perl will tend to assume that an ambiguous
filename is in Unix format.

Allowing "." as a version delimiter is simply incompatible with
determining whether a pathname is in VMS format or in Unix format with
extended file syntax.  There is no way to know whether "perl-5.8.6" is a
Unix "perl-5.8.6" or a VMS "perl-5.8;6" when passing it to unixify() or
vmsify().

The DECC$FILENAME_UNIX_REPORT logical name controls how Perl interprets
filenames to the extent that Perl uses the CRTL internally for many
purposes, and attempts to follow CRTL conventions for reporting
filenames.  The DECC$FILENAME_UNIX_ONLY feature differs in that it
expects all filenames passed to the C run-time to be already in Unix
format.  This feature is not yet supported in Perl since Perl uses
traditional OpenVMS file specifications internally and in the test
harness, and it is not yet clear whether this mode will be useful or
useable.  The feature logical name DECC$POSIX_COMPLIANT_PATHNAMES is new
with the RMS Symbolic Link SDK and included with OpenVMS v8.3, but is
not yet supported in Perl.

=head2 Filename Case

Perl enables DECC$EFS_CASE_PRESERVE and DECC$ARGV_PARSE_STYLE by
default.  Note that the latter only takes effect when extended parse
is set in the process in which Perl is running.  When these features
are explicitly disabled in the environment or the CRTL does not support
them, Perl follows the traditional CRTL behavior of downcasing command-line
arguments and returning file specifications in lower case only.

I<N. B.>  It is very easy to get tripped up using a mixture of other
programs, external utilities, and Perl scripts that are in varying
states of being able to handle case preservation.  For example, a file
created by an older version of an archive utility or a build utility
such as MMK or MMS may generate a filename in all upper case even on an
ODS-5 volume.  If this filename is later retrieved by a Perl script or
module in a case preserving environment, that upper case name may not
match the mixed-case or lower-case expectations of the Perl code.  Your
best bet is to follow an all-or-nothing approach to case preservation:
either don't use it at all, or make sure your entire toolchain and
application environment support and use it.

OpenVMS Alpha v7.3-1 and later and all version of OpenVMS I64 support
case sensitivity as a process setting (see C<SET PROCESS
/CASE_LOOKUP=SENSITIVE>). Perl does not currently support case
sensitivity on VMS, but it may in the future, so Perl programs should
use the C<< File::Spec->case_tolerant >> method to determine the state, and
not the C<$^O> variable.

=head2 Symbolic Links

When built on an ODS-5 volume with symbolic links enabled, Perl by
default supports symbolic links when the requisite support is available
in the filesystem and CRTL (generally 64-bit OpenVMS v8.3 and later). 
There are a number of limitations and caveats to be aware of when
working with symbolic links on VMS.  Most notably, the target of a valid
symbolic link must be expressed as a Unix-style path and it must exist
on a volume visible from your POSIX root (see the C<SHOW ROOT> command
in DCL help).  For further details on symbolic link capabilities and
requirements, see chapter 12 of the CRTL manual that ships with OpenVMS
v8.3 or later.

=head2 Wildcard expansion

File specifications containing wildcards are allowed both on 
the command line and within Perl globs (e.g. C<E<lt>*.cE<gt>>).  If
the wildcard filespec uses VMS syntax, the resultant 
filespecs will follow VMS syntax; if a Unix-style filespec is 
passed in, Unix-style filespecs will be returned.
Similar to the behavior of wildcard globbing for a Unix shell,
one can escape command line wildcards with double quotation
marks C<"> around a perl program command line argument.  However,
owing to the stripping of C<"> characters carried out by the C
handling of argv you will need to escape a construct such as
this one (in a directory containing the files F<PERL.C>, F<PERL.EXE>,
F<PERL.H>, and F<PERL.OBJ>):

    $ perl -e "print join(' ',@ARGV)" perl.*
    perl.c perl.exe perl.h perl.obj

in the following triple quoted manner:

    $ perl -e "print join(' ',@ARGV)" """perl.*"""
    perl.*

In both the case of unquoted command line arguments or in calls
to C<glob()> VMS wildcard expansion is performed. (csh-style
wildcard expansion is available if you use C<File::Glob::glob>.)
If the wildcard filespec contains a device or directory 
specification, then the resultant filespecs will also contain 
a device and directory; otherwise, device and directory 
information are removed.  VMS-style resultant filespecs will 
contain a full device and directory, while Unix-style 
resultant filespecs will contain only as much of a directory 
path as was present in the input filespec.  For example, if 
your default directory is Perl_Root:[000000], the expansion 
of C<[.t]*.*> will yield filespecs  like 
"perl_root:[t]base.dir", while the expansion of C<t/*/*> will 
yield filespecs like "t/base.dir".  (This is done to match 
the behavior of glob expansion performed by Unix shells.) 

Similarly, the resultant filespec will contain the file version
only if one was present in the input filespec.


=head2 Pipes

Input and output pipes to Perl filehandles are supported; the 
"file name" is passed to lib$spawn() for asynchronous 
execution.  You should be careful to close any pipes you have 
opened in a Perl script, lest you leave any "orphaned" 
subprocesses around when Perl exits. 

You may also use backticks to invoke a DCL subprocess, whose 
output is used as the return value of the expression.  The 
string between the backticks is handled as if it were the
argument to the C<system> operator (see below).  In this case,
Perl will wait for the subprocess to complete before continuing. 

The mailbox (MBX) that perl can create to communicate with a pipe
defaults to a buffer size of 8192 on 64-bit systems, 512 on VAX.  The
default buffer size is adjustable via the logical name PERL_MBX_SIZE
provided that the value falls between 128 and the SYSGEN parameter
MAXBUF inclusive.  For example, to set the mailbox size to 32767 use
C<$ENV{'PERL_MBX_SIZE'} = 32767;> and then open and use pipe constructs. 
An alternative would be to issue the command:

    $ Define PERL_MBX_SIZE 32767

before running your wide record pipe program.  A larger value may
improve performance at the expense of the BYTLM UAF quota.

=head1 PERL5LIB and PERLLIB

The PERL5LIB and PERLLIB environment elements work as documented in L<perl>,
except that the element separator is, by default, '|' instead of ':'.
However, when running under a Unix shell as determined by the logical
name C<GNV$UNIX_SHELL>, the separator will be ':' as on Unix systems. The
directory specifications may use either VMS or Unix syntax.

=head1 The Perl Forked Debugger

The Perl forked debugger places the debugger commands and output in a
separate X-11 terminal window so that commands and output from multiple
processes are not mixed together.

Perl on VMS supports an emulation of the forked debugger when Perl is
run on a VMS system that has X11 support installed.

To use the forked debugger, you need to have the default display set to an
X-11 Server and some environment variables set that Unix expects.

The forked debugger requires the environment variable C<TERM> to be C<xterm>,
and the environment variable C<DISPLAY> to exist.  C<xterm> must be in
lower case.

  $define TERM "xterm"

  $define DISPLAY "hostname:0.0"

Currently the value of C<DISPLAY> is ignored.  It is recommended that it be set
to be the hostname of the display, the server and screen in Unix notation.  In
the future the value of DISPLAY may be honored by Perl instead of using the
default display.

It may be helpful to always use the forked debugger so that script I/O is
separated from debugger I/O.  You can force the debugger to be forked by
assigning a value to the logical name <PERLDB_PIDS> that is not a process
identification number.

  $define PERLDB_PIDS XXXX


=head1 PERL_VMS_EXCEPTION_DEBUG

The PERL_VMS_EXCEPTION_DEBUG being defined as "ENABLE" will cause the VMS
debugger to be invoked if a fatal exception that is not otherwise
handled is raised.  The purpose of this is to allow debugging of
internal Perl problems that would cause such a condition.

This allows the programmer to look at the execution stack and variables to
find out the cause of the exception.  As the debugger is being invoked as
the Perl interpreter is about to do a fatal exit, continuing the execution
in debug mode is usually not practical.

Starting Perl in the VMS debugger may change the program execution
profile in a way that such problems are not reproduced.

The C<kill> function can be used to test this functionality from within
a program.

In typical VMS style, only the first letter of the value of this logical
name is actually checked in a case insensitive mode, and it is considered
enabled if it is the value "T","1" or "E".

This logical name must be defined before Perl is started.

=head1 Command line

=head2 I/O redirection and backgrounding

Perl for VMS supports redirection of input and output on the 
command line, using a subset of Bourne shell syntax:

=over 4

=item *

C<E<lt>file> reads stdin from C<file>,

=item *

C<E<gt>file> writes stdout to C<file>,

=item *

C<E<gt>E<gt>file> appends stdout to C<file>,

=item *

C<2E<gt>file> writes stderr to C<file>,

=item *

C<2E<gt>E<gt>file> appends stderr to C<file>, and

=item *

C<< 2>&1 >> redirects stderr to stdout.

=back

In addition, output may be piped to a subprocess, using the  
character '|'.  Anything after this character on the command 
line is passed to a subprocess for execution; the subprocess 
takes the output of Perl as its input.

Finally, if the command line ends with '&', the entire 
command is run in the background as an asynchronous 
subprocess.

=head2 Command line switches

The following command line switches behave differently under
VMS than described in L<perlrun>.  Note also that in order
to pass uppercase switches to Perl, you need to enclose
them in double-quotes on the command line, since the CRTL
downcases all unquoted strings.

On newer 64 bit versions of OpenVMS, a process setting now
controls if the quoting is needed to preserve the case of
command line arguments.

=over 4

=item -i

If the C<-i> switch is present but no extension for a backup
copy is given, then inplace editing creates a new version of
a file; the existing copy is not deleted.  (Note that if
an extension is given, an existing file is renamed to the backup
file, as is the case under other operating systems, so it does
not remain as a previous version under the original filename.)

=item -S

If the C<"-S"> or C<-"S"> switch is present I<and> the script
name does not contain a directory, then Perl translates the
logical name DCL$PATH as a searchlist, using each translation
as a directory in which to look for the script.  In addition,
if no file type is specified, Perl looks in each directory
for a file matching the name specified, with a blank type,
a type of F<.pl>, and a type of F<.com>, in that order.

=item -u

The C<-u> switch causes the VMS debugger to be invoked
after the Perl program is compiled, but before it has
run.  It does not create a core dump file.

=back

=head1 Perl functions

As of the time this document was last revised, the following 
Perl functions were implemented in the VMS port of Perl 
(functions marked with * are discussed in more detail below):

    file tests*, abs, alarm, atan, backticks*, binmode*, bless,
    caller, chdir, chmod, chown, chomp, chop, chr,
    close, closedir, cos, crypt*, defined, delete, die, do, dump*, 
    each, endgrent, endpwent, eof, eval, exec*, exists, exit, exp, 
    fileno, flock  getc, getgrent*, getgrgid*, getgrnam, getlogin,
    getppid, getpwent*, getpwnam*, getpwuid*, glob, gmtime*, goto,
    grep, hex, ioctl, import, index, int, join, keys, kill*,
    last, lc, lcfirst, lchown*, length, link*, local, localtime, log,
    lstat, m//, map, mkdir, my, next, no, oct, open, opendir, ord,
    pack, pipe, pop, pos, print, printf, push, q//, qq//, qw//,
    qx//*, quotemeta, rand, read, readdir, readlink*, redo, ref,
    rename, require, reset, return, reverse, rewinddir, rindex,
    rmdir, s///, scalar, seek, seekdir, select(internal),
    select (system call)*, setgrent, setpwent, shift, sin, sleep,
    socketpair, sort, splice, split, sprintf, sqrt, srand, stat,
    study, substr, symlink*, sysread, system*, syswrite, tell,
    telldir, tie, time, times*, tr///, uc, ucfirst, umask,
    undef, unlink*, unpack, untie, unshift, use, utime*,
    values, vec, wait, waitpid*, wantarray, warn, write, y///

The following functions were not implemented in the VMS port, 
and calling them produces a fatal error (usually) or 
undefined behavior (rarely, we hope):

    chroot, dbmclose, dbmopen, fork*, getpgrp, getpriority,  
    msgctl, msgget, msgsend, msgrcv, semctl,
    semget, semop, setpgrp, setpriority, shmctl, shmget,
    shmread, shmwrite, syscall

The following functions are available on Perls compiled with Dec C
5.2 or greater and running VMS 7.0 or greater:

    truncate

The following functions are available on Perls built on VMS 7.2 or
greater:

    fcntl (without locking)

The following functions may or may not be implemented, 
depending on what type of socket support you've built into 
your copy of Perl:

    accept, bind, connect, getpeername,
    gethostbyname, getnetbyname, getprotobyname,
    getservbyname, gethostbyaddr, getnetbyaddr,
    getprotobynumber, getservbyport, gethostent,
    getnetent, getprotoent, getservent, sethostent,
    setnetent, setprotoent, setservent, endhostent,
    endnetent, endprotoent, endservent, getsockname,
    getsockopt, listen, recv, select(system call)*,
    send, setsockopt, shutdown, socket

The following function is available on Perls built on 64 bit OpenVMS v8.2
with hard links enabled on an ODS-5 formatted build disk.  CRTL support
is in principle available as of OpenVMS v7.3-1, and better configuration
support could detect this.

    link

The following functions are available on Perls built on 64 bit OpenVMS
v8.2 and later.  CRTL support is in principle available as of OpenVMS
v7.3-2, and better configuration support could detect this.

   getgrgid, getgrnam, getpwnam, getpwuid,
   setgrent, ttyname

The following functions are available on Perls built on 64 bit OpenVMS v8.2
and later.  

   statvfs, socketpair

=over 4

=item File tests

The tests C<-b>, C<-B>, C<-c>, C<-C>, C<-d>, C<-e>, C<-f>,
C<-o>, C<-M>, C<-s>, C<-S>, C<-t>, C<-T>, and C<-z> work as
advertised.  The return values for C<-r>, C<-w>, and C<-x>
tell you whether you can actually access the file; this may
not reflect the UIC-based file protections.  Since real and
effective UIC don't differ under VMS, C<-O>, C<-R>, C<-W>,
and C<-X> are equivalent to C<-o>, C<-r>, C<-w>, and C<-x>.
Similarly, several other tests, including C<-A>, C<-g>, C<-k>,
C<-l>, C<-p>, and C<-u>, aren't particularly meaningful under
VMS, and the values returned by these tests reflect whatever
your CRTL C<stat()> routine does to the equivalent bits in the
st_mode field.  Finally, C<-d> returns true if passed a device
specification without an explicit directory (e.g. C<DUA1:>), as
well as if passed a directory.

There are DECC feature logical names AND ODS-5 volume attributes that
also control what values are returned for the date fields.

Note: Some sites have reported problems when using the file-access
tests (C<-r>, C<-w>, and C<-x>) on files accessed via DEC's DFS.
Specifically, since DFS does not currently provide access to the
extended file header of files on remote volumes, attempts to
examine the ACL fail, and the file tests will return false,
with C<$!> indicating that the file does not exist.  You can
use C<stat> on these files, since that checks UIC-based protection
only, and then manually check the appropriate bits, as defined by
your C compiler's F<stat.h>, in the mode value it returns, if you
need an approximation of the file's protections.

=item backticks

Backticks create a subprocess, and pass the enclosed string
to it for execution as a DCL command.  Since the subprocess is
created directly via C<lib$spawn()>, any valid DCL command string
may be specified.

=item binmode FILEHANDLE

The C<binmode> operator will attempt to insure that no translation
of carriage control occurs on input from or output to this filehandle.
Since this involves reopening the file and then restoring its
file position indicator, if this function returns FALSE, the
underlying filehandle may no longer point to an open file, or may
point to a different position in the file than before C<binmode>
was called.

Note that C<binmode> is generally not necessary when using normal
filehandles; it is provided so that you can control I/O to existing
record-structured files when necessary.  You can also use the
C<vmsfopen> function in the VMS::Stdio extension to gain finer
control of I/O to files and devices with different record structures.

=item crypt PLAINTEXT, USER

The C<crypt> operator uses the C<sys$hash_password> system
service to generate the hashed representation of PLAINTEXT.
If USER is a valid username, the algorithm and salt values
are taken from that user's UAF record.  If it is not, then
the preferred algorithm and a salt of 0 are used.  The
quadword encrypted value is returned as an 8-character string.

The value returned by C<crypt> may be compared against
the encrypted password from the UAF returned by the C<getpw*>
functions, in order to authenticate users.  If you're
going to do this, remember that the encrypted password in
the UAF was generated using uppercase username and
password strings; you'll have to upcase the arguments to
C<crypt> to insure that you'll get the proper value:

    sub validate_passwd {
        my($user,$passwd) = @_;
        my($pwdhash);
        if ( !($pwdhash = (getpwnam($user))[1]) ||
               $pwdhash ne crypt("\U$passwd","\U$name") ) {
            intruder_alert($name);
        }
        return 1;
    }


=item die

C<die> will force the native VMS exit status to be an SS$_ABORT code
if neither of the $! or $? status values are ones that would cause
the native status to be interpreted as being what VMS classifies as
SEVERE_ERROR severity for DCL error handling.

When C<PERL_VMS_POSIX_EXIT> is active (see L</"$?"> below), the native VMS exit
status value will have either one of the C<$!> or C<$?> or C<$^E> or
the Unix value 255 encoded into it in a way that the effective original
value can be decoded by other programs written in C, including Perl
and the GNV package.  As per the normal non-VMS behavior of C<die> if
either C<$!> or C<$?> are non-zero, one of those values will be
encoded into a native VMS status value.  If both of the Unix status
values are 0, and the C<$^E> value is set one of ERROR or SEVERE_ERROR
severity, then the C<$^E> value will be used as the exit code as is.
If none of the above apply, the Unix value of 255 will be encoded into
a native VMS exit status value.

Please note a significant difference in the behavior of C<die> in
the C<PERL_VMS_POSIX_EXIT> mode is that it does not force a VMS
SEVERE_ERROR status on exit.  The Unix exit values of 2 through
255 will be encoded in VMS status values with severity levels of
SUCCESS.  The Unix exit value of 1 will be encoded in a VMS status
value with a severity level of ERROR.  This is to be compatible with
how the VMS C library encodes these values.

The minimum severity level set by C<die> in C<PERL_VMS_POSIX_EXIT> mode
may be changed to be ERROR or higher in the future depending on the 
results of testing and further review.

See L</"$?"> for a description of the encoding of the Unix value to
produce a native VMS status containing it.

=item dump

Rather than causing Perl to abort and dump core, the C<dump>
operator invokes the VMS debugger.  If you continue to
execute the Perl program under the debugger, control will
be transferred to the label specified as the argument to
C<dump>, or, if no label was specified, back to the
beginning of the program.  All other state of the program
(I<e.g.> values of variables, open file handles) are not
affected by calling C<dump>.

=item exec LIST

A call to C<exec> will cause Perl to exit, and to invoke the command
given as an argument to C<exec> via C<lib$do_command>.  If the
argument begins with '@' or '$' (other than as part of a filespec),
then it is executed as a DCL command.  Otherwise, the first token on
the command line is treated as the filespec of an image to run, and
an attempt is made to invoke it (using F<.Exe> and the process
defaults to expand the filespec) and pass the rest of C<exec>'s
argument to it as parameters.  If the token has no file type, and
matches a file with null type, then an attempt is made to determine
whether the file is an executable image which should be invoked
using C<MCR> or a text file which should be passed to DCL as a
command procedure.

=item fork

While in principle the C<fork> operator could be implemented via
(and with the same rather severe limitations as) the CRTL C<vfork()>
routine, and while some internal support to do just that is in
place, the implementation has never been completed, making C<fork>
currently unavailable.  A true kernel C<fork()> is expected in a
future version of VMS, and the pseudo-fork based on interpreter
threads may be available in a future version of Perl on VMS (see
L<perlfork>).  In the meantime, use C<system>, backticks, or piped
filehandles to create subprocesses.

=item getpwent

=item getpwnam

=item getpwuid

These operators obtain the information described in L<perlfunc>,
if you have the privileges necessary to retrieve the named user's
UAF information via C<sys$getuai>.  If not, then only the C<$name>,
C<$uid>, and C<$gid> items are returned.  The C<$dir> item contains
the login directory in VMS syntax, while the C<$comment> item
contains the login directory in Unix syntax. The C<$gcos> item
contains the owner field from the UAF record.  The C<$quota>
item is not used.

=item gmtime

The C<gmtime> operator will function properly if you have a
working CRTL C<gmtime()> routine, or if the logical name
SYS$TIMEZONE_DIFFERENTIAL is defined as the number of seconds
which must be added to UTC to yield local time.  (This logical
name is defined automatically if you are running a version of
VMS with built-in UTC support.)  If neither of these cases is
true, a warning message is printed, and C<undef> is returned.

=item kill

In most cases, C<kill> is implemented via the undocumented system
service C<$SIGPRC>, which has the same calling sequence as C<$FORCEX>, but
throws an exception in the target process rather than forcing it to call
C<$EXIT>.  Generally speaking, C<kill> follows the behavior of the
CRTL's C<kill()> function, but unlike that function can be called from
within a signal handler.  Also, unlike the C<kill> in some versions of
the CRTL, Perl's C<kill> checks the validity of the signal passed in and
returns an error rather than attempting to send an unrecognized signal.

Also, negative signal values don't do anything special under
VMS; they're just converted to the corresponding positive value.

=item qx//

See the entry on C<backticks> above.

=item select (system call)

If Perl was not built with socket support, the system call
version of C<select> is not available at all.  If socket
support is present, then the system call version of
C<select> functions only for file descriptors attached
to sockets.  It will not provide information about regular
files or pipes, since the CRTL C<select()> routine does not
provide this functionality.

=item stat EXPR

Since VMS keeps track of files according to a different scheme
than Unix, it's not really possible to represent the file's ID
in the C<st_dev> and C<st_ino> fields of a C<struct stat>.  Perl
tries its best, though, and the values it uses are pretty unlikely
to be the same for two different files.  We can't guarantee this,
though, so caveat scriptor.

=item system LIST

The C<system> operator creates a subprocess, and passes its 
arguments to the subprocess for execution as a DCL command.  
Since the subprocess is created directly via C<lib$spawn()>, any 
valid DCL command string may be specified.  If the string begins with
'@', it is treated as a DCL command unconditionally.  Otherwise, if
the first token contains a character used as a delimiter in file
specification (e.g. C<:> or C<]>), an attempt is made to expand it
using  a default type of F<.Exe> and the process defaults, and if
successful, the resulting file is invoked via C<MCR>. This allows you
to invoke an image directly simply by passing the file specification
to C<system>, a common Unixish idiom.  If the token has no file type,
and matches a file with null type, then an attempt is made to
determine whether the file is an executable image which should be
invoked using C<MCR> or a text file which should be passed to DCL
as a command procedure.

If LIST consists of the empty string, C<system> spawns an
interactive DCL subprocess, in the same fashion as typing
B<SPAWN> at the DCL prompt.

Perl waits for the subprocess to complete before continuing
execution in the current process.  As described in L<perlfunc>,
the return value of C<system> is a fake "status" which follows
POSIX semantics unless the pragma C<use vmsish 'status'> is in
effect; see the description of C<$?> in this document for more 
detail.  

=item time

The value returned by C<time> is the offset in seconds from
01-JAN-1970 00:00:00 (just like the CRTL's times() routine), in order
to make life easier for code coming in from the POSIX/Unix world.

=item times

The array returned by the C<times> operator is divided up 
according to the same rules the CRTL C<times()> routine.  
Therefore, the "system time" elements will always be 0, since 
there is no difference between "user time" and "system" time 
under VMS, and the time accumulated by a subprocess may or may 
not appear separately in the "child time" field, depending on 
whether C<times()> keeps track of subprocesses separately.  Note
especially that the VAXCRTL (at least) keeps track only of
subprocesses spawned using C<fork()> and C<exec()>; it will not
accumulate the times of subprocesses spawned via pipes, C<system()>,
or backticks.

=item unlink LIST

C<unlink> will delete the highest version of a file only; in
order to delete all versions, you need to say

    1 while unlink LIST;

You may need to make this change to scripts written for a
Unix system which expect that after a call to C<unlink>,
no files with the names passed to C<unlink> will exist.
(Note: This can be changed at compile time; if you
C<use Config> and C<$Config{'d_unlink_all_versions'}> is
C<define>, then C<unlink> will delete all versions of a
file on the first call.)

C<unlink> will delete a file if at all possible, even if it
requires changing file protection (though it won't try to
change the protection of the parent directory).  You can tell
whether you've got explicit delete access to a file by using the
C<VMS::Filespec::candelete> operator.  For instance, in order
to delete only files to which you have delete access, you could
say something like

    sub safe_unlink {
        my($file,$num);
        foreach $file (@_) {
            next unless VMS::Filespec::candelete($file);
            $num += unlink $file;
        }
        $num;
    }

(or you could just use C<VMS::Stdio::remove>, if you've installed
the VMS::Stdio extension distributed with Perl). If C<unlink> has to
change the file protection to delete the file, and you interrupt it
in midstream, the file may be left intact, but with a changed ACL
allowing you delete access.

This behavior of C<unlink> is to be compatible with POSIX behavior
and not traditional VMS behavior.

=item utime LIST

This operator changes only the modification time of the file (VMS 
revision date) on ODS-2 volumes and ODS-5 volumes without access 
dates enabled. On ODS-5 volumes with access dates enabled, the 
true access time is modified.

=item waitpid PID,FLAGS

If PID is a subprocess started by a piped C<open()> (see L<open>), 
C<waitpid> will wait for that subprocess, and return its final status
value in C<$?>.  If PID is a subprocess created in some other way (e.g.
SPAWNed before Perl was invoked), C<waitpid> will simply check once per
second whether the process has completed, and return when it has.  (If
PID specifies a process that isn't a subprocess of the current process,
and you invoked Perl with the C<-w> switch, a warning will be issued.)

Returns PID on success, -1 on error.  The FLAGS argument is ignored
in all cases.

=back

=head1 Perl variables

The following VMS-specific information applies to the indicated
"special" Perl variables, in addition to the general information
in L<perlvar>.  Where there is a conflict, this information
takes precedence.

=over 4

=item %ENV 

The operation of the C<%ENV> array depends on the translation
of the logical name F<PERL_ENV_TABLES>.  If defined, it should
be a search list, each element of which specifies a location
for C<%ENV> elements.  If you tell Perl to read or set the
element C<$ENV{>I<name>C<}>, then Perl uses the translations of
F<PERL_ENV_TABLES> as follows:

=over 4

=item CRTL_ENV

This string tells Perl to consult the CRTL's internal C<environ> array
of key-value pairs, using I<name> as the key.  In most cases, this
contains only a few keys, but if Perl was invoked via the C
C<exec[lv]e()> function, as is the case for some embedded Perl
applications or when running under a shell such as GNV bash, the
C<environ> array may have been populated by the calling program.

=item CLISYM_[LOCAL]

A string beginning with C<CLISYM_>tells Perl to consult the CLI's
symbol tables, using I<name> as the name of the symbol.  When reading
an element of C<%ENV>, the local symbol table is scanned first, followed
by the global symbol table..  The characters following C<CLISYM_> are
significant when an element of C<%ENV> is set or deleted: if the
complete string is C<CLISYM_LOCAL>, the change is made in the local
symbol table; otherwise the global symbol table is changed.

=item Any other string

If an element of F<PERL_ENV_TABLES> translates to any other string,
that string is used as the name of a logical name table, which is
consulted using I<name> as the logical name.  The normal search
order of access modes is used.

=back

F<PERL_ENV_TABLES> is translated once when Perl starts up; any changes
you make while Perl is running do not affect the behavior of C<%ENV>.
If F<PERL_ENV_TABLES> is not defined, then Perl defaults to consulting
first the logical name tables specified by F<LNM$FILE_DEV>, and then
the CRTL C<environ> array.  This default order is reversed when the
logical name F<GNV$UNIX_SHELL> is defined, such as when running under
GNV bash.

For operations on %ENV entries based on logical names or DCL symbols, the
key string is treated as if it were entirely uppercase, regardless of the
case actually specified in the Perl expression. Entries in %ENV based on the
CRTL's environ array preserve the case of the key string when stored, and
lookups are case sensitive.

When an element of C<%ENV> is read, the locations to which
F<PERL_ENV_TABLES> points are checked in order, and the value
obtained from the first successful lookup is returned.  If the
name of the C<%ENV> element contains a semi-colon, it and
any characters after it are removed.  These are ignored when
the CRTL C<environ> array or a CLI symbol table is consulted.
However, the name is looked up in a logical name table, the
suffix after the semi-colon is treated as the translation index
to be used for the lookup.   This lets you look up successive values
for search list logical names.  For instance, if you say

   $  Define STORY  once,upon,a,time,there,was
   $  perl -e "for ($i = 0; $i <= 6; $i++) " -
   _$ -e "{ print $ENV{'story;'.$i},' '}"

Perl will print C<ONCE UPON A TIME THERE WAS>, assuming, of course,
that F<PERL_ENV_TABLES> is set up so that the logical name C<story>
is found, rather than a CLI symbol or CRTL C<environ> element with
the same name.

When an element of C<%ENV> is set to a defined string, the
corresponding definition is made in the location to which the
first translation of F<PERL_ENV_TABLES> points.  If this causes a
logical name to be created, it is defined in supervisor mode.
(The same is done if an existing logical name was defined in
executive or kernel mode; an existing user or supervisor mode
logical name is reset to the new value.)  If the value is an empty
string, the logical name's translation is defined as a single C<NUL>
(ASCII C<\0>) character, since a logical name cannot translate to a
zero-length string.  (This restriction does not apply to CLI symbols
or CRTL C<environ> values; they are set to the empty string.)

When an element of C<%ENV> is set to C<undef>, the element is looked
up as if it were being read, and if it is found, it is deleted.  (An
item "deleted" from the CRTL C<environ> array is set to the empty
string.)  Using C<delete> to remove an element from C<%ENV> has a
similar effect, but after the element is deleted, another attempt is
made to look up the element, so an inner-mode logical name or a name
in another location will replace the logical name just deleted. In
either case, only the first value found searching PERL_ENV_TABLES is
altered.  It is not possible at present to define a search list
logical name via %ENV.

The element C<$ENV{DEFAULT}> is special: when read, it returns
Perl's current default device and directory, and when set, it
resets them, regardless of the definition of F<PERL_ENV_TABLES>.
It cannot be cleared or deleted; attempts to do so are silently
ignored.

Note that if you want to pass on any elements of the
C-local environ array to a subprocess which isn't
started by fork/exec, or isn't running a C program, you
can "promote" them to logical names in the current
process, which will then be inherited by all subprocesses,
by saying

    foreach my $key (qw[C-local keys you want promoted]) {
        my $temp = $ENV{$key}; # read from C-local array
        $ENV{$key} = $temp;    # and define as logical name
    }

(You can't just say C<$ENV{$key} = $ENV{$key}>, since the
Perl optimizer is smart enough to elide the expression.)

Don't try to clear C<%ENV> by saying C<%ENV = ();>, it will throw
a fatal error.  This is equivalent to doing the following from DCL:

    DELETE/LOGICAL *

You can imagine how bad things would be if, for example, the SYS$MANAGER
or SYS$SYSTEM logical names were deleted.

At present, the first time you iterate over %ENV using
C<keys>, or C<values>,  you will incur a time penalty as all
logical names are read, in order to fully populate %ENV.
Subsequent iterations will not reread logical names, so they
won't be as slow, but they also won't reflect any changes
to logical name tables caused by other programs.

You do need to be careful with the logical names representing
process-permanent files, such as C<SYS$INPUT> and C<SYS$OUTPUT>.
The translations for these logical names are prepended with a
two-byte binary value (0x1B 0x00) that needs to be stripped off
if you want to use it. (In previous versions of Perl it wasn't
possible to get the values of these logical names, as the null
byte acted as an end-of-string marker)

=item $!

The string value of C<$!> is that returned by the CRTL's
strerror() function, so it will include the VMS message for
VMS-specific errors.  The numeric value of C<$!> is the
value of C<errno>, except if errno is EVMSERR, in which
case C<$!> contains the value of vaxc$errno.  Setting C<$!>
always sets errno to the value specified.  If this value is
EVMSERR, it also sets vaxc$errno to 4 (NONAME-F-NOMSG), so
that the string value of C<$!> won't reflect the VMS error
message from before C<$!> was set.

=item $^E

This variable provides direct access to VMS status values
in vaxc$errno, which are often more specific than the
generic Unix-style error messages in C<$!>.  Its numeric value
is the value of vaxc$errno, and its string value is the
corresponding VMS message string, as retrieved by sys$getmsg().
Setting C<$^E> sets vaxc$errno to the value specified.

While Perl attempts to keep the vaxc$errno value to be current, if
errno is not EVMSERR, it may not be from the current operation.

=item $?

The "status value" returned in C<$?> is synthesized from the
actual exit status of the subprocess in a way that approximates
POSIX wait(5) semantics, in order to allow Perl programs to
portably test for successful completion of subprocesses.  The
low order 8 bits of C<$?> are always 0 under VMS, since the
termination status of a process may or may not have been
generated by an exception.

The next 8 bits contain the termination status of the program.

If the child process follows the convention of C programs
compiled with the _POSIX_EXIT macro set, the status value will
contain the actual value of 0 to 255 returned by that program
on a normal exit.

With the _POSIX_EXIT macro set, the Unix exit value of zero is
represented as a VMS native status of 1, and the Unix values
from 2 to 255 are encoded by the equation:

   VMS_status = 0x35a000 + (unix_value * 8) + 1.

And in the special case of Unix value 1 the encoding is:

   VMS_status = 0x35a000 + 8 + 2 + 0x10000000.

For other termination statuses, the severity portion of the
subprocess's exit status is used: if the severity was success or
informational, these bits are all 0; if the severity was
warning, they contain a value of 1; if the severity was
error or fatal error, they contain the actual severity bits,
which turns out to be a value of 2 for error and 4 for severe_error.
Fatal is another term for the severe_error status.

As a result, C<$?> will always be zero if the subprocess's exit
status indicated successful completion, and non-zero if a
warning or error occurred or a program compliant with encoding
_POSIX_EXIT values was run and set a status.

How can you tell the difference between a non-zero status that is
the result of a VMS native error status or an encoded Unix status?
You can not unless you look at the ${^CHILD_ERROR_NATIVE} value.
The ${^CHILD_ERROR_NATIVE} value returns the actual VMS status value
and check the severity bits. If the severity bits are equal to 1,
then if the numeric value for C<$?> is between 2 and 255 or 0, then
C<$?> accurately reflects a value passed back from a Unix application.
If C<$?> is 1, and the severity bits indicate a VMS error (2), then
C<$?> is from a Unix application exit value.

In practice, Perl scripts that call programs that return _POSIX_EXIT
type status values will be expecting those values, and programs that
call traditional VMS programs will either be expecting the previous
behavior or just checking for a non-zero status.

And success is always the value 0 in all behaviors.

When the actual VMS termination status of the child is an error,
internally the C<$!> value will be set to the closest Unix errno
value to that error so that Perl scripts that test for error
messages will see the expected Unix style error message instead
of a VMS message.

Conversely, when setting C<$?> in an END block, an attempt is made
to convert the POSIX value into a native status intelligible to
the operating system upon exiting Perl.  What this boils down to
is that setting C<$?> to zero results in the generic success value
SS$_NORMAL, and setting C<$?> to a non-zero value results in the
generic failure status SS$_ABORT.  See also L<perlport/exit>.

With the C<PERL_VMS_POSIX_EXIT> logical name defined as "ENABLE",
setting C<$?> will cause the new value to be encoded into C<$^E>
so that either the original parent or child exit status values 
 0 to 255 can be automatically recovered by C programs expecting
_POSIX_EXIT behavior.  If both a parent and a child exit value are
non-zero, then it will be assumed that this is actually a VMS native
status value to be passed through.  The special value of 0xFFFF is
almost a NOOP as it will cause the current native VMS status in the
C library to become the current native Perl VMS status, and is handled
this way as it is known to not be a valid native VMS status value.
It is recommend that only values in the range of normal Unix parent or
child status numbers, 0 to 255 are used.

The pragma C<use vmsish 'status'> makes C<$?> reflect the actual 
VMS exit status instead of the default emulation of POSIX status 
described above.  This pragma also disables the conversion of
non-zero values to SS$_ABORT when setting C<$?> in an END
block (but zero will still be converted to SS$_NORMAL).

Do not use the pragma C<use vmsish 'status'> with C<PERL_VMS_POSIX_EXIT>
enabled, as they are at times requesting conflicting actions and the
consequence of ignoring this advice will be undefined to allow future
improvements in the POSIX exit handling.

In general, with C<PERL_VMS_POSIX_EXIT> enabled, more detailed information
will be available in the exit status for DCL scripts or other native VMS tools,
and will give the expected information for Posix programs.  It has not been
made the default in order to preserve backward compatibility.

N.B. Setting C<DECC$FILENAME_UNIX_REPORT> implicitly enables 
C<PERL_VMS_POSIX_EXIT>.

=item $|

Setting C<$|> for an I/O stream causes data to be flushed
all the way to disk on each write (I<i.e.> not just to
the underlying RMS buffers for a file).  In other words,
it's equivalent to calling fflush() and fsync() from C.

=back

=head1 Standard modules with VMS-specific differences

=head2 SDBM_File

SDBM_File works properly on VMS. It has, however, one minor
difference. The database directory file created has a F<.sdbm_dir>
extension rather than a F<.dir> extension. F<.dir> files are VMS filesystem
directory files, and using them for other purposes could cause unacceptable
problems.

=head1 Revision date

Please see the git repository for revision history.

=head1 AUTHOR

Charles Bailey  bailey@cor.newman.upenn.edu
Craig Berry  craigberry@mac.com
Dan Sugalski  dan@sidhe.org
John Malmberg wb8tyw@qsl.net
perl5181delta.pod000064400000014703150344123430007547 0ustar00=encoding utf8

=head1 NAME

perl5181delta - what is new for perl v5.18.1

=head1 DESCRIPTION

This document describes differences between the 5.18.0 release and the 5.18.1
release.

If you are upgrading from an earlier release such as 5.16.0, first read
L<perl5180delta>, which describes differences between 5.16.0 and 5.18.0.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.18.0
If any exist, they are bugs, and we request that you submit a
report.  See L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

B has been upgraded from 1.42 to 1.42_01, fixing bugs related to lexical
subroutines.

=item *

Digest::SHA has been upgraded from 5.84 to 5.84_01, fixing a crashing bug.
[RT #118649]

=item *

Module::CoreList has been upgraded from 2.89 to 2.96.

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item AIX

A rarely-encounted configuration bug in the AIX hints file has been corrected.

=item MidnightBSD

After a patch to the relevant hints file, perl should now build correctly on
MidnightBSD 0.4-RELEASE.

=back

=head1 Selected Bug Fixes

=over 4

=item *

Starting in v5.18.0, a construct like C</[#](?{})/x> would have its C<#>
incorrectly interpreted as a comment.  The code block would be skipped,
unparsed.  This has been corrected.

=item *

A number of memory leaks related to the new, experimental regexp bracketed
character class feature have been plugged.

=item *

The OP allocation code now returns correctly aligned memory in all cases
for C<struct pmop>. Previously it could return memory only aligned to a
4-byte boundary, which is not correct for an ithreads build with 64 bit IVs
on some 32 bit platforms. Notably, this caused the build to fail completely
on sparc GNU/Linux. [RT #118055]

=item *

The debugger's C<man> command been fixed. It was broken in the v5.18.0
release. The C<man> command is aliased to the names C<doc> and C<perldoc> -
all now work again.

=item *

C<@_> is now correctly visible in the debugger, fixing a regression
introduced in v5.18.0's debugger. [RT #118169]

=item *

Fixed a small number of regexp constructions that could either fail to
match or crash perl when the string being matched against was
allocated above the 2GB line on 32-bit systems. [RT #118175]

=item *

Perl v5.16 inadvertently introduced a bug whereby calls to XSUBs that were
not visible at compile time were treated as lvalues and could be assigned
to, even when the subroutine was not an lvalue sub.  This has been fixed.
[perl #117947]

=item *

Perl v5.18 inadvertently introduced a bug whereby dual-vars (i.e.
variables with both string and numeric values, such as C<$!> ) where the
truthness of the variable was determined by the numeric value rather than
the string value. [RT #118159]

=item *

Perl v5.18 inadvertently introduced a bug whereby interpolating mixed up-
and down-graded UTF-8 strings in a regex could result in malformed UTF-8
in the pattern: specifically if a downgraded character in the range
C<\x80..\xff> followed a UTF-8 string, e.g.

    utf8::upgrade(  my $u = "\x{e5}");
    utf8::downgrade(my $d = "\x{e5}");
    /$u$d/

[perl #118297].

=item *

Lexical constants (C<my sub a() { 42 }>) no longer crash when inlined.

=item *

Parameter prototypes attached to lexical subroutines are now respected when
compiling sub calls without parentheses.  Previously, the prototypes were
honoured only for calls I<with> parentheses. [RT #116735]

=item *

Syntax errors in lexical subroutines in combination with calls to the same
subroutines no longer cause crashes at compile time.

=item *

The dtrace sub-entry probe now works with lexical subs, instead of
crashing [perl #118305].

=item *

Undefining an inlinable lexical subroutine (C<my sub foo() { 42 } undef
&foo>) would result in a crash if warnings were turned on.

=item *

Deep recursion warnings no longer crash lexical subroutines. [RT #118521]

=back

=head1 Acknowledgements

Perl 5.18.1 represents approximately 2 months of development since Perl 5.18.0
and contains approximately 8,400 lines of changes across 60 files from 12
authors.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.18.1:

Chris 'BinGOs' Williams, Craig A. Berry, Dagfinn Ilmari Mannsåker, David
Mitchell, Father Chrysostomos, Karl Williamson, Lukas Mai, Nicholas Clark,
Peter Martini, Ricardo Signes, Shlomi Fish, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
http://rt.perl.org/perlbug/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perltw.pod000064400000010575150344123430006574 0ustar00=encoding utf8

如果你用一般的文字編輯器閱覽這份文件, 請忽略文中奇特的註記字符.
這份文件是以 POD (簡明文件格式) 寫成; 這種格式是為了能讓人直接讀取,
而特別設計的. 關於此格式的進一步資訊, 請參考 perlpod 線上文件.

=head1 NAME

perltw - 正體中文 Perl 指南

=head1 DESCRIPTION

歡迎來到 Perl 的天地!

從 5.8.0 版開始, Perl 具備了完善的 Unicode (萬國碼) 支援,
也連帶支援了許多拉丁語系以外的編碼方式; CJK (中日韓) 便是其中的一部份.
Unicode 是國際性的標準, 試圖涵蓋世界上所有的字符: 西方世界, 東方世界,
以及兩者間的一切 (希臘文, 敘利亞文, 阿拉伯文, 希伯來文, 印度文,
印地安文, 等等). 它也容納了多種作業系統與平臺 (如 PC 及麥金塔).

Perl 本身以 Unicode 進行操作. 這表示 Perl 內部的字串資料可用 Unicode
表示; Perl 的函式與算符 (例如正規表示式比對) 也能對 Unicode 進行操作.
在輸入及輸出時, 為了處理以 Unicode 之前的編碼方式儲存的資料, Perl
提供了 Encode 這個模組, 可以讓你輕易地讀取及寫入舊有的編碼資料.

Encode 延伸模組支援下列正體中文的編碼方式 ('big5' 表示 'big5-eten'):

    big5-eten	Big5 編碼 (含倚天延伸字形)
    big5-hkscs	Big5 + 香港外字集, 2001 年版
    cp950	字碼頁 950 (Big5 + 微軟添加的字符)

舉例來說, 將 Big5 編碼的檔案轉成 Unicode, 祗需鍵入下列指令:

    perl -MEncode -pe '$_= encode( utf8 => decode( big5 => $_ ) )' \
      < file.big5 > file.utf8

Perl 也內附了 "piconv", 一支完全以 Perl 寫成的字符轉換工具程式, 用法如下:

    piconv -f big5 -t utf8 < file.big5 > file.utf8
    piconv -f utf8 -t big5 < file.utf8 > file.big5

另外，若程式碼本身以 utf8 編碼儲存，配合使用 utf8 模組，可讓程式碼中字串以及其運
算皆以字符為單位，而不以位元為單位，如下所示：

    #!/usr/bin/env perl
    use utf8;
    print length("駱駝");	     #  2 (不是 6)
    print index("諄諄教誨", "教誨"); #  2 (從 0 起算第 2 個字符)


=head2 額外的中文編碼

如果需要更多的中文編碼, 可以從 CPAN (L<http://www.cpan.org/>) 下載
Encode::HanExtra 模組. 它目前提供下列編碼方式:

    cccii	1980 年文建會的中文資訊交換碼
    euc-tw	Unix 延伸字符集, 包含 CNS11643 平面 1-7
    big5plus	中文數位化技術推廣基金會的 Big5+
    big5ext	中文數位化技術推廣基金會的 Big5e

另外, Encode::HanConvert 模組則提供了簡繁轉換用的兩種編碼:

    big5-simp	Big5 正體中文與 Unicode 簡體中文互轉
    gbk-trad	GBK 簡體中文與 Unicode 正體中文互轉

若想在 GBK 與 Big5 之間互轉, 請參考該模組內附的 b2g.pl 與 g2b.pl 兩支程式,
或在程式內使用下列寫法:

    use Encode::HanConvert;
    $euc_cn = big5_to_gb($big5); # 從 Big5 轉為 GBK
    $big5 = gb_to_big5($euc_cn); # 從 GBK 轉為 Big5

=head2 進一步的資訊

請參考 Perl 內附的大量說明文件 (不幸全是用英文寫的), 來學習更多關於
Perl 的知識, 以及 Unicode 的使用方式. 不過, 外部的資源相當豐富:

=head2 提供 Perl 資源的網址

=over 4

=item L<http://www.perl.com/>

Perl 的首頁 (由歐萊禮公司維護)

=item L<http://www.cpan.org/>

Perl 綜合典藏網 (Comprehensive Perl Archive Network)

=item L<http://lists.perl.org/>

Perl 郵遞論壇一覽

=back

=head2 學習 Perl 的網址

=over 4

=item L<http://www.oreilly.com.tw/product_perl.php?id=index_perl>

正體中文版的歐萊禮 Perl 書藉

=back

=head2 Perl 使用者集會

=over 4

=item L<http://www.pm.org/groups/taiwan.html>

臺灣 Perl 推廣組一覽

=item L<irc://irc.freenode.org/#perl.tw>

Perl.tw 線上聊天室

=back

=head2 Unicode 相關網址

=over 4

=item L<http://www.unicode.org/>

Unicode 學術學會 (Unicode 標準的制定者)

=item L<http://www.cl.cam.ac.uk/%7Emgk25/unicode.html>

Unix/Linux 上的 UTF-8 及 Unicode 答客問

=back

=head2 中文化資訊

=over 4

=item 中文化軟體聯盟

L<http://www.cpatch.org/>

=item Linux 軟體中文化計劃

L<http://www.linux.org.tw/CLDP/>

=back

=head1 SEE ALSO

L<Encode>, L<Encode::TW>, L<perluniintro>, L<perlunicode>

=head1 AUTHORS

Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt>

Audrey Tang (唐鳳) E<lt>audreyt@audreyt.orgE<gt>

=cut
perl5143delta.pod000064400000017120150344123430007541 0ustar00=encoding utf8

=head1 NAME

perl5143delta - what is new for perl v5.14.3

=head1 DESCRIPTION

This document describes differences between the 5.14.2 release and
the 5.14.3 release.

If you are upgrading from an earlier release such as 5.12.0, first read
L<perl5140delta>, which describes differences between 5.12.0 and
5.14.0.

=head1 Core Enhancements

No changes since 5.14.0.

=head1 Security

=head2 C<Digest> unsafe use of eval (CVE-2011-3597)

The C<Digest-E<gt>new()> function did not properly sanitize input before
using it in an eval() call, which could lead to the injection of arbitrary
Perl code.

In order to exploit this flaw, the attacker would need to be able to set
the algorithm name used, or be able to execute arbitrary Perl code already.

This problem has been fixed.

=head2 Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195)

Poorly written perl code that allows an attacker to specify the count to
perl's 'x' string repeat operator can already cause a memory exhaustion
denial-of-service attack. A flaw in versions of perl before 5.15.5 can
escalate that into a heap buffer overrun; coupled with versions of glibc
before 2.16, it possibly allows the execution of arbitrary code.

This problem has been fixed.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any
exist, they are bugs and reports are welcome.

=head1 Deprecations

There have been no deprecations since 5.14.0.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

None

=head2 Updated Modules and Pragmata

=over 4

=item *

L<PerlIO::scalar> was updated to fix a bug in which opening a filehandle to
a glob copy caused assertion failures (under debugging) or hangs or other
erratic behaviour without debugging.

=item *

L<ODBM_File> and L<NDBM_File> were updated to allow building on GNU/Hurd.

=item *

L<IPC::Open3> has been updated to fix a regression introduced in perl
5.12, which broke C<IPC::Open3::open3($in, $out, $err, '-')>.
[perl #95748]

=item *

L<Digest> has been upgraded from version 1.16 to 1.16_01.

See L</Security>.

=item *

L<Module::CoreList> has been updated to version 2.49_04 to add data for
this release.

=back

=head2 Removed Modules and Pragmata

None

=head1 Documentation

=head2 New Documentation

None

=head2 Changes to Existing Documentation

=head3 L<perlcheat>

=over 4

=item *

L<perlcheat> was updated to 5.14.

=back

=head1 Configuration and Compilation

=over 4

=item *

h2ph was updated to search correctly gcc include directories on platforms
such as Debian with multi-architecture support.

=item *

In Configure, the test for procselfexe was refactored into a loop.

=back

=head1 Platform Support

=head2 New Platforms

None

=head2 Discontinued Platforms

None

=head2 Platform-Specific Notes

=over 4

=item FreeBSD

The FreeBSD hints file was corrected to be compatible with FreeBSD 10.0.

=item Solaris and NetBSD

Configure was updated for "procselfexe" support on Solaris and NetBSD.

=item HP-UX

README.hpux was updated to note the existence of a broken header in
HP-UX 11.00.

=item Linux

libutil is no longer used when compiling on Linux platforms, which avoids
warnings being emitted.

The system gcc (rather than any other gcc which might be in the compiling
user's path) is now used when searching for libraries such as C<-lm>.

=item Mac OS X

The locale tests were updated to reflect the behaviour of locales in
Mountain Lion.

=item GNU/Hurd

Various build and test fixes were included for GNU/Hurd.

LFS support was enabled in GNU/Hurd.

=item NetBSD

The NetBSD hints file was corrected to be compatible with NetBSD 6.*

=back

=head1 Bug Fixes

=over 4

=item *

A regression has been fixed that was introduced in 5.14, in C</i>
regular expression matching, in which a match improperly fails if the
pattern is in UTF-8, the target string is not, and a Latin-1 character
precedes a character in the string that should match the pattern.  [perl
#101710]

=item *

In case-insensitive regular expression pattern matching, no longer on
UTF-8 encoded strings does the scan for the start of match only look at
the first possible position.  This caused matches such as
C<"f\x{FB00}" =~ /ff/i> to fail.

=item *

The sitecustomize support was made relocatableinc aware, so that
-Dusesitecustomize and -Duserelocatableinc may be used together.

=item *

The smartmatch operator (C<~~>) was changed so that the right-hand side
takes precedence during C<Any ~~ Object> operations.

=item *

A bug has been fixed in the tainting support, in which an C<index()>
operation on a tainted constant would cause all other constants to become
tainted.  [perl #64804]

=item *

A regression has been fixed that was introduced in perl 5.12, whereby
tainting errors were not correctly propagated through C<die()>.
[perl #111654]

=item *

A regression has been fixed that was introduced in perl 5.14, in which
C</[[:lower:]]/i> and C</[[:upper:]]/i> no longer matched the opposite case.
[perl #101970]

=back

=head1 Acknowledgements

Perl 5.14.3 represents approximately 12 months of development since Perl 5.14.2
and contains approximately 2,300 lines of changes across 64 files from 22
authors.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.14.3:

Abigail, Andy Dougherty, Carl Hayter, Chris 'BinGOs' Williams, Dave Rolsky,
David Mitchell, Dominic Hargreaves, Father Chrysostomos, Florian Ragwitz,
H.Merijn Brand, Jilles Tjoelker, Karl Williamson, Leon Timmermans, Michael G
Schwern, Nicholas Clark, Niko Tyni, Pino Toscano, Ricardo Signes, Salvador
Fandiño, Samuel Thibault, Steve Hay, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes all the core committers, who be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlref.pod000064400000104750150344123430006715 0ustar00=head1 NAME
X<reference> X<pointer> X<data structure> X<structure> X<struct>

perlref - Perl references and nested data structures

=head1 NOTE

This is complete documentation about all aspects of references.
For a shorter, tutorial introduction to just the essential features,
see L<perlreftut>.

=head1 DESCRIPTION

Before release 5 of Perl it was difficult to represent complex data
structures, because all references had to be symbolic--and even then
it was difficult to refer to a variable instead of a symbol table entry.
Perl now not only makes it easier to use symbolic references to variables,
but also lets you have "hard" references to any piece of data or code.
Any scalar may hold a hard reference.  Because arrays and hashes contain
scalars, you can now easily build arrays of arrays, arrays of hashes,
hashes of arrays, arrays of hashes of functions, and so on.

Hard references are smart--they keep track of reference counts for you,
automatically freeing the thing referred to when its reference count goes
to zero.  (Reference counts for values in self-referential or
cyclic data structures may not go to zero without a little help; see
L</"Circular References"> for a detailed explanation.)
If that thing happens to be an object, the object is destructed.  See
L<perlobj> for more about objects.  (In a sense, everything in Perl is an
object, but we usually reserve the word for references to objects that
have been officially "blessed" into a class package.)

Symbolic references are names of variables or other objects, just as a
symbolic link in a Unix filesystem contains merely the name of a file.
The C<*glob> notation is something of a symbolic reference.  (Symbolic
references are sometimes called "soft references", but please don't call
them that; references are confusing enough without useless synonyms.)
X<reference, symbolic> X<reference, soft>
X<symbolic reference> X<soft reference>

In contrast, hard references are more like hard links in a Unix file
system: They are used to access an underlying object without concern for
what its (other) name is.  When the word "reference" is used without an
adjective, as in the following paragraph, it is usually talking about a
hard reference.
X<reference, hard> X<hard reference>

References are easy to use in Perl.  There is just one overriding
principle: in general, Perl does no implicit referencing or dereferencing.
When a scalar is holding a reference, it always behaves as a simple scalar.
It doesn't magically start being an array or hash or subroutine; you have to
tell it explicitly to do so, by dereferencing it.

=head2 Making References
X<reference, creation> X<referencing>

References can be created in several ways.

=over 4

=item 1.
X<\> X<backslash>

By using the backslash operator on a variable, subroutine, or value.
(This works much like the & (address-of) operator in C.)
This typically creates I<another> reference to a variable, because
there's already a reference to the variable in the symbol table.  But
the symbol table reference might go away, and you'll still have the
reference that the backslash returned.  Here are some examples:

    $scalarref = \$foo;
    $arrayref  = \@ARGV;
    $hashref   = \%ENV;
    $coderef   = \&handler;
    $globref   = \*foo;

It isn't possible to create a true reference to an IO handle (filehandle
or dirhandle) using the backslash operator.  The most you can get is a
reference to a typeglob, which is actually a complete symbol table entry.
But see the explanation of the C<*foo{THING}> syntax below.  However,
you can still use type globs and globrefs as though they were IO handles.

=item 2.
X<array, anonymous> X<[> X<[]> X<square bracket>
X<bracket, square> X<arrayref> X<array reference> X<reference, array>

A reference to an anonymous array can be created using square
brackets:

    $arrayref = [1, 2, ['a', 'b', 'c']];

Here we've created a reference to an anonymous array of three elements
whose final element is itself a reference to another anonymous array of three
elements.  (The multidimensional syntax described later can be used to
access this.  For example, after the above, C<< $arrayref->[2][1] >> would have
the value "b".)

Taking a reference to an enumerated list is not the same
as using square brackets--instead it's the same as creating
a list of references!

    @list = (\$a, \@b, \%c);
    @list = \($a, @b, %c);      # same thing!

As a special case, C<\(@foo)> returns a list of references to the contents
of C<@foo>, not a reference to C<@foo> itself.  Likewise for C<%foo>,
except that the key references are to copies (since the keys are just
strings rather than full-fledged scalars).

=item 3.
X<hash, anonymous> X<{> X<{}> X<curly bracket>
X<bracket, curly> X<brace> X<hashref> X<hash reference> X<reference, hash>

A reference to an anonymous hash can be created using curly
brackets:

    $hashref = {
        'Adam'  => 'Eve',
        'Clyde' => 'Bonnie',
    };

Anonymous hash and array composers like these can be intermixed freely to
produce as complicated a structure as you want.  The multidimensional
syntax described below works for these too.  The values above are
literals, but variables and expressions would work just as well, because
assignment operators in Perl (even within local() or my()) are executable
statements, not compile-time declarations.

Because curly brackets (braces) are used for several other things
including BLOCKs, you may occasionally have to disambiguate braces at the
beginning of a statement by putting a C<+> or a C<return> in front so
that Perl realizes the opening brace isn't starting a BLOCK.  The economy and
mnemonic value of using curlies is deemed worth this occasional extra
hassle.

For example, if you wanted a function to make a new hash and return a
reference to it, you have these options:

    sub hashem {        { @_ } }   # silently wrong
    sub hashem {       +{ @_ } }   # ok
    sub hashem { return { @_ } }   # ok

On the other hand, if you want the other meaning, you can do this:

    sub showem {        { @_ } }   # ambiguous (currently ok,
                                   # but may change)
    sub showem {       {; @_ } }   # ok
    sub showem { { return @_ } }   # ok

The leading C<+{> and C<{;> always serve to disambiguate
the expression to mean either the HASH reference, or the BLOCK.

=item 4.
X<subroutine, anonymous> X<subroutine, reference> X<reference, subroutine>
X<scope, lexical> X<closure> X<lexical> X<lexical scope>

A reference to an anonymous subroutine can be created by using
C<sub> without a subname:

    $coderef = sub { print "Boink!\n" };

Note the semicolon.  Except for the code
inside not being immediately executed, a C<sub {}> is not so much a
declaration as it is an operator, like C<do{}> or C<eval{}>.  (However, no
matter how many times you execute that particular line (unless you're in an
C<eval("...")>), $coderef will still have a reference to the I<same>
anonymous subroutine.)

Anonymous subroutines act as closures with respect to my() variables,
that is, variables lexically visible within the current scope.  Closure
is a notion out of the Lisp world that says if you define an anonymous
function in a particular lexical context, it pretends to run in that
context even when it's called outside the context.

In human terms, it's a funny way of passing arguments to a subroutine when
you define it as well as when you call it.  It's useful for setting up
little bits of code to run later, such as callbacks.  You can even
do object-oriented stuff with it, though Perl already provides a different
mechanism to do that--see L<perlobj>.

You might also think of closure as a way to write a subroutine
template without using eval().  Here's a small example of how
closures work:

    sub newprint {
        my $x = shift;
        return sub { my $y = shift; print "$x, $y!\n"; };
    }
    $h = newprint("Howdy");
    $g = newprint("Greetings");

    # Time passes...

    &$h("world");
    &$g("earthlings");

This prints

    Howdy, world!
    Greetings, earthlings!

Note particularly that $x continues to refer to the value passed
into newprint() I<despite> "my $x" having gone out of scope by the
time the anonymous subroutine runs.  That's what a closure is all
about.

This applies only to lexical variables, by the way.  Dynamic variables
continue to work as they have always worked.  Closure is not something
that most Perl programmers need trouble themselves about to begin with.

=item 5.
X<constructor> X<new>

References are often returned by special subroutines called constructors.  Perl
objects are just references to a special type of object that happens to know
which package it's associated with.  Constructors are just special subroutines
that know how to create that association.  They do so by starting with an
ordinary reference, and it remains an ordinary reference even while it's also
being an object.  Constructors are often named C<new()>.  You I<can> call them
indirectly:

    $objref = new Doggie( Tail => 'short', Ears => 'long' );

But that can produce ambiguous syntax in certain cases, so it's often
better to use the direct method invocation approach:

    $objref   = Doggie->new(Tail => 'short', Ears => 'long');

    use Term::Cap;
    $terminal = Term::Cap->Tgetent( { OSPEED => 9600 });

    use Tk;
    $main    = MainWindow->new();
    $menubar = $main->Frame(-relief              => "raised",
                            -borderwidth         => 2)

=item 6.
X<autovivification>

References of the appropriate type can spring into existence if you
dereference them in a context that assumes they exist.  Because we haven't
talked about dereferencing yet, we can't show you any examples yet.

=item 7.
X<*foo{THING}> X<*>

A reference can be created by using a special syntax, lovingly known as
the *foo{THING} syntax.  *foo{THING} returns a reference to the THING
slot in *foo (which is the symbol table entry which holds everything
known as foo).

    $scalarref = *foo{SCALAR};
    $arrayref  = *ARGV{ARRAY};
    $hashref   = *ENV{HASH};
    $coderef   = *handler{CODE};
    $ioref     = *STDIN{IO};
    $globref   = *foo{GLOB};
    $formatref = *foo{FORMAT};
    $globname  = *foo{NAME};    # "foo"
    $pkgname   = *foo{PACKAGE}; # "main"

Most of these are self-explanatory, but C<*foo{IO}>
deserves special attention.  It returns
the IO handle, used for file handles (L<perlfunc/open>), sockets
(L<perlfunc/socket> and L<perlfunc/socketpair>), and directory
handles (L<perlfunc/opendir>).  For compatibility with previous
versions of Perl, C<*foo{FILEHANDLE}> is a synonym for C<*foo{IO}>, though it
is discouraged, to encourage a consistent use of one name: IO.  On perls
between v5.8 and v5.22, it will issue a deprecation warning, but this
deprecation has since been rescinded.

C<*foo{THING}> returns undef if that particular THING hasn't been used yet,
except in the case of scalars.  C<*foo{SCALAR}> returns a reference to an
anonymous scalar if $foo hasn't been used yet.  This might change in a
future release.

C<*foo{NAME}> and C<*foo{PACKAGE}> are the exception, in that they return
strings, rather than references.  These return the package and name of the
typeglob itself, rather than one that has been assigned to it.  So, after
C<*foo=*Foo::bar>, C<*foo> will become "*Foo::bar" when used as a string,
but C<*foo{PACKAGE}> and C<*foo{NAME}> will continue to produce "main" and
"foo", respectively.

C<*foo{IO}> is an alternative to the C<*HANDLE> mechanism given in
L<perldata/"Typeglobs and Filehandles"> for passing filehandles
into or out of subroutines, or storing into larger data structures.
Its disadvantage is that it won't create a new filehandle for you.
Its advantage is that you have less risk of clobbering more than
you want to with a typeglob assignment.  (It still conflates file
and directory handles, though.)  However, if you assign the incoming
value to a scalar instead of a typeglob as we do in the examples
below, there's no risk of that happening.

    splutter(*STDOUT);          # pass the whole glob
    splutter(*STDOUT{IO});      # pass both file and dir handles

    sub splutter {
        my $fh = shift;
        print $fh "her um well a hmmm\n";
    }

    $rec = get_rec(*STDIN);     # pass the whole glob
    $rec = get_rec(*STDIN{IO}); # pass both file and dir handles

    sub get_rec {
        my $fh = shift;
        return scalar <$fh>;
    }

=back

=head2 Using References
X<reference, use> X<dereferencing> X<dereference>

That's it for creating references.  By now you're probably dying to
know how to use references to get back to your long-lost data.  There
are several basic methods.

=over 4

=item 1.

Anywhere you'd put an identifier (or chain of identifiers) as part
of a variable or subroutine name, you can replace the identifier with
a simple scalar variable containing a reference of the correct type:

    $bar = $$scalarref;
    push(@$arrayref, $filename);
    $$arrayref[0] = "January";
    $$hashref{"KEY"} = "VALUE";
    &$coderef(1,2,3);
    print $globref "output\n";

It's important to understand that we are specifically I<not> dereferencing
C<$arrayref[0]> or C<$hashref{"KEY"}> there.  The dereference of the
scalar variable happens I<before> it does any key lookups.  Anything more
complicated than a simple scalar variable must use methods 2 or 3 below.
However, a "simple scalar" includes an identifier that itself uses method
1 recursively.  Therefore, the following prints "howdy".

    $refrefref = \\\"howdy";
    print $$$$refrefref;

=item 2.

Anywhere you'd put an identifier (or chain of identifiers) as part of a
variable or subroutine name, you can replace the identifier with a
BLOCK returning a reference of the correct type.  In other words, the
previous examples could be written like this:

    $bar = ${$scalarref};
    push(@{$arrayref}, $filename);
    ${$arrayref}[0] = "January";
    ${$hashref}{"KEY"} = "VALUE";
    &{$coderef}(1,2,3);
    $globref->print("output\n");  # iff IO::Handle is loaded

Admittedly, it's a little silly to use the curlies in this case, but
the BLOCK can contain any arbitrary expression, in particular,
subscripted expressions:

    &{ $dispatch{$index} }(1,2,3);      # call correct routine

Because of being able to omit the curlies for the simple case of C<$$x>,
people often make the mistake of viewing the dereferencing symbols as
proper operators, and wonder about their precedence.  If they were,
though, you could use parentheses instead of braces.  That's not the case.
Consider the difference below; case 0 is a short-hand version of case 1,
I<not> case 2:

    $$hashref{"KEY"}   = "VALUE";       # CASE 0
    ${$hashref}{"KEY"} = "VALUE";       # CASE 1
    ${$hashref{"KEY"}} = "VALUE";       # CASE 2
    ${$hashref->{"KEY"}} = "VALUE";     # CASE 3

Case 2 is also deceptive in that you're accessing a variable
called %hashref, not dereferencing through $hashref to the hash
it's presumably referencing.  That would be case 3.

=item 3.

Subroutine calls and lookups of individual array elements arise often
enough that it gets cumbersome to use method 2.  As a form of
syntactic sugar, the examples for method 2 may be written:

    $arrayref->[0] = "January";   # Array element
    $hashref->{"KEY"} = "VALUE";  # Hash element
    $coderef->(1,2,3);            # Subroutine call

The left side of the arrow can be any expression returning a reference,
including a previous dereference.  Note that C<$array[$x]> is I<not> the
same thing as C<< $array->[$x] >> here:

    $array[$x]->{"foo"}->[0] = "January";

This is one of the cases we mentioned earlier in which references could
spring into existence when in an lvalue context.  Before this
statement, C<$array[$x]> may have been undefined.  If so, it's
automatically defined with a hash reference so that we can look up
C<{"foo"}> in it.  Likewise C<< $array[$x]->{"foo"} >> will automatically get
defined with an array reference so that we can look up C<[0]> in it.
This process is called I<autovivification>.

One more thing here.  The arrow is optional I<between> brackets
subscripts, so you can shrink the above down to

    $array[$x]{"foo"}[0] = "January";

Which, in the degenerate case of using only ordinary arrays, gives you
multidimensional arrays just like C's:

    $score[$x][$y][$z] += 42;

Well, okay, not entirely like C's arrays, actually.  C doesn't know how
to grow its arrays on demand.  Perl does.

=item 4.

If a reference happens to be a reference to an object, then there are
probably methods to access the things referred to, and you should probably
stick to those methods unless you're in the class package that defines the
object's methods.  In other words, be nice, and don't violate the object's
encapsulation without a very good reason.  Perl does not enforce
encapsulation.  We are not totalitarians here.  We do expect some basic
civility though.

=back

Using a string or number as a reference produces a symbolic reference,
as explained above.  Using a reference as a number produces an
integer representing its storage location in memory.  The only
useful thing to be done with this is to compare two references
numerically to see whether they refer to the same location.
X<reference, numeric context>

    if ($ref1 == $ref2) {  # cheap numeric compare of references
        print "refs 1 and 2 refer to the same thing\n";
    }

Using a reference as a string produces both its referent's type,
including any package blessing as described in L<perlobj>, as well
as the numeric address expressed in hex.  The ref() operator returns
just the type of thing the reference is pointing to, without the
address.  See L<perlfunc/ref> for details and examples of its use.
X<reference, string context>

The bless() operator may be used to associate the object a reference
points to with a package functioning as an object class.  See L<perlobj>.

A typeglob may be dereferenced the same way a reference can, because
the dereference syntax always indicates the type of reference desired.
So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable.

Here's a trick for interpolating a subroutine call into a string:

    print "My sub returned @{[mysub(1,2,3)]} that time.\n";

The way it works is that when the C<@{...}> is seen in the double-quoted
string, it's evaluated as a block.  The block creates a reference to an
anonymous array containing the results of the call to C<mysub(1,2,3)>.  So
the whole block returns a reference to an array, which is then
dereferenced by C<@{...}> and stuck into the double-quoted string. This
chicanery is also useful for arbitrary expressions:

    print "That yields @{[$n + 5]} widgets\n";

Similarly, an expression that returns a reference to a scalar can be
dereferenced via C<${...}>. Thus, the above expression may be written
as:

    print "That yields ${\($n + 5)} widgets\n";

=head2 Circular References
X<circular reference> X<reference, circular>

It is possible to create a "circular reference" in Perl, which can lead
to memory leaks. A circular reference occurs when two references
contain a reference to each other, like this:

    my $foo = {};
    my $bar = { foo => $foo };
    $foo->{bar} = $bar;

You can also create a circular reference with a single variable:

    my $foo;
    $foo = \$foo;

In this case, the reference count for the variables will never reach 0,
and the references will never be garbage-collected. This can lead to
memory leaks.

Because objects in Perl are implemented as references, it's possible to
have circular references with objects as well. Imagine a TreeNode class
where each node references its parent and child nodes. Any node with a
parent will be part of a circular reference.

You can break circular references by creating a "weak reference". A
weak reference does not increment the reference count for a variable,
which means that the object can go out of scope and be destroyed. You
can weaken a reference with the C<weaken> function exported by the
L<Scalar::Util> module.

Here's how we can make the first example safer:

    use Scalar::Util 'weaken';

    my $foo = {};
    my $bar = { foo => $foo };
    $foo->{bar} = $bar;

    weaken $foo->{bar};

The reference from C<$foo> to C<$bar> has been weakened. When the
C<$bar> variable goes out of scope, it will be garbage-collected. The
next time you look at the value of the C<< $foo->{bar} >> key, it will
be C<undef>.

This action at a distance can be confusing, so you should be careful
with your use of weaken. You should weaken the reference in the
variable that will go out of scope I<first>. That way, the longer-lived
variable will contain the expected reference until it goes out of
scope.

=head2 Symbolic references
X<reference, symbolic> X<reference, soft>
X<symbolic reference> X<soft reference>

We said that references spring into existence as necessary if they are
undefined, but we didn't say what happens if a value used as a
reference is already defined, but I<isn't> a hard reference.  If you
use it as a reference, it'll be treated as a symbolic
reference.  That is, the value of the scalar is taken to be the I<name>
of a variable, rather than a direct link to a (possibly) anonymous
value.

People frequently expect it to work like this.  So it does.

    $name = "foo";
    $$name = 1;                 # Sets $foo
    ${$name} = 2;               # Sets $foo
    ${$name x 2} = 3;           # Sets $foofoo
    $name->[0] = 4;             # Sets $foo[0]
    @$name = ();                # Clears @foo
    &$name();                   # Calls &foo()
    $pack = "THAT";
    ${"${pack}::$name"} = 5;    # Sets $THAT::foo without eval

This is powerful, and slightly dangerous, in that it's possible
to intend (with the utmost sincerity) to use a hard reference, and
accidentally use a symbolic reference instead.  To protect against
that, you can say

    use strict 'refs';

and then only hard references will be allowed for the rest of the enclosing
block.  An inner block may countermand that with

    no strict 'refs';

Only package variables (globals, even if localized) are visible to
symbolic references.  Lexical variables (declared with my()) aren't in
a symbol table, and thus are invisible to this mechanism.  For example:

    local $value = 10;
    $ref = "value";
    {
        my $value = 20;
        print $$ref;
    }

This will still print 10, not 20.  Remember that local() affects package
variables, which are all "global" to the package.

=head2 Not-so-symbolic references

Brackets around a symbolic reference can simply
serve to isolate an identifier or variable name from the rest of an
expression, just as they always have within a string.  For example,

    $push = "pop on ";
    print "${push}over";

has always meant to print "pop on over", even though push is
a reserved word.  This is generalized to work the same
without the enclosing double quotes, so that

    print ${push} . "over";

and even

    print ${ push } . "over";

will have the same effect.  This
construct is I<not> considered to be a symbolic reference when you're
using strict refs:

    use strict 'refs';
    ${ bareword };      # Okay, means $bareword.
    ${ "bareword" };    # Error, symbolic reference.

Similarly, because of all the subscripting that is done using single words,
the same rule applies to any bareword that is used for subscripting a hash.
So now, instead of writing

    $array{ "aaa" }{ "bbb" }{ "ccc" }

you can write just

    $array{ aaa }{ bbb }{ ccc }

and not worry about whether the subscripts are reserved words.  In the
rare event that you do wish to do something like

    $array{ shift }

you can force interpretation as a reserved word by adding anything that
makes it more than a bareword:

    $array{ shift() }
    $array{ +shift }
    $array{ shift @_ }

The C<use warnings> pragma or the B<-w> switch will warn you if it
interprets a reserved word as a string.
But it will no longer warn you about using lowercase words, because the
string is effectively quoted.

=head2 Pseudo-hashes: Using an array as a hash
X<pseudo-hash> X<pseudo hash> X<pseudohash>

Pseudo-hashes have been removed from Perl.  The 'fields' pragma
remains available.

=head2 Function Templates
X<scope, lexical> X<closure> X<lexical> X<lexical scope>
X<subroutine, nested> X<sub, nested> X<subroutine, local> X<sub, local>

As explained above, an anonymous function with access to the lexical
variables visible when that function was compiled, creates a closure.  It
retains access to those variables even though it doesn't get run until
later, such as in a signal handler or a Tk callback.

Using a closure as a function template allows us to generate many functions
that act similarly.  Suppose you wanted functions named after the colors
that generated HTML font changes for the various colors:

    print "Be ", red("careful"), "with that ", green("light");

The red() and green() functions would be similar.  To create these,
we'll assign a closure to a typeglob of the name of the function we're
trying to build.

    @colors = qw(red blue green yellow orange purple violet);
    for my $name (@colors) {
        no strict 'refs';       # allow symbol table manipulation
        *$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" };
    }

Now all those different functions appear to exist independently.  You can
call red(), RED(), blue(), BLUE(), green(), etc.  This technique saves on
both compile time and memory use, and is less error-prone as well, since
syntax checks happen at compile time.  It's critical that any variables in
the anonymous subroutine be lexicals in order to create a proper closure.
That's the reasons for the C<my> on the loop iteration variable.

This is one of the only places where giving a prototype to a closure makes
much sense.  If you wanted to impose scalar context on the arguments of
these functions (probably not a wise idea for this particular example),
you could have written it this way instead:

    *$name = sub ($) { "<FONT COLOR='$name'>$_[0]</FONT>" };

However, since prototype checking happens at compile time, the assignment
above happens too late to be of much use.  You could address this by
putting the whole loop of assignments within a BEGIN block, forcing it
to occur during compilation.

Access to lexicals that change over time--like those in the C<for> loop
above, basically aliases to elements from the surrounding lexical scopes--
only works with anonymous subs, not with named subroutines. Generally
said, named subroutines do not nest properly and should only be declared
in the main package scope.

This is because named subroutines are created at compile time so their
lexical variables get assigned to the parent lexicals from the first
execution of the parent block. If a parent scope is entered a second
time, its lexicals are created again, while the nested subs still
reference the old ones.

Anonymous subroutines get to capture each time you execute the C<sub>
operator, as they are created on the fly. If you are accustomed to using
nested subroutines in other programming languages with their own private
variables, you'll have to work at it a bit in Perl.  The intuitive coding
of this type of thing incurs mysterious warnings about "will not stay
shared" due to the reasons explained above.
For example, this won't work:

    sub outer {
        my $x = $_[0] + 35;
        sub inner { return $x * 19 }   # WRONG
        return $x + inner();
    }

A work-around is the following:

    sub outer {
        my $x = $_[0] + 35;
        local *inner = sub { return $x * 19 };
        return $x + inner();
    }

Now inner() can only be called from within outer(), because of the
temporary assignments of the anonymous subroutine. But when it does,
it has normal access to the lexical variable $x from the scope of
outer() at the time outer is invoked.

This has the interesting effect of creating a function local to another
function, something not normally supported in Perl.

=head1 WARNING: Don't use references as hash keys
X<reference, string context> X<reference, use as hash key>

You may not (usefully) use a reference as the key to a hash.  It will be
converted into a string:

    $x{ \$a } = $a;

If you try to dereference the key, it won't do a hard dereference, and
you won't accomplish what you're attempting.  You might want to do something
more like

    $r = \@a;
    $x{ $r } = $r;

And then at least you can use the values(), which will be
real refs, instead of the keys(), which won't.

The standard Tie::RefHash module provides a convenient workaround to this.

=head2 Postfix Dereference Syntax

Beginning in v5.20.0, a postfix syntax for using references is
available.  It behaves as described in L</Using References>, but instead
of a prefixed sigil, a postfixed sigil-and-star is used.

For example:

    $r = \@a;
    @b = $r->@*; # equivalent to @$r or @{ $r }

    $r = [ 1, [ 2, 3 ], 4 ];
    $r->[1]->@*;  # equivalent to @{ $r->[1] }

In Perl 5.20 and 5.22, this syntax must be enabled with C<use feature
'postderef'>. As of Perl 5.24, no feature declarations are required to make
it available.

Postfix dereference should work in all circumstances where block
(circumfix) dereference worked, and should be entirely equivalent.  This
syntax allows dereferencing to be written and read entirely
left-to-right.  The following equivalencies are defined:

  $sref->$*;  # same as  ${ $sref }
  $aref->@*;  # same as  @{ $aref }
  $aref->$#*; # same as $#{ $aref }
  $href->%*;  # same as  %{ $href }
  $cref->&*;  # same as  &{ $cref }
  $gref->**;  # same as  *{ $gref }

Note especially that C<< $cref->&* >> is I<not> equivalent to C<<
$cref->() >>, and can serve different purposes.

Glob elements can be extracted through the postfix dereferencing feature:

  $gref->*{SCALAR}; # same as *{ $gref }{SCALAR}

Postfix array and scalar dereferencing I<can> be used in interpolating
strings (double quotes or the C<qq> operator), but only if the
C<postderef_qq> feature is enabled.

=head2 Postfix Reference Slicing

Value slices of arrays and hashes may also be taken with postfix
dereferencing notation, with the following equivalencies:

  $aref->@[ ... ];  # same as @$aref[ ... ]
  $href->@{ ... };  # same as @$href{ ... }

Postfix key/value pair slicing, added in 5.20.0 and documented in
L<the KeyE<sol>Value Hash Slices section of perldata|perldata/"Key/Value Hash
Slices">, also behaves as expected:

  $aref->%[ ... ];  # same as %$aref[ ... ]
  $href->%{ ... };  # same as %$href{ ... }

As with postfix array, postfix value slice dereferencing I<can> be used
in interpolating strings (double quotes or the C<qq> operator), but only
if the C<postderef_qq> L<feature> is enabled.

=head2 Assigning to References

Beginning in v5.22.0, the referencing operator can be assigned to.  It
performs an aliasing operation, so that the variable name referenced on the
left-hand side becomes an alias for the thing referenced on the right-hand
side:

    \$a = \$b; # $a and $b now point to the same scalar
    \&foo = \&bar; # foo() now means bar()

This syntax must be enabled with C<use feature 'refaliasing'>.  It is
experimental, and will warn by default unless C<no warnings
'experimental::refaliasing'> is in effect.

These forms may be assigned to, and cause the right-hand side to be
evaluated in scalar context:

    \$scalar
    \@array
    \%hash
    \&sub
    \my $scalar
    \my @array
    \my %hash
    \state $scalar # or @array, etc.
    \our $scalar   # etc.
    \local $scalar # etc.
    \local our $scalar # etc.
    \$some_array[$index]
    \$some_hash{$key}
    \local $some_array[$index]
    \local $some_hash{$key}
    condition ? \$this : \$that[0] # etc.

Slicing operations and parentheses cause
the right-hand side to be evaluated in
list context:

    \@array[5..7]
    (\@array[5..7])
    \(@array[5..7])
    \@hash{'foo','bar'}
    (\@hash{'foo','bar'})
    \(@hash{'foo','bar'})
    (\$scalar)
    \($scalar)
    \(my $scalar)
    \my($scalar)
    (\@array)
    (\%hash)
    (\&sub)
    \(&sub)
    \($foo, @bar, %baz)
    (\$foo, \@bar, \%baz)

Each element on the right-hand side must be a reference to a datum of the
right type.  Parentheses immediately surrounding an array (and possibly
also C<my>/C<state>/C<our>/C<local>) will make each element of the array an
alias to the corresponding scalar referenced on the right-hand side:

    \(@a) = \(@b); # @a and @b now have the same elements
    \my(@a) = \(@b); # likewise
    \(my @a) = \(@b); # likewise
    push @a, 3; # but now @a has an extra element that @b lacks
    \(@a) = (\$a, \$b, \$c); # @a now contains $a, $b, and $c

Combining that form with C<local> and putting parentheses immediately
around a hash are forbidden (because it is not clear what they should do):

    \local(@array) = foo(); # WRONG
    \(%hash)       = bar(); # wRONG

Assignment to references and non-references may be combined in lists and
conditional ternary expressions, as long as the values on the right-hand
side are the right type for each element on the left, though this may make
for obfuscated code:

    (my $tom, \my $dick, \my @harry) = (\1, \2, [1..3]);
    # $tom is now \1
    # $dick is now 2 (read-only)
    # @harry is (1,2,3)

    my $type = ref $thingy;
    ($type ? $type eq 'ARRAY' ? \@foo : \$bar : $baz) = $thingy;

The C<foreach> loop can also take a reference constructor for its loop
variable, though the syntax is limited to one of the following, with an
optional C<my>, C<state>, or C<our> after the backslash:

    \$s
    \@a
    \%h
    \&c

No parentheses are permitted.  This feature is particularly useful for
arrays-of-arrays, or arrays-of-hashes:

    foreach \my @a (@array_of_arrays) {
        frobnicate($a[0], $a[-1]);
    }

    foreach \my %h (@array_of_hashes) {
        $h{gelastic}++ if $h{type} eq 'funny';
    }

B<CAVEAT:> Aliasing does not work correctly with closures.  If you try to
alias lexical variables from an inner subroutine or C<eval>, the aliasing
will only be visible within that inner sub, and will not affect the outer
subroutine where the variables are declared.  This bizarre behavior is
subject to change.

=head1 Declaring a Reference to a Variable

Beginning in v5.26.0, the referencing operator can come after C<my>,
C<state>, C<our>, or C<local>.  This syntax must be enabled with C<use
feature 'declared_refs'>.  It is experimental, and will warn by default
unless C<no warnings 'experimental::refaliasing'> is in effect.

This feature makes these:

    my \$x;
    our \$y;

equivalent to:

    \my $x;
    \our $x;

It is intended mainly for use in assignments to references (see
L</Assigning to References>, above).  It also allows the backslash to be
used on just some items in a list of declared variables:

    my ($foo, \@bar, \%baz); # equivalent to:  my $foo, \my(@bar, %baz);

=head1 SEE ALSO

Besides the obvious documents, source code can be instructive.
Some pathological examples of the use of references can be found
in the F<t/op/ref.t> regression test in the Perl source directory.

See also L<perldsc> and L<perllol> for how to use references to create
complex data structures, and L<perlootut> and L<perlobj>
for how to use them to create objects.
perlrepository.pod000064400000000775150344123430010362 0ustar00=encoding utf8

=head1 NAME

perlrepository - Links to current information on the Perl source repository

=head1 DESCRIPTION

Perl's source code is stored in a Git repository.

See L<perlhack> for an explanation of Perl development, including the
L<Super Quick Patch Guide|perlhack/SUPER QUICK PATCH GUIDE> for making and
submitting a small patch.

See L<perlgit> for detailed information about Perl's Git repository.

(The above documents supersede the information that was formerly here in
perlrepository.)
perlcommunity.pod000064400000016061150344123430010162 0ustar00=head1 NAME

perlcommunity - a brief overview of the Perl community

=head1 DESCRIPTION

This document aims to provide an overview of the vast perl community, which is
far too large and diverse to provide a detailed listing. If any specific niche
has been forgotten, it is not meant as an insult but an omission for the sake
of brevity.

The Perl community is as diverse as Perl, and there is a large amount of
evidence that the Perl users apply TMTOWTDI to all endeavors, not just
programming. From websites, to IRC, to mailing lists, there is more than one
way to get involved in the community.

=head2 Where to Find the Community

There is a central directory for the Perl community: L<http://perl.org>
maintained by the Perl Foundation (L<http://www.perlfoundation.org/>),
which tracks and provides services for a variety of other community sites.

=head2 Mailing Lists and Newsgroups

Perl runs on e-mail; there is no doubt about it. The Camel book was originally
written mostly over e-mail and today Perl's development is co-ordinated through
mailing lists. The largest repository of Perl mailing lists is located at
L<http://lists.perl.org>.

Most Perl-related projects set up mailing lists for both users and
contributors. If you don't see a certain project listed at
L<http://lists.perl.org>, check the particular website for that project.
Most mailing lists are archived at L<http://nntp.perl.org/>.

=head2 IRC

The Perl community has a rather large IRC presence. For starters, it has its
own IRC network, L<irc://irc.perl.org>. General (not help-oriented) chat can be
found at L<irc://irc.perl.org/#perl>. Many other more specific chats are also
hosted on the network. Information about irc.perl.org is located on the
network's website: L<http://www.irc.perl.org>. For a more help-oriented #perl,
check out L<irc://irc.freenode.net/#perl>. Perl 6 development also has a
presence in L<irc://irc.freenode.net/#perl6>. Most Perl-related channels will
be kind enough to point you in the right direction if you ask nicely.

Any large IRC network (Dalnet, EFnet) is also likely to have a #perl channel,
with varying activity levels.

=head2 Websites

Perl websites come in a variety of forms, but they fit into two large
categories: forums and news websites. There are many Perl-related
websites, so only a few of the community's largest are mentioned here.

=head3 News sites

=over 4

=item L<http://perl.com/>

Originally run by O'Reilly Media (the publisher of L<the Camel Book|perlbook>,
this site provides quality articles mostly about technical details of Perl.

=item L<http://blogs.perl.org/>

Many members of the community have a Perl-related blog on this site. If
you'd like to join them, you can sign up for free.

=item L<http://perlsphere.net/>

Perlsphere is one of several aggregators of Perl-related blog feeds.

=item L<http://perlweekly.com/>

Perl Weekly is a weekly mailing list that keeps you up to date on conferences,
releases and notable blog posts.

=item L<http://use.perl.org/>

use Perl; used to provide a slashdot-style news/blog website covering all
things Perl, from minutes of the meetings of the Perl 6 Design team to
conference announcements with (ir)relevant discussion. It no longer accepts
updates, but you can still use the site to read old entries and comments.

=back

=head3 Forums

=over 4

=item L<http://www.perlmonks.org/>

PerlMonks is one of the largest Perl forums, and describes itself as "A place
for individuals to polish, improve, and showcase their Perl skills." and "A
community which allows everyone to grow and learn from each other."

=item L<http://stackoverflow.com/>

Stack Overflow is a free question-and-answer site for programmers. It's not
focussed solely on Perl, but it does have an active group of users who do
their best to help people with their Perl programming questions.

=item L<http://prepan.org/>

PrePAN is used as a place to discuss modules that you're considering uploading
to the CPAN.  You can get feedback on their design before you upload.

=back

=head2 User Groups

Many cities around the world have local Perl Mongers chapters. A Perl Mongers
chapter is a local user group which typically holds regular in-person meetings,
both social and technical; helps organize local conferences, workshops, and
hackathons; and provides a mailing list or other continual contact method for
its members to keep in touch.

To find your local Perl Mongers (or PM as they're commonly abbreviated) group
check the international Perl Mongers directory at L<http://www.pm.org/>.

=head2 Workshops

Perl workshops are, as the name might suggest, workshops where Perl is taught
in a variety of ways. At the workshops, subjects range from a beginner's
introduction (such as the Pittsburgh Perl Workshop's "Zero To Perl") to much
more advanced subjects.

There are several great resources for locating workshops: the
L<websites|"Websites"> mentioned above, the
L<calendar|"Calendar of Perl Events"> mentioned below, and the YAPC Europe
website, L<http://www.yapceurope.org/>, which is probably the best resource for
European Perl events.

=head2 Hackathons

Hackathons are a very different kind of gathering where Perl hackers gather to
do just that, hack nonstop for an extended (several day) period on a specific
project or projects. Information about hackathons can be located in the same
place as information about L<workshops|"Workshops"> as well as in
L<irc://irc.perl.org/#perl>.

If you have never been to a hackathon, here are a few basic things you need to
know before attending: have a working laptop and know how to use it; check out
the involved projects beforehand; have the necessary version control client;
and bring backup equipment (an extra LAN cable, additional power strips, etc.)
because someone will forget.

=head2 Conventions

Perl has two major annual conventions: The Perl Conference (now part of OSCON),
put on by O'Reilly, and Yet Another Perl Conference or YAPC (pronounced
yap-see), which is localized into several regional YAPCs (North America,
Europe, Asia) in a stunning grassroots display by the Perl community. For more
information about either conference, check out their respective web pages:
OSCON L<http://conferences.oreillynet.com/>; YAPC L<http://www.yapc.org>.

A relatively new conference franchise with a large Perl portion is the
Open Source Developers Conference or OSDC. First held in Australia it has
recently also spread to Israel and France. More information can be found at:
L<http://www.osdc.com.au/> for Australia, L<http://www.osdc.org.il>
for Israel, and L<http://www.osdc.fr/> for France.

=head2 Calendar of Perl Events

The Perl Review, L<http://www.theperlreview.com> maintains a website
and Google calendar
(L<http://www.theperlreview.com/community_calendar>) for tracking
workshops, hackathons, Perl Mongers meetings, and other events. Views
of this calendar are at L<http://www.perl.org/events.html> and
L<http://www.yapc.org>.

Not every event or Perl Mongers group is on that calendar, so don't lose
heart if you don't see yours posted. To have your event or group listed,
contact brian d foy (brian@theperlreview.com).

=head1 AUTHOR

Edgar "Trizor" Bering <trizor@gmail.com>

=cut
perlos400.pod000064400000011240150344123430006775 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlos400 - Perl version 5 on OS/400

B<This document needs to be updated, but we don't know what it should say.
Please email comments to L<perlbug@perl.org|mailto:perlbug@perl.org>.>

=head1 DESCRIPTION

This document describes various features of IBM's OS/400 operating
system that will affect how Perl version 5 (hereafter just Perl) is
compiled and/or runs.

By far the easiest way to build Perl for OS/400 is to use the PASE
(Portable Application Solutions Environment), for more information see
L<http://www.iseries.ibm.com/developer/factory/pase/index.html>
This environment allows one to use AIX APIs while programming, and it
provides a runtime that allows AIX binaries to execute directly on the
PowerPC iSeries.

=head2 Compiling Perl for OS/400 PASE

The recommended way to build Perl for the OS/400 PASE is to build the
Perl 5 source code (release 5.8.1 or later) under AIX.

The trick is to give a special parameter to the Configure shell script
when running it on AIX:

  sh Configure -DPASE ...

The default installation directory of Perl under PASE is /QOpenSys/perl.
This can be modified if needed with Configure parameter -Dprefix=/some/dir.

Starting from OS/400 V5R2 the IBM Visual Age compiler is supported
on OS/400 PASE, so it is possible to build Perl natively on OS/400.  
The easier way, however, is to compile in AIX, as just described.

If you don't want to install the compiled Perl in AIX into /QOpenSys
(for packaging it before copying it to PASE), you can use a Configure
parameter: -Dinstallprefix=/tmp/QOpenSys/perl.  This will cause the
"make install" to install everything into that directory, while the
installed files still think they are (will be) in /QOpenSys/perl.

If building natively on PASE, please do the build under the /QOpenSys
directory, since Perl is happier when built on a case sensitive filesystem.

=head2 Installing Perl in OS/400 PASE

If you are compiling on AIX, simply do a "make install" on the AIX box.
Once the install finishes, tar up the /QOpenSys/perl directory.  Transfer
the tarball to the OS/400 using FTP with the following commands:

  > binary
  > site namefmt 1
  > put perl.tar /QOpenSys

Once you have it on, simply bring up a PASE shell and extract the tarball.

If you are compiling in PASE, then "make install" is the only thing you
will need to do.

The default path for perl binary is /QOpenSys/perl/bin/perl.  You'll
want to symlink /QOpenSys/usr/bin/perl to this file so you don't have
to modify your path.

=head2 Using Perl in OS/400 PASE

Perl in PASE may be used in the same manner as you would use Perl on AIX.

Scripts starting with #!/usr/bin/perl should work if you have
/QOpenSys/usr/bin/perl symlinked to your perl binary.  This will not
work if you've done a setuid/setgid or have environment variable
PASE_EXEC_QOPENSYS="N".  If you have V5R1, you'll need to get the
latest PTFs to have this feature.  Scripts starting with
#!/QOpenSys/perl/bin/perl should always work.

=head2 Known Problems

When compiling in PASE, there is no "oslevel" command.  Therefore,
you may want to create a script called "oslevel" that echoes the
level of AIX that your version of PASE runtime supports.  If you're
unsure, consult your documentation or use "4.3.3.0".

If you have test cases that fail, check for the existence of spool files.
The test case may be trying to use a syscall that is not implemented
in PASE.  To avoid the SIGILL, try setting the PASE_SYSCALL_NOSIGILL
environment variable or have a handler for the SIGILL.  If you can
compile programs for PASE, run the config script and edit config.sh
when it gives you the option.  If you want to remove fchdir(), which
isn't implement in V5R1, simply change the line that says:

d_fchdir='define'

to

d_fchdir='undef'

and then compile Perl.  The places where fchdir() is used have
alternatives for systems that do not have fchdir() available.

=head2 Perl on ILE

There exists a port of Perl to the ILE environment.  This port, however,
is based quite an old release of Perl, Perl 5.00502 (August 1998).
(As of July 2002 the latest release of Perl is 5.8.0, and even 5.6.1
has been out since April 2001.)  If you need to run Perl on ILE, though,
you may need this older port: L<http://www.cpan.org/ports/#os400>
Note that any Perl release later than 5.00502 has not been ported to ILE.

If you need to use Perl in the ILE environment, you may want to consider
using Qp2RunPase() to call the PASE version of Perl.

=head1 AUTHORS

Jarkko Hietaniemi <jhi@iki.fi>
Bryan Logan <bryanlog@us.ibm.com>
David Larson <larson1@us.ibm.com>

=cut
perl5182delta.pod000064400000012327150344123430007550 0ustar00=encoding utf8

=head1 NAME

perl5182delta - what is new for perl v5.18.2

=head1 DESCRIPTION

This document describes differences between the 5.18.1 release and the 5.18.2
release.

If you are upgrading from an earlier release such as 5.18.0, first read
L<perl5181delta>, which describes differences between 5.18.0 and 5.18.1.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<B> has been upgraded from version 1.42_01 to 1.42_02.

The fix for [perl #118525] introduced a regression in the behaviour of
C<B::CV::GV>, changing the return value from a C<B::SPECIAL> object on
a C<NULL> C<CvGV> to C<undef>.  C<B::CV::GV> again returns a
C<B::SPECIAL> object in this case.  [perl #119413]

=item *

L<B::Concise> has been upgraded from version 0.95 to 0.95_01.

This fixes a bug in dumping unexpected SPECIALs.

=item *

L<English> has been upgraded from version 1.06 to 1.06_01.  This fixes an
error about the performance of C<$`>, C<$&>, and C<$'>.

=item *

L<File::Glob> has been upgraded from version 1.20 to 1.20_01.

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=over 4

=item *

L<perlrepository> has been restored with a pointer to more useful pages.

=item *

L<perlhack> has been updated with the latest changes from blead.

=back

=head1 Selected Bug Fixes

=over 4

=item *

Perl 5.18.1 introduced a regression along with a bugfix for lexical subs.
Some B::SPECIAL results from B::CV::GV became undefs instead.  This broke
Devel::Cover among other libraries.  This has been fixed.  [perl #119351]

=item *

Perl 5.18.0 introduced a regression whereby C<[:^ascii:]>, if used in the same
character class as other qualifiers, would fail to match characters in the
Latin-1 block.  This has been fixed.  [perl #120799]

=item *

Perl 5.18.0 introduced a regression when using ->SUPER::method with AUTOLOAD
by looking up AUTOLOAD from the current package, rather than the current
package’s superclass.  This has been fixed. [perl #120694]

=item *

Perl 5.18.0 introduced a regression whereby C<-bareword> was no longer
permitted under the C<strict> and C<integer> pragmata when used together.  This
has been fixed.  [perl #120288]

=item *

Previously PerlIOBase_dup didn't check if pushing the new layer succeeded
before (optionally) setting the utf8 flag. This could cause
segfaults-by-nullpointer.  This has been fixed.

=item *

A buffer overflow with very long identifiers has been fixed.

=item *

A regression from 5.16 in the handling of padranges led to assertion failures
if a keyword plugin declined to handle the second ‘my’, but only after creating
a padop.

This affected, at least, Devel::CallParser under threaded builds.

This has been fixed.

=item *

The construct C<< $r=qr/.../; /$r/p >> is now handled properly, an issue which
had been worsened by changes 5.18.0. [perl #118213]

=back

=head1 Acknowledgements

Perl 5.18.2 represents approximately 3 months of development since Perl
5.18.1 and contains approximately 980 lines of changes across 39 files from 4
authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers. The following people are known to have
contributed the improvements that became Perl 5.18.2:

Craig A. Berry, David Mitchell, Ricardo Signes, Tony Cook.

The list above is almost certainly incomplete as it is automatically
generated from version control history. In particular, it does not include
the names of the (very much appreciated) contributors who reported issues to
the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
http://rt.perl.org/perlbug/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5263delta.pod000064400000015627150344123430007556 0ustar00=encoding utf8

=head1 NAME

perldelta - what is new for perl v5.26.3

=head1 DESCRIPTION

This document describes differences between the 5.26.2 release and the 5.26.3
release.

If you are upgrading from an earlier release such as 5.26.1, first read
L<perl5262delta>, which describes differences between 5.26.1 and 5.26.2.

=head1 Security

=head2 [CVE-2018-12015] Directory traversal in module Archive::Tar

By default, L<Archive::Tar> doesn't allow extracting files outside the current
working directory.  However, this secure extraction mode could be bypassed by
putting a symlink and a regular file with the same name into the tar file.

L<[perl #133250]|https://rt.perl.org/Ticket/Display.html?id=133250>
L<[cpan #125523]|https://rt.cpan.org/Ticket/Display.html?id=125523>

=head2 [CVE-2018-18311] Integer overflow leading to buffer overflow and segmentation fault

Integer arithmetic in C<Perl_my_setenv()> could wrap when the combined length
of the environment variable name and value exceeded around 0x7fffffff.  This
could lead to writing beyond the end of an allocated buffer with attacker
supplied data.

L<[perl #133204]|https://rt.perl.org/Ticket/Display.html?id=133204>

=head2 [CVE-2018-18312] Heap-buffer-overflow write in S_regatom (regcomp.c)

A crafted regular expression could cause heap-buffer-overflow write during
compilation, potentially allowing arbitrary code execution.

L<[perl #133423]|https://rt.perl.org/Ticket/Display.html?id=133423>

=head2 [CVE-2018-18313] Heap-buffer-overflow read in S_grok_bslash_N (regcomp.c)

A crafted regular expression could cause heap-buffer-overflow read during
compilation, potentially leading to sensitive information being leaked.

L<[perl #133192]|https://rt.perl.org/Ticket/Display.html?id=133192>

=head2 [CVE-2018-18314] Heap-buffer-overflow write in S_regatom (regcomp.c)

A crafted regular expression could cause heap-buffer-overflow write during
compilation, potentially allowing arbitrary code execution.

L<[perl #131649]|https://rt.perl.org/Ticket/Display.html?id=131649>

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.26.2.  If any exist,
they are bugs, and we request that you submit a report.  See
L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Archive::Tar> has been upgraded from version 2.24 to 2.24_01.

=item *

L<Module::CoreList> has been upgraded from version 5.20180414_26 to 5.20181129_26.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<Unexpected ']' with no following ')' in (?[... in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Unexpected ']' with no following ')' in (?[... in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

(F) While parsing an extended character class a ']' character was encountered
at a point in the definition where the only legal use of ']' is to close the
character class definition as part of a '])', you may have forgotten the close
paren, or otherwise confused the parser.

=item *

L<Expecting close paren for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Expecting close paren for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

(F) While parsing a nested extended character class like:

    (?[ ... (?flags:(?[ ... ])) ... ])
                             ^

we expected to see a close paren ')' (marked by ^) but did not.

=item *

L<Expecting close paren for wrapper for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Expecting close paren for wrapper for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

(F) While parsing a nested extended character class like:

    (?[ ... (?flags:(?[ ... ])) ... ])
                              ^

we expected to see a close paren ')' (marked by ^) but did not.

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

L<Syntax error in (?[...]) in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Syntax error in (?[...]) in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

This fatal error message has been slightly expanded (from "Syntax error in
(?[...]) in regex mE<sol>%sE<sol>") for greater clarity.

=back

=head1 Acknowledgements

Perl 5.26.3 represents approximately 8 months of development since Perl 5.26.2
and contains approximately 4,500 lines of changes across 51 files from 15
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 770 lines of changes to 10 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.26.3:

Aaron Crane, Abigail, Chris 'BinGOs' Williams, Dagfinn Ilmari Mannsåker, David
Mitchell, H.Merijn Brand, James E Keenan, John SJ Anderson, Karen Etheridge,
Karl Williamson, Sawyer X, Steve Hay, Todd Rinaldo, Tony Cook, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the perl bug database
at L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 Give Thanks

If you wish to thank the Perl 5 Porters for the work we had done in Perl 5,
you can do so by running the C<perlthanks> program:

    perlthanks

This will send an email to the Perl 5 Porters list with your show of thanks.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl58delta.pod000064400000340735150344123430007414 0ustar00=head1 NAME

perl58delta - what is new for perl v5.8.0

=head1 DESCRIPTION

This document describes differences between the 5.6.0 release and
the 5.8.0 release.

Many of the bug fixes in 5.8.0 were already seen in the 5.6.1
maintenance release since the two releases were kept closely
coordinated (while 5.8.0 was still called 5.7.something).

Changes that were integrated into the 5.6.1 release are marked C<[561]>.
Many of these changes have been further developed since 5.6.1 was released,
those are marked C<[561+]>.

You can see the list of changes in the 5.6.1 release (both from the
5.005_03 release and the 5.6.0 release) by reading L<perl561delta>.

=head1 Highlights In 5.8.0

=over 4

=item *

Better Unicode support

=item *

New IO Implementation

=item *

New Thread Implementation

=item *

Better Numeric Accuracy

=item *

Safe Signals

=item *

Many New Modules

=item *

More Extensive Regression Testing

=back

=head1 Incompatible Changes

=head2 Binary Incompatibility

B<Perl 5.8 is not binary compatible with earlier releases of Perl.>

B<You have to recompile your XS modules.>

(Pure Perl modules should continue to work.)

The major reason for the discontinuity is the new IO architecture
called PerlIO.  PerlIO is the default configuration because without
it many new features of Perl 5.8 cannot be used.  In other words:
you just have to recompile your modules containing XS code, sorry
about that.

In future releases of Perl, non-PerlIO aware XS modules may become
completely unsupported.  This shouldn't be too difficult for module
authors, however: PerlIO has been designed as a drop-in replacement
(at the source code level) for the stdio interface.

Depending on your platform, there are also other reasons why
we decided to break binary compatibility, please read on.

=head2 64-bit platforms and malloc

If your pointers are 64 bits wide, the Perl malloc is no longer being
used because it does not work well with 8-byte pointers.  Also,
usually the system mallocs on such platforms are much better optimized
for such large memory models than the Perl malloc.  Some memory-hungry
Perl applications like the PDL don't work well with Perl's malloc.
Finally, other applications than Perl (such as mod_perl) tend to prefer
the system malloc.  Such platforms include Alpha and 64-bit HPPA,
MIPS, PPC, and Sparc.

=head2 AIX Dynaloading

The AIX dynaloading now uses in AIX releases 4.3 and newer the native
dlopen interface of AIX instead of the old emulated interface.  This
change will probably break backward compatibility with compiled
modules.  The change was made to make Perl more compliant with other
applications like mod_perl which are using the AIX native interface.

=head2 Attributes for C<my> variables now handled at run-time

The C<my EXPR : ATTRS> syntax now applies variable attributes at
run-time.  (Subroutine and C<our> variables still get attributes applied
at compile-time.)  See L<attributes> for additional details.  In particular,
however, this allows variable attributes to be useful for C<tie> interfaces,
which was a deficiency of earlier releases.  Note that the new semantics
doesn't work with the Attribute::Handlers module (as of version 0.76).

=head2 Socket Extension Dynamic in VMS

The Socket extension is now dynamically loaded instead of being
statically built in.  This may or may not be a problem with ancient
TCP/IP stacks of VMS: we do not know since we weren't able to test
Perl in such configurations.

=head2 IEEE-format Floating Point Default on OpenVMS Alpha

Perl now uses IEEE format (T_FLOAT) as the default internal floating
point format on OpenVMS Alpha, potentially breaking binary compatibility
with external libraries or existing data.  G_FLOAT is still available as
a configuration option.  The default on VAX (D_FLOAT) has not changed.

=head2 New Unicode Semantics (no more C<use utf8>, almost)

Previously in Perl 5.6 to use Unicode one would say "use utf8" and
then the operations (like string concatenation) were Unicode-aware
in that lexical scope.

This was found to be an inconvenient interface, and in Perl 5.8 the
Unicode model has completely changed: now the "Unicodeness" is bound
to the data itself, and for most of the time "use utf8" is not needed
at all.  The only remaining use of "use utf8" is when the Perl script
itself has been written in the UTF-8 encoding of Unicode.  (UTF-8 has
not been made the default since there are many Perl scripts out there
that are using various national eight-bit character sets, which would
be illegal in UTF-8.)

See L<perluniintro> for the explanation of the current model,
and L<utf8> for the current use of the utf8 pragma.

=head2 New Unicode Properties

Unicode I<scripts> are now supported. Scripts are similar to (and superior
to) Unicode I<blocks>. The difference between scripts and blocks is that
scripts are the glyphs used by a language or a group of languages, while
the blocks are more artificial groupings of (mostly) 256 characters based
on the Unicode numbering.

In general, scripts are more inclusive, but not universally so. For
example, while the script C<Latin> includes all the Latin characters and
their various diacritic-adorned versions, it does not include the various
punctuation or digits (since they are not solely C<Latin>).

A number of other properties are now supported, including C<\p{L&}>,
C<\p{Any}> C<\p{Assigned}>, C<\p{Unassigned}>, C<\p{Blank}> [561] and
C<\p{SpacePerl}> [561] (along with their C<\P{...}> versions, of course).
See L<perlunicode> for details, and more additions.

The C<In> or C<Is> prefix to names used with the C<\p{...}> and C<\P{...}>
are now almost always optional. The only exception is that a C<In> prefix
is required to signify a Unicode block when a block name conflicts with a
script name. For example, C<\p{Tibetan}> refers to the script, while
C<\p{InTibetan}> refers to the block. When there is no name conflict, you
can omit the C<In> from the block name (e.g. C<\p{BraillePatterns}>), but
to be safe, it's probably best to always use the C<In>).

=head2 REF(...) Instead Of SCALAR(...)

A reference to a reference now stringifies as "REF(0x81485ec)" instead
of "SCALAR(0x81485ec)" in order to be more consistent with the return
value of ref().

=head2 pack/unpack D/F recycled

The undocumented pack/unpack template letters D/F have been recycled
for better use: now they stand for long double (if supported by the
platform) and NV (Perl internal floating point type).  (They used
to be aliases for d/f, but you never knew that.)

=head2 glob() now returns filenames in alphabetical order

The list of filenames from glob() (or <...>) is now by default sorted
alphabetically to be csh-compliant (which is what happened before
in most Unix platforms).  (bsd_glob() does still sort platform
natively, ASCII or EBCDIC, unless GLOB_ALPHASORT is specified.) [561]

=head2 Deprecations

=over 4

=item *

The semantics of bless(REF, REF) were unclear and until someone proves
it to make some sense, it is forbidden.

=item *

The obsolete chat2 library that should never have been allowed
to escape the laboratory has been decommissioned.

=item *

Using chdir("") or chdir(undef) instead of explicit chdir() is
doubtful.  A failure (think chdir(some_function()) can lead into
unintended chdir() to the home directory, therefore this behaviour
is deprecated.

=item *

The builtin dump() function has probably outlived most of its
usefulness.  The core-dumping functionality will remain in future
available as an explicit call to C<CORE::dump()>, but in future
releases the behaviour of an unqualified C<dump()> call may change.

=item *

The very dusty examples in the eg/ directory have been removed.
Suggestions for new shiny examples welcome but the main issue is that
the examples need to be documented, tested and (most importantly)
maintained.

=item *

The (bogus) escape sequences \8 and \9 now give an optional warning
("Unrecognized escape passed through").  There is no need to \-escape
any C<\w> character.

=item *

The *glob{FILEHANDLE} is deprecated, use *glob{IO} instead.

=item *

The C<package;> syntax (C<package> without an argument) has been
deprecated.  Its semantics were never that clear and its
implementation even less so.  If you have used that feature to
disallow all but fully qualified variables, C<use strict;> instead.

=item *

The unimplemented POSIX regex features [[.cc.]] and [[=c=]] are still
recognised but now cause fatal errors.  The previous behaviour of
ignoring them by default and warning if requested was unacceptable
since it, in a way, falsely promised that the features could be used.

=item *

In future releases, non-PerlIO aware XS modules may become completely
unsupported.  Since PerlIO is a drop-in replacement for stdio at the
source code level, this shouldn't be that drastic a change.

=item *

Previous versions of perl and some readings of some sections of Camel
III implied that the C<:raw> "discipline" was the inverse of C<:crlf>.
Turning off "clrfness" is no longer enough to make a stream truly
binary. So the PerlIO C<:raw> layer (or "discipline", to use the Camel
book's older terminology) is now formally defined as being equivalent
to binmode(FH) - which is in turn defined as doing whatever is
necessary to pass each byte as-is without any translation.  In
particular binmode(FH) - and hence C<:raw> - will now turn off both
CRLF and UTF-8 translation and remove other layers (e.g. :encoding())
which would modify byte stream.

=item *

The current user-visible implementation of pseudo-hashes (the weird
use of the first array element) is deprecated starting from Perl 5.8.0
and will be removed in Perl 5.10.0, and the feature will be
implemented differently.  Not only is the current interface rather
ugly, but the current implementation slows down normal array and hash
use quite noticeably. The C<fields> pragma interface will remain
available.  The I<restricted hashes> interface is expected to
be the replacement interface (see L<Hash::Util>).  If your existing
programs depends on the underlying implementation, consider using
L<Class::PseudoHash> from CPAN.

=item *

The syntaxes C<< @a->[...] >> and  C<< %h->{...} >> have now been deprecated.

=item *

After years of trying, suidperl is considered to be too complex to
ever be considered truly secure.  The suidperl functionality is likely
to be removed in a future release.

=item *

The 5.005 threads model (module C<Thread>) is deprecated and expected
to be removed in Perl 5.10.  Multithreaded code should be migrated to
the new ithreads model (see L<threads>, L<threads::shared> and
L<perlthrtut>).

=item *

The long deprecated uppercase aliases for the string comparison
operators (EQ, NE, LT, LE, GE, GT) have now been removed.

=item *

The tr///C and tr///U features have been removed and will not return;
the interface was a mistake.  Sorry about that.  For similar
functionality, see pack('U0', ...) and pack('C0', ...). [561]

=item *

Earlier Perls treated "sub foo (@bar)" as equivalent to "sub foo (@)".
The prototypes are now checked better at compile-time for invalid
syntax.  An optional warning is generated ("Illegal character in
prototype...")  but this may be upgraded to a fatal error in a future
release.

=item *

The C<exec LIST> and C<system LIST> operations now produce warnings on
tainted data and in some future release they will produce fatal errors.

=item *

The existing behaviour when localising tied arrays and hashes is wrong,
and will be changed in a future release, so do not rely on the existing
behaviour. See L</"Localising Tied Arrays and Hashes Is Broken">.

=back

=head1 Core Enhancements

=head2 Unicode Overhaul

Unicode in general should be now much more usable than in Perl 5.6.0
(or even in 5.6.1).  Unicode can be used in hash keys, Unicode in
regular expressions should work now, Unicode in tr/// should work now,
Unicode in I/O should work now.  See L<perluniintro> for introduction
and L<perlunicode> for details.

=over 4

=item *

The Unicode Character Database coming with Perl has been upgraded
to Unicode 3.2.0.  For more information, see http://www.unicode.org/ .
[561+] (5.6.1 has UCD 3.0.1.)

=item *

For developers interested in enhancing Perl's Unicode capabilities:
almost all the UCD files are included with the Perl distribution in
the F<lib/unicore> subdirectory.  The most notable omission, for space
considerations, is the Unihan database.

=item *

The properties \p{Blank} and \p{SpacePerl} have been added. "Blank" is like
C isblank(), that is, it contains only "horizontal whitespace" (the space
character is, the newline isn't), and the "SpacePerl" is the Unicode
equivalent of C<\s> (\p{Space} isn't, since that includes the vertical
tabulator character, whereas C<\s> doesn't.)

See "New Unicode Properties" earlier in this document for additional
information on changes with Unicode properties.

=back

=head2 PerlIO is Now The Default

=over 4

=item *

IO is now by default done via PerlIO rather than system's "stdio".
PerlIO allows "layers" to be "pushed" onto a file handle to alter the
handle's behaviour.  Layers can be specified at open time via 3-arg
form of open:

   open($fh,'>:crlf :utf8', $path) || ...

or on already opened handles via extended C<binmode>:

   binmode($fh,':encoding(iso-8859-7)');

The built-in layers are: unix (low level read/write), stdio (as in
previous Perls), perlio (re-implementation of stdio buffering in a
portable manner), crlf (does CRLF <=> "\n" translation as on Win32,
but available on any platform).  A mmap layer may be available if
platform supports it (mostly Unixes).

Layers to be applied by default may be specified via the 'open' pragma.

See L</"Installation and Configuration Improvements"> for the effects
of PerlIO on your architecture name.

=item *

If your platform supports fork(), you can use the list form of C<open>
for pipes.  For example:

    open KID_PS, "-|", "ps", "aux" or die $!;

forks the ps(1) command (without spawning a shell, as there are more
than three arguments to open()), and reads its standard output via the
C<KID_PS> filehandle.  See L<perlipc>.

=item *

File handles can be marked as accepting Perl's internal encoding of Unicode
(UTF-8 or UTF-EBCDIC depending on platform) by a pseudo layer ":utf8" :

   open($fh,">:utf8","Uni.txt");

Note for EBCDIC users: the pseudo layer ":utf8" is erroneously named
for you since it's not UTF-8 what you will be getting but instead
UTF-EBCDIC.  See L<perlunicode>, L<utf8>, and
http://www.unicode.org/unicode/reports/tr16/ for more information.
In future releases this naming may change.  See L<perluniintro>
for more information about UTF-8.

=item *

If your environment variables (LC_ALL, LC_CTYPE, LANG) look like you
want to use UTF-8 (any of the variables match C</utf-?8/i>), your
STDIN, STDOUT, STDERR handles and the default open layer (see L<open>)
are marked as UTF-8.  (This feature, like other new features that
combine Unicode and I/O, work only if you are using PerlIO, but that's
the default.)

Note that after this Perl really does assume that everything is UTF-8:
for example if some input handle is not, Perl will probably very soon
complain about the input data like this "Malformed UTF-8 ..." since
any old eight-bit data is not legal UTF-8.

Note for code authors: if you want to enable your users to use UTF-8
as their default encoding  but in your code still have eight-bit I/O streams
(such as images or zip files), you need to explicitly open() or binmode()
with C<:bytes> (see L<perlfunc/open> and L<perlfunc/binmode>), or you
can just use C<binmode(FH)> (nice for pre-5.8.0 backward compatibility).

=item *

File handles can translate character encodings from/to Perl's internal
Unicode form on read/write via the ":encoding()" layer.

=item *

File handles can be opened to "in memory" files held in Perl scalars via:

   open($fh,'>', \$variable) || ...

=item *

Anonymous temporary files are available without need to
'use FileHandle' or other module via

   open($fh,"+>", undef) || ...

That is a literal undef, not an undefined value.

=back

=head2 ithreads

The new interpreter threads ("ithreads" for short) implementation of
multithreading, by Arthur Bergman, replaces the old "5.005 threads"
implementation.  In the ithreads model any data sharing between
threads must be explicit, as opposed to the model where data sharing
was implicit.  See L<threads> and L<threads::shared>, and
L<perlthrtut>.

As a part of the ithreads implementation Perl will also use
any necessary and detectable reentrant libc interfaces.

=head2 Restricted Hashes

A restricted hash is restricted to a certain set of keys, no keys
outside the set can be added.  Also individual keys can be restricted
so that the key cannot be deleted and the value cannot be changed.
No new syntax is involved: the Hash::Util module is the interface.

=head2 Safe Signals

Perl used to be fragile in that signals arriving at inopportune moments
could corrupt Perl's internal state.  Now Perl postpones handling of
signals until it's safe (between opcodes).

This change may have surprising side effects because signals no longer
interrupt Perl instantly.  Perl will now first finish whatever it was
doing, like finishing an internal operation (like sort()) or an
external operation (like an I/O operation), and only then look at any
arrived signals (and before starting the next operation).  No more corrupt
internal state since the current operation is always finished first,
but the signal may take more time to get heard.  Note that breaking
out from potentially blocking operations should still work, though.

=head2 Understanding of Numbers

In general a lot of fixing has happened in the area of Perl's
understanding of numbers, both integer and floating point.  Since in
many systems the standard number parsing functions like C<strtoul()>
and C<atof()> seem to have bugs, Perl tries to work around their
deficiencies.  This results hopefully in more accurate numbers.

Perl now tries internally to use integer values in numeric conversions
and basic arithmetics (+ - * /) if the arguments are integers, and
tries also to keep the results stored internally as integers.
This change leads to often slightly faster and always less lossy
arithmetics. (Previously Perl always preferred floating point numbers
in its math.)

=head2 Arrays now always interpolate into double-quoted strings [561]

In double-quoted strings, arrays now interpolate, no matter what.  The
behavior in earlier versions of perl 5 was that arrays would interpolate
into strings if the array had been mentioned before the string was
compiled, and otherwise Perl would raise a fatal compile-time error.
In versions 5.000 through 5.003, the error was

        Literal @example now requires backslash

In versions 5.004_01 through 5.6.0, the error was

        In string, @example now must be written as \@example

The idea here was to get people into the habit of writing
C<"fred\@example.com"> when they wanted a literal C<@> sign, just as
they have always written C<"Give me back my \$5"> when they wanted a
literal C<$> sign.

Starting with 5.6.1, when Perl now sees an C<@> sign in a
double-quoted string, it I<always> attempts to interpolate an array,
regardless of whether or not the array has been used or declared
already.  The fatal error has been downgraded to an optional warning:

        Possible unintended interpolation of @example in string

This warns you that C<"fred@example.com"> is going to turn into
C<fred.com> if you don't backslash the C<@>.
See http://perl.plover.com/at-error.html for more details
about the history here.

=head2 Miscellaneous Changes

=over 4

=item *

AUTOLOAD is now lvaluable, meaning that you can add the :lvalue attribute
to AUTOLOAD subroutines and you can assign to the AUTOLOAD return value.

=item *

The $Config{byteorder} (and corresponding BYTEORDER in config.h) was
previously wrong in platforms if sizeof(long) was 4, but sizeof(IV)
was 8.  The byteorder was only sizeof(long) bytes long (1234 or 4321),
but now it is correctly sizeof(IV) bytes long, (12345678 or 87654321).
(This problem didn't affect Windows platforms.)

Also, $Config{byteorder} is now computed dynamically--this is more
robust with "fat binaries" where an executable image contains binaries
for more than one binary platform, and when cross-compiling.

=item *

C<perl -d:Module=arg,arg,arg> now works (previously one couldn't pass
in multiple arguments.)

=item *

C<do> followed by a bareword now ensures that this bareword isn't
a keyword (to avoid a bug where C<do q(foo.pl)> tried to call a
subroutine called C<q>).  This means that for example instead of
C<do format()> you must write C<do &format()>.

=item *

The builtin dump() now gives an optional warning
C<dump() better written as CORE::dump()>,
meaning that by default C<dump(...)> is resolved as the builtin
dump() which dumps core and aborts, not as (possibly) user-defined
C<sub dump>.  To call the latter, qualify the call as C<&dump(...)>.
(The whole dump() feature is to considered deprecated, and possibly
removed/changed in future releases.)

=item *

chomp() and chop() are now overridable.  Note, however, that their
prototype (as given by C<prototype("CORE::chomp")> is undefined,
because it cannot be expressed and therefore one cannot really write
replacements to override these builtins.

=item *

END blocks are now run even if you exit/die in a BEGIN block.
Internally, the execution of END blocks is now controlled by
PL_exit_flags & PERL_EXIT_DESTRUCT_END. This enables the new
behaviour for Perl embedders. This will default in 5.10. See
L<perlembed>.

=item *

Formats now support zero-padded decimal fields.

=item *

Although "you shouldn't do that", it was possible to write code that
depends on Perl's hashed key order (Data::Dumper does this).  The new
algorithm "One-at-a-Time" produces a different hashed key order.
More details are in L</"Performance Enhancements">.

=item *

lstat(FILEHANDLE) now gives a warning because the operation makes no sense.
In future releases this may become a fatal error.

=item *

Spurious syntax errors generated in certain situations, when glob()
caused File::Glob to be loaded for the first time, have been fixed. [561]

=item *

Lvalue subroutines can now return C<undef> in list context.  However,
the lvalue subroutine feature still remains experimental.  [561+]

=item *

A lost warning "Can't declare ... dereference in my" has been
restored (Perl had it earlier but it became lost in later releases.)

=item *

A new special regular expression variable has been introduced:
C<$^N>, which contains the most-recently closed group (submatch).

=item *

C<no Module;> does not produce an error even if Module does not have an
unimport() method.  This parallels the behavior of C<use> vis-a-vis
C<import>. [561]

=item *

The numerical comparison operators return C<undef> if either operand
is a NaN.  Previously the behaviour was unspecified.

=item *

C<our> can now have an experimental optional attribute C<unique> that
affects how global variables are shared among multiple interpreters,
see L<perlfunc/our>.

=item *

The following builtin functions are now overridable: each(), keys(),
pop(), push(), shift(), splice(), unshift(). [561]

=item *

C<pack() / unpack()> can now group template letters with C<()> and then
apply repetition/count modifiers on the groups.

=item *

C<pack() / unpack()> can now process the Perl internal numeric types:
IVs, UVs, NVs-- and also long doubles, if supported by the platform.
The template letters are C<j>, C<J>, C<F>, and C<D>.

=item *

C<pack('U0a*', ...)> can now be used to force a string to UTF-8.

=item *

my __PACKAGE__ $obj now works. [561]

=item *

POSIX::sleep() now returns the number of I<unslept> seconds
(as the POSIX standard says), as opposed to CORE::sleep() which
returns the number of slept seconds.

=item *

printf() and sprintf() now support parameter reordering using the
C<%\d+\$> and C<*\d+\$> syntaxes.  For example

    printf "%2\$s %1\$s\n", "foo", "bar";

will print "bar foo\n".  This feature helps in writing
internationalised software, and in general when the order
of the parameters can vary.

=item *

The (\&) prototype now works properly. [561]

=item *

prototype(\[$@%&]) is now available to implicitly create references
(useful for example if you want to emulate the tie() interface).

=item *

A new command-line option, C<-t> is available.  It is the
little brother of C<-T>: instead of dying on taint violations,
lexical warnings are given.  B<This is only meant as a temporary
debugging aid while securing the code of old legacy applications.
This is not a substitute for -T.>

=item *

In other taint news, the C<exec LIST> and C<system LIST> have now been
considered too risky (think C<exec @ARGV>: it can start any program
with any arguments), and now the said forms cause a warning under
lexical warnings.  You should carefully launder the arguments to
guarantee their validity.  In future releases of Perl the forms will
become fatal errors so consider starting laundering now.

=item *

Tied hash interfaces are now required to have the EXISTS and DELETE
methods (either own or inherited).

=item *

If tr/// is just counting characters, it doesn't attempt to
modify its target.

=item *

untie() will now call an UNTIE() hook if it exists.  See L<perltie>
for details. [561]

=item *

L<perlfunc/utime> now supports C<utime undef, undef, @files> to change the
file timestamps to the current time.

=item *

The rules for allowing underscores (underbars) in numeric constants
have been relaxed and simplified: now you can have an underscore
simply B<between digits>.

=item *

Rather than relying on C's argv[0] (which may not contain a full pathname)
where possible $^X is now set by asking the operating system.
(eg by reading F</proc/self/exe> on Linux, F</proc/curproc/file> on FreeBSD)

=item *

A new variable, C<${^TAINT}>, indicates whether taint mode is enabled.

=item *

You can now override the readline() builtin, and this overrides also
the <FILEHANDLE> angle bracket operator.

=item *

The command-line options -s and -F are now recognized on the shebang
(#!) line.

=item *

Use of the C</c> match modifier without an accompanying C</g> modifier
elicits a new warning: C<Use of /c modifier is meaningless without /g>.

Use of C</c> in substitutions, even with C</g>, elicits
C<Use of /c modifier is meaningless in s///>.

Use of C</g> with C<split> elicits C<Use of /g modifier is meaningless
in split>.

=item *

Support for the C<CLONE> special subroutine had been added.
With ithreads, when a new thread is created, all Perl data is cloned,
however non-Perl data cannot be cloned automatically.  In C<CLONE> you
can do whatever you need to do, like for example handle the cloning of
non-Perl data, if necessary.  C<CLONE> will be executed once for every
package that has it defined or inherited.  It will be called in the
context of the new thread, so all modifications are made in the new area.

See L<perlmod>

=back

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

=over 4

=item *

C<Attribute::Handlers>, originally by Damian Conway and now maintained
by Arthur Bergman, allows a class to define attribute handlers.

    package MyPack;
    use Attribute::Handlers;
    sub Wolf :ATTR(SCALAR) { print "howl!\n" }

    # later, in some package using or inheriting from MyPack...

    my MyPack $Fluffy : Wolf; # the attribute handler Wolf will be called

Both variables and routines can have attribute handlers.  Handlers can
be specific to type (SCALAR, ARRAY, HASH, or CODE), or specific to the
exact compilation phase (BEGIN, CHECK, INIT, or END).
See L<Attribute::Handlers>.

=item *

C<B::Concise>, by Stephen McCamant, is a new compiler backend for
walking the Perl syntax tree, printing concise info about ops.
The output is highly customisable.  See L<B::Concise>. [561+]

=item *

The new bignum, bigint, and bigrat pragmas, by Tels, implement
transparent bignum support (using the Math::BigInt, Math::BigFloat,
and Math::BigRat backends).

=item *

C<Class::ISA>, by Sean Burke, is a module for reporting the search
path for a class's ISA tree.  See L<Class::ISA>.

=item *

C<Cwd> now has a split personality: if possible, an XS extension is
used, (this will hopefully be faster, more secure, and more robust)
but if not possible, the familiar Perl implementation is used.

=item *

C<Devel::PPPort>, originally by Kenneth Albanowski and now
maintained by Paul Marquess, has been added.  It is primarily used
by C<h2xs> to enhance portability of XS modules between different
versions of Perl.  See L<Devel::PPPort>.

=item *

C<Digest>, frontend module for calculating digests (checksums), from
Gisle Aas, has been added.  See L<Digest>.

=item *

C<Digest::MD5> for calculating MD5 digests (checksums) as defined in
RFC 1321, from Gisle Aas, has been added.  See L<Digest::MD5>.

    use Digest::MD5 'md5_hex';

    $digest = md5_hex("Thirsty Camel");

    print $digest, "\n"; # 01d19d9d2045e005c3f1b80e8b164de1

NOTE: the C<MD5> backward compatibility module is deliberately not
included since its further use is discouraged.

See also L<PerlIO::via::QuotedPrint>.

=item *

C<Encode>, originally by Nick Ing-Simmons and now maintained by Dan
Kogai, provides a mechanism to translate between different character
encodings.  Support for Unicode, ISO-8859-1, and ASCII are compiled in
to the module.  Several other encodings (like the rest of the
ISO-8859, CP*/Win*, Mac, KOI8-R, three variants EBCDIC, Chinese,
Japanese, and Korean encodings) are included and can be loaded at
runtime.  (For space considerations, the largest Chinese encodings
have been separated into their own CPAN module, Encode::HanExtra,
which Encode will use if available).  See L<Encode>.

Any encoding supported by Encode module is also available to the
":encoding()" layer if PerlIO is used.

=item *

C<Hash::Util> is the interface to the new I<restricted hashes>
feature.  (Implemented by Jeffrey Friedl, Nick Ing-Simmons, and
Michael Schwern.)  See L<Hash::Util>.

=item *

C<I18N::Langinfo> can be used to query locale information.
See L<I18N::Langinfo>.

=item *

C<I18N::LangTags>, by Sean Burke, has functions for dealing with
RFC3066-style language tags.  See L<I18N::LangTags>.

=item *

C<ExtUtils::Constant>, by Nicholas Clark, is a new tool for extension
writers for generating XS code to import C header constants.
See L<ExtUtils::Constant>.

=item *

C<Filter::Simple>, by Damian Conway, is an easy-to-use frontend to
Filter::Util::Call.  See L<Filter::Simple>.

    # in MyFilter.pm:

    package MyFilter;

    use Filter::Simple sub {
        while (my ($from, $to) = splice @_, 0, 2) {
                s/$from/$to/g;
        }
    };

    1;

    # in user's code:

    use MyFilter qr/red/ => 'green';

    print "red\n";   # this code is filtered, will print "green\n"
    print "bored\n"; # this code is filtered, will print "bogreen\n"

    no MyFilter;

    print "red\n";   # this code is not filtered, will print "red\n"

=item *

C<File::Temp>, by Tim Jenness, allows one to create temporary files
and directories in an easy, portable, and secure way.  See L<File::Temp>.
[561+]

=item *

C<Filter::Util::Call>, by Paul Marquess, provides you with the
framework to write I<source filters> in Perl.  For most uses, the
frontend Filter::Simple is to be preferred.  See L<Filter::Util::Call>.

=item *

C<if>, by Ilya Zakharevich, is a new pragma for conditional inclusion
of modules.

=item *

L<libnet>, by Graham Barr, is a collection of perl5 modules related
to network programming.  See L<Net::FTP>, L<Net::NNTP>, L<Net::Ping>
(not part of libnet, but related), L<Net::POP3>, L<Net::SMTP>,
and L<Net::Time>.

Perl installation leaves libnet unconfigured; use F<libnetcfg>
to configure it.

=item *

C<List::Util>, by Graham Barr, is a selection of general-utility
list subroutines, such as sum(), min(), first(), and shuffle().
See L<List::Util>.

=item *

C<Locale::Constants>, C<Locale::Country>, C<Locale::Currency>
C<Locale::Language>, and L<Locale::Script>, by Neil Bowers, have
been added.  They provide the codes for various locale standards, such
as "fr" for France, "usd" for US Dollar, and "ja" for Japanese.

    use Locale::Country;

    $country = code2country('jp');               # $country gets 'Japan'
    $code    = country2code('Norway');           # $code gets 'no'

See L<Locale::Constants>, L<Locale::Country>, L<Locale::Currency>,
and L<Locale::Language>.

=item *

C<Locale::Maketext>, by Sean Burke, is a localization framework.  See
L<Locale::Maketext>, and L<Locale::Maketext::TPJ13>.  The latter is an
article about software localization, originally published in The Perl
Journal #13, and republished here with kind permission.

=item *

C<Math::BigRat> for big rational numbers, to accompany Math::BigInt and
Math::BigFloat, from Tels.  See L<Math::BigRat>.

=item *

C<Memoize> can make your functions faster by trading space for time,
from Mark-Jason Dominus.  See L<Memoize>.

=item *

C<MIME::Base64>, by Gisle Aas, allows you to encode data in base64,
as defined in RFC 2045 - I<MIME (Multipurpose Internet Mail
Extensions)>.

    use MIME::Base64;

    $encoded = encode_base64('Aladdin:open sesame');
    $decoded = decode_base64($encoded);

    print $encoded, "\n"; # "QWxhZGRpbjpvcGVuIHNlc2FtZQ=="

See L<MIME::Base64>.

=item *

C<MIME::QuotedPrint>, by Gisle Aas, allows you to encode data
in quoted-printable encoding, as defined in RFC 2045 - I<MIME
(Multipurpose Internet Mail Extensions)>.

    use MIME::QuotedPrint;

    $encoded = encode_qp("\xDE\xAD\xBE\xEF");
    $decoded = decode_qp($encoded);

    print $encoded, "\n"; # "=DE=AD=BE=EF\n"
    print $decoded, "\n"; # "\xDE\xAD\xBE\xEF\n"

See also L<PerlIO::via::QuotedPrint>.

=item *

C<NEXT>, by Damian Conway, is a pseudo-class for method redispatch.
See L<NEXT>.

=item *

C<open> is a new pragma for setting the default I/O layers
for open().

=item *

C<PerlIO::scalar>, by Nick Ing-Simmons, provides the implementation
of IO to "in memory" Perl scalars as discussed above.  It also serves
as an example of a loadable PerlIO layer.  Other future possibilities
include PerlIO::Array and PerlIO::Code.  See L<PerlIO::scalar>.

=item *

C<PerlIO::via>, by Nick Ing-Simmons, acts as a PerlIO layer and wraps
PerlIO layer functionality provided by a class (typically implemented
in Perl code).

=item *

C<PerlIO::via::QuotedPrint>, by Elizabeth Mattijsen, is an example
of a C<PerlIO::via> class:

    use PerlIO::via::QuotedPrint;
    open($fh,">:via(QuotedPrint)",$path);

This will automatically convert everything output to C<$fh> to
Quoted-Printable.  See L<PerlIO::via> and L<PerlIO::via::QuotedPrint>.

=item *

C<Pod::ParseLink>, by Russ Allbery, has been added,
to parse LZ<><> links in pods as described in the new
perlpodspec.

=item *

C<Pod::Text::Overstrike>, by Joe Smith, has been added.
It converts POD data to formatted overstrike text.
See L<Pod::Text::Overstrike>. [561+]

=item *

C<Scalar::Util> is a selection of general-utility scalar subroutines,
such as blessed(), reftype(), and tainted().  See L<Scalar::Util>.

=item *

C<sort> is a new pragma for controlling the behaviour of sort().

=item *

C<Storable> gives persistence to Perl data structures by allowing the
storage and retrieval of Perl data to and from files in a fast and
compact binary format.  Because in effect Storable does serialisation
of Perl data structures, with it you can also clone deep, hierarchical
datastructures.  Storable was originally created by Raphael Manfredi,
but it is now maintained by Abhijit Menon-Sen.  Storable has been
enhanced to understand the two new hash features, Unicode keys and
restricted hashes.  See L<Storable>.

=item *

C<Switch>, by Damian Conway, has been added.  Just by saying

    use Switch;

you have C<switch> and C<case> available in Perl.

    use Switch;

    switch ($val) {

		case 1		{ print "number 1" }
		case "a"	{ print "string a" }
		case [1..10,42]	{ print "number in list" }
		case (@array)	{ print "number in list" }
		case /\w+/	{ print "pattern" }
		case qr/\w+/	{ print "pattern" }
		case (%hash)	{ print "entry in hash" }
		case (\%hash)	{ print "entry in hash" }
		case (\&sub)	{ print "arg to subroutine" }
		else		{ print "previous case not true" }
    }

See L<Switch>.

=item *

C<Test::More>, by Michael Schwern, is yet another framework for writing
test scripts, more extensive than Test::Simple.  See L<Test::More>.

=item *

C<Test::Simple>, by Michael Schwern, has basic utilities for writing
tests.   See L<Test::Simple>.

=item *

C<Text::Balanced>, by Damian Conway, has been added, for extracting
delimited text sequences from strings.

    use Text::Balanced 'extract_delimited';

    ($a, $b) = extract_delimited("'never say never', he never said", "'", '');

$a will be "'never say never'", $b will be ', he never said'.

In addition to extract_delimited(), there are also extract_bracketed(),
extract_quotelike(), extract_codeblock(), extract_variable(),
extract_tagged(), extract_multiple(), gen_delimited_pat(), and
gen_extract_tagged().  With these, you can implement rather advanced
parsing algorithms.  See L<Text::Balanced>.

=item *

C<threads>, by Arthur Bergman, is an interface to interpreter threads.
Interpreter threads (ithreads) is the new thread model introduced in
Perl 5.6 but only available as an internal interface for extension
writers (and for Win32 Perl for C<fork()> emulation).  See L<threads>,
L<threads::shared>, and L<perlthrtut>.

=item *

C<threads::shared>, by Arthur Bergman, allows data sharing for
interpreter threads.  See L<threads::shared>.

=item *

C<Tie::File>, by Mark-Jason Dominus, associates a Perl array with the
lines of a file.  See L<Tie::File>.

=item *

C<Tie::Memoize>, by Ilya Zakharevich, provides on-demand loaded hashes.
See L<Tie::Memoize>.

=item *

C<Tie::RefHash::Nestable>, by Edward Avis, allows storing hash
references (unlike the standard Tie::RefHash)  The module is contained
within Tie::RefHash.  See L<Tie::RefHash>.

=item *

C<Time::HiRes>, by Douglas E. Wegscheid, provides high resolution
timing (ualarm, usleep, and gettimeofday).  See L<Time::HiRes>.

=item *

C<Unicode::UCD> offers a querying interface to the Unicode Character
Database.  See L<Unicode::UCD>.

=item *

C<Unicode::Collate>, by SADAHIRO Tomoyuki, implements the UCA
(Unicode Collation Algorithm) for sorting Unicode strings.
See L<Unicode::Collate>.

=item *

C<Unicode::Normalize>, by SADAHIRO Tomoyuki, implements the various
Unicode normalization forms.  See L<Unicode::Normalize>.

=item *

C<XS::APItest>, by Tim Jenness, is a test extension that exercises XS
APIs.  Currently only C<printf()> is tested: how to output various
basic data types from XS.

=item *

C<XS::Typemap>, by Tim Jenness, is a test extension that exercises
XS typemaps.  Nothing gets installed, but the code is worth studying
for extension writers.

=back

=head2 Updated And Improved Modules and Pragmata

=over 4

=item *

The following independently supported modules have been updated to the
newest versions from CPAN: CGI, CPAN, DB_File, File::Spec, File::Temp,
Getopt::Long, Math::BigFloat, Math::BigInt, the podlators bundle
(Pod::Man, Pod::Text), Pod::LaTeX [561+], Pod::Parser, Storable,
Term::ANSIColor, Test, Text-Tabs+Wrap.

=item *

attributes::reftype() now works on tied arguments.

=item *

AutoLoader can now be disabled with C<no AutoLoader;>.

=item *

B::Deparse has been significantly enhanced by Robin Houston.  It can
now deparse almost all of the standard test suite (so that the tests
still succeed).  There is a make target "test.deparse" for trying this
out.

=item *

Carp now has better interface documentation, and the @CARP_NOT
interface has been added to get optional control over where errors
are reported independently of @ISA, by Ben Tilly.

=item *

Class::Struct can now define the classes in compile time.

=item *

Class::Struct now assigns the array/hash element if the accessor
is called with an array/hash element as the B<sole> argument.

=item *

The return value of Cwd::fastcwd() is now tainted.

=item *

Data::Dumper now has an option to sort hashes.

=item *

Data::Dumper now has an option to dump code references
using B::Deparse.

=item *

DB_File now supports newer Berkeley DB versions, among
other improvements.

=item *

Devel::Peek now has an interface for the Perl memory statistics
(this works only if you are using perl's malloc, and if you have
compiled with debugging).

=item *

The English module can now be used without the infamous performance
hit by saying

	use English '-no_match_vars';

(Assuming, of course, that you don't need the troublesome variables
C<$`>, C<$&>, or C<$'>.)  Also, introduced C<@LAST_MATCH_START> and
C<@LAST_MATCH_END> English aliases for C<@-> and C<@+>.

=item *

ExtUtils::MakeMaker has been significantly cleaned up and fixed.
The enhanced version has also been backported to earlier releases
of Perl and submitted to CPAN so that the earlier releases can
enjoy the fixes.

=item *

The arguments of WriteMakefile() in Makefile.PL are now checked
for sanity much more carefully than before.  This may cause new
warnings when modules are being installed.  See L<ExtUtils::MakeMaker>
for more details.

=item *

ExtUtils::MakeMaker now uses File::Spec internally, which hopefully
leads to better portability.

=item *

Fcntl, Socket, and Sys::Syslog have been rewritten by Nicholas Clark
to use the new-style constant dispatch section (see L<ExtUtils::Constant>).
This means that they will be more robust and hopefully faster.

=item *

File::Find now chdir()s correctly when chasing symbolic links. [561]

=item *

File::Find now has pre- and post-processing callbacks.  It also
correctly changes directories when chasing symbolic links.  Callbacks
(naughtily) exiting with "next;" instead of "return;" now work.

=item *

File::Find is now (again) reentrant.  It also has been made
more portable.

=item *

The warnings issued by File::Find now belong to their own category.
You can enable/disable them with C<use/no warnings 'File::Find';>.

=item *

File::Glob::glob() has been renamed to File::Glob::bsd_glob()
because the name clashes with the builtin glob().  The older
name is still available for compatibility, but is deprecated. [561]

=item *

File::Glob now supports C<GLOB_LIMIT> constant to limit the size of
the returned list of filenames.

=item *

IPC::Open3 now allows the use of numeric file descriptors.

=item *

IO::Socket now has an atmark() method, which returns true if the socket
is positioned at the out-of-band mark.  The method is also exportable
as a sockatmark() function.

=item *

IO::Socket::INET failed to open the specified port if the service name
was not known.  It now correctly uses the supplied port number as is. [561]

=item *

IO::Socket::INET has support for the ReusePort option (if your
platform supports it).  The Reuse option now has an alias, ReuseAddr.
For clarity, you may want to prefer ReuseAddr.

=item *

IO::Socket::INET now supports a value of zero for C<LocalPort>
(usually meaning that the operating system will make one up.)

=item *

'use lib' now works identically to @INC.  Removing directories
with 'no lib' now works.

=item *

Math::BigFloat and Math::BigInt have undergone a full rewrite by Tels.
They are now magnitudes faster, and they support various bignum
libraries such as GMP and PARI as their backends.

=item *

Math::Complex handles inf, NaN etc., better.

=item *

Net::Ping has been considerably enhanced by Rob Brown: multihoming is
now supported, Win32 functionality is better, there is now time
measuring functionality (optionally high-resolution using
Time::HiRes), and there is now "external" protocol which uses
Net::Ping::External module which runs your external ping utility and
parses the output.  A version of Net::Ping::External is available in
CPAN.

Note that some of the Net::Ping tests are disabled when running
under the Perl distribution since one cannot assume one or more
of the following: enabled echo port at localhost, full Internet
connectivity, or sympathetic firewalls.  You can set the environment
variable PERL_TEST_Net_Ping to "1" (one) before running the Perl test
suite to enable all the Net::Ping tests.

=item *

POSIX::sigaction() is now much more flexible and robust.
You can now install coderef handlers, 'DEFAULT', and 'IGNORE'
handlers, installing new handlers was not atomic.

=item *

In Safe, C<%INC> is now localised in a Safe compartment so that
use/require work.

=item *

In SDBM_File on DOSish platforms, some keys went missing because of
lack of support for files with "holes".  A workaround for the problem
has been added.

=item *

In Search::Dict one can now have a pre-processing hook for the
lines being searched.

=item *

The Shell module now has an OO interface.

=item *

In Sys::Syslog there is now a failover mechanism that will go
through alternative connection mechanisms until the message
is successfully logged.

=item *

The Test module has been significantly enhanced.

=item *

Time::Local::timelocal() does not handle fractional seconds anymore.
The rationale is that neither does localtime(), and timelocal() and
localtime() are supposed to be inverses of each other.

=item *

The vars pragma now supports declaring fully qualified variables.
(Something that C<our()> does not and will not support.)

=item *

The C<utf8::> name space (as in the pragma) provides various
Perl-callable functions to provide low level access to Perl's
internal Unicode representation.  At the moment only length()
has been implemented.

=back

=head1 Utility Changes

=over 4

=item *

Emacs perl mode (emacs/cperl-mode.el) has been updated to version
4.31.

=item *

F<emacs/e2ctags.pl> is now much faster.

=item *

C<enc2xs> is a tool for people adding their own encodings to the
Encode module.

=item *

C<h2ph> now supports C trigraphs.

=item *

C<h2xs> now produces a template README.

=item *

C<h2xs> now uses C<Devel::PPPort> for better portability between
different versions of Perl.

=item *

C<h2xs> uses the new L<ExtUtils::Constant|ExtUtils::Constant> module
which will affect newly created extensions that define constants.
Since the new code is more correct (if you have two constants where the
first one is a prefix of the second one, the first constant B<never>
got defined), less lossy (it uses integers for integer constant,
as opposed to the old code that used floating point numbers even for
integer constants), and slightly faster, you might want to consider
regenerating your extension code (the new scheme makes regenerating
easy).  L<h2xs> now also supports C trigraphs.

=item *

C<libnetcfg> has been added to configure libnet.

=item *

C<perlbug> is now much more robust.  It also sends the bug report to
perl.org, not perl.com.

=item *

C<perlcc> has been rewritten and its user interface (that is,
command line) is much more like that of the Unix C compiler, cc.
(The perlbc tools has been removed.  Use C<perlcc -B> instead.)
B<Note that perlcc is still considered very experimental and
unsupported.> [561]

=item *

C<perlivp> is a new Installation Verification Procedure utility
for running any time after installing Perl.

=item *

C<piconv> is an implementation of the character conversion utility
C<iconv>, demonstrating the new Encode module.

=item *

C<pod2html> now allows specifying a cache directory.

=item *

C<pod2html> now produces XHTML 1.0.

=item *

C<pod2html> now understands POD written using different line endings
(PC-like CRLF versus Unix-like LF versus MacClassic-like CR).

=item *

C<s2p> has been completely rewritten in Perl.  (It is in fact a full
implementation of sed in Perl: you can use the sed functionality by
using the C<psed> utility.)

=item *

C<xsubpp> now understands POD documentation embedded in the *.xs
files. [561]

=item *

C<xsubpp> now supports the OUT keyword.

=back

=head1 New Documentation

=over 4

=item *

perl56delta details the changes between the 5.005 release and the
5.6.0 release.

=item *

perlclib documents the internal replacements for standard C library
functions.  (Interesting only for extension writers and Perl core
hackers.) [561+]

=item *

perldebtut is a Perl debugging tutorial. [561+]

=item *

perlebcdic contains considerations for running Perl on EBCDIC
platforms. [561+]

=item *

perlintro is a gentle introduction to Perl.

=item *

perliol documents the internals of PerlIO with layers.

=item *

perlmodstyle is a style guide for writing modules.

=item *

perlnewmod tells about writing and submitting a new module. [561+]

=item *

perlpacktut is a pack() tutorial.

=item *

perlpod has been rewritten to be clearer and to record the best
practices gathered over the years.

=item *

perlpodspec is a more formal specification of the pod format,
mainly of interest for writers of pod applications, not to
people writing in pod.

=item *

perlretut is a regular expression tutorial. [561+]

=item *

perlrequick is a regular expressions quick-start guide.
Yes, much quicker than perlretut. [561]

=item *

perltodo has been updated.

=item *

perltootc has been renamed as perltooc (to not to conflict
with perltoot in filesystems restricted to "8.3" names).

=item *

perluniintro is an introduction to using Unicode in Perl.
(perlunicode is more of a detailed reference and background
information)

=item *

perlutil explains the command line utilities packaged with the Perl
distribution. [561+]

=back

The following platform-specific documents are available before
the installation as README.I<platform>, and after the installation
as perlI<platform>:

    perlaix perlamiga perlapollo perlbeos perlbs2000
    perlce perlcygwin perldgux perldos perlepoc perlfreebsd perlhpux
    perlhurd perlirix perlmachten perlmacos perlmint perlmpeix
    perlnetware perlos2 perlos390 perlplan9 perlqnx perlsolaris
    perltru64 perluts perlvmesa perlvms perlvos perlwin32

These documents usually detail one or more of the following subjects:
configuring, building, testing, installing, and sometimes also using
Perl on the said platform.

Eastern Asian Perl users are now welcomed in their own languages:
README.jp (Japanese), README.ko (Korean), README.cn (simplified
Chinese) and README.tw (traditional Chinese), which are written in
normal pod but encoded in EUC-JP, EUC-KR, EUC-CN and Big5.  These
will get installed as

   perljp perlko perlcn perltw

=over 4

=item *

The documentation for the POSIX-BC platform is called "BS2000", to avoid
confusion with the Perl POSIX module.

=item *

The documentation for the WinCE platform is called perlce (README.ce
in the source code kit), to avoid confusion with the perlwin32
documentation on 8.3-restricted filesystems.

=back

=head1 Performance Enhancements

=over 4

=item *

map() could get pathologically slow when the result list it generates
is larger than the source list.  The performance has been improved for
common scenarios. [561]

=item *

sort() is also fully reentrant, in the sense that the sort function
can itself call sort().  This did not work reliably in previous
releases. [561]

=item *

sort() has been changed to use primarily mergesort internally as
opposed to the earlier quicksort.  For very small lists this may
result in slightly slower sorting times, but in general the speedup
should be at least 20%.  Additional bonuses are that the worst case
behaviour of sort() is now better (in computer science terms it now
runs in time O(N log N), as opposed to quicksort's Theta(N**2)
worst-case run time behaviour), and that sort() is now stable
(meaning that elements with identical keys will stay ordered as they
were before the sort).  See the C<sort> pragma for information.

The story in more detail: suppose you want to serve yourself a little
slice of Pi.

    @digits = ( 3,1,4,1,5,9 );

A numerical sort of the digits will yield (1,1,3,4,5,9), as expected.
Which C<1> comes first is hard to know, since one C<1> looks pretty
much like any other.  You can regard this as totally trivial,
or somewhat profound.  However, if you just want to sort the even
digits ahead of the odd ones, then what will

    sort { ($a % 2) <=> ($b % 2) } @digits;

yield?  The only even digit, C<4>, will come first.  But how about
the odd numbers, which all compare equal?  With the quicksort algorithm
used to implement Perl 5.6 and earlier, the order of ties is left up
to the sort.  So, as you add more and more digits of Pi, the order
in which the sorted even and odd digits appear will change.
and, for sufficiently large slices of Pi, the quicksort algorithm
in Perl 5.8 won't return the same results even if reinvoked with the
same input.  The justification for this rests with quicksort's
worst case behavior.  If you run

   sort { $a <=> $b } ( 1 .. $N , 1 .. $N );

(something you might approximate if you wanted to merge two sorted
arrays using sort), doubling $N doesn't just double the quicksort time,
it I<quadruples> it.  Quicksort has a worst case run time that can
grow like N**2, so-called I<quadratic> behaviour, and it can happen
on patterns that may well arise in normal use.  You won't notice this
for small arrays, but you I<will> notice it with larger arrays,
and you may not live long enough for the sort to complete on arrays
of a million elements.  So the 5.8 quicksort scrambles large arrays
before sorting them, as a statistical defence against quadratic behaviour.
But that means if you sort the same large array twice, ties may be
broken in different ways.

Because of the unpredictability of tie-breaking order, and the quadratic
worst-case behaviour, quicksort was I<almost> replaced completely with
a stable mergesort.  I<Stable> means that ties are broken to preserve
the original order of appearance in the input array.  So

    sort { ($a % 2) <=> ($b % 2) } (3,1,4,1,5,9);

will yield (4,3,1,1,5,9), guaranteed.  The even and odd numbers
appear in the output in the same order they appeared in the input.
Mergesort has worst case O(N log N) behaviour, the best value
attainable.  And, ironically, this mergesort does particularly
well where quicksort goes quadratic:  mergesort sorts (1..$N, 1..$N)
in O(N) time.  But quicksort was rescued at the last moment because
it is faster than mergesort on certain inputs and platforms.
For example, if you really I<don't> care about the order of even
and odd digits, quicksort will run in O(N) time; it's very good
at sorting many repetitions of a small number of distinct elements.
The quicksort divide and conquer strategy works well on platforms
with relatively small, very fast, caches.  Eventually, the problem gets
whittled down to one that fits in the cache, from which point it
benefits from the increased memory speed.

Quicksort was rescued by implementing a sort pragma to control aspects
of the sort.  The B<stable> subpragma forces stable behaviour,
regardless of algorithm.  The B<_quicksort> and B<_mergesort>
subpragmas are heavy-handed ways to select the underlying implementation.
The leading C<_> is a reminder that these subpragmas may not survive
beyond 5.8.  More appropriate mechanisms for selecting the implementation
exist, but they wouldn't have arrived in time to save quicksort.

=item *

Hashes now use Bob Jenkins "One-at-a-Time" hashing key algorithm
( http://burtleburtle.net/bob/hash/doobs.html ).  This algorithm is
reasonably fast while producing a much better spread of values than
the old hashing algorithm (originally by Chris Torek, later tweaked by
Ilya Zakharevich).  Hash values output from the algorithm on a hash of
all 3-char printable ASCII keys comes much closer to passing the
DIEHARD random number generation tests.  According to perlbench, this
change has not affected the overall speed of Perl.

=item *

unshift() should now be noticeably faster.

=back

=head1 Installation and Configuration Improvements

=head2 Generic Improvements

=over 4

=item *

INSTALL now explains how you can configure Perl to use 64-bit
integers even on non-64-bit platforms.

=item *

Policy.sh policy change: if you are reusing a Policy.sh file
(see INSTALL) and you use Configure -Dprefix=/foo/bar and in the old
Policy $prefix eq $siteprefix and $prefix eq $vendorprefix, all of
them will now be changed to the new prefix, /foo/bar.  (Previously
only $prefix changed.)  If you do not like this new behaviour,
specify prefix, siteprefix, and vendorprefix explicitly.

=item *

A new optional location for Perl libraries, otherlibdirs, is available.
It can be used for example for vendor add-ons without disturbing Perl's
own library directories.

=item *

In many platforms, the vendor-supplied 'cc' is too stripped-down to
build Perl (basically, 'cc' doesn't do ANSI C).  If this seems
to be the case and 'cc' does not seem to be the GNU C compiler
'gcc', an automatic attempt is made to find and use 'gcc' instead.

=item *

gcc needs to closely track the operating system release to avoid
build problems. If Configure finds that gcc was built for a different
operating system release than is running, it now gives a clearly visible
warning that there may be trouble ahead.

=item *

Since Perl 5.8 is not binary-compatible with previous releases
of Perl, Configure no longer suggests including the 5.005
modules in @INC.

=item *

Configure C<-S> can now run non-interactively. [561]

=item *

Configure support for pdp11-style memory models has been removed due
to obsolescence. [561]

=item *

configure.gnu now works with options with whitespace in them.

=item *

installperl now outputs everything to STDERR.

=item *

Because PerlIO is now the default on most platforms, "-perlio" doesn't
get appended to the $Config{archname} (also known as $^O) anymore.
Instead, if you explicitly choose not to use perlio (Configure command
line option -Uuseperlio), you will get "-stdio" appended.

=item *

Another change related to the architecture name is that "-64all"
(-Duse64bitall, or "maximally 64-bit") is appended only if your
pointers are 64 bits wide.  (To be exact, the use64bitall is ignored.)

=item *

In AFS installations, one can configure the root of the AFS to be
somewhere else than the default F</afs> by using the Configure
parameter C<-Dafsroot=/some/where/else>.

=item *

APPLLIB_EXP, a lesser-known configuration-time definition, has been
documented.  It can be used to prepend site-specific directories
to Perl's default search path (@INC); see INSTALL for information.

=item *

The version of Berkeley DB used when the Perl (and, presumably, the
DB_File extension) was built is now available as
C<@Config{qw(db_version_major db_version_minor db_version_patch)}>
from Perl and as C<DB_VERSION_MAJOR_CFG DB_VERSION_MINOR_CFG
DB_VERSION_PATCH_CFG> from C.

=item *

Building Berkeley DB3 for compatibility modes for DB, NDBM, and ODBM
has been documented in INSTALL.

=item *

If you have CPAN access (either network or a local copy such as a
CD-ROM) you can during specify extra modules to Configure to build and
install with Perl using the -Dextras=...  option.  See INSTALL for
more details.

=item *

In addition to config.over, a new override file, config.arch, is
available.  This file is supposed to be used by hints file writers
for architecture-wide changes (as opposed to config.over which is
for site-wide changes).

=item *

If your file system supports symbolic links, you can build Perl outside
of the source directory by

	mkdir perl/build/directory
	cd perl/build/directory
	sh /path/to/perl/source/Configure -Dmksymlinks ...

This will create in perl/build/directory a tree of symbolic links
pointing to files in /path/to/perl/source.  The original files are left
unaffected.  After Configure has finished, you can just say

	make all test

and Perl will be built and tested, all in perl/build/directory.
[561]

=item *

For Perl developers, several new make targets for profiling
and debugging have been added; see L<perlhack>.

=over 8

=item *

Use of the F<gprof> tool to profile Perl has been documented in
L<perlhack>.  There is a make target called "perl.gprof" for
generating a gprofiled Perl executable.

=item *

If you have GCC 3, there is a make target called "perl.gcov" for
creating a gcoved Perl executable for coverage analysis.  See
L<perlhack>.

=item *

If you are on IRIX or Tru64 platforms, new profiling/debugging options
have been added; see L<perlhack> for more information about pixie and
Third Degree.

=back

=item *

Guidelines of how to construct minimal Perl installations have
been added to INSTALL.

=item *

The Thread extension is now not built at all under ithreads
(C<Configure -Duseithreads>) because it wouldn't work anyway (the
Thread extension requires being Configured with C<-Duse5005threads>).

B<Note that the 5.005 threads are unsupported and deprecated: if you
have code written for the old threads you should migrate it to the
new ithreads model.>

=item *

The Gconvert macro ($Config{d_Gconvert}) used by perl for stringifying
floating-point numbers is now more picky about using sprintf %.*g
rules for the conversion.  Some platforms that used to use gcvt may
now resort to the slower sprintf.

=item *

The obsolete method of making a special (e.g., debugging) flavor
of perl by saying

	make LIBPERL=libperld.a

has been removed. Use -DDEBUGGING instead.

=back

=head2 New Or Improved Platforms

For the list of platforms known to support Perl,
see L<perlport/"Supported Platforms">.

=over 4

=item *

AIX dynamic loading should be now better supported.

=item *

AIX should now work better with gcc, threads, and 64-bitness.  Also the
long doubles support in AIX should be better now.  See L<perlaix>.

=item *

AtheOS ( http://www.atheos.cx/ ) is a new platform.

=item *

BeOS has been reclaimed.

=item *

The DG/UX platform now supports 5.005-style threads.
See L<perldgux>.

=item *

The DYNIX/ptx platform (also known as dynixptx) is supported at or
near osvers 4.5.2.

=item *

EBCDIC platforms (z/OS (also known as OS/390), POSIX-BC, and VM/ESA)
have been regained.  Many test suite tests still fail and the
co-existence of Unicode and EBCDIC isn't quite settled, but the
situation is much better than with Perl 5.6.  See L<perlos390>,
L<perlbs2000> (for POSIX-BC), and perlvmesa for more information.
(B<Note:> support for VM/ESA was removed in Perl v5.18.0. The relevant
information was in F<README.vmesa>)

=item *

Building perl with -Duseithreads or -Duse5005threads now works under
HP-UX 10.20 (previously it only worked under 10.30 or later). You will
need a thread library package installed. See README.hpux. [561]

=item *

Mac OS Classic is now supported in the mainstream source package
(MacPerl has of course been available since perl 5.004 but now the
source code bases of standard Perl and MacPerl have been synchronised)
[561]

=item *

Mac OS X (or Darwin) should now be able to build Perl even on HFS+
filesystems.  (The case-insensitivity used to confuse the Perl build
process.)

=item *

NCR MP-RAS is now supported. [561]

=item *

All the NetBSD specific patches (except for the installation
specific ones) have been merged back to the main distribution.

=item *

NetWare from Novell is now supported.  See L<perlnetware>.

=item *

NonStop-UX is now supported. [561]

=item *

NEC SUPER-UX is now supported.

=item *

All the OpenBSD specific patches (except for the installation
specific ones) have been merged back to the main distribution.

=item *

Perl has been tested with the GNU pth userlevel thread package
( http://www.gnu.org/software/pth/pth.html ).  All thread tests
of Perl now work, but not without adding some yield()s to the tests,
so while pth (and other userlevel thread implementations) can be
considered to be "working" with Perl ithreads, keep in mind the
possible non-preemptability of the underlying thread implementation.

=item *

Stratus VOS is now supported using Perl's native build method
(Configure).  This is the recommended method to build Perl on
VOS.  The older methods, which build miniperl, are still
available.  See L<perlvos>. [561+]

=item *

The Amdahl UTS Unix mainframe platform is now supported. [561]

=item *

WinCE is now supported.  See L<perlce>.

=item *

z/OS (formerly known as OS/390, formerly known as MVS OE) now has
support for dynamic loading.  This is not selected by default,
however, you must specify -Dusedl in the arguments of Configure. [561]

=back

=head1 Selected Bug Fixes

Numerous memory leaks and uninitialized memory accesses have been
hunted down.  Most importantly, anonymous subs used to leak quite
a bit. [561]

=over 4

=item *

The autouse pragma didn't work for Multi::Part::Function::Names.

=item *

caller() could cause core dumps in certain situations.  Carp was
sometimes affected by this problem.  In particular, caller() now
returns a subroutine name of C<(unknown)> for subroutines that have
been removed from the symbol table.

=item *

chop(@list) in list context returned the characters chopped in
reverse order.  This has been reversed to be in the right order. [561]

=item *

Configure no longer includes the DBM libraries (dbm, gdbm, db, ndbm)
when building the Perl binary.  The only exception to this is SunOS 4.x,
which needs them. [561]

=item *

The behaviour of non-decimal but numeric string constants such as
"0x23" was platform-dependent: in some platforms that was seen as 35,
in some as 0, in some as a floating point number (don't ask).  This
was caused by Perl's using the operating system libraries in a situation
where the result of the string to number conversion is undefined: now
Perl consistently handles such strings as zero in numeric contexts.

=item *

Several debugger fixes: exit code now reflects the script exit code,
condition C<"0"> now treated correctly, the C<d> command now checks
line number, C<$.> no longer gets corrupted, and all debugger output
now goes correctly to the socket if RemotePort is set. [561]

=item *

The debugger (perl5db.pl) has been modified to present a more
consistent commands interface, via (CommandSet=580).  perl5db.t was
also added to test the changes, and as a placeholder for further tests.

See L<perldebug>.

=item *

The debugger has a new C<dumpDepth> option to control the maximum
depth to which nested structures are dumped.  The C<x> command has
been extended so that C<x N EXPR> dumps out the value of I<EXPR> to a
depth of at most I<N> levels.

=item *

The debugger can now show lexical variables if you have the CPAN
module PadWalker installed.

=item *

The order of DESTROYs has been made more predictable.

=item *

Perl 5.6.0 could emit spurious warnings about redefinition of
dl_error() when statically building extensions into perl.
This has been corrected. [561]

=item *

L<dprofpp> -R didn't work.

=item *

C<*foo{FORMAT}> now works.

=item *

Infinity is now recognized as a number.

=item *

UNIVERSAL::isa no longer caches methods incorrectly.  (This broke
the Tk extension with 5.6.0.) [561]

=item *

Lexicals I: lexicals outside an eval "" weren't resolved
correctly inside a subroutine definition inside the eval "" if they
were not already referenced in the top level of the eval""ed code.

=item *

Lexicals II: lexicals leaked at file scope into subroutines that
were declared before the lexicals.

=item *

Lexical warnings now propagating correctly between scopes
and into C<eval "...">.

=item *

C<use warnings qw(FATAL all)> did not work as intended.  This has been
corrected. [561]

=item *

warnings::enabled() now reports the state of $^W correctly if the caller
isn't using lexical warnings. [561]

=item *

Line renumbering with eval and C<#line> now works. [561]

=item *

Fixed numerous memory leaks, especially in eval "".

=item *

Localised tied variables no longer leak memory

    use Tie::Hash;
    tie my %tied_hash => 'Tie::StdHash';

    ...

    # Used to leak memory every time local() was called;
    # in a loop, this added up.
    local($tied_hash{Foo}) = 1;

=item *

Localised hash elements (and %ENV) are correctly unlocalised to not
exist, if they didn't before they were localised.


    use Tie::Hash;
    tie my %tied_hash => 'Tie::StdHash';

    ...

    # Nothing has set the FOO element so far

    { local $tied_hash{FOO} = 'Bar' }

    # This used to print, but not now.
    print "exists!\n" if exists $tied_hash{FOO};

As a side effect of this fix, tied hash interfaces B<must> define
the EXISTS and DELETE methods.

=item *

mkdir() now ignores trailing slashes in the directory name,
as mandated by POSIX.

=item *

Some versions of glibc have a broken modfl().  This affects builds
with C<-Duselongdouble>.  This version of Perl detects this brokenness
and has a workaround for it.  The glibc release 2.2.2 is known to have
fixed the modfl() bug.

=item *

Modulus of unsigned numbers now works (4063328477 % 65535 used to
return 27406, instead of 27047). [561]

=item *

Some "not a number" warnings introduced in 5.6.0 eliminated to be
more compatible with 5.005.  Infinity is now recognised as a number. [561]

=item *

Numeric conversions did not recognize changes in the string value
properly in certain circumstances. [561]

=item *

Attributes (such as :shared) didn't work with our().

=item *

our() variables will not cause bogus "Variable will not stay shared"
warnings. [561]

=item *

"our" variables of the same name declared in two sibling blocks
resulted in bogus warnings about "redeclaration" of the variables.
The problem has been corrected. [561]

=item *

pack "Z" now correctly terminates the string with "\0".

=item *

Fix password routines which in some shadow password platforms
(e.g. HP-UX) caused getpwent() to return every other entry.

=item *

The PERL5OPT environment variable (for passing command line arguments
to Perl) didn't work for more than a single group of options. [561]

=item *

PERL5OPT with embedded spaces didn't work.

=item *

printf() no longer resets the numeric locale to "C".

=item *

C<qw(a\\b)> now parses correctly as C<'a\\b'>: that is, as three
characters, not four. [561]

=item *

pos() did not return the correct value within s///ge in earlier
versions.  This is now handled correctly. [561]

=item *

Printing quads (64-bit integers) with printf/sprintf now works
without the q L ll prefixes (assuming you are on a quad-capable platform).

=item *

Regular expressions on references and overloaded scalars now work. [561+]

=item *

Right-hand side magic (GMAGIC) could in many cases such as string
concatenation be invoked too many times.

=item *

scalar() now forces scalar context even when used in void context.

=item *

SOCKS support is now much more robust.

=item *

sort() arguments are now compiled in the right wantarray context
(they were accidentally using the context of the sort() itself).
The comparison block is now run in scalar context, and the arguments
to be sorted are always provided list context. [561]

=item *

Changed the POSIX character class C<[[:space:]]> to include the (very
rarely used) vertical tab character.  Added a new POSIX-ish character
class C<[[:blank:]]> which stands for horizontal whitespace
(currently, the space and the tab).

=item *

The tainting behaviour of sprintf() has been rationalized.  It does
not taint the result of floating point formats anymore, making the
behaviour consistent with that of string interpolation. [561]

=item *

Some cases of inconsistent taint propagation (such as within hash
values) have been fixed.

=item *

The RE engine found in Perl 5.6.0 accidentally pessimised certain kinds
of simple pattern matches.  These are now handled better. [561]

=item *

Regular expression debug output (whether through C<use re 'debug'>
or via C<-Dr>) now looks better. [561]

=item *

Multi-line matches like C<"a\nxb\n" =~ /(?!\A)x/m> were flawed.  The
bug has been fixed. [561]

=item *

Use of $& could trigger a core dump under some situations.  This
is now avoided. [561]

=item *

The regular expression captured submatches ($1, $2, ...) are now
more consistently unset if the match fails, instead of leaving false
data lying around in them. [561]

=item *

readline() on files opened in "slurp" mode could return an extra
"" (blank line) at the end in certain situations.  This has been
corrected. [561]

=item *

Autovivification of symbolic references of special variables described
in L<perlvar> (as in C<${$num}>) was accidentally disabled.  This works
again now. [561]

=item *

Sys::Syslog ignored the C<LOG_AUTH> constant.

=item *

$AUTOLOAD, sort(), lock(), and spawning subprocesses
in multiple threads simultaneously are now thread-safe.

=item *

Tie::Array's SPLICE method was broken.

=item *

Allow a read-only string on the left-hand side of a non-modifying tr///.

=item *

If C<STDERR> is tied, warnings caused by C<warn> and C<die> now
correctly pass to it.

=item *

Several Unicode fixes.

=over 8

=item *

BOMs (byte order marks) at the beginning of Perl files
(scripts, modules) should now be transparently skipped.
UTF-16 and UCS-2 encoded Perl files should now be read correctly.

=item *

The character tables have been updated to Unicode 3.2.0.

=item *

Comparing with utf8 data does not magically upgrade non-utf8 data
into utf8.  (This was a problem for example if you were mixing data
from I/O and Unicode data: your output might have got magically encoded
as UTF-8.)

=item *

Generating illegal Unicode code points such as U+FFFE, or the UTF-16
surrogates, now also generates an optional warning.

=item *

C<IsAlnum>, C<IsAlpha>, and C<IsWord> now match titlecase.

=item *

Concatenation with the C<.> operator or via variable interpolation,
C<eq>, C<substr>, C<reverse>, C<quotemeta>, the C<x> operator,
substitution with C<s///>, single-quoted UTF-8, should now work.

=item *

The C<tr///> operator now works.  Note that the C<tr///CU>
functionality has been removed (but see pack('U0', ...)).

=item *

C<eval "v200"> now works.

=item *

Perl 5.6.0 parsed m/\x{ab}/ incorrectly, leading to spurious warnings.
This has been corrected. [561]

=item *

Zero entries were missing from the Unicode classes such as C<IsDigit>.

=back

=item *

Large unsigned numbers (those above 2**31) could sometimes lose their
unsignedness, causing bogus results in arithmetic operations. [561]

=item *

The Perl parser has been stress tested using both random input and
Markov chain input and the few found crashes and lockups have been
fixed.

=back

=head2 Platform Specific Changes and Fixes

=over 4

=item *

BSDI 4.*

Perl now works on post-4.0 BSD/OSes.

=item *

All BSDs

Setting C<$0> now works (as much as possible; see L<perlvar> for details).

=item *

Cygwin

Numerous updates; currently synchronised with Cygwin 1.3.10.

=item *

Previously DYNIX/ptx had problems in its Configure probe for non-blocking I/O.

=item *

EPOC

EPOC now better supported.  See README.epoc. [561]

=item *

FreeBSD 3.*

Perl now works on post-3.0 FreeBSDs.

=item *

HP-UX

README.hpux updated; C<Configure -Duse64bitall> now works;
now uses HP-UX malloc instead of Perl malloc.

=item *

IRIX

Numerous compilation flag and hint enhancements; accidental mixing
of 32-bit and 64-bit libraries (a doomed attempt) made much harder.

=item *

Linux

=over 8

=item *

Long doubles should now work (see INSTALL). [561]

=item *

Linux previously had problems related to sockaddrlen when using
accept(), recvfrom() (in Perl: recv()), getpeername(), and
getsockname().

=back

=item *

Mac OS Classic

Compilation of the standard Perl distribution in Mac OS Classic should
now work if you have the Metrowerks development environment and the
missing Mac-specific toolkit bits.  Contact the macperl mailing list
for details.

=item *

MPE/iX

MPE/iX update after Perl 5.6.0.  See README.mpeix. [561]

=item *

NetBSD/threads: try installing the GNU pth (should be in the
packages collection, or http://www.gnu.org/software/pth/),
and Configure with -Duseithreads.

=item *

NetBSD/sparc

Perl now works on NetBSD/sparc.

=item *

OS/2

Now works with usethreads (see INSTALL). [561]

=item *

Solaris

64-bitness using the Sun Workshop compiler now works.

=item *

Stratus VOS

The native build method requires at least VOS Release 14.5.0
and GNU C++/GNU Tools 2.0.1 or later.  The Perl pack function
now maps overflowed values to +infinity and underflowed values
to -infinity.

=item *

Tru64 (aka Digital UNIX, aka DEC OSF/1)

The operating system version letter now recorded in $Config{osvers}.
Allow compiling with gcc (previously explicitly forbidden).  Compiling
with gcc still not recommended because buggy code results, even with
gcc 2.95.2.

=item *

Unicos

Fixed various alignment problems that lead into core dumps either
during build or later; no longer dies on math errors at runtime;
now using full quad integers (64 bits), previously was using
only 46 bit integers for speed.

=item *

VMS

See L</"Socket Extension Dynamic in VMS"> and L</"IEEE-format Floating Point
Default on OpenVMS Alpha"> for important changes not otherwise listed here.

chdir() now works better despite a CRT bug; now works with MULTIPLICITY
(see INSTALL); now works with Perl's malloc.

The tainting of C<%ENV> elements via C<keys> or C<values> was previously
unimplemented.  It now works as documented.

The C<waitpid> emulation has been improved.  The worst bug (now fixed)
was that a pid of -1 would cause a wildcard search of all processes on
the system.

POSIX-style signals are now emulated much better on VMS versions prior
to 7.0.

The C<system> function and backticks operator have improved
functionality and better error handling. [561]

File access tests now use current process privileges rather than the
user's default privileges, which could sometimes result in a mismatch
between reported access and actual access.  This improvement is only
available on VMS v6.0 and later.

There is a new C<kill> implementation based on C<sys$sigprc> that allows
older VMS systems (pre-7.0) to use C<kill> to send signals rather than
simply force exit.  This implementation also allows later systems to
call C<kill> from within a signal handler.

Iterative logical name translations are now limited to 10 iterations in
imitation of SHOW LOGICAL and other OpenVMS facilities.

=item *

Windows

=over 8

=item *

Signal handling now works better than it used to.  It is now implemented
using a Windows message loop, and is therefore less prone to random
crashes.

=item *

fork() emulation is now more robust, but still continues to have a few
esoteric bugs and caveats.  See L<perlfork> for details. [561+]

=item *

A failed (pseudo)fork now returns undef and sets errno to EAGAIN. [561]

=item *

The following modules now work on Windows:

    ExtUtils::Embed         [561]
    IO::Pipe
    IO::Poll
    Net::Ping

=item *

IO::File::new_tmpfile() is no longer limited to 32767 invocations
per-process.

=item *

Better chdir() return value for a non-existent directory.

=item *

Compiling perl using the 64-bit Platform SDK tools is now supported.

=item *

The Win32::SetChildShowWindow() builtin can be used to control the
visibility of windows created by child processes.  See L<Win32> for
details.

=item *

Non-blocking waits for child processes (or pseudo-processes) are
supported via C<waitpid($pid, &POSIX::WNOHANG)>.

=item *

The behavior of system() with multiple arguments has been rationalized.
Each unquoted argument will be automatically quoted to protect whitespace,
and any existing whitespace in the arguments will be preserved.  This
improves the portability of system(@args) by avoiding the need for
Windows C<cmd> shell specific quoting in perl programs.

Note that this means that some scripts that may have relied on earlier
buggy behavior may no longer work correctly.  For example,
C<system("nmake /nologo", @args)> will now attempt to run the file
C<nmake /nologo> and will fail when such a file isn't found.
On the other hand, perl will now execute code such as
C<system("c:/Program Files/MyApp/foo.exe", @args)> correctly.

=item *

The perl header files no longer suppress common warnings from the
Microsoft Visual C++ compiler.  This means that additional warnings may
now show up when compiling XS code.

=item *

Borland C++ v5.5 is now a supported compiler that can build Perl.
However, the generated binaries continue to be incompatible with those
generated by the other supported compilers (GCC and Visual C++). [561]

=item *

Duping socket handles with open(F, ">&MYSOCK") now works under Windows 9x.
[561]

=item *

Current directory entries in %ENV are now correctly propagated to child
processes. [561]

=item *

New %ENV entries now propagate to subprocesses. [561]

=item *

Win32::GetCwd() correctly returns C:\ instead of C: when at the drive root.
Other bugs in chdir() and Cwd::cwd() have also been fixed. [561]

=item *

The makefiles now default to the features enabled in ActiveState ActivePerl
(a popular Win32 binary distribution). [561]

=item *

HTML files will now be installed in c:\perl\html instead of
c:\perl\lib\pod\html

=item *

REG_EXPAND_SZ keys are now allowed in registry settings used by perl. [561]

=item *

Can now send() from all threads, not just the first one. [561]

=item *

ExtUtils::MakeMaker now uses $ENV{LIB} to search for libraries. [561]

=item *

Less stack reserved per thread so that more threads can run
concurrently. (Still 16M per thread.) [561]

=item *

C<< File::Spec->tmpdir() >> now prefers C:/temp over /tmp
(works better when perl is running as service).

=item *

Better UNC path handling under ithreads. [561]

=item *

wait(), waitpid(), and backticks now return the correct exit status
under Windows 9x. [561]

=item *

A socket handle leak in accept() has been fixed. [561]

=back

=back

=head1 New or Changed Diagnostics

Please see L<perldiag> for more details.

=over 4

=item *

Ambiguous range in the transliteration operator (like a-z-9) now
gives a warning.

=item *

chdir("") and chdir(undef) now give a deprecation warning because they
cause a possible unintentional chdir to the home directory.
Say chdir() if you really mean that.

=item *

Two new debugging options have been added: if you have compiled your
Perl with debugging, you can use the -DT [561] and -DR options to trace
tokenising and to add reference counts to displaying variables,
respectively.

=item *

The lexical warnings category "deprecated" is no longer a sub-category
of the "syntax" category. It is now a top-level category in its own
right.

=item *

Unadorned dump() will now give a warning suggesting to
use explicit CORE::dump() if that's what really is meant.

=item *

The "Unrecognized escape" warning has been extended to include C<\8>,
C<\9>, and C<\_>.  There is no need to escape any of the C<\w> characters.

=item *

All regular expression compilation error messages are now hopefully
easier to understand both because the error message now comes before
the failed regex and because the point of failure is now clearly
marked by a C<E<lt>-- HERE> marker.

=item *

Various I/O (and socket) functions like binmode(), close(), and so
forth now more consistently warn if they are used illogically either
on a yet unopened or on an already closed filehandle (or socket).

=item *

Using lstat() on a filehandle now gives a warning.  (It's a non-sensical
thing to do.)

=item *

The C<-M> and C<-m> options now warn if you didn't supply the module name.

=item *

If you in C<use> specify a required minimum version, modules matching
the name and but not defining a $VERSION will cause a fatal failure.

=item *

Using negative offset for vec() in lvalue context is now a warnable offense.

=item *

Odd number of arguments to overload::constant now elicits a warning.

=item *

Odd number of elements in anonymous hash now elicits a warning.

=item *

The various "opened only for", "on closed", "never opened" warnings
drop the C<main::> prefix for filehandles in the C<main> package,
for example C<STDIN> instead of C<main::STDIN>.

=item *

Subroutine prototypes are now checked more carefully, you may
get warnings for example if you have used non-prototype characters.

=item *

If an attempt to use a (non-blessed) reference as an array index
is made, a warning is given.

=item *

C<push @a;> and C<unshift @a;> (with no values to push or unshift)
now give a warning.  This may be a problem for generated and eval'ed
code.

=item *

If you try to L<perlfunc/pack> a number less than 0 or larger than 255
using the C<"C"> format you will get an optional warning.  Similarly
for the C<"c"> format and a number less than -128 or more than 127.

=item *

pack C<P> format now demands an explicit size.

=item *

unpack C<w> now warns of unterminated compressed integers.

=item *

Warnings relating to the use of PerlIO have been added.

=item *

Certain regex modifiers such as C<(?o)> make sense only if applied to
the entire regex.  You will get an optional warning if you try to do
otherwise.

=item *

Variable length lookbehind has not yet been implemented, trying to
use it will tell that.

=item *

Using arrays or hashes as references (e.g. C<< %foo->{bar} >>
has been deprecated for a while.  Now you will get an optional warning.

=item *

Warnings relating to the use of the new restricted hashes feature
have been added.

=item *

Self-ties of arrays and hashes are not supported and fatal errors
will happen even at an attempt to do so.

=item *

Using C<sort> in scalar context now issues an optional warning.
This didn't do anything useful, as the sort was not performed.

=item *

Using the /g modifier in split() is meaningless and will cause a warning.

=item *

Using splice() past the end of an array now causes a warning.

=item *

Malformed Unicode encodings (UTF-8 and UTF-16) cause a lot of warnings,
as does trying to use UTF-16 surrogates (which are unimplemented).

=item *

Trying to use Unicode characters on an I/O stream without marking the
stream's encoding (using open() or binmode()) will cause "Wide character"
warnings.

=item *

Use of v-strings in use/require causes a (backward) portability warning.

=item *

Warnings relating to the use interpreter threads and their shared data
have been added.

=back

=head1 Changed Internals

=over 4

=item *

PerlIO is now the default.

=item *

perlapi.pod (a companion to perlguts) now attempts to document the
internal API.

=item *

You can now build a really minimal perl called microperl.
Building microperl does not require even running Configure;
C<make -f Makefile.micro> should be enough.  Beware: microperl makes
many assumptions, some of which may be too bold; the resulting
executable may crash or otherwise misbehave in wondrous ways.
For careful hackers only.

=item *

Added rsignal(), whichsig(), do_join(), op_clear, op_null,
ptr_table_clear(), ptr_table_free(), sv_setref_uv(), and several UTF-8
interfaces to the publicised API.  For the full list of the available
APIs see L<perlapi>.

=item *

Made possible to propagate customised exceptions via croak()ing.

=item *

Now xsubs can have attributes just like subs.  (Well, at least the
built-in attributes.)

=item *

dTHR and djSP have been obsoleted; the former removed (because it's
a no-op) and the latter replaced with dSP.

=item *

PERL_OBJECT has been completely removed.

=item *

The MAGIC constants (e.g. C<'P'>) have been macrofied
(e.g. C<PERL_MAGIC_TIED>) for better source code readability
and maintainability.

=item *

The regex compiler now maintains a structure that identifies nodes in
the compiled bytecode with the corresponding syntactic features of the
original regex expression.  The information is attached to the new
C<offsets> member of the C<struct regexp>. See L<perldebguts> for more
complete information.

=item *

The C code has been made much more C<gcc -Wall> clean.  Some warning
messages still remain in some platforms, so if you are compiling with
gcc you may see some warnings about dubious practices.  The warnings
are being worked on.

=item *

F<perly.c>, F<sv.c>, and F<sv.h> have now been extensively commented.

=item *

Documentation on how to use the Perl source repository has been added
to F<Porting/repository.pod>.

=item *

There are now several profiling make targets.

=back

=head1 Security Vulnerability Closed [561]

(This change was already made in 5.7.0 but bears repeating here.)
(5.7.0 came out before 5.6.1: the development branch 5.7 released
earlier than the maintenance branch 5.6)

A potential security vulnerability in the optional suidperl component
of Perl was identified in August 2000.  suidperl is neither built nor
installed by default.  As of November 2001 the only known vulnerable
platform is Linux, most likely all Linux distributions.  CERT and
various vendors and distributors have been alerted about the vulnerability.
See http://www.cpan.org/src/5.0/sperl-2000-08-05/sperl-2000-08-05.txt
for more information.

The problem was caused by Perl trying to report a suspected security
exploit attempt using an external program, /bin/mail.  On Linux
platforms the /bin/mail program had an undocumented feature which
when combined with suidperl gave access to a root shell, resulting in
a serious compromise instead of reporting the exploit attempt.  If you
don't have /bin/mail, or if you have 'safe setuid scripts', or if
suidperl is not installed, you are safe.

The exploit attempt reporting feature has been completely removed from
Perl 5.8.0 (and the maintenance release 5.6.1, and it was removed also
from all the Perl 5.7 releases), so that particular vulnerability
isn't there anymore.  However, further security vulnerabilities are,
unfortunately, always possible.  The suidperl functionality is most
probably going to be removed in Perl 5.10.  In any case, suidperl
should only be used by security experts who know exactly what they are
doing and why they are using suidperl instead of some other solution
such as sudo ( see http://www.courtesan.com/sudo/ ).

=head1 New Tests

Several new tests have been added, especially for the F<lib> and
F<ext> subsections.  There are now about 69 000 individual tests
(spread over about 700 test scripts), in the regression suite (5.6.1
has about 11 700 tests, in 258 test scripts)  The exact numbers depend
on the platform and Perl configuration used.  Many of the new tests
are of course introduced by the new modules, but still in general Perl
is now more thoroughly tested.

Because of the large number of tests, running the regression suite
will take considerably longer time than it used to: expect the suite
to take up to 4-5 times longer to run than in perl 5.6.  On a really
fast machine you can hope to finish the suite in about 6-8 minutes
(wallclock time).

The tests are now reported in a different order than in earlier Perls.
(This happens because the test scripts from under t/lib have been moved
to be closer to the library/extension they are testing.)

=head1 Known Problems

=head2 The Compiler Suite Is Still Very Experimental

The compiler suite is slowly getting better but it continues to be
highly experimental.  Use in production environments is discouraged.

=head2 Localising Tied Arrays and Hashes Is Broken

    local %tied_array;

doesn't work as one would expect: the old value is restored
incorrectly.  This will be changed in a future release, but we don't
know yet what the new semantics will exactly be.  In any case, the
change will break existing code that relies on the current
(ill-defined) semantics, so just avoid doing this in general.

=head2 Building Extensions Can Fail Because Of Largefiles

Some extensions like mod_perl are known to have issues with
`largefiles', a change brought by Perl 5.6.0 in which file offsets
default to 64 bits wide, where supported.  Modules may fail to compile
at all, or they may compile and work incorrectly.  Currently, there
is no good solution for the problem, but Configure now provides
appropriate non-largefile ccflags, ldflags, libswanted, and libs
in the %Config hash (e.g., $Config{ccflags_nolargefiles}) so the
extensions that are having problems can try configuring themselves
without the largefileness.  This is admittedly not a clean solution,
and the solution may not even work at all.  One potential failure is
whether one can (or, if one can, whether it's a good idea to) link
together at all binaries with different ideas about file offsets;
all this is platform-dependent.

=head2 Modifying $_ Inside for(..)

   for (1..5) { $_++ }

works without complaint.  It shouldn't.  (You should be able to
modify only lvalue elements inside the loops.)  You can see the
correct behaviour by replacing the 1..5 with 1, 2, 3, 4, 5.

=head2 mod_perl 1.26 Doesn't Build With Threaded Perl

Use mod_perl 1.27 or higher.

=head2 lib/ftmp-security tests warn 'system possibly insecure'

Don't panic.  Read the 'make test' section of INSTALL instead.

=head2 libwww-perl (LWP) fails base/date #51

Use libwww-perl 5.65 or later.

=head2 PDL failing some tests

Use PDL 2.3.4 or later.

=head2 Perl_get_sv

You may get errors like 'Undefined symbol "Perl_get_sv"' or "can't
resolve symbol 'Perl_get_sv'", or the symbol may be "Perl_sv_2pv".
This probably means that you are trying to use an older shared Perl
library (or extensions linked with such) with Perl 5.8.0 executable.
Perl used to have such a subroutine, but that is no more the case.
Check your shared library path, and any shared Perl libraries in those
directories.

Sometimes this problem may also indicate a partial Perl 5.8.0
installation, see L</"Mac OS X dyld undefined symbols"> for an
example and how to deal with it.

=head2 Self-tying Problems

Self-tying of arrays and hashes is broken in rather deep and
hard-to-fix ways.  As a stop-gap measure to avoid people from getting
frustrated at the mysterious results (core dumps, most often), it is
forbidden for now (you will get a fatal error even from an attempt).

A change to self-tying of globs has caused them to be recursively
referenced (see: L<perlobj/"Two-Phased Garbage Collection">).  You
will now need an explicit untie to destroy a self-tied glob.  This
behaviour may be fixed at a later date.

Self-tying of scalars and IO thingies works.

=head2 ext/threads/t/libc

If this test fails, it indicates that your libc (C library) is not
threadsafe.  This particular test stress tests the localtime() call to
find out whether it is threadsafe.  See L<perlthrtut> for more information.

=head2 Failure of Thread (5.005-style) tests

B<Note that support for 5.005-style threading is deprecated,
experimental and practically unsupported.  In 5.10, it is expected
to be removed.  You should migrate your code to ithreads.>

The following tests are known to fail due to fundamental problems in
the 5.005 threading implementation. These are not new failures--Perl
5.005_0x has the same bugs, but didn't have these tests.

 ../ext/B/t/xref.t                    255 65280    14   12  85.71%  3-14
 ../ext/List/Util/t/first.t           255 65280     7    4  57.14%  2 5-7
 ../lib/English.t                       2   512    54    2   3.70%  2-3
 ../lib/FileCache.t                                 5    1  20.00%  5
 ../lib/Filter/Simple/t/data.t                      6    3  50.00%  1-3
 ../lib/Filter/Simple/t/filter_only.                9    3  33.33%  1-2 5
 ../lib/Math/BigInt/t/bare_mbf.t                 1627    4   0.25%  8 11 1626-1627
 ../lib/Math/BigInt/t/bigfltpm.t                 1629    4   0.25%  10 13 1628-
                                                                    1629
 ../lib/Math/BigInt/t/sub_mbf.t                  1633    4   0.24%  8 11 1632-1633
 ../lib/Math/BigInt/t/with_sub.t                 1628    4   0.25%  9 12 1627-1628
 ../lib/Tie/File/t/31_autodefer.t     255 65280    65   32  49.23%  34-65
 ../lib/autouse.t                                  10    1  10.00%  4
 op/flip.t                                         15    1   6.67%  15

These failures are unlikely to get fixed as 5.005-style threads
are considered fundamentally broken.  (Basically what happens is that
competing threads can corrupt shared global state, one good example
being regular expression engine's state.)

=head2 Timing problems

The following tests may fail intermittently because of timing
problems, for example if the system is heavily loaded.

    t/op/alarm.t
    ext/Time/HiRes/HiRes.t
    lib/Benchmark.t
    lib/Memoize/t/expmod_t.t
    lib/Memoize/t/speed.t

In case of failure please try running them manually, for example

    ./perl -Ilib ext/Time/HiRes/HiRes.t

=head2 Tied/Magical Array/Hash Elements Do Not Autovivify

For normal arrays C<$foo = \$bar[1]> will assign C<undef> to
C<$bar[1]> (assuming that it didn't exist before), but for
tied/magical arrays and hashes such autovivification does not happen
because there is currently no way to catch the reference creation.
The same problem affects slicing over non-existent indices/keys of
a tied/magical array/hash.

=head2 Unicode in package/class and subroutine names does not work

One can have Unicode in identifier names, but not in package/class or
subroutine names.  While some limited functionality towards this does
exist as of Perl 5.8.0, that is more accidental than designed; use of
Unicode for the said purposes is unsupported.

One reason of this unfinishedness is its (currently) inherent
unportability: since both package names and subroutine names may
need to be mapped to file and directory names, the Unicode capability
of the filesystem becomes important-- and there unfortunately aren't
portable answers.

=head1 Platform Specific Problems

=head2 AIX

=over 4

=item *

If using the AIX native make command, instead of just "make" issue
"make all".  In some setups the former has been known to spuriously
also try to run "make install".  Alternatively, you may want to use
GNU make.

=item *

In AIX 4.2, Perl extensions that use C++ functions that use statics
may have problems in that the statics are not getting initialized.
In newer AIX releases, this has been solved by linking Perl with
the libC_r library, but unfortunately in AIX 4.2 the said library
has an obscure bug where the various functions related to time
(such as time() and gettimeofday()) return broken values, and
therefore in AIX 4.2 Perl is not linked against libC_r.

=item *

vac 5.0.0.0 May Produce Buggy Code For Perl

The AIX C compiler vac version 5.0.0.0 may produce buggy code,
resulting in a few random tests failing when run as part of "make
test", but when the failing tests are run by hand, they succeed.
We suggest upgrading to at least vac version 5.0.1.0, that has been
known to compile Perl correctly.  "lslpp -L|grep vac.C" will tell
you the vac version.  See README.aix.

=item *

If building threaded Perl, you may get compilation warning from pp_sys.c:

  "pp_sys.c", line 4651.39: 1506-280 (W) Function argument assignment between types "unsigned char*" and "const void*" is not allowed.

This is harmless; it is caused by the getnetbyaddr() and getnetbyaddr_r()
having slightly different types for their first argument.

=back

=head2 Alpha systems with old gccs fail several tests

If you see op/pack, op/pat, op/regexp, or ext/Storable tests failing
in a Linux/alpha or *BSD/Alpha, it's probably time to upgrade your gcc.
gccs prior to 2.95.3 are definitely not good enough, and gcc 3.1 may
be even better.  (RedHat Linux/alpha with gcc 3.1 reported no problems,
as did Linux 2.4.18 with gcc 2.95.4.)  (In Tru64, it is preferable to
use the bundled C compiler.)

=head2 AmigaOS

Perl 5.8.0 doesn't build in AmigaOS.  It broke at some point during
the ithreads work and we could not find Amiga experts to unbreak the
problems.  Perl 5.6.1 still works for AmigaOS (as does the 5.7.2
development release).

=head2 BeOS

The following tests fail on 5.8.0 Perl in BeOS Personal 5.03:

 t/op/lfs............................FAILED at test 17
 t/op/magic..........................FAILED at test 24
 ext/Fcntl/t/syslfs..................FAILED at test 17
 ext/File/Glob/t/basic...............FAILED at test 3
 ext/POSIX/t/sigaction...............FAILED at test 13
 ext/POSIX/t/waitpid.................FAILED at test 1

(B<Note:> more information was available in F<README.beos> until support for
BeOS was removed in Perl v5.18.0)

=head2 Cygwin "unable to remap"

For example when building the Tk extension for Cygwin,
you may get an error message saying "unable to remap".
This is known problem with Cygwin, and a workaround is
detailed in here: http://sources.redhat.com/ml/cygwin/2001-12/msg00894.html

=head2 Cygwin ndbm tests fail on FAT

One can build but not install (or test the build of) the NDBM_File
on FAT filesystems.  Installation (or build) on NTFS works fine.
If one attempts the test on a FAT install (or build) the following
failures are expected:

 ../ext/NDBM_File/ndbm.t       13  3328    71   59  83.10%  1-2 4 16-71
 ../ext/ODBM_File/odbm.t      255 65280    ??   ??       %  ??
 ../lib/AnyDBM_File.t           2   512    12    2  16.67%  1 4
 ../lib/Memoize/t/errors.t      0   139    11    5  45.45%  7-11
 ../lib/Memoize/t/tie_ndbm.t   13  3328     4    4 100.00%  1-4
 run/fresh_perl.t                          97    1   1.03%  91

NDBM_File fails and ODBM_File just coredumps.

If you intend to run only on FAT (or if using AnyDBM_File on FAT),
run Configure with the -Ui_ndbm and -Ui_dbm options to prevent
NDBM_File and ODBM_File being built.

=head2 DJGPP Failures

 t/op/stat............................FAILED at test 29
 lib/File/Find/t/find.................FAILED at test 1
 lib/File/Find/t/taint................FAILED at test 1
 lib/h2xs.............................FAILED at test 15
 lib/Pod/t/eol........................FAILED at test 1
 lib/Test/Harness/t/strap-analyze.....FAILED at test 8
 lib/Test/Harness/t/test-harness......FAILED at test 23
 lib/Test/Simple/t/exit...............FAILED at test 1

The above failures are known as of 5.8.0 with native builds with long
filenames, but there are a few more if running under dosemu because of
limitations (and maybe bugs) of dosemu:

 t/comp/cpp...........................FAILED at test 3
 t/op/inccode.........................(crash)

and a few lib/ExtUtils tests, and several hundred Encode/t/Aliases.t
failures that work fine with long filenames.  So you really might
prefer native builds and long filenames.

=head2 FreeBSD built with ithreads coredumps reading large directories

This is a known bug in FreeBSD 4.5's readdir_r(), it has been fixed in
FreeBSD 4.6 (see L<perlfreebsd> (README.freebsd)).

=head2 FreeBSD Failing locale Test 117 For ISO 8859-15 Locales

The ISO 8859-15 locales may fail the locale test 117 in FreeBSD.
This is caused by the characters \xFF (y with diaeresis) and \xBE
(Y with diaeresis) not behaving correctly when being matched
case-insensitively.  Apparently this problem has been fixed in
the latest FreeBSD releases.
( http://www.freebsd.org/cgi/query-pr.cgi?pr=34308 )

=head2 IRIX fails ext/List/Util/t/shuffle.t or Digest::MD5

IRIX with MIPSpro 7.3.1.2m or 7.3.1.3m compiler may fail the List::Util
test ext/List/Util/t/shuffle.t by dumping core.  This seems to be
a compiler error since if compiled with gcc no core dump ensues, and
no failures have been seen on the said test on any other platform.

Similarly, building the Digest::MD5 extension has been
known to fail with "*** Termination code 139 (bu21)".

The cure is to drop optimization level (Configure -Doptimize=-O2).

=head2 HP-UX lib/posix Subtest 9 Fails When LP64-Configured

If perl is configured with -Duse64bitall, the successful result of the
subtest 10 of lib/posix may arrive before the successful result of the
subtest 9, which confuses the test harness so much that it thinks the
subtest 9 failed.

=head2 Linux with glibc 2.2.5 fails t/op/int subtest #6 with -Duse64bitint

This is a known bug in the glibc 2.2.5 with long long integers.
( http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=65612 )

=head2 Linux With Sfio Fails op/misc Test 48

No known fix.

=head2 Mac OS X

Please remember to set your environment variable LC_ALL to "C"
(setenv LC_ALL C) before running "make test" to avoid a lot of
warnings about the broken locales of Mac OS X.

The following tests are known to fail in Mac OS X 10.1.5 because of
buggy (old) implementations of Berkeley DB included in Mac OS X:

 Failed Test                 Stat Wstat Total Fail  Failed  List of Failed
 -------------------------------------------------------------------------
 ../ext/DB_File/t/db-btree.t    0    11    ??   ??       %  ??
 ../ext/DB_File/t/db-recno.t              149    3   2.01%  61 63 65

If you are building on a UFS partition, you will also probably see
t/op/stat.t subtest #9 fail.  This is caused by Darwin's UFS not
supporting inode change time.

Also the ext/POSIX/t/posix.t subtest #10 fails but it is skipped for
now because the failure is Apple's fault, not Perl's (blocked signals
are lost).

If you Configure with ithreads, ext/threads/t/libc.t will fail. Again,
this is not Perl's fault-- the libc of Mac OS X is not threadsafe
(in this particular test, the localtime() call is found to be
threadunsafe.)

=head2 Mac OS X dyld undefined symbols

If after installing Perl 5.8.0 you are getting warnings about missing
symbols, for example

    dyld: perl Undefined symbols
    _perl_sv_2pv
    _perl_get_sv

you probably have an old pre-Perl-5.8.0 installation (or parts of one)
in /Library/Perl (the undefined symbols used to exist in pre-5.8.0 Perls).
It seems that for some reason "make install" doesn't always completely
overwrite the files in /Library/Perl.  You can move the old Perl
shared library out of the way like this:

    cd /Library/Perl/darwin/CORE
    mv libperl.dylib libperlold.dylib

and then reissue "make install".  Note that the above of course is
extremely disruptive for anything using the /usr/local/bin/perl.
If that doesn't help, you may have to try removing all the .bundle
files from beneath /Library/Perl, and again "make install"-ing.

=head2 OS/2 Test Failures

The following tests are known to fail on OS/2 (for clarity
only the failures are shown, not the full error messages):

 ../lib/ExtUtils/t/Mkbootstrap.t    1   256    18    1   5.56%  8
 ../lib/ExtUtils/t/Packlist.t       1   256    34    1   2.94%  17
 ../lib/ExtUtils/t/basic.t          1   256    17    1   5.88%  14
 lib/os2_process.t                  2   512   227    2   0.88%  174 209
 lib/os2_process_kid.t                        227    2   0.88%  174 209
 lib/rx_cmprt.t                   255 65280    18    3  16.67%  16-18

=head2 op/sprintf tests 91, 129, and 130

The op/sprintf tests 91, 129, and 130 are known to fail on some platforms.
Examples include any platform using sfio, and Compaq/Tandem's NonStop-UX.

Test 91 is known to fail on QNX6 (nto), because C<sprintf '%e',0>
incorrectly produces C<0.000000e+0> instead of C<0.000000e+00>.

For tests 129 and 130, the failing platforms do not comply with
the ANSI C Standard: lines 19ff on page 134 of ANSI X3.159 1989, to
be exact.  (They produce something other than "1" and "-1" when
formatting 0.6 and -0.6 using the printf format "%.0f"; most often,
they produce "0" and "-0".)

=head2 SCO

The socketpair tests are known to be unhappy in SCO 3.2v5.0.4:

 ext/Socket/socketpair.t...............FAILED tests 15-45

=head2 Solaris 2.5

In case you are still using Solaris 2.5 (aka SunOS 5.5), you may
experience failures (the test core dumping) in lib/locale.t.
The suggested cure is to upgrade your Solaris.

=head2 Solaris x86 Fails Tests With -Duse64bitint

The following tests are known to fail in Solaris x86 with Perl
configured to use 64 bit integers:

 ext/Data/Dumper/t/dumper.............FAILED at test 268
 ext/Devel/Peek/Peek..................FAILED at test 7

=head2 SUPER-UX (NEC SX)

The following tests are known to fail on SUPER-UX:

 op/64bitint...........................FAILED tests 29-30, 32-33, 35-36
 op/arith..............................FAILED tests 128-130
 op/pack...............................FAILED tests 25-5625
 op/pow................................
 op/taint..............................# msgsnd failed
 ../ext/IO/lib/IO/t/io_poll............FAILED tests 3-4
 ../ext/IPC/SysV/ipcsysv...............FAILED tests 2, 5-6
 ../ext/IPC/SysV/t/msg.................FAILED tests 2, 4-6
 ../ext/Socket/socketpair..............FAILED tests 12
 ../lib/IPC/SysV.......................FAILED tests 2, 5-6
 ../lib/warnings.......................FAILED tests 115-116, 118-119

The op/pack failure ("Cannot compress negative numbers at op/pack.t line 126")
is serious but as of yet unsolved.  It points at some problems with the
signedness handling of the C compiler, as do the 64bitint, arith, and pow
failures.  Most of the rest point at problems with SysV IPC.

=head2 Term::ReadKey not working on Win32

Use Term::ReadKey 2.20 or later.

=head2 UNICOS/mk

=over 4

=item *

During Configure, the test

    Guessing which symbols your C compiler and preprocessor define...

will probably fail with error messages like

    CC-20 cc: ERROR File = try.c, Line = 3
      The identifier "bad" is undefined.

      bad switch yylook 79bad switch yylook 79bad switch yylook 79bad switch yylook 79#ifdef A29K
      ^

    CC-65 cc: ERROR File = try.c, Line = 3
      A semicolon is expected at this point.

This is caused by a bug in the awk utility of UNICOS/mk.  You can ignore
the error, but it does cause a slight problem: you cannot fully
benefit from the h2ph utility (see L<h2ph>) that can be used to
convert C headers to Perl libraries, mainly used to be able to access
from Perl the constants defined using C preprocessor, cpp.  Because of
the above error, parts of the converted headers will be invisible.
Luckily, these days the need for h2ph is rare.

=item *

If building Perl with interpreter threads (ithreads), the
getgrent(), getgrnam(), and getgrgid() functions cannot return the
list of the group members due to a bug in the multithreaded support of
UNICOS/mk.  What this means is that in list context the functions will
return only three values, not four.

=back

=head2 UTS

There are a few known test failures.  (B<Note:> the relevant information was
available in F<README.uts> until support for UTS was removed in Perl
v5.18.0)

=head2 VOS (Stratus)

When Perl is built using the native build process on VOS Release
14.5.0 and GNU C++/GNU Tools 2.0.1, all attempted tests either
pass or result in TODO (ignored) failures.

=head2 VMS

There should be no reported test failures with a default configuration,
though there are a number of tests marked TODO that point to areas
needing further debugging and/or porting work.

=head2 Win32

In multi-CPU boxes, there are some problems with the I/O buffering:
some output may appear twice.

=head2 XML::Parser not working

Use XML::Parser 2.31 or later.

=head2 z/OS (OS/390)

z/OS has rather many test failures but the situation is actually much
better than it was in 5.6.0; it's just that so many new modules and
tests have been added.

 Failed Test                   Stat Wstat Total Fail  Failed  List of Failed
 ---------------------------------------------------------------------------
 ../ext/Data/Dumper/t/dumper.t              357    8   2.24%  311 314 325 327
                                                              331 333 337 339
 ../ext/IO/lib/IO/t/io_unix.t                 5    4  80.00%  2-5
 ../ext/Storable/t/downgrade.t   12  3072   169   12   7.10%  14-15 46-47 78-79
                                                              110-111 150 161
 ../lib/ExtUtils/t/Constant.t   121 30976    48   48 100.00%  1-48
 ../lib/ExtUtils/t/Embed.t                    9    9 100.00%  1-9
 op/pat.t                                   922    7   0.76%  665 776 785 832-
                                                              834 845
 op/sprintf.t                               224    3   1.34%  98 100 136
 op/tr.t                                     97    5   5.15%  63 71-74
 uni/fold.t                                 780    6   0.77%  61 169 196 661
                                                              710-711

The failures in dumper.t and downgrade.t are problems in the tests,
those in io_unix and sprintf are problems in the USS (UDP sockets and
printf formats).  The pat, tr, and fold failures are genuine Perl
problems caused by EBCDIC (and in the pat and fold cases, combining
that with Unicode).  The Constant and Embed are probably problems in
the tests (since they test Perl's ability to build extensions, and
that seems to be working reasonably well.)

=head2 Unicode Support on EBCDIC Still Spotty

Though mostly working, Unicode support still has problem spots on
EBCDIC platforms.  One such known spot are the C<\p{}> and C<\P{}>
regular expression constructs for code points less than 256: the
C<pP> are testing for Unicode code points, not knowing about EBCDIC.

=head2 Seen In Perl 5.7 But Gone Now

C<Time::Piece> (previously known as C<Time::Object>) was removed
because it was felt that it didn't have enough value in it to be a
core module.  It is still a useful module, though, and is available
from the CPAN.

Perl 5.8 unfortunately does not build anymore on AmigaOS; this broke
accidentally at some point.  Since there are not that many Amiga
developers available, we could not get this fixed and tested in time
for 5.8.0.  Perl 5.6.1 still works for AmigaOS (as does the 5.7.2
development release).

The C<PerlIO::Scalar> and C<PerlIO::Via> (capitalised) were renamed as
C<PerlIO::scalar> and C<PerlIO::via> (all lowercase) just before 5.8.0.
The main rationale was to have all core PerlIO layers to have all
lowercase names.  The "plugins" are named as usual, for example
C<PerlIO::via::QuotedPrint>.

The C<threads::shared::queue> and C<threads::shared::semaphore> were
renamed as C<Thread::Queue> and C<Thread::Semaphore> just before 5.8.0.
The main rationale was to have thread modules to obey normal naming,
C<Thread::> (the C<threads> and C<threads::shared> themselves are
more pragma-like, they affect compile-time, so they stay lowercase).

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org/ .  There may also be
information at http://www.perl.com/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Jarkko Hietaniemi <F<jhi@iki.fi>>.

=cut
perl583delta.pod000064400000014277150344123430007476 0ustar00=head1 NAME

perl583delta - what is new for perl v5.8.3

=head1 DESCRIPTION

This document describes differences between the 5.8.2 release and
the 5.8.3 release.

If you are upgrading from an earlier release such as 5.6.1, first read
the L<perl58delta>, which describes differences between 5.6.0 and
5.8.0, and the L<perl581delta> and L<perl582delta>, which describe differences
between 5.8.0, 5.8.1 and 5.8.2

=head1 Incompatible Changes

There are no changes incompatible with 5.8.2.

=head1 Core Enhancements

A C<SCALAR> method is now available for tied hashes. This is called when
a tied hash is used in scalar context, such as

    if (%tied_hash) {
	...
    }


The old behaviour was that %tied_hash would return whatever would have been
returned for that hash before the hash was tied (so usually 0). The new
behaviour in the absence of a SCALAR method is to return TRUE if in the
middle of an C<each> iteration, and otherwise call FIRSTKEY to check if the
hash is empty (making sure that a subsequent C<each> will also begin by
calling FIRSTKEY). Please see L<perltie/SCALAR> for the full details and
caveats.

=head1 Modules and Pragmata

=over 4

=item CGI

=item Cwd

=item Digest

=item Digest::MD5

=item Encode

=item File::Spec

=item FindBin

A function C<again> is provided to resolve problems where modules in different
directories wish to use FindBin.

=item List::Util

You can now weaken references to read only values.

=item Math::BigInt

=item PodParser

=item Pod::Perldoc

=item POSIX

=item Unicode::Collate

=item Unicode::Normalize

=item Test::Harness

=item threads::shared

C<cond_wait> has a new two argument form. C<cond_timedwait> has been added.

=back

=head1 Utility Changes

C<find2perl> now assumes C<-print> as a default action. Previously, it
needed to be specified explicitly.

A new utility, C<prove>, makes it easy to run an individual regression test
at the command line. C<prove> is part of Test::Harness, which users of earlier
Perl versions can install from CPAN.

=head1 New Documentation

The documentation has been revised in places to produce more standard manpages.

The documentation for the special code blocks (BEGIN, CHECK, INIT, END)
has been improved.

=head1 Installation and Configuration Improvements

Perl now builds on OpenVMS I64

=head1 Selected Bug Fixes

Using substr() on a UTF8 string could cause subsequent accesses on that
string to return garbage. This was due to incorrect UTF8 offsets being
cached, and is now fixed.

join() could return garbage when the same join() statement was used to
process 8 bit data having earlier processed UTF8 data, due to the flags
on that statement's temporary workspace not being reset correctly. This
is now fixed.

C<$a .. $b> will now work as expected when either $a or $b is C<undef>

Using Unicode keys with tied hashes should now work correctly.

Reading $^E now preserves $!. Previously, the C code implementing $^E
did not preserve C<errno>, so reading $^E could cause C<errno> and therefore
C<$!> to change unexpectedly.

Reentrant functions will (once more) work with C++. 5.8.2 introduced a bugfix
which accidentally broke the compilation of Perl extensions written in C++

=head1 New or Changed Diagnostics

The fatal error "DESTROY created new reference to dead object" is now
documented in L<perldiag>.

=head1 Changed Internals

The hash code has been refactored to reduce source duplication. The
external interface is unchanged, and aside from the bug fixes described
above, there should be no change in behaviour.

C<hv_clear_placeholders> is now part of the perl API

Some C macros have been tidied. In particular macros which create temporary
local variables now name these variables more defensively, which should
avoid bugs where names clash.

<signal.h> is now always included.

=head1 Configuration and Building

C<Configure> now invokes callbacks regardless of the value of the variable
they are called for. Previously callbacks were only invoked in the
C<case $variable $define)> branch. This change should only affect platform
maintainers writing configuration hints files.

=head1 Platform Specific Problems

The regression test ext/threads/shared/t/wait.t fails on early RedHat 9
and HP-UX 10.20 due to bugs in their threading implementations.
RedHat users should see https://rhn.redhat.com/errata/RHBA-2003-136.html
and consider upgrading their glibc.

=head1 Known Problems

Detached threads aren't supported on Windows yet, as they may lead to 
memory access violation problems.

There is a known race condition opening scripts in C<suidperl>. C<suidperl>
is neither built nor installed by default, and has been deprecated since
perl 5.8.0. You are advised to replace use of suidperl with tools such
as sudo ( http://www.courtesan.com/sudo/ )

We have a backlog of unresolved bugs. Dealing with bugs and bug reports
is unglamorous work; not something ideally suited to volunteer labour,
but that is all that we have.

The perl5 development team are implementing changes to help address this
problem, which should go live in early 2004.

=head1 Future Directions

Code freeze for the next maintenance release (5.8.4) is on March 31st 2004,
with release expected by mid April. Similarly 5.8.5's freeze will be at
the end of June, with release by mid July.

=head1 Obituary

Iain 'Spoon' Truskett, Perl hacker, author of L<perlreref> and
contributor to CPAN, died suddenly on 29th December 2003, aged 24.
He will be missed.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl582delta.pod000064400000010566150344123430007472 0ustar00=head1 NAME

perl582delta - what is new for perl v5.8.2

=head1 DESCRIPTION

This document describes differences between the 5.8.1 release and
the 5.8.2 release.

If you are upgrading from an earlier release such as 5.6.1, first read
the L<perl58delta>, which describes differences between 5.6.0 and
5.8.0, and the L<perl581delta>, which describes differences between
5.8.0 and 5.8.1.

=head1 Incompatible Changes

For threaded builds for modules calling certain re-entrant system calls,
binary compatibility was accidentally lost between 5.8.0 and 5.8.1.
Binary compatibility with 5.8.0 has been restored in 5.8.2, which
necessitates breaking compatibility with 5.8.1. We see this as the
lesser of two evils.

This will only affect people who have a threaded perl 5.8.1, and compiled
modules which use these calls, and now attempt to run the compiled modules
with 5.8.2. The fix is to re-compile and re-install the modules using 5.8.2.

=head1 Core Enhancements

=head2 Hash Randomisation

The hash randomisation introduced with 5.8.1 has been amended. It
transpired that although the implementation introduced in 5.8.1 was source
compatible with 5.8.0, it was not binary compatible in certain cases. 5.8.2
contains an improved implementation which is both source and binary
compatible with both 5.8.0 and 5.8.1, and remains robust against the form of
attack which prompted the change for 5.8.1.

We are grateful to the Debian project for their input in this area.
See L<perlsec/"Algorithmic Complexity Attacks"> for the original
rationale behind this change.

=head2 Threading

Several memory leaks associated with variables shared between threads
have been fixed.

=head1 Modules and Pragmata

=head2 Updated Modules And Pragmata

The following modules and pragmata have been updated since Perl 5.8.1:

=over 4

=item Devel::PPPort

=item Digest::MD5

=item I18N::LangTags

=item libnet

=item MIME::Base64

=item Pod::Perldoc

=item strict

Documentation improved

=item Tie::Hash

Documentation improved

=item Time::HiRes

=item Unicode::Collate

=item Unicode::Normalize

=item UNIVERSAL

Documentation improved

=back

=head1 Selected Bug Fixes

Some syntax errors involving unrecognized filetest operators are now handled
correctly by the parser.

=head1 Changed Internals

Interpreter initialization is more complete when -DMULTIPLICITY is off.
This should resolve problems with initializing and destroying the Perl
interpreter more than once in a single process.                      

=head1 Platform Specific Problems

Dynamic linker flags have been tweaked for Solaris and OS X, which should
solve problems seen while building some XS modules.

Bugs in OS/2 sockets and tmpfile have been fixed.

In OS X C<setreuid> and friends are troublesome - perl will now work
around their problems as best possible.

=head1 Future Directions

Starting with 5.8.3 we intend to make more frequent maintenance releases,
with a smaller number of changes in each. The intent is to propagate
bug fixes out to stable releases more rapidly and make upgrading stable
releases less of an upheaval. This should give end users more
flexibility in their choice of upgrade timing, and allow them easier
assessment of the impact of upgrades. The current plan is for code freezes
as follows

=over 4

=item *

5.8.3 23:59:59 GMT, Wednesday December 31st 2003

=item *

5.8.4 23:59:59 GMT, Wednesday March 31st 2004

=item *

5.8.5 23:59:59 GMT, Wednesday June 30th 2004

=back

with the release following soon after, when testing is complete.

See L<perl581delta/"Future Directions"> for more soothsaying.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org/.  There may also be
information at http://www.perl.com/, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlgpl.pod000064400000032767150344123430006733 0ustar00
=head1 NAME

perlgpl - the GNU General Public License, version 1

=head1 SYNOPSIS

 You can refer to this document in Pod via "L<perlgpl>"
 Or you can see this document by entering "perldoc perlgpl"

=head1 DESCRIPTION

Perl is free software; you can redistribute it and/or modify
it under the terms of either:

        a) the GNU General Public License as published by the Free
        Software Foundation; either version 1, or (at your option) any
        later version, or

        b) the "Artistic License" which comes with this Kit.

This is the B<"GNU General Public License, version 1">.
It's here so that modules, programs, etc., that want to declare
this as their distribution license can link to it.

For the Perl Artistic License, see L<perlartistic>.

=cut

# Because the following document's language disallows "changing"
# it, we haven't gone thru and prettied it up with =item's or
# anything.  It's good enough the way it is.

=head1 GNU GENERAL PUBLIC LICENSE

                    GNU GENERAL PUBLIC LICENSE
                     Version 1, February 1989

  Copyright (C) 1989 Free Software Foundation, Inc.
                51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

  Everyone is permitted to copy and distribute verbatim copies
  of this license document, but changing it is not allowed.

                            Preamble

   The license agreements of most software companies try to keep users
 at the mercy of those companies.  By contrast, our General Public
 License is intended to guarantee your freedom to share and change free
 software--to make sure the software is free for all its users.  The
 General Public License applies to the Free Software Foundation's
 software and to any other program whose authors commit to using it.
 You can use it for your programs, too.

   When we speak of free software, we are referring to freedom, not
 price.  Specifically, the General Public License is designed to make
 sure that you have the freedom to give away or sell copies of free
 software, that you receive source code or can get it if you want it,
 that you can change the software or use pieces of it in new free
 programs; and that you know you can do these things.

   To protect your rights, we need to make restrictions that forbid
 anyone to deny you these rights or to ask you to surrender the rights.
 These restrictions translate to certain responsibilities for you if you
 distribute copies of the software, or if you modify it.

   For example, if you distribute copies of a such a program, whether
 gratis or for a fee, you must give the recipients all the rights that
 you have.  You must make sure that they, too, receive or can get the
 source code.  And you must tell them their rights.

   We protect your rights with two steps: (1) copyright the software,
 and (2) offer you this license which gives you legal permission to
 copy, distribute and/or modify the software.

   Also, for each author's protection and ours, we want to make certain
 that everyone understands that there is no warranty for this free
 software.  If the software is modified by someone else and passed on,
 we want its recipients to know that what they have is not the original,
 so that any problems introduced by others will not reflect on the
 original authors' reputations.

   The precise terms and conditions for copying, distribution and
 modification follow.

                    GNU GENERAL PUBLIC LICENSE
    TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

   0. This License Agreement applies to any program or other work which
 contains a notice placed by the copyright holder saying it may be
 distributed under the terms of this General Public License.  The
 "Program", below, refers to any such program or work, and a "work based
 on the Program" means either the Program or any work containing the
 Program or a portion of it, either verbatim or with modifications.
 Each licensee is addressed as "you".

   1. You may copy and distribute verbatim copies of the Program's
 source code as you receive it, in any medium, provided that you
 conspicuously and appropriately publish on each copy an appropriate
 copyright notice and disclaimer of warranty; keep intact all the
 notices that refer to this General Public License and to the absence of
 any warranty; and give any other recipients of the Program a copy of
 this General Public License along with the Program.  You may charge a
 fee for the physical act of transferring a copy.

   2. You may modify your copy or copies of the Program or any portion
 of it, and copy and distribute such modifications under the terms of
 Paragraph 1 above, provided that you also do the following:

     a) cause the modified files to carry prominent notices stating that
     you changed the files and the date of any change; and

     b) cause the whole of any work that you distribute or publish, that
     in whole or in part contains the Program or any part thereof,
     either with or without modifications, to be licensed at no charge
     to all third parties under the terms of this General Public License
     (except that you may choose to grant warranty protection to some or
     all third parties, at your option).

     c) If the modified program normally reads commands interactively
     when run, you must cause it, when started running for such
     interactive use in the simplest and most usual way, to print or
     display an announcement including an appropriate copyright notice
     and a notice that there is no warranty (or else, saying that you
     provide a warranty) and that users may redistribute the program
     under these conditions, and telling the user how to view a copy of
     this General Public License.

     d) You may charge a fee for the physical act of transferring a
     copy, and you may at your option offer warranty protection in
     exchange for a fee.

 Mere aggregation of another independent work with the Program (or its
 derivative) on a volume of a storage or distribution medium does not
 bring the other work under the scope of these terms.

   3. You may copy and distribute the Program (or a portion or
 derivative of it, under Paragraph 2) in object code or executable form
 under the terms of Paragraphs 1 and 2 above provided that you also do
 one of the following:

     a) accompany it with the complete corresponding machine-readable
     source code, which must be distributed under the terms of
     Paragraphs 1 and 2 above; or,

     b) accompany it with a written offer, valid for at least three
     years, to give any third party free (except for a nominal charge
     for the cost of distribution) a complete machine-readable copy of
     the corresponding source code, to be distributed under the terms of
     Paragraphs 1 and 2 above; or,

     c) accompany it with the information you received as to where the
     corresponding source code may be obtained.  (This alternative is
     allowed only for noncommercial distribution and only if you
     received the program in object code or executable form alone.)

 Source code for a work means the preferred form of the work for making
 modifications to it.  For an executable file, complete source code
 means all the source code for all modules it contains; but, as a
 special exception, it need not include source code for modules which
 are standard libraries that accompany the operating system on which the
 executable file runs, or for standard header files or definitions files
 that accompany that operating system.

   4. You may not copy, modify, sublicense, distribute or transfer the
 Program except as expressly provided under this General Public License.
 Any attempt otherwise to copy, modify, sublicense, distribute or
 transfer the Program is void, and will automatically terminate your
 rights to use the Program under this License.  However, parties who
 have received copies, or rights to use copies, from you under this
 General Public License will not have their licenses terminated so long
 as such parties remain in full compliance.

   5. By copying, distributing or modifying the Program (or any work
 based on the Program) you indicate your acceptance of this license to
 do so, and all its terms and conditions.

   6. Each time you redistribute the Program (or any work based on the
 Program), the recipient automatically receives a license from the
 original licensor to copy, distribute or modify the Program subject to
 these terms and conditions.  You may not impose any further
 restrictions on the recipients' exercise of the rights granted herein.

   7. The Free Software Foundation may publish revised and/or new
 versions of the General Public License from time to time.  Such new
 versions will be similar in spirit to the present version, but may
 differ in detail to address new problems or concerns.

 Each version is given a distinguishing version number.  If the Program
 specifies a version number of the license which applies to it and "any
 later version", you have the option of following the terms and
 conditions either of that version or of any later version published by
 the Free Software Foundation.  If the Program does not specify a
 version number of the license, you may choose any version ever
 published by the Free Software Foundation.

   8. If you wish to incorporate parts of the Program into other free
 programs whose distribution conditions are different, write to the
 author to ask for permission.  For software which is copyrighted by the
 Free Software Foundation, write to the Free Software Foundation; we
 sometimes make exceptions for this.  Our decision will be guided by the
 two goals of preserving the free status of all derivatives of our free
 software and of promoting the sharing and reuse of software generally.

                            NO WARRANTY

   9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO
 WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW.
 EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR
 OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND,
 EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
 THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS
 WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
 ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

   10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY
 AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU
 FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
 CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE
 PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING
 RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A
 FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF
 SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH
 DAMAGES.

                     END OF TERMS AND CONDITIONS

        Appendix: How to Apply These Terms to Your New Programs

   If you develop a new program, and you want it to be of the greatest
 possible use to humanity, the best way to achieve this is to make it
 free software which everyone can redistribute and change under these
 terms.

   To do so, attach the following notices to the program.  It is safest
 to attach them to the start of each source file to most effectively
 convey the exclusion of warranty; and each file should have at least
 the "copyright" line and a pointer to where the full notice is found.

     <one line to give the program's name and a brief idea of what it
     does.>
     Copyright (C) 19yy  <name of author>

     This program is free software; you can redistribute it and/or
     modify it under the terms of the GNU General Public License as
     published by the Free Software Foundation; either version 1, or (at
     your option) any later version.

     This program is distributed in the hope that it will be useful,
     but WITHOUT ANY WARRANTY; without even the implied warranty of
     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
     GNU General Public License for more details.

     You should have received a copy of the GNU General Public License
     along with this program; if not, write to the Free Software
     Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA
     02110-1301 USA


 Also add information on how to contact you by electronic and paper
 mail.

 If the program is interactive, make it output a short notice like this
 when it starts in an interactive mode:

     Gnomovision version 69, Copyright (C) 19xx name of author
     Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type
     'show w'.  This is free software, and you are welcome to
     redistribute it under certain conditions; type 'show c' for
     details.

 The hypothetical commands 'show w' and 'show c' should show the
 appropriate parts of the General Public License.  Of course, the
 commands you use may be called something other than 'show w' and 'show
 c'; they could even be mouse-clicks or menu items--whatever suits your
 program.

 You should also get your employer (if you work as a programmer) or your
 school, if any, to sign a "copyright disclaimer" for the program, if
 necessary.  Here a sample; alter the names:

   Yoyodyne, Inc., hereby disclaims all copyright interest in the
   program 'Gnomovision' (a program to direct compilers to make passes
   at assemblers) written by James Hacker.

   <signature of Ty Coon>, 1 April 1989
   Ty Coon, President of Vice

 That's all there is to it!

=cut
perl5222delta.pod000064400000030525150344123440007544 0ustar00=encoding utf8

=head1 NAME

perl5222delta - what is new for perl v5.22.2

=head1 DESCRIPTION

This document describes differences between the 5.22.1 release and the 5.22.2
release.

If you are upgrading from an earlier release such as 5.22.0, first read
L<perl5221delta>, which describes differences between 5.22.0 and 5.22.1.

=head1 Security

=head2 Fix out of boundary access in Win32 path handling

This is CVE-2015-8608.  For more information see
L<[perl #126755]|https://rt.perl.org/Ticket/Display.html?id=126755>.

=head2 Fix loss of taint in C<canonpath()>

This is CVE-2015-8607.  For more information see
L<[perl #126862]|https://rt.perl.org/Ticket/Display.html?id=126862>.

=head2 Set proper umask before calling C<mkstemp(3)>

In 5.22.0 perl started setting umask to C<0600> before calling C<mkstemp(3)>
and restoring it afterwards.  This wrongfully tells C<open(2)> to strip the
owner read and write bits from the given mode before applying it, rather than
the intended negation of leaving only those bits in place.

Systems that use mode C<0666> in C<mkstemp(3)> (like old versions of glibc)
create a file with permissions C<0066>, leaving world read and write permissions
regardless of current umask.

This has been fixed by using umask C<0177> instead.

L<[perl #127322]|https://rt.perl.org/Ticket/Display.html?id=127322>

=head2 Avoid accessing uninitialized memory in Win32 C<crypt()>

Validation that will detect both a short salt and invalid characters in the
salt has been added.

L<[perl #126922]|https://rt.perl.org/Ticket/Display.html?id=126922>

=head2 Remove duplicate environment variables from C<environ>

Previously, if an environment variable appeared more than once in C<environ[]>,
L<C<%ENV>|perlvar/%ENV> would contain the last entry for that name, while a
typical C<getenv()> would return the first entry.  We now make sure C<%ENV>
contains the same as what C<getenv()> returns.

Secondly, we now remove duplicates from C<environ[]>, so if a setting with that
name is set in C<%ENV> we won't pass an unsafe value to a child process.

This is CVE-2016-2381.

=head1 Incompatible Changes

There are no changes intentionally incompatible with Perl 5.22.1.  If any
exist, they are bugs, and we request that you submit a report.  See
L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<File::Spec> has been upgraded from version 3.56 to 3.56_01.

C<canonpath()> now preserves taint.  See L</"Fix loss of taint in
C<canonpath()>">.

=item *

L<Module::CoreList> has been upgraded from version 5.20151213 to 5.20160429.

The version number of L<Digest::SHA> listed for Perl 5.18.4 was wrong and has
been corrected.  Likewise for the version number of L<Config> in 5.18.3 and
5.18.4.
L<[perl #127624]|https://rt.perl.org/Ticket/Display.html?id=127624>

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perldiag>

=over 4

=item *

The explanation of the warning "unable to close filehandle %s properly: %s"
which can occur when doing an implicit close of a filehandle has been expanded
and improved.

=back

=head3 L<perlfunc>

=over 4

=item *

The documentation of L<C<hex()>|perlfunc/hex> has been revised to clarify valid
inputs.

=back

=head1 Configuration and Compilation

=over 4

=item *

Dtrace builds now build successfully on systems with a newer dtrace that
require an input object file that uses the probes in the F<.d> file.

Previously the probe would fail and cause a build failure.

L<[perl #122287]|https://rt.perl.org/Ticket/Display.html?id=122287>

=item *

F<Configure> no longer probes for F<libnm> by default.  Originally this was the
"New Math" library, but the name has been re-used by the GNOME NetworkManager.

L<[perl #127131]|https://rt.perl.org/Ticket/Display.html?id=127131>

=item *

F<Configure> now knows about gcc 5.

=item *

Compiling perl with B<-DPERL_MEM_LOG> now works again.

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item Darwin

Compiling perl with B<-Dusecbacktrace> on Darwin now works again.

L<[perl #127764]|https://rt.perl.org/Ticket/Display.html?id=127764>

=item OS X/Darwin

Builds with both B<-DDEBUGGING> and threading enabled would fail with a "panic:
free from wrong pool" error when built or tested from Terminal on OS X.  This
was caused by perl's internal management of the environment conflicting with an
atfork handler using the libc C<setenv()> function to update the environment.

Perl now uses C<setenv()>/C<unsetenv()> to update the environment on OS X.

L<[perl #126240]|https://rt.perl.org/Ticket/Display.html?id=126240>

=item ppc64el

The floating point format of ppc64el (Debian naming for little-endian PowerPC)
is now detected correctly.

=item Tru64

A test failure in F<t/porting/extrefs.t> has been fixed.

=back

=head1 Internal Changes

=over 4

=item *

An unwarranted assertion in C<Perl_newATTRSUB_x()> has been removed.  If a stub
subroutine definition with a prototype has been seen, then any subsequent stub
(or definition) of the same subroutine with an attribute was causing an
assertion failure because of a null pointer.

L<[perl #126845]|https://rt.perl.org/Ticket/Display.html?id=126845>

=back

=head1 Selected Bug Fixes

=over 4

=item *

Calls to the placeholder C<&PL_sv_yes> used internally when an C<import()> or
C<unimport()> method isn't found now correctly handle scalar context.
L<[perl #126042]|https://rt.perl.org/Ticket/Display.html?id=126042>

=item *

The L<C<pipe()>|perlfunc/pipe> operator would assert for C<DEBUGGING> builds
instead of producing the correct error message.  The condition asserted on is
detected and reported on correctly without the assertions, so the assertions
were removed.
L<[perl #126480]|https://rt.perl.org/Ticket/Display.html?id=126480>

=item *

In some cases, failing to parse a here-doc would attempt to use freed memory.
This was caused by a pointer not being restored correctly.
L<[perl #126443]|https://rt.perl.org/Ticket/Display.html?id=126443>

=item *

Perl now reports more context when it sees an array where it expects to see an
operator, and avoids an assertion failure.
L<[perl #123737]|https://rt.perl.org/Ticket/Display.html?id=123737>

=item *

If a here-doc was found while parsing another operator, the parser had already
read end of file, and the here-doc was not terminated, perl could produce an
assertion or a segmentation fault.  This now reliably complains about the
unterminated here-doc.
L<[perl #125540]|https://rt.perl.org/Ticket/Display.html?id=125540>

=item *

Parsing beyond the end of the buffer when processing a C<#line> directive with
no filename is now avoided.
L<[perl #127334]|https://rt.perl.org/Ticket/Display.html?id=127334>

=item *

Perl 5.22.0 added support for the C99 hexadecimal floating point notation, but
sometimes misparsed hex floats.  This has been fixed.
L<[perl #127183]|https://rt.perl.org/Ticket/Display.html?id=127183>

=item *

Certain regex patterns involving a complemented posix class in an inverted
bracketed character class, and matching something else optionally would
improperly fail to match.  An example of one that could fail is
C<qr/_?[^\Wbar]\x{100}/>.  This has been fixed.
L<[perl #127537]|https://rt.perl.org/Ticket/Display.html?id=127537>

=item *

Fixed an issue with L<C<pack()>|perlfunc/pack> where C<< pack "H" >> (and
C<< pack "h" >>) could read past the source when given a non-utf8 source and a
utf8 target.
L<[perl #126325]|https://rt.perl.org/Ticket/Display.html?id=126325>

=item *

Fixed some cases where perl would abort due to a segmentation fault, or a
C-level assert.
L<[perl #126193]|https://rt.perl.org/Ticket/Display.html?id=126193>
L<[perl #126257]|https://rt.perl.org/Ticket/Display.html?id=126257>
L<[perl #126258]|https://rt.perl.org/Ticket/Display.html?id=126258>
L<[perl #126405]|https://rt.perl.org/Ticket/Display.html?id=126405>
L<[perl #126602]|https://rt.perl.org/Ticket/Display.html?id=126602>
L<[perl #127773]|https://rt.perl.org/Ticket/Display.html?id=127773>
L<[perl #127786]|https://rt.perl.org/Ticket/Display.html?id=127786>

=item *

A memory leak when setting C<$ENV{foo}> on Darwin has been fixed.
L<[perl #126240]|https://rt.perl.org/Ticket/Display.html?id=126240>

=item *

Perl now correctly raises an error when trying to compile patterns with
unterminated character classes while there are trailing backslashes.
L<[perl #126141]|https://rt.perl.org/Ticket/Display.html?id=126141>

=item *

C<NOTHING> regops and C<EXACTFU_SS> regops in C<make_trie()> are now handled
properly.
L<[perl #126206]|https://rt.perl.org/Ticket/Display.html?id=126206>

=item *

Perl now only tests C<semctl()> if we have everything needed to use it.  In
FreeBSD the C<semctl()> entry point may exist, but it can be disabled by
policy.
L<[perl #127533]|https://rt.perl.org/Ticket/Display.html?id=127533>

=item *

A regression that allowed undeclared barewords as hash keys to work despite
strictures has been fixed.
L<[perl #126981]|https://rt.perl.org/Ticket/Display.html?id=126981>

=item *

As an optimization (introduced in Perl 5.20.0), L<C<uc()>|perlfunc/uc>,
L<C<lc()>|perlfunc/lc>, L<C<ucfirst()>|perlfunc/ucfirst> and
L<C<lcfirst()>|perlfunc/lcfirst> sometimes modify their argument in-place
rather than returning a modified copy.  The criteria for this optimization has
been made stricter to avoid these functions accidentally modifying in-place
when they should not, which has been happening in some cases, e.g. in
L<List::Util>.

=item *

Excessive memory usage in the compilation of some regular expressions involving
non-ASCII characters has been reduced.  A more complete fix is forthcoming in
Perl 5.24.0.

=back

=head1 Acknowledgements

Perl 5.22.2 represents approximately 5 months of development since Perl 5.22.1
and contains approximately 3,000 lines of changes across 110 files from 24
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 1,500 lines of changes to 52 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.22.2:

Aaron Crane, Abigail, Andreas König, Aristotle Pagaltzis, Chris 'BinGOs'
Williams, Craig A. Berry, Dagfinn Ilmari Mannsåker, David Golden, David
Mitchell, H.Merijn Brand, James E Keenan, Jarkko Hietaniemi, Karen Etheridge,
Karl Williamson, Matthew Horsfall, Niko Tyni, Ricardo Signes, Sawyer X, Stevan
Little, Steve Hay, Todd Rinaldo, Tony Cook, Vladimir Timofeev, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
https://rt.perl.org/ .  There may also be information at http://www.perl.org/ ,
the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5125delta.pod000064400000017003150344123440007542 0ustar00=encoding utf8

=head1 NAME

perl5125delta - what is new for perl v5.12.5

=head1 DESCRIPTION

This document describes differences between the 5.12.4 release and
the 5.12.5 release.

If you are upgrading from an earlier release such as 5.12.3, first read
L<perl5124delta>, which describes differences between 5.12.3 and
5.12.4.

=head1 Security

=head2 C<Encode> decode_xs n-byte heap-overflow (CVE-2011-2939)

A bug in C<Encode> could, on certain inputs, cause the heap to overflow.
This problem has been corrected.  Bug reported by Robert Zacek.

=head2 C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC (CVE-2011-2728).

Calling C<File::Glob::bsd_glob> with the unsupported flag GLOB_ALTDIRFUNC would 
cause an access violation / segfault.  A Perl program that accepts a flags value from
an external source could expose itself to denial of service or arbitrary code
execution attacks.  There are no known exploits in the wild.  The problem has been
corrected by explicitly disabling all unsupported flags and setting unused function
pointers to null.  Bug reported by Clément Lecigne.

=head2 Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195)

Poorly written perl code that allows an attacker to specify the count to
perl's 'x' string repeat operator can already cause a memory exhaustion
denial-of-service attack. A flaw in versions of perl before 5.15.5 can
escalate that into a heap buffer overrun; coupled with versions of glibc
before 2.16, it possibly allows the execution of arbitrary code.

This problem has been fixed.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.12.4. If any
exist, they are bugs and reports are welcome.

=head1 Modules and Pragmata

=head2 Updated Modules

=head3 L<B::Concise>

L<B::Concise> no longer produces mangled output with the B<-tree> option
[perl #80632].

=head3 L<charnames>

A regression introduced in Perl 5.8.8 has been fixed, that caused
C<charnames::viacode(0)> to return C<undef> instead of the string "NULL"
[perl #72624].

=head3 L<Encode> has been upgraded from version 2.39 to version 2.39_01.

See L</Security>.

=head3 L<File::Glob> has been upgraded from version 1.07 to version 1.07_01.

See L</Security>.

=head3 L<Unicode::UCD>

The documentation for the C<upper> function now actually says "upper", not
"lower".

=head3 L<Module::CoreList>

L<Module::CoreList> has been updated to version 2.50_02 to add data for
this release.

=head1 Changes to Existing Documentation

=head2 L<perlebcdic>

The L<perlebcdic> document contains a helpful table to use in C<tr///> to
convert between EBCDIC and Latin1/ASCII.  Unfortunately, the table was the
inverse of the one it describes.  This has been corrected.

=head2 L<perlunicode>

The section on
L<User-Defined Case Mappings|perlunicode/User-Defined Case Mappings> had
some bad markup and unclear sentences, making parts of it unreadable.  This
has been rectified.

=head2 L<perluniprops>

This document has been corrected to take non-ASCII platforms into account.

=head1 Installation and Configuration Improvements

=head2 Platform Specific Changes

=over 4

=item Mac OS X

There have been configuration and test fixes to make Perl build cleanly on
Lion and Mountain Lion.

=item NetBSD

The NetBSD hints file was corrected to be compatible with NetBSD 6.*

=back

=head1 Selected Bug Fixes

=over 4

=item *

C<chop> now correctly handles characters above "\x{7fffffff}"
[perl #73246].

=item *

C<< ($<,$>) = (...) >> stopped working properly in 5.12.0.  It is supposed
to make a single C<setreuid()> call, rather than calling C<setruid()> and
C<seteuid()> separately.  Consequently it did not work properly.  This has
been fixed [perl #75212].

=item *

Fixed a regression of kill() when a match variable is used for the
process ID to kill [perl #75812].

=item *

C<UNIVERSAL::VERSION> no longer leaks memory.  It started leaking in Perl
5.10.0.

=item *

The C-level C<my_strftime> functions no longer leaks memory.  This fixes a
memory leak in C<POSIX::strftime> [perl #73520].

=item *

C<caller> no longer leaks memory when called from the DB package if
C<@DB::args> was assigned to after the first call to C<caller>.  L<Carp>
was triggering this bug [perl #97010].

=item *

Passing to C<index> an offset beyond the end of the string when the string
is encoded internally in UTF8 no longer causes panics [perl #75898].

=item *

Syntax errors in C<< (?{...}) >> blocks in regular expressions no longer
cause panic messages [perl #2353].

=item *

Perl 5.10.0 introduced some faulty logic that made "U*" in the middle of
a pack template equivalent to "U0" if the input string was empty.  This has
been fixed [perl #90160].

=back

=head1 Errata

=head2 split() and C<@_>

split() no longer modifies C<@_> when called in scalar or void context.
In void context it now produces a "Useless use of split" warning.
This is actually a change introduced in perl 5.12.0, but it was missed from
that release's L<perl5120delta>.

=head1 Acknowledgements

Perl 5.12.5 represents approximately 17 months of development since Perl 5.12.4
and contains approximately 1,900 lines of changes across 64 files from 18
authors.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.12.5:

Andy Dougherty, Chris 'BinGOs' Williams, Craig A. Berry, David Mitchell,
Dominic Hargreaves, Father Chrysostomos, Florian Ragwitz, George Greer, Goro
Fuji, Jesse Vincent, Karl Williamson, Leon Brocard, Nicholas Clark, Rafael
Garcia-Suarez, Reini Urban, Ricardo Signes, Steve Hay, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes all the core committers, who be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlxstut.pod000064400000141657150344123440007340 0ustar00=head1 NAME

perlxstut - Tutorial for writing XSUBs

=head1 DESCRIPTION

This tutorial will educate the reader on the steps involved in creating
a Perl extension.  The reader is assumed to have access to L<perlguts>,
L<perlapi> and L<perlxs>.

This tutorial starts with very simple examples and becomes more complex,
with each new example adding new features.  Certain concepts may not be
completely explained until later in the tutorial in order to slowly ease
the reader into building extensions.

This tutorial was written from a Unix point of view.  Where I know them
to be otherwise different for other platforms (e.g. Win32), I will list
them.  If you find something that was missed, please let me know.

=head1 SPECIAL NOTES

=head2 make

This tutorial assumes that the make program that Perl is configured to
use is called C<make>.  Instead of running "make" in the examples that
follow, you may have to substitute whatever make program Perl has been
configured to use.  Running B<perl -V:make> should tell you what it is.

=head2 Version caveat

When writing a Perl extension for general consumption, one should expect that
the extension will be used with versions of Perl different from the
version available on your machine.  Since you are reading this document,
the version of Perl on your machine is probably 5.005 or later, but the users
of your extension may have more ancient versions.

To understand what kinds of incompatibilities one may expect, and in the rare
case that the version of Perl on your machine is older than this document,
see the section on "Troubleshooting these Examples" for more information.

If your extension uses some features of Perl which are not available on older
releases of Perl, your users would appreciate an early meaningful warning.
You would probably put this information into the F<README> file, but nowadays
installation of extensions may be performed automatically, guided by F<CPAN.pm>
module or other tools.

In MakeMaker-based installations, F<Makefile.PL> provides the earliest
opportunity to perform version checks.  One can put something like this
in F<Makefile.PL> for this purpose:

    eval { require 5.007 }
        or die <<EOD;
    ############
    ### This module uses frobnication framework which is not available
    ### before version 5.007 of Perl.  Upgrade your Perl before
    ### installing Kara::Mba.
    ############
    EOD

=head2 Dynamic Loading versus Static Loading

It is commonly thought that if a system does not have the capability to
dynamically load a library, you cannot build XSUBs.  This is incorrect.
You I<can> build them, but you must link the XSUBs subroutines with the
rest of Perl, creating a new executable.  This situation is similar to
Perl 4.

This tutorial can still be used on such a system.  The XSUB build mechanism
will check the system and build a dynamically-loadable library if possible,
or else a static library and then, optionally, a new statically-linked
executable with that static library linked in.

Should you wish to build a statically-linked executable on a system which
can dynamically load libraries, you may, in all the following examples,
where the command "C<make>" with no arguments is executed, run the command
"C<make perl>" instead.

If you have generated such a statically-linked executable by choice, then
instead of saying "C<make test>", you should say "C<make test_static>".
On systems that cannot build dynamically-loadable libraries at all, simply
saying "C<make test>" is sufficient.

=head2 Threads and PERL_NO_GET_CONTEXT

For threaded builds, perl requires the context pointer for the current
thread, without C<PERL_NO_GET_CONTEXT>, perl will call a function to
retrieve the context.

For improved performance, include:

  #define PERL_NO_GET_CONTEXT

as shown below.

For more details, see L<perlguts|perlguts/How multiple interpreters
and concurrency are supported>.

=head1 TUTORIAL

Now let's go on with the show!

=head2 EXAMPLE 1

Our first extension will be very simple.  When we call the routine in the
extension, it will print out a well-known message and return.

Run "C<h2xs -A -n Mytest>".  This creates a directory named Mytest,
possibly under ext/ if that directory exists in the current working
directory.  Several files will be created under the Mytest dir, including
MANIFEST, Makefile.PL, lib/Mytest.pm, Mytest.xs, t/Mytest.t, and Changes.

The MANIFEST file contains the names of all the files just created in the
Mytest directory.

The file Makefile.PL should look something like this:

    use ExtUtils::MakeMaker;
    # See lib/ExtUtils/MakeMaker.pm for details of how to influence
    # the contents of the Makefile that is written.
    WriteMakefile(
	NAME         => 'Mytest',
	VERSION_FROM => 'Mytest.pm', # finds $VERSION
	LIBS         => [''],   # e.g., '-lm'
	DEFINE       => '',     # e.g., '-DHAVE_SOMETHING'
	INC          => '',     # e.g., '-I/usr/include/other'
    );

The file Mytest.pm should start with something like this:

    package Mytest;

    use 5.008008;
    use strict;
    use warnings;

    require Exporter;

    our @ISA = qw(Exporter);
    our %EXPORT_TAGS = ( 'all' => [ qw(

    ) ] );

    our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );

    our @EXPORT = qw(

    );

    our $VERSION = '0.01';

    require XSLoader;
    XSLoader::load('Mytest', $VERSION);

    # Preloaded methods go here.

    1;
    __END__
    # Below is the stub of documentation for your module. You better
    # edit it!

The rest of the .pm file contains sample code for providing documentation for
the extension.

Finally, the Mytest.xs file should look something like this:

    #define PERL_NO_GET_CONTEXT
    #include "EXTERN.h"
    #include "perl.h"
    #include "XSUB.h"

    #include "ppport.h"

    MODULE = Mytest		PACKAGE = Mytest

Let's edit the .xs file by adding this to the end of the file:

    void
    hello()
	CODE:
	    printf("Hello, world!\n");

It is okay for the lines starting at the "CODE:" line to not be indented.
However, for readability purposes, it is suggested that you indent CODE:
one level and the lines following one more level.

Now we'll run "C<perl Makefile.PL>".  This will create a real Makefile,
which make needs.  Its output looks something like:

    % perl Makefile.PL
    Checking if your kit is complete...
    Looks good
    Writing Makefile for Mytest
    %

Now, running make will produce output that looks something like this (some
long lines have been shortened for clarity and some extraneous lines have
been deleted):

 % make
 cp lib/Mytest.pm blib/lib/Mytest.pm
 perl xsubpp  -typemap typemap  Mytest.xs > Mytest.xsc && \
 mv Mytest.xsc Mytest.c
 Please specify prototyping behavior for Mytest.xs (see perlxs manual)
 cc -c     Mytest.c
 Running Mkbootstrap for Mytest ()
 chmod 644 Mytest.bs
 rm -f blib/arch/auto/Mytest/Mytest.so
 cc -shared -L/usr/local/lib Mytest.o -o blib/arch/auto/Mytest/Mytest.so

 chmod 755 blib/arch/auto/Mytest/Mytest.so
 cp Mytest.bs blib/arch/auto/Mytest/Mytest.bs
 chmod 644 blib/arch/auto/Mytest/Mytest.bs
 Manifying blib/man3/Mytest.3pm
 %

You can safely ignore the line about "prototyping behavior" - it is
explained in L<perlxs/"The PROTOTYPES: Keyword">.

Perl has its own special way of easily writing test scripts, but for this
example only, we'll create our own test script.  Create a file called hello
that looks like this:

    #! /opt/perl5/bin/perl

    use ExtUtils::testlib;

    use Mytest;

    Mytest::hello();

Now we make the script executable (C<chmod +x hello>), run the script
and we should see the following output:

    % ./hello
    Hello, world!
    %

=head2 EXAMPLE 2

Now let's add to our extension a subroutine that will take a single numeric
argument as input and return 1 if the number is even or 0 if the number
is odd.

Add the following to the end of Mytest.xs:

    int
    is_even(input)
	    int input
	CODE:
	    RETVAL = (input % 2 == 0);
	OUTPUT:
	    RETVAL

There does not need to be whitespace at the start of the "C<int input>"
line, but it is useful for improving readability.  Placing a semi-colon at
the end of that line is also optional.  Any amount and kind of whitespace
may be placed between the "C<int>" and "C<input>".

Now re-run make to rebuild our new shared library.

Now perform the same steps as before, generating a Makefile from the
Makefile.PL file, and running make.

In order to test that our extension works, we now need to look at the
file Mytest.t.  This file is set up to imitate the same kind of testing
structure that Perl itself has.  Within the test script, you perform a
number of tests to confirm the behavior of the extension, printing "ok"
when the test is correct, "not ok" when it is not.

    use Test::More tests => 4;
    BEGIN { use_ok('Mytest') };

    #########################

    # Insert your test code below, the Test::More module is use()ed here
    # so read its man page ( perldoc Test::More ) for help writing this
    # test script.

    is(&Mytest::is_even(0), 1);
    is(&Mytest::is_even(1), 0);
    is(&Mytest::is_even(2), 1);

We will be calling the test script through the command "C<make test>".  You
should see output that looks something like this:

 %make test
 PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
 "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
 t/Mytest....ok
 All tests successful.
 Files=1, Tests=4, 0 wallclock secs ( 0.03 cusr + 0.00 csys = 0.03 CPU)
 %

=head2 What has gone on?

The program h2xs is the starting point for creating extensions.  In later
examples we'll see how we can use h2xs to read header files and generate
templates to connect to C routines.

h2xs creates a number of files in the extension directory.  The file
Makefile.PL is a perl script which will generate a true Makefile to build
the extension.  We'll take a closer look at it later.

The .pm and .xs files contain the meat of the extension.  The .xs file holds
the C routines that make up the extension.  The .pm file contains routines
that tell Perl how to load your extension.

Generating the Makefile and running C<make> created a directory called blib
(which stands for "build library") in the current working directory.  This
directory will contain the shared library that we will build.  Once we have
tested it, we can install it into its final location.

Invoking the test script via "C<make test>" did something very important.
It invoked perl with all those C<-I> arguments so that it could find the
various files that are part of the extension.  It is I<very> important that
while you are still testing extensions that you use "C<make test>".  If you
try to run the test script all by itself, you will get a fatal error.
Another reason it is important to use "C<make test>" to run your test
script is that if you are testing an upgrade to an already-existing version,
using "C<make test>" ensures that you will test your new extension, not the
already-existing version.

When Perl sees a C<use extension;>, it searches for a file with the same name
as the C<use>'d extension that has a .pm suffix.  If that file cannot be found,
Perl dies with a fatal error.  The default search path is contained in the
C<@INC> array.

In our case, Mytest.pm tells perl that it will need the Exporter and Dynamic
Loader extensions.  It then sets the C<@ISA> and C<@EXPORT> arrays and the
C<$VERSION> scalar; finally it tells perl to bootstrap the module.  Perl
will call its dynamic loader routine (if there is one) and load the shared
library.

The two arrays C<@ISA> and C<@EXPORT> are very important.  The C<@ISA>
array contains a list of other packages in which to search for methods (or
subroutines) that do not exist in the current package.  This is usually
only important for object-oriented extensions (which we will talk about
much later), and so usually doesn't need to be modified.

The C<@EXPORT> array tells Perl which of the extension's variables and
subroutines should be placed into the calling package's namespace.  Because
you don't know if the user has already used your variable and subroutine
names, it's vitally important to carefully select what to export.  Do I<not>
export method or variable names I<by default> without a good reason.

As a general rule, if the module is trying to be object-oriented then don't
export anything.  If it's just a collection of functions and variables, then
you can export them via another array, called C<@EXPORT_OK>.  This array
does not automatically place its subroutine and variable names into the
namespace unless the user specifically requests that this be done.

See L<perlmod> for more information.

The C<$VERSION> variable is used to ensure that the .pm file and the shared
library are "in sync" with each other.  Any time you make changes to
the .pm or .xs files, you should increment the value of this variable.

=head2 Writing good test scripts

The importance of writing good test scripts cannot be over-emphasized.  You
should closely follow the "ok/not ok" style that Perl itself uses, so that
it is very easy and unambiguous to determine the outcome of each test case.
When you find and fix a bug, make sure you add a test case for it.

By running "C<make test>", you ensure that your Mytest.t script runs and uses
the correct version of your extension.  If you have many test cases,
save your test files in the "t" directory and use the suffix ".t".
When you run "C<make test>", all of these test files will be executed.

=head2 EXAMPLE 3

Our third extension will take one argument as its input, round off that
value, and set the I<argument> to the rounded value.

Add the following to the end of Mytest.xs:

	void
	round(arg)
		double  arg
	    CODE:
		if (arg > 0.0) {
			arg = floor(arg + 0.5);
		} else if (arg < 0.0) {
			arg = ceil(arg - 0.5);
		} else {
			arg = 0.0;
		}
	    OUTPUT:
		arg

Edit the Makefile.PL file so that the corresponding line looks like this:

	'LIBS'      => ['-lm'],   # e.g., '-lm'

Generate the Makefile and run make.  Change the test number in Mytest.t to
"9" and add the following tests:

	$i = -1.5; &Mytest::round($i); is( $i, -2.0 );
	$i = -1.1; &Mytest::round($i); is( $i, -1.0 );
	$i = 0.0; &Mytest::round($i);  is( $i,  0.0 );
	$i = 0.5; &Mytest::round($i);  is( $i,  1.0 );
	$i = 1.2; &Mytest::round($i);  is( $i,  1.0 );

Running "C<make test>" should now print out that all nine tests are okay.

Notice that in these new test cases, the argument passed to round was a
scalar variable.  You might be wondering if you can round a constant or
literal.  To see what happens, temporarily add the following line to Mytest.t:

	&Mytest::round(3);

Run "C<make test>" and notice that Perl dies with a fatal error.  Perl won't
let you change the value of constants!

=head2 What's new here?

=over 4

=item *

We've made some changes to Makefile.PL.  In this case, we've specified an
extra library to be linked into the extension's shared library, the math
library libm in this case.  We'll talk later about how to write XSUBs that
can call every routine in a library.

=item *

The value of the function is not being passed back as the function's return
value, but by changing the value of the variable that was passed into the
function.  You might have guessed that when you saw that the return value
of round is of type "void".

=back

=head2 Input and Output Parameters

You specify the parameters that will be passed into the XSUB on the line(s)
after you declare the function's return value and name.  Each input parameter
line starts with optional whitespace, and may have an optional terminating
semicolon.

The list of output parameters occurs at the very end of the function, just
after the OUTPUT: directive.  The use of RETVAL tells Perl that you
wish to send this value back as the return value of the XSUB function.  In
Example 3, we wanted the "return value" placed in the original variable
which we passed in, so we listed it (and not RETVAL) in the OUTPUT: section.

=head2 The XSUBPP Program

The B<xsubpp> program takes the XS code in the .xs file and translates it into
C code, placing it in a file whose suffix is .c.  The C code created makes
heavy use of the C functions within Perl.

=head2 The TYPEMAP file

The B<xsubpp> program uses rules to convert from Perl's data types (scalar,
array, etc.) to C's data types (int, char, etc.).  These rules are stored
in the typemap file ($PERLLIB/ExtUtils/typemap).  There's a brief discussion
below, but all the nitty-gritty details can be found in L<perlxstypemap>.
If you have a new-enough version of perl (5.16 and up) or an upgraded
XS compiler (C<ExtUtils::ParseXS> 3.13_01 or better), then you can inline
typemaps in your XS instead of writing separate files.
Either way, this typemap thing is split into three parts:

The first section maps various C data types to a name, which corresponds
somewhat with the various Perl types.  The second section contains C code
which B<xsubpp> uses to handle input parameters.  The third section contains
C code which B<xsubpp> uses to handle output parameters.

Let's take a look at a portion of the .c file created for our extension.
The file name is Mytest.c:

	XS(XS_Mytest_round)
	{
	    dXSARGS;
	    if (items != 1)
		Perl_croak(aTHX_ "Usage: Mytest::round(arg)");
	    PERL_UNUSED_VAR(cv); /* -W */
	    {
		double  arg = (double)SvNV(ST(0));	/* XXXXX */
		if (arg > 0.0) {
			arg = floor(arg + 0.5);
		} else if (arg < 0.0) {
			arg = ceil(arg - 0.5);
		} else {
			arg = 0.0;
		}
		sv_setnv(ST(0), (double)arg);	/* XXXXX */
		SvSETMAGIC(ST(0));
	    }
	    XSRETURN_EMPTY;
	}

Notice the two lines commented with "XXXXX".  If you check the first part
of the typemap file (or section), you'll see that doubles are of type
T_DOUBLE.  In the INPUT part of the typemap, an argument that is T_DOUBLE
is assigned to the variable arg by calling the routine SvNV on something,
then casting it to double, then assigned to the variable arg.  Similarly,
in the OUTPUT section, once arg has its final value, it is passed to the
sv_setnv function to be passed back to the calling subroutine.  These two
functions are explained in L<perlguts>; we'll talk more later about what
that "ST(0)" means in the section on the argument stack.

=head2 Warning about Output Arguments

In general, it's not a good idea to write extensions that modify their input
parameters, as in Example 3.  Instead, you should probably return multiple
values in an array and let the caller handle them (we'll do this in a later
example).  However, in order to better accommodate calling pre-existing C
routines, which often do modify their input parameters, this behavior is
tolerated.

=head2 EXAMPLE 4

In this example, we'll now begin to write XSUBs that will interact with
pre-defined C libraries.  To begin with, we will build a small library of
our own, then let h2xs write our .pm and .xs files for us.

Create a new directory called Mytest2 at the same level as the directory
Mytest.  In the Mytest2 directory, create another directory called mylib,
and cd into that directory.

Here we'll create some files that will generate a test library.  These will
include a C source file and a header file.  We'll also create a Makefile.PL
in this directory.  Then we'll make sure that running make at the Mytest2
level will automatically run this Makefile.PL file and the resulting Makefile.

In the mylib directory, create a file mylib.h that looks like this:

	#define TESTVAL	4

	extern double	foo(int, long, const char*);

Also create a file mylib.c that looks like this:

	#include <stdlib.h>
	#include "./mylib.h"

	double
	foo(int a, long b, const char *c)
	{
		return (a + b + atof(c) + TESTVAL);
	}

And finally create a file Makefile.PL that looks like this:

	use ExtUtils::MakeMaker;
	$Verbose = 1;
	WriteMakefile(
	    NAME   => 'Mytest2::mylib',
	    SKIP   => [qw(all static static_lib dynamic dynamic_lib)],
	    clean  => {'FILES' => 'libmylib$(LIB_EXT)'},
	);


	sub MY::top_targets {
		'
	all :: static

	pure_all :: static

	static ::       libmylib$(LIB_EXT)

	libmylib$(LIB_EXT): $(O_FILES)
		$(AR) cr libmylib$(LIB_EXT) $(O_FILES)
		$(RANLIB) libmylib$(LIB_EXT)

	';
	}

Make sure you use a tab and not spaces on the lines beginning with "$(AR)"
and "$(RANLIB)".  Make will not function properly if you use spaces.
It has also been reported that the "cr" argument to $(AR) is unnecessary
on Win32 systems.

We will now create the main top-level Mytest2 files.  Change to the directory
above Mytest2 and run the following command:

	% h2xs -O -n Mytest2 ./Mytest2/mylib/mylib.h

This will print out a warning about overwriting Mytest2, but that's okay.
Our files are stored in Mytest2/mylib, and will be untouched.

The normal Makefile.PL that h2xs generates doesn't know about the mylib
directory.  We need to tell it that there is a subdirectory and that we
will be generating a library in it.  Let's add the argument MYEXTLIB to
the WriteMakefile call so that it looks like this:

	WriteMakefile(
	    'NAME'      => 'Mytest2',
	    'VERSION_FROM' => 'Mytest2.pm', # finds $VERSION
	    'LIBS'      => [''],   # e.g., '-lm'
	    'DEFINE'    => '',     # e.g., '-DHAVE_SOMETHING'
	    'INC'       => '',     # e.g., '-I/usr/include/other'
	    'MYEXTLIB' => 'mylib/libmylib$(LIB_EXT)',
	);

and then at the end add a subroutine (which will override the pre-existing
subroutine).  Remember to use a tab character to indent the line beginning
with "cd"!

	sub MY::postamble {
	'
	$(MYEXTLIB): mylib/Makefile
		cd mylib && $(MAKE) $(PASSTHRU)
	';
	}

Let's also fix the MANIFEST file so that it accurately reflects the contents
of our extension.  The single line that says "mylib" should be replaced by
the following three lines:

	mylib/Makefile.PL
	mylib/mylib.c
	mylib/mylib.h

To keep our namespace nice and unpolluted, edit the .pm file and change
the variable C<@EXPORT> to C<@EXPORT_OK>.  Finally, in the
.xs file, edit the #include line to read:

	#include "mylib/mylib.h"

And also add the following function definition to the end of the .xs file:

	double
	foo(a,b,c)
		int             a
		long            b
		const char *    c
	    OUTPUT:
		RETVAL

Now we also need to create a typemap because the default Perl doesn't
currently support the C<const char *> type.  Include a new TYPEMAP
section in your XS code before the above function:

        TYPEMAP: <<END
	const char *	T_PV
        END

Now run perl on the top-level Makefile.PL.  Notice that it also created a
Makefile in the mylib directory.  Run make and watch that it does cd into
the mylib directory and run make in there as well.

Now edit the Mytest2.t script and change the number of tests to "4",
and add the following lines to the end of the script:

	is( &Mytest2::foo(1, 2, "Hello, world!"), 7 );
	is( &Mytest2::foo(1, 2, "0.0"), 7 );
	ok( abs(&Mytest2::foo(0, 0, "-3.4") - 0.6) <= 0.01 );

(When dealing with floating-point comparisons, it is best to not check for
equality, but rather that the difference between the expected and actual
result is below a certain amount (called epsilon) which is 0.01 in this case)

Run "C<make test>" and all should be well. There are some warnings on missing
tests for the Mytest2::mylib extension, but you can ignore them.

=head2 What has happened here?

Unlike previous examples, we've now run h2xs on a real include file.  This
has caused some extra goodies to appear in both the .pm and .xs files.

=over 4

=item *

In the .xs file, there's now a #include directive with the absolute path to
the mylib.h header file.  We changed this to a relative path so that we
could move the extension directory if we wanted to.

=item *

There's now some new C code that's been added to the .xs file.  The purpose
of the C<constant> routine is to make the values that are #define'd in the
header file accessible by the Perl script (by calling either C<TESTVAL> or
C<&Mytest2::TESTVAL>).  There's also some XS code to allow calls to the
C<constant> routine.

=item *

The .pm file originally exported the name C<TESTVAL> in the C<@EXPORT> array.
This could lead to name clashes.  A good rule of thumb is that if the #define
is only going to be used by the C routines themselves, and not by the user,
they should be removed from the C<@EXPORT> array.  Alternately, if you don't
mind using the "fully qualified name" of a variable, you could move most
or all of the items from the C<@EXPORT> array into the C<@EXPORT_OK> array.

=item *

If our include file had contained #include directives, these would not have
been processed by h2xs.  There is no good solution to this right now.

=item *

We've also told Perl about the library that we built in the mylib
subdirectory.  That required only the addition of the C<MYEXTLIB> variable
to the WriteMakefile call and the replacement of the postamble subroutine
to cd into the subdirectory and run make.  The Makefile.PL for the
library is a bit more complicated, but not excessively so.  Again we
replaced the postamble subroutine to insert our own code.  This code
simply specified that the library to be created here was a static archive
library (as opposed to a dynamically loadable library) and provided the
commands to build it.

=back

=head2 Anatomy of .xs file

The .xs file of L<"EXAMPLE 4"> contained some new elements.  To understand
the meaning of these elements, pay attention to the line which reads

	MODULE = Mytest2		PACKAGE = Mytest2

Anything before this line is plain C code which describes which headers
to include, and defines some convenience functions.  No translations are
performed on this part, apart from having embedded POD documentation
skipped over (see L<perlpod>) it goes into the generated output C file as is.

Anything after this line is the description of XSUB functions.
These descriptions are translated by B<xsubpp> into C code which
implements these functions using Perl calling conventions, and which
makes these functions visible from Perl interpreter.

Pay a special attention to the function C<constant>.  This name appears
twice in the generated .xs file: once in the first part, as a static C
function, then another time in the second part, when an XSUB interface to
this static C function is defined.

This is quite typical for .xs files: usually the .xs file provides
an interface to an existing C function.  Then this C function is defined
somewhere (either in an external library, or in the first part of .xs file),
and a Perl interface to this function (i.e. "Perl glue") is described in the
second part of .xs file.  The situation in L<"EXAMPLE 1">, L<"EXAMPLE 2">,
and L<"EXAMPLE 3">, when all the work is done inside the "Perl glue", is
somewhat of an exception rather than the rule.

=head2 Getting the fat out of XSUBs

In L<"EXAMPLE 4"> the second part of .xs file contained the following
description of an XSUB:

	double
	foo(a,b,c)
		int             a
		long            b
		const char *    c
	    OUTPUT:
		RETVAL

Note that in contrast with L<"EXAMPLE 1">, L<"EXAMPLE 2"> and L<"EXAMPLE 3">,
this description does not contain the actual I<code> for what is done
during a call to Perl function foo().  To understand what is going
on here, one can add a CODE section to this XSUB:

	double
	foo(a,b,c)
		int             a
		long            b
		const char *    c
	    CODE:
		RETVAL = foo(a,b,c);
	    OUTPUT:
		RETVAL

However, these two XSUBs provide almost identical generated C code: B<xsubpp>
compiler is smart enough to figure out the C<CODE:> section from the first
two lines of the description of XSUB.  What about C<OUTPUT:> section?  In
fact, that is absolutely the same!  The C<OUTPUT:> section can be removed
as well, I<as far as C<CODE:> section or C<PPCODE:> section> is not
specified: B<xsubpp> can see that it needs to generate a function call
section, and will autogenerate the OUTPUT section too.  Thus one can
shortcut the XSUB to become:

	double
	foo(a,b,c)
		int             a
		long            b
		const char *    c

Can we do the same with an XSUB

	int
	is_even(input)
		int	input
	    CODE:
		RETVAL = (input % 2 == 0);
	    OUTPUT:
		RETVAL

of L<"EXAMPLE 2">?  To do this, one needs to define a C function C<int
is_even(int input)>.  As we saw in L<Anatomy of .xs file>, a proper place
for this definition is in the first part of .xs file.  In fact a C function

	int
	is_even(int arg)
	{
		return (arg % 2 == 0);
	}

is probably overkill for this.  Something as simple as a C<#define> will
do too:

	#define is_even(arg)	((arg) % 2 == 0)

After having this in the first part of .xs file, the "Perl glue" part becomes
as simple as

	int
	is_even(input)
		int	input

This technique of separation of the glue part from the workhorse part has
obvious tradeoffs: if you want to change a Perl interface, you need to
change two places in your code.  However, it removes a lot of clutter,
and makes the workhorse part independent from idiosyncrasies of Perl calling
convention.  (In fact, there is nothing Perl-specific in the above description,
a different version of B<xsubpp> might have translated this to TCL glue or
Python glue as well.)

=head2 More about XSUB arguments

With the completion of Example 4, we now have an easy way to simulate some
real-life libraries whose interfaces may not be the cleanest in the world.
We shall now continue with a discussion of the arguments passed to the
B<xsubpp> compiler.

When you specify arguments to routines in the .xs file, you are really
passing three pieces of information for each argument listed.  The first
piece is the order of that argument relative to the others (first, second,
etc).  The second is the type of argument, and consists of the type
declaration of the argument (e.g., int, char*, etc).  The third piece is
the calling convention for the argument in the call to the library function.

While Perl passes arguments to functions by reference,
C passes arguments by value; to implement a C function which modifies data
of one of the "arguments", the actual argument of this C function would be
a pointer to the data.  Thus two C functions with declarations

	int string_length(char *s);
	int upper_case_char(char *cp);

may have completely different semantics: the first one may inspect an array
of chars pointed by s, and the second one may immediately dereference C<cp>
and manipulate C<*cp> only (using the return value as, say, a success
indicator).  From Perl one would use these functions in
a completely different manner.

One conveys this info to B<xsubpp> by replacing C<*> before the
argument by C<&>.  C<&> means that the argument should be passed to a library
function by its address.  The above two function may be XSUB-ified as

	int
	string_length(s)
		char *	s

	int
	upper_case_char(cp)
		char	&cp

For example, consider:

	int
	foo(a,b)
		char	&a
		char *	b

The first Perl argument to this function would be treated as a char and
assigned to the variable a, and its address would be passed into the function
foo. The second Perl argument would be treated as a string pointer and assigned
to the variable b. The I<value> of b would be passed into the function foo.
The actual call to the function foo that B<xsubpp> generates would look like
this:

	foo(&a, b);

B<xsubpp> will parse the following function argument lists identically:

	char	&a
	char&a
	char	& a

However, to help ease understanding, it is suggested that you place a "&"
next to the variable name and away from the variable type), and place a
"*" near the variable type, but away from the variable name (as in the
call to foo above).  By doing so, it is easy to understand exactly what
will be passed to the C function; it will be whatever is in the "last
column".

You should take great pains to try to pass the function the type of variable
it wants, when possible.  It will save you a lot of trouble in the long run.

=head2 The Argument Stack

If we look at any of the C code generated by any of the examples except
example 1, you will notice a number of references to ST(n), where n is
usually 0.  "ST" is actually a macro that points to the n'th argument
on the argument stack.  ST(0) is thus the first argument on the stack and
therefore the first argument passed to the XSUB, ST(1) is the second
argument, and so on.

When you list the arguments to the XSUB in the .xs file, that tells B<xsubpp>
which argument corresponds to which of the argument stack (i.e., the first
one listed is the first argument, and so on).  You invite disaster if you
do not list them in the same order as the function expects them.

The actual values on the argument stack are pointers to the values passed
in.  When an argument is listed as being an OUTPUT value, its corresponding
value on the stack (i.e., ST(0) if it was the first argument) is changed.
You can verify this by looking at the C code generated for Example 3.
The code for the round() XSUB routine contains lines that look like this:

	double  arg = (double)SvNV(ST(0));
	/* Round the contents of the variable arg */
	sv_setnv(ST(0), (double)arg);

The arg variable is initially set by taking the value from ST(0), then is
stored back into ST(0) at the end of the routine.

XSUBs are also allowed to return lists, not just scalars.  This must be
done by manipulating stack values ST(0), ST(1), etc, in a subtly
different way.  See L<perlxs> for details.

XSUBs are also allowed to avoid automatic conversion of Perl function arguments
to C function arguments.  See L<perlxs> for details.  Some people prefer
manual conversion by inspecting C<ST(i)> even in the cases when automatic
conversion will do, arguing that this makes the logic of an XSUB call clearer.
Compare with L<"Getting the fat out of XSUBs"> for a similar tradeoff of
a complete separation of "Perl glue" and "workhorse" parts of an XSUB.

While experts may argue about these idioms, a novice to Perl guts may
prefer a way which is as little Perl-guts-specific as possible, meaning
automatic conversion and automatic call generation, as in
L<"Getting the fat out of XSUBs">.  This approach has the additional
benefit of protecting the XSUB writer from future changes to the Perl API.

=head2 Extending your Extension

Sometimes you might want to provide some extra methods or subroutines
to assist in making the interface between Perl and your extension simpler
or easier to understand.  These routines should live in the .pm file.
Whether they are automatically loaded when the extension itself is loaded
or only loaded when called depends on where in the .pm file the subroutine
definition is placed.  You can also consult L<AutoLoader> for an alternate
way to store and load your extra subroutines.

=head2 Documenting your Extension

There is absolutely no excuse for not documenting your extension.
Documentation belongs in the .pm file.  This file will be fed to pod2man,
and the embedded documentation will be converted to the manpage format,
then placed in the blib directory.  It will be copied to Perl's
manpage directory when the extension is installed.

You may intersperse documentation and Perl code within the .pm file.
In fact, if you want to use method autoloading, you must do this,
as the comment inside the .pm file explains.

See L<perlpod> for more information about the pod format.

=head2 Installing your Extension

Once your extension is complete and passes all its tests, installing it
is quite simple: you simply run "make install".  You will either need
to have write permission into the directories where Perl is installed,
or ask your system administrator to run the make for you.

Alternately, you can specify the exact directory to place the extension's
files by placing a "PREFIX=/destination/directory" after the make install
(or in between the make and install if you have a brain-dead version of make).
This can be very useful if you are building an extension that will eventually
be distributed to multiple systems.  You can then just archive the files in
the destination directory and distribute them to your destination systems.

=head2 EXAMPLE 5

In this example, we'll do some more work with the argument stack.  The
previous examples have all returned only a single value.  We'll now
create an extension that returns an array.

This extension is very Unix-oriented (struct statfs and the statfs system
call).  If you are not running on a Unix system, you can substitute for
statfs any other function that returns multiple values, you can hard-code
values to be returned to the caller (although this will be a bit harder
to test the error case), or you can simply not do this example.  If you
change the XSUB, be sure to fix the test cases to match the changes.

Return to the Mytest directory and add the following code to the end of
Mytest.xs:

	void
	statfs(path)
		char *  path
	    INIT:
		int i;
		struct statfs buf;

	    PPCODE:
		i = statfs(path, &buf);
		if (i == 0) {
			XPUSHs(sv_2mortal(newSVnv(buf.f_bavail)));
			XPUSHs(sv_2mortal(newSVnv(buf.f_bfree)));
			XPUSHs(sv_2mortal(newSVnv(buf.f_blocks)));
			XPUSHs(sv_2mortal(newSVnv(buf.f_bsize)));
			XPUSHs(sv_2mortal(newSVnv(buf.f_ffree)));
			XPUSHs(sv_2mortal(newSVnv(buf.f_files)));
			XPUSHs(sv_2mortal(newSVnv(buf.f_type)));
		} else {
			XPUSHs(sv_2mortal(newSVnv(errno)));
		}

You'll also need to add the following code to the top of the .xs file, just
after the include of "XSUB.h":

	#include <sys/vfs.h>

Also add the following code segment to Mytest.t while incrementing the "9"
tests to "11":

	@a = &Mytest::statfs("/blech");
	ok( scalar(@a) == 1 && $a[0] == 2 );
	@a = &Mytest::statfs("/");
	is( scalar(@a), 7 );

=head2 New Things in this Example

This example added quite a few new concepts.  We'll take them one at a time.

=over 4

=item *

The INIT: directive contains code that will be placed immediately after
the argument stack is decoded.  C does not allow variable declarations at
arbitrary locations inside a function,
so this is usually the best way to declare local variables needed by the XSUB.
(Alternatively, one could put the whole C<PPCODE:> section into braces, and
put these declarations on top.)

=item *

This routine also returns a different number of arguments depending on the
success or failure of the call to statfs.  If there is an error, the error
number is returned as a single-element array.  If the call is successful,
then a 7-element array is returned.  Since only one argument is passed into
this function, we need room on the stack to hold the 7 values which may be
returned.

We do this by using the PPCODE: directive, rather than the CODE: directive.
This tells B<xsubpp> that we will be managing the return values that will be
put on the argument stack by ourselves.

=item *

When we want to place values to be returned to the caller onto the stack,
we use the series of macros that begin with "XPUSH".  There are five
different versions, for placing integers, unsigned integers, doubles,
strings, and Perl scalars on the stack.  In our example, we placed a
Perl scalar onto the stack.  (In fact this is the only macro which
can be used to return multiple values.)

The XPUSH* macros will automatically extend the return stack to prevent
it from being overrun.  You push values onto the stack in the order you
want them seen by the calling program.

=item *

The values pushed onto the return stack of the XSUB are actually mortal SV's.
They are made mortal so that once the values are copied by the calling
program, the SV's that held the returned values can be deallocated.
If they were not mortal, then they would continue to exist after the XSUB
routine returned, but would not be accessible.  This is a memory leak.

=item *

If we were interested in performance, not in code compactness, in the success
branch we would not use C<XPUSHs> macros, but C<PUSHs> macros, and would
pre-extend the stack before pushing the return values:

	EXTEND(SP, 7);

The tradeoff is that one needs to calculate the number of return values
in advance (though overextending the stack will not typically hurt
anything but memory consumption).

Similarly, in the failure branch we could use C<PUSHs> I<without> extending
the stack: the Perl function reference comes to an XSUB on the stack, thus
the stack is I<always> large enough to take one return value.

=back

=head2 EXAMPLE 6

In this example, we will accept a reference to an array as an input
parameter, and return a reference to an array of hashes.  This will
demonstrate manipulation of complex Perl data types from an XSUB.

This extension is somewhat contrived.  It is based on the code in
the previous example.  It calls the statfs function multiple times,
accepting a reference to an array of filenames as input, and returning
a reference to an array of hashes containing the data for each of the
filesystems.

Return to the Mytest directory and add the following code to the end of
Mytest.xs:

    SV *
    multi_statfs(paths)
	    SV * paths
	INIT:
	    AV * results;
	    SSize_t numpaths = 0, n;
	    int i;
	    struct statfs buf;

	    SvGETMAGIC(paths);
	    if ((!SvROK(paths))
		|| (SvTYPE(SvRV(paths)) != SVt_PVAV)
		|| ((numpaths = av_top_index((AV *)SvRV(paths))) < 0))
	    {
		XSRETURN_UNDEF;
	    }
	    results = (AV *)sv_2mortal((SV *)newAV());
	CODE:
	    for (n = 0; n <= numpaths; n++) {
		HV * rh;
		STRLEN l;
		char * fn = SvPV(*av_fetch((AV *)SvRV(paths), n, 0), l);

		i = statfs(fn, &buf);
		if (i != 0) {
		    av_push(results, newSVnv(errno));
		    continue;
		}

		rh = (HV *)sv_2mortal((SV *)newHV());

		hv_store(rh, "f_bavail", 8, newSVnv(buf.f_bavail), 0);
		hv_store(rh, "f_bfree",  7, newSVnv(buf.f_bfree),  0);
		hv_store(rh, "f_blocks", 8, newSVnv(buf.f_blocks), 0);
		hv_store(rh, "f_bsize",  7, newSVnv(buf.f_bsize),  0);
		hv_store(rh, "f_ffree",  7, newSVnv(buf.f_ffree),  0);
		hv_store(rh, "f_files",  7, newSVnv(buf.f_files),  0);
		hv_store(rh, "f_type",   6, newSVnv(buf.f_type),   0);

		av_push(results, newRV_inc((SV *)rh));
	    }
	    RETVAL = newRV_inc((SV *)results);
	OUTPUT:
	    RETVAL

And add the following code to Mytest.t, while incrementing the "11"
tests to "13":

	$results = Mytest::multi_statfs([ '/', '/blech' ]);
	ok( ref $results->[0] );
	ok( ! ref $results->[1] );

=head2 New Things in this Example

There are a number of new concepts introduced here, described below:

=over 4

=item *

This function does not use a typemap.  Instead, we declare it as accepting
one SV* (scalar) parameter, and returning an SV* value, and we take care of
populating these scalars within the code.  Because we are only returning
one value, we don't need a C<PPCODE:> directive - instead, we use C<CODE:>
and C<OUTPUT:> directives.

=item *

When dealing with references, it is important to handle them with caution.
The C<INIT:> block first calls SvGETMAGIC(paths), in case
paths is a tied variable.  Then it checks that C<SvROK> returns
true, which indicates that paths is a valid reference.  (Simply
checking C<SvROK> won't trigger FETCH on a tied variable.)  It
then verifies that the object referenced by paths is an array, using C<SvRV>
to dereference paths, and C<SvTYPE> to discover its type.  As an added test,
it checks that the array referenced by paths is non-empty, using the
C<av_top_index> function (which returns -1 if the array is empty). The
XSRETURN_UNDEF macro is used to abort the XSUB and return the undefined value
whenever all three of these conditions are not met.

=item *

We manipulate several arrays in this XSUB.  Note that an array is represented
internally by an AV* pointer.  The functions and macros for manipulating
arrays are similar to the functions in Perl: C<av_top_index> returns the
highest index in an AV*, much like $#array; C<av_fetch> fetches a single scalar
value from an array, given its index; C<av_push> pushes a scalar value onto the
end of the array, automatically extending the array as necessary.

Specifically, we read pathnames one at a time from the input array, and
store the results in an output array (results) in the same order.  If
statfs fails, the element pushed onto the return array is the value of
errno after the failure.  If statfs succeeds, though, the value pushed
onto the return array is a reference to a hash containing some of the
information in the statfs structure.

As with the return stack, it would be possible (and a small performance win)
to pre-extend the return array before pushing data into it, since we know
how many elements we will return:

	av_extend(results, numpaths);

=item *

We are performing only one hash operation in this function, which is storing
a new scalar under a key using C<hv_store>.  A hash is represented by an HV*
pointer.  Like arrays, the functions for manipulating hashes from an XSUB
mirror the functionality available from Perl.  See L<perlguts> and L<perlapi>
for details.

=item *

To create a reference, we use the C<newRV_inc> function.  Note that you can
cast an AV* or an HV* to type SV* in this case (and many others).  This
allows you to take references to arrays, hashes and scalars with the same
function.  Conversely, the C<SvRV> function always returns an SV*, which may
need to be cast to the appropriate type if it is something other than a
scalar (check with C<SvTYPE>).

=item *

At this point, xsubpp is doing very little work - the differences between
Mytest.xs and Mytest.c are minimal.

=back

=head2 EXAMPLE 7 (Coming Soon)

XPUSH args AND set RETVAL AND assign return value to array

=head2 EXAMPLE 8 (Coming Soon)

Setting $!

=head2 EXAMPLE 9 Passing open files to XSes

You would think passing files to an XS is difficult, with all the
typeglobs and stuff. Well, it isn't.

Suppose that for some strange reason we need a wrapper around the
standard C library function C<fputs()>. This is all we need:

	#define PERLIO_NOT_STDIO 0
	#define PERL_NO_GET_CONTEXT
	#include "EXTERN.h"
	#include "perl.h"
	#include "XSUB.h"

	#include <stdio.h>

	int
	fputs(s, stream)
		char *          s
		FILE *	        stream

The real work is done in the standard typemap.

B<But> you lose all the fine stuff done by the perlio layers. This
calls the stdio function C<fputs()>, which knows nothing about them.

The standard typemap offers three variants of PerlIO *:
C<InputStream> (T_IN), C<InOutStream> (T_INOUT) and C<OutputStream>
(T_OUT). A bare C<PerlIO *> is considered a T_INOUT. If it matters
in your code (see below for why it might) #define or typedef
one of the specific names and use that as the argument or result
type in your XS file.

The standard typemap does not contain PerlIO * before perl 5.7,
but it has the three stream variants. Using a PerlIO * directly
is not backwards compatible unless you provide your own typemap.

For streams coming I<from> perl the main difference is that
C<OutputStream> will get the output PerlIO * - which may make
a difference on a socket. Like in our example...

For streams being handed I<to> perl a new file handle is created
(i.e. a reference to a new glob) and associated with the PerlIO *
provided. If the read/write state of the PerlIO * is not correct then you
may get errors or warnings from when the file handle is used.
So if you opened the PerlIO * as "w" it should really be an
C<OutputStream> if open as "r" it should be an C<InputStream>.

Now, suppose you want to use perlio layers in your XS. We'll use the
perlio C<PerlIO_puts()> function as an example.

In the C part of the XS file (above the first MODULE line) you
have

	#define OutputStream	PerlIO *
    or
	typedef PerlIO *	OutputStream;


And this is the XS code:

	int
	perlioputs(s, stream)
		char *          s
		OutputStream	stream
	CODE:
		RETVAL = PerlIO_puts(stream, s);
	OUTPUT:
		RETVAL

We have to use a C<CODE> section because C<PerlIO_puts()> has the arguments
reversed compared to C<fputs()>, and we want to keep the arguments the same.

Wanting to explore this thoroughly, we want to use the stdio C<fputs()>
on a PerlIO *. This means we have to ask the perlio system for a stdio
C<FILE *>:

	int
	perliofputs(s, stream)
		char *          s
		OutputStream	stream
	PREINIT:
		FILE *fp = PerlIO_findFILE(stream);
	CODE:
		if (fp != (FILE*) 0) {
			RETVAL = fputs(s, fp);
		} else {
			RETVAL = -1;
		}
	OUTPUT:
		RETVAL

Note: C<PerlIO_findFILE()> will search the layers for a stdio
layer. If it can't find one, it will call C<PerlIO_exportFILE()> to
generate a new stdio C<FILE>. Please only call C<PerlIO_exportFILE()> if
you want a I<new> C<FILE>. It will generate one on each call and push a
new stdio layer. So don't call it repeatedly on the same
file. C<PerlIO_findFILE()> will retrieve the stdio layer once it has been
generated by C<PerlIO_exportFILE()>.

This applies to the perlio system only. For versions before 5.7,
C<PerlIO_exportFILE()> is equivalent to C<PerlIO_findFILE()>.

=head2 Troubleshooting these Examples

As mentioned at the top of this document, if you are having problems with
these example extensions, you might see if any of these help you.

=over 4

=item *

In versions of 5.002 prior to the gamma version, the test script in Example
1 will not function properly.  You need to change the "use lib" line to
read:

	use lib './blib';

=item *

In versions of 5.002 prior to version 5.002b1h, the test.pl file was not
automatically created by h2xs.  This means that you cannot say "make test"
to run the test script.  You will need to add the following line before the
"use extension" statement:

	use lib './blib';

=item *

In versions 5.000 and 5.001, instead of using the above line, you will need
to use the following line:

	BEGIN { unshift(@INC, "./blib") }

=item *

This document assumes that the executable named "perl" is Perl version 5.
Some systems may have installed Perl version 5 as "perl5".

=back

=head1 See also

For more information, consult L<perlguts>, L<perlapi>, L<perlxs>, L<perlmod>,
and L<perlpod>.

=head1 Author

Jeff Okamoto <F<okamoto@corp.hp.com>>

Reviewed and assisted by Dean Roehrich, Ilya Zakharevich, Andreas Koenig,
and Tim Bunce.

PerlIO material contributed by Lupe Christoph, with some clarification
by Nick Ing-Simmons.

Changes for h2xs as of Perl 5.8.x by Renee Baecker

=head2 Last Changed

2012-01-20
perlsyn.pod000064400000126740150344123440006756 0ustar00=head1 NAME
X<syntax>

perlsyn - Perl syntax

=head1 DESCRIPTION

A Perl program consists of a sequence of declarations and statements
which run from the top to the bottom.  Loops, subroutines, and other
control structures allow you to jump around within the code.

Perl is a B<free-form> language: you can format and indent it however
you like.  Whitespace serves mostly to separate tokens, unlike
languages like Python where it is an important part of the syntax,
or Fortran where it is immaterial.

Many of Perl's syntactic elements are B<optional>.  Rather than
requiring you to put parentheses around every function call and
declare every variable, you can often leave such explicit elements off
and Perl will figure out what you meant.  This is known as B<Do What I
Mean>, abbreviated B<DWIM>.  It allows programmers to be B<lazy> and to
code in a style with which they are comfortable.

Perl B<borrows syntax> and concepts from many languages: awk, sed, C,
Bourne Shell, Smalltalk, Lisp and even English.  Other
languages have borrowed syntax from Perl, particularly its regular
expression extensions.  So if you have programmed in another language
you will see familiar pieces in Perl.  They often work the same, but
see L<perltrap> for information about how they differ.

=head2 Declarations
X<declaration> X<undef> X<undefined> X<uninitialized>

The only things you need to declare in Perl are report formats and
subroutines (and sometimes not even subroutines).  A scalar variable holds
the undefined value (C<undef>) until it has been assigned a defined
value, which is anything other than C<undef>.  When used as a number,
C<undef> is treated as C<0>; when used as a string, it is treated as
the empty string, C<"">; and when used as a reference that isn't being
assigned to, it is treated as an error.  If you enable warnings,
you'll be notified of an uninitialized value whenever you treat
C<undef> as a string or a number.  Well, usually.  Boolean contexts,
such as:

    if ($a) {}

are exempt from warnings (because they care about truth rather than
definedness).  Operators such as C<++>, C<-->, C<+=>,
C<-=>, and C<.=>, that operate on undefined variables such as:

    undef $a;
    $a++;

are also always exempt from such warnings.

A declaration can be put anywhere a statement can, but has no effect on
the execution of the primary sequence of statements: declarations all
take effect at compile time.  All declarations are typically put at
the beginning or the end of the script.  However, if you're using
lexically-scoped private variables created with C<my()>,
C<state()>, or C<our()>, you'll have to make sure
your format or subroutine definition is within the same block scope
as the my if you expect to be able to access those private variables.

Declaring a subroutine allows a subroutine name to be used as if it were a
list operator from that point forward in the program.  You can declare a
subroutine without defining it by saying C<sub name>, thus:
X<subroutine, declaration>

    sub myname;
    $me = myname $0             or die "can't get myname";

A bare declaration like that declares the function to be a list operator,
not a unary operator, so you have to be careful to use parentheses (or
C<or> instead of C<||>.)  The C<||> operator binds too tightly to use after
list operators; it becomes part of the last element.  You can always use
parentheses around the list operators arguments to turn the list operator
back into something that behaves more like a function call.  Alternatively,
you can use the prototype C<($)> to turn the subroutine into a unary
operator:

  sub myname ($);
  $me = myname $0             || die "can't get myname";

That now parses as you'd expect, but you still ought to get in the habit of
using parentheses in that situation.  For more on prototypes, see
L<perlsub>.

Subroutines declarations can also be loaded up with the C<require> statement
or both loaded and imported into your namespace with a C<use> statement.
See L<perlmod> for details on this.

A statement sequence may contain declarations of lexically-scoped
variables, but apart from declaring a variable name, the declaration acts
like an ordinary statement, and is elaborated within the sequence of
statements as if it were an ordinary statement.  That means it actually
has both compile-time and run-time effects.

=head2 Comments
X<comment> X<#>

Text from a C<"#"> character until the end of the line is a comment,
and is ignored.  Exceptions include C<"#"> inside a string or regular
expression.

=head2 Simple Statements
X<statement> X<semicolon> X<expression> X<;>

The only kind of simple statement is an expression evaluated for its
side-effects.  Every simple statement must be terminated with a
semicolon, unless it is the final statement in a block, in which case
the semicolon is optional.  But put the semicolon in anyway if the
block takes up more than one line, because you may eventually add
another line.  Note that there are operators like C<eval {}>, C<sub {}>, and
C<do {}> that I<look> like compound statements, but aren't--they're just
TERMs in an expression--and thus need an explicit termination when used
as the last item in a statement.

=head2 Truth and Falsehood
X<truth> X<falsehood> X<true> X<false> X<!> X<not> X<negation> X<0>

The number 0, the strings C<'0'> and C<"">, the empty list C<()>, and
C<undef> are all false in a boolean context.  All other values are true.
Negation of a true value by C<!> or C<not> returns a special false value.
When evaluated as a string it is treated as C<"">, but as a number, it
is treated as 0.  Most Perl operators
that return true or false behave this way.

=head2 Statement Modifiers
X<statement modifier> X<modifier> X<if> X<unless> X<while>
X<until> X<when> X<foreach> X<for>

Any simple statement may optionally be followed by a I<SINGLE> modifier,
just before the terminating semicolon (or block ending).  The possible
modifiers are:

    if EXPR
    unless EXPR
    while EXPR
    until EXPR
    for LIST
    foreach LIST
    when EXPR

The C<EXPR> following the modifier is referred to as the "condition".
Its truth or falsehood determines how the modifier will behave.

C<if> executes the statement once I<if> and only if the condition is
true.  C<unless> is the opposite, it executes the statement I<unless>
the condition is true (that is, if the condition is false).

    print "Basset hounds got long ears" if length $ear >= 10;
    go_outside() and play() unless $is_raining;

The C<for(each)> modifier is an iterator: it executes the statement once
for each item in the LIST (with C<$_> aliased to each item in turn).

    print "Hello $_!\n" for qw(world Dolly nurse);

C<while> repeats the statement I<while> the condition is true.
C<until> does the opposite, it repeats the statement I<until> the
condition is true (or while the condition is false):

    # Both of these count from 0 to 10.
    print $i++ while $i <= 10;
    print $j++ until $j >  10;

The C<while> and C<until> modifiers have the usual "C<while> loop"
semantics (conditional evaluated first), except when applied to a
C<do>-BLOCK (or to the Perl4 C<do>-SUBROUTINE statement), in
which case the block executes once before the conditional is
evaluated.

This is so that you can write loops like:

    do {
        $line = <STDIN>;
        ...
    } until !defined($line) || $line eq ".\n"

See L<perlfunc/do>.  Note also that the loop control statements described
later will I<NOT> work in this construct, because modifiers don't take
loop labels.  Sorry.  You can always put another block inside of it
(for C<next>/C<redo>) or around it (for C<last>) to do that sort of thing.
X<next> X<last> X<redo>

For C<next> or C<redo>, just double the braces:

    do {{
        next if $x == $y;
        # do something here
    }} until $x++ > $z;

For C<last>, you have to be more elaborate and put braces around it:
X<last>

    {
        do {
            last if $x == $y**2;
            # do something here
        } while $x++ <= $z;
    }

If you need both C<next> and C<last>, you have to do both and also use a
loop label:

    LOOP: {
        do {{
            next if $x == $y;
            last LOOP if $x == $y**2;
            # do something here
        }} until $x++ > $z;
    }

B<NOTE:> The behaviour of a C<my>, C<state>, or
C<our> modified with a statement modifier conditional
or loop construct (for example, C<my $x if ...>) is
B<undefined>.  The value of the C<my> variable may be C<undef>, any
previously assigned value, or possibly anything else.  Don't rely on
it.  Future versions of perl might do something different from the
version of perl you try it out on.  Here be dragons.
X<my>

The C<when> modifier is an experimental feature that first appeared in Perl
5.14.  To use it, you should include a C<use v5.14> declaration.
(Technically, it requires only the C<switch> feature, but that aspect of it
was not available before 5.14.)  Operative only from within a C<foreach>
loop or a C<given> block, it executes the statement only if the smartmatch
C<< $_ ~~ I<EXPR> >> is true.  If the statement executes, it is followed by
a C<next> from inside a C<foreach> and C<break> from inside a C<given>.

Under the current implementation, the C<foreach> loop can be
anywhere within the C<when> modifier's dynamic scope, but must be
within the C<given> block's lexical scope.  This restriction may
be relaxed in a future release.  See L</"Switch Statements"> below.

=head2 Compound Statements
X<statement, compound> X<block> X<bracket, curly> X<curly bracket> X<brace>
X<{> X<}> X<if> X<unless> X<given> X<while> X<until> X<foreach> X<for> X<continue>

In Perl, a sequence of statements that defines a scope is called a block.
Sometimes a block is delimited by the file containing it (in the case
of a required file, or the program as a whole), and sometimes a block
is delimited by the extent of a string (in the case of an eval).

But generally, a block is delimited by curly brackets, also known as braces.
We will call this syntactic construct a BLOCK.

The following compound statements may be used to control flow:

    if (EXPR) BLOCK
    if (EXPR) BLOCK else BLOCK
    if (EXPR) BLOCK elsif (EXPR) BLOCK ...
    if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK

    unless (EXPR) BLOCK
    unless (EXPR) BLOCK else BLOCK
    unless (EXPR) BLOCK elsif (EXPR) BLOCK ...
    unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK

    given (EXPR) BLOCK

    LABEL while (EXPR) BLOCK
    LABEL while (EXPR) BLOCK continue BLOCK

    LABEL until (EXPR) BLOCK
    LABEL until (EXPR) BLOCK continue BLOCK

    LABEL for (EXPR; EXPR; EXPR) BLOCK
    LABEL for VAR (LIST) BLOCK
    LABEL for VAR (LIST) BLOCK continue BLOCK

    LABEL foreach (EXPR; EXPR; EXPR) BLOCK
    LABEL foreach VAR (LIST) BLOCK
    LABEL foreach VAR (LIST) BLOCK continue BLOCK

    LABEL BLOCK
    LABEL BLOCK continue BLOCK

    PHASE BLOCK

The experimental C<given> statement is I<not automatically enabled>; see
L</"Switch Statements"> below for how to do so, and the attendant caveats.

Unlike in C and Pascal, in Perl these are all defined in terms of BLOCKs,
not statements.  This means that the curly brackets are I<required>--no
dangling statements allowed.  If you want to write conditionals without
curly brackets, there are several other ways to do it.  The following
all do the same thing:

    if (!open(FOO)) { die "Can't open $FOO: $!" }
    die "Can't open $FOO: $!" unless open(FOO);
    open(FOO)  || die "Can't open $FOO: $!";
    open(FOO) ? () : die "Can't open $FOO: $!";
        # a bit exotic, that last one

The C<if> statement is straightforward.  Because BLOCKs are always
bounded by curly brackets, there is never any ambiguity about which
C<if> an C<else> goes with.  If you use C<unless> in place of C<if>,
the sense of the test is reversed.  Like C<if>, C<unless> can be followed
by C<else>.  C<unless> can even be followed by one or more C<elsif>
statements, though you may want to think twice before using that particular
language construct, as everyone reading your code will have to think at least
twice before they can understand what's going on.

The C<while> statement executes the block as long as the expression is
L<true|/"Truth and Falsehood">.
The C<until> statement executes the block as long as the expression is
false.
The LABEL is optional, and if present, consists of an identifier followed
by a colon.  The LABEL identifies the loop for the loop control
statements C<next>, C<last>, and C<redo>.
If the LABEL is omitted, the loop control statement
refers to the innermost enclosing loop.  This may include dynamically
looking back your call-stack at run time to find the LABEL.  Such
desperate behavior triggers a warning if you use the C<use warnings>
pragma or the B<-w> flag.

If there is a C<continue> BLOCK, it is always executed just before the
conditional is about to be evaluated again.  Thus it can be used to
increment a loop variable, even when the loop has been continued via
the C<next> statement.

When a block is preceding by a compilation phase keyword such as C<BEGIN>,
C<END>, C<INIT>, C<CHECK>, or C<UNITCHECK>, then the block will run only
during the corresponding phase of execution.  See L<perlmod> for more details.

Extension modules can also hook into the Perl parser to define new
kinds of compound statements.  These are introduced by a keyword which
the extension recognizes, and the syntax following the keyword is
defined entirely by the extension.  If you are an implementor, see
L<perlapi/PL_keyword_plugin> for the mechanism.  If you are using such
a module, see the module's documentation for details of the syntax that
it defines.

=head2 Loop Control
X<loop control> X<loop, control> X<next> X<last> X<redo> X<continue>

The C<next> command starts the next iteration of the loop:

    LINE: while (<STDIN>) {
        next LINE if /^#/;      # discard comments
        ...
    }

The C<last> command immediately exits the loop in question.  The
C<continue> block, if any, is not executed:

    LINE: while (<STDIN>) {
        last LINE if /^$/;      # exit when done with header
        ...
    }

The C<redo> command restarts the loop block without evaluating the
conditional again.  The C<continue> block, if any, is I<not> executed.
This command is normally used by programs that want to lie to themselves
about what was just input.

For example, when processing a file like F</etc/termcap>.
If your input lines might end in backslashes to indicate continuation, you
want to skip ahead and get the next record.

    while (<>) {
        chomp;
        if (s/\\$//) {
            $_ .= <>;
            redo unless eof();
        }
        # now process $_
    }

which is Perl shorthand for the more explicitly written version:

    LINE: while (defined($line = <ARGV>)) {
        chomp($line);
        if ($line =~ s/\\$//) {
            $line .= <ARGV>;
            redo LINE unless eof(); # not eof(ARGV)!
        }
        # now process $line
    }

Note that if there were a C<continue> block on the above code, it would
get executed only on lines discarded by the regex (since redo skips the
continue block).  A continue block is often used to reset line counters
or C<m?pat?> one-time matches:

    # inspired by :1,$g/fred/s//WILMA/
    while (<>) {
        m?(fred)?    && s//WILMA $1 WILMA/;
        m?(barney)?  && s//BETTY $1 BETTY/;
        m?(homer)?   && s//MARGE $1 MARGE/;
    } continue {
        print "$ARGV $.: $_";
        close ARGV  if eof;             # reset $.
        reset       if eof;             # reset ?pat?
    }

If the word C<while> is replaced by the word C<until>, the sense of the
test is reversed, but the conditional is still tested before the first
iteration.

Loop control statements don't work in an C<if> or C<unless>, since
they aren't loops.  You can double the braces to make them such, though.

    if (/pattern/) {{
        last if /fred/;
        next if /barney/; # same effect as "last",
                          # but doesn't document as well
        # do something here
    }}

This is caused by the fact that a block by itself acts as a loop that
executes once, see L</"Basic BLOCKs">.

The form C<while/if BLOCK BLOCK>, available in Perl 4, is no longer
available.   Replace any occurrence of C<if BLOCK> by C<if (do BLOCK)>.

=head2 For Loops
X<for> X<foreach>

Perl's C-style C<for> loop works like the corresponding C<while> loop;
that means that this:

    for ($i = 1; $i < 10; $i++) {
        ...
    }

is the same as this:

    $i = 1;
    while ($i < 10) {
        ...
    } continue {
        $i++;
    }

There is one minor difference: if variables are declared with C<my>
in the initialization section of the C<for>, the lexical scope of
those variables is exactly the C<for> loop (the body of the loop
and the control sections).
X<my>

As a special case, if the test in the C<for> loop (or the corresponding
C<while> loop) is empty, it is treated as true.  That is, both

    for (;;) {
        ...
    }

and

    while () {
        ...
    }

are treated as infinite loops.

Besides the normal array index looping, C<for> can lend itself
to many other interesting applications.  Here's one that avoids the
problem you get into if you explicitly test for end-of-file on
an interactive file descriptor causing your program to appear to
hang.
X<eof> X<end-of-file> X<end of file>

    $on_a_tty = -t STDIN && -t STDOUT;
    sub prompt { print "yes? " if $on_a_tty }
    for ( prompt(); <STDIN>; prompt() ) {
        # do something
    }

Using C<readline> (or the operator form, C<< <EXPR> >>) as the
conditional of a C<for> loop is shorthand for the following.  This
behaviour is the same as a C<while> loop conditional.
X<readline> X<< <> >>

    for ( prompt(); defined( $_ = <STDIN> ); prompt() ) {
        # do something
    }

=head2 Foreach Loops
X<for> X<foreach>

The C<foreach> loop iterates over a normal list value and sets the scalar
variable VAR to be each element of the list in turn.  If the variable
is preceded with the keyword C<my>, then it is lexically scoped, and
is therefore visible only within the loop.  Otherwise, the variable is
implicitly local to the loop and regains its former value upon exiting
the loop.  If the variable was previously declared with C<my>, it uses
that variable instead of the global one, but it's still localized to
the loop.  This implicit localization occurs I<only> in a C<foreach>
loop.
X<my> X<local>

The C<foreach> keyword is actually a synonym for the C<for> keyword, so
you can use either.  If VAR is omitted, C<$_> is set to each value.
X<$_>

If any element of LIST is an lvalue, you can modify it by modifying
VAR inside the loop.  Conversely, if any element of LIST is NOT an
lvalue, any attempt to modify that element will fail.  In other words,
the C<foreach> loop index variable is an implicit alias for each item
in the list that you're looping over.
X<alias>

If any part of LIST is an array, C<foreach> will get very confused if
you add or remove elements within the loop body, for example with
C<splice>.   So don't do that.
X<splice>

C<foreach> probably won't do what you expect if VAR is a tied or other
special variable.   Don't do that either.

As of Perl 5.22, there is an experimental variant of this loop that accepts
a variable preceded by a backslash for VAR, in which case the items in the
LIST must be references.  The backslashed variable will become an alias
to each referenced item in the LIST, which must be of the correct type.
The variable needn't be a scalar in this case, and the backslash may be
followed by C<my>.  To use this form, you must enable the C<refaliasing>
feature via C<use feature>.  (See L<feature>.  See also L<perlref/Assigning
to References>.)

Examples:

    for (@ary) { s/foo/bar/ }

    for my $elem (@elements) {
        $elem *= 2;
    }

    for $count (reverse(1..10), "BOOM") {
        print $count, "\n";
        sleep(1);
    }

    for (1..15) { print "Merry Christmas\n"; }

    foreach $item (split(/:[\\\n:]*/, $ENV{TERMCAP})) {
        print "Item: $item\n";
    }

    use feature "refaliasing";
    no warnings "experimental::refaliasing";
    foreach \my %hash (@array_of_hash_references) {
        # do something which each %hash
    }

Here's how a C programmer might code up a particular algorithm in Perl:

    for (my $i = 0; $i < @ary1; $i++) {
        for (my $j = 0; $j < @ary2; $j++) {
            if ($ary1[$i] > $ary2[$j]) {
                last; # can't go to outer :-(
            }
            $ary1[$i] += $ary2[$j];
        }
        # this is where that last takes me
    }

Whereas here's how a Perl programmer more comfortable with the idiom might
do it:

    OUTER: for my $wid (@ary1) {
    INNER:   for my $jet (@ary2) {
                next OUTER if $wid > $jet;
                $wid += $jet;
             }
          }

See how much easier this is?  It's cleaner, safer, and faster.  It's
cleaner because it's less noisy.  It's safer because if code gets added
between the inner and outer loops later on, the new code won't be
accidentally executed.  The C<next> explicitly iterates the other loop
rather than merely terminating the inner one.  And it's faster because
Perl executes a C<foreach> statement more rapidly than it would the
equivalent C<for> loop.

Perceptive Perl hackers may have noticed that a C<for> loop has a return
value, and that this value can be captured by wrapping the loop in a C<do>
block.  The reward for this discovery is this cautionary advice:  The
return value of a C<for> loop is unspecified and may change without notice.
Do not rely on it.

=head2 Basic BLOCKs
X<block>

A BLOCK by itself (labeled or not) is semantically equivalent to a
loop that executes once.  Thus you can use any of the loop control
statements in it to leave or restart the block.  (Note that this is
I<NOT> true in C<eval{}>, C<sub{}>, or contrary to popular belief
C<do{}> blocks, which do I<NOT> count as loops.)  The C<continue>
block is optional.

The BLOCK construct can be used to emulate case structures.

    SWITCH: {
        if (/^abc/) { $abc = 1; last SWITCH; }
        if (/^def/) { $def = 1; last SWITCH; }
        if (/^xyz/) { $xyz = 1; last SWITCH; }
        $nothing = 1;
    }

You'll also find that C<foreach> loop used to create a topicalizer
and a switch:

    SWITCH:
    for ($var) {
        if (/^abc/) { $abc = 1; last SWITCH; }
        if (/^def/) { $def = 1; last SWITCH; }
        if (/^xyz/) { $xyz = 1; last SWITCH; }
        $nothing = 1;
    }

Such constructs are quite frequently used, both because older versions of
Perl had no official C<switch> statement, and also because the new version
described immediately below remains experimental and can sometimes be confusing.

=head2 Switch Statements

X<switch> X<case> X<given> X<when> X<default>

Starting from Perl 5.10.1 (well, 5.10.0, but it didn't work
right), you can say

    use feature "switch";

to enable an experimental switch feature.  This is loosely based on an
old version of a Perl 6 proposal, but it no longer resembles the Perl 6
construct.   You also get the switch feature whenever you declare that your
code prefers to run under a version of Perl that is 5.10 or later.  For
example:

    use v5.14;

Under the "switch" feature, Perl gains the experimental keywords
C<given>, C<when>, C<default>, C<continue>, and C<break>.
Starting from Perl 5.16, one can prefix the switch
keywords with C<CORE::> to access the feature without a C<use feature>
statement.  The keywords C<given> and
C<when> are analogous to C<switch> and
C<case> in other languages -- though C<continue> is not -- so the code
in the previous section could be rewritten as

    use v5.10.1;
    for ($var) {
        when (/^abc/) { $abc = 1 }
        when (/^def/) { $def = 1 }
        when (/^xyz/) { $xyz = 1 }
        default       { $nothing = 1 }
    }

The C<foreach> is the non-experimental way to set a topicalizer.
If you wish to use the highly experimental C<given>, that could be
written like this:

    use v5.10.1;
    given ($var) {
        when (/^abc/) { $abc = 1 }
        when (/^def/) { $def = 1 }
        when (/^xyz/) { $xyz = 1 }
        default       { $nothing = 1 }
    }

As of 5.14, that can also be written this way:

    use v5.14;
    for ($var) {
        $abc = 1 when /^abc/;
        $def = 1 when /^def/;
        $xyz = 1 when /^xyz/;
        default { $nothing = 1 }
    }

Or if you don't care to play it safe, like this:

    use v5.14;
    given ($var) {
        $abc = 1 when /^abc/;
        $def = 1 when /^def/;
        $xyz = 1 when /^xyz/;
        default { $nothing = 1 }
    }

The arguments to C<given> and C<when> are in scalar context,
and C<given> assigns the C<$_> variable its topic value.

Exactly what the I<EXPR> argument to C<when> does is hard to describe
precisely, but in general, it tries to guess what you want done.  Sometimes
it is interpreted as C<< $_ ~~ I<EXPR> >>, and sometimes it is not.  It
also behaves differently when lexically enclosed by a C<given> block than
it does when dynamically enclosed by a C<foreach> loop.  The rules are far
too difficult to understand to be described here.  See L</"Experimental Details
on given and when"> later on.

Due to an unfortunate bug in how C<given> was implemented between Perl 5.10
and 5.16, under those implementations the version of C<$_> governed by
C<given> is merely a lexically scoped copy of the original, not a
dynamically scoped alias to the original, as it would be if it were a
C<foreach> or under both the original and the current Perl 6 language
specification.  This bug was fixed in Perl 5.18 (and lexicalized C<$_> itself
was removed in Perl 5.24).

If your code still needs to run on older versions,
stick to C<foreach> for your topicalizer and
you will be less unhappy.

=head2 Goto
X<goto>

Although not for the faint of heart, Perl does support a C<goto>
statement.  There are three forms: C<goto>-LABEL, C<goto>-EXPR, and
C<goto>-&NAME.  A loop's LABEL is not actually a valid target for
a C<goto>; it's just the name of the loop.

The C<goto>-LABEL form finds the statement labeled with LABEL and resumes
execution there.  It may not be used to go into any construct that
requires initialization, such as a subroutine or a C<foreach> loop.  It
also can't be used to go into a construct that is optimized away.  It
can be used to go almost anywhere else within the dynamic scope,
including out of subroutines, but it's usually better to use some other
construct such as C<last> or C<die>.  The author of Perl has never felt the
need to use this form of C<goto> (in Perl, that is--C is another matter).

The C<goto>-EXPR form expects a label name, whose scope will be resolved
dynamically.  This allows for computed C<goto>s per FORTRAN, but isn't
necessarily recommended if you're optimizing for maintainability:

    goto(("FOO", "BAR", "GLARCH")[$i]);

The C<goto>-&NAME form is highly magical, and substitutes a call to the
named subroutine for the currently running subroutine.  This is used by
C<AUTOLOAD()> subroutines that wish to load another subroutine and then
pretend that the other subroutine had been called in the first place
(except that any modifications to C<@_> in the current subroutine are
propagated to the other subroutine.)  After the C<goto>, not even C<caller()>
will be able to tell that this routine was called first.

In almost all cases like this, it's usually a far, far better idea to use the
structured control flow mechanisms of C<next>, C<last>, or C<redo> instead of
resorting to a C<goto>.  For certain applications, the catch and throw pair of
C<eval{}> and die() for exception processing can also be a prudent approach.

=head2 The Ellipsis Statement
X<...>
X<... statement>
X<ellipsis operator>
X<elliptical statement>
X<unimplemented statement>
X<unimplemented operator>
X<yada-yada>
X<yada-yada operator>
X<... operator>
X<whatever operator>
X<triple-dot operator>

Beginning in Perl 5.12, Perl accepts an ellipsis, "C<...>", as a
placeholder for code that you haven't implemented yet.  This form of
ellipsis, the unimplemented statement, should not be confused with the
binary flip-flop C<...> operator.  One is a statement and the other an
operator.  (Perl doesn't usually confuse them because usually Perl can tell
whether it wants an operator or a statement, but see below for exceptions.)

When Perl 5.12 or later encounters an ellipsis statement, it parses this
without error, but if and when you should actually try to execute it, Perl
throws an exception with the text C<Unimplemented>:

    use v5.12;
    sub unimplemented { ... }
    eval { unimplemented() };
    if ($@ =~ /^Unimplemented at /) {
        say "I found an ellipsis!";
    }

You can only use the elliptical statement to stand in for a
complete statement.  These examples of how the ellipsis works:

    use v5.12;
    { ... }
    sub foo { ... }
    ...;
    eval { ... };
    sub somemeth {
        my $self = shift;
        ...;
    }
    $x = do {
        my $n;
        ...;
        say "Hurrah!";
        $n;
    };

The elliptical statement cannot stand in for an expression that
is part of a larger statement, since the C<...> is also the three-dot
version of the flip-flop operator (see L<perlop/"Range Operators">).

These examples of attempts to use an ellipsis are syntax errors:

    use v5.12;

    print ...;
    open(my $fh, ">", "/dev/passwd") or ...;
    if ($condition && ... ) { say "Howdy" };

There are some cases where Perl can't immediately tell the difference
between an expression and a statement.  For instance, the syntax for a
block and an anonymous hash reference constructor look the same unless
there's something in the braces to give Perl a hint.  The ellipsis is a
syntax error if Perl doesn't guess that the C<{ ... }> is a block.  In that
case, it doesn't think the C<...> is an ellipsis because it's expecting an
expression instead of a statement:

    @transformed = map { ... } @input;    # syntax error

Inside your block, you can use a C<;> before the ellipsis to denote that the
C<{ ... }> is a block and not a hash reference constructor.  Now the ellipsis
works:

    @transformed = map {; ... } @input;   # ';' disambiguates

Note: Some folks colloquially refer to this bit of punctuation as a
"yada-yada" or "triple-dot", but its true name
is actually an ellipsis.

=head2 PODs: Embedded Documentation
X<POD> X<documentation>

Perl has a mechanism for intermixing documentation with source code.
While it's expecting the beginning of a new statement, if the compiler
encounters a line that begins with an equal sign and a word, like this

    =head1 Here There Be Pods!

Then that text and all remaining text up through and including a line
beginning with C<=cut> will be ignored.  The format of the intervening
text is described in L<perlpod>.

This allows you to intermix your source code
and your documentation text freely, as in

    =item snazzle($)

    The snazzle() function will behave in the most spectacular
    form that you can possibly imagine, not even excepting
    cybernetic pyrotechnics.

    =cut back to the compiler, nuff of this pod stuff!

    sub snazzle($) {
        my $thingie = shift;
        .........
    }

Note that pod translators should look at only paragraphs beginning
with a pod directive (it makes parsing easier), whereas the compiler
actually knows to look for pod escapes even in the middle of a
paragraph.  This means that the following secret stuff will be
ignored by both the compiler and the translators.

    $a=3;
    =secret stuff
     warn "Neither POD nor CODE!?"
    =cut back
    print "got $a\n";

You probably shouldn't rely upon the C<warn()> being podded out forever.
Not all pod translators are well-behaved in this regard, and perhaps
the compiler will become pickier.

One may also use pod directives to quickly comment out a section
of code.

=head2 Plain Old Comments (Not!)
X<comment> X<line> X<#> X<preprocessor> X<eval>

Perl can process line directives, much like the C preprocessor.  Using
this, one can control Perl's idea of filenames and line numbers in
error or warning messages (especially for strings that are processed
with C<eval()>).  The syntax for this mechanism is almost the same as for
most C preprocessors: it matches the regular expression

    # example: '# line 42 "new_filename.plx"'
    /^\#   \s*
      line \s+ (\d+)   \s*
      (?:\s("?)([^"]+)\g2)? \s*
     $/x

with C<$1> being the line number for the next line, and C<$3> being
the optional filename (specified with or without quotes).  Note that
no whitespace may precede the C<< # >>, unlike modern C preprocessors.

There is a fairly obvious gotcha included with the line directive:
Debuggers and profilers will only show the last source line to appear
at a particular line number in a given file.  Care should be taken not
to cause line number collisions in code you'd like to debug later.

Here are some examples that you should be able to type into your command
shell:

    % perl
    # line 200 "bzzzt"
    # the '#' on the previous line must be the first char on line
    die 'foo';
    __END__
    foo at bzzzt line 201.

    % perl
    # line 200 "bzzzt"
    eval qq[\n#line 2001 ""\ndie 'foo']; print $@;
    __END__
    foo at - line 2001.

    % perl
    eval qq[\n#line 200 "foo bar"\ndie 'foo']; print $@;
    __END__
    foo at foo bar line 200.

    % perl
    # line 345 "goop"
    eval "\n#line " . __LINE__ . ' "' . __FILE__ ."\"\ndie 'foo'";
    print $@;
    __END__
    foo at goop line 345.

=head2 Experimental Details on given and when

As previously mentioned, the "switch" feature is considered highly
experimental; it is subject to change with little notice.  In particular,
C<when> has tricky behaviours that are expected to change to become less
tricky in the future.  Do not rely upon its current (mis)implementation.
Before Perl 5.18, C<given> also had tricky behaviours that you should still
beware of if your code must run on older versions of Perl.

Here is a longer example of C<given>:

    use feature ":5.10";
    given ($foo) {
        when (undef) {
            say '$foo is undefined';
        }
        when ("foo") {
            say '$foo is the string "foo"';
        }
        when ([1,3,5,7,9]) {
            say '$foo is an odd digit';
            continue; # Fall through
        }
        when ($_ < 100) {
            say '$foo is numerically less than 100';
        }
        when (\&complicated_check) {
            say 'a complicated check for $foo is true';
        }
        default {
            die q(I don't know what to do with $foo);
        }
    }

Before Perl 5.18, C<given(EXPR)> assigned the value of I<EXPR> to
merely a lexically scoped I<B<copy>> (!) of C<$_>, not a dynamically
scoped alias the way C<foreach> does.  That made it similar to

        do { my $_ = EXPR; ... }

except that the block was automatically broken out of by a successful
C<when> or an explicit C<break>.  Because it was only a copy, and because
it was only lexically scoped, not dynamically scoped, you could not do the
things with it that you are used to in a C<foreach> loop.  In particular,
it did not work for arbitrary function calls if those functions might try
to access $_.  Best stick to C<foreach> for that.

Most of the power comes from the implicit smartmatching that can
sometimes apply.  Most of the time, C<when(EXPR)> is treated as an
implicit smartmatch of C<$_>, that is, C<$_ ~~ EXPR>.  (See
L<perlop/"Smartmatch Operator"> for more information on smartmatching.)
But when I<EXPR> is one of the 10 exceptional cases (or things like them)
listed below, it is used directly as a boolean.

=over 4

=item Z<>1.

A user-defined subroutine call or a method invocation.

=item Z<>2.

A regular expression match in the form of C</REGEX/>, C<$foo =~ /REGEX/>,
or C<$foo =~ EXPR>.  Also, a negated regular expression match in
the form C<!/REGEX/>, C<$foo !~ /REGEX/>, or C<$foo !~ EXPR>.

=item Z<>3.

A smart match that uses an explicit C<~~> operator, such as C<EXPR ~~ EXPR>.

B<NOTE:> You will often have to use C<$c ~~ $_> because the default case
uses C<$_ ~~ $c> , which is frequentlythe opposite of what you want.

=item Z<>4.

A boolean comparison operator such as C<$_ E<lt> 10> or C<$x eq "abc">.  The
relational operators that this applies to are the six numeric comparisons
(C<< < >>, C<< > >>, C<< <= >>, C<< >= >>, C<< == >>, and C<< != >>), and
the six string comparisons (C<lt>, C<gt>, C<le>, C<ge>, C<eq>, and C<ne>).

=item Z<>5.

At least the three builtin functions C<defined(...)>, C<exists(...)>, and
C<eof(...)>.  We might someday add more of these later if we think of them.

=item Z<>6.

A negated expression, whether C<!(EXPR)> or C<not(EXPR)>, or a logical
exclusive-or, C<(EXPR1) xor (EXPR2)>.  The bitwise versions (C<~> and C<^>)
are not included.

=item Z<>7.

A filetest operator, with exactly 4 exceptions: C<-s>, C<-M>, C<-A>, and
C<-C>, as these return numerical values, not boolean ones.  The C<-z>
filetest operator is not included in the exception list.

=item Z<>8.

The C<..> and C<...> flip-flop operators.  Note that the C<...> flip-flop
operator is completely different from the C<...> elliptical statement
just described.

=back

In those 8 cases above, the value of EXPR is used directly as a boolean, so
no smartmatching is done.  You may think of C<when> as a smartsmartmatch.

Furthermore, Perl inspects the operands of logical operators to
decide whether to use smartmatching for each one by applying the
above test to the operands:

=over 4

=item Z<>9.

If EXPR is C<EXPR1 && EXPR2> or C<EXPR1 and EXPR2>, the test is applied
I<recursively> to both EXPR1 and EXPR2.
Only if I<both> operands also pass the
test, I<recursively>, will the expression be treated as boolean.  Otherwise,
smartmatching is used.

=item Z<>10.

If EXPR is C<EXPR1 || EXPR2>, C<EXPR1 // EXPR2>, or C<EXPR1 or EXPR2>, the
test is applied I<recursively> to EXPR1 only (which might itself be a
higher-precedence AND operator, for example, and thus subject to the
previous rule), not to EXPR2.  If EXPR1 is to use smartmatching, then EXPR2
also does so, no matter what EXPR2 contains.  But if EXPR2 does not get to
use smartmatching, then the second argument will not be either.  This is
quite different from the C<&&> case just described, so be careful.

=back

These rules are complicated, but the goal is for them to do what you want
(even if you don't quite understand why they are doing it).  For example:

    when (/^\d+$/ && $_ < 75) { ... }

will be treated as a boolean match because the rules say both
a regex match and an explicit test on C<$_> will be treated
as boolean.

Also:

    when ([qw(foo bar)] && /baz/) { ... }

will use smartmatching because only I<one> of the operands is a boolean:
the other uses smartmatching, and that wins.

Further:

    when ([qw(foo bar)] || /^baz/) { ... }

will use smart matching (only the first operand is considered), whereas

    when (/^baz/ || [qw(foo bar)]) { ... }

will test only the regex, which causes both operands to be
treated as boolean.  Watch out for this one, then, because an
arrayref is always a true value, which makes it effectively
redundant.  Not a good idea.

Tautologous boolean operators are still going to be optimized
away.  Don't be tempted to write

    when ("foo" or "bar") { ... }

This will optimize down to C<"foo">, so C<"bar"> will never be considered (even
though the rules say to use a smartmatch
on C<"foo">).  For an alternation like
this, an array ref will work, because this will instigate smartmatching:

    when ([qw(foo bar)] { ... }

This is somewhat equivalent to the C-style switch statement's fallthrough
functionality (not to be confused with I<Perl's> fallthrough
functionality--see below), wherein the same block is used for several
C<case> statements.

Another useful shortcut is that, if you use a literal array or hash as the
argument to C<given>, it is turned into a reference.  So C<given(@foo)> is
the same as C<given(\@foo)>, for example.

C<default> behaves exactly like C<when(1 == 1)>, which is
to say that it always matches.

=head3 Breaking out

You can use the C<break> keyword to break out of the enclosing
C<given> block.  Every C<when> block is implicitly ended with
a C<break>.

=head3 Fall-through

You can use the C<continue> keyword to fall through from one
case to the next immediate C<when> or C<default>:

    given($foo) {
        when (/x/) { say '$foo contains an x'; continue }
        when (/y/) { say '$foo contains a y'            }
        default    { say '$foo does not contain a y'    }
    }

=head3 Return value

When a C<given> statement is also a valid expression (for example,
when it's the last statement of a block), it evaluates to:

=over 4

=item *

An empty list as soon as an explicit C<break> is encountered.

=item *

The value of the last evaluated expression of the successful
C<when>/C<default> clause, if there happens to be one.

=item *

The value of the last evaluated expression of the C<given> block if no
condition is true.

=back

In both last cases, the last expression is evaluated in the context that
was applied to the C<given> block.

Note that, unlike C<if> and C<unless>, failed C<when> statements always
evaluate to an empty list.

    my $price = do {
        given ($item) {
            when (["pear", "apple"]) { 1 }
            break when "vote";      # My vote cannot be bought
            1e10  when /Mona Lisa/;
            "unknown";
        }
    };

Currently, C<given> blocks can't always
be used as proper expressions.  This
may be addressed in a future version of Perl.

=head3 Switching in a loop

Instead of using C<given()>, you can use a C<foreach()> loop.
For example, here's one way to count how many times a particular
string occurs in an array:

    use v5.10.1;
    my $count = 0;
    for (@array) {
        when ("foo") { ++$count }
    }
    print "\@array contains $count copies of 'foo'\n";

Or in a more recent version:

    use v5.14;
    my $count = 0;
    for (@array) {
        ++$count when "foo";
    }
    print "\@array contains $count copies of 'foo'\n";

At the end of all C<when> blocks, there is an implicit C<next>.
You can override that with an explicit C<last> if you're
interested in only the first match alone.

This doesn't work if you explicitly specify a loop variable, as
in C<for $item (@array)>.  You have to use the default variable C<$_>.

=head3 Differences from Perl 6

The Perl 5 smartmatch and C<given>/C<when> constructs are not compatible
with their Perl 6 analogues.  The most visible difference and least
important difference is that, in Perl 5, parentheses are required around
the argument to C<given()> and C<when()> (except when this last one is used
as a statement modifier).  Parentheses in Perl 6 are always optional in a
control construct such as C<if()>, C<while()>, or C<when()>; they can't be
made optional in Perl 5 without a great deal of potential confusion,
because Perl 5 would parse the expression

    given $foo {
        ...
    }

as though the argument to C<given> were an element of the hash
C<%foo>, interpreting the braces as hash-element syntax.

However, their are many, many other differences.  For example,
this works in Perl 5:

    use v5.12;
    my @primary = ("red", "blue", "green");

    if (@primary ~~ "red") {
        say "primary smartmatches red";
    }

    if ("red" ~~ @primary) {
        say "red smartmatches primary";
    }

    say "that's all, folks!";

But it doesn't work at all in Perl 6.  Instead, you should
use the (parallelizable) C<any> operator:

   if any(@primary) eq "red" {
       say "primary smartmatches red";
   }

   if "red" eq any(@primary) {
       say "red smartmatches primary";
   }

The table of smartmatches in L<perlop/"Smartmatch Operator"> is not
identical to that proposed by the Perl 6 specification, mainly due to
differences between Perl 6's and Perl 5's data models, but also because
the Perl 6 spec has changed since Perl 5 rushed into early adoption.

In Perl 6, C<when()> will always do an implicit smartmatch with its
argument, while in Perl 5 it is convenient (albeit potentially confusing) to
suppress this implicit smartmatch in various rather loosely-defined
situations, as roughly outlined above.  (The difference is largely because
Perl 5 does not have, even internally, a boolean type.)

=cut
perl5184delta.pod000064400000011042150344123440007544 0ustar00=encoding utf8

=head1 NAME

perl5184delta - what is new for perl v5.18.4

=head1 DESCRIPTION

This document describes differences between the 5.18.4 release and the 5.18.2
release.  B<Please note:>  This document ignores perl 5.18.3, a broken release
which existed for a few hours only.

If you are upgrading from an earlier release such as 5.18.1, first read
L<perl5182delta>, which describes differences between 5.18.1 and 5.18.2.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Digest::SHA> has been upgraded from 5.84_01 to 5.84_02.

=item *

L<perl5db.pl> has been upgraded from version 1.39_10 to 1.39_11.

This fixes a crash in tab completion, where available. [perl #120827]  Also,
filehandle information is properly reset after a pager is run. [perl #121456]

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item Win32

=over 4

=item *

Introduced by
L<perl #113536|https://rt.perl.org/Public/Bug/Display.html?id=113536>, a memory
leak on every call to C<system> and backticks (C< `` >), on most Win32 Perls
starting from 5.18.0 has been fixed.  The memory leak only occurred if you
enabled psuedo-fork in your build of Win32 Perl, and were running that build on
Server 2003 R2 or newer OS.  The leak does not appear on WinXP SP3.
[L<perl #121676|https://rt.perl.org/Public/Bug/Display.html?id=121676>]

=back

=back

=head1 Selected Bug Fixes

=over 4

=item *

The debugger now properly resets filehandles as needed. [perl #121456]

=item *

A segfault in Digest::SHA has been addressed.  [perl #121421]

=item *

perl can again be built with USE_64_BIT_INT, with Visual C 2003, 32 bit.
[perl #120925]

=item *

A leading { (brace) in formats is properly parsed again. [perl #119973]

=item *

Copy the values used to perturb hash iteration when cloning an
interpreter.  This was fairly harmless but caused C<valgrind> to
complain. [perl #121336]

=item *

In Perl v5.18 C<undef *_; goto &sub> and C<local *_; goto &sub> started
crashing.  This has been fixed. [perl #119949]

=back

=head1 Acknowledgements

Perl 5.18.4 represents approximately 9 months of development since Perl 5.18.2
and contains approximately 2,000 lines of changes across 53 files from 13
authors.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.18.4:

Daniel Dragan, David Mitchell, Doug Bell, Father Chrysostomos, Hiroo Hayashi,
James E Keenan, Karl Williamson, Mark Shelor, Ricardo Signes, Shlomi Fish,
Smylers, Steve Hay, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
http://rt.perl.org/perlbug/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut

perlebcdic.pod000064400000244411150344123440007352 0ustar00=encoding utf8

=head1 NAME

perlebcdic - Considerations for running Perl on EBCDIC platforms

=head1 DESCRIPTION

An exploration of some of the issues facing Perl programmers
on EBCDIC based computers.

Portions of this document that are still incomplete are marked with XXX.

Early Perl versions worked on some EBCDIC machines, but the last known
version that ran on EBCDIC was v5.8.7, until v5.22, when the Perl core
again works on z/OS.  Theoretically, it could work on OS/400 or Siemens'
BS2000  (or their successors), but this is untested.  In v5.22 and 5.24,
not all
the modules found on CPAN but shipped with core Perl work on z/OS.

If you want to use Perl on a non-z/OS EBCDIC machine, please let us know
by sending mail to perlbug@perl.org

Writing Perl on an EBCDIC platform is really no different than writing
on an L</ASCII> one, but with different underlying numbers, as we'll see
shortly.  You'll have to know something about those L</ASCII> platforms
because the documentation is biased and will frequently use example
numbers that don't apply to EBCDIC.  There are also very few CPAN
modules that are written for EBCDIC and which don't work on ASCII;
instead the vast majority of CPAN modules are written for ASCII, and
some may happen to work on EBCDIC, while a few have been designed to
portably work on both.

If your code just uses the 52 letters A-Z and a-z, plus SPACE, the
digits 0-9, and the punctuation characters that Perl uses, plus a few
controls that are denoted by escape sequences like C<\n> and C<\t>, then
there's nothing special about using Perl, and your code may very well
work on an ASCII machine without change.

But if you write code that uses C<\005> to mean a TAB or C<\xC1> to mean
an "A", or C<\xDF> to mean a "E<yuml>" (small C<"y"> with a diaeresis),
then your code may well work on your EBCDIC platform, but not on an
ASCII one.  That's fine to do if no one will ever want to run your code
on an ASCII platform; but the bias in this document will be towards writing
code portable between EBCDIC and ASCII systems.  Again, if every
character you care about is easily enterable from your keyboard, you
don't have to know anything about ASCII, but many keyboards don't easily
allow you to directly enter, say, the character C<\xDF>, so you have to
specify it indirectly, such as by using the C<"\xDF"> escape sequence.
In those cases it's easiest to know something about the ASCII/Unicode
character sets.  If you know that the small "E<yuml>" is C<U+00FF>, then
you can instead specify it as C<"\N{U+FF}">, and have the computer
automatically translate it to C<\xDF> on your platform, and leave it as
C<\xFF> on ASCII ones.  Or you could specify it by name, C<\N{LATIN
SMALL LETTER Y WITH DIAERESIS> and not have to know the  numbers.
Either way works, but both require familiarity with Unicode.

=head1 COMMON CHARACTER CODE SETS

=head2 ASCII

The American Standard Code for Information Interchange (ASCII or
US-ASCII) is a set of
integers running from 0 to 127 (decimal) that have standardized
interpretations by the computers which use ASCII.  For example, 65 means
the letter "A".
The range 0..127 can be covered by setting various bits in a 7-bit binary
digit, hence the set is sometimes referred to as "7-bit ASCII".
ASCII was described by the American National Standards Institute
document ANSI X3.4-1986.  It was also described by ISO 646:1991
(with localization for currency symbols).  The full ASCII set is
given in the table L<below|/recipe 3> as the first 128 elements.
Languages that
can be written adequately with the characters in ASCII include
English, Hawaiian, Indonesian, Swahili and some Native American
languages.

Most non-EBCDIC character sets are supersets of ASCII.  That is the
integers 0-127 mean what ASCII says they mean.  But integers 128 and
above are specific to the character set.

Many of these fit entirely into 8 bits, using ASCII as 0-127, while
specifying what 128-255 mean, and not using anything above 255.
Thus, these are single-byte (or octet if you prefer) character sets.
One important one (since Unicode is a superset of it) is the ISO 8859-1
character set.

=head2 ISO 8859

The ISO 8859-I<B<$n>> are a collection of character code sets from the
International Organization for Standardization (ISO), each of which adds
characters to the ASCII set that are typically found in various
languages, many of which are based on the Roman, or Latin, alphabet.
Most are for European languages, but there are also ones for Arabic,
Greek, Hebrew, and Thai.  There are good references on the web about
all these.

=head2 Latin 1 (ISO 8859-1)

A particular 8-bit extension to ASCII that includes grave and acute
accented Latin characters.  Languages that can employ ISO 8859-1
include all the languages covered by ASCII as well as Afrikaans,
Albanian, Basque, Catalan, Danish, Faroese, Finnish, Norwegian,
Portuguese, Spanish, and Swedish.  Dutch is covered albeit without
the ij ligature.  French is covered too but without the oe ligature.
German can use ISO 8859-1 but must do so without German-style
quotation marks.  This set is based on Western European extensions
to ASCII and is commonly encountered in world wide web work.
In IBM character code set identification terminology, ISO 8859-1 is
also known as CCSID 819 (or sometimes 0819 or even 00819).

=head2 EBCDIC

The Extended Binary Coded Decimal Interchange Code refers to a
large collection of single- and multi-byte coded character sets that are
quite different from ASCII and ISO 8859-1, and are all slightly
different from each other; they typically run on host computers.  The
EBCDIC encodings derive from 8-bit byte extensions of Hollerith punched
card encodings, which long predate ASCII.  The layout on the
cards was such that high bits were set for the upper and lower case
alphabetic
characters C<[a-z]> and C<[A-Z]>, but there were gaps within each Latin
alphabet range, visible in the table L<below|/recipe 3>.  These gaps can
cause complications.

Some IBM EBCDIC character sets may be known by character code set
identification numbers (CCSID numbers) or code page numbers.

Perl can be compiled on platforms that run any of three commonly used EBCDIC
character sets, listed below.

=head3 The 13 variant characters

Among IBM EBCDIC character code sets there are 13 characters that
are often mapped to different integer values.  Those characters
are known as the 13 "variant" characters and are:

    \ [ ] { } ^ ~ ! # | $ @ `

When Perl is compiled for a platform, it looks at all of these characters to
guess which EBCDIC character set the platform uses, and adapts itself
accordingly to that platform.  If the platform uses a character set that is not
one of the three Perl knows about, Perl will either fail to compile, or
mistakenly and silently choose one of the three.

=head3 EBCDIC code sets recognized by Perl

=over

=item B<0037>

Character code set ID 0037 is a mapping of the ASCII plus Latin-1
characters (i.e. ISO 8859-1) to an EBCDIC set.  0037 is used
in North American English locales on the OS/400 operating system
that runs on AS/400 computers.  CCSID 0037 differs from ISO 8859-1
in 236 places; in other words they agree on only 20 code point values.

=item B<1047>

Character code set ID 1047 is also a mapping of the ASCII plus
Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set.  1047 is
used under Unix System Services for OS/390 or z/OS, and OpenEdition
for VM/ESA.  CCSID 1047 differs from CCSID 0037 in eight places,
and from ISO 8859-1 in 236.

=item B<POSIX-BC>

The EBCDIC code page in use on Siemens' BS2000 system is distinct from
1047 and 0037.  It is identified below as the POSIX-BC set.
Like 0037 and 1047, it is the same as ISO 8859-1 in 20 code point
values.

=back

=head2 Unicode code points versus EBCDIC code points

In Unicode terminology a I<code point> is the number assigned to a
character: for example, in EBCDIC the character "A" is usually assigned
the number 193.  In Unicode, the character "A" is assigned the number 65.
All the code points in ASCII and Latin-1 (ISO 8859-1) have the same
meaning in Unicode.  All three of the recognized EBCDIC code sets have
256 code points, and in each code set, all 256 code points are mapped to
equivalent Latin1 code points.  Obviously, "A" will map to "A", "B" =>
"B", "%" => "%", etc., for all printable characters in Latin1 and these
code pages.

It also turns out that EBCDIC has nearly precise equivalents for the
ASCII/Latin1 C0 controls and the DELETE control.  (The C0 controls are
those whose ASCII code points are 0..0x1F; things like TAB, ACK, BEL,
etc.)  A mapping is set up between these ASCII/EBCDIC controls.  There
isn't such a precise mapping between the C1 controls on ASCII platforms
and the remaining EBCDIC controls.  What has been done is to map these
controls, mostly arbitrarily, to some otherwise unmatched character in
the other character set.  Most of these are very very rarely used
nowadays in EBCDIC anyway, and their names have been dropped, without
much complaint.  For example the EO (Eight Ones) EBCDIC control
(consisting of eight one bits = 0xFF) is mapped to the C1 APC control
(0x9F), and you can't use the name "EO".

The EBCDIC controls provide three possible line terminator characters,
CR (0x0D), LF (0x25), and NL (0x15).  On ASCII platforms, the symbols
"NL" and "LF" refer to the same character, but in strict EBCDIC
terminology they are different ones.  The EBCDIC NL is mapped to the C1
control called "NEL" ("Next Line"; here's a case where the mapping makes
quite a bit of sense, and hence isn't just arbitrary).  On some EBCDIC
platforms, this NL or NEL is the typical line terminator.  This is true
of z/OS and BS2000.  In these platforms, the C compilers will swap the
LF and NEL code points, so that C<"\n"> is 0x15, and refers to NL.  Perl
does that too; you can see it in the code chart L<below|/recipe 3>.
This makes things generally "just work" without you even having to be
aware that there is a swap.

=head2 Unicode and UTF

UTF stands for "Unicode Transformation Format".
UTF-8 is an encoding of Unicode into a sequence of 8-bit byte chunks, based on
ASCII and Latin-1.
The length of a sequence required to represent a Unicode code point
depends on the ordinal number of that code point,
with larger numbers requiring more bytes.
UTF-EBCDIC is like UTF-8, but based on EBCDIC.
They are enough alike that often, casual usage will conflate the two
terms, and use "UTF-8" to mean both the UTF-8 found on ASCII platforms,
and the UTF-EBCDIC found on EBCDIC ones.

You may see the term "invariant" character or code point.
This simply means that the character has the same numeric
value and representation when encoded in UTF-8 (or UTF-EBCDIC) as when
not.  (Note that this is a very different concept from L</The 13 variant
characters> mentioned above.  Careful prose will use the term "UTF-8
invariant" instead of just "invariant", but most often you'll see just
"invariant".) For example, the ordinal value of "A" is 193 in most
EBCDIC code pages, and also is 193 when encoded in UTF-EBCDIC.  All
UTF-8 (or UTF-EBCDIC) variant code points occupy at least two bytes when
encoded in UTF-8 (or UTF-EBCDIC); by definition, the UTF-8 (or
UTF-EBCDIC) invariant code points are exactly one byte whether encoded
in UTF-8 (or UTF-EBCDIC), or not.  (By now you see why people typically
just say "UTF-8" when they also mean "UTF-EBCDIC".  For the rest of this
document, we'll mostly be casual about it too.)
In ASCII UTF-8, the code points corresponding to the lowest 128
ordinal numbers (0 - 127: the ASCII characters) are invariant.
In UTF-EBCDIC, there are 160 invariant characters.
(If you care, the EBCDIC invariants are those characters
which have ASCII equivalents, plus those that correspond to
the C1 controls (128 - 159 on ASCII platforms).)

A string encoded in UTF-EBCDIC may be longer (very rarely shorter) than
one encoded in UTF-8.  Perl extends both UTF-8 and UTF-EBCDIC so that
they can encode code points above the Unicode maximum of U+10FFFF.  Both
extensions are constructed to allow encoding of any code point that fits
in a 64-bit word.

UTF-EBCDIC is defined by
L<Unicode Technical Report #16|http://www.unicode.org/reports/tr16>
(often referred to as just TR16).
It is defined based on CCSID 1047, not allowing for the differences for
other code pages.  This allows for easy interchange of text between
computers running different code pages, but makes it unusable, without
adaptation, for Perl on those other code pages.

The reason for this unusability is that a fundamental assumption of Perl
is that the characters it cares about for parsing and lexical analysis
are the same whether or not the text is in UTF-8.  For example, Perl
expects the character C<"["> to have the same representation, no matter
if the string containing it (or program text) is UTF-8 encoded or not.
To ensure this, Perl adapts UTF-EBCDIC to the particular code page so
that all characters it expects to be UTF-8 invariant are in fact UTF-8
invariant.  This means that text generated on a computer running one
version of Perl's UTF-EBCDIC has to be translated to be intelligible to
a computer running another.

TR16 implies a method to extend UTF-EBCDIC to encode points up through
S<C<2 ** 31 - 1>>.  Perl uses this method for code points up through
S<C<2 ** 30 - 1>>, but uses an incompatible method for larger ones, to
enable it to handle much larger code points than otherwise.

=head2 Using Encode

Starting from Perl 5.8 you can use the standard module Encode
to translate from EBCDIC to Latin-1 code points.
Encode knows about more EBCDIC character sets than Perl can currently
be compiled to run on.

   use Encode 'from_to';

   my %ebcdic = ( 176 => 'cp37', 95 => 'cp1047', 106 => 'posix-bc' );

   # $a is in EBCDIC code points
   from_to($a, $ebcdic{ord '^'}, 'latin1');
   # $a is ISO 8859-1 code points

and from Latin-1 code points to EBCDIC code points

   use Encode 'from_to';

   my %ebcdic = ( 176 => 'cp37', 95 => 'cp1047', 106 => 'posix-bc' );

   # $a is ISO 8859-1 code points
   from_to($a, 'latin1', $ebcdic{ord '^'});
   # $a is in EBCDIC code points

For doing I/O it is suggested that you use the autotranslating features
of PerlIO, see L<perluniintro>.

Since version 5.8 Perl uses the PerlIO I/O library.  This enables
you to use different encodings per IO channel.  For example you may use

    use Encode;
    open($f, ">:encoding(ascii)", "test.ascii");
    print $f "Hello World!\n";
    open($f, ">:encoding(cp37)", "test.ebcdic");
    print $f "Hello World!\n";
    open($f, ">:encoding(latin1)", "test.latin1");
    print $f "Hello World!\n";
    open($f, ">:encoding(utf8)", "test.utf8");
    print $f "Hello World!\n";

to get four files containing "Hello World!\n" in ASCII, CP 0037 EBCDIC,
ISO 8859-1 (Latin-1) (in this example identical to ASCII since only ASCII
characters were printed), and
UTF-EBCDIC (in this example identical to normal EBCDIC since only characters
that don't differ between EBCDIC and UTF-EBCDIC were printed).  See the
documentation of L<Encode::PerlIO> for details.

As the PerlIO layer uses raw IO (bytes) internally, all this totally
ignores things like the type of your filesystem (ASCII or EBCDIC).

=head1 SINGLE OCTET TABLES

The following tables list the ASCII and Latin 1 ordered sets including
the subsets: C0 controls (0..31), ASCII graphics (32..7e), delete (7f),
C1 controls (80..9f), and Latin-1 (a.k.a. ISO 8859-1) (a0..ff).  In the
table names of the Latin 1
extensions to ASCII have been labelled with character names roughly
corresponding to I<The Unicode Standard, Version 6.1> albeit with
substitutions such as C<s/LATIN//> and C<s/VULGAR//> in all cases;
S<C<s/CAPITAL LETTER//>> in some cases; and
S<C<s/SMALL LETTER ([A-Z])/\l$1/>> in some other
cases.  Controls are listed using their Unicode 6.2 abbreviations.
The differences between the 0037 and 1047 sets are
flagged with C<**>.  The differences between the 1047 and POSIX-BC sets
are flagged with C<##.>  All C<ord()> numbers listed are decimal.  If you
would rather see this table listing octal values, then run the table
(that is, the pod source text of this document, since this recipe may not
work with a pod2_other_format translation) through:

=over 4

=item recipe 0

=back

    perl -ne 'if(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
     -e '{printf("%s%-5.03o%-5.03o%-5.03o%.03o\n",$1,$2,$3,$4,$5)}' \
     perlebcdic.pod

If you want to retain the UTF-x code points then in script form you
might want to write:

=over 4

=item recipe 1

=back

 open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!";
 while (<FH>) {
     if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)
                                                     \s+(\d+)\.?(\d*)/x)
     {
         if ($7 ne '' && $9 ne '') {
             printf(
                "%s%-5.03o%-5.03o%-5.03o%-5.03o%-3o.%-5o%-3o.%.03o\n",
                                            $1,$2,$3,$4,$5,$6,$7,$8,$9);
         }
         elsif ($7 ne '') {
             printf("%s%-5.03o%-5.03o%-5.03o%-5.03o%-3o.%-5o%.03o\n",
                                           $1,$2,$3,$4,$5,$6,$7,$8);
         }
         else {
             printf("%s%-5.03o%-5.03o%-5.03o%-5.03o%-5.03o%.03o\n",
                                                $1,$2,$3,$4,$5,$6,$8);
         }
     }
 }

If you would rather see this table listing hexadecimal values then
run the table through:

=over 4

=item recipe 2

=back

    perl -ne 'if(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \
     -e '{printf("%s%-5.02X%-5.02X%-5.02X%.02X\n",$1,$2,$3,$4,$5)}' \
     perlebcdic.pod

Or, in order to retain the UTF-x code points in hexadecimal:

=over 4

=item recipe 3

=back

 open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!";
 while (<FH>) {
     if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*)
                                                     \s+(\d+)\.?(\d*)/x)
     {
         if ($7 ne '' && $9 ne '') {
             printf(
                "%s%-5.02X%-5.02X%-5.02X%-5.02X%-2X.%-6.02X%02X.%02X\n",
                                           $1,$2,$3,$4,$5,$6,$7,$8,$9);
         }
         elsif ($7 ne '') {
             printf("%s%-5.02X%-5.02X%-5.02X%-5.02X%-2X.%-6.02X%02X\n",
                                              $1,$2,$3,$4,$5,$6,$7,$8);
         }
         else {
             printf("%s%-5.02X%-5.02X%-5.02X%-5.02X%-5.02X%02X\n",
                                                  $1,$2,$3,$4,$5,$6,$8);
         }
     }
 }


                          ISO
                         8859-1             POS-         CCSID
                         CCSID  CCSID CCSID IX-          1047
  chr                     0819   0037 1047  BC  UTF-8  UTF-EBCDIC
 ---------------------------------------------------------------------
 <NUL>                       0    0    0    0    0        0
 <SOH>                       1    1    1    1    1        1
 <STX>                       2    2    2    2    2        2
 <ETX>                       3    3    3    3    3        3
 <EOT>                       4    55   55   55   4        55
 <ENQ>                       5    45   45   45   5        45
 <ACK>                       6    46   46   46   6        46
 <BEL>                       7    47   47   47   7        47
 <BS>                        8    22   22   22   8        22
 <HT>                        9    5    5    5    9        5
 <LF>                        10   37   21   21   10       21  **
 <VT>                        11   11   11   11   11       11
 <FF>                        12   12   12   12   12       12
 <CR>                        13   13   13   13   13       13
 <SO>                        14   14   14   14   14       14
 <SI>                        15   15   15   15   15       15
 <DLE>                       16   16   16   16   16       16
 <DC1>                       17   17   17   17   17       17
 <DC2>                       18   18   18   18   18       18
 <DC3>                       19   19   19   19   19       19
 <DC4>                       20   60   60   60   20       60
 <NAK>                       21   61   61   61   21       61
 <SYN>                       22   50   50   50   22       50
 <ETB>                       23   38   38   38   23       38
 <CAN>                       24   24   24   24   24       24
 <EOM>                       25   25   25   25   25       25
 <SUB>                       26   63   63   63   26       63
 <ESC>                       27   39   39   39   27       39
 <FS>                        28   28   28   28   28       28
 <GS>                        29   29   29   29   29       29
 <RS>                        30   30   30   30   30       30
 <US>                        31   31   31   31   31       31
 <SPACE>                     32   64   64   64   32       64
 !                           33   90   90   90   33       90
 "                           34   127  127  127  34       127
 #                           35   123  123  123  35       123
 $                           36   91   91   91   36       91
 %                           37   108  108  108  37       108
 &                           38   80   80   80   38       80
 '                           39   125  125  125  39       125
 (                           40   77   77   77   40       77
 )                           41   93   93   93   41       93
 *                           42   92   92   92   42       92
 +                           43   78   78   78   43       78
 ,                           44   107  107  107  44       107
 -                           45   96   96   96   45       96
 .                           46   75   75   75   46       75
 /                           47   97   97   97   47       97
 0                           48   240  240  240  48       240
 1                           49   241  241  241  49       241
 2                           50   242  242  242  50       242
 3                           51   243  243  243  51       243
 4                           52   244  244  244  52       244
 5                           53   245  245  245  53       245
 6                           54   246  246  246  54       246
 7                           55   247  247  247  55       247
 8                           56   248  248  248  56       248
 9                           57   249  249  249  57       249
 :                           58   122  122  122  58       122
 ;                           59   94   94   94   59       94
 <                           60   76   76   76   60       76
 =                           61   126  126  126  61       126
 >                           62   110  110  110  62       110
 ?                           63   111  111  111  63       111
 @                           64   124  124  124  64       124
 A                           65   193  193  193  65       193
 B                           66   194  194  194  66       194
 C                           67   195  195  195  67       195
 D                           68   196  196  196  68       196
 E                           69   197  197  197  69       197
 F                           70   198  198  198  70       198
 G                           71   199  199  199  71       199
 H                           72   200  200  200  72       200
 I                           73   201  201  201  73       201
 J                           74   209  209  209  74       209
 K                           75   210  210  210  75       210
 L                           76   211  211  211  76       211
 M                           77   212  212  212  77       212
 N                           78   213  213  213  78       213
 O                           79   214  214  214  79       214
 P                           80   215  215  215  80       215
 Q                           81   216  216  216  81       216
 R                           82   217  217  217  82       217
 S                           83   226  226  226  83       226
 T                           84   227  227  227  84       227
 U                           85   228  228  228  85       228
 V                           86   229  229  229  86       229
 W                           87   230  230  230  87       230
 X                           88   231  231  231  88       231
 Y                           89   232  232  232  89       232
 Z                           90   233  233  233  90       233
 [                           91   186  173  187  91       173  ** ##
 \                           92   224  224  188  92       224  ##
 ]                           93   187  189  189  93       189  **
 ^                           94   176  95   106  94       95   ** ##
 _                           95   109  109  109  95       109
 `                           96   121  121  74   96       121  ##
 a                           97   129  129  129  97       129
 b                           98   130  130  130  98       130
 c                           99   131  131  131  99       131
 d                           100  132  132  132  100      132
 e                           101  133  133  133  101      133
 f                           102  134  134  134  102      134
 g                           103  135  135  135  103      135
 h                           104  136  136  136  104      136
 i                           105  137  137  137  105      137
 j                           106  145  145  145  106      145
 k                           107  146  146  146  107      146
 l                           108  147  147  147  108      147
 m                           109  148  148  148  109      148
 n                           110  149  149  149  110      149
 o                           111  150  150  150  111      150
 p                           112  151  151  151  112      151
 q                           113  152  152  152  113      152
 r                           114  153  153  153  114      153
 s                           115  162  162  162  115      162
 t                           116  163  163  163  116      163
 u                           117  164  164  164  117      164
 v                           118  165  165  165  118      165
 w                           119  166  166  166  119      166
 x                           120  167  167  167  120      167
 y                           121  168  168  168  121      168
 z                           122  169  169  169  122      169
 {                           123  192  192  251  123      192  ##
 |                           124  79   79   79   124      79
 }                           125  208  208  253  125      208  ##
 ~                           126  161  161  255  126      161  ##
 <DEL>                       127  7    7    7    127      7
 <PAD>                       128  32   32   32   194.128  32
 <HOP>                       129  33   33   33   194.129  33
 <BPH>                       130  34   34   34   194.130  34
 <NBH>                       131  35   35   35   194.131  35
 <IND>                       132  36   36   36   194.132  36
 <NEL>                       133  21   37   37   194.133  37   **
 <SSA>                       134  6    6    6    194.134  6
 <ESA>                       135  23   23   23   194.135  23
 <HTS>                       136  40   40   40   194.136  40
 <HTJ>                       137  41   41   41   194.137  41
 <VTS>                       138  42   42   42   194.138  42
 <PLD>                       139  43   43   43   194.139  43
 <PLU>                       140  44   44   44   194.140  44
 <RI>                        141  9    9    9    194.141  9
 <SS2>                       142  10   10   10   194.142  10
 <SS3>                       143  27   27   27   194.143  27
 <DCS>                       144  48   48   48   194.144  48
 <PU1>                       145  49   49   49   194.145  49
 <PU2>                       146  26   26   26   194.146  26
 <STS>                       147  51   51   51   194.147  51
 <CCH>                       148  52   52   52   194.148  52
 <MW>                        149  53   53   53   194.149  53
 <SPA>                       150  54   54   54   194.150  54
 <EPA>                       151  8    8    8    194.151  8
 <SOS>                       152  56   56   56   194.152  56
 <SGC>                       153  57   57   57   194.153  57
 <SCI>                       154  58   58   58   194.154  58
 <CSI>                       155  59   59   59   194.155  59
 <ST>                        156  4    4    4    194.156  4
 <OSC>                       157  20   20   20   194.157  20
 <PM>                        158  62   62   62   194.158  62
 <APC>                       159  255  255  95   194.159  255      ##
 <NON-BREAKING SPACE>        160  65   65   65   194.160  128.65
 <INVERTED "!" >             161  170  170  170  194.161  128.66
 <CENT SIGN>                 162  74   74   176  194.162  128.67   ##
 <POUND SIGN>                163  177  177  177  194.163  128.68
 <CURRENCY SIGN>             164  159  159  159  194.164  128.69
 <YEN SIGN>                  165  178  178  178  194.165  128.70
 <BROKEN BAR>                166  106  106  208  194.166  128.71   ##
 <SECTION SIGN>              167  181  181  181  194.167  128.72
 <DIAERESIS>                 168  189  187  121  194.168  128.73   ** ##
 <COPYRIGHT SIGN>            169  180  180  180  194.169  128.74
 <FEMININE ORDINAL>          170  154  154  154  194.170  128.81
 <LEFT POINTING GUILLEMET>   171  138  138  138  194.171  128.82
 <NOT SIGN>                  172  95   176  186  194.172  128.83   ** ##
 <SOFT HYPHEN>               173  202  202  202  194.173  128.84
 <REGISTERED TRADE MARK>     174  175  175  175  194.174  128.85
 <MACRON>                    175  188  188  161  194.175  128.86   ##
 <DEGREE SIGN>               176  144  144  144  194.176  128.87
 <PLUS-OR-MINUS SIGN>        177  143  143  143  194.177  128.88
 <SUPERSCRIPT TWO>           178  234  234  234  194.178  128.89
 <SUPERSCRIPT THREE>         179  250  250  250  194.179  128.98
 <ACUTE ACCENT>              180  190  190  190  194.180  128.99
 <MICRO SIGN>                181  160  160  160  194.181  128.100
 <PARAGRAPH SIGN>            182  182  182  182  194.182  128.101
 <MIDDLE DOT>                183  179  179  179  194.183  128.102
 <CEDILLA>                   184  157  157  157  194.184  128.103
 <SUPERSCRIPT ONE>           185  218  218  218  194.185  128.104
 <MASC. ORDINAL INDICATOR>   186  155  155  155  194.186  128.105
 <RIGHT POINTING GUILLEMET>  187  139  139  139  194.187  128.106
 <FRACTION ONE QUARTER>      188  183  183  183  194.188  128.112
 <FRACTION ONE HALF>         189  184  184  184  194.189  128.113
 <FRACTION THREE QUARTERS>   190  185  185  185  194.190  128.114
 <INVERTED QUESTION MARK>    191  171  171  171  194.191  128.115
 <A WITH GRAVE>              192  100  100  100  195.128  138.65
 <A WITH ACUTE>              193  101  101  101  195.129  138.66
 <A WITH CIRCUMFLEX>         194  98   98   98   195.130  138.67
 <A WITH TILDE>              195  102  102  102  195.131  138.68
 <A WITH DIAERESIS>          196  99   99   99   195.132  138.69
 <A WITH RING ABOVE>         197  103  103  103  195.133  138.70
 <CAPITAL LIGATURE AE>       198  158  158  158  195.134  138.71
 <C WITH CEDILLA>            199  104  104  104  195.135  138.72
 <E WITH GRAVE>              200  116  116  116  195.136  138.73
 <E WITH ACUTE>              201  113  113  113  195.137  138.74
 <E WITH CIRCUMFLEX>         202  114  114  114  195.138  138.81
 <E WITH DIAERESIS>          203  115  115  115  195.139  138.82
 <I WITH GRAVE>              204  120  120  120  195.140  138.83
 <I WITH ACUTE>              205  117  117  117  195.141  138.84
 <I WITH CIRCUMFLEX>         206  118  118  118  195.142  138.85
 <I WITH DIAERESIS>          207  119  119  119  195.143  138.86
 <CAPITAL LETTER ETH>        208  172  172  172  195.144  138.87
 <N WITH TILDE>              209  105  105  105  195.145  138.88
 <O WITH GRAVE>              210  237  237  237  195.146  138.89
 <O WITH ACUTE>              211  238  238  238  195.147  138.98
 <O WITH CIRCUMFLEX>         212  235  235  235  195.148  138.99
 <O WITH TILDE>              213  239  239  239  195.149  138.100
 <O WITH DIAERESIS>          214  236  236  236  195.150  138.101
 <MULTIPLICATION SIGN>       215  191  191  191  195.151  138.102
 <O WITH STROKE>             216  128  128  128  195.152  138.103
 <U WITH GRAVE>              217  253  253  224  195.153  138.104  ##
 <U WITH ACUTE>              218  254  254  254  195.154  138.105
 <U WITH CIRCUMFLEX>         219  251  251  221  195.155  138.106  ##
 <U WITH DIAERESIS>          220  252  252  252  195.156  138.112
 <Y WITH ACUTE>              221  173  186  173  195.157  138.113  ** ##
 <CAPITAL LETTER THORN>      222  174  174  174  195.158  138.114
 <SMALL LETTER SHARP S>      223  89   89   89   195.159  138.115
 <a WITH GRAVE>              224  68   68   68   195.160  139.65
 <a WITH ACUTE>              225  69   69   69   195.161  139.66
 <a WITH CIRCUMFLEX>         226  66   66   66   195.162  139.67
 <a WITH TILDE>              227  70   70   70   195.163  139.68
 <a WITH DIAERESIS>          228  67   67   67   195.164  139.69
 <a WITH RING ABOVE>         229  71   71   71   195.165  139.70
 <SMALL LIGATURE ae>         230  156  156  156  195.166  139.71
 <c WITH CEDILLA>            231  72   72   72   195.167  139.72
 <e WITH GRAVE>              232  84   84   84   195.168  139.73
 <e WITH ACUTE>              233  81   81   81   195.169  139.74
 <e WITH CIRCUMFLEX>         234  82   82   82   195.170  139.81
 <e WITH DIAERESIS>          235  83   83   83   195.171  139.82
 <i WITH GRAVE>              236  88   88   88   195.172  139.83
 <i WITH ACUTE>              237  85   85   85   195.173  139.84
 <i WITH CIRCUMFLEX>         238  86   86   86   195.174  139.85
 <i WITH DIAERESIS>          239  87   87   87   195.175  139.86
 <SMALL LETTER eth>          240  140  140  140  195.176  139.87
 <n WITH TILDE>              241  73   73   73   195.177  139.88
 <o WITH GRAVE>              242  205  205  205  195.178  139.89
 <o WITH ACUTE>              243  206  206  206  195.179  139.98
 <o WITH CIRCUMFLEX>         244  203  203  203  195.180  139.99
 <o WITH TILDE>              245  207  207  207  195.181  139.100
 <o WITH DIAERESIS>          246  204  204  204  195.182  139.101
 <DIVISION SIGN>             247  225  225  225  195.183  139.102
 <o WITH STROKE>             248  112  112  112  195.184  139.103
 <u WITH GRAVE>              249  221  221  192  195.185  139.104  ##
 <u WITH ACUTE>              250  222  222  222  195.186  139.105
 <u WITH CIRCUMFLEX>         251  219  219  219  195.187  139.106
 <u WITH DIAERESIS>          252  220  220  220  195.188  139.112
 <y WITH ACUTE>              253  141  141  141  195.189  139.113
 <SMALL LETTER thorn>        254  142  142  142  195.190  139.114
 <y WITH DIAERESIS>          255  223  223  223  195.191  139.115

If you would rather see the above table in CCSID 0037 order rather than
ASCII + Latin-1 order then run the table through:

=over 4

=item recipe 4

=back

 perl \
    -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\
     -e '{push(@l,$_)}' \
     -e 'END{print map{$_->[0]}' \
     -e '          sort{$a->[1] <=> $b->[1]}' \
     -e '          map{[$_,substr($_,34,3)]}@l;}' perlebcdic.pod

If you would rather see it in CCSID 1047 order then change the number
34 in the last line to 39, like this:

=over 4

=item recipe 5

=back

 perl \
    -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\
    -e '{push(@l,$_)}' \
    -e 'END{print map{$_->[0]}' \
    -e '          sort{$a->[1] <=> $b->[1]}' \
    -e '          map{[$_,substr($_,39,3)]}@l;}' perlebcdic.pod

If you would rather see it in POSIX-BC order then change the number
34 in the last line to 44, like this:

=over 4

=item recipe 6

=back

 perl \
    -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\
     -e '{push(@l,$_)}' \
     -e 'END{print map{$_->[0]}' \
     -e '          sort{$a->[1] <=> $b->[1]}' \
     -e '          map{[$_,substr($_,44,3)]}@l;}' perlebcdic.pod

=head2 Table in hex, sorted in 1047 order

Since this document was first written, the convention has become more
and more to use hexadecimal notation for code points.  To do this with
the recipes and to also sort is a multi-step process, so here, for
convenience, is the table from above, re-sorted to be in Code Page 1047
order, and using hex notation.

                          ISO
                         8859-1             POS-         CCSID
                         CCSID  CCSID CCSID IX-          1047
  chr                     0819   0037 1047  BC  UTF-8  UTF-EBCDIC
 ---------------------------------------------------------------------
 <NUL>                       00   00   00   00   00       00
 <SOH>                       01   01   01   01   01       01
 <STX>                       02   02   02   02   02       02
 <ETX>                       03   03   03   03   03       03
 <ST>                        9C   04   04   04   C2.9C    04
 <HT>                        09   05   05   05   09       05
 <SSA>                       86   06   06   06   C2.86    06
 <DEL>                       7F   07   07   07   7F       07
 <EPA>                       97   08   08   08   C2.97    08
 <RI>                        8D   09   09   09   C2.8D    09
 <SS2>                       8E   0A   0A   0A   C2.8E    0A
 <VT>                        0B   0B   0B   0B   0B       0B
 <FF>                        0C   0C   0C   0C   0C       0C
 <CR>                        0D   0D   0D   0D   0D       0D
 <SO>                        0E   0E   0E   0E   0E       0E
 <SI>                        0F   0F   0F   0F   0F       0F
 <DLE>                       10   10   10   10   10       10
 <DC1>                       11   11   11   11   11       11
 <DC2>                       12   12   12   12   12       12
 <DC3>                       13   13   13   13   13       13
 <OSC>                       9D   14   14   14   C2.9D    14
 <LF>                        0A   25   15   15   0A       15    **
 <BS>                        08   16   16   16   08       16
 <ESA>                       87   17   17   17   C2.87    17
 <CAN>                       18   18   18   18   18       18
 <EOM>                       19   19   19   19   19       19
 <PU2>                       92   1A   1A   1A   C2.92    1A
 <SS3>                       8F   1B   1B   1B   C2.8F    1B
 <FS>                        1C   1C   1C   1C   1C       1C
 <GS>                        1D   1D   1D   1D   1D       1D
 <RS>                        1E   1E   1E   1E   1E       1E
 <US>                        1F   1F   1F   1F   1F       1F
 <PAD>                       80   20   20   20   C2.80    20
 <HOP>                       81   21   21   21   C2.81    21
 <BPH>                       82   22   22   22   C2.82    22
 <NBH>                       83   23   23   23   C2.83    23
 <IND>                       84   24   24   24   C2.84    24
 <NEL>                       85   15   25   25   C2.85    25     **
 <ETB>                       17   26   26   26   17       26
 <ESC>                       1B   27   27   27   1B       27
 <HTS>                       88   28   28   28   C2.88    28
 <HTJ>                       89   29   29   29   C2.89    29
 <VTS>                       8A   2A   2A   2A   C2.8A    2A
 <PLD>                       8B   2B   2B   2B   C2.8B    2B
 <PLU>                       8C   2C   2C   2C   C2.8C    2C
 <ENQ>                       05   2D   2D   2D   05       2D
 <ACK>                       06   2E   2E   2E   06       2E
 <BEL>                       07   2F   2F   2F   07       2F
 <DCS>                       90   30   30   30   C2.90    30
 <PU1>                       91   31   31   31   C2.91    31
 <SYN>                       16   32   32   32   16       32
 <STS>                       93   33   33   33   C2.93    33
 <CCH>                       94   34   34   34   C2.94    34
 <MW>                        95   35   35   35   C2.95    35
 <SPA>                       96   36   36   36   C2.96    36
 <EOT>                       04   37   37   37   04       37
 <SOS>                       98   38   38   38   C2.98    38
 <SGC>                       99   39   39   39   C2.99    39
 <SCI>                       9A   3A   3A   3A   C2.9A    3A
 <CSI>                       9B   3B   3B   3B   C2.9B    3B
 <DC4>                       14   3C   3C   3C   14       3C
 <NAK>                       15   3D   3D   3D   15       3D
 <PM>                        9E   3E   3E   3E   C2.9E    3E
 <SUB>                       1A   3F   3F   3F   1A       3F
 <SPACE>                     20   40   40   40   20       40
 <NON-BREAKING SPACE>        A0   41   41   41   C2.A0    80.41
 <a WITH CIRCUMFLEX>         E2   42   42   42   C3.A2    8B.43
 <a WITH DIAERESIS>          E4   43   43   43   C3.A4    8B.45
 <a WITH GRAVE>              E0   44   44   44   C3.A0    8B.41
 <a WITH ACUTE>              E1   45   45   45   C3.A1    8B.42
 <a WITH TILDE>              E3   46   46   46   C3.A3    8B.44
 <a WITH RING ABOVE>         E5   47   47   47   C3.A5    8B.46
 <c WITH CEDILLA>            E7   48   48   48   C3.A7    8B.48
 <n WITH TILDE>              F1   49   49   49   C3.B1    8B.58
 <CENT SIGN>                 A2   4A   4A   B0   C2.A2    80.43  ##
 .                           2E   4B   4B   4B   2E       4B
 <                           3C   4C   4C   4C   3C       4C
 (                           28   4D   4D   4D   28       4D
 +                           2B   4E   4E   4E   2B       4E
 |                           7C   4F   4F   4F   7C       4F
 &                           26   50   50   50   26       50
 <e WITH ACUTE>              E9   51   51   51   C3.A9    8B.4A
 <e WITH CIRCUMFLEX>         EA   52   52   52   C3.AA    8B.51
 <e WITH DIAERESIS>          EB   53   53   53   C3.AB    8B.52
 <e WITH GRAVE>              E8   54   54   54   C3.A8    8B.49
 <i WITH ACUTE>              ED   55   55   55   C3.AD    8B.54
 <i WITH CIRCUMFLEX>         EE   56   56   56   C3.AE    8B.55
 <i WITH DIAERESIS>          EF   57   57   57   C3.AF    8B.56
 <i WITH GRAVE>              EC   58   58   58   C3.AC    8B.53
 <SMALL LETTER SHARP S>      DF   59   59   59   C3.9F    8A.73
 !                           21   5A   5A   5A   21       5A
 $                           24   5B   5B   5B   24       5B
 *                           2A   5C   5C   5C   2A       5C
 )                           29   5D   5D   5D   29       5D
 ;                           3B   5E   5E   5E   3B       5E
 ^                           5E   B0   5F   6A   5E       5F     ** ##
 -                           2D   60   60   60   2D       60
 /                           2F   61   61   61   2F       61
 <A WITH CIRCUMFLEX>         C2   62   62   62   C3.82    8A.43
 <A WITH DIAERESIS>          C4   63   63   63   C3.84    8A.45
 <A WITH GRAVE>              C0   64   64   64   C3.80    8A.41
 <A WITH ACUTE>              C1   65   65   65   C3.81    8A.42
 <A WITH TILDE>              C3   66   66   66   C3.83    8A.44
 <A WITH RING ABOVE>         C5   67   67   67   C3.85    8A.46
 <C WITH CEDILLA>            C7   68   68   68   C3.87    8A.48
 <N WITH TILDE>              D1   69   69   69   C3.91    8A.58
 <BROKEN BAR>                A6   6A   6A   D0   C2.A6    80.47  ##
 ,                           2C   6B   6B   6B   2C       6B
 %                           25   6C   6C   6C   25       6C
 _                           5F   6D   6D   6D   5F       6D
 >                           3E   6E   6E   6E   3E       6E
 ?                           3F   6F   6F   6F   3F       6F
 <o WITH STROKE>             F8   70   70   70   C3.B8    8B.67
 <E WITH ACUTE>              C9   71   71   71   C3.89    8A.4A
 <E WITH CIRCUMFLEX>         CA   72   72   72   C3.8A    8A.51
 <E WITH DIAERESIS>          CB   73   73   73   C3.8B    8A.52
 <E WITH GRAVE>              C8   74   74   74   C3.88    8A.49
 <I WITH ACUTE>              CD   75   75   75   C3.8D    8A.54
 <I WITH CIRCUMFLEX>         CE   76   76   76   C3.8E    8A.55
 <I WITH DIAERESIS>          CF   77   77   77   C3.8F    8A.56
 <I WITH GRAVE>              CC   78   78   78   C3.8C    8A.53
 `                           60   79   79   4A   60       79     ##
 :                           3A   7A   7A   7A   3A       7A
 #                           23   7B   7B   7B   23       7B
 @                           40   7C   7C   7C   40       7C
 '                           27   7D   7D   7D   27       7D
 =                           3D   7E   7E   7E   3D       7E
 "                           22   7F   7F   7F   22       7F
 <O WITH STROKE>             D8   80   80   80   C3.98    8A.67
 a                           61   81   81   81   61       81
 b                           62   82   82   82   62       82
 c                           63   83   83   83   63       83
 d                           64   84   84   84   64       84
 e                           65   85   85   85   65       85
 f                           66   86   86   86   66       86
 g                           67   87   87   87   67       87
 h                           68   88   88   88   68       88
 i                           69   89   89   89   69       89
 <LEFT POINTING GUILLEMET>   AB   8A   8A   8A   C2.AB    80.52
 <RIGHT POINTING GUILLEMET>  BB   8B   8B   8B   C2.BB    80.6A
 <SMALL LETTER eth>          F0   8C   8C   8C   C3.B0    8B.57
 <y WITH ACUTE>              FD   8D   8D   8D   C3.BD    8B.71
 <SMALL LETTER thorn>        FE   8E   8E   8E   C3.BE    8B.72
 <PLUS-OR-MINUS SIGN>        B1   8F   8F   8F   C2.B1    80.58
 <DEGREE SIGN>               B0   90   90   90   C2.B0    80.57
 j                           6A   91   91   91   6A       91
 k                           6B   92   92   92   6B       92
 l                           6C   93   93   93   6C       93
 m                           6D   94   94   94   6D       94
 n                           6E   95   95   95   6E       95
 o                           6F   96   96   96   6F       96
 p                           70   97   97   97   70       97
 q                           71   98   98   98   71       98
 r                           72   99   99   99   72       99
 <FEMININE ORDINAL>          AA   9A   9A   9A   C2.AA    80.51
 <MASC. ORDINAL INDICATOR>   BA   9B   9B   9B   C2.BA    80.69
 <SMALL LIGATURE ae>         E6   9C   9C   9C   C3.A6    8B.47
 <CEDILLA>                   B8   9D   9D   9D   C2.B8    80.67
 <CAPITAL LIGATURE AE>       C6   9E   9E   9E   C3.86    8A.47
 <CURRENCY SIGN>             A4   9F   9F   9F   C2.A4    80.45
 <MICRO SIGN>                B5   A0   A0   A0   C2.B5    80.64
 ~                           7E   A1   A1   FF   7E       A1     ##
 s                           73   A2   A2   A2   73       A2
 t                           74   A3   A3   A3   74       A3
 u                           75   A4   A4   A4   75       A4
 v                           76   A5   A5   A5   76       A5
 w                           77   A6   A6   A6   77       A6
 x                           78   A7   A7   A7   78       A7
 y                           79   A8   A8   A8   79       A8
 z                           7A   A9   A9   A9   7A       A9
 <INVERTED "!" >             A1   AA   AA   AA   C2.A1    80.42
 <INVERTED QUESTION MARK>    BF   AB   AB   AB   C2.BF    80.73
 <CAPITAL LETTER ETH>        D0   AC   AC   AC   C3.90    8A.57
 [                           5B   BA   AD   BB   5B       AD     ** ##
 <CAPITAL LETTER THORN>      DE   AE   AE   AE   C3.9E    8A.72
 <REGISTERED TRADE MARK>     AE   AF   AF   AF   C2.AE    80.55
 <NOT SIGN>                  AC   5F   B0   BA   C2.AC    80.53  ** ##
 <POUND SIGN>                A3   B1   B1   B1   C2.A3    80.44
 <YEN SIGN>                  A5   B2   B2   B2   C2.A5    80.46
 <MIDDLE DOT>                B7   B3   B3   B3   C2.B7    80.66
 <COPYRIGHT SIGN>            A9   B4   B4   B4   C2.A9    80.4A
 <SECTION SIGN>              A7   B5   B5   B5   C2.A7    80.48
 <PARAGRAPH SIGN>            B6   B6   B6   B6   C2.B6    80.65
 <FRACTION ONE QUARTER>      BC   B7   B7   B7   C2.BC    80.70
 <FRACTION ONE HALF>         BD   B8   B8   B8   C2.BD    80.71
 <FRACTION THREE QUARTERS>   BE   B9   B9   B9   C2.BE    80.72
 <Y WITH ACUTE>              DD   AD   BA   AD   C3.9D    8A.71  ** ##
 <DIAERESIS>                 A8   BD   BB   79   C2.A8    80.49  ** ##
 <MACRON>                    AF   BC   BC   A1   C2.AF    80.56  ##
 ]                           5D   BB   BD   BD   5D       BD     **
 <ACUTE ACCENT>              B4   BE   BE   BE   C2.B4    80.63
 <MULTIPLICATION SIGN>       D7   BF   BF   BF   C3.97    8A.66
 {                           7B   C0   C0   FB   7B       C0     ##
 A                           41   C1   C1   C1   41       C1
 B                           42   C2   C2   C2   42       C2
 C                           43   C3   C3   C3   43       C3
 D                           44   C4   C4   C4   44       C4
 E                           45   C5   C5   C5   45       C5
 F                           46   C6   C6   C6   46       C6
 G                           47   C7   C7   C7   47       C7
 H                           48   C8   C8   C8   48       C8
 I                           49   C9   C9   C9   49       C9
 <SOFT HYPHEN>               AD   CA   CA   CA   C2.AD    80.54
 <o WITH CIRCUMFLEX>         F4   CB   CB   CB   C3.B4    8B.63
 <o WITH DIAERESIS>          F6   CC   CC   CC   C3.B6    8B.65
 <o WITH GRAVE>              F2   CD   CD   CD   C3.B2    8B.59
 <o WITH ACUTE>              F3   CE   CE   CE   C3.B3    8B.62
 <o WITH TILDE>              F5   CF   CF   CF   C3.B5    8B.64
 }                           7D   D0   D0   FD   7D       D0     ##
 J                           4A   D1   D1   D1   4A       D1
 K                           4B   D2   D2   D2   4B       D2
 L                           4C   D3   D3   D3   4C       D3
 M                           4D   D4   D4   D4   4D       D4
 N                           4E   D5   D5   D5   4E       D5
 O                           4F   D6   D6   D6   4F       D6
 P                           50   D7   D7   D7   50       D7
 Q                           51   D8   D8   D8   51       D8
 R                           52   D9   D9   D9   52       D9
 <SUPERSCRIPT ONE>           B9   DA   DA   DA   C2.B9    80.68
 <u WITH CIRCUMFLEX>         FB   DB   DB   DB   C3.BB    8B.6A
 <u WITH DIAERESIS>          FC   DC   DC   DC   C3.BC    8B.70
 <u WITH GRAVE>              F9   DD   DD   C0   C3.B9    8B.68  ##
 <u WITH ACUTE>              FA   DE   DE   DE   C3.BA    8B.69
 <y WITH DIAERESIS>          FF   DF   DF   DF   C3.BF    8B.73
 \                           5C   E0   E0   BC   5C       E0     ##
 <DIVISION SIGN>             F7   E1   E1   E1   C3.B7    8B.66
 S                           53   E2   E2   E2   53       E2
 T                           54   E3   E3   E3   54       E3
 U                           55   E4   E4   E4   55       E4
 V                           56   E5   E5   E5   56       E5
 W                           57   E6   E6   E6   57       E6
 X                           58   E7   E7   E7   58       E7
 Y                           59   E8   E8   E8   59       E8
 Z                           5A   E9   E9   E9   5A       E9
 <SUPERSCRIPT TWO>           B2   EA   EA   EA   C2.B2    80.59
 <O WITH CIRCUMFLEX>         D4   EB   EB   EB   C3.94    8A.63
 <O WITH DIAERESIS>          D6   EC   EC   EC   C3.96    8A.65
 <O WITH GRAVE>              D2   ED   ED   ED   C3.92    8A.59
 <O WITH ACUTE>              D3   EE   EE   EE   C3.93    8A.62
 <O WITH TILDE>              D5   EF   EF   EF   C3.95    8A.64
 0                           30   F0   F0   F0   30       F0
 1                           31   F1   F1   F1   31       F1
 2                           32   F2   F2   F2   32       F2
 3                           33   F3   F3   F3   33       F3
 4                           34   F4   F4   F4   34       F4
 5                           35   F5   F5   F5   35       F5
 6                           36   F6   F6   F6   36       F6
 7                           37   F7   F7   F7   37       F7
 8                           38   F8   F8   F8   38       F8
 9                           39   F9   F9   F9   39       F9
 <SUPERSCRIPT THREE>         B3   FA   FA   FA   C2.B3    80.62
 <U WITH CIRCUMFLEX>         DB   FB   FB   DD   C3.9B    8A.6A  ##
 <U WITH DIAERESIS>          DC   FC   FC   FC   C3.9C    8A.70
 <U WITH GRAVE>              D9   FD   FD   E0   C3.99    8A.68  ##
 <U WITH ACUTE>              DA   FE   FE   FE   C3.9A    8A.69
 <APC>                       9F   FF   FF   5F   C2.9F    FF     ##

=head1 IDENTIFYING CHARACTER CODE SETS

It is possible to determine which character set you are operating under.
But first you need to be really really sure you need to do this.  Your
code will be simpler and probably just as portable if you don't have
to test the character set and do different things, depending.  There are
actually only very few circumstances where it's not easy to write
straight-line code portable to all character sets.  See
L<perluniintro/Unicode and EBCDIC> for how to portably specify
characters.

But there are some cases where you may want to know which character set
you are running under.  One possible example is doing
L<sorting|/SORTING> in inner loops where performance is critical.

To determine if you are running under ASCII or EBCDIC, you can use the
return value of C<ord()> or C<chr()> to test one or more character
values.  For example:

    $is_ascii  = "A" eq chr(65);
    $is_ebcdic = "A" eq chr(193);
    $is_ascii  = ord("A") == 65;
    $is_ebcdic = ord("A") == 193;

There's even less need to distinguish between EBCDIC code pages, but to
do so try looking at one or more of the characters that differ between
them.

    $is_ascii           = ord('[') == 91;
    $is_ebcdic_37       = ord('[') == 186;
    $is_ebcdic_1047     = ord('[') == 173;
    $is_ebcdic_POSIX_BC = ord('[') == 187;

However, it would be unwise to write tests such as:

    $is_ascii = "\r" ne chr(13);  #  WRONG
    $is_ascii = "\n" ne chr(10);  #  ILL ADVISED

Obviously the first of these will fail to distinguish most ASCII
platforms from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC
platform since S<C<"\r" eq chr(13)>> under all of those coded character
sets.  But note too that because C<"\n"> is C<chr(13)> and C<"\r"> is
C<chr(10)> on old Macintosh (which is an ASCII platform) the second
C<$is_ascii> test will lead to trouble there.

To determine whether or not perl was built under an EBCDIC
code page you can use the Config module like so:

    use Config;
    $is_ebcdic = $Config{'ebcdic'} eq 'define';

=head1 CONVERSIONS

=head2 C<utf8::unicode_to_native()> and C<utf8::native_to_unicode()>

These functions take an input numeric code point in one encoding and
return what its equivalent value is in the other.

See L<utf8>.

=head2 tr///

In order to convert a string of characters from one character set to
another a simple list of numbers, such as in the right columns in the
above table, along with Perl's C<tr///> operator is all that is needed.
The data in the table are in ASCII/Latin1 order, hence the EBCDIC columns
provide easy-to-use ASCII/Latin1 to EBCDIC operations that are also easily
reversed.

For example, to convert ASCII/Latin1 to code page 037 take the output of the
second numbers column from the output of recipe 2 (modified to add
C<"\"> characters), and use it in C<tr///> like so:

    $cp_037 =
    '\x00\x01\x02\x03\x37\x2D\x2E\x2F\x16\x05\x25\x0B\x0C\x0D\x0E\x0F' .
    '\x10\x11\x12\x13\x3C\x3D\x32\x26\x18\x19\x3F\x27\x1C\x1D\x1E\x1F' .
    '\x40\x5A\x7F\x7B\x5B\x6C\x50\x7D\x4D\x5D\x5C\x4E\x6B\x60\x4B\x61' .
    '\xF0\xF1\xF2\xF3\xF4\xF5\xF6\xF7\xF8\xF9\x7A\x5E\x4C\x7E\x6E\x6F' .
    '\x7C\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xD1\xD2\xD3\xD4\xD5\xD6' .
    '\xD7\xD8\xD9\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xBA\xE0\xBB\xB0\x6D' .
    '\x79\x81\x82\x83\x84\x85\x86\x87\x88\x89\x91\x92\x93\x94\x95\x96' .
    '\x97\x98\x99\xA2\xA3\xA4\xA5\xA6\xA7\xA8\xA9\xC0\x4F\xD0\xA1\x07' .
    '\x20\x21\x22\x23\x24\x15\x06\x17\x28\x29\x2A\x2B\x2C\x09\x0A\x1B' .
    '\x30\x31\x1A\x33\x34\x35\x36\x08\x38\x39\x3A\x3B\x04\x14\x3E\xFF' .
    '\x41\xAA\x4A\xB1\x9F\xB2\x6A\xB5\xBD\xB4\x9A\x8A\x5F\xCA\xAF\xBC' .
    '\x90\x8F\xEA\xFA\xBE\xA0\xB6\xB3\x9D\xDA\x9B\x8B\xB7\xB8\xB9\xAB' .
    '\x64\x65\x62\x66\x63\x67\x9E\x68\x74\x71\x72\x73\x78\x75\x76\x77' .
    '\xAC\x69\xED\xEE\xEB\xEF\xEC\xBF\x80\xFD\xFE\xFB\xFC\xAD\xAE\x59' .
    '\x44\x45\x42\x46\x43\x47\x9C\x48\x54\x51\x52\x53\x58\x55\x56\x57' .
    '\x8C\x49\xCD\xCE\xCB\xCF\xCC\xE1\x70\xDD\xDE\xDB\xDC\x8D\x8E\xDF';

    my $ebcdic_string = $ascii_string;
    eval '$ebcdic_string =~ tr/\000-\377/' . $cp_037 . '/';

To convert from EBCDIC 037 to ASCII just reverse the order of the tr///
arguments like so:

    my $ascii_string = $ebcdic_string;
    eval '$ascii_string =~ tr/' . $cp_037 . '/\000-\377/';

Similarly one could take the output of the third numbers column from recipe 2
to obtain a C<$cp_1047> table.  The fourth numbers column of the output from
recipe 2 could provide a C<$cp_posix_bc> table suitable for transcoding as
well.

If you wanted to see the inverse tables, you would first have to sort on the
desired numbers column as in recipes 4, 5 or 6, then take the output of the
first numbers column.

=head2 iconv

XPG operability often implies the presence of an I<iconv> utility
available from the shell or from the C library.  Consult your system's
documentation for information on iconv.

On OS/390 or z/OS see the L<iconv(1)> manpage.  One way to invoke the C<iconv>
shell utility from within perl would be to:

    # OS/390 or z/OS example
    $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1`

or the inverse map:

    # OS/390 or z/OS example
    $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047`

For other Perl-based conversion options see the C<Convert::*> modules on CPAN.

=head2 C RTL

The OS/390 and z/OS C run-time libraries provide C<_atoe()> and C<_etoa()> functions.

=head1 OPERATOR DIFFERENCES

The C<..> range operator treats certain character ranges with
care on EBCDIC platforms.  For example the following array
will have twenty six elements on either an EBCDIC platform
or an ASCII platform:

    @alphabet = ('A'..'Z');   #  $#alphabet == 25

The bitwise operators such as & ^ | may return different results
when operating on string or character data in a Perl program running
on an EBCDIC platform than when run on an ASCII platform.  Here is
an example adapted from the one in L<perlop>:

    # EBCDIC-based examples
    print "j p \n" ^ " a h";                      # prints "JAPH\n"
    print "JA" | "  ph\n";                        # prints "japh\n"
    print "JAPH\nJunk" & "\277\277\277\277\277";  # prints "japh\n";
    print 'p N$' ^ " E<H\n";                      # prints "Perl\n";

An interesting property of the 32 C0 control characters
in the ASCII table is that they can "literally" be constructed
as control characters in Perl, e.g. C<(chr(0)> eq C<\c@>)>
C<(chr(1)> eq C<\cA>)>, and so on.  Perl on EBCDIC platforms has been
ported to take C<\c@> to C<chr(0)> and C<\cA> to C<chr(1)>, etc. as well, but the
characters that result depend on which code page you are
using.  The table below uses the standard acronyms for the controls.
The POSIX-BC and 1047 sets are
identical throughout this range and differ from the 0037 set at only
one spot (21 decimal).  Note that the line terminator character
may be generated by C<\cJ> on ASCII platforms but by C<\cU> on 1047 or POSIX-BC
platforms and cannot be generated as a C<"\c.letter."> control character on
0037 platforms.  Note also that C<\c\> cannot be the final element in a string
or regex, as it will absorb the terminator.   But C<\c\I<X>> is a C<FILE
SEPARATOR> concatenated with I<X> for all I<X>.
The outlier C<\c?> on ASCII, which yields a non-C0 control C<DEL>,
yields the outlier control C<APC> on EBCDIC, the one that isn't in the
block of contiguous controls.  Note that a subtlety of this is that
C<\c?> on ASCII platforms is an ASCII character, while it isn't
equivalent to any ASCII character in EBCDIC platforms.

 chr   ord   8859-1    0037    1047 && POSIX-BC
 -----------------------------------------------------------------------
 \c@     0   <NUL>     <NUL>        <NUL>
 \cA     1   <SOH>     <SOH>        <SOH>
 \cB     2   <STX>     <STX>        <STX>
 \cC     3   <ETX>     <ETX>        <ETX>
 \cD     4   <EOT>     <ST>         <ST>
 \cE     5   <ENQ>     <HT>         <HT>
 \cF     6   <ACK>     <SSA>        <SSA>
 \cG     7   <BEL>     <DEL>        <DEL>
 \cH     8   <BS>      <EPA>        <EPA>
 \cI     9   <HT>      <RI>         <RI>
 \cJ    10   <LF>      <SS2>        <SS2>
 \cK    11   <VT>      <VT>         <VT>
 \cL    12   <FF>      <FF>         <FF>
 \cM    13   <CR>      <CR>         <CR>
 \cN    14   <SO>      <SO>         <SO>
 \cO    15   <SI>      <SI>         <SI>
 \cP    16   <DLE>     <DLE>        <DLE>
 \cQ    17   <DC1>     <DC1>        <DC1>
 \cR    18   <DC2>     <DC2>        <DC2>
 \cS    19   <DC3>     <DC3>        <DC3>
 \cT    20   <DC4>     <OSC>        <OSC>
 \cU    21   <NAK>     <NEL>        <LF>              **
 \cV    22   <SYN>     <BS>         <BS>
 \cW    23   <ETB>     <ESA>        <ESA>
 \cX    24   <CAN>     <CAN>        <CAN>
 \cY    25   <EOM>     <EOM>        <EOM>
 \cZ    26   <SUB>     <PU2>        <PU2>
 \c[    27   <ESC>     <SS3>        <SS3>
 \c\X   28   <FS>X     <FS>X        <FS>X
 \c]    29   <GS>      <GS>         <GS>
 \c^    30   <RS>      <RS>         <RS>
 \c_    31   <US>      <US>         <US>
 \c?    *    <DEL>     <APC>        <APC>

C<*> Note: C<\c?> maps to ordinal 127 (C<DEL>) on ASCII platforms, but
since ordinal 127 is a not a control character on EBCDIC machines,
C<\c?> instead maps on them to C<APC>, which is 255 in 0037 and 1047,
and 95 in POSIX-BC.

=head1 FUNCTION DIFFERENCES

=over 8

=item C<chr()>

C<chr()> must be given an EBCDIC code number argument to yield a desired
character return value on an EBCDIC platform.  For example:

    $CAPITAL_LETTER_A = chr(193);

=item C<ord()>

C<ord()> will return EBCDIC code number values on an EBCDIC platform.
For example:

    $the_number_193 = ord("A");

=item C<pack()>


The C<"c"> and C<"C"> templates for C<pack()> are dependent upon character set
encoding.  Examples of usage on EBCDIC include:

    $foo = pack("CCCC",193,194,195,196);
    # $foo eq "ABCD"
    $foo = pack("C4",193,194,195,196);
    # same thing

    $foo = pack("ccxxcc",193,194,195,196);
    # $foo eq "AB\0\0CD"

The C<"U"> template has been ported to mean "Unicode" on all platforms so
that

    pack("U", 65) eq 'A'

is true on all platforms.  If you want native code points for the low
256, use the C<"W"> template.  This means that the equivalences

    pack("W", ord($character)) eq $character
    unpack("W", $character) == ord $character

will hold.

=item C<print()>

One must be careful with scalars and strings that are passed to
print that contain ASCII encodings.  One common place
for this to occur is in the output of the MIME type header for
CGI script writing.  For example, many Perl programming guides
recommend something similar to:

    print "Content-type:\ttext/html\015\012\015\012";
    # this may be wrong on EBCDIC

You can instead write

    print "Content-type:\ttext/html\r\n\r\n"; # OK for DGW et al

and have it work portably.

That is because the translation from EBCDIC to ASCII is done
by the web server in this case.  Consult your web server's documentation for
further details.

=item C<printf()>

The formats that can convert characters to numbers and vice versa
will be different from their ASCII counterparts when executed
on an EBCDIC platform.  Examples include:

    printf("%c%c%c",193,194,195);  # prints ABC

=item C<sort()>

EBCDIC sort results may differ from ASCII sort results especially for
mixed case strings.  This is discussed in more detail L<below|/SORTING>.

=item C<sprintf()>

See the discussion of C<L</printf()>> above.  An example of the use
of sprintf would be:

    $CAPITAL_LETTER_A = sprintf("%c",193);

=item C<unpack()>

See the discussion of C<L</pack()>> above.

=back

Note that it is possible to write portable code for these by specifying
things in Unicode numbers, and using a conversion function:

    printf("%c",utf8::unicode_to_native(65));  # prints A on all
                                               # platforms
    print utf8::native_to_unicode(ord("A"));   # Likewise, prints 65

See L<perluniintro/Unicode and EBCDIC> and L</CONVERSIONS>
for other options.

=head1 REGULAR EXPRESSION DIFFERENCES

You can write your regular expressions just like someone on an ASCII
platform would do.  But keep in mind that using octal or hex notation to
specify a particular code point will give you the character that the
EBCDIC code page natively maps to it.   (This is also true of all
double-quoted strings.)  If you want to write portably, just use the
C<\N{U+...}> notation everywhere where you would have used C<\x{...}>,
and don't use octal notation at all.

Starting in Perl v5.22, this applies to ranges in bracketed character
classes.  If you say, for example, C<qr/[\N{U+20}-\N{U+7F}]/>, it means
the characters C<\N{U+20}>, C<\N{U+21}>, ..., C<\N{U+7F}>.  This range
is all the printable characters that the ASCII character set contains.

Prior to v5.22, you couldn't specify any ranges portably, except
(starting in Perl v5.5.3) all subsets of the C<[A-Z]> and C<[a-z]>
ranges are specially coded to not pick up gap characters.  For example,
characters such as "E<ocirc>" (C<o WITH CIRCUMFLEX>) that lie between
"I" and "J" would not be matched by the regular expression range
C</[H-K]/>.  But if either of the range end points is explicitly numeric
(and neither is specified by C<\N{U+...}>), the gap characters are
matched:

    /[\x89-\x91]/

will match C<\x8e>, even though C<\x89> is "i" and C<\x91 > is "j",
and C<\x8e> is a gap character, from the alphabetic viewpoint.

Another construct to be wary of is the inappropriate use of hex (unless
you use C<\N{U+...}>) or
octal constants in regular expressions.  Consider the following
set of subs:

    sub is_c0 {
        my $char = substr(shift,0,1);
        $char =~ /[\000-\037]/;
    }

    sub is_print_ascii {
        my $char = substr(shift,0,1);
        $char =~ /[\040-\176]/;
    }

    sub is_delete {
        my $char = substr(shift,0,1);
        $char eq "\177";
    }

    sub is_c1 {
        my $char = substr(shift,0,1);
        $char =~ /[\200-\237]/;
    }

    sub is_latin_1 {    # But not ASCII; not C1
        my $char = substr(shift,0,1);
        $char =~ /[\240-\377]/;
    }

These are valid only on ASCII platforms.  Starting in Perl v5.22, simply
changing the octal constants to equivalent C<\N{U+...}> values makes
them portable:

    sub is_c0 {
        my $char = substr(shift,0,1);
        $char =~ /[\N{U+00}-\N{U+1F}]/;
    }

    sub is_print_ascii {
        my $char = substr(shift,0,1);
        $char =~ /[\N{U+20}-\N{U+7E}]/;
    }

    sub is_delete {
        my $char = substr(shift,0,1);
        $char eq "\N{U+7F}";
    }

    sub is_c1 {
        my $char = substr(shift,0,1);
        $char =~ /[\N{U+80}-\N{U+9F}]/;
    }

    sub is_latin_1 {    # But not ASCII; not C1
        my $char = substr(shift,0,1);
        $char =~ /[\N{U+A0}-\N{U+FF}]/;
    }

And here are some alternative portable ways to write them:

    sub Is_c0 {
        my $char = substr(shift,0,1);
        return $char =~ /[[:cntrl:]]/a && ! Is_delete($char);

        # Alternatively:
        # return $char =~ /[[:cntrl:]]/
        #        && $char =~ /[[:ascii:]]/
        #        && ! Is_delete($char);
    }

    sub Is_print_ascii {
        my $char = substr(shift,0,1);

        return $char =~ /[[:print:]]/a;

        # Alternatively:
        # return $char =~ /[[:print:]]/ && $char =~ /[[:ascii:]]/;

        # Or
        # return $char
        #      =~ /[ !"\#\$%&'()*+,\-.\/0-9:;<=>?\@A-Z[\\\]^_`a-z{|}~]/;
    }

    sub Is_delete {
        my $char = substr(shift,0,1);
        return utf8::native_to_unicode(ord $char) == 0x7F;
    }

    sub Is_c1 {
        use feature 'unicode_strings';
        my $char = substr(shift,0,1);
        return $char =~ /[[:cntrl:]]/ && $char !~ /[[:ascii:]]/;
    }

    sub Is_latin_1 {    # But not ASCII; not C1
        use feature 'unicode_strings';
        my $char = substr(shift,0,1);
        return ord($char) < 256
               && $char !~ /[[:ascii:]]/
               && $char !~ /[[:cntrl:]]/;
    }

Another way to write C<Is_latin_1()> would be
to use the characters in the range explicitly:

    sub Is_latin_1 {
        my $char = substr(shift,0,1);
        $char =~ /[ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ]
                  [ÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ]/x;
    }

Although that form may run into trouble in network transit (due to the
presence of 8 bit characters) or on non ISO-Latin character sets.  But
it does allow C<Is_c1> to be rewritten so it works on Perls that don't
have C<'unicode_strings'> (earlier than v5.14):

    sub Is_latin_1 {    # But not ASCII; not C1
        my $char = substr(shift,0,1);
        return ord($char) < 256
               && $char !~ /[[:ascii:]]/
               && ! Is_latin1($char);
    }

=head1 SOCKETS

Most socket programming assumes ASCII character encodings in network
byte order.  Exceptions can include CGI script writing under a
host web server where the server may take care of translation for you.
Most host web servers convert EBCDIC data to ISO-8859-1 or Unicode on
output.

=head1 SORTING

One big difference between ASCII-based character sets and EBCDIC ones
are the relative positions of the characters when sorted in native
order.  Of most concern are the upper- and lowercase letters, the
digits, and the underscore (C<"_">).  On ASCII platforms the native sort
order has the digits come before the uppercase letters which come before
the underscore which comes before the lowercase letters.  On EBCDIC, the
underscore comes first, then the lowercase letters, then the uppercase
ones, and the digits last.  If sorted on an ASCII-based platform, the
two-letter abbreviation for a physician comes before the two letter
abbreviation for drive; that is:

 @sorted = sort(qw(Dr. dr.));  # @sorted holds ('Dr.','dr.') on ASCII,
                                  # but ('dr.','Dr.') on EBCDIC

The property of lowercase before uppercase letters in EBCDIC is
even carried to the Latin 1 EBCDIC pages such as 0037 and 1047.
An example would be that "E<Euml>" (C<E WITH DIAERESIS>, 203) comes
before "E<euml>" (C<e WITH DIAERESIS>, 235) on an ASCII platform, but
the latter (83) comes before the former (115) on an EBCDIC platform.
(Astute readers will note that the uppercase version of "E<szlig>"
C<SMALL LETTER SHARP S> is simply "SS" and that the upper case versions
of "E<yuml>" (small C<y WITH DIAERESIS>) and "E<micro>" (C<MICRO SIGN>)
are not in the 0..255 range but are in Unicode, in a Unicode enabled
Perl).

The sort order will cause differences between results obtained on
ASCII platforms versus EBCDIC platforms.  What follows are some suggestions
on how to deal with these differences.

=head2 Ignore ASCII vs. EBCDIC sort differences.

This is the least computationally expensive strategy.  It may require
some user education.

=head2 Use a sort helper function

This is completely general, but the most computationally expensive
strategy.  Choose one or the other character set and transform to that
for every sort comparision.  Here's a complete example that transforms
to ASCII sort order:

 sub native_to_uni($) {
    my $string = shift;

    # Saves time on an ASCII platform
    return $string if ord 'A' ==  65;

    my $output = "";
    for my $i (0 .. length($string) - 1) {
        $output
           .= chr(utf8::native_to_unicode(ord(substr($string, $i, 1))));
    }

    # Preserve utf8ness of input onto the output, even if it didn't need
    # to be utf8
    utf8::upgrade($output) if utf8::is_utf8($string);

    return $output;
 }

 sub ascii_order {   # Sort helper
    return native_to_uni($a) cmp native_to_uni($b);
 }

 sort ascii_order @list;

=head2 MONO CASE then sort data (for non-digits, non-underscore)

If you don't care about where digits and underscore sort to, you can do
something like this

 sub case_insensitive_order {   # Sort helper
    return lc($a) cmp lc($b)
 }

 sort case_insensitive_order @list;

If performance is an issue, and you don't care if the output is in the
same case as the input, Use C<tr///> to transform to the case most
employed within the data.  If the data are primarily UPPERCASE
non-Latin1, then apply C<tr/[a-z]/[A-Z]/>, and then C<sort()>.  If the
data are primarily lowercase non Latin1 then apply C<tr/[A-Z]/[a-z]/>
before sorting.  If the data are primarily UPPERCASE and include Latin-1
characters then apply:

   tr/[a-z]/[A-Z]/;
   tr/[àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþ]/[ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ/;
   s/ß/SS/g;

then C<sort()>.  If you have a choice, it's better to lowercase things
to avoid the problems of the two Latin-1 characters whose uppercase is
outside Latin-1: "E<yuml>" (small C<y WITH DIAERESIS>) and "E<micro>"
(C<MICRO SIGN>).  If you do need to upppercase, you can; with a
Unicode-enabled Perl, do:

    tr/ÿ/\x{178}/;
    tr/µ/\x{39C}/;

=head2 Perform sorting on one type of platform only.

This strategy can employ a network connection.  As such
it would be computationally expensive.

=head1 TRANSFORMATION FORMATS

There are a variety of ways of transforming data with an intra character set
mapping that serve a variety of purposes.  Sorting was discussed in the
previous section and a few of the other more popular mapping techniques are
discussed next.

=head2 URL decoding and encoding

Note that some URLs have hexadecimal ASCII code points in them in an
attempt to overcome character or protocol limitation issues.  For example
the tilde character is not on every keyboard hence a URL of the form:

    http://www.pvhp.com/~pvhp/

may also be expressed as either of:

    http://www.pvhp.com/%7Epvhp/

    http://www.pvhp.com/%7epvhp/

where 7E is the hexadecimal ASCII code point for "~".  Here is an example
of decoding such a URL in any EBCDIC code page:

    $url = 'http://www.pvhp.com/%7Epvhp/';
    $url =~ s/%([0-9a-fA-F]{2})/
              pack("c",utf8::unicode_to_native(hex($1)))/xge;

Conversely, here is a partial solution for the task of encoding such
a URL in any EBCDIC code page:

    $url = 'http://www.pvhp.com/~pvhp/';
    # The following regular expression does not address the
    # mappings for: ('.' => '%2E', '/' => '%2F', ':' => '%3A')
    $url =~ s/([\t "#%&\(\),;<=>\?\@\[\\\]^`{|}~])/
               sprintf("%%%02X",utf8::native_to_unicode(ord($1)))/xge;

where a more complete solution would split the URL into components
and apply a full s/// substitution only to the appropriate parts.

=head2 uu encoding and decoding

The C<u> template to C<pack()> or C<unpack()> will render EBCDIC data in
EBCDIC characters equivalent to their ASCII counterparts.  For example,
the following will print "Yes indeed\n" on either an ASCII or EBCDIC
computer:

    $all_byte_chrs = '';
    for (0..255) { $all_byte_chrs .= chr($_); }
    $uuencode_byte_chrs = pack('u', $all_byte_chrs);
    ($uu = <<'ENDOFHEREDOC') =~ s/^\s*//gm;
    M``$"`P0%!@<("0H+#`T.#Q`1$A,4%187&!D:&QP='A\@(2(C)"4F)R@I*BLL
    M+2XO,#$R,S0U-C<X.3H[/#T^/T!!0D-$149'2$E*2TQ-3D]045)35%565UA9
    M6EM<75Y?8&%B8V1E9F=H:6IK;&UN;W!Q<G-T=79W>'EZ>WQ]?G^`@8*#A(6&
    MAXB)BHN,C8Z/D)&2DY25EI>8F9J;G)V>GZ"AHJ.DI::GJ*FJJZRMKJ^PL;*S
    MM+6VM[BYNKN\O;Z_P,'"P\3%QL?(R<K+S,W.S]#1TM/4U=;7V-G:V]S=WM_@
    ?X>+CY.7FY^CIZNOL[>[O\/'R\_3U]O?X^?K[_/W^_P``
    ENDOFHEREDOC
    if ($uuencode_byte_chrs eq $uu) {
        print "Yes ";
    }
    $uudecode_byte_chrs = unpack('u', $uuencode_byte_chrs);
    if ($uudecode_byte_chrs eq $all_byte_chrs) {
        print "indeed\n";
    }

Here is a very spartan uudecoder that will work on EBCDIC:

    #!/usr/local/bin/perl
    $_ = <> until ($mode,$file) = /^begin\s*(\d*)\s*(\S*)/;
    open(OUT, "> $file") if $file ne "";
    while(<>) {
        last if /^end/;
        next if /[a-z]/;
        next unless int((((utf8::native_to_unicode(ord()) - 32 ) & 077)
                                                               + 2) / 3)
                    == int(length() / 4);
        print OUT unpack("u", $_);
    }
    close(OUT);
    chmod oct($mode), $file;


=head2 Quoted-Printable encoding and decoding

On ASCII-encoded platforms it is possible to strip characters outside of
the printable set using:

    # This QP encoder works on ASCII only
    $qp_string =~ s/([=\x00-\x1F\x80-\xFF])/
                    sprintf("=%02X",ord($1))/xge;

Starting in Perl v5.22, this is trivially changeable to work portably on
both ASCII and EBCDIC platforms.

    # This QP encoder works on both ASCII and EBCDIC
    $qp_string =~ s/([=\N{U+00}-\N{U+1F}\N{U+80}-\N{U+FF}])/
                    sprintf("=%02X",ord($1))/xge;

For earlier Perls, a QP encoder that works on both ASCII and EBCDIC
platforms would look somewhat like the following:

    $delete = utf8::unicode_to_native(ord("\x7F"));
    $qp_string =~
      s/([^[:print:]$delete])/
         sprintf("=%02X",utf8::native_to_unicode(ord($1)))/xage;

(although in production code the substitutions might be done
in the EBCDIC branch with the function call and separately in the
ASCII branch without the expense of the identity map; in Perl v5.22, the
identity map is optimized out so there is no expense, but the
alternative above is simpler and is also available in v5.22).

Such QP strings can be decoded with:

    # This QP decoder is limited to ASCII only
    $string =~ s/=([[:xdigit:][[:xdigit:])/chr hex $1/ge;
    $string =~ s/=[\n\r]+$//;

Whereas a QP decoder that works on both ASCII and EBCDIC platforms
would look somewhat like the following:

    $string =~ s/=([[:xdigit:][:xdigit:]])/
                                chr utf8::native_to_unicode(hex $1)/xge;
    $string =~ s/=[\n\r]+$//;

=head2 Caesarean ciphers

The practice of shifting an alphabet one or more characters for encipherment
dates back thousands of years and was explicitly detailed by Gaius Julius
Caesar in his B<Gallic Wars> text.  A single alphabet shift is sometimes
referred to as a rotation and the shift amount is given as a number $n after
the string 'rot' or "rot$n".  Rot0 and rot26 would designate identity maps
on the 26-letter English version of the Latin alphabet.  Rot13 has the
interesting property that alternate subsequent invocations are identity maps
(thus rot13 is its own non-trivial inverse in the group of 26 alphabet
rotations).  Hence the following is a rot13 encoder and decoder that will
work on ASCII and EBCDIC platforms:

    #!/usr/local/bin/perl

    while(<>){
        tr/n-za-mN-ZA-M/a-zA-Z/;
        print;
    }

In one-liner form:

    perl -ne 'tr/n-za-mN-ZA-M/a-zA-Z/;print'


=head1 Hashing order and checksums

Perl deliberately randomizes hash order for security purposes on both
ASCII and EBCDIC platforms.

EBCDIC checksums will differ for the same file translated into ASCII
and vice versa.

=head1 I18N AND L10N

Internationalization (I18N) and localization (L10N) are supported at least
in principle even on EBCDIC platforms.  The details are system-dependent
and discussed under the L</OS ISSUES> section below.

=head1 MULTI-OCTET CHARACTER SETS

Perl works with UTF-EBCDIC, a multi-byte encoding.  In Perls earlier
than v5.22, there may be various bugs in this regard.

Legacy multi byte EBCDIC code pages XXX.

=head1 OS ISSUES

There may be a few system-dependent issues
of concern to EBCDIC Perl programmers.

=head2 OS/400

=over 8

=item PASE

The PASE environment is a runtime environment for OS/400 that can run
executables built for PowerPC AIX in OS/400; see L<perlos400>.  PASE
is ASCII-based, not EBCDIC-based as the ILE.

=item IFS access

XXX.

=back

=head2 OS/390, z/OS

Perl runs under Unix Systems Services or USS.

=over 8

=item C<sigaction>

C<SA_SIGINFO> can have segmentation faults.

=item C<chcp>

B<chcp> is supported as a shell utility for displaying and changing
one's code page.  See also L<chcp(1)>.

=item dataset access

For sequential data set access try:

    my @ds_records = `cat //DSNAME`;

or:

    my @ds_records = `cat //'HLQ.DSNAME'`;

See also the OS390::Stdio module on CPAN.

=item C<iconv>

B<iconv> is supported as both a shell utility and a C RTL routine.
See also the L<iconv(1)> and L<iconv(3)> manual pages.

=item locales

Locales are supported.  There may be glitches when a locale is another
EBCDIC code page which has some of the
L<code-page variant characters|/The 13 variant characters> in other
positions.

There aren't currently any real UTF-8 locales, even though some locale
names contain the string "UTF-8".

See L<perllocale> for information on locales.  The L10N files
are in F</usr/nls/locale>.  C<$Config{d_setlocale}> is C<'define'> on
OS/390 or z/OS.

=back

=head2 POSIX-BC?

XXX.

=head1 BUGS

=over 4

=item *

Not all shells will allow multiple C<-e> string arguments to perl to
be concatenated together properly as recipes in this document
0, 2, 4, 5, and 6 might
seem to imply.

=item *

There are a significant number of test failures in the CPAN modules
shipped with Perl v5.22 and 5.24.  These are only in modules not primarily
maintained by Perl 5 porters.  Some of these are failures in the tests
only: they don't realize that it is proper to get different results on
EBCDIC platforms.  And some of the failures are real bugs.  If you
compile and do a C<make test> on Perl, all tests on the C</cpan>
directory are skipped.

L<Encode> partially works.

=item *

In earlier Perl versions, when byte and character data were
concatenated, the new string was sometimes created by
decoding the byte strings as I<ISO 8859-1 (Latin-1)>, even if the
old Unicode string used EBCDIC.

=back

=head1 SEE ALSO

L<perllocale>, L<perlfunc>, L<perlunicode>, L<utf8>.

=head1 REFERENCES

L<http://anubis.dkuug.dk/i18n/charmaps>

L<http://www.unicode.org/>

L<http://www.unicode.org/unicode/reports/tr16/>

L<http://www.wps.com/projects/codes/>
B<ASCII: American Standard Code for Information Infiltration> Tom Jennings,
September 1999.

B<The Unicode Standard, Version 3.0> The Unicode Consortium, Lisa Moore ed.,
ISBN 0-201-61633-5, Addison Wesley Developers Press, February 2000.

B<CDRA: IBM - Character Data Representation Architecture -
Reference and Registry>, IBM SC09-2190-00, December 1996.

"Demystifying Character Sets", Andrea Vine, Multilingual Computing
& Technology, B<#26 Vol. 10 Issue 4>, August/September 1999;
ISSN 1523-0309; Multilingual Computing Inc. Sandpoint ID, USA.

B<Codes, Ciphers, and Other Cryptic and Clandestine Communication>
Fred B. Wrixon, ISBN 1-57912-040-7, Black Dog & Leventhal Publishers,
1998.

L<http://www.bobbemer.com/P-BIT.HTM>
B<IBM - EBCDIC and the P-bit; The biggest Computer Goof Ever> Robert Bemer.

=head1 HISTORY

15 April 2001: added UTF-8 and UTF-EBCDIC to main table, pvhp.

=head1 AUTHOR

Peter Prymmer pvhp@best.com wrote this in 1999 and 2000
with CCSID 0819 and 0037 help from Chris Leach and
AndrE<eacute> Pirard A.Pirard@ulg.ac.be as well as POSIX-BC
help from Thomas Dorner Thomas.Dorner@start.de.
Thanks also to Vickie Cooper, Philip Newton, William Raffloer, and
Joe Smith.  Trademarks, registered trademarks, service marks and
registered service marks used in this document are the property of
their respective owners.

Now maintained by Perl5 Porters.
perl5203delta.pod000064400000022260150344123440007540 0ustar00=encoding utf8

=head1 NAME

perl5203delta - what is new for perl v5.20.3

=head1 DESCRIPTION

This document describes differences between the 5.20.2 release and the 5.20.3
release.

If you are upgrading from an earlier release such as 5.20.1, first read
L<perl5202delta>, which describes differences between 5.20.1 and 5.20.2.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.20.2.  If any exist,
they are bugs, and we request that you submit a report.  See L</Reporting Bugs>
below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Errno> has been upgraded from version 1.20_05 to 1.20_06.

Add B<-P> to the pre-processor command-line on GCC 5.  GCC added extra line
directives, breaking parsing of error code definitions.
L<[perl #123784]|https://rt.perl.org/Ticket/Display.html?id=123784>

=item *

L<Module::CoreList> has been upgraded from version 5.20150214 to 5.20150822.

Updated to cover the latest releases of Perl.

=item *

L<perl5db.pl> has been upgraded from 1.44 to 1.44_01.

The debugger would cause an assertion failure.
L<[perl #124127]|https://rt.perl.org/Ticket/Display.html?id=124127>

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perlfunc>

=over 4

=item *

Mention that L<C<study()>|perlfunc/study> is currently a no-op.

=back

=head3 L<perlguts>

=over 4

=item *

The OOK example has been updated to account for COW changes and a change in the
storage of the offset.

=back

=head3 L<perlhacktips>

=over 4

=item *

Documentation has been added illustrating the perils of assuming the contents
of static memory pointed to by the return values of Perl wrappers for C library
functions doesn't change.

=back

=head3 L<perlpodspec>

=over 4

=item *

The specification of the POD language is changing so that the default encoding
of PODs that aren't in UTF-8 (unless otherwise indicated) is CP1252 instead of
ISO-8859-1 (Latin1).

=back

=head1 Utility Changes

=head2 L<h2ph>

=over 4

=item *

B<h2ph> now handles hexadecimal constants in the compiler's predefined macro
definitions, as visible in C<$Config{cppsymbols}>.
L<[perl #123784]|https://rt.perl.org/Ticket/Display.html?id=123784>

=back

=head1 Testing

=over 4

=item *

F<t/perf/taint.t> has been added to see if optimisations with taint issues are
keeping things fast.

=item *

F<t/porting/re_context.t> has been added to test that L<utf8> and its
dependencies only use the subset of the C<$1..$n> capture vars that
Perl_save_re_context() is hard-coded to localize, because that function has no
efficient way of determining at runtime what vars to localize.

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item Win32

=over 4

=item *

Previously, when compiling with a 64-bit Visual C++, every Perl XS module
(including CPAN ones) and Perl aware C file would unconditionally have around a
dozen warnings from F<hv_func.h>.  These warnings have been silenced.  GCC (all
bitness) and 32-bit Visual C++ were not affected.

=item *

B<miniperl.exe> is now built with B<-fno-strict-aliasing>, allowing 64-bit
builds to complete with GCC 4.8.
L<[perl #123976]|https://rt.perl.org/Ticket/Display.html?id=123976>

=back

=back

=head1 Selected Bug Fixes

=over 4

=item *

Repeated global pattern matches in scalar context on large tainted strings were
exponentially slow depending on the current match position in the string.
L<[perl #123202]|https://rt.perl.org/Ticket/Display.html?id=123202>

=item *

The original visible value of L<C<$E<sol>>|perlvar/$E<sol>> is now preserved
when it is set to an invalid value.  Previously if you set C<$/> to a reference
to an array, for example, perl would produce a runtime error and not set PL_rs,
but Perl code that checked C<$/> would see the array reference.
L<[perl #123218]|https://rt.perl.org/Ticket/Display.html?id=123218>

=item *

Perl 5.14.0 introduced a bug whereby C<eval { LABEL: }> would crash.  This has
been fixed.
L<[perl #123652]|https://rt.perl.org/Ticket/Display.html?id=123652>

=item *

Extending an array cloned from a parent thread could result in "Modification of
a read-only value attempted" errors when attempting to modify the new elements.
L<[perl #124127]|https://rt.perl.org/Ticket/Display.html?id=124127>

=item *

Several cases of data used to store environment variable contents in core C
code being potentially overwritten before being used have been fixed.
L<[perl #123748]|https://rt.perl.org/Ticket/Display.html?id=123748>

=item *

UTF-8 variable names used in array indexes, unquoted UTF-8 HERE-document
terminators and UTF-8 function names all now work correctly.
L<[perl #124113]|https://rt.perl.org/Ticket/Display.html?id=124113>

=item *

A subtle bug introduced in Perl 5.20.2 involving UTF-8 in regular expressions
and sometimes causing a crash has been fixed.  A new test script has been added
to test this fix; see under L</Testing>.
L<[perl #124109]|https://rt.perl.org/Ticket/Display.html?id=124109>

=item *

Some patterns starting with C</.*..../> matched against long strings have been
slow since Perl 5.8, and some of the form C</.*..../i> have been slow since
Perl 5.18.  They are now all fast again.
L<[perl #123743]|https://rt.perl.org/Ticket/Display.html?id=123743>

=item *

Warning fatality is now ignored when rewinding the stack.  This prevents
infinite recursion when the now fatal error also causes rewinding of the stack.
L<[perl #123398]|https://rt.perl.org/Ticket/Display.html?id=123398>

=item *

C<setpgrp($nonzero)> (with one argument) was accidentally changed in Perl 5.16
to mean C<setpgrp(0)>.  This has been fixed.

=item *

A crash with C<< %::=(); J->${\"::"} >> has been fixed.
L<[perl #125541]|https://rt.perl.org/Ticket/Display.html?id=125541>

=item *

Regular expression possessive quantifier Perl 5.20 regression now fixed.
C<qr/>I<PAT>C<{>I<min>,I<max>C<}+>C</> is supposed to behave identically to
C<qr/(?E<gt>>I<PAT>C<{>I<min>,I<max>C<})/>.  Since Perl 5.20, this didn't work
if I<min> and I<max> were equal.
L<[perl #125825]|https://rt.perl.org/Ticket/Display.html?id=125825>

=item *

Code like C</$a[/> used to read the next line of input and treat it as though
it came immediately after the opening bracket.  Some invalid code consequently
would parse and run, but some code caused crashes, so this is now disallowed.
L<[perl #123712]|https://rt.perl.org/Ticket/Display.html?id=123712>

=back

=head1 Acknowledgements

Perl 5.20.3 represents approximately 7 months of development since Perl 5.20.2
and contains approximately 3,200 lines of changes across 99 files from 26
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 1,500 lines of changes to 43 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.20.3:

Alex Vandiver, Andy Dougherty, Aristotle Pagaltzis, Chris 'BinGOs' Williams,
Craig A. Berry, Dagfinn Ilmari Mannsåker, Daniel Dragan, David Mitchell,
Father Chrysostomos, H.Merijn Brand, James E Keenan, James McCoy, Jarkko
Hietaniemi, Karen Etheridge, Karl Williamson, kmx, Lajos Veres, Lukas Mai,
Matthew Horsfall, Petr Písař, Randy Stauner, Ricardo Signes, Sawyer X, Steve
Hay, Tony Cook, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
https://rt.perl.org/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlthrtut.pod000064400000132573150344123440007500 0ustar00=encoding utf8

=head1 NAME

perlthrtut - Tutorial on threads in Perl

=head1 DESCRIPTION

This tutorial describes the use of Perl interpreter threads (sometimes
referred to as I<ithreads>).  In this
model, each thread runs in its own Perl interpreter, and any data sharing
between threads must be explicit.  The user-level interface for I<ithreads>
uses the L<threads> class.

B<NOTE>: There was another older Perl threading flavor called the 5.005 model
that used the L<threads> class.  This old model was known to have problems, is
deprecated, and was removed for release 5.10.  You are
strongly encouraged to migrate any existing 5.005 threads code to the new
model as soon as possible.

You can see which (or neither) threading flavour you have by
running C<perl -V> and looking at the C<Platform> section.
If you have C<useithreads=define> you have ithreads, if you
have C<use5005threads=define> you have 5.005 threads.
If you have neither, you don't have any thread support built in.
If you have both, you are in trouble.

The L<threads> and L<threads::shared> modules are included in the core Perl
distribution.  Additionally, they are maintained as a separate modules on
CPAN, so you can check there for any updates.

=head1 What Is A Thread Anyway?

A thread is a flow of control through a program with a single
execution point.

Sounds an awful lot like a process, doesn't it? Well, it should.
Threads are one of the pieces of a process.  Every process has at least
one thread and, up until now, every process running Perl had only one
thread.  With 5.8, though, you can create extra threads.  We're going
to show you how, when, and why.

=head1 Threaded Program Models

There are three basic ways that you can structure a threaded
program.  Which model you choose depends on what you need your program
to do.  For many non-trivial threaded programs, you'll need to choose
different models for different pieces of your program.

=head2 Boss/Worker

The boss/worker model usually has one I<boss> thread and one or more
I<worker> threads.  The boss thread gathers or generates tasks that need
to be done, then parcels those tasks out to the appropriate worker
thread.

This model is common in GUI and server programs, where a main thread
waits for some event and then passes that event to the appropriate
worker threads for processing.  Once the event has been passed on, the
boss thread goes back to waiting for another event.

The boss thread does relatively little work.  While tasks aren't
necessarily performed faster than with any other method, it tends to
have the best user-response times.

=head2 Work Crew

In the work crew model, several threads are created that do
essentially the same thing to different pieces of data.  It closely
mirrors classical parallel processing and vector processors, where a
large array of processors do the exact same thing to many pieces of
data.

This model is particularly useful if the system running the program
will distribute multiple threads across different processors.  It can
also be useful in ray tracing or rendering engines, where the
individual threads can pass on interim results to give the user visual
feedback.

=head2 Pipeline

The pipeline model divides up a task into a series of steps, and
passes the results of one step on to the thread processing the
next.  Each thread does one thing to each piece of data and passes the
results to the next thread in line.

This model makes the most sense if you have multiple processors so two
or more threads will be executing in parallel, though it can often
make sense in other contexts as well.  It tends to keep the individual
tasks small and simple, as well as allowing some parts of the pipeline
to block (on I/O or system calls, for example) while other parts keep
going.  If you're running different parts of the pipeline on different
processors you may also take advantage of the caches on each
processor.

This model is also handy for a form of recursive programming where,
rather than having a subroutine call itself, it instead creates
another thread.  Prime and Fibonacci generators both map well to this
form of the pipeline model. (A version of a prime number generator is
presented later on.)

=head1 What kind of threads are Perl threads?

If you have experience with other thread implementations, you might
find that things aren't quite what you expect.  It's very important to
remember when dealing with Perl threads that I<Perl Threads Are Not X
Threads> for all values of X.  They aren't POSIX threads, or
DecThreads, or Java's Green threads, or Win32 threads.  There are
similarities, and the broad concepts are the same, but if you start
looking for implementation details you're going to be either
disappointed or confused.  Possibly both.

This is not to say that Perl threads are completely different from
everything that's ever come before. They're not.  Perl's threading
model owes a lot to other thread models, especially POSIX.  Just as
Perl is not C, though, Perl threads are not POSIX threads.  So if you
find yourself looking for mutexes, or thread priorities, it's time to
step back a bit and think about what you want to do and how Perl can
do it.

However, it is important to remember that Perl threads cannot magically
do things unless your operating system's threads allow it. So if your
system blocks the entire process on C<sleep()>, Perl usually will, as well.

B<Perl Threads Are Different.>

=head1 Thread-Safe Modules

The addition of threads has changed Perl's internals
substantially. There are implications for people who write
modules with XS code or external libraries. However, since Perl data is
not shared among threads by default, Perl modules stand a high chance of
being thread-safe or can be made thread-safe easily.  Modules that are not
tagged as thread-safe should be tested or code reviewed before being used
in production code.

Not all modules that you might use are thread-safe, and you should
always assume a module is unsafe unless the documentation says
otherwise.  This includes modules that are distributed as part of the
core.  Threads are a relatively new feature, and even some of the standard
modules aren't thread-safe.

Even if a module is thread-safe, it doesn't mean that the module is optimized
to work well with threads. A module could possibly be rewritten to utilize
the new features in threaded Perl to increase performance in a threaded
environment.

If you're using a module that's not thread-safe for some reason, you
can protect yourself by using it from one, and only one thread at all.
If you need multiple threads to access such a module, you can use semaphores and
lots of programming discipline to control access to it.  Semaphores
are covered in L</"Basic semaphores">.

See also L</"Thread-Safety of System Libraries">.

=head1 Thread Basics

The L<threads> module provides the basic functions you need to write
threaded programs.  In the following sections, we'll cover the basics,
showing you what you need to do to create a threaded program.   After
that, we'll go over some of the features of the L<threads> module that
make threaded programming easier.

=head2 Basic Thread Support

Thread support is a Perl compile-time option. It's something that's
turned on or off when Perl is built at your site, rather than when
your programs are compiled. If your Perl wasn't compiled with thread
support enabled, then any attempt to use threads will fail.

Your programs can use the Config module to check whether threads are
enabled. If your program can't run without them, you can say something
like:

    use Config;
    $Config{useithreads} or
        die('Recompile Perl with threads to run this program.');

A possibly-threaded program using a possibly-threaded module might
have code like this:

    use Config;
    use MyMod;

    BEGIN {
        if ($Config{useithreads}) {
            # We have threads
            require MyMod_threaded;
            import MyMod_threaded;
        } else {
            require MyMod_unthreaded;
            import MyMod_unthreaded;
        }
    }

Since code that runs both with and without threads is usually pretty
messy, it's best to isolate the thread-specific code in its own
module.  In our example above, that's what C<MyMod_threaded> is, and it's
only imported if we're running on a threaded Perl.

=head2 A Note about the Examples

In a real situation, care should be taken that all threads are finished
executing before the program exits.  That care has B<not> been taken in these
examples in the interest of simplicity.  Running these examples I<as is> will
produce error messages, usually caused by the fact that there are still
threads running when the program exits.  You should not be alarmed by this.

=head2 Creating Threads

The L<threads> module provides the tools you need to create new
threads.  Like any other module, you need to tell Perl that you want to use
it; C<use threads;> imports all the pieces you need to create basic
threads.

The simplest, most straightforward way to create a thread is with C<create()>:

    use threads;

    my $thr = threads->create(\&sub1);

    sub sub1 {
        print("In the thread\n");
    }

The C<create()> method takes a reference to a subroutine and creates a new
thread that starts executing in the referenced subroutine.  Control
then passes both to the subroutine and the caller.

If you need to, your program can pass parameters to the subroutine as
part of the thread startup.  Just include the list of parameters as
part of the C<threads-E<gt>create()> call, like this:

    use threads;

    my $Param3 = 'foo';
    my $thr1 = threads->create(\&sub1, 'Param 1', 'Param 2', $Param3);
    my @ParamList = (42, 'Hello', 3.14);
    my $thr2 = threads->create(\&sub1, @ParamList);
    my $thr3 = threads->create(\&sub1, qw(Param1 Param2 Param3));

    sub sub1 {
        my @InboundParameters = @_;
        print("In the thread\n");
        print('Got parameters >', join('<>',@InboundParameters), "<\n");
    }

The last example illustrates another feature of threads.  You can spawn
off several threads using the same subroutine.  Each thread executes
the same subroutine, but in a separate thread with a separate
environment and potentially separate arguments.

C<new()> is a synonym for C<create()>.

=head2 Waiting For A Thread To Exit

Since threads are also subroutines, they can return values.  To wait
for a thread to exit and extract any values it might return, you can
use the C<join()> method:

    use threads;

    my ($thr) = threads->create(\&sub1);

    my @ReturnData = $thr->join();
    print('Thread returned ', join(', ', @ReturnData), "\n");

    sub sub1 { return ('Fifty-six', 'foo', 2); }

In the example above, the C<join()> method returns as soon as the thread
ends.  In addition to waiting for a thread to finish and gathering up
any values that the thread might have returned, C<join()> also performs
any OS cleanup necessary for the thread.  That cleanup might be
important, especially for long-running programs that spawn lots of
threads.  If you don't want the return values and don't want to wait
for the thread to finish, you should call the C<detach()> method
instead, as described next.

NOTE: In the example above, the thread returns a list, thus necessitating
that the thread creation call be made in list context (i.e., C<my ($thr)>).
See L<< threads/"$thr->join()" >> and L<threads/"THREAD CONTEXT"> for more
details on thread context and return values.

=head2 Ignoring A Thread

C<join()> does three things: it waits for a thread to exit, cleans up
after it, and returns any data the thread may have produced.  But what
if you're not interested in the thread's return values, and you don't
really care when the thread finishes? All you want is for the thread
to get cleaned up after when it's done.

In this case, you use the C<detach()> method.  Once a thread is detached,
it'll run until it's finished; then Perl will clean up after it
automatically.

    use threads;

    my $thr = threads->create(\&sub1);   # Spawn the thread

    $thr->detach();   # Now we officially don't care any more

    sleep(15);        # Let thread run for awhile

    sub sub1 {
        my $count = 0;
        while (1) {
            $count++;
            print("\$count is $count\n");
            sleep(1);
        }
    }

Once a thread is detached, it may not be joined, and any return data
that it might have produced (if it was done and waiting for a join) is
lost.

C<detach()> can also be called as a class method to allow a thread to
detach itself:

    use threads;

    my $thr = threads->create(\&sub1);

    sub sub1 {
        threads->detach();
        # Do more work
    }

=head2 Process and Thread Termination

With threads one must be careful to make sure they all have a chance to
run to completion, assuming that is what you want.

An action that terminates a process will terminate I<all> running
threads.  die() and exit() have this property,
and perl does an exit when the main thread exits,
perhaps implicitly by falling off the end of your code,
even if that's not what you want.

As an example of this case, this code prints the message
"Perl exited with active threads: 2 running and unjoined":

    use threads;
    my $thr1 = threads->new(\&thrsub, "test1");
    my $thr2 = threads->new(\&thrsub, "test2");
    sub thrsub {
       my ($message) = @_;
       sleep 1;
       print "thread $message\n";
    }

But when the following lines are added at the end:

    $thr1->join();
    $thr2->join();

it prints two lines of output, a perhaps more useful outcome.

=head1 Threads And Data

Now that we've covered the basics of threads, it's time for our next
topic: Data.  Threading introduces a couple of complications to data
access that non-threaded programs never need to worry about.

=head2 Shared And Unshared Data

The biggest difference between Perl I<ithreads> and the old 5.005 style
threading, or for that matter, to most other threading systems out there,
is that by default, no data is shared. When a new Perl thread is created,
all the data associated with the current thread is copied to the new
thread, and is subsequently private to that new thread!
This is similar in feel to what happens when a Unix process forks,
except that in this case, the data is just copied to a different part of
memory within the same process rather than a real fork taking place.

To make use of threading, however, one usually wants the threads to share
at least some data between themselves. This is done with the
L<threads::shared> module and the C<:shared> attribute:

    use threads;
    use threads::shared;

    my $foo :shared = 1;
    my $bar = 1;
    threads->create(sub { $foo++; $bar++; })->join();

    print("$foo\n");  # Prints 2 since $foo is shared
    print("$bar\n");  # Prints 1 since $bar is not shared

In the case of a shared array, all the array's elements are shared, and for
a shared hash, all the keys and values are shared. This places
restrictions on what may be assigned to shared array and hash elements: only
simple values or references to shared variables are allowed - this is
so that a private variable can't accidentally become shared. A bad
assignment will cause the thread to die. For example:

    use threads;
    use threads::shared;

    my $var          = 1;
    my $svar :shared = 2;
    my %hash :shared;

    ... create some threads ...

    $hash{a} = 1;       # All threads see exists($hash{a})
                        # and $hash{a} == 1
    $hash{a} = $var;    # okay - copy-by-value: same effect as previous
    $hash{a} = $svar;   # okay - copy-by-value: same effect as previous
    $hash{a} = \$svar;  # okay - a reference to a shared variable
    $hash{a} = \$var;   # This will die
    delete($hash{a});   # okay - all threads will see !exists($hash{a})

Note that a shared variable guarantees that if two or more threads try to
modify it at the same time, the internal state of the variable will not
become corrupted. However, there are no guarantees beyond this, as
explained in the next section.

=head2 Thread Pitfalls: Races

While threads bring a new set of useful tools, they also bring a
number of pitfalls.  One pitfall is the race condition:

    use threads;
    use threads::shared;

    my $x :shared = 1;
    my $thr1 = threads->create(\&sub1);
    my $thr2 = threads->create(\&sub2);

    $thr1->join();
    $thr2->join();
    print("$x\n");

    sub sub1 { my $foo = $x; $x = $foo + 1; }
    sub sub2 { my $bar = $x; $x = $bar + 1; }

What do you think C<$x> will be? The answer, unfortunately, is I<it
depends>. Both C<sub1()> and C<sub2()> access the global variable C<$x>, once
to read and once to write.  Depending on factors ranging from your
thread implementation's scheduling algorithm to the phase of the moon,
C<$x> can be 2 or 3.

Race conditions are caused by unsynchronized access to shared
data.  Without explicit synchronization, there's no way to be sure that
nothing has happened to the shared data between the time you access it
and the time you update it.  Even this simple code fragment has the
possibility of error:

    use threads;
    my $x :shared = 2;
    my $y :shared;
    my $z :shared;
    my $thr1 = threads->create(sub { $y = $x; $x = $y + 1; });
    my $thr2 = threads->create(sub { $z = $x; $x = $z + 1; });
    $thr1->join();
    $thr2->join();

Two threads both access C<$x>.  Each thread can potentially be interrupted
at any point, or be executed in any order.  At the end, C<$x> could be 3
or 4, and both C<$y> and C<$z> could be 2 or 3.

Even C<$x += 5> or C<$x++> are not guaranteed to be atomic.

Whenever your program accesses data or resources that can be accessed
by other threads, you must take steps to coordinate access or risk
data inconsistency and race conditions. Note that Perl will protect its
internals from your race conditions, but it won't protect you from you.

=head1 Synchronization and control

Perl provides a number of mechanisms to coordinate the interactions
between themselves and their data, to avoid race conditions and the like.
Some of these are designed to resemble the common techniques used in thread
libraries such as C<pthreads>; others are Perl-specific. Often, the
standard techniques are clumsy and difficult to get right (such as
condition waits). Where possible, it is usually easier to use Perlish
techniques such as queues, which remove some of the hard work involved.

=head2 Controlling access: lock()

The C<lock()> function takes a shared variable and puts a lock on it.
No other thread may lock the variable until the variable is unlocked
by the thread holding the lock. Unlocking happens automatically
when the locking thread exits the block that contains the call to the
C<lock()> function.  Using C<lock()> is straightforward: This example has
several threads doing some calculations in parallel, and occasionally
updating a running total:

    use threads;
    use threads::shared;

    my $total :shared = 0;

    sub calc {
        while (1) {
            my $result;
            # (... do some calculations and set $result ...)
            {
                lock($total);  # Block until we obtain the lock
                $total += $result;
            } # Lock implicitly released at end of scope
            last if $result == 0;
        }
    }

    my $thr1 = threads->create(\&calc);
    my $thr2 = threads->create(\&calc);
    my $thr3 = threads->create(\&calc);
    $thr1->join();
    $thr2->join();
    $thr3->join();
    print("total=$total\n");

C<lock()> blocks the thread until the variable being locked is
available.  When C<lock()> returns, your thread can be sure that no other
thread can lock that variable until the block containing the
lock exits.

It's important to note that locks don't prevent access to the variable
in question, only lock attempts.  This is in keeping with Perl's
longstanding tradition of courteous programming, and the advisory file
locking that C<flock()> gives you.

You may lock arrays and hashes as well as scalars.  Locking an array,
though, will not block subsequent locks on array elements, just lock
attempts on the array itself.

Locks are recursive, which means it's okay for a thread to
lock a variable more than once.  The lock will last until the outermost
C<lock()> on the variable goes out of scope. For example:

    my $x :shared;
    doit();

    sub doit {
        {
            {
                lock($x); # Wait for lock
                lock($x); # NOOP - we already have the lock
                {
                    lock($x); # NOOP
                    {
                        lock($x); # NOOP
                        lockit_some_more();
                    }
                }
            } # *** Implicit unlock here ***
        }
    }

    sub lockit_some_more {
        lock($x); # NOOP
    } # Nothing happens here

Note that there is no C<unlock()> function - the only way to unlock a
variable is to allow it to go out of scope.

A lock can either be used to guard the data contained within the variable
being locked, or it can be used to guard something else, like a section
of code. In this latter case, the variable in question does not hold any
useful data, and exists only for the purpose of being locked. In this
respect, the variable behaves like the mutexes and basic semaphores of
traditional thread libraries.

=head2 A Thread Pitfall: Deadlocks

Locks are a handy tool to synchronize access to data, and using them
properly is the key to safe shared data.  Unfortunately, locks aren't
without their dangers, especially when multiple locks are involved.
Consider the following code:

    use threads;

    my $x :shared = 4;
    my $y :shared = 'foo';
    my $thr1 = threads->create(sub {
        lock($x);
        sleep(20);
        lock($y);
    });
    my $thr2 = threads->create(sub {
        lock($y);
        sleep(20);
        lock($x);
    });

This program will probably hang until you kill it.  The only way it
won't hang is if one of the two threads acquires both locks
first.  A guaranteed-to-hang version is more complicated, but the
principle is the same.

The first thread will grab a lock on C<$x>, then, after a pause during which
the second thread has probably had time to do some work, try to grab a
lock on C<$y>.  Meanwhile, the second thread grabs a lock on C<$y>, then later
tries to grab a lock on C<$x>.  The second lock attempt for both threads will
block, each waiting for the other to release its lock.

This condition is called a deadlock, and it occurs whenever two or
more threads are trying to get locks on resources that the others
own.  Each thread will block, waiting for the other to release a lock
on a resource.  That never happens, though, since the thread with the
resource is itself waiting for a lock to be released.

There are a number of ways to handle this sort of problem.  The best
way is to always have all threads acquire locks in the exact same
order.  If, for example, you lock variables C<$x>, C<$y>, and C<$z>, always lock
C<$x> before C<$y>, and C<$y> before C<$z>.  It's also best to hold on to locks for
as short a period of time to minimize the risks of deadlock.

The other synchronization primitives described below can suffer from
similar problems.

=head2 Queues: Passing Data Around

A queue is a special thread-safe object that lets you put data in one
end and take it out the other without having to worry about
synchronization issues.  They're pretty straightforward, and look like
this:

    use threads;
    use Thread::Queue;

    my $DataQueue = Thread::Queue->new();
    my $thr = threads->create(sub {
        while (my $DataElement = $DataQueue->dequeue()) {
            print("Popped $DataElement off the queue\n");
        }
    });

    $DataQueue->enqueue(12);
    $DataQueue->enqueue("A", "B", "C");
    sleep(10);
    $DataQueue->enqueue(undef);
    $thr->join();

You create the queue with C<Thread::Queue-E<gt>new()>.  Then you can
add lists of scalars onto the end with C<enqueue()>, and pop scalars off
the front of it with C<dequeue()>.  A queue has no fixed size, and can grow
as needed to hold everything pushed on to it.

If a queue is empty, C<dequeue()> blocks until another thread enqueues
something.  This makes queues ideal for event loops and other
communications between threads.

=head2 Semaphores: Synchronizing Data Access

Semaphores are a kind of generic locking mechanism. In their most basic
form, they behave very much like lockable scalars, except that they
can't hold data, and that they must be explicitly unlocked. In their
advanced form, they act like a kind of counter, and can allow multiple
threads to have the I<lock> at any one time.

=head2 Basic semaphores

Semaphores have two methods, C<down()> and C<up()>: C<down()> decrements the resource
count, while C<up()> increments it. Calls to C<down()> will block if the
semaphore's current count would decrement below zero.  This program
gives a quick demonstration:

    use threads;
    use Thread::Semaphore;

    my $semaphore = Thread::Semaphore->new();
    my $GlobalVariable :shared = 0;

    $thr1 = threads->create(\&sample_sub, 1);
    $thr2 = threads->create(\&sample_sub, 2);
    $thr3 = threads->create(\&sample_sub, 3);

    sub sample_sub {
        my $SubNumber = shift(@_);
        my $TryCount = 10;
        my $LocalCopy;
        sleep(1);
        while ($TryCount--) {
            $semaphore->down();
            $LocalCopy = $GlobalVariable;
            print("$TryCount tries left for sub $SubNumber "
                 ."(\$GlobalVariable is $GlobalVariable)\n");
            sleep(2);
            $LocalCopy++;
            $GlobalVariable = $LocalCopy;
            $semaphore->up();
        }
    }

    $thr1->join();
    $thr2->join();
    $thr3->join();

The three invocations of the subroutine all operate in sync.  The
semaphore, though, makes sure that only one thread is accessing the
global variable at once.

=head2 Advanced Semaphores

By default, semaphores behave like locks, letting only one thread
C<down()> them at a time.  However, there are other uses for semaphores.

Each semaphore has a counter attached to it. By default, semaphores are
created with the counter set to one, C<down()> decrements the counter by
one, and C<up()> increments by one. However, we can override any or all
of these defaults simply by passing in different values:

    use threads;
    use Thread::Semaphore;

    my $semaphore = Thread::Semaphore->new(5);
                    # Creates a semaphore with the counter set to five

    my $thr1 = threads->create(\&sub1);
    my $thr2 = threads->create(\&sub1);

    sub sub1 {
        $semaphore->down(5); # Decrements the counter by five
        # Do stuff here
        $semaphore->up(5); # Increment the counter by five
    }

    $thr1->detach();
    $thr2->detach();

If C<down()> attempts to decrement the counter below zero, it blocks until
the counter is large enough.  Note that while a semaphore can be created
with a starting count of zero, any C<up()> or C<down()> always changes the
counter by at least one, and so C<< $semaphore->down(0) >> is the same as
C<< $semaphore->down(1) >>.

The question, of course, is why would you do something like this? Why
create a semaphore with a starting count that's not one, or why
decrement or increment it by more than one? The answer is resource
availability.  Many resources that you want to manage access for can be
safely used by more than one thread at once.

For example, let's take a GUI driven program.  It has a semaphore that
it uses to synchronize access to the display, so only one thread is
ever drawing at once.  Handy, but of course you don't want any thread
to start drawing until things are properly set up.  In this case, you
can create a semaphore with a counter set to zero, and up it when
things are ready for drawing.

Semaphores with counters greater than one are also useful for
establishing quotas.  Say, for example, that you have a number of
threads that can do I/O at once.  You don't want all the threads
reading or writing at once though, since that can potentially swamp
your I/O channels, or deplete your process's quota of filehandles.  You
can use a semaphore initialized to the number of concurrent I/O
requests (or open files) that you want at any one time, and have your
threads quietly block and unblock themselves.

Larger increments or decrements are handy in those cases where a
thread needs to check out or return a number of resources at once.

=head2 Waiting for a Condition

The functions C<cond_wait()> and C<cond_signal()>
can be used in conjunction with locks to notify
co-operating threads that a resource has become available. They are
very similar in use to the functions found in C<pthreads>. However
for most purposes, queues are simpler to use and more intuitive. See
L<threads::shared> for more details.

=head2 Giving up control

There are times when you may find it useful to have a thread
explicitly give up the CPU to another thread.  You may be doing something
processor-intensive and want to make sure that the user-interface thread
gets called frequently.  Regardless, there are times that you might want
a thread to give up the processor.

Perl's threading package provides the C<yield()> function that does
this. C<yield()> is pretty straightforward, and works like this:

    use threads;

    sub loop {
        my $thread = shift;
        my $foo = 50;
        while($foo--) { print("In thread $thread\n"); }
        threads->yield();
        $foo = 50;
        while($foo--) { print("In thread $thread\n"); }
    }

    my $thr1 = threads->create(\&loop, 'first');
    my $thr2 = threads->create(\&loop, 'second');
    my $thr3 = threads->create(\&loop, 'third');

It is important to remember that C<yield()> is only a hint to give up the CPU,
it depends on your hardware, OS and threading libraries what actually happens.
B<On many operating systems, yield() is a no-op.>  Therefore it is important
to note that one should not build the scheduling of the threads around
C<yield()> calls. It might work on your platform but it won't work on another
platform.

=head1 General Thread Utility Routines

We've covered the workhorse parts of Perl's threading package, and
with these tools you should be well on your way to writing threaded
code and packages.  There are a few useful little pieces that didn't
really fit in anyplace else.

=head2 What Thread Am I In?

The C<threads-E<gt>self()> class method provides your program with a way to
get an object representing the thread it's currently in.  You can use this
object in the same way as the ones returned from thread creation.

=head2 Thread IDs

C<tid()> is a thread object method that returns the thread ID of the
thread the object represents.  Thread IDs are integers, with the main
thread in a program being 0.  Currently Perl assigns a unique TID to
every thread ever created in your program, assigning the first thread
to be created a TID of 1, and increasing the TID by 1 for each new
thread that's created.  When used as a class method, C<threads-E<gt>tid()>
can be used by a thread to get its own TID.

=head2 Are These Threads The Same?

The C<equal()> method takes two thread objects and returns true
if the objects represent the same thread, and false if they don't.

Thread objects also have an overloaded C<==> comparison so that you can do
comparison on them as you would with normal objects.

=head2 What Threads Are Running?

C<threads-E<gt>list()> returns a list of thread objects, one for each thread
that's currently running and not detached.  Handy for a number of things,
including cleaning up at the end of your program (from the main Perl thread,
of course):

    # Loop through all the threads
    foreach my $thr (threads->list()) {
        $thr->join();
    }

If some threads have not finished running when the main Perl thread
ends, Perl will warn you about it and die, since it is impossible for Perl
to clean up itself while other threads are running.

NOTE:  The main Perl thread (thread 0) is in a I<detached> state, and so
does not appear in the list returned by C<threads-E<gt>list()>.

=head1 A Complete Example

Confused yet? It's time for an example program to show some of the
things we've covered.  This program finds prime numbers using threads.

   1 #!/usr/bin/perl
   2 # prime-pthread, courtesy of Tom Christiansen
   3
   4 use strict;
   5 use warnings;
   6
   7 use threads;
   8 use Thread::Queue;
   9
  10 sub check_num {
  11     my ($upstream, $cur_prime) = @_;
  12     my $kid;
  13     my $downstream = Thread::Queue->new();
  14     while (my $num = $upstream->dequeue()) {
  15         next unless ($num % $cur_prime);
  16         if ($kid) {
  17             $downstream->enqueue($num);
  18         } else {
  19             print("Found prime: $num\n");
  20             $kid = threads->create(\&check_num, $downstream, $num);
  21             if (! $kid) {
  22                 warn("Sorry.  Ran out of threads.\n");
  23                 last;
  24             }
  25         }
  26     }
  27     if ($kid) {
  28         $downstream->enqueue(undef);
  29         $kid->join();
  30     }
  31 }
  32
  33 my $stream = Thread::Queue->new(3..1000, undef);
  34 check_num($stream, 2);

This program uses the pipeline model to generate prime numbers.  Each
thread in the pipeline has an input queue that feeds numbers to be
checked, a prime number that it's responsible for, and an output queue
into which it funnels numbers that have failed the check.  If the thread
has a number that's failed its check and there's no child thread, then
the thread must have found a new prime number.  In that case, a new
child thread is created for that prime and stuck on the end of the
pipeline.

This probably sounds a bit more confusing than it really is, so let's
go through this program piece by piece and see what it does.  (For
those of you who might be trying to remember exactly what a prime
number is, it's a number that's only evenly divisible by itself and 1.)

The bulk of the work is done by the C<check_num()> subroutine, which
takes a reference to its input queue and a prime number that it's
responsible for.  After pulling in the input queue and the prime that
the subroutine is checking (line 11), we create a new queue (line 13)
and reserve a scalar for the thread that we're likely to create later
(line 12).

The while loop from line 14 to line 26 grabs a scalar off the input
queue and checks against the prime this thread is responsible
for.  Line 15 checks to see if there's a remainder when we divide the
number to be checked by our prime.  If there is one, the number
must not be evenly divisible by our prime, so we need to either pass
it on to the next thread if we've created one (line 17) or create a
new thread if we haven't.

The new thread creation is line 20.  We pass on to it a reference to
the queue we've created, and the prime number we've found.  In lines 21
through 24, we check to make sure that our new thread got created, and
if not, we stop checking any remaining numbers in the queue.

Finally, once the loop terminates (because we got a 0 or C<undef> in the
queue, which serves as a note to terminate), we pass on the notice to our
child, and wait for it to exit if we've created a child (lines 27 and
30).

Meanwhile, back in the main thread, we first create a queue (line 33) and
queue up all the numbers from 3 to 1000 for checking, plus a termination
notice.  Then all we have to do to get the ball rolling is pass the queue
and the first prime to the C<check_num()> subroutine (line 34).

That's how it works.  It's pretty simple; as with many Perl programs,
the explanation is much longer than the program.

=head1 Different implementations of threads

Some background on thread implementations from the operating system
viewpoint.  There are three basic categories of threads: user-mode threads,
kernel threads, and multiprocessor kernel threads.

User-mode threads are threads that live entirely within a program and
its libraries.  In this model, the OS knows nothing about threads.  As
far as it's concerned, your process is just a process.

This is the easiest way to implement threads, and the way most OSes
start.  The big disadvantage is that, since the OS knows nothing about
threads, if one thread blocks they all do.  Typical blocking activities
include most system calls, most I/O, and things like C<sleep()>.

Kernel threads are the next step in thread evolution.  The OS knows
about kernel threads, and makes allowances for them.  The main
difference between a kernel thread and a user-mode thread is
blocking.  With kernel threads, things that block a single thread don't
block other threads.  This is not the case with user-mode threads,
where the kernel blocks at the process level and not the thread level.

This is a big step forward, and can give a threaded program quite a
performance boost over non-threaded programs.  Threads that block
performing I/O, for example, won't block threads that are doing other
things.  Each process still has only one thread running at once,
though, regardless of how many CPUs a system might have.

Since kernel threading can interrupt a thread at any time, they will
uncover some of the implicit locking assumptions you may make in your
program.  For example, something as simple as C<$x = $x + 2> can behave
unpredictably with kernel threads if C<$x> is visible to other
threads, as another thread may have changed C<$x> between the time it
was fetched on the right hand side and the time the new value is
stored.

Multiprocessor kernel threads are the final step in thread
support.  With multiprocessor kernel threads on a machine with multiple
CPUs, the OS may schedule two or more threads to run simultaneously on
different CPUs.

This can give a serious performance boost to your threaded program,
since more than one thread will be executing at the same time.  As a
tradeoff, though, any of those nagging synchronization issues that
might not have shown with basic kernel threads will appear with a
vengeance.

In addition to the different levels of OS involvement in threads,
different OSes (and different thread implementations for a particular
OS) allocate CPU cycles to threads in different ways.

Cooperative multitasking systems have running threads give up control
if one of two things happen.  If a thread calls a yield function, it
gives up control.  It also gives up control if the thread does
something that would cause it to block, such as perform I/O.  In a
cooperative multitasking implementation, one thread can starve all the
others for CPU time if it so chooses.

Preemptive multitasking systems interrupt threads at regular intervals
while the system decides which thread should run next.  In a preemptive
multitasking system, one thread usually won't monopolize the CPU.

On some systems, there can be cooperative and preemptive threads
running simultaneously. (Threads running with realtime priorities
often behave cooperatively, for example, while threads running at
normal priorities behave preemptively.)

Most modern operating systems support preemptive multitasking nowadays.

=head1 Performance considerations

The main thing to bear in mind when comparing Perl's I<ithreads> to other threading
models is the fact that for each new thread created, a complete copy of
all the variables and data of the parent thread has to be taken. Thus,
thread creation can be quite expensive, both in terms of memory usage and
time spent in creation. The ideal way to reduce these costs is to have a
relatively short number of long-lived threads, all created fairly early
on (before the base thread has accumulated too much data). Of course, this
may not always be possible, so compromises have to be made. However, after
a thread has been created, its performance and extra memory usage should
be little different than ordinary code.

Also note that under the current implementation, shared variables
use a little more memory and are a little slower than ordinary variables.

=head1 Process-scope Changes

Note that while threads themselves are separate execution threads and
Perl data is thread-private unless explicitly shared, the threads can
affect process-scope state, affecting all the threads.

The most common example of this is changing the current working
directory using C<chdir()>.  One thread calls C<chdir()>, and the working
directory of all the threads changes.

Even more drastic example of a process-scope change is C<chroot()>:
the root directory of all the threads changes, and no thread can
undo it (as opposed to C<chdir()>).

Further examples of process-scope changes include C<umask()> and
changing uids and gids.

Thinking of mixing C<fork()> and threads?  Please lie down and wait
until the feeling passes.  Be aware that the semantics of C<fork()> vary
between platforms.  For example, some Unix systems copy all the current
threads into the child process, while others only copy the thread that
called C<fork()>. You have been warned!

Similarly, mixing signals and threads may be problematic.
Implementations are platform-dependent, and even the POSIX
semantics may not be what you expect (and Perl doesn't even
give you the full POSIX API).  For example, there is no way to
guarantee that a signal sent to a multi-threaded Perl application
will get intercepted by any particular thread.  (However, a recently
added feature does provide the capability to send signals between
threads.  See L<threads/THREAD SIGNALLING> for more details.)

=head1 Thread-Safety of System Libraries

Whether various library calls are thread-safe is outside the control
of Perl.  Calls often suffering from not being thread-safe include:
C<localtime()>, C<gmtime()>,  functions fetching user, group and
network information (such as C<getgrent()>, C<gethostent()>,
C<getnetent()> and so on), C<readdir()>, C<rand()>, and C<srand()>. In
general, calls that depend on some global external state.

If the system Perl is compiled in has thread-safe variants of such
calls, they will be used.  Beyond that, Perl is at the mercy of
the thread-safety or -unsafety of the calls.  Please consult your
C library call documentation.

On some platforms the thread-safe library interfaces may fail if the
result buffer is too small (for example the user group databases may
be rather large, and the reentrant interfaces may have to carry around
a full snapshot of those databases).  Perl will start with a small
buffer, but keep retrying and growing the result buffer
until the result fits.  If this limitless growing sounds bad for
security or memory consumption reasons you can recompile Perl with
C<PERL_REENTRANT_MAXSIZE> defined to the maximum number of bytes you will
allow.

=head1 Conclusion

A complete thread tutorial could fill a book (and has, many times),
but with what we've covered in this introduction, you should be well
on your way to becoming a threaded Perl expert.

=head1 SEE ALSO

Annotated POD for L<threads>:
L<http://annocpan.org/?mode=search&field=Module&name=threads>

Latest version of L<threads> on CPAN:
L<http://search.cpan.org/search?module=threads>

Annotated POD for L<threads::shared>:
L<http://annocpan.org/?mode=search&field=Module&name=threads%3A%3Ashared>

Latest version of L<threads::shared> on CPAN:
L<http://search.cpan.org/search?module=threads%3A%3Ashared>

Perl threads mailing list:
L<http://lists.perl.org/list/ithreads.html>

=head1 Bibliography

Here's a short bibliography courtesy of Jürgen Christoffel:

=head2 Introductory Texts

Birrell, Andrew D. An Introduction to Programming with
Threads. Digital Equipment Corporation, 1989, DEC-SRC Research Report
#35 online as
L<ftp://ftp.dec.com/pub/DEC/SRC/research-reports/SRC-035.pdf>
(highly recommended)

Robbins, Kay. A., and Steven Robbins. Practical Unix Programming: A
Guide to Concurrency, Communication, and
Multithreading. Prentice-Hall, 1996.

Lewis, Bill, and Daniel J. Berg. Multithreaded Programming with
Pthreads. Prentice Hall, 1997, ISBN 0-13-443698-9 (a well-written
introduction to threads).

Nelson, Greg (editor). Systems Programming with Modula-3.  Prentice
Hall, 1991, ISBN 0-13-590464-1.

Nichols, Bradford, Dick Buttlar, and Jacqueline Proulx Farrell.
Pthreads Programming. O'Reilly & Associates, 1996, ISBN 156592-115-1
(covers POSIX threads).

=head2 OS-Related References

Boykin, Joseph, David Kirschen, Alan Langerman, and Susan
LoVerso. Programming under Mach. Addison-Wesley, 1994, ISBN
0-201-52739-1.

Tanenbaum, Andrew S. Distributed Operating Systems. Prentice Hall,
1995, ISBN 0-13-219908-4 (great textbook).

Silberschatz, Abraham, and Peter B. Galvin. Operating System Concepts,
4th ed. Addison-Wesley, 1995, ISBN 0-201-59292-4

=head2 Other References

Arnold, Ken and James Gosling. The Java Programming Language, 2nd
ed. Addison-Wesley, 1998, ISBN 0-201-31006-6.

comp.programming.threads FAQ,
L<http://www.serpentine.com/~bos/threads-faq/>

Le Sergent, T. and B. Berthomieu. "Incremental MultiThreaded Garbage
Collection on Virtually Shared Memory Architectures" in Memory
Management: Proc. of the International Workshop IWMM 92, St. Malo,
France, September 1992, Yves Bekkers and Jacques Cohen, eds. Springer,
1992, ISBN 3540-55940-X (real-life thread applications).

Artur Bergman, "Where Wizards Fear To Tread", June 11, 2002,
L<http://www.perl.com/pub/a/2002/06/11/threads.html>

=head1 Acknowledgements

Thanks (in no particular order) to Chaim Frenkel, Steve Fink, Gurusamy
Sarathy, Ilya Zakharevich, Benjamin Sugars, Jürgen Christoffel, Joshua
Pritikin, and Alan Burlison, for their help in reality-checking and
polishing this article.  Big thanks to Tom Christiansen for his rewrite
of the prime number generator.

=head1 AUTHOR

Dan Sugalski E<lt>dan@sidhe.org<gt>

Slightly modified by Arthur Bergman to fit the new thread model/module.

Reworked slightly by Jörg Walter E<lt>jwalt@cpan.org<gt> to be more concise
about thread-safety of Perl code.

Rearranged slightly by Elizabeth Mattijsen E<lt>liz@dijkmat.nl<gt> to put
less emphasis on yield().

=head1 Copyrights

The original version of this article originally appeared in The Perl
Journal #10, and is copyright 1998 The Perl Journal. It appears courtesy
of Jon Orwant and The Perl Journal.  This document may be distributed
under the same terms as Perl itself.

=cut
perlpragma.pod000064400000012070150344123440007402 0ustar00=head1 NAME

perlpragma - how to write a user pragma

=head1 DESCRIPTION

A pragma is a module which influences some aspect of the compile time or run
time behaviour of Perl, such as C<strict> or C<warnings>. With Perl 5.10 you
are no longer limited to the built in pragmata; you can now create user
pragmata that modify the behaviour of user functions within a lexical scope.

=head1 A basic example

For example, say you need to create a class implementing overloaded
mathematical operators, and would like to provide your own pragma that
functions much like C<use integer;> You'd like this code

    use MyMaths;

    my $l = MyMaths->new(1.2);
    my $r = MyMaths->new(3.4);

    print "A: ", $l + $r, "\n";

    use myint;
    print "B: ", $l + $r, "\n";

    {
        no myint;
        print "C: ", $l + $r, "\n";
    }

    print "D: ", $l + $r, "\n";

    no myint;
    print "E: ", $l + $r, "\n";

to give the output

    A: 4.6
    B: 4
    C: 4.6
    D: 4
    E: 4.6

I<i.e.>, where C<use myint;> is in effect, addition operations are forced
to integer, whereas by default they are not, with the default behaviour being
restored via C<no myint;>

The minimal implementation of the package C<MyMaths> would be something like
this:

    package MyMaths;
    use warnings;
    use strict;
    use myint();
    use overload '+' => sub {
        my ($l, $r) = @_;
	# Pass 1 to check up one call level from here
        if (myint::in_effect(1)) {
            int($$l) + int($$r);
        } else {
            $$l + $$r;
        }
    };

    sub new {
        my ($class, $value) = @_;
        bless \$value, $class;
    }

    1;

Note how we load the user pragma C<myint> with an empty list C<()> to
prevent its C<import> being called.

The interaction with the Perl compilation happens inside package C<myint>:

    package myint;

    use strict;
    use warnings;

    sub import {
        $^H{"myint/in_effect"} = 1;
    }

    sub unimport {
        $^H{"myint/in_effect"} = 0;
    }

    sub in_effect {
        my $level = shift // 0;
        my $hinthash = (caller($level))[10];
        return $hinthash->{"myint/in_effect"};
    }

    1;

As pragmata are implemented as modules, like any other module, C<use myint;>
becomes

    BEGIN {
        require myint;
        myint->import();
    }

and C<no myint;> is

    BEGIN {
        require myint;
        myint->unimport();
    }

Hence the C<import> and C<unimport> routines are called at B<compile time>
for the user's code.

User pragmata store their state by writing to the magical hash C<%^H>,
hence these two routines manipulate it. The state information in C<%^H> is
stored in the optree, and can be retrieved read-only at runtime with C<caller()>,
at index 10 of the list of returned results. In the example pragma, retrieval
is encapsulated into the routine C<in_effect()>, which takes as parameter
the number of call frames to go up to find the value of the pragma in the
user's script. This uses C<caller()> to determine the value of
C<$^H{"myint/in_effect"}> when each line of the user's script was called, and
therefore provide the correct semantics in the subroutine implementing the
overloaded addition.

=head1 Key naming

There is only a single C<%^H>, but arbitrarily many modules that want
to use its scoping semantics.  To avoid stepping on each other's toes,
they need to be sure to use different keys in the hash.  It is therefore
conventional for a module to use only keys that begin with the module's
name (the name of its main package) and a "/" character.  After this
module-identifying prefix, the rest of the key is entirely up to the
module: it may include any characters whatsoever.  For example, a module
C<Foo::Bar> should use keys such as C<Foo::Bar/baz> and C<Foo::Bar/$%/_!>.
Modules following this convention all play nicely with each other.

The Perl core uses a handful of keys in C<%^H> which do not follow this
convention, because they predate it.  Keys that follow the convention
won't conflict with the core's historical keys.

=head1 Implementation details

The optree is shared between threads.  This means there is a possibility that
the optree will outlive the particular thread (and therefore the interpreter
instance) that created it, so true Perl scalars cannot be stored in the
optree.  Instead a compact form is used, which can only store values that are
integers (signed and unsigned), strings or C<undef> - references and
floating point values are stringified.  If you need to store multiple values
or complex structures, you should serialise them, for example with C<pack>.
The deletion of a hash key from C<%^H> is recorded, and as ever can be
distinguished from the existence of a key with value C<undef> with
C<exists>.

B<Don't> attempt to store references to data structures as integers which
are retrieved via C<caller> and converted back, as this will not be threadsafe.
Accesses would be to the structure without locking (which is not safe for
Perl's scalars), and either the structure has to leak, or it has to be
freed when its creating thread terminates, which may be before the optree
referencing it is deleted, if other threads outlive it.
perl5224delta.pod000064400000010554150344123440007546 0ustar00=encoding utf8

=head1 NAME

perl5224delta - what is new for perl v5.22.4

=head1 DESCRIPTION

This document describes differences between the 5.22.3 release and the 5.22.4
release.

If you are upgrading from an earlier release such as 5.22.2, first read
L<perl5223delta>, which describes differences between 5.22.2 and 5.22.3.

=head1 Security

=head2 Improved handling of '.' in @INC in base.pm

The handling of (the removal of) C<'.'> in C<@INC> in L<base> has been
improved.  This resolves some problematic behaviour in the approach taken in
Perl 5.22.3, which is probably best described in the following two threads on
the Perl 5 Porters mailing list:
L<http://www.nntp.perl.org/group/perl.perl5.porters/2016/08/msg238991.html>,
L<http://www.nntp.perl.org/group/perl.perl5.porters/2016/10/msg240297.html>.

=head2 "Escaped" colons and relative paths in PATH

On Unix systems, Perl treats any relative paths in the PATH environment
variable as tainted when starting a new process.  Previously, it was allowing a
backslash to escape a colon (unlike the OS), consequently allowing relative
paths to be considered safe if the PATH was set to something like C</\:.>.  The
check has been fixed to treat C<.> as tainted in that example.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<base> has been upgraded from version 2.22 to 2.22_01.

=item *

L<Module::CoreList> has been upgraded from version 5.20170114_22 to 5.20170715_22.

=back

=head1 Selected Bug Fixes

=over 4

=item *

Fixed a crash with C<s///l> where it thought it was dealing with UTF-8 when it
wasn't.
L<[perl #129038]|https://rt.perl.org/Ticket/Display.html?id=129038>

=back

=head1 Acknowledgements

Perl 5.22.4 represents approximately 6 months of development since Perl 5.22.3
and contains approximately 2,200 lines of changes across 52 files from 16
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 970 lines of changes to 18 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.22.4:

Aaron Crane, Abigail, Aristotle Pagaltzis, Chris 'BinGOs' Williams, David
Mitchell, Eric Herman, Father Chrysostomos, James E Keenan, Karl Williamson,
Lukas Mai, Renee Baecker, Ricardo Signes, Sawyer X, Stevan Little, Steve Hay,
Tony Cook.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
https://rt.perl.org/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5142delta.pod000064400000015354150344123440007550 0ustar00=encoding utf8

=head1 NAME

perl5142delta - what is new for perl v5.14.2

=head1 DESCRIPTION

This document describes differences between the 5.14.1 release and
the 5.14.2 release.

If you are upgrading from an earlier release such as 5.14.0, first read
L<perl5141delta>, which describes differences between 5.14.0 and
5.14.1.

=head1 Core Enhancements

No changes since 5.14.0.

=head1 Security

=head2 C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC (CVE-2011-2728).

Calling C<File::Glob::bsd_glob> with the unsupported flag GLOB_ALTDIRFUNC would
cause an access violation / segfault.  A Perl program that accepts a flags value from
an external source could expose itself to denial of service or arbitrary code
execution attacks.  There are no known exploits in the wild.  The problem has been
corrected by explicitly disabling all unsupported flags and setting unused function
pointers to null.  Bug reported by Clément Lecigne.

=head2 C<Encode> decode_xs n-byte heap-overflow (CVE-2011-2939)

A bug in C<Encode> could, on certain inputs, cause the heap to overflow.
This problem has been corrected.  Bug reported by Robert Zacek.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any
exist, they are bugs and reports are welcome.

=head1 Deprecations

There have been no deprecations since 5.14.0.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

None

=head2 Updated Modules and Pragmata

=over 4

=item *

L<CPAN> has been upgraded from version 1.9600 to version 1.9600_01.

L<CPAN::Distribution> has been upgraded from version 1.9602 to 1.9602_01.

Backported bugfixes from CPAN version 1.9800.  Ensures proper
detection of C<configure_requires> prerequisites from CPAN Meta files
in the case where C<dynamic_config> is true.  [rt.cpan.org #68835]

Also ensures that C<configure_requires> is only checked in META files,
not MYMETA files, so protect against MYMETA generation that drops
C<configure_requires>.

=item *

L<Encode> has been upgraded from version 2.42 to 2.42_01.

See L</Security>.

=item *

L<File::Glob> has been upgraded from version 1.12 to version 1.13.

See L</Security>.

=item *

L<PerlIO::scalar> has been upgraded from version 0.11 to 0.11_01.

It fixes a problem with C<< open my $fh, ">", \$scalar >> not working if
C<$scalar> is a copy-on-write scalar.

=back

=head2 Removed Modules and Pragmata

None

=head1 Platform Support

=head2 New Platforms

None

=head2 Discontinued Platforms

None

=head2 Platform-Specific Notes

=over 4

=item HP-UX PA-RISC/64 now supports gcc-4.x

A fix to correct the socketsize now makes the test suite pass on HP-UX
PA-RISC for 64bitall builds.

=item Building on OS X 10.7 Lion and Xcode 4 works again

The build system has been updated to work with the build tools under Mac OS X
10.7.

=back

=head1 Bug Fixes

=over 4

=item *

In @INC filters (subroutines returned by subroutines in @INC), $_ used to
misbehave: If returned from a subroutine, it would not be copied, but the
variable itself would be returned; and freeing $_ (e.g., with C<undef *_>)
would cause perl to crash.  This has been fixed [perl #91880].

=item *

Perl 5.10.0 introduced some faulty logic that made "U*" in the middle of
a pack template equivalent to "U0" if the input string was empty.  This has
been fixed [perl #90160].

=item *

C<caller> no longer leaks memory when called from the DB package if
C<@DB::args> was assigned to after the first call to C<caller>.  L<Carp>
was triggering this bug [perl #97010].

=item *

C<utf8::decode> had a nasty bug that would modify copy-on-write scalars'
string buffers in place (i.e., skipping the copy).  This could result in
hashes having two elements with the same key [perl #91834].

=item *

Localising a tied variable used to make it read-only if it contained a
copy-on-write string.

=item *

Elements of restricted hashes (see the L<fields> pragma) containing
copy-on-write values couldn't be deleted, nor could such hashes be cleared
(C<%hash = ()>).

=item *

Locking a hash element that is a glob copy no longer causes subsequent
assignment to it to corrupt the glob.

=item *

A panic involving the combination of the regular expression modifiers
C</aa> introduced in 5.14.0 and the C<\b> escape sequence has been
fixed [perl #95964].

=back

=head1 Known Problems

This is a list of some significant unfixed bugs, which are regressions
from 5.12.0.

=over 4

=item *

C<PERL_GLOBAL_STRUCT> is broken.

Since perl 5.14.0, building with C<-DPERL_GLOBAL_STRUCT> hasn't been
possible. This means that perl currently doesn't work on any platforms that
require it to be built this way, including Symbian.

While C<PERL_GLOBAL_STRUCT> now works again on recent development versions of
perl, it actually working on Symbian again hasn't been verified.

We'd be very interested in hearing from anyone working with Perl on Symbian.

=back

=head1 Acknowledgements

Perl 5.14.2 represents approximately three months of development since
Perl 5.14.1 and contains approximately 1200 lines of changes
across 61 files from 9 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.14.2:

Craig A. Berry, David Golden, Father Chrysostomos, Florian Ragwitz, H.Merijn
Brand, Karl Williamson, Nicholas Clark, Pau Amma and Ricardo Signes.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes all the core committers, who be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl587delta.pod000064400000020245150344123440007473 0ustar00=head1 NAME

perl587delta - what is new for perl v5.8.7

=head1 DESCRIPTION

This document describes differences between the 5.8.6 release and
the 5.8.7 release.

=head1 Incompatible Changes

There are no changes incompatible with 5.8.6.

=head1 Core Enhancements

=head2 Unicode Character Database 4.1.0

The copy of the Unicode Character Database included in Perl 5.8 has
been updated to 4.1.0 from 4.0.1. See
L<http://www.unicode.org/versions/Unicode4.1.0/#NotableChanges> for the
notable changes.

=head2 suidperl less insecure

A pair of exploits in C<suidperl> involving debugging code have been closed.

For new projects the core perl team strongly recommends that you use
dedicated, single purpose security tools such as C<sudo> in preference to
C<suidperl>.

=head2 Optional site customization script

The perl interpreter can be built to allow the use of a site customization
script. By default this is not enabled, to be consistent with previous perl
releases. To use this, add C<-Dusesitecustomize> to the command line flags
when running the C<Configure> script. See also L<perlrun/-f>.

=head2 C<Config.pm> is now much smaller.

C<Config.pm> is now about 3K rather than 32K, with the infrequently used
code and C<%Config> values loaded on demand. This is transparent to the
programmer, but means that most code will save parsing and loading 29K of
script (for example, code that uses C<File::Find>).

=head1 Modules and Pragmata

=over 4

=item *

B upgraded to version 1.09

=item *

base upgraded to version 2.07

=item *

bignum upgraded to version 0.17

=item *

bytes upgraded to version 1.02

=item *

Carp upgraded to version 1.04

=item *

CGI upgraded to version 3.10

=item *

Class::ISA upgraded to version 0.33

=item *

Data::Dumper upgraded to version 2.121_02

=item *

DB_File upgraded to version 1.811

=item *

Devel::PPPort upgraded to version 3.06

=item *

Digest upgraded to version 1.10

=item *

Encode upgraded to version 2.10

=item *

FileCache upgraded to version 1.05

=item *

File::Path upgraded to version 1.07

=item *

File::Temp upgraded to version 0.16

=item *

IO::File upgraded to version 1.11

=item *

IO::Socket upgraded to version 1.28

=item *

Math::BigInt upgraded to version 1.77

=item *

Math::BigRat upgraded to version 0.15

=item *

overload upgraded to version 1.03

=item *

PathTools upgraded to version 3.05

=item *

Pod::HTML upgraded to version 1.0503

=item *

Pod::Perldoc upgraded to version 3.14

=item *

Pod::LaTeX upgraded to version 0.58

=item *

Pod::Parser upgraded to version 1.30

=item *

Symbol upgraded to version 1.06

=item *

Term::ANSIColor upgraded to version 1.09

=item *

Test::Harness upgraded to version 2.48

=item *

Test::Simple upgraded to version 0.54

=item *

Text::Wrap upgraded to version 2001.09293, to fix a bug when wrap() was
called with a non-space separator.

=item *

threads::shared upgraded to version 0.93

=item *

Time::HiRes upgraded to version 1.66

=item *

Time::Local upgraded to version 1.11

=item *

Unicode::Normalize upgraded to version 0.32

=item *

utf8 upgraded to version 1.05

=item *

Win32 upgraded to version 0.24, which provides Win32::GetFileVersion

=back

=head1 Utility Changes

=head2 find2perl enhancements

C<find2perl> has new options C<-iname>, C<-path> and C<-ipath>.

=head1 Performance Enhancements

The internal pointer mapping hash used during ithreads cloning now uses an
arena for memory allocation. In tests this reduced ithreads cloning time by
about 10%.

=head1 Installation and Configuration Improvements

=over 4

=item *

The Win32 "dmake" makefile.mk has been updated to make it compatible
with the latest versions of dmake.

=item *

C<PERL_MALLOC>, C<DEBUG_MSTATS>, C<PERL_HASH_SEED_EXPLICIT> and C<NO_HASH_SEED>
should now work in Win32 makefiles.

=back

=head1 Selected Bug Fixes

=over 4

=item *

The socket() function on Win32 has been fixed so that it is able to use
transport providers which specify a protocol of 0 (meaning any protocol
is allowed) once more.  (This was broken in 5.8.6, and typically caused
the use of ICMP sockets to fail.)

=item *

Another obscure bug involving C<substr> and UTF-8 caused by bad internal
offset caching has been identified and fixed.

=item *

A bug involving the loading of UTF-8 tables by the regexp engine has been
fixed - code such as C<"\x{100}" =~ /[[:print:]]/> will no longer give
corrupt results.

=item *

Case conversion operations such as C<uc> on a long Unicode string could
exhaust memory. This has been fixed.

=item *

C<index>/C<rindex> were buggy for some combinations of Unicode and
non-Unicode data. This has been fixed.

=item *

C<read> (and presumably C<sysread>) would expose the UTF-8 internals when
reading from a byte oriented file handle into a UTF-8 scalar. This has
been fixed.

=item *

Several C<pack>/C<unpack> bug fixes:

=over 4

=item *

Checksums with C<b> or C<B> formats were broken.

=item *

C<unpack> checksums could overflow with the C<C> format.

=item *

C<U0> and C<C0> are now scoped to C<()> C<pack> sub-templates.

=item *

Counted length prefixes now don't change C<C0>/C<U0> mode.

=item *

C<pack> C<Z0> used to destroy the preceding character.

=item *

C<P>/C<p> C<pack> formats used to only recognise literal C<undef> 

=back

=item *

Using closures with ithreads could cause perl to crash. This was due to
failure to correctly lock internal OP structures, and has been fixed.

=item *

The return value of C<close> now correctly reflects any file errors that
occur while flushing the handle's data, instead of just giving failure if
the actual underlying file close operation failed.

=item *

C<not() || 1> used to segfault. C<not()> now behaves like C<not(0)>, which was
the pre 5.6.0 behaviour.

=item *

C<h2ph> has various enhancements to cope with constructs in header files that
used to result in incorrect or invalid output.

=back

=head1 New or Changed Diagnostics

There is a new taint error, "%ENV is aliased to %s". This error is thrown
when taint checks are enabled and when C<*ENV> has been aliased, so that
C<%ENV> has no env-magic anymore and hence the environment cannot be verified
as taint-free.

The internals of C<pack> and C<unpack> have been updated. All legitimate
templates should work as before, but there may be some changes in the error
reported for complex failure cases. Any behaviour changes for non-error cases
are bugs, and should be reported.

=head1 Changed Internals

There has been a fair amount of refactoring of the C<C> source code, partly to
make it tidier and more maintainable. The resulting object code and the
C<perl> binary may well be smaller than 5.8.6, and hopefully faster in some
cases, but apart from this there should be no user-detectable changes.

C<${^UTF8LOCALE}> has been added to give perl space access to C<PL_utf8locale>.

The size of the arenas used to allocate SV heads and most SV bodies can now
be changed at compile time. The old size was 1008 bytes, the new default size
is 4080 bytes.

=head1 Known Problems

Unicode strings returned from overloaded operators can be buggy. This is a
long standing bug reported since 5.8.6 was released, but we do not yet have
a suitable fix for it.

=head1 Platform Specific Problems

On UNICOS, lib/Math/BigInt/t/bigintc.t hangs burning CPU.
ext/B/t/bytecode.t and ext/Socket/t/socketpair.t both fail tests.
These are unlikely to be resolved, as our valiant UNICOS porter's last
Cray is being decommissioned.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5124delta.pod000064400000007130150344123440007541 0ustar00=encoding utf8

=head1 NAME

perl5124delta - what is new for perl v5.12.4

=head1 DESCRIPTION

This document describes differences between the 5.12.3 release and
the 5.12.4 release.

If you are upgrading from an earlier release such as 5.12.2, first read
L<perl5123delta>, which describes differences between 5.12.2
and 5.12.3. The major changes made in 5.12.0 are described in L<perl5120delta>.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.12.3. If any
exist, they are bugs and reports are welcome.

=head1 Selected Bug Fixes

When strict "refs" mode is off, C<%{...}> in rvalue context returns
C<undef> if its argument is undefined.  An optimisation introduced in Perl
5.12.0 to make C<keys %{...}> faster when used as a boolean did not take
this into account, causing C<keys %{+undef}> (and C<keys %$foo> when
C<$foo> is undefined) to be an error, which it should be so in strict
mode only [perl #81750].

C<lc>, C<uc>, C<lcfirst>, and C<ucfirst> no longer return untainted strings
when the argument is tainted. This has been broken since perl 5.8.9
[perl #87336].

Fixed a case where it was possible that a freed buffer may have been read
from when parsing a here document.

=head1 Modules and Pragmata

L<Module::CoreList> has been upgraded from version 2.43 to 2.50.

=head1 Testing

The F<cpan/CGI/t/http.t> test script has been fixed to work when the
environment has HTTPS_* environment variables, such as HTTPS_PROXY.

=head1 Documentation

Updated the documentation for rand() in L<perlfunc> to note that it is not
cryptographically secure.

=head1 Platform Specific Notes

=over 4

=item Linux

Support Ubuntu 11.04's new multi-arch library layout.

=back

=head1 Acknowledgements

Perl 5.12.4 represents approximately 5 months of development since
Perl 5.12.3 and contains approximately 200 lines of changes across
11 files from 8 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.12.4:

Andy Dougherty, David Golden, David Leadbeater, Father Chrysostomos,
Florian Ragwitz, Jesse Vincent, Leon Brocard, Zsbán Ambrus.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes all the core committers, who be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlform.pod000064400000040340150344123440007077 0ustar00=head1 NAME
X<format> X<report> X<chart>

perlform - Perl formats

=head1 DESCRIPTION

Perl has a mechanism to help you generate simple reports and charts.  To
facilitate this, Perl helps you code up your output page close to how it
will look when it's printed.  It can keep track of things like how many
lines are on a page, what page you're on, when to print page headers,
etc.  Keywords are borrowed from FORTRAN: format() to declare and write()
to execute; see their entries in L<perlfunc>.  Fortunately, the layout is
much more legible, more like BASIC's PRINT USING statement.  Think of it
as a poor man's nroff(1).
X<nroff>

Formats, like packages and subroutines, are declared rather than
executed, so they may occur at any point in your program.  (Usually it's
best to keep them all together though.) They have their own namespace
apart from all the other "types" in Perl.  This means that if you have a
function named "Foo", it is not the same thing as having a format named
"Foo".  However, the default name for the format associated with a given
filehandle is the same as the name of the filehandle.  Thus, the default
format for STDOUT is named "STDOUT", and the default format for filehandle
TEMP is named "TEMP".  They just look the same.  They aren't.

Output record formats are declared as follows:

    format NAME =
    FORMLIST
    .

If the name is omitted, format "STDOUT" is defined. A single "." in 
column 1 is used to terminate a format.  FORMLIST consists of a sequence 
of lines, each of which may be one of three types:

=over 4

=item 1.

A comment, indicated by putting a '#' in the first column.

=item 2.

A "picture" line giving the format for one output line.

=item 3.

An argument line supplying values to plug into the previous picture line.

=back

Picture lines contain output field definitions, intermingled with
literal text. These lines do not undergo any kind of variable interpolation.
Field definitions are made up from a set of characters, for starting and
extending a field to its desired width. This is the complete set of
characters for field definitions:
X<format, picture line>
X<@> X<^> X<< < >> X<< | >> X<< > >> X<#> X<0> X<.> X<...>
X<@*> X<^*> X<~> X<~~>

   @    start of regular field
   ^    start of special field
   <    pad character for left justification
   |    pad character for centering
   >    pad character for right justification
   #    pad character for a right-justified numeric field
   0    instead of first #: pad number with leading zeroes
   .    decimal point within a numeric field
   ...  terminate a text field, show "..." as truncation evidence
   @*   variable width field for a multi-line value
   ^*   variable width field for next line of a multi-line value
   ~    suppress line with all fields empty
   ~~   repeat line until all fields are exhausted

Each field in a picture line starts with either "@" (at) or "^" (caret),
indicating what we'll call, respectively, a "regular" or "special" field.
The choice of pad characters determines whether a field is textual or
numeric. The tilde operators are not part of a field.  Let's look at
the various possibilities in detail.


=head2 Text Fields
X<format, text field>

The length of the field is supplied by padding out the field with multiple 
"E<lt>", "E<gt>", or "|" characters to specify a non-numeric field with,
respectively, left justification, right justification, or centering. 
For a regular field, the value (up to the first newline) is taken and
printed according to the selected justification, truncating excess characters.
If you terminate a text field with "...", three dots will be shown if
the value is truncated. A special text field may be used to do rudimentary 
multi-line text block filling; see L</Using Fill Mode> for details.

   Example:
      format STDOUT =
      @<<<<<<   @||||||   @>>>>>>
      "left",   "middle", "right"
      .
   Output:
      left      middle    right


=head2 Numeric Fields
X<#> X<format, numeric field>

Using "#" as a padding character specifies a numeric field, with
right justification. An optional "." defines the position of the
decimal point. With a "0" (zero) instead of the first "#", the
formatted number will be padded with leading zeroes if necessary.
A special numeric field is blanked out if the value is undefined.
If the resulting value would exceed the width specified the field is
filled with "#" as overflow evidence.

   Example:
      format STDOUT =
      @###   @.###   @##.###  @###   @###   ^####
       42,   3.1415,  undef,    0, 10000,   undef
      .
   Output:
        42   3.142     0.000     0   ####


=head2 The Field @* for Variable-Width Multi-Line Text
X<@*>

The field "@*" can be used for printing multi-line, nontruncated
values; it should (but need not) appear by itself on a line. A final
line feed is chomped off, but all other characters are emitted verbatim.


=head2 The Field ^* for Variable-Width One-line-at-a-time Text
X<^*>

Like "@*", this is a variable-width field. The value supplied must be a 
scalar variable. Perl puts the first line (up to the first "\n") of the 
text into the field, and then chops off the front of the string so that 
the next time the variable is referenced, more of the text can be printed. 
The variable will I<not> be restored.

   Example:
      $text = "line 1\nline 2\nline 3";
      format STDOUT =
      Text: ^*
            $text
      ~~    ^*
            $text
      .
   Output:
      Text: line 1
            line 2
            line 3


=head2 Specifying Values
X<format, specifying values>

The values are specified on the following format line in the same order as
the picture fields.  The expressions providing the values must be
separated by commas.  They are all evaluated in a list context
before the line is processed, so a single list expression could produce
multiple list elements.  The expressions may be spread out to more than
one line if enclosed in braces.  If so, the opening brace must be the first
token on the first line.  If an expression evaluates to a number with a
decimal part, and if the corresponding picture specifies that the decimal
part should appear in the output (that is, any picture except multiple "#"
characters B<without> an embedded "."), the character used for the decimal
point is determined by the current LC_NUMERIC locale if C<use locale> is in
effect.  This means that, if, for example, the run-time environment happens
to specify a German locale, "," will be used instead of the default ".".  See
L<perllocale> and L</"WARNINGS"> for more information.


=head2 Using Fill Mode
X<format, fill mode>

On text fields the caret enables a kind of fill mode.  Instead of an
arbitrary expression, the value supplied must be a scalar variable
that contains a text string.  Perl puts the next portion of the text into
the field, and then chops off the front of the string so that the next time
the variable is referenced, more of the text can be printed.  (Yes, this
means that the variable itself is altered during execution of the write()
call, and is not restored.)  The next portion of text is determined by
a crude line-breaking algorithm. You may use the carriage return character
(C<\r>) to force a line break. You can change which characters are legal 
to break on by changing the variable C<$:> (that's 
$FORMAT_LINE_BREAK_CHARACTERS if you're using the English module) to a 
list of the desired characters.

Normally you would use a sequence of fields in a vertical stack associated 
with the same scalar variable to print out a block of text. You might wish 
to end the final field with the text "...", which will appear in the output 
if the text was too long to appear in its entirety.  


=head2 Suppressing Lines Where All Fields Are Void
X<format, suppressing lines>

Using caret fields can produce lines where all fields are blank. You can
suppress such lines by putting a "~" (tilde) character anywhere in the
line.  The tilde will be translated to a space upon output.


=head2 Repeating Format Lines
X<format, repeating lines>

If you put two contiguous tilde characters "~~" anywhere into a line,
the line will be repeated until all the fields on the line are exhausted,
i.e. undefined. For special (caret) text fields this will occur sooner or
later, but if you use a text field of the at variety, the  expression you
supply had better not give the same value every time forever! (C<shift(@f)>
is a simple example that would work.)  Don't use a regular (at) numeric 
field in such lines, because it will never go blank.


=head2 Top of Form Processing
X<format, top of form> X<top> X<header>

Top-of-form processing is by default handled by a format with the
same name as the current filehandle with "_TOP" concatenated to it.
It's triggered at the top of each page.  See L<perlfunc/write>.

Examples:

 # a report on the /etc/passwd file
 format STDOUT_TOP =
                         Passwd File
 Name                Login    Office   Uid   Gid Home
 ------------------------------------------------------------------
 .
 format STDOUT =
 @<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<<
 $name,              $login,  $office,$uid,$gid, $home
 .


 # a report from a bug report form
 format STDOUT_TOP =
                         Bug Reports
 @<<<<<<<<<<<<<<<<<<<<<<<     @|||         @>>>>>>>>>>>>>>>>>>>>>>>
 $system,                      $%,         $date
 ------------------------------------------------------------------
 .
 format STDOUT =
 Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
          $subject
 Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
        $index,                       $description
 Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
           $priority,        $date,   $description
 From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
       $from,                         $description
 Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
              $programmer,            $description
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                                      $description
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                                      $description
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                                      $description
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<
                                      $description
 ~                                    ^<<<<<<<<<<<<<<<<<<<<<<<...
                                      $description
 .

It is possible to intermix print()s with write()s on the same output
channel, but you'll have to handle C<$-> (C<$FORMAT_LINES_LEFT>)
yourself.

=head2 Format Variables
X<format variables>
X<format, variables>

The current format name is stored in the variable C<$~> (C<$FORMAT_NAME>),
and the current top of form format name is in C<$^> (C<$FORMAT_TOP_NAME>).
The current output page number is stored in C<$%> (C<$FORMAT_PAGE_NUMBER>),
and the number of lines on the page is in C<$=> (C<$FORMAT_LINES_PER_PAGE>).
Whether to autoflush output on this handle is stored in C<$|>
(C<$OUTPUT_AUTOFLUSH>).  The string output before each top of page (except
the first) is stored in C<$^L> (C<$FORMAT_FORMFEED>).  These variables are
set on a per-filehandle basis, so you'll need to select() into a different
one to affect them:

    select((select(OUTF),
	    $~ = "My_Other_Format",
	    $^ = "My_Top_Format"
	   )[0]);

Pretty ugly, eh?  It's a common idiom though, so don't be too surprised
when you see it.  You can at least use a temporary variable to hold
the previous filehandle: (this is a much better approach in general,
because not only does legibility improve, you now have an intermediary
stage in the expression to single-step the debugger through):

    $ofh = select(OUTF);
    $~ = "My_Other_Format";
    $^ = "My_Top_Format";
    select($ofh);

If you use the English module, you can even read the variable names:

    use English;
    $ofh = select(OUTF);
    $FORMAT_NAME     = "My_Other_Format";
    $FORMAT_TOP_NAME = "My_Top_Format";
    select($ofh);

But you still have those funny select()s.  So just use the FileHandle
module.  Now, you can access these special variables using lowercase
method names instead:

    use FileHandle;
    format_name     OUTF "My_Other_Format";
    format_top_name OUTF "My_Top_Format";

Much better!

=head1 NOTES

Because the values line may contain arbitrary expressions (for at fields,
not caret fields), you can farm out more sophisticated processing
to other functions, like sprintf() or one of your own.  For example:

    format Ident =
	@<<<<<<<<<<<<<<<
	&commify($n)
    .

To get a real at or caret into the field, do this:

    format Ident =
    I have an @ here.
	    "@"
    .

To center a whole line of text, do something like this:

    format Ident =
    @|||||||||||||||||||||||||||||||||||||||||||||||
	    "Some text line"
    .

There is no builtin way to say "float this to the right hand side
of the page, however wide it is."  You have to specify where it goes.
The truly desperate can generate their own format on the fly, based
on the current number of columns, and then eval() it:

    $format  = "format STDOUT = \n"
             . '^' . '<' x $cols . "\n"
             . '$entry' . "\n"
             . "\t^" . "<" x ($cols-8) . "~~\n"
             . '$entry' . "\n"
             . ".\n";
    print $format if $Debugging;
    eval $format;
    die $@ if $@;

Which would generate a format looking something like this:

 format STDOUT =
 ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
 $entry
         ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
 $entry
 .

Here's a little program that's somewhat like fmt(1):

 format =
 ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ~~
 $_

 .

 $/ = '';
 while (<>) {
     s/\s*\n\s*/ /g;
     write;
 }

=head2 Footers
X<format, footer> X<footer>

While $FORMAT_TOP_NAME contains the name of the current header format,
there is no corresponding mechanism to automatically do the same thing
for a footer.  Not knowing how big a format is going to be until you
evaluate it is one of the major problems.  It's on the TODO list.

Here's one strategy:  If you have a fixed-size footer, you can get footers
by checking $FORMAT_LINES_LEFT before each write() and print the footer
yourself if necessary.

Here's another strategy: Open a pipe to yourself, using C<open(MYSELF, "|-")>
(see L<perlfunc/open>) and always write() to MYSELF instead of STDOUT.
Have your child process massage its STDIN to rearrange headers and footers
however you like.  Not very convenient, but doable.

=head2 Accessing Formatting Internals
X<format, internals>

For low-level access to the formatting mechanism, you may use formline()
and access C<$^A> (the $ACCUMULATOR variable) directly.

For example:

    $str = formline <<'END', 1,2,3;
    @<<<  @|||  @>>>
    END

    print "Wow, I just stored '$^A' in the accumulator!\n";

Or to make an swrite() subroutine, which is to write() what sprintf()
is to printf(), do this:

    use Carp;
    sub swrite {
	croak "usage: swrite PICTURE ARGS" unless @_;
	my $format = shift;
	$^A = "";
	formline($format,@_);
	return $^A;
    }

    $string = swrite(<<'END', 1, 2, 3);
 Check me out
 @<<<  @|||  @>>>
 END
    print $string;

=head1 WARNINGS

The lone dot that ends a format can also prematurely end a mail
message passing through a misconfigured Internet mailer (and based on
experience, such misconfiguration is the rule, not the exception).  So
when sending format code through mail, you should indent it so that
the format-ending dot is not on the left margin; this will prevent
SMTP cutoff.

Lexical variables (declared with "my") are not visible within a
format unless the format is declared within the scope of the lexical
variable.

If a program's environment specifies an LC_NUMERIC locale and C<use
locale> is in effect when the format is declared, the locale is used
to specify the decimal point character in formatted output.  Formatted
output cannot be controlled by C<use locale> at the time when write()
is called. See L<perllocale> for further discussion of locale handling.

Within strings that are to be displayed in a fixed-length text field,
each control character is substituted by a space. (But remember the
special meaning of C<\r> when using fill mode.) This is done to avoid
misalignment when control characters "disappear" on some output media.

perldtrace.pod000064400000017426150344123440007407 0ustar00=head1 NAME

perldtrace - Perl's support for DTrace

=head1 SYNOPSIS

 # dtrace -Zn 'perl::sub-entry, perl::sub-return { trace(copyinstr(arg0)) }'
 dtrace: description 'perl::sub-entry, perl::sub-return ' matched 10 probes

 # perl -E 'sub outer { inner(@_) } sub inner { say shift } outer("hello")'
 hello

 (dtrace output)
 CPU     ID                    FUNCTION:NAME
   0  75915       Perl_pp_entersub:sub-entry   BEGIN
   0  75915       Perl_pp_entersub:sub-entry   import
   0  75922      Perl_pp_leavesub:sub-return   import
   0  75922      Perl_pp_leavesub:sub-return   BEGIN
   0  75915       Perl_pp_entersub:sub-entry   outer
   0  75915       Perl_pp_entersub:sub-entry   inner
   0  75922      Perl_pp_leavesub:sub-return   inner
   0  75922      Perl_pp_leavesub:sub-return   outer

=head1 DESCRIPTION

DTrace is a framework for comprehensive system- and application-level
tracing. Perl is a DTrace I<provider>, meaning it exposes several
I<probes> for instrumentation. You can use these in conjunction
with kernel-level probes, as well as probes from other providers
such as MySQL, in order to diagnose software defects, or even just
your application's bottlenecks.

Perl must be compiled with the C<-Dusedtrace> option in order to
make use of the provided probes. While DTrace aims to have no
overhead when its instrumentation is not active, Perl's support
itself cannot uphold that guarantee, so it is built without DTrace
probes under most systems. One notable exception is that Mac OS X
ships a F</usr/bin/perl> with DTrace support enabled.

=head1 HISTORY

=over 4

=item 5.10.1

Perl's initial DTrace support was added, providing C<sub-entry> and
C<sub-return> probes.

=item 5.14.0

The C<sub-entry> and C<sub-return> probes gain a fourth argument: the
package name of the function.

=item 5.16.0

The C<phase-change> probe was added.

=item 5.18.0

The C<op-entry>, C<loading-file>, and C<loaded-file> probes were added.

=back

=head1 PROBES

=over 4

=item sub-entry(SUBNAME, FILE, LINE, PACKAGE)

Traces the entry of any subroutine. Note that all of the variables
refer to the subroutine that is being invoked; there is currently
no way to get ahold of any information about the subroutine's
I<caller> from a DTrace action.

 :*perl*::sub-entry {
     printf("%s::%s entered at %s line %d\n",
           copyinstr(arg3), copyinstr(arg0), copyinstr(arg1), arg2);
 }

=item sub-return(SUBNAME, FILE, LINE, PACKAGE)

Traces the exit of any subroutine. Note that all of the variables
refer to the subroutine that is returning; there is currently no
way to get ahold of any information about the subroutine's I<caller>
from a DTrace action.

 :*perl*::sub-return {
     printf("%s::%s returned at %s line %d\n",
           copyinstr(arg3), copyinstr(arg0), copyinstr(arg1), arg2);
 }

=item phase-change(NEWPHASE, OLDPHASE)

Traces changes to Perl's interpreter state. You can internalize this
as tracing changes to Perl's C<${^GLOBAL_PHASE}> variable, especially
since the values for C<NEWPHASE> and C<OLDPHASE> are the strings that
C<${^GLOBAL_PHASE}> reports.

 :*perl*::phase-change {
     printf("Phase changed from %s to %s\n",
         copyinstr(arg1), copyinstr(arg0));
 }

=item op-entry(OPNAME)

Traces the execution of each opcode in the Perl runloop. This probe
is fired before the opcode is executed. When the Perl debugger is
enabled, the DTrace probe is fired I<after> the debugger hooks (but
still before the opcode itself is executed).

 :*perl*::op-entry {
     printf("About to execute opcode %s\n", copyinstr(arg0));
 }

=item loading-file(FILENAME)

Fires when Perl is about to load an individual file, whether from
C<use>, C<require>, or C<do>. This probe fires before the file is
read from disk. The filename argument is converted to local filesystem
paths instead of providing C<Module::Name>-style names.

 :*perl*:loading-file {
     printf("About to load %s\n", copyinstr(arg0));
 }

=item loaded-file(FILENAME)

Fires when Perl has successfully loaded an individual file, whether
from C<use>, C<require>, or C<do>. This probe fires after the file
is read from disk and its contents evaluated. The filename argument
is converted to local filesystem paths instead of providing
C<Module::Name>-style names.

 :*perl*:loaded-file {
     printf("Successfully loaded %s\n", copyinstr(arg0));
 }

=back

=head1 EXAMPLES

=over 4

=item Most frequently called functions

 # dtrace -qZn 'sub-entry { @[strjoin(strjoin(copyinstr(arg3),"::"),copyinstr(arg0))] = count() } END {trunc(@, 10)}'

 Class::MOP::Attribute::slots                                    400
 Try::Tiny::catch                                                411
 Try::Tiny::try                                                  411
 Class::MOP::Instance::inline_slot_access                        451
 Class::MOP::Class::Immutable::Trait:::around                    472
 Class::MOP::Mixin::AttributeCore::has_initializer               496
 Class::MOP::Method::Wrapped::__ANON__                           544
 Class::MOP::Package::_package_stash                             737
 Class::MOP::Class::initialize                                  1128
 Class::MOP::get_metaclass_by_name                              1204

=item Trace function calls

 # dtrace -qFZn 'sub-entry, sub-return { trace(copyinstr(arg0)) }'

 0  -> Perl_pp_entersub                        BEGIN
 0  <- Perl_pp_leavesub                        BEGIN
 0  -> Perl_pp_entersub                        BEGIN
 0    -> Perl_pp_entersub                      import
 0    <- Perl_pp_leavesub                      import
 0  <- Perl_pp_leavesub                        BEGIN
 0  -> Perl_pp_entersub                        BEGIN
 0    -> Perl_pp_entersub                      dress
 0    <- Perl_pp_leavesub                      dress
 0    -> Perl_pp_entersub                      dirty
 0    <- Perl_pp_leavesub                      dirty
 0    -> Perl_pp_entersub                      whiten
 0    <- Perl_pp_leavesub                      whiten
 0  <- Perl_dounwind                           BEGIN

=item Function calls during interpreter cleanup

 # dtrace -Zn 'phase-change /copyinstr(arg0) == "END"/ { self->ending = 1 } sub-entry /self->ending/ { trace(copyinstr(arg0)) }'

 CPU     ID                    FUNCTION:NAME
   1  77214       Perl_pp_entersub:sub-entry   END
   1  77214       Perl_pp_entersub:sub-entry   END
   1  77214       Perl_pp_entersub:sub-entry   cleanup
   1  77214       Perl_pp_entersub:sub-entry   _force_writable
   1  77214       Perl_pp_entersub:sub-entry   _force_writable

=item System calls at compile time

 # dtrace -qZn 'phase-change /copyinstr(arg0) == "START"/ { self->interesting = 1 } phase-change /copyinstr(arg0) == "RUN"/ { self->interesting = 0 } syscall::: /self->interesting/ { @[probefunc] = count() } END { trunc(@, 3) }'

 lseek                                                           310
 read                                                            374
 stat64                                                         1056

=item Perl functions that execute the most opcodes

 # dtrace -qZn 'sub-entry { self->fqn = strjoin(copyinstr(arg3), strjoin("::", copyinstr(arg0))) } op-entry /self->fqn != ""/ { @[self->fqn] = count() } END { trunc(@, 3) }'

 warnings::unimport                                             4589
 Exporter::Heavy::_rebuild_cache                                5039
 Exporter::import                                              14578

=back

=head1 REFERENCES

=over 4

=item DTrace Dynamic Tracing Guide

L<http://dtrace.org/guide/preface.html>

=item DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD

L<http://www.amazon.com/DTrace-Dynamic-Tracing-Solaris-FreeBSD/dp/0132091518/>

=back

=head1 SEE ALSO

=over 4

=item L<Devel::DTrace::Provider>

This CPAN module lets you create application-level DTrace probes written in
Perl.

=back

=head1 AUTHORS

Shawn M Moore C<sartak@gmail.com>

=cut
perl5180delta.pod000064400000351207150344123440007552 0ustar00=encoding utf8

=head1 NAME

perl5180delta - what is new for perl v5.18.0

=head1 DESCRIPTION

This document describes differences between the v5.16.0 release and the v5.18.0
release.

If you are upgrading from an earlier release such as v5.14.0, first read
L<perl5160delta>, which describes differences between v5.14.0 and v5.16.0.

=head1 Core Enhancements

=head2 New mechanism for experimental features

Newly-added experimental features will now require this incantation:

    no warnings "experimental::feature_name";
    use feature "feature_name";  # would warn without the prev line

There is a new warnings category, called "experimental", containing
warnings that the L<feature> pragma emits when enabling experimental
features.

Newly-added experimental features will also be given special warning IDs,
which consist of "experimental::" followed by the name of the feature.  (The
plan is to extend this mechanism eventually to all warnings, to allow them
to be enabled or disabled individually, and not just by category.)

By saying

    no warnings "experimental::feature_name";

you are taking responsibility for any breakage that future changes to, or
removal of, the feature may cause.

Since some features (like C<~~> or C<my $_>) now emit experimental warnings,
and you may want to disable them in code that is also run on perls that do not
recognize these warning categories, consider using the C<if> pragma like this:

    no if $] >= 5.018, warnings => "experimental::feature_name";

Existing experimental features may begin emitting these warnings, too.  Please
consult L<perlexperiment> for information on which features are considered
experimental.

=head2 Hash overhaul

Changes to the implementation of hashes in perl v5.18.0 will be one of the most
visible changes to the behavior of existing code.

By default, two distinct hash variables with identical keys and values may now
provide their contents in a different order where it was previously identical.

When encountering these changes, the key to cleaning up from them is to accept
that B<hashes are unordered collections> and to act accordingly.

=head3 Hash randomization

The seed used by Perl's hash function is now random.  This means that the
order which keys/values will be returned from functions like C<keys()>,
C<values()>, and C<each()> will differ from run to run.

This change was introduced to make Perl's hashes more robust to algorithmic
complexity attacks, and also because we discovered that it exposes hash
ordering dependency bugs and makes them easier to track down.

Toolchain maintainers might want to invest in additional infrastructure to
test for things like this.  Running tests several times in a row and then
comparing results will make it easier to spot hash order dependencies in
code.  Authors are strongly encouraged not to expose the key order of
Perl's hashes to insecure audiences.

Further, every hash has its own iteration order, which should make it much
more difficult to determine what the current hash seed is.

=head3 New hash functions

Perl v5.18 includes support for multiple hash functions, and changed
the default (to ONE_AT_A_TIME_HARD), you can choose a different
algorithm by defining a symbol at compile time.  For a current list,
consult the F<INSTALL> document.  Note that as of Perl v5.18 we can
only recommend use of the default or SIPHASH. All the others are
known to have security issues and are for research purposes only.

=head3 PERL_HASH_SEED environment variable now takes a hex value

C<PERL_HASH_SEED> no longer accepts an integer as a parameter;
instead the value is expected to be a binary value encoded in a hex
string, such as "0xf5867c55039dc724".  This is to make the
infrastructure support hash seeds of arbitrary lengths, which might
exceed that of an integer.  (SipHash uses a 16 byte seed.)

=head3 PERL_PERTURB_KEYS environment variable added

The C<PERL_PERTURB_KEYS> environment variable allows one to control the level of
randomization applied to C<keys> and friends.

When C<PERL_PERTURB_KEYS> is 0, perl will not randomize the key order at all. The
chance that C<keys> changes due to an insert will be the same as in previous
perls, basically only when the bucket size is changed.

When C<PERL_PERTURB_KEYS> is 1, perl will randomize keys in a non-repeatable
way. The chance that C<keys> changes due to an insert will be very high.  This
is the most secure and default mode.

When C<PERL_PERTURB_KEYS> is 2, perl will randomize keys in a repeatable way.
Repeated runs of the same program should produce the same output every time.

C<PERL_HASH_SEED> implies a non-default C<PERL_PERTURB_KEYS> setting. Setting
C<PERL_HASH_SEED=0> (exactly one 0) implies C<PERL_PERTURB_KEYS=0> (hash key
randomization disabled); setting C<PERL_HASH_SEED> to any other value implies
C<PERL_PERTURB_KEYS=2> (deterministic and repeatable hash key randomization).
Specifying C<PERL_PERTURB_KEYS> explicitly to a different level overrides this
behavior.

=head3 Hash::Util::hash_seed() now returns a string

Hash::Util::hash_seed() now returns a string instead of an integer.  This
is to make the infrastructure support hash seeds of arbitrary lengths
which might exceed that of an integer.  (SipHash uses a 16 byte seed.)

=head3 Output of PERL_HASH_SEED_DEBUG has been changed

The environment variable PERL_HASH_SEED_DEBUG now makes perl show both the
hash function perl was built with, I<and> the seed, in hex, in use for that
process. Code parsing this output, should it exist, must change to accommodate
the new format.  Example of the new format:

    $ PERL_HASH_SEED_DEBUG=1 ./perl -e1
    HASH_FUNCTION = MURMUR3 HASH_SEED = 0x1476bb9f

=head2 Upgrade to Unicode 6.2

Perl now supports Unicode 6.2.  A list of changes from Unicode
6.1 is at L<http://www.unicode.org/versions/Unicode6.2.0>.

=head2 Character name aliases may now include non-Latin1-range characters

It is possible to define your own names for characters for use in
C<\N{...}>, C<charnames::vianame()>, etc.  These names can now be
comprised of characters from the whole Unicode range.  This allows for
names to be in your native language, and not just English.  Certain
restrictions apply to the characters that may be used (you can't define
a name that has punctuation in it, for example).  See L<charnames/CUSTOM
ALIASES>.

=head2 New DTrace probes

The following new DTrace probes have been added:

=over 4

=item *

C<op-entry>

=item *

C<loading-file>

=item *

C<loaded-file>

=back

=head2 C<${^LAST_FH}>

This new variable provides access to the filehandle that was last read.
This is the handle used by C<$.> and by C<tell> and C<eof> without
arguments.

=head2 Regular Expression Set Operations

This is an B<experimental> feature to allow matching against the union,
intersection, etc., of sets of code points, similar to
L<Unicode::Regex::Set>.  It can also be used to extend C</x> processing
to [bracketed] character classes, and as a replacement of user-defined
properties, allowing more complex expressions than they do.  See
L<perlrecharclass/Extended Bracketed Character Classes>.

=head2 Lexical subroutines

This new feature is still considered B<experimental>.  To enable it:

    use 5.018;
    no warnings "experimental::lexical_subs";
    use feature "lexical_subs";

You can now declare subroutines with C<state sub foo>, C<my sub foo>, and
C<our sub foo>.  (C<state sub> requires that the "state" feature be
enabled, unless you write it as C<CORE::state sub foo>.)

C<state sub> creates a subroutine visible within the lexical scope in which
it is declared.  The subroutine is shared between calls to the outer sub.

C<my sub> declares a lexical subroutine that is created each time the
enclosing block is entered.  C<state sub> is generally slightly faster than
C<my sub>.

C<our sub> declares a lexical alias to the package subroutine of the same
name.

For more information, see L<perlsub/Lexical Subroutines>.

=head2 Computed Labels

The loop controls C<next>, C<last> and C<redo>, and the special C<dump>
operator, now allow arbitrary expressions to be used to compute labels at run
time.  Previously, any argument that was not a constant was treated as the
empty string.

=head2 More CORE:: subs

Several more built-in functions have been added as subroutines to the
CORE:: namespace - namely, those non-overridable keywords that can be
implemented without custom parsers: C<defined>, C<delete>, C<exists>,
C<glob>, C<pos>, C<prototype>, C<scalar>, C<split>, C<study>, and C<undef>.

As some of these have prototypes, C<prototype('CORE::...')> has been
changed to not make a distinction between overridable and non-overridable
keywords.  This is to make C<prototype('CORE::pos')> consistent with
C<prototype(&CORE::pos)>.

=head2 C<kill> with negative signal names

C<kill> has always allowed a negative signal number, which kills the
process group instead of a single process.  It has also allowed signal
names.  But it did not behave consistently, because negative signal names
were treated as 0.  Now negative signals names like C<-INT> are supported
and treated the same way as -2 [perl #112990].

=head1 Security

=head2 See also: hash overhaul

Some of the changes in the L<hash overhaul|/"Hash overhaul"> were made to
enhance security.  Please read that section.

=head2 C<Storable> security warning in documentation

The documentation for C<Storable> now includes a section which warns readers
of the danger of accepting Storable documents from untrusted sources. The
short version is that deserializing certain types of data can lead to loading
modules and other code execution. This is documented behavior and wanted
behavior, but this opens an attack vector for malicious entities.

=head2 C<Locale::Maketext> allowed code injection via a malicious template

If users could provide a translation string to Locale::Maketext, this could be
used to invoke arbitrary Perl subroutines available in the current process.

This has been fixed, but it is still possible to invoke any method provided by
C<Locale::Maketext> itself or a subclass that you are using. One of these
methods in turn will invoke the Perl core's C<sprintf> subroutine.

In summary, allowing users to provide translation strings without auditing
them is a bad idea.

This vulnerability is documented in CVE-2012-6329.

=head2 Avoid calling memset with a negative count

Poorly written perl code that allows an attacker to specify the count to perl's
C<x> string repeat operator can already cause a memory exhaustion
denial-of-service attack. A flaw in versions of perl before v5.15.5 can escalate
that into a heap buffer overrun; coupled with versions of glibc before 2.16, it
possibly allows the execution of arbitrary code.

The flaw addressed to this commit has been assigned identifier CVE-2012-5195
and was researched by Tim Brown.

=head1 Incompatible Changes

=head2 See also: hash overhaul

Some of the changes in the L<hash overhaul|/"Hash overhaul"> are not fully
compatible with previous versions of perl.  Please read that section.

=head2 An unknown character name in C<\N{...}> is now a syntax error

Previously, it warned, and the Unicode REPLACEMENT CHARACTER was
substituted.  Unicode now recommends that this situation be a syntax
error.  Also, the previous behavior led to some confusing warnings and
behaviors, and since the REPLACEMENT CHARACTER has no use other than as
a stand-in for some unknown character, any code that has this problem is
buggy.

=head2 Formerly deprecated characters in C<\N{}> character name aliases are now errors.

Since v5.12.0, it has been deprecated to use certain characters in
user-defined C<\N{...}> character names.  These now cause a syntax
error.  For example, it is now an error to begin a name with a digit,
such as in

 my $undraftable = "\N{4F}";    # Syntax error!

or to have commas anywhere in the name.  See L<charnames/CUSTOM ALIASES>.

=head2 C<\N{BELL}> now refers to U+1F514 instead of U+0007

Unicode 6.0 reused the name "BELL" for a different code point than it
traditionally had meant.  Since Perl v5.14, use of this name still
referred to U+0007, but would raise a deprecation warning.  Now, "BELL"
refers to U+1F514, and the name for U+0007 is "ALERT".  All the
functions in L<charnames> have been correspondingly updated.

=head2 New Restrictions in Multi-Character Case-Insensitive Matching in Regular Expression Bracketed Character Classes

Unicode has now withdrawn their previous recommendation for regular
expressions to automatically handle cases where a single character can
match multiple characters case-insensitively, for example, the letter
LATIN SMALL LETTER SHARP S and the sequence C<ss>.  This is because
it turns out to be impracticable to do this correctly in all
circumstances.  Because Perl has tried to do this as best it can, it
will continue to do so.  (We are considering an option to turn it off.)
However, a new restriction is being added on such matches when they
occur in [bracketed] character classes.  People were specifying
things such as C</[\0-\xff]/i>, and being surprised that it matches the
two character sequence C<ss> (since LATIN SMALL LETTER SHARP S occurs in
this range).  This behavior is also inconsistent with using a
property instead of a range:  C<\p{Block=Latin1}> also includes LATIN
SMALL LETTER SHARP S, but C</[\p{Block=Latin1}]/i> does not match C<ss>.
The new rule is that for there to be a multi-character case-insensitive
match within a bracketed character class, the character must be
explicitly listed, and not as an end point of a range.  This more
closely obeys the Principle of Least Astonishment.  See
L<perlrecharclass/Bracketed Character Classes>.  Note that a bug [perl
#89774], now fixed as part of this change, prevented the previous
behavior from working fully.

=head2 Explicit rules for variable names and identifiers

Due to an oversight, single character variable names in v5.16 were
completely unrestricted.  This opened the door to several kinds of
insanity.  As of v5.18, these now follow the rules of other identifiers,
in addition to accepting characters that match the C<\p{POSIX_Punct}>
property.

There is no longer any difference in the parsing of identifiers
specified by using braces versus without braces.  For instance, perl
used to allow C<${foo:bar}> (with a single colon) but not C<$foo:bar>.
Now that both are handled by a single code path, they are both treated
the same way: both are forbidden.  Note that this change is about the
range of permissible literal identifiers, not other expressions.

=head2 Vertical tabs are now whitespace

No one could recall why C<\s> didn't match C<\cK>, the vertical tab.
Now it does.  Given the extreme rarity of that character, very little
breakage is expected.  That said, here's what it means:

C<\s> in a regex now matches a vertical tab in all circumstances.

Literal vertical tabs in a regex literal are ignored when the C</x>
modifier is used.

Leading vertical tabs, alone or mixed with other whitespace, are now
ignored when interpreting a string as a number.  For example:

  $dec = " \cK \t 123";
  $hex = " \cK \t 0xF";

  say 0 + $dec;   # was 0 with warning, now 123
  say int $dec;   # was 0, now 123
  say oct $hex;   # was 0, now  15

=head2 C</(?{})/> and C</(??{})/> have been heavily reworked

The implementation of this feature has been almost completely rewritten.
Although its main intent is to fix bugs, some behaviors, especially
related to the scope of lexical variables, will have changed.  This is
described more fully in the L</Selected Bug Fixes> section.

=head2 Stricter parsing of substitution replacement

It is no longer possible to abuse the way the parser parses C<s///e> like
this:

    %_=(_,"Just another ");
    $_="Perl hacker,\n";
    s//_}->{_/e;print

=head2 C<given> now aliases the global C<$_>

Instead of assigning to an implicit lexical C<$_>, C<given> now makes the
global C<$_> an alias for its argument, just like C<foreach>.  However, it
still uses lexical C<$_> if there is lexical C<$_> in scope (again, just like
C<foreach>) [perl #114020].

=head2 The smartmatch family of features are now experimental

Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been
a regular point of complaint.  Although there are a number of ways in which
it is useful, it has also proven problematic and confusing for both users and
implementors of Perl.  There have been a number of proposals on how to best
address the problem.  It is clear that smartmatch is almost certainly either
going to change or go away in the future.  Relying on its current behavior
is not recommended.

Warnings will now be issued when the parser sees C<~~>, C<given>, or C<when>.
To disable these warnings, you can add this line to the appropriate scope:

  no if $] >= 5.018, warnings => "experimental::smartmatch";

Consider, though, replacing the use of these features, as they may change
behavior again before becoming stable.

=head2 Lexical C<$_> is now experimental

Since it was introduced in Perl v5.10, it has caused much confusion with no
obvious solution:

=over

=item *

Various modules (e.g., List::Util) expect callback routines to use the
global C<$_>.  C<use List::Util 'first'; my $_; first { $_ == 1 } @list>
does not work as one would expect.

=item *

A C<my $_> declaration earlier in the same file can cause confusing closure
warnings.

=item *

The "_" subroutine prototype character allows called subroutines to access
your lexical C<$_>, so it is not really private after all.

=item *

Nevertheless, subroutines with a "(@)" prototype and methods cannot access
the caller's lexical C<$_>, unless they are written in XS.

=item *

But even XS routines cannot access a lexical C<$_> declared, not in the
calling subroutine, but in an outer scope, iff that subroutine happened not
to mention C<$_> or use any operators that default to C<$_>.

=back

It is our hope that lexical C<$_> can be rehabilitated, but this may
cause changes in its behavior.  Please use it with caution until it
becomes stable.

=head2 readline() with C<$/ = \N> now reads N characters, not N bytes

Previously, when reading from a stream with I/O layers such as
C<encoding>, the readline() function, otherwise known as the C<< <> >>
operator, would read I<N> bytes from the top-most layer. [perl #79960]

Now, I<N> characters are read instead.

There is no change in behaviour when reading from streams with no
extra layers, since bytes map exactly to characters.

=head2 Overridden C<glob> is now passed one argument

C<glob> overrides used to be passed a magical undocumented second argument
that identified the caller.  Nothing on CPAN was using this, and it got in
the way of a bug fix, so it was removed.  If you really need to identify
the caller, see L<Devel::Callsite> on CPAN.

=head2 Here doc parsing

The body of a here document inside a quote-like operator now always begins
on the line after the "<<foo" marker.  Previously, it was documented to
begin on the line following the containing quote-like operator, but that
was only sometimes the case [perl #114040].

=head2 Alphanumeric operators must now be separated from the closing
delimiter of regular expressions

You may no longer write something like:

 m/a/and 1

Instead you must write

 m/a/ and 1

with whitespace separating the operator from the closing delimiter of
the regular expression.  Not having whitespace has resulted in a
deprecation warning since Perl v5.14.0.

=head2 qw(...) can no longer be used as parentheses

C<qw> lists used to fool the parser into thinking they were always
surrounded by parentheses.  This permitted some surprising constructions
such as C<foreach $x qw(a b c) {...}>, which should really be written
C<foreach $x (qw(a b c)) {...}>.  These would sometimes get the lexer into
the wrong state, so they didn't fully work, and the similar C<foreach qw(a
b c) {...}> that one might expect to be permitted never worked at all.

This side effect of C<qw> has now been abolished.  It has been deprecated
since Perl v5.13.11.  It is now necessary to use real parentheses
everywhere that the grammar calls for them.

=head2 Interaction of lexical and default warnings

Turning on any lexical warnings used first to disable all default warnings
if lexical warnings were not already enabled:

    $*; # deprecation warning
    use warnings "void";
    $#; # void warning; no deprecation warning

Now, the C<debugging>, C<deprecated>, C<glob>, C<inplace> and C<malloc> warnings
categories are left on when turning on lexical warnings (unless they are
turned off by C<no warnings>, of course).

This may cause deprecation warnings to occur in code that used to be free
of warnings.

Those are the only categories consisting only of default warnings.  Default
warnings in other categories are still disabled by C<< use warnings "category" >>,
as we do not yet have the infrastructure for controlling
individual warnings.

=head2 C<state sub> and C<our sub>

Due to an accident of history, C<state sub> and C<our sub> were equivalent
to a plain C<sub>, so one could even create an anonymous sub with
C<our sub { ... }>.  These are now disallowed outside of the "lexical_subs"
feature.  Under the "lexical_subs" feature they have new meanings described
in L<perlsub/Lexical Subroutines>.

=head2 Defined values stored in environment are forced to byte strings

A value stored in an environment variable has always been stringified when
inherited by child processes.

In this release, when assigning to C<%ENV>, values are immediately stringified,
and converted to be only a byte string.

First, it is forced to be a only a string.  Then if the string is utf8 and the
equivalent of C<utf8::downgrade()> works, that result is used; otherwise, the
equivalent of C<utf8::encode()> is used, and a warning is issued about wide
characters (L</Diagnostics>).

=head2 C<require> dies for unreadable files

When C<require> encounters an unreadable file, it now dies.  It used to
ignore the file and continue searching the directories in C<@INC>
[perl #113422].

=head2 C<gv_fetchmeth_*> and SUPER

The various C<gv_fetchmeth_*> XS functions used to treat a package whose
named ended with C<::SUPER> specially.  A method lookup on the C<Foo::SUPER>
package would be treated as a C<SUPER> method lookup on the C<Foo> package.  This
is no longer the case.  To do a C<SUPER> lookup, pass the C<Foo> stash and the
C<GV_SUPER> flag.

=head2 C<split>'s first argument is more consistently interpreted

After some changes earlier in v5.17, C<split>'s behavior has been
simplified: if the PATTERN argument evaluates to a string
containing one space, it is treated the way that a I<literal> string
containing one space once was.

=head1 Deprecations

=head2 Module removals

The following modules will be removed from the core distribution in a future
release, and will at that time need to be installed from CPAN. Distributions
on CPAN which require these modules will need to list them as prerequisites.

The core versions of these modules will now issue C<"deprecated">-category
warnings to alert you to this fact. To silence these deprecation warnings,
install the modules in question from CPAN.

Note that these are (with rare exceptions) fine modules that you are encouraged
to continue to use. Their disinclusion from core primarily hinges on their
necessity to bootstrapping a fully functional, CPAN-capable Perl installation,
not usually on concerns over their design.

=over

=item L<encoding>

The use of this pragma is now strongly discouraged. It conflates the encoding
of source text with the encoding of I/O data, reinterprets escape sequences in
source text (a questionable choice), and introduces the UTF-8 bug to all runtime
handling of character strings. It is broken as designed and beyond repair.

For using non-ASCII literal characters in source text, please refer to L<utf8>.
For dealing with textual I/O data, please refer to L<Encode> and L<open>.

=item L<Archive::Extract>

=item L<B::Lint>

=item L<B::Lint::Debug>

=item L<CPANPLUS> and all included C<CPANPLUS::*> modules

=item L<Devel::InnerPackage>

=item L<Log::Message>

=item L<Log::Message::Config>

=item L<Log::Message::Handlers>

=item L<Log::Message::Item>

=item L<Log::Message::Simple>

=item L<Module::Pluggable>

=item L<Module::Pluggable::Object>

=item L<Object::Accessor>

=item L<Pod::LaTeX>

=item L<Term::UI>

=item L<Term::UI::History>

=back

=head2 Deprecated Utilities

The following utilities will be removed from the core distribution in a
future release as their associated modules have been deprecated. They
will remain available with the applicable CPAN distribution.

=over

=item L<cpanp>

=item C<cpanp-run-perl>

=item L<cpan2dist>

These items are part of the C<CPANPLUS> distribution.

=item L<pod2latex>

This item is part of the C<Pod::LaTeX> distribution.

=back

=head2 PL_sv_objcount

This interpreter-global variable used to track the total number of
Perl objects in the interpreter. It is no longer maintained and will
be removed altogether in Perl v5.20.

=head2 Five additional characters should be escaped in patterns with C</x>

When a regular expression pattern is compiled with C</x>, Perl treats 6
characters as white space to ignore, such as SPACE and TAB.  However,
Unicode recommends 11 characters be treated thusly.  We will conform
with this in a future Perl version.  In the meantime, use of any of the
missing characters will raise a deprecation warning, unless turned off.
The five characters are:

    U+0085 NEXT LINE
    U+200E LEFT-TO-RIGHT MARK
    U+200F RIGHT-TO-LEFT MARK
    U+2028 LINE SEPARATOR
    U+2029 PARAGRAPH SEPARATOR

=head2 User-defined charnames with surprising whitespace

A user-defined character name with trailing or multiple spaces in a row is
likely a typo.  This now generates a warning when defined, on the assumption
that uses of it will be unlikely to include the excess whitespace.

=head2 Various XS-callable functions are now deprecated

All the functions used to classify characters will be removed from a
future version of Perl, and should not be used.  With participating C
compilers (e.g., gcc), compiling any file that uses any of these will
generate a warning.  These were not intended for public use; there are
equivalent, faster, macros for most of them.

See L<perlapi/Character classes>.  The complete list is:

C<is_uni_alnum>, C<is_uni_alnumc>, C<is_uni_alnumc_lc>,
C<is_uni_alnum_lc>, C<is_uni_alpha>, C<is_uni_alpha_lc>,
C<is_uni_ascii>, C<is_uni_ascii_lc>, C<is_uni_blank>,
C<is_uni_blank_lc>, C<is_uni_cntrl>, C<is_uni_cntrl_lc>,
C<is_uni_digit>, C<is_uni_digit_lc>, C<is_uni_graph>,
C<is_uni_graph_lc>, C<is_uni_idfirst>, C<is_uni_idfirst_lc>,
C<is_uni_lower>, C<is_uni_lower_lc>, C<is_uni_print>,
C<is_uni_print_lc>, C<is_uni_punct>, C<is_uni_punct_lc>,
C<is_uni_space>, C<is_uni_space_lc>, C<is_uni_upper>,
C<is_uni_upper_lc>, C<is_uni_xdigit>, C<is_uni_xdigit_lc>,
C<is_utf8_alnum>, C<is_utf8_alnumc>, C<is_utf8_alpha>,
C<is_utf8_ascii>, C<is_utf8_blank>, C<is_utf8_char>,
C<is_utf8_cntrl>, C<is_utf8_digit>, C<is_utf8_graph>,
C<is_utf8_idcont>, C<is_utf8_idfirst>, C<is_utf8_lower>,
C<is_utf8_mark>, C<is_utf8_perl_space>, C<is_utf8_perl_word>,
C<is_utf8_posix_digit>, C<is_utf8_print>, C<is_utf8_punct>,
C<is_utf8_space>, C<is_utf8_upper>, C<is_utf8_xdigit>,
C<is_utf8_xidcont>, C<is_utf8_xidfirst>.

In addition these three functions that have never worked properly are
deprecated:
C<to_uni_lower_lc>, C<to_uni_title_lc>, and C<to_uni_upper_lc>.

=head2 Certain rare uses of backslashes within regexes are now deprecated

There are three pairs of characters that Perl recognizes as
metacharacters in regular expression patterns: C<{}>, C<[]>, and C<()>.
These can be used as well to delimit patterns, as in:

  m{foo}
  s(foo)(bar)

Since they are metacharacters, they have special meaning to regular
expression patterns, and it turns out that you can't turn off that
special meaning by the normal means of preceding them with a backslash,
if you use them, paired, within a pattern delimited by them.  For
example, in

  m{foo\{1,3\}}

the backslashes do not change the behavior, and this matches
S<C<"f o">> followed by one to three more occurrences of C<"o">.

Usages like this, where they are interpreted as metacharacters, are
exceedingly rare; we think there are none, for example, in all of CPAN.
Hence, this deprecation should affect very little code.  It does give
notice, however, that any such code needs to change, which will in turn
allow us to change the behavior in future Perl versions so that the
backslashes do have an effect, and without fear that we are silently
breaking any existing code.

=head2 Splitting the tokens C<(?> and C<(*> in regular expressions

A deprecation warning is now raised if the C<(> and C<?> are separated
by white space or comments in C<(?...)> regular expression constructs.
Similarly, if the C<(> and C<*> are separated in C<(*VERB...)>
constructs.

=head2 Pre-PerlIO IO implementations

In theory, you can currently build perl without PerlIO.  Instead, you'd use a
wrapper around stdio or sfio.  In practice, this isn't very useful.  It's not
well tested, and without any support for IO layers or (thus) Unicode, it's not
much of a perl.  Building without PerlIO will most likely be removed in the
next version of perl.

PerlIO supports a C<stdio> layer if stdio use is desired.  Similarly a
sfio layer could be produced in the future, if needed.

=head1 Future Deprecations

=over

=item *

Platforms without support infrastructure

Both Windows CE and z/OS have been historically under-maintained, and are
currently neither successfully building nor regularly being smoke tested.
Efforts are underway to change this situation, but it should not be taken for
granted that the platforms are safe and supported.  If they do not become
buildable and regularly smoked, support for them may be actively removed in
future releases.  If you have an interest in these platforms and you can lend
your time, expertise, or hardware to help support these platforms, please let
the perl development effort know by emailing C<perl5-porters@perl.org>.

Some platforms that appear otherwise entirely dead are also on the short list
for removal between now and v5.20.0:

=over

=item DG/UX

=item NeXT

=back

We also think it likely that current versions of Perl will no longer
build AmigaOS, DJGPP, NetWare (natively), OS/2 and Plan 9. If you
are using Perl on such a platform and have an interest in ensuring
Perl's future on them, please contact us.

We believe that Perl has long been unable to build on mixed endian
architectures (such as PDP-11s), and intend to remove any remaining
support code. Similarly, code supporting the long umaintained GNU
dld will be removed soon if no-one makes themselves known as an
active user.

=item *

Swapping of $< and $>

Perl has supported the idiom of swapping $< and $> (and likewise $( and
$)) to temporarily drop permissions since 5.0, like this:

    ($<, $>) = ($>, $<);

However, this idiom modifies the real user/group id, which can have
undesirable side-effects, is no longer useful on any platform perl
supports and complicates the implementation of these variables and list
assignment in general.

As an alternative, assignment only to C<< $> >> is recommended:

    local $> = $<;

See also: L<Setuid Demystified|http://www.cs.berkeley.edu/~daw/papers/setuid-usenix02.pdf>.

=item *

C<microperl>, long broken and of unclear present purpose, will be removed.

=item *

Revamping C<< "\Q" >> semantics in double-quotish strings when combined with
other escapes.

There are several bugs and inconsistencies involving combinations
of C<\Q> and escapes like C<\x>, C<\L>, etc., within a C<\Q...\E> pair.
These need to be fixed, and doing so will necessarily change current
behavior.  The changes have not yet been settled.

=item *

Use of C<$x>, where C<x> stands for any actual (non-printing) C0 control
character will be disallowed in a future Perl version.  Use C<${x}>
instead (where again C<x> stands for a control character),
or better, C<$^A> , where C<^> is a caret (CIRCUMFLEX ACCENT),
and C<A> stands for any of the characters listed at the end of
L<perlebcdic/OPERATOR DIFFERENCES>.

=back

=head1 Performance Enhancements

=over 4

=item *

Lists of lexical variable declarations (C<my($x, $y)>) are now optimised
down to a single op and are hence faster than before.

=item *

A new C preprocessor define C<NO_TAINT_SUPPORT> was added that, if set,
disables Perl's taint support altogether.  Using the -T or -t command
line flags will cause a fatal error.  Beware that both core tests as
well as many a CPAN distribution's tests will fail with this change.  On
the upside, it provides a small performance benefit due to reduced
branching.

B<Do not enable this unless you know exactly what you are getting yourself
into.>

=item *

C<pack> with constant arguments is now constant folded in most cases
[perl #113470].

=item *

Speed up in regular expression matching against Unicode properties.  The
largest gain is for C<\X>, the Unicode "extended grapheme cluster."  The
gain for it is about 35% - 40%.  Bracketed character classes, e.g.,
C<[0-9\x{100}]> containing code points above 255 are also now faster.

=item *

On platforms supporting it, several former macros are now implemented as static
inline functions. This should speed things up slightly on non-GCC platforms.

=item *

The optimisation of hashes in boolean context has been extended to
affect C<scalar(%hash)>, C<%hash ? ... : ...>, and C<sub { %hash || ... }>.

=item *

Filetest operators manage the stack in a fractionally more efficient manner.

=item *

Globs used in a numeric context are now numified directly in most cases,
rather than being numified via stringification.

=item *

The C<x> repetition operator is now folded to a single constant at compile
time if called in scalar context with constant operands and no parentheses
around the left operand.

=back

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

=over 4

=item *

L<Config::Perl::V> version 0.16 has been added as a dual-lifed module.
It provides structured data retrieval of C<perl -V> output including
information only known to the C<perl> binary and not available via L<Config>.

=back

=head2 Updated Modules and Pragmata

For a complete list of updates, run:

  $ corelist --diff 5.16.0 5.18.0

You can substitute your favorite version in place of C<5.16.0>, too.

=over

=item *

L<Archive::Extract> has been upgraded to 0.68.

Work around an edge case on Linux with Busybox's unzip.

=item *

L<Archive::Tar> has been upgraded to 1.90.

ptar now supports the -T option as well as dashless options
[rt.cpan.org #75473], [rt.cpan.org #75475].

Auto-encode filenames marked as UTF-8 [rt.cpan.org #75474].

Don't use C<tell> on L<IO::Zlib> handles [rt.cpan.org #64339].

Don't try to C<chown> on symlinks.

=item *

L<autodie> has been upgraded to 2.13.

C<autodie> now plays nicely with the 'open' pragma.

=item *

L<B> has been upgraded to 1.42.

The C<stashoff> method of COPs has been added.   This provides access to an
internal field added in perl 5.16 under threaded builds [perl #113034].

C<B::COP::stashpv> now supports UTF-8 package names and embedded NULs.

All C<CVf_*> and C<GVf_*>
and more SV-related flag values are now provided as constants in the C<B::>
namespace and available for export.  The default export list has not changed.

This makes the module work with the new pad API.

=item *

L<B::Concise> has been upgraded to 0.95.

The C<-nobanner> option has been fixed, and C<format>s can now be dumped.
When passed a sub name to dump, it will check also to see whether it
is the name of a format.  If a sub and a format share the same name,
it will dump both.

This adds support for the new C<OpMAYBE_TRUEBOOL> and C<OPpTRUEBOOL> flags.

=item *

L<B::Debug> has been upgraded to 1.18.

This adds support (experimentally) for C<B::PADLIST>, which was
added in Perl 5.17.4.

=item *

L<B::Deparse> has been upgraded to 1.20.

Avoid warning when run under C<perl -w>.

It now deparses
loop controls with the correct precedence, and multiple statements in a
C<format> line are also now deparsed correctly.

This release suppresses trailing semicolons in formats.

This release adds stub deparsing for lexical subroutines.

It no longer dies when deparsing C<sort> without arguments.  It now
correctly omits the comma for C<system $prog @args> and C<exec $prog
@args>.

=item *

L<bignum>, L<bigint> and L<bigrat> have been upgraded to 0.33.

The overrides for C<hex> and C<oct> have been rewritten, eliminating
several problems, and making one incompatible change:

=over

=item *

Formerly, whichever of C<use bigint> or C<use bigrat> was compiled later
would take precedence over the other, causing C<hex> and C<oct> not to
respect the other pragma when in scope.

=item *

Using any of these three pragmata would cause C<hex> and C<oct> anywhere
else in the program to evaluate their arguments in list context and prevent
them from inferring $_ when called without arguments.

=item *

Using any of these three pragmata would make C<oct("1234")> return 1234
(for any number not beginning with 0) anywhere in the program.  Now "1234"
is translated from octal to decimal, whether within the pragma's scope or
not.

=item *

The global overrides that facilitate lexical use of C<hex> and C<oct> now
respect any existing overrides that were in place before the new overrides
were installed, falling back to them outside of the scope of C<use bignum>.

=item *

C<use bignum "hex">, C<use bignum "oct"> and similar invocations for bigint
and bigrat now export a C<hex> or C<oct> function, instead of providing a
global override.

=back

=item *

L<Carp> has been upgraded to 1.29.

Carp is no longer confused when C<caller> returns undef for a package that
has been deleted.

The C<longmess()> and C<shortmess()> functions are now documented.

=item *

L<CGI> has been upgraded to 3.63.

Unrecognized HTML escape sequences are now handled better, problematic
trailing newlines are no longer inserted after E<lt>formE<gt> tags
by C<startform()> or C<start_form()>, and bogus "Insecure Dependency"
warnings appearing with some versions of perl are now worked around.

=item *

L<Class::Struct> has been upgraded to 0.64.

The constructor now respects overridden accessor methods [perl #29230].

=item *

L<Compress::Raw::Bzip2> has been upgraded to 2.060.

The misuse of Perl's "magic" API has been fixed.

=item *

L<Compress::Raw::Zlib> has been upgraded to 2.060.

Upgrade bundled zlib to version 1.2.7.

Fix build failures on Irix, Solaris, and Win32, and also when building as C++
[rt.cpan.org #69985], [rt.cpan.org #77030], [rt.cpan.org #75222].

The misuse of Perl's "magic" API has been fixed.

C<compress()>, C<uncompress()>, C<memGzip()> and C<memGunzip()> have
been speeded up by making parameter validation more efficient.

=item *

L<CPAN::Meta::Requirements> has been upgraded to 2.122.

Treat undef requirements to C<from_string_hash> as 0 (with a warning).

Added C<requirements_for_module> method.

=item *

L<CPANPLUS> has been upgraded to 0.9135.

Allow adding F<blib/script> to PATH.

Save the history between invocations of the shell.

Handle multiple C<makemakerargs> and C<makeflags> arguments better.

This resolves issues with the SQLite source engine.

=item *

L<Data::Dumper> has been upgraded to 2.145.

It has been optimized to only build a seen-scalar hash as necessary,
thereby speeding up serialization drastically.

Additional tests were added in order to improve statement, branch, condition
and subroutine coverage.  On the basis of the coverage analysis, some of the
internals of Dumper.pm were refactored.  Almost all methods are now
documented.

=item *

L<DB_File> has been upgraded to 1.827.

The main Perl module no longer uses the C<"@_"> construct.

=item *

L<Devel::Peek> has been upgraded to 1.11.

This fixes compilation with C++ compilers and makes the module work with
the new pad API.

=item *

L<Digest::MD5> has been upgraded to 2.52.

Fix C<Digest::Perl::MD5> OO fallback [rt.cpan.org #66634].

=item *

L<Digest::SHA> has been upgraded to 5.84.

This fixes a double-free bug, which might have caused vulnerabilities
in some cases.

=item *

L<DynaLoader> has been upgraded to 1.18.

This is due to a minor code change in the XS for the VMS implementation.

This fixes warnings about using C<CODE> sections without an C<OUTPUT>
section.

=item *

L<Encode> has been upgraded to 2.49.

The Mac alias x-mac-ce has been added, and various bugs have been fixed
in Encode::Unicode, Encode::UTF7 and Encode::GSM0338.

=item *

L<Env> has been upgraded to 1.04.

Its SPLICE implementation no longer misbehaves in list context.

=item *

L<ExtUtils::CBuilder> has been upgraded to 0.280210.

Manifest files are now correctly embedded for those versions of VC++ which
make use of them. [perl #111782, #111798].

A list of symbols to export can now be passed to C<link()> when on
Windows, as on other OSes [perl #115100].

=item *

L<ExtUtils::ParseXS> has been upgraded to 3.18.

The generated C code now avoids unnecessarily incrementing
C<PL_amagic_generation> on Perl versions where it's done automatically
(or on current Perl where the variable no longer exists).

This avoids a bogus warning for initialised XSUB non-parameters [perl
#112776].

=item *

L<File::Copy> has been upgraded to 2.26.

C<copy()> no longer zeros files when copying into the same directory,
and also now fails (as it has long been documented to do) when attempting
to copy a file over itself.

=item *

L<File::DosGlob> has been upgraded to 1.10.

The internal cache of file names that it keeps for each caller is now
freed when that caller is freed.  This means
C<< use File::DosGlob 'glob'; eval 'scalar <*>' >> no longer leaks memory.

=item *

L<File::Fetch> has been upgraded to 0.38.

Added the 'file_default' option for URLs that do not have a file
component.

Use C<File::HomeDir> when available, and provide C<PERL5_CPANPLUS_HOME> to
override the autodetection.

Always re-fetch F<CHECKSUMS> if C<fetchdir> is set.

=item *

L<File::Find> has been upgraded to 1.23.

This fixes inconsistent unixy path handling on VMS.

Individual files may now appear in list of directories to be searched
[perl #59750].

=item *

L<File::Glob> has been upgraded to 1.20.

File::Glob has had exactly the same fix as File::DosGlob.  Since it is
what Perl's own C<glob> operator itself uses (except on VMS), this means
C<< eval 'scalar <*>' >> no longer leaks.

A space-separated list of patterns return long lists of results no longer
results in memory corruption or crashes.  This bug was introduced in
Perl 5.16.0.  [perl #114984]

=item *

L<File::Spec::Unix> has been upgraded to 3.40.

C<abs2rel> could produce incorrect results when given two relative paths or
the root directory twice [perl #111510].

=item *

L<File::stat> has been upgraded to 1.07.

C<File::stat> ignores the L<filetest> pragma, and warns when used in
combination therewith.  But it was not warning for C<-r>.  This has been
fixed [perl #111640].

C<-p> now works, and does not return false for pipes [perl #111638].

Previously C<File::stat>'s overloaded C<-x> and C<-X> operators did not give
the correct results for directories or executable files when running as
root. They had been treating executable permissions for root just like for
any other user, performing group membership tests I<etc> for files not owned
by root. They now follow the correct Unix behaviour - for a directory they
are always true, and for a file if any of the three execute permission bits
are set then they report that root can execute the file. Perl's builtin
C<-x> and C<-X> operators have always been correct.

=item *

L<File::Temp> has been upgraded to 0.23

Fixes various bugs involving directory removal.  Defers unlinking tempfiles if
the initial unlink fails, which fixes problems on NFS.

=item *

L<GDBM_File> has been upgraded to 1.15.

The undocumented optional fifth parameter to C<TIEHASH> has been
removed. This was intended to provide control of the callback used by
C<gdbm*> functions in case of fatal errors (such as filesystem problems),
but did not work (and could never have worked). No code on CPAN even
attempted to use it. The callback is now always the previous default,
C<croak>. Problems on some platforms with how the C<C> C<croak> function
is called have also been resolved.

=item *

L<Hash::Util> has been upgraded to 0.15.

C<hash_unlocked> and C<hashref_unlocked> now returns true if the hash is
unlocked, instead of always returning false [perl #112126].

C<hash_unlocked>, C<hashref_unlocked>, C<lock_hash_recurse> and
C<unlock_hash_recurse> are now exportable [perl #112126].

Two new functions, C<hash_locked> and C<hashref_locked>, have been added.
Oddly enough, these two functions were already exported, even though they
did not exist [perl #112126].

=item *

L<HTTP::Tiny> has been upgraded to 0.025.

Add SSL verification features [github #6], [github #9].

Include the final URL in the response hashref.

Add C<local_address> option.

This improves SSL support.

=item *

L<IO> has been upgraded to 1.28.

C<sync()> can now be called on read-only file handles [perl #64772].

L<IO::Socket> tries harder to cache or otherwise fetch socket
information.

=item *

L<IPC::Cmd> has been upgraded to 0.80.

Use C<POSIX::_exit> instead of C<exit> in C<run_forked> [rt.cpan.org #76901].

=item *

L<IPC::Open3> has been upgraded to 1.13.

The C<open3()> function no longer uses C<POSIX::close()> to close file
descriptors since that breaks the ref-counting of file descriptors done by
PerlIO in cases where the file descriptors are shared by PerlIO streams,
leading to attempts to close the file descriptors a second time when
any such PerlIO streams are closed later on.

=item *

L<Locale::Codes> has been upgraded to 3.25.

It includes some new codes.

=item *

L<Memoize> has been upgraded to 1.03.

Fix the C<MERGE> cache option.

=item *

L<Module::Build> has been upgraded to 0.4003.

Fixed bug where modules without C<$VERSION> might have a version of '0' listed
in 'provides' metadata, which will be rejected by PAUSE.

Fixed bug in PodParser to allow numerals in module names.

Fixed bug where giving arguments twice led to them becoming arrays, resulting
in install paths like F<ARRAY(0xdeadbeef)/lib/Foo.pm>.

A minor bug fix allows markup to be used around the leading "Name" in
a POD "abstract" line, and some documentation improvements have been made.

=item *

L<Module::CoreList> has been upgraded to 2.90

Version information is now stored as a delta, which greatly reduces the
size of the F<CoreList.pm> file.

This restores compatibility with older versions of perl and cleans up
the corelist data for various modules.

=item *

L<Module::Load::Conditional> has been upgraded to 0.54.

Fix use of C<requires> on perls installed to a path with spaces.

Various enhancements include the new use of Module::Metadata.

=item *

L<Module::Metadata> has been upgraded to 1.000011.

The creation of a Module::Metadata object for a typical module file has
been sped up by about 40%, and some spurious warnings about C<$VERSION>s
have been suppressed.

=item *

L<Module::Pluggable> has been upgraded to 4.7.

Amongst other changes, triggers are now allowed on events, which gives
a powerful way to modify behaviour.

=item *

L<Net::Ping> has been upgraded to 2.41.

This fixes some test failures on Windows.

=item *

L<Opcode> has been upgraded to 1.25.

Reflect the removal of the boolkeys opcode and the addition of the
clonecv, introcv and padcv opcodes.

=item *

L<overload> has been upgraded to 1.22.

C<no overload> now warns for invalid arguments, just like C<use overload>.

=item *

L<PerlIO::encoding> has been upgraded to 0.16.

This is the module implementing the ":encoding(...)" I/O layer.  It no
longer corrupts memory or crashes when the encoding back-end reallocates
the buffer or gives it a typeglob or shared hash key scalar.

=item *

L<PerlIO::scalar> has been upgraded to 0.16.

The buffer scalar supplied may now only contain code points 0xFF or
lower. [perl #109828]

=item *

L<Perl::OSType> has been upgraded to 1.003.

This fixes a bug detecting the VOS operating system.

=item *

L<Pod::Html> has been upgraded to 1.18.

The option C<--libpods> has been reinstated. It is deprecated, and its use
does nothing other than issue a warning that it is no longer supported.

Since the HTML files generated by pod2html claim to have a UTF-8 charset,
actually write the files out using UTF-8 [perl #111446].

=item *

L<Pod::Simple> has been upgraded to 3.28.

Numerous improvements have been made, mostly to Pod::Simple::XHTML,
which also has a compatibility change: the C<codes_in_verbatim> option
is now disabled by default.  See F<cpan/Pod-Simple/ChangeLog> for the
full details.

=item *

L<re> has been upgraded to 0.23

Single character [class]es like C</[s]/> or C</[s]/i> are now optimized
as if they did not have the brackets, i.e. C</s/> or C</s/i>.

See note about C<op_comp> in the L</Internal Changes> section below.

=item *

L<Safe> has been upgraded to 2.35.

Fix interactions with C<Devel::Cover>.

Don't eval code under C<no strict>.

=item *

L<Scalar::Util> has been upgraded to version 1.27.

Fix an overloading issue with C<sum>.

C<first> and C<reduce> now check the callback first (so C<&first(1)> is
disallowed).

Fix C<tainted> on magical values [rt.cpan.org #55763].

Fix C<sum> on previously magical values [rt.cpan.org #61118].

Fix reading past the end of a fixed buffer [rt.cpan.org #72700].

=item *

L<Search::Dict> has been upgraded to 1.07.

No longer require C<stat> on filehandles.

Use C<fc> for casefolding.

=item *

L<Socket> has been upgraded to 2.009.

Constants and functions required for IP multicast source group membership
have been added.

C<unpack_sockaddr_in()> and C<unpack_sockaddr_in6()> now return just the IP
address in scalar context, and C<inet_ntop()> now guards against incorrect
length scalars being passed in.

This fixes an uninitialized memory read.

=item *

L<Storable> has been upgraded to 2.41.

Modifying C<$_[0]> within C<STORABLE_freeze> no longer results in crashes
[perl #112358].

An object whose class implements C<STORABLE_attach> is now thawed only once
when there are multiple references to it in the structure being thawed
[perl #111918].

Restricted hashes were not always thawed correctly [perl #73972].

Storable would croak when freezing a blessed REF object with a
C<STORABLE_freeze()> method [perl #113880].

It can now freeze and thaw vstrings correctly.  This causes a slight
incompatible change in the storage format, so the format version has
increased to 2.9.

This contains various bugfixes, including compatibility fixes for older
versions of Perl and vstring handling.

=item *

L<Sys::Syslog> has been upgraded to 0.32.

This contains several bug fixes relating to C<getservbyname()>,
C<setlogsock()>and log levels in C<syslog()>, together with fixes for
Windows, Haiku-OS and GNU/kFreeBSD.  See F<cpan/Sys-Syslog/Changes>
for the full details.

=item *

L<Term::ANSIColor> has been upgraded to 4.02.

Add support for italics.

Improve error handling.

=item *

L<Term::ReadLine> has been upgraded to 1.10.  This fixes the
use of the B<cpan> and B<cpanp> shells on Windows in the event that the current
drive happens to contain a F<\dev\tty> file.

=item *

L<Test::Harness> has been upgraded to 3.26.

Fix glob semantics on Win32 [rt.cpan.org #49732].

Don't use C<Win32::GetShortPathName> when calling perl [rt.cpan.org #47890].

Ignore -T when reading shebang [rt.cpan.org #64404].

Handle the case where we don't know the wait status of the test more
gracefully.

Make the test summary 'ok' line overridable so that it can be changed to a
plugin to make the output of prove idempotent.

Don't run world-writable files.

=item *

L<Text::Tabs> and L<Text::Wrap> have been upgraded to
2012.0818.  Support for Unicode combining characters has been added to them
both.

=item *

L<threads::shared> has been upgraded to 1.31.

This adds the option to warn about or ignore attempts to clone structures
that can't be cloned, as opposed to just unconditionally dying in
that case.

This adds support for dual-valued values as created by
L<Scalar::Util::dualvar|Scalar::Util/"dualvar NUM, STRING">.

=item *

L<Tie::StdHandle> has been upgraded to 4.3.

C<READ> now respects the offset argument to C<read> [perl #112826].

=item *

L<Time::Local> has been upgraded to 1.2300.

Seconds values greater than 59 but less than 60 no longer cause
C<timegm()> and C<timelocal()> to croak.

=item *

L<Unicode::UCD> has been upgraded to 0.53.

This adds a function L<all_casefolds()|Unicode::UCD/all_casefolds()>
that returns all the casefolds.

=item *

L<Win32> has been upgraded to 0.47.

New APIs have been added for getting and setting the current code page.

=back


=head2 Removed Modules and Pragmata

=over

=item *

L<Version::Requirements> has been removed from the core distribution.  It is
available under a different name: L<CPAN::Meta::Requirements>.

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perlcheat>

=over 4

=item *

L<perlcheat> has been reorganized, and a few new sections were added.

=back

=head3 L<perldata>

=over 4

=item *

Now explicitly documents the behaviour of hash initializer lists that
contain duplicate keys.

=back

=head3 L<perldiag>

=over 4

=item *

The explanation of symbolic references being prevented by "strict refs"
now doesn't assume that the reader knows what symbolic references are.

=back

=head3 L<perlfaq>

=over 4

=item *

L<perlfaq> has been synchronized with version 5.0150040 from CPAN.

=back

=head3 L<perlfunc>

=over 4

=item *

The return value of C<pipe> is now documented.

=item *

Clarified documentation of C<our>.

=back

=head3 L<perlop>

=over 4

=item *

Loop control verbs (C<dump>, C<goto>, C<next>, C<last> and C<redo>) have always
had the same precedence as assignment operators, but this was not documented
until now.

=back

=head3 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<Unterminated delimiter for here document|perldiag/"Unterminated delimiter for here document">

This message now occurs when a here document label has an initial quotation
mark but the final quotation mark is missing.

This replaces a bogus and misleading error message about not finding the label
itself [perl #114104].

=item *

L<panic: child pseudo-process was never scheduled|perldiag/"panic: child pseudo-process was never scheduled">

This error is thrown when a child pseudo-process in the ithreads implementation
on Windows was not scheduled within the time period allowed and therefore was
not able to initialize properly [perl #88840].

=item *

L<Group name must start with a non-digit word character in regex; marked by <-- HERE in mE<sol>%sE<sol>|perldiag/"Group name must start with a non-digit word character in regex; marked by <-- HERE in m/%s/">

This error has been added for C<(?&0)>, which is invalid.  It used to
produce an incomprehensible error message [perl #101666].

=item *

L<Can't use an undefined value as a subroutine reference|perldiag/"Can't use an undefined value as %s reference">

Calling an undefined value as a subroutine now produces this error message.
It used to, but was accidentally disabled, first in Perl 5.004 for
non-magical variables, and then in Perl v5.14 for magical (e.g., tied)
variables.  It has now been restored.  In the mean time, undef was treated
as an empty string [perl #113576].

=item *

L<Experimental "%s" subs not enabled|perldiag/"Experimental "%s" subs not enabled">

To use lexical subs, you must first enable them:

    no warnings 'experimental::lexical_subs';
    use feature 'lexical_subs';
    my sub foo { ... }

=back

=head3 New Warnings

=over 4

=item *

L<'Strings with code points over 0xFF may not be mapped into in-memory file handles'|perldiag/"Strings with code points over 0xFF may not be mapped into in-memory file handles">

=item *

L<'%s' resolved to '\o{%s}%d'|perldiag/"'%s' resolved to '\o{%s}%d'">

=item *

L<'Trailing white-space in a charnames alias definition is deprecated'|perldiag/"Trailing white-space in a charnames alias definition is deprecated">

=item *

L<'A sequence of multiple spaces in a charnames alias definition is deprecated'|perldiag/"A sequence of multiple spaces in a charnames alias definition is deprecated">

=item *

L<'Passing malformed UTF-8 to "%s" is deprecated'|perldiag/"Passing malformed UTF-8 to "%s" is deprecated">

=item *

L<Subroutine "&%s" is not available|perldiag/"Subroutine "&%s" is not available">

(W closure) During compilation, an inner named subroutine or eval is
attempting to capture an outer lexical subroutine that is not currently
available.  This can happen for one of two reasons.  First, the lexical
subroutine may be declared in an outer anonymous subroutine that has not
yet been created.  (Remember that named subs are created at compile time,
while anonymous subs are created at run-time.)  For example,

    sub { my sub a {...} sub f { \&a } }

At the time that f is created, it can't capture the current the "a" sub,
since the anonymous subroutine hasn't been created yet.  Conversely, the
following won't give a warning since the anonymous subroutine has by now
been created and is live:

    sub { my sub a {...} eval 'sub f { \&a }' }->();

The second situation is caused by an eval accessing a variable that has
gone out of scope, for example,

    sub f {
        my sub a {...}
        sub { eval '\&a' }
    }
    f()->();

Here, when the '\&a' in the eval is being compiled, f() is not currently
being executed, so its &a is not available for capture.

=item *

L<"%s" subroutine &%s masks earlier declaration in same %s|perldiag/"%s" subroutine &%s masks earlier declaration in same %s>

(W misc) A "my" or "state" subroutine has been redeclared in the
current scope or statement, effectively eliminating all access to
the previous instance.  This is almost always a typographical error.
Note that the earlier subroutine will still exist until the end of
the scope or until all closure references to it are destroyed.

=item *

L<The %s feature is experimental|perldiag/"The %s feature is experimental">

(S experimental) This warning is emitted if you enable an experimental
feature via C<use feature>.  Simply suppress the warning if you want
to use the feature, but know that in doing so you are taking the risk
of using an experimental feature which may change or be removed in a
future Perl version:

    no warnings "experimental::lexical_subs";
    use feature "lexical_subs";

=item *

L<sleep(%u) too large|perldiag/"sleep(%u) too large">

(W overflow) You called C<sleep> with a number that was larger than it can
reliably handle and C<sleep> probably slept for less time than requested.

=item *

L<Wide character in setenv|perldiag/"Wide character in %s">

Attempts to put wide characters into environment variables via C<%ENV> now
provoke this warning.

=item *

"L<Invalid negative number (%s) in chr|perldiag/"Invalid negative number (%s) in chr">"

C<chr()> now warns when passed a negative value [perl #83048].

=item *

"L<Integer overflow in srand|perldiag/"Integer overflow in srand">"

C<srand()> now warns when passed a value that doesn't fit in a C<UV> (since the
value will be truncated rather than overflowing) [perl #40605].

=item *

"L<-i used with no filenames on the command line, reading from STDIN|perldiag/"-i used with no filenames on the command line, reading from STDIN">"

Running perl with the C<-i> flag now warns if no input files are provided on
the command line [perl #113410].

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

L<$* is no longer supported|perldiag/"$* is no longer supported">

The warning that use of C<$*> and C<$#> is no longer supported is now
generated for every location that references them.  Previously it would fail
to be generated if another variable using the same typeglob was seen first
(e.g. C<@*> before C<$*>), and would not be generated for the second and
subsequent uses.  (It's hard to fix the failure to generate warnings at all
without also generating them every time, and warning every time is
consistent with the warnings that C<$[> used to generate.)

=item *

The warnings for C<\b{> and C<\B{> were added.  They are a deprecation
warning which should be turned off by that category.  One should not
have to turn off regular regexp warnings as well to get rid of these.

=item *

L<Constant(%s): Call to &{$^H{%s}} did not return a defined value|perldiag/Constant(%s): Call to &{$^H{%s}} did not return a defined value>

Constant overloading that returns C<undef> results in this error message.
For numeric constants, it used to say "Constant(undef)".  "undef" has been
replaced with the number itself.

=item *

The error produced when a module cannot be loaded now includes a hint that
the module may need to be installed: "Can't locate hopping.pm in @INC (you
may need to install the hopping module) (@INC contains: ...)"

=item *

L<vector argument not supported with alpha versions|perldiag/vector argument not supported with alpha versions>

This warning was not suppressible, even with C<no warnings>.  Now it is
suppressible, and has been moved from the "internal" category to the
"printf" category.

=item *

C<< Can't do {n,m} with n > m in regex; marked by <-- HERE in m/%s/ >>

This fatal error has been turned into a warning that reads:

L<< Quantifier {n,m} with n > m can't match in regex | perldiag/Quantifier {n,m} with n > m can't match in regex >>

(W regexp) Minima should be less than or equal to maxima.  If you really want
your regexp to match something 0 times, just put {0}.

=item *

The "Runaway prototype" warning that occurs in bizarre cases has been
removed as being unhelpful and inconsistent.

=item *

The "Not a format reference" error has been removed, as the only case in
which it could be triggered was a bug.

=item *

The "Unable to create sub named %s" error has been removed for the same
reason.

=item *

The 'Can't use "my %s" in sort comparison' error has been downgraded to a
warning, '"my %s" used in sort comparison' (with 'state' instead of 'my'
for state variables).  In addition, the heuristics for guessing whether
lexical $a or $b has been misused have been improved to generate fewer
false positives.  Lexical $a and $b are no longer disallowed if they are
outside the sort block.  Also, a named unary or list operator inside the
sort block no longer causes the $a or $b to be ignored [perl #86136].

=back

=head1 Utility Changes

=head3 L<h2xs>

=over 4

=item *

F<h2xs> no longer produces invalid code for empty defines.  [perl #20636]

=back

=head1 Configuration and Compilation

=over 4

=item *

Added C<useversionedarchname> option to Configure

When set, it includes 'api_versionstring' in 'archname'. E.g.
x86_64-linux-5.13.6-thread-multi.  It is unset by default.

This feature was requested by Tim Bunce, who observed that
C<INSTALL_BASE> creates a library structure that does not
differentiate by perl version.  Instead, it places architecture
specific files in "$install_base/lib/perl5/$archname".  This makes
it difficult to use a common C<INSTALL_BASE> library path with
multiple versions of perl.

By setting C<-Duseversionedarchname>, the $archname will be
distinct for architecture I<and> API version, allowing mixed use of
C<INSTALL_BASE>.

=item *

Add a C<PERL_NO_INLINE_FUNCTIONS> option

If C<PERL_NO_INLINE_FUNCTIONS> is defined, don't include "inline.h"

This permits test code to include the perl headers for definitions without
creating a link dependency on the perl library (which may not exist yet).

=item *

Configure will honour the external C<MAILDOMAIN> environment variable, if set.

=item *

C<installman> no longer ignores the silent option

=item *

Both C<META.yml> and C<META.json> files are now included in the distribution.

=item *

F<Configure> will now correctly detect C<isblank()> when compiling with a C++
compiler.

=item *

The pager detection in F<Configure> has been improved to allow responses which
specify options after the program name, e.g. B</usr/bin/less -R>, if the user
accepts the default value.  This helps B<perldoc> when handling ANSI escapes
[perl #72156].

=back

=head1 Testing

=over 4

=item *

The test suite now has a section for tests that require very large amounts
of memory.  These tests won't run by default; they can be enabled by
setting the C<PERL_TEST_MEMORY> environment variable to the number of
gibibytes of memory that may be safely used.

=back

=head1 Platform Support

=head2 Discontinued Platforms

=over 4

=item BeOS

BeOS was an operating system for personal computers developed by Be Inc,
initially for their BeBox hardware. The OS Haiku was written as an open
source replacement for/continuation of BeOS, and its perl port is current and
actively maintained.

=item UTS Global

Support code relating to UTS global has been removed.  UTS was a mainframe
version of System V created by Amdahl, subsequently sold to UTS Global.  The
port has not been touched since before Perl v5.8.0, and UTS Global is now
defunct.

=item VM/ESA

Support for VM/ESA has been removed. The port was tested on 2.3.0, which
IBM ended service on in March 2002. 2.4.0 ended service in June 2003, and
was superseded by Z/VM. The current version of Z/VM is V6.2.0, and scheduled
for end of service on 2015/04/30.

=item MPE/IX

Support for MPE/IX has been removed.

=item EPOC

Support code relating to EPOC has been removed.  EPOC was a family of
operating systems developed by Psion for mobile devices.  It was the
predecessor of Symbian.  The port was last updated in April 2002.

=item Rhapsody

Support for Rhapsody has been removed.

=back

=head2 Platform-Specific Notes

=head3 AIX

Configure now always adds C<-qlanglvl=extc99> to the CC flags on AIX when
using xlC.  This will make it easier to compile a number of XS-based modules
that assume C99 [perl #113778].

=head3 clang++

There is now a workaround for a compiler bug that prevented compiling
with clang++ since Perl v5.15.7 [perl #112786].

=head3 C++

When compiling the Perl core as C++ (which is only semi-supported), the
mathom functions are now compiled as C<extern "C">, to ensure proper
binary compatibility.  (However, binary compatibility isn't generally
guaranteed anyway in the situations where this would matter.)

=head3 Darwin

Stop hardcoding an alignment on 8 byte boundaries to fix builds using
-Dusemorebits.

=head3 Haiku

Perl should now work out of the box on Haiku R1 Alpha 4.

=head3 MidnightBSD

C<libc_r> was removed from recent versions of MidnightBSD and older versions
work better with C<pthread>. Threading is now enabled using C<pthread> which
corrects build errors with threading enabled on 0.4-CURRENT.

=head3 Solaris

In Configure, avoid running sed commands with flags not supported on Solaris.

=head3 VMS

=over

=item *

Where possible, the case of filenames and command-line arguments is now
preserved by enabling the CRTL features C<DECC$EFS_CASE_PRESERVE> and
C<DECC$ARGV_PARSE_STYLE> at start-up time.  The latter only takes effect
when extended parse is enabled in the process from which Perl is run.

=item *

The character set for Extended Filename Syntax (EFS) is now enabled by default
on VMS.  Among other things, this provides better handling of dots in directory
names, multiple dots in filenames, and spaces in filenames.  To obtain the old
behavior, set the logical name C<DECC$EFS_CHARSET> to C<DISABLE>.

=item *

Fixed linking on builds configured with C<-Dusemymalloc=y>.

=item *

Experimental support for building Perl with the HP C++ compiler is available
by configuring with C<-Dusecxx>.

=item *

All C header files from the top-level directory of the distribution are now
installed on VMS, providing consistency with a long-standing practice on other
platforms. Previously only a subset were installed, which broke non-core
extension builds for extensions that depended on the missing include files.

=item *

Quotes are now removed from the command verb (but not the parameters) for
commands spawned via C<system>, backticks, or a piped C<open>.  Previously,
quotes on the verb were passed through to DCL, which would fail to recognize
the command.  Also, if the verb is actually a path to an image or command
procedure on an ODS-5 volume, quoting it now allows the path to contain spaces.

=item *

The B<a2p> build has been fixed for the HP C++ compiler on OpenVMS.

=back

=head3 Win32

=over

=item *

Perl can now be built using Microsoft's Visual C++ 2012 compiler by specifying
CCTYPE=MSVC110 (or MSVC110FREE if you are using the free Express edition for
Windows Desktop) in F<win32/Makefile>.

=item *

The option to build without C<USE_SOCKETS_AS_HANDLES> has been removed.

=item *

Fixed a problem where perl could crash while cleaning up threads (including the
main thread) in threaded debugging builds on Win32 and possibly other platforms
[perl #114496].

=item *

A rare race condition that would lead to L<sleep|perlfunc/sleep> taking more
time than requested, and possibly even hanging, has been fixed [perl #33096].

=item *

C<link> on Win32 now attempts to set C<$!> to more appropriate values
based on the Win32 API error code. [perl #112272]

Perl no longer mangles the environment block, e.g. when launching a new
sub-process, when the environment contains non-ASCII characters. Known
problems still remain, however, when the environment contains characters
outside of the current ANSI codepage (e.g. see the item about Unicode in
C<%ENV> in L<http://perl5.git.perl.org/perl.git/blob/HEAD:/Porting/todo.pod>).
[perl #113536]

=item *

Building perl with some Windows compilers used to fail due to a problem
with miniperl's C<glob> operator (which uses the C<perlglob> program)
deleting the PATH environment variable [perl #113798].

=item *

A new makefile option, C<USE_64_BIT_INT>, has been added to the Windows
makefiles.  Set this to "define" when building a 32-bit perl if you want
it to use 64-bit integers.

Machine code size reductions, already made to the DLLs of XS modules in
Perl v5.17.2, have now been extended to the perl DLL itself.

Building with VC++ 6.0 was inadvertently broken in Perl v5.17.2 but has
now been fixed again.

=back

=head3 WinCE

Building on WinCE is now possible once again, although more work is required
to fully restore a clean build.

=head1 Internal Changes

=over

=item *

Synonyms for the misleadingly named C<av_len()> have been created:
C<av_top_index()> and C<av_tindex>.  All three of these return the
number of the highest index in the array, not the number of elements it
contains.

=item *

SvUPGRADE() is no longer an expression. Originally this macro (and its
underlying function, sv_upgrade()) were documented as boolean, although
in reality they always croaked on error and never returned false. In 2005
the documentation was updated to specify a void return value, but
SvUPGRADE() was left always returning 1 for backwards compatibility. This
has now been removed, and SvUPGRADE() is now a statement with no return
value.

So this is now a syntax error:

    if (!SvUPGRADE(sv)) { croak(...); }

If you have code like that, simply replace it with

    SvUPGRADE(sv);

or to avoid compiler warnings with older perls, possibly

    (void)SvUPGRADE(sv);

=item *

Perl has a new copy-on-write mechanism that allows any SvPOK scalar to be
upgraded to a copy-on-write scalar.  A reference count on the string buffer
is stored in the string buffer itself.  This feature is B<not enabled by
default>.

It can be enabled in a perl build by running F<Configure> with
B<-Accflags=-DPERL_NEW_COPY_ON_WRITE>, and we would encourage XS authors
to try their code with such an enabled perl, and provide feedback.
Unfortunately, there is not yet a good guide to updating XS code to cope
with COW.  Until such a document is available, consult the perl5-porters
mailing list.

It breaks a few XS modules by allowing copy-on-write scalars to go
through code paths that never encountered them before.

=item *

Copy-on-write no longer uses the SvFAKE and SvREADONLY flags.  Hence,
SvREADONLY indicates a true read-only SV.

Use the SvIsCOW macro (as before) to identify a copy-on-write scalar.

=item *

C<PL_glob_index> is gone.

=item *

The private Perl_croak_no_modify has had its context parameter removed.  It is
now has a void prototype.  Users of the public API croak_no_modify remain
unaffected.

=item *

Copy-on-write (shared hash key) scalars are no longer marked read-only.
C<SvREADONLY> returns false on such an SV, but C<SvIsCOW> still returns
true.

=item *

A new op type, C<OP_PADRANGE> has been introduced.  The perl peephole
optimiser will, where possible, substitute a single padrange op for a
pushmark followed by one or more pad ops, and possibly also skipping list
and nextstate ops.  In addition, the op can carry out the tasks associated
with the RHS of a C<< my(...) = @_ >> assignment, so those ops may be optimised
away too.

=item *

Case-insensitive matching inside a [bracketed] character class with a
multi-character fold no longer excludes one of the possibilities in the
circumstances that it used to. [perl #89774].

=item *

C<PL_formfeed> has been removed.

=item *

The regular expression engine no longer reads one byte past the end of the
target string.  While for all internally well-formed scalars this should
never have been a problem, this change facilitates clever tricks with
string buffers in CPAN modules.  [perl #73542]

=item *

Inside a BEGIN block, C<PL_compcv> now points to the currently-compiling
subroutine, rather than the BEGIN block itself.

=item *

C<mg_length> has been deprecated.

=item *

C<sv_len> now always returns a byte count and C<sv_len_utf8> a character
count.  Previously, C<sv_len> and C<sv_len_utf8> were both buggy and would
sometimes returns bytes and sometimes characters.  C<sv_len_utf8> no longer
assumes that its argument is in UTF-8.  Neither of these creates UTF-8 caches
for tied or overloaded values or for non-PVs any more.

=item *

C<sv_mortalcopy> now copies string buffers of shared hash key scalars when
called from XS modules [perl #79824].

=item *

The new C<RXf_MODIFIES_VARS> flag can be set by custom regular expression
engines to indicate that the execution of the regular expression may cause
variables to be modified.  This lets C<s///> know to skip certain
optimisations.  Perl's own regular expression engine sets this flag for the
special backtracking verbs that set $REGMARK and $REGERROR.

=item *

The APIs for accessing lexical pads have changed considerably.

C<PADLIST>s are now longer C<AV>s, but their own type instead.
C<PADLIST>s now contain a C<PAD> and a C<PADNAMELIST> of C<PADNAME>s,
rather than C<AV>s for the pad and the list of pad names.  C<PAD>s,
C<PADNAMELIST>s, and C<PADNAME>s are to be accessed as such through the
newly added pad API instead of the plain C<AV> and C<SV> APIs.  See
L<perlapi> for details.

=item *

In the regex API, the numbered capture callbacks are passed an index
indicating what match variable is being accessed. There are special
index values for the C<$`, $&, $&> variables. Previously the same three
values were used to retrieve C<${^PREMATCH}, ${^MATCH}, ${^POSTMATCH}>
too, but these have now been assigned three separate values. See
L<perlreapi/Numbered capture callbacks>.

=item *

C<PL_sawampersand> was previously a boolean indicating that any of
C<$`, $&, $&> had been seen; it now contains three one-bit flags
indicating the presence of each of the variables individually.

=item *

The C<CV *> typemap entry now supports C<&{}> overloading and typeglobs,
just like C<&{...}> [perl #96872].

=item *

The C<SVf_AMAGIC> flag to indicate overloading is now on the stash, not the
object.  It is now set automatically whenever a method or @ISA changes, so
its meaning has changed, too.  It now means "potentially overloaded".  When
the overload table is calculated, the flag is automatically turned off if
there is no overloading, so there should be no noticeable slowdown.

The staleness of the overload tables is now checked when overload methods
are invoked, rather than during C<bless>.

"A" magic is gone.  The changes to the handling of the C<SVf_AMAGIC> flag
eliminate the need for it.

C<PL_amagic_generation> has been removed as no longer necessary.  For XS
modules, it is now a macro alias to C<PL_na>.

The fallback overload setting is now stored in a stash entry separate from
overloadedness itself.

=item *

The character-processing code has been cleaned up in places.  The changes
should be operationally invisible.

=item *

The C<study> function was made a no-op in v5.16.  It was simply disabled via
a C<return> statement; the code was left in place.  Now the code supporting
what C<study> used to do has been removed.

=item *

Under threaded perls, there is no longer a separate PV allocated for every
COP to store its package name (C<< cop->stashpv >>).  Instead, there is an
offset (C<< cop->stashoff >>) into the new C<PL_stashpad> array, which
holds stash pointers.

=item *

In the pluggable regex API, the C<regexp_engine> struct has acquired a new
field C<op_comp>, which is currently just for perl's internal use, and
should be initialized to NULL by other regex plugin modules.

=item *

A new function C<alloccopstash> has been added to the API, but is considered
experimental.  See L<perlapi>.

=item *

Perl used to implement get magic in a way that would sometimes hide bugs in
code that could call mg_get() too many times on magical values.  This hiding of
errors no longer occurs, so long-standing bugs may become visible now.  If
you see magic-related errors in XS code, check to make sure it, together
with the Perl API functions it uses, calls mg_get() only once on SvGMAGICAL()
values.

=item *

OP allocation for CVs now uses a slab allocator.  This simplifies
memory management for OPs allocated to a CV, so cleaning up after a
compilation error is simpler and safer [perl #111462][perl #112312].

=item *

C<PERL_DEBUG_READONLY_OPS> has been rewritten to work with the new slab
allocator, allowing it to catch more violations than before.

=item *

The old slab allocator for ops, which was only enabled for C<PERL_IMPLICIT_SYS>
and C<PERL_DEBUG_READONLY_OPS>, has been retired.

=back

=head1 Selected Bug Fixes

=over 4

=item *

Here document terminators no longer require a terminating newline character when
they occur at the end of a file.  This was already the case at the end of a
string eval [perl #65838].

=item *

C<-DPERL_GLOBAL_STRUCT> builds now free the global struct B<after>
they've finished using it.

=item *

A trailing '/' on a path in @INC will no longer have an additional '/'
appended.

=item *

The C<:crlf> layer now works when unread data doesn't fit into its own
buffer. [perl #112244].

=item *

C<ungetc()> now handles UTF-8 encoded data. [perl #116322].

=item *

A bug in the core typemap caused any C types that map to the T_BOOL core
typemap entry to not be set, updated, or modified when the T_BOOL variable was
used in an OUTPUT: section with an exception for RETVAL. T_BOOL in an INPUT:
section was not affected. Using a T_BOOL return type for an XSUB (RETVAL)
was not affected. A side effect of fixing this bug is, if a T_BOOL is specified
in the OUTPUT: section (which previous did nothing to the SV), and a read only
SV (literal) is passed to the XSUB, croaks like "Modification of a read-only
value attempted" will happen. [perl #115796]

=item *

On many platforms, providing a directory name as the script name caused perl
to do nothing and report success.  It should now universally report an error
and exit nonzero. [perl #61362]

=item *

C<sort {undef} ...> under fatal warnings no longer crashes.  It had
begun crashing in Perl v5.16.

=item *

Stashes blessed into each other
(C<bless \%Foo::, 'Bar'; bless \%Bar::, 'Foo'>) no longer result in double
frees.  This bug started happening in Perl v5.16.

=item *

Numerous memory leaks have been fixed, mostly involving fatal warnings and
syntax errors.

=item *

Some failed regular expression matches such as C<'f' =~ /../g> were not
resetting C<pos>.  Also, "match-once" patterns (C<m?...?g>) failed to reset
it, too, when invoked a second time [perl #23180].

=item *

Several bugs involving C<local *ISA> and C<local *Foo::> causing stale
MRO caches have been fixed.

=item *

Defining a subroutine when its typeglob has been aliased no longer results
in stale method caches.  This bug was introduced in Perl v5.10.

=item *

Localising a typeglob containing a subroutine when the typeglob's package
has been deleted from its parent stash no longer produces an error.  This
bug was introduced in Perl v5.14.

=item *

Under some circumstances, C<local *method=...> would fail to reset method
caches upon scope exit.

=item *

C</[.foo.]/> is no longer an error, but produces a warning (as before) and
is treated as C</[.fo]/> [perl #115818].

=item *

C<goto $tied_var> now calls FETCH before deciding what type of goto
(subroutine or label) this is.

=item *

Renaming packages through glob assignment
(C<*Foo:: = *Bar::; *Bar:: = *Baz::>) in combination with C<m?...?> and
C<reset> no longer makes threaded builds crash.

=item *

A number of bugs related to assigning a list to hash have been fixed. Many of
these involve lists with repeated keys like C<(1, 1, 1, 1)>.

=over 4

=item *

The expression C<scalar(%h = (1, 1, 1, 1))> now returns C<4>, not C<2>.

=item *

The return value of C<%h = (1, 1, 1)> in list context was wrong. Previously
this would return C<(1, undef, 1)>, now it returns C<(1, undef)>.

=item *

Perl now issues the same warning on C<($s, %h) = (1, {})> as it does for
C<(%h) = ({})>, "Reference found where even-sized list expected".

=item *

A number of additional edge cases in list assignment to hashes were
corrected. For more details see commit 23b7025ebc.

=back

=item *

Attributes applied to lexical variables no longer leak memory.
[perl #114764]

=item *

C<dump>, C<goto>, C<last>, C<next>, C<redo> or C<require> followed by a
bareword (or version) and then an infix operator is no longer a syntax
error.  It used to be for those infix operators (like C<+>) that have a
different meaning where a term is expected.  [perl #105924]

=item *

C<require a::b . 1> and C<require a::b + 1> no longer produce erroneous
ambiguity warnings.  [perl #107002]

=item *

Class method calls are now allowed on any string, and not just strings
beginning with an alphanumeric character.  [perl #105922]

=item *

An empty pattern created with C<qr//> used in C<m///> no longer triggers
the "empty pattern reuses last pattern" behaviour.  [perl #96230]

=item *

Tying a hash during iteration no longer results in a memory leak.

=item *

Freeing a tied hash during iteration no longer results in a memory leak.

=item *

List assignment to a tied array or hash that dies on STORE no longer
results in a memory leak.

=item *

If the hint hash (C<%^H>) is tied, compile-time scope entry (which copies
the hint hash) no longer leaks memory if FETCH dies.  [perl #107000]

=item *

Constant folding no longer inappropriately triggers the special
C<split " "> behaviour.  [perl #94490]

=item *

C<defined scalar(@array)>, C<defined do { &foo }>, and similar constructs
now treat the argument to C<defined> as a simple scalar.  [perl #97466]

=item *

Running a custom debugging that defines no C<*DB::DB> glob or provides a
subroutine stub for C<&DB::DB> no longer results in a crash, but an error
instead.  [perl #114990]

=item *

C<reset ""> now matches its documentation.  C<reset> only resets C<m?...?>
patterns when called with no argument.  An empty string for an argument now
does nothing.  (It used to be treated as no argument.)  [perl #97958]

=item *

C<printf> with an argument returning an empty list no longer reads past the
end of the stack, resulting in erratic behaviour.  [perl #77094]

=item *

C<--subname> no longer produces erroneous ambiguity warnings.
[perl #77240]

=item *

C<v10> is now allowed as a label or package name.  This was inadvertently
broken when v-strings were added in Perl v5.6.  [perl #56880]

=item *

C<length>, C<pos>, C<substr> and C<sprintf> could be confused by ties,
overloading, references and typeglobs if the stringification of such
changed the internal representation to or from UTF-8.  [perl #114410]

=item *

utf8::encode now calls FETCH and STORE on tied variables.  utf8::decode now
calls STORE (it was already calling FETCH).

=item *

C<$tied =~ s/$non_utf8/$utf8/> no longer loops infinitely if the tied
variable returns a Latin-1 string, shared hash key scalar, or reference or
typeglob that stringifies as ASCII or Latin-1.  This was a regression from
v5.12.

=item *

C<s///> without /e is now better at detecting when it needs to forego
certain optimisations, fixing some buggy cases:

=over

=item *

Match variables in certain constructs (C<&&>, C<||>, C<..> and others) in
the replacement part; e.g., C<s/(.)/$l{$a||$1}/g>.  [perl #26986]

=item *

Aliases to match variables in the replacement.

=item *

C<$REGERROR> or C<$REGMARK> in the replacement.  [perl #49190]

=item *

An empty pattern (C<s//$foo/>) that causes the last-successful pattern to
be used, when that pattern contains code blocks that modify the variables
in the replacement.

=back

=item *

The taintedness of the replacement string no longer affects the taintedness
of the return value of C<s///e>.

=item *

The C<$|> autoflush variable is created on-the-fly when needed.  If this
happened (e.g., if it was mentioned in a module or eval) when the
currently-selected filehandle was a typeglob with an empty IO slot, it used
to crash.  [perl #115206]

=item *

Line numbers at the end of a string eval are no longer off by one.
[perl #114658]

=item *

@INC filters (subroutines returned by subroutines in @INC) that set $_ to a
copy-on-write scalar no longer cause the parser to modify that string
buffer in place.

=item *

C<length($object)> no longer returns the undefined value if the object has
string overloading that returns undef.  [perl #115260]

=item *

The use of C<PL_stashcache>, the stash name lookup cache for method calls, has
been restored,

Commit da6b625f78f5f133 in August 2011 inadvertently broke the code that looks
up values in C<PL_stashcache>. As it's a only cache, quite correctly everything
carried on working without it.

=item *

The error "Can't localize through a reference" had disappeared in v5.16.0
when C<local %$ref> appeared on the last line of an lvalue subroutine.
This error disappeared for C<\local %$ref> in perl v5.8.1.  It has now
been restored.

=item *

The parsing of here-docs has been improved significantly, fixing several
parsing bugs and crashes and one memory leak, and correcting wrong
subsequent line numbers under certain conditions.

=item *

Inside an eval, the error message for an unterminated here-doc no longer
has a newline in the middle of it [perl #70836].

=item *

A substitution inside a substitution pattern (C<s/${s|||}//>) no longer
confuses the parser.

=item *

It may be an odd place to allow comments, but C<s//"" # hello/e> has
always worked, I<unless> there happens to be a null character before the
first #.  Now it works even in the presence of nulls.

=item *

An invalid range in C<tr///> or C<y///> no longer results in a memory leak.

=item *

String eval no longer treats a semicolon-delimited quote-like operator at
the very end (C<eval 'q;;'>) as a syntax error.

=item *

C<< warn {$_ => 1} + 1 >> is no longer a syntax error.  The parser used to
get confused with certain list operators followed by an anonymous hash and
then an infix operator that shares its form with a unary operator.

=item *

C<(caller $n)[6]> (which gives the text of the eval) used to return the
actual parser buffer.  Modifying it could result in crashes.  Now it always
returns a copy.  The string returned no longer has "\n;" tacked on to the
end.  The returned text also includes here-doc bodies, which used to be
omitted.

=item *

The UTF-8 position cache is now reset when accessing magical variables, to
avoid the string buffer and the UTF-8 position cache getting out of sync
[perl #114410].

=item *

Various cases of get magic being called twice for magical UTF-8
strings have been fixed.

=item *

This code (when not in the presence of C<$&> etc)

    $_ = 'x' x 1_000_000;
    1 while /(.)/;

used to skip the buffer copy for performance reasons, but suffered from C<$1>
etc changing if the original string changed.  That's now been fixed.

=item *

Perl doesn't use PerlIO anymore to report out of memory messages, as PerlIO
might attempt to allocate more memory.

=item *

In a regular expression, if something is quantified with C<{n,m}> where
C<S<n E<gt> m>>, it can't possibly match.  Previously this was a fatal
error, but now is merely a warning (and that something won't match).
[perl #82954].

=item *

It used to be possible for formats defined in subroutines that have
subsequently been undefined and redefined to close over variables in the
wrong pad (the newly-defined enclosing sub), resulting in crashes or
"Bizarre copy" errors.

=item *

Redefinition of XSUBs at run time could produce warnings with the wrong
line number.

=item *

The %vd sprintf format does not support version objects for alpha versions.
It used to output the format itself (%vd) when passed an alpha version, and
also emit an "Invalid conversion in printf" warning.  It no longer does,
but produces the empty string in the output.  It also no longer leaks
memory in this case.

=item *

C<< $obj->SUPER::method >> calls in the main package could fail if the
SUPER package had already been accessed by other means.

=item *

Stash aliasing (C<< *foo:: = *bar:: >>) no longer causes SUPER calls to ignore
changes to methods or @ISA or use the wrong package.

=item *

Method calls on packages whose names end in ::SUPER are no longer treated
as SUPER method calls, resulting in failure to find the method.
Furthermore, defining subroutines in such packages no longer causes them to
be found by SUPER method calls on the containing package [perl #114924].

=item *

C<\w> now matches the code points U+200C (ZERO WIDTH NON-JOINER) and U+200D
(ZERO WIDTH JOINER).  C<\W> no longer matches these.  This change is because
Unicode corrected their definition of what C<\w> should match.

=item *

C<dump LABEL> no longer leaks its label.

=item *

Constant folding no longer changes the behaviour of functions like C<stat()>
and C<truncate()> that can take either filenames or handles.
C<stat 1 ? foo : bar> nows treats its argument as a file name (since it is an
arbitrary expression), rather than the handle "foo".

=item *

C<truncate FOO, $len> no longer falls back to treating "FOO" as a file name if
the filehandle has been deleted.  This was broken in Perl v5.16.0.

=item *

Subroutine redefinitions after sub-to-glob and glob-to-glob assignments no
longer cause double frees or panic messages.

=item *

C<s///> now turns vstrings into plain strings when performing a substitution,
even if the resulting string is the same (C<s/a/a/>).

=item *

Prototype mismatch warnings no longer erroneously treat constant subs as having
no prototype when they actually have "".

=item *

Constant subroutines and forward declarations no longer prevent prototype
mismatch warnings from omitting the sub name.

=item *

C<undef> on a subroutine now clears call checkers.

=item *

The C<ref> operator started leaking memory on blessed objects in Perl v5.16.0.
This has been fixed [perl #114340].

=item *

C<use> no longer tries to parse its arguments as a statement, making
C<use constant { () };> a syntax error [perl #114222].

=item *

On debugging builds, "uninitialized" warnings inside formats no longer cause
assertion failures.

=item *

On debugging builds, subroutines nested inside formats no longer cause
assertion failures [perl #78550].

=item *

Formats and C<use> statements are now permitted inside formats.

=item *

C<print $x> and C<sub { print $x }-E<gt>()> now always produce the same output.
It was possible for the latter to refuse to close over $x if the variable was
not active; e.g., if it was defined outside a currently-running named
subroutine.

=item *

Similarly, C<print $x> and C<print eval '$x'> now produce the same output.
This also allows "my $x if 0" variables to be seen in the debugger [perl
#114018].

=item *

Formats called recursively no longer stomp on their own lexical variables, but
each recursive call has its own set of lexicals.

=item *

Attempting to free an active format or the handle associated with it no longer
results in a crash.

=item *

Format parsing no longer gets confused by braces, semicolons and low-precedence
operators.  It used to be possible to use braces as format delimiters (instead
of C<=> and C<.>), but only sometimes.  Semicolons and low-precedence operators
in format argument lines no longer confuse the parser into ignoring the line's
return value.  In format argument lines, braces can now be used for anonymous
hashes, instead of being treated always as C<do> blocks.

=item *

Formats can now be nested inside code blocks in regular expressions and other
quoted constructs (C</(?{...})/> and C<qq/${...}/>) [perl #114040].

=item *

Formats are no longer created after compilation errors.

=item *

Under debugging builds, the B<-DA> command line option started crashing in Perl
v5.16.0.  It has been fixed [perl #114368].

=item *

A potential deadlock scenario involving the premature termination of a pseudo-
forked child in a Windows build with ithreads enabled has been fixed.  This
resolves the common problem of the F<t/op/fork.t> test hanging on Windows [perl
#88840].

=item *

The code which generates errors from C<require()> could potentially read one or
two bytes before the start of the filename for filenames less than three bytes
long and ending C</\.p?\z/>.  This has now been fixed.  Note that it could
never have happened with module names given to C<use()> or C<require()> anyway.

=item *

The handling of pathnames of modules given to C<require()> has been made
thread-safe on VMS.

=item *

Non-blocking sockets have been fixed on VMS.

=item *

Pod can now be nested in code inside a quoted construct outside of a string
eval.  This used to work only within string evals [perl #114040].

=item *

C<goto ''> now looks for an empty label, producing the "goto must have
label" error message, instead of exiting the program [perl #111794].

=item *

C<goto "\0"> now dies with "Can't find label" instead of "goto must have
label".

=item *

The C function C<hv_store> used to result in crashes when used on C<%^H>
[perl #111000].

=item *

A call checker attached to a closure prototype via C<cv_set_call_checker>
is now copied to closures cloned from it.  So C<cv_set_call_checker> now
works inside an attribute handler for a closure.

=item *

Writing to C<$^N> used to have no effect.  Now it croaks with "Modification
of a read-only value" by default, but that can be overridden by a custom
regular expression engine, as with C<$1> [perl #112184].

=item *

C<undef> on a control character glob (C<undef *^H>) no longer emits an
erroneous warning about ambiguity [perl #112456].

=item *

For efficiency's sake, many operators and built-in functions return the
same scalar each time.  Lvalue subroutines and subroutines in the CORE::
namespace were allowing this implementation detail to leak through.
C<print &CORE::uc("a"), &CORE::uc("b")> used to print "BB".  The same thing
would happen with an lvalue subroutine returning the return value of C<uc>.
Now the value is copied in such cases.

=item *

C<method {}> syntax with an empty block or a block returning an empty list
used to crash or use some random value left on the stack as its invocant.
Now it produces an error.

=item *

C<vec> now works with extremely large offsets (E<gt>2 GB) [perl #111730].

=item *

Changes to overload settings now take effect immediately, as do changes to
inheritance that affect overloading.  They used to take effect only after
C<bless>.

Objects that were created before a class had any overloading used to remain
non-overloaded even if the class gained overloading through C<use overload>
or @ISA changes, and even after C<bless>.  This has been fixed
[perl #112708].

=item *

Classes with overloading can now inherit fallback values.

=item *

Overloading was not respecting a fallback value of 0 if there were
overloaded objects on both sides of an assignment operator like C<+=>
[perl #111856].

=item *

C<pos> now croaks with hash and array arguments, instead of producing
erroneous warnings.

=item *

C<while(each %h)> now implies C<while(defined($_ = each %h))>, like
C<readline> and C<readdir>.

=item *

Subs in the CORE:: namespace no longer crash after C<undef *_> when called
with no argument list (C<&CORE::time> with no parentheses).

=item *

C<unpack> no longer produces the "'/' must follow a numeric type in unpack"
error when it is the data that are at fault [perl #60204].

=item *

C<join> and C<"@array"> now call FETCH only once on a tied C<$">
[perl #8931].

=item *

Some subroutine calls generated by compiling core ops affected by a
C<CORE::GLOBAL> override had op checking performed twice.  The checking
is always idempotent for pure Perl code, but the double checking can
matter when custom call checkers are involved.

=item *

A race condition used to exist around fork that could cause a signal sent to
the parent to be handled by both parent and child. Signals are now blocked
briefly around fork to prevent this from happening [perl #82580].

=item *

The implementation of code blocks in regular expressions, such as C<(?{})>
and C<(??{})>, has been heavily reworked to eliminate a whole slew of bugs.
The main user-visible changes are:

=over 4

=item *

Code blocks within patterns are now parsed in the same pass as the
surrounding code; in particular it is no longer necessary to have balanced
braces: this now works:

    /(?{  $x='{'  })/

This means that this error message is no longer generated:

    Sequence (?{...}) not terminated or not {}-balanced in regex

but a new error may be seen:

    Sequence (?{...}) not terminated with ')'

In addition, literal code blocks within run-time patterns are only
compiled once, at perl compile-time:

    for my $p (...) {
        # this 'FOO' block of code is compiled once,
	# at the same time as the surrounding 'for' loop
        /$p{(?{FOO;})/;
    }

=item *

Lexical variables are now sane as regards scope, recursion and closure
behavior. In particular, C</A(?{B})C/> behaves (from a closure viewpoint)
exactly like C</A/ && do { B } && /C/>, while  C<qr/A(?{B})C/> is like
C<sub {/A/ && do { B } && /C/}>. So this code now works how you might
expect, creating three regexes that match 0, 1, and 2:

    for my $i (0..2) {
        push @r, qr/^(??{$i})$/;
    }
    "1" =~ $r[1]; # matches

=item *

The C<use re 'eval'> pragma is now only required for code blocks defined
at runtime; in particular in the following, the text of the C<$r> pattern is
still interpolated into the new pattern and recompiled, but the individual
compiled code-blocks within C<$r> are reused rather than being recompiled,
and C<use re 'eval'> isn't needed any more:

    my $r = qr/abc(?{....})def/;
    /xyz$r/;

=item *

Flow control operators no longer crash. Each code block runs in a new
dynamic scope, so C<next> etc. will not see
any enclosing loops. C<return> returns a value
from the code block, not from any enclosing subroutine.

=item *

Perl normally caches the compilation of run-time patterns, and doesn't
recompile if the pattern hasn't changed, but this is now disabled if
required for the correct behavior of closures. For example:

    my $code = '(??{$x})';
    for my $x (1..3) {
	# recompile to see fresh value of $x each time
        $x =~ /$code/;
    }

=item *

The C</msix> and C<(?msix)> etc. flags are now propagated into the return
value from C<(??{})>; this now works:

    "AB" =~ /a(??{'b'})/i;

=item *

Warnings and errors will appear to come from the surrounding code (or for
run-time code blocks, from an eval) rather than from an C<re_eval>:

    use re 'eval'; $c = '(?{ warn "foo" })'; /$c/;
    /(?{ warn "foo" })/;

formerly gave:

    foo at (re_eval 1) line 1.
    foo at (re_eval 2) line 1.

and now gives:

    foo at (eval 1) line 1.
    foo at /some/prog line 2.

=back

=item *

Perl now can be recompiled to use any Unicode version.  In v5.16, it
worked on Unicodes 6.0 and 6.1, but there were various bugs if earlier
releases were used; the older the release the more problems.

=item *

C<vec> no longer produces "uninitialized" warnings in lvalue context
[perl #9423].

=item *

An optimization involving fixed strings in regular expressions could cause
a severe performance penalty in edge cases.  This has been fixed
[perl #76546].

=item *

In certain cases, including empty subpatterns within a regular expression (such
as C<(?:)> or C<(?:|)>) could disable some optimizations. This has been fixed.

=item *

The "Can't find an opnumber" message that C<prototype> produces when passed
a string like "CORE::nonexistent_keyword" now passes UTF-8 and embedded
NULs through unchanged [perl #97478].

=item *

C<prototype> now treats magical variables like C<$1> the same way as
non-magical variables when checking for the CORE:: prefix, instead of
treating them as subroutine names.

=item *

Under threaded perls, a runtime code block in a regular expression could
corrupt the package name stored in the op tree, resulting in bad reads
in C<caller>, and possibly crashes [perl #113060].

=item *

Referencing a closure prototype (C<\&{$_[1]}> in an attribute handler for a
closure) no longer results in a copy of the subroutine (or assertion
failures on debugging builds).

=item *

C<eval '__PACKAGE__'> now returns the right answer on threaded builds if
the current package has been assigned over (as in
C<*ThisPackage:: = *ThatPackage::>) [perl #78742].

=item *

If a package is deleted by code that it calls, it is possible for C<caller>
to see a stack frame belonging to that deleted package.  C<caller> could
crash if the stash's memory address was reused for a scalar and a
substitution was performed on the same scalar [perl #113486].

=item *

C<UNIVERSAL::can> no longer treats its first argument differently
depending on whether it is a string or number internally.

=item *

C<open> with C<< <& >> for the mode checks to see whether the third argument is
a number, in determining whether to treat it as a file descriptor or a handle
name.  Magical variables like C<$1> were always failing the numeric check and
being treated as handle names.

=item *

C<warn>'s handling of magical variables (C<$1>, ties) has undergone several
fixes.  C<FETCH> is only called once now on a tied argument or a tied C<$@>
[perl #97480].  Tied variables returning objects that stringify as "" are
no longer ignored.  A tied C<$@> that happened to return a reference the
I<previous> time it was used is no longer ignored.

=item *

C<warn ""> now treats C<$@> with a number in it the same way, regardless of
whether it happened via C<$@=3> or C<$@="3">.  It used to ignore the
former.  Now it appends "\t...caught", as it has always done with
C<$@="3">.

=item *

Numeric operators on magical variables (e.g., S<C<$1 + 1>>) used to use
floating point operations even where integer operations were more appropriate,
resulting in loss of accuracy on 64-bit platforms [perl #109542].

=item *

Unary negation no longer treats a string as a number if the string happened
to be used as a number at some point.  So, if C<$x> contains the string "dogs",
C<-$x> returns "-dogs" even if C<$y=0+$x> has happened at some point.

=item *

In Perl v5.14, C<-'-10'> was fixed to return "10", not "+10".  But magical
variables (C<$1>, ties) were not fixed till now [perl #57706].

=item *

Unary negation now treats strings consistently, regardless of the internal
C<UTF8> flag.

=item *

A regression introduced in Perl v5.16.0 involving
C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/> has been fixed.  Only the first
instance is supposed to be meaningful if a character appears more than
once in C<I<SEARCHLIST>>.  Under some circumstances, the final instance
was overriding all earlier ones.  [perl #113584]

=item *

Regular expressions like C<qr/\87/> previously silently inserted a NUL
character, thus matching as if it had been written C<qr/\00087/>.  Now it
matches as if it had been written as C<qr/87/>, with a message that the
sequence C<"\8"> is unrecognized.

=item *

C<__SUB__> now works in special blocks (C<BEGIN>, C<END>, etc.).

=item *

Thread creation on Windows could theoretically result in a crash if done
inside a C<BEGIN> block.  It still does not work properly, but it no longer
crashes [perl #111610].

=item *

C<\&{''}> (with the empty string) now autovivifies a stub like any other
sub name, and no longer produces the "Unable to create sub" error
[perl #94476].

=item *

A regression introduced in v5.14.0 has been fixed, in which some calls
to the C<re> module would clobber C<$_> [perl #113750].

=item *

C<do FILE> now always either sets or clears C<$@>, even when the file can't be
read. This ensures that testing C<$@> first (as recommended by the
documentation) always returns the correct result.

=item *

The array iterator used for the C<each @array> construct is now correctly
reset when C<@array> is cleared [perl #75596]. This happens, for example, when
the array is globally assigned to, as in C<@array = (...)>, but not when its
B<values> are assigned to. In terms of the XS API, it means that C<av_clear()>
will now reset the iterator.

This mirrors the behaviour of the hash iterator when the hash is cleared.

=item *

C<< $class->can >>, C<< $class->isa >>, and C<< $class->DOES >> now return
correct results, regardless of whether that package referred to by C<$class>
exists [perl #47113].

=item *

Arriving signals no longer clear C<$@> [perl #45173].

=item *

Allow C<my ()> declarations with an empty variable list [perl #113554].

=item *

During parsing, subs declared after errors no longer leave stubs
[perl #113712].

=item *

Closures containing no string evals no longer hang on to their containing
subroutines, allowing variables closed over by outer subroutines to be
freed when the outer sub is freed, even if the inner sub still exists
[perl #89544].

=item *

Duplication of in-memory filehandles by opening with a "<&=" or ">&=" mode
stopped working properly in v5.16.0.  It was causing the new handle to
reference a different scalar variable.  This has been fixed [perl #113764].

=item *

C<qr//> expressions no longer crash with custom regular expression engines
that do not set C<offs> at regular expression compilation time
[perl #112962].

=item *

C<delete local> no longer crashes with certain magical arrays and hashes
[perl #112966].

=item *

C<local> on elements of certain magical arrays and hashes used not to
arrange to have the element deleted on scope exit, even if the element did
not exist before C<local>.

=item *

C<scalar(write)> no longer returns multiple items [perl #73690].

=item *

String to floating point conversions no longer misparse certain strings under
C<use locale> [perl #109318].

=item *

C<@INC> filters that die no longer leak memory [perl #92252].

=item *

The implementations of overloaded operations are now called in the correct
context. This allows, among other things, being able to properly override
C<< <> >> [perl #47119].

=item *

Specifying only the C<fallback> key when calling C<use overload> now behaves
properly [perl #113010].

=item *

C<< sub foo { my $a = 0; while ($a) { ... } } >> and
C<< sub foo { while (0) { ... } } >> now return the same thing [perl #73618].

=item *

String negation now behaves the same under C<use integer;> as it does
without [perl #113012].

=item *

C<chr> now returns the Unicode replacement character (U+FFFD) for -1,
regardless of the internal representation.  -1 used to wrap if the argument
was tied or a string internally.

=item *

Using a C<format> after its enclosing sub was freed could crash as of
perl v5.12.0, if the format referenced lexical variables from the outer sub.

=item *

Using a C<format> after its enclosing sub was undefined could crash as of
perl v5.10.0, if the format referenced lexical variables from the outer sub.

=item *

Using a C<format> defined inside a closure, which format references
lexical variables from outside, never really worked unless the C<write>
call was directly inside the closure.  In v5.10.0 it even started crashing.
Now the copy of that closure nearest the top of the call stack is used to
find those variables.

=item *

Formats that close over variables in special blocks no longer crash if a
stub exists with the same name as the special block before the special
block is compiled.

=item *

The parser no longer gets confused, treating C<eval foo ()> as a syntax
error if preceded by C<print;> [perl #16249].

=item *

The return value of C<syscall> is no longer truncated on 64-bit platforms
[perl #113980].

=item *

Constant folding no longer causes C<print 1 ? FOO : BAR> to print to the
FOO handle [perl #78064].

=item *

C<do subname> now calls the named subroutine and uses the file name it
returns, instead of opening a file named "subname".

=item *

Subroutines looked up by rv2cv check hooks (registered by XS modules) are
now taken into consideration when determining whether C<foo bar> should be
the sub call C<foo(bar)> or the method call C<< "bar"->foo >>.

=item *

C<CORE::foo::bar> is no longer treated specially, allowing global overrides
to be called directly via C<CORE::GLOBAL::uc(...)> [perl #113016].

=item *

Calling an undefined sub whose typeglob has been undefined now produces the
customary "Undefined subroutine called" error, instead of "Not a CODE
reference".

=item *

Two bugs involving @ISA have been fixed.  C<*ISA = *glob_without_array> and
C<undef *ISA; @{*ISA}> would prevent future modifications to @ISA from
updating the internal caches used to look up methods.  The
*glob_without_array case was a regression from Perl v5.12.

=item *

Regular expression optimisations sometimes caused C<$> with C</m> to
produce failed or incorrect matches [perl #114068].

=item *

C<__SUB__> now works in a C<sort> block when the enclosing subroutine is
predeclared with C<sub foo;> syntax [perl #113710].

=item *

Unicode properties only apply to Unicode code points, which leads to
some subtleties when regular expressions are matched against
above-Unicode code points.  There is a warning generated to draw your
attention to this.  However, this warning was being generated
inappropriately in some cases, such as when a program was being parsed.
Non-Unicode matches such as C<\w> and C<[:word:]> should not generate the
warning, as their definitions don't limit them to apply to only Unicode
code points.  Now the message is only generated when matching against
C<\p{}> and C<\P{}>.  There remains a bug, [perl #114148], for the very
few properties in Unicode that match just a single code point.  The
warning is not generated if they are matched against an above-Unicode
code point.

=item *

Uninitialized warnings mentioning hash elements would only mention the
element name if it was not in the first bucket of the hash, due to an
off-by-one error.

=item *

A regular expression optimizer bug could cause multiline "^" to behave
incorrectly in the presence of line breaks, such that
C<"/\n\n" =~ m#\A(?:^/$)#im> would not match [perl #115242].

=item *

Failed C<fork> in list context no longer corrupts the stack.
C<@a = (1, 2, fork, 3)> used to gobble up the 2 and assign C<(1, undef, 3)>
if the C<fork> call failed.

=item *

Numerous memory leaks have been fixed, mostly involving tied variables that
die, regular expression character classes and code blocks, and syntax
errors.

=item *

Assigning a regular expression (C<${qr//}>) to a variable that happens to
hold a floating point number no longer causes assertion failures on
debugging builds.

=item *

Assigning a regular expression to a scalar containing a number no longer
causes subsequent numification to produce random numbers.

=item *

Assigning a regular expression to a magic variable no longer wipes away the
magic.  This was a regression from v5.10.

=item *

Assigning a regular expression to a blessed scalar no longer results in
crashes.  This was also a regression from v5.10.

=item *

Regular expression can now be assigned to tied hash and array elements with
flattening into strings.

=item *

Numifying a regular expression no longer results in an uninitialized
warning.

=item *

Negative array indices no longer cause EXISTS methods of tied variables to
be ignored.  This was a regression from v5.12.

=item *

Negative array indices no longer result in crashes on arrays tied to
non-objects.

=item *

C<$byte_overload .= $utf8> no longer results in doubly-encoded UTF-8 if the
left-hand scalar happened to have produced a UTF-8 string the last time
overloading was invoked.

=item *

C<goto &sub> now uses the current value of @_, instead of using the array
the subroutine was originally called with.  This means
C<local @_ = (...); goto &sub> now works [perl #43077].

=item *

If a debugger is invoked recursively, it no longer stomps on its own
lexical variables.  Formerly under recursion all calls would share the same
set of lexical variables [perl #115742].

=item *

C<*_{ARRAY}> returned from a subroutine no longer spontaneously
becomes empty.

=item *

When using C<say> to print to a tied filehandle, the value of C<$\> is
correctly localized, even if it was previously undef.  [perl #119927]

=back

=head1 Known Problems

=over 4

=item *

UTF8-flagged strings in C<%ENV> on HP-UX 11.00 are buggy

The interaction of UTF8-flagged strings and C<%ENV> on HP-UX 11.00 is
currently dodgy in some not-yet-fully-diagnosed way.  Expect test
failures in F<t/op/magic.t>, followed by unknown behavior when storing
wide characters in the environment.

=back

=head1 Obituary

Hojung Yoon (AMORETTE), 24, of Seoul, South Korea, went to his long rest
on May 8, 2013 with llama figurine and autographed TIMTOADY card.  He
was a brilliant young Perl 5 & 6 hacker and a devoted member of
Seoul.pm.  He programmed Perl, talked Perl, ate Perl, and loved Perl.  We
believe that he is still programming in Perl with his broken IBM laptop
somewhere.  He will be missed.

=head1 Acknowledgements

Perl v5.18.0 represents approximately 12 months of development since
Perl v5.16.0 and contains approximately 400,000 lines of changes across
2,100 files from 113 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers. The following people are known to
have contributed the improvements that became Perl v5.18.0:

Aaron Crane, Aaron Trevena, Abhijit Menon-Sen, Adrian M. Enache, Alan
Haggai Alavi, Alexandr Ciornii, Andrew Tam, Andy Dougherty, Anton Nikishaev,
Aristotle Pagaltzis, Augustina Blair, Bob Ernst, Brad Gilbert, Breno G. de
Oliveira, Brian Carlson, Brian Fraser, Charlie Gonzalez, Chip Salzenberg, Chris
'BinGOs' Williams, Christian Hansen, Colin Kuskie, Craig A. Berry, Dagfinn
Ilmari Mannsåker, Daniel Dragan, Daniel Perrett, Darin McBride, Dave Rolsky,
David Golden, David Leadbeater, David Mitchell, David Nicol, Dominic
Hargreaves, E. Choroba, Eric Brine, Evan Miller, Father Chrysostomos, Florian
Ragwitz, François Perrad, George Greer, Goro Fuji, H.Merijn Brand, Herbert
Breunung, Hugo van der Sanden, Igor Zaytsev, James E Keenan, Jan Dubois,
Jasmine Ahuja, Jerry D. Hedden, Jess Robinson, Jesse Luehrs, Joaquin Ferrero,
Joel Berger, John Goodyear, John Peacock, Karen Etheridge, Karl Williamson,
Karthik Rajagopalan, Kent Fredric, Leon Timmermans, Lucas Holt, Lukas Mai,
Marcus Holland-Moritz, Markus Jansen, Martin Hasch, Matthew Horsfall, Max
Maischein, Michael G Schwern, Michael Schroeder, Moritz Lenz, Nicholas Clark,
Niko Tyni, Oleg Nesterov, Patrik Hägglund, Paul Green, Paul Johnson, Paul
Marquess, Peter Martini, Rafael Garcia-Suarez, Reini Urban, Renee Baecker,
Rhesa Rozendaal, Ricardo Signes, Robin Barker, Ronald J. Kimball, Ruslan
Zakirov, Salvador Fandiño, Sawyer X, Scott Lanning, Sergey Alekseev, Shawn M
Moore, Shirakata Kentaro, Shlomi Fish, Sisyphus, Smylers, Steffen Müller,
Steve Hay, Steve Peters, Steven Schubiger, Sullivan Beck, Sven Strickroth,
Sébastien Aperghis-Tramoni, Thomas Sibley, Tobias Leich, Tom Wyant, Tony Cook,
Vadim Konovalov, Vincent Pit, Volker Schatz, Walt Mankowski, Yves Orton,
Zefram.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
http://rt.perl.org/perlbug/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlos2.pod000064400000266247150344123440006657 0ustar00If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see perlpod manpage) which is
specially designed to be readable as is.

=head1 NAME

perlos2 - Perl under OS/2, DOS, Win0.3*, Win0.95 and WinNT.

=head1 SYNOPSIS

One can read this document in the following formats:

	man perlos2
	view perl perlos2
	explorer perlos2.html
	info perlos2

to list some (not all may be available simultaneously), or it may
be read I<as is>: either as F<README.os2>, or F<pod/perlos2.pod>.

To read the F<.INF> version of documentation (B<very> recommended)
outside of OS/2, one needs an IBM's reader (may be available on IBM
ftp sites (?)  (URL anyone?)) or shipped with PC DOS 7.0 and IBM's
Visual Age C++ 3.5.

A copy of a Win* viewer is contained in the "Just add OS/2 Warp" package

  ftp://ftp.software.ibm.com/ps/products/os2/tools/jaow/jaow.zip

in F<?:\JUST_ADD\view.exe>. This gives one an access to EMX's 
F<.INF> docs as well (text form is available in F</emx/doc> in 
EMX's distribution).  There is also a different viewer named xview.

Note that if you have F<lynx.exe> or F<netscape.exe> installed, you can follow WWW links
from this document in F<.INF> format. If you have EMX docs installed 
correctly, you can follow library links (you need to have C<view emxbook>
working by setting C<EMXBOOK> environment variable as it is described
in EMX docs).

=cut

Contents (This may be a little bit obsolete)
 
 perlos2 - Perl under OS/2, DOS, Win0.3*, Win0.95 and WinNT.

      NAME
      SYNOPSIS
      DESCRIPTION
	 -  Target
	 -  Other OSes
	 -  Prerequisites
	 -  Starting Perl programs under OS/2 (and DOS and...)
	 -  Starting OS/2 (and DOS) programs under Perl
      Frequently asked questions
	 -  "It does not work"
	 -  I cannot run external programs
	 -  I cannot embed perl into my program, or use perl.dll from my
	 -  `` and pipe-open do not work under DOS.
	 -  Cannot start find.exe "pattern" file
      INSTALLATION
	 -  Automatic binary installation
	 -  Manual binary installation
	 -  Warning
      Accessing documentation
	 -  OS/2 .INF file
	 -  Plain text
	 -  Manpages
	 -  HTML
	 -  GNU info files
	 -  PDF files
	 -  LaTeX docs
      BUILD
	 -  The short story
	 -  Prerequisites
	 -  Getting perl source
	 -  Application of the patches
	 -  Hand-editing
	 -  Making
	 -  Testing
	 -  Installing the built perl
	 -  a.out-style build
      Build FAQ
	 -  Some / became \ in pdksh.
	 -  'errno' - unresolved external
	 -  Problems with tr or sed
	 -  Some problem (forget which ;-)
	 -  Library ... not found
	 -  Segfault in make
	 -  op/sprintf test failure
      Specific (mis)features of OS/2 port
	 -  setpriority, getpriority
	 -  system()
	 -  extproc on the first line
	 -  Additional modules:
	 -  Prebuilt methods:
	 -  Prebuilt variables:
	 -  Misfeatures
	 -  Modifications
	 -  Identifying DLLs
	 -  Centralized management of resources
      Perl flavors
	 -  perl.exe
	 -  perl_.exe
	 -  perl__.exe
	 -  perl___.exe
	 -  Why strange names?
	 -  Why dynamic linking?
	 -  Why chimera build?
      ENVIRONMENT
	 -  PERLLIB_PREFIX
	 -  PERL_BADLANG
	 -  PERL_BADFREE
	 -  PERL_SH_DIR
	 -  USE_PERL_FLOCK
	 -  TMP or TEMP
      Evolution
	 -  Text-mode filehandles
	 -  Priorities
	 -  DLL name mangling: pre 5.6.2
	 -  DLL name mangling: 5.6.2 and beyond
	 -  DLL forwarder generation
	 -  Threading
	 -  Calls to external programs
	 -  Memory allocation
	 -  Threads
      BUGS
      AUTHOR
      SEE ALSO

=head1 DESCRIPTION

=head2 Target

The target is to make OS/2 one of the best supported platform for
using/building/developing Perl and I<Perl applications>, as well as
make Perl the best language to use under OS/2. The secondary target is
to try to make this work under DOS and Win* as well (but not B<too> hard).

The current state is quite close to this target. Known limitations:

=over 5

=item *

Some *nix programs use fork() a lot; with the mostly useful flavors of
perl for OS/2 (there are several built simultaneously) this is
supported; but some flavors do not support this (e.g., when Perl is
called from inside REXX).  Using fork() after
I<use>ing dynamically loading extensions would not work with I<very> old
versions of EMX.

=item *

You need a separate perl executable F<perl__.exe> (see L</perl__.exe>)
if you want to use PM code in your application (as Perl/Tk or OpenGL
Perl modules do) without having a text-mode window present.

While using the standard F<perl.exe> from a text-mode window is possible
too, I have seen cases when this causes degradation of the system stability.
Using F<perl__.exe> avoids such a degradation.

=item *

There is no simple way to access WPS objects. The only way I know
is via C<OS2::REXX> and C<SOM> extensions (see L<OS2::REXX>, L<SOM>).
However, we do not have access to
convenience methods of Object-REXX. (Is it possible at all? I know
of no Object-REXX API.)  The C<SOM> extension (currently in alpha-text)
may eventually remove this shortcoming; however, due to the fact that
DII is not supported by the C<SOM> module, using C<SOM> is not as
convenient as one would like it.

=back

Please keep this list up-to-date by informing me about other items.

=head2 Other OSes

Since OS/2 port of perl uses a remarkable EMX environment, it can
run (and build extensions, and - possibly - be built itself) under any
environment which can run EMX. The current list is DOS,
DOS-inside-OS/2, Win0.3*, Win0.95 and WinNT. Out of many perl flavors,
only one works, see L</"F<perl_.exe>">.

Note that not all features of Perl are available under these
environments. This depends on the features the I<extender> - most
probably RSX - decided to implement.

Cf. L</Prerequisites>.

=head2 Prerequisites

=over 6

=item EMX

EMX runtime is required (may be substituted by RSX). Note that
it is possible to make F<perl_.exe> to run under DOS without any
external support by binding F<emx.exe>/F<rsx.exe> to it, see C<emxbind>. Note
that under DOS for best results one should use RSX runtime, which
has much more functions working (like C<fork>, C<popen> and so on). In
fact RSX is required if there is no VCPI present. Note the
RSX requires DPMI.  Many implementations of DPMI are known to be very
buggy, beware!

Only the latest runtime is supported, currently C<0.9d fix 03>. Perl may run
under earlier versions of EMX, but this is not tested.

One can get different parts of EMX from, say

  ftp://crydee.sai.msu.ru/pub/comp/os/os2/leo/gnu/emx+gcc/
  http://hobbes.nmsu.edu/h-browse.php?dir=/pub/os2/dev/emx/v0.9d/

The runtime component should have the name F<emxrt.zip>.

B<NOTE>. When using F<emx.exe>/F<rsx.exe>, it is enough to have them on your path. One
does not need to specify them explicitly (though this

  emx perl_.exe -de 0

will work as well.)

=item RSX

To run Perl on DPMI platforms one needs RSX runtime. This is
needed under DOS-inside-OS/2, Win0.3*, Win0.95 and WinNT (see 
L</"Other OSes">). RSX would not work with VCPI
only, as EMX would, it requires DMPI.

Having RSX and the latest F<sh.exe> one gets a fully functional
B<*nix>-ish environment under DOS, say, C<fork>, C<``> and
pipe-C<open> work. In fact, MakeMaker works (for static build), so one
can have Perl development environment under DOS. 

One can get RSX from, say

  http://cd.textfiles.com/hobbesos29804/disk1/EMX09C/
  ftp://crydee.sai.msu.ru/pub/comp/os/os2/leo/gnu/emx+gcc/contrib/

Contact the author on C<rainer@mathematik.uni-bielefeld.de>.

The latest F<sh.exe> with DOS hooks is available in

  http://www.ilyaz.org/software/os2/

as F<sh_dos.zip> or under similar names starting with C<sh>, C<pdksh> etc.

=item HPFS

Perl does not care about file systems, but the perl library contains
many files with long names, so to install it intact one needs a file
system which supports long file names.

Note that if you do not plan to build the perl itself, it may be
possible to fool EMX to truncate file names. This is not supported,
read EMX docs to see how to do it.

=item pdksh

To start external programs with complicated command lines (like with
pipes in between, and/or quoting of arguments), Perl uses an external
shell. With EMX port such shell should be named F<sh.exe>, and located
either in the wired-in-during-compile locations (usually F<F:/bin>),
or in configurable location (see L</"C<PERL_SH_DIR>">).

For best results use EMX pdksh. The standard binary (5.2.14 or later) runs
under DOS (with L</RSX>) as well, see

  http://www.ilyaz.org/software/os2/

=back

=head2 Starting Perl programs under OS/2 (and DOS and...)

Start your Perl program F<foo.pl> with arguments C<arg1 arg2 arg3> the
same way as on any other platform, by

	perl foo.pl arg1 arg2 arg3

If you want to specify perl options C<-my_opts> to the perl itself (as
opposed to your program), use

	perl -my_opts foo.pl arg1 arg2 arg3

Alternately, if you use OS/2-ish shell, like CMD or 4os2, put
the following at the start of your perl script:

	extproc perl -S -my_opts

rename your program to F<foo.cmd>, and start it by typing

	foo arg1 arg2 arg3

Note that because of stupid OS/2 limitations the full path of the perl
script is not available when you use C<extproc>, thus you are forced to
use C<-S> perl switch, and your script should be on the C<PATH>. As a plus
side, if you know a full path to your script, you may still start it
with 

	perl ../../blah/foo.cmd arg1 arg2 arg3

(note that the argument C<-my_opts> is taken care of by the C<extproc> line
in your script, see L<C<extproc> on the first line>).

To understand what the above I<magic> does, read perl docs about C<-S>
switch - see L<perlrun>, and cmdref about C<extproc>:

	view perl perlrun
	man perlrun
	view cmdref extproc
	help extproc

or whatever method you prefer.

There are also endless possibilities to use I<executable extensions> of
4os2, I<associations> of WPS and so on... However, if you use
*nixish shell (like F<sh.exe> supplied in the binary distribution),
you need to follow the syntax specified in L<perlrun/"Command Switches">.

Note that B<-S> switch supports scripts with additional extensions 
F<.cmd>, F<.btm>, F<.bat>, F<.pl> as well.

=head2 Starting OS/2 (and DOS) programs under Perl

This is what system() (see L<perlfunc/system>), C<``> (see
L<perlop/"I/O Operators">), and I<open pipe> (see L<perlfunc/open>)
are for. (Avoid exec() (see L<perlfunc/exec>) unless you know what you
do).

Note however that to use some of these operators you need to have a
sh-syntax shell installed (see L</"Pdksh">, 
L</"Frequently asked questions">), and perl should be able to find it
(see L</"C<PERL_SH_DIR>">).

The cases when the shell is used are:

=over

=item 1

One-argument system() (see L<perlfunc/system>), exec() (see L<perlfunc/exec>)
with redirection or shell meta-characters;

=item 2

Pipe-open (see L<perlfunc/open>) with the command which contains redirection 
or shell meta-characters;

=item 3

Backticks C<``> (see L<perlop/"I/O Operators">) with the command which contains
redirection or shell meta-characters;

=item 4

If the executable called by system()/exec()/pipe-open()/C<``> is a script
with the "magic" C<#!> line or C<extproc> line which specifies shell;

=item 5

If the executable called by system()/exec()/pipe-open()/C<``> is a script
without "magic" line, and C<$ENV{EXECSHELL}> is set to shell;

=item 6

If the executable called by system()/exec()/pipe-open()/C<``> is not
found (is not this remark obsolete?);

=item 7

For globbing (see L<perlfunc/glob>, L<perlop/"I/O Operators">)
(obsolete? Perl uses builtin globbing nowadays...).

=back

For the sake of speed for a common case, in the above algorithms 
backslashes in the command name are not considered as shell metacharacters.

Perl starts scripts which begin with cookies
C<extproc> or C<#!> directly, without an intervention of shell.  Perl uses the
same algorithm to find the executable as F<pdksh>: if the path
on C<#!> line does not work, and contains C</>, then the directory
part of the executable is ignored, and the executable
is searched in F<.> and on C<PATH>.  To find arguments for these scripts
Perl uses a different algorithm than F<pdksh>: up to 3 arguments are 
recognized, and trailing whitespace is stripped.

If a script
does not contain such a cooky, then to avoid calling F<sh.exe>, Perl uses
the same algorithm as F<pdksh>: if C<$ENV{EXECSHELL}> is set, the
script is given as the first argument to this command, if not set, then
C<$ENV{COMSPEC} /c> is used (or a hardwired guess if C<$ENV{COMSPEC}> is
not set).

When starting scripts directly, Perl uses exactly the same algorithm as for 
the search of script given by B<-S> command-line option: it will look in
the current directory, then on components of C<$ENV{PATH}> using the 
following order of appended extensions: no extension, F<.cmd>, F<.btm>, 
F<.bat>, F<.pl>.

Note that Perl will start to look for scripts only if OS/2 cannot start the
specified application, thus C<system 'blah'> will not look for a script if 
there is an executable file F<blah.exe> I<anywhere> on C<PATH>.  In
other words, C<PATH> is essentially searched twice: once by the OS for
an executable, then by Perl for scripts.

Note also that executable files on OS/2 can have an arbitrary extension, but
F<.exe> will be automatically appended if no dot is present in the name.  The
workaround is as simple as that:  since F<blah.> and F<blah> denote the same
file (at list on FAT and HPFS file systems), to start an executable residing in
file F<n:/bin/blah> (no extension) give an argument C<n:/bin/blah.> (dot
appended) to system().

Perl will start PM programs from VIO (=text-mode) Perl process in a
separate PM session;
the opposite is not true: when you start a non-PM program from a PM
Perl process, Perl would not run it in a separate session.  If a separate
session is desired, either ensure
that shell will be used, as in C<system 'cmd /c myprog'>, or start it using
optional arguments to system() documented in C<OS2::Process> module.  This
is considered to be a feature.

=head1 Frequently asked questions

=head2 "It does not work"

Perl binary distributions come with a F<testperl.cmd> script which tries
to detect common problems with misconfigured installations.  There is a
pretty large chance it will discover which step of the installation you
managed to goof.  C<;-)>

=head2 I cannot run external programs

=over 4

=item *

Did you run your programs with C<-w> switch? See 
L<Starting OSE<sol>2 (and DOS) programs under Perl>.

=item *

Do you try to run I<internal> shell commands, like C<`copy a b`>
(internal for F<cmd.exe>), or C<`glob a*b`> (internal for ksh)? You
need to specify your shell explicitly, like C<`cmd /c copy a b`>,
since Perl cannot deduce which commands are internal to your shell.

=back

=head2 I cannot embed perl into my program, or use F<perl.dll> from my
program. 

=over 4

=item Is your program EMX-compiled with C<-Zmt -Zcrtdll>?

Well, nowadays Perl DLL should be usable from a differently compiled
program too...  If you can run Perl code from REXX scripts (see
L<OS2::REXX>), then there are some other aspect of interaction which
are overlooked by the current hackish code to support
differently-compiled principal programs.

If everything else fails, you need to build a stand-alone DLL for
perl. Contact me, I did it once. Sockets would not work, as a lot of
other stuff.

=item Did you use L<ExtUtils::Embed>?

Some time ago I had reports it does not work.  Nowadays it is checked
in the Perl test suite, so grep F<./t> subdirectory of the build tree
(as well as F<*.t> files in the F<./lib> subdirectory) to find how it
should be done "correctly".

=back

=head2 C<``> and pipe-C<open> do not work under DOS.

This may a variant of just L</"I cannot run external programs">, or a
deeper problem. Basically: you I<need> RSX (see L</Prerequisites>)
for these commands to work, and you may need a port of F<sh.exe> which
understands command arguments. One of such ports is listed in
L</Prerequisites> under RSX. Do not forget to set variable
L</"C<PERL_SH_DIR>"> as well.

DPMI is required for RSX.

=head2 Cannot start C<find.exe "pattern" file>

The whole idea of the "standard C API to start applications" is that
the forms C<foo> and C<"foo"> of program arguments are completely
interchangeable.  F<find> breaks this paradigm;

  find "pattern" file
  find pattern file

are not equivalent; F<find> cannot be started directly using the above
API.  One needs a way to surround the doublequotes in some other
quoting construction, necessarily having an extra non-Unixish shell in
between.

Use one of

  system 'cmd', '/c', 'find "pattern" file';
  `cmd /c 'find "pattern" file'`

This would start F<find.exe> via F<cmd.exe> via C<sh.exe> via
C<perl.exe>, but this is a price to pay if you want to use
non-conforming program.

=head1 INSTALLATION

=head2 Automatic binary installation

The most convenient way of installing a binary distribution of perl is via perl installer
F<install.exe>. Just follow the instructions, and 99% of the
installation blues would go away. 

Note however, that you need to have F<unzip.exe> on your path, and
EMX environment I<running>. The latter means that if you just
installed EMX, and made all the needed changes to F<Config.sys>,
you may need to reboot in between. Check EMX runtime by running

	emxrev

Binary installer also creates a folder on your desktop with some useful
objects.  If you need to change some aspects of the work of the binary
installer, feel free to edit the file F<Perl.pkg>.  This may be useful
e.g., if you need to run the installer many times and do not want to
make many interactive changes in the GUI.

B<Things not taken care of by automatic binary installation:>

=over 15

=item C<PERL_BADLANG>

may be needed if you change your codepage I<after> perl installation,
and the new value is not supported by EMX. See L</"C<PERL_BADLANG>">.

=item C<PERL_BADFREE>

see L</"C<PERL_BADFREE>">.

=item F<Config.pm>

This file resides somewhere deep in the location you installed your
perl library, find it out by 

  perl -MConfig -le "print $INC{'Config.pm'}"

While most important values in this file I<are> updated by the binary
installer, some of them may need to be hand-edited. I know no such
data, please keep me informed if you find one.  Moreover, manual
changes to the installed version may need to be accompanied by an edit
of this file.

=back

B<NOTE>. Because of a typo the binary installer of 5.00305
would install a variable C<PERL_SHPATH> into F<Config.sys>. Please
remove this variable and put L</C<PERL_SH_DIR>> instead.

=head2 Manual binary installation

As of version 5.00305, OS/2 perl binary distribution comes split
into 11 components. Unfortunately, to enable configurable binary
installation, the file paths in the zip files are not absolute, but
relative to some directory.

Note that the extraction with the stored paths is still necessary
(default with unzip, specify C<-d> to pkunzip). However, you
need to know where to extract the files. You need also to manually
change entries in F<Config.sys> to reflect where did you put the
files. Note that if you have some primitive unzipper (like
C<pkunzip>), you may get a lot of warnings/errors during
unzipping. Upgrade to C<(w)unzip>.

Below is the sample of what to do to reproduce the configuration on my
machine.  In F<VIEW.EXE> you can press C<Ctrl-Insert> now, and
cut-and-paste from the resulting file - created in the directory you
started F<VIEW.EXE> from.

For each component, we mention environment variables related to each
installation directory.  Either choose directories to match your
values of the variables, or create/append-to variables to take into
account the directories.

=over 3

=item Perl VIO and PM executables (dynamically linked)

  unzip perl_exc.zip *.exe *.ico -d f:/emx.add/bin
  unzip perl_exc.zip *.dll -d f:/emx.add/dll

(have the directories with C<*.exe> on PATH, and C<*.dll> on
LIBPATH);

=item Perl_ VIO executable (statically linked)

  unzip perl_aou.zip -d f:/emx.add/bin

(have the directory on PATH);

=item Executables for Perl utilities

  unzip perl_utl.zip -d f:/emx.add/bin

(have the directory on PATH);

=item Main Perl library

  unzip perl_mlb.zip -d f:/perllib/lib

If this directory is exactly the same as the prefix which was compiled
into F<perl.exe>, you do not need to change
anything. However, for perl to find the library if you use a different
path, you need to
C<set PERLLIB_PREFIX> in F<Config.sys>, see L</"C<PERLLIB_PREFIX>">.

=item Additional Perl modules

  unzip perl_ste.zip -d f:/perllib/lib/site_perl/5.26.3/

Same remark as above applies.  Additionally, if this directory is not
one of directories on @INC (and @INC is influenced by C<PERLLIB_PREFIX>), you
need to put this
directory and subdirectory F<./os2> in C<PERLLIB> or C<PERL5LIB>
variable. Do not use C<PERL5LIB> unless you have it set already. See
L<perl/"ENVIRONMENT">.

B<[Check whether this extraction directory is still applicable with
the new directory structure layout!]>

=item Tools to compile Perl modules

  unzip perl_blb.zip -d f:/perllib/lib

Same remark as for F<perl_ste.zip>.

=item Manpages for Perl and utilities

  unzip perl_man.zip -d f:/perllib/man

This directory should better be on C<MANPATH>. You need to have a
working F<man> to access these files.

=item Manpages for Perl modules

  unzip perl_mam.zip -d f:/perllib/man

This directory should better be on C<MANPATH>. You need to have a
working man to access these files.

=item Source for Perl documentation

  unzip perl_pod.zip -d f:/perllib/lib

This is used by the C<perldoc> program (see L<perldoc>), and may be used to
generate HTML documentation usable by WWW browsers, and
documentation in zillions of other formats: C<info>, C<LaTeX>,
C<Acrobat>, C<FrameMaker> and so on.  [Use programs such as
F<pod2latex> etc.]

=item Perl manual in F<.INF> format

  unzip perl_inf.zip -d d:/os2/book

This directory should better be on C<BOOKSHELF>.

=item Pdksh

  unzip perl_sh.zip -d f:/bin

This is used by perl to run external commands which explicitly
require shell, like the commands using I<redirection> and I<shell
metacharacters>. It is also used instead of explicit F</bin/sh>.

Set C<PERL_SH_DIR> (see L</"C<PERL_SH_DIR>">) if you move F<sh.exe> from
the above location.

B<Note.> It may be possible to use some other sh-compatible shell (untested).

=back

After you installed the components you needed and updated the
F<Config.sys> correspondingly, you need to hand-edit
F<Config.pm>. This file resides somewhere deep in the location you
installed your perl library, find it out by

  perl -MConfig -le "print $INC{'Config.pm'}"

You need to correct all the entries which look like file paths (they
currently start with C<f:/>).

=head2 B<Warning>

The automatic and manual perl installation leave precompiled paths
inside perl executables. While these paths are overwriteable (see
L</"C<PERLLIB_PREFIX>">, L</"C<PERL_SH_DIR>">), some people may prefer
binary editing of paths inside the executables/DLLs.

=head1 Accessing documentation

Depending on how you built/installed perl you may have (otherwise
identical) Perl documentation in the following formats:

=head2 OS/2 F<.INF> file

Most probably the most convenient form. Under OS/2 view it as

  view perl
  view perl perlfunc
  view perl less
  view perl ExtUtils::MakeMaker

(currently the last two may hit a wrong location, but this may improve
soon). Under Win* see L</"SYNOPSIS">.

If you want to build the docs yourself, and have I<OS/2 toolkit>, run

	pod2ipf > perl.ipf

in F</perllib/lib/pod> directory, then

	ipfc /inf perl.ipf

(Expect a lot of errors during the both steps.) Now move it on your
BOOKSHELF path.

=head2 Plain text

If you have perl documentation in the source form, perl utilities
installed, and GNU groff installed, you may use 

	perldoc perlfunc
	perldoc less
	perldoc ExtUtils::MakeMaker

to access the perl documentation in the text form (note that you may get
better results using perl manpages).

Alternately, try running pod2text on F<.pod> files.

=head2 Manpages

If you have F<man> installed on your system, and you installed perl
manpages, use something like this:

	man perlfunc
	man 3 less
	man ExtUtils.MakeMaker

to access documentation for different components of Perl. Start with

	man perl

Note that dot (F<.>) is used as a package separator for documentation
for packages, and as usual, sometimes you need to give the section - C<3>
above - to avoid shadowing by the I<less(1) manpage>.

Make sure that the directory B<above> the directory with manpages is
on our C<MANPATH>, like this

  set MANPATH=c:/man;f:/perllib/man

for Perl manpages in C<f:/perllib/man/man1/> etc.

=head2 HTML

If you have some WWW browser available, installed the Perl
documentation in the source form, and Perl utilities, you can build
HTML docs. Cd to directory with F<.pod> files, and do like this

	cd f:/perllib/lib/pod
	pod2html

After this you can direct your browser the file F<perl.html> in this
directory, and go ahead with reading docs, like this:

	explore file:///f:/perllib/lib/pod/perl.html

Alternatively you may be able to get these docs prebuilt from CPAN.

=head2 GNU C<info> files

Users of Emacs would appreciate it very much, especially with
C<CPerl> mode loaded. You need to get latest C<pod2texi> from C<CPAN>,
or, alternately, the prebuilt info pages.

=head2 F<PDF> files

for C<Acrobat> are available on CPAN (may be for slightly older version of
perl).

=head2 C<LaTeX> docs

can be constructed using C<pod2latex>.

=head1 BUILD

Here we discuss how to build Perl under OS/2.

=head2 The short story

Assume that you are a seasoned porter, so are sure that all the necessary
tools are already present on your system, and you know how to get the Perl
source distribution.  Untar it, change to the extract directory, and

  gnupatch -p0 < os2\diff.configure
  sh Configure -des -D prefix=f:/perllib
  make
  make test
  make install
  make aout_test
  make aout_install

This puts the executables in f:/perllib/bin.  Manually move them to the
C<PATH>, manually move the built F<perl*.dll> to C<LIBPATH> (here for
Perl DLL F<*> is a not-very-meaningful hex checksum), and run

  make installcmd INSTALLCMDDIR=d:/ir/on/path

Assuming that the C<man>-files were put on an appropriate location,
this completes the installation of minimal Perl system.  (The binary
distribution contains also a lot of additional modules, and the
documentation in INF format.)

What follows is a detailed guide through these steps.

=head2 Prerequisites

You need to have the latest EMX development environment, the full
GNU tool suite (gawk renamed to awk, and GNU F<find.exe>
earlier on path than the OS/2 F<find.exe>, same with F<sort.exe>, to
check use

  find --version
  sort --version

). You need the latest version of F<pdksh> installed as F<sh.exe>.

Check that you have B<BSD> libraries and headers installed, and - 
optionally - Berkeley DB headers and libraries, and crypt.

Possible locations to get the files:


  ftp://ftp.uni-heidelberg.de/pub/os2/unix/
  http://hobbes.nmsu.edu/h-browse.php?dir=/pub/os2
  http://cd.textfiles.com/hobbesos29804/disk1/DEV32/
  http://cd.textfiles.com/hobbesos29804/disk1/EMX09C/

It is reported that the following archives contain enough utils to
build perl: F<gnufutil.zip>, F<gnusutil.zip>, F<gnututil.zip>, F<gnused.zip>,
F<gnupatch.zip>, F<gnuawk.zip>, F<gnumake.zip>, F<gnugrep.zip>, F<bsddev.zip> and
F<ksh527rt.zip> (or a later version).  Note that all these utilities are
known to be available from LEO:

  ftp://crydee.sai.msu.ru/pub/comp/os/os2/leo/gnu/

Note also that the F<db.lib> and F<db.a> from the EMX distribution
are not suitable for multi-threaded compile (even single-threaded
flavor of Perl uses multi-threaded C RTL, for
compatibility with XFree86-OS/2). Get a corrected one from

  http://www.ilyaz.org/software/os2/db_mt.zip

If you have I<exactly the same version of Perl> installed already,
make sure that no copies or perl are currently running.  Later steps
of the build may fail since an older version of F<perl.dll> loaded into
memory may be found.  Running C<make test> becomes meaningless, since
the test are checking a previous build of perl (this situation is detected
and reported by F<os2/os2_base.t> test).  Do not forget to unset
C<PERL_EMXLOAD_SEC> in environment.

Also make sure that you have F</tmp> directory on the current drive,
and F<.> directory in your C<LIBPATH>. One may try to correct the
latter condition by

  set BEGINLIBPATH .\.

if you use something like F<CMD.EXE> or latest versions of
F<4os2.exe>.  (Setting BEGINLIBPATH to just C<.> is ignored by the
OS/2 kernel.)

Make sure your gcc is good for C<-Zomf> linking: run C<omflibs>
script in F</emx/lib> directory.

Check that you have link386 installed. It comes standard with OS/2,
but may be not installed due to customization. If typing

  link386

shows you do not have it, do I<Selective install>, and choose C<Link
object modules> in I<Optional system utilities/More>. If you get into
link386 prompts, press C<Ctrl-C> to exit.

=head2 Getting perl source

You need to fetch the latest perl source (including developers
releases). With some probability it is located in 

  http://www.cpan.org/src/
  http://www.cpan.org/src/unsupported

If not, you may need to dig in the indices to find it in the directory
of the current maintainer.

Quick cycle of developers release may break the OS/2 build time to
time, looking into 

  http://www.cpan.org/ports/os2/

may indicate the latest release which was publicly released by the
maintainer. Note that the release may include some additional patches
to apply to the current source of perl.

Extract it like this

  tar vzxf perl5.00409.tar.gz

You may see a message about errors while extracting F<Configure>. This is
because there is a conflict with a similarly-named file F<configure>.

Change to the directory of extraction.

=head2 Application of the patches

You need to apply the patches in F<./os2/diff.*> like this:

  gnupatch -p0 < os2\diff.configure

You may also need to apply the patches supplied with the binary
distribution of perl.  It also makes sense to look on the
perl5-porters mailing list for the latest OS/2-related patches (see
L<http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/>).  Such
patches usually contain strings C</os2/> and C<patch>, so it makes
sense looking for these strings.

=head2 Hand-editing

You may look into the file F<./hints/os2.sh> and correct anything
wrong you find there. I do not expect it is needed anywhere.

=head2 Making

  sh Configure -des -D prefix=f:/perllib

C<prefix> means: where to install the resulting perl library. Giving
correct prefix you may avoid the need to specify C<PERLLIB_PREFIX>,
see L</"C<PERLLIB_PREFIX>">.

I<Ignore the message about missing C<ln>, and about C<-c> option to
tr>. The latter is most probably already fixed, if you see it and can trace
where the latter spurious warning comes from, please inform me.

Now

  make

At some moment the built may die, reporting a I<version mismatch> or
I<unable to run F<perl>>.  This means that you do not have F<.> in
your LIBPATH, so F<perl.exe> cannot find the needed F<perl67B2.dll> (treat
these hex digits as line noise).  After this is fixed the build
should finish without a lot of fuss.

=head2 Testing

Now run

  make test

All tests should succeed (with some of them skipped).  If you have the
same version of Perl installed, it is crucial that you have C<.> early
in your LIBPATH (or in BEGINLIBPATH), otherwise your tests will most
probably test the wrong version of Perl.

Some tests may generate extra messages similar to

=over 4

=item A lot of C<bad free>

in database tests related to Berkeley DB. I<This should be fixed already.>
If it persists, you may disable this warnings, see L</"C<PERL_BADFREE>">.

=item Process terminated by SIGTERM/SIGINT

This is a standard message issued by OS/2 applications. *nix
applications die in silence. It is considered to be a feature. One can
easily disable this by appropriate sighandlers. 

However the test engine bleeds these message to screen in unexpected
moments. Two messages of this kind I<should> be present during
testing.

=back

To get finer test reports, call

  perl t/harness

The report with F<io/pipe.t> failing may look like this:

 Failed Test  Status Wstat Total Fail  Failed  List of failed
 ------------------------------------------------------------
 io/pipe.t                    12    1   8.33%  9
 7 tests skipped, plus 56 subtests skipped.
 Failed 1/195 test scripts, 99.49% okay. 1/6542 subtests failed,
    99.98% okay.

The reasons for most important skipped tests are:

=over 8

=item F<op/fs.t>

=over 4

=item Z<>18

Checks C<atime> and C<mtime> of C<stat()> - unfortunately, HPFS
provides only 2sec time granularity (for compatibility with FAT?).

=item Z<>25

Checks C<truncate()> on a filehandle just opened for write - I do not
know why this should or should not work.

=back

=item F<op/stat.t>

Checks C<stat()>. Tests:

=over 4

=item 4

Checks C<atime> and C<mtime> of C<stat()> - unfortunately, HPFS
provides only 2sec time granularity (for compatibility with FAT?).

=back

=back

=head2 Installing the built perl

If you haven't yet moved C<perl*.dll> onto LIBPATH, do it now.

Run

  make install

It would put the generated files into needed locations. Manually put
F<perl.exe>, F<perl__.exe> and F<perl___.exe> to a location on your
PATH, F<perl.dll> to a location on your LIBPATH.

Run

  make installcmd INSTALLCMDDIR=d:/ir/on/path

to convert perl utilities to F<.cmd> files and put them on
PATH. You need to put F<.EXE>-utilities on path manually. They are
installed in C<$prefix/bin>, here C<$prefix> is what you gave to
F<Configure>, see L</Making>.

If you use C<man>, either move the installed F<*/man/> directories to
your C<MANPATH>, or modify C<MANPATH> to match the location.  (One
could have avoided this by providing a correct C<manpath> option to
F<./Configure>, or editing F<./config.sh> between configuring and
making steps.)

=head2 C<a.out>-style build

Proceed as above, but make F<perl_.exe> (see L</"F<perl_.exe>">) by

  make perl_

test and install by

  make aout_test
  make aout_install

Manually put F<perl_.exe> to a location on your PATH.

B<Note.> The build process for C<perl_> I<does not know> about all the
dependencies, so you should make sure that anything is up-to-date,
say, by doing

  make perl_dll

first.

=head1 Building a binary distribution

[This section provides a short overview only...]

Building should proceed differently depending on whether the version of perl
you install is already present and used on your system, or is a new version
not yet used.  The description below assumes that the version is new, so
installing its DLLs and F<.pm> files will not disrupt the operation of your
system even if some intermediate steps are not yet fully working.

The other cases require a little bit more convoluted procedures.  Below I
suppose that the current version of Perl is C<5.8.2>, so the executables are
named accordingly.

=over

=item 1.

Fully build and test the Perl distribution.  Make sure that no tests are
failing with C<test> and C<aout_test> targets; fix the bugs in Perl and
the Perl test suite detected by these tests.  Make sure that C<all_test>
make target runs as clean as possible.  Check that F<os2/perlrexx.cmd>
runs fine.

=item 2.

Fully install Perl, including C<installcmd> target.  Copy the generated DLLs
to C<LIBPATH>; copy the numbered Perl executables (as in F<perl5.8.2.exe>)
to C<PATH>; copy C<perl_.exe> to C<PATH> as C<perl_5.8.2.exe>.  Think whether
you need backward-compatibility DLLs.  In most cases you do not need to install
them yet; but sometime this may simplify the following steps.

=item 3.

Make sure that C<CPAN.pm> can download files from CPAN.  If not, you may need
to manually install C<Net::FTP>.

=item 4.

Install the bundle C<Bundle::OS2_default>

 perl5.8.2 -MCPAN -e "install Bundle::OS2_default" < nul |& tee 00cpan_i_1

This may take a couple of hours on 1GHz processor (when run the first time).
And this should not be necessarily a smooth procedure.  Some modules may not
specify required dependencies, so one may need to repeat this procedure several
times until the results stabilize.

 perl5.8.2 -MCPAN -e "install Bundle::OS2_default" < nul |& tee 00cpan_i_2
 perl5.8.2 -MCPAN -e "install Bundle::OS2_default" < nul |& tee 00cpan_i_3

Even after they stabilize, some tests may fail.

Fix as many discovered bugs as possible.  Document all the bugs which are not
fixed, and all the failures with unknown reasons.  Inspect the produced logs
F<00cpan_i_1> to find suspiciously skipped tests, and other fishy events.

Keep in mind that I<installation> of some modules may fail too: for example,
the DLLs to update may be already loaded by F<CPAN.pm>.  Inspect the C<install>
logs (in the example above F<00cpan_i_1> etc) for errors, and install things
manually, as in

  cd $CPANHOME/.cpan/build/Digest-MD5-2.31
  make install

Some distributions may fail some tests, but you may want to install them
anyway (as above, or via C<force install> command of C<CPAN.pm> shell-mode).

Since this procedure may take quite a long time to complete, it makes sense
to "freeze" your CPAN configuration by disabling periodic updates of the
local copy of CPAN index: set C<index_expire> to some big value (I use 365),
then save the settings

  CPAN> o conf index_expire 365
  CPAN> o conf commit

Reset back to the default value C<1> when you are finished.

=item 5.

When satisfied with the results, rerun the C<installcmd> target.  Now you
can copy C<perl5.8.2.exe> to C<perl.exe>, and install the other OMF-build
executables: C<perl__.exe> etc.  They are ready to be used.

=item 6.

Change to the C<./pod> directory of the build tree, download the Perl logo
F<CamelGrayBig.BMP>, and run

  ( perl2ipf > perl.ipf ) |& tee 00ipf
  ipfc /INF perl.ipf |& tee 00inf

This produces the Perl docs online book C<perl.INF>.  Install in on
C<BOOKSHELF> path.

=item 7.

Now is the time to build statically linked executable F<perl_.exe> which
includes newly-installed via C<Bundle::OS2_default> modules.  Doing testing
via C<CPAN.pm> is going to be painfully slow, since it statically links
a new executable per XS extension.

Here is a possible workaround: create a toplevel F<Makefile.PL> in
F<$CPANHOME/.cpan/build/> with contents being (compare with L</Making
executables with a custom collection of statically loaded extensions>)

  use ExtUtils::MakeMaker;
  WriteMakefile NAME => 'dummy';

execute this as

  perl_5.8.2.exe Makefile.PL <nul |& tee 00aout_c1
  make -k all test <nul |& 00aout_t1

Again, this procedure should not be absolutely smooth.  Some C<Makefile.PL>'s
in subdirectories may be buggy, and would not run as "child" scripts.  The
interdependency of modules can strike you; however, since non-XS modules
are already installed, the prerequisites of most modules have a very good
chance to be present.

If you discover some glitches, move directories of problematic modules to a
different location; if these modules are non-XS modules, you may just ignore
them - they are already installed; the remaining, XS, modules you need to
install manually one by one.

After each such removal you need to rerun the C<Makefile.PL>/C<make> process;
usually this procedure converges soon.  (But be sure to convert all the
necessary external C libraries from F<.lib> format to F<.a> format: run one of

  emxaout foo.lib
  emximp -o foo.a foo.lib

whichever is appropriate.)  Also, make sure that the DLLs for external
libraries are usable with with executables compiled without C<-Zmtd> options.

When you are sure that only a few subdirectories
lead to failures, you may want to add C<-j4> option to C<make> to speed up
skipping subdirectories with already finished build.

When you are satisfied with the results of tests, install the build C libraries
for extensions:

  make install |& tee 00aout_i

Now you can rename the file F<./perl.exe> generated during the last phase
to F<perl_5.8.2.exe>; place it on C<PATH>; if there is an inter-dependency
between some XS modules, you may need to repeat the C<test>/C<install> loop
with this new executable and some excluded modules - until the procedure
converges.

Now you have all the necessary F<.a> libraries for these Perl modules in the
places where Perl builder can find it.  Use the perl builder: change to an
empty directory, create a "dummy" F<Makefile.PL> again, and run

  perl_5.8.2.exe Makefile.PL |& tee 00c
  make perl		     |& tee 00p

This should create an executable F<./perl.exe> with all the statically loaded
extensions built in.  Compare the generated F<perlmain.c> files to make sure
that during the iterations the number of loaded extensions only increases.
Rename F<./perl.exe> to F<perl_5.8.2.exe> on C<PATH>.

When it converges, you got a functional variant of F<perl_5.8.2.exe>; copy it
to C<perl_.exe>.  You are done with generation of the local Perl installation.

=item 8.

Make sure that the installed modules are actually installed in the location
of the new Perl, and are not inherited from entries of @INC given for
inheritance from the older versions of Perl: set C<PERLLIB_582_PREFIX> to
redirect the new version of Perl to a new location, and copy the installed
files to this new location.  Redo the tests to make sure that the versions of
modules inherited from older versions of Perl are not needed.

Actually, the log output of L<pod2ipf(1)> during the step 6 gives a very detailed
info about which modules are loaded from which place; so you may use it as
an additional verification tool.

Check that some temporary files did not make into the perl install tree.
Run something like this

  pfind . -f "!(/\.(pm|pl|ix|al|h|a|lib|txt|pod|imp|bs|dll|ld|bs|inc|xbm|yml|cgi|uu|e2x|skip|packlist|eg|cfg|html|pub|enc|all|ini|po|pot)$/i or /^\w+$/") | less

in the install tree (both top one and F<sitelib> one).

Compress all the DLLs with F<lxlite>.  The tiny F<.exe> can be compressed with
C</c:max> (the bug only appears when there is a fixup in the last 6 bytes of a
page (?); since the tiny executables are much smaller than a page, the bug
will not hit).  Do not compress C<perl_.exe> - it would not work under DOS.

=item 9.

Now you can generate the binary distribution.  This is done by running the
test of the CPAN distribution C<OS2::SoftInstaller>.  Tune up the file
F<test.pl> to suit the layout of current version of Perl first.  Do not
forget to pack the necessary external DLLs accordingly.  Include the
description of the bugs and test suite failures you could not fix.  Include
the small-stack versions of Perl executables from Perl build directory.

Include F<perl5.def> so that people can relink the perl DLL preserving
the binary compatibility, or can create compatibility DLLs.  Include the diff
files (C<diff -pu old new>) of fixes you did so that people can rebuild your
version.  Include F<perl5.map> so that one can use remote debugging.

=item 10.

Share what you did with the other people.  Relax.  Enjoy fruits of your work.

=item 11.

Brace yourself for thanks, bug reports, hate mail and spam coming as result
of the previous step.  No good deed should remain unpunished!

=back

=head1 Building custom F<.EXE> files

The Perl executables can be easily rebuilt at any moment.  Moreover, one can
use the I<embedding> interface (see L<perlembed>) to make very customized
executables.

=head2 Making executables with a custom collection of statically loaded extensions

It is a little bit easier to do so while I<decreasing> the list of statically
loaded extensions.  We discuss this case only here.

=over

=item 1.

Change to an empty directory, and create a placeholder <Makefile.PL>:

  use ExtUtils::MakeMaker;
  WriteMakefile NAME => 'dummy';

=item 2.

Run it with the flavor of Perl (F<perl.exe> or F<perl_.exe>) you want to
rebuild.

  perl_ Makefile.PL

=item 3.

Ask it to create new Perl executable:

  make perl

(you may need to manually add C<PERLTYPE=-DPERL_CORE> to this commandline on
some versions of Perl; the symptom is that the command-line globbing does not
work from OS/2 shells with the newly-compiled executable; check with

  .\perl.exe -wle "print for @ARGV" *

).

=item 4.

The previous step created F<perlmain.c> which contains a list of newXS() calls
near the end.  Removing unnecessary calls, and rerunning

  make perl

will produce a customized executable.

=back

=head2 Making executables with a custom search-paths

The default perl executable is flexible enough to support most usages.
However, one may want something yet more flexible; for example, one may want
to find Perl DLL relatively to the location of the EXE file; or one may want
to ignore the environment when setting the Perl-library search patch, etc.

If you fill comfortable with I<embedding> interface (see L<perlembed>), such
things are easy to do repeating the steps outlined in L/<Making
executables with a custom collection of statically loaded extensions>, and
doing more comprehensive edits to main() of F<perlmain.c>.  The people with
little desire to understand Perl can just rename main(), and do necessary
modification in a custom main() which calls the renamed function in appropriate
time.

However, there is a third way: perl DLL exports the main() function and several
callbacks to customize the search path.  Below is a complete example of a
"Perl loader" which

=over

=item 1.

Looks for Perl DLL in the directory C<$exedir/../dll>;

=item 2.

Prepends the above directory to C<BEGINLIBPATH>;

=item 3.

Fails if the Perl DLL found via C<BEGINLIBPATH> is different from what was
loaded on step 1; e.g., another process could have loaded it from C<LIBPATH>
or from a different value of C<BEGINLIBPATH>.  In these cases one needs to
modify the setting of the system so that this other process either does not
run, or loads the DLL from C<BEGINLIBPATH> with C<LIBPATHSTRICT=T> (available
with kernels after September 2000).

=item 4.

Loads Perl library from C<$exedir/../dll/lib/>.

=item 5.

Uses Bourne shell from C<$exedir/../dll/sh/ksh.exe>.

=back

For best results compile the C file below with the same options as the Perl
DLL.  However, a lot of functionality will work even if the executable is not
an EMX applications, e.g., if compiled with

  gcc -Wall -DDOSISH -DOS2=1 -O2 -s -Zomf -Zsys perl-starter.c \
    -DPERL_DLL_BASENAME=\"perl312F\" -Zstack 8192 -Zlinker /PM:VIO

Here is the sample C file:

 #define INCL_DOS
 #define INCL_NOPM
 /* These are needed for compile if os2.h includes os2tk.h, not
  * os2emx.h */
 #define INCL_DOSPROCESS
 #include <os2.h>

 #include "EXTERN.h"
 #define PERL_IN_MINIPERLMAIN_C
 #include "perl.h"

 static char *me;
 HMODULE handle;

 static void
 die_with(char *msg1, char *msg2, char *msg3, char *msg4)
 {
    ULONG c;
    char *s = " error: ";

    DosWrite(2, me, strlen(me), &c);
    DosWrite(2, s, strlen(s), &c);
    DosWrite(2, msg1, strlen(msg1), &c);
    DosWrite(2, msg2, strlen(msg2), &c);
    DosWrite(2, msg3, strlen(msg3), &c);
    DosWrite(2, msg4, strlen(msg4), &c);
    DosWrite(2, "\r\n", 2, &c);
    exit(255);
 }

 typedef ULONG (*fill_extLibpath_t)(int type,
                                    char *pre,
                                    char *post,
                                    int replace,
                                    char *msg);
 typedef int (*main_t)(int type, char *argv[], char *env[]);
 typedef int (*handler_t)(void* data, int which);

 #ifndef PERL_DLL_BASENAME
 #  define PERL_DLL_BASENAME "perl"
 #endif

 static HMODULE
 load_perl_dll(char *basename)
 {
     char buf[300], fail[260];
     STRLEN l, dirl;
     fill_extLibpath_t f;
     ULONG rc_fullname;
     HMODULE handle, handle1;

     if (_execname(buf, sizeof(buf) - 13) != 0)
         die_with("Can't find full path: ", strerror(errno), "", "");
     /* XXXX Fill 'me' with new value */
     l = strlen(buf);
     while (l && buf[l-1] != '/' && buf[l-1] != '\\')
         l--;
     dirl = l - 1;
     strcpy(buf + l, basename);
     l += strlen(basename);
     strcpy(buf + l, ".dll");
     if ( (rc_fullname = DosLoadModule(fail, sizeof fail, buf, &handle))
                                                                    != 0
          && DosLoadModule(fail, sizeof fail, basename, &handle) != 0 )
         die_with("Can't load DLL ", buf, "", "");
     if (rc_fullname)
         return handle;    /* was loaded with short name; all is fine */
     if (DosQueryProcAddr(handle, 0, "fill_extLibpath", (PFN*)&f))
         die_with(buf,
                  ": DLL exports no symbol ",
                  "fill_extLibpath",
                  "");
     buf[dirl] = 0;
     if (f(0 /*BEGINLIBPATH*/, buf /* prepend */, NULL /* append */,
           0 /* keep old value */, me))
         die_with(me, ": prepending BEGINLIBPATH", "", "");
     if (DosLoadModule(fail, sizeof fail, basename, &handle1) != 0)
         die_with(me,
                  ": finding perl DLL again via BEGINLIBPATH",
                  "",
                  "");
     buf[dirl] = '\\';
     if (handle1 != handle) {
         if (DosQueryModuleName(handle1, sizeof(fail), fail))
             strcpy(fail, "???");
         die_with(buf,
                  ":\n\tperl DLL via BEGINLIBPATH is different: \n\t",
                  fail,
                  "\n\tYou may need to manipulate global BEGINLIBPATH"
                     " and LIBPATHSTRICT"
                     "\n\tso that the other copy is loaded via"
                     BEGINLIBPATH.");
     }
     return handle;
 }

 int
 main(int argc, char **argv, char **env)
 {
     main_t f;
     handler_t h;

     me = argv[0];
     /**/
     handle = load_perl_dll(PERL_DLL_BASENAME);

     if (DosQueryProcAddr(handle,
                          0,
                          "Perl_OS2_handler_install",
                          (PFN*)&h))
         die_with(PERL_DLL_BASENAME,
                  ": DLL exports no symbol ",
                  "Perl_OS2_handler_install",
                  "");
     if ( !h((void *)"~installprefix", Perlos2_handler_perllib_from)
          || !h((void *)"~dll", Perlos2_handler_perllib_to)
          || !h((void *)"~dll/sh/ksh.exe", Perlos2_handler_perl_sh) )
         die_with(PERL_DLL_BASENAME,
                  ": Can't install @INC manglers",
                  "",
                  "");
     if (DosQueryProcAddr(handle, 0, "dll_perlmain", (PFN*)&f))
         die_with(PERL_DLL_BASENAME,
                  ": DLL exports no symbol ",
                  "dll_perlmain",
                  "");
     return f(argc, argv, env);
 }

=head1 Build FAQ

=head2 Some C</> became C<\> in pdksh.

You have a very old pdksh. See L</Prerequisites>.

=head2 C<'errno'> - unresolved external

You do not have MT-safe F<db.lib>. See L</Prerequisites>.

=head2 Problems with tr or sed

reported with very old version of tr and sed.

=head2 Some problem (forget which ;-)

You have an older version of F<perl.dll> on your LIBPATH, which
broke the build of extensions.

=head2 Library ... not found

You did not run C<omflibs>. See L</Prerequisites>.

=head2 Segfault in make

You use an old version of GNU make. See L</Prerequisites>.

=head2 op/sprintf test failure

This can result from a bug in emx sprintf which was fixed in 0.9d fix 03.

=head1 Specific (mis)features of OS/2 port

=head2 C<setpriority>, C<getpriority>

Note that these functions are compatible with *nix, not with the older
ports of '94 - 95. The priorities are absolute, go from 32 to -95,
lower is quicker. 0 is the default priority.

B<WARNING>.  Calling C<getpriority> on a non-existing process could lock
the system before Warp3 fixpak22.  Starting with Warp3, Perl will use
a workaround: it aborts getpriority() if the process is not present.
This is not possible on older versions C<2.*>, and has a race
condition anyway.

=head2 C<system()>

Multi-argument form of C<system()> allows an additional numeric
argument. The meaning of this argument is described in
L<OS2::Process>.

When finding a program to run, Perl first asks the OS to look for executables
on C<PATH> (OS/2 adds extension F<.exe> if no extension is present).
If not found, it looks for a script with possible extensions 
added in this order: no extension, F<.cmd>, F<.btm>, 
F<.bat>, F<.pl>.  If found, Perl checks the start of the file for magic
strings C<"#!"> and C<"extproc ">.  If found, Perl uses the rest of the
first line as the beginning of the command line to run this script.  The
only mangling done to the first line is extraction of arguments (currently
up to 3), and ignoring of the path-part of the "interpreter" name if it can't
be found using the full path.

E.g., C<system 'foo', 'bar', 'baz'> may lead Perl to finding
F<C:/emx/bin/foo.cmd> with the first line being

 extproc /bin/bash    -x   -c

If F</bin/bash.exe> is not found, then Perl looks for an executable F<bash.exe> on
C<PATH>.  If found in F<C:/emx.add/bin/bash.exe>, then the above system() is
translated to

  system qw(C:/emx.add/bin/bash.exe -x -c C:/emx/bin/foo.cmd bar baz)

One additional translation is performed: instead of F</bin/sh> Perl uses
the hardwired-or-customized shell (see L</"C<PERL_SH_DIR>">).

The above search for "interpreter" is recursive: if F<bash> executable is not
found, but F<bash.btm> is found, Perl will investigate its first line etc.
The only hardwired limit on the recursion depth is implicit: there is a limit
4 on the number of additional arguments inserted before the actual arguments
given to system().  In particular, if no additional arguments are specified
on the "magic" first lines, then the limit on the depth is 4.

If Perl finds that the found executable is of PM type when the
current session is not, it will start the new process in a separate session of
necessary type.  Call via C<OS2::Process> to disable this magic.

B<WARNING>.  Due to the described logic, you need to explicitly
specify F<.com> extension if needed.  Moreover, if the executable
F<perl5.6.1> is requested, Perl will not look for F<perl5.6.1.exe>.
[This may change in the future.]

=head2 C<extproc> on the first line

If the first chars of a Perl script are C<"extproc ">, this line is treated
as C<#!>-line, thus all the switches on this line are processed (twice
if script was started via cmd.exe).  See L<perlrun/DESCRIPTION>.

=head2 Additional modules:

L<OS2::Process>, L<OS2::DLL>, L<OS2::REXX>, L<OS2::PrfDB>, L<OS2::ExtAttr>. These
modules provide access to additional numeric argument for C<system>
and to the information about the running process,
to DLLs having functions with REXX signature and to the REXX runtime, to
OS/2 databases in the F<.INI> format, and to Extended Attributes.

Two additional extensions by Andreas Kaiser, C<OS2::UPM>, and
C<OS2::FTP>, are included into C<ILYAZ> directory, mirrored on CPAN.
Other OS/2-related extensions are available too.

=head2 Prebuilt methods:

=over 4

=item C<File::Copy::syscopy>

used by C<File::Copy::copy>, see L<File::Copy>.

=item C<DynaLoader::mod2fname>

used by C<DynaLoader> for DLL name mangling.

=item  C<Cwd::current_drive()>

Self explanatory.

=item  C<Cwd::sys_chdir(name)>

leaves drive as it is.

=item  C<Cwd::change_drive(name)>

changes the "current" drive.

=item  C<Cwd::sys_is_absolute(name)>

means has drive letter and is_rooted.

=item  C<Cwd::sys_is_rooted(name)>

means has leading C<[/\\]> (maybe after a drive-letter:).

=item  C<Cwd::sys_is_relative(name)>

means changes with current dir.

=item  C<Cwd::sys_cwd(name)>

Interface to cwd from EMX. Used by C<Cwd::cwd>.

=item  C<Cwd::sys_abspath(name, dir)>

Really really odious function to implement. Returns absolute name of
file which would have C<name> if CWD were C<dir>.  C<Dir> defaults to the
current dir.

=item  C<Cwd::extLibpath([type])>

Get current value of extended library search path. If C<type> is
present and positive, works with C<END_LIBPATH>, if negative, works
with C<LIBPATHSTRICT>, otherwise with C<BEGIN_LIBPATH>. 

=item  C<Cwd::extLibpath_set( path [, type ] )>

Set current value of extended library search path. If C<type> is
present and positive, works with <END_LIBPATH>, if negative, works
with C<LIBPATHSTRICT>, otherwise with C<BEGIN_LIBPATH>.

=item C<OS2::Error(do_harderror,do_exception)>

Returns	C<undef> if it was not called yet, otherwise bit 1 is
set if on the previous call do_harderror was enabled, bit
2 is set if on previous call do_exception was enabled.

This function enables/disables error popups associated with 
hardware errors (Disk not ready etc.) and software exceptions.

I know of no way to find out the state of popups I<before> the first call
to this function.

=item C<OS2::Errors2Drive(drive)>

Returns C<undef> if it was not called yet, otherwise return false if errors
were not requested to be written to a hard drive, or the drive letter if
this was requested.

This function may redirect error popups associated with hardware errors
(Disk not ready etc.) and software exceptions to the file POPUPLOG.OS2 at
the root directory of the specified drive.  Overrides OS2::Error() specified
by individual programs.  Given argument undef will disable redirection.

Has global effect, persists after the application exits.

I know of no way to find out the state of redirection of popups to the disk
I<before> the first call to this function.

=item OS2::SysInfo()

Returns a hash with system information. The keys of the hash are

	MAX_PATH_LENGTH, MAX_TEXT_SESSIONS, MAX_PM_SESSIONS,
	MAX_VDM_SESSIONS, BOOT_DRIVE, DYN_PRI_VARIATION,
	MAX_WAIT, MIN_SLICE, MAX_SLICE, PAGE_SIZE,
	VERSION_MAJOR, VERSION_MINOR, VERSION_REVISION,
	MS_COUNT, TIME_LOW, TIME_HIGH, TOTPHYSMEM, TOTRESMEM,
	TOTAVAILMEM, MAXPRMEM, MAXSHMEM, TIMER_INTERVAL,
	MAX_COMP_LENGTH, FOREGROUND_FS_SESSION,
	FOREGROUND_PROCESS

=item OS2::BootDrive()

Returns a letter without colon.

=item C<OS2::MorphPM(serve)>, C<OS2::UnMorphPM(serve)>

Transforms the current application into a PM application and back.
The argument true means that a real message loop is going to be served.
OS2::MorphPM() returns the PM message queue handle as an integer.

See L</"Centralized management of resources"> for additional details.

=item C<OS2::Serve_Messages(force)>

Fake on-demand retrieval of outstanding PM messages.  If C<force> is false,
will not dispatch messages if a real message loop is known to
be present.  Returns number of messages retrieved.

Dies with "QUITing..." if WM_QUIT message is obtained.

=item C<OS2::Process_Messages(force [, cnt])>

Retrieval of PM messages until window creation/destruction.  
If C<force> is false, will not dispatch messages if a real message loop
is known to be present.

Returns change in number of windows.  If C<cnt> is given,
it is incremented by the number of messages retrieved.

Dies with "QUITing..." if WM_QUIT message is obtained.

=item C<OS2::_control87(new,mask)>

the same as L<_control87(3)> of EMX.  Takes integers as arguments, returns
the previous coprocessor control word as an integer.  Only bits in C<new> which
are present in C<mask> are changed in the control word.

=item OS2::get_control87()

gets the coprocessor control word as an integer.

=item C<OS2::set_control87_em(new=MCW_EM,mask=MCW_EM)>

The variant of OS2::_control87() with default values good for
handling exception mask: if no C<mask>, uses exception mask part of C<new>
only.  If no C<new>, disables all the floating point exceptions.

See L</"Misfeatures"> for details.

=item C<OS2::DLLname([how [, \&xsub]])>

Gives the information about the Perl DLL or the DLL containing the C
function bound to by C<&xsub>.  The meaning of C<how> is: default (2):
full name; 0: handle; 1: module name.

=back

(Note that some of these may be moved to different libraries -
eventually).


=head2 Prebuilt variables:

=over 4

=item $OS2::emx_rev

numeric value is the same as _emx_rev of EMX, a string value the same
as _emx_vprt (similar to C<0.9c>).

=item $OS2::emx_env

same as _emx_env of EMX, a number similar to 0x8001.

=item $OS2::os_ver

a number C<OS_MAJOR + 0.001 * OS_MINOR>.

=item $OS2::is_aout

true if the Perl library was compiled in AOUT format.

=item $OS2::can_fork

true if the current executable is an AOUT EMX executable, so Perl can
fork.  Do not use this, use the portable check for
$Config::Config{dfork}.

=item $OS2::nsyserror

This variable (default is 1) controls whether to enforce the contents
of $^E to start with C<SYS0003>-like id.  If set to 0, then the string
value of $^E is what is available from the OS/2 message file.  (Some
messages in this file have an C<SYS0003>-like id prepended, some not.)

=back

=head2 Misfeatures

=over 4

=item *

Since L<flock(3)> is present in EMX, but is not functional, it is 
emulated by perl.  To disable the emulations, set environment variable
C<USE_PERL_FLOCK=0>.

=item *

Here is the list of things which may be "broken" on
EMX (from EMX docs):

=over 4

=item *

The functions L<recvmsg(3)>, L<sendmsg(3)>, and L<socketpair(3)> are not
implemented.

=item *

L<sock_init(3)> is not required and not implemented.

=item *

L<flock(3)> is not yet implemented (dummy function).  (Perl has a workaround.)

=item *

L<kill(3)>:  Special treatment of PID=0, PID=1 and PID=-1 is not implemented.

=item *

L<waitpid(3)>:

      WUNTRACED
	      Not implemented.
      waitpid() is not implemented for negative values of PID.

=back

Note that C<kill -9> does not work with the current version of EMX.

=item *

See L</"Text-mode filehandles">.

=item *

Unix-domain sockets on OS/2 live in a pseudo-file-system C</sockets/...>.
To avoid a failure to create a socket with a name of a different form,
C<"/socket/"> is prepended to the socket name (unless it starts with this
already).

This may lead to problems later in case the socket is accessed via the
"usual" file-system calls using the "initial" name.

=item *

Apparently, IBM used a compiler (for some period of time around '95?) which
changes FP mask right and left.  This is not I<that> bad for IBM's
programs, but the same compiler was used for DLLs which are used with
general-purpose applications.  When these DLLs are used, the state of
floating-point flags in the application is not predictable.

What is much worse, some DLLs change the floating point flags when in
_DLLInitTerm() (e.g., F<TCP32IP>).  This means that even if you do not I<call>
any function in the DLL, just the act of loading this DLL will reset your
flags.  What is worse, the same compiler was used to compile some HOOK DLLs.
Given that HOOK dlls are executed in the context of I<all> the applications
in the system, this means a complete unpredictability of floating point
flags on systems using such HOOK DLLs.  E.g., F<GAMESRVR.DLL> of B<DIVE>
origin changes the floating point flags on each write to the TTY of a VIO
(windowed text-mode) applications.

Some other (not completely debugged) situations when FP flags change include
some video drivers (?), and some operations related to creation of the windows.
People who code B<OpenGL> may have more experience on this.

Perl is generally used in the situation when all the floating-point
exceptions are ignored, as is the default under EMX.  If they are not ignored,
some benign Perl programs would get a C<SIGFPE> and would die a horrible death.

To circumvent this, Perl uses two hacks.  They help against I<one> type of
damage only: FP flags changed when loading a DLL.

One of the hacks is to disable floating point exceptions on Perl startup (as
is the default with EMX).  This helps only with compile-time-linked DLLs
changing the flags before main() had a chance to be called.

The other hack is to restore FP flags after a call to dlopen().  This helps
against similar damage done by DLLs _DLLInitTerm() at runtime.  Currently
no way to switch these hacks off is provided.

=back

=head2 Modifications

Perl modifies some standard C library calls in the following ways:

=over 9

=item C<popen>

C<my_popen> uses F<sh.exe> if shell is required, cf. L</"C<PERL_SH_DIR>">.

=item C<tmpnam>

is created using C<TMP> or C<TEMP> environment variable, via
C<tempnam>.

=item C<tmpfile>

If the current directory is not writable, file is created using modified
C<tmpnam>, so there may be a race condition.

=item C<ctermid>

a dummy implementation.

=item C<stat>

C<os2_stat> special-cases F</dev/tty> and F</dev/con>.

=item C<mkdir>, C<rmdir>

these EMX functions do not work if the path contains a trailing C</>.
Perl contains a workaround for this.

=item C<flock>

Since L<flock(3)> is present in EMX, but is not functional, it is 
emulated by perl.  To disable the emulations, set environment variable
C<USE_PERL_FLOCK=0>.

=back

=head2 Identifying DLLs

All the DLLs built with the current versions of Perl have ID strings
identifying the name of the extension, its version, and the version
of Perl required for this DLL.  Run C<bldlevel DLL-name> to find this
info.

=head2 Centralized management of resources

Since to call certain OS/2 API one needs to have a correctly initialized
C<Win> subsystem, OS/2-specific extensions may require getting C<HAB>s and
C<HMQ>s.  If an extension would do it on its own, another extension could
fail to initialize.

Perl provides a centralized management of these resources:

=over

=item C<HAB>

To get the HAB, the extension should call C<hab = perl_hab_GET()> in C.  After
this call is performed, C<hab> may be accessed as C<Perl_hab>.  There is
no need to release the HAB after it is used.

If by some reasons F<perl.h> cannot be included, use

  extern int Perl_hab_GET(void);

instead.

=item C<HMQ>

There are two cases:

=over

=item *

the extension needs an C<HMQ> only because some API will not work otherwise.
Use C<serve = 0> below.

=item *

the extension needs an C<HMQ> since it wants to engage in a PM event loop.
Use C<serve = 1> below.

=back

To get an C<HMQ>, the extension should call C<hmq = perl_hmq_GET(serve)> in C.
After this call is performed, C<hmq> may be accessed as C<Perl_hmq>.

To signal to Perl that HMQ is not needed any more, call
C<perl_hmq_UNSET(serve)>.  Perl process will automatically morph/unmorph itself
into/from a PM process if HMQ is needed/not-needed.  Perl will automatically
enable/disable C<WM_QUIT> message during shutdown if the message queue is
served/not-served.

B<NOTE>.  If during a shutdown there is a message queue which did not disable
WM_QUIT, and which did not process the received WM_QUIT message, the
shutdown will be automatically cancelled.  Do not call C<perl_hmq_GET(1)>
unless you are going to process messages on an orderly basis.

=item Treating errors reported by OS/2 API

There are two principal conventions (it is useful to call them C<Dos*>
and C<Win*> - though this part of the function signature is not always
determined by the name of the API) of reporting the error conditions
of OS/2 API.  Most of C<Dos*> APIs report the error code as the result
of the call (so 0 means success, and there are many types of errors).
Most of C<Win*> API report success/fail via the result being
C<TRUE>/C<FALSE>; to find the reason for the failure one should call
WinGetLastError() API.

Some C<Win*> entry points also overload a "meaningful" return value
with the error indicator; having a 0 return value indicates an error.
Yet some other C<Win*> entry points overload things even more, and 0
return value may mean a successful call returning a valid value 0, as
well as an error condition; in the case of a 0 return value one should
call WinGetLastError() API to distinguish a successful call from a
failing one.

By convention, all the calls to OS/2 API should indicate their
failures by resetting $^E.  All the Perl-accessible functions which
call OS/2 API may be broken into two classes: some die()s when an API
error is encountered, the other report the error via a false return
value (of course, this does not concern Perl-accessible functions
which I<expect> a failure of the OS/2 API call, having some workarounds
coded).

Obviously, in the situation of the last type of the signature of an OS/2
API, it is must more convenient for the users if the failure is
indicated by die()ing: one does not need to check $^E to know that
something went wrong.  If, however, this solution is not desirable by
some reason, the code in question should reset $^E to 0 before making
this OS/2 API call, so that the caller of this Perl-accessible
function has a chance to distinguish a success-but-0-return value from
a failure.  (One may return undef as an alternative way of reporting
an error.)

The macros to simplify this type of error propagation are

=over

=item C<CheckOSError(expr)>

Returns true on error, sets $^E.  Expects expr() be a call of
C<Dos*>-style API.

=item C<CheckWinError(expr)>

Returns true on error, sets $^E.  Expects expr() be a call of
C<Win*>-style API.

=item C<SaveWinError(expr)>

Returns C<expr>, sets $^E from WinGetLastError() if C<expr> is false.

=item C<SaveCroakWinError(expr,die,name1,name2)>

Returns C<expr>, sets $^E from WinGetLastError() if C<expr> is false,
and die()s if C<die> and $^E are true.  The message to die is the
concatenated strings C<name1> and C<name2>, separated by C<": "> from
the contents of $^E.

=item C<WinError_2_Perl_rc>

Sets C<Perl_rc> to the return value of WinGetLastError().

=item C<FillWinError>

Sets C<Perl_rc> to the return value of WinGetLastError(), and sets $^E
to the corresponding value.

=item C<FillOSError(rc)>

Sets C<Perl_rc> to C<rc>, and sets $^E to the corresponding value.

=back

=item Loading DLLs and ordinals in DLLs

Some DLLs are only present in some versions of OS/2, or in some
configurations of OS/2.  Some exported entry points are present only
in DLLs shipped with some versions of OS/2.  If these DLLs and entry
points were linked directly for a Perl executable/DLL or from a Perl
extensions, this binary would work only with the specified
versions/setups.  Even if these entry points were not needed, the
I<load> of the executable (or DLL) would fail.

For example, many newer useful APIs are not present in OS/2 v2; many
PM-related APIs require DLLs not available on floppy-boot setup.

To make these calls fail I<only when the calls are executed>, one
should call these API via a dynamic linking API.  There is a subsystem
in Perl to simplify such type of calls.  A large number of entry
points available for such linking is provided (see C<entries_ordinals>
- and also C<PMWIN_entries> - in F<os2ish.h>).  These ordinals can be
accessed via the APIs:

 CallORD(), DeclFuncByORD(), DeclVoidFuncByORD(),
 DeclOSFuncByORD(), DeclWinFuncByORD(), AssignFuncPByORD(),
 DeclWinFuncByORD_CACHE(), DeclWinFuncByORD_CACHE_survive(),
 DeclWinFuncByORD_CACHE_resetError_survive(),
 DeclWinFunc_CACHE(), DeclWinFunc_CACHE_resetError(),
 DeclWinFunc_CACHE_survive(), DeclWinFunc_CACHE_resetError_survive()

See the header files and the C code in the supplied OS/2-related
modules for the details on usage of these functions.

Some of these functions also combine dynaloading semantic with the
error-propagation semantic discussed above.

=back

=head1 Perl flavors

Because of idiosyncrasies of OS/2 one cannot have all the eggs in the
same basket (though EMX environment tries hard to overcome this
limitations, so the situation may somehow improve). There are 4
executables for Perl provided by the distribution:

=head2 F<perl.exe>

The main workhorse. This is a chimera executable: it is compiled as an
C<a.out>-style executable, but is linked with C<omf>-style dynamic
library F<perl.dll>, and with dynamic CRT DLL. This executable is a
VIO application.

It can load perl dynamic extensions, and it can fork().

B<Note.> Keep in mind that fork() is needed to open a pipe to yourself.

=head2 F<perl_.exe>

This is a statically linked C<a.out>-style executable. It cannot
load dynamic Perl extensions. The executable supplied in binary
distributions has a lot of extensions prebuilt, thus the above restriction is 
important only if you use custom-built extensions. This executable is a VIO
application.

I<This is the only executable with does not require OS/2.> The
friends locked into C<M$> world would appreciate the fact that this
executable runs under DOS, Win0.3*, Win0.95 and WinNT with an
appropriate extender. See L</"Other OSes">.

=head2 F<perl__.exe>

This is the same executable as F<perl___.exe>, but it is a PM
application. 

B<Note.> Usually (unless explicitly redirected during the startup)
STDIN, STDERR, and STDOUT of a PM
application are redirected to F<nul>. However, it is possible to I<see>
them if you start C<perl__.exe> from a PM program which emulates a
console window, like I<Shell mode> of Emacs or EPM. Thus it I<is
possible> to use Perl debugger (see L<perldebug>) to debug your PM
application (but beware of the message loop lockups - this will not
work if you have a message queue to serve, unless you hook the serving
into the getc() function of the debugger).

Another way to see the output of a PM program is to run it as

  pm_prog args 2>&1 | cat -

with a shell I<different> from F<cmd.exe>, so that it does not create
a link between a VIO session and the session of C<pm_porg>.  (Such a link
closes the VIO window.)  E.g., this works with F<sh.exe> - or with Perl!

  open P, 'pm_prog args 2>&1 |' or die;
  print while <P>;

The flavor F<perl__.exe> is required if you want to start your program without
a VIO window present, but not C<detach>ed (run C<help detach> for more info).
Very useful for extensions which use PM, like C<Perl/Tk> or C<OpenGL>.

Note also that the differences between PM and VIO executables are only
in the I<default> behaviour.  One can start I<any> executable in
I<any> kind of session by using the arguments C</fs>, C</pm> or
C</win> switches of the command C<start> (of F<CMD.EXE> or a similar
shell).  Alternatively, one can use the numeric first argument of the
C<system> Perl function (see L<OS2::Process>).

=head2 F<perl___.exe>

This is an C<omf>-style executable which is dynamically linked to
F<perl.dll> and CRT DLL. I know no advantages of this executable
over C<perl.exe>, but it cannot fork() at all. Well, one advantage is
that the build process is not so convoluted as with C<perl.exe>.

It is a VIO application.

=head2 Why strange names?

Since Perl processes the C<#!>-line (cf. 
L<perlrun/DESCRIPTION>, L<perlrun/Command Switches>,
L<perldiag/"No Perl script found in input">), it should know when a
program I<is a Perl>. There is some naming convention which allows
Perl to distinguish correct lines from wrong ones. The above names are
almost the only names allowed by this convention which do not contain
digits (which have absolutely different semantics).

=head2 Why dynamic linking?

Well, having several executables dynamically linked to the same huge
library has its advantages, but this would not substantiate the
additional work to make it compile. The reason is the complicated-to-developers
but very quick and convenient-to-users "hard" dynamic linking used by OS/2.

There are two distinctive features of the dyna-linking model of OS/2:
first, all the references to external functions are resolved at the compile time;
second, there is no runtime fixup of the DLLs after they are loaded into memory.
The first feature is an enormous advantage over other models: it avoids
conflicts when several DLLs used by an application export entries with
the same name.  In such cases "other" models of dyna-linking just choose
between these two entry points using some random criterion - with predictable
disasters as results.  But it is the second feature which requires the build
of F<perl.dll>.

The address tables of DLLs are patched only once, when they are
loaded. The addresses of the entry points into DLLs are guaranteed to be
the same for all the programs which use the same DLL.  This removes the
runtime fixup - once DLL is loaded, its code is read-only.

While this allows some (significant?) performance advantages, this makes life
much harder for developers, since the above scheme makes it impossible
for a DLL to be "linked" to a symbol in the F<.EXE> file.  Indeed, this
would need a DLL to have different relocations tables for the
(different) executables which use this DLL.

However, a dynamically loaded Perl extension is forced to use some symbols
from the perl
executable, e.g., to know how to find the arguments to the functions:
the arguments live on the perl
internal evaluation stack. The solution is to put the main code of
the interpreter into a DLL, and make the F<.EXE> file which just loads
this DLL into memory and supplies command-arguments.  The extension DLL
cannot link to symbols in F<.EXE>, but it has no problem linking
to symbols in the F<.DLL>.

This I<greatly> increases the load time for the application (as well as
complexity of the compilation). Since interpreter is in a DLL,
the C RTL is basically forced to reside in a DLL as well (otherwise
extensions would not be able to use CRT).  There are some advantages if
you use different flavors of perl, such as running F<perl.exe> and
F<perl__.exe> simultaneously: they share the memory of F<perl.dll>.

B<NOTE>.  There is one additional effect which makes DLLs more wasteful:
DLLs are loaded in the shared memory region, which is a scarse resource
given the 512M barrier of the "standard" OS/2 virtual memory.  The code of
F<.EXE> files is also shared by all the processes which use the particular
F<.EXE>, but they are "shared in the private address space of the process";
this is possible because the address at which different sections
of the F<.EXE> file are loaded is decided at compile-time, thus all the
processes have these sections loaded at same addresses, and no fixup
of internal links inside the F<.EXE> is needed.

Since DLLs may be loaded at run time, to have the same mechanism for DLLs
one needs to have the address range of I<any of the loaded> DLLs in the
system to be available I<in all the processes> which did not load a particular
DLL yet.  This is why the DLLs are mapped to the shared memory region.

=head2 Why chimera build?

Current EMX environment does not allow DLLs compiled using Unixish
C<a.out> format to export symbols for data (or at least some types of
data). This forces C<omf>-style compile of F<perl.dll>.

Current EMX environment does not allow F<.EXE> files compiled in
C<omf> format to fork(). fork() is needed for exactly three Perl
operations:

=over 4

=item *

explicit fork() in the script, 

=item *

C<open FH, "|-">

=item *

C<open FH, "-|">, in other words, opening pipes to itself.

=back

While these operations are not questions of life and death, they are
needed for a lot of
useful scripts. This forces C<a.out>-style compile of
F<perl.exe>.


=head1 ENVIRONMENT

Here we list environment variables with are either OS/2- and DOS- and
Win*-specific, or are more important under OS/2 than under other OSes.

=head2 C<PERLLIB_PREFIX>

Specific for EMX port. Should have the form

  path1;path2

or

  path1 path2

If the beginning of some prebuilt path matches F<path1>, it is
substituted with F<path2>.

Should be used if the perl library is moved from the default
location in preference to C<PERL(5)LIB>, since this would not leave wrong
entries in @INC.  For example, if the compiled version of perl looks for @INC
in F<f:/perllib/lib>, and you want to install the library in
F<h:/opt/gnu>, do

  set PERLLIB_PREFIX=f:/perllib/lib;h:/opt/gnu

This will cause Perl with the prebuilt @INC of

  f:/perllib/lib/5.00553/os2
  f:/perllib/lib/5.00553
  f:/perllib/lib/site_perl/5.00553/os2
  f:/perllib/lib/site_perl/5.00553
  .

to use the following @INC:

  h:/opt/gnu/5.00553/os2
  h:/opt/gnu/5.00553
  h:/opt/gnu/site_perl/5.00553/os2
  h:/opt/gnu/site_perl/5.00553
  .

=head2 C<PERL_BADLANG>

If 0, perl ignores setlocale() failing. May be useful with some
strange I<locale>s.

=head2 C<PERL_BADFREE>

If 0, perl would not warn of in case of unwarranted free(). With older
perls this might be
useful in conjunction with the module DB_File, which was buggy when
dynamically linked and OMF-built.

Should not be set with newer Perls, since this may hide some I<real> problems.

=head2 C<PERL_SH_DIR>

Specific for EMX port. Gives the directory part of the location for
F<sh.exe>.

=head2 C<USE_PERL_FLOCK>

Specific for EMX port. Since L<flock(3)> is present in EMX, but is not 
functional, it is emulated by perl.  To disable the emulations, set 
environment variable C<USE_PERL_FLOCK=0>.

=head2 C<TMP> or C<TEMP>

Specific for EMX port. Used as storage place for temporary files.

=head1 Evolution

Here we list major changes which could make you by surprise.

=head2 Text-mode filehandles

Starting from version 5.8, Perl uses a builtin translation layer for
text-mode files.  This replaces the efficient well-tested EMX layer by
some code which should be best characterized as a "quick hack".

In addition to possible bugs and an inability to follow changes to the
translation policy with off/on switches of TERMIO translation, this
introduces a serious incompatible change: before sysread() on
text-mode filehandles would go through the translation layer, now it
would not.

=head2 Priorities

C<setpriority> and C<getpriority> are not compatible with earlier
ports by Andreas Kaiser. See C<"setpriority, getpriority">.

=head2 DLL name mangling: pre 5.6.2

With the release 5.003_01 the dynamically loadable libraries
should be rebuilt when a different version of Perl is compiled. In particular,
DLLs (including F<perl.dll>) are now created with the names
which contain a checksum, thus allowing workaround for OS/2 scheme of
caching DLLs.

It may be possible to code a simple workaround which would 

=over

=item *

find the old DLLs looking through the old @INC;

=item *

mangle the names according to the scheme of new perl and copy the DLLs to
these names;

=item *

edit the internal C<LX> tables of DLL to reflect the change of the name
(probably not needed for Perl extension DLLs, since the internally coded names
are not used for "specific" DLLs, they used only for "global" DLLs).

=item *

edit the internal C<IMPORT> tables and change the name of the "old"
F<perl????.dll> to the "new" F<perl????.dll>.

=back

=head2 DLL name mangling: 5.6.2 and beyond

In fact mangling of I<extension> DLLs was done due to misunderstanding
of the OS/2 dynaloading model.  OS/2 (effectively) maintains two
different tables of loaded DLL:

=over

=item Global DLLs

those loaded by the base name from C<LIBPATH>; including those
associated at link time;

=item specific DLLs

loaded by the full name.

=back

When resolving a request for a global DLL, the table of already-loaded
specific DLLs is (effectively) ignored; moreover, specific DLLs are
I<always> loaded from the prescribed path.

There is/was a minor twist which makes this scheme fragile: what to do
with DLLs loaded from

=over

=item C<BEGINLIBPATH> and C<ENDLIBPATH>

(which depend on the process)

=item F<.> from C<LIBPATH>

which I<effectively> depends on the process (although C<LIBPATH> is the
same for all the processes).

=back

Unless C<LIBPATHSTRICT> is set to C<T> (and the kernel is after
2000/09/01), such DLLs are considered to be global.  When loading a
global DLL it is first looked in the table of already-loaded global
DLLs.  Because of this the fact that one executable loaded a DLL from
C<BEGINLIBPATH> and C<ENDLIBPATH>, or F<.> from C<LIBPATH> may affect
I<which> DLL is loaded when I<another> executable requests a DLL with
the same name.  I<This> is the reason for version-specific mangling of
the DLL name for perl DLL.

Since the Perl extension DLLs are always loaded with the full path,
there is no need to mangle their names in a version-specific ways:
their directory already reflects the corresponding version of perl,
and @INC takes into account binary compatibility with older version.
Starting from C<5.6.2> the name mangling scheme is fixed to be the
same as for Perl 5.005_53 (same as in a popular binary release).  Thus
new Perls will be able to I<resolve the names> of old extension DLLs
if @INC allows finding their directories.

However, this still does not guarantee that these DLL may be loaded.
The reason is the mangling of the name of the I<Perl DLL>.  And since
the extension DLLs link with the Perl DLL, extension DLLs for older
versions would load an older Perl DLL, and would most probably
segfault (since the data in this DLL is not properly initialized).

There is a partial workaround (which can be made complete with newer
OS/2 kernels): create a forwarder DLL with the same name as the DLL of
the older version of Perl, which forwards the entry points to the
newer Perl's DLL.  Make this DLL accessible on (say) the C<BEGINLIBPATH> of
the new Perl executable.  When the new executable accesses old Perl's
extension DLLs, they would request the old Perl's DLL by name, get the
forwarder instead, so effectively will link with the currently running
(new) Perl DLL.

This may break in two ways:

=over

=item *

Old perl executable is started when a new executable is running has
loaded an extension compiled for the old executable (ouph!).  In this
case the old executable will get a forwarder DLL instead of the old
perl DLL, so would link with the new perl DLL.  While not directly
fatal, it will behave the same as new executable.  This beats the whole
purpose of explicitly starting an old executable.

=item *

A new executable loads an extension compiled for the old executable
when an old perl executable is running.  In this case the extension
will not pick up the forwarder - with fatal results.

=back

With support for C<LIBPATHSTRICT> this may be circumvented - unless
one of DLLs is started from F<.> from C<LIBPATH> (I do not know
whether C<LIBPATHSTRICT> affects this case).

B<REMARK>.  Unless newer kernels allow F<.> in C<BEGINLIBPATH> (older
do not), this mess cannot be completely cleaned.  (It turns out that
as of the beginning of 2002, F<.> is not allowed, but F<.\.> is - and
it has the same effect.)


B<REMARK>.  C<LIBPATHSTRICT>, C<BEGINLIBPATH> and C<ENDLIBPATH> are
not environment variables, although F<cmd.exe> emulates them on C<SET
...> lines.  From Perl they may be accessed by
L<Cwd::extLibpath|/Cwd::extLibpath([type])> and
L<Cwd::extLibpath_set|/Cwd::extLibpath_set( path [, type ] )>.

=head2 DLL forwarder generation

Assume that the old DLL is named F<perlE0AC.dll> (as is one for
5.005_53), and the new version is 5.6.1.  Create a file
F<perl5shim.def-leader> with

  LIBRARY 'perlE0AC' INITINSTANCE TERMINSTANCE
  DESCRIPTION '@#perl5-porters@perl.org:5.006001#@ Perl module for 5.00553 -> Perl 5.6.1 forwarder'
  CODE LOADONCALL
  DATA LOADONCALL NONSHARED MULTIPLE
  EXPORTS

modifying the versions/names as needed.  Run

 perl -wnle "next if 0../EXPORTS/; print qq(  \"$1\")
                                          if /\"(\w+)\"/" perl5.def >lst

in the Perl build directory (to make the DLL smaller replace perl5.def
with the definition file for the older version of Perl if present).

 cat perl5shim.def-leader lst >perl5shim.def
 gcc -Zomf -Zdll -o perlE0AC.dll perl5shim.def -s -llibperl

(ignore multiple C<warning L4085>).

=head2 Threading

As of release 5.003_01 perl is linked to multithreaded C RTL
DLL.  If perl itself is not compiled multithread-enabled, so will not be perl's
malloc(). However, extensions may use multiple thread on their own
risk. 

This was needed to compile C<Perl/Tk> for XFree86-OS/2 out-of-the-box, and
link with DLLs for other useful libraries, which typically are compiled
with C<-Zmt -Zcrtdll>.

=head2 Calls to external programs

Due to a popular demand the perl external program calling has been
changed wrt Andreas Kaiser's port.  I<If> perl needs to call an
external program I<via shell>, the F<f:/bin/sh.exe> will be called, or
whatever is the override, see L</"C<PERL_SH_DIR>">.

Thus means that you need to get some copy of a F<sh.exe> as well (I
use one from pdksh). The path F<F:/bin> above is set up automatically during
the build to a correct value on the builder machine, but is
overridable at runtime,

B<Reasons:> a consensus on C<perl5-porters> was that perl should use
one non-overridable shell per platform. The obvious choices for OS/2
are F<cmd.exe> and F<sh.exe>. Having perl build itself would be impossible
with F<cmd.exe> as a shell, thus I picked up C<sh.exe>. This assures almost
100% compatibility with the scripts coming from *nix. As an added benefit 
this works as well under DOS if you use DOS-enabled port of pdksh 
(see L</Prerequisites>).

B<Disadvantages:> currently F<sh.exe> of pdksh calls external programs
via fork()/exec(), and there is I<no> functioning exec() on
OS/2. exec() is emulated by EMX by an asynchronous call while the caller
waits for child completion (to pretend that the C<pid> did not change). This
means that 1 I<extra> copy of F<sh.exe> is made active via fork()/exec(),
which may lead to some resources taken from the system (even if we do
not count extra work needed for fork()ing).

Note that this a lesser issue now when we do not spawn F<sh.exe>
unless needed (metachars found).

One can always start F<cmd.exe> explicitly via

  system 'cmd', '/c', 'mycmd', 'arg1', 'arg2', ...

If you need to use F<cmd.exe>, and do not want to hand-edit thousands of your
scripts, the long-term solution proposed on p5-p is to have a directive

  use OS2::Cmd;

which will override system(), exec(), C<``>, and
C<open(,'...|')>. With current perl you may override only system(),
readpipe() - the explicit version of C<``>, and maybe exec(). The code
will substitute the one-argument call to system() by
C<CORE::system('cmd.exe', '/c', shift)>.

If you have some working code for C<OS2::Cmd>, please send it to me,
I will include it into distribution. I have no need for such a module, so
cannot test it.

For the details of the current situation with calling external programs,
see L<Starting OSE<sol>2 (and DOS) programs under Perl>.  Set us mention a couple
of features:

=over 4

=item *

External scripts may be called by their basename.  Perl will try the same
extensions as when processing B<-S> command-line switch.

=item *

External scripts starting with C<#!> or C<extproc > will be executed directly,
without calling the shell, by calling the program specified on the rest of
the first line.

=back

=head2 Memory allocation

Perl uses its own malloc() under OS/2 - interpreters are usually malloc-bound
for speed, but perl is not, since its malloc is lightning-fast.
Perl-memory-usage-tuned benchmarks show that Perl's malloc is 5 times quicker
than EMX one.  I do not have convincing data about memory footprint, but
a (pretty random) benchmark showed that Perl's one is 5% better.

Combination of perl's malloc() and rigid DLL name resolution creates
a special problem with library functions which expect their return value to
be free()d by system's free(). To facilitate extensions which need to call 
such functions, system memory-allocation functions are still available with
the prefix C<emx_> added. (Currently only DLL perl has this, it should 
propagate to F<perl_.exe> shortly.)

=head2 Threads

One can build perl with thread support enabled by providing C<-D usethreads>
option to F<Configure>.  Currently OS/2 support of threads is very 
preliminary.

Most notable problems: 

=over 4

=item C<COND_WAIT> 

may have a race condition (but probably does not due to edge-triggered
nature of OS/2 Event semaphores).  (Needs a reimplementation (in terms of chaining
waiting threads, with the linked list stored in per-thread structure?)?)

=item F<os2.c>

has a couple of static variables used in OS/2-specific functions.  (Need to be
moved to per-thread structure, or serialized?)

=back

Note that these problems should not discourage experimenting, since they
have a low probability of affecting small programs.

=head1 BUGS

This description is not updated often (since 5.6.1?), see F<./os2/Changes>
for more info.

=cut

OS/2 extensions
~~~~~~~~~~~~~~~
I include 3 extensions by Andreas Kaiser, OS2::REXX, OS2::UPM, and OS2::FTP, 
into my ftp directory, mirrored on CPAN. I made
some minor changes needed to compile them by standard tools. I cannot 
test UPM and FTP, so I will appreciate your feedback. Other extensions
there are OS2::ExtAttr, OS2::PrfDB for tied access to EAs and .INI
files - and maybe some other extensions at the time you read it.

Note that OS2 perl defines 2 pseudo-extension functions
OS2::Copy::copy and DynaLoader::mod2fname (many more now, see
L</Prebuilt methods>).

The -R switch of older perl is deprecated. If you need to call a REXX code
which needs access to variables, include the call into a REXX compartment
created by 
	REXX_call {...block...};

Two new functions are supported by REXX code, 
	REXX_eval 'string';
	REXX_eval_with 'string', REXX_function_name => \&perl_sub_reference;

If you have some other extensions you want to share, send the code to
me.  At least two are available: tied access to EA's, and tied access
to system databases.

=head1 AUTHOR

Ilya Zakharevich, cpan@ilyaz.org

=head1 SEE ALSO

perl(1).

=cut

perldsc.pod000064400000062016150344123440006711 0ustar00=head1 NAME
X<data structure> X<complex data structure> X<struct>

perldsc - Perl Data Structures Cookbook

=head1 DESCRIPTION

Perl lets us have complex data structures.  You can write something like
this and all of a sudden, you'd have an array with three dimensions!

    for my $x (1 .. 10) {
        for my $y (1 .. 10) {
            for my $z (1 .. 10) {
                $AoA[$x][$y][$z] =
                    $x ** $y + $z;
            }
        }
    }

Alas, however simple this may appear, underneath it's a much more
elaborate construct than meets the eye!

How do you print it out?  Why can't you say just C<print @AoA>?  How do
you sort it?  How can you pass it to a function or get one of these back
from a function?  Is it an object?  Can you save it to disk to read
back later?  How do you access whole rows or columns of that matrix?  Do
all the values have to be numeric?

As you see, it's quite easy to become confused.  While some small portion
of the blame for this can be attributed to the reference-based
implementation, it's really more due to a lack of existing documentation with
examples designed for the beginner.

This document is meant to be a detailed but understandable treatment of the
many different sorts of data structures you might want to develop.  It
should also serve as a cookbook of examples.  That way, when you need to
create one of these complex data structures, you can just pinch, pilfer, or
purloin a drop-in example from here.

Let's look at each of these possible constructs in detail.  There are separate
sections on each of the following:

=over 5

=item * arrays of arrays

=item * hashes of arrays

=item * arrays of hashes

=item * hashes of hashes

=item * more elaborate constructs

=back

But for now, let's look at general issues common to all
these types of data structures.

=head1 REFERENCES
X<reference> X<dereference> X<dereferencing> X<pointer>

The most important thing to understand about all data structures in
Perl--including multidimensional arrays--is that even though they might
appear otherwise, Perl C<@ARRAY>s and C<%HASH>es are all internally
one-dimensional.  They can hold only scalar values (meaning a string,
number, or a reference).  They cannot directly contain other arrays or
hashes, but instead contain I<references> to other arrays or hashes.
X<multidimensional array> X<array, multidimensional>

You can't use a reference to an array or hash in quite the same way that you
would a real array or hash.  For C or C++ programmers unused to
distinguishing between arrays and pointers to the same, this can be
confusing.  If so, just think of it as the difference between a structure
and a pointer to a structure.

You can (and should) read more about references in L<perlref>.
Briefly, references are rather like pointers that know what they
point to.  (Objects are also a kind of reference, but we won't be needing
them right away--if ever.)  This means that when you have something which
looks to you like an access to a two-or-more-dimensional array and/or hash,
what's really going on is that the base type is
merely a one-dimensional entity that contains references to the next
level.  It's just that you can I<use> it as though it were a
two-dimensional one.  This is actually the way almost all C
multidimensional arrays work as well.

    $array[7][12]                       # array of arrays
    $array[7]{string}                   # array of hashes
    $hash{string}[7]                    # hash of arrays
    $hash{string}{'another string'}     # hash of hashes

Now, because the top level contains only references, if you try to print
out your array in with a simple print() function, you'll get something
that doesn't look very nice, like this:

    my @AoA = ( [2, 3], [4, 5, 7], [0] );
    print $AoA[1][2];
  7
    print @AoA;
  ARRAY(0x83c38)ARRAY(0x8b194)ARRAY(0x8b1d0)


That's because Perl doesn't (ever) implicitly dereference your variables.
If you want to get at the thing a reference is referring to, then you have
to do this yourself using either prefix typing indicators, like
C<${$blah}>, C<@{$blah}>, C<@{$blah[$i]}>, or else postfix pointer arrows,
like C<$a-E<gt>[3]>, C<$h-E<gt>{fred}>, or even C<$ob-E<gt>method()-E<gt>[3]>.

=head1 COMMON MISTAKES

The two most common mistakes made in constructing something like
an array of arrays is either accidentally counting the number of
elements or else taking a reference to the same memory location
repeatedly.  Here's the case where you just get the count instead
of a nested array:

    for my $i (1..10) {
        my @array = somefunc($i);
        $AoA[$i] = @array;      # WRONG!
    }

That's just the simple case of assigning an array to a scalar and getting
its element count.  If that's what you really and truly want, then you
might do well to consider being a tad more explicit about it, like this:

    for my $i (1..10) {
        my @array = somefunc($i);
        $counts[$i] = scalar @array;
    }

Here's the case of taking a reference to the same memory location
again and again:

    # Either without strict or having an outer-scope my @array;
    # declaration.

    for my $i (1..10) {
        @array = somefunc($i);
        $AoA[$i] = \@array;     # WRONG!
    }

So, what's the big problem with that?  It looks right, doesn't it?
After all, I just told you that you need an array of references, so by
golly, you've made me one!

Unfortunately, while this is true, it's still broken.  All the references
in @AoA refer to the I<very same place>, and they will therefore all hold
whatever was last in @array!  It's similar to the problem demonstrated in
the following C program:

    #include <pwd.h>
    main() {
        struct passwd *getpwnam(), *rp, *dp;
        rp = getpwnam("root");
        dp = getpwnam("daemon");

        printf("daemon name is %s\nroot name is %s\n",
                dp->pw_name, rp->pw_name);
    }

Which will print

    daemon name is daemon
    root name is daemon

The problem is that both C<rp> and C<dp> are pointers to the same location
in memory!  In C, you'd have to remember to malloc() yourself some new
memory.  In Perl, you'll want to use the array constructor C<[]> or the
hash constructor C<{}> instead.   Here's the right way to do the preceding
broken code fragments:
X<[]> X<{}>

    # Either without strict or having an outer-scope my @array;
    # declaration.

    for my $i (1..10) {
        @array = somefunc($i);
        $AoA[$i] = [ @array ];
    }

The square brackets make a reference to a new array with a I<copy>
of what's in @array at the time of the assignment.  This is what
you want.

Note that this will produce something similar, but it's
much harder to read:

    # Either without strict or having an outer-scope my @array;
    # declaration.
    for my $i (1..10) {
        @array = 0 .. $i;
        @{$AoA[$i]} = @array;
    }

Is it the same?  Well, maybe so--and maybe not.  The subtle difference
is that when you assign something in square brackets, you know for sure
it's always a brand new reference with a new I<copy> of the data.
Something else could be going on in this new case with the C<@{$AoA[$i]}>
dereference on the left-hand-side of the assignment.  It all depends on
whether C<$AoA[$i]> had been undefined to start with, or whether it
already contained a reference.  If you had already populated @AoA with
references, as in

    $AoA[3] = \@another_array;

Then the assignment with the indirection on the left-hand-side would
use the existing reference that was already there:

    @{$AoA[3]} = @array;

Of course, this I<would> have the "interesting" effect of clobbering
@another_array.  (Have you ever noticed how when a programmer says
something is "interesting", that rather than meaning "intriguing",
they're disturbingly more apt to mean that it's "annoying",
"difficult", or both?  :-)

So just remember always to use the array or hash constructors with C<[]>
or C<{}>, and you'll be fine, although it's not always optimally
efficient.

Surprisingly, the following dangerous-looking construct will
actually work out fine:

    for my $i (1..10) {
        my @array = somefunc($i);
        $AoA[$i] = \@array;
    }

That's because my() is more of a run-time statement than it is a
compile-time declaration I<per se>.  This means that the my() variable is
remade afresh each time through the loop.  So even though it I<looks> as
though you stored the same variable reference each time, you actually did
not!  This is a subtle distinction that can produce more efficient code at
the risk of misleading all but the most experienced of programmers.  So I
usually advise against teaching it to beginners.  In fact, except for
passing arguments to functions, I seldom like to see the gimme-a-reference
operator (backslash) used much at all in code.  Instead, I advise
beginners that they (and most of the rest of us) should try to use the
much more easily understood constructors C<[]> and C<{}> instead of
relying upon lexical (or dynamic) scoping and hidden reference-counting to
do the right thing behind the scenes.

In summary:

    $AoA[$i] = [ @array ];     # usually best
    $AoA[$i] = \@array;        # perilous; just how my() was that array?
    @{ $AoA[$i] } = @array;    # way too tricky for most programmers


=head1 CAVEAT ON PRECEDENCE
X<dereference, precedence> X<dereferencing, precedence>

Speaking of things like C<@{$AoA[$i]}>, the following are actually the
same thing:
X<< -> >>

    $aref->[2][2]       # clear
    $$aref[2][2]        # confusing

That's because Perl's precedence rules on its five prefix dereferencers
(which look like someone swearing: C<$ @ * % &>) make them bind more
tightly than the postfix subscripting brackets or braces!  This will no
doubt come as a great shock to the C or C++ programmer, who is quite
accustomed to using C<*a[i]> to mean what's pointed to by the I<i'th>
element of C<a>.  That is, they first take the subscript, and only then
dereference the thing at that subscript.  That's fine in C, but this isn't C.

The seemingly equivalent construct in Perl, C<$$aref[$i]> first does
the deref of $aref, making it take $aref as a reference to an
array, and then dereference that, and finally tell you the I<i'th> value
of the array pointed to by $AoA. If you wanted the C notion, you'd have to
write C<${$AoA[$i]}> to force the C<$AoA[$i]> to get evaluated first
before the leading C<$> dereferencer.

=head1 WHY YOU SHOULD ALWAYS C<use strict>

If this is starting to sound scarier than it's worth, relax.  Perl has
some features to help you avoid its most common pitfalls.  The best
way to avoid getting confused is to start every program like this:

    #!/usr/bin/perl -w
    use strict;

This way, you'll be forced to declare all your variables with my() and
also disallow accidental "symbolic dereferencing".  Therefore if you'd done
this:

    my $aref = [
        [ "fred", "barney", "pebbles", "bambam", "dino", ],
        [ "homer", "bart", "marge", "maggie", ],
        [ "george", "jane", "elroy", "judy", ],
    ];

    print $aref[2][2];

The compiler would immediately flag that as an error I<at compile time>,
because you were accidentally accessing C<@aref>, an undeclared
variable, and it would thereby remind you to write instead:

    print $aref->[2][2]

=head1 DEBUGGING
X<data structure, debugging> X<complex data structure, debugging>
X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging>
X<array of arrays, debugging> X<hash of arrays, debugging>
X<array of hashes, debugging> X<hash of hashes, debugging>

You can use the debugger's C<x> command to dump out complex data structures.
For example, given the assignment to $AoA above, here's the debugger output:

    DB<1> x $AoA
    $AoA = ARRAY(0x13b5a0)
       0  ARRAY(0x1f0a24)
          0  'fred'
          1  'barney'
          2  'pebbles'
          3  'bambam'
          4  'dino'
       1  ARRAY(0x13b558)
          0  'homer'
          1  'bart'
          2  'marge'
          3  'maggie'
       2  ARRAY(0x13b540)
          0  'george'
          1  'jane'
          2  'elroy'
          3  'judy'

=head1 CODE EXAMPLES

Presented with little comment (these will get their own manpages someday)
here are short code examples illustrating access of various
types of data structures.

=head1 ARRAYS OF ARRAYS
X<array of arrays> X<AoA>

=head2 Declaration of an ARRAY OF ARRAYS

 @AoA = (
        [ "fred", "barney" ],
        [ "george", "jane", "elroy" ],
        [ "homer", "marge", "bart" ],
      );

=head2 Generation of an ARRAY OF ARRAYS

 # reading from file
 while ( <> ) {
     push @AoA, [ split ];
 }

 # calling a function
 for $i ( 1 .. 10 ) {
     $AoA[$i] = [ somefunc($i) ];
 }

 # using temp vars
 for $i ( 1 .. 10 ) {
     @tmp = somefunc($i);
     $AoA[$i] = [ @tmp ];
 }

 # add to an existing row
 push @{ $AoA[0] }, "wilma", "betty";

=head2 Access and Printing of an ARRAY OF ARRAYS

 # one element
 $AoA[0][0] = "Fred";

 # another element
 $AoA[1][1] =~ s/(\w)/\u$1/;

 # print the whole thing with refs
 for $aref ( @AoA ) {
     print "\t [ @$aref ],\n";
 }

 # print the whole thing with indices
 for $i ( 0 .. $#AoA ) {
     print "\t [ @{$AoA[$i]} ],\n";
 }

 # print the whole thing one at a time
 for $i ( 0 .. $#AoA ) {
     for $j ( 0 .. $#{ $AoA[$i] } ) {
         print "elt $i $j is $AoA[$i][$j]\n";
     }
 }

=head1 HASHES OF ARRAYS
X<hash of arrays> X<HoA>

=head2 Declaration of a HASH OF ARRAYS

 %HoA = (
        flintstones        => [ "fred", "barney" ],
        jetsons            => [ "george", "jane", "elroy" ],
        simpsons           => [ "homer", "marge", "bart" ],
      );

=head2 Generation of a HASH OF ARRAYS

 # reading from file
 # flintstones: fred barney wilma dino
 while ( <> ) {
     next unless s/^(.*?):\s*//;
     $HoA{$1} = [ split ];
 }

 # reading from file; more temps
 # flintstones: fred barney wilma dino
 while ( $line = <> ) {
     ($who, $rest) = split /:\s*/, $line, 2;
     @fields = split ' ', $rest;
     $HoA{$who} = [ @fields ];
 }

 # calling a function that returns a list
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     $HoA{$group} = [ get_family($group) ];
 }

 # likewise, but using temps
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     @members = get_family($group);
     $HoA{$group} = [ @members ];
 }

 # append new members to an existing family
 push @{ $HoA{"flintstones"} }, "wilma", "betty";

=head2 Access and Printing of a HASH OF ARRAYS

 # one element
 $HoA{flintstones}[0] = "Fred";

 # another element
 $HoA{simpsons}[1] =~ s/(\w)/\u$1/;

 # print the whole thing
 foreach $family ( keys %HoA ) {
     print "$family: @{ $HoA{$family} }\n"
 }

 # print the whole thing with indices
 foreach $family ( keys %HoA ) {
     print "family: ";
     foreach $i ( 0 .. $#{ $HoA{$family} } ) {
         print " $i = $HoA{$family}[$i]";
     }
     print "\n";
 }

 # print the whole thing sorted by number of members
 foreach $family ( sort { @{$HoA{$b}} <=> @{$HoA{$a}} } keys %HoA ) {
     print "$family: @{ $HoA{$family} }\n"
 }

 # print the whole thing sorted by number of members and name
 foreach $family ( sort {
                            @{$HoA{$b}} <=> @{$HoA{$a}}
                                        ||
                                    $a cmp $b
            } keys %HoA )
 {
     print "$family: ", join(", ", sort @{ $HoA{$family} }), "\n";
 }

=head1 ARRAYS OF HASHES
X<array of hashes> X<AoH>

=head2 Declaration of an ARRAY OF HASHES

 @AoH = (
        {
            Lead     => "fred",
            Friend   => "barney",
        },
        {
            Lead     => "george",
            Wife     => "jane",
            Son      => "elroy",
        },
        {
            Lead     => "homer",
            Wife     => "marge",
            Son      => "bart",
        }
  );

=head2 Generation of an ARRAY OF HASHES

 # reading from file
 # format: LEAD=fred FRIEND=barney
 while ( <> ) {
     $rec = {};
     for $field ( split ) {
         ($key, $value) = split /=/, $field;
         $rec->{$key} = $value;
     }
     push @AoH, $rec;
 }


 # reading from file
 # format: LEAD=fred FRIEND=barney
 # no temp
 while ( <> ) {
     push @AoH, { split /[\s+=]/ };
 }

 # calling a function  that returns a key/value pair list, like
 # "lead","fred","daughter","pebbles"
 while ( %fields = getnextpairset() ) {
     push @AoH, { %fields };
 }

 # likewise, but using no temp vars
 while (<>) {
     push @AoH, { parsepairs($_) };
 }

 # add key/value to an element
 $AoH[0]{pet} = "dino";
 $AoH[2]{pet} = "santa's little helper";

=head2 Access and Printing of an ARRAY OF HASHES

 # one element
 $AoH[0]{lead} = "fred";

 # another element
 $AoH[1]{lead} =~ s/(\w)/\u$1/;

 # print the whole thing with refs
 for $href ( @AoH ) {
     print "{ ";
     for $role ( keys %$href ) {
         print "$role=$href->{$role} ";
     }
     print "}\n";
 }

 # print the whole thing with indices
 for $i ( 0 .. $#AoH ) {
     print "$i is { ";
     for $role ( keys %{ $AoH[$i] } ) {
         print "$role=$AoH[$i]{$role} ";
     }
     print "}\n";
 }

 # print the whole thing one at a time
 for $i ( 0 .. $#AoH ) {
     for $role ( keys %{ $AoH[$i] } ) {
         print "elt $i $role is $AoH[$i]{$role}\n";
     }
 }

=head1 HASHES OF HASHES
X<hash of hashes> X<HoH>

=head2 Declaration of a HASH OF HASHES

 %HoH = (
        flintstones => {
                lead      => "fred",
                pal       => "barney",
        },
        jetsons     => {
                lead      => "george",
                wife      => "jane",
                "his boy" => "elroy",
        },
        simpsons    => {
                lead      => "homer",
                wife      => "marge",
                kid       => "bart",
        },
 );

=head2 Generation of a HASH OF HASHES

 # reading from file
 # flintstones: lead=fred pal=barney wife=wilma pet=dino
 while ( <> ) {
     next unless s/^(.*?):\s*//;
     $who = $1;
     for $field ( split ) {
         ($key, $value) = split /=/, $field;
         $HoH{$who}{$key} = $value;
     }


 # reading from file; more temps
 while ( <> ) {
     next unless s/^(.*?):\s*//;
     $who = $1;
     $rec = {};
     $HoH{$who} = $rec;
     for $field ( split ) {
         ($key, $value) = split /=/, $field;
         $rec->{$key} = $value;
     }
 }

 # calling a function  that returns a key,value hash
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     $HoH{$group} = { get_family($group) };
 }

 # likewise, but using temps
 for $group ( "simpsons", "jetsons", "flintstones" ) {
     %members = get_family($group);
     $HoH{$group} = { %members };
 }

 # append new members to an existing family
 %new_folks = (
     wife => "wilma",
     pet  => "dino",
 );

 for $what (keys %new_folks) {
     $HoH{flintstones}{$what} = $new_folks{$what};
 }

=head2 Access and Printing of a HASH OF HASHES

 # one element
 $HoH{flintstones}{wife} = "wilma";

 # another element
 $HoH{simpsons}{lead} =~ s/(\w)/\u$1/;

 # print the whole thing
 foreach $family ( keys %HoH ) {
     print "$family: { ";
     for $role ( keys %{ $HoH{$family} } ) {
         print "$role=$HoH{$family}{$role} ";
     }
     print "}\n";
 }

 # print the whole thing  somewhat sorted
 foreach $family ( sort keys %HoH ) {
     print "$family: { ";
     for $role ( sort keys %{ $HoH{$family} } ) {
         print "$role=$HoH{$family}{$role} ";
     }
     print "}\n";
 }


 # print the whole thing sorted by number of members
 foreach $family ( sort { keys %{$HoH{$b}} <=> keys %{$HoH{$a}} }
                                                             keys %HoH )
 {
     print "$family: { ";
     for $role ( sort keys %{ $HoH{$family} } ) {
         print "$role=$HoH{$family}{$role} ";
     }
     print "}\n";
 }

 # establish a sort order (rank) for each role
 $i = 0;
 for ( qw(lead wife son daughter pal pet) ) { $rank{$_} = ++$i }

 # now print the whole thing sorted by number of members
 foreach $family ( sort { keys %{ $HoH{$b} } <=> keys %{ $HoH{$a} } }
                                                             keys %HoH )
 {
     print "$family: { ";
     # and print these according to rank order
     for $role ( sort { $rank{$a} <=> $rank{$b} }
                                               keys %{ $HoH{$family} } )
     {
         print "$role=$HoH{$family}{$role} ";
     }
     print "}\n";
 }


=head1 MORE ELABORATE RECORDS
X<record> X<structure> X<struct>

=head2 Declaration of MORE ELABORATE RECORDS

Here's a sample showing how to create and use a record whose fields are of
many different sorts:

     $rec = {
         TEXT      => $string,
         SEQUENCE  => [ @old_values ],
         LOOKUP    => { %some_table },
         THATCODE  => \&some_function,
         THISCODE  => sub { $_[0] ** $_[1] },
         HANDLE    => \*STDOUT,
     };

     print $rec->{TEXT};

     print $rec->{SEQUENCE}[0];
     $last = pop @ { $rec->{SEQUENCE} };

     print $rec->{LOOKUP}{"key"};
     ($first_k, $first_v) = each %{ $rec->{LOOKUP} };

     $answer = $rec->{THATCODE}->($arg);
     $answer = $rec->{THISCODE}->($arg1, $arg2);

     # careful of extra block braces on fh ref
     print { $rec->{HANDLE} } "a string\n";

     use FileHandle;
     $rec->{HANDLE}->autoflush(1);
     $rec->{HANDLE}->print(" a string\n");

=head2 Declaration of a HASH OF COMPLEX RECORDS

     %TV = (
        flintstones => {
            series   => "flintstones",
            nights   => [ qw(monday thursday friday) ],
            members  => [
                { name => "fred",    role => "lead", age  => 36, },
                { name => "wilma",   role => "wife", age  => 31, },
                { name => "pebbles", role => "kid",  age  =>  4, },
            ],
        },

        jetsons     => {
            series   => "jetsons",
            nights   => [ qw(wednesday saturday) ],
            members  => [
                { name => "george",  role => "lead", age  => 41, },
                { name => "jane",    role => "wife", age  => 39, },
                { name => "elroy",   role => "kid",  age  =>  9, },
            ],
         },

        simpsons    => {
            series   => "simpsons",
            nights   => [ qw(monday) ],
            members  => [
                { name => "homer", role => "lead", age  => 34, },
                { name => "marge", role => "wife", age => 37, },
                { name => "bart",  role => "kid",  age  =>  11, },
            ],
         },
      );

=head2 Generation of a HASH OF COMPLEX RECORDS

     # reading from file
     # this is most easily done by having the file itself be
     # in the raw data format as shown above.  perl is happy
     # to parse complex data structures if declared as data, so
     # sometimes it's easiest to do that

     # here's a piece by piece build up
     $rec = {};
     $rec->{series} = "flintstones";
     $rec->{nights} = [ find_days() ];

     @members = ();
     # assume this file in field=value syntax
     while (<>) {
         %fields = split /[\s=]+/;
         push @members, { %fields };
     }
     $rec->{members} = [ @members ];

     # now remember the whole thing
     $TV{ $rec->{series} } = $rec;

     ###########################################################
     # now, you might want to make interesting extra fields that
     # include pointers back into the same data structure so if
     # change one piece, it changes everywhere, like for example
     # if you wanted a {kids} field that was a reference
     # to an array of the kids' records without having duplicate
     # records and thus update problems.
     ###########################################################
     foreach $family (keys %TV) {
         $rec = $TV{$family}; # temp pointer
         @kids = ();
         for $person ( @{ $rec->{members} } ) {
             if ($person->{role} =~ /kid|son|daughter/) {
                 push @kids, $person;
             }
         }
         # REMEMBER: $rec and $TV{$family} point to same data!!
         $rec->{kids} = [ @kids ];
     }

     # you copied the array, but the array itself contains pointers
     # to uncopied objects. this means that if you make bart get
     # older via

     $TV{simpsons}{kids}[0]{age}++;

     # then this would also change in
     print $TV{simpsons}{members}[2]{age};

     # because $TV{simpsons}{kids}[0] and $TV{simpsons}{members}[2]
     # both point to the same underlying anonymous hash table

     # print the whole thing
     foreach $family ( keys %TV ) {
         print "the $family";
         print " is on during @{ $TV{$family}{nights} }\n";
         print "its members are:\n";
         for $who ( @{ $TV{$family}{members} } ) {
             print " $who->{name} ($who->{role}), age $who->{age}\n";
         }
         print "it turns out that $TV{$family}{lead} has ";
         print scalar ( @{ $TV{$family}{kids} } ), " kids named ";
         print join (", ", map { $_->{name} } @{ $TV{$family}{kids} } );
         print "\n";
     }

=head1 Database Ties

You cannot easily tie a multilevel data structure (such as a hash of
hashes) to a dbm file.  The first problem is that all but GDBM and
Berkeley DB have size limitations, but beyond that, you also have problems
with how references are to be represented on disk.  One experimental
module that does partially attempt to address this need is the MLDBM
module.  Check your nearest CPAN site as described in L<perlmodlib> for
source code to MLDBM.

=head1 SEE ALSO

L<perlref>, L<perllol>, L<perldata>, L<perlobj>

=head1 AUTHOR

Tom Christiansen <F<tchrist@perl.com>>
perlce.pod000064400000034412150344123440006526 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perlce - Perl for WinCE

=head1 Building Perl for WinCE

=head2 WARNING

B<< Much of this document has become very out of date and needs updating,
rewriting or deleting. The build process was overhauled during the 5.19
development track and the current instructions as of that time are given
in L</CURRENT BUILD INSTRUCTIONS>; the previous build instructions, which
are largely superseded but may still contain some useful information, are
left in L</OLD BUILD INSTRUCTIONS> but really need removing after anything
of use has been extracted from them. >>

=head2 DESCRIPTION

This file gives the instructions for building Perl5.8 and above for
WinCE.  Please read and understand the terms under which this
software is distributed.

=head2 General explanations on cross-compiling WinCE

=over

=item *

F<miniperl> is built. This is a single executable (without DLL), intended
to run on Win32, and it will facilitate remaining build process; all binaries
built after it are foreign and should not run locally.

F<miniperl> is built using F<./win32/Makefile>; this is part of normal
build process invoked as dependency from wince/Makefile.ce

=item *

After F<miniperl> is built, F<configpm> is invoked to create right F<Config.pm>
in right place and its corresponding Cross.pm.

Unlike Win32 build, miniperl will not have F<Config.pm> of host within reach;
it rather will use F<Config.pm> from within cross-compilation directories.

File F<Cross.pm> is dead simple: for given cross-architecture places in @INC
a path where perl modules are, and right F<Config.pm> in that place.

That said, C<miniperl -Ilib -MConfig -we 1> should report an error, because
it can not find F<Config.pm>. If it does not give an error -- wrong F<Config.pm>
is substituted, and resulting binaries will be a mess.

C<miniperl -MCross -MConfig -we 1> should run okay, and it will provide right
F<Config.pm> for further compilations.

=item *

During extensions build phase, a script F<./win32/buildext.pl> is invoked,
which in turn steps in F<./ext> subdirectories and performs a build of
each extension in turn.

All invokes of F<Makefile.PL> are provided with C<-MCross> so to enable cross-
compile.

=back

=head2 CURRENT BUILD INSTRUCTIONS

(These instructions assume the host is 32-bit Windows. If you're on 64-bit
Windows then change "C:\Program Files" to "C:\Program Files (x86)" throughout.)

1. Install EVC4 from

 http://download.microsoft.com/download/c/3/f/c3f8b58b-9753-4c2e-8b96-2dfe3476a2f7/eVC4.exe

Use the key mentioned at 

 http://download.cnet.com/Microsoft-eMbedded-Visual-C/3000-2212_4-10108490.html?tag=bc

The installer is ancient and has a few bugs on the paths it uses. You 
will have to fix them later. Basically, some things go into "C:/Program 
Files/Windows CE Tools", others go into "C:/Windows CE Tools" regardless 
of the path you gave to the installer (the default will be "C:/Windows 
CE Tools"). Reboots will be required for the installer to proceed. Also 
.c and .h associations with Visual Studio might get overridden when 
installing EVC4. You have been warned.

2. Download celib from GitHub (using "Download ZIP") at

    https://github.com/bulk88/celib 

Extract it to a spaceless path but not into the perl build source.
I call this directory "celib-palm-3.0" but in the GitHub 
snapshot it will be called "celib-master". Make a copy of the 
"wince-arm-pocket-wce300-release" folder and rename the copy to 
"wince-arm-pocket-wce400". This is a hack so we can build a CE 4.0 
binary by linking in CE 3.0 ARM asm; the linker doesn't care. Windows 
Mobile/WinCE are backwards compatible with machine code like Desktop Windows.

3. Download console-1.3-src.tar.gz from 

 http://sourceforge.net/projects/perlce/files/PerlCE%20support%20files/console/

Extract it to a spaceless path but not into the perl build source. 
Don't extract it into the same directory as celib. Make a copy of the 
"wince-arm-pocket-wce300" folder and rename the copy to 
"wince-arm-pocket-wce400". This is a hack so we can build a CE 4.0 
binary by linking in CE 3.0 ARM asm; the linker doesn't care. Windows 
Mobile/WinCE are backwards compatible with machine code like Desktop Windows.

4. Open a command prompt, run your regular batch file to set the environment
for desktop Visual C building, goto the perl source directory, cd into win32/,
fill out Makefile, and do a "nmake all" to build a Desktop Perl.

5. Open win32/Makefile.ce in a text editor and do something similar to the 
following patch.

    -CELIBDLLDIR  = h:\src\wince\celib-palm-3.0
    -CECONSOLEDIR = h:\src\wince\w32console
    +CELIBDLLDIR  = C:\sources\celib-palm-3.0
    +CECONSOLEDIR = C:\sources\w32console

Also change

    !if "$(MACHINE)" == ""
    MACHINE=wince-arm-hpc-wce300
    #MACHINE=wince-arm-hpc-wce211
    #MACHINE=wince-sh3-hpc-wce211
    #MACHINE=wince-mips-hpc-wce211
    #MACHINE=wince-sh3-hpc-wce200
    #MACHINE=wince-mips-hpc-wce200
    #MACHINE=wince-arm-pocket-wce300
    #MACHINE=wince-mips-pocket-wce300
    #MACHINE=wince-sh3-pocket-wce300
    #MACHINE=wince-x86em-pocket-wce300
    #MACHINE=wince-mips-palm-wce211
    #MACHINE=wince-sh3-palm-wce211
    #MACHINE=wince-x86em-palm-wce211
    #MACHINE=wince-x86-hpc-wce300
    #MACHINE=wince-arm-pocket-wce400
    !endif

to

    !if "$(MACHINE)" == ""
    #MACHINE=wince-arm-hpc-wce300
    #MACHINE=wince-arm-hpc-wce211
    #MACHINE=wince-sh3-hpc-wce211
    #MACHINE=wince-mips-hpc-wce211
    #MACHINE=wince-sh3-hpc-wce200
    #MACHINE=wince-mips-hpc-wce200
    #MACHINE=wince-arm-pocket-wce300
    #MACHINE=wince-mips-pocket-wce300
    #MACHINE=wince-sh3-pocket-wce300
    #MACHINE=wince-x86em-pocket-wce300
    #MACHINE=wince-mips-palm-wce211
    #MACHINE=wince-sh3-palm-wce211
    #MACHINE=wince-x86em-palm-wce211
    #MACHINE=wince-x86-hpc-wce300
    MACHINE=wince-arm-pocket-wce400
    !endif

so wince-arm-pocket-wce400 is the MACHINE type.

6. Use a text editor to open "C:\Program Files\Microsoft eMbedded C++ 
4.0\EVC\WCE400\BIN\WCEARMV4.BAT". Look for

    if "%SDKROOT%"=="" set SDKROOT=...

On a new install it is "C:\Windows CE Tools". Goto 
"C:\Windows CE Tools" in a file manager and see if "C:\Windows CE 
Tools\wce400\STANDARDSDK\Include\Armv4" exists on your disk. If not
the SDKROOT need to be changed to "C:\Program Files\Windows CE Tools".

Goto celib-palm-3.0\inc\cewin32.h, search for

    typedef struct _ABC {

and uncomment the struct.

7. Open another command prompt, ensure PLATFORM is not set to anything
already unless you know what you're doing (so that the correct default
value is set by the next command), and run "C:\Program Files\Microsoft
eMbedded C++ 4.0\EVC\WCE400\BIN\WCEARMV4.BAT"

8. In the WinCE command prompt you made with WCEARMV4.BAT, goto the perl 
source directory, cd into win32/ and run "nmake -f Makefile.ce".

9. The ARM perl interpreter (perl519.dll and perl.exe) will be in something
like "C:\perl519\src\win32\wince-arm-pocket-wce400", with the XS DLLs in
"C:\perl519\src\xlib\wince-arm-hpc-wce400\auto".

To prove success on the host machine, run
"dumpbin /headers wince-arm-pocket-wce400\perl.exe" from the win32/ folder
and look for "machine (ARM)" in the FILE HEADER VALUES and
"subsystem (Windows CE GUI)" in the OPTIONAL HEADER VALUES.

=head2 OLD BUILD INSTRUCTIONS

This section describes the steps to be performed to build PerlCE.
You may find additional information about building perl for WinCE
at L<http://perlce.sourceforge.net> and some pre-built binaries.

=head3 Tools & SDK

For compiling, you need following:

=over 4

=item * Microsoft Embedded Visual Tools

=item * Microsoft Visual C++

=item * Rainer Keuchel's celib-sources

=item * Rainer Keuchel's console-sources

=back

Needed source files can be downloaded at
L<http://perlce.sourceforge.net>

=head3 Make

Normally you only need to edit F<./win32/ce-helpers/compile.bat>
to reflect your system and run it.

File F<./win32/ce-helpers/compile.bat> is actually a wrapper to call
C<nmake -f makefile.ce> with appropriate parameters and it accepts extra
parameters and forwards them to C<nmake> command as additional
arguments. You should pass target this way.

To prepare distribution you need to do following:

=over 4

=item * go to F<./win32> subdirectory

=item * edit file F<./win32/ce-helpers/compile.bat>

=item * run 
  compile.bat

=item * run 
  compile.bat dist

=back

F<Makefile.ce> has C<CROSS_NAME> macro, and it is used further to refer to
your cross-compilation scheme. You could assign a name to it, but this
is not necessary, because by default it is assigned after your machine
configuration name, such as "wince-sh3-hpc-wce211", and this is enough
to distinguish different builds at the same time. This option could be
handy for several different builds on same platform to perform, say,
threaded build. In a following example we assume that all required
environment variables are set properly for C cross-compiler (a special
*.bat file could fit perfectly to this purpose) and your F<compile.bat>
has proper "MACHINE" parameter set, to, say, C<wince-mips-pocket-wce300>.

  compile.bat
  compile.bat dist
  compile.bat CROSS_NAME=mips-wce300-thr "USE_ITHREADS=define" ^
    "USE_IMP_SYS=define" "USE_MULTI=define"
  compile.bat CROSS_NAME=mips-wce300-thr "USE_ITHREADS=define" ^
    "USE_IMP_SYS=define" "USE_MULTI=define" dist

If all goes okay and no errors during a build, you'll get two independent
distributions: C<wince-mips-pocket-wce300> and C<mips-wce300-thr>.

Target C<dist> prepares distribution file set. Target C<zipdist> performs
same as C<dist> but additionally compresses distribution files into zip
archive.

NOTE: during a build there could be created a number (or one) of F<Config.pm>
for cross-compilation ("foreign" F<Config.pm>) and those are hidden inside
F<../xlib/$(CROSS_NAME)> with other auxiliary files, but, and this is important to
note, there should be B<no> F<Config.pm> for host miniperl.
If you'll get an error that perl could not find Config.pm somewhere in building
process this means something went wrong. Most probably you forgot to
specify a cross-compilation when invoking miniperl.exe to Makefile.PL
When building an extension for cross-compilation your command line should
look like

  ..\miniperl.exe -I..\lib -MCross=mips-wce300-thr Makefile.PL

or just

  ..\miniperl.exe -I..\lib -MCross Makefile.PL

to refer a cross-compilation that was created last time.

All questions related to building for WinCE devices could be asked in
F<perlce-user@lists.sourceforge.net> mailing list.

=head1 Using Perl on WinCE

=head2 DESCRIPTION

PerlCE is currently linked with a simple console window, so it also
works on non-hpc devices.

The simple stdio implementation creates the files F<stdin.txt>,
F<stdout.txt> and F<stderr.txt>, so you might examine them if your
console has only a limited number of cols.

When exitcode is non-zero, a message box appears, otherwise the
console closes, so you might have to catch an exit with
status 0 in your program to see any output.

stdout/stderr now go into the files F</perl-stdout.txt> and
F</perl-stderr.txt.>

PerlIDE is handy to deal with perlce.

=head2 LIMITATIONS

No fork(), pipe(), popen() etc.

=head2 ENVIRONMENT

All environment vars must be stored in HKLM\Environment as
strings. They are read at process startup.

=over

=item PERL5LIB

Usual perl lib path (semi-list).

=item PATH

Semi-list for executables.

=item TMP

- Tempdir.

=item UNIXROOTPATH

- Root for accessing some special files, i.e. F</dev/null>, F</etc/services>.

=item ROWS/COLS

- Rows/cols for console.

=item HOME

- Home directory.

=item CONSOLEFONTSIZE

- Size for console font.

=back

You can set these with cereg.exe, a (remote) registry editor
or via the PerlIDE.

=head2 REGISTRY

To start perl by clicking on a perl source file, you have
to make the according entries in HKCR (see F<ce-helpers/wince-reg.bat>).
cereg.exe (which must be executed on a desktop pc with
ActiveSync) is reported not to work on some devices.
You have to create the registry entries by hand using a 
registry editor.

=head2 XS

The following Win32-Methods are built-in:

	newXS("Win32::GetCwd", w32_GetCwd, file);
	newXS("Win32::SetCwd", w32_SetCwd, file);
	newXS("Win32::GetTickCount", w32_GetTickCount, file);
	newXS("Win32::GetOSVersion", w32_GetOSVersion, file);
	newXS("Win32::IsWinNT", w32_IsWinNT, file);
	newXS("Win32::IsWin95", w32_IsWin95, file);
	newXS("Win32::IsWinCE", w32_IsWinCE, file);
	newXS("Win32::CopyFile", w32_CopyFile, file);
	newXS("Win32::Sleep", w32_Sleep, file);
	newXS("Win32::MessageBox", w32_MessageBox, file);
	newXS("Win32::GetPowerStatus", w32_GetPowerStatus, file);
	newXS("Win32::GetOemInfo", w32_GetOemInfo, file);
	newXS("Win32::ShellEx", w32_ShellEx, file);

=head2 BUGS

Opening files for read-write is currently not supported if
they use stdio (normal perl file handles).

If you find bugs or if it does not work at all on your
device, send mail to the address below. Please report
the details of your device (processor, ceversion, 
devicetype (hpc/palm/pocket)) and the date of the downloaded
files. 

=head2 INSTALLATION

Currently installation instructions are at L<http://perlce.sourceforge.net/>.

After installation & testing processes will stabilize, information will
be more precise.

=head1 ACKNOWLEDGEMENTS

The port for Win32 was used as a reference.

=head1 History of WinCE port

=over

=item 5.6.0

Initial port of perl to WinCE. It was performed in separate directory
named F<wince>. This port was based on contents of F<./win32> directory.
F<miniperl> was not built, user must have HOST perl and properly edit
F<makefile.ce> to reflect this.

=item 5.8.0

wince port was kept in the same F<./wince> directory, and F<wince/Makefile.ce>
was used to invoke native compiler to create HOST miniperl, which then
facilitates cross-compiling process.
Extension building support was added.

=item 5.9.4

Two directories F<./win32> and F<./wince> were merged, so perlce build
process comes in F<./win32> directory.

=back

=head1 AUTHORS

=over

=item Rainer Keuchel <coyxc@rainer-keuchel.de>

provided initial port of Perl, which appears to be most essential work, as
it was a breakthrough on having Perl ported at all.
Many thanks and obligations to Rainer!

=item Vadim Konovalov

made further support of WinCE port.

=item Daniel Dragan

updated the build process during the 5.19 development track.

=back
perltodo.pod000064400000000570150344123440007102 0ustar00=head1 NAME

perltodo - Link to the Perl to-do list

=head1 DESCRIPTION

The Perl 5 to-do list is maintained in the git repository, and can
be viewed at L<http://perl5.git.perl.org/perl.git/blob/HEAD:/Porting/todo.pod>

(The to-do list used to be here in perltodo. That has stopped, as installing a
snapshot that becomes increasingly out of date isn't that useful to anyone.)
perlsolaris.pod000064400000072176150344123440007624 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perlsolaris - Perl version 5 on Solaris systems

=head1 DESCRIPTION

This document describes various features of Sun's Solaris operating system
that will affect how Perl version 5 (hereafter just perl) is
compiled and/or runs.  Some issues relating to the older SunOS 4.x are
also discussed, though they may be out of date.

For the most part, everything should just work.

Starting with Solaris 8, perl5.00503 (or higher) is supplied with the
operating system, so you might not even need to build a newer version
of perl at all.  The Sun-supplied version is installed in /usr/perl5
with F</usr/bin/perl> pointing to F</usr/perl5/bin/perl>.  Do not disturb
that installation unless you really know what you are doing.  If you
remove the perl supplied with the OS, you will render some bits of
your system inoperable.  If you wish to install a newer version of perl,
install it under a different prefix from /usr/perl5.  Common prefixes
to use are /usr/local and /opt/perl.

You may wish to put your version of perl in the PATH of all users by
changing the link F</usr/bin/perl>.  This is probably OK, as most perl
scripts shipped with Solaris use an explicit path.  (There are a few
exceptions, such as F</usr/bin/rpm2cpio> and F</etc/rcm/scripts/README>, but
these are also sufficiently generic that the actual version of perl
probably doesn't matter too much.)

Solaris ships with a range of Solaris-specific modules.  If you choose
to install your own version of perl you will find the source of many of
these modules is available on CPAN under the Sun::Solaris:: namespace.

Solaris may include two versions of perl, e.g. Solaris 9 includes
both 5.005_03 and 5.6.1.  This is to provide stability across Solaris
releases, in cases where a later perl version has incompatibilities
with the version included in the preceding Solaris release.  The
default perl version will always be the most recent, and in general
the old version will only be retained for one Solaris release.  Note
also that the default perl will NOT be configured to search for modules
in the older version, again due to compatibility/stability concerns.
As a consequence if you upgrade Solaris, you will have to
rebuild/reinstall any additional CPAN modules that you installed for
the previous Solaris version.  See the CPAN manpage under 'autobundle'
for a quick way of doing this.

As an interim measure, you may either change the #! line of your
scripts to specifically refer to the old perl version, e.g. on
Solaris 9 use #!/usr/perl5/5.00503/bin/perl to use the perl version
that was the default for Solaris 8, or if you have a large number of
scripts it may be more convenient to make the old version of perl the
default on your system.  You can do this by changing the appropriate
symlinks under /usr/perl5 as follows (example for Solaris 9):

 # cd /usr/perl5
 # rm bin man pod
 # ln -s ./5.00503/bin
 # ln -s ./5.00503/man
 # ln -s ./5.00503/lib/pod
 # rm /usr/bin/perl
 # ln -s ../perl5/5.00503/bin/perl /usr/bin/perl

In both cases this should only be considered to be a temporary
measure - you should upgrade to the later version of perl as soon as
is practicable.

Note also that the perl command-line utilities (e.g. perldoc) and any
that are added by modules that you install will be under
/usr/perl5/bin, so that directory should be added to your PATH.

=head2 Solaris Version Numbers.

For consistency with common usage, perl's Configure script performs
some minor manipulations on the operating system name and version
number as reported by uname.  Here's a partial translation table:

          Sun:                      perl's Configure:
 uname    uname -r   Name           osname     osvers
 SunOS    4.1.3     Solaris 1.1     sunos      4.1.3
 SunOS    5.6       Solaris 2.6     solaris    2.6
 SunOS    5.8       Solaris 8       solaris    2.8
 SunOS    5.9       Solaris 9       solaris    2.9
 SunOS    5.10      Solaris 10      solaris    2.10

The complete table can be found in the Sun Managers' FAQ
L<ftp://ftp.cs.toronto.edu/pub/jdd/sunmanagers/faq> under
"9.1) Which Sun models run which versions of SunOS?".

=head1 RESOURCES

There are many, many sources for Solaris information.  A few of the
important ones for perl:

=over 4

=item Solaris FAQ

The Solaris FAQ is available at
L<http://www.science.uva.nl/pub/solaris/solaris2.html>.

The Sun Managers' FAQ is available at
L<ftp://ftp.cs.toronto.edu/pub/jdd/sunmanagers/faq>

=item Precompiled Binaries

Precompiled binaries, links to many sites, and much, much more are
available at L<http://www.sunfreeware.com/> and
L<http://www.blastwave.org/>.

=item Solaris Documentation

All Solaris documentation is available on-line at L<http://docs.sun.com/>.

=back

=head1 SETTING UP

=head2 File Extraction Problems on Solaris.

Be sure to use a tar program compiled under Solaris (not SunOS 4.x)
to extract the perl-5.x.x.tar.gz file.  Do not use GNU tar compiled
for SunOS4 on Solaris.  (GNU tar compiled for Solaris should be fine.)
When you run SunOS4 binaries on Solaris, the run-time system magically
alters pathnames matching m#lib/locale# so that when tar tries to create
lib/locale.pm, a file named lib/oldlocale.pm gets created instead.
If you found this advice too late and used a SunOS4-compiled tar
anyway, you must find the incorrectly renamed file and move it back
to lib/locale.pm.

=head2 Compiler and Related Tools on Solaris.

You must use an ANSI C compiler to build perl.  Perl can be compiled
with either Sun's add-on C compiler or with gcc.  The C compiler that
shipped with SunOS4 will not do.

=head3 Include /usr/ccs/bin/ in your PATH.

Several tools needed to build perl are located in /usr/ccs/bin/:  ar,
as, ld, and make.  Make sure that /usr/ccs/bin/ is in your PATH.


On all the released versions of Solaris (8, 9 and 10) you need to make sure the following packages are installed (this info is extracted from the Solaris FAQ):

for tools (sccs, lex, yacc, make, nm, truss, ld, as): SUNWbtool,
SUNWsprot, SUNWtoo

for libraries & headers: SUNWhea, SUNWarc, SUNWlibm, SUNWlibms, SUNWdfbh,
SUNWcg6h, SUNWxwinc

Additionaly, on Solaris 8 and 9 you also need:

for 64 bit development: SUNWarcx, SUNWbtoox, SUNWdplx, SUNWscpux,
SUNWsprox, SUNWtoox, SUNWlmsx, SUNWlmx, SUNWlibCx

And only on Solaris 8 you also need:

for libraries & headers: SUNWolinc


If you are in doubt which package contains a file you are missing,
try to find an installation that has that file. Then do a

 $ grep /my/missing/file /var/sadm/install/contents

This will display a line like this:

/usr/include/sys/errno.h f none 0644 root bin 7471 37605 956241356 SUNWhea

The last item listed (SUNWhea in this example) is the package you need.

=head3 Avoid /usr/ucb/cc.

You don't need to have /usr/ucb/ in your PATH to build perl.  If you
want /usr/ucb/ in your PATH anyway, make sure that /usr/ucb/ is NOT
in your PATH before the directory containing the right C compiler.

=head3 Sun's C Compiler

If you use Sun's C compiler, make sure the correct directory
(usually /opt/SUNWspro/bin/) is in your PATH (before /usr/ucb/).

=head3 GCC

If you use gcc, make sure your installation is recent and complete.
perl versions since 5.6.0 build fine with gcc > 2.8.1 on Solaris >=
2.6.

You must Configure perl with

 $ sh Configure -Dcc=gcc

If you don't, you may experience strange build errors.

If you have updated your Solaris version, you may also have to update
your gcc.  For example, if you are running Solaris 2.6 and your gcc is
installed under /usr/local, check in /usr/local/lib/gcc-lib and make
sure you have the appropriate directory, sparc-sun-solaris2.6/ or
i386-pc-solaris2.6/.  If gcc's directory is for a different version of
Solaris than you are running, then you will need to rebuild gcc for
your new version of Solaris.

You can get a precompiled version of gcc from
L<http://www.sunfreeware.com/> or L<http://www.blastwave.org/>. Make
sure you pick up the package for your Solaris release.

If you wish to use gcc to build add-on modules for use with the perl
shipped with Solaris, you should use the Solaris::PerlGcc module
which is available from CPAN.  The perl shipped with Solaris
is configured and built with the Sun compilers, and the compiler
configuration information stored in Config.pm is therefore only
relevant to the Sun compilers.  The Solaris:PerlGcc module contains a
replacement Config.pm that is correct for gcc - see the module for
details.

=head3 GNU as and GNU ld

The following information applies to gcc version 2.  Volunteers to
update it as appropriately for gcc version 3 would be appreciated.

The versions of as and ld supplied with Solaris work fine for building
perl.  There is normally no need to install the GNU versions to
compile perl.

If you decide to ignore this advice and use the GNU versions anyway,
then be sure that they are relatively recent.  Versions newer than 2.7
are apparently new enough.  Older versions may have trouble with
dynamic loading.

If you wish to use GNU ld, then you need to pass it the -Wl,-E flag.
The hints/solaris_2.sh file tries to do this automatically by setting
the following Configure variables:

 ccdlflags="$ccdlflags -Wl,-E"
 lddlflags="$lddlflags -Wl,-E -G"

However, over the years, changes in gcc, GNU ld, and Solaris ld have made
it difficult to automatically detect which ld ultimately gets called.
You may have to manually edit config.sh and add the -Wl,-E flags
yourself, or else run Configure interactively and add the flags at the
appropriate prompts.

If your gcc is configured to use GNU as and ld but you want to use the
Solaris ones instead to build perl, then you'll need to add
-B/usr/ccs/bin/ to the gcc command line.  One convenient way to do
that is with

 $ sh Configure -Dcc='gcc -B/usr/ccs/bin/'

Note that the trailing slash is required.  This will result in some
harmless warnings as Configure is run:

 gcc: file path prefix `/usr/ccs/bin/' never used

These messages may safely be ignored.
(Note that for a SunOS4 system, you must use -B/bin/ instead.)

Alternatively, you can use the GCC_EXEC_PREFIX environment variable to
ensure that Sun's as and ld are used.  Consult your gcc documentation
for further information on the -B option and the GCC_EXEC_PREFIX variable.

=head3 Sun and GNU make

The make under /usr/ccs/bin works fine for building perl.  If you
have the Sun C compilers, you will also have a parallel version of
make (dmake).  This works fine to build perl, but can sometimes cause
problems when running 'make test' due to underspecified dependencies
between the different test harness files.  The same problem can also
affect the building of some add-on modules, so in those cases either
specify '-m serial' on the dmake command line, or use
/usr/ccs/bin/make instead.  If you wish to use GNU make, be sure that
the set-group-id bit is not set.  If it is, then arrange your PATH so
that /usr/ccs/bin/make is before GNU make or else have the system
administrator disable the set-group-id bit on GNU make.

=head3 Avoid libucb.

Solaris provides some BSD-compatibility functions in /usr/ucblib/libucb.a.
Perl will not build and run correctly if linked against -lucb since it
contains routines that are incompatible with the standard Solaris libc.
Normally this is not a problem since the solaris hints file prevents
Configure from even looking in /usr/ucblib for libraries, and also
explicitly omits -lucb.

=head2 Environment for Compiling perl on Solaris

=head3 PATH

Make sure your PATH includes the compiler (/opt/SUNWspro/bin/ if you're
using Sun's compiler) as well as /usr/ccs/bin/ to pick up the other
development tools (such as make, ar, as, and ld).  Make sure your path
either doesn't include /usr/ucb or that it includes it after the
compiler and compiler tools and other standard Solaris directories.
You definitely don't want /usr/ucb/cc.

=head3 LD_LIBRARY_PATH

If you have the LD_LIBRARY_PATH environment variable set, be sure that
it does NOT include /lib or /usr/lib.  If you will be building
extensions that call third-party shared libraries (e.g. Berkeley DB)
then make sure that your LD_LIBRARY_PATH environment variable includes
the directory with that library (e.g. /usr/local/lib).

If you get an error message

 dlopen: stub interception failed

it is probably because your LD_LIBRARY_PATH environment variable
includes a directory which is a symlink to /usr/lib (such as /lib).
The reason this causes a problem is quite subtle.  The file
libdl.so.1.0 actually *only* contains functions which generate 'stub
interception failed' errors!  The runtime linker intercepts links to
"/usr/lib/libdl.so.1.0" and links in internal implementations of those
functions instead.  [Thanks to Tim Bunce for this explanation.]

=head1 RUN CONFIGURE.

See the INSTALL file for general information regarding Configure.
Only Solaris-specific issues are discussed here.  Usually, the
defaults should be fine.

=head2 64-bit perl on Solaris.

See the INSTALL file for general information regarding 64-bit compiles.
In general, the defaults should be fine for most people.

By default, perl-5.6.0 (or later) is compiled as a 32-bit application
with largefile and long-long support.

=head3 General 32-bit vs. 64-bit issues.

Solaris 7 and above will run in either 32 bit or 64 bit mode on SPARC
CPUs, via a reboot. You can build 64 bit apps whilst running 32 bit
mode and vice-versa. 32 bit apps will run under Solaris running in
either 32 or 64 bit mode.  64 bit apps require Solaris to be running
64 bit mode.

Existing 32 bit apps are properly known as LP32, i.e. Longs and
Pointers are 32 bit.  64-bit apps are more properly known as LP64.
The discriminating feature of a LP64 bit app is its ability to utilise a
64-bit address space.  It is perfectly possible to have a LP32 bit app
that supports both 64-bit integers (long long) and largefiles (> 2GB),
and this is the default for perl-5.6.0.

For a more complete explanation of 64-bit issues, see the
"Solaris 64-bit Developer's Guide" at L<http://docs.sun.com/>

You can detect the OS mode using "isainfo -v", e.g.

 $ isainfo -v   # Ultra 30 in 64 bit mode
 64-bit sparcv9 applications
 32-bit sparc applications

By default, perl will be compiled as a 32-bit application.  Unless
you want to allocate more than ~ 4GB of memory inside perl, or unless
you need more than 255 open file descriptors, you probably don't need
perl to be a 64-bit app.

=head3 Large File Support

For Solaris 2.6 and onwards, there are two different ways for 32-bit
applications to manipulate large files (files whose size is > 2GByte).
(A 64-bit application automatically has largefile support built in
by default.)

First is the "transitional compilation environment", described in
lfcompile64(5).  According to the man page,

 The transitional compilation  environment  exports  all  the
 explicit 64-bit functions (xxx64()) and types in addition to
 all the regular functions (xxx()) and types. Both xxx()  and
 xxx64()  functions  are  available to the program source.  A
 32-bit application must use the xxx64() functions in  order
 to  access  large  files.  See the lf64(5) manual page for a
 complete listing of the 64-bit transitional interfaces.

The transitional compilation environment is obtained with the
following compiler and linker flags:

 getconf LFS64_CFLAGS        -D_LARGEFILE64_SOURCE
 getconf LFS64_LDFLAG        # nothing special needed
 getconf LFS64_LIBS          # nothing special needed

Second is the "large file compilation environment", described in
lfcompile(5).  According to the man page,

 Each interface named xxx() that needs to access 64-bit entities
 to  access  large  files maps to a xxx64() call in the
 resulting binary. All relevant data types are defined to  be
 of correct size (for example, off_t has a typedef definition
 for a 64-bit entity).

 An application compiled in this environment is able  to  use
 the  xxx()  source interfaces to access both large and small
 files, rather than having to explicitly utilize the  transitional
 xxx64()  interface  calls to access large files.

Two exceptions are fseek() and ftell().  32-bit applications should
use fseeko(3C) and ftello(3C).  These will get automatically mapped
to fseeko64() and ftello64().

The large file compilation environment is obtained with

 getconf LFS_CFLAGS      -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
 getconf LFS_LDFLAGS     # nothing special needed
 getconf LFS_LIBS        # nothing special needed

By default, perl uses the large file compilation environment and
relies on Solaris to do the underlying mapping of interfaces.

=head3 Building an LP64 perl

To compile a 64-bit application on an UltraSparc with a recent Sun Compiler,
you need to use the flag "-xarch=v9".  getconf(1) will tell you this, e.g.

 $ getconf -a | grep v9
 XBS5_LP64_OFF64_CFLAGS:         -xarch=v9
 XBS5_LP64_OFF64_LDFLAGS:        -xarch=v9
 XBS5_LP64_OFF64_LINTFLAGS:      -xarch=v9
 XBS5_LPBIG_OFFBIG_CFLAGS:       -xarch=v9
 XBS5_LPBIG_OFFBIG_LDFLAGS:      -xarch=v9
 XBS5_LPBIG_OFFBIG_LINTFLAGS:    -xarch=v9
 _XBS5_LP64_OFF64_CFLAGS:        -xarch=v9
 _XBS5_LP64_OFF64_LDFLAGS:       -xarch=v9
 _XBS5_LP64_OFF64_LINTFLAGS:     -xarch=v9
 _XBS5_LPBIG_OFFBIG_CFLAGS:      -xarch=v9
 _XBS5_LPBIG_OFFBIG_LDFLAGS:     -xarch=v9
 _XBS5_LPBIG_OFFBIG_LINTFLAGS:   -xarch=v9

This flag is supported in Sun WorkShop Compilers 5.0 and onwards
(now marketed under the name Forte) when used on Solaris 7 or later on
UltraSparc systems.

If you are using gcc, you would need to use -mcpu=v9 -m64 instead.  This
option is not yet supported as of gcc 2.95.2; from install/SPECIFIC
in that release:

 GCC version 2.95 is not able to compile code correctly for sparc64
 targets. Users of the Linux kernel, at least, can use the sparc32
 program to start up a new shell invocation with an environment that
 causes configure to recognize (via uname -a) the system as sparc-*-*
 instead.

All this should be handled automatically by the hints file, if
requested.

=head3 Long Doubles.

As of 5.8.1, long doubles are working if you use the Sun compilers
(needed for additional math routines not included in libm).

=head2 Threads in perl on Solaris.

It is possible to build a threaded version of perl on Solaris.  The entire
perl thread implementation is still experimental, however, so beware.

=head2 Malloc Issues with perl on Solaris.

Starting from perl 5.7.1 perl uses the Solaris malloc, since the perl
malloc breaks when dealing with more than 2GB of memory, and the Solaris
malloc also seems to be faster.

If you for some reason (such as binary backward compatibility) really
need to use perl's malloc, you can rebuild perl from the sources
and Configure the build with 

 $ sh Configure -Dusemymalloc

You should not use perl's malloc if you are building with gcc.  There
are reports of core dumps, especially in the PDL module.  The problem
appears to go away under -DDEBUGGING, so it has been difficult to
track down.  Sun's compiler appears to be okay with or without perl's
malloc. [XXX further investigation is needed here.]

=head1 MAKE PROBLEMS.

=over 4

=item Dynamic Loading Problems With GNU as and GNU ld

If you have problems with dynamic loading using gcc on SunOS or
Solaris, and you are using GNU as and GNU ld, see the section
L</"GNU as and GNU ld"> above.

=item ld.so.1: ./perl: fatal: relocation error:

If you get this message on SunOS or Solaris, and you're using gcc,
it's probably the GNU as or GNU ld problem in the previous item
L</"GNU as and GNU ld">.

=item dlopen: stub interception failed

The primary cause of the 'dlopen: stub interception failed' message is
that the LD_LIBRARY_PATH environment variable includes a directory
which is a symlink to /usr/lib (such as /lib).  See
L</"LD_LIBRARY_PATH"> above.

=item #error "No DATAMODEL_NATIVE specified"

This is a common error when trying to build perl on Solaris 2.6 with a
gcc installation from Solaris 2.5 or 2.5.1.  The Solaris header files
changed, so you need to update your gcc installation.  You can either
rerun the fixincludes script from gcc or take the opportunity to
update your gcc installation.

=item sh: ar: not found

This is a message from your shell telling you that the command 'ar'
was not found.  You need to check your PATH environment variable to
make sure that it includes the directory with the 'ar' command.  This
is a common problem on Solaris, where 'ar' is in the /usr/ccs/bin/
directory.

=back

=head1 MAKE TEST

=head2 op/stat.t test 4 in Solaris

F<op/stat.t> test 4 may fail if you are on a tmpfs of some sort.
Building in /tmp sometimes shows this behavior.  The
test suite detects if you are building in /tmp, but it may not be able
to catch all tmpfs situations.

=head2 nss_delete core dump from op/pwent or op/grent

See L<perlhpux/"nss_delete core dump from op/pwent or op/grent">.

=head1 CROSS-COMPILATION

Nothing too unusual here.  You can easily do this if you have a 
cross-compiler available;  A usual Configure invocation when targetting a
Solaris x86 looks something like this:

    sh ./Configure -des -Dusecrosscompile \
        -Dcc=i386-pc-solaris2.11-gcc      \
        -Dsysroot=$SYSROOT                \
        -Alddlflags=" -Wl,-z,notext"      \
        -Dtargethost=... # The usual cross-compilation options

The lddlflags addition is the only abnormal bit.

=head1 PREBUILT BINARIES OF PERL FOR SOLARIS.

You can pick up prebuilt binaries for Solaris from
L<http://www.sunfreeware.com/>, L<http://www.blastwave.org>,
ActiveState L<http://www.activestate.com/>, and
L<http://www.perl.com/> under the Binaries list at the top of the
page.  There are probably other sources as well.  Please note that
these sites are under the control of their respective owners, not the
perl developers.

=head1 RUNTIME ISSUES FOR PERL ON SOLARIS.

=head2 Limits on Numbers of Open Files on Solaris.

The stdio(3C) manpage notes that for LP32 applications, only 255
files may be opened using fopen(), and only file descriptors 0
through 255 can be used in a stream.  Since perl calls open() and
then fdopen(3C) with the resulting file descriptor, perl is limited
to 255 simultaneous open files, even if sysopen() is used.  If this
proves to be an insurmountable problem, you can compile perl as a
LP64 application, see L</Building an LP64 perl> for details.  Note
also that the default resource limit for open file descriptors on
Solaris is 255, so you will have to modify your ulimit or rctl
(Solaris 9 onwards) appropriately.

=head1 SOLARIS-SPECIFIC MODULES.

See the modules under the Solaris:: and Sun::Solaris namespaces on CPAN,
see L<http://www.cpan.org/modules/by-module/Solaris/> and
L<http://www.cpan.org/modules/by-module/Sun/>.

=head1 SOLARIS-SPECIFIC PROBLEMS WITH MODULES.

=head2 Proc::ProcessTable on Solaris

Proc::ProcessTable does not compile on Solaris with perl5.6.0 and higher
if you have LARGEFILES defined.  Since largefile support is the
default in 5.6.0 and later, you have to take special steps to use this
module.

The problem is that various structures visible via procfs use off_t,
and if you compile with largefile support these change from 32 bits to
64 bits.  Thus what you get back from procfs doesn't match up with
the structures in perl, resulting in garbage.  See proc(4) for further
discussion.

A fix for Proc::ProcessTable is to edit Makefile to
explicitly remove the largefile flags from the ones MakeMaker picks up
from Config.pm.  This will result in Proc::ProcessTable being built
under the correct environment.  Everything should then be OK as long as
Proc::ProcessTable doesn't try to share off_t's with the rest of perl,
or if it does they should be explicitly specified as off64_t.

=head2 BSD::Resource on Solaris

BSD::Resource versions earlier than 1.09 do not compile on Solaris
with perl 5.6.0 and higher, for the same reasons as Proc::ProcessTable.
BSD::Resource versions starting from 1.09 have a workaround for the problem.

=head2 Net::SSLeay on Solaris

Net::SSLeay requires a /dev/urandom to be present. This device is
available from Solaris 9 onwards.  For earlier Solaris versions you
can either get the package SUNWski (packaged with several Sun
software products, for example the Sun WebServer, which is part of
the Solaris Server Intranet Extension, or the Sun Directory Services,
part of Solaris for ISPs) or download the ANDIrand package from
L<http://www.cosy.sbg.ac.at/~andi/>. If you use SUNWski, make a
symbolic link /dev/urandom pointing to /dev/random.  For more details,
see Document ID27606 entitled "Differing /dev/random support requirements
within Solaris[TM] Operating Environments", available at
L<http://sunsolve.sun.com> .

It may be possible to use the Entropy Gathering Daemon (written in
Perl!), available from L<http://www.lothar.com/tech/crypto/>.

=head1 SunOS 4.x

In SunOS 4.x you most probably want to use the SunOS ld, /usr/bin/ld,
since the more recent versions of GNU ld (like 2.13) do not seem to
work for building Perl anymore.  When linking the extensions, the
GNU ld gets very unhappy and spews a lot of errors like this

  ... relocation truncated to fit: BASE13 ...

and dies.  Therefore the SunOS 4.1 hints file explicitly sets the
ld to be F</usr/bin/ld>.

As of Perl 5.8.1 the dynamic loading of libraries (DynaLoader, XSLoader)
also seems to have become broken in in SunOS 4.x.  Therefore the default
is to build Perl statically.

Running the test suite in SunOS 4.1 is a bit tricky since the
F<dist/Tie-File/t/09_gen_rs.t> test hangs (subtest #51, FWIW) for some
unknown reason.  Just stop the test and kill that particular Perl
process.

There are various other failures, that as of SunOS 4.1.4 and gcc 3.2.2
look a lot like gcc bugs.  Many of the failures happen in the Encode
tests, where for example when the test expects "0" you get "&#48;"
which should after a little squinting look very odd indeed.
Another example is earlier in F<t/run/fresh_perl> where chr(0xff) is
expected but the test fails because the result is chr(0xff).  Exactly.

This is the "make test" result from the said combination:

  Failed 27 test scripts out of 745, 96.38% okay.

Running the C<harness> is painful because of the many failing
Unicode-related tests will output megabytes of failure messages,
but if one patiently waits, one gets these results:

 Failed Test                     Stat Wstat Total Fail  Failed  List of Failed
 -----------------------------------------------------------------------------
 ...
 ../ext/Encode/t/at-cn.t            4  1024    29    4  13.79%  14-17
 ../ext/Encode/t/at-tw.t           10  2560    17   10  58.82%  2 4 6 8 10 12
                                                                14-17
 ../ext/Encode/t/enc_data.t        29  7424    ??   ??       %  ??
 ../ext/Encode/t/enc_eucjp.t       29  7424    ??   ??       %  ??
 ../ext/Encode/t/enc_module.t      29  7424    ??   ??       %  ??
 ../ext/Encode/t/encoding.t        29  7424    ??   ??       %  ??
 ../ext/Encode/t/grow.t            12  3072    24   12  50.00%  2 4 6 8 10 12 14
                                                                16 18 20 22 24
  Failed Test                     Stat Wstat Total Fail  Failed  List of Failed
 ------------------------------------------------------------------------------
 ../ext/Encode/t/guess.t          255 65280    29   40 137.93%  10-29
 ../ext/Encode/t/jperl.t           29  7424    15   30 200.00%  1-15
 ../ext/Encode/t/mime-header.t      2   512    10    2  20.00%  2-3
 ../ext/Encode/t/perlio.t          22  5632    38   22  57.89%  1-4 9-16 19-20
                                                                23-24 27-32
 ../ext/List/Util/t/shuffle.t       0   139    ??   ??       %  ??
 ../ext/PerlIO/t/encoding.t                    14    1   7.14%  11
 ../ext/PerlIO/t/fallback.t                     9    2  22.22%  3 5
 ../ext/Socket/t/socketpair.t       0     2    45   70 155.56%  11-45
 ../lib/CPAN/t/vcmp.t                          30    1   3.33%  25
 ../lib/Tie/File/t/09_gen_rs.t      0    15    ??   ??       %  ??
 ../lib/Unicode/Collate/t/test.t              199   30  15.08%  7 26-27 71-75
                                                                81-88 95 101
                                                                103-104 106 108-
                                                                109 122 124 161
                                                                169-172
 ../lib/sort.t                      0   139   119   26  21.85%  107-119
 op/alarm.t                                     4    1  25.00%  4
 op/utfhash.t                                  97    1   1.03%  31
 run/fresh_perl.t                              91    1   1.10%  32
 uni/tr_7jis.t                                 ??   ??       %  ??
 uni/tr_eucjp.t                    29  7424     6   12 200.00%  1-6
 uni/tr_sjis.t                     29  7424     6   12 200.00%  1-6
 56 tests and 467 subtests skipped.
 Failed 27/811 test scripts, 96.67% okay. 1383/75399 subtests failed,
   98.17% okay.

The alarm() test failure is caused by system() apparently blocking
alarm().  That is probably a libc bug, and given that SunOS 4.x
has been end-of-lifed years ago, don't hold your breath for a fix.
In addition to that, don't try anything too Unicode-y, especially
with Encode, and you should be fine in SunOS 4.x.

=head1 AUTHOR

The original was written by Andy Dougherty F<doughera@lafayette.edu>
drawing heavily on advice from Alan Burlison, Nick Ing-Simmons, Tim Bunce,
and many other Solaris users over the years.

Please report any errors, updates, or suggestions to F<perlbug@perl.org>.
perlbot.pod000064400000000460150344123440006717 0ustar00=encoding utf8

=head1 NAME

perlbot - Links to information on object-oriented programming in Perl

=head1 DESCRIPTION

For information on OO programming with Perl, please see L<perlootut>
and L<perlobj>.

(The above documents supersede the collection of tricks that was formerly here
in perlbot.)

=cut
perlhack.pod000064400000116775150344123440007062 0ustar00=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlhack.pod

=head1 NAME

perlhack - How to hack on Perl

=head1 DESCRIPTION

This document explains how Perl development works.  It includes details
about the Perl 5 Porters email list, the Perl repository, the Perlbug
bug tracker, patch guidelines, and commentary on Perl development
philosophy.

=head1 SUPER QUICK PATCH GUIDE

If you just want to submit a single small patch like a pod fix, a test
for a bug, comment fixes, etc., it's easy! Here's how:

=over 4

=item * Check out the source repository

The perl source is in a git repository.  You can clone the repository
with the following command:

  % git clone git://perl5.git.perl.org/perl.git perl

=item * Ensure you're following the latest advice

In case the advice in this guide has been updated recently, read the
latest version directly from the perl source:

  % perldoc pod/perlhack.pod

=item * Make your change

Hack, hack, hack.  Keep in mind that Perl runs on many different
platforms, with different operating systems that have different
capabilities, different filesystem organizations, and even different
character sets.  L<perlhacktips> gives advice on this.

=item * Test your change

You can run all the tests with the following commands:

  % ./Configure -des -Dusedevel
  % make test

Keep hacking until the tests pass.

=item * Commit your change

Committing your work will save the change I<on your local system>:

  % git commit -a -m 'Commit message goes here'

Make sure the commit message describes your change in a single
sentence.  For example, "Fixed spelling errors in perlhack.pod".

=item * Send your change to perlbug

The next step is to submit your patch to the Perl core ticket system
via email.

If your changes are in a single git commit, run the following commands
to generate the patch file and attach it to your bug report:

  % git format-patch -1
  % ./perl -Ilib utils/perlbug -p 0001-*.patch

The perlbug program will ask you a few questions about your email
address and the patch you're submitting.  Once you've answered them it
will submit your patch via email.

If your changes are in multiple commits, generate a patch file for each
one and provide them to perlbug's C<-p> option separated by commas:

  % git format-patch -3
  % ./perl -Ilib utils/perlbug -p 0001-fix1.patch,0002-fix2.patch,\
  > 0003-fix3.patch

When prompted, pick a subject that summarizes your changes.

=item * Thank you

The porters appreciate the time you spent helping to make Perl better.
Thank you!

=item * Next time

The next time you wish to make a patch, you need to start from the
latest perl in a pristine state.  Check you don't have any local changes
or added files in your perl check-out which you wish to keep, then run
these commands:

  % git pull
  % git reset --hard origin/blead
  % git clean -dxf

=back

=head1 BUG REPORTING

If you want to report a bug in Perl, you must use the F<perlbug>
command line tool.  This tool will ensure that your bug report includes
all the relevant system and configuration information.

To browse existing Perl bugs and patches, you can use the web interface
at L<http://rt.perl.org/>.

Please check the archive of the perl5-porters list (see below) and/or
the bug tracking system before submitting a bug report.  Often, you'll
find that the bug has been reported already.

You can log in to the bug tracking system and comment on existing bug
reports.  If you have additional information regarding an existing bug,
please add it.  This will help the porters fix the bug.

=head1 PERL 5 PORTERS

The perl5-porters (p5p) mailing list is where the Perl standard
distribution is maintained and developed.  The people who maintain Perl
are also referred to as the "Perl 5 Porters", "p5p" or just the
"porters".

A searchable archive of the list is available at
L<http://markmail.org/search/?q=perl5-porters>.  There is also an archive at
L<http://archive.develooper.com/perl5-porters@perl.org/>.

=head2 perl-changes mailing list

The perl5-changes mailing list receives a copy of each patch that gets
submitted to the maintenance and development branches of the perl
repository.  See L<http://lists.perl.org/list/perl5-changes.html> for
subscription and archive information.

=head2 #p5p on IRC

Many porters are also active on the L<irc://irc.perl.org/#p5p> channel.
Feel free to join the channel and ask questions about hacking on the
Perl core.

=head1 GETTING THE PERL SOURCE

All of Perl's source code is kept centrally in a Git repository at
I<perl5.git.perl.org>.  The repository contains many Perl revisions
from Perl 1 onwards and all the revisions from Perforce, the previous
version control system.

For much more detail on using git with the Perl repository, please see
L<perlgit>.

=head2 Read access via Git

You will need a copy of Git for your computer.  You can fetch a copy of
the repository using the git protocol:

  % git clone git://perl5.git.perl.org/perl.git perl

This clones the repository and makes a local copy in the F<perl>
directory.

If you cannot use the git protocol for firewall reasons, you can also
clone via http, though this is much slower:

  % git clone http://perl5.git.perl.org/perl.git perl

=head2 Read access via the web

You may access the repository over the web.  This allows you to browse
the tree, see recent commits, subscribe to RSS feeds for the changes,
search for particular commits and more.  You may access it at
L<http://perl5.git.perl.org/perl.git>.  A mirror of the repository is
found at L<https://github.com/Perl/perl5>.

=head2 Read access via rsync

You can also choose to use rsync to get a copy of the current source
tree for the bleadperl branch and all maintenance branches:

  % rsync -avz rsync://perl5.git.perl.org/perl-current .
  % rsync -avz rsync://perl5.git.perl.org/perl-5.12.x .
  % rsync -avz rsync://perl5.git.perl.org/perl-5.10.x .
  % rsync -avz rsync://perl5.git.perl.org/perl-5.8.x .
  % rsync -avz rsync://perl5.git.perl.org/perl-5.6.x .
  % rsync -avz rsync://perl5.git.perl.org/perl-5.005xx .

(Add the C<--delete> option to remove leftover files.)

To get a full list of the available sync points:

  % rsync perl5.git.perl.org::

=head2 Write access via git

If you have a commit bit, please see L<perlgit> for more details on
using git.

=head1 PATCHING PERL

If you're planning to do more extensive work than a single small fix,
we encourage you to read the documentation below.  This will help you
focus your work and make your patches easier to incorporate into the
Perl source.

=head2 Submitting patches

If you have a small patch to submit, please submit it via perlbug.  You
can also send email directly to perlbug@perl.org.  Please note that
messages sent to perlbug may be held in a moderation queue, so you
won't receive a response immediately.

You'll know your submission has been processed when you receive an
email from our ticket tracking system.  This email will give you a
ticket number.  Once your patch has made it to the ticket tracking
system, it will also be sent to the perl5-porters@perl.org list.

Patches are reviewed and discussed on the p5p list.  Simple,
uncontroversial patches will usually be applied without any discussion.
When the patch is applied, the ticket will be updated and you will
receive email.  In addition, an email will be sent to the p5p list.

In other cases, the patch will need more work or discussion.  That will
happen on the p5p list.

You are encouraged to participate in the discussion and advocate for
your patch.  Sometimes your patch may get lost in the shuffle.  It's
appropriate to send a reminder email to p5p if no action has been taken
in a month.  Please remember that the Perl 5 developers are all
volunteers, and be polite.

Changes are always applied directly to the main development branch,
called "blead".  Some patches may be backported to a maintenance
branch.  If you think your patch is appropriate for the maintenance
branch (see L<perlpolicy/MAINTENANCE BRANCHES>), please explain why
when you submit it.

=head2 Getting your patch accepted

If you are submitting a code patch there are several things that you
can do to help the Perl 5 Porters accept your patch.

=head3 Patch style

If you used git to check out the Perl source, then using C<git
format-patch> will produce a patch in a style suitable for Perl.  The
C<format-patch> command produces one patch file for each commit you
made.  If you prefer to send a single patch for all commits, you can
use C<git diff>.

  % git checkout blead
  % git pull
  % git diff blead my-branch-name

This produces a patch based on the difference between blead and your
current branch.  It's important to make sure that blead is up to date
before producing the diff, that's why we call C<git pull> first.

We strongly recommend that you use git if possible.  It will make your
life easier, and ours as well.

However, if you're not using git, you can still produce a suitable
patch.  You'll need a pristine copy of the Perl source to diff against.
The porters prefer unified diffs.  Using GNU C<diff>, you can produce a
diff like this:

  % diff -Npurd perl.pristine perl.mine

Make sure that you C<make realclean> in your copy of Perl to remove any
build artifacts, or you may get a confusing result.

=head3 Commit message

As you craft each patch you intend to submit to the Perl core, it's
important to write a good commit message.  This is especially important
if your submission will consist of a series of commits.

The first line of the commit message should be a short description
without a period.  It should be no longer than the subject line of an
email, 50 characters being a good rule of thumb.

A lot of Git tools (Gitweb, GitHub, git log --pretty=oneline, ...) will
only display the first line (cut off at 50 characters) when presenting
commit summaries.

The commit message should include a description of the problem that the
patch corrects or new functionality that the patch adds.

As a general rule of thumb, your commit message should help a
programmer who knows the Perl core quickly understand what you were
trying to do, how you were trying to do it, and why the change matters
to Perl.

=over 4

=item * Why

Your commit message should describe why the change you are making is
important.  When someone looks at your change in six months or six
years, your intent should be clear.

If you're deprecating a feature with the intent of later simplifying
another bit of code, say so.  If you're fixing a performance problem or
adding a new feature to support some other bit of the core, mention
that.

=item * What

Your commit message should describe what part of the Perl core you're
changing and what you expect your patch to do.

=item * How

While it's not necessary for documentation changes, new tests or
trivial patches, it's often worth explaining how your change works.
Even if it's clear to you today, it may not be clear to a porter next
month or next year.

=back

A commit message isn't intended to take the place of comments in your
code.  Commit messages should describe the change you made, while code
comments should describe the current state of the code.

If you've just implemented a new feature, complete with doc, tests and
well-commented code, a brief commit message will often suffice.  If,
however, you've just changed a single character deep in the parser or
lexer, you might need to write a small novel to ensure that future
readers understand what you did and why you did it.

=head3 Comments, Comments, Comments

Be sure to adequately comment your code.  While commenting every line
is unnecessary, anything that takes advantage of side effects of
operators, that creates changes that will be felt outside of the
function being patched, or that others may find confusing should be
documented.  If you are going to err, it is better to err on the side
of adding too many comments than too few.

The best comments explain I<why> the code does what it does, not I<what
it does>.

=head3 Style

In general, please follow the particular style of the code you are
patching.

In particular, follow these general guidelines for patching Perl
sources:

=over 4

=item *

4-wide indents for code, 2-wide indents for nested CPP C<#define>s,
with 8-wide tabstops.

=item *

Use spaces for indentation, not tab characters.

The codebase is a mixture of tabs and spaces for indentation, and we
are moving to spaces only.  Converting lines you're patching from 8-wide
tabs to spaces will help this migration.

=item *

Try hard not to exceed 79-columns

=item *

ANSI C prototypes

=item *

Uncuddled elses and "K&R" style for indenting control constructs

=item *

No C++ style (//) comments

=item *

Mark places that need to be revisited with XXX (and revisit often!)

=item *

Opening brace lines up with "if" when conditional spans multiple lines;
should be at end-of-line otherwise

=item *

In function definitions, name starts in column 0 (return value-type is on
previous line)

=item *

Single space after keywords that are followed by parens, no space
between function name and following paren

=item *

Avoid assignments in conditionals, but if they're unavoidable, use
extra paren, e.g. "if (a && (b = c)) ..."

=item *

"return foo;" rather than "return(foo);"

=item *

"if (!foo) ..." rather than "if (foo == FALSE) ..." etc.

=item *

Do not declare variables using "register".  It may be counterproductive
with modern compilers, and is deprecated in C++, under which the Perl
source is regularly compiled.

=item *

In-line functions that are in headers that are accessible to XS code
need to be able to compile without warnings with commonly used extra
compilation flags, such as gcc's C<-Wswitch-default> which warns
whenever a switch statement does not have a "default" case.  The use of
these extra flags is to catch potential problems in legal C code, and
is often used by Perl aggregators, such as Linux distributors.

=back

=head3 Test suite

If your patch changes code (rather than just changing documentation),
you should also include one or more test cases which illustrate the bug
you're fixing or validate the new functionality you're adding.  In
general, you should update an existing test file rather than create a
new one.

Your test suite additions should generally follow these guidelines
(courtesy of Gurusamy Sarathy <gsar@activestate.com>):

=over 4

=item *

Know what you're testing.  Read the docs, and the source.

=item *

Tend to fail, not succeed.

=item *

Interpret results strictly.

=item *

Use unrelated features (this will flush out bizarre interactions).

=item *

Use non-standard idioms (otherwise you are not testing TIMTOWTDI).

=item *

Avoid using hardcoded test numbers whenever possible (the EXPECTED/GOT
found in t/op/tie.t is much more maintainable, and gives better failure
reports).

=item *

Give meaningful error messages when a test fails.

=item *

Avoid using qx// and system() unless you are testing for them.  If you
do use them, make sure that you cover _all_ perl platforms.

=item *

Unlink any temporary files you create.

=item *

Promote unforeseen warnings to errors with $SIG{__WARN__}.

=item *

Be sure to use the libraries and modules shipped with the version being
tested, not those that were already installed.

=item *

Add comments to the code explaining what you are testing for.

=item *

Make updating the '1..42' string unnecessary.  Or make sure that you
update it.

=item *

Test _all_ behaviors of a given operator, library, or function.

Test all optional arguments.

Test return values in various contexts (boolean, scalar, list, lvalue).

Use both global and lexical variables.

Don't forget the exceptional, pathological cases.

=back

=head2 Patching a core module

This works just like patching anything else, with one extra
consideration.

Modules in the F<cpan/> directory of the source tree are maintained
outside of the Perl core.  When the author updates the module, the
updates are simply copied into the core.  See that module's
documentation or its listing on L<http://search.cpan.org/> for more
information on reporting bugs and submitting patches.

In most cases, patches to modules in F<cpan/> should be sent upstream
and should not be applied to the Perl core individually.  If a patch to
a file in F<cpan/> absolutely cannot wait for the fix to be made
upstream, released to CPAN and copied to blead, you must add (or
update) a C<CUSTOMIZED> entry in the F<"Porting/Maintainers.pl"> file
to flag that a local modification has been made.  See
F<"Porting/Maintainers.pl"> for more details.

In contrast, modules in the F<dist/> directory are maintained in the
core.

=head2 Updating perldelta

For changes significant enough to warrant a F<pod/perldelta.pod> entry,
the porters will greatly appreciate it if you submit a delta entry
along with your actual change.  Significant changes include, but are
not limited to:

=over 4

=item *

Adding, deprecating, or removing core features

=item *

Adding, deprecating, removing, or upgrading core or dual-life modules

=item *

Adding new core tests

=item *

Fixing security issues and user-visible bugs in the core

=item *

Changes that might break existing code, either on the perl or C level

=item *

Significant performance improvements

=item *

Adding, removing, or significantly changing documentation in the
F<pod/> directory

=item *

Important platform-specific changes

=back

Please make sure you add the perldelta entry to the right section
within F<pod/perldelta.pod>.  More information on how to write good
perldelta entries is available in the C<Style> section of
F<Porting/how_to_write_a_perldelta.pod>.

=head2 What makes for a good patch?

New features and extensions to the language can be contentious.  There
is no specific set of criteria which determine what features get added,
but here are some questions to consider when developing a patch:

=head3 Does the concept match the general goals of Perl?

Our goals include, but are not limited to:

=over 4

=item 1.

Keep it fast, simple, and useful.

=item 2.

Keep features/concepts as orthogonal as possible.

=item 3.

No arbitrary limits (platforms, data sizes, cultures).

=item 4.

Keep it open and exciting to use/patch/advocate Perl everywhere.

=item 5.

Either assimilate new technologies, or build bridges to them.

=back

=head3 Where is the implementation?

All the talk in the world is useless without an implementation.  In
almost every case, the person or people who argue for a new feature
will be expected to be the ones who implement it.  Porters capable of
coding new features have their own agendas, and are not available to
implement your (possibly good) idea.

=head3 Backwards compatibility

It's a cardinal sin to break existing Perl programs.  New warnings can
be contentious--some say that a program that emits warnings is not
broken, while others say it is.  Adding keywords has the potential to
break programs, changing the meaning of existing token sequences or
functions might break programs.

The Perl 5 core includes mechanisms to help porters make backwards
incompatible changes more compatible such as the L<feature> and
L<deprecate> modules.  Please use them when appropriate.

=head3 Could it be a module instead?

Perl 5 has extension mechanisms, modules and XS, specifically to avoid
the need to keep changing the Perl interpreter.  You can write modules
that export functions, you can give those functions prototypes so they
can be called like built-in functions, you can even write XS code to
mess with the runtime data structures of the Perl interpreter if you
want to implement really complicated things.

Whenever possible, new features should be prototyped in a CPAN module
before they will be considered for the core.

=head3 Is the feature generic enough?

Is this something that only the submitter wants added to the language,
or is it broadly useful?  Sometimes, instead of adding a feature with a
tight focus, the porters might decide to wait until someone implements
the more generalized feature.

=head3 Does it potentially introduce new bugs?

Radical rewrites of large chunks of the Perl interpreter have the
potential to introduce new bugs.

=head3 How big is it?

The smaller and more localized the change, the better.  Similarly, a
series of small patches is greatly preferred over a single large patch.

=head3 Does it preclude other desirable features?

A patch is likely to be rejected if it closes off future avenues of
development.  For instance, a patch that placed a true and final
interpretation on prototypes is likely to be rejected because there are
still options for the future of prototypes that haven't been addressed.

=head3 Is the implementation robust?

Good patches (tight code, complete, correct) stand more chance of going
in.  Sloppy or incorrect patches might be placed on the back burner
until the pumpking has time to fix, or might be discarded altogether
without further notice.

=head3 Is the implementation generic enough to be portable?

The worst patches make use of system-specific features.  It's highly
unlikely that non-portable additions to the Perl language will be
accepted.

=head3 Is the implementation tested?

Patches which change behaviour (fixing bugs or introducing new
features) must include regression tests to verify that everything works
as expected.

Without tests provided by the original author, how can anyone else
changing perl in the future be sure that they haven't unwittingly
broken the behaviour the patch implements? And without tests, how can
the patch's author be confident that his/her hard work put into the
patch won't be accidentally thrown away by someone in the future?

=head3 Is there enough documentation?

Patches without documentation are probably ill-thought out or
incomplete.  No features can be added or changed without documentation,
so submitting a patch for the appropriate pod docs as well as the
source code is important.

=head3 Is there another way to do it?

Larry said "Although the Perl Slogan is I<There's More Than One Way to
Do It>, I hesitate to make 10 ways to do something".  This is a tricky
heuristic to navigate, though--one man's essential addition is another
man's pointless cruft.

=head3 Does it create too much work?

Work for the pumpking, work for Perl programmers, work for module
authors, ... Perl is supposed to be easy.

=head3 Patches speak louder than words

Working code is always preferred to pie-in-the-sky ideas.  A patch to
add a feature stands a much higher chance of making it to the language
than does a random feature request, no matter how fervently argued the
request might be.  This ties into "Will it be useful?", as the fact
that someone took the time to make the patch demonstrates a strong
desire for the feature.

=head1 TESTING

The core uses the same testing style as the rest of Perl, a simple
"ok/not ok" run through Test::Harness, but there are a few special
considerations.

There are three ways to write a test in the core: L<Test::More>,
F<t/test.pl> and ad hoc C<print $test ? "ok 42\n" : "not ok 42\n">.
The decision of which to use depends on what part of the test suite
you're working on.  This is a measure to prevent a high-level failure
(such as Config.pm breaking) from causing basic functionality tests to
fail.

The F<t/test.pl> library provides some of the features of
L<Test::More>, but avoids loading most modules and uses as few core
features as possible.

If you write your own test, use the L<Test Anything
Protocol|http://testanything.org>.

=over 4

=item * F<t/base>, F<t/comp> and F<t/opbasic>

Since we don't know if C<require> works, or even subroutines, use ad hoc
tests for these three.  Step carefully to avoid using the feature being
tested.  Tests in F<t/opbasic>, for instance, have been placed there
rather than in F<t/op> because they test functionality which
F<t/test.pl> presumes has already been demonstrated to work.

=item * F<t/cmd>, F<t/run>, F<t/io> and F<t/op>

Now that basic require() and subroutines are tested, you can use the
F<t/test.pl> library.

You can also use certain libraries like Config conditionally, but be
sure to skip the test gracefully if it's not there.

=item * Everything else

Now that the core of Perl is tested, L<Test::More> can and should be
used.  You can also use the full suite of core modules in the tests.

=back

When you say "make test", Perl uses the F<t/TEST> program to run the
test suite (except under Win32 where it uses F<t/harness> instead).
All tests are run from the F<t/> directory, B<not> the directory which
contains the test.  This causes some problems with the tests in
F<lib/>, so here's some opportunity for some patching.

You must be triply conscious of cross-platform concerns.  This usually
boils down to using L<File::Spec>, avoiding things like C<fork()>
and C<system()> unless absolutely necessary, and not assuming that a
given character has a particular ordinal value (code point) or that its
UTF-8 representation is composed of particular bytes.

There are several functions available to specify characters and code
points portably in tests.  The always-preloaded functions
C<utf8::unicode_to_native()> and its inverse
C<utf8::native_to_unicode()> take code points and translate
appropriately.  The file F<t/charset_tools.pl> has several functions
that can be useful.  It has versions of the previous two functions
that take strings as inputs -- not single numeric code points:
C<uni_to_native()> and C<native_to_uni()>.  If you must look at the
individual bytes comprising a UTF-8 encoded string,
C<byte_utf8a_to_utf8n()> takes as input a string of those bytes encoded
for an ASCII platform, and returns the equivalent string in the native
platform.  For example, C<byte_utf8a_to_utf8n("\xC2\xA0")> returns the
byte sequence on the current platform that form the UTF-8 for C<U+00A0>,
since C<"\xC2\xA0"> are the UTF-8 bytes on an ASCII platform for that
code point.  This function returns C<"\xC2\xA0"> on an ASCII platform, and
C<"\x80\x41"> on an EBCDIC 1047 one.

But easiest is, if the character is specifiable as a literal, like
C<"A"> or C<"%">, to use that; if not so specificable, you can use use
C<\N{}> , if the side effects aren't troublesome.  Simply specify all
your characters in hex, using C<\N{U+ZZ}> instead of C<\xZZ>.  C<\N{}>
is the Unicode name, and so it
always gives you the Unicode character.  C<\N{U+41}> is the character
whose Unicode code point is C<0x41>, hence is C<'A'> on all platforms.
The side effects are:

=over 4

=item *

These select Unicode rules.  That means that in double-quotish strings,
the string is always converted to UTF-8 to force a Unicode
interpretation (you can C<utf8::downgrade()> afterwards to convert back
to non-UTF8, if possible).  In regular expression patterns, the
conversion isn't done, but if the character set modifier would
otherwise be C</d>, it is changed to C</u>.

=item *

If you use the form C<\N{I<character name>}>, the L<charnames> module
gets automatically loaded.  This may not be suitable for the test level
you are doing.

=back

If you are testing locales (see L<perllocale>), there are helper
functions in F<t/loc_tools.pl> to enable you to see what locales there
are on the current platform.

=head2 Special C<make test> targets

There are various special make targets that can be used to test Perl
slightly differently than the standard "test" target.  Not all them are
expected to give a 100% success rate.  Many of them have several
aliases, and many of them are not available on certain operating
systems.

=over 4

=item * test_porting

This runs some basic sanity tests on the source tree and helps catch
basic errors before you submit a patch.

=item * minitest

Run F<miniperl> on F<t/base>, F<t/comp>, F<t/cmd>, F<t/run>, F<t/io>,
F<t/op>, F<t/uni> and F<t/mro> tests.

=item * test.valgrind check.valgrind

(Only in Linux) Run all the tests using the memory leak + naughty
memory access tool "valgrind".  The log files will be named
F<testname.valgrind>.

=item * test_harness

Run the test suite with the F<t/harness> controlling program, instead
of F<t/TEST>.  F<t/harness> is more sophisticated, and uses the
L<Test::Harness> module, thus using this test target supposes that perl
mostly works.  The main advantage for our purposes is that it prints a
detailed summary of failed tests at the end.  Also, unlike F<t/TEST>,
it doesn't redirect stderr to stdout.

Note that under Win32 F<t/harness> is always used instead of F<t/TEST>,
so there is no special "test_harness" target.

Under Win32's "test" target you may use the TEST_SWITCHES and
TEST_FILES environment variables to control the behaviour of
F<t/harness>.  This means you can say

    nmake test TEST_FILES="op/*.t"
    nmake test TEST_SWITCHES="-torture" TEST_FILES="op/*.t"

=item * test-notty test_notty

Sets PERL_SKIP_TTY_TEST to true before running normal test.

=back

=head2 Parallel tests

The core distribution can now run its regression tests in parallel on
Unix-like platforms.  Instead of running C<make test>, set C<TEST_JOBS>
in your environment to the number of tests to run in parallel, and run
C<make test_harness>.  On a Bourne-like shell, this can be done as

    TEST_JOBS=3 make test_harness  # Run 3 tests in parallel

An environment variable is used, rather than parallel make itself,
because L<TAP::Harness> needs to be able to schedule individual
non-conflicting test scripts itself, and there is no standard interface
to C<make> utilities to interact with their job schedulers.

Note that currently some test scripts may fail when run in parallel
(most notably F<dist/IO/t/io_dir.t>).  If necessary, run just the
failing scripts again sequentially and see if the failures go away.

=head2 Running tests by hand

You can run part of the test suite by hand by using one of the
following commands from the F<t/> directory:

    ./perl -I../lib TEST list-of-.t-files

or

    ./perl -I../lib harness list-of-.t-files

(If you don't specify test scripts, the whole test suite will be run.)

=head2 Using F<t/harness> for testing

If you use C<harness> for testing, you have several command line
options available to you.  The arguments are as follows, and are in the
order that they must appear if used together.

    harness -v -torture -re=pattern LIST OF FILES TO TEST
    harness -v -torture -re LIST OF PATTERNS TO MATCH

If C<LIST OF FILES TO TEST> is omitted, the file list is obtained from
the manifest.  The file list may include shell wildcards which will be
expanded out.

=over 4

=item * -v

Run the tests under verbose mode so you can see what tests were run,
and debug output.

=item * -torture

Run the torture tests as well as the normal set.

=item * -re=PATTERN

Filter the file list so that all the test files run match PATTERN.
Note that this form is distinct from the B<-re LIST OF PATTERNS> form
below in that it allows the file list to be provided as well.

=item * -re LIST OF PATTERNS

Filter the file list so that all the test files run match
/(LIST|OF|PATTERNS)/.  Note that with this form the patterns are joined
by '|' and you cannot supply a list of files, instead the test files
are obtained from the MANIFEST.

=back

You can run an individual test by a command similar to

    ./perl -I../lib path/to/foo.t

except that the harnesses set up some environment variables that may
affect the execution of the test:

=over 4

=item * PERL_CORE=1

indicates that we're running this test as part of the perl core test
suite.  This is useful for modules that have a dual life on CPAN.

=item * PERL_DESTRUCT_LEVEL=2

is set to 2 if it isn't set already (see
L<perlhacktips/PERL_DESTRUCT_LEVEL>).

=item * PERL

(used only by F<t/TEST>) if set, overrides the path to the perl
executable that should be used to run the tests (the default being
F<./perl>).

=item * PERL_SKIP_TTY_TEST

if set, tells to skip the tests that need a terminal.  It's actually
set automatically by the Makefile, but can also be forced artificially
by running 'make test_notty'.

=back

=head3 Other environment variables that may influence tests

=over 4

=item * PERL_TEST_Net_Ping

Setting this variable runs all the Net::Ping modules tests, otherwise
some tests that interact with the outside world are skipped.  See
L<perl58delta>.

=item * PERL_TEST_NOVREXX

Setting this variable skips the vrexx.t tests for OS2::REXX.

=item * PERL_TEST_NUMCONVERTS

This sets a variable in op/numconvert.t.

=item * PERL_TEST_MEMORY

Setting this variable includes the tests in F<t/bigmem/>.  This should
be set to the number of gigabytes of memory available for testing, eg.
C<PERL_TEST_MEMORY=4> indicates that tests that require 4GiB of
available memory can be run safely.

=back

See also the documentation for the Test and Test::Harness modules, for
more environment variables that affect testing.

=head2 Performance testing

The file F<t/perf/benchmarks> contains snippets of perl code which are
intended to be benchmarked across a range of perls by the
F<Porting/bench.pl> tool. If you fix or enhance a performance issue, you
may want to add a representative code sample to the file, then run
F<bench.pl> against the previous and current perls to see what difference
it has made, and whether anything else has slowed down as a consequence.

The file F<t/perf/opcount.t> is designed to test whether a particular
code snippet has been compiled into an optree containing specified
numbers of particular op types. This is good for testing whether
optimisations which alter ops, such as converting an C<aelem> op into an
C<aelemfast> op, are really doing that.

The files F<t/perf/speed.t> and F<t/re/speed.t> are designed to test
things that run thousands of times slower if a particular optimisation
is broken (for example, the utf8 length cache on long utf8 strings).
Add a test that will take a fraction of a second normally, and minutes
otherwise, causing the test file to time out on failure.

=head1 MORE READING FOR GUTS HACKERS

To hack on the Perl guts, you'll need to read the following things:

=over 4

=item * L<perlsource>

An overview of the Perl source tree.  This will help you find the files
you're looking for.

=item * L<perlinterp>

An overview of the Perl interpreter source code and some details on how
Perl does what it does.

=item * L<perlhacktut>

This document walks through the creation of a small patch to Perl's C
code.  If you're just getting started with Perl core hacking, this will
help you understand how it works.

=item * L<perlhacktips>

More details on hacking the Perl core.  This document focuses on lower
level details such as how to write tests, compilation issues,
portability, debugging, etc.

If you plan on doing serious C hacking, make sure to read this.

=item * L<perlguts>

This is of paramount importance, since it's the documentation of what
goes where in the Perl source.  Read it over a couple of times and it
might start to make sense - don't worry if it doesn't yet, because the
best way to study it is to read it in conjunction with poking at Perl
source, and we'll do that later on.

Gisle Aas's "illustrated perlguts", also known as I<illguts>, has very
helpful pictures:

L<http://search.cpan.org/dist/illguts/>

=item * L<perlxstut> and L<perlxs>

A working knowledge of XSUB programming is incredibly useful for core
hacking; XSUBs use techniques drawn from the PP code, the portion of
the guts that actually executes a Perl program.  It's a lot gentler to
learn those techniques from simple examples and explanation than from
the core itself.

=item * L<perlapi>

The documentation for the Perl API explains what some of the internal
functions do, as well as the many macros used in the source.

=item * F<Porting/pumpkin.pod>

This is a collection of words of wisdom for a Perl porter; some of it
is only useful to the pumpkin holder, but most of it applies to anyone
wanting to go about Perl development.

=back

=head1 CPAN TESTERS AND PERL SMOKERS

The CPAN testers ( L<http://testers.cpan.org/> ) are a group of volunteers
who test CPAN modules on a variety of platforms.

Perl Smokers ( L<http://www.nntp.perl.org/group/perl.daily-build/> and
L<http://www.nntp.perl.org/group/perl.daily-build.reports/> )
automatically test Perl source releases on platforms with various
configurations.

Both efforts welcome volunteers.  In order to get involved in smoke
testing of the perl itself visit
L<http://search.cpan.org/dist/Test-Smoke/>.  In order to start smoke
testing CPAN modules visit
L<http://search.cpan.org/dist/CPANPLUS-YACSmoke/> or
L<http://search.cpan.org/dist/minismokebox/> or
L<http://search.cpan.org/dist/CPAN-Reporter/>.

=head1 WHAT NEXT?

If you've read all the documentation in the document and the ones
listed above, you're more than ready to hack on Perl.

Here's some more recommendations

=over 4

=item *

Subscribe to perl5-porters, follow the patches and try and understand
them; don't be afraid to ask if there's a portion you're not clear on -
who knows, you may unearth a bug in the patch...

=item *

Do read the README associated with your operating system, e.g.
README.aix on the IBM AIX OS.  Don't hesitate to supply patches to that
README if you find anything missing or changed over a new OS release.

=item *

Find an area of Perl that seems interesting to you, and see if you can
work out how it works.  Scan through the source, and step over it in
the debugger.  Play, poke, investigate, fiddle! You'll probably get to
understand not just your chosen area but a much wider range of
F<perl>'s activity as well, and probably sooner than you'd think.

=back

=head2 "The Road goes ever on and on, down from the door where it began."

If you can do these things, you've started on the long road to Perl
porting.  Thanks for wanting to help make Perl better - and happy
hacking!

=head2 Metaphoric Quotations

If you recognized the quote about the Road above, you're in luck.

Most software projects begin each file with a literal description of
each file's purpose.  Perl instead begins each with a literary allusion
to that file's purpose.

Like chapters in many books, all top-level Perl source files (along
with a few others here and there) begin with an epigrammatic
inscription that alludes, indirectly and metaphorically, to the
material you're about to read.

Quotations are taken from writings of J.R.R. Tolkien pertaining to his
Legendarium, almost always from I<The Lord of the Rings>.  Chapters and
page numbers are given using the following editions:

=over 4

=item *

I<The Hobbit>, by J.R.R. Tolkien.  The hardcover, 70th-anniversary
edition of 2007 was used, published in the UK by Harper Collins
Publishers and in the US by the Houghton Mifflin Company.

=item *

I<The Lord of the Rings>, by J.R.R. Tolkien.  The hardcover,
50th-anniversary edition of 2004 was used, published in the UK by
Harper Collins Publishers and in the US by the Houghton Mifflin
Company.

=item *

I<The Lays of Beleriand>, by J.R.R. Tolkien and published posthumously
by his son and literary executor, C.J.R. Tolkien, being the 3rd of the
12 volumes in Christopher's mammoth I<History of Middle Earth>.  Page
numbers derive from the hardcover edition, first published in 1983 by
George Allen & Unwin; no page numbers changed for the special 3-volume
omnibus edition of 2002 or the various trade-paper editions, all again
now by Harper Collins or Houghton Mifflin.

=back

Other JRRT books fair game for quotes would thus include I<The
Adventures of Tom Bombadil>, I<The Silmarillion>, I<Unfinished Tales>,
and I<The Tale of the Children of Hurin>, all but the first
posthumously assembled by CJRT.  But I<The Lord of the Rings> itself is
perfectly fine and probably best to quote from, provided you can find a
suitable quote there.

So if you were to supply a new, complete, top-level source file to add
to Perl, you should conform to this peculiar practice by yourself
selecting an appropriate quotation from Tolkien, retaining the original
spelling and punctuation and using the same format the rest of the
quotes are in.  Indirect and oblique is just fine; remember, it's a
metaphor, so being meta is, after all, what it's for.

=head1 AUTHOR

This document was originally written by Nathan Torkington, and is
maintained by the perl5-porters mailing list.
perl5243delta.pod000064400000026244150344123440007552 0ustar00=encoding utf8

=head1 NAME

perl5243delta - what is new for perl v5.24.3

=head1 DESCRIPTION

This document describes differences between the 5.24.2 release and the 5.24.3
release.

If you are upgrading from an earlier release such as 5.24.1, first read
L<perl5242delta>, which describes differences between 5.24.1 and 5.24.2.

=head1 Security

=head2 [CVE-2017-12837] Heap buffer overflow in regular expression compiler

Compiling certain regular expression patterns with the case-insensitive
modifier could cause a heap buffer overflow and crash perl.  This has now been
fixed.
L<[perl #131582]|https://rt.perl.org/Public/Bug/Display.html?id=131582>

=head2 [CVE-2017-12883] Buffer over-read in regular expression parser

For certain types of syntax error in a regular expression pattern, the error
message could either contain the contents of a random, possibly large, chunk of
memory, or could crash perl.  This has now been fixed.
L<[perl #131598]|https://rt.perl.org/Public/Bug/Display.html?id=131598>

=head2 [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows

A possible stack buffer overflow in the C<%ENV> code on Windows has been fixed
by removing the buffer completely since it was superfluous anyway.
L<[perl #131665]|https://rt.perl.org/Public/Bug/Display.html?id=131665>

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.24.2.  If any exist,
they are bugs, and we request that you submit a report.  See L</Reporting
Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Module::CoreList> has been upgraded from version 5.20170715_24 to
5.20170922_24.

=item *

L<POSIX> has been upgraded from version 1.65 to 1.65_01.

=item *

L<Time::HiRes> has been upgraded from version 1.9733 to 1.9741.

L<[perl #128427]|https://rt.perl.org/Public/Bug/Display.html?id=128427>
L<[perl #128445]|https://rt.perl.org/Public/Bug/Display.html?id=128445>
L<[perl #128972]|https://rt.perl.org/Public/Bug/Display.html?id=128972>
L<[cpan #120032]|https://rt.cpan.org/Public/Bug/Display.html?id=120032>

=back

=head1 Configuration and Compilation

=over 4

=item *

When building with GCC 6 and link-time optimization (the B<-flto> option to
B<gcc>), F<Configure> was treating all probed symbols as present on the system,
regardless of whether they actually exist.  This has been fixed.
L<[perl #128131]|https://rt.perl.org/Public/Bug/Display.html?id=128131>

=item *

F<Configure> now aborts if both C<-Duselongdouble> and C<-Dusequadmath> are
requested.
L<[perl #126203]|https://rt.perl.org/Public/Bug/Display.html?id=126203>

=item *

Fixed a bug in which F<Configure> could append C<-quadmath> to the archname
even if it was already present.
L<[perl #128538]|https://rt.perl.org/Public/Bug/Display.html?id=128538>

=item *

Clang builds with C<-DPERL_GLOBAL_STRUCT> or C<-DPERL_GLOBAL_STRUCT_PRIVATE>
have been fixed (by disabling Thread Safety Analysis for these configurations).

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item VMS

=over 4

=item *

C<configure.com> now recognizes the VSI-branded C compiler.

=back

=item Windows

=over 4

=item *

Building XS modules with GCC 6 in a 64-bit build of Perl failed due to
incorrect mapping of C<strtoll> and C<strtoull>.  This has now been fixed.
L<[perl #131726]|https://rt.perl.org/Public/Bug/Display.html?id=131726>
L<[cpan #121683]|https://rt.cpan.org/Public/Bug/Display.html?id=121683>
L<[cpan #122353]|https://rt.cpan.org/Public/Bug/Display.html?id=122353>

=back

=back

=head1 Selected Bug Fixes

=over 4

=item *

C<< /@0{0*-E<gt>@*/*0 >> and similar contortions used to crash, but no longer
do, but merely produce a syntax error.
L<[perl #128171]|https://rt.perl.org/Public/Bug/Display.html?id=128171>

=item *

C<do> or C<require> with an argument which is a reference or typeglob which,
when stringified, contains a null character, started crashing in Perl 5.20, but
has now been fixed.
L<[perl #128182]|https://rt.perl.org/Public/Bug/Display.html?id=128182>

=item *

Expressions containing an C<&&> or C<||> operator (or their synonyms C<and> and
C<or>) were being compiled incorrectly in some cases.  If the left-hand side
consisted of either a negated bareword constant or a negated C<do {}> block
containing a constant expression, and the right-hand side consisted of a
negated non-foldable expression, one of the negations was effectively ignored.
The same was true of C<if> and C<unless> statement modifiers, though with the
left-hand and right-hand sides swapped.  This long-standing bug has now been
fixed.
L<[perl #127952]|https://rt.perl.org/Public/Bug/Display.html?id=127952>

=item *

C<reset> with an argument no longer crashes when encountering stash entries
other than globs.
L<[perl #128106]|https://rt.perl.org/Public/Bug/Display.html?id=128106>

=item *

Assignment of hashes to, and deletion of, typeglobs named C<*::::::> no longer
causes crashes.
L<[perl #128086]|https://rt.perl.org/Public/Bug/Display.html?id=128086>

=item *

Assignment variants of any bitwise ops under the C<bitwise> feature would crash
if the left-hand side was an array or hash.
L<[perl #128204]|https://rt.perl.org/Public/Bug/Display.html?id=128204>

=item *

C<socket> now leaves the error code returned by the system in C<$!> on failure.
L<[perl #128316]|https://rt.perl.org/Public/Bug/Display.html?id=128316>

=item *

Parsing bad POSIX charclasses no longer leaks memory.
L<[perl #128313]|https://rt.perl.org/Public/Bug/Display.html?id=128313>

=item *

Since Perl 5.20, line numbers have been off by one when perl is invoked with
the B<-x> switch.  This has been fixed.
L<[perl #128508]|https://rt.perl.org/Public/Bug/Display.html?id=128508>

=item *

Some obscure cases of subroutines and file handles being freed at the same time
could result in crashes, but have been fixed.  The crash was introduced in Perl
5.22.
L<[perl #128597]|https://rt.perl.org/Public/Bug/Display.html?id=128597>

=item *

Some regular expression parsing glitches could lead to assertion failures with
regular expressions such as C</(?E<lt>=/> and C</(?E<lt>!/>.  This has now been
fixed.
L<[perl #128170]|https://rt.perl.org/Public/Bug/Display.html?id=128170>

=item *

C<gethostent> and similar functions now perform a null check internally, to
avoid crashing with the torsocks library.  This was a regression from Perl
5.22.
L<[perl #128740]|https://rt.perl.org/Public/Bug/Display.html?id=128740>

=item *

Mentioning the same constant twice in a row (which is a syntax error) no longer
fails an assertion under debugging builds.  This was a regression from Perl
5.20.
L<[perl #126482]|https://rt.perl.org/Public/Bug/Display.html?id=126482>

=item *

In Perl 5.24 C<fchown> was changed not to accept negative one as an argument
because in some platforms that is an error.  However, in some other platforms
that is an acceptable argument.  This change has been reverted.
L<[perl #128967]|https://rt.perl.org/Public/Bug/Display.html?id=128967>.

=item *

C<@{x> followed by a newline where C<"x"> represents a control or non-ASCII
character no longer produces a garbled syntax error message or a crash.
L<[perl #128951]|https://rt.perl.org/Public/Bug/Display.html?id=128951>

=item *

A regression in Perl 5.24 with C<tr/\N{U+...}/foo/> when the code point was
between 128 and 255 has been fixed.
L<[perl #128734]|https://rt.perl.org/Public/Bug/Display.html?id=128734>.

=item *

Many issues relating to C<printf "%a"> of hexadecimal floating point were
fixed.  In addition, the "subnormals" (formerly known as "denormals") floating
point numbers are now supported both with the plain IEEE 754 floating point
numbers (64-bit or 128-bit) and the x86 80-bit "extended precision".  Note that
subnormal hexadecimal floating point literals will give a warning about
"exponent underflow".
L<[perl #128843]|https://rt.perl.org/Public/Bug/Display.html?id=128843>
L<[perl #128888]|https://rt.perl.org/Public/Bug/Display.html?id=128888>
L<[perl #128889]|https://rt.perl.org/Public/Bug/Display.html?id=128889>
L<[perl #128890]|https://rt.perl.org/Public/Bug/Display.html?id=128890>
L<[perl #128893]|https://rt.perl.org/Public/Bug/Display.html?id=128893>
L<[perl #128909]|https://rt.perl.org/Public/Bug/Display.html?id=128909>
L<[perl #128919]|https://rt.perl.org/Public/Bug/Display.html?id=128919>

=item *

The parser could sometimes crash if a bareword came after C<evalbytes>.
L<[perl #129196]|https://rt.perl.org/Public/Bug/Display.html?id=129196>

=item *

Fixed a place where the regex parser was not setting the syntax error correctly
on a syntactically incorrect pattern.
L<[perl #129122]|https://rt.perl.org/Public/Bug/Display.html?id=129122>

=item *

A vulnerability in Perl's C<sprintf> implementation has been fixed by avoiding
a possible memory wrap.
L<[perl #131260]|https://rt.perl.org/Public/Bug/Display.html?id=131260>

=back

=head1 Acknowledgements

Perl 5.24.3 represents approximately 2 months of development since Perl 5.24.2
and contains approximately 3,200 lines of changes across 120 files from 23
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 1,600 lines of changes to 56 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.24.3:

Aaron Crane, Craig A. Berry, Dagfinn Ilmari Mannsåker, Dan Collins, Daniel
Dragan, Dave Cross, David Mitchell, Eric Herman, Father Chrysostomos, H.Merijn
Brand, Hugo van der Sanden, James E Keenan, Jarkko Hietaniemi, John SJ
Anderson, Karl Williamson, Ken Brown, Lukas Mai, Matthew Horsfall, Stevan
Little, Steve Hay, Steven Humphrey, Tony Cook, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to
report the issue.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5242delta.pod000064400000010021150344123450007534 0ustar00=encoding utf8

=head1 NAME

perl5242delta - what is new for perl v5.24.2

=head1 DESCRIPTION

This document describes differences between the 5.24.1 release and the 5.24.2
release.

If you are upgrading from an earlier release such as 5.24.0, first read
L<perl5241delta>, which describes differences between 5.24.0 and 5.24.1.

=head1 Security

=head2 Improved handling of '.' in @INC in base.pm

The handling of (the removal of) C<'.'> in C<@INC> in L<base> has been
improved.  This resolves some problematic behaviour in the approach taken in
Perl 5.24.1, which is probably best described in the following two threads on
the Perl 5 Porters mailing list:
L<http://www.nntp.perl.org/group/perl.perl5.porters/2016/08/msg238991.html>,
L<http://www.nntp.perl.org/group/perl.perl5.porters/2016/10/msg240297.html>.

=head2 "Escaped" colons and relative paths in PATH

On Unix systems, Perl treats any relative paths in the PATH environment
variable as tainted when starting a new process.  Previously, it was allowing a
backslash to escape a colon (unlike the OS), consequently allowing relative
paths to be considered safe if the PATH was set to something like C</\:.>.  The
check has been fixed to treat C<.> as tainted in that example.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<base> has been upgraded from version 2.23 to 2.23_01.

=item *

L<Module::CoreList> has been upgraded from version 5.20170114_24 to 5.20170715_24.

=back

=head1 Selected Bug Fixes

=over 4

=item *

Fixed a crash with C<s///l> where it thought it was dealing with UTF-8 when it
wasn't.
L<[perl #129038]|https://rt.perl.org/Ticket/Display.html?id=129038>

=back

=head1 Acknowledgements

Perl 5.24.2 represents approximately 6 months of development since Perl 5.24.1
and contains approximately 2,500 lines of changes across 53 files from 18
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 960 lines of changes to 17 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.24.2:

Aaron Crane, Abigail, Aristotle Pagaltzis, Chris 'BinGOs' Williams, Dan
Collins, David Mitchell, Eric Herman, Father Chrysostomos, James E Keenan, Karl
Williamson, Lukas Mai, Renee Baecker, Ricardo Signes, Sawyer X, Stevan Little,
Steve Hay, Tony Cook, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlcn.pod000064400000011123150344123450006532 0ustar00=encoding utf8

如果你用一般的文字编辑器阅览这份文件, 请忽略文中奇特的注记字符.
这份文件是以 POD (简明文件格式) 写成; 这种格式是为了能让人直接阅读,
而特别设计的. 关于此格式的进一步信息, 请参考 perlpod 线上文件.

=head1 NAME

perlcn - 简体中文 Perl 指南

=head1 DESCRIPTION

欢迎来到 Perl 的天地!

从 5.8.0 版开始, Perl 具备了完善的 Unicode (统一码) 支援,
也连带支援了许多拉丁语系以外的编码方式; CJK (中日韩) 便是其中的一部份.
Unicode 是国际性的标准, 试图涵盖世界上所有的字符: 西方世界, 东方世界,
以及两者间的一切 (希腊文, 叙利亚文, 亚拉伯文, 希伯来文, 印度文,
印地安文, 等等). 它也容纳了多种作业系统与平台 (如 PC 及麦金塔).

Perl 本身以 Unicode 进行操作. 这表示 Perl 内部的字符串数据可用 Unicode
表示; Perl 的函式与算符 (例如正规表示式比对) 也能对 Unicode 进行操作.
在输入及输出时, 为了处理以 Unicode 之前的编码方式存放的数据, Perl
提供了 Encode 这个模块, 可以让你轻易地读取及写入旧有的编码数据.

Encode 延伸模块支援下列简体中文的编码方式 ('gb2312' 表示 'euc-cn'):

    euc-cn	Unix 延伸字符集, 也就是俗称的国标码
    gb2312-raw	未经处理的 (低比特) GB2312 字符表
    gb12345	未经处理的中国用繁体中文编码
    iso-ir-165	GB2312 + GB6345 + GB8565 + 新增字符
    cp936	字码页 936, 也可以用 'GBK' (扩充国标码) 指明
    hz		7 比特逸出式 GB2312 编码

举例来说, 将 EUC-CN 编码的档案转成 Unicode, 祗需键入下列指令:

    perl -Mencoding=euc-cn,STDOUT,utf8 -pe1 < file.euc-cn > file.utf8

Perl 也内附了 "piconv", 一支完全以 Perl 写成的字符转换工具程序, 用法如下:

    piconv -f euc-cn -t utf8 < file.euc-cn > file.utf8
    piconv -f utf8 -t euc-cn < file.utf8 > file.euc-cn

另外, 利用 encoding 模块, 你可以轻易写出以字符为单位的程序码, 如下所示:

    #!/usr/bin/env perl
    # 启动 euc-cn 字串解析; 标准输出入及标准错误都设为 euc-cn 编码
    use encoding 'euc-cn', STDIN => 'euc-cn', STDOUT => 'euc-cn';
    print length("骆驼");	     #  2 (双引号表示字符)
    print length('骆驼');	     #  4 (单引号表示字节)
    print index("谆谆教诲", "蛔唤"); # -1 (不包含此子字符串)
    print index('谆谆教诲', '蛔唤'); #  1 (从第二个字节开始)

在最后一列例子里, "谆" 的第二个字节与 "谆" 的第一个字节结合成 EUC-CN
码的 "蛔"; "谆" 的第二个字节则与 "教" 的第一个字节结合成 "唤".
这解决了以前 EUC-CN 码比对处理上常见的问题.

=head2 额外的中文编码

如果需要更多的中文编码, 可以从 CPAN (L<http://www.cpan.org/>) 下载
Encode::HanExtra 模块. 它目前提供下列编码方式:

    gb18030	扩充过的国标码, 包含繁体中文

另外, Encode::HanConvert 模块则提供了简繁转换用的两种编码:

    big5-simp	Big5 繁体中文与 Unicode 简体中文互转
    gbk-trad	GBK 简体中文与 Unicode 繁体中文互转

若想在 GBK 与 Big5 之间互转, 请参考该模块内附的 b2g.pl 与 g2b.pl 两支程序,
或在程序内使用下列写法:

    use Encode::HanConvert;
    $euc_cn = big5_to_gb($big5); # 从 Big5 转为 GBK
    $big5 = gb_to_big5($euc_cn); # 从 GBK 转为 Big5

=head2 进一步的信息

请参考 Perl 内附的大量说明文件 (不幸全是用英文写的), 来学习更多关于
Perl 的知识, 以及 Unicode 的使用方式. 不过, 外部的资源相当丰富:

=head2 提供 Perl 资源的网址

=over 4

=item L<http://www.perl.com/>

Perl 的首页 (由欧莱礼公司维护)

=item L<http://www.cpan.org/>

Perl 综合典藏网 (Comprehensive Perl Archive Network)

=item L<http://lists.perl.org/>

Perl 邮递论坛一览

=back

=head2 学习 Perl 的网址

=over 4

=item L<http://www.oreilly.com.cn/index.php?func=booklist&cat=68>

简体中文版的欧莱礼 Perl 书藉

=back

=head2 Perl 使用者集会

=over 4

=item L<http://www.pm.org/groups/asia.html>

中国 Perl 推广组一览

=back

=head2 Unicode 相关网址

=over 4

=item L<http://www.unicode.org/>

Unicode 学术学会 (Unicode 标准的制定者)

=item L<http://www.cl.cam.ac.uk/%7Emgk25/unicode.html>

Unix/Linux 上的 UTF-8 及 Unicode 答客问

=back

=head1 SEE ALSO

L<Encode>, L<Encode::CN>, L<encoding>, L<perluniintro>, L<perlunicode>

=head1 AUTHORS

Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt>

Audrey Tang (唐凤) E<lt>audreyt@audreyt.orgE<gt>

=cut
perlcheat.pod000064400000010601150344123450007216 0ustar00=head1 NAME

perlcheat - Perl 5 Cheat Sheet

=head1 DESCRIPTION

This 'cheat sheet' is a handy reference, meant for beginning Perl
programmers. Not everything is mentioned, but 195 features may
already be overwhelming.

=head2 The sheet

  CONTEXTS  SIGILS  ref        ARRAYS        HASHES
  void      $scalar SCALAR     @array        %hash
  scalar    @array  ARRAY      @array[0, 2]  @hash{'a', 'b'}
  list      %hash   HASH       $array[0]     $hash{'a'}
            &sub    CODE
            *glob   GLOB       SCALAR VALUES
                    FORMAT     number, string, ref, glob, undef
  REFERENCES
  \      reference       $$foo[1]       aka $foo->[1]
  $@%&*  dereference     $$foo{bar}     aka $foo->{bar}
  []     anon. arrayref  ${$$foo[1]}[2] aka $foo->[1]->[2]
  {}     anon. hashref   ${$$foo[1]}[2] aka $foo->[1][2]
  \()    list of refs
                         SYNTAX
  OPERATOR PRECEDENCE    foreach (LIST) { }     for (a;b;c) { }
  ->                     while   (e) { }        until (e)   { }
  ++ --                  if      (e) { } elsif (e) { } else { }
  **                     unless  (e) { } elsif (e) { } else { }
  ! ~ \ u+ u-            given   (e) { when (e) {} default {} }
  =~ !~
  * / % x                 NUMBERS vs STRINGS  FALSE vs TRUE
  + - .                   =          =        undef, "", 0, "0"
  << >>                   +          .        anything else
  named uops              == !=      eq ne
  < > <= >= lt gt le ge   < > <= >=  lt gt le ge
  == != <=> eq ne cmp ~~  <=>        cmp
  &
  | ^             REGEX MODIFIERS       REGEX METACHARS
  &&              /i case insensitive   ^      string begin
  || //           /m line based ^$      $      str end (bfr \n)
  .. ...          /s . includes \n      +      one or more
  ?:              /x /xx ign. wh.space  *      zero or more
  = += last goto  /p preserve           ?      zero or one
  , =>            /a ASCII    /aa safe  {3,7}  repeat in range
  list ops        /l locale   /d  dual  |      alternation
  not             /u Unicode            []     character class
  and             /e evaluate /ee rpts  \b     boundary
  or xor          /g global             \z     string end
                  /o compile pat once   ()     capture
  DEBUG                                 (?:p)  no capture
  -MO=Deparse     REGEX CHARCLASSES     (?#t)  comment
  -MO=Terse       .   [^\n]             (?=p)  ZW pos ahead
  -D##            \s  whitespace        (?!p)  ZW neg ahead
  -d:Trace        \w  word chars        (?<=p) ZW pos behind \K
                  \d  digits            (?<!p) ZW neg behind
  CONFIGURATION   \pP named property    (?>p)  no backtrack
  perl -V:ivsize  \h  horiz.wh.space    (?|p|p)branch reset
                  \R  linebreak         (?<n>p)named capture
                  \S \W \D \H negate    \g{n}  ref to named cap
                                        \K     keep left part
  FUNCTION RETURN LISTS
  stat      localtime    caller         SPECIAL VARIABLES
   0 dev    0 second      0 package     $_    default variable
   1 ino    1 minute      1 filename    $0    program name
   2 mode   2 hour        2 line        $/    input separator
   3 nlink  3 day         3 subroutine  $\    output separator
   4 uid    4 month-1     4 hasargs     $|    autoflush
   5 gid    5 year-1900   5 wantarray   $!    sys/libcall error
   6 rdev   6 weekday     6 evaltext    $@    eval error
   7 size   7 yearday     7 is_require  $$    process ID
   8 atime  8 is_dst      8 hints       $.    line number
   9 mtime                9 bitmask     @ARGV command line args
  10 ctime               10 hinthash    @INC  include paths
  11 blksz               3..10 only     @_    subroutine args
  12 blcks               with EXPR      %ENV  environment

=head1 ACKNOWLEDGEMENTS

The first version of this document appeared on Perl Monks, where several
people had useful suggestions. Thank you, Perl Monks.

A special thanks to Damian Conway, who didn't only suggest important changes,
but also took the time to count the number of listed features and make a
Perl 6 version to show that Perl will stay Perl.

=head1 AUTHOR

Juerd Waalboer <#####@juerd.nl>, with the help of many Perl Monks.

=head1 SEE ALSO

=over 4

=item *

L<http://perlmonks.org/?node_id=216602> - the original PM post

=item *

L<http://perlmonks.org/?node_id=238031> - Damian Conway's Perl 6 version

=item *

L<http://juerd.nl/site.plp/perlcheat> - home of the Perl Cheat Sheet

=back
perlbs2000.pod000064400000017572150344123450007056 0ustar00This document is written in pod format hence there are punctuation
characters in odd places.  Do not worry, you've apparently got the
ASCII->EBCDIC translation worked out correctly.  You can read more
about pod in pod/perlpod.pod or the short summary in the INSTALL file.

=head1 NAME

perlbs2000 - building and installing Perl for BS2000.

B<This document needs to be updated, but we don't know what it should say.
Please email comments to L<perlbug@perl.org|mailto:perlbug@perl.org>.>

=head1 SYNOPSIS

This document will help you Configure, build, test and install Perl
on BS2000 in the POSIX subsystem.

=head1 DESCRIPTION

This is a ported perl for the POSIX subsystem in BS2000 VERSION OSD
V3.1A or later.  It may work on other versions, but we started porting
and testing it with 3.1A and are currently using Version V4.0A.

You may need the following GNU programs in order to install perl:

=head2 gzip on BS2000

We used version 1.2.4, which could be installed out of the box with
one failure during 'make check'.

=head2 bison on BS2000

The yacc coming with BS2000 POSIX didn't work for us.  So we had to
use bison.  We had to make a few changes to perl in order to use the
pure (reentrant) parser of bison.  We used version 1.25, but we had to
add a few changes due to EBCDIC.  See below for more details
concerning yacc.

=head2 Unpacking Perl Distribution on BS2000

To extract an ASCII tar archive on BS2000 POSIX you need an ASCII
filesystem (we used the mountpoint /usr/local/ascii for this).  Now
you extract the archive in the ASCII filesystem without
I/O-conversion:

cd /usr/local/ascii
export IO_CONVERSION=NO
gunzip < /usr/local/src/perl.tar.gz | pax -r

You may ignore the error message for the first element of the archive
(this doesn't look like a tar archive / skipping to next file...),
it's only the directory which will be created automatically anyway.

After extracting the archive you copy the whole directory tree to your
EBCDIC filesystem.  B<This time you use I/O-conversion>:

cd /usr/local/src
IO_CONVERSION=YES
cp -r /usr/local/ascii/perl5.005_02 ./

=head2 Compiling Perl on BS2000

There is a "hints" file for BS2000 called hints.posix-bc (because
posix-bc is the OS name given by `uname`) that specifies the correct
values for most things.  The major problem is (of course) the EBCDIC
character set.  We have german EBCDIC version.

Because of our problems with the native yacc we used GNU bison to
generate a pure (=reentrant) parser for perly.y.  So our yacc is
really the following script:

-----8<-----/usr/local/bin/yacc-----8<-----
#! /usr/bin/sh

# Bison as a reentrant yacc:

# save parameters:
params=""
while [[ $# -gt 1 ]]; do
    params="$params $1"
    shift
done

# add flag %pure_parser:

tmpfile=/tmp/bison.$$.y
echo %pure_parser > $tmpfile
cat $1 >> $tmpfile

# call bison:

echo "/usr/local/bin/bison --yacc $params $1\t\t\t(Pure Parser)"
/usr/local/bin/bison --yacc $params $tmpfile

# cleanup:

rm -f $tmpfile
-----8<----------8<-----

We still use the normal yacc for a2p.y though!!!  We made a softlink
called byacc to distinguish between the two versions:

ln -s /usr/bin/yacc /usr/local/bin/byacc

We build perl using GNU make.  We tried the native make once and it
worked too.

=head2 Testing Perl on BS2000

We still got a few errors during C<make test>.  Some of them are the
result of using bison.  Bison prints I<parser error> instead of I<syntax
error>, so we may ignore them.  The following list shows
our errors, your results may differ:

op/numconvert.......FAILED tests 1409-1440
op/regexp...........FAILED tests 483, 496
op/regexp_noamp.....FAILED tests 483, 496
pragma/overload.....FAILED tests 152-153, 170-171
pragma/warnings.....FAILED tests 14, 82, 129, 155, 192, 205, 207
lib/bigfloat........FAILED tests 351-352, 355
lib/bigfltpm........FAILED tests 354-355, 358
lib/complex.........FAILED tests 267, 487
lib/dumper..........FAILED tests 43, 45
Failed 11/231 test scripts, 95.24% okay. 57/10595 subtests failed, 99.46% okay.

=head2 Installing Perl on BS2000

We have no nroff on BS2000 POSIX (yet), so we ignored any errors while
installing the documentation.


=head2 Using Perl in the Posix-Shell of BS2000

BS2000 POSIX doesn't support the shebang notation
(C<#!/usr/local/bin/perl>), so you have to use the following lines
instead:

: # use perl
    eval 'exec /usr/local/bin/perl -S $0 ${1+"$@"}'
        if $running_under_some_shell;

=head2 Using Perl in "native" BS2000

We don't have much experience with this yet, but try the following:

Copy your Perl executable to a BS2000 LLM using bs2cp:

C<bs2cp /usr/local/bin/perl 'bs2:perl(perl,l)'>

Now you can start it with the following (SDF) command:

C</START-PROG FROM-FILE=*MODULE(PERL,PERL),PROG-MODE=*ANY,RUN-MODE=*ADV>

First you get the BS2000 commandline prompt ('*').  Here you may enter
your parameters, e.g. C<-e 'print "Hello World!\\n";'> (note the
double backslash!) or C<-w> and the name of your Perl script.
Filenames starting with C</> are searched in the Posix filesystem,
others are searched in the BS2000 filesystem.  You may even use
wildcards if you put a C<%> in front of your filename (e.g. C<-w
checkfiles.pl %*.c>).  Read your C/C++ manual for additional
possibilities of the commandline prompt (look for
PARAMETER-PROMPTING).

=head2 Floating point anomalies on BS2000

There appears to be a bug in the floating point implementation on BS2000 POSIX
systems such that calling int() on the product of a number and a small
magnitude number is not the same as calling int() on the quotient of
that number and a large magnitude number.  For example, in the following
Perl code:

    my $x = 100000.0;
    my $y = int($x * 1e-5) * 1e5; # '0'
    my $z = int($x / 1e+5) * 1e5;  # '100000'
    print "\$y is $y and \$z is $z\n"; # $y is 0 and $z is 100000

Although one would expect the quantities $y and $z to be the same and equal
to 100000 they will differ and instead will be 0 and 100000 respectively.

=head2 Using PerlIO and different encodings on ASCII and EBCDIC partitions

Since version 5.8 Perl uses the new PerlIO on BS2000.  This enables
you using different encodings per IO channel.  For example you may use

    use Encode;
    open($f, ">:encoding(ascii)", "test.ascii");
    print $f "Hello World!\n";
    open($f, ">:encoding(posix-bc)", "test.ebcdic");
    print $f "Hello World!\n";
    open($f, ">:encoding(latin1)", "test.latin1");
    print $f "Hello World!\n";
    open($f, ">:encoding(utf8)", "test.utf8");
    print $f "Hello World!\n";

to get two files containing "Hello World!\n" in ASCII, EBCDIC, ISO
Latin-1 (in this example identical to ASCII) respective UTF-EBCDIC (in
this example identical to normal EBCDIC).  See the documentation of
Encode::PerlIO for details.

As the PerlIO layer uses raw IO internally, all this totally ignores
the type of your filesystem (ASCII or EBCDIC) and the IO_CONVERSION
environment variable.  If you want to get the old behavior, that the
BS2000 IO functions determine conversion depending on the filesystem
PerlIO still is your friend.  You use IO_CONVERSION as usual and tell
Perl, that it should use the native IO layer:

    export IO_CONVERSION=YES
    export PERLIO=stdio

Now your IO would be ASCII on ASCII partitions and EBCDIC on EBCDIC
partitions.  See the documentation of PerlIO (without C<Encode::>!)
for further possibilities.

=head1 AUTHORS

Thomas Dorner

=head1 SEE ALSO

L<INSTALL>, L<perlport>.

=head2 Mailing list

If you are interested in the z/OS (formerly known as OS/390)
and POSIX-BC (BS2000) ports of Perl then see the perl-mvs mailing list.
To subscribe, send an empty message to perl-mvs-subscribe@perl.org.

See also:

    http://lists.perl.org/list/perl-mvs.html

There are web archives of the mailing list at:

    http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/
    http://archive.develooper.com/perl-mvs@perl.org/

=head1 HISTORY

This document was originally written by Thomas Dorner for the 5.005
release of Perl.

This document was podified for the 5.6 release of perl 11 July 2000.

=cut
perl561delta.pod000064400000363451150344123450007475 0ustar00=head1 NAME

perl561delta - what's new for perl v5.6.1

=head1 DESCRIPTION

This document describes differences between the 5.005 release and the 5.6.1
release.

=head1 Summary of changes between 5.6.0 and 5.6.1

This section contains a summary of the changes between the 5.6.0 release
and the 5.6.1 release.  More details about the changes mentioned here
may be found in the F<Changes> files that accompany the Perl source
distribution.  See L<perlhack> for pointers to online resources where you
can inspect the individual patches described by these changes.

=head2 Security Issues

suidperl will not run /bin/mail anymore, because some platforms have
a /bin/mail that is vulnerable to buffer overflow attacks.

Note that suidperl is neither built nor installed by default in
any recent version of perl.  Use of suidperl is highly discouraged.
If you think you need it, try alternatives such as sudo first.
See http://www.courtesan.com/sudo/ .

=head2 Core bug fixes

This is not an exhaustive list.  It is intended to cover only the
significant user-visible changes.

=over

=item C<UNIVERSAL::isa()>

A bug in the caching mechanism used by C<UNIVERSAL::isa()> that affected
base.pm has been fixed.  The bug has existed since the 5.005 releases,
but wasn't tickled by base.pm in those releases.

=item Memory leaks

Various cases of memory leaks and attempts to access uninitialized memory
have been cured.  See L</"Known Problems"> below for further issues.

=item Numeric conversions

Numeric conversions did not recognize changes in the string value
properly in certain circumstances.

In other situations, large unsigned numbers (those above 2**31) could
sometimes lose their unsignedness, causing bogus results in arithmetic
operations.

Integer modulus on large unsigned integers sometimes returned
incorrect values.

Perl 5.6.0 generated "not a number" warnings on certain conversions where
previous versions didn't.

These problems have all been rectified.

Infinity is now recognized as a number.

=item qw(a\\b)

In Perl 5.6.0, qw(a\\b) produced a string with two backslashes instead
of one, in a departure from the behavior in previous versions.  The
older behavior has been reinstated.  

=item caller()

caller() could cause core dumps in certain situations.  Carp was sometimes
affected by this problem.

=item Bugs in regular expressions

Pattern matches on overloaded values are now handled correctly.

Perl 5.6.0 parsed m/\x{ab}/ incorrectly, leading to spurious warnings.
This has been corrected.

The RE engine found in Perl 5.6.0 accidentally pessimised certain kinds
of simple pattern matches.  These are now handled better.

Regular expression debug output (whether through C<use re 'debug'>
or via C<-Dr>) now looks better.

Multi-line matches like C<"a\nxb\n" =~ /(?!\A)x/m> were flawed.  The
bug has been fixed.

Use of $& could trigger a core dump under some situations.  This
is now avoided.

Match variables $1 et al., weren't being unset when a pattern match
was backtracking, and the anomaly showed up inside C</...(?{ ... }).../>
etc.  These variables are now tracked correctly.

pos() did not return the correct value within s///ge in earlier
versions.  This is now handled correctly.

=item "slurp" mode

readline() on files opened in "slurp" mode could return an extra "" at
the end in certain situations.  This has been corrected.

=item Autovivification of symbolic references to special variables

Autovivification of symbolic references of special variables described
in L<perlvar> (as in C<${$num}>) was accidentally disabled.  This works
again now.

=item Lexical warnings 

Lexical warnings now propagate correctly into C<eval "...">.

C<use warnings qw(FATAL all)> did not work as intended.  This has been
corrected.

Lexical warnings could leak into other scopes in some situations.
This is now fixed.

warnings::enabled() now reports the state of $^W correctly if the caller
isn't using lexical warnings.

=item Spurious warnings and errors

Perl 5.6.0 could emit spurious warnings about redefinition of dl_error()
when statically building extensions into perl.  This has been corrected.

"our" variables could result in bogus "Variable will not stay shared"
warnings.  This is now fixed.

"our" variables of the same name declared in two sibling blocks
resulted in bogus warnings about "redeclaration" of the variables.
The problem has been corrected.

=item glob()

Compatibility of the builtin glob() with old csh-based glob has been
improved with the addition of GLOB_ALPHASORT option.  See C<File::Glob>.

File::Glob::glob() has been renamed to File::Glob::bsd_glob()
because the name clashes with the builtin glob().  The older
name is still available for compatibility, but is deprecated.

Spurious syntax errors generated in certain situations, when glob()
caused File::Glob to be loaded for the first time, have been fixed.

=item Tainting

Some cases of inconsistent taint propagation (such as within hash
values) have been fixed.

The tainting behavior of sprintf() has been rationalized.  It does
not taint the result of floating point formats anymore, making the
behavior consistent with that of string interpolation.

=item sort()

Arguments to sort() weren't being provided the right wantarray() context.
The comparison block is now run in scalar context, and the arguments to
be sorted are always provided list context.

sort() is also fully reentrant, in the sense that the sort function
can itself call sort().  This did not work reliably in previous releases.

=item #line directives

#line directives now work correctly when they appear at the very
beginning of C<eval "...">.

=item Subroutine prototypes

The (\&) prototype now works properly.

=item map()

map() could get pathologically slow when the result list it generates
is larger than the source list.  The performance has been improved for
common scenarios.

=item Debugger

Debugger exit code now reflects the script exit code.

Condition C<"0"> in breakpoints is now treated correctly.

The C<d> command now checks the line number.

C<$.> is no longer corrupted by the debugger.

All debugger output now correctly goes to the socket if RemotePort
is set.

=item PERL5OPT

PERL5OPT can be set to more than one switch group.  Previously,
it used to be limited to one group of options only.

=item chop()

chop(@list) in list context returned the characters chopped in reverse
order.  This has been reversed to be in the right order.

=item Unicode support

Unicode support has seen a large number of incremental improvements,
but continues to be highly experimental.  It is not expected to be
fully supported in the 5.6.x maintenance releases.

substr(), join(), repeat(), reverse(), quotemeta() and string
concatenation were all handling Unicode strings incorrectly in
Perl 5.6.0.  This has been corrected.

Support for C<tr///CU> and C<tr///UC> etc., have been removed since
we realized the interface is broken.  For similar functionality,
see L<perlfunc/pack>.

The Unicode Character Database has been updated to version 3.0.1
with additions made available to the public as of August 30, 2000.

The Unicode character classes \p{Blank} and \p{SpacePerl} have been
added.  "Blank" is like C isblank(), that is, it contains only
"horizontal whitespace" (the space character is, the newline isn't),
and the "SpacePerl" is the Unicode equivalent of C<\s> (\p{Space}
isn't, since that includes the vertical tabulator character, whereas
C<\s> doesn't.)

If you are experimenting with Unicode support in perl, the development
versions of Perl may have more to offer.  In particular, I/O layers
are now available in the development track, but not in the maintenance
track, primarily to do backward compatibility issues.  Unicode support
is also evolving rapidly on a daily basis in the development track--the
maintenance track only reflects the most conservative of these changes.

=item 64-bit support

Support for 64-bit platforms has been improved, but continues to be
experimental.  The level of support varies greatly among platforms.

=item Compiler

The B Compiler and its various backends have had many incremental
improvements, but they continue to remain highly experimental.  Use in
production environments is discouraged.

The perlcc tool has been rewritten so that the user interface is much
more like that of a C compiler.

The perlbc tools has been removed.  Use C<perlcc -B> instead.

=item Lvalue subroutines

There have been various bugfixes to support lvalue subroutines better.
However, the feature still remains experimental.

=item IO::Socket

IO::Socket::INET failed to open the specified port if the service
name was not known.  It now correctly uses the supplied port number
as is.

=item File::Find

File::Find now chdir()s correctly when chasing symbolic links.

=item xsubpp

xsubpp now tolerates embedded POD sections.

=item C<no Module;>

C<no Module;> does not produce an error even if Module does not have an
unimport() method.  This parallels the behavior of C<use> vis-a-vis
C<import>.

=item Tests

A large number of tests have been added.

=back

=head2 Core features

untie() will now call an UNTIE() hook if it exists.  See L<perltie>
for details.

The C<-DT> command line switch outputs copious tokenizing information.
See L<perlrun>.

Arrays are now always interpolated in double-quotish strings.  Previously,
C<"foo@bar.com"> used to be a fatal error at compile time, if an array
C<@bar> was not used or declared.  This transitional behavior was
intended to help migrate perl4 code, and is deemed to be no longer useful.
See L</"Arrays now always interpolate into double-quoted strings">.

keys(), each(), pop(), push(), shift(), splice() and unshift()
can all be overridden now.

C<my __PACKAGE__ $obj> now does the expected thing.

=head2 Configuration issues

On some systems (IRIX and Solaris among them) the system malloc is demonstrably
better.  While the defaults haven't been changed in order to retain binary
compatibility with earlier releases, you may be better off building perl
with C<Configure -Uusemymalloc ...> as discussed in the F<INSTALL> file.

C<Configure> has been enhanced in various ways:

=over

=item *

Minimizes use of temporary files.

=item *

By default, does not link perl with libraries not used by it, such as
the various dbm libraries.  SunOS 4.x hints preserve behavior on that
platform.

=item *

Support for pdp11-style memory models has been removed due to obsolescence.

=item *

Building outside the source tree is supported on systems that have
symbolic links. This is done by running

    sh /path/to/source/Configure -Dmksymlinks ...
    make all test install

in a directory other than the perl source directory.  See F<INSTALL>.

=item *

C<Configure -S> can be run non-interactively.

=back

=head2 Documentation

README.aix, README.solaris and README.macos have been added.
README.posix-bc has been renamed to README.bs2000.  These are
installed as L<perlaix>, L<perlsolaris>, L<perlmacos>, and
L<perlbs2000> respectively.

The following pod documents are brand new:

    perlclib	Internal replacements for standard C library functions
    perldebtut	Perl debugging tutorial
    perlebcdic	Considerations for running Perl on EBCDIC platforms
    perlnewmod	Perl modules: preparing a new module for distribution
    perlrequick	Perl regular expressions quick start
    perlretut	Perl regular expressions tutorial
    perlutil	utilities packaged with the Perl distribution

The F<INSTALL> file has been expanded to cover various issues, such as
64-bit support.

A longer list of contributors has been added to the source distribution.
See the file C<AUTHORS>.

Numerous other changes have been made to the included documentation and FAQs.

=head2 Bundled modules

The following modules have been added.

=over

=item B::Concise

Walks Perl syntax tree, printing concise info about ops.  See L<B::Concise>.

=item File::Temp

Returns name and handle of a temporary file safely.  See L<File::Temp>.

=item Pod::LaTeX

Converts Pod data to formatted LaTeX.  See L<Pod::LaTeX>.

=item Pod::Text::Overstrike

Converts POD data to formatted overstrike text.  See L<Pod::Text::Overstrike>.

=back

The following modules have been upgraded.

=over

=item CGI

CGI v2.752 is now included.

=item CPAN

CPAN v1.59_54 is now included.

=item Class::Struct

Various bugfixes have been added.

=item DB_File

DB_File v1.75 supports newer Berkeley DB versions, among other
improvements.

=item Devel::Peek

Devel::Peek has been enhanced to support dumping of memory statistics,
when perl is built with the included malloc().

=item File::Find

File::Find now supports pre and post-processing of the files in order
to sort() them, etc.

=item Getopt::Long

Getopt::Long v2.25 is included.

=item IO::Poll

Various bug fixes have been included.

=item IPC::Open3

IPC::Open3 allows use of numeric file descriptors.

=item Math::BigFloat

The fmod() function supports modulus operations.  Various bug fixes
have also been included.

=item Math::Complex

Math::Complex handles inf, NaN etc., better.

=item Net::Ping

ping() could fail on odd number of data bytes, and when the echo service
isn't running.  This has been corrected.

=item Opcode

A memory leak has been fixed.

=item Pod::Parser

Version 1.13 of the Pod::Parser suite is included.

=item Pod::Text

Pod::Text and related modules have been upgraded to the versions
in podlators suite v2.08.

=item SDBM_File

On dosish platforms, some keys went missing because of lack of support for
files with "holes".  A workaround for the problem has been added.

=item Sys::Syslog

Various bug fixes have been included.

=item Tie::RefHash

Now supports Tie::RefHash::Nestable to automagically tie hashref values.

=item Tie::SubstrHash

Various bug fixes have been included.

=back

=head2 Platform-specific improvements

The following new ports are now available.

=over

=item NCR MP-RAS

=item NonStop-UX

=back

Perl now builds under Amdahl UTS.

Perl has also been verified to build under Amiga OS.

Support for EPOC has been much improved.  See README.epoc.

Building perl with -Duseithreads or -Duse5005threads now works
under HP-UX 10.20 (previously it only worked under 10.30 or later).
You will need a thread library package installed.  See README.hpux.

Long doubles should now work under Linux.

Mac OS Classic is now supported in the mainstream source package.
See README.macos.

Support for MPE/iX has been updated.  See README.mpeix.

Support for OS/2 has been improved.  See C<os2/Changes> and README.os2.

Dynamic loading on z/OS (formerly OS/390) has been improved.  See
README.os390.

Support for VMS has seen many incremental improvements, including
better support for operators like backticks and system(), and better
%ENV handling.  See C<README.vms> and L<perlvms>.

Support for Stratus VOS has been improved.  See C<vos/Changes> and README.vos.

Support for Windows has been improved.

=over

=item *

fork() emulation has been improved in various ways, but still continues
to be experimental.  See L<perlfork> for known bugs and caveats.

=item *

%SIG has been enabled under USE_ITHREADS, but its use is completely
unsupported under all configurations.

=item *

Borland C++ v5.5 is now a supported compiler that can build Perl.
However, the generated binaries continue to be incompatible with those
generated by the other supported compilers (GCC and Visual C++).

=item *

Non-blocking waits for child processes (or pseudo-processes) are
supported via C<waitpid($pid, &POSIX::WNOHANG)>.

=item *

A memory leak in accept() has been fixed.

=item *

wait(), waitpid() and backticks now return the correct exit status under
Windows 9x.

=item *

Trailing new %ENV entries weren't propagated to child processes.  This
is now fixed.

=item *

Current directory entries in %ENV are now correctly propagated to child
processes.

=item *

Duping socket handles with open(F, ">&MYSOCK") now works under Windows 9x.

=item *

The makefiles now provide a single switch to bulk-enable all the features
enabled in ActiveState ActivePerl (a popular binary distribution).

=item *

Win32::GetCwd() correctly returns C:\ instead of C: when at the drive root.
Other bugs in chdir() and Cwd::cwd() have also been fixed.

=item *

fork() correctly returns undef and sets EAGAIN when it runs out of
pseudo-process handles.

=item *

ExtUtils::MakeMaker now uses $ENV{LIB} to search for libraries.

=item *

UNC path handling is better when perl is built to support fork().

=item *

A handle leak in socket handling has been fixed.

=item *

send() works from within a pseudo-process.

=back

Unless specifically qualified otherwise, the remainder of this document
covers changes between the 5.005 and 5.6.0 releases.

=head1 Core Enhancements

=head2 Interpreter cloning, threads, and concurrency

Perl 5.6.0 introduces the beginnings of support for running multiple
interpreters concurrently in different threads.  In conjunction with
the perl_clone() API call, which can be used to selectively duplicate
the state of any given interpreter, it is possible to compile a
piece of code once in an interpreter, clone that interpreter
one or more times, and run all the resulting interpreters in distinct
threads.

On the Windows platform, this feature is used to emulate fork() at the
interpreter level.  See L<perlfork> for details about that.

This feature is still in evolution.  It is eventually meant to be used
to selectively clone a subroutine and data reachable from that
subroutine in a separate interpreter and run the cloned subroutine
in a separate thread.  Since there is no shared data between the
interpreters, little or no locking will be needed (unless parts of
the symbol table are explicitly shared).  This is obviously intended
to be an easy-to-use replacement for the existing threads support.

Support for cloning interpreters and interpreter concurrency can be
enabled using the -Dusethreads Configure option (see win32/Makefile for
how to enable it on Windows.)  The resulting perl executable will be
functionally identical to one that was built with -Dmultiplicity, but
the perl_clone() API call will only be available in the former.

-Dusethreads enables the cpp macro USE_ITHREADS by default, which in turn
enables Perl source code changes that provide a clear separation between
the op tree and the data it operates with.  The former is immutable, and
can therefore be shared between an interpreter and all of its clones,
while the latter is considered local to each interpreter, and is therefore
copied for each clone.

Note that building Perl with the -Dusemultiplicity Configure option
is adequate if you wish to run multiple B<independent> interpreters
concurrently in different threads.  -Dusethreads only provides the
additional functionality of the perl_clone() API call and other
support for running B<cloned> interpreters concurrently.

    NOTE: This is an experimental feature.  Implementation details are
    subject to change.

=head2 Lexically scoped warning categories

You can now control the granularity of warnings emitted by perl at a finer
level using the C<use warnings> pragma.  L<warnings> and L<perllexwarn>
have copious documentation on this feature.

=head2 Unicode and UTF-8 support

Perl now uses UTF-8 as its internal representation for character
strings.  The C<utf8> and C<bytes> pragmas are used to control this support
in the current lexical scope.  See L<perlunicode>, L<utf8> and L<bytes> for
more information.

This feature is expected to evolve quickly to support some form of I/O
disciplines that can be used to specify the kind of input and output data
(bytes or characters).  Until that happens, additional modules from CPAN
will be needed to complete the toolkit for dealing with Unicode.

    NOTE: This should be considered an experimental feature.  Implementation
    details are subject to change.

=head2 Support for interpolating named characters

The new C<\N> escape interpolates named characters within strings.
For example, C<"Hi! \N{WHITE SMILING FACE}"> evaluates to a string
with a Unicode smiley face at the end.

=head2 "our" declarations

An "our" declaration introduces a value that can be best understood
as a lexically scoped symbolic alias to a global variable in the
package that was current where the variable was declared.  This is
mostly useful as an alternative to the C<vars> pragma, but also provides
the opportunity to introduce typing and other attributes for such
variables.  See L<perlfunc/our>.

=head2 Support for strings represented as a vector of ordinals

Literals of the form C<v1.2.3.4> are now parsed as a string composed
of characters with the specified ordinals.  This is an alternative, more
readable way to construct (possibly Unicode) strings instead of
interpolating characters, as in C<"\x{1}\x{2}\x{3}\x{4}">.  The leading
C<v> may be omitted if there are more than two ordinals, so C<1.2.3> is
parsed the same as C<v1.2.3>.

Strings written in this form are also useful to represent version "numbers".
It is easy to compare such version "numbers" (which are really just plain
strings) using any of the usual string comparison operators C<eq>, C<ne>,
C<lt>, C<gt>, etc., or perform bitwise string operations on them using C<|>,
C<&>, etc.

In conjunction with the new C<$^V> magic variable (which contains
the perl version as a string), such literals can be used as a readable way
to check if you're running a particular version of Perl:

    # this will parse in older versions of Perl also
    if ($^V and $^V gt v5.6.0) {
        # new features supported
    }

C<require> and C<use> also have some special magic to support such literals.
They will be interpreted as a version rather than as a module name:

    require v5.6.0;		# croak if $^V lt v5.6.0
    use v5.6.0;			# same, but croaks at compile-time

Alternatively, the C<v> may be omitted if there is more than one dot:

    require 5.6.0;
    use 5.6.0;

Also, C<sprintf> and C<printf> support the Perl-specific format flag C<%v>
to print ordinals of characters in arbitrary strings:

    printf "v%vd", $^V;		# prints current version, such as "v5.5.650"
    printf "%*vX", ":", $addr;	# formats IPv6 address
    printf "%*vb", " ", $bits;	# displays bitstring

See L<perldata/"Scalar value constructors"> for additional information.

=head2 Improved Perl version numbering system

Beginning with Perl version 5.6.0, the version number convention has been
changed to a "dotted integer" scheme that is more commonly found in open
source projects.

Maintenance versions of v5.6.0 will be released as v5.6.1, v5.6.2 etc.
The next development series following v5.6.0 will be numbered v5.7.x,
beginning with v5.7.0, and the next major production release following
v5.6.0 will be v5.8.0.

The English module now sets $PERL_VERSION to $^V (a string value) rather
than C<$]> (a numeric value).  (This is a potential incompatibility.
Send us a report via perlbug if you are affected by this.)

The v1.2.3 syntax is also now legal in Perl.
See L</Support for strings represented as a vector of ordinals> for more on that.

To cope with the new versioning system's use of at least three significant
digits for each version component, the method used for incrementing the
subversion number has also changed slightly.  We assume that versions older
than v5.6.0 have been incrementing the subversion component in multiples of
10.  Versions after v5.6.0 will increment them by 1.  Thus, using the new
notation, 5.005_03 is the "same" as v5.5.30, and the first maintenance
version following v5.6.0 will be v5.6.1 (which should be read as being
equivalent to a floating point value of 5.006_001 in the older format,
stored in C<$]>).

=head2 New syntax for declaring subroutine attributes

Formerly, if you wanted to mark a subroutine as being a method call or
as requiring an automatic lock() when it is entered, you had to declare
that with a C<use attrs> pragma in the body of the subroutine.
That can now be accomplished with declaration syntax, like this:

    sub mymethod : locked method;
    ...
    sub mymethod : locked method {
	...
    }

    sub othermethod :locked :method;
    ...
    sub othermethod :locked :method {
	...
    }


(Note how only the first C<:> is mandatory, and whitespace surrounding
the C<:> is optional.)

F<AutoSplit.pm> and F<SelfLoader.pm> have been updated to keep the attributes
with the stubs they provide.  See L<attributes>.

=head2 File and directory handles can be autovivified

Similar to how constructs such as C<< $x->[0] >> autovivify a reference,
handle constructors (open(), opendir(), pipe(), socketpair(), sysopen(),
socket(), and accept()) now autovivify a file or directory handle
if the handle passed to them is an uninitialized scalar variable.  This
allows the constructs such as C<open(my $fh, ...)> and C<open(local $fh,...)>
to be used to create filehandles that will conveniently be closed
automatically when the scope ends, provided there are no other references
to them.  This largely eliminates the need for typeglobs when opening
filehandles that must be passed around, as in the following example:

    sub myopen {
        open my $fh, "@_"
	     or die "Can't open '@_': $!";
	return $fh;
    }

    {
        my $f = myopen("</etc/motd");
	print <$f>;
	# $f implicitly closed here
    }

=head2 open() with more than two arguments

If open() is passed three arguments instead of two, the second argument
is used as the mode and the third argument is taken to be the file name.
This is primarily useful for protecting against unintended magic behavior
of the traditional two-argument form.  See L<perlfunc/open>.

=head2 64-bit support

Any platform that has 64-bit integers either

	(1) natively as longs or ints
	(2) via special compiler flags
	(3) using long long or int64_t

is able to use "quads" (64-bit integers) as follows:

=over 4

=item *

constants (decimal, hexadecimal, octal, binary) in the code 

=item *

arguments to oct() and hex()

=item *

arguments to print(), printf() and sprintf() (flag prefixes ll, L, q)

=item *

printed as such

=item *

pack() and unpack() "q" and "Q" formats

=item *

in basic arithmetics: + - * / % (NOTE: operating close to the limits
of the integer values may produce surprising results)

=item *

in bit arithmetics: & | ^ ~ << >> (NOTE: these used to be forced 
to be 32 bits wide but now operate on the full native width.)

=item *

vec()

=back

Note that unless you have the case (a) you will have to configure
and compile Perl using the -Duse64bitint Configure flag.

    NOTE: The Configure flags -Duselonglong and -Duse64bits have been
    deprecated.  Use -Duse64bitint instead.

There are actually two modes of 64-bitness: the first one is achieved
using Configure -Duse64bitint and the second one using Configure
-Duse64bitall.  The difference is that the first one is minimal and
the second one maximal.  The first works in more places than the second.

The C<use64bitint> does only as much as is required to get 64-bit
integers into Perl (this may mean, for example, using "long longs")
while your memory may still be limited to 2 gigabytes (because your
pointers could still be 32-bit).  Note that the name C<64bitint> does
not imply that your C compiler will be using 64-bit C<int>s (it might,
but it doesn't have to): the C<use64bitint> means that you will be
able to have 64 bits wide scalar values.

The C<use64bitall> goes all the way by attempting to switch also
integers (if it can), longs (and pointers) to being 64-bit.  This may
create an even more binary incompatible Perl than -Duse64bitint: the
resulting executable may not run at all in a 32-bit box, or you may
have to reboot/reconfigure/rebuild your operating system to be 64-bit
aware.

Natively 64-bit systems like Alpha and Cray need neither -Duse64bitint
nor -Duse64bitall.

Last but not least: note that due to Perl's habit of always using
floating point numbers, the quads are still not true integers.
When quads overflow their limits (0...18_446_744_073_709_551_615 unsigned,
-9_223_372_036_854_775_808...9_223_372_036_854_775_807 signed), they
are silently promoted to floating point numbers, after which they will
start losing precision (in their lower digits).

    NOTE: 64-bit support is still experimental on most platforms.
    Existing support only covers the LP64 data model.  In particular, the
    LLP64 data model is not yet supported.  64-bit libraries and system
    APIs on many platforms have not stabilized--your mileage may vary.

=head2 Large file support

If you have filesystems that support "large files" (files larger than
2 gigabytes), you may now also be able to create and access them from
Perl.

    NOTE: The default action is to enable large file support, if
    available on the platform.

If the large file support is on, and you have a Fcntl constant
O_LARGEFILE, the O_LARGEFILE is automatically added to the flags
of sysopen().

Beware that unless your filesystem also supports "sparse files" seeking
to umpteen petabytes may be inadvisable.

Note that in addition to requiring a proper file system to do large
files you may also need to adjust your per-process (or your
per-system, or per-process-group, or per-user-group) maximum filesize
limits before running Perl scripts that try to handle large files,
especially if you intend to write such files.

Finally, in addition to your process/process group maximum filesize
limits, you may have quota limits on your filesystems that stop you
(your user id or your user group id) from using large files.

Adjusting your process/user/group/file system/operating system limits
is outside the scope of Perl core language.  For process limits, you
may try increasing the limits using your shell's limits/limit/ulimit
command before running Perl.  The BSD::Resource extension (not
included with the standard Perl distribution) may also be of use, it
offers the getrlimit/setrlimit interface that can be used to adjust
process resource usage limits, including the maximum filesize limit.

=head2 Long doubles

In some systems you may be able to use long doubles to enhance the
range and precision of your double precision floating point numbers
(that is, Perl's numbers).  Use Configure -Duselongdouble to enable
this support (if it is available).

=head2 "more bits"

You can "Configure -Dusemorebits" to turn on both the 64-bit support
and the long double support.

=head2 Enhanced support for sort() subroutines

Perl subroutines with a prototype of C<($$)>, and XSUBs in general, can
now be used as sort subroutines.  In either case, the two elements to
be compared are passed as normal parameters in @_.  See L<perlfunc/sort>.

For unprototyped sort subroutines, the historical behavior of passing 
the elements to be compared as the global variables $a and $b remains
unchanged.

=head2 C<sort $coderef @foo> allowed

sort() did not accept a subroutine reference as the comparison
function in earlier versions.  This is now permitted.

=head2 File globbing implemented internally

Perl now uses the File::Glob implementation of the glob() operator
automatically.  This avoids using an external csh process and the
problems associated with it.

    NOTE: This is currently an experimental feature.  Interfaces and
    implementation are subject to change.

=head2 Support for CHECK blocks

In addition to C<BEGIN>, C<INIT>, C<END>, C<DESTROY> and C<AUTOLOAD>,
subroutines named C<CHECK> are now special.  These are queued up during
compilation and behave similar to END blocks, except they are called at
the end of compilation rather than at the end of execution.  They cannot
be called directly.

=head2 POSIX character class syntax [: :] supported

For example to match alphabetic characters use /[[:alpha:]]/.
See L<perlre> for details.

=head2 Better pseudo-random number generator

In 5.005_0x and earlier, perl's rand() function used the C library
rand(3) function.  As of 5.005_52, Configure tests for drand48(),
random(), and rand() (in that order) and picks the first one it finds.

These changes should result in better random numbers from rand().

=head2 Improved C<qw//> operator

The C<qw//> operator is now evaluated at compile time into a true list
instead of being replaced with a run time call to C<split()>.  This
removes the confusing misbehaviour of C<qw//> in scalar context, which
had inherited that behaviour from split().

Thus:

    $foo = ($bar) = qw(a b c); print "$foo|$bar\n";

now correctly prints "3|a", instead of "2|a".

=head2 Better worst-case behavior of hashes

Small changes in the hashing algorithm have been implemented in
order to improve the distribution of lower order bits in the
hashed value.  This is expected to yield better performance on
keys that are repeated sequences.

=head2 pack() format 'Z' supported

The new format type 'Z' is useful for packing and unpacking null-terminated
strings.  See L<perlfunc/"pack">.

=head2 pack() format modifier '!' supported

The new format type modifier '!' is useful for packing and unpacking
native shorts, ints, and longs.  See L<perlfunc/"pack">.

=head2 pack() and unpack() support counted strings

The template character '/' can be used to specify a counted string
type to be packed or unpacked.  See L<perlfunc/"pack">.

=head2 Comments in pack() templates

The '#' character in a template introduces a comment up to
end of the line.  This facilitates documentation of pack()
templates.

=head2 Weak references

In previous versions of Perl, you couldn't cache objects so as
to allow them to be deleted if the last reference from outside 
the cache is deleted.  The reference in the cache would hold a
reference count on the object and the objects would never be
destroyed.

Another familiar problem is with circular references.  When an
object references itself, its reference count would never go
down to zero, and it would not get destroyed until the program
is about to exit.

Weak references solve this by allowing you to "weaken" any
reference, that is, make it not count towards the reference count.
When the last non-weak reference to an object is deleted, the object
is destroyed and all the weak references to the object are
automatically undef-ed.

To use this feature, you need the Devel::WeakRef package from CPAN, which
contains additional documentation.

    NOTE: This is an experimental feature.  Details are subject to change.  

=head2 Binary numbers supported

Binary numbers are now supported as literals, in s?printf formats, and
C<oct()>:

    $answer = 0b101010;
    printf "The answer is: %b\n", oct("0b101010");

=head2 Lvalue subroutines

Subroutines can now return modifiable lvalues.
See L<perlsub/"Lvalue subroutines">.

    NOTE: This is an experimental feature.  Details are subject to change.

=head2 Some arrows may be omitted in calls through references

Perl now allows the arrow to be omitted in many constructs
involving subroutine calls through references.  For example,
C<< $foo[10]->('foo') >> may now be written C<$foo[10]('foo')>.
This is rather similar to how the arrow may be omitted from
C<< $foo[10]->{'foo'} >>.  Note however, that the arrow is still
required for C<< foo(10)->('bar') >>.

=head2 Boolean assignment operators are legal lvalues

Constructs such as C<($a ||= 2) += 1> are now allowed.

=head2 exists() is supported on subroutine names

The exists() builtin now works on subroutine names.  A subroutine
is considered to exist if it has been declared (even if implicitly).
See L<perlfunc/exists> for examples.

=head2 exists() and delete() are supported on array elements

The exists() and delete() builtins now work on simple arrays as well.
The behavior is similar to that on hash elements.

exists() can be used to check whether an array element has been
initialized.  This avoids autovivifying array elements that don't exist.
If the array is tied, the EXISTS() method in the corresponding tied
package will be invoked.

delete() may be used to remove an element from the array and return
it.  The array element at that position returns to its uninitialized
state, so that testing for the same element with exists() will return
false.  If the element happens to be the one at the end, the size of
the array also shrinks up to the highest element that tests true for
exists(), or 0 if none such is found.  If the array is tied, the DELETE() 
method in the corresponding tied package will be invoked.

See L<perlfunc/exists> and L<perlfunc/delete> for examples.

=head2 Pseudo-hashes work better

Dereferencing some types of reference values in a pseudo-hash,
such as C<< $ph->{foo}[1] >>, was accidentally disallowed.  This has
been corrected.

When applied to a pseudo-hash element, exists() now reports whether
the specified value exists, not merely if the key is valid.

delete() now works on pseudo-hashes.  When given a pseudo-hash element
or slice it deletes the values corresponding to the keys (but not the keys
themselves).  See L<perlref/"Pseudo-hashes: Using an array as a hash">.

Pseudo-hash slices with constant keys are now optimized to array lookups
at compile-time.

List assignments to pseudo-hash slices are now supported.

The C<fields> pragma now provides ways to create pseudo-hashes, via
fields::new() and fields::phash().  See L<fields>.

    NOTE: The pseudo-hash data type continues to be experimental.
    Limiting oneself to the interface elements provided by the
    fields pragma will provide protection from any future changes.

=head2 Automatic flushing of output buffers

fork(), exec(), system(), qx//, and pipe open()s now flush buffers
of all files opened for output when the operation was attempted.  This
mostly eliminates confusing buffering mishaps suffered by users unaware
of how Perl internally handles I/O.

This is not supported on some platforms like Solaris where a suitably
correct implementation of fflush(NULL) isn't available.

=head2 Better diagnostics on meaningless filehandle operations

Constructs such as C<< open(<FH>) >> and C<< close(<FH>) >>
are compile time errors.  Attempting to read from filehandles that
were opened only for writing will now produce warnings (just as
writing to read-only filehandles does).

=head2 Where possible, buffered data discarded from duped input filehandle

C<< open(NEW, "<&OLD") >> now attempts to discard any data that
was previously read and buffered in C<OLD> before duping the handle.
On platforms where doing this is allowed, the next read operation
on C<NEW> will return the same data as the corresponding operation
on C<OLD>.  Formerly, it would have returned the data from the start
of the following disk block instead.

=head2 eof() has the same old magic as <>

C<eof()> would return true if no attempt to read from C<< <> >> had
yet been made.  C<eof()> has been changed to have a little magic of its
own, it now opens the C<< <> >> files.

=head2 binmode() can be used to set :crlf and :raw modes

binmode() now accepts a second argument that specifies a discipline
for the handle in question.  The two pseudo-disciplines ":raw" and
":crlf" are currently supported on DOS-derivative platforms.
See L<perlfunc/"binmode"> and L<open>.

=head2 C<-T> filetest recognizes UTF-8 encoded files as "text"

The algorithm used for the C<-T> filetest has been enhanced to
correctly identify UTF-8 content as "text".

=head2 system(), backticks and pipe open now reflect exec() failure

On Unix and similar platforms, system(), qx() and open(FOO, "cmd |")
etc., are implemented via fork() and exec().  When the underlying
exec() fails, earlier versions did not report the error properly,
since the exec() happened to be in a different process.

The child process now communicates with the parent about the
error in launching the external command, which allows these
constructs to return with their usual error value and set $!.

=head2 Improved diagnostics

Line numbers are no longer suppressed (under most likely circumstances)
during the global destruction phase.

Diagnostics emitted from code running in threads other than the main
thread are now accompanied by the thread ID.

Embedded null characters in diagnostics now actually show up.  They
used to truncate the message in prior versions.

$foo::a and $foo::b are now exempt from "possible typo" warnings only
if sort() is encountered in package C<foo>.

Unrecognized alphabetic escapes encountered when parsing quote
constructs now generate a warning, since they may take on new
semantics in later versions of Perl.

Many diagnostics now report the internal operation in which the warning
was provoked, like so:

    Use of uninitialized value in concatenation (.) at (eval 1) line 1.
    Use of uninitialized value in print at (eval 1) line 1.

Diagnostics  that occur within eval may also report the file and line
number where the eval is located, in addition to the eval sequence
number and the line number within the evaluated text itself.  For
example:

    Not enough arguments for scalar at (eval 4)[newlib/perl5db.pl:1411] line 2, at EOF

=head2 Diagnostics follow STDERR

Diagnostic output now goes to whichever file the C<STDERR> handle
is pointing at, instead of always going to the underlying C runtime
library's C<stderr>.

=head2 More consistent close-on-exec behavior

On systems that support a close-on-exec flag on filehandles, the
flag is now set for any handles created by pipe(), socketpair(),
socket(), and accept(), if that is warranted by the value of $^F
that may be in effect.  Earlier versions neglected to set the flag
for handles created with these operators.  See L<perlfunc/pipe>,
L<perlfunc/socketpair>, L<perlfunc/socket>, L<perlfunc/accept>,
and L<perlvar/$^F>.

=head2 syswrite() ease-of-use

The length argument of C<syswrite()> has become optional.

=head2 Better syntax checks on parenthesized unary operators

Expressions such as:

    print defined(&foo,&bar,&baz);
    print uc("foo","bar","baz");
    undef($foo,&bar);

used to be accidentally allowed in earlier versions, and produced
unpredictable behaviour.  Some produced ancillary warnings
when used in this way; others silently did the wrong thing.

The parenthesized forms of most unary operators that expect a single
argument now ensure that they are not called with more than one
argument, making the cases shown above syntax errors.  The usual
behaviour of:

    print defined &foo, &bar, &baz;
    print uc "foo", "bar", "baz";
    undef $foo, &bar;

remains unchanged.  See L<perlop>.

=head2 Bit operators support full native integer width

The bit operators (& | ^ ~ << >>) now operate on the full native
integral width (the exact size of which is available in $Config{ivsize}).
For example, if your platform is either natively 64-bit or if Perl
has been configured to use 64-bit integers, these operations apply
to 8 bytes (as opposed to 4 bytes on 32-bit platforms).
For portability, be sure to mask off the excess bits in the result of
unary C<~>, e.g., C<~$x & 0xffffffff>.

=head2 Improved security features

More potentially unsafe operations taint their results for improved
security.

The C<passwd> and C<shell> fields returned by the getpwent(), getpwnam(),
and getpwuid() are now tainted, because the user can affect their own
encrypted password and login shell.

The variable modified by shmread(), and messages returned by msgrcv()
(and its object-oriented interface IPC::SysV::Msg::rcv) are also tainted,
because other untrusted processes can modify messages and shared memory
segments for their own nefarious purposes.

=head2 More functional bareword prototype (*)

Bareword prototypes have been rationalized to enable them to be used
to override builtins that accept barewords and interpret them in
a special way, such as C<require> or C<do>.

Arguments prototyped as C<*> will now be visible within the subroutine
as either a simple scalar or as a reference to a typeglob.
See L<perlsub/Prototypes>.

=head2 C<require> and C<do> may be overridden

C<require> and C<do 'file'> operations may be overridden locally
by importing subroutines of the same name into the current package 
(or globally by importing them into the CORE::GLOBAL:: namespace).
Overriding C<require> will also affect C<use>, provided the override
is visible at compile-time.
See L<perlsub/"Overriding Built-in Functions">.

=head2 $^X variables may now have names longer than one character

Formerly, $^X was synonymous with ${"\cX"}, but $^XY was a syntax
error.  Now variable names that begin with a control character may be
arbitrarily long.  However, for compatibility reasons, these variables
I<must> be written with explicit braces, as C<${^XY}> for example.
C<${^XYZ}> is synonymous with ${"\cXYZ"}.  Variable names with more
than one control character, such as C<${^XY^Z}>, are illegal.

The old syntax has not changed.  As before, `^X' may be either a
literal control-X character or the two-character sequence `caret' plus
`X'.  When braces are omitted, the variable name stops after the
control character.  Thus C<"$^XYZ"> continues to be synonymous with
C<$^X . "YZ"> as before.

As before, lexical variables may not have names beginning with control
characters.  As before, variables whose names begin with a control
character are always forced to be in package `main'.  All such variables
are reserved for future extensions, except those that begin with
C<^_>, which may be used by user programs and are guaranteed not to
acquire special meaning in any future version of Perl.

=head2 New variable $^C reflects C<-c> switch

C<$^C> has a boolean value that reflects whether perl is being run
in compile-only mode (i.e. via the C<-c> switch).  Since
BEGIN blocks are executed under such conditions, this variable
enables perl code to determine whether actions that make sense
only during normal running are warranted.  See L<perlvar>.

=head2 New variable $^V contains Perl version as a string

C<$^V> contains the Perl version number as a string composed of
characters whose ordinals match the version numbers, i.e. v5.6.0.
This may be used in string comparisons.

See C<Support for strings represented as a vector of ordinals> for an
example.

=head2 Optional Y2K warnings

If Perl is built with the cpp macro C<PERL_Y2KWARN> defined,
it emits optional warnings when concatenating the number 19
with another number.

This behavior must be specifically enabled when running Configure.
See F<INSTALL> and F<README.Y2K>.

=head2 Arrays now always interpolate into double-quoted strings

In double-quoted strings, arrays now interpolate, no matter what.  The
behavior in earlier versions of perl 5 was that arrays would interpolate
into strings if the array had been mentioned before the string was
compiled, and otherwise Perl would raise a fatal compile-time error.
In versions 5.000 through 5.003, the error was

        Literal @example now requires backslash

In versions 5.004_01 through 5.6.0, the error was

        In string, @example now must be written as \@example

The idea here was to get people into the habit of writing
C<"fred\@example.com"> when they wanted a literal C<@> sign, just as
they have always written C<"Give me back my \$5"> when they wanted a
literal C<$> sign.

Starting with 5.6.1, when Perl now sees an C<@> sign in a
double-quoted string, it I<always> attempts to interpolate an array,
regardless of whether or not the array has been used or declared
already.  The fatal error has been downgraded to an optional warning:

        Possible unintended interpolation of @example in string

This warns you that C<"fred@example.com"> is going to turn into
C<fred.com> if you don't backslash the C<@>.
See http://perl.plover.com/at-error.html for more details
about the history here.

=head2 @- and @+ provide starting/ending offsets of regex submatches

The new magic variables @- and @+ provide the starting and ending
offsets, respectively, of $&, $1, $2, etc.  See L<perlvar> for
details.

=head1 Modules and Pragmata

=head2 Modules

=over 4

=item attributes

While used internally by Perl as a pragma, this module also
provides a way to fetch subroutine and variable attributes.
See L<attributes>.

=item B

The Perl Compiler suite has been extensively reworked for this
release.  More of the standard Perl test suite passes when run
under the Compiler, but there is still a significant way to
go to achieve production quality compiled executables.

    NOTE: The Compiler suite remains highly experimental.  The
    generated code may not be correct, even when it manages to execute
    without errors.

=item Benchmark

Overall, Benchmark results exhibit lower average error and better timing
accuracy.  

You can now run tests for I<n> seconds instead of guessing the right
number of tests to run: e.g., timethese(-5, ...) will run each 
code for at least 5 CPU seconds.  Zero as the "number of repetitions"
means "for at least 3 CPU seconds".  The output format has also
changed.  For example:

   use Benchmark;$x=3;timethese(-5,{a=>sub{$x*$x},b=>sub{$x**2}})

will now output something like this:

   Benchmark: running a, b, each for at least 5 CPU seconds...
            a:  5 wallclock secs ( 5.77 usr +  0.00 sys =  5.77 CPU) @ 200551.91/s (n=1156516)
            b:  4 wallclock secs ( 5.00 usr +  0.02 sys =  5.02 CPU) @ 159605.18/s (n=800686)

New features: "each for at least N CPU seconds...", "wallclock secs",
and the "@ operations/CPU second (n=operations)".

timethese() now returns a reference to a hash of Benchmark objects containing
the test results, keyed on the names of the tests.

timethis() now returns the iterations field in the Benchmark result object
instead of 0.

timethese(), timethis(), and the new cmpthese() (see below) can also take
a format specifier of 'none' to suppress output.

A new function countit() is just like timeit() except that it takes a
TIME instead of a COUNT.

A new function cmpthese() prints a chart comparing the results of each test
returned from a timethese() call.  For each possible pair of tests, the
percentage speed difference (iters/sec or seconds/iter) is shown.

For other details, see L<Benchmark>.

=item ByteLoader

The ByteLoader is a dedicated extension to generate and run
Perl bytecode.  See L<ByteLoader>.

=item constant

References can now be used.

The new version also allows a leading underscore in constant names, but
disallows a double leading underscore (as in "__LINE__").  Some other names
are disallowed or warned against, including BEGIN, END, etc.  Some names
which were forced into main:: used to fail silently in some cases; now they're
fatal (outside of main::) and an optional warning (inside of main::).
The ability to detect whether a constant had been set with a given name has
been added.

See L<constant>.

=item charnames

This pragma implements the C<\N> string escape.  See L<charnames>.

=item Data::Dumper

A C<Maxdepth> setting can be specified to avoid venturing
too deeply into deep data structures.  See L<Data::Dumper>.

The XSUB implementation of Dump() is now automatically called if the
C<Useqq> setting is not in use.

Dumping C<qr//> objects works correctly.

=item DB

C<DB> is an experimental module that exposes a clean abstraction
to Perl's debugging API.

=item DB_File

DB_File can now be built with Berkeley DB versions 1, 2 or 3.
See C<ext/DB_File/Changes>.

=item Devel::DProf

Devel::DProf, a Perl source code profiler has been added.  See
L<Devel::DProf> and L<dprofpp>.

=item Devel::Peek

The Devel::Peek module provides access to the internal representation
of Perl variables and data.  It is a data debugging tool for the XS programmer.

=item Dumpvalue

The Dumpvalue module provides screen dumps of Perl data.

=item DynaLoader

DynaLoader now supports a dl_unload_file() function on platforms that
support unloading shared objects using dlclose().

Perl can also optionally arrange to unload all extension shared objects
loaded by Perl.  To enable this, build Perl with the Configure option
C<-Accflags=-DDL_UNLOAD_ALL_AT_EXIT>.  (This maybe useful if you are
using Apache with mod_perl.)

=item English

$PERL_VERSION now stands for C<$^V> (a string value) rather than for C<$]>
(a numeric value).

=item Env

Env now supports accessing environment variables like PATH as array
variables.

=item Fcntl

More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for
large file (more than 4GB) access (NOTE: the O_LARGEFILE is
automatically added to sysopen() flags if large file support has been
configured, as is the default), Free/Net/OpenBSD locking behaviour
flags F_FLOCK, F_POSIX, Linux F_SHLCK, and O_ACCMODE: the combined
mask of O_RDONLY, O_WRONLY, and O_RDWR.  The seek()/sysseek()
constants SEEK_SET, SEEK_CUR, and SEEK_END are available via the
C<:seek> tag.  The chmod()/stat() S_IF* constants and S_IS* functions
are available via the C<:mode> tag.

=item File::Compare

A compare_text() function has been added, which allows custom
comparison functions.  See L<File::Compare>.

=item File::Find

File::Find now works correctly when the wanted() function is either
autoloaded or is a symbolic reference.

A bug that caused File::Find to lose track of the working directory
when pruning top-level directories has been fixed.

File::Find now also supports several other options to control its
behavior.  It can follow symbolic links if the C<follow> option is
specified.  Enabling the C<no_chdir> option will make File::Find skip
changing the current directory when walking directories.  The C<untaint>
flag can be useful when running with taint checks enabled.

See L<File::Find>.

=item File::Glob

This extension implements BSD-style file globbing.  By default,
it will also be used for the internal implementation of the glob()
operator.  See L<File::Glob>.

=item File::Spec

New methods have been added to the File::Spec module: devnull() returns
the name of the null device (/dev/null on Unix) and tmpdir() the name of
the temp directory (normally /tmp on Unix).  There are now also methods
to convert between absolute and relative filenames: abs2rel() and
rel2abs().  For compatibility with operating systems that specify volume
names in file paths, the splitpath(), splitdir(), and catdir() methods
have been added.

=item File::Spec::Functions

The new File::Spec::Functions modules provides a function interface
to the File::Spec module.  Allows shorthand

    $fullname = catfile($dir1, $dir2, $file);

instead of

    $fullname = File::Spec->catfile($dir1, $dir2, $file);

=item Getopt::Long

Getopt::Long licensing has changed to allow the Perl Artistic License
as well as the GPL. It used to be GPL only, which got in the way of
non-GPL applications that wanted to use Getopt::Long.

Getopt::Long encourages the use of Pod::Usage to produce help
messages. For example:

    use Getopt::Long;
    use Pod::Usage;
    my $man = 0;
    my $help = 0;
    GetOptions('help|?' => \$help, man => \$man) or pod2usage(2);
    pod2usage(1) if $help;
    pod2usage(-exitstatus => 0, -verbose => 2) if $man;

    __END__

    =head1 NAME

    sample - Using Getopt::Long and Pod::Usage

    =head1 SYNOPSIS

    sample [options] [file ...]

     Options:
       -help            brief help message
       -man             full documentation

    =head1 OPTIONS

    =over 8

    =item B<-help>

    Print a brief help message and exits.

    =item B<-man>

    Prints the manual page and exits.

    =back

    =head1 DESCRIPTION

    B<This program> will read the given input file(s) and do something
    useful with the contents thereof.

    =cut

See L<Pod::Usage> for details.

A bug that prevented the non-option call-back <> from being
specified as the first argument has been fixed.

To specify the characters < and > as option starters, use ><. Note,
however, that changing option starters is strongly deprecated. 

=item IO

write() and syswrite() will now accept a single-argument
form of the call, for consistency with Perl's syswrite().

You can now create a TCP-based IO::Socket::INET without forcing
a connect attempt.  This allows you to configure its options
(like making it non-blocking) and then call connect() manually.

A bug that prevented the IO::Socket::protocol() accessor
from ever returning the correct value has been corrected.

IO::Socket::connect now uses non-blocking IO instead of alarm()
to do connect timeouts.

IO::Socket::accept now uses select() instead of alarm() for doing
timeouts.

IO::Socket::INET->new now sets $! correctly on failure. $@ is
still set for backwards compatibility.

=item JPL

Java Perl Lingo is now distributed with Perl.  See jpl/README
for more information.

=item lib

C<use lib> now weeds out any trailing duplicate entries.
C<no lib> removes all named entries.

=item Math::BigInt

The bitwise operations C<<< << >>>, C<<< >> >>>, C<&>, C<|>,
and C<~> are now supported on bigints.

=item Math::Complex

The accessor methods Re, Im, arg, abs, rho, and theta can now also
act as mutators (accessor $z->Re(), mutator $z->Re(3)).

The class method C<display_format> and the corresponding object method
C<display_format>, in addition to accepting just one argument, now can
also accept a parameter hash.  Recognized keys of a parameter hash are
C<"style">, which corresponds to the old one parameter case, and two
new parameters: C<"format">, which is a printf()-style format string
(defaults usually to C<"%.15g">, you can revert to the default by
setting the format string to C<undef>) used for both parts of a
complex number, and C<"polar_pretty_print"> (defaults to true),
which controls whether an attempt is made to try to recognize small
multiples and rationals of pi (2pi, pi/2) at the argument (angle) of a
polar complex number.

The potentially disruptive change is that in list context both methods
now I<return the parameter hash>, instead of only the value of the
C<"style"> parameter.

=item Math::Trig

A little bit of radial trigonometry (cylindrical and spherical),
radial coordinate conversions, and the great circle distance were added.

=item Pod::Parser, Pod::InputObjects

Pod::Parser is a base class for parsing and selecting sections of
pod documentation from an input stream.  This module takes care of
identifying pod paragraphs and commands in the input and hands off the
parsed paragraphs and commands to user-defined methods which are free
to interpret or translate them as they see fit.

Pod::InputObjects defines some input objects needed by Pod::Parser, and
for advanced users of Pod::Parser that need more about a command besides
its name and text.

As of release 5.6.0 of Perl, Pod::Parser is now the officially sanctioned
"base parser code" recommended for use by all pod2xxx translators.
Pod::Text (pod2text) and Pod::Man (pod2man) have already been converted
to use Pod::Parser and efforts to convert Pod::HTML (pod2html) are already
underway.  For any questions or comments about pod parsing and translating
issues and utilities, please use the pod-people@perl.org mailing list.

For further information, please see L<Pod::Parser> and L<Pod::InputObjects>.

=item Pod::Checker, podchecker

This utility checks pod files for correct syntax, according to
L<perlpod>.  Obvious errors are flagged as such, while warnings are
printed for mistakes that can be handled gracefully.  The checklist is
not complete yet.  See L<Pod::Checker>.

=item Pod::ParseUtils, Pod::Find

These modules provide a set of gizmos that are useful mainly for pod
translators.  L<Pod::Find|Pod::Find> traverses directory structures and
returns found pod files, along with their canonical names (like
C<File::Spec::Unix>).  L<Pod::ParseUtils|Pod::ParseUtils> contains
B<Pod::List> (useful for storing pod list information), B<Pod::Hyperlink>
(for parsing the contents of C<LE<lt>E<gt>> sequences) and B<Pod::Cache>
(for caching information about pod files, e.g., link nodes).

=item Pod::Select, podselect

Pod::Select is a subclass of Pod::Parser which provides a function
named "podselect()" to filter out user-specified sections of raw pod
documentation from an input stream. podselect is a script that provides
access to Pod::Select from other scripts to be used as a filter.
See L<Pod::Select>.

=item Pod::Usage, pod2usage

Pod::Usage provides the function "pod2usage()" to print usage messages for
a Perl script based on its embedded pod documentation.  The pod2usage()
function is generally useful to all script authors since it lets them
write and maintain a single source (the pods) for documentation, thus
removing the need to create and maintain redundant usage message text
consisting of information already in the pods.

There is also a pod2usage script which can be used from other kinds of
scripts to print usage messages from pods (even for non-Perl scripts
with pods embedded in comments).

For details and examples, please see L<Pod::Usage>.

=item Pod::Text and Pod::Man

Pod::Text has been rewritten to use Pod::Parser.  While pod2text() is
still available for backwards compatibility, the module now has a new
preferred interface.  See L<Pod::Text> for the details.  The new Pod::Text
module is easily subclassed for tweaks to the output, and two such
subclasses (Pod::Text::Termcap for man-page-style bold and underlining
using termcap information, and Pod::Text::Color for markup with ANSI color
sequences) are now standard.

pod2man has been turned into a module, Pod::Man, which also uses
Pod::Parser.  In the process, several outstanding bugs related to quotes
in section headers, quoting of code escapes, and nested lists have been
fixed.  pod2man is now a wrapper script around this module.

=item SDBM_File

An EXISTS method has been added to this module (and sdbm_exists() has
been added to the underlying sdbm library), so one can now call exists
on an SDBM_File tied hash and get the correct result, rather than a
runtime error.

A bug that may have caused data loss when more than one disk block
happens to be read from the database in a single FETCH() has been
fixed.

=item Sys::Syslog

Sys::Syslog now uses XSUBs to access facilities from syslog.h so it
no longer requires syslog.ph to exist. 

=item Sys::Hostname

Sys::Hostname now uses XSUBs to call the C library's gethostname() or
uname() if they exist.

=item Term::ANSIColor

Term::ANSIColor is a very simple module to provide easy and readable
access to the ANSI color and highlighting escape sequences, supported by
most ANSI terminal emulators.  It is now included standard.

=item Time::Local

The timelocal() and timegm() functions used to silently return bogus
results when the date fell outside the machine's integer range.  They
now consistently croak() if the date falls in an unsupported range.

=item Win32

The error return value in list context has been changed for all functions
that return a list of values.  Previously these functions returned a list
with a single element C<undef> if an error occurred.  Now these functions
return the empty list in these situations.  This applies to the following
functions:

    Win32::FsType
    Win32::GetOSVersion

The remaining functions are unchanged and continue to return C<undef> on
error even in list context.

The Win32::SetLastError(ERROR) function has been added as a complement
to the Win32::GetLastError() function.

The new Win32::GetFullPathName(FILENAME) returns the full absolute
pathname for FILENAME in scalar context.  In list context it returns
a two-element list containing the fully qualified directory name and
the filename.  See L<Win32>.

=item XSLoader

The XSLoader extension is a simpler alternative to DynaLoader.
See L<XSLoader>.

=item DBM Filters

A new feature called "DBM Filters" has been added to all the
DBM modules--DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File.
DBM Filters add four new methods to each DBM module:

    filter_store_key
    filter_store_value
    filter_fetch_key
    filter_fetch_value

These can be used to filter key-value pairs before the pairs are
written to the database or just after they are read from the database.
See L<perldbmfilter> for further information.

=back

=head2 Pragmata

C<use attrs> is now obsolete, and is only provided for
backward-compatibility.  It's been replaced by the C<sub : attributes>
syntax.  See L<perlsub/"Subroutine Attributes"> and L<attributes>.

Lexical warnings pragma, C<use warnings;>, to control optional warnings.
See L<perllexwarn>.

C<use filetest> to control the behaviour of filetests (C<-r> C<-w>
...).  Currently only one subpragma implemented, "use filetest
'access';", that uses access(2) or equivalent to check permissions
instead of using stat(2) as usual.  This matters in filesystems
where there are ACLs (access control lists): the stat(2) might lie,
but access(2) knows better.

The C<open> pragma can be used to specify default disciplines for
handle constructors (e.g. open()) and for qx//.  The two
pseudo-disciplines C<:raw> and C<:crlf> are currently supported on
DOS-derivative platforms (i.e. where binmode is not a no-op).
See also L</"binmode() can be used to set :crlf and :raw modes">.

=head1 Utility Changes

=head2 dprofpp

C<dprofpp> is used to display profile data generated using C<Devel::DProf>.
See L<dprofpp>.

=head2 find2perl

The C<find2perl> utility now uses the enhanced features of the File::Find
module.  The -depth and -follow options are supported.  Pod documentation
is also included in the script.

=head2 h2xs

The C<h2xs> tool can now work in conjunction with C<C::Scan> (available
from CPAN) to automatically parse real-life header files.  The C<-M>,
C<-a>, C<-k>, and C<-o> options are new.

=head2 perlcc

C<perlcc> now supports the C and Bytecode backends.  By default,
it generates output from the simple C backend rather than the
optimized C backend.

Support for non-Unix platforms has been improved.

=head2 perldoc

C<perldoc> has been reworked to avoid possible security holes.
It will not by default let itself be run as the superuser, but you
may still use the B<-U> switch to try to make it drop privileges
first.

=head2 The Perl Debugger

Many bug fixes and enhancements were added to F<perl5db.pl>, the
Perl debugger.  The help documentation was rearranged.  New commands
include C<< < ? >>, C<< > ? >>, and C<< { ? >> to list out current
actions, C<man I<docpage>> to run your doc viewer on some perl
docset, and support for quoted options.  The help information was
rearranged, and should be viewable once again if you're using B<less>
as your pager.  A serious security hole was plugged--you should
immediately remove all older versions of the Perl debugger as
installed in previous releases, all the way back to perl3, from
your system to avoid being bitten by this.

=head1 Improved Documentation

Many of the platform-specific README files are now part of the perl
installation.  See L<perl> for the complete list.

=over 4

=item perlapi.pod

The official list of public Perl API functions.

=item perlboot.pod

A tutorial for beginners on object-oriented Perl.

=item perlcompile.pod

An introduction to using the Perl Compiler suite.

=item perldbmfilter.pod

A howto document on using the DBM filter facility.

=item perldebug.pod

All material unrelated to running the Perl debugger, plus all
low-level guts-like details that risked crushing the casual user
of the debugger, have been relocated from the old manpage to the
next entry below.

=item perldebguts.pod

This new manpage contains excessively low-level material not related
to the Perl debugger, but slightly related to debugging Perl itself.
It also contains some arcane internal details of how the debugging
process works that may only be of interest to developers of Perl
debuggers.

=item perlfork.pod

Notes on the fork() emulation currently available for the Windows platform.

=item perlfilter.pod

An introduction to writing Perl source filters.

=item perlhack.pod

Some guidelines for hacking the Perl source code.

=item perlintern.pod

A list of internal functions in the Perl source code.
(List is currently empty.)

=item perllexwarn.pod

Introduction and reference information about lexically scoped
warning categories.

=item perlnumber.pod

Detailed information about numbers as they are represented in Perl.

=item perlopentut.pod

A tutorial on using open() effectively.

=item perlreftut.pod

A tutorial that introduces the essentials of references.

=item perltootc.pod

A tutorial on managing class data for object modules.

=item perltodo.pod

Discussion of the most often wanted features that may someday be
supported in Perl.

=item perlunicode.pod

An introduction to Unicode support features in Perl.

=back

=head1 Performance enhancements

=head2 Simple sort() using { $a <=> $b } and the like are optimized

Many common sort() operations using a simple inlined block are now
optimized for faster performance.

=head2 Optimized assignments to lexical variables

Certain operations in the RHS of assignment statements have been
optimized to directly set the lexical variable on the LHS,
eliminating redundant copying overheads.

=head2 Faster subroutine calls

Minor changes in how subroutine calls are handled internally
provide marginal improvements in performance.

=head2 delete(), each(), values() and hash iteration are faster

The hash values returned by delete(), each(), values() and hashes in a
list context are the actual values in the hash, instead of copies.
This results in significantly better performance, because it eliminates
needless copying in most situations.

=head1 Installation and Configuration Improvements

=head2 -Dusethreads means something different

The -Dusethreads flag now enables the experimental interpreter-based thread
support by default.  To get the flavor of experimental threads that was in
5.005 instead, you need to run Configure with "-Dusethreads -Duse5005threads".

As of v5.6.0, interpreter-threads support is still lacking a way to
create new threads from Perl (i.e., C<use Thread;> will not work with
interpreter threads).  C<use Thread;> continues to be available when you
specify the -Duse5005threads option to Configure, bugs and all.

    NOTE: Support for threads continues to be an experimental feature.
    Interfaces and implementation are subject to sudden and drastic changes.

=head2 New Configure flags

The following new flags may be enabled on the Configure command line
by running Configure with C<-Dflag>.

    usemultiplicity
    usethreads useithreads	(new interpreter threads: no Perl API yet)
    usethreads use5005threads	(threads as they were in 5.005)

    use64bitint			(equal to now deprecated 'use64bits')
    use64bitall

    uselongdouble
    usemorebits
    uselargefiles
    usesocks			(only SOCKS v5 supported)

=head2 Threadedness and 64-bitness now more daring

The Configure options enabling the use of threads and the use of
64-bitness are now more daring in the sense that they no more have an
explicit list of operating systems of known threads/64-bit
capabilities.  In other words: if your operating system has the
necessary APIs and datatypes, you should be able just to go ahead and
use them, for threads by Configure -Dusethreads, and for 64 bits
either explicitly by Configure -Duse64bitint or implicitly if your
system has 64-bit wide datatypes.  See also L</"64-bit support">.

=head2 Long Doubles

Some platforms have "long doubles", floating point numbers of even
larger range than ordinary "doubles".  To enable using long doubles for
Perl's scalars, use -Duselongdouble.

=head2 -Dusemorebits

You can enable both -Duse64bitint and -Duselongdouble with -Dusemorebits.
See also L</"64-bit support">.

=head2 -Duselargefiles

Some platforms support system APIs that are capable of handling large files
(typically, files larger than two gigabytes).  Perl will try to use these
APIs if you ask for -Duselargefiles.

See L</"Large file support"> for more information. 

=head2 installusrbinperl

You can use "Configure -Uinstallusrbinperl" which causes installperl
to skip installing perl also as /usr/bin/perl.  This is useful if you
prefer not to modify /usr/bin for some reason or another but harmful
because many scripts assume to find Perl in /usr/bin/perl.

=head2 SOCKS support

You can use "Configure -Dusesocks" which causes Perl to probe
for the SOCKS proxy protocol library (v5, not v4).  For more information
on SOCKS, see:

    http://www.socks.nec.com/

=head2 C<-A> flag

You can "post-edit" the Configure variables using the Configure C<-A>
switch.  The editing happens immediately after the platform specific
hints files have been processed but before the actual configuration
process starts.  Run C<Configure -h> to find out the full C<-A> syntax.

=head2 Enhanced Installation Directories

The installation structure has been enriched to improve the support
for maintaining multiple versions of perl, to provide locations for
vendor-supplied modules, scripts, and manpages, and to ease maintenance
of locally-added modules, scripts, and manpages.  See the section on
Installation Directories in the INSTALL file for complete details.
For most users building and installing from source, the defaults should
be fine.

If you previously used C<Configure -Dsitelib> or C<-Dsitearch> to set
special values for library directories, you might wish to consider using
the new C<-Dsiteprefix> setting instead.  Also, if you wish to re-use a
config.sh file from an earlier version of perl, you should be sure to
check that Configure makes sensible choices for the new directories.
See INSTALL for complete details.

=head2 gcc automatically tried if 'cc' does not seem to be working

In many platforms the vendor-supplied 'cc' is too stripped-down to
build Perl (basically, the 'cc' doesn't do ANSI C).  If this seems
to be the case and the 'cc' does not seem to be the GNU C compiler
'gcc', an automatic attempt is made to find and use 'gcc' instead.

=head1 Platform specific changes

=head2 Supported platforms

=over 4

=item *

The Mach CThreads (NEXTSTEP, OPENSTEP) are now supported by the Thread
extension.

=item *

GNU/Hurd is now supported.

=item *

Rhapsody/Darwin is now supported.

=item *

EPOC is now supported (on Psion 5).

=item *

The cygwin port (formerly cygwin32) has been greatly improved.

=back

=head2 DOS

=over 4

=item *

Perl now works with djgpp 2.02 (and 2.03 alpha).

=item *

Environment variable names are not converted to uppercase any more.

=item *

Incorrect exit codes from backticks have been fixed.

=item *

This port continues to use its own builtin globbing (not File::Glob).

=back

=head2 OS390 (OpenEdition MVS)

Support for this EBCDIC platform has not been renewed in this release.
There are difficulties in reconciling Perl's standardization on UTF-8
as its internal representation for characters with the EBCDIC character
set, because the two are incompatible.

It is unclear whether future versions will renew support for this
platform, but the possibility exists.

=head2 VMS

Numerous revisions and extensions to configuration, build, testing, and
installation process to accommodate core changes and VMS-specific options.

Expand %ENV-handling code to allow runtime mapping to logical names,
CLI symbols, and CRTL environ array.

Extension of subprocess invocation code to accept filespecs as command
"verbs".

Add to Perl command line processing the ability to use default file types and
to recognize Unix-style C<2E<gt>&1>.

Expansion of File::Spec::VMS routines, and integration into ExtUtils::MM_VMS.

Extension of ExtUtils::MM_VMS to handle complex extensions more flexibly.

Barewords at start of Unix-syntax paths may be treated as text rather than
only as logical names.

Optional secure translation of several logical names used internally by Perl.

Miscellaneous bugfixing and porting of new core code to VMS.

Thanks are gladly extended to the many people who have contributed VMS
patches, testing, and ideas.

=head2 Win32

Perl can now emulate fork() internally, using multiple interpreters running
in different concurrent threads.  This support must be enabled at build
time.  See L<perlfork> for detailed information.

When given a pathname that consists only of a drivename, such as C<A:>,
opendir() and stat() now use the current working directory for the drive
rather than the drive root.

The builtin XSUB functions in the Win32:: namespace are documented.  See
L<Win32>.

$^X now contains the full path name of the running executable.

A Win32::GetLongPathName() function is provided to complement
Win32::GetFullPathName() and Win32::GetShortPathName().  See L<Win32>.

POSIX::uname() is supported.

system(1,...) now returns true process IDs rather than process
handles.  kill() accepts any real process id, rather than strictly
return values from system(1,...).

For better compatibility with Unix, C<kill(0, $pid)> can now be used to
test whether a process exists.

The C<Shell> module is supported.

Better support for building Perl under command.com in Windows 95
has been added.

Scripts are read in binary mode by default to allow ByteLoader (and
the filter mechanism in general) to work properly.  For compatibility,
the DATA filehandle will be set to text mode if a carriage return is
detected at the end of the line containing the __END__ or __DATA__
token; if not, the DATA filehandle will be left open in binary mode.
Earlier versions always opened the DATA filehandle in text mode.

The glob() operator is implemented via the C<File::Glob> extension,
which supports glob syntax of the C shell.  This increases the flexibility
of the glob() operator, but there may be compatibility issues for
programs that relied on the older globbing syntax.  If you want to
preserve compatibility with the older syntax, you might want to run
perl with C<-MFile::DosGlob>.  For details and compatibility information,
see L<File::Glob>.

=head1 Significant bug fixes

=head2 <HANDLE> on empty files

With C<$/> set to C<undef>, "slurping" an empty file returns a string of
zero length (instead of C<undef>, as it used to) the first time the
HANDLE is read after C<$/> is set to C<undef>.  Further reads yield
C<undef>.

This means that the following will append "foo" to an empty file (it used
to do nothing):

    perl -0777 -pi -e 's/^/foo/' empty_file

The behaviour of:

    perl -pi -e 's/^/foo/' empty_file

is unchanged (it continues to leave the file empty).

=head2 C<eval '...'> improvements

Line numbers (as reflected by caller() and most diagnostics) within
C<eval '...'> were often incorrect where here documents were involved.
This has been corrected.

Lexical lookups for variables appearing in C<eval '...'> within
functions that were themselves called within an C<eval '...'> were
searching the wrong place for lexicals.  The lexical search now
correctly ends at the subroutine's block boundary.

The use of C<return> within C<eval {...}> caused $@ not to be reset
correctly when no exception occurred within the eval.  This has
been fixed.

Parsing of here documents used to be flawed when they appeared as
the replacement expression in C<eval 's/.../.../e'>.  This has
been fixed.

=head2 All compilation errors are true errors

Some "errors" encountered at compile time were by necessity 
generated as warnings followed by eventual termination of the
program.  This enabled more such errors to be reported in a
single run, rather than causing a hard stop at the first error
that was encountered.

The mechanism for reporting such errors has been reimplemented
to queue compile-time errors and report them at the end of the
compilation as true errors rather than as warnings.  This fixes
cases where error messages leaked through in the form of warnings
when code was compiled at run time using C<eval STRING>, and
also allows such errors to be reliably trapped using C<eval "...">.

=head2 Implicitly closed filehandles are safer

Sometimes implicitly closed filehandles (as when they are localized,
and Perl automatically closes them on exiting the scope) could
inadvertently set $? or $!.  This has been corrected.


=head2 Behavior of list slices is more consistent

When taking a slice of a literal list (as opposed to a slice of
an array or hash), Perl used to return an empty list if the
result happened to be composed of all undef values.

The new behavior is to produce an empty list if (and only if)
the original list was empty.  Consider the following example:

    @a = (1,undef,undef,2)[2,1,2];

The old behavior would have resulted in @a having no elements.
The new behavior ensures it has three undefined elements.

Note in particular that the behavior of slices of the following
cases remains unchanged:

    @a = ()[1,2];
    @a = (getpwent)[7,0];
    @a = (anything_returning_empty_list())[2,1,2];
    @a = @b[2,1,2];
    @a = @c{'a','b','c'};

See L<perldata>.

=head2 C<(\$)> prototype and C<$foo{a}>

A scalar reference prototype now correctly allows a hash or
array element in that slot.

=head2 C<goto &sub> and AUTOLOAD

The C<goto &sub> construct works correctly when C<&sub> happens
to be autoloaded.

=head2 C<-bareword> allowed under C<use integer>

The autoquoting of barewords preceded by C<-> did not work
in prior versions when the C<integer> pragma was enabled.
This has been fixed.

=head2 Failures in DESTROY()

When code in a destructor threw an exception, it went unnoticed
in earlier versions of Perl, unless someone happened to be
looking in $@ just after the point the destructor happened to
run.  Such failures are now visible as warnings when warnings are
enabled.

=head2 Locale bugs fixed

printf() and sprintf() previously reset the numeric locale
back to the default "C" locale.  This has been fixed.

Numbers formatted according to the local numeric locale
(such as using a decimal comma instead of a decimal dot) caused
"isn't numeric" warnings, even while the operations accessing
those numbers produced correct results.  These warnings have been
discontinued.

=head2 Memory leaks

The C<eval 'return sub {...}'> construct could sometimes leak
memory.  This has been fixed.

Operations that aren't filehandle constructors used to leak memory
when used on invalid filehandles.  This has been fixed.

Constructs that modified C<@_> could fail to deallocate values
in C<@_> and thus leak memory.  This has been corrected.

=head2 Spurious subroutine stubs after failed subroutine calls

Perl could sometimes create empty subroutine stubs when a
subroutine was not found in the package.  Such cases stopped
later method lookups from progressing into base packages.
This has been corrected.

=head2 Taint failures under C<-U>

When running in unsafe mode, taint violations could sometimes
cause silent failures.  This has been fixed.

=head2 END blocks and the C<-c> switch

Prior versions used to run BEGIN B<and> END blocks when Perl was
run in compile-only mode.  Since this is typically not the expected
behavior, END blocks are not executed anymore when the C<-c> switch
is used, or if compilation fails.

See L</"Support for CHECK blocks"> for how to run things when the compile 
phase ends.

=head2 Potential to leak DATA filehandles

Using the C<__DATA__> token creates an implicit filehandle to
the file that contains the token.  It is the program's
responsibility to close it when it is done reading from it.

This caveat is now better explained in the documentation.
See L<perldata>.

=head1 New or Changed Diagnostics

=over 4

=item "%s" variable %s masks earlier declaration in same %s

(W misc) A "my" or "our" variable has been redeclared in the current scope or statement,
effectively eliminating all access to the previous instance.  This is almost
always a typographical error.  Note that the earlier variable will still exist
until the end of the scope or until all closure referents to it are
destroyed.

=item "my sub" not yet implemented

(F) Lexically scoped subroutines are not yet implemented.  Don't try that
yet.

=item "our" variable %s redeclared

(W misc) You seem to have already declared the same global once before in the
current lexical scope.

=item '!' allowed only after types %s

(F) The '!' is allowed in pack() and unpack() only after certain types.
See L<perlfunc/pack>.

=item / cannot take a count

(F) You had an unpack template indicating a counted-length string,
but you have also specified an explicit size for the string.
See L<perlfunc/pack>.

=item / must be followed by a, A or Z

(F) You had an unpack template indicating a counted-length string,
which must be followed by one of the letters a, A or Z
to indicate what sort of string is to be unpacked.
See L<perlfunc/pack>.

=item / must be followed by a*, A* or Z*

(F) You had a pack template indicating a counted-length string,
Currently the only things that can have their length counted are a*, A* or Z*.
See L<perlfunc/pack>.

=item / must follow a numeric type

(F) You had an unpack template that contained a '#',
but this did not follow some numeric unpack specification.
See L<perlfunc/pack>.

=item /%s/: Unrecognized escape \\%c passed through

(W regexp) You used a backslash-character combination which is not recognized
by Perl.  This combination appears in an interpolated variable or a
C<'>-delimited regular expression.  The character was understood literally.

=item /%s/: Unrecognized escape \\%c in character class passed through

(W regexp) You used a backslash-character combination which is not recognized
by Perl inside character classes.  The character was understood literally.

=item /%s/ should probably be written as "%s"

(W syntax) You have used a pattern where Perl expected to find a string,
as in the first argument to C<join>.  Perl will treat the true
or false result of matching the pattern against $_ as the string,
which is probably not what you had in mind.

=item %s() called too early to check prototype

(W prototype) You've called a function that has a prototype before the parser saw a
definition or declaration for it, and Perl could not check that the call
conforms to the prototype.  You need to either add an early prototype
declaration for the subroutine in question, or move the subroutine
definition ahead of the call to get proper prototype checking.  Alternatively,
if you are certain that you're calling the function correctly, you may put
an ampersand before the name to avoid the warning.  See L<perlsub>.

=item %s argument is not a HASH or ARRAY element

(F) The argument to exists() must be a hash or array element, such as:

    $foo{$bar}
    $ref->{"susie"}[12]

=item %s argument is not a HASH or ARRAY element or slice

(F) The argument to delete() must be either a hash or array element, such as:

    $foo{$bar}
    $ref->{"susie"}[12]

or a hash or array slice, such as:

    @foo[$bar, $baz, $xyzzy]
    @{$ref->[12]}{"susie", "queue"}

=item %s argument is not a subroutine name

(F) The argument to exists() for C<exists &sub> must be a subroutine
name, and not a subroutine call.  C<exists &sub()> will generate this error.

=item %s package attribute may clash with future reserved word: %s

(W reserved) A lowercase attribute name was used that had a package-specific handler.
That name might have a meaning to Perl itself some day, even though it
doesn't yet.  Perhaps you should use a mixed-case attribute name, instead.
See L<attributes>.

=item (in cleanup) %s

(W misc) This prefix usually indicates that a DESTROY() method raised
the indicated exception.  Since destructors are usually called by
the system at arbitrary points during execution, and often a vast
number of times, the warning is issued only once for any number
of failures that would otherwise result in the same message being
repeated.

Failure of user callbacks dispatched using the C<G_KEEPERR> flag
could also result in this warning.  See L<perlcall/G_KEEPERR>.

=item <> should be quotes

(F) You wrote C<< require <file> >> when you should have written
C<require 'file'>.

=item Attempt to join self

(F) You tried to join a thread from within itself, which is an
impossible task.  You may be joining the wrong thread, or you may
need to move the join() to some other thread.

=item Bad evalled substitution pattern

(F) You've used the /e switch to evaluate the replacement for a
substitution, but perl found a syntax error in the code to evaluate,
most likely an unexpected right brace '}'.

=item Bad realloc() ignored

(S) An internal routine called realloc() on something that had never been
malloc()ed in the first place. Mandatory, but can be disabled by
setting environment variable C<PERL_BADFREE> to 1.

=item Bareword found in conditional

(W bareword) The compiler found a bareword where it expected a conditional,
which often indicates that an || or && was parsed as part of the
last argument of the previous construct, for example:

    open FOO || die;

It may also indicate a misspelled constant that has been interpreted
as a bareword:

    use constant TYPO => 1;
    if (TYOP) { print "foo" }

The C<strict> pragma is useful in avoiding such errors.

=item Binary number > 0b11111111111111111111111111111111 non-portable

(W portable) The binary number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems.  See
L<perlport> for more on portability concerns.

=item Bit vector size > 32 non-portable

(W portable) Using bit vector sizes larger than 32 is non-portable.

=item Buffer overflow in prime_env_iter: %s

(W internal) A warning peculiar to VMS.  While Perl was preparing to iterate over
%ENV, it encountered a logical name or symbol definition which was too long,
so it was truncated to the string shown.

=item Can't check filesystem of script "%s"

(P) For some reason you can't check the filesystem of the script for nosuid.

=item Can't declare class for non-scalar %s in "%s"

(S) Currently, only scalar variables can declared with a specific class
qualifier in a "my" or "our" declaration.  The semantics may be extended
for other types of variables in future.

=item Can't declare %s in "%s"

(F) Only scalar, array, and hash variables may be declared as "my" or
"our" variables.  They must have ordinary identifiers as names.

=item Can't ignore signal CHLD, forcing to default

(W signal) Perl has detected that it is being run with the SIGCHLD signal
(sometimes known as SIGCLD) disabled.  Since disabling this signal
will interfere with proper determination of exit status of child
processes, Perl has reset the signal to its default value.
This situation typically indicates that the parent program under
which Perl may be running (e.g., cron) is being very careless.

=item Can't modify non-lvalue subroutine call

(F) Subroutines meant to be used in lvalue context should be declared as
such, see L<perlsub/"Lvalue subroutines">.

=item Can't read CRTL environ

(S) A warning peculiar to VMS.  Perl tried to read an element of %ENV
from the CRTL's internal environment array and discovered the array was
missing.  You need to figure out where your CRTL misplaced its environ
or define F<PERL_ENV_TABLES> (see L<perlvms>) so that environ is not searched.

=item Can't remove %s: %s, skipping file 

(S) You requested an inplace edit without creating a backup file.  Perl
was unable to remove the original file to replace it with the modified
file.  The file was left unmodified.

=item Can't return %s from lvalue subroutine

(F) Perl detected an attempt to return illegal lvalues (such
as temporary or readonly values) from a subroutine used as an lvalue.
This is not allowed.

=item Can't weaken a nonreference

(F) You attempted to weaken something that was not a reference.  Only
references can be weakened.

=item Character class [:%s:] unknown

(F) The class in the character class [: :] syntax is unknown.
See L<perlre>.

=item Character class syntax [%s] belongs inside character classes

(W unsafe) The character class constructs [: :], [= =], and [. .]  go
I<inside> character classes, the [] are part of the construct,
for example: /[012[:alpha:]345]/.  Note that [= =] and [. .]
are not currently implemented; they are simply placeholders for
future extensions.

=item Constant is not %s reference

(F) A constant value (perhaps declared using the C<use constant> pragma)
is being dereferenced, but it amounts to the wrong type of reference.  The
message indicates the type of reference that was expected. This usually
indicates a syntax error in dereferencing the constant value.
See L<perlsub/"Constant Functions"> and L<constant>.

=item constant(%s): %s

(F) The parser found inconsistencies either while attempting to define an
overloaded constant, or when trying to find the character name specified
in the C<\N{...}> escape.  Perhaps you forgot to load the corresponding
C<overload> or C<charnames> pragma?  See L<charnames> and L<overload>.

=item CORE::%s is not a keyword

(F) The CORE:: namespace is reserved for Perl keywords.

=item defined(@array) is deprecated

(D) defined() is not usually useful on arrays because it checks for an
undefined I<scalar> value.  If you want to see if the array is empty,
just use C<if (@array) { # not empty }> for example.  

=item defined(%hash) is deprecated

(D) defined() is not usually useful on hashes because it checks for an
undefined I<scalar> value.  If you want to see if the hash is empty,
just use C<if (%hash) { # not empty }> for example.  

=item Did not produce a valid header

See Server error.

=item (Did you mean "local" instead of "our"?)

(W misc) Remember that "our" does not localize the declared global variable.
You have declared it again in the same lexical scope, which seems superfluous.

=item Document contains no data

See Server error.

=item entering effective %s failed

(F) While under the C<use filetest> pragma, switching the real and
effective uids or gids failed.

=item false [] range "%s" in regexp

(W regexp) A character class range must start and end at a literal character, not
another character class like C<\d> or C<[:alpha:]>.  The "-" in your false
range is interpreted as a literal "-".  Consider quoting the "-",  "\-".
See L<perlre>.

=item Filehandle %s opened only for output

(W io) You tried to read from a filehandle opened only for writing.  If you
intended it to be a read/write filehandle, you needed to open it with
"+<" or "+>" or "+>>" instead of with "<" or nothing.  If
you intended only to read from the file, use "<".  See
L<perlfunc/open>.

=item flock() on closed filehandle %s

(W closed) The filehandle you're attempting to flock() got itself closed some
time before now.  Check your logic flow.  flock() operates on filehandles.
Are you attempting to call flock() on a dirhandle by the same name?

=item Global symbol "%s" requires explicit package name

(F) You've said "use strict vars", which indicates that all variables
must either be lexically scoped (using "my"), declared beforehand using
"our", or explicitly qualified to say which package the global variable
is in (using "::").

=item Hexadecimal number > 0xffffffff non-portable

(W portable) The hexadecimal number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems.  See
L<perlport> for more on portability concerns.

=item Ill-formed CRTL environ value "%s"

(W internal) A warning peculiar to VMS.  Perl tried to read the CRTL's internal
environ array, and encountered an element without the C<=> delimiter
used to separate keys from values.  The element is ignored.

=item Ill-formed message in prime_env_iter: |%s|

(W internal) A warning peculiar to VMS.  Perl tried to read a logical name
or CLI symbol definition when preparing to iterate over %ENV, and
didn't see the expected delimiter between key and value, so the
line was ignored.

=item Illegal binary digit %s

(F) You used a digit other than 0 or 1 in a binary number.

=item Illegal binary digit %s ignored

(W digit) You may have tried to use a digit other than 0 or 1 in a binary number.
Interpretation of the binary number stopped before the offending digit.

=item Illegal number of bits in vec

(F) The number of bits in vec() (the third argument) must be a power of
two from 1 to 32 (or 64, if your platform supports that).

=item Integer overflow in %s number

(W overflow) The hexadecimal, octal or binary number you have specified either
as a literal or as an argument to hex() or oct() is too big for your
architecture, and has been converted to a floating point number.  On a
32-bit architecture the largest hexadecimal, octal or binary number
representable without overflow is 0xFFFFFFFF, 037777777777, or
0b11111111111111111111111111111111 respectively.  Note that Perl
transparently promotes all numbers to a floating point representation
internally--subject to loss of precision errors in subsequent
operations.

=item Invalid %s attribute: %s

The indicated attribute for a subroutine or variable was not recognized
by Perl or by a user-supplied handler.  See L<attributes>.

=item Invalid %s attributes: %s

The indicated attributes for a subroutine or variable were not recognized
by Perl or by a user-supplied handler.  See L<attributes>.

=item invalid [] range "%s" in regexp

The offending range is now explicitly displayed.

=item Invalid separator character %s in attribute list

(F) Something other than a colon or whitespace was seen between the
elements of an attribute list.  If the previous attribute
had a parenthesised parameter list, perhaps that list was terminated
too soon.  See L<attributes>.

=item Invalid separator character %s in subroutine attribute list

(F) Something other than a colon or whitespace was seen between the
elements of a subroutine attribute list.  If the previous attribute
had a parenthesised parameter list, perhaps that list was terminated
too soon.

=item leaving effective %s failed

(F) While under the C<use filetest> pragma, switching the real and
effective uids or gids failed.

=item Lvalue subs returning %s not implemented yet

(F) Due to limitations in the current implementation, array and hash
values cannot be returned in subroutines used in lvalue context.
See L<perlsub/"Lvalue subroutines">.

=item Method %s not permitted

See Server error.

=item Missing %sbrace%s on \N{}

(F) Wrong syntax of character name literal C<\N{charname}> within
double-quotish context.

=item Missing command in piped open

(W pipe) You used the C<open(FH, "| command")> or C<open(FH, "command |")>
construction, but the command was missing or blank.

=item Missing name in "my sub"

(F) The reserved syntax for lexically scoped subroutines requires that they
have a name with which they can be found.

=item No %s specified for -%c

(F) The indicated command line switch needs a mandatory argument, but
you haven't specified one.

=item No package name allowed for variable %s in "our"

(F) Fully qualified variable names are not allowed in "our" declarations,
because that doesn't make much sense under existing semantics.  Such
syntax is reserved for future extensions.

=item No space allowed after -%c

(F) The argument to the indicated command line switch must follow immediately
after the switch, without intervening spaces.

=item no UTC offset information; assuming local time is UTC

(S) A warning peculiar to VMS.  Perl was unable to find the local
timezone offset, so it's assuming that local system time is equivalent
to UTC.  If it's not, define the logical name F<SYS$TIMEZONE_DIFFERENTIAL>
to translate to the number of seconds which need to be added to UTC to
get local time.

=item Octal number > 037777777777 non-portable

(W portable) The octal number you specified is larger than 2**32-1 (4294967295)
and therefore non-portable between systems.  See L<perlport> for more
on portability concerns.

See also L<perlport> for writing portable code.

=item panic: del_backref

(P) Failed an internal consistency check while trying to reset a weak
reference.

=item panic: kid popen errno read

(F) forked child returned an incomprehensible message about its errno.

=item panic: magic_killbackrefs

(P) Failed an internal consistency check while trying to reset all weak
references to an object.

=item Parentheses missing around "%s" list

(W parenthesis) You said something like

    my $foo, $bar = @_;

when you meant

    my ($foo, $bar) = @_;

Remember that "my", "our", and "local" bind tighter than comma.

=item Possible unintended interpolation of %s in string

(W ambiguous) It used to be that Perl would try to guess whether you
wanted an array interpolated or a literal @.  It no longer does this;
arrays are now I<always> interpolated into strings.  This means that 
if you try something like:

        print "fred@example.com";

and the array C<@example> doesn't exist, Perl is going to print
C<fred.com>, which is probably not what you wanted.  To get a literal
C<@> sign in a string, put a backslash before it, just as you would
to get a literal C<$> sign.

=item Possible Y2K bug: %s

(W y2k) You are concatenating the number 19 with another number, which
could be a potential Year 2000 problem.

=item pragma "attrs" is deprecated, use "sub NAME : ATTRS" instead

(W deprecated) You have written something like this:

    sub doit
    {
        use attrs qw(locked);
    }

You should use the new declaration syntax instead.

    sub doit : locked
    {
        ...

The C<use attrs> pragma is now obsolete, and is only provided for
backward-compatibility. See L<perlsub/"Subroutine Attributes">.

=item Premature end of script headers

See Server error.

=item Repeat count in pack overflows

(F) You can't specify a repeat count so large that it overflows
your signed integers.  See L<perlfunc/pack>.

=item Repeat count in unpack overflows

(F) You can't specify a repeat count so large that it overflows
your signed integers.  See L<perlfunc/unpack>.

=item realloc() of freed memory ignored

(S) An internal routine called realloc() on something that had already
been freed.

=item Reference is already weak

(W misc) You have attempted to weaken a reference that is already weak.
Doing so has no effect.

=item setpgrp can't take arguments

(F) Your system has the setpgrp() from BSD 4.2, which takes no arguments,
unlike POSIX setpgid(), which takes a process ID and process group ID.

=item Strange *+?{} on zero-length expression

(W regexp) You applied a regular expression quantifier in a place where it
makes no sense, such as on a zero-width assertion.
Try putting the quantifier inside the assertion instead.  For example,
the way to match "abc" provided that it is followed by three
repetitions of "xyz" is C</abc(?=(?:xyz){3})/>, not C</abc(?=xyz){3}/>.

=item switching effective %s is not implemented

(F) While under the C<use filetest> pragma, we cannot switch the
real and effective uids or gids.

=item This Perl can't reset CRTL environ elements (%s)

=item This Perl can't set CRTL environ elements (%s=%s)

(W internal) Warnings peculiar to VMS.  You tried to change or delete an element
of the CRTL's internal environ array, but your copy of Perl wasn't
built with a CRTL that contained the setenv() function.  You'll need to
rebuild Perl with a CRTL that does, or redefine F<PERL_ENV_TABLES> (see
L<perlvms>) so that the environ array isn't the target of the change to
%ENV which produced the warning.

=item Too late to run %s block

(W void) A CHECK or INIT block is being defined during run time proper,
when the opportunity to run them has already passed.  Perhaps you are
loading a file with C<require> or C<do> when you should be using
C<use> instead.  Or perhaps you should put the C<require> or C<do>
inside a BEGIN block.

=item Unknown open() mode '%s'

(F) The second argument of 3-argument open() is not among the list
of valid modes: C<< < >>, C<< > >>, C<<< >> >>>, C<< +< >>,
C<< +> >>, C<<< +>> >>>, C<-|>, C<|->.

=item Unknown process %x sent message to prime_env_iter: %s

(P) An error peculiar to VMS.  Perl was reading values for %ENV before
iterating over it, and someone else stuck a message in the stream of
data Perl expected.  Someone's very confused, or perhaps trying to
subvert Perl's population of %ENV for nefarious purposes.

=item Unrecognized escape \\%c passed through

(W misc) You used a backslash-character combination which is not recognized
by Perl.  The character was understood literally.

=item Unterminated attribute parameter in attribute list

(F) The lexer saw an opening (left) parenthesis character while parsing an
attribute list, but the matching closing (right) parenthesis
character was not found.  You may need to add (or remove) a backslash
character to get your parentheses to balance.  See L<attributes>.

=item Unterminated attribute list

(F) The lexer found something other than a simple identifier at the start
of an attribute, and it wasn't a semicolon or the start of a
block.  Perhaps you terminated the parameter list of the previous attribute
too soon.  See L<attributes>.

=item Unterminated attribute parameter in subroutine attribute list

(F) The lexer saw an opening (left) parenthesis character while parsing a
subroutine attribute list, but the matching closing (right) parenthesis
character was not found.  You may need to add (or remove) a backslash
character to get your parentheses to balance.

=item Unterminated subroutine attribute list

(F) The lexer found something other than a simple identifier at the start
of a subroutine attribute, and it wasn't a semicolon or the start of a
block.  Perhaps you terminated the parameter list of the previous attribute
too soon.

=item Value of CLI symbol "%s" too long

(W misc) A warning peculiar to VMS.  Perl tried to read the value of an %ENV
element from a CLI symbol table, and found a resultant string longer
than 1024 characters.  The return value has been truncated to 1024
characters.

=item Version number must be a constant number

(P) The attempt to translate a C<use Module n.n LIST> statement into
its equivalent C<BEGIN> block found an internal inconsistency with
the version number.

=back

=head1 New tests

=over 4

=item	lib/attrs

Compatibility tests for C<sub : attrs> vs the older C<use attrs>.

=item	lib/env

Tests for new environment scalar capability (e.g., C<use Env qw($BAR);>).

=item	lib/env-array

Tests for new environment array capability (e.g., C<use Env qw(@PATH);>).

=item	lib/io_const

IO constants (SEEK_*, _IO*).

=item	lib/io_dir

Directory-related IO methods (new, read, close, rewind, tied delete).

=item	lib/io_multihomed

INET sockets with multi-homed hosts.

=item	lib/io_poll

IO poll().

=item	lib/io_unix

UNIX sockets.

=item	op/attrs

Regression tests for C<my ($x,@y,%z) : attrs> and <sub : attrs>.

=item	op/filetest

File test operators.

=item	op/lex_assign

Verify operations that access pad objects (lexicals and temporaries).

=item	op/exists_sub

Verify C<exists &sub> operations.

=back

=head1 Incompatible Changes

=head2 Perl Source Incompatibilities

Beware that any new warnings that have been added or old ones
that have been enhanced are B<not> considered incompatible changes.

Since all new warnings must be explicitly requested via the C<-w>
switch or the C<warnings> pragma, it is ultimately the programmer's
responsibility to ensure that warnings are enabled judiciously.

=over 4

=item CHECK is a new keyword

All subroutine definitions named CHECK are now special.  See
C</"Support for CHECK blocks"> for more information.

=item Treatment of list slices of undef has changed

There is a potential incompatibility in the behavior of list slices
that are comprised entirely of undefined values.
See L</"Behavior of list slices is more consistent">.

=item Format of $English::PERL_VERSION is different

The English module now sets $PERL_VERSION to $^V (a string value) rather
than C<$]> (a numeric value).  This is a potential incompatibility.
Send us a report via perlbug if you are affected by this.

See L</"Improved Perl version numbering system"> for the reasons for
this change.

=item Literals of the form C<1.2.3> parse differently

Previously, numeric literals with more than one dot in them were
interpreted as a floating point number concatenated with one or more
numbers.  Such "numbers" are now parsed as strings composed of the
specified ordinals.

For example, C<print 97.98.99> used to output C<97.9899> in earlier
versions, but now prints C<abc>.

See L</"Support for strings represented as a vector of ordinals">.

=item Possibly changed pseudo-random number generator

Perl programs that depend on reproducing a specific set of pseudo-random
numbers may now produce different output due to improvements made to the
rand() builtin.  You can use C<sh Configure -Drandfunc=rand> to obtain
the old behavior.

See L</"Better pseudo-random number generator">.

=item Hashing function for hash keys has changed

Even though Perl hashes are not order preserving, the apparently
random order encountered when iterating on the contents of a hash
is actually determined by the hashing algorithm used.  Improvements
in the algorithm may yield a random order that is B<different> from
that of previous versions, especially when iterating on hashes.

See L</"Better worst-case behavior of hashes"> for additional
information.

=item C<undef> fails on read only values

Using the C<undef> operator on a readonly value (such as $1) has
the same effect as assigning C<undef> to the readonly value--it
throws an exception.

=item Close-on-exec bit may be set on pipe and socket handles

Pipe and socket handles are also now subject to the close-on-exec
behavior determined by the special variable $^F.

See L</"More consistent close-on-exec behavior">.

=item Writing C<"$$1"> to mean C<"${$}1"> is unsupported

Perl 5.004 deprecated the interpretation of C<$$1> and
similar within interpolated strings to mean C<$$ . "1">,
but still allowed it.

In Perl 5.6.0 and later, C<"$$1"> always means C<"${$1}">.

=item delete(), each(), values() and C<\(%h)>

operate on aliases to values, not copies

delete(), each(), values() and hashes (e.g. C<\(%h)>)
in a list context return the actual
values in the hash, instead of copies (as they used to in earlier
versions).  Typical idioms for using these constructs copy the
returned values, but this can make a significant difference when
creating references to the returned values.  Keys in the hash are still
returned as copies when iterating on a hash.

See also L</"delete(), each(), values() and hash iteration are faster">.

=item vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS

vec() generates a run-time error if the BITS argument is not
a valid power-of-two integer.

=item Text of some diagnostic output has changed

Most references to internal Perl operations in diagnostics
have been changed to be more descriptive.  This may be an
issue for programs that may incorrectly rely on the exact
text of diagnostics for proper functioning.

=item C<%@> has been removed

The undocumented special variable C<%@> that used to accumulate
"background" errors (such as those that happen in DESTROY())
has been removed, because it could potentially result in memory
leaks.

=item Parenthesized not() behaves like a list operator

The C<not> operator now falls under the "if it looks like a function,
it behaves like a function" rule.

As a result, the parenthesized form can be used with C<grep> and C<map>.
The following construct used to be a syntax error before, but it works
as expected now:

    grep not($_), @things;

On the other hand, using C<not> with a literal list slice may not
work.  The following previously allowed construct:

    print not (1,2,3)[0];

needs to be written with additional parentheses now:

    print not((1,2,3)[0]);

The behavior remains unaffected when C<not> is not followed by parentheses.

=item Semantics of bareword prototype C<(*)> have changed

The semantics of the bareword prototype C<*> have changed.  Perl 5.005
always coerced simple scalar arguments to a typeglob, which wasn't useful
in situations where the subroutine must distinguish between a simple
scalar and a typeglob.  The new behavior is to not coerce bareword
arguments to a typeglob.  The value will always be visible as either
a simple scalar or as a reference to a typeglob.

See L</"More functional bareword prototype (*)">.

=item Semantics of bit operators may have changed on 64-bit platforms

If your platform is either natively 64-bit or if Perl has been
configured to used 64-bit integers, i.e., $Config{ivsize} is 8, 
there may be a potential incompatibility in the behavior of bitwise
numeric operators (& | ^ ~ << >>).  These operators used to strictly
operate on the lower 32 bits of integers in previous versions, but now
operate over the entire native integral width.  In particular, note
that unary C<~> will produce different results on platforms that have
different $Config{ivsize}.  For portability, be sure to mask off
the excess bits in the result of unary C<~>, e.g., C<~$x & 0xffffffff>.

See L</"Bit operators support full native integer width">.

=item More builtins taint their results

As described in L</"Improved security features">, there may be more
sources of taint in a Perl program.

To avoid these new tainting behaviors, you can build Perl with the
Configure option C<-Accflags=-DINCOMPLETE_TAINTS>.  Beware that the
ensuing perl binary may be insecure.

=back

=head2 C Source Incompatibilities

=over 4

=item C<PERL_POLLUTE>

Release 5.005 grandfathered old global symbol names by providing preprocessor
macros for extension source compatibility.  As of release 5.6.0, these
preprocessor definitions are not available by default.  You need to explicitly
compile perl with C<-DPERL_POLLUTE> to get these definitions.  For
extensions still using the old symbols, this option can be
specified via MakeMaker:

    perl Makefile.PL POLLUTE=1

=item C<PERL_IMPLICIT_CONTEXT>

This new build option provides a set of macros for all API functions
such that an implicit interpreter/thread context argument is passed to
every API function.  As a result of this, something like C<sv_setsv(foo,bar)>
amounts to a macro invocation that actually translates to something like
C<Perl_sv_setsv(my_perl,foo,bar)>.  While this is generally expected
to not have any significant source compatibility issues, the difference
between a macro and a real function call will need to be considered.

This means that there B<is> a source compatibility issue as a result of
this if your extensions attempt to use pointers to any of the Perl API
functions.

Note that the above issue is not relevant to the default build of
Perl, whose interfaces continue to match those of prior versions
(but subject to the other options described here).

See L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for detailed information
on the ramifications of building Perl with this option.

    NOTE: PERL_IMPLICIT_CONTEXT is automatically enabled whenever Perl is built
    with one of -Dusethreads, -Dusemultiplicity, or both.  It is not
    intended to be enabled by users at this time.

=item C<PERL_POLLUTE_MALLOC>

Enabling Perl's malloc in release 5.005 and earlier caused the namespace of
the system's malloc family of functions to be usurped by the Perl versions,
since by default they used the same names.  Besides causing problems on
platforms that do not allow these functions to be cleanly replaced, this
also meant that the system versions could not be called in programs that
used Perl's malloc.  Previous versions of Perl have allowed this behaviour
to be suppressed with the HIDEMYMALLOC and EMBEDMYMALLOC preprocessor
definitions.

As of release 5.6.0, Perl's malloc family of functions have default names
distinct from the system versions.  You need to explicitly compile perl with
C<-DPERL_POLLUTE_MALLOC> to get the older behaviour.  HIDEMYMALLOC
and EMBEDMYMALLOC have no effect, since the behaviour they enabled is now
the default.

Note that these functions do B<not> constitute Perl's memory allocation API.
See L<perlguts/"Memory Allocation"> for further information about that.

=back

=head2 Compatible C Source API Changes

=over 4

=item C<PATCHLEVEL> is now C<PERL_VERSION>

The cpp macros C<PERL_REVISION>, C<PERL_VERSION>, and C<PERL_SUBVERSION>
are now available by default from perl.h, and reflect the base revision,
patchlevel, and subversion respectively.  C<PERL_REVISION> had no
prior equivalent, while C<PERL_VERSION> and C<PERL_SUBVERSION> were
previously available as C<PATCHLEVEL> and C<SUBVERSION>.

The new names cause less pollution of the B<cpp> namespace and reflect what
the numbers have come to stand for in common practice.  For compatibility,
the old names are still supported when F<patchlevel.h> is explicitly
included (as required before), so there is no source incompatibility
from the change.

=back

=head2 Binary Incompatibilities

In general, the default build of this release is expected to be binary
compatible for extensions built with the 5.005 release or its maintenance
versions.  However, specific platforms may have broken binary compatibility
due to changes in the defaults used in hints files.  Therefore, please be
sure to always check the platform-specific README files for any notes to
the contrary.

The usethreads or usemultiplicity builds are B<not> binary compatible
with the corresponding builds in 5.005.

On platforms that require an explicit list of exports (AIX, OS/2 and Windows,
among others), purely internal symbols such as parser functions and the
run time opcodes are not exported by default.  Perl 5.005 used to export
all functions irrespective of whether they were considered part of the
public API or not.

For the full list of public API functions, see L<perlapi>.

=head1 Known Problems

=head2 Localizing a tied hash element may leak memory

As of the 5.6.1 release, there is a known leak when code such as this
is executed:

    use Tie::Hash;
    tie my %tie_hash => 'Tie::StdHash';

    ...

    local($tie_hash{Foo}) = 1; # leaks

=head2 Known test failures

=over

=item *

64-bit builds

Subtest #15 of lib/b.t may fail under 64-bit builds on platforms such
as HP-UX PA64 and Linux IA64.  The issue is still being investigated.

The lib/io_multihomed test may hang in HP-UX if Perl has been
configured to be 64-bit.  Because other 64-bit platforms do not
hang in this test, HP-UX is suspect.  All other tests pass
in 64-bit HP-UX.  The test attempts to create and connect to
"multihomed" sockets (sockets which have multiple IP addresses).

Note that 64-bit support is still experimental.

=item *

Failure of Thread tests

The subtests 19 and 20 of lib/thr5005.t test are known to fail due to
fundamental problems in the 5.005 threading implementation.  These are
not new failures--Perl 5.005_0x has the same bugs, but didn't have these
tests.  (Note that support for 5.005-style threading remains experimental.)

=item *

NEXTSTEP 3.3 POSIX test failure

In NEXTSTEP 3.3p2 the implementation of the strftime(3) in the
operating system libraries is buggy: the %j format numbers the days of
a month starting from zero, which, while being logical to programmers,
will cause the subtests 19 to 27 of the lib/posix test may fail.

=item *

Tru64 (aka Digital UNIX, aka DEC OSF/1) lib/sdbm test failure with gcc

If compiled with gcc 2.95 the lib/sdbm test will fail (dump core).
The cure is to use the vendor cc, it comes with the operating system
and produces good code.

=back

=head2 EBCDIC platforms not fully supported

In earlier releases of Perl, EBCDIC environments like OS390 (also
known as Open Edition MVS) and VM-ESA were supported.  Due to changes
required by the UTF-8 (Unicode) support, the EBCDIC platforms are not
supported in Perl 5.6.0.

The 5.6.1 release improves support for EBCDIC platforms, but they
are not fully supported yet.

=head2 UNICOS/mk CC failures during Configure run

In UNICOS/mk the following errors may appear during the Configure run:

	Guessing which symbols your C compiler and preprocessor define...
	CC-20 cc: ERROR File = try.c, Line = 3
	...
	  bad switch yylook 79bad switch yylook 79bad switch yylook 79bad switch yylook 79#ifdef A29K
	...
	4 errors detected in the compilation of "try.c".

The culprit is the broken awk of UNICOS/mk.  The effect is fortunately
rather mild: Perl itself is not adversely affected by the error, only
the h2ph utility coming with Perl, and that is rather rarely needed
these days.

=head2 Arrow operator and arrays

When the left argument to the arrow operator C<< -> >> is an array, or
the C<scalar> operator operating on an array, the result of the
operation must be considered erroneous. For example:

    @x->[2]
    scalar(@x)->[2]

These expressions will get run-time errors in some future release of
Perl.

=head2 Experimental features

As discussed above, many features are still experimental.  Interfaces and
implementation of these features are subject to change, and in extreme cases,
even subject to removal in some future release of Perl.  These features
include the following:

=over 4

=item Threads

=item Unicode

=item 64-bit support

=item Lvalue subroutines

=item Weak references

=item The pseudo-hash data type

=item The Compiler suite

=item Internal implementation of file globbing

=item The DB module

=item The regular expression code constructs: 

C<(?{ code })> and C<(??{ code })>

=back

=head1 Obsolete Diagnostics

=over 4

=item Character class syntax [: :] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax beginning
with "[:" and ending with ":]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[:" and ":\]".

=item Ill-formed logical name |%s| in prime_env_iter

(W) A warning peculiar to VMS.  A logical name was encountered when preparing
to iterate over %ENV which violates the syntactic rules governing logical
names.  Because it cannot be translated normally, it is skipped, and will not
appear in %ENV.  This may be a benign occurrence, as some software packages
might directly modify logical name tables and introduce nonstandard names,
or it may indicate that a logical name table has been corrupted.

=item In string, @%s now must be written as \@%s

The description of this error used to say:

        (Someday it will simply assume that an unbackslashed @
         interpolates an array.)

That day has come, and this fatal error has been removed.  It has been
replaced by a non-fatal warning instead.
See L</Arrays now always interpolate into double-quoted strings> for
details.

=item Probable precedence problem on %s

(W) The compiler found a bareword where it expected a conditional,
which often indicates that an || or && was parsed as part of the
last argument of the previous construct, for example:

    open FOO || die;

=item regexp too big

(F) The current implementation of regular expressions uses shorts as
address offsets within a string.  Unfortunately this means that if
the regular expression compiles to longer than 32767, it'll blow up.
Usually when you want a regular expression this big, there is a better
way to do it with multiple statements.  See L<perlre>.

=item Use of "$$<digit>" to mean "${$}<digit>" is deprecated

(D) Perl versions before 5.004 misinterpreted any type marker followed
by "$" and a digit.  For example, "$$0" was incorrectly taken to mean
"${$}0" instead of "${$0}".  This bug is (mostly) fixed in Perl 5.004.

However, the developers of Perl 5.004 could not fix this bug completely,
because at least two widely-used modules depend on the old meaning of
"$$0" in a string.  So Perl 5.004 still interprets "$$<digit>" in the
old (broken) way inside strings; but it generates this message as a
warning.  And in Perl 5.005, this special treatment will cease.

=back

=head1 Reporting Bugs

If you find what you think is a bug, you might check the
articles recently posted to the comp.lang.perl.misc newsgroup.
There may also be information at http://www.perl.com/ , the Perl
Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Gurusamy Sarathy <F<gsar@ActiveState.com>>, with many
contributions from The Perl Porters.

Send omissions or corrections to <F<perlbug@perl.org>>.

=cut
perlhaiku.pod000064400000002744150344123450007244 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlhaiku - Perl version 5.10+ on Haiku

=head1 DESCRIPTION

This file contains instructions how to build Perl for Haiku and lists
known problems.

=head1 BUILD AND INSTALL

The build procedure is completely standard:

  ./Configure -de
  make
  make install

Make perl executable and create a symlink for libperl:

  chmod a+x /boot/common/bin/perl
  cd /boot/common/lib; ln -s perl5/5.26.3/BePC-haiku/CORE/libperl.so .

Replace C<5.26.3> with your respective version of Perl.

=head1 KNOWN PROBLEMS

The following problems are encountered with Haiku revision 28311:

=over 4

=item *

Perl cannot be compiled with threading support ATM.

=item *

The F<cpan/Socket/t/socketpair.t> test fails. More precisely: the subtests
using datagram sockets fail. Unix datagram sockets aren't implemented in
Haiku yet.

=item *

A subtest of the F<cpan/Sys-Syslog/t/syslog.t> test fails. This is due to Haiku
not implementing F</dev/log> support yet.

=item *

The tests F<dist/Net-Ping/t/450_service.t> and F<dist/Net-Ping/t/510_ping_udp.t>
fail. This is due to bugs in Haiku's network stack implementation.

=back

=head1 CONTACT

For Haiku specific problems contact the HaikuPorts developers:
L<http://ports.haiku-files.org/>

The initial Haiku port was done by Ingo Weinhold <ingo_weinhold@gmx.de>.

Last update: 2008-10-29
perltrap.pod000064400000024574150344123450007116 0ustar00=head1 NAME

perltrap - Perl traps for the unwary

=head1 DESCRIPTION

The biggest trap of all is forgetting to C<use warnings> or use the B<-w>
switch; see L<warnings> and L<perlrun>. The second biggest trap is not
making your entire program runnable under C<use strict>.  The third biggest
trap is not reading the list of changes in this version of Perl; see
L<perldelta>.

=head2 Awk Traps

Accustomed B<awk> users should take special note of the following:

=over 4

=item *

A Perl program executes only once, not once for each input line.  You can
do an implicit loop with C<-n> or C<-p>.

=item *

The English module, loaded via

    use English;

allows you to refer to special variables (like C<$/>) with names (like
$RS), as though they were in B<awk>; see L<perlvar> for details.

=item *

Semicolons are required after all simple statements in Perl (except
at the end of a block).  Newline is not a statement delimiter.

=item *

Curly brackets are required on C<if>s and C<while>s.

=item *

Variables begin with "$", "@" or "%" in Perl.

=item *

Arrays index from 0.  Likewise string positions in substr() and
index().

=item *

You have to decide whether your array has numeric or string indices.

=item *

Hash values do not spring into existence upon mere reference.

=item *

You have to decide whether you want to use string or numeric
comparisons.

=item *

Reading an input line does not split it for you.  You get to split it
to an array yourself.  And the split() operator has different
arguments than B<awk>'s.

=item *

The current input line is normally in $_, not $0.  It generally does
not have the newline stripped.  ($0 is the name of the program
executed.)  See L<perlvar>.

=item *

$<I<digit>> does not refer to fields--it refers to substrings matched
by the last match pattern.

=item *

The print() statement does not add field and record separators unless
you set C<$,> and C<$\>.  You can set $OFS and $ORS if you're using
the English module.

=item *

You must open your files before you print to them.

=item *

The range operator is "..", not comma.  The comma operator works as in
C.

=item *

The match operator is "=~", not "~".  ("~" is the one's complement
operator, as in C.)

=item *

The exponentiation operator is "**", not "^".  "^" is the XOR
operator, as in C.  (You know, one could get the feeling that B<awk> is
basically incompatible with C.)

=item *

The concatenation operator is ".", not the null string.  (Using the
null string would render C</pat/ /pat/> unparsable, because the third slash
would be interpreted as a division operator--the tokenizer is in fact
slightly context sensitive for operators like "/", "?", and ">".
And in fact, "." itself can be the beginning of a number.)

=item *

The C<next>, C<exit>, and C<continue> keywords work differently.

=item *


The following variables work differently:

      Awk	Perl
      ARGC	scalar @ARGV (compare with $#ARGV)
      ARGV[0]	$0
      FILENAME	$ARGV
      FNR	$. - something
      FS	(whatever you like)
      NF	$#Fld, or some such
      NR	$.
      OFMT	$#
      OFS	$,
      ORS	$\
      RLENGTH	length($&)
      RS	$/
      RSTART	length($`)
      SUBSEP	$;

=item *

You cannot set $RS to a pattern, only a string.

=item *

When in doubt, run the B<awk> construct through B<a2p> and see what it
gives you.

=back

=head2 C/C++ Traps

Cerebral C and C++ programmers should take note of the following:

=over 4

=item *

Curly brackets are required on C<if>'s and C<while>'s.

=item *

You must use C<elsif> rather than C<else if>.

=item *

The C<break> and C<continue> keywords from C become in Perl C<last>
and C<next>, respectively.  Unlike in C, these do I<not> work within a
C<do { } while> construct.  See L<perlsyn/"Loop Control">.

=item *

The switch statement is called C<given>/C<when> and only available in
perl 5.10 or newer.  See L<perlsyn/"Switch Statements">.

=item *

Variables begin with "$", "@" or "%" in Perl.

=item *

Comments begin with "#", not "/*" or "//".  Perl may interpret C/C++
comments as division operators, unterminated regular expressions or
the defined-or operator.

=item *

You can't take the address of anything, although a similar operator
in Perl is the backslash, which creates a reference.

=item *

C<ARGV> must be capitalized.  C<$ARGV[0]> is C's C<argv[1]>, and C<argv[0]>
ends up in C<$0>.

=item *

System calls such as link(), unlink(), rename(), etc. return nonzero for
success, not 0. (system(), however, returns zero for success.)

=item *

Signal handlers deal with signal names, not numbers.  Use C<kill -l>
to find their names on your system.

=back

=head2 JavaScript Traps

Judicious JavaScript programmers should take note of the following:

=over 4

=item *

In Perl, binary C<+> is always addition.  C<$string1 + $string2> converts
both strings to numbers and then adds them.  To concatenate two strings,
use the C<.> operator.

=item *

The C<+> unary operator doesn't do anything in Perl.  It exists to avoid
syntactic ambiguities.

=item *

Unlike C<for...in>, Perl's C<for> (also spelled C<foreach>) does not allow
the left-hand side to be an arbitrary expression.  It must be a variable:

   for my $variable (keys %hash) {
	...
   }

Furthermore, don't forget the C<keys> in there, as
C<foreach my $kv (%hash) {}> iterates over the keys and values, and is
generally not useful ($kv would be a key, then a value, and so on).

=item *

To iterate over the indices of an array, use C<foreach my $i (0 .. $#array)
{}>.  C<foreach my $v (@array) {}> iterates over the values.

=item *

Perl requires braces following C<if>, C<while>, C<foreach>, etc.

=item *

In Perl, C<else if> is spelled C<elsif>.

=item *

C<? :> has higher precedence than assignment.  In JavaScript, one can
write:

    condition ? do_something() : variable = 3

and the variable is only assigned if the condition is false.  In Perl, you
need parentheses:

    $condition ? do_something() : ($variable = 3);

Or just use C<if>.

=item *

Perl requires semicolons to separate statements.

=item *

Variables declared with C<my> only affect code I<after> the declaration.
You cannot write C<$x = 1; my $x;> and expect the first assignment to
affect the same variable.  It will instead assign to an C<$x> declared
previously in an outer scope, or to a global variable.

Note also that the variable is not visible until the following
I<statement>.  This means that in C<my $x = 1 + $x> the second $x refers
to one declared previously.

=item *

C<my> variables are scoped to the current block, not to the current
function.  If you write C<{my $x;} $x;>, the second C<$x> does not refer to
the one declared inside the block.

=item *

An object's members cannot be made accessible as variables.  The closest
Perl equivalent to C<with(object) { method() }> is C<for>, which can alias
C<$_> to the object:

    for ($object) {
	$_->method;
    }

=item *

The object or class on which a method is called is passed as one of the
method's arguments, not as a separate C<this> value.

=back

=head2 Sed Traps

Seasoned B<sed> programmers should take note of the following:

=over 4

=item *

A Perl program executes only once, not once for each input line.  You can
do an implicit loop with C<-n> or C<-p>.

=item *

Backreferences in substitutions use "$" rather than "\".

=item *

The pattern matching metacharacters "(", ")", and "|" do not have backslashes
in front.

=item *

The range operator is C<...>, rather than comma.

=back

=head2 Shell Traps

Sharp shell programmers should take note of the following:

=over 4

=item *

The backtick operator does variable interpolation without regard to
the presence of single quotes in the command.

=item *

The backtick operator does no translation of the return value, unlike B<csh>.

=item *

Shells (especially B<csh>) do several levels of substitution on each
command line.  Perl does substitution in only certain constructs
such as double quotes, backticks, angle brackets, and search patterns.

=item *

Shells interpret scripts a little bit at a time.  Perl compiles the
entire program before executing it (except for C<BEGIN> blocks, which
execute at compile time).

=item *

The arguments are available via @ARGV, not $1, $2, etc.

=item *

The environment is not automatically made available as separate scalar
variables.

=item *

The shell's C<test> uses "=", "!=", "<" etc for string comparisons and "-eq",
"-ne", "-lt" etc for numeric comparisons. This is the reverse of Perl, which
uses C<eq>, C<ne>, C<lt> for string comparisons, and C<==>, C<!=> C<< < >> etc
for numeric comparisons.

=back

=head2 Perl Traps

Practicing Perl Programmers should take note of the following:

=over 4

=item *

Remember that many operations behave differently in a list
context than they do in a scalar one.  See L<perldata> for details.

=item *

Avoid barewords if you can, especially all lowercase ones.
You can't tell by just looking at it whether a bareword is
a function or a string.  By using quotes on strings and
parentheses on function calls, you won't ever get them confused.

=item *

You cannot discern from mere inspection which builtins
are unary operators (like chop() and chdir())
and which are list operators (like print() and unlink()).
(Unless prototyped, user-defined subroutines can B<only> be list
operators, never unary ones.)  See L<perlop> and L<perlsub>.

=item *

People have a hard time remembering that some functions
default to $_, or @ARGV, or whatever, but that others which
you might expect to do not.

=item *

The <FH> construct is not the name of the filehandle, it is a readline
operation on that handle.  The data read is assigned to $_ only if the
file read is the sole condition in a while loop:

    while (<FH>)      { }
    while (defined($_ = <FH>)) { }..
    <FH>;  # data discarded!

=item *

Remember not to use C<=> when you need C<=~>;
these two constructs are quite different:

    $x =  /foo/;
    $x =~ /foo/;

=item *

The C<do {}> construct isn't a real loop that you can use
loop control on.

=item *

Use C<my()> for local variables whenever you can get away with
it (but see L<perlform> for where you can't).
Using C<local()> actually gives a local value to a global
variable, which leaves you open to unforeseen side-effects
of dynamic scoping.

=item *

If you localize an exported variable in a module, its exported value will
not change.  The local name becomes an alias to a new value but the
external name is still an alias for the original.

=back

As always, if any of these are ever officially declared as bugs,
they'll be fixed and removed.

perlcall.pod000064400000156602150344123450007061 0ustar00=head1 NAME

perlcall - Perl calling conventions from C

=head1 DESCRIPTION

The purpose of this document is to show you how to call Perl subroutines
directly from C, i.e., how to write I<callbacks>.

Apart from discussing the C interface provided by Perl for writing
callbacks the document uses a series of examples to show how the
interface actually works in practice.  In addition some techniques for
coding callbacks are covered.

Examples where callbacks are necessary include

=over 5

=item * An Error Handler

You have created an XSUB interface to an application's C API.

A fairly common feature in applications is to allow you to define a C
function that will be called whenever something nasty occurs. What we
would like is to be able to specify a Perl subroutine that will be
called instead.

=item * An Event-Driven Program

The classic example of where callbacks are used is when writing an
event driven program, such as for an X11 application.  In this case
you register functions to be called whenever specific events occur,
e.g., a mouse button is pressed, the cursor moves into a window or a
menu item is selected.

=back

Although the techniques described here are applicable when embedding
Perl in a C program, this is not the primary goal of this document.
There are other details that must be considered and are specific to
embedding Perl. For details on embedding Perl in C refer to
L<perlembed>.

Before you launch yourself head first into the rest of this document,
it would be a good idea to have read the following two documents--L<perlxs>
and L<perlguts>.

=head1 THE CALL_ FUNCTIONS

Although this stuff is easier to explain using examples, you first need
be aware of a few important definitions.

Perl has a number of C functions that allow you to call Perl
subroutines.  They are

    I32 call_sv(SV* sv, I32 flags);
    I32 call_pv(char *subname, I32 flags);
    I32 call_method(char *methname, I32 flags);
    I32 call_argv(char *subname, I32 flags, char **argv);

The key function is I<call_sv>.  All the other functions are
fairly simple wrappers which make it easier to call Perl subroutines in
special cases. At the end of the day they will all call I<call_sv>
to invoke the Perl subroutine.

All the I<call_*> functions have a C<flags> parameter which is
used to pass a bit mask of options to Perl.  This bit mask operates
identically for each of the functions.  The settings available in the
bit mask are discussed in L</FLAG VALUES>.

Each of the functions will now be discussed in turn.

=over 5

=item call_sv

I<call_sv> takes two parameters. The first, C<sv>, is an SV*.
This allows you to specify the Perl subroutine to be called either as a
C string (which has first been converted to an SV) or a reference to a
subroutine. The section, L</Using call_sv>, shows how you can make
use of I<call_sv>.

=item call_pv

The function, I<call_pv>, is similar to I<call_sv> except it
expects its first parameter to be a C char* which identifies the Perl
subroutine you want to call, e.g., C<call_pv("fred", 0)>.  If the
subroutine you want to call is in another package, just include the
package name in the string, e.g., C<"pkg::fred">.

=item call_method

The function I<call_method> is used to call a method from a Perl
class.  The parameter C<methname> corresponds to the name of the method
to be called.  Note that the class that the method belongs to is passed
on the Perl stack rather than in the parameter list. This class can be
either the name of the class (for a static method) or a reference to an
object (for a virtual method).  See L<perlobj> for more information on
static and virtual methods and L</Using call_method> for an example
of using I<call_method>.

=item call_argv

I<call_argv> calls the Perl subroutine specified by the C string
stored in the C<subname> parameter. It also takes the usual C<flags>
parameter.  The final parameter, C<argv>, consists of a NULL-terminated
list of C strings to be passed as parameters to the Perl subroutine.
See L</Using call_argv>.

=back

All the functions return an integer. This is a count of the number of
items returned by the Perl subroutine. The actual items returned by the
subroutine are stored on the Perl stack.

As a general rule you should I<always> check the return value from
these functions.  Even if you are expecting only a particular number of
values to be returned from the Perl subroutine, there is nothing to
stop someone from doing something unexpected--don't say you haven't
been warned.

=head1 FLAG VALUES

The C<flags> parameter in all the I<call_*> functions is one of G_VOID,
G_SCALAR, or G_ARRAY, which indicate the call context, OR'ed together
with a bit mask of any combination of the other G_* symbols defined below.

=head2  G_VOID

Calls the Perl subroutine in a void context.

This flag has 2 effects:

=over 5

=item 1.

It indicates to the subroutine being called that it is executing in
a void context (if it executes I<wantarray> the result will be the
undefined value).

=item 2.

It ensures that nothing is actually returned from the subroutine.

=back

The value returned by the I<call_*> function indicates how many
items have been returned by the Perl subroutine--in this case it will
be 0.


=head2  G_SCALAR

Calls the Perl subroutine in a scalar context.  This is the default
context flag setting for all the I<call_*> functions.

This flag has 2 effects:

=over 5

=item 1.

It indicates to the subroutine being called that it is executing in a
scalar context (if it executes I<wantarray> the result will be false).

=item 2.

It ensures that only a scalar is actually returned from the subroutine.
The subroutine can, of course,  ignore the I<wantarray> and return a
list anyway. If so, then only the last element of the list will be
returned.

=back

The value returned by the I<call_*> function indicates how many
items have been returned by the Perl subroutine - in this case it will
be either 0 or 1.

If 0, then you have specified the G_DISCARD flag.

If 1, then the item actually returned by the Perl subroutine will be
stored on the Perl stack - the section L</Returning a Scalar> shows how
to access this value on the stack.  Remember that regardless of how
many items the Perl subroutine returns, only the last one will be
accessible from the stack - think of the case where only one value is
returned as being a list with only one element.  Any other items that
were returned will not exist by the time control returns from the
I<call_*> function.  The section L</Returning a List in Scalar
Context> shows an example of this behavior.


=head2 G_ARRAY

Calls the Perl subroutine in a list context.

As with G_SCALAR, this flag has 2 effects:

=over 5

=item 1.

It indicates to the subroutine being called that it is executing in a
list context (if it executes I<wantarray> the result will be true).

=item 2.

It ensures that all items returned from the subroutine will be
accessible when control returns from the I<call_*> function.

=back

The value returned by the I<call_*> function indicates how many
items have been returned by the Perl subroutine.

If 0, then you have specified the G_DISCARD flag.

If not 0, then it will be a count of the number of items returned by
the subroutine. These items will be stored on the Perl stack.  The
section L</Returning a List of Values> gives an example of using the
G_ARRAY flag and the mechanics of accessing the returned items from the
Perl stack.

=head2 G_DISCARD

By default, the I<call_*> functions place the items returned from
by the Perl subroutine on the stack.  If you are not interested in
these items, then setting this flag will make Perl get rid of them
automatically for you.  Note that it is still possible to indicate a
context to the Perl subroutine by using either G_SCALAR or G_ARRAY.

If you do not set this flag then it is I<very> important that you make
sure that any temporaries (i.e., parameters passed to the Perl
subroutine and values returned from the subroutine) are disposed of
yourself.  The section L</Returning a Scalar> gives details of how to
dispose of these temporaries explicitly and the section L</Using Perl to
Dispose of Temporaries> discusses the specific circumstances where you
can ignore the problem and let Perl deal with it for you.

=head2 G_NOARGS

Whenever a Perl subroutine is called using one of the I<call_*>
functions, it is assumed by default that parameters are to be passed to
the subroutine.  If you are not passing any parameters to the Perl
subroutine, you can save a bit of time by setting this flag.  It has
the effect of not creating the C<@_> array for the Perl subroutine.

Although the functionality provided by this flag may seem
straightforward, it should be used only if there is a good reason to do
so.  The reason for being cautious is that, even if you have specified
the G_NOARGS flag, it is still possible for the Perl subroutine that
has been called to think that you have passed it parameters.

In fact, what can happen is that the Perl subroutine you have called
can access the C<@_> array from a previous Perl subroutine.  This will
occur when the code that is executing the I<call_*> function has
itself been called from another Perl subroutine. The code below
illustrates this

    sub fred
      { print "@_\n"  }

    sub joe
      { &fred }

    &joe(1,2,3);

This will print

    1 2 3

What has happened is that C<fred> accesses the C<@_> array which
belongs to C<joe>.


=head2 G_EVAL

It is possible for the Perl subroutine you are calling to terminate
abnormally, e.g., by calling I<die> explicitly or by not actually
existing.  By default, when either of these events occurs, the
process will terminate immediately.  If you want to trap this
type of event, specify the G_EVAL flag.  It will put an I<eval { }>
around the subroutine call.

Whenever control returns from the I<call_*> function you need to
check the C<$@> variable as you would in a normal Perl script.

The value returned from the I<call_*> function is dependent on
what other flags have been specified and whether an error has
occurred.  Here are all the different cases that can occur:

=over 5

=item *

If the I<call_*> function returns normally, then the value
returned is as specified in the previous sections.

=item *

If G_DISCARD is specified, the return value will always be 0.

=item *

If G_ARRAY is specified I<and> an error has occurred, the return value
will always be 0.

=item *

If G_SCALAR is specified I<and> an error has occurred, the return value
will be 1 and the value on the top of the stack will be I<undef>. This
means that if you have already detected the error by checking C<$@> and
you want the program to continue, you must remember to pop the I<undef>
from the stack.

=back

See L</Using G_EVAL> for details on using G_EVAL.

=head2 G_KEEPERR

Using the G_EVAL flag described above will always set C<$@>: clearing
it if there was no error, and setting it to describe the error if there
was an error in the called code.  This is what you want if your intention
is to handle possible errors, but sometimes you just want to trap errors
and stop them interfering with the rest of the program.

This scenario will mostly be applicable to code that is meant to be called
from within destructors, asynchronous callbacks, and signal handlers.
In such situations, where the code being called has little relation to the
surrounding dynamic context, the main program needs to be insulated from
errors in the called code, even if they can't be handled intelligently.
It may also be useful to do this with code for C<__DIE__> or C<__WARN__>
hooks, and C<tie> functions.

The G_KEEPERR flag is meant to be used in conjunction with G_EVAL in
I<call_*> functions that are used to implement such code, or with
C<eval_sv>.  This flag has no effect on the C<call_*> functions when
G_EVAL is not used.

When G_KEEPERR is used, any error in the called code will terminate the
call as usual, and the error will not propagate beyond the call (as usual
for G_EVAL), but it will not go into C<$@>.  Instead the error will be
converted into a warning, prefixed with the string "\t(in cleanup)".
This can be disabled using C<no warnings 'misc'>.  If there is no error,
C<$@> will not be cleared.

Note that the G_KEEPERR flag does not propagate into inner evals; these
may still set C<$@>.

The G_KEEPERR flag was introduced in Perl version 5.002.

See L</Using G_KEEPERR> for an example of a situation that warrants the
use of this flag.

=head2 Determining the Context

As mentioned above, you can determine the context of the currently
executing subroutine in Perl with I<wantarray>.  The equivalent test
can be made in C by using the C<GIMME_V> macro, which returns
C<G_ARRAY> if you have been called in a list context, C<G_SCALAR> if
in a scalar context, or C<G_VOID> if in a void context (i.e., the
return value will not be used).  An older version of this macro is
called C<GIMME>; in a void context it returns C<G_SCALAR> instead of
C<G_VOID>.  An example of using the C<GIMME_V> macro is shown in
section L</Using GIMME_V>.

=head1 EXAMPLES

Enough of the definition talk! Let's have a few examples.

Perl provides many macros to assist in accessing the Perl stack.
Wherever possible, these macros should always be used when interfacing
to Perl internals.  We hope this should make the code less vulnerable
to any changes made to Perl in the future.

Another point worth noting is that in the first series of examples I
have made use of only the I<call_pv> function.  This has been done
to keep the code simpler and ease you into the topic.  Wherever
possible, if the choice is between using I<call_pv> and
I<call_sv>, you should always try to use I<call_sv>.  See
L</Using call_sv> for details.

=head2 No Parameters, Nothing Returned

This first trivial example will call a Perl subroutine, I<PrintUID>, to
print out the UID of the process.

    sub PrintUID
    {
        print "UID is $<\n";
    }

and here is a C function to call it

    static void
    call_PrintUID()
    {
        dSP;

        PUSHMARK(SP);
        call_pv("PrintUID", G_DISCARD|G_NOARGS);
    }

Simple, eh?

A few points to note about this example:

=over 5

=item 1.

Ignore C<dSP> and C<PUSHMARK(SP)> for now. They will be discussed in
the next example.

=item 2.

We aren't passing any parameters to I<PrintUID> so G_NOARGS can be
specified.

=item 3.

We aren't interested in anything returned from I<PrintUID>, so
G_DISCARD is specified. Even if I<PrintUID> was changed to
return some value(s), having specified G_DISCARD will mean that they
will be wiped by the time control returns from I<call_pv>.

=item 4.

As I<call_pv> is being used, the Perl subroutine is specified as a
C string. In this case the subroutine name has been 'hard-wired' into the
code.

=item 5.

Because we specified G_DISCARD, it is not necessary to check the value
returned from I<call_pv>. It will always be 0.

=back

=head2 Passing Parameters

Now let's make a slightly more complex example. This time we want to
call a Perl subroutine, C<LeftString>, which will take 2 parameters--a
string ($s) and an integer ($n).  The subroutine will simply
print the first $n characters of the string.

So the Perl subroutine would look like this:

    sub LeftString
    {
        my($s, $n) = @_;
        print substr($s, 0, $n), "\n";
    }

The C function required to call I<LeftString> would look like this:

    static void
    call_LeftString(a, b)
    char * a;
    int b;
    {
        dSP;

	ENTER;
        SAVETMPS;

        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(sv_2mortal(newSVpv(a, 0)));
        PUSHs(sv_2mortal(newSViv(b)));
        PUTBACK;

        call_pv("LeftString", G_DISCARD);

        FREETMPS;
        LEAVE;
    }

Here are a few notes on the C function I<call_LeftString>.

=over 5

=item 1.

Parameters are passed to the Perl subroutine using the Perl stack.
This is the purpose of the code beginning with the line C<dSP> and
ending with the line C<PUTBACK>.  The C<dSP> declares a local copy
of the stack pointer.  This local copy should B<always> be accessed
as C<SP>.

=item 2.

If you are going to put something onto the Perl stack, you need to know
where to put it. This is the purpose of the macro C<dSP>--it declares
and initializes a I<local> copy of the Perl stack pointer.

All the other macros which will be used in this example require you to
have used this macro.

The exception to this rule is if you are calling a Perl subroutine
directly from an XSUB function. In this case it is not necessary to
use the C<dSP> macro explicitly--it will be declared for you
automatically.

=item 3.

Any parameters to be pushed onto the stack should be bracketed by the
C<PUSHMARK> and C<PUTBACK> macros.  The purpose of these two macros, in
this context, is to count the number of parameters you are
pushing automatically.  Then whenever Perl is creating the C<@_> array for the
subroutine, it knows how big to make it.

The C<PUSHMARK> macro tells Perl to make a mental note of the current
stack pointer. Even if you aren't passing any parameters (like the
example shown in the section L</No Parameters, Nothing Returned>) you
must still call the C<PUSHMARK> macro before you can call any of the
I<call_*> functions--Perl still needs to know that there are no
parameters.

The C<PUTBACK> macro sets the global copy of the stack pointer to be
the same as our local copy. If we didn't do this, I<call_pv>
wouldn't know where the two parameters we pushed were--remember that
up to now all the stack pointer manipulation we have done is with our
local copy, I<not> the global copy.

=item 4.

Next, we come to EXTEND and PUSHs. This is where the parameters
actually get pushed onto the stack. In this case we are pushing a
string and an integer.

Alternatively you can use the XPUSHs() macro, which combines a
C<EXTEND(SP, 1)> and C<PUSHs()>.  This is less efficient if you're
pushing multiple values.

See L<perlguts/"XSUBs and the Argument Stack"> for details
on how the PUSH macros work.

=item 5.

Because we created temporary values (by means of sv_2mortal() calls)
we will have to tidy up the Perl stack and dispose of mortal SVs.

This is the purpose of

    ENTER;
    SAVETMPS;

at the start of the function, and

    FREETMPS;
    LEAVE;

at the end. The C<ENTER>/C<SAVETMPS> pair creates a boundary for any
temporaries we create.  This means that the temporaries we get rid of
will be limited to those which were created after these calls.

The C<FREETMPS>/C<LEAVE> pair will get rid of any values returned by
the Perl subroutine (see next example), plus it will also dump the
mortal SVs we have created.  Having C<ENTER>/C<SAVETMPS> at the
beginning of the code makes sure that no other mortals are destroyed.

Think of these macros as working a bit like C<{> and C<}> in Perl
to limit the scope of local variables.

See the section L</Using Perl to Dispose of Temporaries> for details of
an alternative to using these macros.

=item 6.

Finally, I<LeftString> can now be called via the I<call_pv> function.
The only flag specified this time is G_DISCARD. Because we are passing
2 parameters to the Perl subroutine this time, we have not specified
G_NOARGS.

=back

=head2 Returning a Scalar

Now for an example of dealing with the items returned from a Perl
subroutine.

Here is a Perl subroutine, I<Adder>, that takes 2 integer parameters
and simply returns their sum.

    sub Adder
    {
        my($a, $b) = @_;
        $a + $b;
    }

Because we are now concerned with the return value from I<Adder>, the C
function required to call it is now a bit more complex.

    static void
    call_Adder(a, b)
    int a;
    int b;
    {
        dSP;
        int count;

        ENTER;
        SAVETMPS;

        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(sv_2mortal(newSViv(a)));
        PUSHs(sv_2mortal(newSViv(b)));
        PUTBACK;

        count = call_pv("Adder", G_SCALAR);

        SPAGAIN;

        if (count != 1)
            croak("Big trouble\n");

        printf ("The sum of %d and %d is %d\n", a, b, POPi);

        PUTBACK;
        FREETMPS;
        LEAVE;
    }

Points to note this time are

=over 5

=item 1.

The only flag specified this time was G_SCALAR. That means that the C<@_>
array will be created and that the value returned by I<Adder> will
still exist after the call to I<call_pv>.

=item 2.

The purpose of the macro C<SPAGAIN> is to refresh the local copy of the
stack pointer. This is necessary because it is possible that the memory
allocated to the Perl stack has been reallocated during the
I<call_pv> call.

If you are making use of the Perl stack pointer in your code you must
always refresh the local copy using SPAGAIN whenever you make use
of the I<call_*> functions or any other Perl internal function.

=item 3.

Although only a single value was expected to be returned from I<Adder>,
it is still good practice to check the return code from I<call_pv>
anyway.

Expecting a single value is not quite the same as knowing that there
will be one. If someone modified I<Adder> to return a list and we
didn't check for that possibility and take appropriate action the Perl
stack would end up in an inconsistent state. That is something you
I<really> don't want to happen ever.

=item 4.

The C<POPi> macro is used here to pop the return value from the stack.
In this case we wanted an integer, so C<POPi> was used.


Here is the complete list of POP macros available, along with the types
they return.

    POPs	SV
    POPp	pointer (PV)
    POPpbytex   pointer to bytes (PV)
    POPn	double (NV)
    POPi	integer (IV)
    POPu        unsigned integer (UV)
    POPl	long
    POPul       unsigned long

Since these macros have side-effects don't use them as arguments to
macros that may evaluate their argument several times, for example:

  /* Bad idea, don't do this */
  STRLEN len;
  const char *s = SvPV(POPs, len);

Instead, use a temporary:

  STRLEN len;
  SV *sv = POPs;
  const char *s = SvPV(sv, len);

or a macro that guarantees it will evaluate its arguments only once:

  STRLEN len;
  const char *s = SvPVx(POPs, len);

=item 5.

The final C<PUTBACK> is used to leave the Perl stack in a consistent
state before exiting the function.  This is necessary because when we
popped the return value from the stack with C<POPi> it updated only our
local copy of the stack pointer.  Remember, C<PUTBACK> sets the global
stack pointer to be the same as our local copy.

=back


=head2 Returning a List of Values

Now, let's extend the previous example to return both the sum of the
parameters and the difference.

Here is the Perl subroutine

    sub AddSubtract
    {
       my($a, $b) = @_;
       ($a+$b, $a-$b);
    }

and this is the C function

    static void
    call_AddSubtract(a, b)
    int a;
    int b;
    {
        dSP;
        int count;

        ENTER;
        SAVETMPS;

        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(sv_2mortal(newSViv(a)));
        PUSHs(sv_2mortal(newSViv(b)));
        PUTBACK;

        count = call_pv("AddSubtract", G_ARRAY);

        SPAGAIN;

        if (count != 2)
            croak("Big trouble\n");

        printf ("%d - %d = %d\n", a, b, POPi);
        printf ("%d + %d = %d\n", a, b, POPi);

        PUTBACK;
        FREETMPS;
        LEAVE;
    }

If I<call_AddSubtract> is called like this

    call_AddSubtract(7, 4);

then here is the output

    7 - 4 = 3
    7 + 4 = 11

Notes

=over 5

=item 1.

We wanted list context, so G_ARRAY was used.

=item 2.

Not surprisingly C<POPi> is used twice this time because we were
retrieving 2 values from the stack. The important thing to note is that
when using the C<POP*> macros they come off the stack in I<reverse>
order.

=back

=head2 Returning a List in Scalar Context

Say the Perl subroutine in the previous section was called in a scalar
context, like this

    static void
    call_AddSubScalar(a, b)
    int a;
    int b;
    {
        dSP;
        int count;
        int i;

        ENTER;
        SAVETMPS;

        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(sv_2mortal(newSViv(a)));
        PUSHs(sv_2mortal(newSViv(b)));
        PUTBACK;

        count = call_pv("AddSubtract", G_SCALAR);

        SPAGAIN;

        printf ("Items Returned = %d\n", count);

        for (i = 1; i <= count; ++i)
            printf ("Value %d = %d\n", i, POPi);

        PUTBACK;
        FREETMPS;
        LEAVE;
    }

The other modification made is that I<call_AddSubScalar> will print the
number of items returned from the Perl subroutine and their value (for
simplicity it assumes that they are integer).  So if
I<call_AddSubScalar> is called

    call_AddSubScalar(7, 4);

then the output will be

    Items Returned = 1
    Value 1 = 3

In this case the main point to note is that only the last item in the
list is returned from the subroutine. I<AddSubtract> actually made it back to
I<call_AddSubScalar>.


=head2 Returning Data from Perl via the Parameter List

It is also possible to return values directly via the parameter
list--whether it is actually desirable to do it is another matter entirely.

The Perl subroutine, I<Inc>, below takes 2 parameters and increments
each directly.

    sub Inc
    {
        ++ $_[0];
        ++ $_[1];
    }

and here is a C function to call it.

    static void
    call_Inc(a, b)
    int a;
    int b;
    {
        dSP;
        int count;
        SV * sva;
        SV * svb;

        ENTER;
        SAVETMPS;

        sva = sv_2mortal(newSViv(a));
        svb = sv_2mortal(newSViv(b));

        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(sva);
        PUSHs(svb);
        PUTBACK;

        count = call_pv("Inc", G_DISCARD);

        if (count != 0)
            croak ("call_Inc: expected 0 values from 'Inc', got %d\n",
                   count);

        printf ("%d + 1 = %d\n", a, SvIV(sva));
        printf ("%d + 1 = %d\n", b, SvIV(svb));

	FREETMPS;
	LEAVE;
    }

To be able to access the two parameters that were pushed onto the stack
after they return from I<call_pv> it is necessary to make a note
of their addresses--thus the two variables C<sva> and C<svb>.

The reason this is necessary is that the area of the Perl stack which
held them will very likely have been overwritten by something else by
the time control returns from I<call_pv>.




=head2 Using G_EVAL

Now an example using G_EVAL. Below is a Perl subroutine which computes
the difference of its 2 parameters. If this would result in a negative
result, the subroutine calls I<die>.

    sub Subtract
    {
        my ($a, $b) = @_;

        die "death can be fatal\n" if $a < $b;

        $a - $b;
    }

and some C to call it

 static void
 call_Subtract(a, b)
 int a;
 int b;
 {
     dSP;
     int count;
     SV *err_tmp;

     ENTER;
     SAVETMPS;

     PUSHMARK(SP);
     EXTEND(SP, 2);
     PUSHs(sv_2mortal(newSViv(a)));
     PUSHs(sv_2mortal(newSViv(b)));
     PUTBACK;

     count = call_pv("Subtract", G_EVAL|G_SCALAR);

     SPAGAIN;

     /* Check the eval first */
     err_tmp = ERRSV;
     if (SvTRUE(err_tmp))
     {
         printf ("Uh oh - %s\n", SvPV_nolen(err_tmp));
         POPs;
     }
     else
     {
       if (count != 1)
        croak("call_Subtract: wanted 1 value from 'Subtract', got %d\n",
              count);

         printf ("%d - %d = %d\n", a, b, POPi);
     }

     PUTBACK;
     FREETMPS;
     LEAVE;
 }

If I<call_Subtract> is called thus

    call_Subtract(4, 5)

the following will be printed

    Uh oh - death can be fatal

Notes

=over 5

=item 1.

We want to be able to catch the I<die> so we have used the G_EVAL
flag.  Not specifying this flag would mean that the program would
terminate immediately at the I<die> statement in the subroutine
I<Subtract>.

=item 2.

The code

    err_tmp = ERRSV;
    if (SvTRUE(err_tmp))
    {
        printf ("Uh oh - %s\n", SvPV_nolen(err_tmp));
        POPs;
    }

is the direct equivalent of this bit of Perl

    print "Uh oh - $@\n" if $@;

C<PL_errgv> is a perl global of type C<GV *> that points to the symbol
table entry containing the error.  C<ERRSV> therefore refers to the C
equivalent of C<$@>.  We use a local temporary, C<err_tmp>, since
C<ERRSV> is a macro that calls a function, and C<SvTRUE(ERRSV)> would
end up calling that function multiple times.

=item 3.

Note that the stack is popped using C<POPs> in the block where
C<SvTRUE(err_tmp)> is true.  This is necessary because whenever a
I<call_*> function invoked with G_EVAL|G_SCALAR returns an error,
the top of the stack holds the value I<undef>. Because we want the
program to continue after detecting this error, it is essential that
the stack be tidied up by removing the I<undef>.

=back


=head2 Using G_KEEPERR

Consider this rather facetious example, where we have used an XS
version of the call_Subtract example above inside a destructor:

    package Foo;
    sub new { bless {}, $_[0] }
    sub Subtract {
        my($a,$b) = @_;
        die "death can be fatal" if $a < $b;
        $a - $b;
    }
    sub DESTROY { call_Subtract(5, 4); }
    sub foo { die "foo dies"; }

    package main;
    {
	my $foo = Foo->new;
	eval { $foo->foo };
    }
    print "Saw: $@" if $@;             # should be, but isn't

This example will fail to recognize that an error occurred inside the
C<eval {}>.  Here's why: the call_Subtract code got executed while perl
was cleaning up temporaries when exiting the outer braced block, and because
call_Subtract is implemented with I<call_pv> using the G_EVAL
flag, it promptly reset C<$@>.  This results in the failure of the
outermost test for C<$@>, and thereby the failure of the error trap.

Appending the G_KEEPERR flag, so that the I<call_pv> call in
call_Subtract reads:

        count = call_pv("Subtract", G_EVAL|G_SCALAR|G_KEEPERR);

will preserve the error and restore reliable error handling.

=head2 Using call_sv

In all the previous examples I have 'hard-wired' the name of the Perl
subroutine to be called from C.  Most of the time though, it is more
convenient to be able to specify the name of the Perl subroutine from
within the Perl script, and you'll want to use
L<call_sv|perlapi/call_sv>.

Consider the Perl code below

    sub fred
    {
        print "Hello there\n";
    }

    CallSubPV("fred");

Here is a snippet of XSUB which defines I<CallSubPV>.

    void
    CallSubPV(name)
    	char *	name
    	CODE:
	PUSHMARK(SP);
	call_pv(name, G_DISCARD|G_NOARGS);

That is fine as far as it goes. The thing is, the Perl subroutine
can be specified as only a string, however, Perl allows references
to subroutines and anonymous subroutines.
This is where I<call_sv> is useful.

The code below for I<CallSubSV> is identical to I<CallSubPV> except
that the C<name> parameter is now defined as an SV* and we use
I<call_sv> instead of I<call_pv>.

    void
    CallSubSV(name)
    	SV *	name
    	CODE:
	PUSHMARK(SP);
	call_sv(name, G_DISCARD|G_NOARGS);

Because we are using an SV to call I<fred> the following can all be used:

    CallSubSV("fred");
    CallSubSV(\&fred);
    $ref = \&fred;
    CallSubSV($ref);
    CallSubSV( sub { print "Hello there\n" } );

As you can see, I<call_sv> gives you much greater flexibility in
how you can specify the Perl subroutine.

You should note that, if it is necessary to store the SV (C<name> in the
example above) which corresponds to the Perl subroutine so that it can
be used later in the program, it not enough just to store a copy of the
pointer to the SV. Say the code above had been like this:

    static SV * rememberSub;

    void
    SaveSub1(name)
    	SV *	name
    	CODE:
	rememberSub = name;

    void
    CallSavedSub1()
    	CODE:
	PUSHMARK(SP);
	call_sv(rememberSub, G_DISCARD|G_NOARGS);

The reason this is wrong is that, by the time you come to use the
pointer C<rememberSub> in C<CallSavedSub1>, it may or may not still refer
to the Perl subroutine that was recorded in C<SaveSub1>.  This is
particularly true for these cases:

    SaveSub1(\&fred);
    CallSavedSub1();

    SaveSub1( sub { print "Hello there\n" } );
    CallSavedSub1();

By the time each of the C<SaveSub1> statements above has been executed,
the SV*s which corresponded to the parameters will no longer exist.
Expect an error message from Perl of the form

    Can't use an undefined value as a subroutine reference at ...

for each of the C<CallSavedSub1> lines.

Similarly, with this code

    $ref = \&fred;
    SaveSub1($ref);
    $ref = 47;
    CallSavedSub1();

you can expect one of these messages (which you actually get is dependent on
the version of Perl you are using)

    Not a CODE reference at ...
    Undefined subroutine &main::47 called ...

The variable $ref may have referred to the subroutine C<fred>
whenever the call to C<SaveSub1> was made but by the time
C<CallSavedSub1> gets called it now holds the number C<47>. Because we
saved only a pointer to the original SV in C<SaveSub1>, any changes to
$ref will be tracked by the pointer C<rememberSub>. This means that
whenever C<CallSavedSub1> gets called, it will attempt to execute the
code which is referenced by the SV* C<rememberSub>.  In this case
though, it now refers to the integer C<47>, so expect Perl to complain
loudly.

A similar but more subtle problem is illustrated with this code:

    $ref = \&fred;
    SaveSub1($ref);
    $ref = \&joe;
    CallSavedSub1();

This time whenever C<CallSavedSub1> gets called it will execute the Perl
subroutine C<joe> (assuming it exists) rather than C<fred> as was
originally requested in the call to C<SaveSub1>.

To get around these problems it is necessary to take a full copy of the
SV.  The code below shows C<SaveSub2> modified to do that.

    /* this isn't thread-safe */
    static SV * keepSub = (SV*)NULL;

    void
    SaveSub2(name)
        SV *	name
    	CODE:
     	/* Take a copy of the callback */
    	if (keepSub == (SV*)NULL)
    	    /* First time, so create a new SV */
	    keepSub = newSVsv(name);
    	else
    	    /* Been here before, so overwrite */
	    SvSetSV(keepSub, name);

    void
    CallSavedSub2()
    	CODE:
	PUSHMARK(SP);
	call_sv(keepSub, G_DISCARD|G_NOARGS);

To avoid creating a new SV every time C<SaveSub2> is called,
the function first checks to see if it has been called before.  If not,
then space for a new SV is allocated and the reference to the Perl
subroutine C<name> is copied to the variable C<keepSub> in one
operation using C<newSVsv>.  Thereafter, whenever C<SaveSub2> is called,
the existing SV, C<keepSub>, is overwritten with the new value using
C<SvSetSV>.

Note: using a static or global variable to store the SV isn't
thread-safe.  You can either use the C<MY_CXT> mechanism documented in
L<perlxs/Safely Storing Static Data in XS> which is fast, or store the
values in perl global variables, using get_sv(), which is much slower.

=head2 Using call_argv

Here is a Perl subroutine which prints whatever parameters are passed
to it.

    sub PrintList
    {
        my(@list) = @_;

        foreach (@list) { print "$_\n" }
    }

And here is an example of I<call_argv> which will call
I<PrintList>.

    static char * words[] = {"alpha", "beta", "gamma", "delta", NULL};

    static void
    call_PrintList()
    {
        call_argv("PrintList", G_DISCARD, words);
    }

Note that it is not necessary to call C<PUSHMARK> in this instance.
This is because I<call_argv> will do it for you.

=head2 Using call_method

Consider the following Perl code:

    {
        package Mine;

        sub new
        {
            my($type) = shift;
            bless [@_]
        }

        sub Display
        {
            my ($self, $index) = @_;
            print "$index: $$self[$index]\n";
        }

        sub PrintID
        {
            my($class) = @_;
            print "This is Class $class version 1.0\n";
        }
    }

It implements just a very simple class to manage an array.  Apart from
the constructor, C<new>, it declares methods, one static and one
virtual. The static method, C<PrintID>, prints out simply the class
name and a version number. The virtual method, C<Display>, prints out a
single element of the array.  Here is an all-Perl example of using it.

    $a = Mine->new('red', 'green', 'blue');
    $a->Display(1);
    Mine->PrintID;

will print

    1: green
    This is Class Mine version 1.0

Calling a Perl method from C is fairly straightforward. The following
things are required:

=over 5

=item *

A reference to the object for a virtual method or the name of the class
for a static method

=item *

The name of the method

=item *

Any other parameters specific to the method

=back

Here is a simple XSUB which illustrates the mechanics of calling both
the C<PrintID> and C<Display> methods from C.

    void
    call_Method(ref, method, index)
        SV *	ref
        char *	method
        int		index
        CODE:
        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(ref);
        PUSHs(sv_2mortal(newSViv(index)));
        PUTBACK;

        call_method(method, G_DISCARD);

    void
    call_PrintID(class, method)
        char *	class
        char *	method
        CODE:
        PUSHMARK(SP);
        XPUSHs(sv_2mortal(newSVpv(class, 0)));
        PUTBACK;

        call_method(method, G_DISCARD);


So the methods C<PrintID> and C<Display> can be invoked like this:

    $a = Mine->new('red', 'green', 'blue');
    call_Method($a, 'Display', 1);
    call_PrintID('Mine', 'PrintID');

The only thing to note is that, in both the static and virtual methods,
the method name is not passed via the stack--it is used as the first
parameter to I<call_method>.

=head2 Using GIMME_V

Here is a trivial XSUB which prints the context in which it is
currently executing.

    void
    PrintContext()
        CODE:
        U8 gimme = GIMME_V;
        if (gimme == G_VOID)
            printf ("Context is Void\n");
        else if (gimme == G_SCALAR)
            printf ("Context is Scalar\n");
        else
            printf ("Context is Array\n");

And here is some Perl to test it.

    PrintContext;
    $a = PrintContext;
    @a = PrintContext;

The output from that will be

    Context is Void
    Context is Scalar
    Context is Array

=head2 Using Perl to Dispose of Temporaries

In the examples given to date, any temporaries created in the callback
(i.e., parameters passed on the stack to the I<call_*> function or
values returned via the stack) have been freed by one of these methods:

=over 5

=item *

Specifying the G_DISCARD flag with I<call_*>

=item *

Explicitly using the C<ENTER>/C<SAVETMPS>--C<FREETMPS>/C<LEAVE> pairing

=back

There is another method which can be used, namely letting Perl do it
for you automatically whenever it regains control after the callback
has terminated.  This is done by simply not using the

    ENTER;
    SAVETMPS;
    ...
    FREETMPS;
    LEAVE;

sequence in the callback (and not, of course, specifying the G_DISCARD
flag).

If you are going to use this method you have to be aware of a possible
memory leak which can arise under very specific circumstances.  To
explain these circumstances you need to know a bit about the flow of
control between Perl and the callback routine.

The examples given at the start of the document (an error handler and
an event driven program) are typical of the two main sorts of flow
control that you are likely to encounter with callbacks.  There is a
very important distinction between them, so pay attention.

In the first example, an error handler, the flow of control could be as
follows.  You have created an interface to an external library.
Control can reach the external library like this

    perl --> XSUB --> external library

Whilst control is in the library, an error condition occurs. You have
previously set up a Perl callback to handle this situation, so it will
get executed. Once the callback has finished, control will drop back to
Perl again.  Here is what the flow of control will be like in that
situation

    perl --> XSUB --> external library
                      ...
                      error occurs
                      ...
                      external library --> call_* --> perl
                                                          |
    perl <-- XSUB <-- external library <-- call_* <----+

After processing of the error using I<call_*> is completed,
control reverts back to Perl more or less immediately.

In the diagram, the further right you go the more deeply nested the
scope is.  It is only when control is back with perl on the extreme
left of the diagram that you will have dropped back to the enclosing
scope and any temporaries you have left hanging around will be freed.

In the second example, an event driven program, the flow of control
will be more like this

    perl --> XSUB --> event handler
                      ...
                      event handler --> call_* --> perl
                                                       |
                      event handler <-- call_* <----+
                      ...
                      event handler --> call_* --> perl
                                                       |
                      event handler <-- call_* <----+
                      ...
                      event handler --> call_* --> perl
                                                       |
                      event handler <-- call_* <----+

In this case the flow of control can consist of only the repeated
sequence

    event handler --> call_* --> perl

for practically the complete duration of the program.  This means that
control may I<never> drop back to the surrounding scope in Perl at the
extreme left.

So what is the big problem? Well, if you are expecting Perl to tidy up
those temporaries for you, you might be in for a long wait.  For Perl
to dispose of your temporaries, control must drop back to the
enclosing scope at some stage.  In the event driven scenario that may
never happen.  This means that, as time goes on, your program will
create more and more temporaries, none of which will ever be freed. As
each of these temporaries consumes some memory your program will
eventually consume all the available memory in your system--kapow!

So here is the bottom line--if you are sure that control will revert
back to the enclosing Perl scope fairly quickly after the end of your
callback, then it isn't absolutely necessary to dispose explicitly of
any temporaries you may have created. Mind you, if you are at all
uncertain about what to do, it doesn't do any harm to tidy up anyway.


=head2 Strategies for Storing Callback Context Information


Potentially one of the trickiest problems to overcome when designing a
callback interface can be figuring out how to store the mapping between
the C callback function and the Perl equivalent.

To help understand why this can be a real problem first consider how a
callback is set up in an all C environment.  Typically a C API will
provide a function to register a callback.  This will expect a pointer
to a function as one of its parameters.  Below is a call to a
hypothetical function C<register_fatal> which registers the C function
to get called when a fatal error occurs.

    register_fatal(cb1);

The single parameter C<cb1> is a pointer to a function, so you must
have defined C<cb1> in your code, say something like this

    static void
    cb1()
    {
        printf ("Fatal Error\n");
        exit(1);
    }

Now change that to call a Perl subroutine instead

    static SV * callback = (SV*)NULL;

    static void
    cb1()
    {
        dSP;

        PUSHMARK(SP);

        /* Call the Perl sub to process the callback */
        call_sv(callback, G_DISCARD);
    }


    void
    register_fatal(fn)
        SV *	fn
        CODE:
        /* Remember the Perl sub */
        if (callback == (SV*)NULL)
            callback = newSVsv(fn);
        else
            SvSetSV(callback, fn);

        /* register the callback with the external library */
        register_fatal(cb1);

where the Perl equivalent of C<register_fatal> and the callback it
registers, C<pcb1>, might look like this

    # Register the sub pcb1
    register_fatal(\&pcb1);

    sub pcb1
    {
        die "I'm dying...\n";
    }

The mapping between the C callback and the Perl equivalent is stored in
the global variable C<callback>.

This will be adequate if you ever need to have only one callback
registered at any time. An example could be an error handler like the
code sketched out above. Remember though, repeated calls to
C<register_fatal> will replace the previously registered callback
function with the new one.

Say for example you want to interface to a library which allows asynchronous
file i/o.  In this case you may be able to register a callback whenever
a read operation has completed. To be of any use we want to be able to
call separate Perl subroutines for each file that is opened.  As it
stands, the error handler example above would not be adequate as it
allows only a single callback to be defined at any time. What we
require is a means of storing the mapping between the opened file and
the Perl subroutine we want to be called for that file.

Say the i/o library has a function C<asynch_read> which associates a C
function C<ProcessRead> with a file handle C<fh>--this assumes that it
has also provided some routine to open the file and so obtain the file
handle.

    asynch_read(fh, ProcessRead)

This may expect the C I<ProcessRead> function of this form

    void
    ProcessRead(fh, buffer)
    int	fh;
    char *	buffer;
    {
         ...
    }

To provide a Perl interface to this library we need to be able to map
between the C<fh> parameter and the Perl subroutine we want called.  A
hash is a convenient mechanism for storing this mapping.  The code
below shows a possible implementation

    static HV * Mapping = (HV*)NULL;

    void
    asynch_read(fh, callback)
        int	fh
        SV *	callback
        CODE:
        /* If the hash doesn't already exist, create it */
        if (Mapping == (HV*)NULL)
            Mapping = newHV();

        /* Save the fh -> callback mapping */
        hv_store(Mapping, (char*)&fh, sizeof(fh), newSVsv(callback), 0);

        /* Register with the C Library */
        asynch_read(fh, asynch_read_if);

and C<asynch_read_if> could look like this

    static void
    asynch_read_if(fh, buffer)
    int	fh;
    char *	buffer;
    {
        dSP;
        SV ** sv;

        /* Get the callback associated with fh */
        sv =  hv_fetch(Mapping, (char*)&fh , sizeof(fh), FALSE);
        if (sv == (SV**)NULL)
            croak("Internal error...\n");

        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(sv_2mortal(newSViv(fh)));
        PUSHs(sv_2mortal(newSVpv(buffer, 0)));
        PUTBACK;

        /* Call the Perl sub */
        call_sv(*sv, G_DISCARD);
    }

For completeness, here is C<asynch_close>.  This shows how to remove
the entry from the hash C<Mapping>.

    void
    asynch_close(fh)
        int	fh
        CODE:
        /* Remove the entry from the hash */
        (void) hv_delete(Mapping, (char*)&fh, sizeof(fh), G_DISCARD);

        /* Now call the real asynch_close */
        asynch_close(fh);

So the Perl interface would look like this

    sub callback1
    {
        my($handle, $buffer) = @_;
    }

    # Register the Perl callback
    asynch_read($fh, \&callback1);

    asynch_close($fh);

The mapping between the C callback and Perl is stored in the global
hash C<Mapping> this time. Using a hash has the distinct advantage that
it allows an unlimited number of callbacks to be registered.

What if the interface provided by the C callback doesn't contain a
parameter which allows the file handle to Perl subroutine mapping?  Say
in the asynchronous i/o package, the callback function gets passed only
the C<buffer> parameter like this

    void
    ProcessRead(buffer)
    char *	buffer;
    {
        ...
    }

Without the file handle there is no straightforward way to map from the
C callback to the Perl subroutine.

In this case a possible way around this problem is to predefine a
series of C functions to act as the interface to Perl, thus

    #define MAX_CB		3
    #define NULL_HANDLE	-1
    typedef void (*FnMap)();

    struct MapStruct {
        FnMap    Function;
        SV *     PerlSub;
        int      Handle;
      };

    static void  fn1();
    static void  fn2();
    static void  fn3();

    static struct MapStruct Map [MAX_CB] =
        {
            { fn1, NULL, NULL_HANDLE },
            { fn2, NULL, NULL_HANDLE },
            { fn3, NULL, NULL_HANDLE }
        };

    static void
    Pcb(index, buffer)
    int index;
    char * buffer;
    {
        dSP;

        PUSHMARK(SP);
        XPUSHs(sv_2mortal(newSVpv(buffer, 0)));
        PUTBACK;

        /* Call the Perl sub */
        call_sv(Map[index].PerlSub, G_DISCARD);
    }

    static void
    fn1(buffer)
    char * buffer;
    {
        Pcb(0, buffer);
    }

    static void
    fn2(buffer)
    char * buffer;
    {
        Pcb(1, buffer);
    }

    static void
    fn3(buffer)
    char * buffer;
    {
        Pcb(2, buffer);
    }

    void
    array_asynch_read(fh, callback)
        int		fh
        SV *	callback
        CODE:
        int index;
        int null_index = MAX_CB;

        /* Find the same handle or an empty entry */
        for (index = 0; index < MAX_CB; ++index)
        {
            if (Map[index].Handle == fh)
                break;

            if (Map[index].Handle == NULL_HANDLE)
                null_index = index;
        }

        if (index == MAX_CB && null_index == MAX_CB)
            croak ("Too many callback functions registered\n");

        if (index == MAX_CB)
            index = null_index;

        /* Save the file handle */
        Map[index].Handle = fh;

        /* Remember the Perl sub */
        if (Map[index].PerlSub == (SV*)NULL)
            Map[index].PerlSub = newSVsv(callback);
        else
            SvSetSV(Map[index].PerlSub, callback);

        asynch_read(fh, Map[index].Function);

    void
    array_asynch_close(fh)
        int	fh
        CODE:
        int index;

        /* Find the file handle */
        for (index = 0; index < MAX_CB; ++ index)
            if (Map[index].Handle == fh)
                break;

        if (index == MAX_CB)
            croak ("could not close fh %d\n", fh);

        Map[index].Handle = NULL_HANDLE;
        SvREFCNT_dec(Map[index].PerlSub);
        Map[index].PerlSub = (SV*)NULL;

        asynch_close(fh);

In this case the functions C<fn1>, C<fn2>, and C<fn3> are used to
remember the Perl subroutine to be called. Each of the functions holds
a separate hard-wired index which is used in the function C<Pcb> to
access the C<Map> array and actually call the Perl subroutine.

There are some obvious disadvantages with this technique.

Firstly, the code is considerably more complex than with the previous
example.

Secondly, there is a hard-wired limit (in this case 3) to the number of
callbacks that can exist simultaneously. The only way to increase the
limit is by modifying the code to add more functions and then
recompiling.  None the less, as long as the number of functions is
chosen with some care, it is still a workable solution and in some
cases is the only one available.

To summarize, here are a number of possible methods for you to consider
for storing the mapping between C and the Perl callback

=over 5

=item 1. Ignore the problem - Allow only 1 callback

For a lot of situations, like interfacing to an error handler, this may
be a perfectly adequate solution.

=item 2. Create a sequence of callbacks - hard wired limit

If it is impossible to tell from the parameters passed back from the C
callback what the context is, then you may need to create a sequence of C
callback interface functions, and store pointers to each in an array.

=item 3. Use a parameter to map to the Perl callback

A hash is an ideal mechanism to store the mapping between C and Perl.

=back


=head2 Alternate Stack Manipulation


Although I have made use of only the C<POP*> macros to access values
returned from Perl subroutines, it is also possible to bypass these
macros and read the stack using the C<ST> macro (See L<perlxs> for a
full description of the C<ST> macro).

Most of the time the C<POP*> macros should be adequate; the main
problem with them is that they force you to process the returned values
in sequence. This may not be the most suitable way to process the
values in some cases. What we want is to be able to access the stack in
a random order. The C<ST> macro as used when coding an XSUB is ideal
for this purpose.

The code below is the example given in the section L</Returning a List
of Values> recoded to use C<ST> instead of C<POP*>.

    static void
    call_AddSubtract2(a, b)
    int a;
    int b;
    {
        dSP;
        I32 ax;
        int count;

        ENTER;
        SAVETMPS;

        PUSHMARK(SP);
        EXTEND(SP, 2);
        PUSHs(sv_2mortal(newSViv(a)));
        PUSHs(sv_2mortal(newSViv(b)));
        PUTBACK;

        count = call_pv("AddSubtract", G_ARRAY);

        SPAGAIN;
        SP -= count;
        ax = (SP - PL_stack_base) + 1;

        if (count != 2)
            croak("Big trouble\n");

        printf ("%d + %d = %d\n", a, b, SvIV(ST(0)));
        printf ("%d - %d = %d\n", a, b, SvIV(ST(1)));

        PUTBACK;
        FREETMPS;
        LEAVE;
    }

Notes

=over 5

=item 1.

Notice that it was necessary to define the variable C<ax>.  This is
because the C<ST> macro expects it to exist.  If we were in an XSUB it
would not be necessary to define C<ax> as it is already defined for
us.

=item 2.

The code

        SPAGAIN;
        SP -= count;
        ax = (SP - PL_stack_base) + 1;

sets the stack up so that we can use the C<ST> macro.

=item 3.

Unlike the original coding of this example, the returned
values are not accessed in reverse order.  So C<ST(0)> refers to the
first value returned by the Perl subroutine and C<ST(count-1)>
refers to the last.

=back

=head2 Creating and Calling an Anonymous Subroutine in C

As we've already shown, C<call_sv> can be used to invoke an
anonymous subroutine.  However, our example showed a Perl script
invoking an XSUB to perform this operation.  Let's see how it can be
done inside our C code:

 ...

 SV *cvrv
    = eval_pv("sub {
                print 'You will not find me cluttering any namespace!'
               }", TRUE);

 ...

 call_sv(cvrv, G_VOID|G_NOARGS);

C<eval_pv> is used to compile the anonymous subroutine, which
will be the return value as well (read more about C<eval_pv> in
L<perlapi/eval_pv>).  Once this code reference is in hand, it
can be mixed in with all the previous examples we've shown.

=head1 LIGHTWEIGHT CALLBACKS

Sometimes you need to invoke the same subroutine repeatedly.
This usually happens with a function that acts on a list of
values, such as Perl's built-in sort(). You can pass a
comparison function to sort(), which will then be invoked
for every pair of values that needs to be compared. The first()
and reduce() functions from L<List::Util> follow a similar
pattern.

In this case it is possible to speed up the routine (often
quite substantially) by using the lightweight callback API.
The idea is that the calling context only needs to be
created and destroyed once, and the sub can be called
arbitrarily many times in between.

It is usual to pass parameters using global variables (typically
$_ for one parameter, or $a and $b for two parameters) rather
than via @_. (It is possible to use the @_ mechanism if you know
what you're doing, though there is as yet no supported API for
it. It's also inherently slower.)

The pattern of macro calls is like this:

    dMULTICALL;			/* Declare local variables */
    U8 gimme = G_SCALAR;	/* context of the call: G_SCALAR,
				 * G_ARRAY, or G_VOID */

    PUSH_MULTICALL(cv);		/* Set up the context for calling cv,
				   and set local vars appropriately */

    /* loop */ {
        /* set the value(s) af your parameter variables */
        MULTICALL;		/* Make the actual call */
    } /* end of loop */

    POP_MULTICALL;		/* Tear down the calling context */

For some concrete examples, see the implementation of the
first() and reduce() functions of List::Util 1.18. There you
will also find a header file that emulates the multicall API
on older versions of perl.

=head1 SEE ALSO

L<perlxs>, L<perlguts>, L<perlembed>

=head1 AUTHOR

Paul Marquess 

Special thanks to the following people who assisted in the creation of
the document.

Jeff Okamoto, Tim Bunce, Nick Gianniotis, Steve Kelem, Gurusamy Sarathy
and Larry Wall.

=head1 DATE

Last updated for perl 5.23.1.
perldelta.pod000064400000015627150344123450007240 0ustar00=encoding utf8

=head1 NAME

perldelta - what is new for perl v5.26.3

=head1 DESCRIPTION

This document describes differences between the 5.26.2 release and the 5.26.3
release.

If you are upgrading from an earlier release such as 5.26.1, first read
L<perl5262delta>, which describes differences between 5.26.1 and 5.26.2.

=head1 Security

=head2 [CVE-2018-12015] Directory traversal in module Archive::Tar

By default, L<Archive::Tar> doesn't allow extracting files outside the current
working directory.  However, this secure extraction mode could be bypassed by
putting a symlink and a regular file with the same name into the tar file.

L<[perl #133250]|https://rt.perl.org/Ticket/Display.html?id=133250>
L<[cpan #125523]|https://rt.cpan.org/Ticket/Display.html?id=125523>

=head2 [CVE-2018-18311] Integer overflow leading to buffer overflow and segmentation fault

Integer arithmetic in C<Perl_my_setenv()> could wrap when the combined length
of the environment variable name and value exceeded around 0x7fffffff.  This
could lead to writing beyond the end of an allocated buffer with attacker
supplied data.

L<[perl #133204]|https://rt.perl.org/Ticket/Display.html?id=133204>

=head2 [CVE-2018-18312] Heap-buffer-overflow write in S_regatom (regcomp.c)

A crafted regular expression could cause heap-buffer-overflow write during
compilation, potentially allowing arbitrary code execution.

L<[perl #133423]|https://rt.perl.org/Ticket/Display.html?id=133423>

=head2 [CVE-2018-18313] Heap-buffer-overflow read in S_grok_bslash_N (regcomp.c)

A crafted regular expression could cause heap-buffer-overflow read during
compilation, potentially leading to sensitive information being leaked.

L<[perl #133192]|https://rt.perl.org/Ticket/Display.html?id=133192>

=head2 [CVE-2018-18314] Heap-buffer-overflow write in S_regatom (regcomp.c)

A crafted regular expression could cause heap-buffer-overflow write during
compilation, potentially allowing arbitrary code execution.

L<[perl #131649]|https://rt.perl.org/Ticket/Display.html?id=131649>

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.26.2.  If any exist,
they are bugs, and we request that you submit a report.  See
L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Archive::Tar> has been upgraded from version 2.24 to 2.24_01.

=item *

L<Module::CoreList> has been upgraded from version 5.20180414_26 to 5.20181129_26.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<Unexpected ']' with no following ')' in (?[... in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Unexpected ']' with no following ')' in (?[... in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

(F) While parsing an extended character class a ']' character was encountered
at a point in the definition where the only legal use of ']' is to close the
character class definition as part of a '])', you may have forgotten the close
paren, or otherwise confused the parser.

=item *

L<Expecting close paren for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Expecting close paren for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

(F) While parsing a nested extended character class like:

    (?[ ... (?flags:(?[ ... ])) ... ])
                             ^

we expected to see a close paren ')' (marked by ^) but did not.

=item *

L<Expecting close paren for wrapper for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Expecting close paren for wrapper for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

(F) While parsing a nested extended character class like:

    (?[ ... (?flags:(?[ ... ])) ... ])
                              ^

we expected to see a close paren ')' (marked by ^) but did not.

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

L<Syntax error in (?[...]) in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Syntax error in (?[...]) in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>">

This fatal error message has been slightly expanded (from "Syntax error in
(?[...]) in regex mE<sol>%sE<sol>") for greater clarity.

=back

=head1 Acknowledgements

Perl 5.26.3 represents approximately 8 months of development since Perl 5.26.2
and contains approximately 4,500 lines of changes across 51 files from 15
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 770 lines of changes to 10 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.26.3:

Aaron Crane, Abigail, Chris 'BinGOs' Williams, Dagfinn Ilmari Mannsåker, David
Mitchell, H.Merijn Brand, James E Keenan, John SJ Anderson, Karen Etheridge,
Karl Williamson, Sawyer X, Steve Hay, Todd Rinaldo, Tony Cook, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the perl bug database
at L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 Give Thanks

If you wish to thank the Perl 5 Porters for the work we had done in Perl 5,
you can do so by running the C<perlthanks> program:

    perlthanks

This will send an email to the Perl 5 Porters list with your show of thanks.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlxs.pod000064400000232110150344123450006565 0ustar00=head1 NAME

perlxs - XS language reference manual

=head1 DESCRIPTION

=head2 Introduction

XS is an interface description file format used to create an extension
interface between Perl and C code (or a C library) which one wishes
to use with Perl.  The XS interface is combined with the library to
create a new library which can then be either dynamically loaded
or statically linked into perl.  The XS interface description is
written in the XS language and is the core component of the Perl
extension interface.

Before writing XS, read the L</CAVEATS> section below.

An B<XSUB> forms the basic unit of the XS interface.  After compilation
by the B<xsubpp> compiler, each XSUB amounts to a C function definition
which will provide the glue between Perl calling conventions and C
calling conventions.

The glue code pulls the arguments from the Perl stack, converts these
Perl values to the formats expected by a C function, call this C function,
transfers the return values of the C function back to Perl.
Return values here may be a conventional C return value or any C
function arguments that may serve as output parameters.  These return
values may be passed back to Perl either by putting them on the
Perl stack, or by modifying the arguments supplied from the Perl side.

The above is a somewhat simplified view of what really happens.  Since
Perl allows more flexible calling conventions than C, XSUBs may do much
more in practice, such as checking input parameters for validity,
throwing exceptions (or returning undef/empty list) if the return value
from the C function indicates failure, calling different C functions
based on numbers and types of the arguments, providing an object-oriented
interface, etc.

Of course, one could write such glue code directly in C.  However, this
would be a tedious task, especially if one needs to write glue for
multiple C functions, and/or one is not familiar enough with the Perl
stack discipline and other such arcana.  XS comes to the rescue here:
instead of writing this glue C code in long-hand, one can write
a more concise short-hand I<description> of what should be done by
the glue, and let the XS compiler B<xsubpp> handle the rest.

The XS language allows one to describe the mapping between how the C
routine is used, and how the corresponding Perl routine is used.  It
also allows creation of Perl routines which are directly translated to
C code and which are not related to a pre-existing C function.  In cases
when the C interface coincides with the Perl interface, the XSUB
declaration is almost identical to a declaration of a C function (in K&R
style).  In such circumstances, there is another tool called C<h2xs>
that is able to translate an entire C header file into a corresponding
XS file that will provide glue to the functions/macros described in
the header file.

The XS compiler is called B<xsubpp>.  This compiler creates
the constructs necessary to let an XSUB manipulate Perl values, and
creates the glue necessary to let Perl call the XSUB.  The compiler
uses B<typemaps> to determine how to map C function parameters
and output values to Perl values and back.  The default typemap
(which comes with Perl) handles many common C types.  A supplementary
typemap may also be needed to handle any special structures and types
for the library being linked. For more information on typemaps,
see L<perlxstypemap>.

A file in XS format starts with a C language section which goes until the
first C<MODULE =Z<>> directive.  Other XS directives and XSUB definitions
may follow this line.  The "language" used in this part of the file
is usually referred to as the XS language.  B<xsubpp> recognizes and
skips POD (see L<perlpod>) in both the C and XS language sections, which
allows the XS file to contain embedded documentation.

See L<perlxstut> for a tutorial on the whole extension creation process.

Note: For some extensions, Dave Beazley's SWIG system may provide a
significantly more convenient mechanism for creating the extension
glue code.  See L<http://www.swig.org/> for more information.

=head2 On The Road

Many of the examples which follow will concentrate on creating an interface
between Perl and the ONC+ RPC bind library functions.  The rpcb_gettime()
function is used to demonstrate many features of the XS language.  This
function has two parameters; the first is an input parameter and the second
is an output parameter.  The function also returns a status value.

	bool_t rpcb_gettime(const char *host, time_t *timep);

From C this function will be called with the following
statements.

     #include <rpc/rpc.h>
     bool_t status;
     time_t timep;
     status = rpcb_gettime( "localhost", &timep );

If an XSUB is created to offer a direct translation between this function
and Perl, then this XSUB will be used from Perl with the following code.
The $status and $timep variables will contain the output of the function.

     use RPC;
     $status = rpcb_gettime( "localhost", $timep );

The following XS file shows an XS subroutine, or XSUB, which
demonstrates one possible interface to the rpcb_gettime()
function.  This XSUB represents a direct translation between
C and Perl and so preserves the interface even from Perl.
This XSUB will be invoked from Perl with the usage shown
above.  Note that the first three #include statements, for
C<EXTERN.h>, C<perl.h>, and C<XSUB.h>, will always be present at the
beginning of an XS file.  This approach and others will be
expanded later in this document.  A #define for C<PERL_NO_GET_CONTEXT>
should be present to fetch the interpreter context more efficiently,
see L<perlguts|perlguts/How multiple interpreters and concurrency are
supported> for details.

     #define PERL_NO_GET_CONTEXT
     #include "EXTERN.h"
     #include "perl.h"
     #include "XSUB.h"
     #include <rpc/rpc.h>

     MODULE = RPC  PACKAGE = RPC

     bool_t
     rpcb_gettime(host,timep)
          char *host
          time_t &timep
        OUTPUT:
          timep

Any extension to Perl, including those containing XSUBs,
should have a Perl module to serve as the bootstrap which
pulls the extension into Perl.  This module will export the
extension's functions and variables to the Perl program and
will cause the extension's XSUBs to be linked into Perl.
The following module will be used for most of the examples
in this document and should be used from Perl with the C<use>
command as shown earlier.  Perl modules are explained in
more detail later in this document.

     package RPC;

     require Exporter;
     require DynaLoader;
     @ISA = qw(Exporter DynaLoader);
     @EXPORT = qw( rpcb_gettime );

     bootstrap RPC;
     1;

Throughout this document a variety of interfaces to the rpcb_gettime()
XSUB will be explored.  The XSUBs will take their parameters in different
orders or will take different numbers of parameters.  In each case the
XSUB is an abstraction between Perl and the real C rpcb_gettime()
function, and the XSUB must always ensure that the real rpcb_gettime()
function is called with the correct parameters.  This abstraction will
allow the programmer to create a more Perl-like interface to the C
function.

=head2 The Anatomy of an XSUB

The simplest XSUBs consist of 3 parts: a description of the return
value, the name of the XSUB routine and the names of its arguments,
and a description of types or formats of the arguments.

The following XSUB allows a Perl program to access a C library function
called sin().  The XSUB will imitate the C function which takes a single
argument and returns a single value.

     double
     sin(x)
       double x

Optionally, one can merge the description of types and the list of
argument names, rewriting this as

     double
     sin(double x)

This makes this XSUB look similar to an ANSI C declaration.  An optional
semicolon is allowed after the argument list, as in

     double
     sin(double x);

Parameters with C pointer types can have different semantic: C functions
with similar declarations

     bool string_looks_as_a_number(char *s);
     bool make_char_uppercase(char *c);

are used in absolutely incompatible manner.  Parameters to these functions
could be described B<xsubpp> like this:

     char *  s
     char    &c

Both these XS declarations correspond to the C<char*> C type, but they have
different semantics, see L<"The & Unary Operator">.

It is convenient to think that the indirection operator
C<*> should be considered as a part of the type and the address operator C<&>
should be considered part of the variable.  See L<perlxstypemap>
for more info about handling qualifiers and unary operators in C types.

The function name and the return type must be placed on
separate lines and should be flush left-adjusted.

  INCORRECT                        CORRECT

  double sin(x)                    double
    double x                       sin(x)
				     double x

The rest of the function description may be indented or left-adjusted. The
following example shows a function with its body left-adjusted.  Most
examples in this document will indent the body for better readability.

  CORRECT

  double
  sin(x)
  double x

More complicated XSUBs may contain many other sections.  Each section of
an XSUB starts with the corresponding keyword, such as INIT: or CLEANUP:.
However, the first two lines of an XSUB always contain the same data:
descriptions of the return type and the names of the function and its
parameters.  Whatever immediately follows these is considered to be
an INPUT: section unless explicitly marked with another keyword.
(See L<The INPUT: Keyword>.)

An XSUB section continues until another section-start keyword is found.

=head2 The Argument Stack

The Perl argument stack is used to store the values which are
sent as parameters to the XSUB and to store the XSUB's
return value(s).  In reality all Perl functions (including non-XSUB
ones) keep their values on this stack all the same time, each limited
to its own range of positions on the stack.  In this document the
first position on that stack which belongs to the active
function will be referred to as position 0 for that function.

XSUBs refer to their stack arguments with the macro B<ST(x)>, where I<x>
refers to a position in this XSUB's part of the stack.  Position 0 for that
function would be known to the XSUB as ST(0).  The XSUB's incoming
parameters and outgoing return values always begin at ST(0).  For many
simple cases the B<xsubpp> compiler will generate the code necessary to
handle the argument stack by embedding code fragments found in the
typemaps.  In more complex cases the programmer must supply the code.

=head2 The RETVAL Variable

The RETVAL variable is a special C variable that is declared automatically
for you.  The C type of RETVAL matches the return type of the C library
function.  The B<xsubpp> compiler will declare this variable in each XSUB
with non-C<void> return type.  By default the generated C function
will use RETVAL to hold the return value of the C library function being
called.  In simple cases the value of RETVAL will be placed in ST(0) of
the argument stack where it can be received by Perl as the return value
of the XSUB.

If the XSUB has a return type of C<void> then the compiler will
not declare a RETVAL variable for that function.  When using
a PPCODE: section no manipulation of the RETVAL variable is required, the
section may use direct stack manipulation to place output values on the stack.

If PPCODE: directive is not used, C<void> return value should be used
only for subroutines which do not return a value, I<even if> CODE:
directive is used which sets ST(0) explicitly.

Older versions of this document recommended to use C<void> return
value in such cases. It was discovered that this could lead to
segfaults in cases when XSUB was I<truly> C<void>. This practice is
now deprecated, and may be not supported at some future version. Use
the return value C<SV *> in such cases. (Currently C<xsubpp> contains
some heuristic code which tries to disambiguate between "truly-void"
and "old-practice-declared-as-void" functions. Hence your code is at
mercy of this heuristics unless you use C<SV *> as return value.)

=head2 Returning SVs, AVs and HVs through RETVAL

When you're using RETVAL to return an C<SV *>, there's some magic
going on behind the scenes that should be mentioned. When you're
manipulating the argument stack using the ST(x) macro, for example,
you usually have to pay special attention to reference counts. (For
more about reference counts, see L<perlguts>.) To make your life
easier, the typemap file automatically makes C<RETVAL> mortal when
you're returning an C<SV *>. Thus, the following two XSUBs are more
or less equivalent:

  void
  alpha()
      PPCODE:
          ST(0) = newSVpv("Hello World",0);
          sv_2mortal(ST(0));
          XSRETURN(1);

  SV *
  beta()
      CODE:
          RETVAL = newSVpv("Hello World",0);
      OUTPUT:
          RETVAL

This is quite useful as it usually improves readability. While
this works fine for an C<SV *>, it's unfortunately not as easy
to have C<AV *> or C<HV *> as a return value. You I<should> be
able to write:

  AV *
  array()
      CODE:
          RETVAL = newAV();
          /* do something with RETVAL */
      OUTPUT:
          RETVAL

But due to an unfixable bug (fixing it would break lots of existing
CPAN modules) in the typemap file, the reference count of the C<AV *>
is not properly decremented. Thus, the above XSUB would leak memory
whenever it is being called. The same problem exists for C<HV *>,
C<CV *>, and C<SVREF> (which indicates a scalar reference, not
a general C<SV *>).
In XS code on perls starting with perl 5.16, you can override the
typemaps for any of these types with a version that has proper
handling of refcounts. In your C<TYPEMAP> section, do

  AV*	T_AVREF_REFCOUNT_FIXED

to get the repaired variant. For backward compatibility with older
versions of perl, you can instead decrement the reference count
manually when you're returning one of the aforementioned
types using C<sv_2mortal>:

  AV *
  array()
      CODE:
          RETVAL = newAV();
          sv_2mortal((SV*)RETVAL);
          /* do something with RETVAL */
      OUTPUT:
          RETVAL

Remember that you don't have to do this for an C<SV *>. The reference
documentation for all core typemaps can be found in L<perlxstypemap>.

=head2 The MODULE Keyword

The MODULE keyword is used to start the XS code and to specify the package
of the functions which are being defined.  All text preceding the first
MODULE keyword is considered C code and is passed through to the output with
POD stripped, but otherwise untouched.  Every XS module will have a
bootstrap function which is used to hook the XSUBs into Perl.  The package
name of this bootstrap function will match the value of the last MODULE
statement in the XS source files.  The value of MODULE should always remain
constant within the same XS file, though this is not required.

The following example will start the XS code and will place
all functions in a package named RPC.

     MODULE = RPC

=head2 The PACKAGE Keyword

When functions within an XS source file must be separated into packages
the PACKAGE keyword should be used.  This keyword is used with the MODULE
keyword and must follow immediately after it when used.

     MODULE = RPC  PACKAGE = RPC

     [ XS code in package RPC ]

     MODULE = RPC  PACKAGE = RPCB

     [ XS code in package RPCB ]

     MODULE = RPC  PACKAGE = RPC

     [ XS code in package RPC ]

The same package name can be used more than once, allowing for
non-contiguous code. This is useful if you have a stronger ordering
principle than package names.

Although this keyword is optional and in some cases provides redundant
information it should always be used.  This keyword will ensure that the
XSUBs appear in the desired package.

=head2 The PREFIX Keyword

The PREFIX keyword designates prefixes which should be
removed from the Perl function names.  If the C function is
C<rpcb_gettime()> and the PREFIX value is C<rpcb_> then Perl will
see this function as C<gettime()>.

This keyword should follow the PACKAGE keyword when used.
If PACKAGE is not used then PREFIX should follow the MODULE
keyword.

     MODULE = RPC  PREFIX = rpc_

     MODULE = RPC  PACKAGE = RPCB  PREFIX = rpcb_

=head2 The OUTPUT: Keyword

The OUTPUT: keyword indicates that certain function parameters should be
updated (new values made visible to Perl) when the XSUB terminates or that
certain values should be returned to the calling Perl function.  For
simple functions which have no CODE: or PPCODE: section,
such as the sin() function above, the RETVAL variable is
automatically designated as an output value.  For more complex functions
the B<xsubpp> compiler will need help to determine which variables are output
variables.

This keyword will normally be used to complement the CODE:  keyword.
The RETVAL variable is not recognized as an output variable when the
CODE: keyword is present.  The OUTPUT:  keyword is used in this
situation to tell the compiler that RETVAL really is an output
variable.

The OUTPUT: keyword can also be used to indicate that function parameters
are output variables.  This may be necessary when a parameter has been
modified within the function and the programmer would like the update to
be seen by Perl.

     bool_t
     rpcb_gettime(host,timep)
          char *host
          time_t &timep
        OUTPUT:
          timep

The OUTPUT: keyword will also allow an output parameter to
be mapped to a matching piece of code rather than to a
typemap.

     bool_t
     rpcb_gettime(host,timep)
          char *host
          time_t &timep
        OUTPUT:
          timep sv_setnv(ST(1), (double)timep);

B<xsubpp> emits an automatic C<SvSETMAGIC()> for all parameters in the
OUTPUT section of the XSUB, except RETVAL.  This is the usually desired
behavior, as it takes care of properly invoking 'set' magic on output
parameters (needed for hash or array element parameters that must be
created if they didn't exist).  If for some reason, this behavior is
not desired, the OUTPUT section may contain a C<SETMAGIC: DISABLE> line
to disable it for the remainder of the parameters in the OUTPUT section.
Likewise,  C<SETMAGIC: ENABLE> can be used to reenable it for the
remainder of the OUTPUT section.  See L<perlguts> for more details
about 'set' magic.

=head2 The NO_OUTPUT Keyword

The NO_OUTPUT can be placed as the first token of the XSUB.  This keyword
indicates that while the C subroutine we provide an interface to has
a non-C<void> return type, the return value of this C subroutine should not
be returned from the generated Perl subroutine.

With this keyword present L<The RETVAL Variable> is created, and in the
generated call to the subroutine this variable is assigned to, but the value
of this variable is not going to be used in the auto-generated code.

This keyword makes sense only if C<RETVAL> is going to be accessed by the
user-supplied code.  It is especially useful to make a function interface
more Perl-like, especially when the C return value is just an error condition
indicator.  For example,

  NO_OUTPUT int
  delete_file(char *name)
    POSTCALL:
      if (RETVAL != 0)
	  croak("Error %d while deleting file '%s'", RETVAL, name);

Here the generated XS function returns nothing on success, and will die()
with a meaningful error message on error.

=head2 The CODE: Keyword

This keyword is used in more complicated XSUBs which require
special handling for the C function.  The RETVAL variable is
still declared, but it will not be returned unless it is specified
in the OUTPUT: section.

The following XSUB is for a C function which requires special handling of
its parameters.  The Perl usage is given first.

     $status = rpcb_gettime( "localhost", $timep );

The XSUB follows.

     bool_t
     rpcb_gettime(host,timep)
          char *host
          time_t timep
        CODE:
               RETVAL = rpcb_gettime( host, &timep );
        OUTPUT:
          timep
          RETVAL

=head2 The INIT: Keyword

The INIT: keyword allows initialization to be inserted into the XSUB before
the compiler generates the call to the C function.  Unlike the CODE: keyword
above, this keyword does not affect the way the compiler handles RETVAL.

    bool_t
    rpcb_gettime(host,timep)
          char *host
          time_t &timep
	INIT:
	  printf("# Host is %s\n", host );
        OUTPUT:
          timep

Another use for the INIT: section is to check for preconditions before
making a call to the C function:

    long long
    lldiv(a,b)
	long long a
	long long b
      INIT:
	if (a == 0 && b == 0)
	    XSRETURN_UNDEF;
	if (b == 0)
	    croak("lldiv: cannot divide by 0");

=head2 The NO_INIT Keyword

The NO_INIT keyword is used to indicate that a function
parameter is being used only as an output value.  The B<xsubpp>
compiler will normally generate code to read the values of
all function parameters from the argument stack and assign
them to C variables upon entry to the function.  NO_INIT
will tell the compiler that some parameters will be used for
output rather than for input and that they will be handled
before the function terminates.

The following example shows a variation of the rpcb_gettime() function.
This function uses the timep variable only as an output variable and does
not care about its initial contents.

     bool_t
     rpcb_gettime(host,timep)
          char *host
          time_t &timep = NO_INIT
        OUTPUT:
          timep

=head2 The TYPEMAP: Keyword

Starting with Perl 5.16, you can embed typemaps into your XS code
instead of or in addition to typemaps in a separate file.  Multiple
such embedded typemaps will be processed in order of appearance in
the XS code and like local typemap files take precedence over the
default typemap, the embedded typemaps may overwrite previous
definitions of TYPEMAP, INPUT, and OUTPUT stanzas.  The syntax for
embedded typemaps is

      TYPEMAP: <<HERE
      ... your typemap code here ...
      HERE

where the C<TYPEMAP> keyword must appear in the first column of a
new line.

Refer to L<perlxstypemap> for details on writing typemaps.

=head2 Initializing Function Parameters

C function parameters are normally initialized with their values from
the argument stack (which in turn contains the parameters that were
passed to the XSUB from Perl).  The typemaps contain the
code segments which are used to translate the Perl values to
the C parameters.  The programmer, however, is allowed to
override the typemaps and supply alternate (or additional)
initialization code.  Initialization code starts with the first
C<=>, C<;> or C<+> on a line in the INPUT: section.  The only
exception happens if this C<;> terminates the line, then this C<;>
is quietly ignored.

The following code demonstrates how to supply initialization code for
function parameters.  The initialization code is eval'ed within double
quotes by the compiler before it is added to the output so anything
which should be interpreted literally [mainly C<$>, C<@>, or C<\\>]
must be protected with backslashes.  The variables C<$var>, C<$arg>,
and C<$type> can be used as in typemaps.

     bool_t
     rpcb_gettime(host,timep)
          char *host = (char *)SvPV_nolen($arg);
          time_t &timep = 0;
        OUTPUT:
          timep

This should not be used to supply default values for parameters.  One
would normally use this when a function parameter must be processed by
another library function before it can be used.  Default parameters are
covered in the next section.

If the initialization begins with C<=>, then it is output in
the declaration for the input variable, replacing the initialization
supplied by the typemap.  If the initialization
begins with C<;> or C<+>, then it is performed after
all of the input variables have been declared.  In the C<;>
case the initialization normally supplied by the typemap is not performed.
For the C<+> case, the declaration for the variable will include the
initialization from the typemap.  A global
variable, C<%v>, is available for the truly rare case where
information from one initialization is needed in another
initialization.

Here's a truly obscure example:

     bool_t
     rpcb_gettime(host,timep)
          time_t &timep; /* \$v{timep}=@{[$v{timep}=$arg]} */
          char *host + SvOK($v{timep}) ? SvPV_nolen($arg) : NULL;
        OUTPUT:
          timep

The construct C<\$v{timep}=@{[$v{timep}=$arg]}> used in the above
example has a two-fold purpose: first, when this line is processed by
B<xsubpp>, the Perl snippet C<$v{timep}=$arg> is evaluated.  Second,
the text of the evaluated snippet is output into the generated C file
(inside a C comment)!  During the processing of C<char *host> line,
C<$arg> will evaluate to C<ST(0)>, and C<$v{timep}> will evaluate to
C<ST(1)>.

=head2 Default Parameter Values

Default values for XSUB arguments can be specified by placing an
assignment statement in the parameter list.  The default value may
be a number, a string or the special string C<NO_INIT>.  Defaults should
always be used on the right-most parameters only.

To allow the XSUB for rpcb_gettime() to have a default host
value the parameters to the XSUB could be rearranged.  The
XSUB will then call the real rpcb_gettime() function with
the parameters in the correct order.  This XSUB can be called
from Perl with either of the following statements:

     $status = rpcb_gettime( $timep, $host );

     $status = rpcb_gettime( $timep );

The XSUB will look like the code  which  follows.   A  CODE:
block  is used to call the real rpcb_gettime() function with
the parameters in the correct order for that function.

     bool_t
     rpcb_gettime(timep,host="localhost")
          char *host
          time_t timep = NO_INIT
        CODE:
               RETVAL = rpcb_gettime( host, &timep );
        OUTPUT:
          timep
          RETVAL

=head2 The PREINIT: Keyword

The PREINIT: keyword allows extra variables to be declared immediately
before or after the declarations of the parameters from the INPUT: section
are emitted.

If a variable is declared inside a CODE: section it will follow any typemap
code that is emitted for the input parameters.  This may result in the
declaration ending up after C code, which is C syntax error.  Similar
errors may happen with an explicit C<;>-type or C<+>-type initialization of
parameters is used (see L<"Initializing Function Parameters">).  Declaring
these variables in an INIT: section will not help.

In such cases, to force an additional variable to be declared together
with declarations of other variables, place the declaration into a
PREINIT: section.  The PREINIT: keyword may be used one or more times
within an XSUB.

The following examples are equivalent, but if the code is using complex
typemaps then the first example is safer.

     bool_t
     rpcb_gettime(timep)
          time_t timep = NO_INIT
	PREINIT:
          char *host = "localhost";
        CODE:
	  RETVAL = rpcb_gettime( host, &timep );
        OUTPUT:
          timep
          RETVAL

For this particular case an INIT: keyword would generate the
same C code as the PREINIT: keyword.  Another correct, but error-prone example:

     bool_t
     rpcb_gettime(timep)
          time_t timep = NO_INIT
	CODE:
          char *host = "localhost";
	  RETVAL = rpcb_gettime( host, &timep );
        OUTPUT:
          timep
          RETVAL

Another way to declare C<host> is to use a C block in the CODE: section:

     bool_t
     rpcb_gettime(timep)
          time_t timep = NO_INIT
	CODE:
	  {
            char *host = "localhost";
	    RETVAL = rpcb_gettime( host, &timep );
	  }
        OUTPUT:
          timep
          RETVAL

The ability to put additional declarations before the typemap entries are
processed is very handy in the cases when typemap conversions manipulate
some global state:

    MyObject
    mutate(o)
	PREINIT:
	    MyState st = global_state;
	INPUT:
	    MyObject o;
	CLEANUP:
	    reset_to(global_state, st);

Here we suppose that conversion to C<MyObject> in the INPUT: section and from
MyObject when processing RETVAL will modify a global variable C<global_state>.
After these conversions are performed, we restore the old value of
C<global_state> (to avoid memory leaks, for example).

There is another way to trade clarity for compactness: INPUT sections allow
declaration of C variables which do not appear in the parameter list of
a subroutine.  Thus the above code for mutate() can be rewritten as

    MyObject
    mutate(o)
	  MyState st = global_state;
	  MyObject o;
	CLEANUP:
	  reset_to(global_state, st);

and the code for rpcb_gettime() can be rewritten as

     bool_t
     rpcb_gettime(timep)
	  time_t timep = NO_INIT
	  char *host = "localhost";
	C_ARGS:
	  host, &timep
	OUTPUT:
          timep
          RETVAL

=head2 The SCOPE: Keyword

The SCOPE: keyword allows scoping to be enabled for a particular XSUB. If
enabled, the XSUB will invoke ENTER and LEAVE automatically.

To support potentially complex type mappings, if a typemap entry used
by an XSUB contains a comment like C</*scope*/> then scoping will
be automatically enabled for that XSUB.

To enable scoping:

    SCOPE: ENABLE

To disable scoping:

    SCOPE: DISABLE

=head2 The INPUT: Keyword

The XSUB's parameters are usually evaluated immediately after entering the
XSUB.  The INPUT: keyword can be used to force those parameters to be
evaluated a little later.  The INPUT: keyword can be used multiple times
within an XSUB and can be used to list one or more input variables.  This
keyword is used with the PREINIT: keyword.

The following example shows how the input parameter C<timep> can be
evaluated late, after a PREINIT.

    bool_t
    rpcb_gettime(host,timep)
          char *host
	PREINIT:
	  time_t tt;
	INPUT:
          time_t timep
        CODE:
               RETVAL = rpcb_gettime( host, &tt );
	       timep = tt;
        OUTPUT:
          timep
          RETVAL

The next example shows each input parameter evaluated late.

    bool_t
    rpcb_gettime(host,timep)
	PREINIT:
	  time_t tt;
	INPUT:
          char *host
	PREINIT:
	  char *h;
	INPUT:
          time_t timep
        CODE:
	       h = host;
	       RETVAL = rpcb_gettime( h, &tt );
	       timep = tt;
        OUTPUT:
          timep
          RETVAL

Since INPUT sections allow declaration of C variables which do not appear
in the parameter list of a subroutine, this may be shortened to:

    bool_t
    rpcb_gettime(host,timep)
	  time_t tt;
          char *host;
	  char *h = host;
          time_t timep;
        CODE:
	  RETVAL = rpcb_gettime( h, &tt );
	  timep = tt;
        OUTPUT:
          timep
          RETVAL

(We used our knowledge that input conversion for C<char *> is a "simple" one,
thus C<host> is initialized on the declaration line, and our assignment
C<h = host> is not performed too early.  Otherwise one would need to have the
assignment C<h = host> in a CODE: or INIT: section.)

=head2 The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords

In the list of parameters for an XSUB, one can precede parameter names
by the C<IN>/C<OUTLIST>/C<IN_OUTLIST>/C<OUT>/C<IN_OUT> keywords.
C<IN> keyword is the default, the other keywords indicate how the Perl
interface should differ from the C interface.

Parameters preceded by C<OUTLIST>/C<IN_OUTLIST>/C<OUT>/C<IN_OUT>
keywords are considered to be used by the C subroutine I<via
pointers>.  C<OUTLIST>/C<OUT> keywords indicate that the C subroutine
does not inspect the memory pointed by this parameter, but will write
through this pointer to provide additional return values.

Parameters preceded by C<OUTLIST> keyword do not appear in the usage
signature of the generated Perl function.

Parameters preceded by C<IN_OUTLIST>/C<IN_OUT>/C<OUT> I<do> appear as
parameters to the Perl function.  With the exception of
C<OUT>-parameters, these parameters are converted to the corresponding
C type, then pointers to these data are given as arguments to the C
function.  It is expected that the C function will write through these
pointers.

The return list of the generated Perl function consists of the C return value
from the function (unless the XSUB is of C<void> return type or
C<The NO_OUTPUT Keyword> was used) followed by all the C<OUTLIST>
and C<IN_OUTLIST> parameters (in the order of appearance).  On the
return from the XSUB the C<IN_OUT>/C<OUT> Perl parameter will be
modified to have the values written by the C function.

For example, an XSUB

  void
  day_month(OUTLIST day, IN unix_time, OUTLIST month)
    int day
    int unix_time
    int month

should be used from Perl as

  my ($day, $month) = day_month(time);

The C signature of the corresponding function should be

  void day_month(int *day, int unix_time, int *month);

The C<IN>/C<OUTLIST>/C<IN_OUTLIST>/C<IN_OUT>/C<OUT> keywords can be
mixed with ANSI-style declarations, as in

  void
  day_month(OUTLIST int day, int unix_time, OUTLIST int month)

(here the optional C<IN> keyword is omitted).

The C<IN_OUT> parameters are identical with parameters introduced with
L<The & Unary Operator> and put into the C<OUTPUT:> section (see
L<The OUTPUT: Keyword>).  The C<IN_OUTLIST> parameters are very similar,
the only difference being that the value C function writes through the
pointer would not modify the Perl parameter, but is put in the output
list.

The C<OUTLIST>/C<OUT> parameter differ from C<IN_OUTLIST>/C<IN_OUT>
parameters only by the initial value of the Perl parameter not
being read (and not being given to the C function - which gets some
garbage instead).  For example, the same C function as above can be
interfaced with as

  void day_month(OUT int day, int unix_time, OUT int month);

or

  void
  day_month(day, unix_time, month)
      int &day = NO_INIT
      int  unix_time
      int &month = NO_INIT
    OUTPUT:
      day
      month

However, the generated Perl function is called in very C-ish style:

  my ($day, $month);
  day_month($day, time, $month);

=head2 The C<length(NAME)> Keyword

If one of the input arguments to the C function is the length of a string
argument C<NAME>, one can substitute the name of the length-argument by
C<length(NAME)> in the XSUB declaration.  This argument must be omitted when
the generated Perl function is called.  E.g.,

  void
  dump_chars(char *s, short l)
  {
    short n = 0;
    while (n < l) {
        printf("s[%d] = \"\\%#03o\"\n", n, (int)s[n]);
        n++;
    }
  }

  MODULE = x		PACKAGE = x

  void dump_chars(char *s, short length(s))

should be called as C<dump_chars($string)>.

This directive is supported with ANSI-type function declarations only.

=head2 Variable-length Parameter Lists

XSUBs can have variable-length parameter lists by specifying an ellipsis
C<(...)> in the parameter list.  This use of the ellipsis is similar to that
found in ANSI C.  The programmer is able to determine the number of
arguments passed to the XSUB by examining the C<items> variable which the
B<xsubpp> compiler supplies for all XSUBs.  By using this mechanism one can
create an XSUB which accepts a list of parameters of unknown length.

The I<host> parameter for the rpcb_gettime() XSUB can be
optional so the ellipsis can be used to indicate that the
XSUB will take a variable number of parameters.  Perl should
be able to call this XSUB with either of the following statements.

     $status = rpcb_gettime( $timep, $host );

     $status = rpcb_gettime( $timep );

The XS code, with ellipsis, follows.

     bool_t
     rpcb_gettime(timep, ...)
          time_t timep = NO_INIT
	PREINIT:
          char *host = "localhost";
        CODE:
	  if( items > 1 )
	       host = (char *)SvPV_nolen(ST(1));
	  RETVAL = rpcb_gettime( host, &timep );
        OUTPUT:
          timep
          RETVAL

=head2 The C_ARGS: Keyword

The C_ARGS: keyword allows creating of XSUBS which have different
calling sequence from Perl than from C, without a need to write
CODE: or PPCODE: section.  The contents of the C_ARGS: paragraph is
put as the argument to the called C function without any change.

For example, suppose that a C function is declared as

    symbolic nth_derivative(int n, symbolic function, int flags);

and that the default flags are kept in a global C variable
C<default_flags>.  Suppose that you want to create an interface which
is called as

    $second_deriv = $function->nth_derivative(2);

To do this, declare the XSUB as

    symbolic
    nth_derivative(function, n)
	symbolic	function
	int		n
      C_ARGS:
	n, function, default_flags

=head2 The PPCODE: Keyword

The PPCODE: keyword is an alternate form of the CODE: keyword and is used
to tell the B<xsubpp> compiler that the programmer is supplying the code to
control the argument stack for the XSUBs return values.  Occasionally one
will want an XSUB to return a list of values rather than a single value.
In these cases one must use PPCODE: and then explicitly push the list of
values on the stack.  The PPCODE: and CODE:  keywords should not be used
together within the same XSUB.

The actual difference between PPCODE: and CODE: sections is in the
initialization of C<SP> macro (which stands for the I<current> Perl
stack pointer), and in the handling of data on the stack when returning
from an XSUB.  In CODE: sections SP preserves the value which was on
entry to the XSUB: SP is on the function pointer (which follows the
last parameter).  In PPCODE: sections SP is moved backward to the
beginning of the parameter list, which allows C<PUSH*()> macros
to place output values in the place Perl expects them to be when
the XSUB returns back to Perl.

The generated trailer for a CODE: section ensures that the number of return
values Perl will see is either 0 or 1 (depending on the C<void>ness of the
return value of the C function, and heuristics mentioned in
L<"The RETVAL Variable">).  The trailer generated for a PPCODE: section
is based on the number of return values and on the number of times
C<SP> was updated by C<[X]PUSH*()> macros.

Note that macros C<ST(i)>, C<XST_m*()> and C<XSRETURN*()> work equally
well in CODE: sections and PPCODE: sections.

The following XSUB will call the C rpcb_gettime() function
and will return its two output values, timep and status, to
Perl as a single list.

     void
     rpcb_gettime(host)
          char *host
	PREINIT:
          time_t  timep;
          bool_t  status;
        PPCODE:
          status = rpcb_gettime( host, &timep );
          EXTEND(SP, 2);
          PUSHs(sv_2mortal(newSViv(status)));
          PUSHs(sv_2mortal(newSViv(timep)));

Notice that the programmer must supply the C code necessary
to have the real rpcb_gettime() function called and to have
the return values properly placed on the argument stack.

The C<void> return type for this function tells the B<xsubpp> compiler that
the RETVAL variable is not needed or used and that it should not be created.
In most scenarios the void return type should be used with the PPCODE:
directive.

The EXTEND() macro is used to make room on the argument
stack for 2 return values.  The PPCODE: directive causes the
B<xsubpp> compiler to create a stack pointer available as C<SP>, and it
is this pointer which is being used in the EXTEND() macro.
The values are then pushed onto the stack with the PUSHs()
macro.

Now the rpcb_gettime() function can be used from Perl with
the following statement.

     ($status, $timep) = rpcb_gettime("localhost");

When handling output parameters with a PPCODE section, be sure to handle
'set' magic properly.  See L<perlguts> for details about 'set' magic.

=head2 Returning Undef And Empty Lists

Occasionally the programmer will want to return simply
C<undef> or an empty list if a function fails rather than a
separate status value.  The rpcb_gettime() function offers
just this situation.  If the function succeeds we would like
to have it return the time and if it fails we would like to
have undef returned.  In the following Perl code the value
of $timep will either be undef or it will be a valid time.

     $timep = rpcb_gettime( "localhost" );

The following XSUB uses the C<SV *> return type as a mnemonic only,
and uses a CODE: block to indicate to the compiler
that the programmer has supplied all the necessary code.  The
sv_newmortal() call will initialize the return value to undef, making that
the default return value.

     SV *
     rpcb_gettime(host)
          char *  host
	PREINIT:
          time_t  timep;
          bool_t x;
        CODE:
          ST(0) = sv_newmortal();
          if( rpcb_gettime( host, &timep ) )
               sv_setnv( ST(0), (double)timep);

The next example demonstrates how one would place an explicit undef in the
return value, should the need arise.

     SV *
     rpcb_gettime(host)
          char *  host
	PREINIT:
          time_t  timep;
          bool_t x;
        CODE:
          if( rpcb_gettime( host, &timep ) ){
               ST(0) = sv_newmortal();
               sv_setnv( ST(0), (double)timep);
          }
          else{
               ST(0) = &PL_sv_undef;
          }

To return an empty list one must use a PPCODE: block and
then not push return values on the stack.

     void
     rpcb_gettime(host)
          char *host
	PREINIT:
          time_t  timep;
        PPCODE:
          if( rpcb_gettime( host, &timep ) )
               PUSHs(sv_2mortal(newSViv(timep)));
          else{
	      /* Nothing pushed on stack, so an empty
	       * list is implicitly returned. */
          }

Some people may be inclined to include an explicit C<return> in the above
XSUB, rather than letting control fall through to the end.  In those
situations C<XSRETURN_EMPTY> should be used, instead.  This will ensure that
the XSUB stack is properly adjusted.  Consult L<perlapi> for other
C<XSRETURN> macros.

Since C<XSRETURN_*> macros can be used with CODE blocks as well, one can
rewrite this example as:

     int
     rpcb_gettime(host)
          char *host
	PREINIT:
          time_t  timep;
        CODE:
          RETVAL = rpcb_gettime( host, &timep );
	  if (RETVAL == 0)
		XSRETURN_UNDEF;
	OUTPUT:
	  RETVAL

In fact, one can put this check into a POSTCALL: section as well.  Together
with PREINIT: simplifications, this leads to:

     int
     rpcb_gettime(host)
          char *host
          time_t  timep;
	POSTCALL:
	  if (RETVAL == 0)
		XSRETURN_UNDEF;

=head2 The REQUIRE: Keyword

The REQUIRE: keyword is used to indicate the minimum version of the
B<xsubpp> compiler needed to compile the XS module.  An XS module which
contains the following statement will compile with only B<xsubpp> version
1.922 or greater:

	REQUIRE: 1.922

=head2 The CLEANUP: Keyword

This keyword can be used when an XSUB requires special cleanup procedures
before it terminates.  When the CLEANUP:  keyword is used it must follow
any CODE:, or OUTPUT: blocks which are present in the XSUB.  The code
specified for the cleanup block will be added as the last statements in
the XSUB.

=head2 The POSTCALL: Keyword

This keyword can be used when an XSUB requires special procedures
executed after the C subroutine call is performed.  When the POSTCALL:
keyword is used it must precede OUTPUT: and CLEANUP: blocks which are
present in the XSUB.

See examples in L<"The NO_OUTPUT Keyword"> and L<"Returning Undef And Empty Lists">.

The POSTCALL: block does not make a lot of sense when the C subroutine
call is supplied by user by providing either CODE: or PPCODE: section.

=head2 The BOOT: Keyword

The BOOT: keyword is used to add code to the extension's bootstrap
function.  The bootstrap function is generated by the B<xsubpp> compiler and
normally holds the statements necessary to register any XSUBs with Perl.
With the BOOT: keyword the programmer can tell the compiler to add extra
statements to the bootstrap function.

This keyword may be used any time after the first MODULE keyword and should
appear on a line by itself.  The first blank line after the keyword will
terminate the code block.

     BOOT:
     # The following message will be printed when the
     # bootstrap function executes.
     printf("Hello from the bootstrap!\n");

=head2 The VERSIONCHECK: Keyword

The VERSIONCHECK: keyword corresponds to B<xsubpp>'s C<-versioncheck> and
C<-noversioncheck> options.  This keyword overrides the command line
options.  Version checking is enabled by default.  When version checking is
enabled the XS module will attempt to verify that its version matches the
version of the PM module.

To enable version checking:

    VERSIONCHECK: ENABLE

To disable version checking:

    VERSIONCHECK: DISABLE

Note that if the version of the PM module is an NV (a floating point
number), it will be stringified with a possible loss of precision
(currently chopping to nine decimal places) so that it may not match
the version of the XS module anymore. Quoting the $VERSION declaration
to make it a string is recommended if long version numbers are used.

=head2 The PROTOTYPES: Keyword

The PROTOTYPES: keyword corresponds to B<xsubpp>'s C<-prototypes> and
C<-noprototypes> options.  This keyword overrides the command line options.
Prototypes are disabled by default.  When prototypes are enabled, XSUBs will
be given Perl prototypes.  This keyword may be used multiple times in an XS
module to enable and disable prototypes for different parts of the module.
Note that B<xsubpp> will nag you if you don't explicitly enable or disable
prototypes, with:

    Please specify prototyping behavior for Foo.xs (see perlxs manual)

To enable prototypes:

    PROTOTYPES: ENABLE

To disable prototypes:

    PROTOTYPES: DISABLE

=head2 The PROTOTYPE: Keyword

This keyword is similar to the PROTOTYPES: keyword above but can be used to
force B<xsubpp> to use a specific prototype for the XSUB.  This keyword
overrides all other prototype options and keywords but affects only the
current XSUB.  Consult L<perlsub/Prototypes> for information about Perl
prototypes.

    bool_t
    rpcb_gettime(timep, ...)
          time_t timep = NO_INIT
	PROTOTYPE: $;$
	PREINIT:
          char *host = "localhost";
        CODE:
		  if( items > 1 )
		       host = (char *)SvPV_nolen(ST(1));
		  RETVAL = rpcb_gettime( host, &timep );
        OUTPUT:
          timep
          RETVAL

If the prototypes are enabled, you can disable it locally for a given
XSUB as in the following example:

    void
    rpcb_gettime_noproto()
        PROTOTYPE: DISABLE
    ...

=head2 The ALIAS: Keyword

The ALIAS: keyword allows an XSUB to have two or more unique Perl names
and to know which of those names was used when it was invoked.  The Perl
names may be fully-qualified with package names.  Each alias is given an
index.  The compiler will setup a variable called C<ix> which contain the
index of the alias which was used.  When the XSUB is called with its
declared name C<ix> will be 0.

The following example will create aliases C<FOO::gettime()> and
C<BAR::getit()> for this function.

    bool_t
    rpcb_gettime(host,timep)
          char *host
          time_t &timep
	ALIAS:
	    FOO::gettime = 1
	    BAR::getit = 2
	INIT:
	  printf("# ix = %d\n", ix );
        OUTPUT:
          timep

=head2 The OVERLOAD: Keyword

Instead of writing an overloaded interface using pure Perl, you
can also use the OVERLOAD keyword to define additional Perl names
for your functions (like the ALIAS: keyword above).  However, the
overloaded functions must be defined with three parameters (except
for the nomethod() function which needs four parameters).  If any
function has the OVERLOAD: keyword, several additional lines
will be defined in the c file generated by xsubpp in order to
register with the overload magic.

Since blessed objects are actually stored as RV's, it is useful
to use the typemap features to preprocess parameters and extract
the actual SV stored within the blessed RV.  See the sample for
T_PTROBJ_SPECIAL below.

To use the OVERLOAD: keyword, create an XS function which takes
three input parameters ( or use the c style '...' definition) like
this:

    SV *
    cmp (lobj, robj, swap)
    My_Module_obj    lobj
    My_Module_obj    robj
    IV               swap
    OVERLOAD: cmp <=>
    { /* function defined here */}

In this case, the function will overload both of the three way
comparison operators.  For all overload operations using non-alpha
characters, you must type the parameter without quoting, separating
multiple overloads with whitespace.  Note that "" (the stringify
overload) should be entered as \"\" (i.e. escaped).

=head2 The FALLBACK: Keyword

In addition to the OVERLOAD keyword, if you need to control how
Perl autogenerates missing overloaded operators, you can set the
FALLBACK keyword in the module header section, like this:

    MODULE = RPC  PACKAGE = RPC

    FALLBACK: TRUE
    ...

where FALLBACK can take any of the three values TRUE, FALSE, or
UNDEF.  If you do not set any FALLBACK value when using OVERLOAD,
it defaults to UNDEF.  FALLBACK is not used except when one or
more functions using OVERLOAD have been defined.  Please see
L<overload/fallback> for more details.

=head2 The INTERFACE: Keyword

This keyword declares the current XSUB as a keeper of the given
calling signature.  If some text follows this keyword, it is
considered as a list of functions which have this signature, and
should be attached to the current XSUB.

For example, if you have 4 C functions multiply(), divide(), add(),
subtract() all having the signature:

    symbolic f(symbolic, symbolic);

you can make them all to use the same XSUB using this:

    symbolic
    interface_s_ss(arg1, arg2)
	symbolic	arg1
	symbolic	arg2
    INTERFACE:
	multiply divide
	add subtract

(This is the complete XSUB code for 4 Perl functions!)  Four generated
Perl function share names with corresponding C functions.

The advantage of this approach comparing to ALIAS: keyword is that there
is no need to code a switch statement, each Perl function (which shares
the same XSUB) knows which C function it should call.  Additionally, one
can attach an extra function remainder() at runtime by using

    CV *mycv = newXSproto("Symbolic::remainder",
			  XS_Symbolic_interface_s_ss, __FILE__, "$$");
    XSINTERFACE_FUNC_SET(mycv, remainder);

say, from another XSUB.  (This example supposes that there was no
INTERFACE_MACRO: section, otherwise one needs to use something else instead of
C<XSINTERFACE_FUNC_SET>, see the next section.)

=head2 The INTERFACE_MACRO: Keyword

This keyword allows one to define an INTERFACE using a different way
to extract a function pointer from an XSUB.  The text which follows
this keyword should give the name of macros which would extract/set a
function pointer.  The extractor macro is given return type, C<CV*>,
and C<XSANY.any_dptr> for this C<CV*>.  The setter macro is given cv,
and the function pointer.

The default value is C<XSINTERFACE_FUNC> and C<XSINTERFACE_FUNC_SET>.
An INTERFACE keyword with an empty list of functions can be omitted if
INTERFACE_MACRO keyword is used.

Suppose that in the previous example functions pointers for
multiply(), divide(), add(), subtract() are kept in a global C array
C<fp[]> with offsets being C<multiply_off>, C<divide_off>, C<add_off>,
C<subtract_off>.  Then one can use

    #define XSINTERFACE_FUNC_BYOFFSET(ret,cv,f) \
	((XSINTERFACE_CVT_ANON(ret))fp[CvXSUBANY(cv).any_i32])
    #define XSINTERFACE_FUNC_BYOFFSET_set(cv,f) \
	CvXSUBANY(cv).any_i32 = CAT2( f, _off )

in C section,

    symbolic
    interface_s_ss(arg1, arg2)
	symbolic	arg1
	symbolic	arg2
      INTERFACE_MACRO:
	XSINTERFACE_FUNC_BYOFFSET
	XSINTERFACE_FUNC_BYOFFSET_set
      INTERFACE:
	multiply divide
	add subtract

in XSUB section.

=head2 The INCLUDE: Keyword

This keyword can be used to pull other files into the XS module.  The other
files may have XS code.  INCLUDE: can also be used to run a command to
generate the XS code to be pulled into the module.

The file F<Rpcb1.xsh> contains our C<rpcb_gettime()> function:

    bool_t
    rpcb_gettime(host,timep)
          char *host
          time_t &timep
        OUTPUT:
          timep

The XS module can use INCLUDE: to pull that file into it.

    INCLUDE: Rpcb1.xsh

If the parameters to the INCLUDE: keyword are followed by a pipe (C<|>) then
the compiler will interpret the parameters as a command. This feature is
mildly deprecated in favour of the C<INCLUDE_COMMAND:> directive, as documented
below.

    INCLUDE: cat Rpcb1.xsh |

Do not use this to run perl: C<INCLUDE: perl |> will run the perl that
happens to be the first in your path and not necessarily the same perl that is
used to run C<xsubpp>. See L<"The INCLUDE_COMMAND: Keyword">.

=head2 The INCLUDE_COMMAND: Keyword

Runs the supplied command and includes its output into the current XS
document. C<INCLUDE_COMMAND> assigns special meaning to the C<$^X> token
in that it runs the same perl interpreter that is running C<xsubpp>:

    INCLUDE_COMMAND: cat Rpcb1.xsh

    INCLUDE_COMMAND: $^X -e ...

=head2 The CASE: Keyword

The CASE: keyword allows an XSUB to have multiple distinct parts with each
part acting as a virtual XSUB.  CASE: is greedy and if it is used then all
other XS keywords must be contained within a CASE:.  This means nothing may
precede the first CASE: in the XSUB and anything following the last CASE: is
included in that case.

A CASE: might switch via a parameter of the XSUB, via the C<ix> ALIAS:
variable (see L<"The ALIAS: Keyword">), or maybe via the C<items> variable
(see L<"Variable-length Parameter Lists">).  The last CASE: becomes the
B<default> case if it is not associated with a conditional.  The following
example shows CASE switched via C<ix> with a function C<rpcb_gettime()>
having an alias C<x_gettime()>.  When the function is called as
C<rpcb_gettime()> its parameters are the usual C<(char *host, time_t *timep)>,
but when the function is called as C<x_gettime()> its parameters are
reversed, C<(time_t *timep, char *host)>.

    long
    rpcb_gettime(a,b)
      CASE: ix == 1
	ALIAS:
	  x_gettime = 1
	INPUT:
	  # 'a' is timep, 'b' is host
          char *b
          time_t a = NO_INIT
        CODE:
               RETVAL = rpcb_gettime( b, &a );
        OUTPUT:
          a
          RETVAL
      CASE:
	  # 'a' is host, 'b' is timep
          char *a
          time_t &b = NO_INIT
        OUTPUT:
          b
          RETVAL

That function can be called with either of the following statements.  Note
the different argument lists.

	$status = rpcb_gettime( $host, $timep );

	$status = x_gettime( $timep, $host );

=head2 The EXPORT_XSUB_SYMBOLS: Keyword

The EXPORT_XSUB_SYMBOLS: keyword is likely something you will never need.
In perl versions earlier than 5.16.0, this keyword does nothing. Starting
with 5.16, XSUB symbols are no longer exported by default. That is, they
are C<static> functions. If you include

  EXPORT_XSUB_SYMBOLS: ENABLE

in your XS code, the XSUBs following this line will not be declared C<static>.
You can later disable this with

  EXPORT_XSUB_SYMBOLS: DISABLE

which, again, is the default that you should probably never change.
You cannot use this keyword on versions of perl before 5.16 to make
XSUBs C<static>.

=head2 The & Unary Operator

The C<&> unary operator in the INPUT: section is used to tell B<xsubpp>
that it should convert a Perl value to/from C using the C type to the left
of C<&>, but provide a pointer to this value when the C function is called.

This is useful to avoid a CODE: block for a C function which takes a parameter
by reference.  Typically, the parameter should be not a pointer type (an
C<int> or C<long> but not an C<int*> or C<long*>).

The following XSUB will generate incorrect C code.  The B<xsubpp> compiler will
turn this into code which calls C<rpcb_gettime()> with parameters C<(char
*host, time_t timep)>, but the real C<rpcb_gettime()> wants the C<timep>
parameter to be of type C<time_t*> rather than C<time_t>.

    bool_t
    rpcb_gettime(host,timep)
          char *host
          time_t timep
        OUTPUT:
          timep

That problem is corrected by using the C<&> operator.  The B<xsubpp> compiler
will now turn this into code which calls C<rpcb_gettime()> correctly with
parameters C<(char *host, time_t *timep)>.  It does this by carrying the
C<&> through, so the function call looks like C<rpcb_gettime(host, &timep)>.

    bool_t
    rpcb_gettime(host,timep)
          char *host
          time_t &timep
        OUTPUT:
          timep

=head2 Inserting POD, Comments and C Preprocessor Directives

C preprocessor directives are allowed within BOOT:, PREINIT: INIT:, CODE:,
PPCODE:, POSTCALL:, and CLEANUP: blocks, as well as outside the functions.
Comments are allowed anywhere after the MODULE keyword.  The compiler will
pass the preprocessor directives through untouched and will remove the
commented lines. POD documentation is allowed at any point, both in the
C and XS language sections. POD must be terminated with a C<=cut> command;
C<xsubpp> will exit with an error if it does not. It is very unlikely that
human generated C code will be mistaken for POD, as most indenting styles
result in whitespace in front of any line starting with C<=>. Machine
generated XS files may fall into this trap unless care is taken to
ensure that a space breaks the sequence "\n=".

Comments can be added to XSUBs by placing a C<#> as the first
non-whitespace of a line.  Care should be taken to avoid making the
comment look like a C preprocessor directive, lest it be interpreted as
such.  The simplest way to prevent this is to put whitespace in front of
the C<#>.

If you use preprocessor directives to choose one of two
versions of a function, use

    #if ... version1
    #else /* ... version2  */
    #endif

and not

    #if ... version1
    #endif
    #if ... version2
    #endif

because otherwise B<xsubpp> will believe that you made a duplicate
definition of the function.  Also, put a blank line before the
#else/#endif so it will not be seen as part of the function body.

=head2 Using XS With C++

If an XSUB name contains C<::>, it is considered to be a C++ method.
The generated Perl function will assume that
its first argument is an object pointer.  The object pointer
will be stored in a variable called THIS.  The object should
have been created by C++ with the new() function and should
be blessed by Perl with the sv_setref_pv() macro.  The
blessing of the object by Perl can be handled by a typemap.  An example
typemap is shown at the end of this section.

If the return type of the XSUB includes C<static>, the method is considered
to be a static method.  It will call the C++
function using the class::method() syntax.  If the method is not static
the function will be called using the THIS-E<gt>method() syntax.

The next examples will use the following C++ class.

     class color {
          public:
          color();
          ~color();
          int blue();
          void set_blue( int );

          private:
          int c_blue;
     };

The XSUBs for the blue() and set_blue() methods are defined with the class
name but the parameter for the object (THIS, or "self") is implicit and is
not listed.

     int
     color::blue()

     void
     color::set_blue( val )
          int val

Both Perl functions will expect an object as the first parameter.  In the
generated C++ code the object is called C<THIS>, and the method call will
be performed on this object.  So in the C++ code the blue() and set_blue()
methods will be called as this:

     RETVAL = THIS->blue();

     THIS->set_blue( val );

You could also write a single get/set method using an optional argument:

     int
     color::blue( val = NO_INIT )
         int val
         PROTOTYPE $;$
         CODE:
             if (items > 1)
                 THIS->set_blue( val );
             RETVAL = THIS->blue();
         OUTPUT:
             RETVAL

If the function's name is B<DESTROY> then the C++ C<delete> function will be
called and C<THIS> will be given as its parameter.  The generated C++ code for

     void
     color::DESTROY()

will look like this:

     color *THIS = ...;  // Initialized as in typemap

     delete THIS;

If the function's name is B<new> then the C++ C<new> function will be called
to create a dynamic C++ object.  The XSUB will expect the class name, which
will be kept in a variable called C<CLASS>, to be given as the first
argument.

     color *
     color::new()

The generated C++ code will call C<new>.

     RETVAL = new color();

The following is an example of a typemap that could be used for this C++
example.

    TYPEMAP
    color *  O_OBJECT

    OUTPUT
    # The Perl object is blessed into 'CLASS', which should be a
    # char* having the name of the package for the blessing.
    O_OBJECT
        sv_setref_pv( $arg, CLASS, (void*)$var );

    INPUT
    O_OBJECT
        if( sv_isobject($arg) && (SvTYPE(SvRV($arg)) == SVt_PVMG) )
            $var = ($type)SvIV((SV*)SvRV( $arg ));
        else{
            warn("${Package}::$func_name() -- " .
                "$var is not a blessed SV reference");
            XSRETURN_UNDEF;
        }

=head2 Interface Strategy

When designing an interface between Perl and a C library a straight
translation from C to XS (such as created by C<h2xs -x>) is often sufficient.
However, sometimes the interface will look
very C-like and occasionally nonintuitive, especially when the C function
modifies one of its parameters, or returns failure inband (as in "negative
return values mean failure").  In cases where the programmer wishes to
create a more Perl-like interface the following strategy may help to
identify the more critical parts of the interface.

Identify the C functions with input/output or output parameters.  The XSUBs for
these functions may be able to return lists to Perl.

Identify the C functions which use some inband info as an indication
of failure.  They may be
candidates to return undef or an empty list in case of failure.  If the
failure may be detected without a call to the C function, you may want to use
an INIT: section to report the failure.  For failures detectable after the C
function returns one may want to use a POSTCALL: section to process the
failure.  In more complicated cases use CODE: or PPCODE: sections.

If many functions use the same failure indication based on the return value,
you may want to create a special typedef to handle this situation.  Put

  typedef int negative_is_failure;

near the beginning of XS file, and create an OUTPUT typemap entry
for C<negative_is_failure> which converts negative values to C<undef>, or
maybe croak()s.  After this the return value of type C<negative_is_failure>
will create more Perl-like interface.

Identify which values are used by only the C and XSUB functions
themselves, say, when a parameter to a function should be a contents of a
global variable.  If Perl does not need to access the contents of the value
then it may not be necessary to provide a translation for that value
from C to Perl.

Identify the pointers in the C function parameter lists and return
values.  Some pointers may be used to implement input/output or
output parameters, they can be handled in XS with the C<&> unary operator,
and, possibly, using the NO_INIT keyword.
Some others will require handling of types like C<int *>, and one needs
to decide what a useful Perl translation will do in such a case.  When
the semantic is clear, it is advisable to put the translation into a typemap
file.

Identify the structures used by the C functions.  In many
cases it may be helpful to use the T_PTROBJ typemap for
these structures so they can be manipulated by Perl as
blessed objects.  (This is handled automatically by C<h2xs -x>.)

If the same C type is used in several different contexts which require
different translations, C<typedef> several new types mapped to this C type,
and create separate F<typemap> entries for these new types.  Use these
types in declarations of return type and parameters to XSUBs.

=head2 Perl Objects And C Structures

When dealing with C structures one should select either
B<T_PTROBJ> or B<T_PTRREF> for the XS type.  Both types are
designed to handle pointers to complex objects.  The
T_PTRREF type will allow the Perl object to be unblessed
while the T_PTROBJ type requires that the object be blessed.
By using T_PTROBJ one can achieve a form of type-checking
because the XSUB will attempt to verify that the Perl object
is of the expected type.

The following XS code shows the getnetconfigent() function which is used
with ONC+ TIRPC.  The getnetconfigent() function will return a pointer to a
C structure and has the C prototype shown below.  The example will
demonstrate how the C pointer will become a Perl reference.  Perl will
consider this reference to be a pointer to a blessed object and will
attempt to call a destructor for the object.  A destructor will be
provided in the XS source to free the memory used by getnetconfigent().
Destructors in XS can be created by specifying an XSUB function whose name
ends with the word B<DESTROY>.  XS destructors can be used to free memory
which may have been malloc'd by another XSUB.

     struct netconfig *getnetconfigent(const char *netid);

A C<typedef> will be created for C<struct netconfig>.  The Perl
object will be blessed in a class matching the name of the C
type, with the tag C<Ptr> appended, and the name should not
have embedded spaces if it will be a Perl package name.  The
destructor will be placed in a class corresponding to the
class of the object and the PREFIX keyword will be used to
trim the name to the word DESTROY as Perl will expect.

     typedef struct netconfig Netconfig;

     MODULE = RPC  PACKAGE = RPC

     Netconfig *
     getnetconfigent(netid)
          char *netid

     MODULE = RPC  PACKAGE = NetconfigPtr  PREFIX = rpcb_

     void
     rpcb_DESTROY(netconf)
          Netconfig *netconf
        CODE:
          printf("Now in NetconfigPtr::DESTROY\n");
          free( netconf );

This example requires the following typemap entry.  Consult
L<perlxstypemap> for more information about adding new typemaps
for an extension.

     TYPEMAP
     Netconfig *  T_PTROBJ

This example will be used with the following Perl statements.

     use RPC;
     $netconf = getnetconfigent("udp");

When Perl destroys the object referenced by $netconf it will send the
object to the supplied XSUB DESTROY function.  Perl cannot determine, and
does not care, that this object is a C struct and not a Perl object.  In
this sense, there is no difference between the object created by the
getnetconfigent() XSUB and an object created by a normal Perl subroutine.

=head2 Safely Storing Static Data in XS

Starting with Perl 5.8, a macro framework has been defined to allow
static data to be safely stored in XS modules that will be accessed from
a multi-threaded Perl.

Although primarily designed for use with multi-threaded Perl, the macros
have been designed so that they will work with non-threaded Perl as well.

It is therefore strongly recommended that these macros be used by all
XS modules that make use of static data.

The easiest way to get a template set of macros to use is by specifying
the C<-g> (C<--global>) option with h2xs (see L<h2xs>).

Below is an example module that makes use of the macros.

    #define PERL_NO_GET_CONTEXT
    #include "EXTERN.h"
    #include "perl.h"
    #include "XSUB.h"

    /* Global Data */

    #define MY_CXT_KEY "BlindMice::_guts" XS_VERSION

    typedef struct {
        int count;
        char name[3][100];
    } my_cxt_t;

    START_MY_CXT

    MODULE = BlindMice           PACKAGE = BlindMice

    BOOT:
    {
        MY_CXT_INIT;
        MY_CXT.count = 0;
        strcpy(MY_CXT.name[0], "None");
        strcpy(MY_CXT.name[1], "None");
        strcpy(MY_CXT.name[2], "None");
    }

    int
    newMouse(char * name)
        PREINIT:
          dMY_CXT;
        CODE:
          if (MY_CXT.count >= 3) {
              warn("Already have 3 blind mice");
              RETVAL = 0;
          }
          else {
              RETVAL = ++ MY_CXT.count;
              strcpy(MY_CXT.name[MY_CXT.count - 1], name);
          }
        OUTPUT:
          RETVAL

    char *
    get_mouse_name(index)
          int index
        PREINIT:
          dMY_CXT;
        CODE:
          if (index > MY_CXT.count)
            croak("There are only 3 blind mice.");
          else
            RETVAL = MY_CXT.name[index - 1];
        OUTPUT:
          RETVAL

    void
    CLONE(...)
	CODE:
	  MY_CXT_CLONE;

=head3 MY_CXT REFERENCE

=over 5

=item MY_CXT_KEY

This macro is used to define a unique key to refer to the static data
for an XS module. The suggested naming scheme, as used by h2xs, is to
use a string that consists of the module name, the string "::_guts"
and the module version number.

    #define MY_CXT_KEY "MyModule::_guts" XS_VERSION

=item typedef my_cxt_t

This struct typedef I<must> always be called C<my_cxt_t>. The other
C<CXT*> macros assume the existence of the C<my_cxt_t> typedef name.

Declare a typedef named C<my_cxt_t> that is a structure that contains
all the data that needs to be interpreter-local.

    typedef struct {
        int some_value;
    } my_cxt_t;

=item START_MY_CXT

Always place the START_MY_CXT macro directly after the declaration
of C<my_cxt_t>.

=item MY_CXT_INIT

The MY_CXT_INIT macro initializes storage for the C<my_cxt_t> struct.

It I<must> be called exactly once, typically in a BOOT: section. If you
are maintaining multiple interpreters, it should be called once in each
interpreter instance, except for interpreters cloned from existing ones.
(But see L</MY_CXT_CLONE> below.)

=item dMY_CXT

Use the dMY_CXT macro (a declaration) in all the functions that access
MY_CXT.

=item MY_CXT

Use the MY_CXT macro to access members of the C<my_cxt_t> struct. For
example, if C<my_cxt_t> is

    typedef struct {
        int index;
    } my_cxt_t;

then use this to access the C<index> member

    dMY_CXT;
    MY_CXT.index = 2;

=item aMY_CXT/pMY_CXT

C<dMY_CXT> may be quite expensive to calculate, and to avoid the overhead
of invoking it in each function it is possible to pass the declaration
onto other functions using the C<aMY_CXT>/C<pMY_CXT> macros, eg

    void sub1() {
	dMY_CXT;
	MY_CXT.index = 1;
	sub2(aMY_CXT);
    }

    void sub2(pMY_CXT) {
	MY_CXT.index = 2;
    }

Analogously to C<pTHX>, there are equivalent forms for when the macro is the
first or last in multiple arguments, where an underscore represents a
comma, i.e.  C<_aMY_CXT>, C<aMY_CXT_>, C<_pMY_CXT> and C<pMY_CXT_>.

=item MY_CXT_CLONE

By default, when a new interpreter is created as a copy of an existing one
(eg via C<< threads->create() >>), both interpreters share the same physical
my_cxt_t structure. Calling C<MY_CXT_CLONE> (typically via the package's
C<CLONE()> function), causes a byte-for-byte copy of the structure to be
taken, and any future dMY_CXT will cause the copy to be accessed instead.

=item MY_CXT_INIT_INTERP(my_perl)

=item dMY_CXT_INTERP(my_perl)

These are versions of the macros which take an explicit interpreter as an
argument.

=back

Note that these macros will only work together within the I<same> source
file; that is, a dMY_CTX in one source file will access a different structure
than a dMY_CTX in another source file.

=head2 Thread-aware system interfaces

Starting from Perl 5.8, in C/C++ level Perl knows how to wrap
system/library interfaces that have thread-aware versions
(e.g. getpwent_r()) into frontend macros (e.g. getpwent()) that
correctly handle the multithreaded interaction with the Perl
interpreter.  This will happen transparently, the only thing
you need to do is to instantiate a Perl interpreter.

This wrapping happens always when compiling Perl core source
(PERL_CORE is defined) or the Perl core extensions (PERL_EXT is
defined).  When compiling XS code outside of Perl core the wrapping
does not take place.  Note, however, that intermixing the _r-forms
(as Perl compiled for multithreaded operation will do) and the _r-less
forms is neither well-defined (inconsistent results, data corruption,
or even crashes become more likely), nor is it very portable.

=head1 EXAMPLES

File C<RPC.xs>: Interface to some ONC+ RPC bind library functions.

     #define PERL_NO_GET_CONTEXT
     #include "EXTERN.h"
     #include "perl.h"
     #include "XSUB.h"

     #include <rpc/rpc.h>

     typedef struct netconfig Netconfig;

     MODULE = RPC  PACKAGE = RPC

     SV *
     rpcb_gettime(host="localhost")
          char *host
	PREINIT:
          time_t  timep;
        CODE:
          ST(0) = sv_newmortal();
          if( rpcb_gettime( host, &timep ) )
               sv_setnv( ST(0), (double)timep );

     Netconfig *
     getnetconfigent(netid="udp")
          char *netid

     MODULE = RPC  PACKAGE = NetconfigPtr  PREFIX = rpcb_

     void
     rpcb_DESTROY(netconf)
          Netconfig *netconf
        CODE:
          printf("NetconfigPtr::DESTROY\n");
          free( netconf );

File C<typemap>: Custom typemap for RPC.xs. (cf. L<perlxstypemap>)

     TYPEMAP
     Netconfig *  T_PTROBJ

File C<RPC.pm>: Perl module for the RPC extension.

     package RPC;

     require Exporter;
     require DynaLoader;
     @ISA = qw(Exporter DynaLoader);
     @EXPORT = qw(rpcb_gettime getnetconfigent);

     bootstrap RPC;
     1;

File C<rpctest.pl>: Perl test program for the RPC extension.

     use RPC;

     $netconf = getnetconfigent();
     $a = rpcb_gettime();
     print "time = $a\n";
     print "netconf = $netconf\n";

     $netconf = getnetconfigent("tcp");
     $a = rpcb_gettime("poplar");
     print "time = $a\n";
     print "netconf = $netconf\n";

=head1 CAVEATS

XS code has full access to system calls including C library functions.
It thus has the capability of interfering with things that the Perl core
or other modules have set up, such as signal handlers or file handles.
It could mess with the memory, or any number of harmful things.  Don't.

Some modules have an event loop, waiting for user-input.  It is highly
unlikely that two such modules would work adequately together in a
single Perl application.

In general, the perl interpreter views itself as the center of the
universe as far as the Perl program goes.  XS code is viewed as a
help-mate, to accomplish things that perl doesn't do, or doesn't do fast
enough, but always subservient to perl.  The closer XS code adheres to
this model, the less likely conflicts will occur.

One area where there has been conflict is in regards to C locales.  (See
L<perllocale>.)  perl, with one exception and unless told otherwise,
sets up the underlying locale the program is running in to the locale
passed
into it from the environment.  This is an important difference from a
generic C language program, where the underlying locale is the "C"
locale unless the program changes it.  As of v5.20, this underlying
locale is completely hidden from pure perl code outside the lexical
scope of C<S<use locale>> except for a couple of function calls in the
POSIX module which of necessity use it.  But the underlying locale, with
that
one exception is exposed to XS code, affecting all C library routines
whose behavior is locale-dependent.  Your XS code better not assume that
the underlying locale is "C".  The exception is the
L<C<LC_NUMERIC>|perllocale/Category LC_NUMERIC: Numeric Formatting>
locale category, and the reason it is an exception is that experience
has shown that it can be problematic for XS code, whereas we have not
had reports of problems with the
L<other locale categories|perllocale/WHAT IS A LOCALE>.  And the reason
for this one category being problematic is that the character used as a
decimal point can vary.  Many European languages use a comma, whereas
English, and hence Perl are expecting a dot (U+002E: FULL STOP).  Many
modules can handle only the radix character being a dot, and so perl
attempts to make it so.  Up through Perl v5.20, the attempt was merely
to set C<LC_NUMERIC> upon startup to the C<"C"> locale.  Any
L<setlocale()|perllocale/The setlocale function> otherwise would change
it; this caused some failures.  Therefore, starting in v5.22, perl tries
to keep C<LC_NUMERIC> always set to C<"C"> for XS code.

To summarize, here's what to expect and how to handle locales in XS code:

=over

=item Non-locale-aware XS code

Keep in mind that even if you think your code is not locale-aware, it
may call a C library function that is.  Hopefully the man page for such
a function will indicate that dependency, but the documentation is
imperfect.

The current locale is exposed to XS code except possibly C<LC_NUMERIC>
(explained in the next paragraph).
There have not been reports of problems with the other categories.
Perl initializes things on start-up so that the current locale is the
one which is indicated by the user's environment in effect at that time.
See L<perllocale/ENVIRONMENT>.

However, up through v5.20, Perl initialized things on start-up so that
C<LC_NUMERIC> was set to the "C" locale.  But if any code anywhere
changed it, it would stay changed.  This means that your module can't
count on C<LC_NUMERIC> being something in particular, and you can't
expect floating point numbers (including version strings) to have dots
in them.  If you don't allow for a non-dot, your code could break if
anyone anywhere changed the locale.  For this reason, v5.22 changed
the behavior so that Perl tries to keep C<LC_NUMERIC> in the "C" locale
except around the operations internally where it should be something
else.  Misbehaving XS code will always be able to change the locale
anyway, but the most common instance of this is checked for and
handled.

=item Locale-aware XS code

If the locale from the user's environment is desired, there should be no
need for XS code to set the locale except for C<LC_NUMERIC>, as perl has
already set it up.  XS code should avoid changing the locale, as it can
adversely affect other, unrelated, code and may not be thread safe.
However, some alien libraries that may be called do set it, such as
C<Gtk>.  This can cause problems for the perl core and other modules.
Starting in v5.20.1, calling the function
L<sync_locale()|perlapi/sync_locale> from XS should be sufficient to
avoid most of these problems.  Prior to this, you need a pure Perl
statement that does this:

 POSIX::setlocale(LC_ALL, POSIX::setlocale(LC_ALL));

In the event that your XS code may need the underlying C<LC_NUMERIC>
locale, there are macros available to access this; see
L<perlapi/Locale-related functions and macros>.

=back

=head1 XS VERSION

This document covers features supported by C<ExtUtils::ParseXS>
(also known as C<xsubpp>) 3.13_01.

=head1 AUTHOR

Originally written by Dean Roehrich <F<roehrich@cray.com>>.

Maintained since 1996 by The Perl Porters <F<perlbug@perl.org>>.
perl5262delta.pod000064400000017310150344123450007546 0ustar00=encoding utf8

=head1 NAME

perl5262delta - what is new for perl v5.26.2

=head1 DESCRIPTION

This document describes differences between the 5.26.1 release and the 5.26.2
release.

If you are upgrading from an earlier release such as 5.26.0, first read
L<perl5261delta>, which describes differences between 5.26.0 and 5.26.1.

=head1 Security

=head2 [CVE-2018-6797] heap-buffer-overflow (WRITE of size 1) in S_regatom (regcomp.c)

A crafted regular expression could cause a heap buffer write overflow, with
control over the bytes written.
L<[perl #132227]|https://rt.perl.org/Public/Bug/Display.html?id=132227>

=head2 [CVE-2018-6798] Heap-buffer-overflow in Perl__byte_dump_string (utf8.c)

Matching a crafted locale dependent regular expression could cause a heap
buffer read overflow and potentially information disclosure.
L<[perl #132063]|https://rt.perl.org/Public/Bug/Display.html?id=132063>

=head2 [CVE-2018-6913] heap-buffer-overflow in S_pack_rec

C<pack()> could cause a heap buffer write overflow with a large item count.
L<[perl #131844]|https://rt.perl.org/Public/Bug/Display.html?id=131844>

=head2 Assertion failure in Perl__core_swash_init (utf8.c)

Control characters in a supposed Unicode property name could cause perl to
crash.  This has been fixed.
L<[perl #132055]|https://rt.perl.org/Public/Bug/Display.html?id=132055>
L<[perl #132553]|https://rt.perl.org/Public/Bug/Display.html?id=132553>
L<[perl #132658]|https://rt.perl.org/Public/Bug/Display.html?id=132658>

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.26.1.  If any exist,
they are bugs, and we request that you submit a report.  See L</Reporting
Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Module::CoreList> has been upgraded from version 5.20170922_26 to 5.20180414_26.

=item *

L<PerlIO::via> has been upgraded from version 0.16 to 0.17.

=item *

L<Term::ReadLine> has been upgraded from version 1.16 to 1.17.

=item *

L<Unicode::UCD> has been upgraded from version 0.68 to 0.69.

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perluniprops>

=over 4

=item *

This has been updated to note that C<\p{Word}> now includes code points
matching the C<\p{Join_Control}> property.  The change to the property was made
in Perl 5.18, but not documented until now.  There are currently only two code
points that match this property: U+200C (ZERO WIDTH NON-JOINER) and U+200D
(ZERO WIDTH JOINER).

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item Windows

Visual C++ compiler version detection has been improved to work on non-English
language systems.
L<[perl #132421]|https://rt.perl.org/Public/Bug/Display.html?id=132421>

We now set C<$Config{libpth}> correctly for 64-bit builds using Visual C++
versions earlier than 14.1.
L<[perl #132484]|https://rt.perl.org/Public/Bug/Display.html?id=132484>

=back

=head1 Selected Bug Fixes

=over 4

=item *

The C<readpipe()> built-in function now checks at compile time that it has only
one parameter expression, and puts it in scalar context, thus ensuring that it
doesn't corrupt the stack at runtime.
L<[perl #4574]|https://rt.perl.org/Public/Bug/Display.html?id=4574>

=item *

Fixed a use after free bug in C<pp_list> introduced in Perl 5.27.1.
L<[perl #131954]|https://rt.perl.org/Public/Bug/Display.html?id=131954>

=item *

Parsing a C<sub> definition could cause a use after free if the C<sub> keyword
was followed by whitespace including newlines (and comments).
L<[perl #131836]|https://rt.perl.org/Public/Bug/Display.html?id=131836>

=item *

The tokenizer now correctly adjusts a parse pointer when skipping whitespace in
an C< ${identifier} > construct.
L<[perl #131949]|https://rt.perl.org/Public/Bug/Display.html?id=131949>

=item *

Accesses to C<${^LAST_FH}> no longer assert after using any of a variety of I/O
operations on a non-glob.
L<[perl #128263]|https://rt.perl.org/Public/Bug/Display.html?id=128263>

=item *

C<sort> now performs correct reference counting when aliasing C<$a> and C<$b>,
thus avoiding premature destruction and leakage of scalars if they are
re-aliased during execution of the sort comparator.
L<[perl #92264]|https://rt.perl.org/Public/Bug/Display.html?id=92264>

=item *

Some convoluted kinds of regexp no longer cause an arithmetic overflow when
compiled.
L<[perl #131893]|https://rt.perl.org/Public/Bug/Display.html?id=131893>

=item *

Fixed a duplicate symbol failure with B<-flto -mieee-fp> builds.  F<pp.c>
defined C<_LIB_VERSION> which B<-lieee> already defines.
L<[perl #131786]|https://rt.perl.org/Public/Bug/Display.html?id=131786>

=item *

A NULL pointer dereference in the C<S_regmatch()> function has been fixed.
L<[perl #132017]|https://rt.perl.org/Public/Bug/Display.html?id=132017>

=item *

Failures while compiling code within other constructs, such as with string
interpolation and the right part of C<s///e> now cause compilation to abort
earlier.

Previously compilation could continue in order to report other errors, but the
failed sub-parse could leave partly parsed constructs on the parser
shift-reduce stack, confusing the parser, leading to perl crashes.
L<[perl #125351]|https://rt.perl.org/Public/Bug/Display.html?id=125351>

=back

=head1 Acknowledgements

Perl 5.26.2 represents approximately 7 months of development since Perl 5.26.1
and contains approximately 3,300 lines of changes across 82 files from 17
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 1,800 lines of changes to 36 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.26.2:

Aaron Crane, Abigail, Chris 'BinGOs' Williams, H.Merijn Brand, James E Keenan,
Jarkko Hietaniemi, John SJ Anderson, Karen Etheridge, Karl Williamson, Lukas
Mai, Renee Baecker, Sawyer X, Steve Hay, Todd Rinaldo, Tony Cook, Yves Orton,
Zefram.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the perl bug database
at L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 Give Thanks

If you wish to thank the Perl 5 Porters for the work we had done in Perl 5,
you can do so by running the C<perlthanks> program:

    perlthanks

This will send an email to the Perl 5 Porters list with your show of thanks.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlmodlib.pod000064400000225302150344123450007406 0ustar00-*- buffer-read-only: t -*-
!!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
This file is built by pod/perlmodlib.PL extracting documentation from the
Perl source files.
Any changes made here will be lost!

=head1 NAME

perlmodlib - constructing new Perl modules and finding existing ones

=head1 THE PERL MODULE LIBRARY

Many modules are included in the Perl distribution.  These are described
below, and all end in F<.pm>.  You may discover compiled library
files (usually ending in F<.so>) or small pieces of modules to be
autoloaded (ending in F<.al>); these were automatically generated
by the installation process.  You may also discover files in the
library directory that end in either F<.pl> or F<.ph>.  These are
old libraries supplied so that old programs that use them still
run.  The F<.pl> files will all eventually be converted into standard
modules, and the F<.ph> files made by B<h2ph> will probably end up
as extension modules made by B<h2xs>.  (Some F<.ph> values may
already be available through the POSIX, Errno, or Fcntl modules.)
The B<pl2pm> file in the distribution may help in your conversion,
but it's just a mechanical process and therefore far from bulletproof.

=head2 Pragmatic Modules

They work somewhat like compiler directives (pragmata) in that they
tend to affect the compilation of your program, and thus will usually
work well only when used within a C<use>, or C<no>.  Most of these
are lexically scoped, so an inner BLOCK may countermand them
by saying:

    no integer;
    no strict 'refs';
    no warnings;

which lasts until the end of that BLOCK.

Some pragmas are lexically scoped--typically those that affect the
C<$^H> hints variable.  Others affect the current package instead,
like C<use vars> and C<use subs>, which allow you to predeclare a
variables or subroutines within a particular I<file> rather than
just a block.  Such declarations are effective for the entire file
for which they were declared.  You cannot rescind them with C<no
vars> or C<no subs>.

The following pragmas are defined (and have their own documentation).

=over 12

=item arybase

Set indexing base via $[

=item attributes

Get/set subroutine or variable attributes

=item autodie

Replace functions with ones that succeed or die with lexical scope

=item autodie::exception

Exceptions from autodying functions.

=item autodie::exception::system

Exceptions from autodying system().

=item autodie::hints

Provide hints about user subroutines to autodie

=item autodie::skip

Skip a package when throwing autodie exceptions

=item autouse

Postpone load of modules until a function is used

=item base

Establish an ISA relationship with base classes at compile time

=item bigint

Transparent BigInteger support for Perl

=item bignum

Transparent BigNumber support for Perl

=item bigrat

Transparent BigNumber/BigRational support for Perl

=item blib

Use MakeMaker's uninstalled version of a package

=item bytes

Expose the individual bytes of characters

=item charnames

Access to Unicode character names and named character sequences; also define character names

=item constant

Declare constants

=item deprecate

Perl pragma for deprecating the core version of a module

=item diagnostics

Produce verbose warning diagnostics

=item encoding

Allows you to write your script in non-ASCII and non-UTF-8

=item encoding::warnings

Warn on implicit encoding conversions

=item experimental

Experimental features made easy

=item feature

Enable new features

=item fields

Compile-time class fields

=item filetest

Control the filetest permission operators

=item if

C<use> a Perl module if a condition holds (also can C<no> a module)

=item integer

Use integer arithmetic instead of floating point

=item less

Request less of something

=item lib

Manipulate @INC at compile time

=item locale

Use or avoid POSIX locales for built-in operations

=item mro

Method Resolution Order

=item ok

Alternative to Test::More::use_ok

=item open

Set default PerlIO layers for input and output

=item ops

Restrict unsafe operations when compiling

=item overload

Package for overloading Perl operations

=item overloading

Lexically control overloading

=item parent

Establish an ISA relationship with base classes at compile time

=item re

Alter regular expression behaviour

=item sigtrap

Enable simple signal handling

=item sort

Control sort() behaviour

=item strict

Restrict unsafe constructs

=item subs

Predeclare sub names

=item threads

Perl interpreter-based threads

=item threads::shared

Perl extension for sharing data structures between threads

=item utf8

Enable/disable UTF-8 (or UTF-EBCDIC) in source code

=item vars

Predeclare global variable names

=item version

Perl extension for Version Objects

=item vmsish

Control VMS-specific language features

=item warnings::register

Warnings import function


=back

=head2 Standard Modules

Standard, bundled modules are all expected to behave in a well-defined
manner with respect to namespace pollution because they use the
Exporter module.  See their own documentation for details.

It's possible that not all modules listed below are installed on your
system. For example, the GDBM_File module will not be installed if you
don't have the gdbm library.

=over 12

=item Amiga::ARexx

Perl extension for ARexx support

=item Amiga::Exec

Perl extension for low level amiga support

=item AnyDBM_File

Provide framework for multiple DBMs

=item App::Cpan

Easily interact with CPAN from the command line

=item App::Prove

Implements the C<prove> command.

=item App::Prove::State

State storage for the C<prove> command.

=item App::Prove::State::Result

Individual test suite results.

=item App::Prove::State::Result::Test

Individual test results.

=item Archive::Tar

Module for manipulations of tar archives

=item Archive::Tar::File

A subclass for in-memory extracted file from Archive::Tar

=item Attribute::Handlers

Simpler definition of attribute handlers

=item AutoLoader

Load subroutines only on demand

=item AutoSplit

Split a package for autoloading

=item B

The Perl Compiler Backend

=item B::Concise

Walk Perl syntax tree, printing concise info about ops

=item B::Debug

Walk Perl syntax tree, printing debug info about ops

=item B::Deparse

Perl compiler backend to produce perl code

=item B::Op_private

 OP op_private flag definitions

=item B::Showlex

Show lexical variables used in functions or files

=item B::Terse

Walk Perl syntax tree, printing terse info about ops

=item B::Xref

Generates cross reference reports for Perl programs

=item Benchmark

Benchmark running times of Perl code

=item C<IO::Socket::IP>

Family-neutral IP socket supporting both IPv4 and IPv6

=item C<Socket>

Networking constants and support functions

=item CORE

Namespace for Perl's core routines

=item CPAN

Query, download and build perl modules from CPAN sites

=item CPAN::API::HOWTO

A recipe book for programming with CPAN.pm

=item CPAN::Debug

Internal debugging for CPAN.pm

=item CPAN::Distroprefs

Read and match distroprefs

=item CPAN::FirstTime

Utility for CPAN::Config file Initialization

=item CPAN::HandleConfig

Internal configuration handling for CPAN.pm

=item CPAN::Kwalify

Interface between CPAN.pm and Kwalify.pm

=item CPAN::Meta

The distribution metadata for a CPAN dist

=item CPAN::Meta::Converter

Convert CPAN distribution metadata structures

=item CPAN::Meta::Feature

An optional feature provided by a CPAN distribution

=item CPAN::Meta::History

History of CPAN Meta Spec changes

=item CPAN::Meta::History::Meta_1_0

Version 1.0 metadata specification for META.yml

=item CPAN::Meta::History::Meta_1_1

Version 1.1 metadata specification for META.yml

=item CPAN::Meta::History::Meta_1_2

Version 1.2 metadata specification for META.yml

=item CPAN::Meta::History::Meta_1_3

Version 1.3 metadata specification for META.yml

=item CPAN::Meta::History::Meta_1_4

Version 1.4 metadata specification for META.yml

=item CPAN::Meta::Merge

Merging CPAN Meta fragments

=item CPAN::Meta::Prereqs

A set of distribution prerequisites by phase and type

=item CPAN::Meta::Requirements

A set of version requirements for a CPAN dist

=item CPAN::Meta::Spec

Specification for CPAN distribution metadata

=item CPAN::Meta::Validator

Validate CPAN distribution metadata structures

=item CPAN::Meta::YAML

Read and write a subset of YAML for CPAN Meta files

=item CPAN::Nox

Wrapper around CPAN.pm without using any XS module

=item CPAN::Plugin

Base class for CPAN shell extensions

=item CPAN::Plugin::Specfile

Proof of concept implementation of a trivial CPAN::Plugin

=item CPAN::Queue

Internal queue support for CPAN.pm

=item CPAN::Tarzip

Internal handling of tar archives for CPAN.pm

=item CPAN::Version

Utility functions to compare CPAN versions

=item Carp

Alternative warn and die for modules

=item Class::Struct

Declare struct-like datatypes as Perl classes

=item Compress::Raw::Bzip2

Low-Level Interface to bzip2 compression library

=item Compress::Raw::Zlib

Low-Level Interface to zlib compression library

=item Compress::Zlib

Interface to zlib compression library

=item Config

Access Perl configuration information

=item Config::Perl::V

Structured data retrieval of perl -V output

=item Cwd

Get pathname of current working directory

=item DB

Programmatic interface to the Perl debugging API

=item DBM_Filter

Filter DBM keys/values 

=item DBM_Filter::compress

Filter for DBM_Filter

=item DBM_Filter::encode

Filter for DBM_Filter

=item DBM_Filter::int32

Filter for DBM_Filter

=item DBM_Filter::null

Filter for DBM_Filter

=item DBM_Filter::utf8

Filter for DBM_Filter

=item DB_File

Perl5 access to Berkeley DB version 1.x

=item Data::Dumper

Stringified perl data structures, suitable for both printing and C<eval>

=item Devel::PPPort

Perl/Pollution/Portability

=item Devel::Peek

A data debugging tool for the XS programmer

=item Devel::SelfStubber

Generate stubs for a SelfLoading module

=item Digest

Modules that calculate message digests

=item Digest::MD5

Perl interface to the MD5 Algorithm

=item Digest::SHA

Perl extension for SHA-1/224/256/384/512

=item Digest::base

Digest base class

=item Digest::file

Calculate digests of files

=item DirHandle

Supply object methods for directory handles

=item Dumpvalue

Provides screen dump of Perl data.

=item DynaLoader

Dynamically load C libraries into Perl code

=item Encode

Character encodings in Perl

=item Encode::Alias

Alias definitions to encodings

=item Encode::Byte

Single Byte Encodings

=item Encode::CJKConstants

Internally used by Encode::??::ISO_2022_*

=item Encode::CN

China-based Chinese Encodings

=item Encode::CN::HZ

Internally used by Encode::CN

=item Encode::Config

Internally used by Encode

=item Encode::EBCDIC

EBCDIC Encodings

=item Encode::Encoder

Object Oriented Encoder

=item Encode::Encoding

Encode Implementation Base Class

=item Encode::GSM0338

ESTI GSM 03.38 Encoding

=item Encode::Guess

Guesses encoding from data

=item Encode::JP

Japanese Encodings

=item Encode::JP::H2Z

Internally used by Encode::JP::2022_JP*

=item Encode::JP::JIS7

Internally used by Encode::JP

=item Encode::KR

Korean Encodings

=item Encode::KR::2022_KR

Internally used by Encode::KR

=item Encode::MIME::Header

MIME encoding for an unstructured email header

=item Encode::MIME::Name

Internally used by Encode

=item Encode::PerlIO

A detailed document on Encode and PerlIO

=item Encode::Supported

Encodings supported by Encode

=item Encode::Symbol

Symbol Encodings

=item Encode::TW

Taiwan-based Chinese Encodings

=item Encode::Unicode

Various Unicode Transformation Formats

=item Encode::Unicode::UTF7

UTF-7 encoding

=item English

Use nice English (or awk) names for ugly punctuation variables

=item Env

Perl module that imports environment variables as scalars or arrays

=item Errno

System errno constants

=item Exporter

Implements default import method for modules

=item Exporter::Heavy

Exporter guts

=item ExtUtils::CBuilder

Compile and link C code for Perl modules

=item ExtUtils::CBuilder::Platform::Windows

Builder class for Windows platforms

=item ExtUtils::Command

Utilities to replace common UNIX commands in Makefiles etc.

=item ExtUtils::Command::MM

Commands for the MM's to use in Makefiles

=item ExtUtils::Constant

Generate XS code to import C header constants

=item ExtUtils::Constant::Base

Base class for ExtUtils::Constant objects

=item ExtUtils::Constant::Utils

Helper functions for ExtUtils::Constant

=item ExtUtils::Constant::XS

Generate C code for XS modules' constants.

=item ExtUtils::Embed

Utilities for embedding Perl in C/C++ applications

=item ExtUtils::Install

Install files from here to there

=item ExtUtils::Installed

Inventory management of installed modules

=item ExtUtils::Liblist

Determine libraries to use and how to use them

=item ExtUtils::MM

OS adjusted ExtUtils::MakeMaker subclass

=item ExtUtils::MM::Utils

ExtUtils::MM methods without dependency on ExtUtils::MakeMaker

=item ExtUtils::MM_AIX

AIX specific subclass of ExtUtils::MM_Unix

=item ExtUtils::MM_Any

Platform-agnostic MM methods

=item ExtUtils::MM_BeOS

Methods to override UN*X behaviour in ExtUtils::MakeMaker

=item ExtUtils::MM_Cygwin

Methods to override UN*X behaviour in ExtUtils::MakeMaker

=item ExtUtils::MM_DOS

DOS specific subclass of ExtUtils::MM_Unix

=item ExtUtils::MM_Darwin

Special behaviors for OS X

=item ExtUtils::MM_MacOS

Once produced Makefiles for MacOS Classic

=item ExtUtils::MM_NW5

Methods to override UN*X behaviour in ExtUtils::MakeMaker

=item ExtUtils::MM_OS2

Methods to override UN*X behaviour in ExtUtils::MakeMaker

=item ExtUtils::MM_QNX

QNX specific subclass of ExtUtils::MM_Unix

=item ExtUtils::MM_UWIN

U/WIN specific subclass of ExtUtils::MM_Unix

=item ExtUtils::MM_Unix

Methods used by ExtUtils::MakeMaker

=item ExtUtils::MM_VMS

Methods to override UN*X behaviour in ExtUtils::MakeMaker

=item ExtUtils::MM_VOS

VOS specific subclass of ExtUtils::MM_Unix

=item ExtUtils::MM_Win32

Methods to override UN*X behaviour in ExtUtils::MakeMaker

=item ExtUtils::MM_Win95

Method to customize MakeMaker for Win9X

=item ExtUtils::MY

ExtUtils::MakeMaker subclass for customization

=item ExtUtils::MakeMaker

Create a module Makefile

=item ExtUtils::MakeMaker::Config

Wrapper around Config.pm

=item ExtUtils::MakeMaker::FAQ

Frequently Asked Questions About MakeMaker

=item ExtUtils::MakeMaker::Locale

Bundled Encode::Locale

=item ExtUtils::MakeMaker::Tutorial

Writing a module with MakeMaker

=item ExtUtils::Manifest

Utilities to write and check a MANIFEST file

=item ExtUtils::Miniperl

Write the C code for miniperlmain.c and perlmain.c

=item ExtUtils::Mkbootstrap

Make a bootstrap file for use by DynaLoader

=item ExtUtils::Mksymlists

Write linker options files for dynamic extension

=item ExtUtils::Packlist

Manage .packlist files

=item ExtUtils::ParseXS

Converts Perl XS code into C code

=item ExtUtils::ParseXS::Constants

Initialization values for some globals

=item ExtUtils::ParseXS::Eval

Clean package to evaluate code in

=item ExtUtils::ParseXS::Utilities

Subroutines used with ExtUtils::ParseXS

=item ExtUtils::Typemaps

Read/Write/Modify Perl/XS typemap files

=item ExtUtils::Typemaps::Cmd

Quick commands for handling typemaps

=item ExtUtils::Typemaps::InputMap

Entry in the INPUT section of a typemap

=item ExtUtils::Typemaps::OutputMap

Entry in the OUTPUT section of a typemap

=item ExtUtils::Typemaps::Type

Entry in the TYPEMAP section of a typemap

=item ExtUtils::XSSymSet

Keep sets of symbol names palatable to the VMS linker

=item ExtUtils::testlib

Add blib/* directories to @INC

=item Fatal

Replace functions with equivalents which succeed or die

=item Fcntl

Load the C Fcntl.h defines

=item File::Basename

Parse file paths into directory, filename and suffix.

=item File::Compare

Compare files or filehandles

=item File::Copy

Copy files or filehandles

=item File::DosGlob

DOS like globbing and then some

=item File::Fetch

A generic file fetching mechanism

=item File::Find

Traverse a directory tree.

=item File::Glob

Perl extension for BSD glob routine

=item File::GlobMapper

Extend File Glob to Allow Input and Output Files

=item File::Path

Create or remove directory trees

=item File::Spec

Portably perform operations on file names

=item File::Spec::AmigaOS

File::Spec for AmigaOS

=item File::Spec::Cygwin

Methods for Cygwin file specs

=item File::Spec::Epoc

Methods for Epoc file specs

=item File::Spec::Functions

Portably perform operations on file names

=item File::Spec::Mac

File::Spec for Mac OS (Classic)

=item File::Spec::OS2

Methods for OS/2 file specs

=item File::Spec::Unix

File::Spec for Unix, base for other File::Spec modules

=item File::Spec::VMS

Methods for VMS file specs

=item File::Spec::Win32

Methods for Win32 file specs

=item File::Temp

Return name and handle of a temporary file safely

=item File::stat

By-name interface to Perl's built-in stat() functions

=item FileCache

Keep more files open than the system permits

=item FileHandle

Supply object methods for filehandles

=item Filter::Simple

Simplified source filtering

=item Filter::Util::Call

Perl Source Filter Utility Module

=item FindBin

Locate directory of original perl script

=item GDBM_File

Perl5 access to the gdbm library.

=item Getopt::Long

Extended processing of command line options

=item Getopt::Std

Process single-character switches with switch clustering

=item HTTP::Tiny

A small, simple, correct HTTP/1.1 client

=item Hash::Util

A selection of general-utility hash subroutines

=item Hash::Util::FieldHash

Support for Inside-Out Classes

=item I18N::Collate

Compare 8-bit scalar data according to the current locale

=item I18N::LangTags

Functions for dealing with RFC3066-style language tags

=item I18N::LangTags::Detect

Detect the user's language preferences

=item I18N::LangTags::List

Tags and names for human languages

=item I18N::Langinfo

Query locale information

=item IO

Load various IO modules

=item IO::Compress::Base

Base Class for IO::Compress modules 

=item IO::Compress::Bzip2

Write bzip2 files/buffers

=item IO::Compress::Deflate

Write RFC 1950 files/buffers

=item IO::Compress::FAQ

Frequently Asked Questions about IO::Compress

=item IO::Compress::Gzip

Write RFC 1952 files/buffers

=item IO::Compress::RawDeflate

Write RFC 1951 files/buffers

=item IO::Compress::Zip

Write zip files/buffers

=item IO::Dir

Supply object methods for directory handles

=item IO::File

Supply object methods for filehandles

=item IO::Handle

Supply object methods for I/O handles

=item IO::Pipe

Supply object methods for pipes

=item IO::Poll

Object interface to system poll call

=item IO::Seekable

Supply seek based methods for I/O objects

=item IO::Select

OO interface to the select system call

=item IO::Socket

Object interface to socket communications

=item IO::Socket::INET

Object interface for AF_INET domain sockets

=item IO::Socket::UNIX

Object interface for AF_UNIX domain sockets

=item IO::Uncompress::AnyInflate

Uncompress zlib-based (zip, gzip) file/buffer

=item IO::Uncompress::AnyUncompress

Uncompress gzip, zip, bzip2 or lzop file/buffer

=item IO::Uncompress::Base

Base Class for IO::Uncompress modules 

=item IO::Uncompress::Bunzip2

Read bzip2 files/buffers

=item IO::Uncompress::Gunzip

Read RFC 1952 files/buffers

=item IO::Uncompress::Inflate

Read RFC 1950 files/buffers

=item IO::Uncompress::RawInflate

Read RFC 1951 files/buffers

=item IO::Uncompress::Unzip

Read zip files/buffers

=item IO::Zlib

IO:: style interface to L<Compress::Zlib>

=item IPC::Cmd

Finding and running system commands made easy

=item IPC::Msg

SysV Msg IPC object class

=item IPC::Open2

Open a process for both reading and writing using open2()

=item IPC::Open3

Open a process for reading, writing, and error handling using open3()

=item IPC::Semaphore

SysV Semaphore IPC object class

=item IPC::SharedMem

SysV Shared Memory IPC object class

=item IPC::SysV

System V IPC constants and system calls

=item Internals

Reserved special namespace for internals related functions

=item JSON::PP

JSON::XS compatible pure-Perl module.

=item JSON::PP::Boolean

Dummy module providing JSON::PP::Boolean

=item List::Util

A selection of general-utility list subroutines

=item List::Util::XS

Indicate if List::Util was compiled with a C compiler

=item Locale::Codes

A distribution of modules to handle locale codes

=item Locale::Codes::API

A description of the callable function in each module

=item Locale::Codes::Changes

Details changes to Locale::Codes

=item Locale::Codes::Country

Standard codes for country identification

=item Locale::Codes::Currency

Standard codes for currency identification

=item Locale::Codes::LangExt

Standard codes for language extension identification

=item Locale::Codes::LangFam

Standard codes for language extension identification

=item Locale::Codes::LangVar

Standard codes for language variation identification

=item Locale::Codes::Language

Standard codes for language identification

=item Locale::Codes::Script

Standard codes for script identification

=item Locale::Country

Standard codes for country identification

=item Locale::Currency

Standard codes for currency identification

=item Locale::Language

Standard codes for language identification

=item Locale::Maketext

Framework for localization

=item Locale::Maketext::Cookbook

Recipes for using Locale::Maketext

=item Locale::Maketext::Guts

Deprecated module to load Locale::Maketext utf8 code

=item Locale::Maketext::GutsLoader

Deprecated module to load Locale::Maketext utf8 code

=item Locale::Maketext::Simple

Simple interface to Locale::Maketext::Lexicon

=item Locale::Maketext::TPJ13

Article about software localization

=item Locale::Script

Standard codes for script identification

=item MIME::Base64

Encoding and decoding of base64 strings

=item MIME::QuotedPrint

Encoding and decoding of quoted-printable strings

=item Math::BigFloat

Arbitrary size floating point math package

=item Math::BigInt

Arbitrary size integer/float math package

=item Math::BigInt::Calc

Pure Perl module to support Math::BigInt

=item Math::BigInt::CalcEmu

Emulate low-level math with BigInt code

=item Math::BigInt::FastCalc

Math::BigInt::Calc with some XS for more speed

=item Math::BigInt::Lib

Virtual parent class for Math::BigInt libraries

=item Math::BigRat

Arbitrary big rational numbers

=item Math::Complex

Complex numbers and associated mathematical functions

=item Math::Trig

Trigonometric functions

=item Memoize

Make functions faster by trading space for time

=item Memoize::AnyDBM_File

Glue to provide EXISTS for AnyDBM_File for Storable use

=item Memoize::Expire

Plug-in module for automatic expiration of memoized values

=item Memoize::ExpireFile

Test for Memoize expiration semantics

=item Memoize::ExpireTest

Test for Memoize expiration semantics

=item Memoize::NDBM_File

Glue to provide EXISTS for NDBM_File for Storable use

=item Memoize::SDBM_File

Glue to provide EXISTS for SDBM_File for Storable use

=item Memoize::Storable

Store Memoized data in Storable database

=item Module::CoreList

What modules shipped with versions of perl

=item Module::CoreList::Utils

What utilities shipped with versions of perl

=item Module::Load

Runtime require of both modules and files

=item Module::Load::Conditional

Looking up module information / loading at runtime

=item Module::Loaded

Mark modules as loaded or unloaded

=item Module::Metadata

Gather package and POD information from perl module files

=item NDBM_File

Tied access to ndbm files

=item NEXT

Provide a pseudo-class NEXT (et al) that allows method redispatch

=item Net::Cmd

Network Command class (as used by FTP, SMTP etc)

=item Net::Config

Local configuration data for libnet

=item Net::Domain

Attempt to evaluate the current host's internet name and domain

=item Net::FTP

FTP Client class

=item Net::FTP::dataconn

FTP Client data connection class

=item Net::NNTP

NNTP Client class

=item Net::Netrc

OO interface to users netrc file

=item Net::POP3

Post Office Protocol 3 Client class (RFC1939)

=item Net::Ping

Check a remote host for reachability

=item Net::SMTP

Simple Mail Transfer Protocol Client

=item Net::Time

Time and daytime network client interface

=item Net::hostent

By-name interface to Perl's built-in gethost*() functions

=item Net::libnetFAQ

Libnet Frequently Asked Questions

=item Net::netent

By-name interface to Perl's built-in getnet*() functions

=item Net::protoent

By-name interface to Perl's built-in getproto*() functions

=item Net::servent

By-name interface to Perl's built-in getserv*() functions

=item O

Generic interface to Perl Compiler backends

=item ODBM_File

Tied access to odbm files

=item Opcode

Disable named opcodes when compiling perl code

=item POSIX

Perl interface to IEEE Std 1003.1

=item Params::Check

A generic input parsing/checking mechanism.

=item Parse::CPAN::Meta

Parse META.yml and META.json CPAN metadata files

=item Perl::OSType

Map Perl operating system names to generic types

=item PerlIO

On demand loader for PerlIO layers and root of PerlIO::* name space

=item PerlIO::encoding

Encoding layer

=item PerlIO::mmap

Memory mapped IO

=item PerlIO::scalar

In-memory IO, scalar IO

=item PerlIO::via

Helper class for PerlIO layers implemented in perl

=item PerlIO::via::QuotedPrint

PerlIO layer for quoted-printable strings

=item Pod::Checker

Check pod documents for syntax errors

=item Pod::Escapes

For resolving Pod EE<lt>...E<gt> sequences

=item Pod::Find

Find POD documents in directory trees

=item Pod::Functions

Group Perl's functions a la perlfunc.pod

=item Pod::Html

Module to convert pod files to HTML

=item Pod::InputObjects

Objects representing POD input paragraphs, commands, etc.

=item Pod::Man

Convert POD data to formatted *roff input

=item Pod::ParseLink

Parse an LE<lt>E<gt> formatting code in POD text

=item Pod::ParseUtils

Helpers for POD parsing and conversion

=item Pod::Parser

Base class for creating POD filters and translators

=item Pod::Perldoc

Look up Perl documentation in Pod format.

=item Pod::Perldoc::BaseTo

Base for Pod::Perldoc formatters

=item Pod::Perldoc::GetOptsOO

Customized option parser for Pod::Perldoc

=item Pod::Perldoc::ToANSI

Render Pod with ANSI color escapes 

=item Pod::Perldoc::ToChecker

Let Perldoc check Pod for errors

=item Pod::Perldoc::ToMan

Let Perldoc render Pod as man pages

=item Pod::Perldoc::ToNroff

Let Perldoc convert Pod to nroff

=item Pod::Perldoc::ToPod

Let Perldoc render Pod as ... Pod!

=item Pod::Perldoc::ToRtf

Let Perldoc render Pod as RTF

=item Pod::Perldoc::ToTerm

Render Pod with terminal escapes

=item Pod::Perldoc::ToText

Let Perldoc render Pod as plaintext

=item Pod::Perldoc::ToTk

Let Perldoc use Tk::Pod to render Pod

=item Pod::Perldoc::ToXml

Let Perldoc render Pod as XML

=item Pod::PlainText

Convert POD data to formatted ASCII text

=item Pod::Select

Extract selected sections of POD from input

=item Pod::Simple

Framework for parsing Pod

=item Pod::Simple::Checker

Check the Pod syntax of a document

=item Pod::Simple::Debug

Put Pod::Simple into trace/debug mode

=item Pod::Simple::DumpAsText

Dump Pod-parsing events as text

=item Pod::Simple::DumpAsXML

Turn Pod into XML

=item Pod::Simple::HTML

Convert Pod to HTML

=item Pod::Simple::HTMLBatch

Convert several Pod files to several HTML files

=item Pod::Simple::LinkSection

Represent "section" attributes of L codes

=item Pod::Simple::Methody

Turn Pod::Simple events into method calls

=item Pod::Simple::PullParser

A pull-parser interface to parsing Pod

=item Pod::Simple::PullParserEndToken

End-tokens from Pod::Simple::PullParser

=item Pod::Simple::PullParserStartToken

Start-tokens from Pod::Simple::PullParser

=item Pod::Simple::PullParserTextToken

Text-tokens from Pod::Simple::PullParser

=item Pod::Simple::PullParserToken

Tokens from Pod::Simple::PullParser

=item Pod::Simple::RTF

Format Pod as RTF

=item Pod::Simple::Search

Find POD documents in directory trees

=item Pod::Simple::SimpleTree

Parse Pod into a simple parse tree 

=item Pod::Simple::Subclassing

Write a formatter as a Pod::Simple subclass

=item Pod::Simple::Text

Format Pod as plaintext

=item Pod::Simple::TextContent

Get the text content of Pod

=item Pod::Simple::XHTML

Format Pod as validating XHTML

=item Pod::Simple::XMLOutStream

Turn Pod into XML

=item Pod::Text

Convert POD data to formatted text

=item Pod::Text::Color

Convert POD data to formatted color ASCII text

=item Pod::Text::Termcap

Convert POD data to ASCII text with format escapes

=item Pod::Usage

Print a usage message from embedded pod documentation

=item SDBM_File

Tied access to sdbm files

=item Safe

Compile and execute code in restricted compartments

=item Scalar::Util

A selection of general-utility scalar subroutines

=item Search::Dict

Look - search for key in dictionary file

=item SelectSaver

Save and restore selected file handle

=item SelfLoader

Load functions only on demand

=item Storable

Persistence for Perl data structures

=item Sub::Util

A selection of utility subroutines for subs and CODE references

=item Symbol

Manipulate Perl symbols and their names

=item Sys::Hostname

Try every conceivable way to get hostname

=item Sys::Syslog

Perl interface to the UNIX syslog(3) calls

=item Sys::Syslog::Win32

Win32 support for Sys::Syslog

=item TAP::Base

Base class that provides common functionality to L<TAP::Parser>

=item TAP::Formatter::Base

Base class for harness output delegates

=item TAP::Formatter::Color

Run Perl test scripts with color

=item TAP::Formatter::Console

Harness output delegate for default console output

=item TAP::Formatter::Console::ParallelSession

Harness output delegate for parallel console output

=item TAP::Formatter::Console::Session

Harness output delegate for default console output

=item TAP::Formatter::File

Harness output delegate for file output

=item TAP::Formatter::File::Session

Harness output delegate for file output

=item TAP::Formatter::Session

Abstract base class for harness output delegate 

=item TAP::Harness

Run test scripts with statistics

=item TAP::Harness::Env

Parsing harness related environmental variables where appropriate

=item TAP::Object

Base class that provides common functionality to all C<TAP::*> modules

=item TAP::Parser

Parse L<TAP|Test::Harness::TAP> output

=item TAP::Parser::Aggregator

Aggregate TAP::Parser results

=item TAP::Parser::Grammar

A grammar for the Test Anything Protocol.

=item TAP::Parser::Iterator

Base class for TAP source iterators

=item TAP::Parser::Iterator::Array

Iterator for array-based TAP sources

=item TAP::Parser::Iterator::Process

Iterator for process-based TAP sources

=item TAP::Parser::Iterator::Stream

Iterator for filehandle-based TAP sources

=item TAP::Parser::IteratorFactory

Figures out which SourceHandler objects to use for a given Source

=item TAP::Parser::Multiplexer

Multiplex multiple TAP::Parsers

=item TAP::Parser::Result

Base class for TAP::Parser output objects

=item TAP::Parser::Result::Bailout

Bailout result token.

=item TAP::Parser::Result::Comment

Comment result token.

=item TAP::Parser::Result::Plan

Plan result token.

=item TAP::Parser::Result::Pragma

TAP pragma token.

=item TAP::Parser::Result::Test

Test result token.

=item TAP::Parser::Result::Unknown

Unknown result token.

=item TAP::Parser::Result::Version

TAP syntax version token.

=item TAP::Parser::Result::YAML

YAML result token.

=item TAP::Parser::ResultFactory

Factory for creating TAP::Parser output objects

=item TAP::Parser::Scheduler

Schedule tests during parallel testing

=item TAP::Parser::Scheduler::Job

A single testing job.

=item TAP::Parser::Scheduler::Spinner

A no-op job.

=item TAP::Parser::Source

A TAP source & meta data about it

=item TAP::Parser::SourceHandler

Base class for different TAP source handlers

=item TAP::Parser::SourceHandler::Executable

Stream output from an executable TAP source

=item TAP::Parser::SourceHandler::File

Stream TAP from a text file.

=item TAP::Parser::SourceHandler::Handle

Stream TAP from an IO::Handle or a GLOB.

=item TAP::Parser::SourceHandler::Perl

Stream TAP from a Perl executable

=item TAP::Parser::SourceHandler::RawTAP

Stream output from raw TAP in a scalar/array ref.

=item TAP::Parser::YAMLish::Reader

Read YAMLish data from iterator

=item TAP::Parser::YAMLish::Writer

Write YAMLish data

=item Term::ANSIColor

Color screen output using ANSI escape sequences

=item Term::Cap

Perl termcap interface

=item Term::Complete

Perl word completion module

=item Term::ReadLine

Perl interface to various C<readline> packages.

=item Test

Provides a simple framework for writing test scripts

=item Test2

Framework for writing test tools that all work together.

=item Test2::API

Primary interface for writing Test2 based testing tools.

=item Test2::API::Breakage

What breaks at what version

=item Test2::API::Context

Object to represent a testing context.

=item Test2::API::Instance

Object used by Test2::API under the hood

=item Test2::API::Stack

Object to manage a stack of L<Test2::Hub>

=item Test2::Event

Base class for events

=item Test2::Event::Bail

Bailout!

=item Test2::Event::Diag

Diag event type

=item Test2::Event::Encoding

Set the encoding for the output stream

=item Test2::Event::Exception

Exception event

=item Test2::Event::Generic

Generic event type.

=item Test2::Event::Info

Info event base class

=item Test2::Event::Note

Note event type

=item Test2::Event::Ok

Ok event type

=item Test2::Event::Plan

The event of a plan

=item Test2::Event::Skip

Skip event type

=item Test2::Event::Subtest

Event for subtest types

=item Test2::Event::TAP::Version

Event for TAP version.

=item Test2::Event::Waiting

Tell all procs/threads it is time to be done

=item Test2::Formatter

Namespace for formatters.

=item Test2::Formatter::TAP

Standard TAP formatter

=item Test2::Hub

The conduit through which all events flow.

=item Test2::Hub::Interceptor

Hub used by interceptor to grab results.

=item Test2::Hub::Interceptor::Terminator

Exception class used by

=item Test2::Hub::Subtest

Hub used by subtests

=item Test2::IPC

Turn on IPC for threading or forking support.

=item Test2::IPC::Driver

Base class for Test2 IPC drivers.

=item Test2::IPC::Driver::Files

Temp dir + Files concurrency model.

=item Test2::Tools::Tiny

Tiny set of tools for unfortunate souls who cannot use

=item Test2::Transition

Transition notes when upgrading to Test2

=item Test2::Util

Tools used by Test2 and friends.

=item Test2::Util::ExternalMeta

Allow third party tools to safely attach meta-data

=item Test2::Util::HashBase

Build hash based classes.

=item Test2::Util::Trace

Debug information for events

=item Test::Builder

Backend for building test libraries

=item Test::Builder::Formatter

Test::Builder subclass of Test2::Formatter::TAP

=item Test::Builder::IO::Scalar

A copy of IO::Scalar for Test::Builder

=item Test::Builder::Module

Base class for test modules

=item Test::Builder::Tester

Test testsuites that have been built with

=item Test::Builder::Tester::Color

Turn on colour in Test::Builder::Tester

=item Test::Builder::TodoDiag

Test::Builder subclass of Test2::Event::Diag

=item Test::Harness

Run Perl standard test scripts with statistics

=item Test::Harness::Beyond

Beyond make test

=item Test::More

Yet another framework for writing test scripts

=item Test::Simple

Basic utilities for writing tests.

=item Test::Tester

Ease testing test modules built with Test::Builder

=item Test::Tester::Capture

Help testing test modules built with Test::Builder

=item Test::Tester::CaptureRunner

Help testing test modules built with Test::Builder

=item Test::Tutorial

A tutorial about writing really basic tests

=item Test::use::ok

Alternative to Test::More::use_ok

=item Text::Abbrev

Abbrev - create an abbreviation table from a list

=item Text::Balanced

Extract delimited text sequences from strings.

=item Text::ParseWords

Parse text into an array of tokens or array of arrays

=item Text::Tabs

Expand and unexpand tabs like unix expand(1) and unexpand(1)

=item Text::Wrap

Line wrapping to form simple paragraphs

=item Thread

Manipulate threads in Perl (for old code only)

=item Thread::Queue

Thread-safe queues

=item Thread::Semaphore

Thread-safe semaphores

=item Tie::Array

Base class for tied arrays

=item Tie::File

Access the lines of a disk file via a Perl array

=item Tie::Handle

Base class definitions for tied handles

=item Tie::Hash

Base class definitions for tied hashes

=item Tie::Hash::NamedCapture

Named regexp capture buffers

=item Tie::Memoize

Add data to hash when needed

=item Tie::RefHash

Use references as hash keys

=item Tie::Scalar

Base class definitions for tied scalars

=item Tie::StdHandle

Base class definitions for tied handles

=item Tie::SubstrHash

Fixed-table-size, fixed-key-length hashing

=item Time::HiRes

High resolution alarm, sleep, gettimeofday, interval timers

=item Time::Local

Efficiently compute time from local and GMT time

=item Time::Piece

Object Oriented time objects

=item Time::Seconds

A simple API to convert seconds to other date values

=item Time::gmtime

By-name interface to Perl's built-in gmtime() function

=item Time::localtime

By-name interface to Perl's built-in localtime() function

=item Time::tm

Internal object used by Time::gmtime and Time::localtime

=item UNIVERSAL

Base class for ALL classes (blessed references)

=item Unicode::Collate

Unicode Collation Algorithm

=item Unicode::Collate::CJK::Big5

Weighting CJK Unified Ideographs

=item Unicode::Collate::CJK::GB2312

Weighting CJK Unified Ideographs

=item Unicode::Collate::CJK::JISX0208

Weighting JIS KANJI for Unicode::Collate

=item Unicode::Collate::CJK::Korean

Weighting CJK Unified Ideographs

=item Unicode::Collate::CJK::Pinyin

Weighting CJK Unified Ideographs

=item Unicode::Collate::CJK::Stroke

Weighting CJK Unified Ideographs

=item Unicode::Collate::CJK::Zhuyin

Weighting CJK Unified Ideographs

=item Unicode::Collate::Locale

Linguistic tailoring for DUCET via Unicode::Collate

=item Unicode::Normalize

Unicode Normalization Forms

=item Unicode::UCD

Unicode character database

=item User::grent

By-name interface to Perl's built-in getgr*() functions

=item User::pwent

By-name interface to Perl's built-in getpw*() functions

=item VMS::DCLsym

Perl extension to manipulate DCL symbols

=item VMS::Filespec

Convert between VMS and Unix file specification syntax

=item VMS::Stdio

Standard I/O functions via VMS extensions

=item Win32

Interfaces to some Win32 API Functions

=item Win32API::File

Low-level access to Win32 system API calls for files/dirs.

=item Win32CORE

Win32 CORE function stubs

=item XS::APItest

Test the perl C API

=item XS::Typemap

Module to test the XS typemaps distributed with perl

=item XSLoader

Dynamically load C libraries into Perl code

=item autodie::Scope::Guard

Wrapper class for calling subs at end of scope

=item autodie::Scope::GuardStack

 Hook stack for managing scopes via %^H

=item autodie::Util

Internal Utility subroutines for autodie and Fatal

=item version::Internals

Perl extension for Version Objects


=back

To find out I<all> modules installed on your system, including
those without documentation or outside the standard release,
just use the following command (under the default win32 shell,
double quotes should be used instead of single quotes).

    % perl -MFile::Find=find -MFile::Spec::Functions -Tlwe \
      'find { wanted => sub { print canonpath $_ if /\.pm\z/ },
      no_chdir => 1 }, @INC'

(The -T is here to prevent '.' from being listed in @INC.)
They should all have their own documentation installed and accessible
via your system man(1) command.  If you do not have a B<find>
program, you can use the Perl B<find2perl> program instead, which
generates Perl code as output you can run through perl.  If you
have a B<man> program but it doesn't find your modules, you'll have
to fix your manpath.  See L<perl> for details.  If you have no
system B<man> command, you might try the B<perldoc> program.

Note also that the command C<perldoc perllocal> gives you a (possibly
incomplete) list of the modules that have been further installed on
your system. (The perllocal.pod file is updated by the standard MakeMaker
install process.)

=head2 Extension Modules

Extension modules are written in C (or a mix of Perl and C).  They
are usually dynamically loaded into Perl if and when you need them,
but may also be linked in statically.  Supported extension modules
include Socket, Fcntl, and POSIX.

Many popular C extension modules do not come bundled (at least, not
completely) due to their sizes, volatility, or simply lack of time
for adequate testing and configuration across the multitude of
platforms on which Perl was beta-tested.  You are encouraged to
look for them on CPAN (described below), or using web search engines
like Alta Vista or Google.

=head1 CPAN

CPAN stands for Comprehensive Perl Archive Network; it's a globally
replicated trove of Perl materials, including documentation, style
guides, tricks and traps, alternate ports to non-Unix systems and
occasional binary distributions for these.   Search engines for
CPAN can be found at http://www.cpan.org/

Most importantly, CPAN includes around a thousand unbundled modules,
some of which require a C compiler to build.  Major categories of
modules are:

=over

=item *

Language Extensions and Documentation Tools

=item *

Development Support

=item *

Operating System Interfaces

=item *

Networking, Device Control (modems) and InterProcess Communication

=item *

Data Types and Data Type Utilities

=item *

Database Interfaces

=item *

User Interfaces

=item *

Interfaces to / Emulations of Other Programming Languages

=item *

File Names, File Systems and File Locking (see also File Handles)

=item *

String Processing, Language Text Processing, Parsing, and Searching

=item *

Option, Argument, Parameter, and Configuration File Processing

=item *

Internationalization and Locale

=item *

Authentication, Security, and Encryption

=item *

World Wide Web, HTML, HTTP, CGI, MIME

=item *

Server and Daemon Utilities

=item *

Archiving and Compression

=item *

Images, Pixmap and Bitmap Manipulation, Drawing, and Graphing

=item *

Mail and Usenet News

=item *

Control Flow Utilities (callbacks and exceptions etc)

=item *

File Handle and Input/Output Stream Utilities

=item *

Miscellaneous Modules

=back

The list of the registered CPAN sites follows.
Please note that the sorting order is alphabetical on fields:

Continent
   |
   |-->Country
         |
         |-->[state/province]
                   |
                   |-->ftp
                   |
                   |-->[http]

and thus the North American servers happen to be listed between the
European and the South American sites.

Registered CPAN sites

=for maintainers
Generated by Porting/make_modlib_cpan.pl

=head2 Africa

=over 4

=item South Africa

  http://mirror.is.co.za/pub/cpan/
  ftp://ftp.is.co.za/pub/cpan/
  http://cpan.mirror.ac.za/
  ftp://cpan.mirror.ac.za/
  http://cpan.saix.net/
  ftp://ftp.saix.net/pub/CPAN/
  http://ftp.wa.co.za/pub/CPAN/
  ftp://ftp.wa.co.za/pub/CPAN/

=item Uganda

  http://mirror.ucu.ac.ug/cpan/

=item Zimbabwe

  http://mirror.zol.co.zw/CPAN/
  ftp://mirror.zol.co.zw/CPAN/

=back

=head2 Asia

=over 4

=item Bangladesh

  http://mirror.dhakacom.com/CPAN/
  ftp://mirror.dhakacom.com/CPAN/

=item China

  http://cpan.communilink.net/
  http://ftp.cuhk.edu.hk/pub/packages/perl/CPAN/
  ftp://ftp.cuhk.edu.hk/pub/packages/perl/CPAN/
  http://mirrors.hust.edu.cn/CPAN/
  http://mirrors.neusoft.edu.cn/cpan/
  http://mirror.lzu.edu.cn/CPAN/
  http://mirrors.163.com/cpan/
  http://mirrors.sohu.com/CPAN/
  http://mirrors.ustc.edu.cn/CPAN/
  ftp://mirrors.ustc.edu.cn/CPAN/
  http://mirrors.xmu.edu.cn/CPAN/
  ftp://mirrors.xmu.edu.cn/CPAN/
  http://mirrors.zju.edu.cn/CPAN/

=item India

  http://cpan.excellmedia.net/
  http://perlmirror.indialinks.com/

=item Indonesia

  http://kambing.ui.ac.id/cpan/
  http://cpan.pesat.net.id/
  http://mirror.poliwangi.ac.id/CPAN/
  http://kartolo.sby.datautama.net.id/CPAN/
  http://mirror.wanxp.id/cpan/

=item Iran

  http://mirror.yazd.ac.ir/cpan/

=item Israel

  http://biocourse.weizmann.ac.il/CPAN/

=item Japan

  http://ftp.jaist.ac.jp/pub/CPAN/
  ftp://ftp.jaist.ac.jp/pub/CPAN/
  http://mirror.jre655.com/CPAN/
  ftp://mirror.jre655.com/CPAN/
  ftp://ftp.kddilabs.jp/CPAN/
  http://ftp.nara.wide.ad.jp/pub/CPAN/
  ftp://ftp.nara.wide.ad.jp/pub/CPAN/
  http://ftp.riken.jp/lang/CPAN/
  ftp://ftp.riken.jp/lang/CPAN/
  ftp://ftp.u-aizu.ac.jp/pub/CPAN/
  http://ftp.yz.yamagata-u.ac.jp/pub/lang/cpan/
  ftp://ftp.yz.yamagata-u.ac.jp/pub/lang/cpan/

=item Kazakhstan

  http://mirror.neolabs.kz/CPAN/
  ftp://mirror.neolabs.kz/CPAN/

=item Philippines

  http://mirror.pregi.net/CPAN/
  ftp://mirror.pregi.net/CPAN/
  http://mirror.rise.ph/cpan/
  ftp://mirror.rise.ph/cpan/

=item Qatar

  http://mirror.qnren.qa/CPAN/
  ftp://mirror.qnren.qa/CPAN/

=item Republic of Korea

  http://cpan.mirror.cdnetworks.com/
  ftp://cpan.mirror.cdnetworks.com/CPAN/
  http://ftp.kaist.ac.kr/pub/CPAN/
  ftp://ftp.kaist.ac.kr/CPAN/
  http://ftp.kr.freebsd.org/pub/CPAN/
  ftp://ftp.kr.freebsd.org/pub/CPAN/
  http://mirror.navercorp.com/CPAN/
  http://ftp.neowiz.com/CPAN/
  ftp://ftp.neowiz.com/CPAN/

=item Singapore

  http://cpan.mirror.choon.net/
  http://mirror.0x.sg/CPAN/
  ftp://mirror.0x.sg/CPAN/

=item Taiwan

  http://cpan.cdpa.nsysu.edu.tw/Unix/Lang/CPAN/
  ftp://cpan.cdpa.nsysu.edu.tw/Unix/Lang/CPAN/
  http://cpan.stu.edu.tw/
  ftp://ftp.stu.edu.tw/CPAN/
  http://ftp.yzu.edu.tw/CPAN/
  ftp://ftp.yzu.edu.tw/CPAN/
  http://cpan.nctu.edu.tw/
  ftp://cpan.nctu.edu.tw/
  http://ftp.ubuntu-tw.org/mirror/CPAN/
  ftp://ftp.ubuntu-tw.org/mirror/CPAN/

=item Turkey

  http://cpan.ulak.net.tr/
  ftp://ftp.ulak.net.tr/pub/perl/CPAN/
  http://mirror.vit.com.tr/mirror/CPAN/
  ftp://mirror.vit.com.tr/CPAN/

=item Viet Nam

  http://mirrors.digipower.vn/CPAN/
  http://mirror.downloadvn.com/cpan/
  http://mirrors.vinahost.vn/CPAN/

=back

=head2 Europe

=over 4

=item Austria

  http://cpan.inode.at/
  ftp://cpan.inode.at/
  http://mirror.easyname.at/cpan/
  ftp://mirror.easyname.at/cpan/
  http://gd.tuwien.ac.at/languages/perl/CPAN/
  ftp://gd.tuwien.ac.at/pub/CPAN/

=item Belarus

  http://ftp.byfly.by/pub/CPAN/
  ftp://ftp.byfly.by/pub/CPAN/
  http://mirror.datacenter.by/pub/CPAN/
  ftp://mirror.datacenter.by/pub/CPAN/

=item Belgium

  http://ftp.belnet.be/ftp.cpan.org/
  ftp://ftp.belnet.be/mirror/ftp.cpan.org/
  http://cpan.cu.be/
  http://lib.ugent.be/CPAN/
  http://cpan.weepeetelecom.be/

=item Bosnia and Herzegovina

  http://cpan.mirror.ba/
  ftp://ftp.mirror.ba/CPAN/

=item Bulgaria

  http://mirrors.neterra.net/CPAN/
  ftp://mirrors.neterra.net/CPAN/
  http://mirrors.netix.net/CPAN/
  ftp://mirrors.netix.net/CPAN/

=item Croatia

  http://ftp.carnet.hr/pub/CPAN/
  ftp://ftp.carnet.hr/pub/CPAN/

=item Czech Republic

  http://mirror.dkm.cz/cpan/
  ftp://mirror.dkm.cz/cpan/
  ftp://ftp.fi.muni.cz/pub/CPAN/
  http://mirrors.nic.cz/CPAN/
  ftp://mirrors.nic.cz/pub/CPAN/
  http://cpan.mirror.vutbr.cz/
  ftp://mirror.vutbr.cz/cpan/

=item Denmark

  http://www.cpan.dk/
  http://mirrors.dotsrc.org/cpan/
  ftp://mirrors.dotsrc.org/cpan/

=item Finland

  ftp://ftp.funet.fi/pub/languages/perl/CPAN/

=item France

  http://ftp.ciril.fr/pub/cpan/
  ftp://ftp.ciril.fr/pub/cpan/
  http://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/
  ftp://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/
  http://ftp.lip6.fr/pub/perl/CPAN/
  ftp://ftp.lip6.fr/pub/perl/CPAN/
  http://mirror.ibcp.fr/pub/CPAN/
  ftp://ftp.oleane.net/pub/CPAN/
  http://cpan.mirrors.ovh.net/ftp.cpan.org/
  ftp://cpan.mirrors.ovh.net/ftp.cpan.org/
  http://cpan.enstimac.fr/

=item Germany

  http://mirror.23media.de/cpan/
  ftp://mirror.23media.de/cpan/
  http://artfiles.org/cpan.org/
  ftp://artfiles.org/cpan.org/
  http://mirror.bibleonline.ru/cpan/
  http://mirror.checkdomain.de/CPAN/
  ftp://mirror.checkdomain.de/CPAN/
  http://cpan.noris.de/
  http://mirror.de.leaseweb.net/CPAN/
  ftp://mirror.de.leaseweb.net/CPAN/
  http://cpan.mirror.euserv.net/
  ftp://mirror.euserv.net/cpan/
  http://ftp-stud.hs-esslingen.de/pub/Mirrors/CPAN/
  ftp://mirror.fraunhofer.de/CPAN/
  ftp://ftp.freenet.de/pub/ftp.cpan.org/pub/CPAN/
  http://ftp.hosteurope.de/pub/CPAN/
  ftp://ftp.hosteurope.de/pub/CPAN/
  ftp://ftp.fu-berlin.de/unix/languages/perl/
  http://ftp.gwdg.de/pub/languages/perl/CPAN/
  ftp://ftp.gwdg.de/pub/languages/perl/CPAN/
  http://ftp.hawo.stw.uni-erlangen.de/CPAN/
  ftp://ftp.hawo.stw.uni-erlangen.de/CPAN/
  http://cpan.mirror.iphh.net/
  ftp://cpan.mirror.iphh.net/pub/CPAN/
  ftp://ftp.mpi-inf.mpg.de/pub/perl/CPAN/
  http://cpan.netbet.org/
  http://mirror.netcologne.de/cpan/
  ftp://mirror.netcologne.de/cpan/
  ftp://mirror.petamem.com/CPAN/
  http://www.planet-elektronik.de/CPAN/
  http://ftp.halifax.rwth-aachen.de/cpan/
  ftp://ftp.halifax.rwth-aachen.de/cpan/
  http://mirror.softaculous.com/cpan/
  http://ftp.u-tx.net/CPAN/
  ftp://ftp.u-tx.net/CPAN/
  http://mirror.reismil.ch/CPAN/

=item Greece

  http://cpan.cc.uoc.gr/mirrors/CPAN/
  ftp://ftp.cc.uoc.gr/mirrors/CPAN/
  http://ftp.ntua.gr/pub/lang/perl/
  ftp://ftp.ntua.gr/pub/lang/perl/

=item Hungary

  http://mirror.met.hu/CPAN/

=item Ireland

  http://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN/
  ftp://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN/

=item Italy

  http://bo.mirror.garr.it/mirrors/CPAN/
  ftp://ftp.eutelia.it/CPAN_Mirror/
  http://cpan.panu.it/
  ftp://ftp.panu.it/pub/mirrors/perl/CPAN/
  http://cpan.muzzy.it/

=item Latvia

  http://kvin.lv/pub/CPAN/

=item Lithuania

  http://ftp.litnet.lt/pub/CPAN/
  ftp://ftp.litnet.lt/pub/CPAN/

=item Moldova

  http://mirror.as43289.net/pub/CPAN/
  ftp://mirror.as43289.net/pub/CPAN/

=item Netherlands

  http://cpan.cs.uu.nl/
  ftp://ftp.cs.uu.nl/pub/CPAN/
  http://mirror.nl.leaseweb.net/CPAN/
  ftp://mirror.nl.leaseweb.net/CPAN/
  http://ftp.nluug.nl/languages/perl/CPAN/
  ftp://ftp.nluug.nl/pub/languages/perl/CPAN/
  http://mirror.transip.net/CPAN/
  ftp://mirror.transip.net/CPAN/
  http://cpan.mirror.triple-it.nl/
  http://ftp.tudelft.nl/cpan/
  ftp://ftp.tudelft.nl/pub/CPAN/
  ftp://download.xs4all.nl/pub/mirror/CPAN/

=item Norway

  http://cpan.uib.no/
  ftp://cpan.uib.no/pub/CPAN/
  ftp://ftp.uninett.no/pub/languages/perl/CPAN/
  http://cpan.vianett.no/

=item Poland

  http://ftp.agh.edu.pl/CPAN/
  ftp://ftp.agh.edu.pl/CPAN/
  http://ftp.piotrkosoft.net/pub/mirrors/CPAN/
  ftp://ftp.piotrkosoft.net/pub/mirrors/CPAN/
  ftp://ftp.ps.pl/pub/CPAN/
  http://sunsite.icm.edu.pl/pub/CPAN/
  ftp://sunsite.icm.edu.pl/pub/CPAN/

=item Portugal

  http://cpan.dcc.fc.up.pt/
  http://mirrors.fe.up.pt/pub/CPAN/
  http://cpan.perl-hackers.net/
  http://cpan.perl.pt/

=item Romania

  http://mirrors.hostingromania.ro/cpan.org/
  ftp://ftp.lug.ro/CPAN/
  http://mirrors.m247.ro/CPAN/
  http://mirrors.evowise.com/CPAN/
  http://mirrors.teentelecom.net/CPAN/
  ftp://mirrors.teentelecom.net/CPAN/
  http://mirrors.xservers.ro/CPAN/

=item Russian Federation

  ftp://ftp.aha.ru/CPAN/
  http://cpan.rinet.ru/
  ftp://cpan.rinet.ru/pub/mirror/CPAN/
  http://cpan-mirror.rbc.ru/pub/CPAN/
  http://mirror.rol.ru/CPAN/
  http://cpan.uni-altai.ru/
  http://cpan.webdesk.ru/
  ftp://cpan.webdesk.ru/cpan/
  http://mirror.yandex.ru/mirrors/cpan/
  ftp://mirror.yandex.ru/mirrors/cpan/

=item Serbia

  http://mirror.sbb.rs/CPAN/
  ftp://mirror.sbb.rs/CPAN/

=item Slovakia

  http://cpan.lnx.sk/
  http://tux.rainside.sk/CPAN/
  ftp://tux.rainside.sk/CPAN/

=item Slovenia

  http://ftp.arnes.si/software/perl/CPAN/
  ftp://ftp.arnes.si/software/perl/CPAN/

=item Spain

  http://mirrors.evowise.com/CPAN/
  http://osl.ugr.es/CPAN/
  http://ftp.rediris.es/mirror/CPAN/
  ftp://ftp.rediris.es/mirror/CPAN/

=item Sweden

  http://ftp.acc.umu.se/mirror/CPAN/
  ftp://ftp.acc.umu.se/mirror/CPAN/

=item Switzerland

  http://www.pirbot.com/mirrors/cpan/
  http://mirror.switch.ch/ftp/mirror/CPAN/
  ftp://mirror.switch.ch/mirror/CPAN/

=item Ukraine

  http://cpan.ip-connect.vn.ua/
  ftp://cpan.ip-connect.vn.ua/mirror/cpan/

=item United Kingdom

  http://cpan.mirror.anlx.net/
  ftp://ftp.mirror.anlx.net/CPAN/
  http://mirror.bytemark.co.uk/CPAN/
  ftp://mirror.bytemark.co.uk/CPAN/
  http://mirrors.coreix.net/CPAN/
  http://cpan.etla.org/
  ftp://cpan.etla.org/pub/CPAN/
  http://cpan.cpantesters.org/
  http://mirror.sax.uk.as61049.net/CPAN/
  http://mirror.sov.uk.goscomb.net/CPAN/
  http://www.mirrorservice.org/sites/cpan.perl.org/CPAN/
  ftp://ftp.mirrorservice.org/sites/cpan.perl.org/CPAN/
  http://mirror.ox.ac.uk/sites/www.cpan.org/
  ftp://mirror.ox.ac.uk/sites/www.cpan.org/
  http://ftp.ticklers.org/pub/CPAN/
  ftp://ftp.ticklers.org/pub/CPAN/
  http://cpan.mirrors.uk2.net/
  ftp://mirrors.uk2.net/pub/CPAN/
  http://mirror.ukhost4u.com/CPAN/

=back

=head2 North America

=over 4

=item Canada

  http://CPAN.mirror.rafal.ca/
  ftp://CPAN.mirror.rafal.ca/pub/CPAN/
  http://mirror.csclub.uwaterloo.ca/CPAN/
  ftp://mirror.csclub.uwaterloo.ca/CPAN/
  http://mirrors.gossamer-threads.com/CPAN/
  http://mirror.its.dal.ca/cpan/
  ftp://mirror.its.dal.ca/cpan/
  ftp://ftp.ottix.net/pub/CPAN/

=item Costa Rica

  http://mirrors.ucr.ac.cr/CPAN/

=item Mexico

  http://www.msg.com.mx/CPAN/
  ftp://ftp.msg.com.mx/pub/CPAN/

=item United States

=over 8

=item Alabama

  http://mirror.teklinks.com/CPAN/

=item Arizona

  http://mirror.n5tech.com/CPAN/
  http://mirrors.namecheap.com/CPAN/
  ftp://mirrors.namecheap.com/CPAN/

=item California

  http://cpan.develooper.com/
  http://httpupdate127.cpanel.net/CPAN/
  http://mirrors.sonic.net/cpan/
  ftp://mirrors.sonic.net/cpan/
  http://www.perl.com/CPAN/
  http://cpan.yimg.com/

=item Idaho

  http://mirrors.syringanetworks.net/CPAN/
  ftp://mirrors.syringanetworks.net/CPAN/

=item Illinois

  http://cpan.mirrors.hoobly.com/
  http://mirror.team-cymru.org/CPAN/
  ftp://mirror.team-cymru.org/CPAN/

=item Indiana

  http://cpan.netnitco.net/
  ftp://cpan.netnitco.net/pub/mirrors/CPAN/
  ftp://ftp.uwsg.iu.edu/pub/perl/CPAN/

=item Kansas

  http://mirrors.concertpass.com/cpan/

=item Massachusetts

  http://mirrors.ccs.neu.edu/CPAN/

=item Michigan

  http://cpan.cse.msu.edu/
  ftp://cpan.cse.msu.edu/
  http://httpupdate118.cpanel.net/CPAN/
  http://mirrors-usa.go-parts.com/cpan/
  http://ftp.wayne.edu/CPAN/
  ftp://ftp.wayne.edu/CPAN/

=item New Hampshire

  http://mirror.metrocast.net/cpan/

=item New Jersey

  http://mirror.datapipe.net/CPAN/
  ftp://mirror.datapipe.net/pub/CPAN/
  http://www.hoovism.com/CPAN/
  ftp://ftp.hoovism.com/CPAN/
  http://cpan.mirror.nac.net/

=item New York

  http://mirror.cc.columbia.edu/pub/software/cpan/
  ftp://mirror.cc.columbia.edu/pub/software/cpan/
  http://cpan.belfry.net/
  http://cpan.erlbaum.net/
  ftp://cpan.erlbaum.net/CPAN/
  http://cpan.hexten.net/
  ftp://cpan.hexten.net/
  http://mirror.nyi.net/CPAN/
  ftp://mirror.nyi.net/pub/CPAN/
  http://noodle.portalus.net/CPAN/
  ftp://noodle.portalus.net/CPAN/
  http://mirrors.rit.edu/CPAN/
  ftp://mirrors.rit.edu/CPAN/

=item North Carolina

  http://httpupdate140.cpanel.net/CPAN/
  http://mirrors.ibiblio.org/CPAN/

=item Oregon

  http://ftp.osuosl.org/pub/CPAN/
  ftp://ftp.osuosl.org/pub/CPAN/
  http://mirror.uoregon.edu/CPAN/

=item Pennsylvania

  http://cpan.pair.com/
  ftp://cpan.pair.com/pub/CPAN/
  http://cpan.mirrors.ionfish.org/

=item South Carolina

  http://cpan.mirror.clemson.edu/

=item Texas

  http://mirror.uta.edu/CPAN/

=item Utah

  http://cpan.cs.utah.edu/
  ftp://cpan.cs.utah.edu/CPAN/
  ftp://mirror.xmission.com/CPAN/

=item Virginia

  http://mirror.cogentco.com/pub/CPAN/
  ftp://mirror.cogentco.com/pub/CPAN/
  http://mirror.jmu.edu/pub/CPAN/
  ftp://mirror.jmu.edu/pub/CPAN/
  http://mirror.us.leaseweb.net/CPAN/
  ftp://mirror.us.leaseweb.net/CPAN/

=item Washington

  http://cpan.llarian.net/
  ftp://cpan.llarian.net/pub/CPAN/

=item Wisconsin

  http://cpan.mirrors.tds.net/
  ftp://cpan.mirrors.tds.net/pub/CPAN/

=back

=back

=head2 Oceania

=over 4

=item Australia

  http://mirror.as24220.net/pub/cpan/
  ftp://mirror.as24220.net/pub/cpan/
  http://cpan.mirrors.ilisys.com.au/
  http://cpan.mirror.digitalpacific.com.au/
  ftp://mirror.internode.on.net/pub/cpan/
  http://mirror.optusnet.com.au/CPAN/
  http://cpan.mirror.serversaustralia.com.au/
  http://cpan.uberglobalmirror.com/
  http://mirror.waia.asn.au/pub/cpan/

=item New Caledonia

  http://cpan.lagoon.nc/pub/CPAN/
  ftp://cpan.lagoon.nc/pub/CPAN/
  http://cpan.nautile.nc/CPAN/
  ftp://cpan.nautile.nc/CPAN/

=item New Zealand

  ftp://ftp.auckland.ac.nz/pub/perl/CPAN/
  http://cpan.catalyst.net.nz/CPAN/
  ftp://cpan.catalyst.net.nz/pub/CPAN/
  http://cpan.inspire.net.nz/
  ftp://cpan.inspire.net.nz/cpan/
  http://mirror.webtastix.net/CPAN/
  ftp://mirror.webtastix.net/CPAN/

=back

=head2 South America

=over 4

=item Argentina

  http://cpan.mmgdesigns.com.ar/

=item Brazil

  http://cpan.kinghost.net/
  http://linorg.usp.br/CPAN/
  http://mirror.nbtelecom.com.br/CPAN/

=item Chile

  http://cpan.dcc.uchile.cl/
  ftp://cpan.dcc.uchile.cl/pub/lang/cpan/

=back

=head2 RSYNC Mirrors

		rsync://ftp.is.co.za/IS-Mirror/ftp.cpan.org/
		rsync://mirror.ac.za/CPAN/
		rsync://mirror.zol.co.zw/CPAN/
		rsync://mirror.dhakacom.com/CPAN/
		rsync://mirrors.ustc.edu.cn/CPAN/
		rsync://mirrors.xmu.edu.cn/CPAN/
		rsync://kambing.ui.ac.id/CPAN/
		rsync://ftp.jaist.ac.jp/pub/CPAN/
		rsync://mirror.jre655.com/CPAN/
		rsync://ftp.kddilabs.jp/cpan/
		rsync://ftp.nara.wide.ad.jp/cpan/
		rsync://ftp.riken.jp/cpan/
		rsync://mirror.neolabs.kz/CPAN/
		rsync://mirror.qnren.qa/CPAN/
		rsync://ftp.neowiz.com/CPAN/
		rsync://mirror.0x.sg/CPAN/
		rsync://ftp.yzu.edu.tw/pub/CPAN/
		rsync://ftp.ubuntu-tw.org/CPAN/
		rsync://mirrors.digipower.vn/CPAN/
		rsync://cpan.inode.at/CPAN/
		rsync://ftp.byfly.by/CPAN/
		rsync://mirror.datacenter.by/CPAN/
		rsync://ftp.belnet.be/cpan/
		rsync://cpan.mirror.ba/CPAN/
		rsync://mirrors.neterra.net/CPAN/
		rsync://mirrors.netix.net/CPAN/
		rsync://mirror.dkm.cz/cpan/
		rsync://mirrors.nic.cz/CPAN/
		rsync://cpan.mirror.vutbr.cz/cpan/
		rsync://rsync.nic.funet.fi/CPAN/
		rsync://ftp.ciril.fr/pub/cpan/
		rsync://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/
		rsync://cpan.mirrors.ovh.net/CPAN/
		rsync://mirror.de.leaseweb.net/CPAN/
		rsync://mirror.euserv.net/cpan/
		rsync://ftp-stud.hs-esslingen.de/CPAN/
		rsync://ftp.gwdg.de/pub/languages/perl/CPAN/
		rsync://ftp.hawo.stw.uni-erlangen.de/CPAN/
		rsync://cpan.mirror.iphh.net/CPAN/
		rsync://mirror.netcologne.de/cpan/
		rsync://ftp.halifax.rwth-aachen.de/cpan/
		rsync://ftp.ntua.gr/CPAN/
		rsync://mirror.met.hu/CPAN/
		rsync://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN/
		rsync://rsync.panu.it/CPAN/
		rsync://mirror.as43289.net/CPAN/
		rsync://rsync.cs.uu.nl/CPAN/
		rsync://mirror.nl.leaseweb.net/CPAN/
		rsync://ftp.nluug.nl/CPAN/
		rsync://mirror.transip.net/CPAN/
		rsync://cpan.uib.no/cpan/
		rsync://cpan.vianett.no/CPAN/
		rsync://cpan.perl-hackers.net/CPAN/
		rsync://cpan.perl.pt/cpan/
		rsync://mirrors.m247.ro/CPAN/
		rsync://mirrors.teentelecom.net/CPAN/
		rsync://cpan.webdesk.ru/CPAN/
		rsync://mirror.yandex.ru/mirrors/cpan/
		rsync://mirror.sbb.rs/CPAN/
		rsync://ftp.acc.umu.se/mirror/CPAN/
		rsync://rsync.pirbot.com/ftp/cpan/
		rsync://cpan.ip-connect.vn.ua/CPAN/
		rsync://rsync.mirror.anlx.net/CPAN/
		rsync://mirror.bytemark.co.uk/CPAN/
		rsync://mirror.sax.uk.as61049.net/CPAN/
		rsync://rsync.mirrorservice.org/cpan.perl.org/CPAN/
		rsync://ftp.ticklers.org/CPAN/
		rsync://mirrors.uk2.net/CPAN/
		rsync://CPAN.mirror.rafal.ca/CPAN/
		rsync://mirror.csclub.uwaterloo.ca/CPAN/
		rsync://mirrors.namecheap.com/CPAN/
		rsync://mirrors.syringanetworks.net/CPAN/
		rsync://mirror.team-cymru.org/CPAN/
		rsync://debian.cse.msu.edu/cpan/
		rsync://mirrors-usa.go-parts.com/mirrors/cpan/
		rsync://rsync.hoovism.com/CPAN/
		rsync://mirror.cc.columbia.edu/cpan/
		rsync://noodle.portalus.net/CPAN/
		rsync://mirrors.rit.edu/cpan/
		rsync://mirrors.ibiblio.org/CPAN/
		rsync://cpan.pair.com/CPAN/
		rsync://cpan.cs.utah.edu/CPAN/
		rsync://mirror.cogentco.com/CPAN/
		rsync://mirror.jmu.edu/CPAN/
		rsync://mirror.us.leaseweb.net/CPAN/
		rsync://cpan.mirror.digitalpacific.com.au/cpan/
		rsync://mirror.internode.on.net/cpan/
		rsync://uberglobalmirror.com/cpan/
		rsync://cpan.lagoon.nc/cpan/
		rsync://mirrors.mmgdesigns.com.ar/CPAN/


For an up-to-date listing of CPAN sites,
see L<http://www.cpan.org/SITES> or L<ftp://www.cpan.org/SITES>.

=head1 Modules: Creation, Use, and Abuse

(The following section is borrowed directly from Tim Bunce's modules
file, available at your nearest CPAN site.)

Perl implements a class using a package, but the presence of a
package doesn't imply the presence of a class.  A package is just a
namespace.  A class is a package that provides subroutines that can be
used as methods.  A method is just a subroutine that expects, as its
first argument, either the name of a package (for "static" methods),
or a reference to something (for "virtual" methods).

A module is a file that (by convention) provides a class of the same
name (sans the .pm), plus an import method in that class that can be
called to fetch exported symbols.  This module may implement some of
its methods by loading dynamic C or C++ objects, but that should be
totally transparent to the user of the module.  Likewise, the module
might set up an AUTOLOAD function to slurp in subroutine definitions on
demand, but this is also transparent.  Only the F<.pm> file is required to
exist.  See L<perlsub>, L<perlobj>, and L<AutoLoader> for details about
the AUTOLOAD mechanism.

=head2 Guidelines for Module Creation

=over 4

=item  *

Do similar modules already exist in some form?

If so, please try to reuse the existing modules either in whole or
by inheriting useful features into a new class.  If this is not
practical try to get together with the module authors to work on
extending or enhancing the functionality of the existing modules.
A perfect example is the plethora of packages in perl4 for dealing
with command line options.

If you are writing a module to expand an already existing set of
modules, please coordinate with the author of the package.  It
helps if you follow the same naming scheme and module interaction
scheme as the original author.

=item  *

Try to design the new module to be easy to extend and reuse.

Try to C<use warnings;> (or C<use warnings qw(...);>).
Remember that you can add C<no warnings qw(...);> to individual blocks
of code that need less warnings.

Use blessed references.  Use the two argument form of bless to bless
into the class name given as the first parameter of the constructor,
e.g.,:

 sub new {
     my $class = shift;
     return bless {}, $class;
 }

or even this if you'd like it to be used as either a static
or a virtual method.

 sub new {
     my $self  = shift;
     my $class = ref($self) || $self;
     return bless {}, $class;
 }

Pass arrays as references so more parameters can be added later
(it's also faster).  Convert functions into methods where
appropriate.  Split large methods into smaller more flexible ones.
Inherit methods from other modules if appropriate.

Avoid class name tests like: C<die "Invalid" unless ref $ref eq 'FOO'>.
Generally you can delete the C<eq 'FOO'> part with no harm at all.
Let the objects look after themselves! Generally, avoid hard-wired
class names as far as possible.

Avoid C<< $r->Class::func() >> where using C<@ISA=qw(... Class ...)> and
C<< $r->func() >> would work.

Use autosplit so little used or newly added functions won't be a
burden to programs that don't use them. Add test functions to
the module after __END__ either using AutoSplit or by saying:

 eval join('',<main::DATA>) || die $@ unless caller();

Does your module pass the 'empty subclass' test? If you say
C<@SUBCLASS::ISA = qw(YOURCLASS);> your applications should be able
to use SUBCLASS in exactly the same way as YOURCLASS.  For example,
does your application still work if you change:  C<< $obj = YOURCLASS->new(); >>
into: C<< $obj = SUBCLASS->new(); >> ?

Avoid keeping any state information in your packages. It makes it
difficult for multiple other packages to use yours. Keep state
information in objects.

Always use B<-w>.

Try to C<use strict;> (or C<use strict qw(...);>).
Remember that you can add C<no strict qw(...);> to individual blocks
of code that need less strictness.

Always use B<-w>.

Follow the guidelines in L<perlstyle>.

Always use B<-w>.

=item  *

Some simple style guidelines

The perlstyle manual supplied with Perl has many helpful points.

Coding style is a matter of personal taste. Many people evolve their
style over several years as they learn what helps them write and
maintain good code.  Here's one set of assorted suggestions that
seem to be widely used by experienced developers:

Use underscores to separate words.  It is generally easier to read
$var_names_like_this than $VarNamesLikeThis, especially for
non-native speakers of English. It's also a simple rule that works
consistently with VAR_NAMES_LIKE_THIS.

Package/Module names are an exception to this rule. Perl informally
reserves lowercase module names for 'pragma' modules like integer
and strict. Other modules normally begin with a capital letter and
use mixed case with no underscores (need to be short and portable).

You may find it helpful to use letter case to indicate the scope
or nature of a variable. For example:

 $ALL_CAPS_HERE   constants only (beware clashes with Perl vars)
 $Some_Caps_Here  package-wide global/static
 $no_caps_here    function scope my() or local() variables

Function and method names seem to work best as all lowercase.
e.g., C<< $obj->as_string() >>.

You can use a leading underscore to indicate that a variable or
function should not be used outside the package that defined it.

=item  *

Select what to export.

Do NOT export method names!

Do NOT export anything else by default without a good reason!

Exports pollute the namespace of the module user.  If you must
export try to use @EXPORT_OK in preference to @EXPORT and avoid
short or common names to reduce the risk of name clashes.

Generally anything not exported is still accessible from outside the
module using the ModuleName::item_name (or C<< $blessed_ref->method >>)
syntax.  By convention you can use a leading underscore on names to
indicate informally that they are 'internal' and not for public use.

(It is actually possible to get private functions by saying:
C<my $subref = sub { ... };  &$subref;>.  But there's no way to call that
directly as a method, because a method must have a name in the symbol
table.)

As a general rule, if the module is trying to be object oriented
then export nothing. If it's just a collection of functions then
@EXPORT_OK anything but use @EXPORT with caution.

=item  *

Select a name for the module.

This name should be as descriptive, accurate, and complete as
possible.  Avoid any risk of ambiguity. Always try to use two or
more whole words.  Generally the name should reflect what is special
about what the module does rather than how it does it.  Please use
nested module names to group informally or categorize a module.
There should be a very good reason for a module not to have a nested name.
Module names should begin with a capital letter.

Having 57 modules all called Sort will not make life easy for anyone
(though having 23 called Sort::Quick is only marginally better :-).
Imagine someone trying to install your module alongside many others.

If you are developing a suite of related modules/classes it's good
practice to use nested classes with a common prefix as this will
avoid namespace clashes. For example: Xyz::Control, Xyz::View,
Xyz::Model etc. Use the modules in this list as a naming guide.

If adding a new module to a set, follow the original author's
standards for naming modules and the interface to methods in
those modules.

If developing modules for private internal or project specific use,
that will never be released to the public, then you should ensure
that their names will not clash with any future public module. You
can do this either by using the reserved Local::* category or by
using a category name that includes an underscore like Foo_Corp::*.

To be portable each component of a module name should be limited to
11 characters. If it might be used on MS-DOS then try to ensure each is
unique in the first 8 characters. Nested modules make this easier.

For additional guidance on the naming of modules, please consult:

    http://pause.perl.org/pause/query?ACTION=pause_namingmodules

or send mail to the <module-authors@perl.org> mailing list.

=item  *

Have you got it right?

How do you know that you've made the right decisions? Have you
picked an interface design that will cause problems later? Have
you picked the most appropriate name? Do you have any questions?

The best way to know for sure, and pick up many helpful suggestions,
is to ask someone who knows. The <module-authors@perl.org> mailing list
is useful for this purpose; it's also accessible via news interface as
perl.module-authors at nntp.perl.org.

All you need to do is post a short summary of the module, its
purpose and interfaces. A few lines on each of the main methods is
probably enough. (If you post the whole module it might be ignored
by busy people - generally the very people you want to read it!)

Don't worry about posting if you can't say when the module will be
ready - just say so in the message. It might be worth inviting
others to help you, they may be able to complete it for you!

=item  *

README and other Additional Files.

It's well known that software developers usually fully document the
software they write. If, however, the world is in urgent need of
your software and there is not enough time to write the full
documentation please at least provide a README file containing:

=over 10

=item *

A description of the module/package/extension etc.

=item *

A copyright notice - see below.

=item *

Prerequisites - what else you may need to have.

=item *

How to build it - possible changes to Makefile.PL etc.

=item *

How to install it.

=item *

Recent changes in this release, especially incompatibilities

=item *

Changes / enhancements you plan to make in the future.

=back

If the README file seems to be getting too large you may wish to
split out some of the sections into separate files: INSTALL,
Copying, ToDo etc.

=over 4

=item *

Adding a Copyright Notice.

How you choose to license your work is a personal decision.
The general mechanism is to assert your Copyright and then make
a declaration of how others may copy/use/modify your work.

Perl, for example, is supplied with two types of licence: The GNU GPL
and The Artistic Licence (see the files README, Copying, and Artistic,
or L<perlgpl> and L<perlartistic>).  Larry has good reasons for NOT
just using the GNU GPL.

My personal recommendation, out of respect for Larry, Perl, and the
Perl community at large is to state something simply like:

 Copyright (c) 1995 Your Name. All rights reserved.
 This program is free software; you can redistribute it and/or
 modify it under the same terms as Perl itself.

This statement should at least appear in the README file. You may
also wish to include it in a Copying file and your source files.
Remember to include the other words in addition to the Copyright.

=item  *

Give the module a version/issue/release number.

To be fully compatible with the Exporter and MakeMaker modules you
should store your module's version number in a non-my package
variable called $VERSION.  This should be a positive floating point
number with at least two digits after the decimal (i.e., hundredths,
e.g, C<$VERSION = "0.01">).  Don't use a "1.3.2" style version.
See L<Exporter> for details.

It may be handy to add a function or method to retrieve the number.
Use the number in announcements and archive file names when
releasing the module (ModuleName-1.02.tar.Z).
See perldoc ExtUtils::MakeMaker.pm for details.

=item  *

How to release and distribute a module.

If possible, register the module with CPAN. Follow the instructions
and links on:

   http://www.cpan.org/modules/04pause.html

and upload to:

   http://pause.perl.org/

and notify <modules@perl.org>. This will allow anyone to install
your module using the C<cpan> tool distributed with Perl.

By using the WWW interface you can ask the Upload Server to mirror
your modules from your ftp or WWW site into your own directory on
CPAN!

=item  *

Take care when changing a released module.

Always strive to remain compatible with previous released versions.
Otherwise try to add a mechanism to revert to the
old behavior if people rely on it.  Document incompatible changes.

=back

=back

=head2 Guidelines for Converting Perl 4 Library Scripts into Modules

=over 4

=item  *

There is no requirement to convert anything.

If it ain't broke, don't fix it! Perl 4 library scripts should
continue to work with no problems. You may need to make some minor
changes (like escaping non-array @'s in double quoted strings) but
there is no need to convert a .pl file into a Module for just that.

=item  *

Consider the implications.

All Perl applications that make use of the script will need to
be changed (slightly) if the script is converted into a module.  Is
it worth it unless you plan to make other changes at the same time?

=item  *

Make the most of the opportunity.

If you are going to convert the script to a module you can use the
opportunity to redesign the interface.  The guidelines for module
creation above include many of the issues you should consider.

=item  *

The pl2pm utility will get you started.

This utility will read *.pl files (given as parameters) and write
corresponding *.pm files. The pl2pm utilities does the following:

=over 10

=item *

Adds the standard Module prologue lines

=item *

Converts package specifiers from ' to ::

=item *

Converts die(...) to croak(...)

=item *

Several other minor changes

=back

Being a mechanical process pl2pm is not bullet proof. The converted
code will need careful checking, especially any package statements.
Don't delete the original .pl file till the new .pm one works!

=back

=head2 Guidelines for Reusing Application Code

=over 4

=item  *

Complete applications rarely belong in the Perl Module Library.

=item  *

Many applications contain some Perl code that could be reused.

Help save the world! Share your code in a form that makes it easy
to reuse.

=item  *

Break-out the reusable code into one or more separate module files.

=item  *

Take the opportunity to reconsider and redesign the interfaces.

=item  *

In some cases the 'application' can then be reduced to a small

fragment of code built on top of the reusable modules. In these cases
the application could invoked as:

     % perl -e 'use Module::Name; method(@ARGV)' ...
or
     % perl -mModule::Name ...    (in perl5.002 or higher)

=back

=head1 NOTE

Perl does not enforce private and public parts of its modules as you may
have been used to in other languages like C++, Ada, or Modula-17.  Perl
doesn't have an infatuation with enforced privacy.  It would prefer
that you stayed out of its living room because you weren't invited, not
because it has a shotgun.

The module and its user have a contract, part of which is common law,
and part of which is "written".  Part of the common law contract is
that a module doesn't pollute any namespace it wasn't asked to.  The
written contract for the module (A.K.A. documentation) may make other
provisions.  But then you know when you C<use RedefineTheWorld> that
you're redefining the world and willing to take the consequences.

=cut

ex: set ro:
perl5221delta.pod000064400000025017150344123450007544 0ustar00=encoding utf8

=head1 NAME

perl5221delta - what is new for perl v5.22.1

=head1 DESCRIPTION

This document describes differences between the 5.22.0 release and the 5.22.1
release.

If you are upgrading from an earlier release such as 5.20.0, first read
L<perl5220delta>, which describes differences between 5.20.0 and 5.22.0.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.20.0 other than the
following single exception, which we deemed to be a sensible change to make in
order to get the new C<\b{wb}> and (in particular) C<\b{sb}> features sane
before people decided they're worthless because of bugs in their Perl 5.22.0
implementation and avoided them in the future.
If any others exist, they are bugs, and we request that you submit a report.
See L</Reporting Bugs> below.

=head2 Bounds Checking Constructs

Several bugs, including a segmentation fault, have been fixed with the bounds
checking constructs (introduced in Perl 5.22) C<\b{gcb}>, C<\b{sb}>, C<\b{wb}>,
C<\B{gcb}>, C<\B{sb}>, and C<\B{wb}>.  All the C<\B{}> ones now match an empty
string; none of the C<\b{}> ones do.
L<[perl #126319]|https://rt.perl.org/Ticket/Display.html?id=126319>

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Module::CoreList> has been upgraded from version 5.20150520 to 5.20151213.

=item *

L<PerlIO::scalar> has been upgraded from version 0.22 to 0.23.

=item *

L<POSIX> has been upgraded from version 1.53 to 1.53_01.

If C<POSIX::strerror> was passed C<$!> as its argument then it accidentally
cleared C<$!>.  This has been fixed.
L<[perl #126229]|https://rt.perl.org/Ticket/Display.html?id=126229>

=item *

L<Storable> has been upgraded from version 2.53 to 2.53_01.

=item *

L<warnings> has been upgraded from version 1.32 to 1.34.

The C<warnings::enabled> example now actually uses C<warnings::enabled>.
L<[perl #126051]|https://rt.perl.org/Ticket/Display.html?id=126051>

=item *

L<Win32> has been upgraded from version 0.51 to 0.52.

This has been updated for Windows 8.1, 10 and 2012 R2 Server.

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perltie>

=over 4

=item *

The usage of C<FIRSTKEY> and C<NEXTKEY> has been clarified.

=back

=head3 L<perlvar>

=over 4

=item *

The specific true value of C<$!{E...}> is now documented, noting that it is
subject to change and not guaranteed.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 Changes to Existing Diagnostics

=over 4

=item *

The C<printf> and C<sprintf> builtins are now more careful about the warnings
they emit: argument reordering now disables the "redundant argument" warning in
all cases.
L<[perl #125469]|https://rt.perl.org/Ticket/Display.html?id=125469>

=back

=head1 Configuration and Compilation

=over 4

=item *

Using the C<NO_HASH_SEED> define in combination with the default hash algorithm
C<PERL_HASH_FUNC_ONE_AT_A_TIME_HARD> resulted in a fatal error while compiling
the interpreter, since Perl 5.17.10.  This has been fixed.

=item *

Configuring with ccflags containing quotes (e.g.
C<< -Accflags='-DAPPLLIB_EXP=\"/usr/libperl\"' >>) was broken in Perl 5.22.0
but has now been fixed again.
L<[perl #125314]|https://rt.perl.org/Ticket/Display.html?id=125314>

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item IRIX

=over

=item *

Under some circumstances IRIX stdio fgetc() and fread() set the errno to
C<ENOENT>, which made no sense according to either IRIX or POSIX docs.  Errno
is now cleared in such cases.
L<[perl #123977]|https://rt.perl.org/Ticket/Display.html?id=123977>

=item *

Problems when multiplying long doubles by infinity have been fixed.
L<[perl #126396]|https://rt.perl.org/Ticket/Display.html?id=126396>

=item *

All tests pass now on IRIX with the default build configuration.

=back

=back

=head1 Selected Bug Fixes

=over 4

=item *

C<qr/(?[ () ])/> no longer segfaults, giving a syntax error message instead.
L<[perl #125805]|https://rt.perl.org/Ticket/Display.html?id=125805>

=item *

Regular expression possessive quantifier Perl 5.20 regression now fixed.
C<qr/>I<PAT>C<{>I<min>,I<max>C<}+>C</> is supposed to behave identically to
C<qr/(?E<gt>>I<PAT>C<{>I<min>,I<max>C<})/>.  Since Perl 5.20, this didn't work
if I<min> and I<max> were equal.
L<[perl #125825]|https://rt.perl.org/Ticket/Display.html?id=125825>

=item *

Certain syntax errors in
L<perlrecharclass/Extended Bracketed Character Classes> caused panics instead
of the proper error message.  This has now been fixed.
L<[perl #126481]|https://rt.perl.org/Ticket/Display.html?id=126481>

=item *

C<< BEGIN <> >> no longer segfaults and properly produces an error message.
L<[perl #125341]|https://rt.perl.org/Ticket/Display.html?id=125341>

=item *

A regression from Perl 5.20 has been fixed, in which some syntax errors in
L<C<(?[...])>|perlrecharclass/Extended Bracketed Character Classes> constructs
within regular expression patterns could cause a segfault instead of a proper
error message.
L<[perl #126180]|https://rt.perl.org/Ticket/Display.html?id=126180>

=item *

Another problem with
L<C<(?[...])>|perlrecharclass/Extended Bracketed Character Classes>
constructs has been fixed wherein things like C<\c]> could cause panics.
L<[perl #126181]|https://rt.perl.org/Ticket/Display.html?id=126181>

=item *

In Perl 5.22.0, the logic changed when parsing a numeric parameter to the -C
option, such that the successfully parsed number was not saved as the option
value if it parsed to the end of the argument.
L<[perl #125381]|https://rt.perl.org/Ticket/Display.html?id=125381>

=item *

Warning fatality is now ignored when rewinding the stack.  This prevents
infinite recursion when the now fatal error also causes rewinding of the stack.
L<[perl #123398]|https://rt.perl.org/Ticket/Display.html?id=123398>

=item *

A crash with C<< %::=(); J->${\"::"} >> has been fixed.
L<[perl #125541]|https://rt.perl.org/Ticket/Display.html?id=125541>

=item *

Nested quantifiers such as C</.{1}??/> should cause perl to throw a fatal
error, but were being silently accepted since Perl 5.20.0.  This has been
fixed.
L<[perl #126253]|https://rt.perl.org/Ticket/Display.html?id=126253>

=item *

Regular expression sequences such as C</(?i/> (and similarly with other
recognized flags or combination of flags) should cause perl to throw a fatal
error, but were being silently accepted since Perl 5.18.0.  This has been
fixed.
L<[perl #126178]|https://rt.perl.org/Ticket/Display.html?id=126178>

=item *

A bug in hexadecimal floating point literal support meant that high-order bits
could be lost in cases where mantissa overflow was caused by too many trailing
zeros in the fractional part.  This has been fixed.
L<[perl #126582]|https://rt.perl.org/Ticket/Display.html?id=126582>

=item *

Another hexadecimal floating point bug, causing low-order bits to be lost in
cases where the last hexadecimal digit of the mantissa has bits straddling the
limit of the number of bits allowed for the mantissa, has also been fixed.
L<[perl #126586]|https://rt.perl.org/Ticket/Display.html?id=126586>

=item *

Further hexadecimal floating point bugs have been fixed: In some circumstances,
the C<%a> format specifier could variously lose the sign of the negative zero,
fail to display zeros after the radix point with the requested precision, or
even lose the radix point after the leftmost hexadecimal digit completely.

=item *

A crash caused by incomplete expressions within C<< /(?[ ])/ >> (e.g.
C<< /(?[[0]+()+])/ >>) has been fixed.
L<[perl #126615]|https://rt.perl.org/Ticket/Display.html?id=126615>

=back

=head1 Acknowledgements

Perl 5.22.1 represents approximately 6 months of development since Perl 5.22.0
and contains approximately 19,000 lines of changes across 130 files from 27
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 1,700 lines of changes to 44 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.22.1:

Aaron Crane, Abigail, Andy Broad, Aristotle Pagaltzis, Chase Whitener, Chris
'BinGOs' Williams, Craig A. Berry, Daniel Dragan, David Mitchell, Father
Chrysostomos, Herbert Breunung, Hugo van der Sanden, James E Keenan, Jan
Dubois, Jarkko Hietaniemi, Karen Etheridge, Karl Williamson, Lukas Mai, Matthew
Horsfall, Peter Martini, Rafael Garcia-Suarez, Ricardo Signes, Shlomi Fish,
Sisyphus, Steve Hay, Tony Cook, Victor Adam.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
https://rt.perl.org/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl56delta.pod000064400000321301150344123450007400 0ustar00=head1 NAME

perl56delta - what's new for perl v5.6.0

=head1 DESCRIPTION

This document describes differences between the 5.005 release and the 5.6.0
release.

=head1 Core Enhancements

=head2 Interpreter cloning, threads, and concurrency

Perl 5.6.0 introduces the beginnings of support for running multiple
interpreters concurrently in different threads.  In conjunction with
the perl_clone() API call, which can be used to selectively duplicate
the state of any given interpreter, it is possible to compile a
piece of code once in an interpreter, clone that interpreter
one or more times, and run all the resulting interpreters in distinct
threads.

On the Windows platform, this feature is used to emulate fork() at the
interpreter level.  See L<perlfork> for details about that.

This feature is still in evolution.  It is eventually meant to be used
to selectively clone a subroutine and data reachable from that
subroutine in a separate interpreter and run the cloned subroutine
in a separate thread.  Since there is no shared data between the
interpreters, little or no locking will be needed (unless parts of
the symbol table are explicitly shared).  This is obviously intended
to be an easy-to-use replacement for the existing threads support.

Support for cloning interpreters and interpreter concurrency can be
enabled using the -Dusethreads Configure option (see win32/Makefile for
how to enable it on Windows.)  The resulting perl executable will be
functionally identical to one that was built with -Dmultiplicity, but
the perl_clone() API call will only be available in the former.

-Dusethreads enables the cpp macro USE_ITHREADS by default, which in turn
enables Perl source code changes that provide a clear separation between
the op tree and the data it operates with.  The former is immutable, and
can therefore be shared between an interpreter and all of its clones,
while the latter is considered local to each interpreter, and is therefore
copied for each clone.

Note that building Perl with the -Dusemultiplicity Configure option
is adequate if you wish to run multiple B<independent> interpreters
concurrently in different threads.  -Dusethreads only provides the
additional functionality of the perl_clone() API call and other
support for running B<cloned> interpreters concurrently.

    NOTE: This is an experimental feature.  Implementation details are
    subject to change.

=head2 Lexically scoped warning categories

You can now control the granularity of warnings emitted by perl at a finer
level using the C<use warnings> pragma.  L<warnings> and L<perllexwarn>
have copious documentation on this feature.

=head2 Unicode and UTF-8 support

Perl now uses UTF-8 as its internal representation for character
strings.  The C<utf8> and C<bytes> pragmas are used to control this support
in the current lexical scope.  See L<perlunicode>, L<utf8> and L<bytes> for
more information.

This feature is expected to evolve quickly to support some form of I/O
disciplines that can be used to specify the kind of input and output data
(bytes or characters).  Until that happens, additional modules from CPAN
will be needed to complete the toolkit for dealing with Unicode.

    NOTE: This should be considered an experimental feature.  Implementation
    details are subject to change.

=head2 Support for interpolating named characters

The new C<\N> escape interpolates named characters within strings.
For example, C<"Hi! \N{WHITE SMILING FACE}"> evaluates to a string
with a unicode smiley face at the end.

=head2 "our" declarations

An "our" declaration introduces a value that can be best understood
as a lexically scoped symbolic alias to a global variable in the
package that was current where the variable was declared.  This is
mostly useful as an alternative to the C<vars> pragma, but also provides
the opportunity to introduce typing and other attributes for such
variables.  See L<perlfunc/our>.

=head2 Support for strings represented as a vector of ordinals

Literals of the form C<v1.2.3.4> are now parsed as a string composed
of characters with the specified ordinals.  This is an alternative, more
readable way to construct (possibly unicode) strings instead of
interpolating characters, as in C<"\x{1}\x{2}\x{3}\x{4}">.  The leading
C<v> may be omitted if there are more than two ordinals, so C<1.2.3> is
parsed the same as C<v1.2.3>.

Strings written in this form are also useful to represent version "numbers".
It is easy to compare such version "numbers" (which are really just plain
strings) using any of the usual string comparison operators C<eq>, C<ne>,
C<lt>, C<gt>, etc., or perform bitwise string operations on them using C<|>,
C<&>, etc.

In conjunction with the new C<$^V> magic variable (which contains
the perl version as a string), such literals can be used as a readable way
to check if you're running a particular version of Perl:

    # this will parse in older versions of Perl also
    if ($^V and $^V gt v5.6.0) {
        # new features supported
    }

C<require> and C<use> also have some special magic to support such
literals, but this particular usage should be avoided because it leads to
misleading error messages under versions of Perl which don't support vector
strings.  Using a true version number will ensure correct behavior in all
versions of Perl:

    require 5.006;    # run time check for v5.6
    use 5.006_001;    # compile time check for v5.6.1

Also, C<sprintf> and C<printf> support the Perl-specific format flag C<%v>
to print ordinals of characters in arbitrary strings:

    printf "v%vd", $^V;		# prints current version, such as "v5.5.650"
    printf "%*vX", ":", $addr;	# formats IPv6 address
    printf "%*vb", " ", $bits;	# displays bitstring

See L<perldata/"Scalar value constructors"> for additional information.

=head2 Improved Perl version numbering system

Beginning with Perl version 5.6.0, the version number convention has been
changed to a "dotted integer" scheme that is more commonly found in open
source projects.

Maintenance versions of v5.6.0 will be released as v5.6.1, v5.6.2 etc.
The next development series following v5.6.0 will be numbered v5.7.x,
beginning with v5.7.0, and the next major production release following
v5.6.0 will be v5.8.0.

The English module now sets $PERL_VERSION to $^V (a string value) rather
than C<$]> (a numeric value).  (This is a potential incompatibility.
Send us a report via perlbug if you are affected by this.)

The v1.2.3 syntax is also now legal in Perl.
See L</Support for strings represented as a vector of ordinals> for more on that.

To cope with the new versioning system's use of at least three significant
digits for each version component, the method used for incrementing the
subversion number has also changed slightly.  We assume that versions older
than v5.6.0 have been incrementing the subversion component in multiples of
10.  Versions after v5.6.0 will increment them by 1.  Thus, using the new
notation, 5.005_03 is the "same" as v5.5.30, and the first maintenance
version following v5.6.0 will be v5.6.1 (which should be read as being
equivalent to a floating point value of 5.006_001 in the older format,
stored in C<$]>).

=head2 New syntax for declaring subroutine attributes

Formerly, if you wanted to mark a subroutine as being a method call or
as requiring an automatic lock() when it is entered, you had to declare
that with a C<use attrs> pragma in the body of the subroutine.
That can now be accomplished with declaration syntax, like this:

    sub mymethod : locked method;
    ...
    sub mymethod : locked method {
	...
    }

    sub othermethod :locked :method;
    ...
    sub othermethod :locked :method {
	...
    }


(Note how only the first C<:> is mandatory, and whitespace surrounding
the C<:> is optional.)

F<AutoSplit.pm> and F<SelfLoader.pm> have been updated to keep the attributes
with the stubs they provide.  See L<attributes>.

=head2 File and directory handles can be autovivified

Similar to how constructs such as C<< $x->[0] >> autovivify a reference,
handle constructors (open(), opendir(), pipe(), socketpair(), sysopen(),
socket(), and accept()) now autovivify a file or directory handle
if the handle passed to them is an uninitialized scalar variable.  This
allows the constructs such as C<open(my $fh, ...)> and C<open(local $fh,...)>
to be used to create filehandles that will conveniently be closed
automatically when the scope ends, provided there are no other references
to them.  This largely eliminates the need for typeglobs when opening
filehandles that must be passed around, as in the following example:

    sub myopen {
        open my $fh, "@_"
	     or die "Can't open '@_': $!";
	return $fh;
    }

    {
        my $f = myopen("</etc/motd");
	print <$f>;
	# $f implicitly closed here
    }

=head2 open() with more than two arguments

If open() is passed three arguments instead of two, the second argument
is used as the mode and the third argument is taken to be the file name.
This is primarily useful for protecting against unintended magic behavior
of the traditional two-argument form.  See L<perlfunc/open>.

=head2 64-bit support

Any platform that has 64-bit integers either

	(1) natively as longs or ints
	(2) via special compiler flags
	(3) using long long or int64_t

is able to use "quads" (64-bit integers) as follows:

=over 4

=item *

constants (decimal, hexadecimal, octal, binary) in the code 

=item *

arguments to oct() and hex()

=item *

arguments to print(), printf() and sprintf() (flag prefixes ll, L, q)

=item *

printed as such

=item *

pack() and unpack() "q" and "Q" formats

=item *

in basic arithmetics: + - * / % (NOTE: operating close to the limits
of the integer values may produce surprising results)

=item *

in bit arithmetics: & | ^ ~ << >> (NOTE: these used to be forced 
to be 32 bits wide but now operate on the full native width.)

=item *

vec()

=back

Note that unless you have the case (a) you will have to configure
and compile Perl using the -Duse64bitint Configure flag.

    NOTE: The Configure flags -Duselonglong and -Duse64bits have been
    deprecated.  Use -Duse64bitint instead.

There are actually two modes of 64-bitness: the first one is achieved
using Configure -Duse64bitint and the second one using Configure
-Duse64bitall.  The difference is that the first one is minimal and
the second one maximal.  The first works in more places than the second.

The C<use64bitint> does only as much as is required to get 64-bit
integers into Perl (this may mean, for example, using "long longs")
while your memory may still be limited to 2 gigabytes (because your
pointers could still be 32-bit).  Note that the name C<64bitint> does
not imply that your C compiler will be using 64-bit C<int>s (it might,
but it doesn't have to): the C<use64bitint> means that you will be
able to have 64 bits wide scalar values.

The C<use64bitall> goes all the way by attempting to switch also
integers (if it can), longs (and pointers) to being 64-bit.  This may
create an even more binary incompatible Perl than -Duse64bitint: the
resulting executable may not run at all in a 32-bit box, or you may
have to reboot/reconfigure/rebuild your operating system to be 64-bit
aware.

Natively 64-bit systems like Alpha and Cray need neither -Duse64bitint
nor -Duse64bitall.

Last but not least: note that due to Perl's habit of always using
floating point numbers, the quads are still not true integers.
When quads overflow their limits (0...18_446_744_073_709_551_615 unsigned,
-9_223_372_036_854_775_808...9_223_372_036_854_775_807 signed), they
are silently promoted to floating point numbers, after which they will
start losing precision (in their lower digits).

    NOTE: 64-bit support is still experimental on most platforms.
    Existing support only covers the LP64 data model.  In particular, the
    LLP64 data model is not yet supported.  64-bit libraries and system
    APIs on many platforms have not stabilized--your mileage may vary.

=head2 Large file support

If you have filesystems that support "large files" (files larger than
2 gigabytes), you may now also be able to create and access them from
Perl.

    NOTE: The default action is to enable large file support, if
    available on the platform.

If the large file support is on, and you have a Fcntl constant
O_LARGEFILE, the O_LARGEFILE is automatically added to the flags
of sysopen().

Beware that unless your filesystem also supports "sparse files" seeking
to umpteen petabytes may be inadvisable.

Note that in addition to requiring a proper file system to do large
files you may also need to adjust your per-process (or your
per-system, or per-process-group, or per-user-group) maximum filesize
limits before running Perl scripts that try to handle large files,
especially if you intend to write such files.

Finally, in addition to your process/process group maximum filesize
limits, you may have quota limits on your filesystems that stop you
(your user id or your user group id) from using large files.

Adjusting your process/user/group/file system/operating system limits
is outside the scope of Perl core language.  For process limits, you
may try increasing the limits using your shell's limits/limit/ulimit
command before running Perl.  The BSD::Resource extension (not
included with the standard Perl distribution) may also be of use, it
offers the getrlimit/setrlimit interface that can be used to adjust
process resource usage limits, including the maximum filesize limit.

=head2 Long doubles

In some systems you may be able to use long doubles to enhance the
range and precision of your double precision floating point numbers
(that is, Perl's numbers).  Use Configure -Duselongdouble to enable
this support (if it is available).

=head2 "more bits"

You can "Configure -Dusemorebits" to turn on both the 64-bit support
and the long double support.

=head2 Enhanced support for sort() subroutines

Perl subroutines with a prototype of C<($$)>, and XSUBs in general, can
now be used as sort subroutines.  In either case, the two elements to
be compared are passed as normal parameters in @_.  See L<perlfunc/sort>.

For unprototyped sort subroutines, the historical behavior of passing 
the elements to be compared as the global variables $a and $b remains
unchanged.

=head2 C<sort $coderef @foo> allowed

sort() did not accept a subroutine reference as the comparison
function in earlier versions.  This is now permitted.

=head2 File globbing implemented internally

Perl now uses the File::Glob implementation of the glob() operator
automatically.  This avoids using an external csh process and the
problems associated with it.

    NOTE: This is currently an experimental feature.  Interfaces and
    implementation are subject to change.

=head2 Support for CHECK blocks

In addition to C<BEGIN>, C<INIT>, C<END>, C<DESTROY> and C<AUTOLOAD>,
subroutines named C<CHECK> are now special.  These are queued up during
compilation and behave similar to END blocks, except they are called at
the end of compilation rather than at the end of execution.  They cannot
be called directly.

=head2 POSIX character class syntax [: :] supported

For example to match alphabetic characters use /[[:alpha:]]/.
See L<perlre> for details.

=head2 Better pseudo-random number generator

In 5.005_0x and earlier, perl's rand() function used the C library
rand(3) function.  As of 5.005_52, Configure tests for drand48(),
random(), and rand() (in that order) and picks the first one it finds.

These changes should result in better random numbers from rand().

=head2 Improved C<qw//> operator

The C<qw//> operator is now evaluated at compile time into a true list
instead of being replaced with a run time call to C<split()>.  This
removes the confusing misbehaviour of C<qw//> in scalar context, which
had inherited that behaviour from split().

Thus:

    $foo = ($bar) = qw(a b c); print "$foo|$bar\n";

now correctly prints "3|a", instead of "2|a".

=head2 Better worst-case behavior of hashes

Small changes in the hashing algorithm have been implemented in
order to improve the distribution of lower order bits in the
hashed value.  This is expected to yield better performance on
keys that are repeated sequences.

=head2 pack() format 'Z' supported

The new format type 'Z' is useful for packing and unpacking null-terminated
strings.  See L<perlfunc/"pack">.

=head2 pack() format modifier '!' supported

The new format type modifier '!' is useful for packing and unpacking
native shorts, ints, and longs.  See L<perlfunc/"pack">.

=head2 pack() and unpack() support counted strings

The template character '/' can be used to specify a counted string
type to be packed or unpacked.  See L<perlfunc/"pack">.

=head2 Comments in pack() templates

The '#' character in a template introduces a comment up to
end of the line.  This facilitates documentation of pack()
templates.

=head2 Weak references

In previous versions of Perl, you couldn't cache objects so as
to allow them to be deleted if the last reference from outside 
the cache is deleted.  The reference in the cache would hold a
reference count on the object and the objects would never be
destroyed.

Another familiar problem is with circular references.  When an
object references itself, its reference count would never go
down to zero, and it would not get destroyed until the program
is about to exit.

Weak references solve this by allowing you to "weaken" any
reference, that is, make it not count towards the reference count.
When the last non-weak reference to an object is deleted, the object
is destroyed and all the weak references to the object are
automatically undef-ed.

To use this feature, you need the Devel::WeakRef package from CPAN, which
contains additional documentation.

    NOTE: This is an experimental feature.  Details are subject to change.  

=head2 Binary numbers supported

Binary numbers are now supported as literals, in s?printf formats, and
C<oct()>:

    $answer = 0b101010;
    printf "The answer is: %b\n", oct("0b101010");

=head2 Lvalue subroutines

Subroutines can now return modifiable lvalues.
See L<perlsub/"Lvalue subroutines">.

    NOTE: This is an experimental feature.  Details are subject to change.

=head2 Some arrows may be omitted in calls through references

Perl now allows the arrow to be omitted in many constructs
involving subroutine calls through references.  For example,
C<< $foo[10]->('foo') >> may now be written C<$foo[10]('foo')>.
This is rather similar to how the arrow may be omitted from
C<< $foo[10]->{'foo'} >>.  Note however, that the arrow is still
required for C<< foo(10)->('bar') >>.

=head2 Boolean assignment operators are legal lvalues

Constructs such as C<($a ||= 2) += 1> are now allowed.

=head2 exists() is supported on subroutine names

The exists() builtin now works on subroutine names.  A subroutine
is considered to exist if it has been declared (even if implicitly).
See L<perlfunc/exists> for examples.

=head2 exists() and delete() are supported on array elements

The exists() and delete() builtins now work on simple arrays as well.
The behavior is similar to that on hash elements.

exists() can be used to check whether an array element has been
initialized.  This avoids autovivifying array elements that don't exist.
If the array is tied, the EXISTS() method in the corresponding tied
package will be invoked.

delete() may be used to remove an element from the array and return
it.  The array element at that position returns to its uninitialized
state, so that testing for the same element with exists() will return
false.  If the element happens to be the one at the end, the size of
the array also shrinks up to the highest element that tests true for
exists(), or 0 if none such is found.  If the array is tied, the DELETE() 
method in the corresponding tied package will be invoked.

See L<perlfunc/exists> and L<perlfunc/delete> for examples.

=head2 Pseudo-hashes work better

Dereferencing some types of reference values in a pseudo-hash,
such as C<< $ph->{foo}[1] >>, was accidentally disallowed.  This has
been corrected.

When applied to a pseudo-hash element, exists() now reports whether
the specified value exists, not merely if the key is valid.

delete() now works on pseudo-hashes.  When given a pseudo-hash element
or slice it deletes the values corresponding to the keys (but not the keys
themselves).  See L<perlref/"Pseudo-hashes: Using an array as a hash">.

Pseudo-hash slices with constant keys are now optimized to array lookups
at compile-time.

List assignments to pseudo-hash slices are now supported.

The C<fields> pragma now provides ways to create pseudo-hashes, via
fields::new() and fields::phash().  See L<fields>.

    NOTE: The pseudo-hash data type continues to be experimental.
    Limiting oneself to the interface elements provided by the
    fields pragma will provide protection from any future changes.

=head2 Automatic flushing of output buffers

fork(), exec(), system(), qx//, and pipe open()s now flush buffers
of all files opened for output when the operation was attempted.  This
mostly eliminates confusing buffering mishaps suffered by users unaware
of how Perl internally handles I/O.

This is not supported on some platforms like Solaris where a suitably
correct implementation of fflush(NULL) isn't available.

=head2 Better diagnostics on meaningless filehandle operations

Constructs such as C<< open(<FH>) >> and C<< close(<FH>) >>
are compile time errors.  Attempting to read from filehandles that
were opened only for writing will now produce warnings (just as
writing to read-only filehandles does).

=head2 Where possible, buffered data discarded from duped input filehandle

C<< open(NEW, "<&OLD") >> now attempts to discard any data that
was previously read and buffered in C<OLD> before duping the handle.
On platforms where doing this is allowed, the next read operation
on C<NEW> will return the same data as the corresponding operation
on C<OLD>.  Formerly, it would have returned the data from the start
of the following disk block instead.

=head2 eof() has the same old magic as <>

C<eof()> would return true if no attempt to read from C<< <> >> had
yet been made.  C<eof()> has been changed to have a little magic of its
own, it now opens the C<< <> >> files.

=head2 binmode() can be used to set :crlf and :raw modes

binmode() now accepts a second argument that specifies a discipline
for the handle in question.  The two pseudo-disciplines ":raw" and
":crlf" are currently supported on DOS-derivative platforms.
See L<perlfunc/"binmode"> and L<open>.

=head2 C<-T> filetest recognizes UTF-8 encoded files as "text"

The algorithm used for the C<-T> filetest has been enhanced to
correctly identify UTF-8 content as "text".

=head2 system(), backticks and pipe open now reflect exec() failure

On Unix and similar platforms, system(), qx() and open(FOO, "cmd |")
etc., are implemented via fork() and exec().  When the underlying
exec() fails, earlier versions did not report the error properly,
since the exec() happened to be in a different process.

The child process now communicates with the parent about the
error in launching the external command, which allows these
constructs to return with their usual error value and set $!.

=head2 Improved diagnostics

Line numbers are no longer suppressed (under most likely circumstances)
during the global destruction phase.

Diagnostics emitted from code running in threads other than the main
thread are now accompanied by the thread ID.

Embedded null characters in diagnostics now actually show up.  They
used to truncate the message in prior versions.

$foo::a and $foo::b are now exempt from "possible typo" warnings only
if sort() is encountered in package C<foo>.

Unrecognized alphabetic escapes encountered when parsing quote
constructs now generate a warning, since they may take on new
semantics in later versions of Perl.

Many diagnostics now report the internal operation in which the warning
was provoked, like so:

    Use of uninitialized value in concatenation (.) at (eval 1) line 1.
    Use of uninitialized value in print at (eval 1) line 1.

Diagnostics  that occur within eval may also report the file and line
number where the eval is located, in addition to the eval sequence
number and the line number within the evaluated text itself.  For
example:

    Not enough arguments for scalar at (eval 4)[newlib/perl5db.pl:1411] line 2, at EOF

=head2 Diagnostics follow STDERR

Diagnostic output now goes to whichever file the C<STDERR> handle
is pointing at, instead of always going to the underlying C runtime
library's C<stderr>.

=head2 More consistent close-on-exec behavior

On systems that support a close-on-exec flag on filehandles, the
flag is now set for any handles created by pipe(), socketpair(),
socket(), and accept(), if that is warranted by the value of $^F
that may be in effect.  Earlier versions neglected to set the flag
for handles created with these operators.  See L<perlfunc/pipe>,
L<perlfunc/socketpair>, L<perlfunc/socket>, L<perlfunc/accept>,
and L<perlvar/$^F>.

=head2 syswrite() ease-of-use

The length argument of C<syswrite()> has become optional.

=head2 Better syntax checks on parenthesized unary operators

Expressions such as:

    print defined(&foo,&bar,&baz);
    print uc("foo","bar","baz");
    undef($foo,&bar);

used to be accidentally allowed in earlier versions, and produced
unpredictable behaviour.  Some produced ancillary warnings
when used in this way; others silently did the wrong thing.

The parenthesized forms of most unary operators that expect a single
argument now ensure that they are not called with more than one
argument, making the cases shown above syntax errors.  The usual
behaviour of:

    print defined &foo, &bar, &baz;
    print uc "foo", "bar", "baz";
    undef $foo, &bar;

remains unchanged.  See L<perlop>.

=head2 Bit operators support full native integer width

The bit operators (& | ^ ~ << >>) now operate on the full native
integral width (the exact size of which is available in $Config{ivsize}).
For example, if your platform is either natively 64-bit or if Perl
has been configured to use 64-bit integers, these operations apply
to 8 bytes (as opposed to 4 bytes on 32-bit platforms).
For portability, be sure to mask off the excess bits in the result of
unary C<~>, e.g., C<~$x & 0xffffffff>.

=head2 Improved security features

More potentially unsafe operations taint their results for improved
security.

The C<passwd> and C<shell> fields returned by the getpwent(), getpwnam(),
and getpwuid() are now tainted, because the user can affect their own
encrypted password and login shell.

The variable modified by shmread(), and messages returned by msgrcv()
(and its object-oriented interface IPC::SysV::Msg::rcv) are also tainted,
because other untrusted processes can modify messages and shared memory
segments for their own nefarious purposes.

=head2 More functional bareword prototype (*)

Bareword prototypes have been rationalized to enable them to be used
to override builtins that accept barewords and interpret them in
a special way, such as C<require> or C<do>.

Arguments prototyped as C<*> will now be visible within the subroutine
as either a simple scalar or as a reference to a typeglob.
See L<perlsub/Prototypes>.

=head2 C<require> and C<do> may be overridden

C<require> and C<do 'file'> operations may be overridden locally
by importing subroutines of the same name into the current package 
(or globally by importing them into the CORE::GLOBAL:: namespace).
Overriding C<require> will also affect C<use>, provided the override
is visible at compile-time.
See L<perlsub/"Overriding Built-in Functions">.

=head2 $^X variables may now have names longer than one character

Formerly, $^X was synonymous with ${"\cX"}, but $^XY was a syntax
error.  Now variable names that begin with a control character may be
arbitrarily long.  However, for compatibility reasons, these variables
I<must> be written with explicit braces, as C<${^XY}> for example.
C<${^XYZ}> is synonymous with ${"\cXYZ"}.  Variable names with more
than one control character, such as C<${^XY^Z}>, are illegal.

The old syntax has not changed.  As before, `^X' may be either a
literal control-X character or the two-character sequence `caret' plus
`X'.  When braces are omitted, the variable name stops after the
control character.  Thus C<"$^XYZ"> continues to be synonymous with
C<$^X . "YZ"> as before.

As before, lexical variables may not have names beginning with control
characters.  As before, variables whose names begin with a control
character are always forced to be in package `main'.  All such variables
are reserved for future extensions, except those that begin with
C<^_>, which may be used by user programs and are guaranteed not to
acquire special meaning in any future version of Perl.

=head2 New variable $^C reflects C<-c> switch

C<$^C> has a boolean value that reflects whether perl is being run
in compile-only mode (i.e. via the C<-c> switch).  Since
BEGIN blocks are executed under such conditions, this variable
enables perl code to determine whether actions that make sense
only during normal running are warranted.  See L<perlvar>.

=head2 New variable $^V contains Perl version as a string

C<$^V> contains the Perl version number as a string composed of
characters whose ordinals match the version numbers, i.e. v5.6.0.
This may be used in string comparisons.

See C<Support for strings represented as a vector of ordinals> for an
example.

=head2 Optional Y2K warnings

If Perl is built with the cpp macro C<PERL_Y2KWARN> defined,
it emits optional warnings when concatenating the number 19
with another number.

This behavior must be specifically enabled when running Configure.
See F<INSTALL> and F<README.Y2K>.

=head2 Arrays now always interpolate into double-quoted strings

In double-quoted strings, arrays now interpolate, no matter what.  The
behavior in earlier versions of perl 5 was that arrays would interpolate
into strings if the array had been mentioned before the string was
compiled, and otherwise Perl would raise a fatal compile-time error.
In versions 5.000 through 5.003, the error was

        Literal @example now requires backslash

In versions 5.004_01 through 5.6.0, the error was

        In string, @example now must be written as \@example

The idea here was to get people into the habit of writing
C<"fred\@example.com"> when they wanted a literal C<@> sign, just as
they have always written C<"Give me back my \$5"> when they wanted a
literal C<$> sign.

Starting with 5.6.1, when Perl now sees an C<@> sign in a
double-quoted string, it I<always> attempts to interpolate an array,
regardless of whether or not the array has been used or declared
already.  The fatal error has been downgraded to an optional warning:

        Possible unintended interpolation of @example in string

This warns you that C<"fred@example.com"> is going to turn into
C<fred.com> if you don't backslash the C<@>.
See http://perl.plover.com/at-error.html for more details
about the history here.

=head2 @- and @+ provide starting/ending offsets of regex matches

The new magic variables @- and @+ provide the starting and ending
offsets, respectively, of $&, $1, $2, etc.  See L<perlvar> for
details.

=head1 Modules and Pragmata

=head2 Modules

=over 4

=item attributes

While used internally by Perl as a pragma, this module also
provides a way to fetch subroutine and variable attributes.
See L<attributes>.

=item B

The Perl Compiler suite has been extensively reworked for this
release.  More of the standard Perl test suite passes when run
under the Compiler, but there is still a significant way to
go to achieve production quality compiled executables.

    NOTE: The Compiler suite remains highly experimental.  The
    generated code may not be correct, even when it manages to execute
    without errors.

=item Benchmark

Overall, Benchmark results exhibit lower average error and better timing
accuracy.  

You can now run tests for I<n> seconds instead of guessing the right
number of tests to run: e.g., timethese(-5, ...) will run each 
code for at least 5 CPU seconds.  Zero as the "number of repetitions"
means "for at least 3 CPU seconds".  The output format has also
changed.  For example:

   use Benchmark;$x=3;timethese(-5,{a=>sub{$x*$x},b=>sub{$x**2}})

will now output something like this:

   Benchmark: running a, b, each for at least 5 CPU seconds...
            a:  5 wallclock secs ( 5.77 usr +  0.00 sys =  5.77 CPU) @ 200551.91/s (n=1156516)
            b:  4 wallclock secs ( 5.00 usr +  0.02 sys =  5.02 CPU) @ 159605.18/s (n=800686)

New features: "each for at least N CPU seconds...", "wallclock secs",
and the "@ operations/CPU second (n=operations)".

timethese() now returns a reference to a hash of Benchmark objects containing
the test results, keyed on the names of the tests.

timethis() now returns the iterations field in the Benchmark result object
instead of 0.

timethese(), timethis(), and the new cmpthese() (see below) can also take
a format specifier of 'none' to suppress output.

A new function countit() is just like timeit() except that it takes a
TIME instead of a COUNT.

A new function cmpthese() prints a chart comparing the results of each test
returned from a timethese() call.  For each possible pair of tests, the
percentage speed difference (iters/sec or seconds/iter) is shown.

For other details, see L<Benchmark>.

=item ByteLoader

The ByteLoader is a dedicated extension to generate and run
Perl bytecode.  See L<ByteLoader>.

=item constant

References can now be used.

The new version also allows a leading underscore in constant names, but
disallows a double leading underscore (as in "__LINE__").  Some other names
are disallowed or warned against, including BEGIN, END, etc.  Some names
which were forced into main:: used to fail silently in some cases; now they're
fatal (outside of main::) and an optional warning (inside of main::).
The ability to detect whether a constant had been set with a given name has
been added.

See L<constant>.

=item charnames

This pragma implements the C<\N> string escape.  See L<charnames>.

=item Data::Dumper

A C<Maxdepth> setting can be specified to avoid venturing
too deeply into deep data structures.  See L<Data::Dumper>.

The XSUB implementation of Dump() is now automatically called if the
C<Useqq> setting is not in use.

Dumping C<qr//> objects works correctly.

=item DB

C<DB> is an experimental module that exposes a clean abstraction
to Perl's debugging API.

=item DB_File

DB_File can now be built with Berkeley DB versions 1, 2 or 3.
See C<ext/DB_File/Changes>.

=item Devel::DProf

Devel::DProf, a Perl source code profiler has been added.  See
L<Devel::DProf> and L<dprofpp>.

=item Devel::Peek

The Devel::Peek module provides access to the internal representation
of Perl variables and data.  It is a data debugging tool for the XS programmer.

=item Dumpvalue

The Dumpvalue module provides screen dumps of Perl data.

=item DynaLoader

DynaLoader now supports a dl_unload_file() function on platforms that
support unloading shared objects using dlclose().

Perl can also optionally arrange to unload all extension shared objects
loaded by Perl.  To enable this, build Perl with the Configure option
C<-Accflags=-DDL_UNLOAD_ALL_AT_EXIT>.  (This maybe useful if you are
using Apache with mod_perl.)

=item English

$PERL_VERSION now stands for C<$^V> (a string value) rather than for C<$]>
(a numeric value).

=item Env

Env now supports accessing environment variables like PATH as array
variables.

=item Fcntl

More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for
large file (more than 4GB) access (NOTE: the O_LARGEFILE is
automatically added to sysopen() flags if large file support has been
configured, as is the default), Free/Net/OpenBSD locking behaviour
flags F_FLOCK, F_POSIX, Linux F_SHLCK, and O_ACCMODE: the combined
mask of O_RDONLY, O_WRONLY, and O_RDWR.  The seek()/sysseek()
constants SEEK_SET, SEEK_CUR, and SEEK_END are available via the
C<:seek> tag.  The chmod()/stat() S_IF* constants and S_IS* functions
are available via the C<:mode> tag.

=item File::Compare

A compare_text() function has been added, which allows custom
comparison functions.  See L<File::Compare>.

=item File::Find

File::Find now works correctly when the wanted() function is either
autoloaded or is a symbolic reference.

A bug that caused File::Find to lose track of the working directory
when pruning top-level directories has been fixed.

File::Find now also supports several other options to control its
behavior.  It can follow symbolic links if the C<follow> option is
specified.  Enabling the C<no_chdir> option will make File::Find skip
changing the current directory when walking directories.  The C<untaint>
flag can be useful when running with taint checks enabled.

See L<File::Find>.

=item File::Glob

This extension implements BSD-style file globbing.  By default,
it will also be used for the internal implementation of the glob()
operator.  See L<File::Glob>.

=item File::Spec

New methods have been added to the File::Spec module: devnull() returns
the name of the null device (/dev/null on Unix) and tmpdir() the name of
the temp directory (normally /tmp on Unix).  There are now also methods
to convert between absolute and relative filenames: abs2rel() and
rel2abs().  For compatibility with operating systems that specify volume
names in file paths, the splitpath(), splitdir(), and catdir() methods
have been added.

=item File::Spec::Functions

The new File::Spec::Functions modules provides a function interface
to the File::Spec module.  Allows shorthand

    $fullname = catfile($dir1, $dir2, $file);

instead of

    $fullname = File::Spec->catfile($dir1, $dir2, $file);

=item Getopt::Long

Getopt::Long licensing has changed to allow the Perl Artistic License
as well as the GPL. It used to be GPL only, which got in the way of
non-GPL applications that wanted to use Getopt::Long.

Getopt::Long encourages the use of Pod::Usage to produce help
messages. For example:

    use Getopt::Long;
    use Pod::Usage;
    my $man = 0;
    my $help = 0;
    GetOptions('help|?' => \$help, man => \$man) or pod2usage(2);
    pod2usage(1) if $help;
    pod2usage(-exitstatus => 0, -verbose => 2) if $man;

    __END__

    =head1 NAME

    sample - Using Getopt::Long and Pod::Usage

    =head1 SYNOPSIS

    sample [options] [file ...]

     Options:
       -help            brief help message
       -man             full documentation

    =head1 OPTIONS

    =over 8

    =item B<-help>

    Print a brief help message and exits.

    =item B<-man>

    Prints the manual page and exits.

    =back

    =head1 DESCRIPTION

    B<This program> will read the given input file(s) and do something
    useful with the contents thereof.

    =cut

See L<Pod::Usage> for details.

A bug that prevented the non-option call-back <> from being
specified as the first argument has been fixed.

To specify the characters < and > as option starters, use ><. Note,
however, that changing option starters is strongly deprecated. 

=item IO

write() and syswrite() will now accept a single-argument
form of the call, for consistency with Perl's syswrite().

You can now create a TCP-based IO::Socket::INET without forcing
a connect attempt.  This allows you to configure its options
(like making it non-blocking) and then call connect() manually.

A bug that prevented the IO::Socket::protocol() accessor
from ever returning the correct value has been corrected.

IO::Socket::connect now uses non-blocking IO instead of alarm()
to do connect timeouts.

IO::Socket::accept now uses select() instead of alarm() for doing
timeouts.

IO::Socket::INET->new now sets $! correctly on failure. $@ is
still set for backwards compatibility.

=item JPL

Java Perl Lingo is now distributed with Perl.  See jpl/README
for more information.

=item lib

C<use lib> now weeds out any trailing duplicate entries.
C<no lib> removes all named entries.

=item Math::BigInt

The bitwise operations C<<< << >>>, C<<< >> >>>, C<&>, C<|>,
and C<~> are now supported on bigints.

=item Math::Complex

The accessor methods Re, Im, arg, abs, rho, and theta can now also
act as mutators (accessor $z->Re(), mutator $z->Re(3)).

The class method C<display_format> and the corresponding object method
C<display_format>, in addition to accepting just one argument, now can
also accept a parameter hash.  Recognized keys of a parameter hash are
C<"style">, which corresponds to the old one parameter case, and two
new parameters: C<"format">, which is a printf()-style format string
(defaults usually to C<"%.15g">, you can revert to the default by
setting the format string to C<undef>) used for both parts of a
complex number, and C<"polar_pretty_print"> (defaults to true),
which controls whether an attempt is made to try to recognize small
multiples and rationals of pi (2pi, pi/2) at the argument (angle) of a
polar complex number.

The potentially disruptive change is that in list context both methods
now I<return the parameter hash>, instead of only the value of the
C<"style"> parameter.

=item Math::Trig

A little bit of radial trigonometry (cylindrical and spherical),
radial coordinate conversions, and the great circle distance were added.

=item Pod::Parser, Pod::InputObjects

Pod::Parser is a base class for parsing and selecting sections of
pod documentation from an input stream.  This module takes care of
identifying pod paragraphs and commands in the input and hands off the
parsed paragraphs and commands to user-defined methods which are free
to interpret or translate them as they see fit.

Pod::InputObjects defines some input objects needed by Pod::Parser, and
for advanced users of Pod::Parser that need more about a command besides
its name and text.

As of release 5.6.0 of Perl, Pod::Parser is now the officially sanctioned
"base parser code" recommended for use by all pod2xxx translators.
Pod::Text (pod2text) and Pod::Man (pod2man) have already been converted
to use Pod::Parser and efforts to convert Pod::HTML (pod2html) are already
underway.  For any questions or comments about pod parsing and translating
issues and utilities, please use the pod-people@perl.org mailing list.

For further information, please see L<Pod::Parser> and L<Pod::InputObjects>.

=item Pod::Checker, podchecker

This utility checks pod files for correct syntax, according to
L<perlpod>.  Obvious errors are flagged as such, while warnings are
printed for mistakes that can be handled gracefully.  The checklist is
not complete yet.  See L<Pod::Checker>.

=item Pod::ParseUtils, Pod::Find

These modules provide a set of gizmos that are useful mainly for pod
translators.  L<Pod::Find|Pod::Find> traverses directory structures and
returns found pod files, along with their canonical names (like
C<File::Spec::Unix>).  L<Pod::ParseUtils|Pod::ParseUtils> contains
B<Pod::List> (useful for storing pod list information), B<Pod::Hyperlink>
(for parsing the contents of C<LE<lt>E<gt>> sequences) and B<Pod::Cache>
(for caching information about pod files, e.g., link nodes).

=item Pod::Select, podselect

Pod::Select is a subclass of Pod::Parser which provides a function
named "podselect()" to filter out user-specified sections of raw pod
documentation from an input stream. podselect is a script that provides
access to Pod::Select from other scripts to be used as a filter.
See L<Pod::Select>.

=item Pod::Usage, pod2usage

Pod::Usage provides the function "pod2usage()" to print usage messages for
a Perl script based on its embedded pod documentation.  The pod2usage()
function is generally useful to all script authors since it lets them
write and maintain a single source (the pods) for documentation, thus
removing the need to create and maintain redundant usage message text
consisting of information already in the pods.

There is also a pod2usage script which can be used from other kinds of
scripts to print usage messages from pods (even for non-Perl scripts
with pods embedded in comments).

For details and examples, please see L<Pod::Usage>.

=item Pod::Text and Pod::Man

Pod::Text has been rewritten to use Pod::Parser.  While pod2text() is
still available for backwards compatibility, the module now has a new
preferred interface.  See L<Pod::Text> for the details.  The new Pod::Text
module is easily subclassed for tweaks to the output, and two such
subclasses (Pod::Text::Termcap for man-page-style bold and underlining
using termcap information, and Pod::Text::Color for markup with ANSI color
sequences) are now standard.

pod2man has been turned into a module, Pod::Man, which also uses
Pod::Parser.  In the process, several outstanding bugs related to quotes
in section headers, quoting of code escapes, and nested lists have been
fixed.  pod2man is now a wrapper script around this module.

=item SDBM_File

An EXISTS method has been added to this module (and sdbm_exists() has
been added to the underlying sdbm library), so one can now call exists
on an SDBM_File tied hash and get the correct result, rather than a
runtime error.

A bug that may have caused data loss when more than one disk block
happens to be read from the database in a single FETCH() has been
fixed.

=item Sys::Syslog

Sys::Syslog now uses XSUBs to access facilities from syslog.h so it
no longer requires syslog.ph to exist. 

=item Sys::Hostname

Sys::Hostname now uses XSUBs to call the C library's gethostname() or
uname() if they exist.

=item Term::ANSIColor

Term::ANSIColor is a very simple module to provide easy and readable
access to the ANSI color and highlighting escape sequences, supported by
most ANSI terminal emulators.  It is now included standard.

=item Time::Local

The timelocal() and timegm() functions used to silently return bogus
results when the date fell outside the machine's integer range.  They
now consistently croak() if the date falls in an unsupported range.

=item Win32

The error return value in list context has been changed for all functions
that return a list of values.  Previously these functions returned a list
with a single element C<undef> if an error occurred.  Now these functions
return the empty list in these situations.  This applies to the following
functions:

    Win32::FsType
    Win32::GetOSVersion

The remaining functions are unchanged and continue to return C<undef> on
error even in list context.

The Win32::SetLastError(ERROR) function has been added as a complement
to the Win32::GetLastError() function.

The new Win32::GetFullPathName(FILENAME) returns the full absolute
pathname for FILENAME in scalar context.  In list context it returns
a two-element list containing the fully qualified directory name and
the filename.  See L<Win32>.

=item XSLoader

The XSLoader extension is a simpler alternative to DynaLoader.
See L<XSLoader>.

=item DBM Filters

A new feature called "DBM Filters" has been added to all the
DBM modules--DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File.
DBM Filters add four new methods to each DBM module:

    filter_store_key
    filter_store_value
    filter_fetch_key
    filter_fetch_value

These can be used to filter key-value pairs before the pairs are
written to the database or just after they are read from the database.
See L<perldbmfilter> for further information.

=back

=head2 Pragmata

C<use attrs> is now obsolete, and is only provided for
backward-compatibility.  It's been replaced by the C<sub : attributes>
syntax.  See L<perlsub/"Subroutine Attributes"> and L<attributes>.

Lexical warnings pragma, C<use warnings;>, to control optional warnings.
See L<perllexwarn>.

C<use filetest> to control the behaviour of filetests (C<-r> C<-w>
...).  Currently only one subpragma implemented, "use filetest
'access';", that uses access(2) or equivalent to check permissions
instead of using stat(2) as usual.  This matters in filesystems
where there are ACLs (access control lists): the stat(2) might lie,
but access(2) knows better.

The C<open> pragma can be used to specify default disciplines for
handle constructors (e.g. open()) and for qx//.  The two
pseudo-disciplines C<:raw> and C<:crlf> are currently supported on
DOS-derivative platforms (i.e. where binmode is not a no-op).
See also L</"binmode() can be used to set :crlf and :raw modes">.

=head1 Utility Changes

=head2 dprofpp

C<dprofpp> is used to display profile data generated using C<Devel::DProf>.
See L<dprofpp>.

=head2 find2perl

The C<find2perl> utility now uses the enhanced features of the File::Find
module.  The -depth and -follow options are supported.  Pod documentation
is also included in the script.

=head2 h2xs

The C<h2xs> tool can now work in conjunction with C<C::Scan> (available
from CPAN) to automatically parse real-life header files.  The C<-M>,
C<-a>, C<-k>, and C<-o> options are new.

=head2 perlcc

C<perlcc> now supports the C and Bytecode backends.  By default,
it generates output from the simple C backend rather than the
optimized C backend.

Support for non-Unix platforms has been improved.

=head2 perldoc

C<perldoc> has been reworked to avoid possible security holes.
It will not by default let itself be run as the superuser, but you
may still use the B<-U> switch to try to make it drop privileges
first.

=head2 The Perl Debugger

Many bug fixes and enhancements were added to F<perl5db.pl>, the
Perl debugger.  The help documentation was rearranged.  New commands
include C<< < ? >>, C<< > ? >>, and C<< { ? >> to list out current
actions, C<man I<docpage>> to run your doc viewer on some perl
docset, and support for quoted options.  The help information was
rearranged, and should be viewable once again if you're using B<less>
as your pager.  A serious security hole was plugged--you should
immediately remove all older versions of the Perl debugger as
installed in previous releases, all the way back to perl3, from
your system to avoid being bitten by this.

=head1 Improved Documentation

Many of the platform-specific README files are now part of the perl
installation.  See L<perl> for the complete list.

=over 4

=item perlapi.pod

The official list of public Perl API functions.

=item perlboot.pod

A tutorial for beginners on object-oriented Perl.

=item perlcompile.pod

An introduction to using the Perl Compiler suite.

=item perldbmfilter.pod

A howto document on using the DBM filter facility.

=item perldebug.pod

All material unrelated to running the Perl debugger, plus all
low-level guts-like details that risked crushing the casual user
of the debugger, have been relocated from the old manpage to the
next entry below.

=item perldebguts.pod

This new manpage contains excessively low-level material not related
to the Perl debugger, but slightly related to debugging Perl itself.
It also contains some arcane internal details of how the debugging
process works that may only be of interest to developers of Perl
debuggers.

=item perlfork.pod

Notes on the fork() emulation currently available for the Windows platform.

=item perlfilter.pod

An introduction to writing Perl source filters.

=item perlhack.pod

Some guidelines for hacking the Perl source code.

=item perlintern.pod

A list of internal functions in the Perl source code.
(List is currently empty.)

=item perllexwarn.pod

Introduction and reference information about lexically scoped
warning categories.

=item perlnumber.pod

Detailed information about numbers as they are represented in Perl.

=item perlopentut.pod

A tutorial on using open() effectively.

=item perlreftut.pod

A tutorial that introduces the essentials of references.

=item perltootc.pod

A tutorial on managing class data for object modules.

=item perltodo.pod

Discussion of the most often wanted features that may someday be
supported in Perl.

=item perlunicode.pod

An introduction to Unicode support features in Perl.

=back

=head1 Performance enhancements

=head2 Simple sort() using { $a <=> $b } and the like are optimized

Many common sort() operations using a simple inlined block are now
optimized for faster performance.

=head2 Optimized assignments to lexical variables

Certain operations in the RHS of assignment statements have been
optimized to directly set the lexical variable on the LHS,
eliminating redundant copying overheads.

=head2 Faster subroutine calls

Minor changes in how subroutine calls are handled internally
provide marginal improvements in performance.

=head2 delete(), each(), values() and hash iteration are faster

The hash values returned by delete(), each(), values() and hashes in a
list context are the actual values in the hash, instead of copies.
This results in significantly better performance, because it eliminates
needless copying in most situations.

=head1 Installation and Configuration Improvements

=head2 -Dusethreads means something different

The -Dusethreads flag now enables the experimental interpreter-based thread
support by default.  To get the flavor of experimental threads that was in
5.005 instead, you need to run Configure with "-Dusethreads -Duse5005threads".

As of v5.6.0, interpreter-threads support is still lacking a way to
create new threads from Perl (i.e., C<use Thread;> will not work with
interpreter threads).  C<use Thread;> continues to be available when you
specify the -Duse5005threads option to Configure, bugs and all.

    NOTE: Support for threads continues to be an experimental feature.
    Interfaces and implementation are subject to sudden and drastic changes.

=head2 New Configure flags

The following new flags may be enabled on the Configure command line
by running Configure with C<-Dflag>.

    usemultiplicity
    usethreads useithreads	(new interpreter threads: no Perl API yet)
    usethreads use5005threads	(threads as they were in 5.005)

    use64bitint			(equal to now deprecated 'use64bits')
    use64bitall

    uselongdouble
    usemorebits
    uselargefiles
    usesocks			(only SOCKS v5 supported)

=head2 Threadedness and 64-bitness now more daring

The Configure options enabling the use of threads and the use of
64-bitness are now more daring in the sense that they no more have an
explicit list of operating systems of known threads/64-bit
capabilities.  In other words: if your operating system has the
necessary APIs and datatypes, you should be able just to go ahead and
use them, for threads by Configure -Dusethreads, and for 64 bits
either explicitly by Configure -Duse64bitint or implicitly if your
system has 64-bit wide datatypes.  See also L</"64-bit support">.

=head2 Long Doubles

Some platforms have "long doubles", floating point numbers of even
larger range than ordinary "doubles".  To enable using long doubles for
Perl's scalars, use -Duselongdouble.

=head2 -Dusemorebits

You can enable both -Duse64bitint and -Duselongdouble with -Dusemorebits.
See also L</"64-bit support">.

=head2 -Duselargefiles

Some platforms support system APIs that are capable of handling large files
(typically, files larger than two gigabytes).  Perl will try to use these
APIs if you ask for -Duselargefiles.

See L</"Large file support"> for more information. 

=head2 installusrbinperl

You can use "Configure -Uinstallusrbinperl" which causes installperl
to skip installing perl also as /usr/bin/perl.  This is useful if you
prefer not to modify /usr/bin for some reason or another but harmful
because many scripts assume to find Perl in /usr/bin/perl.

=head2 SOCKS support

You can use "Configure -Dusesocks" which causes Perl to probe
for the SOCKS proxy protocol library (v5, not v4).  For more information
on SOCKS, see:

    http://www.socks.nec.com/

=head2 C<-A> flag

You can "post-edit" the Configure variables using the Configure C<-A>
switch.  The editing happens immediately after the platform specific
hints files have been processed but before the actual configuration
process starts.  Run C<Configure -h> to find out the full C<-A> syntax.

=head2 Enhanced Installation Directories

The installation structure has been enriched to improve the support
for maintaining multiple versions of perl, to provide locations for
vendor-supplied modules, scripts, and manpages, and to ease maintenance
of locally-added modules, scripts, and manpages.  See the section on
Installation Directories in the INSTALL file for complete details.
For most users building and installing from source, the defaults should
be fine.

If you previously used C<Configure -Dsitelib> or C<-Dsitearch> to set
special values for library directories, you might wish to consider using
the new C<-Dsiteprefix> setting instead.  Also, if you wish to re-use a
config.sh file from an earlier version of perl, you should be sure to
check that Configure makes sensible choices for the new directories.
See INSTALL for complete details.

=head1 Platform specific changes

=head2 Supported platforms

=over 4

=item *

The Mach CThreads (NEXTSTEP, OPENSTEP) are now supported by the Thread
extension.

=item *

GNU/Hurd is now supported.

=item *

Rhapsody/Darwin is now supported.

=item *

EPOC is now supported (on Psion 5).

=item *

The cygwin port (formerly cygwin32) has been greatly improved.

=back

=head2 DOS

=over 4

=item *

Perl now works with djgpp 2.02 (and 2.03 alpha).

=item *

Environment variable names are not converted to uppercase any more.

=item *

Incorrect exit codes from backticks have been fixed.

=item *

This port continues to use its own builtin globbing (not File::Glob).

=back

=head2 OS390 (OpenEdition MVS)

Support for this EBCDIC platform has not been renewed in this release.
There are difficulties in reconciling Perl's standardization on UTF-8
as its internal representation for characters with the EBCDIC character
set, because the two are incompatible.

It is unclear whether future versions will renew support for this
platform, but the possibility exists.

=head2 VMS

Numerous revisions and extensions to configuration, build, testing, and
installation process to accommodate core changes and VMS-specific options.

Expand %ENV-handling code to allow runtime mapping to logical names,
CLI symbols, and CRTL environ array.

Extension of subprocess invocation code to accept filespecs as command
"verbs".

Add to Perl command line processing the ability to use default file types and
to recognize Unix-style C<2E<gt>&1>.

Expansion of File::Spec::VMS routines, and integration into ExtUtils::MM_VMS.

Extension of ExtUtils::MM_VMS to handle complex extensions more flexibly.

Barewords at start of Unix-syntax paths may be treated as text rather than
only as logical names.

Optional secure translation of several logical names used internally by Perl.

Miscellaneous bugfixing and porting of new core code to VMS.

Thanks are gladly extended to the many people who have contributed VMS
patches, testing, and ideas.

=head2 Win32

Perl can now emulate fork() internally, using multiple interpreters running
in different concurrent threads.  This support must be enabled at build
time.  See L<perlfork> for detailed information.

When given a pathname that consists only of a drivename, such as C<A:>,
opendir() and stat() now use the current working directory for the drive
rather than the drive root.

The builtin XSUB functions in the Win32:: namespace are documented.  See
L<Win32>.

$^X now contains the full path name of the running executable.

A Win32::GetLongPathName() function is provided to complement
Win32::GetFullPathName() and Win32::GetShortPathName().  See L<Win32>.

POSIX::uname() is supported.

system(1,...) now returns true process IDs rather than process
handles.  kill() accepts any real process id, rather than strictly
return values from system(1,...).

For better compatibility with Unix, C<kill(0, $pid)> can now be used to
test whether a process exists.

The C<Shell> module is supported.

Better support for building Perl under command.com in Windows 95
has been added.

Scripts are read in binary mode by default to allow ByteLoader (and
the filter mechanism in general) to work properly.  For compatibility,
the DATA filehandle will be set to text mode if a carriage return is
detected at the end of the line containing the __END__ or __DATA__
token; if not, the DATA filehandle will be left open in binary mode.
Earlier versions always opened the DATA filehandle in text mode.

The glob() operator is implemented via the C<File::Glob> extension,
which supports glob syntax of the C shell.  This increases the flexibility
of the glob() operator, but there may be compatibility issues for
programs that relied on the older globbing syntax.  If you want to
preserve compatibility with the older syntax, you might want to run
perl with C<-MFile::DosGlob>.  For details and compatibility information,
see L<File::Glob>.

=head1 Significant bug fixes

=head2 <HANDLE> on empty files

With C<$/> set to C<undef>, "slurping" an empty file returns a string of
zero length (instead of C<undef>, as it used to) the first time the
HANDLE is read after C<$/> is set to C<undef>.  Further reads yield
C<undef>.

This means that the following will append "foo" to an empty file (it used
to do nothing):

    perl -0777 -pi -e 's/^/foo/' empty_file

The behaviour of:

    perl -pi -e 's/^/foo/' empty_file

is unchanged (it continues to leave the file empty).

=head2 C<eval '...'> improvements

Line numbers (as reflected by caller() and most diagnostics) within
C<eval '...'> were often incorrect where here documents were involved.
This has been corrected.

Lexical lookups for variables appearing in C<eval '...'> within
functions that were themselves called within an C<eval '...'> were
searching the wrong place for lexicals.  The lexical search now
correctly ends at the subroutine's block boundary.

The use of C<return> within C<eval {...}> caused $@ not to be reset
correctly when no exception occurred within the eval.  This has
been fixed.

Parsing of here documents used to be flawed when they appeared as
the replacement expression in C<eval 's/.../.../e'>.  This has
been fixed.

=head2 All compilation errors are true errors

Some "errors" encountered at compile time were by necessity 
generated as warnings followed by eventual termination of the
program.  This enabled more such errors to be reported in a
single run, rather than causing a hard stop at the first error
that was encountered.

The mechanism for reporting such errors has been reimplemented
to queue compile-time errors and report them at the end of the
compilation as true errors rather than as warnings.  This fixes
cases where error messages leaked through in the form of warnings
when code was compiled at run time using C<eval STRING>, and
also allows such errors to be reliably trapped using C<eval "...">.

=head2 Implicitly closed filehandles are safer

Sometimes implicitly closed filehandles (as when they are localized,
and Perl automatically closes them on exiting the scope) could
inadvertently set $? or $!.  This has been corrected.


=head2 Behavior of list slices is more consistent

When taking a slice of a literal list (as opposed to a slice of
an array or hash), Perl used to return an empty list if the
result happened to be composed of all undef values.

The new behavior is to produce an empty list if (and only if)
the original list was empty.  Consider the following example:

    @a = (1,undef,undef,2)[2,1,2];

The old behavior would have resulted in @a having no elements.
The new behavior ensures it has three undefined elements.

Note in particular that the behavior of slices of the following
cases remains unchanged:

    @a = ()[1,2];
    @a = (getpwent)[7,0];
    @a = (anything_returning_empty_list())[2,1,2];
    @a = @b[2,1,2];
    @a = @c{'a','b','c'};

See L<perldata>.

=head2 C<(\$)> prototype and C<$foo{a}>

A scalar reference prototype now correctly allows a hash or
array element in that slot.

=head2 C<goto &sub> and AUTOLOAD

The C<goto &sub> construct works correctly when C<&sub> happens
to be autoloaded.

=head2 C<-bareword> allowed under C<use integer>

The autoquoting of barewords preceded by C<-> did not work
in prior versions when the C<integer> pragma was enabled.
This has been fixed.

=head2 Failures in DESTROY()

When code in a destructor threw an exception, it went unnoticed
in earlier versions of Perl, unless someone happened to be
looking in $@ just after the point the destructor happened to
run.  Such failures are now visible as warnings when warnings are
enabled.

=head2 Locale bugs fixed

printf() and sprintf() previously reset the numeric locale
back to the default "C" locale.  This has been fixed.

Numbers formatted according to the local numeric locale
(such as using a decimal comma instead of a decimal dot) caused
"isn't numeric" warnings, even while the operations accessing
those numbers produced correct results.  These warnings have been
discontinued.

=head2 Memory leaks

The C<eval 'return sub {...}'> construct could sometimes leak
memory.  This has been fixed.

Operations that aren't filehandle constructors used to leak memory
when used on invalid filehandles.  This has been fixed.

Constructs that modified C<@_> could fail to deallocate values
in C<@_> and thus leak memory.  This has been corrected.

=head2 Spurious subroutine stubs after failed subroutine calls

Perl could sometimes create empty subroutine stubs when a
subroutine was not found in the package.  Such cases stopped
later method lookups from progressing into base packages.
This has been corrected.

=head2 Taint failures under C<-U>

When running in unsafe mode, taint violations could sometimes
cause silent failures.  This has been fixed.

=head2 END blocks and the C<-c> switch

Prior versions used to run BEGIN B<and> END blocks when Perl was
run in compile-only mode.  Since this is typically not the expected
behavior, END blocks are not executed anymore when the C<-c> switch
is used, or if compilation fails.

See L</"Support for CHECK blocks"> for how to run things when the compile 
phase ends.

=head2 Potential to leak DATA filehandles

Using the C<__DATA__> token creates an implicit filehandle to
the file that contains the token.  It is the program's
responsibility to close it when it is done reading from it.

This caveat is now better explained in the documentation.
See L<perldata>.

=head1 New or Changed Diagnostics

=over 4

=item "%s" variable %s masks earlier declaration in same %s

(W misc) A "my" or "our" variable has been redeclared in the current scope or statement,
effectively eliminating all access to the previous instance.  This is almost
always a typographical error.  Note that the earlier variable will still exist
until the end of the scope or until all closure referents to it are
destroyed.

=item "my sub" not yet implemented

(F) Lexically scoped subroutines are not yet implemented.  Don't try that
yet.

=item "our" variable %s redeclared

(W misc) You seem to have already declared the same global once before in the
current lexical scope.

=item '!' allowed only after types %s

(F) The '!' is allowed in pack() and unpack() only after certain types.
See L<perlfunc/pack>.

=item / cannot take a count

(F) You had an unpack template indicating a counted-length string,
but you have also specified an explicit size for the string.
See L<perlfunc/pack>.

=item / must be followed by a, A or Z

(F) You had an unpack template indicating a counted-length string,
which must be followed by one of the letters a, A or Z
to indicate what sort of string is to be unpacked.
See L<perlfunc/pack>.

=item / must be followed by a*, A* or Z*

(F) You had a pack template indicating a counted-length string,
Currently the only things that can have their length counted are a*, A* or Z*.
See L<perlfunc/pack>.

=item / must follow a numeric type

(F) You had an unpack template that contained a '#',
but this did not follow some numeric unpack specification.
See L<perlfunc/pack>.

=item /%s/: Unrecognized escape \\%c passed through

(W regexp) You used a backslash-character combination which is not recognized
by Perl.  This combination appears in an interpolated variable or a
C<'>-delimited regular expression.  The character was understood literally.

=item /%s/: Unrecognized escape \\%c in character class passed through

(W regexp) You used a backslash-character combination which is not recognized
by Perl inside character classes.  The character was understood literally.

=item /%s/ should probably be written as "%s"

(W syntax) You have used a pattern where Perl expected to find a string,
as in the first argument to C<join>.  Perl will treat the true
or false result of matching the pattern against $_ as the string,
which is probably not what you had in mind.

=item %s() called too early to check prototype

(W prototype) You've called a function that has a prototype before the parser saw a
definition or declaration for it, and Perl could not check that the call
conforms to the prototype.  You need to either add an early prototype
declaration for the subroutine in question, or move the subroutine
definition ahead of the call to get proper prototype checking.  Alternatively,
if you are certain that you're calling the function correctly, you may put
an ampersand before the name to avoid the warning.  See L<perlsub>.

=item %s argument is not a HASH or ARRAY element

(F) The argument to exists() must be a hash or array element, such as:

    $foo{$bar}
    $ref->{"susie"}[12]

=item %s argument is not a HASH or ARRAY element or slice

(F) The argument to delete() must be either a hash or array element, such as:

    $foo{$bar}
    $ref->{"susie"}[12]

or a hash or array slice, such as:

    @foo[$bar, $baz, $xyzzy]
    @{$ref->[12]}{"susie", "queue"}

=item %s argument is not a subroutine name

(F) The argument to exists() for C<exists &sub> must be a subroutine
name, and not a subroutine call.  C<exists &sub()> will generate this error.

=item %s package attribute may clash with future reserved word: %s

(W reserved) A lowercase attribute name was used that had a package-specific handler.
That name might have a meaning to Perl itself some day, even though it
doesn't yet.  Perhaps you should use a mixed-case attribute name, instead.
See L<attributes>.

=item (in cleanup) %s

(W misc) This prefix usually indicates that a DESTROY() method raised
the indicated exception.  Since destructors are usually called by
the system at arbitrary points during execution, and often a vast
number of times, the warning is issued only once for any number
of failures that would otherwise result in the same message being
repeated.

Failure of user callbacks dispatched using the C<G_KEEPERR> flag
could also result in this warning.  See L<perlcall/G_KEEPERR>.

=item <> should be quotes

(F) You wrote C<< require <file> >> when you should have written
C<require 'file'>.

=item Attempt to join self

(F) You tried to join a thread from within itself, which is an
impossible task.  You may be joining the wrong thread, or you may
need to move the join() to some other thread.

=item Bad evalled substitution pattern

(F) You've used the /e switch to evaluate the replacement for a
substitution, but perl found a syntax error in the code to evaluate,
most likely an unexpected right brace '}'.

=item Bad realloc() ignored

(S) An internal routine called realloc() on something that had never been
malloc()ed in the first place. Mandatory, but can be disabled by
setting environment variable C<PERL_BADFREE> to 1.

=item Bareword found in conditional

(W bareword) The compiler found a bareword where it expected a conditional,
which often indicates that an || or && was parsed as part of the
last argument of the previous construct, for example:

    open FOO || die;

It may also indicate a misspelled constant that has been interpreted
as a bareword:

    use constant TYPO => 1;
    if (TYOP) { print "foo" }

The C<strict> pragma is useful in avoiding such errors.

=item Binary number > 0b11111111111111111111111111111111 non-portable

(W portable) The binary number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems.  See
L<perlport> for more on portability concerns.

=item Bit vector size > 32 non-portable

(W portable) Using bit vector sizes larger than 32 is non-portable.

=item Buffer overflow in prime_env_iter: %s

(W internal) A warning peculiar to VMS.  While Perl was preparing to iterate over
%ENV, it encountered a logical name or symbol definition which was too long,
so it was truncated to the string shown.

=item Can't check filesystem of script "%s"

(P) For some reason you can't check the filesystem of the script for nosuid.

=item Can't declare class for non-scalar %s in "%s"

(S) Currently, only scalar variables can declared with a specific class
qualifier in a "my" or "our" declaration.  The semantics may be extended
for other types of variables in future.

=item Can't declare %s in "%s"

(F) Only scalar, array, and hash variables may be declared as "my" or
"our" variables.  They must have ordinary identifiers as names.

=item Can't ignore signal CHLD, forcing to default

(W signal) Perl has detected that it is being run with the SIGCHLD signal
(sometimes known as SIGCLD) disabled.  Since disabling this signal
will interfere with proper determination of exit status of child
processes, Perl has reset the signal to its default value.
This situation typically indicates that the parent program under
which Perl may be running (e.g., cron) is being very careless.

=item Can't modify non-lvalue subroutine call

(F) Subroutines meant to be used in lvalue context should be declared as
such, see L<perlsub/"Lvalue subroutines">.

=item Can't read CRTL environ

(S) A warning peculiar to VMS.  Perl tried to read an element of %ENV
from the CRTL's internal environment array and discovered the array was
missing.  You need to figure out where your CRTL misplaced its environ
or define F<PERL_ENV_TABLES> (see L<perlvms>) so that environ is not searched.

=item Can't remove %s: %s, skipping file 

(S) You requested an inplace edit without creating a backup file.  Perl
was unable to remove the original file to replace it with the modified
file.  The file was left unmodified.

=item Can't return %s from lvalue subroutine

(F) Perl detected an attempt to return illegal lvalues (such
as temporary or readonly values) from a subroutine used as an lvalue.
This is not allowed.

=item Can't weaken a nonreference

(F) You attempted to weaken something that was not a reference.  Only
references can be weakened.

=item Character class [:%s:] unknown

(F) The class in the character class [: :] syntax is unknown.
See L<perlre>.

=item Character class syntax [%s] belongs inside character classes

(W unsafe) The character class constructs [: :], [= =], and [. .]  go
I<inside> character classes, the [] are part of the construct,
for example: /[012[:alpha:]345]/.  Note that [= =] and [. .]
are not currently implemented; they are simply placeholders for
future extensions.

=item Constant is not %s reference

(F) A constant value (perhaps declared using the C<use constant> pragma)
is being dereferenced, but it amounts to the wrong type of reference.  The
message indicates the type of reference that was expected. This usually
indicates a syntax error in dereferencing the constant value.
See L<perlsub/"Constant Functions"> and L<constant>.

=item constant(%s): %s

(F) The parser found inconsistencies either while attempting to define an
overloaded constant, or when trying to find the character name specified
in the C<\N{...}> escape.  Perhaps you forgot to load the corresponding
C<overload> or C<charnames> pragma?  See L<charnames> and L<overload>.

=item CORE::%s is not a keyword

(F) The CORE:: namespace is reserved for Perl keywords.

=item defined(@array) is deprecated

(D) defined() is not usually useful on arrays because it checks for an
undefined I<scalar> value.  If you want to see if the array is empty,
just use C<if (@array) { # not empty }> for example.  

=item defined(%hash) is deprecated

(D) defined() is not usually useful on hashes because it checks for an
undefined I<scalar> value.  If you want to see if the hash is empty,
just use C<if (%hash) { # not empty }> for example.  

=item Did not produce a valid header

See Server error.

=item (Did you mean "local" instead of "our"?)

(W misc) Remember that "our" does not localize the declared global variable.
You have declared it again in the same lexical scope, which seems superfluous.

=item Document contains no data

See Server error.

=item entering effective %s failed

(F) While under the C<use filetest> pragma, switching the real and
effective uids or gids failed.

=item false [] range "%s" in regexp

(W regexp) A character class range must start and end at a literal character, not
another character class like C<\d> or C<[:alpha:]>.  The "-" in your false
range is interpreted as a literal "-".  Consider quoting the "-",  "\-".
See L<perlre>.

=item Filehandle %s opened only for output

(W io) You tried to read from a filehandle opened only for writing.  If you
intended it to be a read/write filehandle, you needed to open it with
"+<" or "+>" or "+>>" instead of with "<" or nothing.  If
you intended only to read from the file, use "<".  See
L<perlfunc/open>.

=item flock() on closed filehandle %s

(W closed) The filehandle you're attempting to flock() got itself closed some
time before now.  Check your logic flow.  flock() operates on filehandles.
Are you attempting to call flock() on a dirhandle by the same name?

=item Global symbol "%s" requires explicit package name

(F) You've said "use strict vars", which indicates that all variables
must either be lexically scoped (using "my"), declared beforehand using
"our", or explicitly qualified to say which package the global variable
is in (using "::").

=item Hexadecimal number > 0xffffffff non-portable

(W portable) The hexadecimal number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems.  See
L<perlport> for more on portability concerns.

=item Ill-formed CRTL environ value "%s"

(W internal) A warning peculiar to VMS.  Perl tried to read the CRTL's internal
environ array, and encountered an element without the C<=> delimiter
used to separate keys from values.  The element is ignored.

=item Ill-formed message in prime_env_iter: |%s|

(W internal) A warning peculiar to VMS.  Perl tried to read a logical name
or CLI symbol definition when preparing to iterate over %ENV, and
didn't see the expected delimiter between key and value, so the
line was ignored.

=item Illegal binary digit %s

(F) You used a digit other than 0 or 1 in a binary number.

=item Illegal binary digit %s ignored

(W digit) You may have tried to use a digit other than 0 or 1 in a binary number.
Interpretation of the binary number stopped before the offending digit.

=item Illegal number of bits in vec

(F) The number of bits in vec() (the third argument) must be a power of
two from 1 to 32 (or 64, if your platform supports that).

=item Integer overflow in %s number

(W overflow) The hexadecimal, octal or binary number you have specified either
as a literal or as an argument to hex() or oct() is too big for your
architecture, and has been converted to a floating point number.  On a
32-bit architecture the largest hexadecimal, octal or binary number
representable without overflow is 0xFFFFFFFF, 037777777777, or
0b11111111111111111111111111111111 respectively.  Note that Perl
transparently promotes all numbers to a floating point representation
internally--subject to loss of precision errors in subsequent
operations.

=item Invalid %s attribute: %s

The indicated attribute for a subroutine or variable was not recognized
by Perl or by a user-supplied handler.  See L<attributes>.

=item Invalid %s attributes: %s

The indicated attributes for a subroutine or variable were not recognized
by Perl or by a user-supplied handler.  See L<attributes>.

=item invalid [] range "%s" in regexp

The offending range is now explicitly displayed.

=item Invalid separator character %s in attribute list

(F) Something other than a colon or whitespace was seen between the
elements of an attribute list.  If the previous attribute
had a parenthesised parameter list, perhaps that list was terminated
too soon.  See L<attributes>.

=item Invalid separator character %s in subroutine attribute list

(F) Something other than a colon or whitespace was seen between the
elements of a subroutine attribute list.  If the previous attribute
had a parenthesised parameter list, perhaps that list was terminated
too soon.

=item leaving effective %s failed

(F) While under the C<use filetest> pragma, switching the real and
effective uids or gids failed.

=item Lvalue subs returning %s not implemented yet

(F) Due to limitations in the current implementation, array and hash
values cannot be returned in subroutines used in lvalue context.
See L<perlsub/"Lvalue subroutines">.

=item Method %s not permitted

See Server error.

=item Missing %sbrace%s on \N{}

(F) Wrong syntax of character name literal C<\N{charname}> within
double-quotish context.

=item Missing command in piped open

(W pipe) You used the C<open(FH, "| command")> or C<open(FH, "command |")>
construction, but the command was missing or blank.

=item Missing name in "my sub"

(F) The reserved syntax for lexically scoped subroutines requires that they
have a name with which they can be found.

=item No %s specified for -%c

(F) The indicated command line switch needs a mandatory argument, but
you haven't specified one.

=item No package name allowed for variable %s in "our"

(F) Fully qualified variable names are not allowed in "our" declarations,
because that doesn't make much sense under existing semantics.  Such
syntax is reserved for future extensions.

=item No space allowed after -%c

(F) The argument to the indicated command line switch must follow immediately
after the switch, without intervening spaces.

=item no UTC offset information; assuming local time is UTC

(S) A warning peculiar to VMS.  Perl was unable to find the local
timezone offset, so it's assuming that local system time is equivalent
to UTC.  If it's not, define the logical name F<SYS$TIMEZONE_DIFFERENTIAL>
to translate to the number of seconds which need to be added to UTC to
get local time.

=item Octal number > 037777777777 non-portable

(W portable) The octal number you specified is larger than 2**32-1 (4294967295)
and therefore non-portable between systems.  See L<perlport> for more
on portability concerns.

See also L<perlport> for writing portable code.

=item panic: del_backref

(P) Failed an internal consistency check while trying to reset a weak
reference.

=item panic: kid popen errno read

(F) forked child returned an incomprehensible message about its errno.

=item panic: magic_killbackrefs

(P) Failed an internal consistency check while trying to reset all weak
references to an object.

=item Parentheses missing around "%s" list

(W parenthesis) You said something like

    my $foo, $bar = @_;

when you meant

    my ($foo, $bar) = @_;

Remember that "my", "our", and "local" bind tighter than comma.

=item Possible unintended interpolation of %s in string

(W ambiguous) It used to be that Perl would try to guess whether you
wanted an array interpolated or a literal @.  It no longer does this;
arrays are now I<always> interpolated into strings.  This means that 
if you try something like:

        print "fred@example.com";

and the array C<@example> doesn't exist, Perl is going to print
C<fred.com>, which is probably not what you wanted.  To get a literal
C<@> sign in a string, put a backslash before it, just as you would
to get a literal C<$> sign.

=item Possible Y2K bug: %s

(W y2k) You are concatenating the number 19 with another number, which
could be a potential Year 2000 problem.

=item pragma "attrs" is deprecated, use "sub NAME : ATTRS" instead

(W deprecated) You have written something like this:

    sub doit
    {
        use attrs qw(locked);
    }

You should use the new declaration syntax instead.

    sub doit : locked
    {
        ...

The C<use attrs> pragma is now obsolete, and is only provided for
backward-compatibility. See L<perlsub/"Subroutine Attributes">.

=item Premature end of script headers

See Server error.

=item Repeat count in pack overflows

(F) You can't specify a repeat count so large that it overflows
your signed integers.  See L<perlfunc/pack>.

=item Repeat count in unpack overflows

(F) You can't specify a repeat count so large that it overflows
your signed integers.  See L<perlfunc/unpack>.

=item realloc() of freed memory ignored

(S) An internal routine called realloc() on something that had already
been freed.

=item Reference is already weak

(W misc) You have attempted to weaken a reference that is already weak.
Doing so has no effect.

=item setpgrp can't take arguments

(F) Your system has the setpgrp() from BSD 4.2, which takes no arguments,
unlike POSIX setpgid(), which takes a process ID and process group ID.

=item Strange *+?{} on zero-length expression

(W regexp) You applied a regular expression quantifier in a place where it
makes no sense, such as on a zero-width assertion.
Try putting the quantifier inside the assertion instead.  For example,
the way to match "abc" provided that it is followed by three
repetitions of "xyz" is C</abc(?=(?:xyz){3})/>, not C</abc(?=xyz){3}/>.

=item switching effective %s is not implemented

(F) While under the C<use filetest> pragma, we cannot switch the
real and effective uids or gids.

=item This Perl can't reset CRTL environ elements (%s)

=item This Perl can't set CRTL environ elements (%s=%s)

(W internal) Warnings peculiar to VMS.  You tried to change or delete an element
of the CRTL's internal environ array, but your copy of Perl wasn't
built with a CRTL that contained the setenv() function.  You'll need to
rebuild Perl with a CRTL that does, or redefine F<PERL_ENV_TABLES> (see
L<perlvms>) so that the environ array isn't the target of the change to
%ENV which produced the warning.

=item Too late to run %s block

(W void) A CHECK or INIT block is being defined during run time proper,
when the opportunity to run them has already passed.  Perhaps you are
loading a file with C<require> or C<do> when you should be using
C<use> instead.  Or perhaps you should put the C<require> or C<do>
inside a BEGIN block.

=item Unknown open() mode '%s'

(F) The second argument of 3-argument open() is not among the list
of valid modes: C<< < >>, C<< > >>, C<<< >> >>>, C<< +< >>,
C<< +> >>, C<<< +>> >>>, C<-|>, C<|->.

=item Unknown process %x sent message to prime_env_iter: %s

(P) An error peculiar to VMS.  Perl was reading values for %ENV before
iterating over it, and someone else stuck a message in the stream of
data Perl expected.  Someone's very confused, or perhaps trying to
subvert Perl's population of %ENV for nefarious purposes.

=item Unrecognized escape \\%c passed through

(W misc) You used a backslash-character combination which is not recognized
by Perl.  The character was understood literally.

=item Unterminated attribute parameter in attribute list

(F) The lexer saw an opening (left) parenthesis character while parsing an
attribute list, but the matching closing (right) parenthesis
character was not found.  You may need to add (or remove) a backslash
character to get your parentheses to balance.  See L<attributes>.

=item Unterminated attribute list

(F) The lexer found something other than a simple identifier at the start
of an attribute, and it wasn't a semicolon or the start of a
block.  Perhaps you terminated the parameter list of the previous attribute
too soon.  See L<attributes>.

=item Unterminated attribute parameter in subroutine attribute list

(F) The lexer saw an opening (left) parenthesis character while parsing a
subroutine attribute list, but the matching closing (right) parenthesis
character was not found.  You may need to add (or remove) a backslash
character to get your parentheses to balance.

=item Unterminated subroutine attribute list

(F) The lexer found something other than a simple identifier at the start
of a subroutine attribute, and it wasn't a semicolon or the start of a
block.  Perhaps you terminated the parameter list of the previous attribute
too soon.

=item Value of CLI symbol "%s" too long

(W misc) A warning peculiar to VMS.  Perl tried to read the value of an %ENV
element from a CLI symbol table, and found a resultant string longer
than 1024 characters.  The return value has been truncated to 1024
characters.

=item Version number must be a constant number

(P) The attempt to translate a C<use Module n.n LIST> statement into
its equivalent C<BEGIN> block found an internal inconsistency with
the version number.

=back

=head1 New tests

=over 4

=item	lib/attrs

Compatibility tests for C<sub : attrs> vs the older C<use attrs>.

=item	lib/env

Tests for new environment scalar capability (e.g., C<use Env qw($BAR);>).

=item	lib/env-array

Tests for new environment array capability (e.g., C<use Env qw(@PATH);>).

=item	lib/io_const

IO constants (SEEK_*, _IO*).

=item	lib/io_dir

Directory-related IO methods (new, read, close, rewind, tied delete).

=item	lib/io_multihomed

INET sockets with multi-homed hosts.

=item	lib/io_poll

IO poll().

=item	lib/io_unix

UNIX sockets.

=item	op/attrs

Regression tests for C<my ($x,@y,%z) : attrs> and <sub : attrs>.

=item	op/filetest

File test operators.

=item	op/lex_assign

Verify operations that access pad objects (lexicals and temporaries).

=item	op/exists_sub

Verify C<exists &sub> operations.

=back

=head1 Incompatible Changes

=head2 Perl Source Incompatibilities

Beware that any new warnings that have been added or old ones
that have been enhanced are B<not> considered incompatible changes.

Since all new warnings must be explicitly requested via the C<-w>
switch or the C<warnings> pragma, it is ultimately the programmer's
responsibility to ensure that warnings are enabled judiciously.

=over 4

=item CHECK is a new keyword

All subroutine definitions named CHECK are now special.  See
C</"Support for CHECK blocks"> for more information.

=item Treatment of list slices of undef has changed

There is a potential incompatibility in the behavior of list slices
that are comprised entirely of undefined values.
See L</"Behavior of list slices is more consistent">.

=item Format of $English::PERL_VERSION is different

The English module now sets $PERL_VERSION to $^V (a string value) rather
than C<$]> (a numeric value).  This is a potential incompatibility.
Send us a report via perlbug if you are affected by this.

See L</"Improved Perl version numbering system"> for the reasons for
this change.

=item Literals of the form C<1.2.3> parse differently

Previously, numeric literals with more than one dot in them were
interpreted as a floating point number concatenated with one or more
numbers.  Such "numbers" are now parsed as strings composed of the
specified ordinals.

For example, C<print 97.98.99> used to output C<97.9899> in earlier
versions, but now prints C<abc>.

See L</"Support for strings represented as a vector of ordinals">.

=item Possibly changed pseudo-random number generator

Perl programs that depend on reproducing a specific set of pseudo-random
numbers may now produce different output due to improvements made to the
rand() builtin.  You can use C<sh Configure -Drandfunc=rand> to obtain
the old behavior.

See L</"Better pseudo-random number generator">.

=item Hashing function for hash keys has changed

Even though Perl hashes are not order preserving, the apparently
random order encountered when iterating on the contents of a hash
is actually determined by the hashing algorithm used.  Improvements
in the algorithm may yield a random order that is B<different> from
that of previous versions, especially when iterating on hashes.

See L</"Better worst-case behavior of hashes"> for additional
information.

=item C<undef> fails on read only values

Using the C<undef> operator on a readonly value (such as $1) has
the same effect as assigning C<undef> to the readonly value--it
throws an exception.

=item Close-on-exec bit may be set on pipe and socket handles

Pipe and socket handles are also now subject to the close-on-exec
behavior determined by the special variable $^F.

See L</"More consistent close-on-exec behavior">.

=item Writing C<"$$1"> to mean C<"${$}1"> is unsupported

Perl 5.004 deprecated the interpretation of C<$$1> and
similar within interpolated strings to mean C<$$ . "1">,
but still allowed it.

In Perl 5.6.0 and later, C<"$$1"> always means C<"${$1}">.

=item delete(), each(), values() and C<\(%h)>

operate on aliases to values, not copies

delete(), each(), values() and hashes (e.g. C<\(%h)>)
in a list context return the actual
values in the hash, instead of copies (as they used to in earlier
versions).  Typical idioms for using these constructs copy the
returned values, but this can make a significant difference when
creating references to the returned values.  Keys in the hash are still
returned as copies when iterating on a hash.

See also L</"delete(), each(), values() and hash iteration are faster">.

=item vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS

vec() generates a run-time error if the BITS argument is not
a valid power-of-two integer.

=item Text of some diagnostic output has changed

Most references to internal Perl operations in diagnostics
have been changed to be more descriptive.  This may be an
issue for programs that may incorrectly rely on the exact
text of diagnostics for proper functioning.

=item C<%@> has been removed

The undocumented special variable C<%@> that used to accumulate
"background" errors (such as those that happen in DESTROY())
has been removed, because it could potentially result in memory
leaks.

=item Parenthesized not() behaves like a list operator

The C<not> operator now falls under the "if it looks like a function,
it behaves like a function" rule.

As a result, the parenthesized form can be used with C<grep> and C<map>.
The following construct used to be a syntax error before, but it works
as expected now:

    grep not($_), @things;

On the other hand, using C<not> with a literal list slice may not
work.  The following previously allowed construct:

    print not (1,2,3)[0];

needs to be written with additional parentheses now:

    print not((1,2,3)[0]);

The behavior remains unaffected when C<not> is not followed by parentheses.

=item Semantics of bareword prototype C<(*)> have changed

The semantics of the bareword prototype C<*> have changed.  Perl 5.005
always coerced simple scalar arguments to a typeglob, which wasn't useful
in situations where the subroutine must distinguish between a simple
scalar and a typeglob.  The new behavior is to not coerce bareword
arguments to a typeglob.  The value will always be visible as either
a simple scalar or as a reference to a typeglob.

See L</"More functional bareword prototype (*)">.

=item Semantics of bit operators may have changed on 64-bit platforms

If your platform is either natively 64-bit or if Perl has been
configured to used 64-bit integers, i.e., $Config{ivsize} is 8, 
there may be a potential incompatibility in the behavior of bitwise
numeric operators (& | ^ ~ << >>).  These operators used to strictly
operate on the lower 32 bits of integers in previous versions, but now
operate over the entire native integral width.  In particular, note
that unary C<~> will produce different results on platforms that have
different $Config{ivsize}.  For portability, be sure to mask off
the excess bits in the result of unary C<~>, e.g., C<~$x & 0xffffffff>.

See L</"Bit operators support full native integer width">.

=item More builtins taint their results

As described in L</"Improved security features">, there may be more
sources of taint in a Perl program.

To avoid these new tainting behaviors, you can build Perl with the
Configure option C<-Accflags=-DINCOMPLETE_TAINTS>.  Beware that the
ensuing perl binary may be insecure.

=back

=head2 C Source Incompatibilities

=over 4

=item C<PERL_POLLUTE>

Release 5.005 grandfathered old global symbol names by providing preprocessor
macros for extension source compatibility.  As of release 5.6.0, these
preprocessor definitions are not available by default.  You need to explicitly
compile perl with C<-DPERL_POLLUTE> to get these definitions.  For
extensions still using the old symbols, this option can be
specified via MakeMaker:

    perl Makefile.PL POLLUTE=1

=item C<PERL_IMPLICIT_CONTEXT>

This new build option provides a set of macros for all API functions
such that an implicit interpreter/thread context argument is passed to
every API function.  As a result of this, something like C<sv_setsv(foo,bar)>
amounts to a macro invocation that actually translates to something like
C<Perl_sv_setsv(my_perl,foo,bar)>.  While this is generally expected
to not have any significant source compatibility issues, the difference
between a macro and a real function call will need to be considered.

This means that there B<is> a source compatibility issue as a result of
this if your extensions attempt to use pointers to any of the Perl API
functions.

Note that the above issue is not relevant to the default build of
Perl, whose interfaces continue to match those of prior versions
(but subject to the other options described here).


See L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for detailed information on the
ramifications of building Perl with this option.

    NOTE: PERL_IMPLICIT_CONTEXT is automatically enabled whenever Perl is built
    with one of -Dusethreads, -Dusemultiplicity, or both.  It is not
    intended to be enabled by users at this time.

=item C<PERL_POLLUTE_MALLOC>

Enabling Perl's malloc in release 5.005 and earlier caused the namespace of
the system's malloc family of functions to be usurped by the Perl versions,
since by default they used the same names.  Besides causing problems on
platforms that do not allow these functions to be cleanly replaced, this
also meant that the system versions could not be called in programs that
used Perl's malloc.  Previous versions of Perl have allowed this behaviour
to be suppressed with the HIDEMYMALLOC and EMBEDMYMALLOC preprocessor
definitions.

As of release 5.6.0, Perl's malloc family of functions have default names
distinct from the system versions.  You need to explicitly compile perl with
C<-DPERL_POLLUTE_MALLOC> to get the older behaviour.  HIDEMYMALLOC
and EMBEDMYMALLOC have no effect, since the behaviour they enabled is now
the default.

Note that these functions do B<not> constitute Perl's memory allocation API.
See L<perlguts/"Memory Allocation"> for further information about that.

=back

=head2 Compatible C Source API Changes

=over 4

=item C<PATCHLEVEL> is now C<PERL_VERSION>

The cpp macros C<PERL_REVISION>, C<PERL_VERSION>, and C<PERL_SUBVERSION>
are now available by default from perl.h, and reflect the base revision,
patchlevel, and subversion respectively.  C<PERL_REVISION> had no
prior equivalent, while C<PERL_VERSION> and C<PERL_SUBVERSION> were
previously available as C<PATCHLEVEL> and C<SUBVERSION>.

The new names cause less pollution of the B<cpp> namespace and reflect what
the numbers have come to stand for in common practice.  For compatibility,
the old names are still supported when F<patchlevel.h> is explicitly
included (as required before), so there is no source incompatibility
from the change.

=back

=head2 Binary Incompatibilities

In general, the default build of this release is expected to be binary
compatible for extensions built with the 5.005 release or its maintenance
versions.  However, specific platforms may have broken binary compatibility
due to changes in the defaults used in hints files.  Therefore, please be
sure to always check the platform-specific README files for any notes to
the contrary.

The usethreads or usemultiplicity builds are B<not> binary compatible
with the corresponding builds in 5.005.

On platforms that require an explicit list of exports (AIX, OS/2 and Windows,
among others), purely internal symbols such as parser functions and the
run time opcodes are not exported by default.  Perl 5.005 used to export
all functions irrespective of whether they were considered part of the
public API or not.

For the full list of public API functions, see L<perlapi>.

=head1 Known Problems

=head2 Thread test failures

The subtests 19 and 20 of lib/thr5005.t test are known to fail due to
fundamental problems in the 5.005 threading implementation.  These are
not new failures--Perl 5.005_0x has the same bugs, but didn't have these
tests.

=head2 EBCDIC platforms not supported

In earlier releases of Perl, EBCDIC environments like OS390 (also
known as Open Edition MVS) and VM-ESA were supported.  Due to changes
required by the UTF-8 (Unicode) support, the EBCDIC platforms are not
supported in Perl 5.6.0.

=head2 In 64-bit HP-UX the lib/io_multihomed test may hang

The lib/io_multihomed test may hang in HP-UX if Perl has been
configured to be 64-bit.  Because other 64-bit platforms do not
hang in this test, HP-UX is suspect.  All other tests pass
in 64-bit HP-UX.  The test attempts to create and connect to
"multihomed" sockets (sockets which have multiple IP addresses).

=head2 NEXTSTEP 3.3 POSIX test failure

In NEXTSTEP 3.3p2 the implementation of the strftime(3) in the
operating system libraries is buggy: the %j format numbers the days of
a month starting from zero, which, while being logical to programmers,
will cause the subtests 19 to 27 of the lib/posix test may fail.

=head2 Tru64 (aka Digital UNIX, aka DEC OSF/1) lib/sdbm test failure with gcc

If compiled with gcc 2.95 the lib/sdbm test will fail (dump core).
The cure is to use the vendor cc, it comes with the operating system
and produces good code.

=head2 UNICOS/mk CC failures during Configure run

In UNICOS/mk the following errors may appear during the Configure run:

	Guessing which symbols your C compiler and preprocessor define...
	CC-20 cc: ERROR File = try.c, Line = 3
	...
	  bad switch yylook 79bad switch yylook 79bad switch yylook 79bad switch yylook 79#ifdef A29K
	...
	4 errors detected in the compilation of "try.c".

The culprit is the broken awk of UNICOS/mk.  The effect is fortunately
rather mild: Perl itself is not adversely affected by the error, only
the h2ph utility coming with Perl, and that is rather rarely needed
these days.

=head2 Arrow operator and arrays

When the left argument to the arrow operator C<< -> >> is an array, or
the C<scalar> operator operating on an array, the result of the
operation must be considered erroneous. For example:

    @x->[2]
    scalar(@x)->[2]

These expressions will get run-time errors in some future release of
Perl.

=head2 Experimental features

As discussed above, many features are still experimental.  Interfaces and
implementation of these features are subject to change, and in extreme cases,
even subject to removal in some future release of Perl.  These features
include the following:

=over 4

=item Threads

=item Unicode

=item 64-bit support

=item Lvalue subroutines

=item Weak references

=item The pseudo-hash data type

=item The Compiler suite

=item Internal implementation of file globbing

=item The DB module

=item The regular expression code constructs: 

C<(?{ code })> and C<(??{ code })>

=back

=head1 Obsolete Diagnostics

=over 4

=item Character class syntax [: :] is reserved for future extensions

(W) Within regular expression character classes ([]) the syntax beginning
with "[:" and ending with ":]" is reserved for future extensions.
If you need to represent those character sequences inside a regular
expression character class, just quote the square brackets with the
backslash: "\[:" and ":\]".

=item Ill-formed logical name |%s| in prime_env_iter

(W) A warning peculiar to VMS.  A logical name was encountered when preparing
to iterate over %ENV which violates the syntactic rules governing logical
names.  Because it cannot be translated normally, it is skipped, and will not
appear in %ENV.  This may be a benign occurrence, as some software packages
might directly modify logical name tables and introduce nonstandard names,
or it may indicate that a logical name table has been corrupted.

=item In string, @%s now must be written as \@%s

The description of this error used to say:

        (Someday it will simply assume that an unbackslashed @
         interpolates an array.)

That day has come, and this fatal error has been removed.  It has been
replaced by a non-fatal warning instead.
See L</Arrays now always interpolate into double-quoted strings> for
details.

=item Probable precedence problem on %s

(W) The compiler found a bareword where it expected a conditional,
which often indicates that an || or && was parsed as part of the
last argument of the previous construct, for example:

    open FOO || die;

=item regexp too big

(F) The current implementation of regular expressions uses shorts as
address offsets within a string.  Unfortunately this means that if
the regular expression compiles to longer than 32767, it'll blow up.
Usually when you want a regular expression this big, there is a better
way to do it with multiple statements.  See L<perlre>.

=item Use of "$$<digit>" to mean "${$}<digit>" is deprecated

(D) Perl versions before 5.004 misinterpreted any type marker followed
by "$" and a digit.  For example, "$$0" was incorrectly taken to mean
"${$}0" instead of "${$0}".  This bug is (mostly) fixed in Perl 5.004.

However, the developers of Perl 5.004 could not fix this bug completely,
because at least two widely-used modules depend on the old meaning of
"$$0" in a string.  So Perl 5.004 still interprets "$$<digit>" in the
old (broken) way inside strings; but it generates this message as a
warning.  And in Perl 5.005, this special treatment will cease.

=back

=head1 Reporting Bugs

If you find what you think is a bug, you might check the
articles recently posted to the comp.lang.perl.misc newsgroup.
There may also be information at http://www.perl.com/perl/ , the Perl
Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=head1 HISTORY

Written by Gurusamy Sarathy <F<gsar@activestate.com>>, with many
contributions from The Perl Porters.

Send omissions or corrections to <F<perlbug@perl.org>>.

=cut
perlhacktut.pod000064400000014011150344123450007574 0ustar00=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlhacktut.pod

=head1 NAME

perlhacktut - Walk through the creation of a simple C code patch

=head1 DESCRIPTION

This document takes you through a simple patch example.

If you haven't read L<perlhack> yet, go do that first! You might also
want to read through L<perlsource> too.

Once you're done here, check out L<perlhacktips> next.

=head1 EXAMPLE OF A SIMPLE PATCH

Let's take a simple patch from start to finish.

Here's something Larry suggested: if a C<U> is the first active format
during a C<pack>, (for example, C<pack "U3C8", @stuff>) then the
resulting string should be treated as UTF-8 encoded.

If you are working with a git clone of the Perl repository, you will
want to create a branch for your changes. This will make creating a
proper patch much simpler. See the L<perlgit> for details on how to do
this.

=head2 Writing the patch

How do we prepare to fix this up? First we locate the code in question
- the C<pack> happens at runtime, so it's going to be in one of the
F<pp> files. Sure enough, C<pp_pack> is in F<pp.c>. Since we're going
to be altering this file, let's copy it to F<pp.c~>.

[Well, it was in F<pp.c> when this tutorial was written. It has now
been split off with C<pp_unpack> to its own file, F<pp_pack.c>]

Now let's look over C<pp_pack>: we take a pattern into C<pat>, and then
loop over the pattern, taking each format character in turn into
C<datum_type>. Then for each possible format character, we swallow up
the other arguments in the pattern (a field width, an asterisk, and so
on) and convert the next chunk input into the specified format, adding
it onto the output SV C<cat>.

How do we know if the C<U> is the first format in the C<pat>? Well, if
we have a pointer to the start of C<pat> then, if we see a C<U> we can
test whether we're still at the start of the string. So, here's where
C<pat> is set up:

    STRLEN fromlen;
    char *pat = SvPVx(*++MARK, fromlen);
    char *patend = pat + fromlen;
    I32 len;
    I32 datumtype;
    SV *fromstr;

We'll have another string pointer in there:

    STRLEN fromlen;
    char *pat = SvPVx(*++MARK, fromlen);
    char *patend = pat + fromlen;
 +  char *patcopy;
    I32 len;
    I32 datumtype;
    SV *fromstr;

And just before we start the loop, we'll set C<patcopy> to be the start
of C<pat>:

    items = SP - MARK;
    MARK++;
    SvPVCLEAR(cat);
 +  patcopy = pat;
    while (pat < patend) {

Now if we see a C<U> which was at the start of the string, we turn on
the C<UTF8> flag for the output SV, C<cat>:

 +  if (datumtype == 'U' && pat==patcopy+1)
 +      SvUTF8_on(cat);
    if (datumtype == '#') {
        while (pat < patend && *pat != '\n')
            pat++;

Remember that it has to be C<patcopy+1> because the first character of
the string is the C<U> which has been swallowed into C<datumtype!>

Oops, we forgot one thing: what if there are spaces at the start of the
pattern? C<pack("  U*", @stuff)> will have C<U> as the first active
character, even though it's not the first thing in the pattern. In this
case, we have to advance C<patcopy> along with C<pat> when we see
spaces:

    if (isSPACE(datumtype))
        continue;

needs to become

    if (isSPACE(datumtype)) {
        patcopy++;
        continue;
    }

OK. That's the C part done. Now we must do two additional things before
this patch is ready to go: we've changed the behaviour of Perl, and so
we must document that change. We must also provide some more regression
tests to make sure our patch works and doesn't create a bug somewhere
else along the line.

=head2 Testing the patch

The regression tests for each operator live in F<t/op/>, and so we make
a copy of F<t/op/pack.t> to F<t/op/pack.t~>. Now we can add our tests
to the end. First, we'll test that the C<U> does indeed create Unicode
strings.

t/op/pack.t has a sensible ok() function, but if it didn't we could use
the one from t/test.pl.

 require './test.pl';
 plan( tests => 159 );

so instead of this:

 print 'not ' unless "1.20.300.4000" eq sprintf "%vd",
                                               pack("U*",1,20,300,4000);
 print "ok $test\n"; $test++;

we can write the more sensible (see L<Test::More> for a full
explanation of is() and other testing functions).

 is( "1.20.300.4000", sprintf "%vd", pack("U*",1,20,300,4000),
                                       "U* produces Unicode" );

Now we'll test that we got that space-at-the-beginning business right:

 is( "1.20.300.4000", sprintf "%vd", pack("  U*",1,20,300,4000),
                                     "  with spaces at the beginning" );

And finally we'll test that we don't make Unicode strings if C<U> is
B<not> the first active format:

 isnt( v1.20.300.4000, sprintf "%vd", pack("C0U*",1,20,300,4000),
                                       "U* not first isn't Unicode" );

Mustn't forget to change the number of tests which appears at the top,
or else the automated tester will get confused. This will either look
like this:

 print "1..156\n";

or this:

 plan( tests => 156 );

We now compile up Perl, and run it through the test suite. Our new
tests pass, hooray!

=head2 Documenting the patch

Finally, the documentation. The job is never done until the paperwork
is over, so let's describe the change we've just made. The relevant
place is F<pod/perlfunc.pod>; again, we make a copy, and then we'll
insert this text in the description of C<pack>:

 =item *

 If the pattern begins with a C<U>, the resulting string will be treated
 as UTF-8-encoded Unicode. You can force UTF-8 encoding on in a string
 with an initial C<U0>, and the bytes that follow will be interpreted as
 Unicode characters. If you don't want this to happen, you can begin
 your pattern with C<C0> (or anything else) to force Perl not to UTF-8
 encode your string, and then follow this with a C<U*> somewhere in your
 pattern.

=head2 Submit

See L<perlhack> for details on how to submit this patch.

=head1 AUTHOR

This document was originally written by Nathan Torkington, and is
maintained by the perl5-porters mailing list.
perlretut.pod000064400000354651150344123450007315 0ustar00=head1 NAME

perlretut - Perl regular expressions tutorial

=head1 DESCRIPTION

This page provides a basic tutorial on understanding, creating and
using regular expressions in Perl.  It serves as a complement to the
reference page on regular expressions L<perlre>.  Regular expressions
are an integral part of the C<m//>, C<s///>, C<qr//> and C<split>
operators and so this tutorial also overlaps with
L<perlop/"Regexp Quote-Like Operators"> and L<perlfunc/split>.

Perl is widely renowned for excellence in text processing, and regular
expressions are one of the big factors behind this fame.  Perl regular
expressions display an efficiency and flexibility unknown in most
other computer languages.  Mastering even the basics of regular
expressions will allow you to manipulate text with surprising ease.

What is a regular expression?  At its most basic, a regular expression
is a template that is used to determine if a string has certain
characteristics.  The string is most often some text, such as a line,
sentence, web page, or even a whole book, but less commonly it could be
some binary data as well.
Suppose we want to determine if the text in variable, C<$var> contains
the sequence of characters S<C<m u s h r o o m>>
(blanks added for legibility).  We can write in Perl

 $var =~ m/mushroom/

The value of this expression will be TRUE if C<$var> contains that
sequence of characters, and FALSE otherwise.  The portion enclosed in
C<'E<sol>'> characters denotes the characteristic we are looking for.
We use the term I<pattern> for it.  The process of looking to see if the
pattern occurs in the string is called I<matching>, and the C<"=~">
operator along with the C<m//> tell Perl to try to match the pattern
against the string.  Note that the pattern is also a string, but a very
special kind of one, as we will see.  Patterns are in common use these
days;
examples are the patterns typed into a search engine to find web pages
and the patterns used to list files in a directory, I<e.g.>, "C<ls *.txt>"
or "C<dir *.*>".  In Perl, the patterns described by regular expressions
are used not only to search strings, but to also extract desired parts
of strings, and to do search and replace operations.

Regular expressions have the undeserved reputation of being abstract
and difficult to understand.  This really stems simply because the
notation used to express them tends to be terse and dense, and not
because of inherent complexity.  We recommend using the C</x> regular
expression modifier (described below) along with plenty of white space
to make them less dense, and easier to read.  Regular expressions are
constructed using
simple concepts like conditionals and loops and are no more difficult
to understand than the corresponding C<if> conditionals and C<while>
loops in the Perl language itself.

This tutorial flattens the learning curve by discussing regular
expression concepts, along with their notation, one at a time and with
many examples.  The first part of the tutorial will progress from the
simplest word searches to the basic regular expression concepts.  If
you master the first part, you will have all the tools needed to solve
about 98% of your needs.  The second part of the tutorial is for those
comfortable with the basics and hungry for more power tools.  It
discusses the more advanced regular expression operators and
introduces the latest cutting-edge innovations.

A note: to save time, "regular expression" is often abbreviated as
regexp or regex.  Regexp is a more natural abbreviation than regex, but
is harder to pronounce.  The Perl pod documentation is evenly split on
regexp vs regex; in Perl, there is more than one way to abbreviate it.
We'll use regexp in this tutorial.

New in v5.22, L<C<use re 'strict'>|re/'strict' mode> applies stricter
rules than otherwise when compiling regular expression patterns.  It can
find things that, while legal, may not be what you intended.

=head1 Part 1: The basics

=head2 Simple word matching

The simplest regexp is simply a word, or more generally, a string of
characters.  A regexp consisting of just a word matches any string that
contains that word:

    "Hello World" =~ /World/;  # matches

What is this Perl statement all about? C<"Hello World"> is a simple
double-quoted string.  C<World> is the regular expression and the
C<//> enclosing C</World/> tells Perl to search a string for a match.
The operator C<=~> associates the string with the regexp match and
produces a true value if the regexp matched, or false if the regexp
did not match.  In our case, C<World> matches the second word in
C<"Hello World">, so the expression is true.  Expressions like this
are useful in conditionals:

    if ("Hello World" =~ /World/) {
        print "It matches\n";
    }
    else {
        print "It doesn't match\n";
    }

There are useful variations on this theme.  The sense of the match can
be reversed by using the C<!~> operator:

    if ("Hello World" !~ /World/) {
        print "It doesn't match\n";
    }
    else {
        print "It matches\n";
    }

The literal string in the regexp can be replaced by a variable:

    my $greeting = "World";
    if ("Hello World" =~ /$greeting/) {
        print "It matches\n";
    }
    else {
        print "It doesn't match\n";
    }

If you're matching against the special default variable C<$_>, the
C<$_ =~> part can be omitted:

    $_ = "Hello World";
    if (/World/) {
        print "It matches\n";
    }
    else {
        print "It doesn't match\n";
    }

And finally, the C<//> default delimiters for a match can be changed
to arbitrary delimiters by putting an C<'m'> out front:

    "Hello World" =~ m!World!;   # matches, delimited by '!'
    "Hello World" =~ m{World};   # matches, note the matching '{}'
    "/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin',
                                 # '/' becomes an ordinary char

C</World/>, C<m!World!>, and C<m{World}> all represent the
same thing.  When, I<e.g.>, the quote (C<'"'>) is used as a delimiter, the forward
slash C<'/'> becomes an ordinary character and can be used in this regexp
without trouble.

Let's consider how different regexps would match C<"Hello World">:

    "Hello World" =~ /world/;  # doesn't match
    "Hello World" =~ /o W/;    # matches
    "Hello World" =~ /oW/;     # doesn't match
    "Hello World" =~ /World /; # doesn't match

The first regexp C<world> doesn't match because regexps are
case-sensitive.  The second regexp matches because the substring
S<C<'o W'>> occurs in the string S<C<"Hello World">>.  The space
character C<' '> is treated like any other character in a regexp and is
needed to match in this case.  The lack of a space character is the
reason the third regexp C<'oW'> doesn't match.  The fourth regexp
"C<World >" doesn't match because there is a space at the end of the
regexp, but not at the end of the string.  The lesson here is that
regexps must match a part of the string I<exactly> in order for the
statement to be true.

If a regexp matches in more than one place in the string, Perl will
always match at the earliest possible point in the string:

    "Hello World" =~ /o/;       # matches 'o' in 'Hello'
    "That hat is red" =~ /hat/; # matches 'hat' in 'That'

With respect to character matching, there are a few more points you
need to know about.   First of all, not all characters can be used "as
is" in a match.  Some characters, called I<metacharacters>, are reserved
for use in regexp notation.  The metacharacters are

    {}[]()^$.|*+?-\

The significance of each of these will be explained
in the rest of the tutorial, but for now, it is important only to know
that a metacharacter can be matched as-is by putting a backslash before
it:

    "2+2=4" =~ /2+2/;    # doesn't match, + is a metacharacter
    "2+2=4" =~ /2\+2/;   # matches, \+ is treated like an ordinary +
    "The interval is [0,1)." =~ /[0,1)./     # is a syntax error!
    "The interval is [0,1)." =~ /\[0,1\)\./  # matches
    "#!/usr/bin/perl" =~ /#!\/usr\/bin\/perl/;  # matches

In the last regexp, the forward slash C<'/'> is also backslashed,
because it is used to delimit the regexp.  This can lead to LTS
(leaning toothpick syndrome), however, and it is often more readable
to change delimiters.

    "#!/usr/bin/perl" =~ m!#\!/usr/bin/perl!;  # easier to read

The backslash character C<'\'> is a metacharacter itself and needs to
be backslashed:

    'C:\WIN32' =~ /C:\\WIN/;   # matches

In situations where it doesn't make sense for a particular metacharacter
to mean what it normally does, it automatically loses its
metacharacter-ness and becomes an ordinary character that is to be
matched literally.  For example, the C<'}'> is a metacharacter only when
it is the mate of a C<'{'> metacharacter.  Otherwise it is treated as a
literal RIGHT CURLY BRACKET.  This may lead to unexpected results.
L<C<use re 'strict'>|re/'strict' mode> can catch some of these.

In addition to the metacharacters, there are some ASCII characters
which don't have printable character equivalents and are instead
represented by I<escape sequences>.  Common examples are C<\t> for a
tab, C<\n> for a newline, C<\r> for a carriage return and C<\a> for a
bell (or alert).  If your string is better thought of as a sequence of arbitrary
bytes, the octal escape sequence, I<e.g.>, C<\033>, or hexadecimal escape
sequence, I<e.g.>, C<\x1B> may be a more natural representation for your
bytes.  Here are some examples of escapes:

    "1000\t2000" =~ m(0\t2)   # matches
    "1000\n2000" =~ /0\n20/   # matches
    "1000\t2000" =~ /\000\t2/ # doesn't match, "0" ne "\000"
    "cat"   =~ /\o{143}\x61\x74/ # matches in ASCII, but a weird way
                                 # to spell cat

If you've been around Perl a while, all this talk of escape sequences
may seem familiar.  Similar escape sequences are used in double-quoted
strings and in fact the regexps in Perl are mostly treated as
double-quoted strings.  This means that variables can be used in
regexps as well.  Just like double-quoted strings, the values of the
variables in the regexp will be substituted in before the regexp is
evaluated for matching purposes.  So we have:

    $foo = 'house';
    'housecat' =~ /$foo/;      # matches
    'cathouse' =~ /cat$foo/;   # matches
    'housecat' =~ /${foo}cat/; # matches

So far, so good.  With the knowledge above you can already perform
searches with just about any literal string regexp you can dream up.
Here is a I<very simple> emulation of the Unix grep program:

    % cat > simple_grep
    #!/usr/bin/perl
    $regexp = shift;
    while (<>) {
        print if /$regexp/;
    }
    ^D

    % chmod +x simple_grep

    % simple_grep abba /usr/dict/words
    Babbage
    cabbage
    cabbages
    sabbath
    Sabbathize
    Sabbathizes
    sabbatical
    scabbard
    scabbards

This program is easy to understand.  C<#!/usr/bin/perl> is the standard
way to invoke a perl program from the shell.
S<C<$regexp = shift;>> saves the first command line argument as the
regexp to be used, leaving the rest of the command line arguments to
be treated as files.  S<C<< while (<>) >>> loops over all the lines in
all the files.  For each line, S<C<print if /$regexp/;>> prints the
line if the regexp matches the line.  In this line, both C<print> and
C</$regexp/> use the default variable C<$_> implicitly.

With all of the regexps above, if the regexp matched anywhere in the
string, it was considered a match.  Sometimes, however, we'd like to
specify I<where> in the string the regexp should try to match.  To do
this, we would use the I<anchor> metacharacters C<'^'> and C<'$'>.  The
anchor C<'^'> means match at the beginning of the string and the anchor
C<'$'> means match at the end of the string, or before a newline at the
end of the string.  Here is how they are used:

    "housekeeper" =~ /keeper/;    # matches
    "housekeeper" =~ /^keeper/;   # doesn't match
    "housekeeper" =~ /keeper$/;   # matches
    "housekeeper\n" =~ /keeper$/; # matches

The second regexp doesn't match because C<'^'> constrains C<keeper> to
match only at the beginning of the string, but C<"housekeeper"> has
keeper starting in the middle.  The third regexp does match, since the
C<'$'> constrains C<keeper> to match only at the end of the string.

When both C<'^'> and C<'$'> are used at the same time, the regexp has to
match both the beginning and the end of the string, I<i.e.>, the regexp
matches the whole string.  Consider

    "keeper" =~ /^keep$/;      # doesn't match
    "keeper" =~ /^keeper$/;    # matches
    ""       =~ /^$/;          # ^$ matches an empty string

The first regexp doesn't match because the string has more to it than
C<keep>.  Since the second regexp is exactly the string, it
matches.  Using both C<'^'> and C<'$'> in a regexp forces the complete
string to match, so it gives you complete control over which strings
match and which don't.  Suppose you are looking for a fellow named
bert, off in a string by himself:

    "dogbert" =~ /bert/;   # matches, but not what you want

    "dilbert" =~ /^bert/;  # doesn't match, but ..
    "bertram" =~ /^bert/;  # matches, so still not good enough

    "bertram" =~ /^bert$/; # doesn't match, good
    "dilbert" =~ /^bert$/; # doesn't match, good
    "bert"    =~ /^bert$/; # matches, perfect

Of course, in the case of a literal string, one could just as easily
use the string comparison S<C<$string eq 'bert'>> and it would be
more efficient.   The  C<^...$> regexp really becomes useful when we
add in the more powerful regexp tools below.

=head2 Using character classes

Although one can already do quite a lot with the literal string
regexps above, we've only scratched the surface of regular expression
technology.  In this and subsequent sections we will introduce regexp
concepts (and associated metacharacter notations) that will allow a
regexp to represent not just a single character sequence, but a I<whole
class> of them.

One such concept is that of a I<character class>.  A character class
allows a set of possible characters, rather than just a single
character, to match at a particular point in a regexp.  You can define
your own custom character classes.  These
are denoted by brackets C<[...]>, with the set of characters
to be possibly matched inside.  Here are some examples:

    /cat/;       # matches 'cat'
    /[bcr]at/;   # matches 'bat, 'cat', or 'rat'
    /item[0123456789]/;  # matches 'item0' or ... or 'item9'
    "abc" =~ /[cab]/;    # matches 'a'

In the last statement, even though C<'c'> is the first character in
the class, C<'a'> matches because the first character position in the
string is the earliest point at which the regexp can match.

    /[yY][eE][sS]/;      # match 'yes' in a case-insensitive way
                         # 'yes', 'Yes', 'YES', etc.

This regexp displays a common task: perform a case-insensitive
match.  Perl provides a way of avoiding all those brackets by simply
appending an C<'i'> to the end of the match.  Then C</[yY][eE][sS]/;>
can be rewritten as C</yes/i;>.  The C<'i'> stands for
case-insensitive and is an example of a I<modifier> of the matching
operation.  We will meet other modifiers later in the tutorial.

We saw in the section above that there were ordinary characters, which
represented themselves, and special characters, which needed a
backslash C<'\'> to represent themselves.  The same is true in a
character class, but the sets of ordinary and special characters
inside a character class are different than those outside a character
class.  The special characters for a character class are C<-]\^$> (and
the pattern delimiter, whatever it is).
C<']'> is special because it denotes the end of a character class.  C<'$'> is
special because it denotes a scalar variable.  C<'\'> is special because
it is used in escape sequences, just like above.  Here is how the
special characters C<]$\> are handled:

   /[\]c]def/; # matches ']def' or 'cdef'
   $x = 'bcr';
   /[$x]at/;   # matches 'bat', 'cat', or 'rat'
   /[\$x]at/;  # matches '$at' or 'xat'
   /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat'

The last two are a little tricky.  In C<[\$x]>, the backslash protects
the dollar sign, so the character class has two members C<'$'> and C<'x'>.
In C<[\\$x]>, the backslash is protected, so C<$x> is treated as a
variable and substituted in double quote fashion.

The special character C<'-'> acts as a range operator within character
classes, so that a contiguous set of characters can be written as a
range.  With ranges, the unwieldy C<[0123456789]> and C<[abc...xyz]>
become the svelte C<[0-9]> and C<[a-z]>.  Some examples are

    /item[0-9]/;  # matches 'item0' or ... or 'item9'
    /[0-9bx-z]aa/;  # matches '0aa', ..., '9aa',
                    # 'baa', 'xaa', 'yaa', or 'zaa'
    /[0-9a-fA-F]/;  # matches a hexadecimal digit
    /[0-9a-zA-Z_]/; # matches a "word" character,
                    # like those in a Perl variable name

If C<'-'> is the first or last character in a character class, it is
treated as an ordinary character; C<[-ab]>, C<[ab-]> and C<[a\-b]> are
all equivalent.

The special character C<'^'> in the first position of a character class
denotes a I<negated character class>, which matches any character but
those in the brackets.  Both C<[...]> and C<[^...]> must match a
character, or the match fails.  Then

    /[^a]at/;  # doesn't match 'aat' or 'at', but matches
               # all other 'bat', 'cat, '0at', '%at', etc.
    /[^0-9]/;  # matches a non-numeric character
    /[a^]at/;  # matches 'aat' or '^at'; here '^' is ordinary

Now, even C<[0-9]> can be a bother to write multiple times, so in the
interest of saving keystrokes and making regexps more readable, Perl
has several abbreviations for common character classes, as shown below.
Since the introduction of Unicode, unless the C</a> modifier is in
effect, these character classes match more than just a few characters in
the ASCII range.

=over 4

=item *

C<\d> matches a digit, not just C<[0-9]> but also digits from non-roman scripts

=item *

C<\s> matches a whitespace character, the set C<[\ \t\r\n\f]> and others

=item *

C<\w> matches a word character (alphanumeric or C<'_'>), not just C<[0-9a-zA-Z_]>
but also digits and characters from non-roman scripts

=item *

C<\D> is a negated C<\d>; it represents any other character than a digit, or C<[^\d]>

=item *

C<\S> is a negated C<\s>; it represents any non-whitespace character C<[^\s]>

=item *

C<\W> is a negated C<\w>; it represents any non-word character C<[^\w]>

=item *

The period C<'.'> matches any character but C<"\n"> (unless the modifier C</s> is
in effect, as explained below).

=item *

C<\N>, like the period, matches any character but C<"\n">, but it does so
regardless of whether the modifier C</s> is in effect.

=back

The C</a> modifier, available starting in Perl 5.14,  is used to
restrict the matches of C<\d>, C<\s>, and C<\w> to just those in the ASCII range.
It is useful to keep your program from being needlessly exposed to full
Unicode (and its accompanying security considerations) when all you want
is to process English-like text.  (The "a" may be doubled, C</aa>, to
provide even more restrictions, preventing case-insensitive matching of
ASCII with non-ASCII characters; otherwise a Unicode "Kelvin Sign"
would caselessly match a "k" or "K".)

The C<\d\s\w\D\S\W> abbreviations can be used both inside and outside
of bracketed character classes.  Here are some in use:

    /\d\d:\d\d:\d\d/; # matches a hh:mm:ss time format
    /[\d\s]/;         # matches any digit or whitespace character
    /\w\W\w/;         # matches a word char, followed by a
                      # non-word char, followed by a word char
    /..rt/;           # matches any two chars, followed by 'rt'
    /end\./;          # matches 'end.'
    /end[.]/;         # same thing, matches 'end.'

Because a period is a metacharacter, it needs to be escaped to match
as an ordinary period. Because, for example, C<\d> and C<\w> are sets
of characters, it is incorrect to think of C<[^\d\w]> as C<[\D\W]>; in
fact C<[^\d\w]> is the same as C<[^\w]>, which is the same as
C<[\W]>. Think DeMorgan's laws.

In actuality, the period and C<\d\s\w\D\S\W> abbreviations are
themselves types of character classes, so the ones surrounded by
brackets are just one type of character class.  When we need to make a
distinction, we refer to them as "bracketed character classes."

An anchor useful in basic regexps is the I<word anchor>
C<\b>.  This matches a boundary between a word character and a non-word
character C<\w\W> or C<\W\w>:

    $x = "Housecat catenates house and cat";
    $x =~ /cat/;    # matches cat in 'housecat'
    $x =~ /\bcat/;  # matches cat in 'catenates'
    $x =~ /cat\b/;  # matches cat in 'housecat'
    $x =~ /\bcat\b/;  # matches 'cat' at end of string

Note in the last example, the end of the string is considered a word
boundary.

For natural language processing (so that, for example, apostrophes are
included in words), use instead C<\b{wb}>

    "don't" =~ / .+? \b{wb} /x;  # matches the whole string

You might wonder why C<'.'> matches everything but C<"\n"> - why not
every character? The reason is that often one is matching against
lines and would like to ignore the newline characters.  For instance,
while the string C<"\n"> represents one line, we would like to think
of it as empty.  Then

    ""   =~ /^$/;    # matches
    "\n" =~ /^$/;    # matches, $ anchors before "\n"

    ""   =~ /./;      # doesn't match; it needs a char
    ""   =~ /^.$/;    # doesn't match; it needs a char
    "\n" =~ /^.$/;    # doesn't match; it needs a char other than "\n"
    "a"  =~ /^.$/;    # matches
    "a\n"  =~ /^.$/;  # matches, $ anchors before "\n"

This behavior is convenient, because we usually want to ignore
newlines when we count and match characters in a line.  Sometimes,
however, we want to keep track of newlines.  We might even want C<'^'>
and C<'$'> to anchor at the beginning and end of lines within the
string, rather than just the beginning and end of the string.  Perl
allows us to choose between ignoring and paying attention to newlines
by using the C</s> and C</m> modifiers.  C</s> and C</m> stand for
single line and multi-line and they determine whether a string is to
be treated as one continuous string, or as a set of lines.  The two
modifiers affect two aspects of how the regexp is interpreted: 1) how
the C<'.'> character class is defined, and 2) where the anchors C<'^'>
and C<'$'> are able to match.  Here are the four possible combinations:

=over 4

=item *

no modifiers: Default behavior.  C<'.'> matches any character
except C<"\n">.  C<'^'> matches only at the beginning of the string and
C<'$'> matches only at the end or before a newline at the end.

=item *

s modifier (C</s>): Treat string as a single long line.  C<'.'> matches
any character, even C<"\n">.  C<'^'> matches only at the beginning of
the string and C<'$'> matches only at the end or before a newline at the
end.

=item *

m modifier (C</m>): Treat string as a set of multiple lines.  C<'.'>
matches any character except C<"\n">.  C<'^'> and C<'$'> are able to match
at the start or end of I<any> line within the string.

=item *

both s and m modifiers (C</sm>): Treat string as a single long line, but
detect multiple lines.  C<'.'> matches any character, even
C<"\n">.  C<'^'> and C<'$'>, however, are able to match at the start or end
of I<any> line within the string.

=back

Here are examples of C</s> and C</m> in action:

    $x = "There once was a girl\nWho programmed in Perl\n";

    $x =~ /^Who/;   # doesn't match, "Who" not at start of string
    $x =~ /^Who/s;  # doesn't match, "Who" not at start of string
    $x =~ /^Who/m;  # matches, "Who" at start of second line
    $x =~ /^Who/sm; # matches, "Who" at start of second line

    $x =~ /girl.Who/;   # doesn't match, "." doesn't match "\n"
    $x =~ /girl.Who/s;  # matches, "." matches "\n"
    $x =~ /girl.Who/m;  # doesn't match, "." doesn't match "\n"
    $x =~ /girl.Who/sm; # matches, "." matches "\n"

Most of the time, the default behavior is what is wanted, but C</s> and
C</m> are occasionally very useful.  If C</m> is being used, the start
of the string can still be matched with C<\A> and the end of the string
can still be matched with the anchors C<\Z> (matches both the end and
the newline before, like C<'$'>), and C<\z> (matches only the end):

    $x =~ /^Who/m;   # matches, "Who" at start of second line
    $x =~ /\AWho/m;  # doesn't match, "Who" is not at start of string

    $x =~ /girl$/m;  # matches, "girl" at end of first line
    $x =~ /girl\Z/m; # doesn't match, "girl" is not at end of string

    $x =~ /Perl\Z/m; # matches, "Perl" is at newline before end
    $x =~ /Perl\z/m; # doesn't match, "Perl" is not at end of string

We now know how to create choices among classes of characters in a
regexp.  What about choices among words or character strings? Such
choices are described in the next section.

=head2 Matching this or that

Sometimes we would like our regexp to be able to match different
possible words or character strings.  This is accomplished by using
the I<alternation> metacharacter C<'|'>.  To match C<dog> or C<cat>, we
form the regexp C<dog|cat>.  As before, Perl will try to match the
regexp at the earliest possible point in the string.  At each
character position, Perl will first try to match the first
alternative, C<dog>.  If C<dog> doesn't match, Perl will then try the
next alternative, C<cat>.  If C<cat> doesn't match either, then the
match fails and Perl moves to the next position in the string.  Some
examples:

    "cats and dogs" =~ /cat|dog|bird/;  # matches "cat"
    "cats and dogs" =~ /dog|cat|bird/;  # matches "cat"

Even though C<dog> is the first alternative in the second regexp,
C<cat> is able to match earlier in the string.

    "cats"          =~ /c|ca|cat|cats/; # matches "c"
    "cats"          =~ /cats|cat|ca|c/; # matches "cats"

Here, all the alternatives match at the first string position, so the
first alternative is the one that matches.  If some of the
alternatives are truncations of the others, put the longest ones first
to give them a chance to match.

    "cab" =~ /a|b|c/ # matches "c"
                     # /a|b|c/ == /[abc]/

The last example points out that character classes are like
alternations of characters.  At a given character position, the first
alternative that allows the regexp match to succeed will be the one
that matches.

=head2 Grouping things and hierarchical matching

Alternation allows a regexp to choose among alternatives, but by
itself it is unsatisfying.  The reason is that each alternative is a whole
regexp, but sometime we want alternatives for just part of a
regexp.  For instance, suppose we want to search for housecats or
housekeepers.  The regexp C<housecat|housekeeper> fits the bill, but is
inefficient because we had to type C<house> twice.  It would be nice to
have parts of the regexp be constant, like C<house>, and some
parts have alternatives, like C<cat|keeper>.

The I<grouping> metacharacters C<()> solve this problem.  Grouping
allows parts of a regexp to be treated as a single unit.  Parts of a
regexp are grouped by enclosing them in parentheses.  Thus we could solve
the C<housecat|housekeeper> by forming the regexp as
C<house(cat|keeper)>.  The regexp C<house(cat|keeper)> means match
C<house> followed by either C<cat> or C<keeper>.  Some more examples
are

    /(a|b)b/;    # matches 'ab' or 'bb'
    /(ac|b)b/;   # matches 'acb' or 'bb'
    /(^a|b)c/;   # matches 'ac' at start of string or 'bc' anywhere
    /(a|[bc])d/; # matches 'ad', 'bd', or 'cd'

    /house(cat|)/;  # matches either 'housecat' or 'house'
    /house(cat(s|)|)/;  # matches either 'housecats' or 'housecat' or
                        # 'house'.  Note groups can be nested.

    /(19|20|)\d\d/;  # match years 19xx, 20xx, or the Y2K problem, xx
    "20" =~ /(19|20|)\d\d/;  # matches the null alternative '()\d\d',
                             # because '20\d\d' can't match

Alternations behave the same way in groups as out of them: at a given
string position, the leftmost alternative that allows the regexp to
match is taken.  So in the last example at the first string position,
C<"20"> matches the second alternative, but there is nothing left over
to match the next two digits C<\d\d>.  So Perl moves on to the next
alternative, which is the null alternative and that works, since
C<"20"> is two digits.

The process of trying one alternative, seeing if it matches, and
moving on to the next alternative, while going back in the string
from where the previous alternative was tried, if it doesn't, is called
I<backtracking>.  The term "backtracking" comes from the idea that
matching a regexp is like a walk in the woods.  Successfully matching
a regexp is like arriving at a destination.  There are many possible
trailheads, one for each string position, and each one is tried in
order, left to right.  From each trailhead there may be many paths,
some of which get you there, and some which are dead ends.  When you
walk along a trail and hit a dead end, you have to backtrack along the
trail to an earlier point to try another trail.  If you hit your
destination, you stop immediately and forget about trying all the
other trails.  You are persistent, and only if you have tried all the
trails from all the trailheads and not arrived at your destination, do
you declare failure.  To be concrete, here is a step-by-step analysis
of what Perl does when it tries to match the regexp

    "abcde" =~ /(abd|abc)(df|d|de)/;

=over 4

=item Z<>0. Start with the first letter in the string C<'a'>.

E<nbsp>

=item Z<>1. Try the first alternative in the first group C<'abd'>.

E<nbsp>

=item Z<>2.  Match C<'a'> followed by C<'b'>. So far so good.

E<nbsp>

=item Z<>3.  C<'d'> in the regexp doesn't match C<'c'> in the string - a
dead end.  So backtrack two characters and pick the second alternative
in the first group C<'abc'>.

E<nbsp>

=item Z<>4.  Match C<'a'> followed by C<'b'> followed by C<'c'>.  We are on a roll
and have satisfied the first group. Set C<$1> to C<'abc'>.

E<nbsp>

=item Z<>5 Move on to the second group and pick the first alternative C<'df'>.

E<nbsp>

=item Z<>6 Match the C<'d'>.

E<nbsp>

=item Z<>7.  C<'f'> in the regexp doesn't match C<'e'> in the string, so a dead
end.  Backtrack one character and pick the second alternative in the
second group C<'d'>.

E<nbsp>

=item Z<>8.  C<'d'> matches. The second grouping is satisfied, so set
C<$2> to C<'d'>.

E<nbsp>

=item Z<>9.  We are at the end of the regexp, so we are done! We have
matched C<'abcd'> out of the string C<"abcde">.

=back

There are a couple of things to note about this analysis.  First, the
third alternative in the second group C<'de'> also allows a match, but we
stopped before we got to it - at a given character position, leftmost
wins.  Second, we were able to get a match at the first character
position of the string C<'a'>.  If there were no matches at the first
position, Perl would move to the second character position C<'b'> and
attempt the match all over again.  Only when all possible paths at all
possible character positions have been exhausted does Perl give
up and declare S<C<$string =~ /(abd|abc)(df|d|de)/;>> to be false.

Even with all this work, regexp matching happens remarkably fast.  To
speed things up, Perl compiles the regexp into a compact sequence of
opcodes that can often fit inside a processor cache.  When the code is
executed, these opcodes can then run at full throttle and search very
quickly.

=head2 Extracting matches

The grouping metacharacters C<()> also serve another completely
different function: they allow the extraction of the parts of a string
that matched.  This is very useful to find out what matched and for
text processing in general.  For each grouping, the part that matched
inside goes into the special variables C<$1>, C<$2>, I<etc>.  They can be
used just as ordinary variables:

    # extract hours, minutes, seconds
    if ($time =~ /(\d\d):(\d\d):(\d\d)/) {    # match hh:mm:ss format
	$hours = $1;
	$minutes = $2;
	$seconds = $3;
    }

Now, we know that in scalar context,
S<C<$time =~ /(\d\d):(\d\d):(\d\d)/>> returns a true or false
value.  In list context, however, it returns the list of matched values
C<($1,$2,$3)>.  So we could write the code more compactly as

    # extract hours, minutes, seconds
    ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);

If the groupings in a regexp are nested, C<$1> gets the group with the
leftmost opening parenthesis, C<$2> the next opening parenthesis,
I<etc>.  Here is a regexp with nested groups:

    /(ab(cd|ef)((gi)|j))/;
     1  2      34

If this regexp matches, C<$1> contains a string starting with
C<'ab'>, C<$2> is either set to C<'cd'> or C<'ef'>, C<$3> equals either
C<'gi'> or C<'j'>, and C<$4> is either set to C<'gi'>, just like C<$3>,
or it remains undefined.

For convenience, Perl sets C<$+> to the string held by the highest numbered
C<$1>, C<$2>,... that got assigned (and, somewhat related, C<$^N> to the
value of the C<$1>, C<$2>,... most-recently assigned; I<i.e.> the C<$1>,
C<$2>,... associated with the rightmost closing parenthesis used in the
match).


=head2 Backreferences

Closely associated with the matching variables C<$1>, C<$2>, ... are
the I<backreferences> C<\g1>, C<\g2>,...  Backreferences are simply
matching variables that can be used I<inside> a regexp.  This is a
really nice feature; what matches later in a regexp is made to depend on
what matched earlier in the regexp.  Suppose we wanted to look
for doubled words in a text, like "the the".  The following regexp finds
all 3-letter doubles with a space in between:

    /\b(\w\w\w)\s\g1\b/;

The grouping assigns a value to C<\g1>, so that the same 3-letter sequence
is used for both parts.

A similar task is to find words consisting of two identical parts:

    % simple_grep '^(\w\w\w\w|\w\w\w|\w\w|\w)\g1$' /usr/dict/words
    beriberi
    booboo
    coco
    mama
    murmur
    papa

The regexp has a single grouping which considers 4-letter
combinations, then 3-letter combinations, I<etc>., and uses C<\g1> to look for
a repeat.  Although C<$1> and C<\g1> represent the same thing, care should be
taken to use matched variables C<$1>, C<$2>,... only I<outside> a regexp
and backreferences C<\g1>, C<\g2>,... only I<inside> a regexp; not doing
so may lead to surprising and unsatisfactory results.


=head2 Relative backreferences

Counting the opening parentheses to get the correct number for a
backreference is error-prone as soon as there is more than one
capturing group.  A more convenient technique became available
with Perl 5.10: relative backreferences. To refer to the immediately
preceding capture group one now may write C<\g{-1}>, the next but
last is available via C<\g{-2}>, and so on.

Another good reason in addition to readability and maintainability
for using relative backreferences is illustrated by the following example,
where a simple pattern for matching peculiar strings is used:

    $a99a = '([a-z])(\d)\g2\g1';   # matches a11a, g22g, x33x, etc.

Now that we have this pattern stored as a handy string, we might feel
tempted to use it as a part of some other pattern:

    $line = "code=e99e";
    if ($line =~ /^(\w+)=$a99a$/){   # unexpected behavior!
        print "$1 is valid\n";
    } else {
        print "bad line: '$line'\n";
    }

But this doesn't match, at least not the way one might expect. Only
after inserting the interpolated C<$a99a> and looking at the resulting
full text of the regexp is it obvious that the backreferences have
backfired. The subexpression C<(\w+)> has snatched number 1 and
demoted the groups in C<$a99a> by one rank. This can be avoided by
using relative backreferences:

    $a99a = '([a-z])(\d)\g{-1}\g{-2}';  # safe for being interpolated


=head2 Named backreferences

Perl 5.10 also introduced named capture groups and named backreferences.
To attach a name to a capturing group, you write either
C<< (?<name>...) >> or C<< (?'name'...) >>.  The backreference may
then be written as C<\g{name}>.  It is permissible to attach the
same name to more than one group, but then only the leftmost one of the
eponymous set can be referenced.  Outside of the pattern a named
capture group is accessible through the C<%+> hash.

Assuming that we have to match calendar dates which may be given in one
of the three formats yyyy-mm-dd, mm/dd/yyyy or dd.mm.yyyy, we can write
three suitable patterns where we use C<'d'>, C<'m'> and C<'y'> respectively as the
names of the groups capturing the pertaining components of a date. The
matching operation combines the three patterns as alternatives:

    $fmt1 = '(?<y>\d\d\d\d)-(?<m>\d\d)-(?<d>\d\d)';
    $fmt2 = '(?<m>\d\d)/(?<d>\d\d)/(?<y>\d\d\d\d)';
    $fmt3 = '(?<d>\d\d)\.(?<m>\d\d)\.(?<y>\d\d\d\d)';
    for my $d qw( 2006-10-21 15.01.2007 10/31/2005 ){
        if ( $d =~ m{$fmt1|$fmt2|$fmt3} ){
            print "day=$+{d} month=$+{m} year=$+{y}\n";
        }
    }

If any of the alternatives matches, the hash C<%+> is bound to contain the
three key-value pairs.


=head2 Alternative capture group numbering

Yet another capturing group numbering technique (also as from Perl 5.10)
deals with the problem of referring to groups within a set of alternatives.
Consider a pattern for matching a time of the day, civil or military style:

    if ( $time =~ /(\d\d|\d):(\d\d)|(\d\d)(\d\d)/ ){
        # process hour and minute
    }

Processing the results requires an additional if statement to determine
whether C<$1> and C<$2> or C<$3> and C<$4> contain the goodies. It would
be easier if we could use group numbers 1 and 2 in second alternative as
well, and this is exactly what the parenthesized construct C<(?|...)>,
set around an alternative achieves. Here is an extended version of the
previous pattern:

  if($time =~ /(?|(\d\d|\d):(\d\d)|(\d\d)(\d\d))\s+([A-Z][A-Z][A-Z])/){
      print "hour=$1 minute=$2 zone=$3\n";
  }

Within the alternative numbering group, group numbers start at the same
position for each alternative. After the group, numbering continues
with one higher than the maximum reached across all the alternatives.

=head2 Position information

In addition to what was matched, Perl also provides the
positions of what was matched as contents of the C<@-> and C<@+>
arrays. C<$-[0]> is the position of the start of the entire match and
C<$+[0]> is the position of the end. Similarly, C<$-[n]> is the
position of the start of the C<$n> match and C<$+[n]> is the position
of the end. If C<$n> is undefined, so are C<$-[n]> and C<$+[n]>. Then
this code

    $x = "Mmm...donut, thought Homer";
    $x =~ /^(Mmm|Yech)\.\.\.(donut|peas)/; # matches
    foreach $exp (1..$#-) {
        print "Match $exp: '${$exp}' at position ($-[$exp],$+[$exp])\n";
    }

prints

    Match 1: 'Mmm' at position (0,3)
    Match 2: 'donut' at position (6,11)

Even if there are no groupings in a regexp, it is still possible to
find out what exactly matched in a string.  If you use them, Perl
will set C<$`> to the part of the string before the match, will set C<$&>
to the part of the string that matched, and will set C<'$'> to the part
of the string after the match.  An example:

    $x = "the cat caught the mouse";
    $x =~ /cat/;  # $` = 'the ', $& = 'cat', $' = ' caught the mouse'
    $x =~ /the/;  # $` = '', $& = 'the', $' = ' cat caught the mouse'

In the second match, C<$`> equals C<''> because the regexp matched at the
first character position in the string and stopped; it never saw the
second "the".

If your code is to run on Perl versions earlier than
5.20, it is worthwhile to note that using C<$`> and C<'$'>
slows down regexp matching quite a bit, while C<$&> slows it down to a
lesser extent, because if they are used in one regexp in a program,
they are generated for I<all> regexps in the program.  So if raw
performance is a goal of your application, they should be avoided.
If you need to extract the corresponding substrings, use C<@-> and
C<@+> instead:

    $` is the same as substr( $x, 0, $-[0] )
    $& is the same as substr( $x, $-[0], $+[0]-$-[0] )
    $' is the same as substr( $x, $+[0] )

As of Perl 5.10, the C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}>
variables may be used.  These are only set if the C</p> modifier is
present.  Consequently they do not penalize the rest of the program.  In
Perl 5.20, C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}> are available
whether the C</p> has been used or not (the modifier is ignored), and
C<$`>, C<'$'> and C<$&> do not cause any speed difference.

=head2 Non-capturing groupings

A group that is required to bundle a set of alternatives may or may not be
useful as a capturing group.  If it isn't, it just creates a superfluous
addition to the set of available capture group values, inside as well as
outside the regexp.  Non-capturing groupings, denoted by C<(?:regexp)>,
still allow the regexp to be treated as a single unit, but don't establish
a capturing group at the same time.  Both capturing and non-capturing
groupings are allowed to co-exist in the same regexp.  Because there is
no extraction, non-capturing groupings are faster than capturing
groupings.  Non-capturing groupings are also handy for choosing exactly
which parts of a regexp are to be extracted to matching variables:

    # match a number, $1-$4 are set, but we only want $1
    /([+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)/;

    # match a number faster , only $1 is set
    /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?)/;

    # match a number, get $1 = whole number, $2 = exponent
    /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE]([+-]?\d+))?)/;

Non-capturing groupings are also useful for removing nuisance
elements gathered from a split operation where parentheses are
required for some reason:

    $x = '12aba34ba5';
    @num = split /(a|b)+/, $x;    # @num = ('12','a','34','a','5')
    @num = split /(?:a|b)+/, $x;  # @num = ('12','34','5')

In Perl 5.22 and later, all groups within a regexp can be set to
non-capturing by using the new C</n> flag:

    "hello" =~ /(hi|hello)/n; # $1 is not set!

See L<perlre/"n"> for more information.

=head2 Matching repetitions

The examples in the previous section display an annoying weakness.  We
were only matching 3-letter words, or chunks of words of 4 letters or
less.  We'd like to be able to match words or, more generally, strings
of any length, without writing out tedious alternatives like
C<\w\w\w\w|\w\w\w|\w\w|\w>.

This is exactly the problem the I<quantifier> metacharacters C<'?'>,
C<'*'>, C<'+'>, and C<{}> were created for.  They allow us to delimit the
number of repeats for a portion of a regexp we consider to be a
match.  Quantifiers are put immediately after the character, character
class, or grouping that we want to specify.  They have the following
meanings:

=over 4

=item *

C<a?> means: match C<'a'> 1 or 0 times

=item *

C<a*> means: match C<'a'> 0 or more times, I<i.e.>, any number of times

=item *

C<a+> means: match C<'a'> 1 or more times, I<i.e.>, at least once

=item *

C<a{n,m}> means: match at least C<n> times, but not more than C<m>
times.

=item *

C<a{n,}> means: match at least C<n> or more times

=item *

C<a{n}> means: match exactly C<n> times

=back

Here are some examples:

    /[a-z]+\s+\d*/;  # match a lowercase word, at least one space, and
                     # any number of digits
    /(\w+)\s+\g1/;    # match doubled words of arbitrary length
    /y(es)?/i;       # matches 'y', 'Y', or a case-insensitive 'yes'
    $year =~ /^\d{2,4}$/;  # make sure year is at least 2 but not more
                           # than 4 digits
    $year =~ /^\d{4}$|^\d{2}$/; # better match; throw out 3-digit dates
    $year =~ /^\d{2}(\d{2})?$/; # same thing written differently.
                                # However, this captures the last two
                                # digits in $1 and the other does not.

    % simple_grep '^(\w+)\g1$' /usr/dict/words   # isn't this easier?
    beriberi
    booboo
    coco
    mama
    murmur
    papa

For all of these quantifiers, Perl will try to match as much of the
string as possible, while still allowing the regexp to succeed.  Thus
with C</a?.../>, Perl will first try to match the regexp with the C<'a'>
present; if that fails, Perl will try to match the regexp without the
C<'a'> present.  For the quantifier C<'*'>, we get the following:

    $x = "the cat in the hat";
    $x =~ /^(.*)(cat)(.*)$/; # matches,
                             # $1 = 'the '
                             # $2 = 'cat'
                             # $3 = ' in the hat'

Which is what we might expect, the match finds the only C<cat> in the
string and locks onto it.  Consider, however, this regexp:

    $x =~ /^(.*)(at)(.*)$/; # matches,
                            # $1 = 'the cat in the h'
                            # $2 = 'at'
                            # $3 = ''   (0 characters match)

One might initially guess that Perl would find the C<at> in C<cat> and
stop there, but that wouldn't give the longest possible string to the
first quantifier C<.*>.  Instead, the first quantifier C<.*> grabs as
much of the string as possible while still having the regexp match.  In
this example, that means having the C<at> sequence with the final C<at>
in the string.  The other important principle illustrated here is that,
when there are two or more elements in a regexp, the I<leftmost>
quantifier, if there is one, gets to grab as much of the string as
possible, leaving the rest of the regexp to fight over scraps.  Thus in
our example, the first quantifier C<.*> grabs most of the string, while
the second quantifier C<.*> gets the empty string.   Quantifiers that
grab as much of the string as possible are called I<maximal match> or
I<greedy> quantifiers.

When a regexp can match a string in several different ways, we can use
the principles above to predict which way the regexp will match:

=over 4

=item *

Principle 0: Taken as a whole, any regexp will be matched at the
earliest possible position in the string.

=item *

Principle 1: In an alternation C<a|b|c...>, the leftmost alternative
that allows a match for the whole regexp will be the one used.

=item *

Principle 2: The maximal matching quantifiers C<'?'>, C<'*'>, C<'+'> and
C<{n,m}> will in general match as much of the string as possible while
still allowing the whole regexp to match.

=item *

Principle 3: If there are two or more elements in a regexp, the
leftmost greedy quantifier, if any, will match as much of the string
as possible while still allowing the whole regexp to match.  The next
leftmost greedy quantifier, if any, will try to match as much of the
string remaining available to it as possible, while still allowing the
whole regexp to match.  And so on, until all the regexp elements are
satisfied.

=back

As we have seen above, Principle 0 overrides the others. The regexp
will be matched as early as possible, with the other principles
determining how the regexp matches at that earliest character
position.

Here is an example of these principles in action:

    $x = "The programming republic of Perl";
    $x =~ /^(.+)(e|r)(.*)$/;  # matches,
                              # $1 = 'The programming republic of Pe'
                              # $2 = 'r'
                              # $3 = 'l'

This regexp matches at the earliest string position, C<'T'>.  One
might think that C<'e'>, being leftmost in the alternation, would be
matched, but C<'r'> produces the longest string in the first quantifier.

    $x =~ /(m{1,2})(.*)$/;  # matches,
                            # $1 = 'mm'
                            # $2 = 'ing republic of Perl'

Here, The earliest possible match is at the first C<'m'> in
C<programming>. C<m{1,2}> is the first quantifier, so it gets to match
a maximal C<mm>.

    $x =~ /.*(m{1,2})(.*)$/;  # matches,
                              # $1 = 'm'
                              # $2 = 'ing republic of Perl'

Here, the regexp matches at the start of the string. The first
quantifier C<.*> grabs as much as possible, leaving just a single
C<'m'> for the second quantifier C<m{1,2}>.

    $x =~ /(.?)(m{1,2})(.*)$/;  # matches,
                                # $1 = 'a'
                                # $2 = 'mm'
                                # $3 = 'ing republic of Perl'

Here, C<.?> eats its maximal one character at the earliest possible
position in the string, C<'a'> in C<programming>, leaving C<m{1,2}>
the opportunity to match both C<'m'>'s. Finally,

    "aXXXb" =~ /(X*)/; # matches with $1 = ''

because it can match zero copies of C<'X'> at the beginning of the
string.  If you definitely want to match at least one C<'X'>, use
C<X+>, not C<X*>.

Sometimes greed is not good.  At times, we would like quantifiers to
match a I<minimal> piece of string, rather than a maximal piece.  For
this purpose, Larry Wall created the I<minimal match> or
I<non-greedy> quantifiers C<??>, C<*?>, C<+?>, and C<{}?>.  These are
the usual quantifiers with a C<'?'> appended to them.  They have the
following meanings:

=over 4

=item *

C<a??> means: match C<'a'> 0 or 1 times. Try 0 first, then 1.

=item *

C<a*?> means: match C<'a'> 0 or more times, I<i.e.>, any number of times,
but as few times as possible

=item *

C<a+?> means: match C<'a'> 1 or more times, I<i.e.>, at least once, but
as few times as possible

=item *

C<a{n,m}?> means: match at least C<n> times, not more than C<m>
times, as few times as possible

=item *

C<a{n,}?> means: match at least C<n> times, but as few times as
possible

=item *

C<a{n}?> means: match exactly C<n> times.  Because we match exactly
C<n> times, C<a{n}?> is equivalent to C<a{n}> and is just there for
notational consistency.

=back

Let's look at the example above, but with minimal quantifiers:

    $x = "The programming republic of Perl";
    $x =~ /^(.+?)(e|r)(.*)$/; # matches,
                              # $1 = 'Th'
                              # $2 = 'e'
                              # $3 = ' programming republic of Perl'

The minimal string that will allow both the start of the string C<'^'>
and the alternation to match is C<Th>, with the alternation C<e|r>
matching C<'e'>.  The second quantifier C<.*> is free to gobble up the
rest of the string.

    $x =~ /(m{1,2}?)(.*?)$/;  # matches,
                              # $1 = 'm'
                              # $2 = 'ming republic of Perl'

The first string position that this regexp can match is at the first
C<'m'> in C<programming>. At this position, the minimal C<m{1,2}?>
matches just one C<'m'>.  Although the second quantifier C<.*?> would
prefer to match no characters, it is constrained by the end-of-string
anchor C<'$'> to match the rest of the string.

    $x =~ /(.*?)(m{1,2}?)(.*)$/;  # matches,
                                  # $1 = 'The progra'
                                  # $2 = 'm'
                                  # $3 = 'ming republic of Perl'

In this regexp, you might expect the first minimal quantifier C<.*?>
to match the empty string, because it is not constrained by a C<'^'>
anchor to match the beginning of the word.  Principle 0 applies here,
however.  Because it is possible for the whole regexp to match at the
start of the string, it I<will> match at the start of the string.  Thus
the first quantifier has to match everything up to the first C<'m'>.  The
second minimal quantifier matches just one C<'m'> and the third
quantifier matches the rest of the string.

    $x =~ /(.??)(m{1,2})(.*)$/;  # matches,
                                 # $1 = 'a'
                                 # $2 = 'mm'
                                 # $3 = 'ing republic of Perl'

Just as in the previous regexp, the first quantifier C<.??> can match
earliest at position C<'a'>, so it does.  The second quantifier is
greedy, so it matches C<mm>, and the third matches the rest of the
string.

We can modify principle 3 above to take into account non-greedy
quantifiers:

=over 4

=item *

Principle 3: If there are two or more elements in a regexp, the
leftmost greedy (non-greedy) quantifier, if any, will match as much
(little) of the string as possible while still allowing the whole
regexp to match.  The next leftmost greedy (non-greedy) quantifier, if
any, will try to match as much (little) of the string remaining
available to it as possible, while still allowing the whole regexp to
match.  And so on, until all the regexp elements are satisfied.

=back

Just like alternation, quantifiers are also susceptible to
backtracking.  Here is a step-by-step analysis of the example

    $x = "the cat in the hat";
    $x =~ /^(.*)(at)(.*)$/; # matches,
                            # $1 = 'the cat in the h'
                            # $2 = 'at'
                            # $3 = ''   (0 matches)

=over 4

=item Z<>0.  Start with the first letter in the string C<'t'>.

E<nbsp>

=item Z<>1.  The first quantifier C<'.*'> starts out by matching the whole
string "C<the cat in the hat>".

E<nbsp>

=item Z<>2.  C<'a'> in the regexp element C<'at'> doesn't match the end
of the string.  Backtrack one character.

E<nbsp>

=item Z<>3.  C<'a'> in the regexp element C<'at'> still doesn't match
the last letter of the string C<'t'>, so backtrack one more character.

E<nbsp>

=item Z<>4.  Now we can match the C<'a'> and the C<'t'>.

E<nbsp>

=item Z<>5.  Move on to the third element C<'.*'>.  Since we are at the
end of the string and C<'.*'> can match 0 times, assign it the empty
string.

E<nbsp>

=item Z<>6.  We are done!

=back

Most of the time, all this moving forward and backtracking happens
quickly and searching is fast. There are some pathological regexps,
however, whose execution time exponentially grows with the size of the
string.  A typical structure that blows up in your face is of the form

    /(a|b+)*/;

The problem is the nested indeterminate quantifiers.  There are many
different ways of partitioning a string of length n between the C<'+'>
and C<'*'>: one repetition with C<b+> of length n, two repetitions with
the first C<b+> length k and the second with length n-k, m repetitions
whose bits add up to length n, I<etc>.  In fact there are an exponential
number of ways to partition a string as a function of its length.  A
regexp may get lucky and match early in the process, but if there is
no match, Perl will try I<every> possibility before giving up.  So be
careful with nested C<'*'>'s, C<{n,m}>'s, and C<'+'>'s.  The book
I<Mastering Regular Expressions> by Jeffrey Friedl gives a wonderful
discussion of this and other efficiency issues.


=head2 Possessive quantifiers

Backtracking during the relentless search for a match may be a waste
of time, particularly when the match is bound to fail.  Consider
the simple pattern

    /^\w+\s+\w+$/; # a word, spaces, a word

Whenever this is applied to a string which doesn't quite meet the
pattern's expectations such as S<C<"abc  ">> or S<C<"abc  def ">>,
the regexp engine will backtrack, approximately once for each character
in the string.  But we know that there is no way around taking I<all>
of the initial word characters to match the first repetition, that I<all>
spaces must be eaten by the middle part, and the same goes for the second
word.

With the introduction of the I<possessive quantifiers> in Perl 5.10, we
have a way of instructing the regexp engine not to backtrack, with the
usual quantifiers with a C<'+'> appended to them.  This makes them greedy as
well as stingy; once they succeed they won't give anything back to permit
another solution. They have the following meanings:

=over 4

=item *

C<a{n,m}+> means: match at least C<n> times, not more than C<m> times,
as many times as possible, and don't give anything up. C<a?+> is short
for C<a{0,1}+>

=item *

C<a{n,}+> means: match at least C<n> times, but as many times as possible,
and don't give anything up. C<a*+> is short for C<a{0,}+> and C<a++> is
short for C<a{1,}+>.

=item *

C<a{n}+> means: match exactly C<n> times.  It is just there for
notational consistency.

=back

These possessive quantifiers represent a special case of a more general
concept, the I<independent subexpression>, see below.

As an example where a possessive quantifier is suitable we consider
matching a quoted string, as it appears in several programming languages.
The backslash is used as an escape character that indicates that the
next character is to be taken literally, as another character for the
string.  Therefore, after the opening quote, we expect a (possibly
empty) sequence of alternatives: either some character except an
unescaped quote or backslash or an escaped character.

    /"(?:[^"\\]++|\\.)*+"/;


=head2 Building a regexp

At this point, we have all the basic regexp concepts covered, so let's
give a more involved example of a regular expression.  We will build a
regexp that matches numbers.

The first task in building a regexp is to decide what we want to match
and what we want to exclude.  In our case, we want to match both
integers and floating point numbers and we want to reject any string
that isn't a number.

The next task is to break the problem down into smaller problems that
are easily converted into a regexp.

The simplest case is integers.  These consist of a sequence of digits,
with an optional sign in front.  The digits we can represent with
C<\d+> and the sign can be matched with C<[+-]>.  Thus the integer
regexp is

    /[+-]?\d+/;  # matches integers

A floating point number potentially has a sign, an integral part, a
decimal point, a fractional part, and an exponent.  One or more of these
parts is optional, so we need to check out the different
possibilities.  Floating point numbers which are in proper form include
123., 0.345, .34, -1e6, and 25.4E-72.  As with integers, the sign out
front is completely optional and can be matched by C<[+-]?>.  We can
see that if there is no exponent, floating point numbers must have a
decimal point, otherwise they are integers.  We might be tempted to
model these with C<\d*\.\d*>, but this would also match just a single
decimal point, which is not a number.  So the three cases of floating
point number without exponent are

   /[+-]?\d+\./;  # 1., 321., etc.
   /[+-]?\.\d+/;  # .1, .234, etc.
   /[+-]?\d+\.\d+/;  # 1.0, 30.56, etc.

These can be combined into a single regexp with a three-way alternation:

   /[+-]?(\d+\.\d+|\d+\.|\.\d+)/;  # floating point, no exponent

In this alternation, it is important to put C<'\d+\.\d+'> before
C<'\d+\.'>.  If C<'\d+\.'> were first, the regexp would happily match that
and ignore the fractional part of the number.

Now consider floating point numbers with exponents.  The key
observation here is that I<both> integers and numbers with decimal
points are allowed in front of an exponent.  Then exponents, like the
overall sign, are independent of whether we are matching numbers with
or without decimal points, and can be "decoupled" from the
mantissa.  The overall form of the regexp now becomes clear:

    /^(optional sign)(integer | f.p. mantissa)(optional exponent)$/;

The exponent is an C<'e'> or C<'E'>, followed by an integer.  So the
exponent regexp is

   /[eE][+-]?\d+/;  # exponent

Putting all the parts together, we get a regexp that matches numbers:

   /^[+-]?(\d+\.\d+|\d+\.|\.\d+|\d+)([eE][+-]?\d+)?$/;  # Ta da!

Long regexps like this may impress your friends, but can be hard to
decipher.  In complex situations like this, the C</x> modifier for a
match is invaluable.  It allows one to put nearly arbitrary whitespace
and comments into a regexp without affecting their meaning.  Using it,
we can rewrite our "extended" regexp in the more pleasing form

   /^
      [+-]?         # first, match an optional sign
      (             # then match integers or f.p. mantissas:
          \d+\.\d+  # mantissa of the form a.b
         |\d+\.     # mantissa of the form a.
         |\.\d+     # mantissa of the form .b
         |\d+       # integer of the form a
      )
      ( [eE] [+-]? \d+ )?  # finally, optionally match an exponent
   $/x;

If whitespace is mostly irrelevant, how does one include space
characters in an extended regexp? The answer is to backslash it
S<C<'\ '>> or put it in a character class S<C<[ ]>>.  The same thing
goes for pound signs: use C<\#> or C<[#]>.  For instance, Perl allows
a space between the sign and the mantissa or integer, and we could add
this to our regexp as follows:

   /^
      [+-]?\ *      # first, match an optional sign *and space*
      (             # then match integers or f.p. mantissas:
          \d+\.\d+  # mantissa of the form a.b
         |\d+\.     # mantissa of the form a.
         |\.\d+     # mantissa of the form .b
         |\d+       # integer of the form a
      )
      ( [eE] [+-]? \d+ )?  # finally, optionally match an exponent
   $/x;

In this form, it is easier to see a way to simplify the
alternation.  Alternatives 1, 2, and 4 all start with C<\d+>, so it
could be factored out:

   /^
      [+-]?\ *      # first, match an optional sign
      (             # then match integers or f.p. mantissas:
          \d+       # start out with a ...
          (
              \.\d* # mantissa of the form a.b or a.
          )?        # ? takes care of integers of the form a
         |\.\d+     # mantissa of the form .b
      )
      ( [eE] [+-]? \d+ )?  # finally, optionally match an exponent
   $/x;

Starting in Perl v5.26, specifying C</xx> changes the square-bracketed
portions of a pattern to ignore tabs and space characters unless they
are escaped by preceding them with a backslash.  So, we could write

   /^
      [ + - ]?\ *   # first, match an optional sign
      (             # then match integers or f.p. mantissas:
          \d+       # start out with a ...
          (
              \.\d* # mantissa of the form a.b or a.
          )?        # ? takes care of integers of the form a
         |\.\d+     # mantissa of the form .b
      )
      ( [ e E ] [ + - ]? \d+ )?  # finally, optionally match an exponent
   $/xx;

This doesn't really improve the legibility of this example, but it's
available in case you want it.  Squashing the pattern down to the
compact form, we have

    /^[+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?$/;

This is our final regexp.  To recap, we built a regexp by

=over 4

=item *

specifying the task in detail,

=item *

breaking down the problem into smaller parts,

=item *

translating the small parts into regexps,

=item *

combining the regexps,

=item *

and optimizing the final combined regexp.

=back

These are also the typical steps involved in writing a computer
program.  This makes perfect sense, because regular expressions are
essentially programs written in a little computer language that specifies
patterns.

=head2 Using regular expressions in Perl

The last topic of Part 1 briefly covers how regexps are used in Perl
programs.  Where do they fit into Perl syntax?

We have already introduced the matching operator in its default
C</regexp/> and arbitrary delimiter C<m!regexp!> forms.  We have used
the binding operator C<=~> and its negation C<!~> to test for string
matches.  Associated with the matching operator, we have discussed the
single line C</s>, multi-line C</m>, case-insensitive C</i> and
extended C</x> modifiers.  There are a few more things you might
want to know about matching operators.

=head3 Prohibiting substitution

If you change C<$pattern> after the first substitution happens, Perl
will ignore it.  If you don't want any substitutions at all, use the
special delimiter C<m''>:

    @pattern = ('Seuss');
    while (<>) {
        print if m'@pattern';  # matches literal '@pattern', not 'Seuss'
    }

Similar to strings, C<m''> acts like apostrophes on a regexp; all other
C<'m'> delimiters act like quotes.  If the regexp evaluates to the empty string,
the regexp in the I<last successful match> is used instead.  So we have

    "dog" =~ /d/;  # 'd' matches
    "dogbert =~ //;  # this matches the 'd' regexp used before


=head3 Global matching

The final two modifiers we will discuss here,
C</g> and C</c>, concern multiple matches.
The modifier C</g> stands for global matching and allows the
matching operator to match within a string as many times as possible.
In scalar context, successive invocations against a string will have
C</g> jump from match to match, keeping track of position in the
string as it goes along.  You can get or set the position with the
C<pos()> function.

The use of C</g> is shown in the following example.  Suppose we have
a string that consists of words separated by spaces.  If we know how
many words there are in advance, we could extract the words using
groupings:

    $x = "cat dog house"; # 3 words
    $x =~ /^\s*(\w+)\s+(\w+)\s+(\w+)\s*$/; # matches,
                                           # $1 = 'cat'
                                           # $2 = 'dog'
                                           # $3 = 'house'

But what if we had an indeterminate number of words? This is the sort
of task C</g> was made for.  To extract all words, form the simple
regexp C<(\w+)> and loop over all matches with C</(\w+)/g>:

    while ($x =~ /(\w+)/g) {
        print "Word is $1, ends at position ", pos $x, "\n";
    }

prints

    Word is cat, ends at position 3
    Word is dog, ends at position 7
    Word is house, ends at position 13

A failed match or changing the target string resets the position.  If
you don't want the position reset after failure to match, add the
C</c>, as in C</regexp/gc>.  The current position in the string is
associated with the string, not the regexp.  This means that different
strings have different positions and their respective positions can be
set or read independently.

In list context, C</g> returns a list of matched groupings, or if
there are no groupings, a list of matches to the whole regexp.  So if
we wanted just the words, we could use

    @words = ($x =~ /(\w+)/g);  # matches,
                                # $words[0] = 'cat'
                                # $words[1] = 'dog'
                                # $words[2] = 'house'

Closely associated with the C</g> modifier is the C<\G> anchor.  The
C<\G> anchor matches at the point where the previous C</g> match left
off.  C<\G> allows us to easily do context-sensitive matching:

    $metric = 1;  # use metric units
    ...
    $x = <FILE>;  # read in measurement
    $x =~ /^([+-]?\d+)\s*/g;  # get magnitude
    $weight = $1;
    if ($metric) { # error checking
        print "Units error!" unless $x =~ /\Gkg\./g;
    }
    else {
        print "Units error!" unless $x =~ /\Glbs\./g;
    }
    $x =~ /\G\s+(widget|sprocket)/g;  # continue processing

The combination of C</g> and C<\G> allows us to process the string a
bit at a time and use arbitrary Perl logic to decide what to do next.
Currently, the C<\G> anchor is only fully supported when used to anchor
to the start of the pattern.

C<\G> is also invaluable in processing fixed-length records with
regexps.  Suppose we have a snippet of coding region DNA, encoded as
base pair letters C<ATCGTTGAAT...> and we want to find all the stop
codons C<TGA>.  In a coding region, codons are 3-letter sequences, so
we can think of the DNA snippet as a sequence of 3-letter records.  The
naive regexp

    # expanded, this is "ATC GTT GAA TGC AAA TGA CAT GAC"
    $dna = "ATCGTTGAATGCAAATGACATGAC";
    $dna =~ /TGA/;

doesn't work; it may match a C<TGA>, but there is no guarantee that
the match is aligned with codon boundaries, I<e.g.>, the substring
S<C<GTT GAA>> gives a match.  A better solution is

    while ($dna =~ /(\w\w\w)*?TGA/g) {  # note the minimal *?
        print "Got a TGA stop codon at position ", pos $dna, "\n";
    }

which prints

    Got a TGA stop codon at position 18
    Got a TGA stop codon at position 23

Position 18 is good, but position 23 is bogus.  What happened?

The answer is that our regexp works well until we get past the last
real match.  Then the regexp will fail to match a synchronized C<TGA>
and start stepping ahead one character position at a time, not what we
want.  The solution is to use C<\G> to anchor the match to the codon
alignment:

    while ($dna =~ /\G(\w\w\w)*?TGA/g) {
        print "Got a TGA stop codon at position ", pos $dna, "\n";
    }

This prints

    Got a TGA stop codon at position 18

which is the correct answer.  This example illustrates that it is
important not only to match what is desired, but to reject what is not
desired.

(There are other regexp modifiers that are available, such as
C</o>, but their specialized uses are beyond the
scope of this introduction.  )

=head3 Search and replace

Regular expressions also play a big role in I<search and replace>
operations in Perl.  Search and replace is accomplished with the
C<s///> operator.  The general form is
C<s/regexp/replacement/modifiers>, with everything we know about
regexps and modifiers applying in this case as well.  The
I<replacement> is a Perl double-quoted string that replaces in the
string whatever is matched with the C<regexp>.  The operator C<=~> is
also used here to associate a string with C<s///>.  If matching
against C<$_>, the S<C<$_ =~>> can be dropped.  If there is a match,
C<s///> returns the number of substitutions made; otherwise it returns
false.  Here are a few examples:

    $x = "Time to feed the cat!";
    $x =~ s/cat/hacker/;   # $x contains "Time to feed the hacker!"
    if ($x =~ s/^(Time.*hacker)!$/$1 now!/) {
        $more_insistent = 1;
    }
    $y = "'quoted words'";
    $y =~ s/^'(.*)'$/$1/;  # strip single quotes,
                           # $y contains "quoted words"

In the last example, the whole string was matched, but only the part
inside the single quotes was grouped.  With the C<s///> operator, the
matched variables C<$1>, C<$2>, I<etc>. are immediately available for use
in the replacement expression, so we use C<$1> to replace the quoted
string with just what was quoted.  With the global modifier, C<s///g>
will search and replace all occurrences of the regexp in the string:

    $x = "I batted 4 for 4";
    $x =~ s/4/four/;   # doesn't do it all:
                       # $x contains "I batted four for 4"
    $x = "I batted 4 for 4";
    $x =~ s/4/four/g;  # does it all:
                       # $x contains "I batted four for four"

If you prefer "regex" over "regexp" in this tutorial, you could use
the following program to replace it:

    % cat > simple_replace
    #!/usr/bin/perl
    $regexp = shift;
    $replacement = shift;
    while (<>) {
        s/$regexp/$replacement/g;
        print;
    }
    ^D

    % simple_replace regexp regex perlretut.pod

In C<simple_replace> we used the C<s///g> modifier to replace all
occurrences of the regexp on each line.  (Even though the regular
expression appears in a loop, Perl is smart enough to compile it
only once.)  As with C<simple_grep>, both the
C<print> and the C<s/$regexp/$replacement/g> use C<$_> implicitly.

If you don't want C<s///> to change your original variable you can use
the non-destructive substitute modifier, C<s///r>.  This changes the
behavior so that C<s///r> returns the final substituted string
(instead of the number of substitutions):

    $x = "I like dogs.";
    $y = $x =~ s/dogs/cats/r;
    print "$x $y\n";

That example will print "I like dogs. I like cats". Notice the original
C<$x> variable has not been affected. The overall
result of the substitution is instead stored in C<$y>. If the
substitution doesn't affect anything then the original string is
returned:

    $x = "I like dogs.";
    $y = $x =~ s/elephants/cougars/r;
    print "$x $y\n"; # prints "I like dogs. I like dogs."

One other interesting thing that the C<s///r> flag allows is chaining
substitutions:

    $x = "Cats are great.";
    print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~
        s/Frogs/Hedgehogs/r, "\n";
    # prints "Hedgehogs are great."

A modifier available specifically to search and replace is the
C<s///e> evaluation modifier.  C<s///e> treats the
replacement text as Perl code, rather than a double-quoted
string.  The value that the code returns is substituted for the
matched substring.  C<s///e> is useful if you need to do a bit of
computation in the process of replacing text.  This example counts
character frequencies in a line:

    $x = "Bill the cat";
    $x =~ s/(.)/$chars{$1}++;$1/eg; # final $1 replaces char with itself
    print "frequency of '$_' is $chars{$_}\n"
        foreach (sort {$chars{$b} <=> $chars{$a}} keys %chars);

This prints

    frequency of ' ' is 2
    frequency of 't' is 2
    frequency of 'l' is 2
    frequency of 'B' is 1
    frequency of 'c' is 1
    frequency of 'e' is 1
    frequency of 'h' is 1
    frequency of 'i' is 1
    frequency of 'a' is 1

As with the match C<m//> operator, C<s///> can use other delimiters,
such as C<s!!!> and C<s{}{}>, and even C<s{}//>.  If single quotes are
used C<s'''>, then the regexp and replacement are
treated as single-quoted strings and there are no
variable substitutions.  C<s///> in list context
returns the same thing as in scalar context, I<i.e.>, the number of
matches.

=head3 The split function

The C<split()> function is another place where a regexp is used.
C<split /regexp/, string, limit> separates the C<string> operand into
a list of substrings and returns that list.  The regexp must be designed
to match whatever constitutes the separators for the desired substrings.
The C<limit>, if present, constrains splitting into no more than C<limit>
number of strings.  For example, to split a string into words, use

    $x = "Calvin and Hobbes";
    @words = split /\s+/, $x;  # $word[0] = 'Calvin'
                               # $word[1] = 'and'
                               # $word[2] = 'Hobbes'

If the empty regexp C<//> is used, the regexp always matches and
the string is split into individual characters.  If the regexp has
groupings, then the resulting list contains the matched substrings from the
groupings as well.  For instance,

    $x = "/usr/bin/perl";
    @dirs = split m!/!, $x;  # $dirs[0] = ''
                             # $dirs[1] = 'usr'
                             # $dirs[2] = 'bin'
                             # $dirs[3] = 'perl'
    @parts = split m!(/)!, $x;  # $parts[0] = ''
                                # $parts[1] = '/'
                                # $parts[2] = 'usr'
                                # $parts[3] = '/'
                                # $parts[4] = 'bin'
                                # $parts[5] = '/'
                                # $parts[6] = 'perl'

Since the first character of C<$x> matched the regexp, C<split> prepended
an empty initial element to the list.

If you have read this far, congratulations! You now have all the basic
tools needed to use regular expressions to solve a wide range of text
processing problems.  If this is your first time through the tutorial,
why not stop here and play around with regexps a while....  S<Part 2>
concerns the more esoteric aspects of regular expressions and those
concepts certainly aren't needed right at the start.

=head1 Part 2: Power tools

OK, you know the basics of regexps and you want to know more.  If
matching regular expressions is analogous to a walk in the woods, then
the tools discussed in Part 1 are analogous to topo maps and a
compass, basic tools we use all the time.  Most of the tools in part 2
are analogous to flare guns and satellite phones.  They aren't used
too often on a hike, but when we are stuck, they can be invaluable.

What follows are the more advanced, less used, or sometimes esoteric
capabilities of Perl regexps.  In Part 2, we will assume you are
comfortable with the basics and concentrate on the advanced features.

=head2 More on characters, strings, and character classes

There are a number of escape sequences and character classes that we
haven't covered yet.

There are several escape sequences that convert characters or strings
between upper and lower case, and they are also available within
patterns.  C<\l> and C<\u> convert the next character to lower or
upper case, respectively:

    $x = "perl";
    $string =~ /\u$x/;  # matches 'Perl' in $string
    $x = "M(rs?|s)\\."; # note the double backslash
    $string =~ /\l$x/;  # matches 'mr.', 'mrs.', and 'ms.',

A C<\L> or C<\U> indicates a lasting conversion of case, until
terminated by C<\E> or thrown over by another C<\U> or C<\L>:

    $x = "This word is in lower case:\L SHOUT\E";
    $x =~ /shout/;       # matches
    $x = "I STILL KEYPUNCH CARDS FOR MY 360"
    $x =~ /\Ukeypunch/;  # matches punch card string

If there is no C<\E>, case is converted until the end of the
string. The regexps C<\L\u$word> or C<\u\L$word> convert the first
character of C<$word> to uppercase and the rest of the characters to
lowercase.

Control characters can be escaped with C<\c>, so that a control-Z
character would be matched with C<\cZ>.  The escape sequence
C<\Q>...C<\E> quotes, or protects most non-alphabetic characters.   For
instance,

    $x = "\QThat !^*&%~& cat!";
    $x =~ /\Q!^*&%~&\E/;  # check for rough language

It does not protect C<'$'> or C<'@'>, so that variables can still be
substituted.

C<\Q>, C<\L>, C<\l>, C<\U>, C<\u> and C<\E> are actually part of
double-quotish syntax, and not part of regexp syntax proper.  They will
work if they appear in a regular expression embedded directly in a
program, but not when contained in a string that is interpolated in a
pattern.

Perl regexps can handle more than just the
standard ASCII character set.  Perl supports I<Unicode>, a standard
for representing the alphabets from virtually all of the world's written
languages, and a host of symbols.  Perl's text strings are Unicode strings, so
they can contain characters with a value (codepoint or character number) higher
than 255.

What does this mean for regexps? Well, regexp users don't need to know
much about Perl's internal representation of strings.  But they do need
to know 1) how to represent Unicode characters in a regexp and 2) that
a matching operation will treat the string to be searched as a sequence
of characters, not bytes.  The answer to 1) is that Unicode characters
greater than C<chr(255)> are represented using the C<\x{hex}> notation, because
C<\x>I<XY> (without curly braces and I<XY> are two hex digits) doesn't
go further than 255.  (Starting in Perl 5.14, if you're an octal fan,
you can also use C<\o{oct}>.)

    /\x{263a}/;  # match a Unicode smiley face :)

B<NOTE>: In Perl 5.6.0 it used to be that one needed to say C<use
utf8> to use any Unicode features.  This is no more the case: for
almost all Unicode processing, the explicit C<utf8> pragma is not
needed.  (The only case where it matters is if your Perl script is in
Unicode and encoded in UTF-8, then an explicit C<use utf8> is needed.)

Figuring out the hexadecimal sequence of a Unicode character you want
or deciphering someone else's hexadecimal Unicode regexp is about as
much fun as programming in machine code.  So another way to specify
Unicode characters is to use the I<named character> escape
sequence C<\N{I<name>}>.  I<name> is a name for the Unicode character, as
specified in the Unicode standard.  For instance, if we wanted to
represent or match the astrological sign for the planet Mercury, we
could use

    $x = "abc\N{MERCURY}def";
    $x =~ /\N{MERCURY}/;   # matches

One can also use "short" names:

    print "\N{GREEK SMALL LETTER SIGMA} is called sigma.\n";
    print "\N{greek:Sigma} is an upper-case sigma.\n";

You can also restrict names to a certain alphabet by specifying the
L<charnames> pragma:

    use charnames qw(greek);
    print "\N{sigma} is Greek sigma\n";

An index of character names is available on-line from the Unicode
Consortium, L<http://www.unicode.org/charts/charindex.html>; explanatory
material with links to other resources at
L<http://www.unicode.org/standard/where>.

The answer to requirement 2) is that a regexp (mostly)
uses Unicode characters.  The "mostly" is for messy backward
compatibility reasons, but starting in Perl 5.14, any regexp compiled in
the scope of a C<use feature 'unicode_strings'> (which is automatically
turned on within the scope of a C<use 5.012> or higher) will turn that
"mostly" into "always".  If you want to handle Unicode properly, you
should ensure that C<'unicode_strings'> is turned on.
Internally, this is encoded to bytes using either UTF-8 or a native 8
bit encoding, depending on the history of the string, but conceptually
it is a sequence of characters, not bytes. See L<perlunitut> for a
tutorial about that.

Let us now discuss Unicode character classes, most usually called
"character properties".  These are represented by the C<\p{I<name>}>
escape sequence.  The negation of this is C<\P{I<name>}>.  For example,
to match lower and uppercase characters,

    $x = "BOB";
    $x =~ /^\p{IsUpper}/;   # matches, uppercase char class
    $x =~ /^\P{IsUpper}/;   # doesn't match, char class sans uppercase
    $x =~ /^\p{IsLower}/;   # doesn't match, lowercase char class
    $x =~ /^\P{IsLower}/;   # matches, char class sans lowercase

(The "C<Is>" is optional.)

There are many, many Unicode character properties.  For the full list
see L<perluniprops>.  Most of them have synonyms with shorter names,
also listed there.  Some synonyms are a single character.  For these,
you can drop the braces.  For instance, C<\pM> is the same thing as
C<\p{Mark}>, meaning things like accent marks.

The Unicode C<\p{Script}> and C<\p{Script_Extensions}> properties are
used to categorize every Unicode character into the language script it
is written in.  (C<Script_Extensions> is an improved version of
C<Script>, which is retained for backward compatibility, and so you
should generally use C<Script_Extensions>.)
For example,
English, French, and a bunch of other European languages are written in
the Latin script.  But there is also the Greek script, the Thai script,
the Katakana script, I<etc>.  You can test whether a character is in a
particular script (based on C<Script_Extensions>) with, for example
C<\p{Latin}>, C<\p{Greek}>, or C<\p{Katakana}>.  To test if it isn't in
the Balinese script, you would use C<\P{Balinese}>.

What we have described so far is the single form of the C<\p{...}> character
classes.  There is also a compound form which you may run into.  These
look like C<\p{I<name>=I<value>}> or C<\p{I<name>:I<value>}> (the equals sign and colon
can be used interchangeably).  These are more general than the single form,
and in fact most of the single forms are just Perl-defined shortcuts for common
compound forms.  For example, the script examples in the previous paragraph
could be written equivalently as C<\p{Script_Extensions=Latin}>, C<\p{Script_Extensions:Greek}>,
C<\p{script_extensions=katakana}>, and C<\P{script_extensions=balinese}> (case is irrelevant
between the C<{}> braces).  You may
never have to use the compound forms, but sometimes it is necessary, and their
use can make your code easier to understand.

C<\X> is an abbreviation for a character class that comprises
a Unicode I<extended grapheme cluster>.  This represents a "logical character":
what appears to be a single character, but may be represented internally by more
than one.  As an example, using the Unicode full names, I<e.g.>, "S<A + COMBINING
RING>" is a grapheme cluster with base character "A" and combining character
"S<COMBINING RING>, which translates in Danish to "A" with the circle atop it,
as in the word E<Aring>ngstrom.

For the full and latest information about Unicode see the latest
Unicode standard, or the Unicode Consortium's website L<http://www.unicode.org>

As if all those classes weren't enough, Perl also defines POSIX-style
character classes.  These have the form C<[:I<name>:]>, with I<name> the
name of the POSIX class.  The POSIX classes are C<alpha>, C<alnum>,
C<ascii>, C<cntrl>, C<digit>, C<graph>, C<lower>, C<print>, C<punct>,
C<space>, C<upper>, and C<xdigit>, and two extensions, C<word> (a Perl
extension to match C<\w>), and C<blank> (a GNU extension).  The C</a>
modifier restricts these to matching just in the ASCII range; otherwise
they can match the same as their corresponding Perl Unicode classes:
C<[:upper:]> is the same as C<\p{IsUpper}>, I<etc>.  (There are some
exceptions and gotchas with this; see L<perlrecharclass> for a full
discussion.) The C<[:digit:]>, C<[:word:]>, and
C<[:space:]> correspond to the familiar C<\d>, C<\w>, and C<\s>
character classes.  To negate a POSIX class, put a C<'^'> in front of
the name, so that, I<e.g.>, C<[:^digit:]> corresponds to C<\D> and, under
Unicode, C<\P{IsDigit}>.  The Unicode and POSIX character classes can
be used just like C<\d>, with the exception that POSIX character
classes can only be used inside of a character class:

    /\s+[abc[:digit:]xyz]\s*/;  # match a,b,c,x,y,z, or a digit
    /^=item\s[[:digit:]]/;      # match '=item',
                                # followed by a space and a digit
    /\s+[abc\p{IsDigit}xyz]\s+/;  # match a,b,c,x,y,z, or a digit
    /^=item\s\p{IsDigit}/;        # match '=item',
                                  # followed by a space and a digit

Whew! That is all the rest of the characters and character classes.

=head2 Compiling and saving regular expressions

In Part 1 we mentioned that Perl compiles a regexp into a compact
sequence of opcodes.  Thus, a compiled regexp is a data structure
that can be stored once and used again and again.  The regexp quote
C<qr//> does exactly that: C<qr/string/> compiles the C<string> as a
regexp and transforms the result into a form that can be assigned to a
variable:

    $reg = qr/foo+bar?/;  # reg contains a compiled regexp

Then C<$reg> can be used as a regexp:

    $x = "fooooba";
    $x =~ $reg;     # matches, just like /foo+bar?/
    $x =~ /$reg/;   # same thing, alternate form

C<$reg> can also be interpolated into a larger regexp:

    $x =~ /(abc)?$reg/;  # still matches

As with the matching operator, the regexp quote can use different
delimiters, I<e.g.>, C<qr!!>, C<qr{}> or C<qr~~>.  Apostrophes
as delimiters (C<qr''>) inhibit any interpolation.

Pre-compiled regexps are useful for creating dynamic matches that
don't need to be recompiled each time they are encountered.  Using
pre-compiled regexps, we write a C<grep_step> program which greps
for a sequence of patterns, advancing to the next pattern as soon
as one has been satisfied.

    % cat > grep_step
    #!/usr/bin/perl
    # grep_step - match <number> regexps, one after the other
    # usage: multi_grep <number> regexp1 regexp2 ... file1 file2 ...

    $number = shift;
    $regexp[$_] = shift foreach (0..$number-1);
    @compiled = map qr/$_/, @regexp;
    while ($line = <>) {
        if ($line =~ /$compiled[0]/) {
            print $line;
            shift @compiled;
            last unless @compiled;
        }
    }
    ^D

    % grep_step 3 shift print last grep_step
    $number = shift;
            print $line;
            last unless @compiled;

Storing pre-compiled regexps in an array C<@compiled> allows us to
simply loop through the regexps without any recompilation, thus gaining
flexibility without sacrificing speed.


=head2 Composing regular expressions at runtime

Backtracking is more efficient than repeated tries with different regular
expressions.  If there are several regular expressions and a match with
any of them is acceptable, then it is possible to combine them into a set
of alternatives.  If the individual expressions are input data, this
can be done by programming a join operation.  We'll exploit this idea in
an improved version of the C<simple_grep> program: a program that matches
multiple patterns:

    % cat > multi_grep
    #!/usr/bin/perl
    # multi_grep - match any of <number> regexps
    # usage: multi_grep <number> regexp1 regexp2 ... file1 file2 ...

    $number = shift;
    $regexp[$_] = shift foreach (0..$number-1);
    $pattern = join '|', @regexp;

    while ($line = <>) {
        print $line if $line =~ /$pattern/;
    }
    ^D

    % multi_grep 2 shift for multi_grep
    $number = shift;
    $regexp[$_] = shift foreach (0..$number-1);

Sometimes it is advantageous to construct a pattern from the I<input>
that is to be analyzed and use the permissible values on the left
hand side of the matching operations.  As an example for this somewhat
paradoxical situation, let's assume that our input contains a command
verb which should match one out of a set of available command verbs,
with the additional twist that commands may be abbreviated as long as
the given string is unique. The program below demonstrates the basic
algorithm.

    % cat > keymatch
    #!/usr/bin/perl
    $kwds = 'copy compare list print';
    while( $cmd = <> ){
        $cmd =~ s/^\s+|\s+$//g;  # trim leading and trailing spaces
        if( ( @matches = $kwds =~ /\b$cmd\w*/g ) == 1 ){
            print "command: '@matches'\n";
        } elsif( @matches == 0 ){
            print "no such command: '$cmd'\n";
        } else {
            print "not unique: '$cmd' (could be one of: @matches)\n";
        }
    }
    ^D

    % keymatch
    li
    command: 'list'
    co
    not unique: 'co' (could be one of: copy compare)
    printer
    no such command: 'printer'

Rather than trying to match the input against the keywords, we match the
combined set of keywords against the input.  The pattern matching
operation S<C<$kwds =~ /\b($cmd\w*)/g>> does several things at the
same time. It makes sure that the given command begins where a keyword
begins (C<\b>). It tolerates abbreviations due to the added C<\w*>. It
tells us the number of matches (C<scalar @matches>) and all the keywords
that were actually matched.  You could hardly ask for more.

=head2 Embedding comments and modifiers in a regular expression

Starting with this section, we will be discussing Perl's set of
I<extended patterns>.  These are extensions to the traditional regular
expression syntax that provide powerful new tools for pattern
matching.  We have already seen extensions in the form of the minimal
matching constructs C<??>, C<*?>, C<+?>, C<{n,m}?>, and C<{n,}?>.  Most
of the extensions below have the form C<(?char...)>, where the
C<char> is a character that determines the type of extension.

The first extension is an embedded comment C<(?#text)>.  This embeds a
comment into the regular expression without affecting its meaning.  The
comment should not have any closing parentheses in the text.  An
example is

    /(?# Match an integer:)[+-]?\d+/;

This style of commenting has been largely superseded by the raw,
freeform commenting that is allowed with the C</x> modifier.

Most modifiers, such as C</i>, C</m>, C</s> and C</x> (or any
combination thereof) can also be embedded in
a regexp using C<(?i)>, C<(?m)>, C<(?s)>, and C<(?x)>.  For instance,

    /(?i)yes/;  # match 'yes' case insensitively
    /yes/i;     # same thing
    /(?x)(          # freeform version of an integer regexp
             [+-]?  # match an optional sign
             \d+    # match a sequence of digits
         )
    /x;

Embedded modifiers can have two important advantages over the usual
modifiers.  Embedded modifiers allow a custom set of modifiers for
I<each> regexp pattern.  This is great for matching an array of regexps
that must have different modifiers:

    $pattern[0] = '(?i)doctor';
    $pattern[1] = 'Johnson';
    ...
    while (<>) {
        foreach $patt (@pattern) {
            print if /$patt/;
        }
    }

The second advantage is that embedded modifiers (except C</p>, which
modifies the entire regexp) only affect the regexp
inside the group the embedded modifier is contained in.  So grouping
can be used to localize the modifier's effects:

    /Answer: ((?i)yes)/;  # matches 'Answer: yes', 'Answer: YES', etc.

Embedded modifiers can also turn off any modifiers already present
by using, I<e.g.>, C<(?-i)>.  Modifiers can also be combined into
a single expression, I<e.g.>, C<(?s-i)> turns on single line mode and
turns off case insensitivity.

Embedded modifiers may also be added to a non-capturing grouping.
C<(?i-m:regexp)> is a non-capturing grouping that matches C<regexp>
case insensitively and turns off multi-line mode.


=head2 Looking ahead and looking behind

This section concerns the lookahead and lookbehind assertions.  First,
a little background.

In Perl regular expressions, most regexp elements "eat up" a certain
amount of string when they match.  For instance, the regexp element
C<[abc]> eats up one character of the string when it matches, in the
sense that Perl moves to the next character position in the string
after the match.  There are some elements, however, that don't eat up
characters (advance the character position) if they match.  The examples
we have seen so far are the anchors.  The anchor C<'^'> matches the
beginning of the line, but doesn't eat any characters.  Similarly, the
word boundary anchor C<\b> matches wherever a character matching C<\w>
is next to a character that doesn't, but it doesn't eat up any
characters itself.  Anchors are examples of I<zero-width assertions>:
zero-width, because they consume
no characters, and assertions, because they test some property of the
string.  In the context of our walk in the woods analogy to regexp
matching, most regexp elements move us along a trail, but anchors have
us stop a moment and check our surroundings.  If the local environment
checks out, we can proceed forward.  But if the local environment
doesn't satisfy us, we must backtrack.

Checking the environment entails either looking ahead on the trail,
looking behind, or both.  C<'^'> looks behind, to see that there are no
characters before.  C<'$'> looks ahead, to see that there are no
characters after.  C<\b> looks both ahead and behind, to see if the
characters on either side differ in their "word-ness".

The lookahead and lookbehind assertions are generalizations of the
anchor concept.  Lookahead and lookbehind are zero-width assertions
that let us specify which characters we want to test for.  The
lookahead assertion is denoted by C<(?=regexp)> and the lookbehind
assertion is denoted by C<< (?<=fixed-regexp) >>.  Some examples are

    $x = "I catch the housecat 'Tom-cat' with catnip";
    $x =~ /cat(?=\s)/;   # matches 'cat' in 'housecat'
    @catwords = ($x =~ /(?<=\s)cat\w+/g);  # matches,
                                           # $catwords[0] = 'catch'
                                           # $catwords[1] = 'catnip'
    $x =~ /\bcat\b/;  # matches 'cat' in 'Tom-cat'
    $x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in
                              # middle of $x

Note that the parentheses in C<(?=regexp)> and C<< (?<=regexp) >> are
non-capturing, since these are zero-width assertions.  Thus in the
second regexp, the substrings captured are those of the whole regexp
itself.  Lookahead C<(?=regexp)> can match arbitrary regexps, but
lookbehind C<< (?<=fixed-regexp) >> only works for regexps of fixed
width, I<i.e.>, a fixed number of characters long.  Thus
C<< (?<=(ab|bc)) >> is fine, but C<< (?<=(ab)*) >> is not.  The
negated versions of the lookahead and lookbehind assertions are
denoted by C<(?!regexp)> and C<< (?<!fixed-regexp) >> respectively.
They evaluate true if the regexps do I<not> match:

    $x = "foobar";
    $x =~ /foo(?!bar)/;  # doesn't match, 'bar' follows 'foo'
    $x =~ /foo(?!baz)/;  # matches, 'baz' doesn't follow 'foo'
    $x =~ /(?<!\s)foo/;  # matches, there is no \s before 'foo'

Here is an example where a string containing blank-separated words,
numbers and single dashes is to be split into its components.
Using C</\s+/> alone won't work, because spaces are not required between
dashes, or a word or a dash. Additional places for a split are established
by looking ahead and behind:

    $str = "one two - --6-8";
    @toks = split / \s+              # a run of spaces
                  | (?<=\S) (?=-)    # any non-space followed by '-'
                  | (?<=-)  (?=\S)   # a '-' followed by any non-space
                  /x, $str;          # @toks = qw(one two - - - 6 - 8)


=head2 Using independent subexpressions to prevent backtracking

I<Independent subexpressions> are regular expressions, in the
context of a larger regular expression, that function independently of
the larger regular expression.  That is, they consume as much or as
little of the string as they wish without regard for the ability of
the larger regexp to match.  Independent subexpressions are represented
by C<< (?>regexp) >>.  We can illustrate their behavior by first
considering an ordinary regexp:

    $x = "ab";
    $x =~ /a*ab/;  # matches

This obviously matches, but in the process of matching, the
subexpression C<a*> first grabbed the C<'a'>.  Doing so, however,
wouldn't allow the whole regexp to match, so after backtracking, C<a*>
eventually gave back the C<'a'> and matched the empty string.  Here, what
C<a*> matched was I<dependent> on what the rest of the regexp matched.

Contrast that with an independent subexpression:

    $x =~ /(?>a*)ab/;  # doesn't match!

The independent subexpression C<< (?>a*) >> doesn't care about the rest
of the regexp, so it sees an C<'a'> and grabs it.  Then the rest of the
regexp C<ab> cannot match.  Because C<< (?>a*) >> is independent, there
is no backtracking and the independent subexpression does not give
up its C<'a'>.  Thus the match of the regexp as a whole fails.  A similar
behavior occurs with completely independent regexps:

    $x = "ab";
    $x =~ /a*/g;   # matches, eats an 'a'
    $x =~ /\Gab/g; # doesn't match, no 'a' available

Here C</g> and C<\G> create a "tag team" handoff of the string from
one regexp to the other.  Regexps with an independent subexpression are
much like this, with a handoff of the string to the independent
subexpression, and a handoff of the string back to the enclosing
regexp.

The ability of an independent subexpression to prevent backtracking
can be quite useful.  Suppose we want to match a non-empty string
enclosed in parentheses up to two levels deep.  Then the following
regexp matches:

    $x = "abc(de(fg)h";  # unbalanced parentheses
    $x =~ /\( ( [ ^ () ]+ | \( [ ^ () ]* \) )+ \)/xx;

The regexp matches an open parenthesis, one or more copies of an
alternation, and a close parenthesis.  The alternation is two-way, with
the first alternative C<[^()]+> matching a substring with no
parentheses and the second alternative C<\([^()]*\)>  matching a
substring delimited by parentheses.  The problem with this regexp is
that it is pathological: it has nested indeterminate quantifiers
of the form C<(a+|b)+>.  We discussed in Part 1 how nested quantifiers
like this could take an exponentially long time to execute if there
was no match possible.  To prevent the exponential blowup, we need to
prevent useless backtracking at some point.  This can be done by
enclosing the inner quantifier as an independent subexpression:

    $x =~ /\( ( (?> [ ^ () ]+ ) | \([ ^ () ]* \) )+ \)/xx;

Here, C<< (?>[^()]+) >> breaks the degeneracy of string partitioning
by gobbling up as much of the string as possible and keeping it.   Then
match failures fail much more quickly.


=head2 Conditional expressions

A I<conditional expression> is a form of if-then-else statement
that allows one to choose which patterns are to be matched, based on
some condition.  There are two types of conditional expression:
C<(?(I<condition>)I<yes-regexp>)> and
C<(?(condition)I<yes-regexp>|I<no-regexp>)>.
C<(?(I<condition>)I<yes-regexp>)> is
like an S<C<'if () {}'>> statement in Perl.  If the I<condition> is true,
the I<yes-regexp> will be matched.  If the I<condition> is false, the
I<yes-regexp> will be skipped and Perl will move onto the next regexp
element.  The second form is like an S<C<'if () {} else {}'>> statement
in Perl.  If the I<condition> is true, the I<yes-regexp> will be
matched, otherwise the I<no-regexp> will be matched.

The I<condition> can have several forms.  The first form is simply an
integer in parentheses C<(I<integer>)>.  It is true if the corresponding
backreference C<\I<integer>> matched earlier in the regexp.  The same
thing can be done with a name associated with a capture group, written
as C<<< (E<lt>I<name>E<gt>) >>> or C<< ('I<name>') >>.  The second form is a bare
zero-width assertion C<(?...)>, either a lookahead, a lookbehind, or a
code assertion (discussed in the next section).  The third set of forms
provides tests that return true if the expression is executed within
a recursion (C<(R)>) or is being called from some capturing group,
referenced either by number (C<(R1)>, C<(R2)>,...) or by name
(C<(R&I<name>)>).

The integer or name form of the C<condition> allows us to choose,
with more flexibility, what to match based on what matched earlier in the
regexp. This searches for words of the form C<"$x$x"> or C<"$x$y$y$x">:

    % simple_grep '^(\w+)(\w+)?(?(2)\g2\g1|\g1)$' /usr/dict/words
    beriberi
    coco
    couscous
    deed
    ...
    toot
    toto
    tutu

The lookbehind C<condition> allows, along with backreferences,
an earlier part of the match to influence a later part of the
match.  For instance,

    /[ATGC]+(?(?<=AA)G|C)$/;

matches a DNA sequence such that it either ends in C<AAG>, or some
other base pair combination and C<'C'>.  Note that the form is
C<< (?(?<=AA)G|C) >> and not C<< (?((?<=AA))G|C) >>; for the
lookahead, lookbehind or code assertions, the parentheses around the
conditional are not needed.


=head2 Defining named patterns

Some regular expressions use identical subpatterns in several places.
Starting with Perl 5.10, it is possible to define named subpatterns in
a section of the pattern so that they can be called up by name
anywhere in the pattern.  This syntactic pattern for this definition
group is C<< (?(DEFINE)(?<I<name>>I<pattern>)...) >>.  An insertion
of a named pattern is written as C<(?&I<name>)>.

The example below illustrates this feature using the pattern for
floating point numbers that was presented earlier on.  The three
subpatterns that are used more than once are the optional sign, the
digit sequence for an integer and the decimal fraction.  The C<DEFINE>
group at the end of the pattern contains their definition.  Notice
that the decimal fraction pattern is the first place where we can
reuse the integer pattern.

   /^ (?&osg)\ * ( (?&int)(?&dec)? | (?&dec) )
      (?: [eE](?&osg)(?&int) )?
    $
    (?(DEFINE)
      (?<osg>[-+]?)         # optional sign
      (?<int>\d++)          # integer
      (?<dec>\.(?&int))     # decimal fraction
    )/x


=head2 Recursive patterns

This feature (introduced in Perl 5.10) significantly extends the
power of Perl's pattern matching.  By referring to some other
capture group anywhere in the pattern with the construct
C<(?I<group-ref>)>, the I<pattern> within the referenced group is used
as an independent subpattern in place of the group reference itself.
Because the group reference may be contained I<within> the group it
refers to, it is now possible to apply pattern matching to tasks that
hitherto required a recursive parser.

To illustrate this feature, we'll design a pattern that matches if
a string contains a palindrome. (This is a word or a sentence that,
while ignoring spaces, interpunctuation and case, reads the same backwards
as forwards. We begin by observing that the empty string or a string
containing just one word character is a palindrome. Otherwise it must
have a word character up front and the same at its end, with another
palindrome in between.

    /(?: (\w) (?...Here be a palindrome...) \g{-1} | \w? )/x

Adding C<\W*> at either end to eliminate what is to be ignored, we already
have the full pattern:

    my $pp = qr/^(\W* (?: (\w) (?1) \g{-1} | \w? ) \W*)$/ix;
    for $s ( "saippuakauppias", "A man, a plan, a canal: Panama!" ){
        print "'$s' is a palindrome\n" if $s =~ /$pp/;
    }

In C<(?...)> both absolute and relative backreferences may be used.
The entire pattern can be reinserted with C<(?R)> or C<(?0)>.
If you prefer to name your groups, you can use C<(?&I<name>)> to
recurse into that group.


=head2 A bit of magic: executing Perl code in a regular expression

Normally, regexps are a part of Perl expressions.
I<Code evaluation> expressions turn that around by allowing
arbitrary Perl code to be a part of a regexp.  A code evaluation
expression is denoted C<(?{I<code>})>, with I<code> a string of Perl
statements.

Code expressions are zero-width assertions, and the value they return
depends on their environment.  There are two possibilities: either the
code expression is used as a conditional in a conditional expression
C<(?(I<condition>)...)>, or it is not.  If the code expression is a
conditional, the code is evaluated and the result (I<i.e.>, the result of
the last statement) is used to determine truth or falsehood.  If the
code expression is not used as a conditional, the assertion always
evaluates true and the result is put into the special variable
C<$^R>.  The variable C<$^R> can then be used in code expressions later
in the regexp.  Here are some silly examples:

    $x = "abcdef";
    $x =~ /abc(?{print "Hi Mom!";})def/; # matches,
                                         # prints 'Hi Mom!'
    $x =~ /aaa(?{print "Hi Mom!";})def/; # doesn't match,
                                         # no 'Hi Mom!'

Pay careful attention to the next example:

    $x =~ /abc(?{print "Hi Mom!";})ddd/; # doesn't match,
                                         # no 'Hi Mom!'
                                         # but why not?

At first glance, you'd think that it shouldn't print, because obviously
the C<ddd> isn't going to match the target string. But look at this
example:

    $x =~ /abc(?{print "Hi Mom!";})[dD]dd/; # doesn't match,
                                            # but _does_ print

Hmm. What happened here? If you've been following along, you know that
the above pattern should be effectively (almost) the same as the last one;
enclosing the C<'d'> in a character class isn't going to change what it
matches. So why does the first not print while the second one does?

The answer lies in the optimizations the regexp engine makes. In the first
case, all the engine sees are plain old characters (aside from the
C<?{}> construct). It's smart enough to realize that the string C<'ddd'>
doesn't occur in our target string before actually running the pattern
through. But in the second case, we've tricked it into thinking that our
pattern is more complicated. It takes a look, sees our
character class, and decides that it will have to actually run the
pattern to determine whether or not it matches, and in the process of
running it hits the print statement before it discovers that we don't
have a match.

To take a closer look at how the engine does optimizations, see the
section L</"Pragmas and debugging"> below.

More fun with C<?{}>:

    $x =~ /(?{print "Hi Mom!";})/;       # matches,
                                         # prints 'Hi Mom!'
    $x =~ /(?{$c = 1;})(?{print "$c";})/;  # matches,
                                           # prints '1'
    $x =~ /(?{$c = 1;})(?{print "$^R";})/; # matches,
                                           # prints '1'

The bit of magic mentioned in the section title occurs when the regexp
backtracks in the process of searching for a match.  If the regexp
backtracks over a code expression and if the variables used within are
localized using C<local>, the changes in the variables produced by the
code expression are undone! Thus, if we wanted to count how many times
a character got matched inside a group, we could use, I<e.g.>,

    $x = "aaaa";
    $count = 0;  # initialize 'a' count
    $c = "bob";  # test if $c gets clobbered
    $x =~ /(?{local $c = 0;})         # initialize count
           ( a                        # match 'a'
             (?{local $c = $c + 1;})  # increment count
           )*                         # do this any number of times,
           aa                         # but match 'aa' at the end
           (?{$count = $c;})          # copy local $c var into $count
          /x;
    print "'a' count is $count, \$c variable is '$c'\n";

This prints

    'a' count is 2, $c variable is 'bob'

If we replace the S<C< (?{local $c = $c + 1;})>> with
S<C< (?{$c = $c + 1;})>>, the variable changes are I<not> undone
during backtracking, and we get

    'a' count is 4, $c variable is 'bob'

Note that only localized variable changes are undone.  Other side
effects of code expression execution are permanent.  Thus

    $x = "aaaa";
    $x =~ /(a(?{print "Yow\n";}))*aa/;

produces

   Yow
   Yow
   Yow
   Yow

The result C<$^R> is automatically localized, so that it will behave
properly in the presence of backtracking.

This example uses a code expression in a conditional to match a
definite article, either C<'the'> in English or C<'der|die|das'> in
German:

    $lang = 'DE';  # use German
    ...
    $text = "das";
    print "matched\n"
        if $text =~ /(?(?{
                          $lang eq 'EN'; # is the language English?
                         })
                       the |             # if so, then match 'the'
                       (der|die|das)     # else, match 'der|die|das'
                     )
                    /xi;

Note that the syntax here is C<(?(?{...})I<yes-regexp>|I<no-regexp>)>, not
C<(?((?{...}))I<yes-regexp>|I<no-regexp>)>.  In other words, in the case of a
code expression, we don't need the extra parentheses around the
conditional.

If you try to use code expressions where the code text is contained within
an interpolated variable, rather than appearing literally in the pattern,
Perl may surprise you:

    $bar = 5;
    $pat = '(?{ 1 })';
    /foo(?{ $bar })bar/; # compiles ok, $bar not interpolated
    /foo(?{ 1 })$bar/;   # compiles ok, $bar interpolated
    /foo${pat}bar/;      # compile error!

    $pat = qr/(?{ $foo = 1 })/;  # precompile code regexp
    /foo${pat}bar/;      # compiles ok

If a regexp has a variable that interpolates a code expression, Perl
treats the regexp as an error. If the code expression is precompiled into
a variable, however, interpolating is ok. The question is, why is this an
error?

The reason is that variable interpolation and code expressions
together pose a security risk.  The combination is dangerous because
many programmers who write search engines often take user input and
plug it directly into a regexp:

    $regexp = <>;       # read user-supplied regexp
    $chomp $regexp;     # get rid of possible newline
    $text =~ /$regexp/; # search $text for the $regexp

If the C<$regexp> variable contains a code expression, the user could
then execute arbitrary Perl code.  For instance, some joker could
search for S<C<system('rm -rf *');>> to erase your files.  In this
sense, the combination of interpolation and code expressions I<taints>
your regexp.  So by default, using both interpolation and code
expressions in the same regexp is not allowed.  If you're not
concerned about malicious users, it is possible to bypass this
security check by invoking S<C<use re 'eval'>>:

    use re 'eval';       # throw caution out the door
    $bar = 5;
    $pat = '(?{ 1 })';
    /foo${pat}bar/;      # compiles ok

Another form of code expression is the I<pattern code expression>.
The pattern code expression is like a regular code expression, except
that the result of the code evaluation is treated as a regular
expression and matched immediately.  A simple example is

    $length = 5;
    $char = 'a';
    $x = 'aaaaabb';
    $x =~ /(??{$char x $length})/x; # matches, there are 5 of 'a'


This final example contains both ordinary and pattern code
expressions.  It detects whether a binary string C<1101010010001...> has a
Fibonacci spacing 0,1,1,2,3,5,...  of the C<'1'>'s:

    $x = "1101010010001000001";
    $z0 = ''; $z1 = '0';   # initial conditions
    print "It is a Fibonacci sequence\n"
        if $x =~ /^1         # match an initial '1'
                    (?:
                       ((??{ $z0 })) # match some '0'
                       1             # and then a '1'
		       (?{ $z0 = $z1; $z1 .= $^N; })
                    )+   # repeat as needed
                  $      # that is all there is
                 /x;
    printf "Largest sequence matched was %d\n", length($z1)-length($z0);

Remember that C<$^N> is set to whatever was matched by the last
completed capture group. This prints

    It is a Fibonacci sequence
    Largest sequence matched was 5

Ha! Try that with your garden variety regexp package...

Note that the variables C<$z0> and C<$z1> are not substituted when the
regexp is compiled, as happens for ordinary variables outside a code
expression.  Rather, the whole code block is parsed as perl code at the
same time as perl is compiling the code containing the literal regexp
pattern.

This regexp without the C</x> modifier is

    /^1(?:((??{ $z0 }))1(?{ $z0 = $z1; $z1 .= $^N; }))+$/

which shows that spaces are still possible in the code parts. Nevertheless,
when working with code and conditional expressions, the extended form of
regexps is almost necessary in creating and debugging regexps.


=head2 Backtracking control verbs

Perl 5.10 introduced a number of control verbs intended to provide
detailed control over the backtracking process, by directly influencing
the regexp engine and by providing monitoring techniques.  See
L<perlre/"Special Backtracking Control Verbs"> for a detailed
description.

Below is just one example, illustrating the control verb C<(*FAIL)>,
which may be abbreviated as C<(*F)>. If this is inserted in a regexp
it will cause it to fail, just as it would at some
mismatch between the pattern and the string. Processing
of the regexp continues as it would after any "normal"
failure, so that, for instance, the next position in the string or another
alternative will be tried. As failing to match doesn't preserve capture
groups or produce results, it may be necessary to use this in
combination with embedded code.

   %count = ();
   "supercalifragilisticexpialidocious" =~
       /([aeiou])(?{ $count{$1}++; })(*FAIL)/i;
   printf "%3d '%s'\n", $count{$_}, $_ for (sort keys %count);

The pattern begins with a class matching a subset of letters.  Whenever
this matches, a statement like C<$count{'a'}++;> is executed, incrementing
the letter's counter. Then C<(*FAIL)> does what it says, and
the regexp engine proceeds according to the book: as long as the end of
the string hasn't been reached, the position is advanced before looking
for another vowel. Thus, match or no match makes no difference, and the
regexp engine proceeds until the entire string has been inspected.
(It's remarkable that an alternative solution using something like

   $count{lc($_)}++ for split('', "supercalifragilisticexpialidocious");
   printf "%3d '%s'\n", $count2{$_}, $_ for ( qw{ a e i o u } );

is considerably slower.)


=head2 Pragmas and debugging

Speaking of debugging, there are several pragmas available to control
and debug regexps in Perl.  We have already encountered one pragma in
the previous section, S<C<use re 'eval';>>, that allows variable
interpolation and code expressions to coexist in a regexp.  The other
pragmas are

    use re 'taint';
    $tainted = <>;
    @parts = ($tainted =~ /(\w+)\s+(\w+)/; # @parts is now tainted

The C<taint> pragma causes any substrings from a match with a tainted
variable to be tainted as well.  This is not normally the case, as
regexps are often used to extract the safe bits from a tainted
variable.  Use C<taint> when you are not extracting safe bits, but are
performing some other processing.  Both C<taint> and C<eval> pragmas
are lexically scoped, which means they are in effect only until
the end of the block enclosing the pragmas.

    use re '/m';  # or any other flags
    $multiline_string =~ /^foo/; # /m is implied

The C<re '/flags'> pragma (introduced in Perl
5.14) turns on the given regular expression flags
until the end of the lexical scope.  See
L<re/"'E<sol>flags' mode"> for more
detail.

    use re 'debug';
    /^(.*)$/s;       # output debugging info

    use re 'debugcolor';
    /^(.*)$/s;       # output debugging info in living color

The global C<debug> and C<debugcolor> pragmas allow one to get
detailed debugging info about regexp compilation and
execution.  C<debugcolor> is the same as debug, except the debugging
information is displayed in color on terminals that can display
termcap color sequences.  Here is example output:

    % perl -e 'use re "debug"; "abc" =~ /a*b+c/;'
    Compiling REx 'a*b+c'
    size 9 first at 1
       1: STAR(4)
       2:   EXACT <a>(0)
       4: PLUS(7)
       5:   EXACT <b>(0)
       7: EXACT <c>(9)
       9: END(0)
    floating 'bc' at 0..2147483647 (checking floating) minlen 2
    Guessing start of match, REx 'a*b+c' against 'abc'...
    Found floating substr 'bc' at offset 1...
    Guessed: match at offset 0
    Matching REx 'a*b+c' against 'abc'
      Setting an EVAL scope, savestack=3
       0 <> <abc>           |  1:  STAR
                             EXACT <a> can match 1 times out of 32767...
      Setting an EVAL scope, savestack=3
       1 <a> <bc>           |  4:    PLUS
                             EXACT <b> can match 1 times out of 32767...
      Setting an EVAL scope, savestack=3
       2 <ab> <c>           |  7:      EXACT <c>
       3 <abc> <>           |  9:      END
    Match successful!
    Freeing REx: 'a*b+c'

If you have gotten this far into the tutorial, you can probably guess
what the different parts of the debugging output tell you.  The first
part

    Compiling REx 'a*b+c'
    size 9 first at 1
       1: STAR(4)
       2:   EXACT <a>(0)
       4: PLUS(7)
       5:   EXACT <b>(0)
       7: EXACT <c>(9)
       9: END(0)

describes the compilation stage.  C<STAR(4)> means that there is a
starred object, in this case C<'a'>, and if it matches, goto line 4,
I<i.e.>, C<PLUS(7)>.  The middle lines describe some heuristics and
optimizations performed before a match:

    floating 'bc' at 0..2147483647 (checking floating) minlen 2
    Guessing start of match, REx 'a*b+c' against 'abc'...
    Found floating substr 'bc' at offset 1...
    Guessed: match at offset 0

Then the match is executed and the remaining lines describe the
process:

    Matching REx 'a*b+c' against 'abc'
      Setting an EVAL scope, savestack=3
       0 <> <abc>           |  1:  STAR
                             EXACT <a> can match 1 times out of 32767...
      Setting an EVAL scope, savestack=3
       1 <a> <bc>           |  4:    PLUS
                             EXACT <b> can match 1 times out of 32767...
      Setting an EVAL scope, savestack=3
       2 <ab> <c>           |  7:      EXACT <c>
       3 <abc> <>           |  9:      END
    Match successful!
    Freeing REx: 'a*b+c'

Each step is of the form S<C<< n <x> <y> >>>, with C<< <x> >> the
part of the string matched and C<< <y> >> the part not yet
matched.  The S<C<< |  1:  STAR >>> says that Perl is at line number 1
in the compilation list above.  See
L<perldebguts/"Debugging Regular Expressions"> for much more detail.

An alternative method of debugging regexps is to embed C<print>
statements within the regexp.  This provides a blow-by-blow account of
the backtracking in an alternation:

    "that this" =~ m@(?{print "Start at position ", pos, "\n";})
                     t(?{print "t1\n";})
                     h(?{print "h1\n";})
                     i(?{print "i1\n";})
                     s(?{print "s1\n";})
                         |
                     t(?{print "t2\n";})
                     h(?{print "h2\n";})
                     a(?{print "a2\n";})
                     t(?{print "t2\n";})
                     (?{print "Done at position ", pos, "\n";})
                    @x;

prints

    Start at position 0
    t1
    h1
    t2
    h2
    a2
    t2
    Done at position 4

=head1 SEE ALSO

This is just a tutorial.  For the full story on Perl regular
expressions, see the L<perlre> regular expressions reference page.

For more information on the matching C<m//> and substitution C<s///>
operators, see L<perlop/"Regexp Quote-Like Operators">.  For
information on the C<split> operation, see L<perlfunc/split>.

For an excellent all-around resource on the care and feeding of
regular expressions, see the book I<Mastering Regular Expressions> by
Jeffrey Friedl (published by O'Reilly, ISBN 1556592-257-3).

=head1 AUTHOR AND COPYRIGHT

Copyright (c) 2000 Mark Kvale.
All rights reserved.
Now maintained by Perl porters.

This document may be distributed under the same terms as Perl itself.

=head2 Acknowledgments

The inspiration for the stop codon DNA example came from the ZIP
code example in chapter 7 of I<Mastering Regular Expressions>.

The author would like to thank Jeff Pinyan, Andrew Johnson, Peter
Haworth, Ronald J Kimball, and Joe Smith for all their helpful
comments.

=cut

perlvar.pod000064400000231034150344123450006727 0ustar00=head1 NAME

perlvar - Perl predefined variables

=head1 DESCRIPTION

=head2 The Syntax of Variable Names

Variable names in Perl can have several formats.  Usually, they
must begin with a letter or underscore, in which case they can be
arbitrarily long (up to an internal limit of 251 characters) and
may contain letters, digits, underscores, or the special sequence
C<::> or C<'>.  In this case, the part before the last C<::> or
C<'> is taken to be a I<package qualifier>; see L<perlmod>.
A Unicode letter that is not ASCII is not considered to be a letter
unless S<C<"use utf8">> is in effect, and somewhat more complicated
rules apply; see L<perldata/Identifier parsing> for details.

Perl variable names may also be a sequence of digits, a single
punctuation character, or the two-character sequence: C<^> (caret or
CIRCUMFLEX ACCENT) followed by any one of the characters C<[][A-Z^_?\]>.
These names are all reserved for
special uses by Perl; for example, the all-digits names are used
to hold data captured by backreferences after a regular expression
match.

Since Perl v5.6.0, Perl variable names may also be alphanumeric strings
preceded by a caret.  These must all be written in the form C<${^Foo}>;
the braces are not optional.  C<${^Foo}> denotes the scalar variable
whose name is considered to be a control-C<F> followed by two C<o>'s.
These variables are
reserved for future special uses by Perl, except for the ones that
begin with C<^_> (caret-underscore).  No
name that begins with C<^_> will acquire a special
meaning in any future version of Perl; such names may therefore be
used safely in programs.  C<$^_> itself, however, I<is> reserved.

Perl identifiers that begin with digits or
punctuation characters are exempt from the effects of the C<package>
declaration and are always forced to be in package C<main>; they are
also exempt from C<strict 'vars'> errors.  A few other names are also
exempt in these ways:

    ENV      STDIN
    INC      STDOUT
    ARGV     STDERR
    ARGVOUT
    SIG

In particular, the special C<${^_XYZ}> variables are always taken
to be in package C<main>, regardless of any C<package> declarations
presently in scope.

=head1 SPECIAL VARIABLES

The following names have special meaning to Perl.  Most punctuation
names have reasonable mnemonics, or analogs in the shells.
Nevertheless, if you wish to use long variable names, you need only say:

    use English;

at the top of your program.  This aliases all the short names to the long
names in the current package.  Some even have medium names, generally
borrowed from B<awk>.  For more info, please see L<English>.

Before you continue, note the sort order for variables.  In general, we
first list the variables in case-insensitive, almost-lexigraphical
order (ignoring the C<{> or C<^> preceding words, as in C<${^UNICODE}>
or C<$^T>), although C<$_> and C<@_> move up to the top of the pile.
For variables with the same identifier, we list it in order of scalar,
array, hash, and bareword.

=head2 General Variables

=over 8

=item $ARG

=item $_
X<$_> X<$ARG>

The default input and pattern-searching space.  The following pairs are
equivalent:

    while (<>) {...}    # equivalent only in while!
    while (defined($_ = <>)) {...}

    /^Subject:/
    $_ =~ /^Subject:/

    tr/a-z/A-Z/
    $_ =~ tr/a-z/A-Z/

    chomp
    chomp($_)

Here are the places where Perl will assume C<$_> even if you don't use it:

=over 3

=item *

The following functions use C<$_> as a default argument:

abs, alarm, chomp, chop, chr, chroot,
cos, defined, eval, evalbytes, exp, fc, glob, hex, int, lc,
lcfirst, length, log, lstat, mkdir, oct, ord, pos, print, printf,
quotemeta, readlink, readpipe, ref, require, reverse (in scalar context only),
rmdir, say, sin, split (for its second
argument), sqrt, stat, study, uc, ucfirst,
unlink, unpack.

=item *

All file tests (C<-f>, C<-d>) except for C<-t>, which defaults to STDIN.
See L<perlfunc/-X>

=item *

The pattern matching operations C<m//>, C<s///> and C<tr///> (aka C<y///>)
when used without an C<=~> operator.

=item *

The default iterator variable in a C<foreach> loop if no other
variable is supplied.

=item *

The implicit iterator variable in the C<grep()> and C<map()> functions.

=item *

The implicit variable of C<given()>.

=item *

The default place to put the next value or input record
when a C<< <FH> >>, C<readline>, C<readdir> or C<each>
operation's result is tested by itself as the sole criterion of a C<while>
test.  Outside a C<while> test, this will not happen.

=back

C<$_> is by default a global variable.  However, as
of perl v5.10.0, you can use a lexical version of
C<$_> by declaring it in a file or in a block with C<my>.  Moreover,
declaring C<our $_> restores the global C<$_> in the current scope.  Though
this seemed like a good idea at the time it was introduced, lexical C<$_>
actually causes more problems than it solves.  If you call a function that
expects to be passed information via C<$_>, it may or may not work,
depending on how the function is written, there not being any easy way to
solve this.  Just avoid lexical C<$_>, unless you are feeling particularly
masochistic.  For this reason lexical C<$_> is still experimental and will
produce a warning unless warnings have been disabled.  As with other
experimental features, the behavior of lexical C<$_> is subject to change
without notice, including change into a fatal error.

Mnemonic: underline is understood in certain operations.

=item @ARG

=item @_
X<@_> X<@ARG>

Within a subroutine the array C<@_> contains the parameters passed to
that subroutine.  Inside a subroutine, C<@_> is the default array for
the array operators C<pop> and C<shift>.

See L<perlsub>.

=item $LIST_SEPARATOR

=item $"
X<$"> X<$LIST_SEPARATOR>

When an array or an array slice is interpolated into a double-quoted
string or a similar context such as C</.../>, its elements are
separated by this value.  Default is a space.  For example, this:

    print "The array is: @array\n";

is equivalent to this:

    print "The array is: " . join($", @array) . "\n";

Mnemonic: works in double-quoted context.

=item $PROCESS_ID

=item $PID

=item $$
X<$$> X<$PID> X<$PROCESS_ID>

The process number of the Perl running this script.  Though you I<can> set
this variable, doing so is generally discouraged, although it can be
invaluable for some testing purposes.  It will be reset automatically
across C<fork()> calls.

Note for Linux and Debian GNU/kFreeBSD users: Before Perl v5.16.0 perl
would emulate POSIX semantics on Linux systems using LinuxThreads, a
partial implementation of POSIX Threads that has since been superseded
by the Native POSIX Thread Library (NPTL).

LinuxThreads is now obsolete on Linux, and caching C<getpid()>
like this made embedding perl unnecessarily complex (since you'd have
to manually update the value of $$), so now C<$$> and C<getppid()>
will always return the same values as the underlying C library.

Debian GNU/kFreeBSD systems also used LinuxThreads up until and
including the 6.0 release, but after that moved to FreeBSD thread
semantics, which are POSIX-like.

To see if your system is affected by this discrepancy check if
C<getconf GNU_LIBPTHREAD_VERSION | grep -q NPTL> returns a false
value.  NTPL threads preserve the POSIX semantics.

Mnemonic: same as shells.

=item $PROGRAM_NAME

=item $0
X<$0> X<$PROGRAM_NAME>

Contains the name of the program being executed.

On some (but not all) operating systems assigning to C<$0> modifies
the argument area that the C<ps> program sees.  On some platforms you
may have to use special C<ps> options or a different C<ps> to see the
changes.  Modifying the C<$0> is more useful as a way of indicating the
current program state than it is for hiding the program you're
running.

Note that there are platform-specific limitations on the maximum
length of C<$0>.  In the most extreme case it may be limited to the
space occupied by the original C<$0>.

In some platforms there may be arbitrary amount of padding, for
example space characters, after the modified name as shown by C<ps>.
In some platforms this padding may extend all the way to the original
length of the argument area, no matter what you do (this is the case
for example with Linux 2.2).

Note for BSD users: setting C<$0> does not completely remove "perl"
from the ps(1) output.  For example, setting C<$0> to C<"foobar"> may
result in C<"perl: foobar (perl)"> (whether both the C<"perl: "> prefix
and the " (perl)" suffix are shown depends on your exact BSD variant
and version).  This is an operating system feature, Perl cannot help it.

In multithreaded scripts Perl coordinates the threads so that any
thread may modify its copy of the C<$0> and the change becomes visible
to ps(1) (assuming the operating system plays along).  Note that
the view of C<$0> the other threads have will not change since they
have their own copies of it.

If the program has been given to perl via the switches C<-e> or C<-E>,
C<$0> will contain the string C<"-e">.

On Linux as of perl v5.14.0 the legacy process name will be set with
C<prctl(2)>, in addition to altering the POSIX name via C<argv[0]> as
perl has done since version 4.000.  Now system utilities that read the
legacy process name such as ps, top and killall will recognize the
name you set when assigning to C<$0>.  The string you supply will be
cut off at 16 bytes, this is a limitation imposed by Linux.

Mnemonic: same as B<sh> and B<ksh>.

=item $REAL_GROUP_ID

=item $GID

=item $(
X<$(> X<$GID> X<$REAL_GROUP_ID>

The real gid of this process.  If you are on a machine that supports
membership in multiple groups simultaneously, gives a space separated
list of groups you are in.  The first number is the one returned by
C<getgid()>, and the subsequent ones by C<getgroups()>, one of which may be
the same as the first number.

However, a value assigned to C<$(> must be a single number used to
set the real gid.  So the value given by C<$(> should I<not> be assigned
back to C<$(> without being forced numeric, such as by adding zero.  Note
that this is different to the effective gid (C<$)>) which does take a
list.

You can change both the real gid and the effective gid at the same
time by using C<POSIX::setgid()>.  Changes
to C<$(> require a check to C<$!>
to detect any possible errors after an attempted change.

Mnemonic: parentheses are used to I<group> things.  The real gid is the
group you I<left>, if you're running setgid.

=item $EFFECTIVE_GROUP_ID

=item $EGID

=item $)
X<$)> X<$EGID> X<$EFFECTIVE_GROUP_ID>

The effective gid of this process.  If you are on a machine that
supports membership in multiple groups simultaneously, gives a space
separated list of groups you are in.  The first number is the one
returned by C<getegid()>, and the subsequent ones by C<getgroups()>,
one of which may be the same as the first number.

Similarly, a value assigned to C<$)> must also be a space-separated
list of numbers.  The first number sets the effective gid, and
the rest (if any) are passed to C<setgroups()>.  To get the effect of an
empty list for C<setgroups()>, just repeat the new effective gid; that is,
to force an effective gid of 5 and an effectively empty C<setgroups()>
list, say C< $) = "5 5" >.

You can change both the effective gid and the real gid at the same
time by using C<POSIX::setgid()> (use only a single numeric argument).
Changes to C<$)> require a check to C<$!> to detect any possible errors
after an attempted change.

C<< $< >>, C<< $> >>, C<$(> and C<$)> can be set only on
machines that support the corresponding I<set[re][ug]id()> routine.  C<$(>
and C<$)> can be swapped only on machines supporting C<setregid()>.

Mnemonic: parentheses are used to I<group> things.  The effective gid
is the group that's I<right> for you, if you're running setgid.

=item $REAL_USER_ID

=item $UID

=item $<
X<< $< >> X<$UID> X<$REAL_USER_ID>

The real uid of this process.  You can change both the real uid and the
effective uid at the same time by using C<POSIX::setuid()>.  Since
changes to C<< $< >> require a system call, check C<$!> after a change
attempt to detect any possible errors.

Mnemonic: it's the uid you came I<from>, if you're running setuid.

=item $EFFECTIVE_USER_ID

=item $EUID

=item $>
X<< $> >> X<$EUID> X<$EFFECTIVE_USER_ID>

The effective uid of this process.  For example:

    $< = $>;            # set real to effective uid
    ($<,$>) = ($>,$<);  # swap real and effective uids

You can change both the effective uid and the real uid at the same
time by using C<POSIX::setuid()>.  Changes to C<< $> >> require a check
to C<$!> to detect any possible errors after an attempted change.

C<< $< >> and C<< $> >> can be swapped only on machines
supporting C<setreuid()>.

Mnemonic: it's the uid you went I<to>, if you're running setuid.

=item $SUBSCRIPT_SEPARATOR

=item $SUBSEP

=item $;
X<$;> X<$SUBSEP> X<SUBSCRIPT_SEPARATOR>

The subscript separator for multidimensional array emulation.  If you
refer to a hash element as

    $foo{$x,$y,$z}

it really means

    $foo{join($;, $x, $y, $z)}

But don't put

    @foo{$x,$y,$z}	# a slice--note the @

which means

    ($foo{$x},$foo{$y},$foo{$z})

Default is "\034", the same as SUBSEP in B<awk>.  If your keys contain
binary data there might not be any safe value for C<$;>.

Consider using "real" multidimensional arrays as described
in L<perllol>.

Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon.

=item $a

=item $b
X<$a> X<$b>

Special package variables when using C<sort()>, see L<perlfunc/sort>.
Because of this specialness C<$a> and C<$b> don't need to be declared
(using C<use vars>, or C<our()>) even when using the C<strict 'vars'>
pragma.  Don't lexicalize them with C<my $a> or C<my $b> if you want to
be able to use them in the C<sort()> comparison block or function.

=item %ENV
X<%ENV>

The hash C<%ENV> contains your current environment.  Setting a
value in C<ENV> changes the environment for any child processes
you subsequently C<fork()> off.

As of v5.18.0, both keys and values stored in C<%ENV> are stringified.

    my $foo = 1;
    $ENV{'bar'} = \$foo;
    if( ref $ENV{'bar'} ) {
        say "Pre 5.18.0 Behaviour";
    } else {
        say "Post 5.18.0 Behaviour";
    }

Previously, only child processes received stringified values:

    my $foo = 1;
    $ENV{'bar'} = \$foo;

    # Always printed 'non ref'
    system($^X, '-e',
           q/print ( ref $ENV{'bar'}  ? 'ref' : 'non ref' ) /);

This happens because you can't really share arbitrary data structures with
foreign processes.

=item $OLD_PERL_VERSION

=item $]
X<$]> X<$OLD_PERL_VERSION>

The revision, version, and subversion of the Perl interpreter, represented
as a decimal of the form 5.XXXYYY, where XXX is the version / 1e3 and YYY
is the subversion / 1e6.  For example, Perl v5.10.1 would be "5.010001".

This variable can be used to determine whether the Perl interpreter
executing a script is in the right range of versions:

    warn "No PerlIO!\n" if $] lt '5.008';

When comparing C<$]>, string comparison operators are B<highly
recommended>.  The inherent limitations of binary floating point
representation can sometimes lead to incorrect comparisons for some
numbers on some architectures.

See also the documentation of C<use VERSION> and C<require VERSION>
for a convenient way to fail if the running Perl interpreter is too old.

See L</$^V> for a representation of the Perl version as a L<version>
object, which allows more flexible string comparisons.

The main advantage of C<$]> over C<$^V> is that it works the same on any
version of Perl.  The disadvantages are that it can't easily be compared
to versions in other formats (e.g. literal v-strings, "v1.2.3" or
version objects) and numeric comparisons can occasionally fail; it's good
for string literal version checks and bad for comparing to a variable
that hasn't been sanity-checked.

The C<$OLD_PERL_VERSION> form was added in Perl v5.20.0 for historical
reasons but its use is discouraged. (If your reason to use C<$]> is to
run code on old perls then referring to it as C<$OLD_PERL_VERSION> would
be self-defeating.)

Mnemonic: Is this version of perl in the right bracket?

=item $SYSTEM_FD_MAX

=item $^F
X<$^F> X<$SYSTEM_FD_MAX>

The maximum system file descriptor, ordinarily 2.  System file
descriptors are passed to C<exec()>ed processes, while higher file
descriptors are not.  Also, during an
C<open()>, system file descriptors are
preserved even if the C<open()> fails (ordinary file descriptors are
closed before the C<open()> is attempted).  The close-on-exec
status of a file descriptor will be decided according to the value of
C<$^F> when the corresponding file, pipe, or socket was opened, not the
time of the C<exec()>.

=item @F
X<@F>

The array C<@F> contains the fields of each line read in when autosplit
mode is turned on.  See L<perlrun> for the B<-a> switch.  This array
is package-specific, and must be declared or given a full package name
if not in package main when running under C<strict 'vars'>.

=item @INC
X<@INC>

The array C<@INC> contains the list of places that the C<do EXPR>,
C<require>, or C<use> constructs look for their library files.  It
initially consists of the arguments to any B<-I> command-line
switches, followed by the default Perl library, probably
F</usr/local/lib/perl>, followed by ".", to represent the current
directory.  ("." will not be appended if taint checks are enabled,
either by C<-T> or by C<-t>, or if configured not to do so by the
C<-Ddefault_inc_excludes_dot> compile time option.)  If you need to
modify this at runtime, you should use the C<use lib> pragma to get
the machine-dependent library properly loaded also:

    use lib '/mypath/libdir/';
    use SomeMod;

You can also insert hooks into the file inclusion system by putting Perl
code directly into C<@INC>.  Those hooks may be subroutine references,
array references or blessed objects.  See L<perlfunc/require> for details.

=item %INC
X<%INC>

The hash C<%INC> contains entries for each filename included via the
C<do>, C<require>, or C<use> operators.  The key is the filename
you specified (with module names converted to pathnames), and the
value is the location of the file found.  The C<require>
operator uses this hash to determine whether a particular file has
already been included.

If the file was loaded via a hook (e.g. a subroutine reference, see
L<perlfunc/require> for a description of these hooks), this hook is
by default inserted into C<%INC> in place of a filename.  Note, however,
that the hook may have set the C<%INC> entry by itself to provide some more
specific info.

=item $INPLACE_EDIT

=item $^I
X<$^I> X<$INPLACE_EDIT>

The current value of the inplace-edit extension.  Use C<undef> to disable
inplace editing.

Mnemonic: value of B<-i> switch.

=item @ISA
X<@ISA>

Each package contains a special array called C<@ISA> which contains a list
of that class's parent classes, if any. This array is simply a list of
scalars, each of which is a string that corresponds to a package name. The
array is examined when Perl does method resolution, which is covered in
L<perlobj>.

To load packages while adding them to C<@ISA>, see the L<parent> pragma. The
discouraged L<base> pragma does this as well, but should not be used except
when compatibility with the discouraged L<fields> pragma is required.

=item $^M
X<$^M>

By default, running out of memory is an untrappable, fatal error.
However, if suitably built, Perl can use the contents of C<$^M>
as an emergency memory pool after C<die()>ing.  Suppose that your Perl
were compiled with C<-DPERL_EMERGENCY_SBRK> and used Perl's malloc.
Then

    $^M = 'a' x (1 << 16);

would allocate a 64K buffer for use in an emergency.  See the
F<INSTALL> file in the Perl distribution for information on how to
add custom C compilation flags when compiling perl.  To discourage casual
use of this advanced feature, there is no L<English|English> long name for
this variable.

This variable was added in Perl 5.004.

=item $OSNAME

=item $^O
X<$^O> X<$OSNAME>

The name of the operating system under which this copy of Perl was
built, as determined during the configuration process.  For examples
see L<perlport/PLATFORMS>.

The value is identical to C<$Config{'osname'}>.  See also L<Config>
and the B<-V> command-line switch documented in L<perlrun>.

In Windows platforms, C<$^O> is not very helpful: since it is always
C<MSWin32>, it doesn't tell the difference between
95/98/ME/NT/2000/XP/CE/.NET.  Use C<Win32::GetOSName()> or
Win32::GetOSVersion() (see L<Win32> and L<perlport>) to distinguish
between the variants.

This variable was added in Perl 5.003.

=item %SIG
X<%SIG>

The hash C<%SIG> contains signal handlers for signals.  For example:

    sub handler {   # 1st argument is signal name
	my($sig) = @_;
	print "Caught a SIG$sig--shutting down\n";
	close(LOG);
	exit(0);
	}

    $SIG{'INT'}  = \&handler;
    $SIG{'QUIT'} = \&handler;
    ...
    $SIG{'INT'}  = 'DEFAULT';   # restore default action
    $SIG{'QUIT'} = 'IGNORE';    # ignore SIGQUIT

Using a value of C<'IGNORE'> usually has the effect of ignoring the
signal, except for the C<CHLD> signal.  See L<perlipc> for more about
this special case.

Here are some other examples:

    $SIG{"PIPE"} = "Plumber";   # assumes main::Plumber (not
				# recommended)
    $SIG{"PIPE"} = \&Plumber;   # just fine; assume current
				# Plumber
    $SIG{"PIPE"} = *Plumber;    # somewhat esoteric
    $SIG{"PIPE"} = Plumber();   # oops, what did Plumber()
				# return??

Be sure not to use a bareword as the name of a signal handler,
lest you inadvertently call it.

If your system has the C<sigaction()> function then signal handlers
are installed using it.  This means you get reliable signal handling.

The default delivery policy of signals changed in Perl v5.8.0 from
immediate (also known as "unsafe") to deferred, also known as "safe
signals".  See L<perlipc> for more information.

Certain internal hooks can be also set using the C<%SIG> hash.  The
routine indicated by C<$SIG{__WARN__}> is called when a warning
message is about to be printed.  The warning message is passed as the
first argument.  The presence of a C<__WARN__> hook causes the
ordinary printing of warnings to C<STDERR> to be suppressed.  You can
use this to save warnings in a variable, or turn warnings into fatal
errors, like this:

    local $SIG{__WARN__} = sub { die $_[0] };
    eval $proggie;

As the C<'IGNORE'> hook is not supported by C<__WARN__>, you can
disable warnings using the empty subroutine:

    local $SIG{__WARN__} = sub {};

The routine indicated by C<$SIG{__DIE__}> is called when a fatal
exception is about to be thrown.  The error message is passed as the
first argument.  When a C<__DIE__> hook routine returns, the exception
processing continues as it would have in the absence of the hook,
unless the hook routine itself exits via a C<goto &sub>, a loop exit,
or a C<die()>.  The C<__DIE__> handler is explicitly disabled during
the call, so that you can die from a C<__DIE__> handler.  Similarly
for C<__WARN__>.

The C<$SIG{__DIE__}> hook is called even inside an C<eval()>. It was
never intended to happen this way, but an implementation glitch made
this possible. This used to be deprecated, as it allowed strange action
at a distance like rewriting a pending exception in C<$@>. Plans to
rectify this have been scrapped, as users found that rewriting a 
pending exception is actually a useful feature, and not a bug.

C<__DIE__>/C<__WARN__> handlers are very special in one respect: they
may be called to report (probable) errors found by the parser.  In such
a case the parser may be in inconsistent state, so any attempt to
evaluate Perl code from such a handler will probably result in a
segfault.  This means that warnings or errors that result from parsing
Perl should be used with extreme caution, like this:

    require Carp if defined $^S;
    Carp::confess("Something wrong") if defined &Carp::confess;
    die "Something wrong, but could not load Carp to give "
      . "backtrace...\n\t"
      . "To see backtrace try starting Perl with -MCarp switch";

Here the first line will load C<Carp> I<unless> it is the parser who
called the handler.  The second line will print backtrace and die if
C<Carp> was available.  The third line will be executed only if C<Carp> was
not available.

Having to even think about the C<$^S> variable in your exception
handlers is simply wrong.  C<$SIG{__DIE__}> as currently implemented
invites grievous and difficult to track down errors.  Avoid it
and use an C<END{}> or CORE::GLOBAL::die override instead.

See L<perlfunc/die>, L<perlfunc/warn>, L<perlfunc/eval>, and
L<warnings> for additional information.

=item $BASETIME

=item $^T
X<$^T> X<$BASETIME>

The time at which the program began running, in seconds since the
epoch (beginning of 1970).  The values returned by the B<-M>, B<-A>,
and B<-C> filetests are based on this value.

=item $PERL_VERSION

=item $^V
X<$^V> X<$PERL_VERSION>

The revision, version, and subversion of the Perl interpreter,
represented as a L<version> object.

This variable first appeared in perl v5.6.0; earlier versions of perl
will see an undefined value.  Before perl v5.10.0 C<$^V> was represented
as a v-string rather than a L<version> object.

C<$^V> can be used to determine whether the Perl interpreter executing
a script is in the right range of versions.  For example:

    warn "Hashes not randomized!\n" if !$^V or $^V lt v5.8.1

While version objects overload stringification, to portably convert
C<$^V> into its string representation, use C<sprintf()>'s C<"%vd">
conversion, which works for both v-strings or version objects:

    printf "version is v%vd\n", $^V;  # Perl's version

See the documentation of C<use VERSION> and C<require VERSION>
for a convenient way to fail if the running Perl interpreter is too old.

See also C<L</$]>> for a decimal representation of the Perl version.

The main advantage of C<$^V> over C<$]> is that, for Perl v5.10.0 or
later, it overloads operators, allowing easy comparison against other
version representations (e.g. decimal, literal v-string, "v1.2.3", or
objects).  The disadvantage is that prior to v5.10.0, it was only a
literal v-string, which can't be easily printed or compared, whereas
the behavior of C<$]> is unchanged on all versions of Perl.

Mnemonic: use ^V for a version object.

=item ${^WIN32_SLOPPY_STAT}
X<${^WIN32_SLOPPY_STAT}> X<sitecustomize> X<sitecustomize.pl>

If this variable is set to a true value, then C<stat()> on Windows will
not try to open the file.  This means that the link count cannot be
determined and file attributes may be out of date if additional
hardlinks to the file exist.  On the other hand, not opening the file
is considerably faster, especially for files on network drives.

This variable could be set in the F<sitecustomize.pl> file to
configure the local Perl installation to use "sloppy" C<stat()> by
default.  See the documentation for B<-f> in
L<perlrun|perlrun/"Command Switches"> for more information about site
customization.

This variable was added in Perl v5.10.0.

=item $EXECUTABLE_NAME

=item $^X
X<$^X> X<$EXECUTABLE_NAME>

The name used to execute the current copy of Perl, from C's
C<argv[0]> or (where supported) F</proc/self/exe>.

Depending on the host operating system, the value of C<$^X> may be
a relative or absolute pathname of the perl program file, or may
be the string used to invoke perl but not the pathname of the
perl program file.  Also, most operating systems permit invoking
programs that are not in the PATH environment variable, so there
is no guarantee that the value of C<$^X> is in PATH.  For VMS, the
value may or may not include a version number.

You usually can use the value of C<$^X> to re-invoke an independent
copy of the same perl that is currently running, e.g.,

    @first_run = `$^X -le "print int rand 100 for 1..100"`;

But recall that not all operating systems support forking or
capturing of the output of commands, so this complex statement
may not be portable.

It is not safe to use the value of C<$^X> as a path name of a file,
as some operating systems that have a mandatory suffix on
executable files do not require use of the suffix when invoking
a command.  To convert the value of C<$^X> to a path name, use the
following statements:

    # Build up a set of file names (not command names).
    use Config;
    my $this_perl = $^X;
    if ($^O ne 'VMS') {
	$this_perl .= $Config{_exe}
	  unless $this_perl =~ m/$Config{_exe}$/i;
	}

Because many operating systems permit anyone with read access to
the Perl program file to make a copy of it, patch the copy, and
then execute the copy, the security-conscious Perl programmer
should take care to invoke the installed copy of perl, not the
copy referenced by C<$^X>.  The following statements accomplish
this goal, and produce a pathname that can be invoked as a
command or referenced as a file.

    use Config;
    my $secure_perl_path = $Config{perlpath};
    if ($^O ne 'VMS') {
	$secure_perl_path .= $Config{_exe}
	    unless $secure_perl_path =~ m/$Config{_exe}$/i;
	}

=back

=head2 Variables related to regular expressions

Most of the special variables related to regular expressions are side
effects.  Perl sets these variables when it has a successful match, so
you should check the match result before using them.  For instance:

    if( /P(A)TT(ER)N/ ) {
	print "I found $1 and $2\n";
	}

These variables are read-only and dynamically-scoped, unless we note
otherwise.

The dynamic nature of the regular expression variables means that
their value is limited to the block that they are in, as demonstrated
by this bit of code:

    my $outer = 'Wallace and Grommit';
    my $inner = 'Mutt and Jeff';

    my $pattern = qr/(\S+) and (\S+)/;

    sub show_n { print "\$1 is $1; \$2 is $2\n" }

    {
    OUTER:
	show_n() if $outer =~ m/$pattern/;

	INNER: {
	    show_n() if $inner =~ m/$pattern/;
	    }

	show_n();
    }

The output shows that while in the C<OUTER> block, the values of C<$1>
and C<$2> are from the match against C<$outer>.  Inside the C<INNER>
block, the values of C<$1> and C<$2> are from the match against
C<$inner>, but only until the end of the block (i.e. the dynamic
scope).  After the C<INNER> block completes, the values of C<$1> and
C<$2> return to the values for the match against C<$outer> even though
we have not made another match:

    $1 is Wallace; $2 is Grommit
    $1 is Mutt; $2 is Jeff
    $1 is Wallace; $2 is Grommit

=head3 Performance issues

Traditionally in Perl, any use of any of the three variables  C<$`>, C<$&>
or C<$'> (or their C<use English> equivalents) anywhere in the code, caused
all subsequent successful pattern matches to make a copy of the matched
string, in case the code might subsequently access one of those variables.
This imposed a considerable performance penalty across the whole program,
so generally the use of these variables has been discouraged.

In Perl 5.6.0 the C<@-> and C<@+> dynamic arrays were introduced that
supply the indices of successful matches. So you could for example do
this:

    $str =~ /pattern/;

    print $`, $&, $'; # bad: perfomance hit

    print             # good: no perfomance hit
	substr($str, 0,     $-[0]),
	substr($str, $-[0], $+[0]-$-[0]),
	substr($str, $+[0]);

In Perl 5.10.0 the C</p> match operator flag and the C<${^PREMATCH}>,
C<${^MATCH}>, and C<${^POSTMATCH}> variables were introduced, that allowed
you to suffer the penalties only on patterns marked with C</p>.

In Perl 5.18.0 onwards, perl started noting the presence of each of the
three variables separately, and only copied that part of the string
required; so in

    $`; $&; "abcdefgh" =~ /d/

perl would only copy the "abcd" part of the string. That could make a big
difference in something like

    $str = 'x' x 1_000_000;
    $&; # whoops
    $str =~ /x/g # one char copied a million times, not a million chars

In Perl 5.20.0 a new copy-on-write system was enabled by default, which
finally fixes all performance issues with these three variables, and makes
them safe to use anywhere.

The C<Devel::NYTProf> and C<Devel::FindAmpersand> modules can help you
find uses of these problematic match variables in your code.

=over 8

=item $<I<digits>> ($1, $2, ...)
X<$1> X<$2> X<$3> X<$I<digits>>

Contains the subpattern from the corresponding set of capturing
parentheses from the last successful pattern match, not counting patterns
matched in nested blocks that have been exited already.

Note there is a distinction between a capture buffer which matches
the empty string a capture buffer which is optional. Eg, C<(x?)> and
C<(x)?> The latter may be undef, the former not.

These variables are read-only and dynamically-scoped.

Mnemonic: like \digits.

=item @{^CAPTURE}
X<@{^CAPTURE}> X<@^CAPTURE>

An array which exposes the contents of the capture buffers, if any, of
the last successful pattern match, not counting patterns matched
in nested blocks that have been exited already.

Note that the 0 index of @{^CAPTURE} is equivalent to $1, the 1 index
is equivalent to $2, etc.

    if ("foal"=~/(.)(.)(.)(.)/) {
        print join "-", @{^CAPTURE};
    }

should output "f-o-a-l".

See also L</$I<digits>>, L</%{^CAPTURE}> and L</%{^CAPTURE_ALL}>.

Note that unlike most other regex magic variables there is no single
letter equivalent to C<@{^CAPTURE}>.

This variable was added in 5.25.7

=item $MATCH

=item $&
X<$&> X<$MATCH>

The string matched by the last successful pattern match (not counting
any matches hidden within a BLOCK or C<eval()> enclosed by the current
BLOCK).

See L</Performance issues> above for the serious performance implications
of using this variable (even once) in your code.

This variable is read-only and dynamically-scoped.

Mnemonic: like C<&> in some editors.

=item ${^MATCH}
X<${^MATCH}>

This is similar to C<$&> (C<$MATCH>) except that it does not incur the
performance penalty associated with that variable.

See L</Performance issues> above.

In Perl v5.18 and earlier, it is only guaranteed
to return a defined value when the pattern was compiled or executed with
the C</p> modifier.  In Perl v5.20, the C</p> modifier does nothing, so
C<${^MATCH}> does the same thing as C<$MATCH>.

This variable was added in Perl v5.10.0.

This variable is read-only and dynamically-scoped.

=item $PREMATCH

=item $`
X<$`> X<$PREMATCH> X<${^PREMATCH}>

The string preceding whatever was matched by the last successful
pattern match, not counting any matches hidden within a BLOCK or C<eval>
enclosed by the current BLOCK.

See L</Performance issues> above for the serious performance implications
of using this variable (even once) in your code.

This variable is read-only and dynamically-scoped.

Mnemonic: C<`> often precedes a quoted string.

=item ${^PREMATCH}
X<$`> X<${^PREMATCH}>

This is similar to C<$`> ($PREMATCH) except that it does not incur the
performance penalty associated with that variable.

See L</Performance issues> above.

In Perl v5.18 and earlier, it is only guaranteed
to return a defined value when the pattern was compiled or executed with
the C</p> modifier.  In Perl v5.20, the C</p> modifier does nothing, so
C<${^PREMATCH}> does the same thing as C<$PREMATCH>.

This variable was added in Perl v5.10.0.

This variable is read-only and dynamically-scoped.

=item $POSTMATCH

=item $'
X<$'> X<$POSTMATCH> X<${^POSTMATCH}> X<@->

The string following whatever was matched by the last successful
pattern match (not counting any matches hidden within a BLOCK or C<eval()>
enclosed by the current BLOCK).  Example:

    local $_ = 'abcdefghi';
    /def/;
    print "$`:$&:$'\n";  	# prints abc:def:ghi

See L</Performance issues> above for the serious performance implications
of using this variable (even once) in your code.

This variable is read-only and dynamically-scoped.

Mnemonic: C<'> often follows a quoted string.

=item ${^POSTMATCH}
X<${^POSTMATCH}> X<$'> X<$POSTMATCH>

This is similar to C<$'> (C<$POSTMATCH>) except that it does not incur the
performance penalty associated with that variable.

See L</Performance issues> above.

In Perl v5.18 and earlier, it is only guaranteed
to return a defined value when the pattern was compiled or executed with
the C</p> modifier.  In Perl v5.20, the C</p> modifier does nothing, so
C<${^POSTMATCH}> does the same thing as C<$POSTMATCH>.

This variable was added in Perl v5.10.0.

This variable is read-only and dynamically-scoped.

=item $LAST_PAREN_MATCH

=item $+
X<$+> X<$LAST_PAREN_MATCH>

The text matched by the last bracket of the last successful search pattern.
This is useful if you don't know which one of a set of alternative patterns
matched.  For example:

    /Version: (.*)|Revision: (.*)/ && ($rev = $+);

This variable is read-only and dynamically-scoped.

Mnemonic: be positive and forward looking.

=item $LAST_SUBMATCH_RESULT

=item $^N
X<$^N> X<$LAST_SUBMATCH_RESULT>

The text matched by the used group most-recently closed (i.e. the group
with the rightmost closing parenthesis) of the last successful search
pattern.

This is primarily used inside C<(?{...})> blocks for examining text
recently matched.  For example, to effectively capture text to a variable
(in addition to C<$1>, C<$2>, etc.), replace C<(...)> with

    (?:(...)(?{ $var = $^N }))

By setting and then using C<$var> in this way relieves you from having to
worry about exactly which numbered set of parentheses they are.

This variable was added in Perl v5.8.0.

Mnemonic: the (possibly) Nested parenthesis that most recently closed.

=item @LAST_MATCH_END

=item @+
X<@+> X<@LAST_MATCH_END>

This array holds the offsets of the ends of the last successful
submatches in the currently active dynamic scope.  C<$+[0]> is
the offset into the string of the end of the entire match.  This
is the same value as what the C<pos> function returns when called
on the variable that was matched against.  The I<n>th element
of this array holds the offset of the I<n>th submatch, so
C<$+[1]> is the offset past where C<$1> ends, C<$+[2]> the offset
past where C<$2> ends, and so on.  You can use C<$#+> to determine
how many subgroups were in the last successful match.  See the
examples given for the C<@-> variable.

This variable was added in Perl v5.6.0.

=item %{^CAPTURE}

=item %LAST_PAREN_MATCH

=item %+
X<%+> X<%LAST_PAREN_MATCH> X<%{^CAPTURE}>

Similar to C<@+>, the C<%+> hash allows access to the named capture
buffers, should they exist, in the last successful match in the
currently active dynamic scope.

For example, C<$+{foo}> is equivalent to C<$1> after the following match:

    'foo' =~ /(?<foo>foo)/;

The keys of the C<%+> hash list only the names of buffers that have
captured (and that are thus associated to defined values).

The underlying behaviour of C<%+> is provided by the
L<Tie::Hash::NamedCapture> module.

B<Note:> C<%-> and C<%+> are tied views into a common internal hash
associated with the last successful regular expression.  Therefore mixing
iterative access to them via C<each> may have unpredictable results.
Likewise, if the last successful match changes, then the results may be
surprising.

This variable was added in Perl v5.10.0. The C<%{^CAPTURE}> alias was
added in 5.25.7.

This variable is read-only and dynamically-scoped.

=item @LAST_MATCH_START

=item @-
X<@-> X<@LAST_MATCH_START>

C<$-[0]> is the offset of the start of the last successful match.
C<$-[>I<n>C<]> is the offset of the start of the substring matched by
I<n>-th subpattern, or undef if the subpattern did not match.

Thus, after a match against C<$_>, C<$&> coincides with C<substr $_, $-[0],
$+[0] - $-[0]>.  Similarly, $I<n> coincides with C<substr $_, $-[n],
$+[n] - $-[n]> if C<$-[n]> is defined, and $+ coincides with
C<substr $_, $-[$#-], $+[$#-] - $-[$#-]>.  One can use C<$#-> to find the
last matched subgroup in the last successful match.  Contrast with
C<$#+>, the number of subgroups in the regular expression.  Compare
with C<@+>.

This array holds the offsets of the beginnings of the last
successful submatches in the currently active dynamic scope.
C<$-[0]> is the offset into the string of the beginning of the
entire match.  The I<n>th element of this array holds the offset
of the I<n>th submatch, so C<$-[1]> is the offset where C<$1>
begins, C<$-[2]> the offset where C<$2> begins, and so on.

After a match against some variable C<$var>:

=over 5

=item C<$`> is the same as C<substr($var, 0, $-[0])>

=item C<$&> is the same as C<substr($var, $-[0], $+[0] - $-[0])>

=item C<$'> is the same as C<substr($var, $+[0])>

=item C<$1> is the same as C<substr($var, $-[1], $+[1] - $-[1])>

=item C<$2> is the same as C<substr($var, $-[2], $+[2] - $-[2])>

=item C<$3> is the same as C<substr($var, $-[3], $+[3] - $-[3])>

=back

This variable was added in Perl v5.6.0.

=item %{^CAPTURE_ALL}
X<%{^CAPTURE_ALL}>

=item %-
X<%->

Similar to C<%+>, this variable allows access to the named capture groups
in the last successful match in the currently active dynamic scope.  To
each capture group name found in the regular expression, it associates a
reference to an array containing the list of values captured by all
buffers with that name (should there be several of them), in the order
where they appear.

Here's an example:

    if ('1234' =~ /(?<A>1)(?<B>2)(?<A>3)(?<B>4)/) {
        foreach my $bufname (sort keys %-) {
            my $ary = $-{$bufname};
            foreach my $idx (0..$#$ary) {
                print "\$-{$bufname}[$idx] : ",
                      (defined($ary->[$idx])
                          ? "'$ary->[$idx]'"
                          : "undef"),
                      "\n";
            }
        }
    }

would print out:

    $-{A}[0] : '1'
    $-{A}[1] : '3'
    $-{B}[0] : '2'
    $-{B}[1] : '4'

The keys of the C<%-> hash correspond to all buffer names found in
the regular expression.

The behaviour of C<%-> is implemented via the
L<Tie::Hash::NamedCapture> module.

B<Note:> C<%-> and C<%+> are tied views into a common internal hash
associated with the last successful regular expression.  Therefore mixing
iterative access to them via C<each> may have unpredictable results.
Likewise, if the last successful match changes, then the results may be
surprising.

This variable was added in Perl v5.10.0. The C<%{^CAPTURE_ALL}> alias was
added in 5.25.7.

This variable is read-only and dynamically-scoped.

=item $LAST_REGEXP_CODE_RESULT

=item $^R
X<$^R> X<$LAST_REGEXP_CODE_RESULT>

The result of evaluation of the last successful C<(?{ code })>
regular expression assertion (see L<perlre>).  May be written to.

This variable was added in Perl 5.005.

=item ${^RE_DEBUG_FLAGS}
X<${^RE_DEBUG_FLAGS}>

The current value of the regex debugging flags.  Set to 0 for no debug output
even when the C<re 'debug'> module is loaded.  See L<re> for details.

This variable was added in Perl v5.10.0.

=item ${^RE_TRIE_MAXBUF}
X<${^RE_TRIE_MAXBUF}>

Controls how certain regex optimisations are applied and how much memory they
utilize.  This value by default is 65536 which corresponds to a 512kB
temporary cache.  Set this to a higher value to trade
memory for speed when matching large alternations.  Set
it to a lower value if you want the optimisations to
be as conservative of memory as possible but still occur, and set it to a
negative value to prevent the optimisation and conserve the most memory.
Under normal situations this variable should be of no interest to you.

This variable was added in Perl v5.10.0.

=back

=head2 Variables related to filehandles

Variables that depend on the currently selected filehandle may be set
by calling an appropriate object method on the C<IO::Handle> object,
although this is less efficient than using the regular built-in
variables.  (Summary lines below for this contain the word HANDLE.)
First you must say

    use IO::Handle;

after which you may use either

    method HANDLE EXPR

or more safely,

    HANDLE->method(EXPR)

Each method returns the old value of the C<IO::Handle> attribute.  The
methods each take an optional EXPR, which, if supplied, specifies the
new value for the C<IO::Handle> attribute in question.  If not
supplied, most methods do nothing to the current value--except for
C<autoflush()>, which will assume a 1 for you, just to be different.

Because loading in the C<IO::Handle> class is an expensive operation,
you should learn how to use the regular built-in variables.

A few of these variables are considered "read-only".  This means that
if you try to assign to this variable, either directly or indirectly
through a reference, you'll raise a run-time exception.

You should be very careful when modifying the default values of most
special variables described in this document.  In most cases you want
to localize these variables before changing them, since if you don't,
the change may affect other modules which rely on the default values
of the special variables that you have changed.  This is one of the
correct ways to read the whole file at once:

    open my $fh, "<", "foo" or die $!;
    local $/; # enable localized slurp mode
    my $content = <$fh>;
    close $fh;

But the following code is quite bad:

    open my $fh, "<", "foo" or die $!;
    undef $/; # enable slurp mode
    my $content = <$fh>;
    close $fh;

since some other module, may want to read data from some file in the
default "line mode", so if the code we have just presented has been
executed, the global value of C<$/> is now changed for any other code
running inside the same Perl interpreter.

Usually when a variable is localized you want to make sure that this
change affects the shortest scope possible.  So unless you are already
inside some short C<{}> block, you should create one yourself.  For
example:

    my $content = '';
    open my $fh, "<", "foo" or die $!;
    {
	local $/;
	$content = <$fh>;
    }
    close $fh;

Here is an example of how your own code can go broken:

    for ( 1..3 ){
	$\ = "\r\n";
	nasty_break();
	print "$_";
    }

    sub nasty_break {
	$\ = "\f";
	# do something with $_
    }

You probably expect this code to print the equivalent of

    "1\r\n2\r\n3\r\n"

but instead you get:

    "1\f2\f3\f"

Why? Because C<nasty_break()> modifies C<$\> without localizing it
first.  The value you set in  C<nasty_break()> is still there when you
return.  The fix is to add C<local()> so the value doesn't leak out of
C<nasty_break()>:

    local $\ = "\f";

It's easy to notice the problem in such a short example, but in more
complicated code you are looking for trouble if you don't localize
changes to the special variables.

=over 8

=item $ARGV
X<$ARGV>

Contains the name of the current file when reading from C<< <> >>.

=item @ARGV
X<@ARGV>

The array C<@ARGV> contains the command-line arguments intended for
the script.  C<$#ARGV> is generally the number of arguments minus
one, because C<$ARGV[0]> is the first argument, I<not> the program's
command name itself.  See L</$0> for the command name.

=item ARGV
X<ARGV>

The special filehandle that iterates over command-line filenames in
C<@ARGV>.  Usually written as the null filehandle in the angle operator
C<< <> >>.  Note that currently C<ARGV> only has its magical effect
within the C<< <> >> operator; elsewhere it is just a plain filehandle
corresponding to the last file opened by C<< <> >>.  In particular,
passing C<\*ARGV> as a parameter to a function that expects a filehandle
may not cause your function to automatically read the contents of all the
files in C<@ARGV>.

=item ARGVOUT
X<ARGVOUT>

The special filehandle that points to the currently open output file
when doing edit-in-place processing with B<-i>.  Useful when you have
to do a lot of inserting and don't want to keep modifying C<$_>.  See
L<perlrun> for the B<-i> switch.

=item IO::Handle->output_field_separator( EXPR )

=item $OUTPUT_FIELD_SEPARATOR

=item $OFS

=item $,
X<$,> X<$OFS> X<$OUTPUT_FIELD_SEPARATOR>

The output field separator for the print operator.  If defined, this
value is printed between each of print's arguments.  Default is C<undef>.

You cannot call C<output_field_separator()> on a handle, only as a
static method.  See L<IO::Handle|IO::Handle>.

Mnemonic: what is printed when there is a "," in your print statement.

=item HANDLE->input_line_number( EXPR )

=item $INPUT_LINE_NUMBER

=item $NR

=item $.
X<$.> X<$NR> X<$INPUT_LINE_NUMBER> X<line number>

Current line number for the last filehandle accessed.

Each filehandle in Perl counts the number of lines that have been read
from it.  (Depending on the value of C<$/>, Perl's idea of what
constitutes a line may not match yours.)  When a line is read from a
filehandle (via C<readline()> or C<< <> >>), or when C<tell()> or
C<seek()> is called on it, C<$.> becomes an alias to the line counter
for that filehandle.

You can adjust the counter by assigning to C<$.>, but this will not
actually move the seek pointer.  I<Localizing C<$.> will not localize
the filehandle's line count>.  Instead, it will localize perl's notion
of which filehandle C<$.> is currently aliased to.

C<$.> is reset when the filehandle is closed, but B<not> when an open
filehandle is reopened without an intervening C<close()>.  For more
details, see L<perlop/"IE<sol>O Operators">.  Because C<< <> >> never does
an explicit close, line numbers increase across C<ARGV> files (but see
examples in L<perlfunc/eof>).

You can also use C<< HANDLE->input_line_number(EXPR) >> to access the
line counter for a given filehandle without having to worry about
which handle you last accessed.

Mnemonic: many programs use "." to mean the current line number.

=item IO::Handle->input_record_separator( EXPR )

=item $INPUT_RECORD_SEPARATOR

=item $RS

=item $/
X<$/> X<$RS> X<$INPUT_RECORD_SEPARATOR>

The input record separator, newline by default.  This influences Perl's
idea of what a "line" is.  Works like B<awk>'s RS variable, including
treating empty lines as a terminator if set to the null string (an
empty line cannot contain any spaces or tabs).  You may set it to a
multi-character string to match a multi-character terminator, or to
C<undef> to read through the end of file.  Setting it to C<"\n\n">
means something slightly different than setting to C<"">, if the file
contains consecutive empty lines.  Setting to C<""> will treat two or
more consecutive empty lines as a single empty line.  Setting to
C<"\n\n"> will blindly assume that the next input character belongs to
the next paragraph, even if it's a newline.

    local $/;           # enable "slurp" mode
    local $_ = <FH>;    # whole file now here
    s/\n[ \t]+/ /g;

Remember: the value of C<$/> is a string, not a regex.  B<awk> has to
be better for something. :-)

Setting C<$/> to a reference to an integer, scalar containing an
integer, or scalar that's convertible to an integer will attempt to
read records instead of lines, with the maximum record size being the
referenced integer number of characters.  So this:

    local $/ = \32768; # or \"32768", or \$var_containing_32768
    open my $fh, "<", $myfile or die $!;
    local $_ = <$fh>;

will read a record of no more than 32768 characters from $fh.  If you're
not reading from a record-oriented file (or your OS doesn't have
record-oriented files), then you'll likely get a full chunk of data
with every read.  If a record is larger than the record size you've
set, you'll get the record back in pieces.  Trying to set the record
size to zero or less is deprecated and will cause $/ to have the value
of "undef", which will cause reading in the (rest of the) whole file.

As of 5.19.9 setting C<$/> to any other form of reference will throw a
fatal exception. This is in preparation for supporting new ways to set
C<$/> in the future.

On VMS only, record reads bypass PerlIO layers and any associated
buffering, so you must not mix record and non-record reads on the
same filehandle.  Record mode mixes with line mode only when the
same buffering layer is in use for both modes.

You cannot call C<input_record_separator()> on a handle, only as a
static method.  See L<IO::Handle|IO::Handle>.

See also L<perlport/"Newlines">.  Also see L</$.>.

Mnemonic: / delimits line boundaries when quoting poetry.

=item IO::Handle->output_record_separator( EXPR )

=item $OUTPUT_RECORD_SEPARATOR

=item $ORS

=item $\
X<$\> X<$ORS> X<$OUTPUT_RECORD_SEPARATOR>

The output record separator for the print operator.  If defined, this
value is printed after the last of print's arguments.  Default is C<undef>.

You cannot call C<output_record_separator()> on a handle, only as a
static method.  See L<IO::Handle|IO::Handle>.

Mnemonic: you set C<$\> instead of adding "\n" at the end of the print.
Also, it's just like C<$/>, but it's what you get "back" from Perl.

=item HANDLE->autoflush( EXPR )

=item $OUTPUT_AUTOFLUSH

=item $|
X<$|> X<autoflush> X<flush> X<$OUTPUT_AUTOFLUSH>

If set to nonzero, forces a flush right away and after every write or
print on the currently selected output channel.  Default is 0
(regardless of whether the channel is really buffered by the system or
not; C<$|> tells you only whether you've asked Perl explicitly to
flush after each write).  STDOUT will typically be line buffered if
output is to the terminal and block buffered otherwise.  Setting this
variable is useful primarily when you are outputting to a pipe or
socket, such as when you are running a Perl program under B<rsh> and
want to see the output as it's happening.  This has no effect on input
buffering.  See L<perlfunc/getc> for that.  See L<perlfunc/select> on
how to select the output channel.  See also L<IO::Handle>.

Mnemonic: when you want your pipes to be piping hot.

=item ${^LAST_FH}
X<${^LAST_FH}>

This read-only variable contains a reference to the last-read filehandle.
This is set by C<< <HANDLE> >>, C<readline>, C<tell>, C<eof> and C<seek>.
This is the same handle that C<$.> and C<tell> and C<eof> without arguments
use.  It is also the handle used when Perl appends ", <STDIN> line 1" to
an error or warning message.

This variable was added in Perl v5.18.0.

=back

=head3 Variables related to formats

The special variables for formats are a subset of those for
filehandles.  See L<perlform> for more information about Perl's
formats.

=over 8

=item $ACCUMULATOR

=item $^A
X<$^A> X<$ACCUMULATOR>

The current value of the C<write()> accumulator for C<format()> lines.
A format contains C<formline()> calls that put their result into
C<$^A>.  After calling its format, C<write()> prints out the contents
of C<$^A> and empties.  So you never really see the contents of C<$^A>
unless you call C<formline()> yourself and then look at it.  See
L<perlform> and L<perlfunc/"formline PICTURE,LIST">.

=item IO::Handle->format_formfeed(EXPR)

=item $FORMAT_FORMFEED

=item $^L
X<$^L> X<$FORMAT_FORMFEED>

What formats output as a form feed.  The default is C<\f>.

You cannot call C<format_formfeed()> on a handle, only as a static
method.  See L<IO::Handle|IO::Handle>.

=item HANDLE->format_page_number(EXPR)

=item $FORMAT_PAGE_NUMBER

=item $%
X<$%> X<$FORMAT_PAGE_NUMBER>

The current page number of the currently selected output channel.

Mnemonic: C<%> is page number in B<nroff>.

=item HANDLE->format_lines_left(EXPR)

=item $FORMAT_LINES_LEFT

=item $-
X<$-> X<$FORMAT_LINES_LEFT>

The number of lines left on the page of the currently selected output
channel.

Mnemonic: lines_on_page - lines_printed.

=item IO::Handle->format_line_break_characters EXPR

=item $FORMAT_LINE_BREAK_CHARACTERS

=item $:
X<$:> X<FORMAT_LINE_BREAK_CHARACTERS>

The current set of characters after which a string may be broken to
fill continuation fields (starting with C<^>) in a format.  The default is
S<" \n-">, to break on a space, newline, or a hyphen.

You cannot call C<format_line_break_characters()> on a handle, only as
a static method.  See L<IO::Handle|IO::Handle>.

Mnemonic: a "colon" in poetry is a part of a line.

=item HANDLE->format_lines_per_page(EXPR)

=item $FORMAT_LINES_PER_PAGE

=item $=
X<$=> X<$FORMAT_LINES_PER_PAGE>

The current page length (printable lines) of the currently selected
output channel.  The default is 60.

Mnemonic: = has horizontal lines.

=item HANDLE->format_top_name(EXPR)

=item $FORMAT_TOP_NAME

=item $^
X<$^> X<$FORMAT_TOP_NAME>

The name of the current top-of-page format for the currently selected
output channel.  The default is the name of the filehandle with C<_TOP>
appended.  For example, the default format top name for the C<STDOUT>
filehandle is C<STDOUT_TOP>.

Mnemonic: points to top of page.

=item HANDLE->format_name(EXPR)

=item $FORMAT_NAME

=item $~
X<$~> X<$FORMAT_NAME>

The name of the current report format for the currently selected
output channel.  The default format name is the same as the filehandle
name.  For example, the default format name for the C<STDOUT>
filehandle is just C<STDOUT>.

Mnemonic: brother to C<$^>.

=back

=head2 Error Variables
X<error> X<exception>

The variables C<$@>, C<$!>, C<$^E>, and C<$?> contain information
about different types of error conditions that may appear during
execution of a Perl program.  The variables are shown ordered by
the "distance" between the subsystem which reported the error and
the Perl process.  They correspond to errors detected by the Perl
interpreter, C library, operating system, or an external program,
respectively.

To illustrate the differences between these variables, consider the
following Perl expression, which uses a single-quoted string.  After
execution of this statement, perl may have set all four special error
variables:

    eval q{
	open my $pipe, "/cdrom/install |" or die $!;
	my @res = <$pipe>;
	close $pipe or die "bad pipe: $?, $!";
    };

When perl executes the C<eval()> expression, it translates the
C<open()>, C<< <PIPE> >>, and C<close> calls in the C run-time library
and thence to the operating system kernel.  perl sets C<$!> to
the C library's C<errno> if one of these calls fails.

C<$@> is set if the string to be C<eval>-ed did not compile (this may
happen if C<open> or C<close> were imported with bad prototypes), or
if Perl code executed during evaluation C<die()>d.  In these cases the
value of C<$@> is the compile error, or the argument to C<die> (which
will interpolate C<$!> and C<$?>).  (See also L<Fatal>, though.)

Under a few operating systems, C<$^E> may contain a more verbose error
indicator, such as in this case, "CDROM tray not closed."  Systems that
do not support extended error messages leave C<$^E> the same as C<$!>.

Finally, C<$?> may be set to a non-0 value if the external program
F</cdrom/install> fails.  The upper eight bits reflect specific error
conditions encountered by the program (the program's C<exit()> value).
The lower eight bits reflect mode of failure, like signal death and
core dump information.  See L<wait(2)> for details.  In contrast to
C<$!> and C<$^E>, which are set only if an error condition is detected,
the variable C<$?> is set on each C<wait> or pipe C<close>,
overwriting the old value.  This is more like C<$@>, which on every
C<eval()> is always set on failure and cleared on success.

For more details, see the individual descriptions at C<$@>, C<$!>,
C<$^E>, and C<$?>.

=over 8

=item ${^CHILD_ERROR_NATIVE}
X<$^CHILD_ERROR_NATIVE>

The native status returned by the last pipe close, backtick (C<``>)
command, successful call to C<wait()> or C<waitpid()>, or from the
C<system()> operator.  On POSIX-like systems this value can be decoded
with the WIFEXITED, WEXITSTATUS, WIFSIGNALED, WTERMSIG, WIFSTOPPED,
WSTOPSIG and WIFCONTINUED functions provided by the L<POSIX> module.

Under VMS this reflects the actual VMS exit status; i.e. it is the
same as C<$?> when the pragma C<use vmsish 'status'> is in effect.

This variable was added in Perl v5.10.0.

=item $EXTENDED_OS_ERROR

=item $^E
X<$^E> X<$EXTENDED_OS_ERROR>

Error information specific to the current operating system.  At the
moment, this differs from C<L</$!>> under only VMS, OS/2, and Win32 (and
for MacPerl).  On all other platforms, C<$^E> is always just the same
as C<$!>.

Under VMS, C<$^E> provides the VMS status value from the last system
error.  This is more specific information about the last system error
than that provided by C<$!>.  This is particularly important when C<$!>
is set to B<EVMSERR>.

Under OS/2, C<$^E> is set to the error code of the last call to OS/2
API either via CRT, or directly from perl.

Under Win32, C<$^E> always returns the last error information reported
by the Win32 call C<GetLastError()> which describes the last error
from within the Win32 API.  Most Win32-specific code will report errors
via C<$^E>.  ANSI C and Unix-like calls set C<errno> and so most
portable Perl code will report errors via C<$!>.

Caveats mentioned in the description of C<L</$!>> generally apply to
C<$^E>, also.

This variable was added in Perl 5.003.

Mnemonic: Extra error explanation.

=item $EXCEPTIONS_BEING_CAUGHT

=item $^S
X<$^S> X<$EXCEPTIONS_BEING_CAUGHT>

Current state of the interpreter.

	$^S         State
	---------   -------------------------------------
	undef       Parsing module, eval, or main program
	true (1)    Executing an eval
	false (0)   Otherwise

The first state may happen in C<$SIG{__DIE__}> and C<$SIG{__WARN__}>
handlers.

The English name $EXCEPTIONS_BEING_CAUGHT is slightly misleading, because
the C<undef> value does not indicate whether exceptions are being caught,
since compilation of the main program does not catch exceptions.

This variable was added in Perl 5.004.

=item $WARNING

=item $^W
X<$^W> X<$WARNING>

The current value of the warning switch, initially true if B<-w> was
used, false otherwise, but directly modifiable.

See also L<warnings>.

Mnemonic: related to the B<-w> switch.

=item ${^WARNING_BITS}
X<${^WARNING_BITS}>

The current set of warning checks enabled by the C<use warnings> pragma.
It has the same scoping as the C<$^H> and C<%^H> variables.  The exact
values are considered internal to the L<warnings> pragma and may change
between versions of Perl.

This variable was added in Perl v5.6.0.

=item $OS_ERROR

=item $ERRNO

=item $!
X<$!> X<$ERRNO> X<$OS_ERROR>

When referenced, C<$!> retrieves the current value
of the C C<errno> integer variable.
If C<$!> is assigned a numerical value, that value is stored in C<errno>.
When referenced as a string, C<$!> yields the system error string
corresponding to C<errno>.

Many system or library calls set C<errno> if they fail,
to indicate the cause of failure.  They usually do B<not>
set C<errno> to zero if they succeed.  This means C<errno>,
hence C<$!>, is meaningful only I<immediately> after a B<failure>:

    if (open my $fh, "<", $filename) {
		# Here $! is meaningless.
		...
    }
    else {
		# ONLY here is $! meaningful.
		...
		# Already here $! might be meaningless.
    }
    # Since here we might have either success or failure,
    # $! is meaningless.

Here, I<meaningless> means that C<$!> may be unrelated to the outcome
of the C<open()> operator.  Assignment to C<$!> is similarly ephemeral.
It can be used immediately before invoking the C<die()> operator,
to set the exit value, or to inspect the system error string
corresponding to error I<n>, or to restore C<$!> to a meaningful state.

Mnemonic: What just went bang?

=item %OS_ERROR

=item %ERRNO

=item %!
X<%!> X<%OS_ERROR> X<%ERRNO>

Each element of C<%!> has a true value only if C<$!> is set to that
value.  For example, C<$!{ENOENT}> is true if and only if the current
value of C<$!> is C<ENOENT>; that is, if the most recent error was "No
such file or directory" (or its moral equivalent: not all operating
systems give that exact error, and certainly not all languages).  The
specific true value is not guaranteed, but in the past has generally
been the numeric value of C<$!>.  To check if a particular key is
meaningful on your system, use C<exists $!{the_key}>; for a list of legal
keys, use C<keys %!>.  See L<Errno> for more information, and also see
L</$!>.

This variable was added in Perl 5.005.

=item $CHILD_ERROR

=item $?
X<$?> X<$CHILD_ERROR>

The status returned by the last pipe close, backtick (C<``>) command,
successful call to C<wait()> or C<waitpid()>, or from the C<system()>
operator.  This is just the 16-bit status word returned by the
traditional Unix C<wait()> system call (or else is made up to look
like it).  Thus, the exit value of the subprocess is really (C<<< $? >>
8 >>>), and C<$? & 127> gives which signal, if any, the process died
from, and C<$? & 128> reports whether there was a core dump.

Additionally, if the C<h_errno> variable is supported in C, its value
is returned via C<$?> if any C<gethost*()> function fails.

If you have installed a signal handler for C<SIGCHLD>, the
value of C<$?> will usually be wrong outside that handler.

Inside an C<END> subroutine C<$?> contains the value that is going to be
given to C<exit()>.  You can modify C<$?> in an C<END> subroutine to
change the exit status of your program.  For example:

    END {
	$? = 1 if $? == 255;  # die would make it 255
    }

Under VMS, the pragma C<use vmsish 'status'> makes C<$?> reflect the
actual VMS exit status, instead of the default emulation of POSIX
status; see L<perlvms/$?> for details.

Mnemonic: similar to B<sh> and B<ksh>.

=item $EVAL_ERROR

=item $@
X<$@> X<$EVAL_ERROR>

The Perl error from the last C<eval> operator, i.e. the last exception that
was caught.  For C<eval BLOCK>, this is either a runtime error message or the
string or reference C<die> was called with.  The C<eval STRING> form also
catches syntax errors and other compile time exceptions.

If no error occurs, C<eval> sets C<$@> to the empty string.

Warning messages are not collected in this variable.  You can, however,
set up a routine to process warnings by setting C<$SIG{__WARN__}> as
described in L</%SIG>.

Mnemonic: Where was the error "at"?

=back

=head2 Variables related to the interpreter state

These variables provide information about the current interpreter state.

=over 8

=item $COMPILING

=item $^C
X<$^C> X<$COMPILING>

The current value of the flag associated with the B<-c> switch.
Mainly of use with B<-MO=...> to allow code to alter its behavior
when being compiled, such as for example to C<AUTOLOAD> at compile
time rather than normal, deferred loading.  Setting
C<$^C = 1> is similar to calling C<B::minus_c>.

This variable was added in Perl v5.6.0.

=item $DEBUGGING

=item $^D
X<$^D> X<$DEBUGGING>

The current value of the debugging flags.  May be read or set.  Like its
L<command-line equivalent|perlrun/B<-D>I<letters>>, you can use numeric
or symbolic values, e.g. C<$^D = 10> or C<$^D = "st">.  See
L<perlrun/B<-D>I<number>>.  The contents of this variable also affects the
debugger operation.  See L<perldebguts/Debugger Internals>.

Mnemonic: value of B<-D> switch.

=item ${^ENCODING}
X<${^ENCODING}>

This variable is no longer supported.

It used to hold the I<object reference> to the C<Encode> object that was
used to convert the source code to Unicode.

Its purpose was to allow your non-ASCII Perl
scripts not to have to be written in UTF-8; this was
useful before editors that worked on UTF-8 encoded text were common, but
that was long ago.  It caused problems, such as affecting the operation
of other modules that weren't expecting it, causing general mayhem.

If you need something like this functionality, it is recommended that use
you a simple source filter, such as L<Filter::Encoding>.

If you are coming here because code of yours is being adversely affected
by someone's use of this variable, you can usually work around it by
doing this:

 local ${^ENCODING};

near the beginning of the functions that are getting broken.  This
undefines the variable during the scope of execution of the including
function.

This variable was added in Perl 5.8.2 and removed in 5.26.0.

=item ${^GLOBAL_PHASE}
X<${^GLOBAL_PHASE}>

The current phase of the perl interpreter.

Possible values are:

=over 8

=item CONSTRUCT

The C<PerlInterpreter*> is being constructed via C<perl_construct>.  This
value is mostly there for completeness and for use via the
underlying C variable C<PL_phase>.  It's not really possible for Perl
code to be executed unless construction of the interpreter is
finished.

=item START

This is the global compile-time.  That includes, basically, every
C<BEGIN> block executed directly or indirectly from during the
compile-time of the top-level program.

This phase is not called "BEGIN" to avoid confusion with
C<BEGIN>-blocks, as those are executed during compile-time of any
compilation unit, not just the top-level program.  A new, localised
compile-time entered at run-time, for example by constructs as
C<eval "use SomeModule"> are not global interpreter phases, and
therefore aren't reflected by C<${^GLOBAL_PHASE}>.

=item CHECK

Execution of any C<CHECK> blocks.

=item INIT

Similar to "CHECK", but for C<INIT>-blocks, not C<CHECK> blocks.

=item RUN

The main run-time, i.e. the execution of C<PL_main_root>.

=item END

Execution of any C<END> blocks.

=item DESTRUCT

Global destruction.

=back

Also note that there's no value for UNITCHECK-blocks.  That's because
those are run for each compilation unit individually, and therefore is
not a global interpreter phase.

Not every program has to go through each of the possible phases, but
transition from one phase to another can only happen in the order
described in the above list.

An example of all of the phases Perl code can see:

    BEGIN { print "compile-time: ${^GLOBAL_PHASE}\n" }

    INIT  { print "init-time: ${^GLOBAL_PHASE}\n" }

    CHECK { print "check-time: ${^GLOBAL_PHASE}\n" }

    {
        package Print::Phase;

        sub new {
            my ($class, $time) = @_;
            return bless \$time, $class;
        }

        sub DESTROY {
            my $self = shift;
            print "$$self: ${^GLOBAL_PHASE}\n";
        }
    }

    print "run-time: ${^GLOBAL_PHASE}\n";

    my $runtime = Print::Phase->new(
        "lexical variables are garbage collected before END"
    );

    END   { print "end-time: ${^GLOBAL_PHASE}\n" }

    our $destruct = Print::Phase->new(
        "package variables are garbage collected after END"
    );

This will print out

    compile-time: START
    check-time: CHECK
    init-time: INIT
    run-time: RUN
    lexical variables are garbage collected before END: RUN
    end-time: END
    package variables are garbage collected after END: DESTRUCT

This variable was added in Perl 5.14.0.

=item $^H
X<$^H>

WARNING: This variable is strictly for
internal use only.  Its availability,
behavior, and contents are subject to change without notice.

This variable contains compile-time hints for the Perl interpreter.  At the
end of compilation of a BLOCK the value of this variable is restored to the
value when the interpreter started to compile the BLOCK.

When perl begins to parse any block construct that provides a lexical scope
(e.g., eval body, required file, subroutine body, loop body, or conditional
block), the existing value of C<$^H> is saved, but its value is left unchanged.
When the compilation of the block is completed, it regains the saved value.
Between the points where its value is saved and restored, code that
executes within BEGIN blocks is free to change the value of C<$^H>.

This behavior provides the semantic of lexical scoping, and is used in,
for instance, the C<use strict> pragma.

The contents should be an integer; different bits of it are used for
different pragmatic flags.  Here's an example:

    sub add_100 { $^H |= 0x100 }

    sub foo {
	BEGIN { add_100() }
	bar->baz($boon);
    }

Consider what happens during execution of the BEGIN block.  At this point
the BEGIN block has already been compiled, but the body of C<foo()> is still
being compiled.  The new value of C<$^H>
will therefore be visible only while
the body of C<foo()> is being compiled.

Substitution of C<BEGIN { add_100() }> block with:

    BEGIN { require strict; strict->import('vars') }

demonstrates how C<use strict 'vars'> is implemented.  Here's a conditional
version of the same lexical pragma:

    BEGIN {
	require strict; strict->import('vars') if $condition
    }

This variable was added in Perl 5.003.

=item %^H
X<%^H>

The C<%^H> hash provides the same scoping semantic as C<$^H>.  This makes
it useful for implementation of lexically scoped pragmas.  See
L<perlpragma>.   All the entries are stringified when accessed at
runtime, so only simple values can be accommodated.  This means no
pointers to objects, for example.

When putting items into C<%^H>, in order to avoid conflicting with other
users of the hash there is a convention regarding which keys to use.
A module should use only keys that begin with the module's name (the
name of its main package) and a "/" character.  For example, a module
C<Foo::Bar> should use keys such as C<Foo::Bar/baz>.

This variable was added in Perl v5.6.0.

=item ${^OPEN}
X<${^OPEN}>

An internal variable used by PerlIO.  A string in two parts, separated
by a C<\0> byte, the first part describes the input layers, the second
part describes the output layers.

This variable was added in Perl v5.8.0.

=item $PERLDB

=item $^P
X<$^P> X<$PERLDB>

The internal variable for debugging support.  The meanings of the
various bits are subject to change, but currently indicate:

=over 6

=item 0x01

Debug subroutine enter/exit.

=item 0x02

Line-by-line debugging.  Causes C<DB::DB()> subroutine to be called for
each statement executed.  Also causes saving source code lines (like
0x400).

=item 0x04

Switch off optimizations.

=item 0x08

Preserve more data for future interactive inspections.

=item 0x10

Keep info about source lines on which a subroutine is defined.

=item 0x20

Start with single-step on.

=item 0x40

Use subroutine address instead of name when reporting.

=item 0x80

Report C<goto &subroutine> as well.

=item 0x100

Provide informative "file" names for evals based on the place they were compiled.

=item 0x200

Provide informative names to anonymous subroutines based on the place they
were compiled.

=item 0x400

Save source code lines into C<@{"_<$filename"}>.

=item 0x800

When saving source, include evals that generate no subroutines.

=item 0x1000

When saving source, include source that did not compile.

=back

Some bits may be relevant at compile-time only, some at
run-time only.  This is a new mechanism and the details may change.
See also L<perldebguts>.

=item ${^TAINT}
X<${^TAINT}>

Reflects if taint mode is on or off.  1 for on (the program was run with
B<-T>), 0 for off, -1 when only taint warnings are enabled (i.e. with
B<-t> or B<-TU>).

This variable is read-only.

This variable was added in Perl v5.8.0.

=item ${^UNICODE}
X<${^UNICODE}>

Reflects certain Unicode settings of Perl.  See L<perlrun>
documentation for the C<-C> switch for more information about
the possible values.

This variable is set during Perl startup and is thereafter read-only.

This variable was added in Perl v5.8.2.

=item ${^UTF8CACHE}
X<${^UTF8CACHE}>

This variable controls the state of the internal UTF-8 offset caching code.
1 for on (the default), 0 for off, -1 to debug the caching code by checking
all its results against linear scans, and panicking on any discrepancy.

This variable was added in Perl v5.8.9.  It is subject to change or
removal without notice, but is currently used to avoid recalculating the
boundaries of multi-byte UTF-8-encoded characters.

=item ${^UTF8LOCALE}
X<${^UTF8LOCALE}>

This variable indicates whether a UTF-8 locale was detected by perl at
startup.  This information is used by perl when it's in
adjust-utf8ness-to-locale mode (as when run with the C<-CL> command-line
switch); see L<perlrun> for more info on this.

This variable was added in Perl v5.8.8.

=back

=head2 Deprecated and removed variables

Deprecating a variable announces the intent of the perl maintainers to
eventually remove the variable from the language.  It may still be
available despite its status.  Using a deprecated variable triggers
a warning.

Once a variable is removed, its use triggers an error telling you
the variable is unsupported.

See L<perldiag> for details about error messages.

=over 8

=item $#
X<$#>

C<$#> was a variable that could be used to format printed numbers.
After a deprecation cycle, its magic was removed in Perl v5.10.0 and
using it now triggers a warning: C<$# is no longer supported>.

This is not the sigil you use in front of an array name to get the
last index, like C<$#array>.  That's still how you get the last index
of an array in Perl.  The two have nothing to do with each other.

Deprecated in Perl 5.

Removed in Perl v5.10.0.

=item $*
X<$*>

C<$*> was a variable that you could use to enable multiline matching.
After a deprecation cycle, its magic was removed in Perl v5.10.0.
Using it now triggers a warning: C<$* is no longer supported>.
You should use the C</s> and C</m> regexp modifiers instead.

Deprecated in Perl 5.

Removed in Perl v5.10.0.

=item $[
X<$[>

This variable stores the index of the first element in an array, and
of the first character in a substring.  The default is 0, but you could
theoretically set it to 1 to make Perl behave more like B<awk> (or Fortran)
when subscripting and when evaluating the index() and substr() functions.

As of release 5 of Perl, assignment to C<$[> is treated as a compiler
directive, and cannot influence the behavior of any other file.
(That's why you can only assign compile-time constants to it.)
Its use is highly discouraged.

Prior to Perl v5.10.0, assignment to C<$[> could be seen from outer lexical
scopes in the same file, unlike other compile-time directives (such as
L<strict>).  Using local() on it would bind its value strictly to a lexical
block.  Now it is always lexically scoped.

As of Perl v5.16.0, it is implemented by the L<arybase> module.  See
L<arybase> for more details on its behaviour.

Under C<use v5.16>, or C<no feature "array_base">, C<$[> no longer has any
effect, and always contains 0.  Assigning 0 to it is permitted, but any
other value will produce an error.

Mnemonic: [ begins subscripts.

Deprecated in Perl v5.12.0.

=back

=cut
perl581delta.pod000064400000112255150344123450007471 0ustar00=head1 NAME

perl581delta - what is new for perl v5.8.1

=head1 DESCRIPTION

This document describes differences between the 5.8.0 release and
the 5.8.1 release.

If you are upgrading from an earlier release such as 5.6.1, first read
the L<perl58delta>, which describes differences between 5.6.0 and
5.8.0.

In case you are wondering about 5.6.1, it was bug-fix-wise rather
identical to the development release 5.7.1.  Confused?  This timeline
hopefully helps a bit: it lists the new major releases, their maintenance
releases, and the development releases.

          New     Maintenance  Development

          5.6.0                             2000-Mar-22
                               5.7.0        2000-Sep-02
                  5.6.1                     2001-Apr-08
                               5.7.1        2001-Apr-09
                               5.7.2        2001-Jul-13
                               5.7.3        2002-Mar-05
          5.8.0                             2002-Jul-18
                  5.8.1                     2003-Sep-25

=head1 Incompatible Changes

=head2 Hash Randomisation

Mainly due to security reasons, the "random ordering" of hashes
has been made even more random.  Previously while the order of hash
elements from keys(), values(), and each() was essentially random,
it was still repeatable.  Now, however, the order varies between
different runs of Perl.

B<Perl has never guaranteed any ordering of the hash keys>, and the
ordering has already changed several times during the lifetime of
Perl 5.  Also, the ordering of hash keys has always been, and
continues to be, affected by the insertion order.

The added randomness may affect applications.

One possible scenario is when output of an application has included
hash data.  For example, if you have used the Data::Dumper module to
dump data into different files, and then compared the files to see
whether the data has changed, now you will have false positives since
the order in which hashes are dumped will vary.  In general the cure
is to sort the keys (or the values); in particular for Data::Dumper to
use the C<Sortkeys> option.  If some particular order is really
important, use tied hashes: for example the Tie::IxHash module
which by default preserves the order in which the hash elements
were added.

More subtle problem is reliance on the order of "global destruction".
That is what happens at the end of execution: Perl destroys all data
structures, including user data.  If your destructors (the DESTROY
subroutines) have assumed any particular ordering to the global
destruction, there might be problems ahead.  For example, in a
destructor of one object you cannot assume that objects of any other
class are still available, unless you hold a reference to them.
If the environment variable PERL_DESTRUCT_LEVEL is set to a non-zero
value, or if Perl is exiting a spawned thread, it will also destruct
the ordinary references and the symbol tables that are no longer in use.
You can't call a class method or an ordinary function on a class that
has been collected that way.

The hash randomisation is certain to reveal hidden assumptions about
some particular ordering of hash elements, and outright bugs: it
revealed a few bugs in the Perl core and core modules.

To disable the hash randomisation in runtime, set the environment
variable PERL_HASH_SEED to 0 (zero) before running Perl (for more
information see L<perlrun/PERL_HASH_SEED>), or to disable the feature
completely in compile time, compile with C<-DNO_HASH_SEED> (see F<INSTALL>).

See L<perlsec/"Algorithmic Complexity Attacks"> for the original
rationale behind this change.

=head2 UTF-8 On Filehandles No Longer Activated By Locale

In Perl 5.8.0 all filehandles, including the standard filehandles,
were implicitly set to be in Unicode UTF-8 if the locale settings
indicated the use of UTF-8.  This feature caused too many problems,
so the feature was turned off and redesigned: see L</"Core Enhancements">.

=head2 Single-number v-strings are no longer v-strings before "=>"

The version strings or v-strings (see L<perldata/"Version Strings">)
feature introduced in Perl 5.6.0 has been a source of some confusion--
especially when the user did not want to use it, but Perl thought it
knew better.  Especially troublesome has been the feature that before
a "=>" a version string (a "v" followed by digits) has been interpreted
as a v-string instead of a string literal.  In other words:

	%h = ( v65 => 42 );

has meant since Perl 5.6.0

	%h = ( 'A' => 42 );

(at least in platforms of ASCII progeny)  Perl 5.8.1 restores the
more natural interpretation

	%h = ( 'v65' => 42 );

The multi-number v-strings like v65.66 and 65.66.67 still continue to
be v-strings in Perl 5.8.

=head2 (Win32) The -C Switch Has Been Repurposed

The -C switch has changed in an incompatible way.  The old semantics
of this switch only made sense in Win32 and only in the "use utf8"
universe in 5.6.x releases, and do not make sense for the Unicode
implementation in 5.8.0.  Since this switch could not have been used
by anyone, it has been repurposed.  The behavior that this switch
enabled in 5.6.x releases may be supported in a transparent,
data-dependent fashion in a future release.

For the new life of this switch, see L</"UTF-8 no longer default under
UTF-8 locales">, and L<perlrun/-C>.

=head2 (Win32) The /d Switch Of cmd.exe

Perl 5.8.1 uses the /d switch when running the cmd.exe shell
internally for system(), backticks, and when opening pipes to external
programs.  The extra switch disables the execution of AutoRun commands
from the registry, which is generally considered undesirable when
running external programs.  If you wish to retain compatibility with
the older behavior, set PERL5SHELL in your environment to C<cmd /x/c>.

=head1 Core Enhancements

=head2 UTF-8 no longer default under UTF-8 locales

In Perl 5.8.0 many Unicode features were introduced.   One of them
was found to be of more nuisance than benefit: the automagic
(and silent) "UTF-8-ification" of filehandles, including the
standard filehandles, if the user's locale settings indicated
use of UTF-8.

For example, if you had C<en_US.UTF-8> as your locale, your STDIN and
STDOUT were automatically "UTF-8", in other words an implicit
binmode(..., ":utf8") was made.  This meant that trying to print, say,
chr(0xff), ended up printing the bytes 0xc3 0xbf.  Hardly what
you had in mind unless you were aware of this feature of Perl 5.8.0.
The problem is that the vast majority of people weren't: for example
in RedHat releases 8 and 9 the B<default> locale setting is UTF-8, so
all RedHat users got UTF-8 filehandles, whether they wanted it or not.
The pain was intensified by the Unicode implementation of Perl 5.8.0
(still) having nasty bugs, especially related to the use of s/// and
tr///.  (Bugs that have been fixed in 5.8.1)

Therefore a decision was made to backtrack the feature and change it
from implicit silent default to explicit conscious option.  The new
Perl command line option C<-C> and its counterpart environment
variable PERL_UNICODE can now be used to control how Perl and Unicode
interact at interfaces like I/O and for example the command line
arguments.  See L<perlrun/-C> and L<perlrun/PERL_UNICODE> for more
information.

=head2 Unsafe signals again available

In Perl 5.8.0 the so-called "safe signals" were introduced.  This
means that Perl no longer handles signals immediately but instead
"between opcodes", when it is safe to do so.  The earlier immediate
handling easily could corrupt the internal state of Perl, resulting
in mysterious crashes.

However, the new safer model has its problems too.  Because now an
opcode, a basic unit of Perl execution, is never interrupted but
instead let to run to completion, certain operations that can take a
long time now really do take a long time.  For example, certain
network operations have their own blocking and timeout mechanisms, and
being able to interrupt them immediately would be nice.

Therefore perl 5.8.1 introduces a "backdoor" to restore the pre-5.8.0
(pre-5.7.3, really) signal behaviour.  Just set the environment variable
PERL_SIGNALS to C<unsafe>, and the old immediate (and unsafe)
signal handling behaviour returns.  See L<perlrun/PERL_SIGNALS>
and L<perlipc/"Deferred Signals (Safe Signals)">.

In completely unrelated news, you can now use safe signals with
POSIX::SigAction.  See L<POSIX/POSIX::SigAction>.

=head2 Tied Arrays with Negative Array Indices

Formerly, the indices passed to C<FETCH>, C<STORE>, C<EXISTS>, and
C<DELETE> methods in tied array class were always non-negative.  If
the actual argument was negative, Perl would call FETCHSIZE implicitly
and add the result to the index before passing the result to the tied
array method.  This behaviour is now optional.  If the tied array class
contains a package variable named C<$NEGATIVE_INDICES> which is set to
a true value, negative values will be passed to C<FETCH>, C<STORE>,
C<EXISTS>, and C<DELETE> unchanged.

=head2 local ${$x}

The syntaxes

	local ${$x}
	local @{$x}
	local %{$x}

now do localise variables, given that the $x is a valid variable name.

=head2 Unicode Character Database 4.0.0

The copy of the Unicode Character Database included in Perl 5.8 has
been updated to 4.0.0 from 3.2.0.  This means for example that the
Unicode character properties are as in Unicode 4.0.0.

=head2 Deprecation Warnings

There is one new feature deprecation.  Perl 5.8.0 forgot to add
some deprecation warnings, these warnings have now been added.
Finally, a reminder of an impending feature removal.

=head3 (Reminder) Pseudo-hashes are deprecated (really)

Pseudo-hashes were deprecated in Perl 5.8.0 and will be removed in
Perl 5.10.0, see L<perl58delta> for details.  Each attempt to access
pseudo-hashes will trigger the warning C<Pseudo-hashes are deprecated>.
If you really want to continue using pseudo-hashes but not to see the
deprecation warnings, use:

    no warnings 'deprecated';

Or you can continue to use the L<fields> pragma, but please don't
expect the data structures to be pseudohashes any more.

=head3 (Reminder) 5.005-style threads are deprecated (really)

5.005-style threads (activated by C<use Thread;>) were deprecated in
Perl 5.8.0 and will be removed after Perl 5.8, see L<perl58delta> for
details.  Each 5.005-style thread creation will trigger the warning
C<5.005 threads are deprecated>.  If you really want to continue
using the 5.005 threads but not to see the deprecation warnings, use:

    no warnings 'deprecated';

=head3 (Reminder) The $* variable is deprecated (really)

The C<$*> variable controlling multi-line matching has been deprecated
and will be removed after 5.8.  The variable has been deprecated for a
long time, and a deprecation warning C<Use of $* is deprecated> is given,
now the variable will just finally be removed.  The functionality has
been supplanted by the C</s> and C</m> modifiers on pattern matching.
If you really want to continue using the C<$*>-variable but not to see
the deprecation warnings, use:

    no warnings 'deprecated';

=head2 Miscellaneous Enhancements

C<map> in void context is no longer expensive. C<map> is now context
aware, and will not construct a list if called in void context.

If a socket gets closed by the server while printing to it, the client
now gets a SIGPIPE.  While this new feature was not planned, it fell
naturally out of PerlIO changes, and is to be considered an accidental
feature.

PerlIO::get_layers(FH) returns the names of the PerlIO layers
active on a filehandle.

PerlIO::via layers can now have an optional UTF8 method to
indicate whether the layer wants to "auto-:utf8" the stream.

utf8::is_utf8() has been added as a quick way to test whether
a scalar is encoded internally in UTF-8 (Unicode).

=head1 Modules and Pragmata

=head2 Updated Modules And Pragmata

The following modules and pragmata have been updated since Perl 5.8.0:

=over 4

=item base

=item B::Bytecode

In much better shape than it used to be.  Still far from perfect, but
maybe worth a try.

=item B::Concise

=item B::Deparse

=item Benchmark

An optional feature, C<:hireswallclock>, now allows for high
resolution wall clock times (uses Time::HiRes).

=item ByteLoader

See B::Bytecode.

=item bytes

Now has bytes::substr.

=item CGI

=item charnames

One can now have custom character name aliases.

=item CPAN

There is now a simple command line frontend to the CPAN.pm
module called F<cpan>.

=item Data::Dumper

A new option, Pair, allows choosing the separator between hash keys
and values.

=item DB_File

=item Devel::PPPort

=item Digest::MD5

=item Encode

Significant updates on the encoding pragma functionality
(tr/// and the DATA filehandle, formats).

If a filehandle has been marked as to have an encoding, unmappable
characters are detected already during input, not later (when the
corrupted data is being used).

The ISO 8859-6 conversion table has been corrected (the 0x30..0x39
erroneously mapped to U+0660..U+0669, instead of U+0030..U+0039).  The
GSM 03.38 conversion did not handle escape sequences correctly.  The
UTF-7 encoding has been added (making Encode feature-complete with
Unicode::String).

=item fields

=item libnet

=item Math::BigInt

A lot of bugs have been fixed since v1.60, the version included in Perl
v5.8.0. Especially noteworthy are the bug in Calc that caused div and mod to
fail for some large values, and the fixes to the handling of bad inputs.

Some new features were added, e.g. the broot() method, you can now pass
parameters to config() to change some settings at runtime, and it is now
possible to trap the creation of NaN and infinity.

As usual, some optimizations took place and made the math overall a tad
faster. In some cases, quite a lot faster, actually. Especially alternative
libraries like Math::BigInt::GMP benefit from this. In addition, a lot of the
quite clunky routines like fsqrt() and flog() are now much much faster.

=item MIME::Base64

=item NEXT

Diamond inheritance now works.

=item Net::Ping

=item PerlIO::scalar

Reading from non-string scalars (like the special variables, see
L<perlvar>) now works.

=item podlators

=item Pod::LaTeX

=item PodParsers

=item Pod::Perldoc

Complete rewrite.  As a side-effect, no longer refuses to startup when
run by root.

=item Scalar::Util

New utilities: refaddr, isvstring, looks_like_number, set_prototype.

=item Storable

Can now store code references (via B::Deparse, so not foolproof).

=item strict

Earlier versions of the strict pragma did not check the parameters
implicitly passed to its "import" (use) and "unimport" (no) routine.
This caused the false idiom such as:

        use strict qw(@ISA);
        @ISA = qw(Foo);

This however (probably) raised the false expectation that the strict
refs, vars and subs were being enforced (and that @ISA was somehow
"declared").  But the strict refs, vars, and subs are B<not> enforced
when using this false idiom.

Starting from Perl 5.8.1, the above B<will> cause an error to be
raised.  This may cause programs which used to execute seemingly
correctly without warnings and errors to fail when run under 5.8.1.
This happens because

        use strict qw(@ISA);

will now fail with the error:

        Unknown 'strict' tag(s) '@ISA'

The remedy to this problem is to replace this code with the correct idiom:

        use strict;
        use vars qw(@ISA);
        @ISA = qw(Foo);

=item Term::ANSIcolor

=item Test::Harness

Now much more picky about extra or missing output from test scripts.

=item Test::More

=item Test::Simple

=item Text::Balanced

=item Time::HiRes

Use of nanosleep(), if available, allows mixing subsecond sleeps with
alarms.

=item threads

Several fixes, for example for join() problems and memory
leaks.  In some platforms (like Linux) that use glibc the minimum memory
footprint of one ithread has been reduced by several hundred kilobytes.

=item threads::shared

Many memory leaks have been fixed.

=item Unicode::Collate

=item Unicode::Normalize

=item Win32::GetFolderPath

=item Win32::GetOSVersion

Now returns extra information.

=back

=head1 Utility Changes

The C<h2xs> utility now produces a more modern layout:
F<Foo-Bar/lib/Foo/Bar.pm> instead of F<Foo/Bar/Bar.pm>.
Also, the boilerplate test is now called F<t/Foo-Bar.t>
instead of F<t/1.t>.

The Perl debugger (F<lib/perl5db.pl>) has now been extensively
documented and bugs found while documenting have been fixed.

C<perldoc> has been rewritten from scratch to be more robust and
feature rich.

C<perlcc -B> works now at least somewhat better, while C<perlcc -c>
is rather more broken.  (The Perl compiler suite as a whole continues
to be experimental.)

=head1 New Documentation

perl573delta has been added to list the differences between the
(now quite obsolete) development releases 5.7.2 and 5.7.3.

perl58delta has been added: it is the perldelta of 5.8.0, detailing
the differences between 5.6.0 and 5.8.0.

perlartistic has been added: it is the Artistic License in pod format,
making it easier for modules to refer to it.

perlcheat has been added: it is a Perl cheat sheet.

perlgpl has been added: it is the GNU General Public License in pod
format, making it easier for modules to refer to it.

perlmacosx has been added to tell about the installation and use
of Perl in Mac OS X.

perlos400 has been added to tell about the installation and use
of Perl in OS/400 PASE.

perlreref has been added: it is a regular expressions quick reference.

=head1 Installation and Configuration Improvements

The Unix standard Perl location, F</usr/bin/perl>, is no longer
overwritten by default if it exists.  This change was very prudent
because so many Unix vendors already provide a F</usr/bin/perl>,
but simultaneously many system utilities may depend on that
exact version of Perl, so better not to overwrite it.

One can now specify installation directories for site and vendor man
and HTML pages, and site and vendor scripts.  See F<INSTALL>.

One can now specify a destination directory for Perl installation
by specifying the DESTDIR variable for C<make install>.  (This feature
is slightly different from the previous C<Configure -Dinstallprefix=...>.)
See F<INSTALL>.

gcc versions 3.x introduced a new warning that caused a lot of noise
during Perl compilation: C<gcc -Ialreadyknowndirectory (warning:
changing search order)>.  This warning has now been avoided by
Configure weeding out such directories before the compilation.

One can now build subsets of Perl core modules by using the
Configure flags C<-Dnoextensions=...> and C<-Donlyextensions=...>,
see F<INSTALL>.

=head2 Platform-specific enhancements

In Cygwin Perl can now be built with threads (C<Configure -Duseithreads>).
This works with both Cygwin 1.3.22 and Cygwin 1.5.3.

In newer FreeBSD releases Perl 5.8.0 compilation failed because of
trying to use F<malloc.h>, which in FreeBSD is just a dummy file, and
a fatal error to even try to use.  Now F<malloc.h> is not used.

Perl is now known to build also in Hitachi HI-UXMPP.

Perl is now known to build again in LynxOS.

Mac OS X now installs with Perl version number embedded in
installation directory names for easier upgrading of user-compiled
Perl, and the installation directories in general are more standard.
In other words, the default installation no longer breaks the
Apple-provided Perl.  On the other hand, with C<Configure -Dprefix=/usr>
you can now really replace the Apple-supplied Perl (B<please be careful>).

Mac OS X now builds Perl statically by default.  This change was done
mainly for faster startup times.  The Apple-provided Perl is still
dynamically linked and shared, and you can enable the sharedness for
your own Perl builds by C<Configure -Duseshrplib>.

Perl has been ported to IBM's OS/400 PASE environment.  The best way
to build a Perl for PASE is to use an AIX host as a cross-compilation
environment.  See README.os400.

Yet another cross-compilation option has been added: now Perl builds
on OpenZaurus, an Linux distribution based on Mandrake + Embedix for
the Sharp Zaurus PDA.  See the Cross/README file.

Tru64 when using gcc 3 drops the optimisation for F<toke.c> to C<-O2>
because of gigantic memory use with the default C<-O3>.

Tru64 can now build Perl with the newer Berkeley DBs.

Building Perl on WinCE has been much enhanced, see F<README.ce>
and F<README.perlce>.

=head1 Selected Bug Fixes

=head2 Closures, eval and lexicals

There have been many fixes in the area of anonymous subs, lexicals and
closures.  Although this means that Perl is now more "correct", it is
possible that some existing code will break that happens to rely on
the faulty behaviour.  In practice this is unlikely unless your code
contains a very complex nesting of anonymous subs, evals and lexicals.

=head2 Generic fixes

If an input filehandle is marked C<:utf8> and Perl sees illegal UTF-8
coming in when doing C<< <FH> >>, if warnings are enabled a warning is
immediately given - instead of being silent about it and Perl being
unhappy about the broken data later.  (The C<:encoding(utf8)> layer
also works the same way.)

binmode(SOCKET, ":utf8") only worked on the input side, not on the
output side of the socket.  Now it works both ways.

For threaded Perls certain system database functions like getpwent()
and getgrent() now grow their result buffer dynamically, instead of
failing.  This means that at sites with lots of users and groups the
functions no longer fail by returning only partial results.

Perl 5.8.0 had accidentally broken the capability for users
to define their own uppercase<->lowercase Unicode mappings
(as advertised by the Camel).  This feature has been fixed and
is also documented better.

In 5.8.0 this

	$some_unicode .= <FH>;

didn't work correctly but instead corrupted the data.  This has now
been fixed.

Tied methods like FETCH etc. may now safely access tied values, i.e.
resulting in a recursive call to FETCH etc.  Remember to break the
recursion, though.

At startup Perl blocks the SIGFPE signal away since there isn't much
Perl can do about it.  Previously this blocking was in effect also for
programs executed from within Perl.  Now Perl restores the original
SIGFPE handling routine, whatever it was, before running external
programs.

Linenumbers in Perl scripts may now be greater than 65536, or 2**16.
(Perl scripts have always been able to be larger than that, it's just
that the linenumber for reported errors and warnings have "wrapped
around".)  While scripts that large usually indicate a need to rethink
your code a bit, such Perl scripts do exist, for example as results
from generated code.  Now linenumbers can go all the way to
4294967296, or 2**32.

=head2 Platform-specific fixes

Linux

=over 4

=item *

Setting $0 works again (with certain limitations that
Perl cannot do much about: see L<perlvar/$0>)

=back

HP-UX

=over 4

=item *

Setting $0 now works.

=back

VMS

=over 4

=item *

Configuration now tests for the presence of C<poll()>, and IO::Poll
now uses the vendor-supplied function if detected.

=item *

A rare access violation at Perl start-up could occur if the Perl image was
installed with privileges or if there was an identifier with the
subsystem attribute set in the process's rightslist.  Either of these
circumstances triggered tainting code that contained a pointer bug. 
The faulty pointer arithmetic has been fixed.

=item *

The length limit on values (not keys) in the %ENV hash has been raised
from 255 bytes to 32640 bytes (except when the PERL_ENV_TABLES setting
overrides the default use of logical names for %ENV).  If it is
necessary to access these long values from outside Perl, be aware that
they are implemented using search list logical names that store the
value in pieces, each 255-byte piece (up to 128 of them) being an
element in the search list. When doing a lookup in %ENV from within
Perl, the elements are combined into a single value.  The existing
VMS-specific ability to access individual elements of a search list
logical name via the $ENV{'foo;N'} syntax (where N is the search list
index) is unimpaired.

=item *

The piping implementation now uses local rather than global DCL
symbols for inter-process communication.

=item *

File::Find could become confused when navigating to a relative
directory whose name collided with a logical name.  This problem has
been corrected by adding directory syntax to relative path names, thus
preventing logical name translation.

=back

Win32

=over 4

=item *

A memory leak in the fork() emulation has been fixed.

=item *

The return value of the ioctl() built-in function was accidentally
broken in 5.8.0.  This has been corrected.

=item *

The internal message loop executed by perl during blocking operations
sometimes interfered with messages that were external to Perl.
This often resulted in blocking operations terminating prematurely or
returning incorrect results, when Perl was executing under environments
that could generate Windows messages.  This has been corrected.

=item *

Pipes and sockets are now automatically in binary mode.

=item *

The four-argument form of select() did not preserve $! (errno) properly
when there were errors in the underlying call.  This is now fixed.

=item *

The "CR CR LF" problem of has been fixed, binmode(FH, ":crlf")
is now effectively a no-op.

=back

=head1 New or Changed Diagnostics

All the warnings related to pack() and unpack() were made more
informative and consistent.

=head2 Changed "A thread exited while %d threads were running"

The old version

    A thread exited while %d other threads were still running

was misleading because the "other" included also the thread giving
the warning.

=head2 Removed "Attempt to clear a restricted hash"

It is not illegal to clear a restricted hash, so the warning
was removed.

=head2 New "Illegal declaration of anonymous subroutine"

You must specify the block of code for C<sub>.

=head2 Changed "Invalid range "%s" in transliteration operator"

The old version

    Invalid [] range "%s" in transliteration operator

was simply wrong because there are no "[] ranges" in tr///.

=head2 New "Missing control char name in \c"

Self-explanatory.

=head2 New "Newline in left-justified string for %s"

The padding spaces would appear after the newline, which is
probably not what you had in mind.

=head2 New "Possible precedence problem on bitwise %c operator"

If you think this

    $x & $y == 0

tests whether the bitwise AND of $x and $y is zero,
you will like this warning.

=head2 New "Pseudo-hashes are deprecated"

This warning should have been already in 5.8.0, since they are.

=head2 New "read() on %s filehandle %s"

You cannot read() (or sysread()) from a closed or unopened filehandle.

=head2 New "5.005 threads are deprecated"

This warning should have been already in 5.8.0, since they are.

=head2 New "Tied variable freed while still in use"

Something pulled the plug on a live tied variable, Perl plays
safe by bailing out.

=head2 New "To%s: illegal mapping '%s'"

An illegal user-defined Unicode casemapping was specified.

=head2 New "Use of freed value in iteration"

Something modified the values being iterated over.  This is not good.

=head1 Changed Internals

These news matter to you only if you either write XS code or like to
know about or hack Perl internals (using Devel::Peek or any of the
C<B::> modules counts), or like to run Perl with the C<-D> option.

The embedding examples of L<perlembed> have been reviewed to be
up to date and consistent: for example, the correct use of
PERL_SYS_INIT3() and PERL_SYS_TERM().

Extensive reworking of the pad code (the code responsible
for lexical variables) has been conducted by Dave Mitchell.

Extensive work on the v-strings by John Peacock.

UTF-8 length and position cache: to speed up the handling of Unicode
(UTF-8) scalars, a cache was introduced.  Potential problems exist if
an extension bypasses the official APIs and directly modifies the PV
of an SV: the UTF-8 cache does not get cleared as it should.

APIs obsoleted in Perl 5.8.0, like sv_2pv, sv_catpvn, sv_catsv,
sv_setsv, are again available.

Certain Perl core C APIs like cxinc and regatom are no longer
available at all to code outside the Perl core of the Perl core
extensions.  This is intentional.  They never should have been
available with the shorter names, and if you application depends on
them, you should (be ashamed and) contact perl5-porters to discuss
what are the proper APIs.

Certain Perl core C APIs like C<Perl_list> are no longer available
without their C<Perl_> prefix.  If your XS module stops working
because some functions cannot be found, in many cases a simple fix is
to add the C<Perl_> prefix to the function and the thread context
C<aTHX_> as the first argument of the function call.  This is also how
it should always have been done: letting the Perl_-less forms to leak
from the core was an accident.  For cleaner embedding you can also
force this for all APIs by defining at compile time the cpp define
PERL_NO_SHORT_NAMES.

Perl_save_bool() has been added.

Regexp objects (those created with C<qr>) now have S-magic rather than
R-magic.  This fixed regexps of the form /...(??{...;$x})/ to no
longer ignore changes made to $x.  The S-magic avoids dropping
the caching optimization and making (??{...}) constructs obscenely
slow (and consequently useless).  See also L<perlguts/"Magic Variables">.
Regexp::Copy was affected by this change.

The Perl internal debugging macros DEBUG() and DEB() have been renamed
to PERL_DEBUG() and PERL_DEB() to avoid namespace conflicts.

C<-DL> removed (the leaktest had been broken and unsupported for years,
use alternative debugging mallocs or tools like valgrind and Purify).

Verbose modifier C<v> added for C<-DXv> and C<-Dsv>, see L<perlrun>.

=head1 New Tests

In Perl 5.8.0 there were about 69000 separate tests in about 700 test files,
in Perl 5.8.1 there are about 77000 separate tests in about 780 test files.
The exact numbers depend on the Perl configuration and on the operating
system platform.

=head1 Known Problems

The hash randomisation mentioned in L</Incompatible Changes> is definitely
problematic: it will wake dormant bugs and shake out bad assumptions.

If you want to use mod_perl 2.x with Perl 5.8.1, you will need
mod_perl-1.99_10 or higher.  Earlier versions of mod_perl 2.x
do not work with the randomised hashes.  (mod_perl 1.x works fine.)
You will also need Apache::Test 1.04 or higher.

Many of the rarer platforms that worked 100% or pretty close to it
with perl 5.8.0 have been left a little bit untended since their
maintainers have been otherwise busy lately, and therefore there will
be more failures on those platforms.  Such platforms include Mac OS
Classic, IBM z/OS (and other EBCDIC platforms), and NetWare.  The most
common Perl platforms (Unix and Unix-like, Microsoft platforms, and
VMS) have large enough testing and expert population that they are
doing well.

=head2 Tied hashes in scalar context

Tied hashes do not currently return anything useful in scalar context,
for example when used as boolean tests:

	if (%tied_hash) { ... }

The current nonsensical behaviour is always to return false,
regardless of whether the hash is empty or has elements.

The root cause is that there is no interface for the implementors of
tied hashes to implement the behaviour of a hash in scalar context.

=head2 Net::Ping 450_service and 510_ping_udp failures

The subtests 9 and 18 of lib/Net/Ping/t/450_service.t, and the
subtest 2 of lib/Net/Ping/t/510_ping_udp.t might fail if you have
an unusual networking setup.  For example in the latter case the
test is trying to send a UDP ping to the IP address 127.0.0.1.

=head2 B::C

The C-generating compiler backend B::C (the frontend being
C<perlcc -c>) is even more broken than it used to be because of
the extensive lexical variable changes.  (The good news is that
B::Bytecode and ByteLoader are better than they used to be.)

=head1 Platform Specific Problems

=head2 EBCDIC Platforms

IBM z/OS and other EBCDIC platforms continue to be problematic
regarding Unicode support.  Many Unicode tests are skipped when
they really should be fixed.

=head2 Cygwin 1.5 problems

In Cygwin 1.5 the F<io/tell> and F<op/sysio> tests have failures for
some yet unknown reason.  In 1.5.5 the threads tests stress_cv,
stress_re, and stress_string are failing unless the environment
variable PERLIO is set to "perlio" (which makes also the io/tell
failure go away).

Perl 5.8.1 does build and work well with Cygwin 1.3: with (uname -a)
C<CYGWIN_NT-5.0 ... 1.3.22(0.78/3/2) 2003-03-18 09:20 i686 ...>
a 100% "make test"  was achieved with C<Configure -des -Duseithreads>.

=head2 HP-UX: HP cc warnings about sendfile and sendpath

With certain HP C compiler releases (e.g. B.11.11.02) you will
get many warnings like this (lines wrapped for easier reading):

  cc: "/usr/include/sys/socket.h", line 504: warning 562:
    Redeclaration of "sendfile" with a different storage class specifier:
      "sendfile" will have internal linkage.
  cc: "/usr/include/sys/socket.h", line 505: warning 562:
    Redeclaration of "sendpath" with a different storage class specifier:
      "sendpath" will have internal linkage.

The warnings show up both during the build of Perl and during certain
lib/ExtUtils tests that invoke the C compiler.  The warning, however,
is not serious and can be ignored.

=head2 IRIX: t/uni/tr_7jis.t falsely failing

The test t/uni/tr_7jis.t is known to report failure under 'make test'
or the test harness with certain releases of IRIX (at least IRIX 6.5
and MIPSpro Compilers Version 7.3.1.1m), but if run manually the test
fully passes.

=head2 Mac OS X: no usemymalloc

The Perl malloc (C<-Dusemymalloc>) does not work at all in Mac OS X.
This is not that serious, though, since the native malloc works just
fine.

=head2 Tru64: No threaded builds with GNU cc (gcc)

In the latest Tru64 releases (e.g. v5.1B or later) gcc cannot be used
to compile a threaded Perl (-Duseithreads) because the system
C<< <pthread.h> >> file doesn't know about gcc.

=head2 Win32: sysopen, sysread, syswrite

As of the 5.8.0 release, sysopen()/sysread()/syswrite() do not behave
like they used to in 5.6.1 and earlier with respect to "text" mode.
These built-ins now always operate in "binary" mode (even if sysopen()
was passed the O_TEXT flag, or if binmode() was used on the file
handle).  Note that this issue should only make a difference for disk
files, as sockets and pipes have always been in "binary" mode in the
Windows port.  As this behavior is currently considered a bug,
compatible behavior may be re-introduced in a future release.  Until
then, the use of sysopen(), sysread() and syswrite() is not supported
for "text" mode operations.

=head1 Future Directions

The following things B<might> happen in future.  The first publicly
available releases having these characteristics will be the developer
releases Perl 5.9.x, culminating in the Perl 5.10.0 release.  These
are our best guesses at the moment: we reserve the right to rethink.

=over 4

=item *

PerlIO will become The Default.  Currently (in Perl 5.8.x) the stdio
library is still used if Perl thinks it can use certain tricks to
make stdio go B<really> fast.  For future releases our goal is to
make PerlIO go even faster.

=item *

A new feature called I<assertions> will be available.  This means that
one can have code called assertions sprinkled in the code: usually
they are optimised away, but they can be enabled with the C<-A> option.

=item *

A new operator C<//> (defined-or) will be available.  This means that
one will be able to say

    $a // $b

instead of

   defined $a ? $a : $b

and

   $c //= $d;

instead of

   $c = $d unless defined $c;

The operator will have the same precedence and associativity as C<||>.
A source code patch against the Perl 5.8.1 sources will be available
in CPAN as F<authors/id/H/HM/HMBRAND/dor-5.8.1.diff>.

=item *

C<unpack()> will default to unpacking the C<$_>.

=item *

Various Copy-On-Write techniques will be investigated in hopes
of speeding up Perl.

=item *

CPANPLUS, Inline, and Module::Build will become core modules.

=item *

The ability to write true lexically scoped pragmas will be introduced.

=item *

Work will continue on the bytecompiler and byteloader.

=item *

v-strings as they currently exist are scheduled to be deprecated.  The
v-less form (1.2.3) will become a "version object" when used with C<use>,
C<require>, and C<$VERSION>.  $^V will also be a "version object" so the
printf("%vd",...) construct will no longer be needed.  The v-ful version
(v1.2.3) will become obsolete.  The equivalence of strings and v-strings (e.g.
that currently 5.8.0 is equal to "\5\8\0") will go away.  B<There may be no
deprecation warning for v-strings>, though: it is quite hard to detect when
v-strings are being used safely, and when they are not.

=item *

5.005 Threads Will Be Removed

=item *

The C<$*> Variable Will Be Removed
(it was deprecated a long time ago)

=item *

Pseudohashes Will Be Removed

=back

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org/ .  There may also be
information at http://www.perl.com/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perluniintro.pod000064400000112704150344123450010010 0ustar00=head1 NAME

perluniintro - Perl Unicode introduction

=head1 DESCRIPTION

This document gives a general idea of Unicode and how to use Unicode
in Perl.  See L</Further Resources> for references to more in-depth
treatments of Unicode.

=head2 Unicode

Unicode is a character set standard which plans to codify all of the
writing systems of the world, plus many other symbols.

Unicode and ISO/IEC 10646 are coordinated standards that unify
almost all other modern character set standards,
covering more than 80 writing systems and hundreds of languages,
including all commercially-important modern languages.  All characters
in the largest Chinese, Japanese, and Korean dictionaries are also
encoded. The standards will eventually cover almost all characters in
more than 250 writing systems and thousands of languages.
Unicode 1.0 was released in October 1991, and 6.0 in October 2010.

A Unicode I<character> is an abstract entity.  It is not bound to any
particular integer width, especially not to the C language C<char>.
Unicode is language-neutral and display-neutral: it does not encode the
language of the text, and it does not generally define fonts or other graphical
layout details.  Unicode operates on characters and on text built from
those characters.

Unicode defines characters like C<LATIN CAPITAL LETTER A> or C<GREEK
SMALL LETTER ALPHA> and unique numbers for the characters, in this
case 0x0041 and 0x03B1, respectively.  These unique numbers are called
I<code points>.  A code point is essentially the position of the
character within the set of all possible Unicode characters, and thus in
Perl, the term I<ordinal> is often used interchangeably with it.

The Unicode standard prefers using hexadecimal notation for the code
points.  If numbers like C<0x0041> are unfamiliar to you, take a peek
at a later section, L</"Hexadecimal Notation">.  The Unicode standard
uses the notation C<U+0041 LATIN CAPITAL LETTER A>, to give the
hexadecimal code point and the normative name of the character.

Unicode also defines various I<properties> for the characters, like
"uppercase" or "lowercase", "decimal digit", or "punctuation";
these properties are independent of the names of the characters.
Furthermore, various operations on the characters like uppercasing,
lowercasing, and collating (sorting) are defined.

A Unicode I<logical> "character" can actually consist of more than one internal
I<actual> "character" or code point.  For Western languages, this is adequately
modelled by a I<base character> (like C<LATIN CAPITAL LETTER A>) followed
by one or more I<modifiers> (like C<COMBINING ACUTE ACCENT>).  This sequence of
base character and modifiers is called a I<combining character
sequence>.  Some non-western languages require more complicated
models, so Unicode created the I<grapheme cluster> concept, which was
later further refined into the I<extended grapheme cluster>.  For
example, a Korean Hangul syllable is considered a single logical
character, but most often consists of three actual
Unicode characters: a leading consonant followed by an interior vowel followed
by a trailing consonant.

Whether to call these extended grapheme clusters "characters" depends on your
point of view. If you are a programmer, you probably would tend towards seeing
each element in the sequences as one unit, or "character".  However from
the user's point of view, the whole sequence could be seen as one
"character" since that's probably what it looks like in the context of the
user's language.  In this document, we take the programmer's point of
view: one "character" is one Unicode code point.

For some combinations of base character and modifiers, there are
I<precomposed> characters.  There is a single character equivalent, for
example, for the sequence C<LATIN CAPITAL LETTER A> followed by
C<COMBINING ACUTE ACCENT>.  It is called  C<LATIN CAPITAL LETTER A WITH
ACUTE>.  These precomposed characters are, however, only available for
some combinations, and are mainly meant to support round-trip
conversions between Unicode and legacy standards (like ISO 8859).  Using
sequences, as Unicode does, allows for needing fewer basic building blocks
(code points) to express many more potential grapheme clusters.  To
support conversion between equivalent forms, various I<normalization
forms> are also defined.  Thus, C<LATIN CAPITAL LETTER A WITH ACUTE> is
in I<Normalization Form Composed>, (abbreviated NFC), and the sequence
C<LATIN CAPITAL LETTER A> followed by C<COMBINING ACUTE ACCENT>
represents the same character in I<Normalization Form Decomposed> (NFD).

Because of backward compatibility with legacy encodings, the "a unique
number for every character" idea breaks down a bit: instead, there is
"at least one number for every character".  The same character could
be represented differently in several legacy encodings.  The
converse is not true: some code points do not have an assigned
character.  Firstly, there are unallocated code points within
otherwise used blocks.  Secondly, there are special Unicode control
characters that do not represent true characters.

When Unicode was first conceived, it was thought that all the world's
characters could be represented using a 16-bit word; that is a maximum of
C<0x10000> (or 65,536) characters would be needed, from C<0x0000> to
C<0xFFFF>.  This soon proved to be wrong, and since Unicode 2.0 (July
1996), Unicode has been defined all the way up to 21 bits (C<0x10FFFF>),
and Unicode 3.1 (March 2001) defined the first characters above C<0xFFFF>.
The first C<0x10000> characters are called the I<Plane 0>, or the
I<Basic Multilingual Plane> (BMP).  With Unicode 3.1, 17 (yes,
seventeen) planes in all were defined--but they are nowhere near full of
defined characters, yet.

When a new language is being encoded, Unicode generally will choose a
C<block> of consecutive unallocated code points for its characters.  So
far, the number of code points in these blocks has always been evenly
divisible by 16.  Extras in a block, not currently needed, are left
unallocated, for future growth.  But there have been occasions when
a later release needed more code points than the available extras, and a
new block had to allocated somewhere else, not contiguous to the initial
one, to handle the overflow.  Thus, it became apparent early on that
"block" wasn't an adequate organizing principle, and so the C<Script>
property was created.  (Later an improved script property was added as
well, the C<Script_Extensions> property.)  Those code points that are in
overflow blocks can still
have the same script as the original ones.  The script concept fits more
closely with natural language: there is C<Latin> script, C<Greek>
script, and so on; and there are several artificial scripts, like
C<Common> for characters that are used in multiple scripts, such as
mathematical symbols.  Scripts usually span varied parts of several
blocks.  For more information about scripts, see L<perlunicode/Scripts>.
The division into blocks exists, but it is almost completely
accidental--an artifact of how the characters have been and still are
allocated.  (Note that this paragraph has oversimplified things for the
sake of this being an introduction.  Unicode doesn't really encode
languages, but the writing systems for them--their scripts; and one
script can be used by many languages.  Unicode also encodes things that
aren't really about languages, such as symbols like C<BAGGAGE CLAIM>.)

The Unicode code points are just abstract numbers.  To input and
output these abstract numbers, the numbers must be I<encoded> or
I<serialised> somehow.  Unicode defines several I<character encoding
forms>, of which I<UTF-8> is the most popular.  UTF-8 is a
variable length encoding that encodes Unicode characters as 1 to 4
bytes.  Other encodings
include UTF-16 and UTF-32 and their big- and little-endian variants
(UTF-8 is byte-order independent).  The ISO/IEC 10646 defines the UCS-2
and UCS-4 encoding forms.

For more information about encodings--for instance, to learn what
I<surrogates> and I<byte order marks> (BOMs) are--see L<perlunicode>.

=head2 Perl's Unicode Support

Starting from Perl v5.6.0, Perl has had the capacity to handle Unicode
natively.  Perl v5.8.0, however, is the first recommended release for
serious Unicode work.  The maintenance release 5.6.1 fixed many of the
problems of the initial Unicode implementation, but for example
regular expressions still do not work with Unicode in 5.6.1.
Perl v5.14.0 is the first release where Unicode support is
(almost) seamlessly integrable without some gotchas. (There are a few
exceptions. Firstly, some differences in L<quotemeta|perlfunc/quotemeta>
were fixed starting in Perl 5.16.0. Secondly, some differences in
L<the range operator|perlop/Range Operators> were fixed starting in
Perl 5.26.0. Thirdly, some differences in L<split|perlfunc/split> were fixed
started in Perl 5.28.0.)

To enable this
seamless support, you should C<use feature 'unicode_strings'> (which is
automatically selected if you C<use 5.012> or higher).  See L<feature>.
(5.14 also fixes a number of bugs and departures from the Unicode
standard.)

Before Perl v5.8.0, the use of C<use utf8> was used to declare
that operations in the current block or file would be Unicode-aware.
This model was found to be wrong, or at least clumsy: the "Unicodeness"
is now carried with the data, instead of being attached to the
operations.
Starting with Perl v5.8.0, only one case remains where an explicit C<use
utf8> is needed: if your Perl script itself is encoded in UTF-8, you can
use UTF-8 in your identifier names, and in string and regular expression
literals, by saying C<use utf8>.  This is not the default because
scripts with legacy 8-bit data in them would break.  See L<utf8>.

=head2 Perl's Unicode Model

Perl supports both pre-5.6 strings of eight-bit native bytes, and
strings of Unicode characters.  The general principle is that Perl tries
to keep its data as eight-bit bytes for as long as possible, but as soon
as Unicodeness cannot be avoided, the data is transparently upgraded
to Unicode.  Prior to Perl v5.14.0, the upgrade was not completely
transparent (see L<perlunicode/The "Unicode Bug">), and for backwards
compatibility, full transparency is not gained unless C<use feature
'unicode_strings'> (see L<feature>) or C<use 5.012> (or higher) is
selected.

Internally, Perl currently uses either whatever the native eight-bit
character set of the platform (for example Latin-1) is, defaulting to
UTF-8, to encode Unicode strings. Specifically, if all code points in
the string are C<0xFF> or less, Perl uses the native eight-bit
character set.  Otherwise, it uses UTF-8.

A user of Perl does not normally need to know nor care how Perl
happens to encode its internal strings, but it becomes relevant when
outputting Unicode strings to a stream without a PerlIO layer (one with
the "default" encoding).  In such a case, the raw bytes used internally
(the native character set or UTF-8, as appropriate for each string)
will be used, and a "Wide character" warning will be issued if those
strings contain a character beyond 0x00FF.

For example,

      perl -e 'print "\x{DF}\n", "\x{0100}\x{DF}\n"'

produces a fairly useless mixture of native bytes and UTF-8, as well
as a warning:

     Wide character in print at ...

To output UTF-8, use the C<:encoding> or C<:utf8> output layer.  Prepending

      binmode(STDOUT, ":utf8");

to this sample program ensures that the output is completely UTF-8,
and removes the program's warning.

You can enable automatic UTF-8-ification of your standard file
handles, default C<open()> layer, and C<@ARGV> by using either
the C<-C> command line switch or the C<PERL_UNICODE> environment
variable, see L<perlrun> for the documentation of the C<-C> switch.

Note that this means that Perl expects other software to work the same
way:
if Perl has been led to believe that STDIN should be UTF-8, but then
STDIN coming in from another command is not UTF-8, Perl will likely
complain about the malformed UTF-8.

All features that combine Unicode and I/O also require using the new
PerlIO feature.  Almost all Perl 5.8 platforms do use PerlIO, though:
you can see whether yours is by running "perl -V" and looking for
C<useperlio=define>.

=head2 Unicode and EBCDIC

Perl 5.8.0 added support for Unicode on EBCDIC platforms.  This support
was allowed to lapse in later releases, but was revived in 5.22.
Unicode support is somewhat more complex to implement since additional
conversions are needed.  See L<perlebcdic> for more information.

On EBCDIC platforms, the internal Unicode encoding form is UTF-EBCDIC
instead of UTF-8.  The difference is that as UTF-8 is "ASCII-safe" in
that ASCII characters encode to UTF-8 as-is, while UTF-EBCDIC is
"EBCDIC-safe", in that all the basic characters (which includes all
those that have ASCII equivalents (like C<"A">, C<"0">, C<"%">, I<etc.>)
are the same in both EBCDIC and UTF-EBCDIC.  Often, documentation
will use the term "UTF-8" to mean UTF-EBCDIC as well.  This is the case
in this document.

=head2 Creating Unicode

This section applies fully to Perls starting with v5.22.  Various
caveats for earlier releases are in the L</Earlier releases caveats>
subsection below.

To create Unicode characters in literals,
use the C<\N{...}> notation in double-quoted strings:

 my $smiley_from_name = "\N{WHITE SMILING FACE}";
 my $smiley_from_code_point = "\N{U+263a}";

Similarly, they can be used in regular expression literals

 $smiley =~ /\N{WHITE SMILING FACE}/;
 $smiley =~ /\N{U+263a}/;

At run-time you can use:

 use charnames ();
 my $hebrew_alef_from_name
                      = charnames::string_vianame("HEBREW LETTER ALEF");
 my $hebrew_alef_from_code_point = charnames::string_vianame("U+05D0");

Naturally, C<ord()> will do the reverse: it turns a character into
a code point.

There are other runtime options as well.  You can use C<pack()>:

 my $hebrew_alef_from_code_point = pack("U", 0x05d0);

Or you can use C<chr()>, though it is less convenient in the general
case:

 $hebrew_alef_from_code_point = chr(utf8::unicode_to_native(0x05d0));
 utf8::upgrade($hebrew_alef_from_code_point);

The C<utf8::unicode_to_native()> and C<utf8::upgrade()> aren't needed if
the argument is above 0xFF, so the above could have been written as

 $hebrew_alef_from_code_point = chr(0x05d0);

since 0x5d0 is above 255.

C<\x{}> and C<\o{}> can also be used to specify code points at compile
time in double-quotish strings, but, for backward compatibility with
older Perls, the same rules apply as with C<chr()> for code points less
than 256.

C<utf8::unicode_to_native()> is used so that the Perl code is portable
to EBCDIC platforms.  You can omit it if you're I<really> sure no one
will ever want to use your code on a non-ASCII platform.  Starting in
Perl v5.22, calls to it on ASCII platforms are optimized out, so there's
no performance penalty at all in adding it.  Or you can simply use the
other constructs that don't require it.

See L</"Further Resources"> for how to find all these names and numeric
codes.

=head3 Earlier releases caveats

On EBCDIC platforms, prior to v5.22, using C<\N{U+...}> doesn't work
properly.

Prior to v5.16, using C<\N{...}> with a character name (as opposed to a
C<U+...> code point) required a S<C<use charnames :full>>.

Prior to v5.14, there were some bugs in C<\N{...}> with a character name
(as opposed to a C<U+...> code point).

C<charnames::string_vianame()> was introduced in v5.14.  Prior to that,
C<charnames::vianame()> should work, but only if the argument is of the
form C<"U+...">.  Your best bet there for runtime Unicode by character
name is probably:

 use charnames ();
 my $hebrew_alef_from_name
                  = pack("U", charnames::vianame("HEBREW LETTER ALEF"));

=head2 Handling Unicode

Handling Unicode is for the most part transparent: just use the
strings as usual.  Functions like C<index()>, C<length()>, and
C<substr()> will work on the Unicode characters; regular expressions
will work on the Unicode characters (see L<perlunicode> and L<perlretut>).

Note that Perl considers grapheme clusters to be separate characters, so for
example

 print length("\N{LATIN CAPITAL LETTER A}\N{COMBINING ACUTE ACCENT}"),
       "\n";

will print 2, not 1.  The only exception is that regular expressions
have C<\X> for matching an extended grapheme cluster.  (Thus C<\X> in a
regular expression would match the entire sequence of both the example
characters.)

Life is not quite so transparent, however, when working with legacy
encodings, I/O, and certain special cases:

=head2 Legacy Encodings

When you combine legacy data and Unicode, the legacy data needs
to be upgraded to Unicode.  Normally the legacy data is assumed to be
ISO 8859-1 (or EBCDIC, if applicable).

The C<Encode> module knows about many encodings and has interfaces
for doing conversions between those encodings:

    use Encode 'decode';
    $data = decode("iso-8859-3", $data); # convert from legacy

=head2 Unicode I/O

Normally, writing out Unicode data

    print FH $some_string_with_unicode, "\n";

produces raw bytes that Perl happens to use to internally encode the
Unicode string.  Perl's internal encoding depends on the system as
well as what characters happen to be in the string at the time. If
any of the characters are at code points C<0x100> or above, you will get
a warning.  To ensure that the output is explicitly rendered in the
encoding you desire--and to avoid the warning--open the stream with
the desired encoding. Some examples:

    open FH, ">:utf8", "file";

    open FH, ">:encoding(ucs2)",      "file";
    open FH, ">:encoding(UTF-8)",     "file";
    open FH, ">:encoding(shift_jis)", "file";

and on already open streams, use C<binmode()>:

    binmode(STDOUT, ":utf8");

    binmode(STDOUT, ":encoding(ucs2)");
    binmode(STDOUT, ":encoding(UTF-8)");
    binmode(STDOUT, ":encoding(shift_jis)");

The matching of encoding names is loose: case does not matter, and
many encodings have several aliases.  Note that the C<:utf8> layer
must always be specified exactly like that; it is I<not> subject to
the loose matching of encoding names. Also note that currently C<:utf8> is unsafe for
input, because it accepts the data without validating that it is indeed valid
UTF-8; you should instead use C<:encoding(UTF-8)> (with or without a
hyphen).

See L<PerlIO> for the C<:utf8> layer, L<PerlIO::encoding> and
L<Encode::PerlIO> for the C<:encoding()> layer, and
L<Encode::Supported> for many encodings supported by the C<Encode>
module.

Reading in a file that you know happens to be encoded in one of the
Unicode or legacy encodings does not magically turn the data into
Unicode in Perl's eyes.  To do that, specify the appropriate
layer when opening files

    open(my $fh,'<:encoding(UTF-8)', 'anything');
    my $line_of_unicode = <$fh>;

    open(my $fh,'<:encoding(Big5)', 'anything');
    my $line_of_unicode = <$fh>;

The I/O layers can also be specified more flexibly with
the C<open> pragma.  See L<open>, or look at the following example.

    use open ':encoding(UTF-8)'; # input/output default encoding will be
                                 # UTF-8
    open X, ">file";
    print X chr(0x100), "\n";
    close X;
    open Y, "<file";
    printf "%#x\n", ord(<Y>); # this should print 0x100
    close Y;

With the C<open> pragma you can use the C<:locale> layer

    BEGIN { $ENV{LC_ALL} = $ENV{LANG} = 'ru_RU.KOI8-R' }
    # the :locale will probe the locale environment variables like
    # LC_ALL
    use open OUT => ':locale'; # russki parusski
    open(O, ">koi8");
    print O chr(0x430); # Unicode CYRILLIC SMALL LETTER A = KOI8-R 0xc1
    close O;
    open(I, "<koi8");
    printf "%#x\n", ord(<I>), "\n"; # this should print 0xc1
    close I;

These methods install a transparent filter on the I/O stream that
converts data from the specified encoding when it is read in from the
stream.  The result is always Unicode.

The L<open> pragma affects all the C<open()> calls after the pragma by
setting default layers.  If you want to affect only certain
streams, use explicit layers directly in the C<open()> call.

You can switch encodings on an already opened stream by using
C<binmode()>; see L<perlfunc/binmode>.

The C<:locale> does not currently work with
C<open()> and C<binmode()>, only with the C<open> pragma.  The
C<:utf8> and C<:encoding(...)> methods do work with all of C<open()>,
C<binmode()>, and the C<open> pragma.

Similarly, you may use these I/O layers on output streams to
automatically convert Unicode to the specified encoding when it is
written to the stream. For example, the following snippet copies the
contents of the file "text.jis" (encoded as ISO-2022-JP, aka JIS) to
the file "text.utf8", encoded as UTF-8:

    open(my $nihongo, '<:encoding(iso-2022-jp)', 'text.jis');
    open(my $unicode, '>:utf8',                  'text.utf8');
    while (<$nihongo>) { print $unicode $_ }

The naming of encodings, both by the C<open()> and by the C<open>
pragma allows for flexible names: C<koi8-r> and C<KOI8R> will both be
understood.

Common encodings recognized by ISO, MIME, IANA, and various other
standardisation organisations are recognised; for a more detailed
list see L<Encode::Supported>.

C<read()> reads characters and returns the number of characters.
C<seek()> and C<tell()> operate on byte counts, as do C<sysread()>
and C<sysseek()>.

Notice that because of the default behaviour of not doing any
conversion upon input if there is no default layer,
it is easy to mistakenly write code that keeps on expanding a file
by repeatedly encoding the data:

    # BAD CODE WARNING
    open F, "file";
    local $/; ## read in the whole file of 8-bit characters
    $t = <F>;
    close F;
    open F, ">:encoding(UTF-8)", "file";
    print F $t; ## convert to UTF-8 on output
    close F;

If you run this code twice, the contents of the F<file> will be twice
UTF-8 encoded.  A C<use open ':encoding(UTF-8)'> would have avoided the
bug, or explicitly opening also the F<file> for input as UTF-8.

B<NOTE>: the C<:utf8> and C<:encoding> features work only if your
Perl has been built with L<PerlIO>, which is the default
on most systems.

=head2 Displaying Unicode As Text

Sometimes you might want to display Perl scalars containing Unicode as
simple ASCII (or EBCDIC) text.  The following subroutine converts
its argument so that Unicode characters with code points greater than
255 are displayed as C<\x{...}>, control characters (like C<\n>) are
displayed as C<\x..>, and the rest of the characters as themselves:

 sub nice_string {
        join("",
        map { $_ > 255                    # if wide character...
              ? sprintf("\\x{%04X}", $_)  # \x{...}
              : chr($_) =~ /[[:cntrl:]]/  # else if control character...
                ? sprintf("\\x%02X", $_)  # \x..
                : quotemeta(chr($_))      # else quoted or as themselves
        } unpack("W*", $_[0]));           # unpack Unicode characters
   }

For example,

   nice_string("foo\x{100}bar\n")

returns the string

   'foo\x{0100}bar\x0A'

which is ready to be printed.

(C<\\x{}> is used here instead of C<\\N{}>, since it's most likely that
you want to see what the native values are.)

=head2 Special Cases

=over 4

=item *

Bit Complement Operator ~ And vec()

The bit complement operator C<~> may produce surprising results if
used on strings containing characters with ordinal values above
255. In such a case, the results are consistent with the internal
encoding of the characters, but not with much else. So don't do
that. Similarly for C<vec()>: you will be operating on the
internally-encoded bit patterns of the Unicode characters, not on
the code point values, which is very probably not what you want.

=item *

Peeking At Perl's Internal Encoding

Normal users of Perl should never care how Perl encodes any particular
Unicode string (because the normal ways to get at the contents of a
string with Unicode--via input and output--should always be via
explicitly-defined I/O layers). But if you must, there are two
ways of looking behind the scenes.

One way of peeking inside the internal encoding of Unicode characters
is to use C<unpack("C*", ...> to get the bytes of whatever the string
encoding happens to be, or C<unpack("U0..", ...)> to get the bytes of the
UTF-8 encoding:

    # this prints  c4 80  for the UTF-8 bytes 0xc4 0x80
    print join(" ", unpack("U0(H2)*", pack("U", 0x100))), "\n";

Yet another way would be to use the Devel::Peek module:

    perl -MDevel::Peek -e 'Dump(chr(0x100))'

That shows the C<UTF8> flag in FLAGS and both the UTF-8 bytes
and Unicode characters in C<PV>.  See also later in this document
the discussion about the C<utf8::is_utf8()> function.

=back

=head2 Advanced Topics

=over 4

=item *

String Equivalence

The question of string equivalence turns somewhat complicated
in Unicode: what do you mean by "equal"?

(Is C<LATIN CAPITAL LETTER A WITH ACUTE> equal to
C<LATIN CAPITAL LETTER A>?)

The short answer is that by default Perl compares equivalence (C<eq>,
C<ne>) based only on code points of the characters.  In the above
case, the answer is no (because 0x00C1 != 0x0041).  But sometimes, any
CAPITAL LETTER A's should be considered equal, or even A's of any case.

The long answer is that you need to consider character normalization
and casing issues: see L<Unicode::Normalize>, Unicode Technical Report #15,
L<Unicode Normalization Forms|http://www.unicode.org/unicode/reports/tr15> and
sections on case mapping in the L<Unicode Standard|http://www.unicode.org>.

As of Perl 5.8.0, the "Full" case-folding of I<Case
Mappings/SpecialCasing> is implemented, but bugs remain in C<qr//i> with them,
mostly fixed by 5.14, and essentially entirely by 5.18.

=item *

String Collation

People like to see their strings nicely sorted--or as Unicode
parlance goes, collated.  But again, what do you mean by collate?

(Does C<LATIN CAPITAL LETTER A WITH ACUTE> come before or after
C<LATIN CAPITAL LETTER A WITH GRAVE>?)

The short answer is that by default, Perl compares strings (C<lt>,
C<le>, C<cmp>, C<ge>, C<gt>) based only on the code points of the
characters.  In the above case, the answer is "after", since
C<0x00C1> > C<0x00C0>.

The long answer is that "it depends", and a good answer cannot be
given without knowing (at the very least) the language context.
See L<Unicode::Collate>, and I<Unicode Collation Algorithm>
L<http://www.unicode.org/unicode/reports/tr10/>

=back

=head2 Miscellaneous

=over 4

=item *

Character Ranges and Classes

Character ranges in regular expression bracketed character classes ( e.g.,
C</[a-z]/>) and in the C<tr///> (also known as C<y///>) operator are not
magically Unicode-aware.  What this means is that C<[A-Za-z]> will not
magically start to mean "all alphabetic letters" (not that it does mean that
even for 8-bit characters; for those, if you are using locales (L<perllocale>),
use C</[[:alpha:]]/>; and if not, use the 8-bit-aware property C<\p{alpha}>).

All the properties that begin with C<\p> (and its inverse C<\P>) are actually
character classes that are Unicode-aware.  There are dozens of them, see
L<perluniprops>.

Starting in v5.22, you can use Unicode code points as the end points of
regular expression pattern character ranges, and the range will include
all Unicode code points that lie between those end points, inclusive.

 qr/ [ \N{U+03} - \N{U+20} ] /xx

includes the code points
C<\N{U+03}>, C<\N{U+04}>, ..., C<\N{U+20}>.

This also works for ranges in C<tr///> starting in Perl v5.24.

=item *

String-To-Number Conversions

Unicode does define several other decimal--and numeric--characters
besides the familiar 0 to 9, such as the Arabic and Indic digits.
Perl does not support string-to-number conversion for digits other
than ASCII C<0> to C<9> (and ASCII C<a> to C<f> for hexadecimal).
To get safe conversions from any Unicode string, use
L<Unicode::UCD/num()>.

=back

=head2 Questions With Answers

=over 4

=item *

Will My Old Scripts Break?

Very probably not.  Unless you are generating Unicode characters
somehow, old behaviour should be preserved.  About the only behaviour
that has changed and which could start generating Unicode is the old
behaviour of C<chr()> where supplying an argument more than 255
produced a character modulo 255.  C<chr(300)>, for example, was equal
to C<chr(45)> or "-" (in ASCII), now it is LATIN CAPITAL LETTER I WITH
BREVE.

=item *

How Do I Make My Scripts Work With Unicode?

Very little work should be needed since nothing changes until you
generate Unicode data.  The most important thing is getting input as
Unicode; for that, see the earlier I/O discussion.
To get full seamless Unicode support, add
C<use feature 'unicode_strings'> (or C<use 5.012> or higher) to your
script.

=item *

How Do I Know Whether My String Is In Unicode?

You shouldn't have to care.  But you may if your Perl is before 5.14.0
or you haven't specified C<use feature 'unicode_strings'> or C<use
5.012> (or higher) because otherwise the rules for the code points
in the range 128 to 255 are different depending on
whether the string they are contained within is in Unicode or not.
(See L<perlunicode/When Unicode Does Not Happen>.)

To determine if a string is in Unicode, use:

    print utf8::is_utf8($string) ? 1 : 0, "\n";

But note that this doesn't mean that any of the characters in the
string are necessary UTF-8 encoded, or that any of the characters have
code points greater than 0xFF (255) or even 0x80 (128), or that the
string has any characters at all.  All the C<is_utf8()> does is to
return the value of the internal "utf8ness" flag attached to the
C<$string>.  If the flag is off, the bytes in the scalar are interpreted
as a single byte encoding.  If the flag is on, the bytes in the scalar
are interpreted as the (variable-length, potentially multi-byte) UTF-8 encoded
code points of the characters.  Bytes added to a UTF-8 encoded string are
automatically upgraded to UTF-8.  If mixed non-UTF-8 and UTF-8 scalars
are merged (double-quoted interpolation, explicit concatenation, or
printf/sprintf parameter substitution), the result will be UTF-8 encoded
as if copies of the byte strings were upgraded to UTF-8: for example,

    $a = "ab\x80c";
    $b = "\x{100}";
    print "$a = $b\n";

the output string will be UTF-8-encoded C<ab\x80c = \x{100}\n>, but
C<$a> will stay byte-encoded.

Sometimes you might really need to know the byte length of a string
instead of the character length. For that use the C<bytes> pragma
and the C<length()> function:

    my $unicode = chr(0x100);
    print length($unicode), "\n"; # will print 1
    use bytes;
    print length($unicode), "\n"; # will print 2
                                  # (the 0xC4 0x80 of the UTF-8)
    no bytes;

=item *

How Do I Find Out What Encoding a File Has?

You might try L<Encode::Guess>, but it has a number of limitations.

=item *

How Do I Detect Data That's Not Valid In a Particular Encoding?

Use the C<Encode> package to try converting it.
For example,

    use Encode 'decode';

    if (eval { decode('UTF-8', $string, Encode::FB_CROAK); 1 }) {
        # $string is valid UTF-8
    } else {
        # $string is not valid UTF-8
    }

Or use C<unpack> to try decoding it:

    use warnings;
    @chars = unpack("C0U*", $string_of_bytes_that_I_think_is_utf8);

If invalid, a C<Malformed UTF-8 character> warning is produced. The "C0" means
"process the string character per character".  Without that, the
C<unpack("U*", ...)> would work in C<U0> mode (the default if the format
string starts with C<U>) and it would return the bytes making up the UTF-8
encoding of the target string, something that will always work.

=item *

How Do I Convert Binary Data Into a Particular Encoding, Or Vice Versa?

This probably isn't as useful as you might think.
Normally, you shouldn't need to.

In one sense, what you are asking doesn't make much sense: encodings
are for characters, and binary data are not "characters", so converting
"data" into some encoding isn't meaningful unless you know in what
character set and encoding the binary data is in, in which case it's
not just binary data, now is it?

If you have a raw sequence of bytes that you know should be
interpreted via a particular encoding, you can use C<Encode>:

    use Encode 'from_to';
    from_to($data, "iso-8859-1", "UTF-8"); # from latin-1 to UTF-8

The call to C<from_to()> changes the bytes in C<$data>, but nothing
material about the nature of the string has changed as far as Perl is
concerned.  Both before and after the call, the string C<$data>
contains just a bunch of 8-bit bytes. As far as Perl is concerned,
the encoding of the string remains as "system-native 8-bit bytes".

You might relate this to a fictional 'Translate' module:

   use Translate;
   my $phrase = "Yes";
   Translate::from_to($phrase, 'english', 'deutsch');
   ## phrase now contains "Ja"

The contents of the string changes, but not the nature of the string.
Perl doesn't know any more after the call than before that the
contents of the string indicates the affirmative.

Back to converting data.  If you have (or want) data in your system's
native 8-bit encoding (e.g. Latin-1, EBCDIC, etc.), you can use
pack/unpack to convert to/from Unicode.

    $native_string  = pack("W*", unpack("U*", $Unicode_string));
    $Unicode_string = pack("U*", unpack("W*", $native_string));

If you have a sequence of bytes you B<know> is valid UTF-8,
but Perl doesn't know it yet, you can make Perl a believer, too:

    $Unicode = $bytes;
    utf8::decode($Unicode);

or:

    $Unicode = pack("U0a*", $bytes);

You can find the bytes that make up a UTF-8 sequence with

    @bytes = unpack("C*", $Unicode_string)

and you can create well-formed Unicode with

    $Unicode_string = pack("U*", 0xff, ...)

=item *

How Do I Display Unicode?  How Do I Input Unicode?

See L<http://www.alanwood.net/unicode/> and
L<http://www.cl.cam.ac.uk/~mgk25/unicode.html>

=item *

How Does Unicode Work With Traditional Locales?

If your locale is a UTF-8 locale, starting in Perl v5.26, Perl works
well for all categories; before this, starting with Perl v5.20, it works
for all categories but C<LC_COLLATE>, which deals with
sorting and the C<cmp> operator.  But note that the standard
C<L<Unicode::Collate>> and C<L<Unicode::Collate::Locale>> modules offer
much more powerful solutions to collation issues, and work on earlier
releases.

For other locales, starting in Perl 5.16, you can specify

    use locale ':not_characters';

to get Perl to work well with them.  The catch is that you
have to translate from the locale character set to/from Unicode
yourself.  See L</Unicode IE<sol>O> above for how to

    use open ':locale';

to accomplish this, but full details are in L<perllocale/Unicode and
UTF-8>, including gotchas that happen if you don't specify
C<:not_characters>.

=back

=head2 Hexadecimal Notation

The Unicode standard prefers using hexadecimal notation because
that more clearly shows the division of Unicode into blocks of 256 characters.
Hexadecimal is also simply shorter than decimal.  You can use decimal
notation, too, but learning to use hexadecimal just makes life easier
with the Unicode standard.  The C<U+HHHH> notation uses hexadecimal,
for example.

The C<0x> prefix means a hexadecimal number, the digits are 0-9 I<and>
a-f (or A-F, case doesn't matter).  Each hexadecimal digit represents
four bits, or half a byte.  C<print 0x..., "\n"> will show a
hexadecimal number in decimal, and C<printf "%x\n", $decimal> will
show a decimal number in hexadecimal.  If you have just the
"hex digits" of a hexadecimal number, you can use the C<hex()> function.

    print 0x0009, "\n";    # 9
    print 0x000a, "\n";    # 10
    print 0x000f, "\n";    # 15
    print 0x0010, "\n";    # 16
    print 0x0011, "\n";    # 17
    print 0x0100, "\n";    # 256

    print 0x0041, "\n";    # 65

    printf "%x\n",  65;    # 41
    printf "%#x\n", 65;    # 0x41

    print hex("41"), "\n"; # 65

=head2 Further Resources

=over 4

=item *

Unicode Consortium

L<http://www.unicode.org/>

=item *

Unicode FAQ

L<http://www.unicode.org/unicode/faq/>

=item *

Unicode Glossary

L<http://www.unicode.org/glossary/>

=item *

Unicode Recommended Reading List

The Unicode Consortium has a list of articles and books, some of which
give a much more in depth treatment of Unicode:
L<http://unicode.org/resources/readinglist.html>

=item *

Unicode Useful Resources

L<http://www.unicode.org/unicode/onlinedat/resources.html>

=item *

Unicode and Multilingual Support in HTML, Fonts, Web Browsers and Other Applications

L<http://www.alanwood.net/unicode/>

=item *

UTF-8 and Unicode FAQ for Unix/Linux

L<http://www.cl.cam.ac.uk/~mgk25/unicode.html>

=item *

Legacy Character Sets

L<http://www.czyborra.com/>
L<http://www.eki.ee/letter/>

=item *

You can explore various information from the Unicode data files using
the C<Unicode::UCD> module.

=back

=head1 UNICODE IN OLDER PERLS

If you cannot upgrade your Perl to 5.8.0 or later, you can still
do some Unicode processing by using the modules C<Unicode::String>,
C<Unicode::Map8>, and C<Unicode::Map>, available from CPAN.
If you have the GNU recode installed, you can also use the
Perl front-end C<Convert::Recode> for character conversions.

The following are fast conversions from ISO 8859-1 (Latin-1) bytes
to UTF-8 bytes and back, the code works even with older Perl 5 versions.

    # ISO 8859-1 to UTF-8
    s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg;

    # UTF-8 to ISO 8859-1
    s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg;

=head1 SEE ALSO

L<perlunitut>, L<perlunicode>, L<Encode>, L<open>, L<utf8>, L<bytes>,
L<perlretut>, L<perlrun>, L<Unicode::Collate>, L<Unicode::Normalize>,
L<Unicode::UCD>

=head1 ACKNOWLEDGMENTS

Thanks to the kind readers of the perl5-porters@perl.org,
perl-unicode@perl.org, linux-utf8@nl.linux.org, and unicore@unicode.org
mailing lists for their valuable feedback.

=head1 AUTHOR, COPYRIGHT, AND LICENSE

Copyright 2001-2011 Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt>.
Now maintained by Perl 5 Porters.

This document may be distributed under the same terms as Perl itself.
perldbmfilter.pod000064400000011565150344123450010114 0ustar00=head1 NAME

perldbmfilter - Perl DBM Filters

=head1 SYNOPSIS

    $db = tie %hash, 'DBM', ...

    $old_filter = $db->filter_store_key  ( sub { ... } );
    $old_filter = $db->filter_store_value( sub { ... } );
    $old_filter = $db->filter_fetch_key  ( sub { ... } );
    $old_filter = $db->filter_fetch_value( sub { ... } );

=head1 DESCRIPTION

The four C<filter_*> methods shown above are available in all the DBM
modules that ship with Perl, namely DB_File, GDBM_File, NDBM_File,
ODBM_File and SDBM_File.

Each of the methods works identically, and is used to install (or
uninstall) a single DBM Filter. The only difference between them is the
place that the filter is installed.

To summarise:

=over 5

=item B<filter_store_key>

If a filter has been installed with this method, it will be invoked
every time you write a key to a DBM database.

=item B<filter_store_value>

If a filter has been installed with this method, it will be invoked
every time you write a value to a DBM database.

=item B<filter_fetch_key>

If a filter has been installed with this method, it will be invoked
every time you read a key from a DBM database.

=item B<filter_fetch_value>

If a filter has been installed with this method, it will be invoked
every time you read a value from a DBM database.

=back

You can use any combination of the methods from none to all four.

All filter methods return the existing filter, if present, or C<undef>
if not.

To delete a filter pass C<undef> to it.

=head2 The Filter

When each filter is called by Perl, a local copy of C<$_> will contain
the key or value to be filtered. Filtering is achieved by modifying
the contents of C<$_>. The return code from the filter is ignored.

=head2 An Example: the NULL termination problem.

DBM Filters are useful for a class of problems where you I<always>
want to make the same transformation to all keys, all values or both.

For example, consider the following scenario. You have a DBM database
that you need to share with a third-party C application. The C application
assumes that I<all> keys and values are NULL terminated. Unfortunately
when Perl writes to DBM databases it doesn't use NULL termination, so
your Perl application will have to manage NULL termination itself. When
you write to the database you will have to use something like this:

    $hash{"$key\0"} = "$value\0";

Similarly the NULL needs to be taken into account when you are considering
the length of existing keys/values.

It would be much better if you could ignore the NULL terminations issue
in the main application code and have a mechanism that automatically
added the terminating NULL to all keys and values whenever you write to
the database and have them removed when you read from the database. As I'm
sure you have already guessed, this is a problem that DBM Filters can
fix very easily.

    use strict;
    use warnings;
    use SDBM_File;
    use Fcntl;

    my %hash;
    my $filename = "filt";
    unlink $filename;

    my $db = tie(%hash, 'SDBM_File', $filename, O_RDWR|O_CREAT, 0640)
      or die "Cannot open $filename: $!\n";

    # Install DBM Filters
    $db->filter_fetch_key  ( sub { s/\0$//    } );
    $db->filter_store_key  ( sub { $_ .= "\0" } );
    $db->filter_fetch_value( 
        sub { no warnings 'uninitialized'; s/\0$// } );
    $db->filter_store_value( sub { $_ .= "\0" } );

    $hash{"abc"} = "def";
    my $a = $hash{"ABC"};
    # ...
    undef $db;
    untie %hash;

The code above uses SDBM_File, but it will work with any of the DBM
modules.

Hopefully the contents of each of the filters should be
self-explanatory. Both "fetch" filters remove the terminating NULL,
and both "store" filters add a terminating NULL.


=head2 Another Example: Key is a C int.

Here is another real-life example. By default, whenever Perl writes to
a DBM database it always writes the key and value as strings. So when
you use this:

    $hash{12345} = "something";

the key 12345 will get stored in the DBM database as the 5 byte string
"12345". If you actually want the key to be stored in the DBM database
as a C int, you will have to use C<pack> when writing, and C<unpack>
when reading.

Here is a DBM Filter that does it:

    use strict;
    use warnings;
    use DB_File;
    my %hash;
    my $filename = "filt";
    unlink $filename;


    my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666,
        $DB_HASH or die "Cannot open $filename: $!\n";

    $db->filter_fetch_key  ( sub { $_ = unpack("i", $_) } );
    $db->filter_store_key  ( sub { $_ = pack ("i", $_) } );
    $hash{123} = "def";
    # ...
    undef $db;
    untie %hash;

The code above uses DB_File, but again it will work with any of the
DBM modules.

This time only two filters have been used; we only need to manipulate
the contents of the key, so it wasn't necessary to install any value
filters.

=head1 SEE ALSO

L<DB_File>, L<GDBM_File>, L<NDBM_File>, L<ODBM_File> and L<SDBM_File>.

=head1 AUTHOR

Paul Marquess

perl5121delta.pod000064400000023635150344123450007547 0ustar00=encoding utf8

=head1 NAME

perl5121delta - what is new for perl v5.12.1

=head1 DESCRIPTION

This document describes differences between the 5.12.0 release and
the 5.12.1 release.

If you are upgrading from an earlier release such as 5.10.1, first read
L<perl5120delta>, which describes differences between 5.10.1 and
5.12.0.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.12.0. If any
incompatibilities with 5.12.0 exist, they are bugs. Please report them.

=head1 Core Enhancements

Other than the bug fixes listed below, there should be no user-visible
changes to the core language in this release.

=head1 Modules and Pragmata

=head2 Pragmata Changes

=over 

=item *

We fixed exporting of C<is_strict> and C<is_lax> from L<version>.

These were being exported with a wrapper that treated them as method
calls, which caused them to fail.  They are just functions, are
documented as such, and should never be subclassed, so this patch
just exports them directly as functions without the wrapper.

=back

=head2 Updated Modules

=over 

=item *

We upgraded L<CGI.pm> to version 3.49 to incorporate fixes for regressions
introduced in the release we shipped with Perl 5.12.0.

=item *

We upgraded L<Pod::Simple> to version 3.14 to get an improvement to \C\<\< \>\>
parsing.

=item *

We made a small fix to the L<CPANPLUS> test suite to fix an occasional spurious test failure.

=item *

We upgraded L<Safe> to version 2.27 to wrap coderefs returned by C<reval()> and C<rdo()>.

=back

=head1 Changes to Existing Documentation

=over

=item *

We added the new maintenance release policy to L<perlpolicy.pod>

=item *

We've clarified the multiple-angle-bracket construct in the spec for POD
in L<perlpodspec>

=item *

We added a missing explanation for a warning about C<:=> to L<perldiag.pod>

=item *

We removed a false claim in L<perlunitut> that all text strings are Unicode strings in Perl.

=item *

We updated the Github mirror link in L<perlrepository> to mirrors/perl, not github/perl

=item *

We fixed a minor error in L<perl5114delta.pod>.

=item * 

We replaced a mention of the now-obsolete L<Switch.pm> with F<given>/F<when>.

=item *

We improved documentation about F<$sitelibexp/sitecustomize.pl> in L<perlrun>.

=item * 

We corrected L<perlmodlib.pod> which had unintentionally omitted a number of modules.

=item * 

We updated the documentation for 'require' in L<perlfunc.pod> relating to putting Perl code in @INC.

=item *

We reinstated some erroneously-removed documentation about quotemeta in L<perlfunc>.

=item *

We fixed an F<a2p> example in L<perlutil.pod>.

=item  *

We filled in a blank in L<perlport.pod> with the release date of Perl 5.12.

=item  *

We fixed broken links in a number of perldelta files.

=item * 

The documentation for L<Carp.pm> incorrectly stated that the $Carp::Verbose
variable makes cluck generate stack backtraces.

=item *

We fixed a number of typos in L<Pod::Functions>

=item *

We improved documentation of case-changing functions in L<perlfunc.pod>

=item *

We corrected L<perlgpl.pod> to contain the correct version of the GNU
General Public License.



=back

=head1 Testing

=head2 Testing Improvements

=over

=item *

F<t/op/sselect.t> is now less prone to clock jitter during timing checks
on Windows.

sleep() time on Win32 may be rounded down to multiple of
the clock tick interval.

=item *

F<lib/blib.t> and F<lib/locale.t>: Fixes for test failures on Darwin/PPC

=item *

F<perl5db.t>: Fix for test failures when C<Term::ReadLine::Gnu> is installed.

=back

=head1 Installation and Configuration Improvements

=head2 Configuration improvements

=over 

=item * 

We updated F<INSTALL> with notes about how to deal with broken F<dbm.h>
on OpenSUSE (and possibly other platforms)

=back

=head1 Bug Fixes

=over 4

=item *

A bug in how we process filetest operations could cause a segfault.
Filetests don't always expect an op on the stack, so we now use
TOPs only if we're sure that we're not stat'ing the _ filehandle.
This is indicated by OPf_KIDS (as checked in ck_ftst).

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=74542>

=item *

When deparsing a nextstate op that has both a change of package (relative
to the previous nextstate) and a label, the package declaration is now
emitted first, because it is syntactically impermissible for a label to
prefix a package declaration.

=item * 

XSUB.h now correctly redefines fgets under PERL_IMPLICIT_SYS

See also: L<http://rt.cpan.org/Public/Bug/Display.html?id=55049>

=item * 

utf8::is_utf8 now respects GMAGIC (e.g. $1)

=item * 

XS code using C<fputc()> or C<fputs()>: on Windows could cause an error
due to their arguments being swapped.

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=72704>

=item *

We fixed a small bug in lex_stuff_pvn() that caused spurious syntax errors
in an obscure situation.  It happened when stuffing was performed on the
last line of a file and the line ended with a statement that lacked a
terminating semicolon.  

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=74006>

=item *

We fixed a bug that could cause \N{} constructs followed by a single . to
be parsed incorrectly.

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=74978>

=item *


We fixed a bug that caused when(scalar) without an argument not to be
treated as a syntax error.

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=74114>

=item *

We fixed a regression in the handling of labels immediately before string
evals that was introduced in Perl 5.12.0.

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=74290>

=item *

We fixed a regression in case-insensitive matching of folded characters
in regular expressions introduced in Perl 5.10.1.

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=72998>

=back

=head1 Platform Specific Notes

=head2 HP-UX

=over 

=item *

Perl now allows -Duse64bitint without promoting to use64bitall on HP-UX

=back

=head2 AIX

=over

=item * 

Perl now builds on AIX 4.2

The changes required work around AIX 4.2s' lack of support for IPv6,
and limited support for POSIX C<sigaction()>.

=back

=head2 FreeBSD 7

=over

=item * 

FreeBSD 7 no longer contains F</usr/bin/objformat>. At build time,
Perl now skips the F<objformat> check for versions 7 and higher and
assumes ELF.

=back

=head2 VMS

=over

=item *

It's now possible to build extensions on older (pre 7.3-2) VMS systems.

DCL symbol length was limited to 1K up until about seven years or
so ago, but there was no particularly deep reason to prevent those
older systems from configuring and building Perl.

=item *

We fixed the previously-broken C<-Uuseperlio> build on VMS.

We were checking a variable that doesn't exist in the non-default
case of disabling perlio.  Now we only look at it when it exists.

=item *

We fixed the -Uuseperlio command-line option in configure.com.

Formerly it only worked if you went through all the questions
interactively and explicitly answered no.

=back

=head1 Known Problems

=over

=item *

C<List::Util::first> misbehaves in the presence of a lexical C<$_>
(typically introduced by C<my $_> or implicitly by C<given>). The variable
which gets set for each iteration is the package variable C<$_>, not the
lexical C<$_>.

A similar issue may occur in other modules that provide functions which
take a block as their first argument, like

    foo { ... $_ ...} list

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=67694>

=item *

C<Module::Load::Conditional> and C<version> have an unfortunate
interaction which can cause C<CPANPLUS> to crash when it encounters
an unparseable version string.  Upgrading to C<CPANPLUS> 0.9004 or
C<Module::Load::Conditional> 0.38 from CPAN will resolve this issue.

=back


=head1 Acknowledgements

Perl 5.12.1 represents approximately four weeks of development since
Perl 5.12.0 and contains approximately 4,000 lines of changes
across 142 files from 28 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.12.1:

Ævar Arnfjörð Bjarmason, Chris Williams, chromatic, Craig A. Berry,
David Golden, Father Chrysostomos, Florian Ragwitz, Frank Wiegand,
Gene Sullivan, Goro Fuji, H.Merijn Brand, James E Keenan, Jan Dubois,
Jesse Vincent, Josh ben Jore, Karl Williamson, Leon Brocard, Michael
Schwern, Nga Tang Chan, Nicholas Clark, Niko Tyni, Philippe Bruhat,
Rafael Garcia-Suarez, Ricardo Signes, Steffen Mueller, Todd Rinaldo,
Vincent Pit and Zefram.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes
all the core committers, who will be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut

perlfreebsd.pod000064400000003112150344123460007544 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perlfreebsd - Perl version 5 on FreeBSD systems

=head1 DESCRIPTION

This document describes various features of FreeBSD that will affect how Perl
version 5 (hereafter just Perl) is compiled and/or runs.

=head2 FreeBSD core dumps from readdir_r with ithreads

When perl is configured to use ithreads, it will use re-entrant library calls
in preference to non-re-entrant versions.  There is a bug in FreeBSD's
C<readdir_r> function in versions 4.5 and earlier that can cause a SEGV when
reading large directories. A patch for FreeBSD libc is available
(see L<http://www.freebsd.org/cgi/query-pr.cgi?pr=misc/30631> )
which has been integrated into FreeBSD 4.6.

=head2 C<$^X> doesn't always contain a full path in FreeBSD

perl sets C<$^X> where possible to a full path by asking the operating
system. On FreeBSD the full path of the perl interpreter is found by using
C<sysctl> with C<KERN_PROC_PATHNAME> if that is supported, else by reading
the symlink F</proc/curproc/file>. FreeBSD 7 and earlier has a bug where
either approach sometimes returns an incorrect value
(see L<http://www.freebsd.org/cgi/query-pr.cgi?pr=35703> ).
In these cases perl will fall back to the old behaviour of using C's
C<argv[0]> value for C<$^X>.

=head1 AUTHOR

Nicholas Clark <nick@ccl4.org>, collating wisdom supplied by Slaven Rezic
and Tim Bunce.

Please report any errors, updates, or suggestions to L<mailto:perlbug@perl.org>.

perlutil.pod000064400000016730150344123460007121 0ustar00=head1 NAME

perlutil - utilities packaged with the Perl distribution

=head1 DESCRIPTION

Along with the Perl interpreter itself, the Perl distribution installs a
range of utilities on your system. There are also several utilities
which are used by the Perl distribution itself as part of the install
process. This document exists to list all of these utilities, explain
what they are for and provide pointers to each module's documentation,
if appropriate.

=head1 LIST OF UTILITIES

=head2 Documentation

=over 3

=item L<perldoc|perldoc>

The main interface to Perl's documentation is C<perldoc>, although
if you're reading this, it's more than likely that you've already found
it. F<perldoc> will extract and format the documentation from any file
in the current directory, any Perl module installed on the system, or
any of the standard documentation pages, such as this one. Use 
C<perldoc E<lt>nameE<gt>> to get information on any of the utilities
described in this document.

=item L<pod2man|pod2man> and L<pod2text|pod2text>

If it's run from a terminal, F<perldoc> will usually call F<pod2man> to
translate POD (Plain Old Documentation - see L<perlpod> for an
explanation) into a manpage, and then run F<man> to display it; if
F<man> isn't available, F<pod2text> will be used instead and the output
piped through your favourite pager.

=item L<pod2html|pod2html>

As well as these two, there is another converter: F<pod2html> will
produce HTML pages from POD.

=item L<pod2usage|pod2usage>

If you just want to know how to use the utilities described here,
F<pod2usage> will just extract the "USAGE" section; some of
the utilities will automatically call F<pod2usage> on themselves when
you call them with C<-help>.

=item L<podselect|podselect>

F<pod2usage> is a special case of F<podselect>, a utility to extract
named sections from documents written in POD. For instance, while
utilities have "USAGE" sections, Perl modules usually have "SYNOPSIS"
sections: C<podselect -s "SYNOPSIS" ...> will extract this section for
a given file.

=item L<podchecker|podchecker>

If you're writing your own documentation in POD, the F<podchecker>
utility will look for errors in your markup.

=item L<splain|splain>

F<splain> is an interface to L<perldiag> - paste in your error message
to it, and it'll explain it for you.

=item C<roffitall>

The C<roffitall> utility is not installed on your system but lives in
the F<pod/> directory of your Perl source kit; it converts all the
documentation from the distribution to F<*roff> format, and produces a
typeset PostScript or text file of the whole lot.

=back

=head2 Converters

To help you convert legacy programs to more modern Perl, the
L<pl2pm|pl2pm> utility will help you convert old-style Perl 4 libraries
to new-style Perl5 modules.

=head2 Administration

=over 3

=item L<libnetcfg|libnetcfg>

To display and change the libnet configuration run the libnetcfg command.

=item L<perlivp>

The F<perlivp> program is set up at Perl source code build time to test
the Perl version it was built under.  It can be used after running C<make
install> (or your platform's equivalent procedure) to verify that perl
and its libraries have been installed correctly.

=back

=head2 Development

There are a set of utilities which help you in developing Perl programs, 
and in particular, extending Perl with C.

=over 3

=item L<perlbug|perlbug>

F<perlbug> is the recommended way to report bugs in the perl interpreter
itself or any of the standard library modules back to the developers;
please read through the documentation for F<perlbug> thoroughly before
using it to submit a bug report.

=item L<perlthanks|perlbug>

This program provides an easy way to send a thank-you message back to the
authors and maintainers of perl. It's just F<perlbug> installed under
another name.

=item L<h2ph|h2ph>

Back before Perl had the XS system for connecting with C libraries,
programmers used to get library constants by reading through the C
header files. You may still see C<require 'syscall.ph'> or similar
around - the F<.ph> file should be created by running F<h2ph> on the
corresponding F<.h> file. See the F<h2ph> documentation for more on how
to convert a whole bunch of header files at once.

=item L<h2xs|h2xs>

F<h2xs> converts C header files into XS modules, and will try and write
as much glue between C libraries and Perl modules as it can. It's also
very useful for creating skeletons of pure Perl modules.

=item L<enc2xs>

F<enc2xs> builds a Perl extension for use by Encode from either
Unicode Character Mapping files (.ucm) or Tcl Encoding Files (.enc).
Besides being used internally during the build process of the Encode
module, you can use F<enc2xs> to add your own encoding to perl.
No knowledge of XS is necessary.

=item L<xsubpp>

F<xsubpp> is a compiler to convert Perl XS code into C code.
It is typically run by the makefiles created by L<ExtUtils::MakeMaker>.

F<xsubpp> will compile XS code into C code by embedding the constructs
necessary to let C functions manipulate Perl values and creates the glue
necessary to let Perl access those functions.

=item L<prove>

F<prove> is a command-line interface to the test-running functionality
of F<Test::Harness>.  It's an alternative to C<make test>.

=item L<corelist>

A command-line front-end to C<Module::CoreList>, to query what modules
were shipped with given versions of perl.

=back

=head2 General tools

A few general-purpose tools are shipped with perl, mostly because they
came along modules included in the perl distribution.

=over 3

=item L<piconv>

B<piconv> is a Perl version of B<iconv>, a character encoding converter
widely available for various Unixen today.  This script was primarily a
technology demonstrator for Perl v5.8.0, but you can use piconv in the
place of iconv for virtually any case.

=item L<ptar>

F<ptar> is a tar-like program, written in pure Perl.

=item L<ptardiff>

F<ptardiff> is a small utility that produces a diff between an extracted
archive and an unextracted one. (Note that this utility requires the
C<Text::Diff> module to function properly; this module isn't distributed
with perl, but is available from the CPAN.)

=item L<ptargrep>

F<ptargrep> is a utility to apply pattern matching to the contents of files 
in a tar archive.

=item L<shasum>

This utility, that comes with the C<Digest::SHA> module, is used to print
or verify SHA checksums.

=item L<zipdetails>

L<zipdetails> displays information about the internal record structure of the zip file.
It is not concerned with displaying any details of the compressed data stored in the zip file.

=back

=head2 Installation

These utilities help manage extra Perl modules that don't come with the perl
distribution.

=over 3

=item L<cpan>

F<cpan> is a command-line interface to CPAN.pm.  It allows you to install
modules or distributions from CPAN, or just get information about them, and
a lot more.  It is similar to the command line mode of the L<CPAN> module,

    perl -MCPAN -e shell

=item L<instmodsh>

A little interface to ExtUtils::Installed to examine installed modules,
validate your packlists and even create a tarball from an installed module.

=back

=head1 SEE ALSO

L<perldoc|perldoc>, L<pod2man|pod2man>, L<perlpod>,
L<pod2html|pod2html>, L<pod2usage|pod2usage>, L<podselect|podselect>,
L<podchecker|podchecker>, L<splain|splain>, L<perldiag>,
C<roffitall|roffitall>, L<File::Find|File::Find>, L<pl2pm|pl2pm>,
L<perlbug|perlbug>, L<h2ph|h2ph>, L<h2xs|h2xs>, L<enc2xs>,
L<xsubpp>, L<cpan>, L<instmodsh>, L<piconv>, L<prove>, L<corelist>, L<ptar>,
L<ptardiff>, L<shasum>, L<zipdetails>

=cut
perltie.pod000064400000113317150344123460006724 0ustar00=head1 NAME
X<tie>

perltie - how to hide an object class in a simple variable

=head1 SYNOPSIS

 tie VARIABLE, CLASSNAME, LIST

 $object = tied VARIABLE

 untie VARIABLE

=head1 DESCRIPTION

Prior to release 5.0 of Perl, a programmer could use dbmopen()
to connect an on-disk database in the standard Unix dbm(3x)
format magically to a %HASH in their program.  However, their Perl was either
built with one particular dbm library or another, but not both, and
you couldn't extend this mechanism to other packages or types of variables.

Now you can.

The tie() function binds a variable to a class (package) that will provide
the implementation for access methods for that variable.  Once this magic
has been performed, accessing a tied variable automatically triggers
method calls in the proper class.  The complexity of the class is
hidden behind magic methods calls.  The method names are in ALL CAPS,
which is a convention that Perl uses to indicate that they're called
implicitly rather than explicitly--just like the BEGIN() and END()
functions.

In the tie() call, C<VARIABLE> is the name of the variable to be
enchanted.  C<CLASSNAME> is the name of a class implementing objects of
the correct type.  Any additional arguments in the C<LIST> are passed to
the appropriate constructor method for that class--meaning TIESCALAR(),
TIEARRAY(), TIEHASH(), or TIEHANDLE().  (Typically these are arguments
such as might be passed to the dbminit() function of C.) The object
returned by the "new" method is also returned by the tie() function,
which would be useful if you wanted to access other methods in
C<CLASSNAME>. (You don't actually have to return a reference to a right
"type" (e.g., HASH or C<CLASSNAME>) so long as it's a properly blessed
object.)  You can also retrieve a reference to the underlying object
using the tied() function.

Unlike dbmopen(), the tie() function will not C<use> or C<require> a module
for you--you need to do that explicitly yourself.

=head2 Tying Scalars
X<scalar, tying>

A class implementing a tied scalar should define the following methods:
TIESCALAR, FETCH, STORE, and possibly UNTIE and/or DESTROY.

Let's look at each in turn, using as an example a tie class for
scalars that allows the user to do something like:

    tie $his_speed, 'Nice', getppid();
    tie $my_speed,  'Nice', $$;

And now whenever either of those variables is accessed, its current
system priority is retrieved and returned.  If those variables are set,
then the process's priority is changed!

We'll use Jarkko Hietaniemi <F<jhi@iki.fi>>'s BSD::Resource class (not
included) to access the PRIO_PROCESS, PRIO_MIN, and PRIO_MAX constants
from your system, as well as the getpriority() and setpriority() system
calls.  Here's the preamble of the class.

    package Nice;
    use Carp;
    use BSD::Resource;
    use strict;
    $Nice::DEBUG = 0 unless defined $Nice::DEBUG;

=over 4

=item TIESCALAR classname, LIST
X<TIESCALAR>

This is the constructor for the class.  That means it is
expected to return a blessed reference to a new scalar
(probably anonymous) that it's creating.  For example:

 sub TIESCALAR {
     my $class = shift;
     my $pid = shift || $$; # 0 means me

     if ($pid !~ /^\d+$/) {
         carp "Nice::Tie::Scalar got non-numeric pid $pid" if $^W;
         return undef;
     }

     unless (kill 0, $pid) { # EPERM or ERSCH, no doubt
         carp "Nice::Tie::Scalar got bad pid $pid: $!" if $^W;
         return undef;
     }

     return bless \$pid, $class;
 }

This tie class has chosen to return an error rather than raising an
exception if its constructor should fail.  While this is how dbmopen() works,
other classes may well not wish to be so forgiving.  It checks the global
variable C<$^W> to see whether to emit a bit of noise anyway.

=item FETCH this
X<FETCH>

This method will be triggered every time the tied variable is accessed
(read).  It takes no arguments beyond its self reference, which is the
object representing the scalar we're dealing with.  Because in this case
we're using just a SCALAR ref for the tied scalar object, a simple $$self
allows the method to get at the real value stored there.  In our example
below, that real value is the process ID to which we've tied our variable.

    sub FETCH {
        my $self = shift;
        confess "wrong type" unless ref $self;
        croak "usage error" if @_;
        my $nicety;
        local($!) = 0;
        $nicety = getpriority(PRIO_PROCESS, $$self);
        if ($!) { croak "getpriority failed: $!" }
        return $nicety;
    }

This time we've decided to blow up (raise an exception) if the renice
fails--there's no place for us to return an error otherwise, and it's
probably the right thing to do.

=item STORE this, value
X<STORE>

This method will be triggered every time the tied variable is set
(assigned).  Beyond its self reference, it also expects one (and only one)
argument: the new value the user is trying to assign. Don't worry about
returning a value from STORE; the semantic of assignment returning the
assigned value is implemented with FETCH.

 sub STORE {
     my $self = shift;
     confess "wrong type" unless ref $self;
     my $new_nicety = shift;
     croak "usage error" if @_;

     if ($new_nicety < PRIO_MIN) {
         carp sprintf
           "WARNING: priority %d less than minimum system priority %d",
               $new_nicety, PRIO_MIN if $^W;
         $new_nicety = PRIO_MIN;
     }

     if ($new_nicety > PRIO_MAX) {
         carp sprintf
           "WARNING: priority %d greater than maximum system priority %d",
               $new_nicety, PRIO_MAX if $^W;
         $new_nicety = PRIO_MAX;
     }

     unless (defined setpriority(PRIO_PROCESS,
                                 $$self,
                                 $new_nicety))
     {
         confess "setpriority failed: $!";
     }
 }

=item UNTIE this
X<UNTIE>

This method will be triggered when the C<untie> occurs. This can be useful
if the class needs to know when no further calls will be made. (Except DESTROY
of course.) See L<The C<untie> Gotcha> below for more details.

=item DESTROY this
X<DESTROY>

This method will be triggered when the tied variable needs to be destructed.
As with other object classes, such a method is seldom necessary, because Perl
deallocates its moribund object's memory for you automatically--this isn't
C++, you know.  We'll use a DESTROY method here for debugging purposes only.

    sub DESTROY {
        my $self = shift;
        confess "wrong type" unless ref $self;
        carp "[ Nice::DESTROY pid $$self ]" if $Nice::DEBUG;
    }

=back

That's about all there is to it.  Actually, it's more than all there
is to it, because we've done a few nice things here for the sake
of completeness, robustness, and general aesthetics.  Simpler
TIESCALAR classes are certainly possible.

=head2 Tying Arrays
X<array, tying>

A class implementing a tied ordinary array should define the following
methods: TIEARRAY, FETCH, STORE, FETCHSIZE, STORESIZE, CLEAR
and perhaps UNTIE and/or DESTROY.

FETCHSIZE and STORESIZE are used to provide C<$#array> and
equivalent C<scalar(@array)> access.

The methods POP, PUSH, SHIFT, UNSHIFT, SPLICE, DELETE, and EXISTS are
required if the perl operator with the corresponding (but lowercase) name
is to operate on the tied array. The B<Tie::Array> class can be used as a
base class to implement the first five of these in terms of the basic
methods above.  The default implementations of DELETE and EXISTS in
B<Tie::Array> simply C<croak>.

In addition EXTEND will be called when perl would have pre-extended
allocation in a real array.

For this discussion, we'll implement an array whose elements are a fixed
size at creation.  If you try to create an element larger than the fixed
size, you'll take an exception.  For example:

    use FixedElem_Array;
    tie @array, 'FixedElem_Array', 3;
    $array[0] = 'cat';  # ok.
    $array[1] = 'dogs'; # exception, length('dogs') > 3.

The preamble code for the class is as follows:

    package FixedElem_Array;
    use Carp;
    use strict;

=over 4

=item TIEARRAY classname, LIST
X<TIEARRAY>

This is the constructor for the class.  That means it is expected to
return a blessed reference through which the new array (probably an
anonymous ARRAY ref) will be accessed.

In our example, just to show you that you don't I<really> have to return an
ARRAY reference, we'll choose a HASH reference to represent our object.
A HASH works out well as a generic record type: the C<{ELEMSIZE}> field will
store the maximum element size allowed, and the C<{ARRAY}> field will hold the
true ARRAY ref.  If someone outside the class tries to dereference the
object returned (doubtless thinking it an ARRAY ref), they'll blow up.
This just goes to show you that you should respect an object's privacy.

    sub TIEARRAY {
      my $class    = shift;
      my $elemsize = shift;
      if ( @_ || $elemsize =~ /\D/ ) {
        croak "usage: tie ARRAY, '" . __PACKAGE__ . "', elem_size";
      }
      return bless {
        ELEMSIZE => $elemsize,
        ARRAY    => [],
      }, $class;
    }

=item FETCH this, index
X<FETCH>

This method will be triggered every time an individual element the tied array
is accessed (read).  It takes one argument beyond its self reference: the
index whose value we're trying to fetch.

    sub FETCH {
      my $self  = shift;
      my $index = shift;
      return $self->{ARRAY}->[$index];
    }

If a negative array index is used to read from an array, the index
will be translated to a positive one internally by calling FETCHSIZE
before being passed to FETCH.  You may disable this feature by
assigning a true value to the variable C<$NEGATIVE_INDICES> in the
tied array class.

As you may have noticed, the name of the FETCH method (et al.) is the same
for all accesses, even though the constructors differ in names (TIESCALAR
vs TIEARRAY).  While in theory you could have the same class servicing
several tied types, in practice this becomes cumbersome, and it's easiest
to keep them at simply one tie type per class.

=item STORE this, index, value
X<STORE>

This method will be triggered every time an element in the tied array is set
(written).  It takes two arguments beyond its self reference: the index at
which we're trying to store something and the value we're trying to put
there.

In our example, C<undef> is really C<$self-E<gt>{ELEMSIZE}> number of
spaces so we have a little more work to do here:

 sub STORE {
   my $self = shift;
   my( $index, $value ) = @_;
   if ( length $value > $self->{ELEMSIZE} ) {
     croak "length of $value is greater than $self->{ELEMSIZE}";
   }
   # fill in the blanks
   $self->EXTEND( $index ) if $index > $self->FETCHSIZE();
   # right justify to keep element size for smaller elements
   $self->{ARRAY}->[$index] = sprintf "%$self->{ELEMSIZE}s", $value;
 }

Negative indexes are treated the same as with FETCH.

=item FETCHSIZE this
X<FETCHSIZE>

Returns the total number of items in the tied array associated with
object I<this>. (Equivalent to C<scalar(@array)>).  For example:

    sub FETCHSIZE {
      my $self = shift;
      return scalar @{$self->{ARRAY}};
    }

=item STORESIZE this, count
X<STORESIZE>

Sets the total number of items in the tied array associated with
object I<this> to be I<count>. If this makes the array larger then
class's mapping of C<undef> should be returned for new positions.
If the array becomes smaller then entries beyond count should be
deleted. 

In our example, 'undef' is really an element containing
C<$self-E<gt>{ELEMSIZE}> number of spaces.  Observe:

    sub STORESIZE {
      my $self  = shift;
      my $count = shift;
      if ( $count > $self->FETCHSIZE() ) {
        foreach ( $count - $self->FETCHSIZE() .. $count ) {
          $self->STORE( $_, '' );
        }
      } elsif ( $count < $self->FETCHSIZE() ) {
        foreach ( 0 .. $self->FETCHSIZE() - $count - 2 ) {
          $self->POP();
        }
      }
    }

=item EXTEND this, count
X<EXTEND>

Informative call that array is likely to grow to have I<count> entries.
Can be used to optimize allocation. This method need do nothing.

In our example, we want to make sure there are no blank (C<undef>)
entries, so C<EXTEND> will make use of C<STORESIZE> to fill elements
as needed:

    sub EXTEND {   
      my $self  = shift;
      my $count = shift;
      $self->STORESIZE( $count );
    }

=item EXISTS this, key
X<EXISTS>

Verify that the element at index I<key> exists in the tied array I<this>.

In our example, we will determine that if an element consists of
C<$self-E<gt>{ELEMSIZE}> spaces only, it does not exist:

 sub EXISTS {
   my $self  = shift;
   my $index = shift;
   return 0 if ! defined $self->{ARRAY}->[$index] ||
               $self->{ARRAY}->[$index] eq ' ' x $self->{ELEMSIZE};
   return 1;
 }

=item DELETE this, key
X<DELETE>

Delete the element at index I<key> from the tied array I<this>.

In our example, a deleted item is C<$self-E<gt>{ELEMSIZE}> spaces:

    sub DELETE {
      my $self  = shift;
      my $index = shift;
      return $self->STORE( $index, '' );
    }

=item CLEAR this
X<CLEAR>

Clear (remove, delete, ...) all values from the tied array associated with
object I<this>.  For example:

    sub CLEAR {
      my $self = shift;
      return $self->{ARRAY} = [];
    }

=item PUSH this, LIST 
X<PUSH>

Append elements of I<LIST> to the array.  For example:

    sub PUSH {  
      my $self = shift;
      my @list = @_;
      my $last = $self->FETCHSIZE();
      $self->STORE( $last + $_, $list[$_] ) foreach 0 .. $#list;
      return $self->FETCHSIZE();
    }   

=item POP this
X<POP>

Remove last element of the array and return it.  For example:

    sub POP {
      my $self = shift;
      return pop @{$self->{ARRAY}};
    }

=item SHIFT this
X<SHIFT>

Remove the first element of the array (shifting other elements down)
and return it.  For example:

    sub SHIFT {
      my $self = shift;
      return shift @{$self->{ARRAY}};
    }

=item UNSHIFT this, LIST 
X<UNSHIFT>

Insert LIST elements at the beginning of the array, moving existing elements
up to make room.  For example:

    sub UNSHIFT {
      my $self = shift;
      my @list = @_;
      my $size = scalar( @list );
      # make room for our list
      @{$self->{ARRAY}}[ $size .. $#{$self->{ARRAY}} + $size ]
       = @{$self->{ARRAY}};
      $self->STORE( $_, $list[$_] ) foreach 0 .. $#list;
    }

=item SPLICE this, offset, length, LIST
X<SPLICE>

Perform the equivalent of C<splice> on the array. 

I<offset> is optional and defaults to zero, negative values count back 
from the end of the array. 

I<length> is optional and defaults to rest of the array.

I<LIST> may be empty.

Returns a list of the original I<length> elements at I<offset>.

In our example, we'll use a little shortcut if there is a I<LIST>:

    sub SPLICE {
      my $self   = shift;
      my $offset = shift || 0;
      my $length = shift || $self->FETCHSIZE() - $offset;
      my @list   = (); 
      if ( @_ ) {
        tie @list, __PACKAGE__, $self->{ELEMSIZE};
        @list   = @_;
      }
      return splice @{$self->{ARRAY}}, $offset, $length, @list;
    }

=item UNTIE this
X<UNTIE>

Will be called when C<untie> happens. (See L<The C<untie> Gotcha> below.)

=item DESTROY this
X<DESTROY>

This method will be triggered when the tied variable needs to be destructed.
As with the scalar tie class, this is almost never needed in a
language that does its own garbage collection, so this time we'll
just leave it out.

=back

=head2 Tying Hashes
X<hash, tying>

Hashes were the first Perl data type to be tied (see dbmopen()).  A class
implementing a tied hash should define the following methods: TIEHASH is
the constructor.  FETCH and STORE access the key and value pairs.  EXISTS
reports whether a key is present in the hash, and DELETE deletes one.
CLEAR empties the hash by deleting all the key and value pairs.  FIRSTKEY
and NEXTKEY implement the keys() and each() functions to iterate over all
the keys. SCALAR is triggered when the tied hash is evaluated in scalar 
context. UNTIE is called when C<untie> happens, and DESTROY is called when
the tied variable is garbage collected.

If this seems like a lot, then feel free to inherit from merely the
standard Tie::StdHash module for most of your methods, redefining only the
interesting ones.  See L<Tie::Hash> for details.

Remember that Perl distinguishes between a key not existing in the hash,
and the key existing in the hash but having a corresponding value of
C<undef>.  The two possibilities can be tested with the C<exists()> and
C<defined()> functions.

Here's an example of a somewhat interesting tied hash class:  it gives you
a hash representing a particular user's dot files.  You index into the hash
with the name of the file (minus the dot) and you get back that dot file's
contents.  For example:

    use DotFiles;
    tie %dot, 'DotFiles';
    if ( $dot{profile} =~ /MANPATH/ ||
         $dot{login}   =~ /MANPATH/ ||
         $dot{cshrc}   =~ /MANPATH/    )
    {
	print "you seem to set your MANPATH\n";
    }

Or here's another sample of using our tied class:

    tie %him, 'DotFiles', 'daemon';
    foreach $f ( keys %him ) {
	printf "daemon dot file %s is size %d\n",
	    $f, length $him{$f};
    }

In our tied hash DotFiles example, we use a regular
hash for the object containing several important
fields, of which only the C<{LIST}> field will be what the
user thinks of as the real hash.

=over 5

=item USER

whose dot files this object represents

=item HOME

where those dot files live

=item CLOBBER

whether we should try to change or remove those dot files

=item LIST

the hash of dot file names and content mappings

=back

Here's the start of F<Dotfiles.pm>:

    package DotFiles;
    use Carp;
    sub whowasi { (caller(1))[3] . '()' }
    my $DEBUG = 0;
    sub debug { $DEBUG = @_ ? shift : 1 }

For our example, we want to be able to emit debugging info to help in tracing
during development.  We keep also one convenience function around
internally to help print out warnings; whowasi() returns the function name
that calls it.

Here are the methods for the DotFiles tied hash.

=over 4

=item TIEHASH classname, LIST
X<TIEHASH>

This is the constructor for the class.  That means it is expected to
return a blessed reference through which the new object (probably but not
necessarily an anonymous hash) will be accessed.

Here's the constructor:

    sub TIEHASH {
	my $self = shift;
	my $user = shift || $>;
	my $dotdir = shift || '';
	croak "usage: @{[&whowasi]} [USER [DOTDIR]]" if @_;
	$user = getpwuid($user) if $user =~ /^\d+$/;
	my $dir = (getpwnam($user))[7]
		|| croak "@{[&whowasi]}: no user $user";
	$dir .= "/$dotdir" if $dotdir;

	my $node = {
	    USER    => $user,
	    HOME    => $dir,
	    LIST    => {},
	    CLOBBER => 0,
	};

	opendir(DIR, $dir)
		|| croak "@{[&whowasi]}: can't opendir $dir: $!";
	foreach $dot ( grep /^\./ && -f "$dir/$_", readdir(DIR)) {
	    $dot =~ s/^\.//;
	    $node->{LIST}{$dot} = undef;
	}
	closedir DIR;
	return bless $node, $self;
    }

It's probably worth mentioning that if you're going to filetest the
return values out of a readdir, you'd better prepend the directory
in question.  Otherwise, because we didn't chdir() there, it would
have been testing the wrong file.

=item FETCH this, key
X<FETCH>

This method will be triggered every time an element in the tied hash is
accessed (read).  It takes one argument beyond its self reference: the key
whose value we're trying to fetch.

Here's the fetch for our DotFiles example.

    sub FETCH {
	carp &whowasi if $DEBUG;
	my $self = shift;
	my $dot = shift;
	my $dir = $self->{HOME};
	my $file = "$dir/.$dot";

	unless (exists $self->{LIST}->{$dot} || -f $file) {
	    carp "@{[&whowasi]}: no $dot file" if $DEBUG;
	    return undef;
	}

	if (defined $self->{LIST}->{$dot}) {
	    return $self->{LIST}->{$dot};
	} else {
	    return $self->{LIST}->{$dot} = `cat $dir/.$dot`;
	}
    }

It was easy to write by having it call the Unix cat(1) command, but it
would probably be more portable to open the file manually (and somewhat
more efficient).  Of course, because dot files are a Unixy concept, we're
not that concerned.

=item STORE this, key, value
X<STORE>

This method will be triggered every time an element in the tied hash is set
(written).  It takes two arguments beyond its self reference: the index at
which we're trying to store something, and the value we're trying to put
there.

Here in our DotFiles example, we'll be careful not to let
them try to overwrite the file unless they've called the clobber()
method on the original object reference returned by tie().

    sub STORE {
	carp &whowasi if $DEBUG;
	my $self = shift;
	my $dot = shift;
	my $value = shift;
	my $file = $self->{HOME} . "/.$dot";
	my $user = $self->{USER};

	croak "@{[&whowasi]}: $file not clobberable"
	    unless $self->{CLOBBER};

	open(my $f, '>', $file) || croak "can't open $file: $!";
	print $f $value;
	close($f);
    }

If they wanted to clobber something, they might say:

    $ob = tie %daemon_dots, 'daemon';
    $ob->clobber(1);
    $daemon_dots{signature} = "A true daemon\n";

Another way to lay hands on a reference to the underlying object is to
use the tied() function, so they might alternately have set clobber
using:

    tie %daemon_dots, 'daemon';
    tied(%daemon_dots)->clobber(1);

The clobber method is simply:

    sub clobber {
	my $self = shift;
	$self->{CLOBBER} = @_ ? shift : 1;
    }

=item DELETE this, key
X<DELETE>

This method is triggered when we remove an element from the hash,
typically by using the delete() function.  Again, we'll
be careful to check whether they really want to clobber files.

 sub DELETE   {
     carp &whowasi if $DEBUG;

     my $self = shift;
     my $dot = shift;
     my $file = $self->{HOME} . "/.$dot";
     croak "@{[&whowasi]}: won't remove file $file"
         unless $self->{CLOBBER};
     delete $self->{LIST}->{$dot};
     my $success = unlink($file);
     carp "@{[&whowasi]}: can't unlink $file: $!" unless $success;
     $success;
 }

The value returned by DELETE becomes the return value of the call
to delete().  If you want to emulate the normal behavior of delete(),
you should return whatever FETCH would have returned for this key.
In this example, we have chosen instead to return a value which tells
the caller whether the file was successfully deleted.

=item CLEAR this
X<CLEAR>

This method is triggered when the whole hash is to be cleared, usually by
assigning the empty list to it.

In our example, that would remove all the user's dot files!  It's such a
dangerous thing that they'll have to set CLOBBER to something higher than
1 to make it happen.

 sub CLEAR    {
     carp &whowasi if $DEBUG;
     my $self = shift;
     croak "@{[&whowasi]}: won't remove all dot files for $self->{USER}"
         unless $self->{CLOBBER} > 1;
     my $dot;
     foreach $dot ( keys %{$self->{LIST}}) {
         $self->DELETE($dot);
     }
 }

=item EXISTS this, key
X<EXISTS>

This method is triggered when the user uses the exists() function
on a particular hash.  In our example, we'll look at the C<{LIST}>
hash element for this:

    sub EXISTS   {
	carp &whowasi if $DEBUG;
	my $self = shift;
	my $dot = shift;
	return exists $self->{LIST}->{$dot};
    }

=item FIRSTKEY this
X<FIRSTKEY>

This method will be triggered when the user is going
to iterate through the hash, such as via a keys(), values(), or each() call.

    sub FIRSTKEY {
	carp &whowasi if $DEBUG;
	my $self = shift;
	my $a = keys %{$self->{LIST}};  # reset each() iterator
	each %{$self->{LIST}}
    }

FIRSTKEY is always called in scalar context and it should just
return the first key.  values(), and each() in list context,
will call FETCH for the returned keys.

=item NEXTKEY this, lastkey
X<NEXTKEY>

This method gets triggered during a keys(), values(), or each() iteration.  It has a
second argument which is the last key that had been accessed.  This is
useful if you're caring about ordering or calling the iterator from more
than one sequence, or not really storing things in a hash anywhere.

NEXTKEY is always called in scalar context and it should just
return the next key.  values(), and each() in list context,
will call FETCH for the returned keys.

For our example, we're using a real hash so we'll do just the simple
thing, but we'll have to go through the LIST field indirectly.

    sub NEXTKEY  {
	carp &whowasi if $DEBUG;
	my $self = shift;
	return each %{ $self->{LIST} }
    }

=item SCALAR this
X<SCALAR>

This is called when the hash is evaluated in scalar context. In order
to mimic the behaviour of untied hashes, this method should return a
false value when the tied hash is considered empty. If this method does
not exist, perl will make some educated guesses and return true when
the hash is inside an iteration. If this isn't the case, FIRSTKEY is
called, and the result will be a false value if FIRSTKEY returns the empty
list, true otherwise.

However, you should B<not> blindly rely on perl always doing the right 
thing. Particularly, perl will mistakenly return true when you clear the 
hash by repeatedly calling DELETE until it is empty. You are therefore 
advised to supply your own SCALAR method when you want to be absolutely 
sure that your hash behaves nicely in scalar context.

In our example we can just call C<scalar> on the underlying hash
referenced by C<$self-E<gt>{LIST}>:

    sub SCALAR {
	carp &whowasi if $DEBUG;
	my $self = shift;
	return scalar %{ $self->{LIST} }
    }

NOTE: In perl 5.25 the behavior of scalar %hash on an untied hash changed
to return the count of keys. Prior to this it returned a string containing
information about the bucket setup of the hash. See
L<Hash::Util/bucket_ratio> for a backwards compatibility path.

=item UNTIE this
X<UNTIE>

This is called when C<untie> occurs.  See L<The C<untie> Gotcha> below.

=item DESTROY this
X<DESTROY>

This method is triggered when a tied hash is about to go out of
scope.  You don't really need it unless you're trying to add debugging
or have auxiliary state to clean up.  Here's a very simple function:

    sub DESTROY  {
	carp &whowasi if $DEBUG;
    }

=back

Note that functions such as keys() and values() may return huge lists
when used on large objects, like DBM files.  You may prefer to use the
each() function to iterate over such.  Example:

    # print out history file offsets
    use NDBM_File;
    tie(%HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0);
    while (($key,$val) = each %HIST) {
        print $key, ' = ', unpack('L',$val), "\n";
    }
    untie(%HIST);

=head2 Tying FileHandles
X<filehandle, tying>

This is partially implemented now.

A class implementing a tied filehandle should define the following
methods: TIEHANDLE, at least one of PRINT, PRINTF, WRITE, READLINE, GETC,
READ, and possibly CLOSE, UNTIE and DESTROY.  The class can also provide: BINMODE,
OPEN, EOF, FILENO, SEEK, TELL - if the corresponding perl operators are
used on the handle.

When STDERR is tied, its PRINT method will be called to issue warnings
and error messages.  This feature is temporarily disabled during the call, 
which means you can use C<warn()> inside PRINT without starting a recursive
loop.  And just like C<__WARN__> and C<__DIE__> handlers, STDERR's PRINT
method may be called to report parser errors, so the caveats mentioned under 
L<perlvar/%SIG> apply.

All of this is especially useful when perl is embedded in some other 
program, where output to STDOUT and STDERR may have to be redirected 
in some special way.  See nvi and the Apache module for examples.

When tying a handle, the first argument to C<tie> should begin with an
asterisk.  So, if you are tying STDOUT, use C<*STDOUT>.  If you have
assigned it to a scalar variable, say C<$handle>, use C<*$handle>.
C<tie $handle> ties the scalar variable C<$handle>, not the handle inside
it.

In our example we're going to create a shouting handle.

    package Shout;

=over 4

=item TIEHANDLE classname, LIST
X<TIEHANDLE>

This is the constructor for the class.  That means it is expected to
return a blessed reference of some sort. The reference can be used to
hold some internal information.

    sub TIEHANDLE { print "<shout>\n"; my $i; bless \$i, shift }

=item WRITE this, LIST
X<WRITE>

This method will be called when the handle is written to via the
C<syswrite> function.

 sub WRITE {
     $r = shift;
     my($buf,$len,$offset) = @_;
     print "WRITE called, \$buf=$buf, \$len=$len, \$offset=$offset";
 }

=item PRINT this, LIST
X<PRINT>

This method will be triggered every time the tied handle is printed to
with the C<print()> or C<say()> functions.  Beyond its self reference
it also expects the list that was passed to the print function.

  sub PRINT { $r = shift; $$r++; print join($,,map(uc($_),@_)),$\ }

C<say()> acts just like C<print()> except $\ will be localized to C<\n> so
you need do nothing special to handle C<say()> in C<PRINT()>.

=item PRINTF this, LIST
X<PRINTF>

This method will be triggered every time the tied handle is printed to
with the C<printf()> function.
Beyond its self reference it also expects the format and list that was
passed to the printf function.

    sub PRINTF {
        shift;
        my $fmt = shift;
        print sprintf($fmt, @_);
    }

=item READ this, LIST
X<READ>

This method will be called when the handle is read from via the C<read>
or C<sysread> functions.

 sub READ {
   my $self = shift;
   my $bufref = \$_[0];
   my(undef,$len,$offset) = @_;
   print "READ called, \$buf=$bufref, \$len=$len, \$offset=$offset";
   # add to $$bufref, set $len to number of characters read
   $len;
 }

=item READLINE this
X<READLINE>

This method is called when the handle is read via C<E<lt>HANDLEE<gt>>
or C<readline HANDLE>.

As per L<C<readline>|perlfunc/readline>, in scalar context it should return
the next line, or C<undef> for no more data.  In list context it should
return all remaining lines, or an empty list for no more data.  The strings
returned should include the input record separator C<$/> (see L<perlvar>),
unless it is C<undef> (which means "slurp" mode).

    sub READLINE {
      my $r = shift;
      if (wantarray) {
        return ("all remaining\n",
                "lines up\n",
                "to eof\n");
      } else {
        return "READLINE called " . ++$$r . " times\n";
      }
    }

=item GETC this
X<GETC>

This method will be called when the C<getc> function is called.

    sub GETC { print "Don't GETC, Get Perl"; return "a"; }

=item EOF this
X<EOF>

This method will be called when the C<eof> function is called.

Starting with Perl 5.12, an additional integer parameter will be passed.  It
will be zero if C<eof> is called without parameter; C<1> if C<eof> is given
a filehandle as a parameter, e.g. C<eof(FH)>; and C<2> in the very special
case that the tied filehandle is C<ARGV> and C<eof> is called with an empty
parameter list, e.g. C<eof()>.

    sub EOF { not length $stringbuf }

=item CLOSE this
X<CLOSE>

This method will be called when the handle is closed via the C<close>
function.

    sub CLOSE { print "CLOSE called.\n" }

=item UNTIE this
X<UNTIE>

As with the other types of ties, this method will be called when C<untie> happens.
It may be appropriate to "auto CLOSE" when this occurs.  See
L<The C<untie> Gotcha> below.

=item DESTROY this
X<DESTROY>

As with the other types of ties, this method will be called when the
tied handle is about to be destroyed. This is useful for debugging and
possibly cleaning up.

    sub DESTROY { print "</shout>\n" }

=back

Here's how to use our little example:

    tie(*FOO,'Shout');
    print FOO "hello\n";
    $a = 4; $b = 6;
    print FOO $a, " plus ", $b, " equals ", $a + $b, "\n";
    print <FOO>;

=head2 UNTIE this
X<UNTIE>

You can define for all tie types an UNTIE method that will be called
at untie().  See L<The C<untie> Gotcha> below.

=head2 The C<untie> Gotcha
X<untie>

If you intend making use of the object returned from either tie() or
tied(), and if the tie's target class defines a destructor, there is a
subtle gotcha you I<must> guard against.

As setup, consider this (admittedly rather contrived) example of a
tie; all it does is use a file to keep a log of the values assigned to
a scalar.

    package Remember;

    use strict;
    use warnings;
    use IO::File;

    sub TIESCALAR {
        my $class = shift;
        my $filename = shift;
        my $handle = IO::File->new( "> $filename" )
                         or die "Cannot open $filename: $!\n";

        print $handle "The Start\n";
        bless {FH => $handle, Value => 0}, $class;
    }

    sub FETCH {
        my $self = shift;
        return $self->{Value};
    }

    sub STORE {
        my $self = shift;
        my $value = shift;
        my $handle = $self->{FH};
        print $handle "$value\n";
        $self->{Value} = $value;
    }

    sub DESTROY {
        my $self = shift;
        my $handle = $self->{FH};
        print $handle "The End\n";
        close $handle;
    }

    1;

Here is an example that makes use of this tie:

    use strict;
    use Remember;

    my $fred;
    tie $fred, 'Remember', 'myfile.txt';
    $fred = 1;
    $fred = 4;
    $fred = 5;
    untie $fred;
    system "cat myfile.txt";

This is the output when it is executed:

    The Start
    1
    4
    5
    The End

So far so good.  Those of you who have been paying attention will have
spotted that the tied object hasn't been used so far.  So lets add an
extra method to the Remember class to allow comments to be included in
the file; say, something like this:

    sub comment {
        my $self = shift;
        my $text = shift;
        my $handle = $self->{FH};
        print $handle $text, "\n";
    }

And here is the previous example modified to use the C<comment> method
(which requires the tied object):

    use strict;
    use Remember;

    my ($fred, $x);
    $x = tie $fred, 'Remember', 'myfile.txt';
    $fred = 1;
    $fred = 4;
    comment $x "changing...";
    $fred = 5;
    untie $fred;
    system "cat myfile.txt";

When this code is executed there is no output.  Here's why:

When a variable is tied, it is associated with the object which is the
return value of the TIESCALAR, TIEARRAY, or TIEHASH function.  This
object normally has only one reference, namely, the implicit reference
from the tied variable.  When untie() is called, that reference is
destroyed.  Then, as in the first example above, the object's
destructor (DESTROY) is called, which is normal for objects that have
no more valid references; and thus the file is closed.

In the second example, however, we have stored another reference to
the tied object in $x.  That means that when untie() gets called
there will still be a valid reference to the object in existence, so
the destructor is not called at that time, and thus the file is not
closed.  The reason there is no output is because the file buffers
have not been flushed to disk.

Now that you know what the problem is, what can you do to avoid it?
Prior to the introduction of the optional UNTIE method the only way
was the good old C<-w> flag. Which will spot any instances where you call
untie() and there are still valid references to the tied object.  If
the second script above this near the top C<use warnings 'untie'>
or was run with the C<-w> flag, Perl prints this
warning message:

    untie attempted while 1 inner references still exist

To get the script to work properly and silence the warning make sure
there are no valid references to the tied object I<before> untie() is
called:

    undef $x;
    untie $fred;

Now that UNTIE exists the class designer can decide which parts of the
class functionality are really associated with C<untie> and which with
the object being destroyed. What makes sense for a given class depends
on whether the inner references are being kept so that non-tie-related
methods can be called on the object. But in most cases it probably makes
sense to move the functionality that would have been in DESTROY to the UNTIE
method.

If the UNTIE method exists then the warning above does not occur. Instead the
UNTIE method is passed the count of "extra" references and can issue its own
warning if appropriate. e.g. to replicate the no UNTIE case this method can
be used:

 sub UNTIE
 {
  my ($obj,$count) = @_;
  carp "untie attempted while $count inner references still exist"
                                                              if $count;
 }

=head1 SEE ALSO

See L<DB_File> or L<Config> for some interesting tie() implementations.
A good starting point for many tie() implementations is with one of the
modules L<Tie::Scalar>, L<Tie::Array>, L<Tie::Hash>, or L<Tie::Handle>.

=head1 BUGS

The normal return provided by C<scalar(%hash)> is not
available.  What this means is that using %tied_hash in boolean
context doesn't work right (currently this always tests false,
regardless of whether the hash is empty or hash elements).
[ This paragraph needs review in light of changes in 5.25 ]

Localizing tied arrays or hashes does not work.  After exiting the
scope the arrays or the hashes are not restored.

Counting the number of entries in a hash via C<scalar(keys(%hash))>
or C<scalar(values(%hash)>) is inefficient since it needs to iterate
through all the entries with FIRSTKEY/NEXTKEY.

Tied hash/array slices cause multiple FETCH/STORE pairs, there are no
tie methods for slice operations.

You cannot easily tie a multilevel data structure (such as a hash of
hashes) to a dbm file.  The first problem is that all but GDBM and
Berkeley DB have size limitations, but beyond that, you also have problems
with how references are to be represented on disk.  One
module that does attempt to address this need is DBM::Deep.  Check your
nearest CPAN site as described in L<perlmodlib> for source code.  Note
that despite its name, DBM::Deep does not use dbm.  Another earlier attempt
at solving the problem is MLDBM, which is also available on the CPAN, but
which has some fairly serious limitations.

Tied filehandles are still incomplete.  sysopen(), truncate(),
flock(), fcntl(), stat() and -X can't currently be trapped.

=head1 AUTHOR

Tom Christiansen

TIEHANDLE by Sven Verdoolaege <F<skimo@dns.ufsia.ac.be>> and Doug MacEachern <F<dougm@osf.org>>

UNTIE by Nick Ing-Simmons <F<nick@ing-simmons.net>>

SCALAR by Tassilo von Parseval <F<tassilo.von.parseval@rwth-aachen.de>>

Tying Arrays by Casey West <F<casey@geeknest.com>>
perlapio.pod000064400000045525150344123460007100 0ustar00=head1 NAME

perlapio - perl's IO abstraction interface.

=head1 SYNOPSIS

  #define PERLIO_NOT_STDIO 0    /* For co-existence with stdio only */
  #include <perlio.h>           /* Usually via #include <perl.h> */

  PerlIO *PerlIO_stdin(void);
  PerlIO *PerlIO_stdout(void);
  PerlIO *PerlIO_stderr(void);

  PerlIO *PerlIO_open(const char *path,const char *mode);
  PerlIO *PerlIO_fdopen(int fd, const char *mode);
  PerlIO *PerlIO_reopen(const char *path, /* deprecated */
          const char *mode, PerlIO *old);
  int     PerlIO_close(PerlIO *f);

  int     PerlIO_stdoutf(const char *fmt,...)
  int     PerlIO_puts(PerlIO *f,const char *string);
  int     PerlIO_putc(PerlIO *f,int ch);
  SSize_t PerlIO_write(PerlIO *f,const void *buf,size_t numbytes);
  int     PerlIO_printf(PerlIO *f, const char *fmt,...);
  int     PerlIO_vprintf(PerlIO *f, const char *fmt, va_list args);
  int     PerlIO_flush(PerlIO *f);

  int     PerlIO_eof(PerlIO *f);
  int     PerlIO_error(PerlIO *f);
  void    PerlIO_clearerr(PerlIO *f);

  int     PerlIO_getc(PerlIO *d);
  int     PerlIO_ungetc(PerlIO *f,int ch);
  SSize_t PerlIO_read(PerlIO *f, void *buf, size_t numbytes);

  int     PerlIO_fileno(PerlIO *f);

  void    PerlIO_setlinebuf(PerlIO *f);

  Off_t   PerlIO_tell(PerlIO *f);
  int     PerlIO_seek(PerlIO *f, Off_t offset, int whence);
  void    PerlIO_rewind(PerlIO *f);

  int     PerlIO_getpos(PerlIO *f, SV *save);    /* prototype changed */
  int     PerlIO_setpos(PerlIO *f, SV *saved);   /* prototype changed */

  int     PerlIO_fast_gets(PerlIO *f);
  int     PerlIO_has_cntptr(PerlIO *f);
  SSize_t PerlIO_get_cnt(PerlIO *f);
  char   *PerlIO_get_ptr(PerlIO *f);
  void    PerlIO_set_ptrcnt(PerlIO *f, char *ptr, SSize_t count);

  int     PerlIO_canset_cnt(PerlIO *f);              /* deprecated */
  void    PerlIO_set_cnt(PerlIO *f, int count);      /* deprecated */

  int     PerlIO_has_base(PerlIO *f);
  char   *PerlIO_get_base(PerlIO *f);
  SSize_t PerlIO_get_bufsiz(PerlIO *f);

  PerlIO *PerlIO_importFILE(FILE *stdio, const char *mode);
  FILE   *PerlIO_exportFILE(PerlIO *f, int flags);
  FILE   *PerlIO_findFILE(PerlIO *f);
  void    PerlIO_releaseFILE(PerlIO *f,FILE *stdio);

  int     PerlIO_apply_layers(PerlIO *f, const char *mode,
                                                    const char *layers);
  int     PerlIO_binmode(PerlIO *f, int ptype, int imode,
                                                    const char *layers);
  void    PerlIO_debug(const char *fmt,...)

=head1 DESCRIPTION

Perl's source code, and extensions that want maximum portability,
should use the above functions instead of those defined in ANSI C's
I<stdio.h>.  The perl headers (in particular "perlio.h") will
C<#define> them to the I/O mechanism selected at Configure time.

The functions are modeled on those in I<stdio.h>, but parameter order
has been "tidied up a little".

C<PerlIO *> takes the place of FILE *. Like FILE * it should be
treated as opaque (it is probably safe to assume it is a pointer to
something).

There are currently three implementations:

=over 4

=item 1. USE_STDIO

All above are #define'd to stdio functions or are trivial wrapper
functions which call stdio. In this case I<only> PerlIO * is a FILE *.
This has been the default implementation since the abstraction was
introduced in perl5.003_02.

=item 2. USE_PERLIO

Introduced just after perl5.7.0, this is a re-implementation of the
above abstraction which allows perl more control over how IO is done
as it decouples IO from the way the operating system and C library
choose to do things. For USE_PERLIO PerlIO * has an extra layer of
indirection - it is a pointer-to-a-pointer.  This allows the PerlIO *
to remain with a known value while swapping the implementation around
underneath I<at run time>. In this case all the above are true (but
very simple) functions which call the underlying implementation.

This is the only implementation for which C<PerlIO_apply_layers()>
does anything "interesting".

The USE_PERLIO implementation is described in L<perliol>.

=back

Because "perlio.h" is a thin layer (for efficiency) the semantics of
these functions are somewhat dependent on the underlying implementation.
Where these variations are understood they are noted below.

Unless otherwise noted, functions return 0 on success, or a negative
value (usually C<EOF> which is usually -1) and set C<errno> on error.

=over 4

=item B<PerlIO_stdin()>, B<PerlIO_stdout()>, B<PerlIO_stderr()>

Use these rather than C<stdin>, C<stdout>, C<stderr>. They are written
to look like "function calls" rather than variables because this makes
it easier to I<make them> function calls if platform cannot export data
to loaded modules, or if (say) different "threads" might have different
values.

=item B<PerlIO_open(path, mode)>, B<PerlIO_fdopen(fd,mode)>

These correspond to fopen()/fdopen() and the arguments are the same.
Return C<NULL> and set C<errno> if there is an error.  There may be an
implementation limit on the number of open handles, which may be lower
than the limit on the number of open files - C<errno> may not be set
when C<NULL> is returned if this limit is exceeded.

=item B<PerlIO_reopen(path,mode,f)>

While this currently exists in all three implementations perl itself
does not use it. I<As perl does not use it, it is not well tested.>

Perl prefers to C<dup> the new low-level descriptor to the descriptor
used by the existing PerlIO. This may become the behaviour of this
function in the future.

=item B<PerlIO_printf(f,fmt,...)>, B<PerlIO_vprintf(f,fmt,a)>

These are fprintf()/vfprintf() equivalents.

=item B<PerlIO_stdoutf(fmt,...)>

This is printf() equivalent. printf is #defined to this function,
so it is (currently) legal to use C<printf(fmt,...)> in perl sources.

=item B<PerlIO_read(f,buf,count)>, B<PerlIO_write(f,buf,count)>

These correspond functionally to fread() and fwrite() but the
arguments and return values are different.  The PerlIO_read() and
PerlIO_write() signatures have been modeled on the more sane low level
read() and write() functions instead: The "file" argument is passed
first, there is only one "count", and the return value can distinguish
between error and C<EOF>.

Returns a byte count if successful (which may be zero or
positive), returns negative value and sets C<errno> on error.
Depending on implementation C<errno> may be C<EINTR> if operation was
interrupted by a signal.

=item B<PerlIO_close(f)>

Depending on implementation C<errno> may be C<EINTR> if operation was
interrupted by a signal.

=item B<PerlIO_puts(f,s)>, B<PerlIO_putc(f,c)>

These correspond to fputs() and fputc().
Note that arguments have been revised to have "file" first.

=item B<PerlIO_ungetc(f,c)>

This corresponds to ungetc().  Note that arguments have been revised
to have "file" first.  Arranges that next read operation will return
the byte B<c>.  Despite the implied "character" in the name only
values in the range 0..0xFF are defined. Returns the byte B<c> on
success or -1 (C<EOF>) on error.  The number of bytes that can be
"pushed back" may vary, only 1 character is certain, and then only if
it is the last character that was read from the handle.

=item B<PerlIO_getc(f)>

This corresponds to getc().
Despite the c in the name only byte range 0..0xFF is supported.
Returns the character read or -1 (C<EOF>) on error.

=item B<PerlIO_eof(f)>

This corresponds to feof().  Returns a true/false indication of
whether the handle is at end of file.  For terminal devices this may
or may not be "sticky" depending on the implementation.  The flag is
cleared by PerlIO_seek(), or PerlIO_rewind().

=item B<PerlIO_error(f)>

This corresponds to ferror().  Returns a true/false indication of
whether there has been an IO error on the handle.

=item B<PerlIO_fileno(f)>

This corresponds to fileno(), note that on some platforms, the meaning
of "fileno" may not match Unix. Returns -1 if the handle has no open
descriptor associated with it.

=item B<PerlIO_clearerr(f)>

This corresponds to clearerr(), i.e., clears 'error' and (usually)
'eof' flags for the "stream". Does not return a value.

=item B<PerlIO_flush(f)>

This corresponds to fflush().  Sends any buffered write data to the
underlying file.  If called with C<NULL> this may flush all open
streams (or core dump with some USE_STDIO implementations).  Calling
on a handle open for read only, or on which last operation was a read
of some kind may lead to undefined behaviour on some USE_STDIO
implementations.  The USE_PERLIO (layers) implementation tries to
behave better: it flushes all open streams when passed C<NULL>, and
attempts to retain data on read streams either in the buffer or by
seeking the handle to the current logical position.

=item B<PerlIO_seek(f,offset,whence)>

This corresponds to fseek().  Sends buffered write data to the
underlying file, or discards any buffered read data, then positions
the file descriptor as specified by B<offset> and B<whence> (sic).
This is the correct thing to do when switching between read and write
on the same handle (see issues with PerlIO_flush() above).  Offset is
of type C<Off_t> which is a perl Configure value which may not be same
as stdio's C<off_t>.

=item B<PerlIO_tell(f)>

This corresponds to ftell().  Returns the current file position, or
(Off_t) -1 on error.  May just return value system "knows" without
making a system call or checking the underlying file descriptor (so
use on shared file descriptors is not safe without a
PerlIO_seek()). Return value is of type C<Off_t> which is a perl
Configure value which may not be same as stdio's C<off_t>.

=item B<PerlIO_getpos(f,p)>, B<PerlIO_setpos(f,p)>

These correspond (loosely) to fgetpos() and fsetpos(). Rather than
stdio's Fpos_t they expect a "Perl Scalar Value" to be passed. What is
stored there should be considered opaque. The layout of the data may
vary from handle to handle.  When not using stdio or if platform does
not have the stdio calls then they are implemented in terms of
PerlIO_tell() and PerlIO_seek().

=item B<PerlIO_rewind(f)>

This corresponds to rewind(). It is usually defined as being

    PerlIO_seek(f,(Off_t)0L, SEEK_SET);
    PerlIO_clearerr(f);

=item B<PerlIO_tmpfile()>

This corresponds to tmpfile(), i.e., returns an anonymous PerlIO or
NULL on error.  The system will attempt to automatically delete the
file when closed.  On Unix the file is usually C<unlink>-ed just after
it is created so it does not matter how it gets closed. On other
systems the file may only be deleted if closed via PerlIO_close()
and/or the program exits via C<exit>.  Depending on the implementation
there may be "race conditions" which allow other processes access to
the file, though in general it will be safer in this regard than
ad. hoc. schemes.

=item B<PerlIO_setlinebuf(f)>

This corresponds to setlinebuf().  Does not return a value. What
constitutes a "line" is implementation dependent but usually means
that writing "\n" flushes the buffer.  What happens with things like
"this\nthat" is uncertain.  (Perl core uses it I<only> when "dumping";
it has nothing to do with $| auto-flush.)

=back

=head2 Co-existence with stdio

There is outline support for co-existence of PerlIO with stdio.
Obviously if PerlIO is implemented in terms of stdio there is no
problem. However in other cases then mechanisms must exist to create a
FILE * which can be passed to library code which is going to use stdio
calls.

The first step is to add this line:

   #define PERLIO_NOT_STDIO 0

I<before> including any perl header files. (This will probably become
the default at some point).  That prevents "perlio.h" from attempting
to #define stdio functions onto PerlIO functions.

XS code is probably better using "typemap" if it expects FILE *
arguments.  The standard typemap will be adjusted to comprehend any
changes in this area.

=over 4

=item B<PerlIO_importFILE(f,mode)>

Used to get a PerlIO * from a FILE *.

The mode argument should be a string as would be passed to
fopen/PerlIO_open.  If it is NULL then - for legacy support - the code
will (depending upon the platform and the implementation) either
attempt to empirically determine the mode in which I<f> is open, or
use "r+" to indicate a read/write stream.

Once called the FILE * should I<ONLY> be closed by calling
C<PerlIO_close()> on the returned PerlIO *.

The PerlIO is set to textmode. Use PerlIO_binmode if this is
not the desired mode.

This is B<not> the reverse of PerlIO_exportFILE().

=item B<PerlIO_exportFILE(f,mode)>

Given a PerlIO * create a 'native' FILE * suitable for passing to code
expecting to be compiled and linked with ANSI C I<stdio.h>.  The mode
argument should be a string as would be passed to fopen/PerlIO_open.
If it is NULL then - for legacy support - the FILE * is opened in same
mode as the PerlIO *.

The fact that such a FILE * has been 'exported' is recorded, (normally
by pushing a new :stdio "layer" onto the PerlIO *), which may affect
future PerlIO operations on the original PerlIO *.  You should not
call C<fclose()> on the file unless you call C<PerlIO_releaseFILE()>
to disassociate it from the PerlIO *.  (Do not use PerlIO_importFILE()
for doing the disassociation.)

Calling this function repeatedly will create a FILE * on each call
(and will push an :stdio layer each time as well).

=item B<PerlIO_releaseFILE(p,f)>

Calling PerlIO_releaseFILE informs PerlIO that all use of FILE * is
complete. It is removed from the list of 'exported' FILE *s, and the
associated PerlIO * should revert to its original behaviour.

Use this to disassociate a file from a PerlIO * that was associated
using PerlIO_exportFILE().

=item B<PerlIO_findFILE(f)>

Returns a native FILE * used by a stdio layer. If there is none, it
will create one with PerlIO_exportFILE. In either case the FILE *
should be considered as belonging to PerlIO subsystem and should
only be closed by calling C<PerlIO_close()>.


=back

=head2 "Fast gets" Functions

In addition to standard-like API defined so far above there is an
"implementation" interface which allows perl to get at internals of
PerlIO.  The following calls correspond to the various FILE_xxx macros
determined by Configure - or their equivalent in other
implementations. This section is really of interest to only those
concerned with detailed perl-core behaviour, implementing a PerlIO
mapping or writing code which can make use of the "read ahead" that
has been done by the IO system in the same way perl does. Note that
any code that uses these interfaces must be prepared to do things the
traditional way if a handle does not support them.

=over 4

=item B<PerlIO_fast_gets(f)>

Returns true if implementation has all the interfaces required to
allow perl's C<sv_gets> to "bypass" normal IO mechanism.  This can
vary from handle to handle.

  PerlIO_fast_gets(f) = PerlIO_has_cntptr(f) && \
                        PerlIO_canset_cnt(f) && \
                        'Can set pointer into buffer'

=item B<PerlIO_has_cntptr(f)>

Implementation can return pointer to current position in the "buffer"
and a count of bytes available in the buffer.  Do not use this - use
PerlIO_fast_gets.

=item B<PerlIO_get_cnt(f)>

Return count of readable bytes in the buffer. Zero or negative return
means no more bytes available.

=item B<PerlIO_get_ptr(f)>

Return pointer to next readable byte in buffer, accessing via the
pointer (dereferencing) is only safe if PerlIO_get_cnt() has returned
a positive value.  Only positive offsets up to value returned by
PerlIO_get_cnt() are allowed.

=item B<PerlIO_set_ptrcnt(f,p,c)>

Set pointer into buffer, and a count of bytes still in the
buffer. Should be used only to set pointer to within range implied by
previous calls to C<PerlIO_get_ptr> and C<PerlIO_get_cnt>. The two
values I<must> be consistent with each other (implementation may only
use one or the other or may require both).

=item B<PerlIO_canset_cnt(f)>

Implementation can adjust its idea of number of bytes in the buffer.
Do not use this - use PerlIO_fast_gets.

=item B<PerlIO_set_cnt(f,c)>

Obscure - set count of bytes in the buffer. Deprecated.  Only usable
if PerlIO_canset_cnt() returns true.  Currently used in only doio.c to
force count less than -1 to -1.  Perhaps should be PerlIO_set_empty or
similar.  This call may actually do nothing if "count" is deduced from
pointer and a "limit".  Do not use this - use PerlIO_set_ptrcnt().

=item B<PerlIO_has_base(f)>

Returns true if implementation has a buffer, and can return pointer
to whole buffer and its size. Used by perl for B<-T> / B<-B> tests.
Other uses would be very obscure...

=item B<PerlIO_get_base(f)>

Return I<start> of buffer. Access only positive offsets in the buffer
up to the value returned by PerlIO_get_bufsiz().

=item B<PerlIO_get_bufsiz(f)>

Return the I<total number of bytes> in the buffer, this is neither the
number that can be read, nor the amount of memory allocated to the
buffer. Rather it is what the operating system and/or implementation
happened to C<read()> (or whatever) last time IO was requested.

=back

=head2 Other Functions

=over 4

=item PerlIO_apply_layers(f,mode,layers)

The new interface to the USE_PERLIO implementation. The layers ":crlf"
and ":raw" are only ones allowed for other implementations and those
are silently ignored. (As of perl5.8 ":raw" is deprecated.)  Use
PerlIO_binmode() below for the portable case.

=item PerlIO_binmode(f,ptype,imode,layers)

The hook used by perl's C<binmode> operator.
B<ptype> is perl's character for the kind of IO:

=over 8

=item 'E<lt>' read

=item 'E<gt>' write

=item '+' read/write

=back

B<imode> is C<O_BINARY> or C<O_TEXT>.

B<layers> is a string of layers to apply, only ":crlf" makes sense in
the non USE_PERLIO case. (As of perl5.8 ":raw" is deprecated in favour
of passing NULL.)

Portable cases are:

    PerlIO_binmode(f,ptype,O_BINARY,NULL);
and
    PerlIO_binmode(f,ptype,O_TEXT,":crlf");

On Unix these calls probably have no effect whatsoever.  Elsewhere
they alter "\n" to CR,LF translation and possibly cause a special text
"end of file" indicator to be written or honoured on read. The effect
of making the call after doing any IO to the handle depends on the
implementation. (It may be ignored, affect any data which is already
buffered as well, or only apply to subsequent data.)

=item PerlIO_debug(fmt,...)

PerlIO_debug is a printf()-like function which can be used for
debugging.  No return value. Its main use is inside PerlIO where using
real printf, warn() etc. would recursively call PerlIO and be a
problem.

PerlIO_debug writes to the file named by $ENV{'PERLIO_DEBUG'} or defaults
to stderr if the environment variable is not defined. Typical
use might be

  Bourne shells (sh, ksh, bash, zsh, ash, ...):
   PERLIO_DEBUG=/tmp/perliodebug.log ./perl -Di somescript some args

  Csh/Tcsh:
   setenv PERLIO_DEBUG /tmp/perliodebug.log
   ./perl -Di somescript some args

  If you have the "env" utility:
   env PERLIO_DEBUG=/tmp/perliodebug.log ./perl -Di somescript args

  Win32:
   set PERLIO_DEBUG=perliodebug.log
   perl -Di somescript some args

On a Perl built without C<-DDEBUGGING>, or when the C<-Di> command-line switch
is not specified, or under taint, PerlIO_debug() is a no-op.

=back
perlsynology.pod000064400000017142150344123460010025 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is. But if you have been into Perl you
probably already know this.

=head1 NAME

perlsynology - Perl 5 on Synology DSM systems

=head1 DESCRIPTION

Synology manufactures a vast number of Network Attached Storage (NAS)
devices that are very popular in large organisations as well as small
businesses and homes.

The NAS systems are equipped with Synology Disk Storage Manager (DSM),
which is a trimmed-down Linux system enhanced with several tools for
managing the NAS. There are several flavours of hardware: Marvell
Armada (ARMv5tel, ARMv7l), Intel Atom (i686, x86_64), Freescale QorIQ
(PPC), and more. For a full list see the
L<Synology FAQ|http://forum.synology.com/wiki/index.php/What_kind_of_CPU_does_my_NAS_have>.

Since it is based on Linux, the NAS can run many popular Linux
software packages, including Perl. In fact, Synology provides a
ready-to-install package for Perl, depending on the version of DSM
the installed perl ranges from 5.8.6 on DSM-4.3 to 5.18.4 on DSM-6.0.1.

There is an active user community that provides many software packages
for the Synology DSM systems; at the time of writing this document
they provide Perl version 5.18.4.

This document describes various features of Synology DSM operating
system that will affect how Perl 5 (hereafter just Perl) is
configured, compiled and/or runs. It has been compiled and verified by
Johan Vromans for the Synology DS413 (QorIQ), with feedback from
H.Merijn Brand (DS213, ARMv5tel and RS815, Intel Atom x64).

=head2 Setting up the build environment

=head3 DSM 5

As DSM is a trimmed-down Linux system, it lacks many of the tools and
libraries commonly found on Linux. The basic tools like sh, cp, rm,
etc. are implemented using
L<BusyBox|http://en.wikipedia.org/wiki/BusyBox>.

=over 4

=item *

Using your favourite browser open the DSM management page and start
the Package Center.

=item *

If you want to smoke test Perl, install C<Perl>.

=item *

In Settings, add the following Package Sources:

  http://www.cphub.net
  http://packages.quadrat4.de

=item *

Still in Settings, in Channel Update, select Beta Channel.

=item *

Press Refresh. In the left panel the item "Community" will appear.
Click it. Select "Bootstrap Installer Beta" and install it.

=item *

Likewise, install "iPKGui Beta".

The application window should now show an icon for iPKGui.

=item *

Start iPKGui. Install the packages C<make>, C<gcc> and C<coreutils>.

If you want to smoke test Perl, install C<patch>.

=back

The next step is to add some symlinks to system libraries. For
example, the development software expect a library C<libm.so> that
normally is a symlink to C<libm.so.6>. Synology only provides the
latter and not the symlink.

Here the actual architecture of the Synology system matters. You have
to find out where the gcc libraries have been installed. Look in /opt
for a directory similar to arm-none-linux-gnueab or
powerpc-linux-gnuspe. In the instructions below I'll use
powerpc-linux-gnuspe as an example.

=over 4

=item *

On the DSM management page start the Control Panel.

=item *

Click Terminal, and enable SSH service.

=item *

Close Terminal and the Control Panel.

=item *

Open a shell on the Synology using ssh and become root.

=item *

Execute the following commands:

  cd /lib
  ln -s libm.so.6 libm.so
  ln -s libcrypt.so.1 libcrypt.so
  ln -s libdl.so.2 libdl.so
  cd /opt/powerpc-linux-gnuspe/lib  (or
                                    /opt/arm-none-linux-gnueabi/lib)
  ln -s /lib/libdl.so.2 libdl.so

=back

B<WARNING:> When you perform a system software upgrade, these links
will disappear and need to be re-established.

=head3 DSM 6

Using iPkg has been deprecated on DSM 6, but an alternative is available
for DSM 6: entware/opkg. For instructions on how to use that, please read
L<Install Entware-ng on Synology NAS|https://github.com/Entware-ng/Entware-ng/wiki/Install-on-Synology-NAS>

That sadly does not (yet) work on QorIQ. At the moment of writing, the
supported architectures are armv5, armv7, mipsel, x86_32 and x86_64.

Entware-ng comes with a precompiled 5.22.1 (June 2016) that allowes
building shared XS code. Note that this installation does B<not> use
a site_perl folder.

=head2 Compiling Perl 5

When the build environment has been set up, building and testing Perl
is straightforward. The only thing you need to do is download the
sources as usual, and add a file Policy.sh as follows:

  # Administrivia.
  perladmin="your.email@goes.here"

  # Install Perl in a tree in /opt/perl instead of /opt/bin.
  prefix=/opt/perl

  # Select the compiler. Note that there is no 'cc' alias or link.
  cc=gcc

  # Build flags.
  ccflags="-DDEBUGGING"

  # Library and include paths.
  libpth="/lib"
  locincpth="/opt/include"
  loclibpth="/lib"

You may want to create the destination directory and give it the right
permissions before installing, thus eliminating the need to build Perl
as a super user.

In the directory where you unpacked the sources, issue the familiar
commands:

  ./Configure -des
  make
  make test
  make install

=head2 Known problems

=head3 Configure

No known problems yet

=head3 Build

=over 4

=item Error message "No error definitions found".

This error is generated when it is not possible to find the local
definitions for error codes, due to the uncommon structure of the
Synology file system.

This error was fixed in the Perl development git for version 5.19,
commit 7a8f1212e5482613c8a5b0402528e3105b26ff24.

=back

=head3 Failing tests

=over 4

=item F<ext/DynaLoader/t/DynaLoader.t>

One subtest fails due to the uncommon structure of the Synology file
system. The file F</lib/glibc.so> is missing.

B<WARNING:> Do not symlink F</lib/glibc.so.6> to F</lib/glibc.so> or
some system components will start to fail.

=back

=head2 Smoke testing Perl 5

If building completes successfully, you can set up smoke testing as
described in the Test::Smoke documentation.

For smoke testing you need a running Perl. You can either install the
Synology supplied package for Perl 5.8.6, or build and install your
own, much more recent version.

Note that I could not run successful smokes when initiated by the
Synology Task Scheduler. I resorted to initiating the smokes via a
cron job run on another system, using ssh:

  ssh nas1 wrk/Test-Smoke/smoke/smokecurrent.sh

=head3 Local patches

When local patches are applied with smoke testing, the test driver
will automatically request regeneration of certain tables after the
patches are applied. The Synology supplied Perl 5.8.6 (at least on the
DS413) B<is NOT capable> of generating these tables. It will generate
opcodes with bogus values, causing the build to fail.

You can prevent regeneration by adding the setting

  'flags' => 0,

to the smoke config, or by adding another patch that inserts

  exit 0 if $] == 5.008006;

in the beginning of the C<regen.pl> program.

=head2 Adding libraries

The above procedure describes a basic environment and hence results in
a basic Perl. If you want to add additional libraries to Perl, you may
need some extra settings.

For example, the basic Perl does not have any of the DB libraries (db,
dbm, ndbm, gdsm). You can add these using iPKGui, however, you need to
set environment variable LD_LIBRARY_PATH to the appropriate value:

  LD_LIBRARY_PATH=/lib:/opt/lib
  export LD_LIBRARY_PATH

This setting needs to be in effect while Perl is built, but also when
the programs are run.

=head1 REVISION

June 2016, for Synology DSM 5.1.5022 and DSM 6.0.1-7393.

=head1 AUTHOR

Johan Vromans <jvromans@squirrel.nl>
H. Merijn Brand <h.m.brand@xs4all.nl>

=cut
perl5260delta.pod000064400000306714150344123460007556 0ustar00=encoding utf8

=head1 NAME

perl5260delta - what is new for perl v5.26.0

=head1 DESCRIPTION

This document describes the differences between the 5.24.0 release and the
5.26.0 release.

=head1 Notice

This release includes three updates with widespread effects:

=over 4

=item * C<"."> no longer in C<@INC>

For security reasons, the current directory (C<".">) is no longer included
by default at the end of the module search path (C<@INC>). This may have
widespread implications for the building, testing and installing of
modules, and for the execution of scripts.  See the section
L<< Removal of the current directory (C<".">) from C<@INC> >>
for the full details.

=item * C<do> may now warn

C<do> now gives a deprecation warning when it fails to load a file which
it would have loaded had C<"."> been in C<@INC>.

=item * In regular expression patterns, a literal left brace C<"{">
should be escaped

See L</Unescaped literal C<"{"> characters in regular expression patterns are no longer permissible>.

=back

=head1 Core Enhancements

=head2 Lexical subroutines are no longer experimental

Using the C<lexical_subs> feature introduced in v5.18 no longer emits a warning.  Existing
code that disables the C<experimental::lexical_subs> warning category
that the feature previously used will continue to work.  The
C<lexical_subs> feature has no effect; all Perl code can use lexical
subroutines, regardless of what feature declarations are in scope.

=head2 Indented Here-documents

This adds a new modifier C<"~"> to here-docs that tells the parser
that it should look for C</^\s*$DELIM\n/> as the closing delimiter.

These syntaxes are all supported:

    <<~EOF;
    <<~\EOF;
    <<~'EOF';
    <<~"EOF";
    <<~`EOF`;
    <<~ 'EOF';
    <<~ "EOF";
    <<~ `EOF`;

The C<"~"> modifier will strip, from each line in the here-doc, the
same whitespace that appears before the delimiter.

Newlines will be copied as-is, and lines that don't include the
proper beginning whitespace will cause perl to croak.

For example:

    if (1) {
      print <<~EOF;
        Hello there
        EOF
    }

prints "Hello there\n" with no leading whitespace.

=head2 New regular expression modifier C</xx>

Specifying two C<"x"> characters to modify a regular expression pattern
does everything that a single one does, but additionally TAB and SPACE
characters within a bracketed character class are generally ignored and
can be added to improve readability, like
S<C</[ ^ A-Z d-f p-x ]/xx>>.  Details are at
L<perlre/E<sol>x and E<sol>xx>.

=head2 C<@{^CAPTURE}>, C<%{^CAPTURE}>, and C<%{^CAPTURE_ALL}>

C<@{^CAPTURE}> exposes the capture buffers of the last match as an
array.  So C<$1> is C<${^CAPTURE}[0]>.  This is a more efficient equivalent
to code like C<substr($matched_string,$-[0],$+[0]-$-[0])>, and you don't
have to keep track of the C<$matched_string> either.  This variable has no
single character equivalent.  Note that, like the other regex magic variables,
the contents of this variable is dynamic; if you wish to store it beyond
the lifetime of the match you must copy it to another array.

C<%{^CAPTURE}> is equivalent to C<%+> (I<i.e.>, named captures).  Other than
being more self-documenting there is no difference between the two forms.

C<%{^CAPTURE_ALL}> is equivalent to C<%-> (I<i.e.>, all named captures).
Other than being more self-documenting there is no difference between the
two forms.

=head2 Declaring a reference to a variable

As an experimental feature, Perl now allows the referencing operator to come
after L<C<my()>|perlfunc/my>, L<C<state()>|perlfunc/state>,
L<C<our()>|perlfunc/our>, or L<C<local()>|perlfunc/local>.  This syntax must
be enabled with C<use feature 'declared_refs'>.  It is experimental, and will
warn by default unless C<no warnings 'experimental::refaliasing'> is in effect.
It is intended mainly for use in assignments to references.  For example:

    use experimental 'refaliasing', 'declared_refs';
    my \$a = \$b;

See L<perlref/Assigning to References> for more details.

=head2 Unicode 9.0 is now supported

A list of changes is at L<http://www.unicode.org/versions/Unicode9.0.0/>.
Modules that are shipped with core Perl but not maintained by p5p do not
necessarily support Unicode 9.0.  L<Unicode::Normalize> does work on 9.0.

=head2 Use of C<\p{I<script>}> uses the improved Script_Extensions property

Unicode 6.0 introduced an improved form of the Script (C<sc>) property, and
called it Script_Extensions (C<scx>).  Perl now uses this improved
version when a property is specified as just C<\p{I<script>}>.  This
should make programs more accurate when determining if a character is
used in a given script, but there is a slight chance of breakage for
programs that very specifically needed the old behavior.  The meaning of
compound forms, like C<\p{sc=I<script>}> are unchanged.  See
L<perlunicode/Scripts>.

=head2 Perl can now do default collation in UTF-8 locales on platforms
that support it

Some platforms natively do a reasonable job of collating and sorting in
UTF-8 locales.  Perl now works with those.  For portability and full
control, L<Unicode::Collate> is still recommended, but now you may
not need to do anything special to get good-enough results, depending on
your application.  See
L<perllocale/Category C<LC_COLLATE>: Collation: Text Comparisons and Sorting>.

=head2 Better locale collation of strings containing embedded C<NUL>
characters

In locales that have multi-level character weights, C<NUL>s are now
ignored at the higher priority ones.  There are still some gotchas in
some strings, though.  See
L<perllocale/Collation of strings containing embedded C<NUL> characters>.

=head2 C<CORE> subroutines for hash and array functions callable via
reference

The hash and array functions in the C<CORE> namespace (C<keys>, C<each>,
C<values>, C<push>, C<pop>, C<shift>, C<unshift> and C<splice>) can now
be called with ampersand syntax (C<&CORE::keys(\%hash>) and via reference
(C<< my $k = \&CORE::keys; $k-E<gt>(\%hash) >>).  Previously they could only be
used when inlined.

=head2 New Hash Function For 64-bit Builds

We have switched to a hybrid hash function to better balance
performance for short and long keys.

For short keys, 16 bytes and under, we use an optimised variant of
One At A Time Hard, and for longer keys we use Siphash 1-3.  For very
long keys this is a big improvement in performance.  For shorter keys
there is a modest improvement.

=head1 Security

=head2 Removal of the current directory (C<".">) from C<@INC>

The perl binary includes a default set of paths in C<@INC>.  Historically
it has also included the current directory (C<".">) as the final entry,
unless run with taint mode enabled (C<perl -T>).  While convenient, this has
security implications: for example, where a script attempts to load an
optional module when its current directory is untrusted (such as F</tmp>),
it could load and execute code from under that directory.

Starting with v5.26, C<"."> is always removed by default, not just under
tainting.  This has major implications for installing modules and executing
scripts.

The following new features have been added to help ameliorate these
issues.

=over

=item * F<Configure -Udefault_inc_excludes_dot>

There is a new F<Configure> option, C<default_inc_excludes_dot> (enabled
by default) which builds a perl executable without C<".">; unsetting this
option using C<-U> reverts perl to the old behaviour.  This may fix your
path issues but will reintroduce all the security concerns, so don't
build a perl executable like this unless you're I<really> confident that
such issues are not a concern in your environment.

=item * C<PERL_USE_UNSAFE_INC>

There is a new environment variable recognised by the perl interpreter.
If this variable has the value 1 when the perl interpreter starts up,
then C<"."> will be automatically appended to C<@INC> (except under tainting).

This allows you restore the old perl interpreter behaviour on a
case-by-case basis.  But note that this is intended to be a temporary crutch,
and this feature will likely be removed in some future perl version.
It is currently set by the C<cpan> utility and C<Test::Harness> to
ease installation of CPAN modules which have not been updated to handle the
lack of dot.  Once again, don't use this unless you are sure that this
will not reintroduce any security concerns.

=item * A new deprecation warning issued by C<do>.

While it is well-known that C<use> and C<require> use C<@INC> to search
for the file to load, many people don't realise that C<do "file"> also
searches C<@INC> if the file is a relative path.  With the removal of C<".">,
a simple C<do "file.pl"> will fail to read in and execute C<file.pl> from
the current directory.  Since this is commonly expected behaviour, a new
deprecation warning is now issued whenever C<do> fails to load a file which
it otherwise would have found if a dot had been in C<@INC>.

=back

Here are some things script and module authors may need to do to make
their software work in the new regime.

=over

=item * Script authors

If the issue is within your own code (rather than within included
modules), then you have two main options.  Firstly, if you are confident
that your script will only be run within a trusted directory (under which
you expect to find trusted files and modules), then add C<"."> back into the
path; I<e.g.>:

    BEGIN {
        my $dir = "/some/trusted/directory";
        chdir $dir or die "Can't chdir to $dir: $!\n";
        # safe now
        push @INC, '.';
    }

    use "Foo::Bar"; # may load /some/trusted/directory/Foo/Bar.pm
    do "config.pl"; # may load /some/trusted/directory/config.pl

On the other hand, if your script is intended to be run from within
untrusted directories (such as F</tmp>), then your script suddenly failing
to load files may be indicative of a security issue.  You most likely want
to replace any relative paths with full paths; for example,

    do "foo_config.pl"

might become

    do "$ENV{HOME}/foo_config.pl"

If you are absolutely certain that you want your script to load and
execute a file from the current directory, then use a C<./> prefix; for
example:

    do "./foo_config.pl"

=item * Installing and using CPAN modules

If you install a CPAN module using an automatic tool like C<cpan>, then
this tool will itself set the C<PERL_USE_UNSAFE_INC> environment variable
while building and testing the module, which may be sufficient to install
a distribution which hasn't been updated to be dot-aware.  If you want to
install such a module manually, then you'll need to replace the
traditional invocation:

    perl Makefile.PL && make && make test && make install

with something like

    (export PERL_USE_UNSAFE_INC=1; \
     perl Makefile.PL && make && make test && make install)

Note that this only helps build and install an unfixed module.  It's
possible for the tests to pass (since they were run under
C<PERL_USE_UNSAFE_INC=1>), but for the module itself to fail to perform
correctly in production.  In this case, you may have to temporarily modify
your script until a fixed version of the module is released.
For example:

    use Foo::Bar;
    {
        local @INC = (@INC, '.');
        # assuming read_config() needs '.' in @INC
        $config = Foo::Bar->read_config();
    }

This is only rarely expected to be necessary.  Again, if doing this,
assess the resultant risks first.

=item * Module Authors

If you maintain a CPAN distribution, it may need updating to run in
a dotless environment.  Although C<cpan> and other such tools will
currently set the C<PERL_USE_UNSAFE_INC> during module build, this is a
temporary workaround for the set of modules which rely on C<"."> being in
C<@INC> for installation and testing, and this may mask deeper issues.  It
could result in a module which passes tests and installs, but which
fails at run time.

During build, test, and install, it will normally be the case that any perl
processes will be executing directly within the root directory of the
untarred distribution, or a known subdirectory of that, such as F<t/>.  It
may well be that F<Makefile.PL> or F<t/foo.t> will attempt to include
local modules and configuration files using their direct relative
filenames, which will now fail.

However, as described above, automatic tools like F<cpan> will (for now)
set the C<PERL_USE_UNSAFE_INC> environment variable, which introduces
dot during a build.

This makes it likely that your existing build and test code will work, but
this may mask issues with your code which only manifest when used after
install.  It is prudent to try and run your build process with that
variable explicitly disabled:

    (export PERL_USE_UNSAFE_INC=0; \
     perl Makefile.PL && make && make test && make install)

This is more likely to show up any potential problems with your module's
build process, or even with the module itself.  Fixing such issues will
ensure both that your module can again be installed manually, and that
it will still build once the C<PERL_USE_UNSAFE_INC> crutch goes away.

When fixing issues in tests due to the removal of dot from C<@INC>,
reinsertion of dot into C<@INC> should be performed with caution, for this
too may suppress real errors in your runtime code.  You are encouraged
wherever possible to apply the aforementioned approaches with explicit
absolute/relative paths, or to relocate your needed files into a
subdirectory and insert that subdirectory into C<@INC> instead.

If your runtime code has problems under the dotless C<@INC>, then the comments
above on how to fix for script authors will mostly apply here too.  Bear in
mind though that it is considered bad form for a module to globally add a dot to
C<@INC>, since it introduces both a security risk and hides issues of
accidentally requiring dot in C<@INC>, as explained above.

=back

=head2 Escaped colons and relative paths in PATH

On Unix systems, Perl treats any relative paths in the C<PATH> environment
variable as tainted when starting a new process.  Previously, it was
allowing a backslash to escape a colon (unlike the OS), consequently
allowing relative paths to be considered safe if the PATH was set to
something like C</\:.>.  The check has been fixed to treat C<"."> as tainted
in that example.

=head2 New C<-Di> switch is now required for PerlIO debugging output

This is used for debugging of code within PerlIO to avoid recursive
calls.  Previously this output would be sent to the file specified
by the C<PERLIO_DEBUG> environment variable if perl wasn't running
setuid and the C<-T> or C<-t> switches hadn't been parsed yet.

If perl performed output at a point where it hadn't yet parsed its
switches this could result in perl creating or overwriting the file
named by C<PERLIO_DEBUG> even when the C<-T> switch had been supplied.

Perl now requires the C<-Di> switch to be present before it will produce
PerlIO debugging
output.  By default this is written to C<stderr>, but can optionally
be redirected to a file by setting the C<PERLIO_DEBUG> environment
variable.

If perl is running setuid or the C<-T> switch was supplied,
C<PERLIO_DEBUG> is ignored and the debugging output is sent to
C<stderr> as for any other C<-D> switch.

=head1 Incompatible Changes

=head2 Unescaped literal C<"{"> characters in regular expression
patterns are no longer permissible

You have to now say something like C<"\{"> or C<"[{]"> to specify to
match a LEFT CURLY BRACKET; otherwise, it is a fatal pattern compilation
error.  This change will allow future extensions to the language.

These have been deprecated since v5.16, with a deprecation message
raised for some uses starting in v5.22.  Unfortunately, the code added
to raise the message was buggy and failed to warn in some cases where
it should have.  Therefore, enforcement of this ban for these cases is
deferred until Perl 5.30, but the code has been fixed to raise a
default-on deprecation message for them in the meantime.

Some uses of literal C<"{"> occur in contexts where we do not foresee
the meaning ever being anything but the literal, such as the very first
character in the pattern, or after a C<"|"> meaning alternation.  Thus

 qr/{fee|{fie/

matches either of the strings C<{fee> or C<{fie>.  To avoid forcing
unnecessary code changes, these uses do not need to be escaped, and no
warning is raised about them, and there are no current plans to change this.

But it is always correct to escape C<"{">, and the simple rule to
remember is to always do so.

See L<Unescaped left brace in regex is illegal here|perldiag/Unescaped left brace in regex is illegal here in regex; marked by S<E<lt>-- HERE> in mE<sol>%sE<sol>>.

=head2 C<scalar(%hash)> return signature changed

The value returned for C<scalar(%hash)> will no longer show information about
the buckets allocated in the hash.  It will simply return the count of used
keys.  It is thus equivalent to C<0+keys(%hash)>.

A form of backward compatibility is provided via
L<C<Hash::Util::bucket_ratio()>|Hash::Util/bucket_ratio> which provides
the same behavior as
C<scalar(%hash)> provided in Perl 5.24 and earlier.

=head2 C<keys> returned from an lvalue subroutine

C<keys> returned from an lvalue subroutine can no longer be assigned
to in list context.

    sub foo : lvalue { keys(%INC) }
    (foo) = 3; # death
    sub bar : lvalue { keys(@_) }
    (bar) = 3; # also an error

This makes the lvalue sub case consistent with C<(keys %hash) = ...> and
C<(keys @_) = ...>, which are also errors.
L<[perl #128187]|https://rt.perl.org/Public/Bug/Display.html?id=128187>

=head2 The C<${^ENCODING}> facility has been removed

The special behaviour associated with assigning a value to this variable
has been removed.  As a consequence, the L<encoding> pragma's default mode
is no longer supported.  If
you still need to write your source code in encodings other than UTF-8, use a
source filter such as L<Filter::Encoding> on CPAN or L<encoding>'s C<Filter>
option.

=head2 C<POSIX::tmpnam()> has been removed

The fundamentally unsafe C<tmpnam()> interface was deprecated in
Perl 5.22 and has now been removed.  In its place, you can use,
for example, the L<File::Temp> interfaces.

=head2 require ::Foo::Bar is now illegal.

Formerly, C<require ::Foo::Bar> would try to read F</Foo/Bar.pm>.  Now any
bareword require which starts with a double colon dies instead.

=head2 Literal control character variable names are no longer permissible

A variable name may no longer contain a literal control character under
any circumstances.  These previously were allowed in single-character
names on ASCII platforms, but have been deprecated there since Perl
5.20.  This affects things like C<$I<\cT>>, where I<\cT> is a literal
control (such as a C<NAK> or C<NEGATIVE ACKNOWLEDGE> character) in the
source code.

=head2 C<NBSP> is no longer permissible in C<\N{...}>

The name of a character may no longer contain non-breaking spaces.  It
has been deprecated to do so since Perl 5.22.

=head1 Deprecations

=head2 String delimiters that aren't stand-alone graphemes are now deprecated

For Perl to eventually allow string delimiters to be Unicode
grapheme clusters (which look like a single character, but may be
a sequence of several ones), we have to stop allowing a single character
delimiter that isn't a grapheme by itself.  These are unlikely to exist
in actual code, as they would typically display as attached to the
character in front of them.

=head2 C<\cI<X>> that maps to a printable is no longer deprecated

This means we have no plans to remove this feature.  It still raises a
warning, but only if syntax warnings are enabled.  The feature was
originally intended to be a way to express non-printable characters that
don't have a mnemonic (C<\t> and C<\n> are mnemonics for two
non-printable characters, but most non-printables don't have a
mnemonic.)  But the feature can be used to specify a few printable
characters, though those are more clearly expressed as the printable
itself.  See
L<http://www.nntp.perl.org/group/perl.perl5.porters/2017/02/msg242944.html>.

=head1 Performance Enhancements

=over 4

=item *

A hash in boolean context is now sometimes faster, I<e.g.>

    if (!%h) { ... }

This was already special-cased, but some cases were missed (such as
C<grep %$_, @AoH>), and even the ones which weren't have been improved.

=item * New Faster Hash Function on 64 bit builds

We use a different hash function for short and long keys.  This should
improve performance and security, especially for long keys.

=item * readline is faster

Reading from a file line-by-line with C<readline()> or C<< E<lt>E<gt> >> should
now typically be faster due to a better implementation of the code that
searches for the next newline character.

=item *

Assigning one reference to another, I<e.g.> C<$ref1 = $ref2> has been
optimized in some cases.

=item *

Remove some exceptions to creating Copy-on-Write strings. The string
buffer growth algorithm has been slightly altered so that you're less
likely to encounter a string which can't be COWed.

=item *

Better optimise array and hash assignment: where an array or hash appears
in the LHS of a list assignment, such as C<(..., @a) = (...);>, it's
likely to be considerably faster, especially if it involves emptying the
array/hash. For example, this code runs about a third faster compared to
Perl 5.24.0:

    my @a;
    for my $i (1..10_000_000) {
        @a = (1,2,3);
        @a = ();
    }

=item *

Converting a single-digit string to a number is now substantially faster.

=item *

The C<split> builtin is now slightly faster in many cases: in particular
for the two specially-handled forms

    my    @a = split ...;
    local @a = split ...;

=item *

The rather slow implementation for the experimental subroutine signatures
feature has been made much faster; it is now comparable in speed with the
traditional C<my ($a, $b, @c) = @_>.

=item *

Bareword constant strings are now permitted to take part in constant
folding.  They were originally exempted from constant folding in August 1999,
during the development of Perl 5.6, to ensure that C<use strict "subs">
would still apply to bareword constants.  That has now been accomplished a
different way, so barewords, like other constants, now gain the performance
benefits of constant folding.

This also means that void-context warnings on constant expressions of
barewords now report the folded constant operand, rather than the operation;
this matches the behaviour for non-bareword constants.

=back

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

IO::Compress has been upgraded from version 2.069 to 2.074.

=item *

L<Archive::Tar> has been upgraded from version 2.04 to 2.24.

=item *

L<arybase> has been upgraded from version 0.11 to 0.12.

=item *

L<attributes> has been upgraded from version 0.27 to 0.29.

The deprecation message for the C<:unique> and C<:locked> attributes
now mention that they will disappear in Perl 5.28.

=item *

L<B> has been upgraded from version 1.62 to 1.68.

=item *

L<B::Concise> has been upgraded from version 0.996 to 0.999.

Its output is now more descriptive for C<op_private> flags.

=item *

L<B::Debug> has been upgraded from version 1.23 to 1.24.

=item *

L<B::Deparse> has been upgraded from version 1.37 to 1.40.

=item *

L<B::Xref> has been upgraded from version 1.05 to 1.06.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<base> has been upgraded from version 2.23 to 2.25.

=item *

L<bignum> has been upgraded from version 0.42 to 0.47.

=item *

L<Carp> has been upgraded from version 1.40 to 1.42.

=item *

L<charnames> has been upgraded from version 1.43 to 1.44.

=item *

L<Compress::Raw::Bzip2> has been upgraded from version 2.069 to 2.074.

=item *

L<Compress::Raw::Zlib> has been upgraded from version 2.069 to 2.074.

=item *

L<Config::Perl::V> has been upgraded from version 0.25 to 0.28.

=item *

L<CPAN> has been upgraded from version 2.11 to 2.18.

=item *

L<CPAN::Meta> has been upgraded from version 2.150005 to 2.150010.

=item *

L<Data::Dumper> has been upgraded from version 2.160 to 2.167.

The XS implementation now supports Deparse.

=item *

L<DB_File> has been upgraded from version 1.835 to 1.840.

=item *

L<Devel::Peek> has been upgraded from version 1.23 to 1.26.

=item *

L<Devel::PPPort> has been upgraded from version 3.32 to 3.35.

=item *

L<Devel::SelfStubber> has been upgraded from version 1.05 to 1.06.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<diagnostics> has been upgraded from version 1.34 to 1.36.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<Digest> has been upgraded from version 1.17 to 1.17_01.

=item *

L<Digest::MD5> has been upgraded from version 2.54 to 2.55.

=item *

L<Digest::SHA> has been upgraded from version 5.95 to 5.96.

=item *

L<DynaLoader> has been upgraded from version 1.38 to 1.42.

=item *

L<Encode> has been upgraded from version 2.80 to 2.88.

=item *

L<encoding> has been upgraded from version 2.17 to 2.19.

This module's default mode is no longer supported.  It now
dies when imported, unless the C<Filter> option is being used.

=item *

L<encoding::warnings> has been upgraded from version 0.12 to 0.13.

This module is no longer supported.  It emits a warning to
that effect and then does nothing.

=item *

L<Errno> has been upgraded from version 1.25 to 1.28.

It now documents that using C<%!> automatically loads Errno for you.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<ExtUtils::Embed> has been upgraded from version 1.33 to 1.34.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<ExtUtils::MakeMaker> has been upgraded from version 7.10_01 to 7.24.

=item *

L<ExtUtils::Miniperl> has been upgraded from version 1.05 to 1.06.

=item *

L<ExtUtils::ParseXS> has been upgraded from version 3.31 to 3.34.

=item *

L<ExtUtils::Typemaps> has been upgraded from version 3.31 to 3.34.

=item *

L<feature> has been upgraded from version 1.42 to 1.47.

=item *

L<File::Copy> has been upgraded from version 2.31 to 2.32.

=item *

L<File::Fetch> has been upgraded from version 0.48 to 0.52.

=item *

L<File::Glob> has been upgraded from version 1.26 to 1.28.

It now Issues a deprecation message for C<File::Glob::glob()>.

=item *

L<File::Spec> has been upgraded from version 3.63 to 3.67.

=item *

L<FileHandle> has been upgraded from version 2.02 to 2.03.

=item *

L<Filter::Simple> has been upgraded from version 0.92 to 0.93.

It no longer treats C<no MyFilter> immediately following C<use MyFilter> as
end-of-file.
L<[perl #107726]|https://rt.perl.org/Public/Bug/Display.html?id=107726>

=item *

L<Getopt::Long> has been upgraded from version 2.48 to 2.49.

=item *

L<Getopt::Std> has been upgraded from version 1.11 to 1.12.

=item *

L<Hash::Util> has been upgraded from version 0.19 to 0.22.

=item *

L<HTTP::Tiny> has been upgraded from version 0.056 to 0.070.

Internal 599-series errors now include the redirect history.

=item *

L<I18N::LangTags> has been upgraded from version 0.40 to 0.42.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<IO> has been upgraded from version 1.36 to 1.38.

=item *

L<IO::Socket::IP> has been upgraded from version 0.37 to 0.38.

=item *

L<IPC::Cmd> has been upgraded from version 0.92 to 0.96.

=item *

L<IPC::SysV> has been upgraded from version 2.06_01 to 2.07.

=item *

L<JSON::PP> has been upgraded from version 2.27300 to 2.27400_02.

=item *

L<lib> has been upgraded from version 0.63 to 0.64.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<List::Util> has been upgraded from version 1.42_02 to 1.46_02.

=item *

L<Locale::Codes> has been upgraded from version 3.37 to 3.42.

=item *

L<Locale::Maketext> has been upgraded from version 1.26 to 1.28.

=item *

L<Locale::Maketext::Simple> has been upgraded from version 0.21 to 0.21_01.

=item *

L<Math::BigInt> has been upgraded from version 1.999715 to 1.999806.

=item *

L<Math::BigInt::FastCalc> has been upgraded from version 0.40 to 0.5005.

=item *

L<Math::BigRat> has been upgraded from version 0.260802 to 0.2611.

=item *

L<Math::Complex> has been upgraded from version 1.59 to 1.5901.

=item *

L<Memoize> has been upgraded from version 1.03 to 1.03_01.

=item *

L<Module::CoreList> has been upgraded from version 5.20170420 to 5.20170530.

=item *

L<Module::Load::Conditional> has been upgraded from version 0.64 to 0.68.

=item *

L<Module::Metadata> has been upgraded from version 1.000031 to 1.000033.

=item *

L<mro> has been upgraded from version 1.18 to 1.20.

=item *

L<Net::Ping> has been upgraded from version 2.43 to 2.55.

IPv6 addresses and C<AF_INET6> sockets are now supported, along with several
other enhancements.

=item *

L<NEXT> has been upgraded from version 0.65 to 0.67.

=item *

L<Opcode> has been upgraded from version 1.34 to 1.39.

=item *

L<open> has been upgraded from version 1.10 to 1.11.

=item *

L<OS2::Process> has been upgraded from version 1.11 to 1.12.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<overload> has been upgraded from version 1.26 to 1.28.

Its compilation speed has been improved slightly.

=item *

L<parent> has been upgraded from version 0.234 to 0.236.

=item *

L<perl5db.pl> has been upgraded from version 1.50 to 1.51.

It now ignores F</dev/tty> on non-Unix systems.
L<[perl #113960]|https://rt.perl.org/Public/Bug/Display.html?id=113960>

=item *

L<Perl::OSType> has been upgraded from version 1.009 to 1.010.

=item *

L<perlfaq> has been upgraded from version 5.021010 to 5.021011.

=item *

L<PerlIO> has been upgraded from version 1.09 to 1.10.

=item *

L<PerlIO::encoding> has been upgraded from version 0.24 to 0.25.

=item *

L<PerlIO::scalar> has been upgraded from version 0.24 to 0.26.

=item *

L<Pod::Checker> has been upgraded from version 1.60 to 1.73.

=item *

L<Pod::Functions> has been upgraded from version 1.10 to 1.11.

=item *

L<Pod::Html> has been upgraded from version 1.22 to 1.2202.

=item *

L<Pod::Perldoc> has been upgraded from version 3.25_02 to 3.28.

=item *

L<Pod::Simple> has been upgraded from version 3.32 to 3.35.

=item *

L<Pod::Usage> has been upgraded from version 1.68 to 1.69.

=item *

L<POSIX> has been upgraded from version 1.65 to 1.76.

This remedies several defects in making its symbols exportable.
L<[perl #127821]|https://rt.perl.org/Public/Bug/Display.html?id=127821>

The C<POSIX::tmpnam()> interface has been removed,
see L</"POSIX::tmpnam() has been removed">.

The following deprecated functions have been removed:

    POSIX::isalnum
    POSIX::isalpha
    POSIX::iscntrl
    POSIX::isdigit
    POSIX::isgraph
    POSIX::islower
    POSIX::isprint
    POSIX::ispunct
    POSIX::isspace
    POSIX::isupper
    POSIX::isxdigit
    POSIX::tolower
    POSIX::toupper

Trying to import POSIX subs that have no real implementations
(like C<POSIX::atend()>) now fails at import time, instead of
waiting until runtime.

=item *

L<re> has been upgraded from version 0.32 to 0.34

This adds support for the new L<C<E<47>xx>|perlre/E<sol>x and E<sol>xx>
regular expression pattern modifier, and a change to the L<S<C<use re
'strict'>>|re/'strict' mode> experimental feature.  When S<C<re
'strict'>> is enabled, a warning now will be generated for all
unescaped uses of the two characters C<"}"> and C<"]"> in regular
expression patterns (outside bracketed character classes) that are taken
literally.  This brings them more in line with the C<")"> character which
is always a metacharacter unless escaped.  Being a metacharacter only
sometimes, depending on an action at a distance, can lead to silently
having the pattern mean something quite different than was intended,
which the S<C<re 'strict'>> mode is intended to minimize.

=item *

L<Safe> has been upgraded from version 2.39 to 2.40.

=item *

L<Scalar::Util> has been upgraded from version 1.42_02 to 1.46_02.

=item *

L<Storable> has been upgraded from version 2.56 to 2.62.

Fixes
L<[perl #130098]|https://rt.perl.org/Public/Bug/Display.html?id=130098>.

=item *

L<Symbol> has been upgraded from version 1.07 to 1.08.

=item *

L<Sys::Syslog> has been upgraded from version 0.33 to 0.35.

=item *

L<Term::ANSIColor> has been upgraded from version 4.04 to 4.06.

=item *

L<Term::ReadLine> has been upgraded from version 1.15 to 1.16.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<Test> has been upgraded from version 1.28 to 1.30.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<Test::Harness> has been upgraded from version 3.36 to 3.38.

=item *

L<Test::Simple> has been upgraded from version 1.001014 to 1.302073.

=item *

L<Thread::Queue> has been upgraded from version 3.09 to 3.12.

=item *

L<Thread::Semaphore> has been upgraded from 2.12 to 2.13.

Added the C<down_timed> method.

=item *

L<threads> has been upgraded from version 2.07 to 2.15.

=item *

L<threads::shared> has been upgraded from version 1.51 to 1.56.

=item *

L<Tie::Hash::NamedCapture> has been upgraded from version 0.09 to 0.10.

=item *

L<Time::HiRes> has been upgraded from version 1.9733 to 1.9741.

It now builds on systems with C++11 compilers (such as G++ 6 and Clang++
3.9).

Now uses C<clockid_t>.

=item *

L<Time::Local> has been upgraded from version 1.2300 to 1.25.

=item *

L<Unicode::Collate> has been upgraded from version 1.14 to 1.19.

=item *

L<Unicode::UCD> has been upgraded from version 0.64 to 0.68.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<version> has been upgraded from version 0.9916 to 0.9917.

=item *

L<VMS::DCLsym> has been upgraded from version 1.06 to 1.08.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=item *

L<warnings> has been upgraded from version 1.36 to 1.37.

=item *

L<XS::Typemap> has been upgraded from version 0.14 to 0.15.

=item *

L<XSLoader> has been upgraded from version 0.21 to 0.27.

Fixed a security hole in which binary files could be loaded from a path
outside of L<C<@INC>|perlvar/@INC>.

It now uses 3-arg C<open()> instead of 2-arg C<open()>.
L<[perl #130122]|https://rt.perl.org/Public/Bug/Display.html?id=130122>

=back

=head1 Documentation

=head2 New Documentation

=head3 L<perldeprecation>

This file documents all upcoming deprecations, and some of the deprecations
which already have been removed.  The purpose of this documentation is
two-fold: document what will disappear, and by which version, and serve
as a guide for people dealing with code which has features that no longer
work after an upgrade of their perl.

=head2 Changes to Existing Documentation

We have attempted to update the documentation to reflect the changes
listed in this document.  If you find any we have missed, send email to
L<perlbug@perl.org|mailto:perlbug@perl.org>.

Additionally, all references to Usenet have been removed, and the
following selected changes have been made:

=head3 L<perlfunc>

=over 4

=item *

Removed obsolete text about L<C<defined()>|perlfunc/defined>
on aggregates that should have been deleted earlier, when the feature
was removed.

=item *

Corrected documentation of L<C<eval()>|perlfunc/eval>,
and L<C<evalbytes()>|perlfunc/evalbytes>.

=item *

Clarified documentation of L<C<seek()>|perlfunc/seek>,
L<C<tell()>|perlfunc/tell> and L<C<sysseek()>|perlfunc/sysseek>
emphasizing that positions are in bytes and not characters.
L<[perl #128607]|https://rt.perl.org/Public/Bug/Display.html?id=128607>

=item *

Clarified documentation of L<C<sort()>|perlfunc/sort LIST> concerning
the variables C<$a> and C<$b>.

=item *

In L<C<split()>|perlfunc/split> noted that certain pattern modifiers are
legal, and added a caution about its use in Perls before v5.11.

=item *

Removed obsolete documentation of L<C<study()>|perlfunc/study>, noting
that it is now a no-op.

=item *

Noted that L<C<vec()>|perlfunc/vec> doesn't work well when the string
contains characters whose code points are above 255.

=back

=head3 L<perlguts>

=over 4

=item *

Added advice on
L<formatted printing of operands of C<Size_t> and C<SSize_t>|perlguts/Formatted Printing of Size_t and SSize_t>

=back

=head3 L<perlhack>

=over 4

=item *

Clarify what editor tab stop rules to use, and note that we are
migrating away from using tabs, replacing them with sequences of SPACE
characters.

=back

=head3 L<perlhacktips>

=over 4

=item *

Give another reason to use C<cBOOL> to cast an expression to boolean.

=item *

Note that the macros C<TRUE> and C<FALSE> are available to express
boolean values.

=back

=head3 L<perlinterp>

=over 4

=item *

L<perlinterp> has been expanded to give a more detailed example of how to
hunt around in the parser for how a given operator is handled.

=back

=head3 L<perllocale>

=over 4

=item *

Some locales aren't compatible with Perl.  Note that these can cause
core dumps.

=back

=head3 L<perlmod>

=over 4

=item *

Various clarifications have been added.

=back

=head3 L<perlmodlib>

=over 4

=item *

Updated the site mirror list.

=back

=head3 L<perlobj>

=over 4

=item *

Added a section on calling methods using their fully qualified names.

=item *

Do not discourage manual C<@ISA>.

=back

=head3 L<perlootut>

=over 4

=item *

Mention C<Moo> more.

=back

=head3 L<perlop>

=over 4

=item *

Note that white space must be used for quoting operators if the
delimiter is a word character (I<i.e.>, matches C<\w>).

=item *

Clarify that in regular expression patterns delimited by single quotes,
no variable interpolation is done.

=back

=head3 L<perlre>

=over 4

=item *

The first part was extensively rewritten to incorporate various basic
points, that in earlier versions were mentioned in sort of an appendix
on Version 8 regular expressions.

=item *

Note that it is common to have the C</x> modifier and forget that this
means that C<"#"> has to be escaped.

=back

=head3 L<perlretut>

=over 4

=item *

Add introductory material.

=item *

Note that a metacharacter occurring in a context where it can't mean
that, silently loses its meta-ness and matches literally.
L<C<use re 'strict'>|re/'strict' mode> can catch some of these.

=back

=head3 L<perlunicode>

=over 4

=item *

Corrected the text about Unicode BYTE ORDER MARK handling.

=item *

Updated the text to correspond with changes in Unicode UTS#18, concerning
regular expressions, and Perl compatibility with what it says.

=back

=head3 L<perlvar>

=over 4

=item *

Document C<@ISA>.  It was documented in other places, but not in L<perlvar>.

=back

=head1 Diagnostics

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<A signature parameter must start with C<'$'>, C<'@'> or C<'%'>
|perldiag/A signature parameter must start with C<'$'>, C<'@'> or C<'%'>>

=item *

L<Bareword in require contains "%s"|perldiag/"Bareword in require contains "%s"">

=item *

L<Bareword in require maps to empty filename|perldiag/"Bareword in require maps to empty filename">

=item *

L<Bareword in require maps to disallowed filename "%s"|perldiag/"Bareword in require maps to disallowed filename "%s"">

=item *

L<Bareword in require must not start with a double-colon: "%s"|perldiag/"Bareword in require must not start with a double-colon: "%s"">

=item *

L<%s: command not found|perldiag/"%s: command not found">

(A) You've accidentally run your script through B<bash> or another shell
instead of Perl.  Check the C<#!> line, or manually feed your script into
Perl yourself.  The C<#!> line at the top of your file could look like:

  #!/usr/bin/perl

=item *

L<%s: command not found: %s|perldiag/"%s: command not found: %s">

(A) You've accidentally run your script through B<zsh> or another shell
instead of Perl.  Check the C<#!> line, or manually feed your script into
Perl yourself.  The C<#!> line at the top of your file could look like:

  #!/usr/bin/perl

=item *

L<The experimental declared_refs feature is not enabled|perldiag/"The experimental declared_refs feature is not enabled">

(F) To declare references to variables, as in C<my \%x>, you must first enable
the feature:

    no warnings "experimental::declared_refs";
    use feature "declared_refs";

See L</Declaring a reference to a variable>.

=item *

L<Illegal character following sigil in a subroutine signature
|perldiag/Illegal character following sigil in a subroutine signature>

=item *

L<Indentation on line %d of here-doc doesn't match delimiter
|perldiag/Indentation on line %d of here-doc doesn't match delimiter>

=item *

L<Infinite recursion via empty pattern|perldiag/"Infinite recursion via empty pattern">.

Using the empty pattern (which re-executes the last successfully-matched
pattern) inside a code block in another regex, as in C</(?{ s!!new! })/>, has
always previously yielded a segfault.  It now produces this error.

=item *

L<Malformed UTF-8 string in "%s"
|perldiag/Malformed UTF-8 string in "%s">

=item *

L<Multiple slurpy parameters not allowed
|perldiag/Multiple slurpy parameters not allowed>

=item *

L<C<'#'> not allowed immediately following a sigil in a subroutine signature
|perldiag/C<'#'> not allowed immediately following a sigil in a subroutine signature>

=item *

L<panic: unknown OA_*: %x
|perldiag/panic: unknown OA_*: %x>

=item *

L<Unescaped left brace in regex is illegal here|perldiag/Unescaped left brace in regex is illegal here in regex; marked by S<E<lt>-- HERE> in mE<sol>%sE<sol>>

Unescaped left braces are now illegal in some contexts in regular expression
patterns.  In other contexts, they are still just deprecated; they will
be illegal in Perl 5.30.

=item *

L<Version control conflict marker|perldiag/"Version control conflict marker">

(F) The parser found a line starting with C<E<lt>E<lt>E<lt>E<lt>E<lt>E<lt>E<lt>>,
C<E<gt>E<gt>E<gt>E<gt>E<gt>E<gt>E<gt>>, or C<=======>.  These may be left by a
version control system to mark conflicts after a failed merge operation.

=back

=head3 New Warnings

=over 4

=item *

L<Can't determine class of operator %s, assuming C<BASEOP>
|perldiag/Can't determine class of operator %s, assuming C<BASEOP>>

=item *

L<Declaring references is experimental|perldiag/"Declaring references is experimental">

(S experimental::declared_refs) This warning is emitted if you use a reference
constructor on the right-hand side of C<my()>, C<state()>, C<our()>, or
C<local()>.  Simply suppress the warning if you want to use the feature, but
know that in doing so you are taking the risk of using an experimental feature
which may change or be removed in a future Perl version:

    no warnings "experimental::declared_refs";
    use feature "declared_refs";
    $fooref = my \$foo;

See L</Declaring a reference to a variable>.

=item *

L<do "%s" failed, '.' is no longer in @INC|perldiag/do "%s" failed, '.' is no longer in @INC; did you mean do ".E<sol>%s"?>

Since C<"."> is now removed from C<@INC> by default, C<do> will now trigger a warning recommending to fix the C<do> statement.

=item *

L<C<File::Glob::glob()> will disappear in perl 5.30. Use C<File::Glob::bsd_glob()> instead.
|perldiag/C<File::Glob::glob()> will disappear in perl 5.30. Use C<File::Glob::bsd_glob()> instead.>

=item *

L<Unescaped literal '%c' in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>
|perldiag/Unescaped literal '%c' in regex; marked by <-- HERE in mE<sol>%sE<sol>>

=item *

L<Use of unassigned code point or non-standalone grapheme for a delimiter will be a fatal error starting in Perl 5.30|perldiag/"Use of unassigned code point or non-standalone grapheme for a delimiter will be a fatal error starting in Perl 5.30">

See L</Deprecations>

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

When a C<require> fails, we now do not provide C<@INC> when the C<require>
is for a file instead of a module.

=item *

When C<@INC> is not scanned for a C<require> call, we no longer display
C<@INC> to avoid confusion.

=item *

L<Attribute "locked" is deprecated, and will disappear in Perl 5.28
|perldiag/Attribute "locked" is deprecated, and will disappear in Perl 5.28>

This existing warning has had the I<and will disappear> text added in this
release.

=item *

L<Attribute "unique" is deprecated, and will disappear in Perl 5.28
|perldiag/Attribute "unique" is deprecated, and will disappear in Perl 5.28>

This existing warning has had the I<and will disappear> text added in this
release.

=item *

Calling POSIX::%s() is deprecated

This warning has been removed, as the deprecated functions have been
removed from POSIX.

=item *

L<Constants from lexical variables potentially modified elsewhere are deprecated. This will not be allowed in Perl 5.32
|perldiag/Constants from lexical variables potentially modified elsewhere are deprecated. This will not be allowed in Perl 5.32>

This existing warning has had the I<this will not be allowed> text added
in this release.

=item *

L<Deprecated use of C<my()> in false conditional. This will be a fatal error in Perl 5.30
|perldiag/Deprecated use of C<my()> in false conditional. This will be a fatal error in Perl 5.30>

This existing warning has had the I<this will be a fatal error> text added
in this release.

=item *

L<C<dump()> better written as C<CORE::dump()>. C<dump()> will no longer be available in Perl 5.30
|perldiag/C<dump()> better written as C<CORE::dump()>. C<dump()> will no longer be available in Perl 5.30>

This existing warning has had the I<no longer be available> text added in
this release.

=item *

L<Experimental %s on scalar is now forbidden
|perldiag/Experimental %s on scalar is now forbidden>

This message is now followed by more helpful text.
L<[perl #127976]|https://rt.perl.org/Public/Bug/Display.html?id=127976>

=item *

Experimental "%s" subs not enabled

This warning was been removed, as lexical subs are no longer experimental.

=item *

Having more than one /%c regexp modifier is deprecated

This deprecation warning has been removed, since C</xx> now has a new
meaning.

=item *

L<%s() is deprecated on C<:utf8> handles. This will be a fatal error in Perl 5.30
|perldiag/%s() is deprecated on C<:utf8> handles. This will be a fatal error in Perl 5.30>.

where "%s" is one of C<sysread>, C<recv>, C<syswrite>, or C<send>.

This existing warning has had the I<this will be a fatal error> text added
in this release.

This warning is now enabled by default, as all C<deprecated> category
warnings should be.

=item *

L<C<$*> is no longer supported. Its use will be fatal in Perl 5.30
|perldiag/C<$*> is no longer supported. Its use will be fatal in Perl 5.30>

This existing warning has had the I<its use will be fatal> text added in
this release.

=item *

L<C<$#> is no longer supported. Its use will be fatal in Perl 5.30
|perldiag/C<$#> is no longer supported. Its use will be fatal in Perl 5.30>

This existing warning has had the I<its use will be fatal> text added in
this release.

=item *

L<Malformed UTF-8 character%s
|perldiag/Malformed UTF-8 character%s>

Details as to the exact problem have been added at the end of this
message

=item *

L<Missing or undefined argument to %s
|perldiag/Missing or undefined argument to %s>

This warning used to warn about C<require>, even if it was actually C<do>
which being executed. It now gets the operation name right.

=item *

NO-BREAK SPACE in a charnames alias definition is deprecated

This warning has been removed as the behavior is now an error.

=item *

L<Odd nameE<sol>value argument for subroutine '%s'
|perldiag/"Odd nameE<sol>value argument for subroutine '%s'">

This warning now includes the name of the offending subroutine.

=item *

L<Opening dirhandle %s also as a file. This will be a fatal error in Perl 5.28
|perldiag/Opening dirhandle %s also as a file. This will be a fatal error in Perl 5.28>

This existing warning has had the I<this will be a fatal error> text added
in this release.

=item *

L<Opening filehandle %s also as a directory. This will be a fatal error in Perl 5.28
|perldiag/Opening filehandle %s also as a directory. This will be a fatal error in Perl 5.28>

This existing warning has had the I<this will be a fatal error> text added
in this release.

=item *

panic: ck_split, type=%u

panic: pp_split, pm=%p, s=%p

These panic errors have been removed.

=item *

Passing malformed UTF-8 to "%s" is deprecated

This warning has been changed to the fatal
L<Malformed UTF-8 string in "%s"
|perldiag/Malformed UTF-8 string in "%s">

=item *

L<Setting C<< $E<sol> >> to a reference to %s as a form of slurp is deprecated, treating as undef. This will be fatal in Perl 5.28
|perldiag/Setting C<< $E<sol> >> to a reference to %s as a form of slurp is deprecated, treating as undef. This will be fatal in Perl 5.28>

This existing warning has had the I<this will be fatal> text added in
this release.

=item *

L<C<${^ENCODING}> is no longer supported. Its use will be fatal in Perl 5.28|perldiag/"${^ENCODING} is no longer supported. Its use will be fatal in Perl 5.28">

This warning used to be: "Setting C<${^ENCODING}> is deprecated".

The special action of the variable C<${^ENCODING}> was formerly used to
implement the C<encoding> pragma. As of Perl 5.26, rather than being
deprecated, assigning to this variable now has no effect except to issue
the warning.

=item *

L<Too few arguments for subroutine '%s'
|perldiag/Too few arguments for subroutine '%s'>

This warning now includes the name of the offending subroutine.

=item *

L<Too many arguments for subroutine '%s'
|perldiag/Too many arguments for subroutine '%s'>

This warning now includes the name of the offending subroutine.

=item *

L<Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by S<< E<lt>-- HERE >> in mE<sol>%sE<sol>
|perldiag/Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by S<< E<lt>-- HERE >> in mE<sol>%sE<sol>>

This existing warning has had the I<here (and will be fatal...)> text
added in this release.

=item *

L<Unknown charname '' is deprecated. Its use will be fatal in Perl 5.28
|perldiag/Unknown charname '' is deprecated. Its use will be fatal in Perl 5.28>

This existing warning has had the I<its use will be fatal> text added in
this release.

=item *

L<Use of bare E<lt>E<lt> to mean E<lt>E<lt>"" is deprecated. Its use will be fatal in Perl 5.28
|perldiag/Use of bare E<lt>E<lt> to mean E<lt>E<lt>"" is deprecated. Its use will be fatal in Perl 5.28>

This existing warning has had the I<its use will be fatal> text added in
this release.

=item *

L<Use of code point 0x%s is deprecated; the permissible max is 0x%s.  This will be fatal in Perl 5.28
|perldiag/Use of code point 0x%s is deprecated; the permissible max is 0x%s.  This will be fatal in Perl 5.28>

This existing warning has had the I<this will be fatal> text added in
this release.

=item *

L<Use of comma-less variable list is deprecated. Its use will be fatal in Perl 5.28
|perldiag/Use of comma-less variable list is deprecated. Its use will be fatal in Perl 5.28>

This existing warning has had the I<its use will be fatal> text added in
this release.

=item *

L<Use of inherited C<AUTOLOAD> for non-method %s() is deprecated. This will be fatal in Perl 5.28
|perldiag/Use of inherited C<AUTOLOAD> for non-method %s() is deprecated. This will be fatal in Perl 5.28>

This existing warning has had the I<this will be fatal> text added in
this release.

=item *

L<Use of strings with code points over 0xFF as arguments to %s operator is deprecated. This will be a fatal error in Perl 5.28
|perldiag/Use of strings with code points over 0xFF as arguments to %s operator is deprecated. This will be a fatal error in Perl 5.28>

This existing warning has had the I<this will be a fatal error> text added in
this release.

=back

=head1 Utility Changes

=head2 F<c2ph> and F<pstruct>

=over 4

=item *

These old utilities have long since superceded by L<h2xs>, and are
now gone from the distribution.

=back

=head2 F<Porting/pod_lib.pl>

=over 4

=item *

Removed spurious executable bit.

=item *

Account for the possibility of DOS file endings.

=back

=head2 F<Porting/sync-with-cpan>

=over 4

=item *

Many improvements.

=back

=head2 F<perf/benchmarks>

=over 4

=item *

Tidy file, rename some symbols.

=back

=head2 F<Porting/checkAUTHORS.pl>

=over 4

=item *

Replace obscure character range with C<\w>.

=back

=head2 F<t/porting/regen.t>

=over 4

=item *

Try to be more helpful when tests fail.

=back

=head2 F<utils/h2xs.PL>

=over 4

=item *

Avoid infinite loop for enums.

=back

=head2 L<perlbug>

=over 4

=item *

Long lines in the message body are now wrapped at 900 characters, to stay
well within the 1000-character limit imposed by SMTP mail transfer agents.
This is particularly likely to be important for the list of arguments to
F<Configure>, which can readily exceed the limit if, for example, it names
several non-default installation paths.  This change also adds the first unit
tests for perlbug.
L<[perl #128020]|https://rt.perl.org/Public/Bug/Display.html?id=128020>

=back

=head1 Configuration and Compilation

=over 4

=item *

C<-Ddefault_inc_excludes_dot> has added, and enabled by default.

=item *

The C<dtrace> build process has further changes
L<[perl #130108]|https://rt.perl.org/Public/Bug/Display.html?id=130108>:

=over

=item *

If the C<-xnolibs> is available, use that so a F<dtrace> perl can be
built within a FreeBSD jail.

=item *

On systems that build a F<dtrace> object file (FreeBSD, Solaris, and
SystemTap's dtrace emulation), copy the input objects to a separate
directory and process them there, and use those objects in the link,
since C<dtrace -G> also modifies these objects.

=item *

Add F<libelf> to the build on FreeBSD 10.x, since F<dtrace> adds
references to F<libelf> symbols.

=item *

Generate a dummy F<dtrace_main.o> if C<dtrace -G> fails to build it.  A
default build on Solaris generates probes from the unused inline
functions, while they don't on FreeBSD, which causes C<dtrace -G> to
fail.

=back

=item *

You can now disable perl's use of the C<PERL_HASH_SEED> and
C<PERL_PERTURB_KEYS> environment variables by configuring perl with
C<-Accflags=NO_PERL_HASH_ENV>.

=item *

You can now disable perl's use of the C<PERL_HASH_SEED_DEBUG> environment
variable by configuring perl with
C<-Accflags=-DNO_PERL_HASH_SEED_DEBUG>.

=item *

F<Configure> now zeroes out the alignment bytes when calculating the bytes
for 80-bit C<NaN> and C<Inf> to make builds more reproducible.
L<[perl #130133]|https://rt.perl.org/Public/Bug/Display.html?id=130133>

=item *

Since v5.18, for testing purposes we have included support for
building perl with a variety of non-standard, and non-recommended
hash functions.  Since we do not recommend the use of these functions,
we have removed them and their corresponding build options.  Specifically
this includes the following build options:

    PERL_HASH_FUNC_SDBM
    PERL_HASH_FUNC_DJB2
    PERL_HASH_FUNC_SUPERFAST
    PERL_HASH_FUNC_MURMUR3
    PERL_HASH_FUNC_ONE_AT_A_TIME
    PERL_HASH_FUNC_ONE_AT_A_TIME_OLD
    PERL_HASH_FUNC_MURMUR_HASH_64A
    PERL_HASH_FUNC_MURMUR_HASH_64B

=item *

Remove "Warning: perl appears in your path"

This install warning is more or less obsolete, since most platforms already
B<will> have a F</usr/bin/perl> or similar provided by the OS.

=item *

Reduce verbosity of C<make install.man>

Previously, two progress messages were emitted for each manpage: one by
installman itself, and one by the function in F<install_lib.pl> that it calls to
actually install the file.  Disabling the second of those in each case saves
over 750 lines of unhelpful output.

=item *

Cleanup for C<clang -Weverything> support.
L<[perl #129961]|https://rt.perl.org/Public/Bug/Display.html?id=129961>

=item *

F<Configure>: signbit scan was assuming too much, stop assuming negative 0.

=item *

Various compiler warnings have been silenced.

=item *

Several smaller changes have been made to remove impediments to compiling
under C++11.

=item *

Builds using C<USE_PAD_RESET> now work again; this configuration had
bit-rotted.

=item *

A probe for C<gai_strerror> was added to F<Configure> that checks if
the C<gai_strerror()> routine is available and can be used to
translate error codes returned by C<getaddrinfo()> into human
readable strings.

=item *

F<Configure> now aborts if both C<-Duselongdouble> and C<-Dusequadmath> are
requested.
L<[perl #126203]|https://rt.perl.org/Public/Bug/Display.html?id=126203>

=item *

Fixed a bug in which F<Configure> could append C<-quadmath> to the
archname even if it was already present.
L<[perl #128538]|https://rt.perl.org/Public/Bug/Display.html?id=128538>

=item *

Clang builds with C<-DPERL_GLOBAL_STRUCT> or
C<-DPERL_GLOBAL_STRUCT_PRIVATE> have
been fixed (by disabling Thread Safety Analysis for these configurations).

=item *

F<make_ext.pl> no longer updates a module's F<pm_to_blib> file when no
files require updates.  This could cause dependencies, F<perlmain.c>
in particular, to be rebuilt unnecessarily.
L<[perl #126710]|https://rt.perl.org/Public/Bug/Display.html?id=126710>

=item *

The output of C<perl -V> has been reformatted so that each configuration
and compile-time option is now listed one per line, to improve
readability.

=item *

F<Configure> now builds C<miniperl> and C<generate_uudmap> if you
invoke it with C<-Dusecrosscompiler> but not C<-Dtargethost=somehost>.
This means you can supply your target platform C<config.sh>, generate
the headers and proceed to build your cross-target perl.
L<[perl #127234]|https://rt.perl.org/Public/Bug/Display.html?id=127234>

=item *

Perl built with C<-Accflags=-DPERL_TRACE_OPS> now only dumps the operator
counts when the environment variable C<PERL_TRACE_OPS> is set to a
non-zero integer.  This allows C<make test> to pass on such a build.

=item *

When building with GCC 6 and link-time optimization (the C<-flto> option to
C<gcc>), F<Configure> was treating all probed symbols as present on the
system, regardless of whether they actually exist.  This has been fixed.
L<[perl #128131]|https://rt.perl.org/Public/Bug/Display.html?id=128131>

=item *

The F<t/test.pl> library is used for internal testing of Perl itself, and
also copied by several CPAN modules.  Some of those modules must work on
older versions of Perl, so F<t/test.pl> must in turn avoid newer Perl
features.  Compatibility with Perl 5.8 was inadvertently removed some time
ago; it has now been restored.
L<[perl #128052]|https://rt.perl.org/Public/Bug/Display.html?id=128052>

=item *

The build process no longer emits an extra blank line before building each
"simple" extension (those with only F<*.pm> and F<*.pod> files).

=back

=head1 Testing

Tests were added and changed to reflect the other additions and changes
in this release.  Furthermore, these substantive changes were made:

=over 4

=item *

A new test script, F<comp/parser_run.t>, has been added that is like
F<comp/parser.t> but with F<test.pl> included so that C<runperl()> and the
like are available for use.

=item *

Tests for locales were erroneously using locales incompatible with Perl.

=item *

Some parts of the test suite that try to exhaustively test edge cases in the
regex implementation have been restricted to running for a maximum of five
minutes.  On slow systems they could otherwise take several hours, without
significantly improving our understanding of the correctness of the code
under test.

=item *

A new internal facility allows analysing the time taken by the individual
tests in Perl's own test suite; see F<Porting/harness-timer-report.pl>.

=item *

F<t/re/regexp_nonull.t> has been added to test that the regular expression
engine can handle scalars that do not have a null byte just past the end of
the string.

=item *

A new test script, F<t/op/decl-refs.t>, has been added to test the new feature
L</Declaring a reference to a variable>.

=item *

A new test script, F<t/re/keep_tabs.t> has been added to contain tests
where C<\t> characters should not be expanded into spaces.

=item *

A new test script, F<t/re/anyof.t>, has been added to test that the ANYOF nodes
generated by bracketed character classes are as expected.

=item *

There is now more extensive testing of the Unicode-related API macros
and functions.

=item *

Several of the longer running API test files have been split into
multiple test files so that they can be run in parallel.

=item *

F<t/harness> now tries really hard not to run tests which are located
outside of the Perl source tree.
L<[perl #124050]|https://rt.perl.org/Public/Bug/Display.html?id=124050>

=item *

Prevent debugger tests (F<lib/perl5db.t>) from failing due to the contents
of C<$ENV{PERLDB_OPTS}>.
L<[perl #130445]|https://rt.perl.org/Public/Bug/Display.html?id=130445>

=back

=head1 Platform Support

=head2 New Platforms

=over 4

=item NetBSD/VAX

Perl now compiles under NetBSD on VAX machines.  However, it's not
possible for that platform to implement floating-point infinities and
NaNs compatible with most modern systems, which implement the IEEE-754
floating point standard.  The hexadecimal floating point (C<0x...p[+-]n>
literals, C<printf %a>) is not implemented, either.
The C<make test> passes 98% of tests.

=over 4

=item *

Test fixes and minor updates.

=item *

Account for lack of C<inf>, C<nan>, and C<-0.0> support.

=back

=back

=head2 Platform-Specific Notes

=over 4

=item Darwin

=over 4

=item *

Don't treat C<-Dprefix=/usr> as special: instead require an extra option
C<-Ddarwin_distribution> to produce the same results.

=item *

OS X El Capitan doesn't implement the C<clock_gettime()> or
C<clock_getres()> APIs; emulate them as necessary.

=item *

Deprecated C<syscall(2)> on macOS 10.12.

=back

=item EBCDIC

Several tests have been updated to work (or be skipped) on EBCDIC platforms.

=item HP-UX

The L<Net::Ping> UDP test is now skipped on HP-UX.

=item Hurd

The hints for Hurd have been improved, enabling malloc wrap and reporting the
GNU libc used (previously it was an empty string when reported).

=item VAX

VAX floating point formats are now supported on NetBSD.

=item VMS

=over 4

=item *

The path separator for the C<PERL5LIB> and C<PERLLIB> environment entries is
now a colon (C<":">) when running under a Unix shell.  There is no change when
running under DCL (it's still C<"|">).

=item *

F<configure.com> now recognizes the VSI-branded C compiler and no longer
recognizes the "DEC"-branded C compiler (as there hasn't been such a thing for
15 or more years).

=back

=item Windows

=over 4

=item *

Support for compiling perl on Windows using Microsoft Visual Studio 2015
(containing Visual C++ 14.0) has been added.

This version of VC++ includes a completely rewritten C run-time library, some
of the changes in which mean that work done to resolve a socket
C<close()> bug in
perl #120091 and perl #118059 is not workable in its current state with this
version of VC++.  Therefore, we have effectively reverted that bug fix for
VS2015 onwards on the basis that being able to build with VS2015 onwards is
more important than keeping the bug fix.  We may revisit this in the future to
attempt to fix the bug again in a way that is compatible with VS2015.

These changes do not affect compilation with GCC or with Visual Studio versions
up to and including VS2013, I<i.e.>, the bug fix is retained (unchanged) for those
compilers.

Note that you may experience compatibility problems if you mix a perl built
with GCC or VS E<lt>= VS2013 with XS modules built with VS2015, or if you mix a
perl built with VS2015 with XS modules built with GCC or VS E<lt>= VS2013.
Some incompatibility may arise because of the bug fix that has been reverted
for VS2015 builds of perl, but there may well be incompatibility anyway because
of the rewritten CRT in VS2015 (I<e.g.>, see discussion at
L<http://stackoverflow.com/questions/30412951>).

=item *

It now automatically detects GCC versus Visual C and sets the VC version
number on Win32.

=back

=item Linux

Drop support for Linux F<a.out> executable format. Linux has used ELF for
over twenty years.

=item OpenBSD 6

OpenBSD 6 still does not support returning C<pid>, C<gid>, or C<uid> with
C<SA_SIGINFO>.  Make sure to account for it.

=item FreeBSD

F<t/uni/overload.t>: Skip hanging test on FreeBSD.

=item DragonFly BSD

DragonFly BSD now has support for C<setproctitle()>.
L<[perl #130068]|https://rt.perl.org/Public/Bug/Display.html?id=130068>.

=back

=head1 Internal Changes

=over 4

=item *

A new API function L<C<sv_setpv_bufsize()>|perlapi/sv_setpv_bufsize>
allows simultaneously setting the
length and the allocated size of the buffer in an C<SV>, growing the
buffer if necessary.

=item *

A new API macro L<C<SvPVCLEAR()>|perlapi/SvPVCLEAR> sets its C<SV>
argument to an empty string,
like Perl-space C<$x = ''>, but with several optimisations.

=item *

Several new macros and functions for dealing with Unicode and
UTF-8-encoded strings have been added to the API, as well as some
changes in the
functionality of existing functions (see L<perlapi/Unicode Support> for
more details):

=over

=item *

New versions of the API macros like C<isALPHA_utf8> and C<toLOWER_utf8>
have been added, each with the suffix C<_safe>, like
L<C<isSPACE_utf8_safe>|perlapi/isSPACE>.  These take an extra
parameter, giving an upper
limit of how far into the string it is safe to read.  Using the old
versions could cause attempts to read beyond the end of the input buffer
if the UTF-8 is not well-formed, and their use now raises a deprecation
warning.  Details are at L<perlapi/Character classification>.

=item *

Macros like L<C<isALPHA_utf8>|perlapi/isALPHA> and
L<C<toLOWER_utf8>|perlapi/toLOWER_utf8> now die if they detect
that their input UTF-8 is malformed.  A deprecation warning had been
issued since Perl 5.18.

=item *

Several new macros for analysing the validity of utf8 sequences. These
are:

L<C<UTF8_GOT_ABOVE_31_BIT>|perlapi/UTF8_GOT_ABOVE_31_BIT>
L<C<UTF8_GOT_CONTINUATION>|perlapi/UTF8_GOT_CONTINUATION>
L<C<UTF8_GOT_EMPTY>|perlapi/UTF8_GOT_EMPTY>
L<C<UTF8_GOT_LONG>|perlapi/UTF8_GOT_LONG>
L<C<UTF8_GOT_NONCHAR>|perlapi/UTF8_GOT_NONCHAR>
L<C<UTF8_GOT_NON_CONTINUATION>|perlapi/UTF8_GOT_NON_CONTINUATION>
L<C<UTF8_GOT_OVERFLOW>|perlapi/UTF8_GOT_OVERFLOW>
L<C<UTF8_GOT_SHORT>|perlapi/UTF8_GOT_SHORT>
L<C<UTF8_GOT_SUPER>|perlapi/UTF8_GOT_SUPER>
L<C<UTF8_GOT_SURROGATE>|perlapi/UTF8_GOT_SURROGATE>
L<C<UTF8_IS_INVARIANT>|perlapi/UTF8_IS_INVARIANT>
L<C<UTF8_IS_NONCHAR>|perlapi/UTF8_IS_NONCHAR>
L<C<UTF8_IS_SUPER>|perlapi/UTF8_IS_SUPER>
L<C<UTF8_IS_SURROGATE>|perlapi/UTF8_IS_SURROGATE>
L<C<UVCHR_IS_INVARIANT>|perlapi/UVCHR_IS_INVARIANT>
L<C<isUTF8_CHAR_flags>|perlapi/isUTF8_CHAR_flags>
L<C<isSTRICT_UTF8_CHAR>|perlapi/isSTRICT_UTF8_CHAR>
L<C<isC9_STRICT_UTF8_CHAR>|perlapi/isC9_STRICT_UTF8_CHAR>

=item *

Functions that are all extensions of the C<is_utf8_string_I<*>()> functions,
that apply various restrictions to the UTF-8 recognized as valid:

L<C<is_strict_utf8_string>|perlapi/is_strict_utf8_string>,
L<C<is_strict_utf8_string_loc>|perlapi/is_strict_utf8_string_loc>,
L<C<is_strict_utf8_string_loclen>|perlapi/is_strict_utf8_string_loclen>,

L<C<is_c9strict_utf8_string>|perlapi/is_c9strict_utf8_string>,
L<C<is_c9strict_utf8_string_loc>|perlapi/is_c9strict_utf8_string_loc>,
L<C<is_c9strict_utf8_string_loclen>|perlapi/is_c9strict_utf8_string_loclen>,

L<C<is_utf8_string_flags>|perlapi/is_utf8_string_flags>,
L<C<is_utf8_string_loc_flags>|perlapi/is_utf8_string_loc_flags>,
L<C<is_utf8_string_loclen_flags>|perlapi/is_utf8_string_loclen_flags>,

L<C<is_utf8_fixed_width_buf_flags>|perlapi/is_utf8_fixed_width_buf_flags>,
L<C<is_utf8_fixed_width_buf_loc_flags>|perlapi/is_utf8_fixed_width_buf_loc_flags>,
L<C<is_utf8_fixed_width_buf_loclen_flags>|perlapi/is_utf8_fixed_width_buf_loclen_flags>.

L<C<is_utf8_invariant_string>|perlapi/is_utf8_invariant_string>.
L<C<is_utf8_valid_partial_char>|perlapi/is_utf8_valid_partial_char>.
L<C<is_utf8_valid_partial_char_flags>|perlapi/is_utf8_valid_partial_char_flags>.

=item *

The functions L<C<utf8n_to_uvchr>|perlapi/utf8n_to_uvchr> and its
derivatives have had several changes of behaviour.

Calling them, while passing a string length of 0 is now asserted against
in DEBUGGING builds, and otherwise, returns the Unicode REPLACEMENT
CHARACTER.   If you have nothing to decode, you shouldn't call the decode
function.

They now return the Unicode REPLACEMENT CHARACTER if called with UTF-8
that has the overlong malformation and that malformation is allowed by
the input parameters.  This malformation is where the UTF-8 looks valid
syntactically, but there is a shorter sequence that yields the same code
point.  This has been forbidden since Unicode version 3.1.

They now accept an input
flag to allow the overflow malformation.  This malformation is when the
UTF-8 may be syntactically valid, but the code point it represents is
not capable of being represented in the word length on the platform.
What "allowed" means, in this case, is that the function doesn't return an
error, and it advances the parse pointer to beyond the UTF-8 in
question, but it returns the Unicode REPLACEMENT CHARACTER as the value
of the code point (since the real value is not representable).

They no longer abandon searching for other malformations when the first
one is encountered.  A call to one of these functions thus can generate
multiple diagnostics, instead of just one.

=item *

L<C<valid_utf8_to_uvchr()>|perlapi/valid_utf8_to_uvchr> has been added
to the API (although it was
present in core earlier). Like C<utf8_to_uvchr_buf()>, but assumes that
the next character is well-formed.  Use with caution.

=item *

A new function, L<C<utf8n_to_uvchr_error>|perlapi/utf8n_to_uvchr_error>,
has been added for
use by modules that need to know the details of UTF-8 malformations
beyond pass/fail.  Previously, the only ways to know why a sequence was
ill-formed was to capture and parse the generated diagnostics or to do
your own analysis.

=item *

There is now a safer version of utf8_hop(), called
L<C<utf8_hop_safe()>|perlapi/utf8_hop_safe>.
Unlike utf8_hop(), utf8_hop_safe() won't navigate before the beginning or
after the end of the supplied buffer.

=item *

Two new functions, L<C<utf8_hop_forward()>|perlapi/utf8_hop_forward> and
L<C<utf8_hop_back()>|perlapi/utf8_hop_back> are
similar to C<utf8_hop_safe()> but are for when you know which direction
you wish to travel.

=item *

Two new macros which return useful utf8 byte sequences:

L<C<BOM_UTF8>|perlapi/BOM_UTF8>

L<C<REPLACEMENT_CHARACTER_UTF8>|perlapi/REPLACEMENT_CHARACTER_UTF8>

=back

=item *

Perl is now built with the C<PERL_OP_PARENT> compiler define enabled by
default.  To disable it, use the C<PERL_NO_OP_PARENT> compiler define.
This flag alters how the C<op_sibling> field is used in C<OP> structures,
and has been available optionally since perl 5.22.

See L<perl5220delta/"Internal Changes"> for more details of what this
build option does.

=item *

Three new ops, C<OP_ARGELEM>, C<OP_ARGDEFELEM>, and C<OP_ARGCHECK> have
been added.  These are intended principally to implement the individual
elements of a subroutine signature, plus any overall checking required.

=item *

The C<OP_PUSHRE> op has been eliminated and the C<OP_SPLIT> op has been
changed from class C<LISTOP> to C<PMOP>.

Formerly the first child of a split would be a C<pushre>, which would have the
C<split>'s regex attached to it. Now the regex is attached directly to the
C<split> op, and the C<pushre> has been eliminated.

=item *

The L<C<op_class()>|perlapi/op_class> API function has been added.  This
is like the existing
C<OP_CLASS()> macro, but can more accurately determine what struct an op
has been allocated as.  For example C<OP_CLASS()> might return
C<OA_BASEOP_OR_UNOP> indicating that ops of this type are usually
allocated as an C<OP> or C<UNOP>; while C<op_class()> will return
C<OPclass_BASEOP> or C<OPclass_UNOP> as appropriate.

=item *

All parts of the internals now agree that the C<sassign> op is a C<BINOP>;
previously it was listed as a C<BASEOP> in F<regen/opcodes>, which meant
that several parts of the internals had to be special-cased to accommodate
it.  This oddity's original motivation was to handle code like C<$x ||= 1>;
that is now handled in a simpler way.

=item *

The output format of the L<C<op_dump()>|perlapi/op_dump> function (as
used by C<perl -Dx>)
has changed: it now displays an "ASCII-art" tree structure, and shows more
low-level details about each op, such as its address and class.

=item *

The C<PADOFFSET> type has changed from being unsigned to signed, and
several pad-related variables such as C<PL_padix> have changed from being
of type C<I32> to type C<PADOFFSET>.

=item *

The C<DEBUGGING>-mode output for regex compilation and execution has been
enhanced.

=item *

Several obscure SV flags have been eliminated, sometimes along with the
macros which manipulate them: C<SVpbm_VALID>, C<SVpbm_TAIL>, C<SvTAIL_on>,
C<SvTAIL_off>, C<SVrepl_EVAL>, C<SvEVALED>.

=item *

An OP C<op_private> flag has been eliminated: C<OPpRUNTIME>. This used to
often get set on C<PMOP> ops, but had become meaningless over time.

=back

=head1 Selected Bug Fixes

=over 4

=item *

Perl no longer panics when switching into some locales on machines with
buggy C<strxfrm()> implementations in their F<libc>.
L<[perl #121734]|https://rt.perl.org/Public/Bug/Display.html?id=121734>

=item *

C< $-{$name} > would leak an C<AV> on each access if the regular
expression had no named captures.  The same applies to access to any
hash tied with L<Tie::Hash::NamedCapture> and C<< all =E<gt> 1 >>.
L<[perl #130822]|https://rt.perl.org/Public/Bug/Display.html?id=130822>

=item *

Attempting to use the deprecated variable C<$#> as the object in an
indirect object method call could cause a heap use after free or
buffer overflow.
L<[perl #129274]|https://rt.perl.org/Public/Bug/Display.html?id=129274>

=item *

When checking for an indirect object method call, in some rare cases
the parser could reallocate the line buffer but then continue to use
pointers to the old buffer.
L<[perl #129190]|https://rt.perl.org/Public/Bug/Display.html?id=129190>

=item *

Supplying a glob as the format argument to
L<C<formline>|perlfunc/formline> would
cause an assertion failure.
L<[perl #130722]|https://rt.perl.org/Public/Bug/Display.html?id=130722>

=item *

Code like C< $value1 =~ qr/.../ ~~ $value2 > would have the match
converted into a C<qr//> operator, leaving extra elements on the stack to
confuse any surrounding expression.
L<[perl #130705]|https://rt.perl.org/Public/Bug/Display.html?id=130705>

=item *

Since v5.24 in some obscure cases, a regex which included code blocks
from multiple sources (I<e.g.>, via embedded via C<qr//> objects) could end up
with the wrong current pad and crash or give weird results.
L<[perl #129881]|https://rt.perl.org/Public/Bug/Display.html?id=129881>

=item *

Occasionally C<local()>s in a code block within a patterns weren't being
undone when the pattern matching backtracked over the code block.
L<[perl #126697]|https://rt.perl.org/Public/Bug/Display.html?id=126697>

=item *

Using C<substr()> to modify a magic variable could access freed memory
in some cases.
L<[perl #129340]|https://rt.perl.org/Public/Bug/Display.html?id=129340>

=item *

Under C<use utf8>, the entire source code is now checked for being UTF-8
well formed, not just quoted strings as before.
L<[perl #126310]|https://rt.perl.org/Public/Bug/Display.html?id=126310>.

=item *

The range operator C<".."> on strings now handles its arguments correctly when in
the scope of the L<< C<unicode_strings>|feature/"The 'unicode_strings' feature" >>
feature.  The previous behaviour was sufficiently unexpected that we believe no
correct program could have made use of it.

=item *

The C<split> operator did not ensure enough space was allocated for
its return value in scalar context.  It could then write a single
pointer immediately beyond the end of the memory block allocated for
the stack.
L<[perl #130262]|https://rt.perl.org/Public/Bug/Display.html?id=130262>

=item *

Using a large code point with the C<"W"> pack template character with
the current output position aligned at just the right point could
cause a write of a single zero byte immediately beyond the end of an
allocated buffer.
L<[perl #129149]|https://rt.perl.org/Public/Bug/Display.html?id=129149>

=item *

Supplying a format's picture argument as part of the format argument list
where the picture specifies modifying the argument could cause an
access to the new freed compiled form.at.
L<[perl #129125]|https://rt.perl.org/Public/Bug/Display.html?id=129125>

=item *

The L<sort()|perlfunc/sort> operator's built-in numeric comparison
function didn't handle large integers that weren't exactly
representable by a double.  This now uses the same code used to
implement the C<< E<lt>=E<gt> >> operator.
L<[perl #130335]|https://rt.perl.org/Public/Bug/Display.html?id=130335>

=item *

Fix issues with C</(?{ ... E<lt>E<lt>EOF })/> that broke
L<Method::Signatures>.
L<[perl #130398]|https://rt.perl.org/Public/Bug/Display.html?id=130398>

=item *

Fixed an assertion failure with C<chop> and C<chomp>, which
could be triggered by C<chop(@x =~ tr/1/1/)>.
L<[perl #130198]|https://rt.perl.org/Public/Bug/Display.html?id=130198>.

=item *

Fixed a comment skipping error in patterns under C</x>; it could stop
skipping a byte early, which could be in the middle of a UTF-8
character.
L<[perl #130495]|https://rt.perl.org/Public/Bug/Display.html?id=130495>.

=item *

F<perldb> now ignores F</dev/tty> on non-Unix systems.
L<[perl #113960]|https://rt.perl.org/Public/Bug/Display.html?id=113960>;

=item *

Fix assertion failure for C<{}-E<gt>$x> when C<$x> isn't defined.
L<[perl #130496]|https://rt.perl.org/Public/Bug/Display.html?id=130496>.

=item *

Fix an assertion error which could be triggered when a lookahead string
in patterns exceeded a minimum length.
L<[perl #130522]|https://rt.perl.org/Public/Bug/Display.html?id=130522>.

=item *

Only warn once per literal number about a misplaced C<"_">.
L<[perl #70878]|https://rt.perl.org/Public/Bug/Display.html?id=70878>.

=item *

The C<tr///> parse code could be looking at uninitialized data after a
perse error.
L<[perl #129342]|https://rt.perl.org/Public/Bug/Display.html?id=129342>.

=item *

In a pattern match, a back-reference (C<\1>) to an unmatched capture could
read back beyond the start of the string being matched.
L<[perl #129377]|https://rt.perl.org/Public/Bug/Display.html?id=129377>.

=item *

C<use re 'strict'> is supposed to warn if you use a range (such as
C</(?[ [ X-Y ] ])/>) whose start and end digit aren't from the same group
of 10.  It didn't do that for five groups of mathematical digits starting
at C<U+1D7E>.

=item *

A sub containing a "forward" declaration with the same name (I<e.g.>,
C<sub c { sub c; }>) could sometimes crash or loop infinitely.
L<[perl #129090]|https://rt.perl.org/Public/Bug/Display.html?id=129090>

=item *

A crash in executing a regex with a non-anchored UTF-8 substring against a
target string that also used UTF-8 has been fixed.
L<[perl #129350]|https://rt.perl.org/Public/Bug/Display.html?id=129350>

=item *

Previously, a shebang line like C<#!perl -i u> could be erroneously
interpreted as requesting the C<-u> option.  This has been fixed.
L<[perl #129336]|https://rt.perl.org/Public/Bug/Display.html?id=129336>

=item *

The regex engine was previously producing incorrect results in some rare
situations when backtracking past an alternation that matches only one
thing; this
showed up as capture buffers (C<$1>, C<$2>, I<etc.>) erroneously containing data
from regex execution paths that weren't actually executed for the final
match.
L<[perl #129897]|https://rt.perl.org/Public/Bug/Display.html?id=129897>

=item *

Certain regexes making use of the experimental C<regex_sets> feature could
trigger an assertion failure.  This has been fixed.
L<[perl #129322]|https://rt.perl.org/Public/Bug/Display.html?id=129322>

=item *

Invalid assignments to a reference constructor (I<e.g.>, C<\eval=time>) could
sometimes crash in addition to giving a syntax error.
L<[perl #125679]|https://rt.perl.org/Public/Bug/Display.html?id=125679>

=item *

The parser could sometimes crash if a bareword came after C<evalbytes>.
L<[perl #129196]|https://rt.perl.org/Public/Bug/Display.html?id=129196>

=item *

Autoloading via a method call would warn erroneously ("Use of inherited
AUTOLOAD for non-method") if there was a stub present in the package into
which the invocant had been blessed.  The warning is no longer emitted in
such circumstances.
L<[perl #47047]|https://rt.perl.org/Public/Bug/Display.html?id=47047>

=item *

The use of C<splice> on arrays with non-existent elements could cause other
operators to crash.
L<[perl #129164]|https://rt.perl.org/Public/Bug/Display.html?id=129164>

=item *

A possible buffer overrun when a pattern contains a fixed utf8 substring.
L<[perl #129012]|https://rt.perl.org/Public/Bug/Display.html?id=129012>

=item *

Fixed two possible use-after-free bugs in perl's lexer.
L<[perl #129069]|https://rt.perl.org/Public/Bug/Display.html?id=129069>

=item *

Fixed a crash with C<s///l> where it thought it was dealing with UTF-8
when it wasn't.
L<[perl #129038]|https://rt.perl.org/Public/Bug/Display.html?id=129038>

=item *

Fixed a place where the regex parser was not setting the syntax error
correctly on a syntactically incorrect pattern.
L<[perl #129122]|https://rt.perl.org/Public/Bug/Display.html?id=129122>

=item *

The C<&.> operator (and the C<"&"> operator, when it treats its arguments as
strings) were failing to append a trailing null byte if at least one string
was marked as utf8 internally.  Many code paths (system calls, regexp
compilation) still expect there to be a null byte in the string buffer
just past the end of the logical string.  An assertion failure was the
result.
L<[perl #129287]|https://rt.perl.org/Public/Bug/Display.html?id=129287>

=item *

Avoid a heap-after-use error in the parser when creating an error messge
for a syntactically invalid heredoc.
L<[perl #128988]|https://rt.perl.org/Public/Bug/Display.html?id=128988>

=item *

Fix a segfault when run with C<-DC> options on DEBUGGING builds.
L<[perl #129106]|https://rt.perl.org/Public/Bug/Display.html?id=129106>

=item *

Fixed the parser error handling in subroutine attributes for an
'C<:attr(foo>' that does not have an ending 'C<")">'.

=item *

Fix the perl lexer to correctly handle a backslash as the last char in
quoted-string context. This actually fixed two bugs,
L<[perl #129064]|https://rt.perl.org/Public/Bug/Display.html?id=129064> and
L<[perl #129176]|https://rt.perl.org/Public/Bug/Display.html?id=129176>.

=item *

In the API function C<gv_fetchmethod_pvn_flags>, rework separator parsing
to prevent possible string overrun with an invalid C<len> argument.
L<[perl #129267]|https://rt.perl.org/Public/Bug/Display.html?id=129267>

=item *

Problems with in-place array sorts: code like C<@a = sort { ... } @a>,
where the source and destination of the sort are the same plain array, are
optimised to do less copying around.  Two side-effects of this optimisation
were that the contents of C<@a> as seen by sort routines were
partially sorted; and under some circumstances accessing C<@a> during the
sort could crash the interpreter.  Both these issues have been fixed, and
Sort functions see the original value of C<@a>.
L<[perl #128340]|https://rt.perl.org/Public/Bug/Display.html?id=128340>

=item *

Non-ASCII string delimiters are now reported correctly in error messages
for unterminated strings.
L<[perl #128701]|https://rt.perl.org/Public/Bug/Display.html?id=128701>

=item *

C<pack("p", ...)> used to emit its warning ("Attempt to pack pointer to
temporary value") erroneously in some cases, but has been fixed.

=item *

C<@DB::args> is now exempt from "used once" warnings.  The warnings only
occurred under B<-w>, because F<warnings.pm> itself uses C<@DB::args>
multiple times.

=item *

The use of built-in arrays or hash slices in a double-quoted string no
longer issues a warning ("Possible unintended interpolation...") if the
variable has not been mentioned before.  This affected code like
C<qq|@DB::args|> and C<qq|@SIG{'CHLD', 'HUP'}|>.  (The special variables
C<@-> and C<@+> were already exempt from the warning.)

=item *

C<gethostent> and similar functions now perform a null check internally, to
avoid crashing with the torsocks library.  This was a regression from v5.22.
L<[perl #128740]|https://rt.perl.org/Public/Bug/Display.html?id=128740>

=item *

C<defined *{'!'}>, C<defined *{'['}>, and C<defined *{'-'}> no longer leak
memory if the typeglob in question has never been accessed before.

=item *

Mentioning the same constant twice in a row (which is a syntax error) no
longer fails an assertion under debugging builds.  This was a regression
from v5.20.
L<[perl #126482]|https://rt.perl.org/Public/Bug/Display.html?id=126482>

=item *

Many issues relating to C<printf "%a"> of hexadecimal floating point
were fixed.  In addition, the "subnormals" (formerly known as "denormals")
floating point numbers are now supported both with the plain IEEE 754
floating point numbers (64-bit or 128-bit) and the x86 80-bit
"extended precision".  Note that subnormal hexadecimal floating
point literals will give a warning about "exponent underflow".
L<[perl #128843]|https://rt.perl.org/Public/Bug/Display.html?id=128843>
L<[perl #128889]|https://rt.perl.org/Public/Bug/Display.html?id=128889>
L<[perl #128890]|https://rt.perl.org/Public/Bug/Display.html?id=128890>
L<[perl #128893]|https://rt.perl.org/Public/Bug/Display.html?id=128893>
L<[perl #128909]|https://rt.perl.org/Public/Bug/Display.html?id=128909>
L<[perl #128919]|https://rt.perl.org/Public/Bug/Display.html?id=128919>

=item *

A regression in v5.24 with C<tr/\N{U+...}/foo/> when the code point was between
128 and 255 has been fixed.
L<[perl #128734]|https://rt.perl.org/Public/Bug/Display.html?id=128734>.

=item *

Use of a string delimiter whose code point is above 2**31 now works
correctly on platforms that allow this.  Previously, certain characters,
due to truncation, would be confused with other delimiter characters
with special meaning (such as C<"?"> in C<m?...?>), resulting
in inconsistent behaviour.  Note that this is non-portable,
and is based on Perl's extension to UTF-8, and is probably not
displayable nor enterable by any editor.
L<[perl #128738]|https://rt.perl.org/Public/Bug/Display.html?id=128738>

=item *

C<@{x> followed by a newline where C<"x"> represents a control or non-ASCII
character no longer produces a garbled syntax error message or a crash.
L<[perl #128951]|https://rt.perl.org/Public/Bug/Display.html?id=128951>

=item *

An assertion failure with C<%: = 0> has been fixed.
L<[perl #128238]|https://rt.perl.org/Public/Bug/Display.html?id=128238>

=item *

In Perl 5.18, the parsing of C<"$foo::$bar"> was accidentally changed, such
that it would be treated as C<$foo."::".$bar>.  The previous behavior, which
was to parse it as C<$foo:: . $bar>, has been restored.
L<[perl #128478]|https://rt.perl.org/Public/Bug/Display.html?id=128478>

=item *

Since Perl 5.20, line numbers have been off by one when perl is invoked with
the B<-x> switch.  This has been fixed.
L<[perl #128508]|https://rt.perl.org/Public/Bug/Display.html?id=128508>

=item *

Vivifying a subroutine stub in a deleted stash (I<e.g.>,
C<delete $My::{"Foo::"}; \&My::Foo::foo>) no longer crashes.  It had begun
crashing in Perl 5.18.
L<[perl #128532]|https://rt.perl.org/Public/Bug/Display.html?id=128532>

=item *

Some obscure cases of subroutines and file handles being freed at the same time
could result in crashes, but have been fixed.  The crash was introduced in Perl
5.22.
L<[perl #128597]|https://rt.perl.org/Public/Bug/Display.html?id=128597>

=item *

Code that looks for a variable name associated with an uninitialized value
could cause an assertion failure in cases where magic is involved, such as
C<$ISA[0][0]>.  This has now been fixed.
L<[perl #128253]|https://rt.perl.org/Public/Bug/Display.html?id=128253>

=item *

A crash caused by code generating the warning "Subroutine STASH::NAME
redefined" in cases such as C<sub P::f{} undef *P::; *P::f =sub{};> has been
fixed.  In these cases, where the STASH is missing, the warning will now appear
as "Subroutine NAME redefined".
L<[perl #128257]|https://rt.perl.org/Public/Bug/Display.html?id=128257>

=item *

Fixed an assertion triggered by some code that handles deprecated behavior in
formats, I<e.g.>, in cases like this:

    format STDOUT =
    @
    0"$x"

L<[perl #128255]|https://rt.perl.org/Public/Bug/Display.html?id=128255>

=item *

A possible divide by zero in string transformation code on Windows has been
avoided, fixing a crash when collating an empty string.
L<[perl #128618]|https://rt.perl.org/Public/Bug/Display.html?id=128618>

=item *

Some regular expression parsing glitches could lead to assertion failures with
regular expressions such as C</(?E<lt>=/> and C</(?E<lt>!/>.  This has now been fixed.
L<[perl #128170]|https://rt.perl.org/Public/Bug/Display.html?id=128170>

=item *

C< until ($x = 1) { ... } > and C< ... until $x = 1 > now properly
warn when syntax warnings are enabled.
L<[perl #127333]|https://rt.perl.org/Public/Bug/Display.html?id=127333>

=item *

socket() now leaves the error code returned by the system in C<$!> on
failure.
L<[perl #128316]|https://rt.perl.org/Public/Bug/Display.html?id=128316>

=item *

Assignment variants of any bitwise ops under the C<bitwise> feature would
crash if the left-hand side was an array or hash.
L<[perl #128204]|https://rt.perl.org/Public/Bug/Display.html?id=128204>

=item *

C<require> followed by a single colon (as in C<foo() ? require : ...> is
now parsed correctly as C<require> with implicit C<$_>, rather than
C<require "">.
L<[perl #128307]|https://rt.perl.org/Public/Bug/Display.html?id=128307>

=item *

Scalar C<keys %hash> can now be assigned to consistently in all scalar
lvalue contexts.  Previously it worked for some contexts but not others.

=item *

List assignment to C<vec> or C<substr> with an array or hash for its first
argument used to result in crashes or "Can't coerce" error messages at run
time, unlike scalar assignment, which would give an error at compile time.
List assignment now gives a compile-time error, too.
L<[perl #128260]|https://rt.perl.org/Public/Bug/Display.html?id=128260>

=item *

Expressions containing an C<&&> or C<||> operator (or their synonyms C<and>
and C<or>) were being compiled incorrectly in some cases.  If the left-hand
side consisted of either a negated bareword constant or a negated C<do {}>
block containing a constant expression, and the right-hand side consisted of
a negated non-foldable expression, one of the negations was effectively
ignored.  The same was true of C<if> and C<unless> statement modifiers,
though with the left-hand and right-hand sides swapped.  This long-standing
bug has now been fixed.
L<[perl #127952]|https://rt.perl.org/Public/Bug/Display.html?id=127952>

=item *

C<reset> with an argument no longer crashes when encountering stash entries
other than globs.
L<[perl #128106]|https://rt.perl.org/Public/Bug/Display.html?id=128106>

=item *

Assignment of hashes to, and deletion of, typeglobs named C<*::::::> no
longer causes crashes.
L<[perl #128086]|https://rt.perl.org/Public/Bug/Display.html?id=128086>

=item *

Perl wasn't correctly handling true/false values in the LHS of a list
assign; specifically the truth values returned by boolean operators.
This could trigger an assertion failure in something like the following:

    for ($x > $y) {
        ($_, ...) = (...); # here $_ is aliased to a truth value
    }

This was a regression from v5.24.
L<[perl #129991]|https://rt.perl.org/Public/Bug/Display.html?id=129991>

=item *

Assertion failure with user-defined Unicode-like properties.
L<[perl #130010]|https://rt.perl.org/Public/Bug/Display.html?id=130010>

=item *

Fix error message for unclosed C<\N{> in a regex.  An unclosed C<\N{>
could give the wrong error message:
C<"\N{NAME} must be resolved by the lexer">.

=item *

List assignment in list context where the LHS contained aggregates and
where there were not enough RHS elements, used to skip scalar lvalues.
Previously, C<(($a,$b,@c,$d) = (1))> in list context returned C<($a)>; now
it returns C<($a,$b,$d)>.  C<(($a,$b,$c) = (1))> is unchanged: it still
returns C<($a,$b,$c)>.  This can be seen in the following:

    sub inc { $_++ for @_ }
    inc(($a,$b,@c,$d) = (10))

Formerly, the values of C<($a,$b,$d)> would be left as C<(11,undef,undef)>;
now they are C<(11,1,1)>.

=item *

Code like this: C</(?{ s!!! })/> could trigger infinite recursion on the C
stack (not the normal perl stack) when the last successful pattern in
scope is itself.  We avoid the segfault by simply forbidding the use of
the empty pattern when it would resolve to the currently executing
pattern.
L<[perl #129903]|https://rt.perl.org/Public/Bug/Display.html?id=129903>

=item *

Avoid reading beyond the end of the line buffer in perl's lexer when
there's a short UTF-8 character at the end.
L<[perl #128997]|https://rt.perl.org/Public/Bug/Display.html?id=128997>

=item *

Alternations in regular expressions were sometimes failing to match
a utf8 string against a utf8 alternate.
L<[perl #129950]|https://rt.perl.org/Public/Bug/Display.html?id=129950>

=item *

Make C<do "a\0b"> fail silently (and return C<undef> and set C<$!>)
instead of throwing an error.
L<[perl #129928]|https://rt.perl.org/Public/Bug/Display.html?id=129928>

=item *

C<chdir> with no argument didn't ensure that there was stack space
available for returning its result.
L<[perl #129130]|https://rt.perl.org/Public/Bug/Display.html?id=129130>

=item *

All error messages related to C<do> now refer to C<do>; some formerly
claimed to be from C<require> instead.

=item *

Executing C<undef $x> where C<$x> is tied or magical no longer incorrectly
blames the variable for an uninitialized-value warning encountered by the
tied/magical code.

=item *

Code like C<$x = $x . "a"> was incorrectly failing to yield a
L<use of uninitialized value|perldiag/"Use of uninitialized value%s">
warning when C<$x> was a lexical variable with an undefined value. That has
now been fixed.
L<[perl #127877]|https://rt.perl.org/Public/Bug/Display.html?id=127877>

=item *

C<undef *_; shift> or C<undef *_; pop> inside a subroutine, with no
argument to C<shift> or C<pop>, began crashing in Perl 5.14, but has now
been fixed.

=item *

C<< "string$scalar-E<gt>$*" >> now correctly prefers concatenation
overloading to string overloading if C<< $scalar-E<gt>$* >> returns an
overloaded object, bringing it into consistency with C<$$scalar>.

=item *

C<< /@0{0*-E<gt>@*/*0 >> and similar contortions used to crash, but no longer
do, but merely produce a syntax error.
L<[perl #128171]|https://rt.perl.org/Public/Bug/Display.html?id=128171>

=item *

C<do> or C<require> with an argument which is a reference or typeglob
which, when stringified,
contains a null character, started crashing in Perl 5.20, but has now been
fixed.
L<[perl #128182]|https://rt.perl.org/Public/Bug/Display.html?id=128182>

=item *

Improve the error message for a missing C<tie()> package/method. This
brings the error messages in line with the ones used for normal method
calls.

=item *

Parsing bad POSIX charclasses no longer leaks memory.
L<[perl #128313]|https://rt.perl.org/Public/Bug/Display.html?id=128313>

=back

=head1 Known Problems

=over 4

=item *

G++ 6 handles subnormal (denormal) floating point values differently
than gcc 6 or g++ 5 resulting in "flush-to-zero". The end result is
that if you specify very small values using the hexadecimal floating
point format, like C<0x1.fffffffffffffp-1022>, they become zeros.
L<[perl #131388]|https://rt.perl.org/Ticket/Display.html?id=131388>

=back 

=head1 Errata From Previous Releases

=over 4

=item *

Fixed issues with recursive regexes.  The behavior was fixed in Perl 5.24.
L<[perl #126182]|https://rt.perl.org/Public/Bug/Display.html?id=126182>

=back

=head1 Obituary

Jon Portnoy (AVENJ), a prolific Perl author and admired Gentoo community
member, has passed away on August 10, 2016.  He will be remembered and
missed by all those who he came in contact with, and enriched with his
intellect, wit, and spirit.

It is with great sadness that we also note Kip Hampton's passing.  Probably
best known as the author of the Perl & XML column on XML.com, he was a
core contributor to AxKit, an XML server platform that became an Apache
Foundation project.  He was a frequent speaker in the early days at
OSCON, and most recently at YAPC::NA in Madison.  He was frequently on
irc.perl.org as ubu, generally in the #axkit-dahut community, the
group responsible for YAPC::NA Asheville in 2011.

Kip and his constant contributions to the community will be greatly
missed.

=head1 Acknowledgements

Perl 5.26.0 represents approximately 13 months of development since Perl 5.24.0
and contains approximately 360,000 lines of changes across 2,600 files from 86
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 230,000 lines of changes to 1,800 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed the
improvements that became Perl 5.26.0:

Aaron Crane, Abigail, Ævar Arnfjörð Bjarmason, Alex Vandiver, Andreas
König, Andreas Voegele, Andrew Fresh, Andy Lester, Aristotle Pagaltzis, Chad
Granum, Chase Whitener, Chris 'BinGOs' Williams, Chris Lamb, Christian Hansen,
Christian Millour, Colin Newell, Craig A. Berry, Dagfinn Ilmari Mannsåker, Dan
Collins, Daniel Dragan, Dave Cross, Dave Rolsky, David Golden, David H.
Gutteridge, David Mitchell, Dominic Hargreaves, Doug Bell, E. Choroba, Ed Avis,
Father Chrysostomos, François Perrad, Hauke D, H.Merijn Brand, Hugo van der
Sanden, Ivan Pozdeev, James E Keenan, James Raspass, Jarkko Hietaniemi, Jerry
D. Hedden, Jim Cromie, J. Nick Koston, John Lightsey, Karen Etheridge, Karl
Williamson, Leon Timmermans, Lukas Mai, Matthew Horsfall, Maxwell Carey, Misty
De Meo, Neil Bowers, Nicholas Clark, Nicolas R., Niko Tyni, Pali, Paul
Marquess, Peter Avalos, Petr Písař, Pino Toscano, Rafael Garcia-Suarez, Reini
Urban, Renee Baecker, Ricardo Signes, Richard Levitte, Rick Delaney, Salvador
Fandiño, Samuel Thibault, Sawyer X, Sébastien Aperghis-Tramoni, Sergey
Aleynikov, Shlomi Fish, Smylers, Stefan Seifert, Steffen Müller, Stevan
Little, Steve Hay, Steven Humphrey, Sullivan Beck, Theo Buehler, Thomas Sibley,
Todd Rinaldo, Tomasz Konojacki, Tony Cook, Unicode Consortium, Yaroslav Kuzmin,
Yves Orton, Zefram.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the perl bug database at
L<https://rt.perl.org/>.  There may also be information at
L<http://www.perl.org/>, the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to C<perlbug@perl.org> to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 Give Thanks

If you wish to thank the Perl 5 Porters for the work we had done in Perl 5,
you can do so by running the C<perlthanks> program:

    perlthanks

This will send an email to the Perl 5 Porters list with your show of thanks.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlrecharclass.pod000064400000137605150344123460010443 0ustar00=head1 NAME
X<character class>

perlrecharclass - Perl Regular Expression Character Classes

=head1 DESCRIPTION

The top level documentation about Perl regular expressions
is found in L<perlre>.

This manual page discusses the syntax and use of character
classes in Perl regular expressions.

A character class is a way of denoting a set of characters
in such a way that one character of the set is matched.
It's important to remember that: matching a character class
consumes exactly one character in the source string. (The source
string is the string the regular expression is matched against.)

There are three types of character classes in Perl regular
expressions: the dot, backslash sequences, and the form enclosed in square
brackets.  Keep in mind, though, that often the term "character class" is used
to mean just the bracketed form.  Certainly, most Perl documentation does that.

=head2 The dot

The dot (or period), C<.> is probably the most used, and certainly
the most well-known character class. By default, a dot matches any
character, except for the newline. That default can be changed to
add matching the newline by using the I<single line> modifier:
for the entire regular expression with the C</s> modifier, or
locally with C<(?s)>  (and even globally within the scope of
L<C<use re '/s'>|re/'E<sol>flags' mode>).  (The C<L</\N>> backslash
sequence, described
below, matches any character except newline without regard to the
I<single line> modifier.)

Here are some examples:

 "a"  =~  /./       # Match
 "."  =~  /./       # Match
 ""   =~  /./       # No match (dot has to match a character)
 "\n" =~  /./       # No match (dot does not match a newline)
 "\n" =~  /./s      # Match (global 'single line' modifier)
 "\n" =~  /(?s:.)/  # Match (local 'single line' modifier)
 "ab" =~  /^.$/     # No match (dot matches one character)

=head2 Backslash sequences
X<\w> X<\W> X<\s> X<\S> X<\d> X<\D> X<\p> X<\P>
X<\N> X<\v> X<\V> X<\h> X<\H>
X<word> X<whitespace>

A backslash sequence is a sequence of characters, the first one of which is a
backslash.  Perl ascribes special meaning to many such sequences, and some of
these are character classes.  That is, they match a single character each,
provided that the character belongs to the specific set of characters defined
by the sequence.

Here's a list of the backslash sequences that are character classes.  They
are discussed in more detail below.  (For the backslash sequences that aren't
character classes, see L<perlrebackslash>.)

 \d             Match a decimal digit character.
 \D             Match a non-decimal-digit character.
 \w             Match a "word" character.
 \W             Match a non-"word" character.
 \s             Match a whitespace character.
 \S             Match a non-whitespace character.
 \h             Match a horizontal whitespace character.
 \H             Match a character that isn't horizontal whitespace.
 \v             Match a vertical whitespace character.
 \V             Match a character that isn't vertical whitespace.
 \N             Match a character that isn't a newline.
 \pP, \p{Prop}  Match a character that has the given Unicode property.
 \PP, \P{Prop}  Match a character that doesn't have the Unicode property

=head3 \N

C<\N>, available starting in v5.12, like the dot, matches any
character that is not a newline. The difference is that C<\N> is not influenced
by the I<single line> regular expression modifier (see L</The dot> above).  Note
that the form C<\N{...}> may mean something completely different.  When the
C<{...}> is a L<quantifier|perlre/Quantifiers>, it means to match a non-newline
character that many times.  For example, C<\N{3}> means to match 3
non-newlines; C<\N{5,}> means to match 5 or more non-newlines.  But if C<{...}>
is not a legal quantifier, it is presumed to be a named character.  See
L<charnames> for those.  For example, none of C<\N{COLON}>, C<\N{4F}>, and
C<\N{F4}> contain legal quantifiers, so Perl will try to find characters whose
names are respectively C<COLON>, C<4F>, and C<F4>.

=head3 Digits

C<\d> matches a single character considered to be a decimal I<digit>.
If the C</a> regular expression modifier is in effect, it matches [0-9].
Otherwise, it
matches anything that is matched by C<\p{Digit}>, which includes [0-9].
(An unlikely possible exception is that under locale matching rules, the
current locale might not have C<[0-9]> matched by C<\d>, and/or might match
other characters whose code point is less than 256.  The only such locale
definitions that are legal would be to match C<[0-9]> plus another set of
10 consecutive digit characters;  anything else would be in violation of
the C language standard, but Perl doesn't currently assume anything in
regard to this.)

What this means is that unless the C</a> modifier is in effect C<\d> not
only matches the digits '0' - '9', but also Arabic, Devanagari, and
digits from other languages.  This may cause some confusion, and some
security issues.

Some digits that C<\d> matches look like some of the [0-9] ones, but
have different values.  For example, BENGALI DIGIT FOUR (U+09EA) looks
very much like an ASCII DIGIT EIGHT (U+0038).  An application that
is expecting only the ASCII digits might be misled, or if the match is
C<\d+>, the matched string might contain a mixture of digits from
different writing systems that look like they signify a number different
than they actually do.  L<Unicode::UCD/num()> can
be used to safely
calculate the value, returning C<undef> if the input string contains
such a mixture.

What C<\p{Digit}> means (and hence C<\d> except under the C</a>
modifier) is C<\p{General_Category=Decimal_Number}>, or synonymously,
C<\p{General_Category=Digit}>.  Starting with Unicode version 4.1, this
is the same set of characters matched by C<\p{Numeric_Type=Decimal}>.
But Unicode also has a different property with a similar name,
C<\p{Numeric_Type=Digit}>, which matches a completely different set of
characters.  These characters are things such as C<CIRCLED DIGIT ONE>
or subscripts, or are from writing systems that lack all ten digits.

The design intent is for C<\d> to exactly match the set of characters
that can safely be used with "normal" big-endian positional decimal
syntax, where, for example 123 means one 'hundred', plus two 'tens',
plus three 'ones'.  This positional notation does not necessarily apply
to characters that match the other type of "digit",
C<\p{Numeric_Type=Digit}>, and so C<\d> doesn't match them.

The Tamil digits (U+0BE6 - U+0BEF) can also legally be
used in old-style Tamil numbers in which they would appear no more than
one in a row, separated by characters that mean "times 10", "times 100",
etc.  (See L<http://www.unicode.org/notes/tn21>.)

Any character not matched by C<\d> is matched by C<\D>.

=head3 Word characters

A C<\w> matches a single alphanumeric character (an alphabetic character, or a
decimal digit); or a connecting punctuation character, such as an
underscore ("_"); or a "mark" character (like some sort of accent) that
attaches to one of those.  It does not match a whole word.  To match a
whole word, use C<\w+>.  This isn't the same thing as matching an
English word, but in the ASCII range it is the same as a string of
Perl-identifier characters.

=over

=item If the C</a> modifier is in effect ...

C<\w> matches the 63 characters [a-zA-Z0-9_].

=item otherwise ...

=over

=item For code points above 255 ...

C<\w> matches the same as C<\p{Word}> matches in this range.  That is,
it matches Thai letters, Greek letters, etc.  This includes connector
punctuation (like the underscore) which connect two words together, or
diacritics, such as a C<COMBINING TILDE> and the modifier letters, which
are generally used to add auxiliary markings to letters.

=item For code points below 256 ...

=over

=item if locale rules are in effect ...

C<\w> matches the platform's native underscore character plus whatever
the locale considers to be alphanumeric.

=item if, instead, Unicode rules are in effect ...

C<\w> matches exactly what C<\p{Word}> matches.

=item otherwise ...

C<\w> matches [a-zA-Z0-9_].

=back

=back

=back

Which rules apply are determined as described in L<perlre/Which character set modifier is in effect?>.

There are a number of security issues with the full Unicode list of word
characters.  See L<http://unicode.org/reports/tr36>.

Also, for a somewhat finer-grained set of characters that are in programming
language identifiers beyond the ASCII range, you may wish to instead use the
more customized L</Unicode Properties>, C<\p{ID_Start}>,
C<\p{ID_Continue}>, C<\p{XID_Start}>, and C<\p{XID_Continue}>.  See
L<http://unicode.org/reports/tr31>.

Any character not matched by C<\w> is matched by C<\W>.

=head3 Whitespace

C<\s> matches any single character considered whitespace.

=over

=item If the C</a> modifier is in effect ...

In all Perl versions, C<\s> matches the 5 characters [\t\n\f\r ]; that
is, the horizontal tab,
the newline, the form feed, the carriage return, and the space.
Starting in Perl v5.18, it also matches the vertical tab, C<\cK>.
See note C<[1]> below for a discussion of this.

=item otherwise ...

=over

=item For code points above 255 ...

C<\s> matches exactly the code points above 255 shown with an "s" column
in the table below.

=item For code points below 256 ...

=over

=item if locale rules are in effect ...

C<\s> matches whatever the locale considers to be whitespace.

=item if, instead, Unicode rules are in effect ...

C<\s> matches exactly the characters shown with an "s" column in the
table below.

=item otherwise ...

C<\s> matches [\t\n\f\r ] and, starting in Perl
v5.18, the vertical tab, C<\cK>.
(See note C<[1]> below for a discussion of this.)
Note that this list doesn't include the non-breaking space.

=back

=back

=back

Which rules apply are determined as described in L<perlre/Which character set modifier is in effect?>.

Any character not matched by C<\s> is matched by C<\S>.

C<\h> matches any character considered horizontal whitespace;
this includes the platform's space and tab characters and several others
listed in the table below.  C<\H> matches any character
not considered horizontal whitespace.  They use the platform's native
character set, and do not consider any locale that may otherwise be in
use.

C<\v> matches any character considered vertical whitespace;
this includes the platform's carriage return and line feed characters (newline)
plus several other characters, all listed in the table below.
C<\V> matches any character not considered vertical whitespace.
They use the platform's native character set, and do not consider any
locale that may otherwise be in use.

C<\R> matches anything that can be considered a newline under Unicode
rules. It can match a multi-character sequence. It cannot be used inside
a bracketed character class; use C<\v> instead (vertical whitespace).
It uses the platform's
native character set, and does not consider any locale that may
otherwise be in use.
Details are discussed in L<perlrebackslash>.

Note that unlike C<\s> (and C<\d> and C<\w>), C<\h> and C<\v> always match
the same characters, without regard to other factors, such as the active
locale or whether the source string is in UTF-8 format.

One might think that C<\s> is equivalent to C<[\h\v]>. This is indeed true
starting in Perl v5.18, but prior to that, the sole difference was that the
vertical tab (C<"\cK">) was not matched by C<\s>.

The following table is a complete listing of characters matched by
C<\s>, C<\h> and C<\v> as of Unicode 6.3.

The first column gives the Unicode code point of the character (in hex format),
the second column gives the (Unicode) name. The third column indicates
by which class(es) the character is matched (assuming no locale is in
effect that changes the C<\s> matching).

 0x0009        CHARACTER TABULATION   h s
 0x000a              LINE FEED (LF)    vs
 0x000b             LINE TABULATION    vs  [1]
 0x000c              FORM FEED (FF)    vs
 0x000d        CARRIAGE RETURN (CR)    vs
 0x0020                       SPACE   h s
 0x0085             NEXT LINE (NEL)    vs  [2]
 0x00a0              NO-BREAK SPACE   h s  [2]
 0x1680            OGHAM SPACE MARK   h s
 0x2000                     EN QUAD   h s
 0x2001                     EM QUAD   h s
 0x2002                    EN SPACE   h s
 0x2003                    EM SPACE   h s
 0x2004          THREE-PER-EM SPACE   h s
 0x2005           FOUR-PER-EM SPACE   h s
 0x2006            SIX-PER-EM SPACE   h s
 0x2007                FIGURE SPACE   h s
 0x2008           PUNCTUATION SPACE   h s
 0x2009                  THIN SPACE   h s
 0x200a                  HAIR SPACE   h s
 0x2028              LINE SEPARATOR    vs
 0x2029         PARAGRAPH SEPARATOR    vs
 0x202f       NARROW NO-BREAK SPACE   h s
 0x205f   MEDIUM MATHEMATICAL SPACE   h s
 0x3000           IDEOGRAPHIC SPACE   h s

=over 4

=item [1]

Prior to Perl v5.18, C<\s> did not match the vertical tab.
C<[^\S\cK]> (obscurely) matches what C<\s> traditionally did.

=item [2]

NEXT LINE and NO-BREAK SPACE may or may not match C<\s> depending
on the rules in effect.  See
L<the beginning of this section|/Whitespace>.

=back

=head3 Unicode Properties

C<\pP> and C<\p{Prop}> are character classes to match characters that fit given
Unicode properties.  One letter property names can be used in the C<\pP> form,
with the property name following the C<\p>, otherwise, braces are required.
When using braces, there is a single form, which is just the property name
enclosed in the braces, and a compound form which looks like C<\p{name=value}>,
which means to match if the property "name" for the character has that particular
"value".
For instance, a match for a number can be written as C</\pN/> or as
C</\p{Number}/>, or as C</\p{Number=True}/>.
Lowercase letters are matched by the property I<Lowercase_Letter> which
has the short form I<Ll>. They need the braces, so are written as C</\p{Ll}/> or
C</\p{Lowercase_Letter}/>, or C</\p{General_Category=Lowercase_Letter}/>
(the underscores are optional).
C</\pLl/> is valid, but means something different.
It matches a two character string: a letter (Unicode property C<\pL>),
followed by a lowercase C<l>.

If locale rules are not in effect, the use of
a Unicode property will force the regular expression into using Unicode
rules, if it isn't already.

Note that almost all properties are immune to case-insensitive matching.
That is, adding a C</i> regular expression modifier does not change what
they match.  There are two sets that are affected.  The first set is
C<Uppercase_Letter>,
C<Lowercase_Letter>,
and C<Titlecase_Letter>,
all of which match C<Cased_Letter> under C</i> matching.
The second set is
C<Uppercase>,
C<Lowercase>,
and C<Titlecase>,
all of which match C<Cased> under C</i> matching.
(The difference between these sets is that some things, such as Roman
numerals, come in both upper and lower case, so they are C<Cased>, but
aren't considered to be letters, so they aren't C<Cased_Letter>s. They're
actually C<Letter_Number>s.)
This set also includes its subsets C<PosixUpper> and C<PosixLower>, both
of which under C</i> match C<PosixAlpha>.

For more details on Unicode properties, see L<perlunicode/Unicode
Character Properties>; for a
complete list of possible properties, see
L<perluniprops/Properties accessible through \p{} and \P{}>,
which notes all forms that have C</i> differences.
It is also possible to define your own properties. This is discussed in
L<perlunicode/User-Defined Character Properties>.

Unicode properties are defined (surprise!) only on Unicode code points.
Starting in v5.20, when matching against C<\p> and C<\P>, Perl treats
non-Unicode code points (those above the legal Unicode maximum of
0x10FFFF) as if they were typical unassigned Unicode code points.

Prior to v5.20, Perl raised a warning and made all matches fail on
non-Unicode code points.  This could be somewhat surprising:

 chr(0x110000) =~ \p{ASCII_Hex_Digit=True}     # Fails on Perls < v5.20.
 chr(0x110000) =~ \p{ASCII_Hex_Digit=False}    # Also fails on Perls
                                               # < v5.20

Even though these two matches might be thought of as complements, until
v5.20 they were so only on Unicode code points.

=head4 Examples

 "a"  =~  /\w/      # Match, "a" is a 'word' character.
 "7"  =~  /\w/      # Match, "7" is a 'word' character as well.
 "a"  =~  /\d/      # No match, "a" isn't a digit.
 "7"  =~  /\d/      # Match, "7" is a digit.
 " "  =~  /\s/      # Match, a space is whitespace.
 "a"  =~  /\D/      # Match, "a" is a non-digit.
 "7"  =~  /\D/      # No match, "7" is not a non-digit.
 " "  =~  /\S/      # No match, a space is not non-whitespace.

 " "  =~  /\h/      # Match, space is horizontal whitespace.
 " "  =~  /\v/      # No match, space is not vertical whitespace.
 "\r" =~  /\v/      # Match, a return is vertical whitespace.

 "a"  =~  /\pL/     # Match, "a" is a letter.
 "a"  =~  /\p{Lu}/  # No match, /\p{Lu}/ matches upper case letters.

 "\x{0e0b}" =~ /\p{Thai}/  # Match, \x{0e0b} is the character
                           # 'THAI CHARACTER SO SO', and that's in
                           # Thai Unicode class.
 "a"  =~  /\P{Lao}/ # Match, as "a" is not a Laotian character.

It is worth emphasizing that C<\d>, C<\w>, etc, match single characters, not
complete numbers or words. To match a number (that consists of digits),
use C<\d+>; to match a word, use C<\w+>.  But be aware of the security
considerations in doing so, as mentioned above.

=head2 Bracketed Character Classes

The third form of character class you can use in Perl regular expressions
is the bracketed character class.  In its simplest form, it lists the characters
that may be matched, surrounded by square brackets, like this: C<[aeiou]>.
This matches one of C<a>, C<e>, C<i>, C<o> or C<u>.  Like the other
character classes, exactly one character is matched.* To match
a longer string consisting of characters mentioned in the character
class, follow the character class with a L<quantifier|perlre/Quantifiers>.  For
instance, C<[aeiou]+> matches one or more lowercase English vowels.

Repeating a character in a character class has no
effect; it's considered to be in the set only once.

Examples:

 "e"  =~  /[aeiou]/        # Match, as "e" is listed in the class.
 "p"  =~  /[aeiou]/        # No match, "p" is not listed in the class.
 "ae" =~  /^[aeiou]$/      # No match, a character class only matches
                           # a single character.
 "ae" =~  /^[aeiou]+$/     # Match, due to the quantifier.

 -------

* There are two exceptions to a bracketed character class matching a
single character only.  Each requires special handling by Perl to make
things work:

=over

=item *

When the class is to match caselessly under C</i> matching rules, and a
character that is explicitly mentioned inside the class matches a
multiple-character sequence caselessly under Unicode rules, the class
will also match that sequence.  For example, Unicode says that the
letter C<LATIN SMALL LETTER SHARP S> should match the sequence C<ss>
under C</i> rules.  Thus,

 'ss' =~ /\A\N{LATIN SMALL LETTER SHARP S}\z/i             # Matches
 'ss' =~ /\A[aeioust\N{LATIN SMALL LETTER SHARP S}]\z/i    # Matches

For this to happen, the class must not be inverted (see L</Negation>)
and the character must be explicitly specified, and not be part of a
multi-character range (not even as one of its endpoints).  (L</Character
Ranges> will be explained shortly.) Therefore,

 'ss' =~ /\A[\0-\x{ff}]\z/ui       # Doesn't match
 'ss' =~ /\A[\0-\N{LATIN SMALL LETTER SHARP S}]\z/ui   # No match
 'ss' =~ /\A[\xDF-\xDF]\z/ui   # Matches on ASCII platforms, since
                               # \xDF is LATIN SMALL LETTER SHARP S,
                               # and the range is just a single
                               # element

Note that it isn't a good idea to specify these types of ranges anyway.

=item *

Some names known to C<\N{...}> refer to a sequence of multiple characters,
instead of the usual single character.  When one of these is included in
the class, the entire sequence is matched.  For example,

  "\N{TAMIL LETTER KA}\N{TAMIL VOWEL SIGN AU}"
                              =~ / ^ [\N{TAMIL SYLLABLE KAU}]  $ /x;

matches, because C<\N{TAMIL SYLLABLE KAU}> is a named sequence
consisting of the two characters matched against.  Like the other
instance where a bracketed class can match multiple characters, and for
similar reasons, the class must not be inverted, and the named sequence
may not appear in a range, even one where it is both endpoints.  If
these happen, it is a fatal error if the character class is within the
scope of L<C<use re 'strict>|re/'strict' mode>, or within an extended
L<C<(?[...])>|/Extended Bracketed Character Classes> class; otherwise
only the first code point is used (with a C<regexp>-type warning
raised).

=back

=head3 Special Characters Inside a Bracketed Character Class

Most characters that are meta characters in regular expressions (that
is, characters that carry a special meaning like C<.>, C<*>, or C<(>) lose
their special meaning and can be used inside a character class without
the need to escape them. For instance, C<[()]> matches either an opening
parenthesis, or a closing parenthesis, and the parens inside the character
class don't group or capture.  Be aware that, unless the pattern is
evaluated in single-quotish context, variable interpolation will take
place before the bracketed class is parsed:

 $, = "\t| ";
 $a =~ m'[$,]';        # single-quotish: matches '$' or ','
 $a =~ q{[$,]}'        # same
 $a =~ m/[$,]/;        # double-quotish: matches "\t", "|", or " "

Characters that may carry a special meaning inside a character class are:
C<\>, C<^>, C<->, C<[> and C<]>, and are discussed below. They can be
escaped with a backslash, although this is sometimes not needed, in which
case the backslash may be omitted.

The sequence C<\b> is special inside a bracketed character class. While
outside the character class, C<\b> is an assertion indicating a point
that does not have either two word characters or two non-word characters
on either side, inside a bracketed character class, C<\b> matches a
backspace character.

The sequences
C<\a>,
C<\c>,
C<\e>,
C<\f>,
C<\n>,
C<\N{I<NAME>}>,
C<\N{U+I<hex char>}>,
C<\r>,
C<\t>,
and
C<\x>
are also special and have the same meanings as they do outside a
bracketed character class.

Also, a backslash followed by two or three octal digits is considered an octal
number.

A C<[> is not special inside a character class, unless it's the start of a
POSIX character class (see L</POSIX Character Classes> below). It normally does
not need escaping.

A C<]> is normally either the end of a POSIX character class (see
L</POSIX Character Classes> below), or it signals the end of the bracketed
character class.  If you want to include a C<]> in the set of characters, you
must generally escape it.

However, if the C<]> is the I<first> (or the second if the first
character is a caret) character of a bracketed character class, it
does not denote the end of the class (as you cannot have an empty class)
and is considered part of the set of characters that can be matched without
escaping.

Examples:

 "+"   =~ /[+?*]/     #  Match, "+" in a character class is not special.
 "\cH" =~ /[\b]/      #  Match, \b inside in a character class
                      #  is equivalent to a backspace.
 "]"   =~ /[][]/      #  Match, as the character class contains
                      #  both [ and ].
 "[]"  =~ /[[]]/      #  Match, the pattern contains a character class
                      #  containing just [, and the character class is
                      #  followed by a ].

=head3 Bracketed Character Classes and the C</xx> pattern modifier

Normally SPACE and TAB characters have no special meaning inside a
bracketed character class; they are just added to the list of characters
matched by the class.  But if the L<C</xx>|perlre/E<sol>x and E<sol>xx>
pattern modifier is in effect, they are generally ignored and can be
added to improve readability.  They can't be added in the middle of a
single construct:

 / [ \x{10 FFFF} ] /xx  # WRONG!

The SPACE in the middle of the hex constant is illegal.

To specify a literal SPACE character, you can escape it with a
backslash, like:

 /[ a e i o u \  ]/xx

This matches the English vowels plus the SPACE character.

For clarity, you should already have been using C<\t> to specify a
literal tab, and C<\t> is unaffected by C</xx>.

=head3 Character Ranges

It is not uncommon to want to match a range of characters. Luckily, instead
of listing all characters in the range, one may use the hyphen (C<->).
If inside a bracketed character class you have two characters separated
by a hyphen, it's treated as if all characters between the two were in
the class. For instance, C<[0-9]> matches any ASCII digit, and C<[a-m]>
matches any lowercase letter from the first half of the ASCII alphabet.

Note that the two characters on either side of the hyphen are not
necessarily both letters or both digits. Any character is possible,
although not advisable.  C<['-?]> contains a range of characters, but
most people will not know which characters that means.  Furthermore,
such ranges may lead to portability problems if the code has to run on
a platform that uses a different character set, such as EBCDIC.

If a hyphen in a character class cannot syntactically be part of a range, for
instance because it is the first or the last character of the character class,
or if it immediately follows a range, the hyphen isn't special, and so is
considered a character to be matched literally.  If you want a hyphen in
your set of characters to be matched and its position in the class is such
that it could be considered part of a range, you must escape that hyphen
with a backslash.

Examples:

 [a-z]       #  Matches a character that is a lower case ASCII letter.
 [a-fz]      #  Matches any letter between 'a' and 'f' (inclusive) or
             #  the letter 'z'.
 [-z]        #  Matches either a hyphen ('-') or the letter 'z'.
 [a-f-m]     #  Matches any letter between 'a' and 'f' (inclusive), the
             #  hyphen ('-'), or the letter 'm'.
 ['-?]       #  Matches any of the characters  '()*+,-./0123456789:;<=>?
             #  (But not on an EBCDIC platform).
 [\N{APOSTROPHE}-\N{QUESTION MARK}]
             #  Matches any of the characters  '()*+,-./0123456789:;<=>?
             #  even on an EBCDIC platform.
 [\N{U+27}-\N{U+3F}] # Same. (U+27 is "'", and U+3F is "?")

As the final two examples above show, you can achieve portablity to
non-ASCII platforms by using the C<\N{...}> form for the range
endpoints.  These indicate that the specified range is to be interpreted
using Unicode values, so C<[\N{U+27}-\N{U+3F}]> means to match
C<\N{U+27}>, C<\N{U+28}>, C<\N{U+29}>, ..., C<\N{U+3D}>, C<\N{U+3E}>,
and C<\N{U+3F}>, whatever the native code point versions for those are.
These are called "Unicode" ranges.  If either end is of the C<\N{...}>
form, the range is considered Unicode.  A C<regexp> warning is raised
under C<S<"use re 'strict'">> if the other endpoint is specified
non-portably:

 [\N{U+00}-\x09]    # Warning under re 'strict'; \x09 is non-portable
 [\N{U+00}-\t]      # No warning;

Both of the above match the characters C<\N{U+00}> C<\N{U+01}>, ...
C<\N{U+08}>, C<\N{U+09}>, but the C<\x09> looks like it could be a
mistake so the warning is raised (under C<re 'strict'>) for it.

Perl also guarantees that the ranges C<A-Z>, C<a-z>, C<0-9>, and any
subranges of these match what an English-only speaker would expect them
to match on any platform.  That is, C<[A-Z]> matches the 26 ASCII
uppercase letters;
C<[a-z]> matches the 26 lowercase letters; and C<[0-9]> matches the 10
digits.  Subranges, like C<[h-k]>, match correspondingly, in this case
just the four letters C<"h">, C<"i">, C<"j">, and C<"k">.  This is the
natural behavior on ASCII platforms where the code points (ordinal
values) for C<"h"> through C<"k"> are consecutive integers (0x68 through
0x6B).  But special handling to achieve this may be needed on platforms
with a non-ASCII native character set.  For example, on EBCDIC
platforms, the code point for C<"h"> is 0x88, C<"i"> is 0x89, C<"j"> is
0x91, and C<"k"> is 0x92.   Perl specially treats C<[h-k]> to exclude the
seven code points in the gap: 0x8A through 0x90.  This special handling is
only invoked when the range is a subrange of one of the ASCII uppercase,
lowercase, and digit ranges, AND each end of the range is expressed
either as a literal, like C<"A">, or as a named character (C<\N{...}>,
including the C<\N{U+...> form).

EBCDIC Examples:

 [i-j]               #  Matches either "i" or "j"
 [i-\N{LATIN SMALL LETTER J}]  # Same
 [i-\N{U+6A}]        #  Same
 [\N{U+69}-\N{U+6A}] #  Same
 [\x{89}-\x{91}]     #  Matches 0x89 ("i"), 0x8A .. 0x90, 0x91 ("j")
 [i-\x{91}]          #  Same
 [\x{89}-j]          #  Same
 [i-J]               #  Matches, 0x89 ("i") .. 0xC1 ("J"); special
                     #  handling doesn't apply because range is mixed
                     #  case

=head3 Negation

It is also possible to instead list the characters you do not want to
match. You can do so by using a caret (C<^>) as the first character in the
character class. For instance, C<[^a-z]> matches any character that is not a
lowercase ASCII letter, which therefore includes more than a million
Unicode code points.  The class is said to be "negated" or "inverted".

This syntax make the caret a special character inside a bracketed character
class, but only if it is the first character of the class. So if you want
the caret as one of the characters to match, either escape the caret or
else don't list it first.

In inverted bracketed character classes, Perl ignores the Unicode rules
that normally say that named sequence, and certain characters should
match a sequence of multiple characters use under caseless C</i>
matching.  Following those rules could lead to highly confusing
situations:

 "ss" =~ /^[^\xDF]+$/ui;   # Matches!

This should match any sequences of characters that aren't C<\xDF> nor
what C<\xDF> matches under C</i>.  C<"s"> isn't C<\xDF>, but Unicode
says that C<"ss"> is what C<\xDF> matches under C</i>.  So which one
"wins"? Do you fail the match because the string has C<ss> or accept it
because it has an C<s> followed by another C<s>?  Perl has chosen the
latter.  (See note in L</Bracketed Character Classes> above.)

Examples:

 "e"  =~  /[^aeiou]/   #  No match, the 'e' is listed.
 "x"  =~  /[^aeiou]/   #  Match, as 'x' isn't a lowercase vowel.
 "^"  =~  /[^^]/       #  No match, matches anything that isn't a caret.
 "^"  =~  /[x^]/       #  Match, caret is not special here.

=head3 Backslash Sequences

You can put any backslash sequence character class (with the exception of
C<\N> and C<\R>) inside a bracketed character class, and it will act just
as if you had put all characters matched by the backslash sequence inside the
character class. For instance, C<[a-f\d]> matches any decimal digit, or any
of the lowercase letters between 'a' and 'f' inclusive.

C<\N> within a bracketed character class must be of the forms C<\N{I<name>}>
or C<\N{U+I<hex char>}>, and NOT be the form that matches non-newlines,
for the same reason that a dot C<.> inside a bracketed character class loses
its special meaning: it matches nearly anything, which generally isn't what you
want to happen.


Examples:

 /[\p{Thai}\d]/     # Matches a character that is either a Thai
                    # character, or a digit.
 /[^\p{Arabic}()]/  # Matches a character that is neither an Arabic
                    # character, nor a parenthesis.

Backslash sequence character classes cannot form one of the endpoints
of a range.  Thus, you can't say:

 /[\p{Thai}-\d]/     # Wrong!

=head3 POSIX Character Classes
X<character class> X<\p> X<\p{}>
X<alpha> X<alnum> X<ascii> X<blank> X<cntrl> X<digit> X<graph>
X<lower> X<print> X<punct> X<space> X<upper> X<word> X<xdigit>

POSIX character classes have the form C<[:class:]>, where I<class> is the
name, and the C<[:> and C<:]> delimiters. POSIX character classes only appear
I<inside> bracketed character classes, and are a convenient and descriptive
way of listing a group of characters.

Be careful about the syntax,

 # Correct:
 $string =~ /[[:alpha:]]/

 # Incorrect (will warn):
 $string =~ /[:alpha:]/

The latter pattern would be a character class consisting of a colon,
and the letters C<a>, C<l>, C<p> and C<h>.

POSIX character classes can be part of a larger bracketed character class.
For example,

 [01[:alpha:]%]

is valid and matches '0', '1', any alphabetic character, and the percent sign.

Perl recognizes the following POSIX character classes:

 alpha  Any alphabetical character ("[A-Za-z]").
 alnum  Any alphanumeric character ("[A-Za-z0-9]").
 ascii  Any character in the ASCII character set.
 blank  A GNU extension, equal to a space or a horizontal tab ("\t").
 cntrl  Any control character.  See Note [2] below.
 digit  Any decimal digit ("[0-9]"), equivalent to "\d".
 graph  Any printable character, excluding a space.  See Note [3] below.
 lower  Any lowercase character ("[a-z]").
 print  Any printable character, including a space.  See Note [4] below.
 punct  Any graphical character excluding "word" characters.  Note [5].
 space  Any whitespace character. "\s" including the vertical tab
        ("\cK").
 upper  Any uppercase character ("[A-Z]").
 word   A Perl extension ("[A-Za-z0-9_]"), equivalent to "\w".
 xdigit Any hexadecimal digit ("[0-9a-fA-F]").

Like the L<Unicode properties|/Unicode Properties>, most of the POSIX
properties match the same regardless of whether case-insensitive (C</i>)
matching is in effect or not.  The two exceptions are C<[:upper:]> and
C<[:lower:]>.  Under C</i>, they each match the union of C<[:upper:]> and
C<[:lower:]>.

Most POSIX character classes have two Unicode-style C<\p> property
counterparts.  (They are not official Unicode properties, but Perl extensions
derived from official Unicode properties.)  The table below shows the relation
between POSIX character classes and these counterparts.

One counterpart, in the column labelled "ASCII-range Unicode" in
the table, matches only characters in the ASCII character set.

The other counterpart, in the column labelled "Full-range Unicode", matches any
appropriate characters in the full Unicode character set.  For example,
C<\p{Alpha}> matches not just the ASCII alphabetic characters, but any
character in the entire Unicode character set considered alphabetic.
An entry in the column labelled "backslash sequence" is a (short)
equivalent.

 [[:...:]]      ASCII-range          Full-range  backslash  Note
                 Unicode              Unicode     sequence
 -----------------------------------------------------
   alpha      \p{PosixAlpha}       \p{XPosixAlpha}
   alnum      \p{PosixAlnum}       \p{XPosixAlnum}
   ascii      \p{ASCII}
   blank      \p{PosixBlank}       \p{XPosixBlank}  \h      [1]
                                   or \p{HorizSpace}        [1]
   cntrl      \p{PosixCntrl}       \p{XPosixCntrl}          [2]
   digit      \p{PosixDigit}       \p{XPosixDigit}  \d
   graph      \p{PosixGraph}       \p{XPosixGraph}          [3]
   lower      \p{PosixLower}       \p{XPosixLower}
   print      \p{PosixPrint}       \p{XPosixPrint}          [4]
   punct      \p{PosixPunct}       \p{XPosixPunct}          [5]
              \p{PerlSpace}        \p{XPerlSpace}   \s      [6]
   space      \p{PosixSpace}       \p{XPosixSpace}          [6]
   upper      \p{PosixUpper}       \p{XPosixUpper}
   word       \p{PosixWord}        \p{XPosixWord}   \w
   xdigit     \p{PosixXDigit}      \p{XPosixXDigit}

=over 4

=item [1]

C<\p{Blank}> and C<\p{HorizSpace}> are synonyms.

=item [2]

Control characters don't produce output as such, but instead usually control
the terminal somehow: for example, newline and backspace are control characters.
On ASCII platforms, in the ASCII range, characters whose code points are
between 0 and 31 inclusive, plus 127 (C<DEL>) are control characters; on
EBCDIC platforms, their counterparts are control characters.

=item [3]

Any character that is I<graphical>, that is, visible. This class consists
of all alphanumeric characters and all punctuation characters.

=item [4]

All printable characters, which is the set of all graphical characters
plus those whitespace characters which are not also controls.

=item [5]

C<\p{PosixPunct}> and C<[[:punct:]]> in the ASCII range match all
non-controls, non-alphanumeric, non-space characters:
C<[-!"#$%&'()*+,./:;<=E<gt>?@[\\\]^_`{|}~]> (although if a locale is in effect,
it could alter the behavior of C<[[:punct:]]>).

The similarly named property, C<\p{Punct}>, matches a somewhat different
set in the ASCII range, namely
C<[-!"#%&'()*,./:;?@[\\\]_{}]>.  That is, it is missing the nine
characters C<[$+E<lt>=E<gt>^`|~]>.
This is because Unicode splits what POSIX considers to be punctuation into two
categories, Punctuation and Symbols.

C<\p{XPosixPunct}> and (under Unicode rules) C<[[:punct:]]>, match what
C<\p{PosixPunct}> matches in the ASCII range, plus what C<\p{Punct}>
matches.  This is different than strictly matching according to
C<\p{Punct}>.  Another way to say it is that
if Unicode rules are in effect, C<[[:punct:]]> matches all characters
that Unicode considers punctuation, plus all ASCII-range characters that
Unicode considers symbols.

=item [6]

C<\p{XPerlSpace}> and C<\p{Space}> match identically starting with Perl
v5.18.  In earlier versions, these differ only in that in non-locale
matching, C<\p{XPerlSpace}> did not match the vertical tab, C<\cK>.
Same for the two ASCII-only range forms.

=back

There are various other synonyms that can be used besides the names
listed in the table.  For example, C<\p{XPosixAlpha}> can be written as
C<\p{Alpha}>.  All are listed in
L<perluniprops/Properties accessible through \p{} and \P{}>.

Both the C<\p> counterparts always assume Unicode rules are in effect.
On ASCII platforms, this means they assume that the code points from 128
to 255 are Latin-1, and that means that using them under locale rules is
unwise unless the locale is guaranteed to be Latin-1 or UTF-8.  In contrast, the
POSIX character classes are useful under locale rules.  They are
affected by the actual rules in effect, as follows:

=over

=item If the C</a> modifier, is in effect ...

Each of the POSIX classes matches exactly the same as their ASCII-range
counterparts.

=item otherwise ...

=over

=item For code points above 255 ...

The POSIX class matches the same as its Full-range counterpart.

=item For code points below 256 ...

=over

=item if locale rules are in effect ...

The POSIX class matches according to the locale, except:

=over

=item C<word>

also includes the platform's native underscore character, no matter what
the locale is.

=item C<ascii>

on platforms that don't have the POSIX C<ascii> extension, this matches
just the platform's native ASCII-range characters.

=item C<blank>

on platforms that don't have the POSIX C<blank> extension, this matches
just the platform's native tab and space characters.

=back

=item if, instead, Unicode rules are in effect ...

The POSIX class matches the same as the Full-range counterpart.

=item otherwise ...

The POSIX class matches the same as the ASCII range counterpart.

=back

=back

=back

Which rules apply are determined as described in
L<perlre/Which character set modifier is in effect?>.

It is proposed to change this behavior in a future release of Perl so that
whether or not Unicode rules are in effect would not change the
behavior:  Outside of locale, the POSIX classes
would behave like their ASCII-range counterparts.  If you wish to
comment on this proposal, send email to C<perl5-porters@perl.org>.

=head4 Negation of POSIX character classes
X<character class, negation>

A Perl extension to the POSIX character class is the ability to
negate it. This is done by prefixing the class name with a caret (C<^>).
Some examples:

     POSIX         ASCII-range     Full-range  backslash
                    Unicode         Unicode    sequence
 -----------------------------------------------------
 [[:^digit:]]   \P{PosixDigit}  \P{XPosixDigit}   \D
 [[:^space:]]   \P{PosixSpace}  \P{XPosixSpace}
                \P{PerlSpace}   \P{XPerlSpace}    \S
 [[:^word:]]    \P{PerlWord}    \P{XPosixWord}    \W

The backslash sequence can mean either ASCII- or Full-range Unicode,
depending on various factors as described in L<perlre/Which character set modifier is in effect?>.

=head4 [= =] and [. .]

Perl recognizes the POSIX character classes C<[=class=]> and
C<[.class.]>, but does not (yet?) support them.  Any attempt to use
either construct raises an exception.

=head4 Examples

 /[[:digit:]]/            # Matches a character that is a digit.
 /[01[:lower:]]/          # Matches a character that is either a
                          # lowercase letter, or '0' or '1'.
 /[[:digit:][:^xdigit:]]/ # Matches a character that can be anything
                          # except the letters 'a' to 'f' and 'A' to
                          # 'F'.  This is because the main character
                          # class is composed of two POSIX character
                          # classes that are ORed together, one that
                          # matches any digit, and the other that
                          # matches anything that isn't a hex digit.
                          # The OR adds the digits, leaving only the
                          # letters 'a' to 'f' and 'A' to 'F' excluded.

=head3 Extended Bracketed Character Classes
X<character class>
X<set operations>

This is a fancy bracketed character class that can be used for more
readable and less error-prone classes, and to perform set operations,
such as intersection. An example is

 /(?[ \p{Thai} & \p{Digit} ])/

This will match all the digit characters that are in the Thai script.

This is an experimental feature available starting in 5.18, and is
subject to change as we gain field experience with it.  Any attempt to
use it will raise a warning, unless disabled via

 no warnings "experimental::regex_sets";

Comments on this feature are welcome; send email to
C<perl5-porters@perl.org>.

The rules used by L<C<use re 'strict>|re/'strict' mode> apply to this
construct.

We can extend the example above:

 /(?[ ( \p{Thai} + \p{Lao} ) & \p{Digit} ])/

This matches digits that are in either the Thai or Laotian scripts.

Notice the white space in these examples.  This construct always has
the C<E<sol>xx> modifier turned on within it.

The available binary operators are:

 &    intersection
 +    union
 |    another name for '+', hence means union
 -    subtraction (the result matches the set consisting of those
      code points matched by the first operand, excluding any that
      are also matched by the second operand)
 ^    symmetric difference (the union minus the intersection).  This
      is like an exclusive or, in that the result is the set of code
      points that are matched by either, but not both, of the
      operands.

There is one unary operator:

 !    complement

All the binary operators left associate; C<"&"> is higher precedence
than the others, which all have equal precedence.  The unary operator
right associates, and has highest precedence.  Thus this follows the
normal Perl precedence rules for logical operators.  Use parentheses to
override the default precedence and associativity.

The main restriction is that everything is a metacharacter.  Thus,
you cannot refer to single characters by doing something like this:

 /(?[ a + b ])/ # Syntax error!

The easiest way to specify an individual typable character is to enclose
it in brackets:

 /(?[ [a] + [b] ])/

(This is the same thing as C<[ab]>.)  You could also have said the
equivalent:

 /(?[[ a b ]])/

(You can, of course, specify single characters by using, C<\x{...}>,
C<\N{...}>, etc.)

This last example shows the use of this construct to specify an ordinary
bracketed character class without additional set operations.  Note the
white space within it.  This is allowed because C<E<sol>xx> is
automatically turned on within this construct.

All the other escapes accepted by normal bracketed character classes are
accepted here as well.

Because this construct compiles under
L<C<use re 'strict>|re/'strict' mode>,  unrecognized escapes that
generate warnings in normal classes are fatal errors here, as well as
all other warnings from these class elements, as well as some
practices that don't currently warn outside C<re 'strict'>.  For example
you cannot say

 /(?[ [ \xF ] ])/     # Syntax error!

You have to have two hex digits after a braceless C<\x> (use a leading
zero to make two).  These restrictions are to lower the incidence of
typos causing the class to not match what you thought it would.

If a regular bracketed character class contains a C<\p{}> or C<\P{}> and
is matched against a non-Unicode code point, a warning may be
raised, as the result is not Unicode-defined.  No such warning will come
when using this extended form.

The final difference between regular bracketed character classes and
these, is that it is not possible to get these to match a
multi-character fold.  Thus,

 /(?[ [\xDF] ])/iu

does not match the string C<ss>.

You don't have to enclose POSIX class names inside double brackets,
hence both of the following work:

 /(?[ [:word:] - [:lower:] ])/
 /(?[ [[:word:]] - [[:lower:]] ])/

Any contained POSIX character classes, including things like C<\w> and C<\D>
respect the C<E<sol>a> (and C<E<sol>aa>) modifiers.

Note that C<< (?[ ]) >> is a regex-compile-time construct.  Any attempt
to use something which isn't knowable at the time the containing regular
expression is compiled is a fatal error.  In practice, this means
just three limitations:

=over 4

=item 1

When compiled within the scope of C<use locale> (or the C<E<sol>l> regex
modifier), this construct assumes that the execution-time locale will be
a UTF-8 one, and the generated pattern always uses Unicode rules.  What
gets matched or not thus isn't dependent on the actual runtime locale, so
tainting is not enabled.  But a C<locale> category warning is raised
if the runtime locale turns out to not be UTF-8.

=item 2

Any
L<user-defined property|perlunicode/"User-Defined Character Properties">
used must be already defined by the time the regular expression is
compiled (but note that this construct can be used instead of such
properties).

=item 3

A regular expression that otherwise would compile
using C<E<sol>d> rules, and which uses this construct will instead
use C<E<sol>u>.  Thus this construct tells Perl that you don't want
C<E<sol>d> rules for the entire regular expression containing it.

=back

Note that skipping white space applies only to the interior of this
construct.  There must not be any space between any of the characters
that form the initial C<(?[>.  Nor may there be space between the
closing C<])> characters.

Just as in all regular expressions, the pattern can be built up by
including variables that are interpolated at regex compilation time.
Care must be taken to ensure that you are getting what you expect.  For
example:

 my $thai_or_lao = '\p{Thai} + \p{Lao}';
 ...
 qr/(?[ \p{Digit} & $thai_or_lao ])/;

compiles to

 qr/(?[ \p{Digit} & \p{Thai} + \p{Lao} ])/;

But this does not have the effect that someone reading the code would
likely expect, as the intersection applies just to C<\p{Thai}>,
excluding the Laotian.  Pitfalls like this can be avoided by
parenthesizing the component pieces:

 my $thai_or_lao = '( \p{Thai} + \p{Lao} )';

But any modifiers will still apply to all the components:

 my $lower = '\p{Lower} + \p{Digit}';
 qr/(?[ \p{Greek} & $lower ])/i;

matches upper case things.  You can avoid surprises by making the
components into instances of this construct by compiling them:

 my $thai_or_lao = qr/(?[ \p{Thai} + \p{Lao} ])/;
 my $lower = qr/(?[ \p{Lower} + \p{Digit} ])/;

When these are embedded in another pattern, what they match does not
change, regardless of parenthesization or what modifiers are in effect
in that outer pattern.

Due to the way that Perl parses things, your parentheses and brackets
may need to be balanced, even including comments.  If you run into any
examples, please send them to C<perlbug@perl.org>, so that we can have a
concrete example for this man page.

We may change it so that things that remain legal uses in normal bracketed
character classes might become illegal within this experimental
construct.  One proposal, for example, is to forbid adjacent uses of the
same character, as in C<(?[ [aa] ])>.  The motivation for such a change
is that this usage is likely a typo, as the second "a" adds nothing.
perlmodstyle.pod000064400000054057150344123460010010 0ustar00=head1 NAME

perlmodstyle - Perl module style guide

=head1 INTRODUCTION

This document attempts to describe the Perl Community's "best practice"
for writing Perl modules.  It extends the recommendations found in 
L<perlstyle> , which should be considered required reading
before reading this document.

While this document is intended to be useful to all module authors, it is
particularly aimed at authors who wish to publish their modules on CPAN.

The focus is on elements of style which are visible to the users of a 
module, rather than those parts which are only seen by the module's 
developers.  However, many of the guidelines presented in this document
can be extrapolated and applied successfully to a module's internals.

This document differs from L<perlnewmod> in that it is a style guide
rather than a tutorial on creating CPAN modules.  It provides a
checklist against which modules can be compared to determine whether
they conform to best practice, without necessarily describing in detail
how to achieve this.  

All the advice contained in this document has been gleaned from
extensive conversations with experienced CPAN authors and users.  Every
piece of advice given here is the result of previous mistakes.  This
information is here to help you avoid the same mistakes and the extra
work that would inevitably be required to fix them.

The first section of this document provides an itemized checklist; 
subsequent sections provide a more detailed discussion of the items on 
the list.  The final section, "Common Pitfalls", describes some of the 
most popular mistakes made by CPAN authors.

=head1 QUICK CHECKLIST

For more detail on each item in this checklist, see below.

=head2 Before you start

=over 4

=item *

Don't re-invent the wheel

=item *

Patch, extend or subclass an existing module where possible

=item *

Do one thing and do it well

=item *

Choose an appropriate name

=item *

Get feedback before publishing

=back

=head2 The API

=over 4

=item *

API should be understandable by the average programmer

=item *

Simple methods for simple tasks

=item *

Separate functionality from output

=item *

Consistent naming of subroutines or methods

=item *

Use named parameters (a hash or hashref) when there are more than two
parameters

=back

=head2 Stability

=over 4

=item *

Ensure your module works under C<use strict> and C<-w>

=item *

Stable modules should maintain backwards compatibility

=back

=head2 Documentation

=over 4

=item *

Write documentation in POD

=item *

Document purpose, scope and target applications

=item *

Document each publically accessible method or subroutine, including params and return values

=item *

Give examples of use in your documentation

=item *

Provide a README file and perhaps also release notes, changelog, etc

=item *

Provide links to further information (URL, email)

=back

=head2 Release considerations

=over 4

=item *

Specify pre-requisites in Makefile.PL or Build.PL

=item *

Specify Perl version requirements with C<use>

=item *

Include tests with your module

=item *

Choose a sensible and consistent version numbering scheme (X.YY is the common Perl module numbering scheme)

=item *

Increment the version number for every change, no matter how small

=item *

Package the module using "make dist"

=item *

Choose an appropriate license (GPL/Artistic is a good default)

=back

=head1 BEFORE YOU START WRITING A MODULE

Try not to launch headlong into developing your module without spending
some time thinking first.  A little forethought may save you a vast
amount of effort later on.

=head2 Has it been done before?

You may not even need to write the module.  Check whether it's already 
been done in Perl, and avoid re-inventing the wheel unless you have a 
good reason.

Good places to look for pre-existing modules include
L<http://search.cpan.org/> and L<https://metacpan.org>
and asking on C<module-authors@perl.org>
(L<http://lists.perl.org/list/module-authors.html>).

If an existing module B<almost> does what you want, consider writing a
patch, writing a subclass, or otherwise extending the existing module
rather than rewriting it.

=head2 Do one thing and do it well

At the risk of stating the obvious, modules are intended to be modular.
A Perl developer should be able to use modules to put together the
building blocks of their application.  However, it's important that the
blocks are the right shape, and that the developer shouldn't have to use
a big block when all they need is a small one.

Your module should have a clearly defined scope which is no longer than
a single sentence.  Can your module be broken down into a family of
related modules?

Bad example:

"FooBar.pm provides an implementation of the FOO protocol and the
related BAR standard."

Good example:

"Foo.pm provides an implementation of the FOO protocol.  Bar.pm
implements the related BAR protocol."

This means that if a developer only needs a module for the BAR standard,
they should not be forced to install libraries for FOO as well.

=head2 What's in a name?

Make sure you choose an appropriate name for your module early on.  This
will help people find and remember your module, and make programming
with your module more intuitive.

When naming your module, consider the following:

=over 4

=item *

Be descriptive (i.e. accurately describes the purpose of the module).

=item * 

Be consistent with existing modules.

=item *

Reflect the functionality of the module, not the implementation.

=item *

Avoid starting a new top-level hierarchy, especially if a suitable
hierarchy already exists under which you could place your module.

=back

=head2 Get feedback before publishing

If you have never uploaded a module to CPAN before (and even if you have),
you are strongly encouraged to get feedback on L<PrePAN|http://prepan.org>.
PrePAN is a site dedicated to discussing ideas for CPAN modules with other
Perl developers and is a great resource for new (and experienced) Perl
developers.

You should also try to get feedback from people who are already familiar
with the module's application domain and the CPAN naming system.  Authors
of similar modules, or modules with similar names, may be a good place to
start, as are community sites like L<Perl Monks|http://www.perlmonks.org>.

=head1 DESIGNING AND WRITING YOUR MODULE

Considerations for module design and coding:

=head2 To OO or not to OO?

Your module may be object oriented (OO) or not, or it may have both kinds 
of interfaces available.  There are pros and cons of each technique, which 
should be considered when you design your API.

In I<Perl Best Practices> (copyright 2004, Published by O'Reilly Media, Inc.),
Damian Conway provides a list of criteria to use when deciding if OO is the
right fit for your problem:

=over 4

=item *

The system being designed is large, or is likely to become large.

=item *

The data can be aggregated into obvious structures, especially if
there's a large amount of data in each aggregate.

=item *

The various types of data aggregate form a natural hierarchy that
facilitates the use of inheritance and polymorphism.

=item *

You have a piece of data on which many different operations are
applied.

=item *

You need to perform the same general operations on related types of
data, but with slight variations depending on the specific type of data
the operations are applied to.

=item *

It's likely you'll have to add new data types later.

=item *

The typical interactions between pieces of data are best represented by
operators.

=item *

The implementation of individual components of the system is likely to
change over time.

=item *

The system design is already object-oriented.

=item *

Large numbers of other programmers will be using your code modules.

=back

Think carefully about whether OO is appropriate for your module.
Gratuitous object orientation results in complex APIs which are
difficult for the average module user to understand or use.

=head2 Designing your API

Your interfaces should be understandable by an average Perl programmer.  
The following guidelines may help you judge whether your API is
sufficiently straightforward:

=over 4

=item Write simple routines to do simple things.

It's better to have numerous simple routines than a few monolithic ones.
If your routine changes its behaviour significantly based on its
arguments, it's a sign that you should have two (or more) separate
routines.

=item Separate functionality from output.  

Return your results in the most generic form possible and allow the user 
to choose how to use them.  The most generic form possible is usually a
Perl data structure which can then be used to generate a text report,
HTML, XML, a database query, or whatever else your users require.

If your routine iterates through some kind of list (such as a list of
files, or records in a database) you may consider providing a callback
so that users can manipulate each element of the list in turn.
File::Find provides an example of this with its 
C<find(\&wanted, $dir)> syntax.

=item Provide sensible shortcuts and defaults.

Don't require every module user to jump through the same hoops to achieve a
simple result.  You can always include optional parameters or routines for 
more complex or non-standard behaviour.  If most of your users have to
type a few almost identical lines of code when they start using your
module, it's a sign that you should have made that behaviour a default.
Another good indicator that you should use defaults is if most of your 
users call your routines with the same arguments.

=item Naming conventions

Your naming should be consistent.  For instance, it's better to have:

	display_day();
	display_week();
	display_year();

than

	display_day();
	week_display();
	show_year();

This applies equally to method names, parameter names, and anything else
which is visible to the user (and most things that aren't!)

=item Parameter passing

Use named parameters.  It's easier to use a hash like this:

    $obj->do_something(
	    name => "wibble",
	    type => "text",
	    size => 1024,
    );

... than to have a long list of unnamed parameters like this:

    $obj->do_something("wibble", "text", 1024);

While the list of arguments might work fine for one, two or even three
arguments, any more arguments become hard for the module user to
remember, and hard for the module author to manage.  If you want to add
a new parameter you will have to add it to the end of the list for
backward compatibility, and this will probably make your list order
unintuitive.  Also, if many elements may be undefined you may see the
following unattractive method calls:

    $obj->do_something(undef, undef, undef, undef, undef, 1024);

Provide sensible defaults for parameters which have them.  Don't make
your users specify parameters which will almost always be the same.

The issue of whether to pass the arguments in a hash or a hashref is
largely a matter of personal style. 

The use of hash keys starting with a hyphen (C<-name>) or entirely in 
upper case (C<NAME>) is a relic of older versions of Perl in which
ordinary lower case strings were not handled correctly by the C<=E<gt>>
operator.  While some modules retain uppercase or hyphenated argument
keys for historical reasons or as a matter of personal style, most new
modules should use simple lower case keys.  Whatever you choose, be
consistent!

=back

=head2 Strictness and warnings

Your module should run successfully under the strict pragma and should
run without generating any warnings.  Your module should also handle 
taint-checking where appropriate, though this can cause difficulties in
many cases.

=head2 Backwards compatibility

Modules which are "stable" should not break backwards compatibility
without at least a long transition phase and a major change in version
number.

=head2 Error handling and messages

When your module encounters an error it should do one or more of:

=over 4

=item *

Return an undefined value.

=item *

set C<$Module::errstr> or similar (C<errstr> is a common name used by
DBI and other popular modules; if you choose something else, be sure to
document it clearly).

=item *

C<warn()> or C<carp()> a message to STDERR.  

=item *

C<croak()> only when your module absolutely cannot figure out what to
do.  (C<croak()> is a better version of C<die()> for use within 
modules, which reports its errors from the perspective of the caller.  
See L<Carp> for details of C<croak()>, C<carp()> and other useful
routines.)

=item *

As an alternative to the above, you may prefer to throw exceptions using 
the Error module.

=back

Configurable error handling can be very useful to your users.  Consider
offering a choice of levels for warning and debug messages, an option to
send messages to a separate file, a way to specify an error-handling
routine, or other such features.  Be sure to default all these options
to the commonest use.

=head1 DOCUMENTING YOUR MODULE

=head2 POD

Your module should include documentation aimed at Perl developers.
You should use Perl's "plain old documentation" (POD) for your general 
technical documentation, though you may wish to write additional
documentation (white papers, tutorials, etc) in some other format.  
You need to cover the following subjects:

=over 4

=item *

A synopsis of the common uses of the module

=item *

The purpose, scope and target applications of your module

=item *

Use of each publically accessible method or subroutine, including
parameters and return values

=item *

Examples of use

=item *

Sources of further information

=item *

A contact email address for the author/maintainer

=back

The level of detail in Perl module documentation generally goes from
less detailed to more detailed.  Your SYNOPSIS section should contain a
minimal example of use (perhaps as little as one line of code; skip the
unusual use cases or anything not needed by most users); the
DESCRIPTION should describe your module in broad terms, generally in
just a few paragraphs; more detail of the module's routines or methods,
lengthy code examples, or other in-depth material should be given in 
subsequent sections.

Ideally, someone who's slightly familiar with your module should be able
to refresh their memory without hitting "page down".  As your reader
continues through the document, they should receive a progressively
greater amount of knowledge.

The recommended order of sections in Perl module documentation is:

=over 4

=item * 

NAME

=item *

SYNOPSIS

=item *

DESCRIPTION

=item *

One or more sections or subsections giving greater detail of available 
methods and routines and any other relevant information.

=item *

BUGS/CAVEATS/etc

=item *

AUTHOR

=item *

SEE ALSO

=item *

COPYRIGHT and LICENSE

=back

Keep your documentation near the code it documents ("inline"
documentation).  Include POD for a given method right above that 
method's subroutine.  This makes it easier to keep the documentation up
to date, and avoids having to document each piece of code twice (once in
POD and once in comments).

=head2 README, INSTALL, release notes, changelogs

Your module should also include a README file describing the module and
giving pointers to further information (website, author email).  

An INSTALL file should be included, and should contain simple installation 
instructions.  When using ExtUtils::MakeMaker this will usually be:

=over 4

=item perl Makefile.PL

=item make

=item make test

=item make install

=back

When using Module::Build, this will usually be:

=over 4

=item perl Build.PL

=item perl Build

=item perl Build test

=item perl Build install

=back

Release notes or changelogs should be produced for each release of your
software describing user-visible changes to your module, in terms
relevant to the user.

Unless you have good reasons for using some other format
(for example, a format used within your company),
the convention is to name your changelog file C<Changes>,
and to follow the simple format described in L<CPAN::Changes::Spec>.

=head1 RELEASE CONSIDERATIONS

=head2 Version numbering

Version numbers should indicate at least major and minor releases, and
possibly sub-minor releases.  A major release is one in which most of
the functionality has changed, or in which major new functionality is
added.  A minor release is one in which a small amount of functionality
has been added or changed.  Sub-minor version numbers are usually used
for changes which do not affect functionality, such as documentation
patches.

The most common CPAN version numbering scheme looks like this:

    1.00, 1.10, 1.11, 1.20, 1.30, 1.31, 1.32

A correct CPAN version number is a floating point number with at least 
2 digits after the decimal.  You can test whether it conforms to CPAN by 
using

    perl -MExtUtils::MakeMaker -le 'print MM->parse_version(shift)' \
                                                            'Foo.pm'

If you want to release a 'beta' or 'alpha' version of a module but
don't want CPAN.pm to list it as most recent use an '_' after the
regular version number followed by at least 2 digits, eg. 1.20_01.  If
you do this, the following idiom is recommended:

  our $VERSION = "1.12_01"; # so CPAN distribution will have
                            # right filename
  our $XS_VERSION = $VERSION; # only needed if you have XS code
  $VERSION = eval $VERSION; # so "use Module 0.002" won't warn on
                            # underscore

With that trick MakeMaker will only read the first line and thus read
the underscore, while the perl interpreter will evaluate the $VERSION
and convert the string into a number.  Later operations that treat
$VERSION as a number will then be able to do so without provoking a
warning about $VERSION not being a number.

Never release anything (even a one-word documentation patch) without
incrementing the number.  Even a one-word documentation patch should
result in a change in version at the sub-minor level.

Once picked, it is important to stick to your version scheme, without
reducing the number of digits.  This is because "downstream" packagers,
such as the FreeBSD ports system, interpret the version numbers in
various ways.  If you change the number of digits in your version scheme,
you can confuse these systems so they get the versions of your module
out of order, which is obviously bad.

=head2 Pre-requisites

Module authors should carefully consider whether to rely on other
modules, and which modules to rely on.

Most importantly, choose modules which are as stable as possible.  In
order of preference: 

=over 4

=item *

Core Perl modules

=item *

Stable CPAN modules

=item *

Unstable CPAN modules

=item *

Modules not available from CPAN

=back

Specify version requirements for other Perl modules in the
pre-requisites in your Makefile.PL or Build.PL.

Be sure to specify Perl version requirements both in Makefile.PL or
Build.PL and with C<require 5.6.1> or similar.  See the section on
C<use VERSION> of L<perlfunc/require> for details.

=head2 Testing

All modules should be tested before distribution (using "make disttest"),
and the tests should also be available to people installing the modules 
(using "make test").  
For Module::Build you would use the C<make test> equivalent C<perl Build test>.

The importance of these tests is proportional to the alleged stability of a 
module.  A module which purports to be
stable or which hopes to achieve wide 
use should adhere to as strict a testing regime as possible.

Useful modules to help you write tests (with minimum impact on your 
development process or your time) include Test::Simple, Carp::Assert 
and Test::Inline.
For more sophisticated test suites there are Test::More and Test::MockObject.

=head2 Packaging

Modules should be packaged using one of the standard packaging tools.
Currently you have the choice between ExtUtils::MakeMaker and the
more platform independent Module::Build, allowing modules to be installed in a
consistent manner.
When using ExtUtils::MakeMaker, you can use "make dist" to create your
package.  Tools exist to help you to build your module in a
MakeMaker-friendly style.  These include ExtUtils::ModuleMaker and h2xs.
See also L<perlnewmod>.

=head2 Licensing

Make sure that your module has a license, and that the full text of it
is included in the distribution (unless it's a common one and the terms
of the license don't require you to include it).

If you don't know what license to use, dual licensing under the GPL
and Artistic licenses (the same as Perl itself) is a good idea.
See L<perlgpl> and L<perlartistic>.

=head1 COMMON PITFALLS

=head2 Reinventing the wheel

There are certain application spaces which are already very, very well
served by CPAN.  One example is templating systems, another is date and
time modules, and there are many more.  While it is a rite of passage to
write your own version of these things, please consider carefully
whether the Perl world really needs you to publish it.

=head2 Trying to do too much

Your module will be part of a developer's toolkit.  It will not, in
itself, form the B<entire> toolkit.  It's tempting to add extra features
until your code is a monolithic system rather than a set of modular
building blocks.

=head2 Inappropriate documentation

Don't fall into the trap of writing for the wrong audience.  Your
primary audience is a reasonably experienced developer with at least 
a moderate understanding of your module's application domain, who's just 
downloaded your module and wants to start using it as quickly as possible.

Tutorials, end-user documentation, research papers, FAQs etc are not 
appropriate in a module's main documentation.  If you really want to 
write these, include them as sub-documents such as C<My::Module::Tutorial> or
C<My::Module::FAQ> and provide a link in the SEE ALSO section of the
main documentation.  

=head1 SEE ALSO

=over 4

=item L<perlstyle>

General Perl style guide

=item L<perlnewmod>

How to create a new module

=item L<perlpod>

POD documentation

=item L<podchecker>

Verifies your POD's correctness

=item Packaging Tools

L<ExtUtils::MakeMaker>, L<Module::Build>

=item Testing tools

L<Test::Simple>, L<Test::Inline>, L<Carp::Assert>, L<Test::More>, L<Test::MockObject>

=item L<http://pause.perl.org/>

Perl Authors Upload Server.  Contains links to information for module
authors.

=item Any good book on software engineering

=back

=head1 AUTHOR

Kirrily "Skud" Robert <skud@cpan.org>

perldiag.pod000064400001053634150344123460007055 0ustar00=head1 NAME

perldiag - various Perl diagnostics

=head1 DESCRIPTION

These messages are classified as follows (listed in increasing order of
desperation):

    (W) A warning (optional).
    (D) A deprecation (enabled by default).
    (S) A severe warning (enabled by default).
    (F) A fatal error (trappable).
    (P) An internal error you should never see (trappable).
    (X) A very fatal error (nontrappable).
    (A) An alien error message (not generated by Perl).

The majority of messages from the first three classifications above
(W, D & S) can be controlled using the C<warnings> pragma.

If a message can be controlled by the C<warnings> pragma, its warning
category is included with the classification letter in the description
below.  E.g. C<(W closed)> means a warning in the C<closed> category.

Optional warnings are enabled by using the C<warnings> pragma or the B<-w>
and B<-W> switches.  Warnings may be captured by setting C<$SIG{__WARN__}>
to a reference to a routine that will be called on each warning instead
of printing it.  See L<perlvar>.

Severe warnings are always enabled, unless they are explicitly disabled
with the C<warnings> pragma or the B<-X> switch.

Trappable errors may be trapped using the eval operator.  See
L<perlfunc/eval>.  In almost all cases, warnings may be selectively
disabled or promoted to fatal errors using the C<warnings> pragma.
See L<warnings>.

The messages are in alphabetical order, without regard to upper or
lower-case.  Some of these messages are generic.  Spots that vary are
denoted with a %s or other printf-style escape.  These escapes are
ignored by the alphabetical order, as are all characters other than
letters.  To look up your message, just ignore anything that is not a
letter.

=over 4

=item accept() on closed socket %s

(W closed) You tried to do an accept on a closed socket.  Did you forget
to check the return value of your socket() call?  See
L<perlfunc/accept>.

=item Aliasing via reference is experimental

(S experimental::refaliasing) This warning is emitted if you use
a reference constructor on the left-hand side of an assignment to
alias one variable to another.  Simply suppress the warning if you
want to use the feature, but know that in doing so you are taking
the risk of using an experimental feature which may change or be
removed in a future Perl version:

    no warnings "experimental::refaliasing";
    use feature "refaliasing";
    \$x = \$y;

=item Allocation too large: %x

(X) You can't allocate more than 64K on an MS-DOS machine.

=item '%c' allowed only after types %s in %s

(F) The modifiers '!', '<' and '>' are allowed in pack() or unpack() only
after certain types.  See L<perlfunc/pack>.

=item alpha->numify() is lossy

(W numeric) An alpha version can not be numified without losing
information.

=item Ambiguous call resolved as CORE::%s(), qualify as such or use &

(W ambiguous) A subroutine you have declared has the same name as a Perl
keyword, and you have used the name without qualification for calling
one or the other.  Perl decided to call the builtin because the
subroutine is not imported.

To force interpretation as a subroutine call, either put an ampersand
before the subroutine name, or qualify the name with its package.
Alternatively, you can import the subroutine (or pretend that it's
imported with the C<use subs> pragma).

To silently interpret it as the Perl operator, use the C<CORE::> prefix
on the operator (e.g. C<CORE::log($x)>) or declare the subroutine
to be an object method (see L<perlsub/"Subroutine Attributes"> or
L<attributes>).

=item Ambiguous range in transliteration operator

(F) You wrote something like C<tr/a-z-0//> which doesn't mean anything at
all.  To include a C<-> character in a transliteration, put it either
first or last.  (In the past, C<tr/a-z-0//> was synonymous with
C<tr/a-y//>, which was probably not what you would have expected.)

=item Ambiguous use of %s resolved as %s

(S ambiguous) You said something that may not be interpreted the way
you thought.  Normally it's pretty easy to disambiguate it by supplying
a missing quote, operator, parenthesis pair or declaration.

=item Ambiguous use of -%s resolved as -&%s()

(S ambiguous) You wrote something like C<-foo>, which might be the
string C<"-foo">, or a call to the function C<foo>, negated.  If you meant
the string, just write C<"-foo">.  If you meant the function call,
write C<-foo()>.

=item Ambiguous use of %c resolved as operator %c

(S ambiguous) C<%>, C<&>, and C<*> are both infix operators (modulus,
bitwise and, and multiplication) I<and> initial special characters
(denoting hashes, subroutines and typeglobs), and you said something
like C<*foo * foo> that might be interpreted as either of them.  We
assumed you meant the infix operator, but please try to make it more
clear -- in the example given, you might write C<*foo * foo()> if you
really meant to multiply a glob by the result of calling a function.

=item Ambiguous use of %c{%s} resolved to %c%s

(W ambiguous) You wrote something like C<@{foo}>, which might be
asking for the variable C<@foo>, or it might be calling a function
named foo, and dereferencing it as an array reference.  If you wanted
the variable, you can just write C<@foo>.  If you wanted to call the
function, write C<@{foo()}> ... or you could just not have a variable
and a function with the same name, and save yourself a lot of trouble.

=item Ambiguous use of %c{%s[...]} resolved to %c%s[...]

=item Ambiguous use of %c{%s{...}} resolved to %c%s{...}

(W ambiguous) You wrote something like C<${foo[2]}> (where foo represents
the name of a Perl keyword), which might be looking for element number
2 of the array named C<@foo>, in which case please write C<$foo[2]>, or you
might have meant to pass an anonymous arrayref to the function named
foo, and then do a scalar deref on the value it returns.  If you meant
that, write C<${foo([2])}>.

In regular expressions, the C<${foo[2]}> syntax is sometimes necessary
to disambiguate between array subscripts and character classes.
C</$length[2345]/>, for instance, will be interpreted as C<$length> followed
by the character class C<[2345]>.  If an array subscript is what you
want, you can avoid the warning by changing C</${length[2345]}/> to the
unsightly C</${\$length[2345]}/>, by renaming your array to something
that does not coincide with a built-in keyword, or by simply turning
off warnings with C<no warnings 'ambiguous';>.

=item '|' and '<' may not both be specified on command line

(F) An error peculiar to VMS.  Perl does its own command line
redirection, and found that STDIN was a pipe, and that you also tried to
redirect STDIN using '<'.  Only one STDIN stream to a customer, please.

=item '|' and '>' may not both be specified on command line

(F) An error peculiar to VMS.  Perl does its own command line
redirection, and thinks you tried to redirect stdout both to a file and
into a pipe to another command.  You need to choose one or the other,
though nothing's stopping you from piping into a program or Perl script
which 'splits' output into two streams, such as

    open(OUT,">$ARGV[0]") or die "Can't write to $ARGV[0]: $!";
    while (<STDIN>) {
        print;
        print OUT;
    }
    close OUT;

=item Applying %s to %s will act on scalar(%s)

(W misc) The pattern match (C<//>), substitution (C<s///>), and
transliteration (C<tr///>) operators work on scalar values.  If you apply
one of them to an array or a hash, it will convert the array or hash to
a scalar value (the length of an array, or the population info of a
hash) and then work on that scalar value.  This is probably not what
you meant to do.  See L<perlfunc/grep> and L<perlfunc/map> for
alternatives.

=item Arg too short for msgsnd

(F) msgsnd() requires a string at least as long as sizeof(long).

=item Argument "%s" isn't numeric%s

(W numeric) The indicated string was fed as an argument to an operator
that expected a numeric value instead.  If you're fortunate the message
will identify which operator was so unfortunate.

Note that for the C<Inf> and C<NaN> (infinity and not-a-number) the
definition of "numeric" is somewhat unusual: the strings themselves
(like "Inf") are considered numeric, and anything following them is
considered non-numeric.

=item Argument list not closed for PerlIO layer "%s"

(W layer) When pushing a layer with arguments onto the Perl I/O
system you forgot the ) that closes the argument list.  (Layers
take care of transforming data between external and internal
representations.)  Perl stopped parsing the layer list at this
point and did not attempt to push this layer.  If your program
didn't explicitly request the failing operation, it may be the
result of the value of the environment variable PERLIO.

=item Argument "%s" treated as 0 in increment (++)

(W numeric) The indicated string was fed as an argument to the C<++>
operator which expects either a number or a string matching
C</^[a-zA-Z]*[0-9]*\z/>.  See L<perlop/Auto-increment and
Auto-decrement> for details.

=item Array passed to stat will be coerced to a scalar%s

(W syntax) You called stat() on an array, but the array will be
coerced to a scalar - the number of elements in the array.

=item A signature parameter must start with '$', '@' or '%'

(F) Each subroutine signature parameter declaration must start with a valid
sigil; for example:

    sub foo ($a, $, $b = 1, @c) {}

=item A slurpy parameter may not have a default value

(F) Only scalar subroutine signature parameters may have a default value;
for example:

    sub foo ($a = 1)        {} # legal
    sub foo (@a = (1))      {} # invalid
    sub foo (%a = (a => b)) {} # invalid

=item assertion botched: %s

(X) The malloc package that comes with Perl had an internal failure.

=item Assertion %s failed: file "%s", line %d

(X) A general assertion failed.  The file in question must be examined.

=item Assigned value is not a reference

(F) You tried to assign something that was not a reference to an lvalue
reference (e.g., C<\$x = $y>).  If you meant to make $x an alias to $y, use
C<\$x = \$y>.

=item Assigned value is not %s reference

(F) You tried to assign a reference to a reference constructor, but the
two references were not of the same type.  You cannot alias a scalar to
an array, or an array to a hash; the two types must match.

    \$x = \@y;  # error
    \@x = \%y;  # error
     $y = [];
    \$x = $y;   # error; did you mean \$y?

=item Assigning non-zero to $[ is no longer possible

(F) When the "array_base" feature is disabled (e.g., under C<use v5.16;>)
the special variable C<$[>, which is deprecated, is now a fixed zero value.

=item Assignment to both a list and a scalar

(F) If you assign to a conditional operator, the 2nd and 3rd arguments
must either both be scalars or both be lists.  Otherwise Perl won't
know which context to supply to the right side.

=item Assuming NOT a POSIX class since %s in regex; marked by S<<-- HERE> in m/%s/

(W regexp) You had something like these:

 [[:alnum]]
 [[:digit:xyz]

They look like they might have been meant to be the POSIX classes
C<[:alnum:]> or C<[:digit:]>.  If so, they should be written:

 [[:alnum:]]
 [[:digit:]xyz]

Since these aren't legal POSIX class specifications, but are legal
bracketed character classes, Perl treats them as the latter.  In the
first example, it matches the characters C<":">, C<"[">, C<"a">, C<"l">,
C<"m">, C<"n">, and C<"u">.

If these weren't meant to be POSIX classes, this warning message is
spurious, and can be suppressed by reordering things, such as

 [[al:num]]

or

 [[:munla]]

=item <> at require-statement should be quotes

(F) You wrote C<< require <file> >> when you should have written
C<require 'file'>.

=item Attempt to access disallowed key '%s' in a restricted hash

(F) The failing code has attempted to get or set a key which is not in
the current set of allowed keys of a restricted hash.

=item Attempt to bless into a freed package

(F) You wrote C<bless $foo> with one argument after somehow causing
the current package to be freed.  Perl cannot figure out what to
do, so it throws up in hands in despair.

=item Attempt to bless into a reference

(F) The CLASSNAME argument to the bless() operator is expected to be
the name of the package to bless the resulting object into.  You've
supplied instead a reference to something: perhaps you wrote

    bless $self, $proto;

when you intended

    bless $self, ref($proto) || $proto;

If you actually want to bless into the stringified version
of the reference supplied, you need to stringify it yourself, for
example by:

    bless $self, "$proto";

=item Attempt to clear deleted array

(S debugging) An array was assigned to when it was being freed.
Freed values are not supposed to be visible to Perl code.  This
can also happen if XS code calls C<av_clear> from a custom magic
callback on the array.

=item Attempt to delete disallowed key '%s' from a restricted hash

(F) The failing code attempted to delete from a restricted hash a key
which is not in its key set.

=item Attempt to delete readonly key '%s' from a restricted hash

(F) The failing code attempted to delete a key whose value has been
declared readonly from a restricted hash.

=item Attempt to free non-arena SV: 0x%x

(S internal) All SV objects are supposed to be allocated from arenas
that will be garbage collected on exit.  An SV was discovered to be
outside any of those arenas.

=item Attempt to free nonexistent shared string '%s'%s

(S internal) Perl maintains a reference-counted internal table of
strings to optimize the storage and access of hash keys and other
strings.  This indicates someone tried to decrement the reference count
of a string that can no longer be found in the table.

=item Attempt to free temp prematurely: SV 0x%x

(S debugging) Mortalized values are supposed to be freed by the
free_tmps() routine.  This indicates that something else is freeing the
SV before the free_tmps() routine gets a chance, which means that the
free_tmps() routine will be freeing an unreferenced scalar when it does
try to free it.

=item Attempt to free unreferenced glob pointers

(S internal) The reference counts got screwed up on symbol aliases.

=item Attempt to free unreferenced scalar: SV 0x%x

(S internal) Perl went to decrement the reference count of a scalar to
see if it would go to 0, and discovered that it had already gone to 0
earlier, and should have been freed, and in fact, probably was freed.
This could indicate that SvREFCNT_dec() was called too many times, or
that SvREFCNT_inc() was called too few times, or that the SV was
mortalized when it shouldn't have been, or that memory has been
corrupted.

=item Attempt to pack pointer to temporary value

(W pack) You tried to pass a temporary value (like the result of a
function, or a computed expression) to the "p" pack() template.  This
means the result contains a pointer to a location that could become
invalid anytime, even before the end of the current statement.  Use
literals or global values as arguments to the "p" pack() template to
avoid this warning.

=item Attempt to reload %s aborted.

(F) You tried to load a file with C<use> or C<require> that failed to
compile once already.  Perl will not try to compile this file again
unless you delete its entry from %INC.  See L<perlfunc/require> and
L<perlvar/%INC>.

=item Attempt to set length of freed array

(W misc) You tried to set the length of an array which has
been freed.  You can do this by storing a reference to the
scalar representing the last index of an array and later
assigning through that reference.  For example

    $r = do {my @a; \$#a};
    $$r = 503

=item Attempt to use reference as lvalue in substr

(W substr) You supplied a reference as the first argument to substr()
used as an lvalue, which is pretty strange.  Perhaps you forgot to
dereference it first.  See L<perlfunc/substr>.

=item Attribute "locked" is deprecated, and will disappear in Perl 5.28

(D deprecated) You have used the attributes pragma to modify the
"locked" attribute on a code reference.  The :locked attribute is
obsolete, has had no effect since 5005 threads were removed, and
will be removed in a Perl 5.28.

=item Attribute prototype(%s) discards earlier prototype attribute in same sub

(W misc) A sub was declared as sub foo : prototype(A) : prototype(B) {}, for
example.  Since each sub can only have one prototype, the earlier
declaration(s) are discarded while the last one is applied.

=item Attribute "unique" is deprecated, and will disappear in Perl 5.28

(D deprecated) You have used the attributes pragma to modify
the "unique" attribute on an array, hash or scalar reference.
The :unique attribute has had no effect since Perl 5.8.8, and
will be removed in a Perl 5.28.

=item av_reify called on tied array

(S debugging) This indicates that something went wrong and Perl got I<very>
confused about C<@_> or C<@DB::args> being tied.

=item Bad arg length for %s, is %u, should be %d

(F) You passed a buffer of the wrong size to one of msgctl(), semctl()
or shmctl().  In C parlance, the correct sizes are, respectively,
S<sizeof(struct msqid_ds *)>, S<sizeof(struct semid_ds *)>, and
S<sizeof(struct shmid_ds *)>.

=item Bad evalled substitution pattern

(F) You've used the C</e> switch to evaluate the replacement for a
substitution, but perl found a syntax error in the code to evaluate,
most likely an unexpected right brace '}'.

=item Bad filehandle: %s

(F) A symbol was passed to something wanting a filehandle, but the
symbol has no filehandle associated with it.  Perhaps you didn't do an
open(), or did it in another package.

=item Bad free() ignored

(S malloc) An internal routine called free() on something that had never
been malloc()ed in the first place.  Mandatory, but can be disabled by
setting environment variable C<PERL_BADFREE> to 0.

This message can be seen quite often with DB_File on systems with "hard"
dynamic linking, like C<AIX> and C<OS/2>.  It is a bug of C<Berkeley DB>
which is left unnoticed if C<DB> uses I<forgiving> system malloc().

=item Bad hash

(P) One of the internal hash routines was passed a null HV pointer.

=item Badly placed ()'s

(A) You've accidentally run your script through B<csh> instead
of Perl.  Check the #! line, or manually feed your script into
Perl yourself.

=item Bad name after %s

(F) You started to name a symbol by using a package prefix, and then
didn't finish the symbol.  In particular, you can't interpolate outside
of quotes, so

    $var = 'myvar';
    $sym = mypack::$var;

is not the same as

    $var = 'myvar';
    $sym = "mypack::$var";

=item Bad plugin affecting keyword '%s'

(F) An extension using the keyword plugin mechanism violated the
plugin API.

=item Bad realloc() ignored

(S malloc) An internal routine called realloc() on something that
had never been malloc()ed in the first place.  Mandatory, but can
be disabled by setting the environment variable C<PERL_BADFREE> to 1.

=item Bad symbol for array

(P) An internal request asked to add an array entry to something that
wasn't a symbol table entry.

=item Bad symbol for dirhandle

(P) An internal request asked to add a dirhandle entry to something
that wasn't a symbol table entry.

=item Bad symbol for filehandle

(P) An internal request asked to add a filehandle entry to something
that wasn't a symbol table entry.

=item Bad symbol for hash

(P) An internal request asked to add a hash entry to something that
wasn't a symbol table entry.

=item Bad symbol for scalar

(P) An internal request asked to add a scalar entry to something that
wasn't a symbol table entry.

=item Bareword found in conditional

(W bareword) The compiler found a bareword where it expected a
conditional, which often indicates that an || or && was parsed as part
of the last argument of the previous construct, for example:

    open FOO || die;

It may also indicate a misspelled constant that has been interpreted as
a bareword:

    use constant TYPO => 1;
    if (TYOP) { print "foo" }

The C<strict> pragma is useful in avoiding such errors.

=item Bareword in require contains "%s"

=item Bareword in require maps to disallowed filename "%s"

=item Bareword in require maps to empty filename

(F) The bareword form of require has been invoked with a filename which could
not have been generated by a valid bareword permitted by the parser.  You
shouldn't be able to get this error from Perl code, but XS code may throw it
if it passes an invalid module name to C<Perl_load_module>.

=item Bareword in require must not start with a double-colon: "%s"

(F) In C<require Bare::Word>, the bareword is not allowed to start with a
double-colon.  Write C<require ::Foo::Bar> as  C<require Foo::Bar> instead.

=item Bareword "%s" not allowed while "strict subs" in use

(F) With "strict subs" in use, a bareword is only allowed as a
subroutine identifier, in curly brackets or to the left of the "=>"
symbol.  Perhaps you need to predeclare a subroutine?

=item Bareword "%s" refers to nonexistent package

(W bareword) You used a qualified bareword of the form C<Foo::>, but the
compiler saw no other uses of that namespace before that point.  Perhaps
you need to predeclare a package?

=item BEGIN failed--compilation aborted

(F) An untrapped exception was raised while executing a BEGIN
subroutine.  Compilation stops immediately and the interpreter is
exited.

=item BEGIN not safe after errors--compilation aborted

(F) Perl found a C<BEGIN {}> subroutine (or a C<use> directive, which
implies a C<BEGIN {}>) after one or more compilation errors had already
occurred.  Since the intended environment for the C<BEGIN {}> could not
be guaranteed (due to the errors), and since subsequent code likely
depends on its correct operation, Perl just gave up.

=item \%d better written as $%d

(W syntax) Outside of patterns, backreferences live on as variables.
The use of backslashes is grandfathered on the right-hand side of a
substitution, but stylistically it's better to use the variable form
because other Perl programmers will expect it, and it works better if
there are more than 9 backreferences.

=item Binary number > 0b11111111111111111111111111111111 non-portable

(W portable) The binary number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems.  See
L<perlport> for more on portability concerns.

=item bind() on closed socket %s

(W closed) You tried to do a bind on a closed socket.  Did you forget to
check the return value of your socket() call?  See L<perlfunc/bind>.

=item binmode() on closed filehandle %s

(W unopened) You tried binmode() on a filehandle that was never opened.
Check your control flow and number of arguments.

=item Bit vector size > 32 non-portable

(W portable) Using bit vector sizes larger than 32 is non-portable.

=item Bizarre copy of %s

(P) Perl detected an attempt to copy an internal value that is not
copiable.

=item Bizarre SvTYPE [%d]

(P) When starting a new thread or returning values from a thread, Perl
encountered an invalid data type.

=item Both or neither range ends should be Unicode in regex; marked by
S<<-- HERE> in m/%s/

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

In a bracketed character class in a regular expression pattern, you
had a range which has exactly one end of it specified using C<\N{}>, and
the other end is specified using a non-portable mechanism.  Perl treats
the range as a Unicode range, that is, all the characters in it are
considered to be the Unicode characters, and which may be different code
points on some platforms Perl runs on.  For example, C<[\N{U+06}-\x08]>
is treated as if you had instead said C<[\N{U+06}-\N{U+08}]>, that is it
matches the characters whose code points in Unicode are 6, 7, and 8.
But that C<\x08> might indicate that you meant something different, so
the warning gets raised.

=item Buffer overflow in prime_env_iter: %s

(W internal) A warning peculiar to VMS.  While Perl was preparing to
iterate over %ENV, it encountered a logical name or symbol definition
which was too long, so it was truncated to the string shown.

=item Callback called exit

(F) A subroutine invoked from an external package via call_sv()
exited by calling exit.

=item %s() called too early to check prototype

(W prototype) You've called a function that has a prototype before the
parser saw a definition or declaration for it, and Perl could not check
that the call conforms to the prototype.  You need to either add an
early prototype declaration for the subroutine in question, or move the
subroutine definition ahead of the call to get proper prototype
checking.  Alternatively, if you are certain that you're calling the
function correctly, you may put an ampersand before the name to avoid
the warning.  See L<perlsub>.

=item Cannot chr %f

(F) You passed an invalid number (like an infinity or not-a-number) to C<chr>.

=item Cannot compress %f in pack

(F) You tried compressing an infinity or not-a-number as an unsigned
integer with BER, which makes no sense.

=item Cannot compress integer in pack

(F) An argument to pack("w",...) was too large to compress.
The BER compressed integer format can only be used with positive
integers, and you attempted to compress a very large number (> 1e308).
See L<perlfunc/pack>.

=item Cannot compress negative numbers in pack

(F) An argument to pack("w",...) was negative.  The BER compressed integer
format can only be used with positive integers.  See L<perlfunc/pack>.

=item Cannot convert a reference to %s to typeglob

(F) You manipulated Perl's symbol table directly, stored a reference
in it, then tried to access that symbol via conventional Perl syntax.
The access triggers Perl to autovivify that typeglob, but it there is
no legal conversion from that type of reference to a typeglob.

=item Cannot copy to %s

(P) Perl detected an attempt to copy a value to an internal type that cannot
be directly assigned to.

=item Cannot find encoding "%s"

(S io) You tried to apply an encoding that did not exist to a filehandle,
either with open() or binmode().

=item Cannot pack %f with '%c'

(F) You tried converting an infinity or not-a-number to an integer,
which makes no sense.

=item Cannot printf %f with '%c'

(F) You tried printing an infinity or not-a-number as a character (%c),
which makes no sense.  Maybe you meant '%s', or just stringifying it?

=item Cannot set tied @DB::args

(F) C<caller> tried to set C<@DB::args>, but found it tied.  Tying C<@DB::args>
is not supported.  (Before this error was added, it used to crash.)

=item Cannot tie unreifiable array

(P) You somehow managed to call C<tie> on an array that does not
keep a reference count on its arguments and cannot be made to
do so.  Such arrays are not even supposed to be accessible to
Perl code, but are only used internally.

=item Cannot yet reorder sv_catpvfn() arguments from va_list

(F) Some XS code tried to use C<sv_catpvfn()> or a related function with a
format string that specifies explicit indexes for some of the elements, and
using a C-style variable-argument list (a C<va_list>).  This is not currently
supported.  XS authors wanting to do this must instead construct a C array
of C<SV*> scalars containing the arguments.

=item Can only compress unsigned integers in pack

(F) An argument to pack("w",...) was not an integer.  The BER compressed
integer format can only be used with positive integers, and you attempted
to compress something else.  See L<perlfunc/pack>.

=item Can't bless non-reference value

(F) Only hard references may be blessed.  This is how Perl "enforces"
encapsulation of objects.  See L<perlobj>.

=item Can't "break" in a loop topicalizer

(F) You called C<break>, but you're in a C<foreach> block rather than
a C<given> block.  You probably meant to use C<next> or C<last>.

=item Can't "break" outside a given block

(F) You called C<break>, but you're not inside a C<given> block.

=item Can't call method "%s" on an undefined value

(F) You used the syntax of a method call, but the slot filled by the
object reference or package name contains an undefined value.  Something
like this will reproduce the error:

    $BADREF = undef;
    process $BADREF 1,2,3;
    $BADREF->process(1,2,3);

=item Can't call method "%s" on unblessed reference

(F) A method call must know in what package it's supposed to run.  It
ordinarily finds this out from the object reference you supply, but you
didn't supply an object reference in this case.  A reference isn't an
object reference until it has been blessed.  See L<perlobj>.

=item Can't call method "%s" without a package or object reference

(F) You used the syntax of a method call, but the slot filled by the
object reference or package name contains an expression that returns a
defined value which is neither an object reference nor a package name.
Something like this will reproduce the error:

    $BADREF = 42;
    process $BADREF 1,2,3;
    $BADREF->process(1,2,3);

=item Can't call mro_isa_changed_in() on anonymous symbol table

(P) Perl got confused as to whether a hash was a plain hash or a
symbol table hash when trying to update @ISA caches.

=item Can't call mro_method_changed_in() on anonymous symbol table

(F) An XS module tried to call C<mro_method_changed_in> on a hash that was
not attached to the symbol table.

=item Can't chdir to %s

(F) You called C<perl -x/foo/bar>, but F</foo/bar> is not a directory
that you can chdir to, possibly because it doesn't exist.

=item Can't check filesystem of script "%s" for nosuid

(P) For some reason you can't check the filesystem of the script for
nosuid.

=item Can't coerce %s to %s in %s

(F) Certain types of SVs, in particular real symbol table entries
(typeglobs), can't be forced to stop being what they are.  So you can't
say things like:

    *foo += 1;

You CAN say

    $foo = *foo;
    $foo += 1;

but then $foo no longer contains a glob.

=item Can't "continue" outside a when block

(F) You called C<continue>, but you're not inside a C<when>
or C<default> block.

=item Can't create pipe mailbox

(P) An error peculiar to VMS.  The process is suffering from exhausted
quotas or other plumbing problems.

=item Can't declare %s in "%s"

(F) Only scalar, array, and hash variables may be declared as "my", "our" or
"state" variables.  They must have ordinary identifiers as names.

=item Can't "default" outside a topicalizer

(F) You have used a C<default> block that is neither inside a
C<foreach> loop nor a C<given> block.  (Note that this error is
issued on exit from the C<default> block, so you won't get the
error if you use an explicit C<continue>.)

=item Can't determine class of operator %s, assuming BASEOP

(S) This warning indicates something wrong in the internals of perl.
Perl was trying to find the class (e.g. LISTOP) of a particular OP,
and was unable to do so. This is likely to be due to a bug in the perl
internals, or due to a bug in XS code which manipulates perl optrees.

=item Can't do inplace edit: %s is not a regular file

(S inplace) You tried to use the B<-i> switch on a special file, such as
a file in /dev, a FIFO or an uneditable directory.  The file was ignored.

=item Can't do inplace edit on %s: %s

(S inplace) The creation of the new file failed for the indicated
reason.

=item Can't do inplace edit without backup

(F) You're on a system such as MS-DOS that gets confused if you try
reading from a deleted (but still opened) file.  You have to say
C<-i.bak>, or some such.

=item Can't do inplace edit: %s would not be unique

(S inplace) Your filesystem does not support filenames longer than 14
characters and Perl was unable to create a unique filename during
inplace editing with the B<-i> switch.  The file was ignored.

=item Can't do %s("%s") on non-UTF-8 locale; resolved to "%s".

(W locale) You are 1) running under "C<use locale>"; 2) the current
locale is not a UTF-8 one; 3) you tried to do the designated case-change
operation on the specified Unicode character; and 4) the result of this
operation would mix Unicode and locale rules, which likely conflict.
Mixing of different rule types is forbidden, so the operation was not
done; instead the result is the indicated value, which is the best
available that uses entirely Unicode rules.  That turns out to almost
always be the original character, unchanged.

It is generally a bad idea to mix non-UTF-8 locales and Unicode, and
this issue is one of the reasons why.  This warning is raised when
Unicode rules would normally cause the result of this operation to
contain a character that is in the range specified by the locale,
0..255, and hence is subject to the locale's rules, not Unicode's.

If you are using locale purely for its characteristics related to things
like its numeric and time formatting (and not C<LC_CTYPE>), consider
using a restricted form of the locale pragma (see L<perllocale/The "use
locale" pragma>) like "S<C<use locale ':not_characters'>>".

Note that failed case-changing operations done as a result of
case-insensitive C</i> regular expression matching will show up in this
warning as having the C<fc> operation (as that is what the regular
expression engine calls behind the scenes.)

=item Can't do waitpid with flags

(F) This machine doesn't have either waitpid() or wait4(), so only
waitpid() without flags is emulated.

=item Can't emulate -%s on #! line

(F) The #! line specifies a switch that doesn't make sense at this
point.  For example, it'd be kind of silly to put a B<-x> on the #!
line.

=item Can't %s %s-endian %ss on this platform

(F) Your platform's byte-order is neither big-endian nor little-endian,
or it has a very strange pointer size.  Packing and unpacking big- or
little-endian floating point values and pointers may not be possible.
See L<perlfunc/pack>.

=item Can't exec "%s": %s

(W exec) A system(), exec(), or piped open call could not execute the
named program for the indicated reason.  Typical reasons include: the
permissions were wrong on the file, the file wasn't found in
C<$ENV{PATH}>, the executable in question was compiled for another
architecture, or the #! line in a script points to an interpreter that
can't be run for similar reasons.  (Or maybe your system doesn't support
#! at all.)

=item Can't exec %s

(F) Perl was trying to execute the indicated program for you because
that's what the #! line said.  If that's not what you wanted, you may
need to mention "perl" on the #! line somewhere.

=item Can't execute %s

(F) You used the B<-S> switch, but the copies of the script to execute
found in the PATH did not have correct permissions.

=item Can't find an opnumber for "%s"

(F) A string of a form C<CORE::word> was given to prototype(), but there
is no builtin with the name C<word>.

=item Can't find label %s

(F) You said to goto a label that isn't mentioned anywhere that it's
possible for us to go to.  See L<perlfunc/goto>.

=item Can't find %s on PATH

(F) You used the B<-S> switch, but the script to execute could not be
found in the PATH.

=item Can't find %s on PATH, '.' not in PATH

(F) You used the B<-S> switch, but the script to execute could not be
found in the PATH, or at least not with the correct permissions.  The
script exists in the current directory, but PATH prohibits running it.

=item Can't find string terminator %s anywhere before EOF

(F) Perl strings can stretch over multiple lines.  This message means
that the closing delimiter was omitted.  Because bracketed quotes count
nesting levels, the following is missing its final parenthesis:

    print q(The character '(' starts a side comment.);

If you're getting this error from a here-document, you may have
included unseen whitespace before or after your closing tag or there
may not be a linebreak after it.  A good programmer's editor will have
a way to help you find these characters (or lack of characters).  See
L<perlop> for the full details on here-documents.

=item Can't find Unicode property definition "%s"

=item Can't find Unicode property definition "%s" in regex; marked by <-- HERE in m/%s/

(F) The named property which you specified via C<\p> or C<\P> is not one
known to Perl.  Perhaps you misspelled the name?  See
L<perluniprops/Properties accessible through \p{} and \P{}>
for a complete list of available official
properties.  If it is a
L<user-defined property|perlunicode/User-Defined Character Properties>
it must have been defined by the time the regular expression is
matched.

If you didn't mean to use a Unicode property, escape the C<\p>, either
by C<\\p> (just the C<\p>) or by C<\Q\p> (the rest of the string, or
until C<\E>).

=item Can't fork: %s

(F) A fatal error occurred while trying to fork while opening a
pipeline.

=item Can't fork, trying again in 5 seconds

(W pipe) A fork in a piped open failed with EAGAIN and will be retried
after five seconds.

=item Can't get filespec - stale stat buffer?

(S) A warning peculiar to VMS.  This arises because of the difference
between access checks under VMS and under the Unix model Perl assumes.
Under VMS, access checks are done by filename, rather than by bits in
the stat buffer, so that ACLs and other protections can be taken into
account.  Unfortunately, Perl assumes that the stat buffer contains all
the necessary information, and passes it, instead of the filespec, to
the access-checking routine.  It will try to retrieve the filespec using
the device name and FID present in the stat buffer, but this works only
if you haven't made a subsequent call to the CRTL stat() routine,
because the device name is overwritten with each call.  If this warning
appears, the name lookup failed, and the access-checking routine gave up
and returned FALSE, just to be conservative.  (Note: The access-checking
routine knows about the Perl C<stat> operator and file tests, so you
shouldn't ever see this warning in response to a Perl command; it arises
only if some internal code takes stat buffers lightly.)

=item Can't get pipe mailbox device name

(P) An error peculiar to VMS.  After creating a mailbox to act as a
pipe, Perl can't retrieve its name for later use.

=item Can't get SYSGEN parameter value for MAXBUF

(P) An error peculiar to VMS.  Perl asked $GETSYI how big you want your
mailbox buffers to be, and didn't get an answer.

=item Can't "goto" into the middle of a foreach loop

(F) A "goto" statement was executed to jump into the middle of a foreach
loop.  You can't get there from here.  See L<perlfunc/goto>.

=item Can't "goto" out of a pseudo block

(F) A "goto" statement was executed to jump out of what might look like
a block, except that it isn't a proper block.  This usually occurs if
you tried to jump out of a sort() block or subroutine, which is a no-no.
See L<perlfunc/goto>.

=item Can't goto subroutine from an eval-%s

(F) The "goto subroutine" call can't be used to jump out of an eval
"string" or block.

=item Can't goto subroutine from a sort sub (or similar callback)

(F) The "goto subroutine" call can't be used to jump out of the
comparison sub for a sort(), or from a similar callback (such
as the reduce() function in List::Util).

=item Can't goto subroutine outside a subroutine

(F) The deeply magical "goto subroutine" call can only replace one
subroutine call for another.  It can't manufacture one out of whole
cloth.  In general you should be calling it out of only an AUTOLOAD
routine anyway.  See L<perlfunc/goto>.

=item Can't ignore signal CHLD, forcing to default

(W signal) Perl has detected that it is being run with the SIGCHLD
signal (sometimes known as SIGCLD) disabled.  Since disabling this
signal will interfere with proper determination of exit status of child
processes, Perl has reset the signal to its default value.  This
situation typically indicates that the parent program under which Perl
may be running (e.g. cron) is being very careless.

=item Can't kill a non-numeric process ID

(F) Process identifiers must be (signed) integers.  It is a fatal error to
attempt to kill() an undefined, empty-string or otherwise non-numeric
process identifier.

=item Can't "last" outside a loop block

(F) A "last" statement was executed to break out of the current block,
except that there's this itty bitty problem called there isn't a current
block.  Note that an "if" or "else" block doesn't count as a "loopish"
block, as doesn't a block given to sort(), map() or grep().  You can
usually double the curlies to get the same effect though, because the
inner curlies will be considered a block that loops once.  See
L<perlfunc/last>.

=item Can't linearize anonymous symbol table

(F) Perl tried to calculate the method resolution order (MRO) of a
package, but failed because the package stash has no name.

=item Can't load '%s' for module %s

(F) The module you tried to load failed to load a dynamic extension.
This may either mean that you upgraded your version of perl to one
that is incompatible with your old dynamic extensions (which is known
to happen between major versions of perl), or (more likely) that your
dynamic extension was built against an older version of the library
that is installed on your system.  You may need to rebuild your old
dynamic extensions.

=item Can't localize lexical variable %s

(F) You used local on a variable name that was previously declared as a
lexical variable using "my" or "state".  This is not allowed.  If you
want to localize a package variable of the same name, qualify it with
the package name.

=item Can't localize through a reference

(F) You said something like C<local $$ref>, which Perl can't currently
handle, because when it goes to restore the old value of whatever $ref
pointed to after the scope of the local() is finished, it can't be sure
that $ref will still be a reference.

=item Can't locate %s

(F) You said to C<do> (or C<require>, or C<use>) a file that couldn't be found.
Perl looks for the file in all the locations mentioned in @INC, unless
the file name included the full path to the file.  Perhaps you need
to set the PERL5LIB or PERL5OPT environment variable to say where the
extra library is, or maybe the script needs to add the library name
to @INC.  Or maybe you just misspelled the name of the file.  See
L<perlfunc/require> and L<lib>.

=item Can't locate auto/%s.al in @INC

(F) A function (or method) was called in a package which allows
autoload, but there is no function to autoload.  Most probable causes
are a misprint in a function/method name or a failure to C<AutoSplit>
the file, say, by doing C<make install>.

=item Can't locate loadable object for module %s in @INC

(F) The module you loaded is trying to load an external library, like
for example, F<foo.so> or F<bar.dll>, but the L<DynaLoader> module was
unable to locate this library.  See L<DynaLoader>.

=item Can't locate object method "%s" via package "%s"

(F) You called a method correctly, and it correctly indicated a package
functioning as a class, but that package doesn't define that particular
method, nor does any of its base classes.  See L<perlobj>.

=item Can't locate object method "%s" via package "%s" (perhaps you forgot
to load "%s"?)

(F) You called a method on a class that did not exist, and the method
could not be found in UNIVERSAL.  This often means that a method
requires a package that has not been loaded.

=item Can't locate package %s for @%s::ISA

(W syntax) The @ISA array contained the name of another package that
doesn't seem to exist.

=item Can't locate PerlIO%s

(F) You tried to use in open() a PerlIO layer that does not exist,
e.g. open(FH, ">:nosuchlayer", "somefile").

=item Can't make list assignment to %ENV on this system

(F) List assignment to %ENV is not supported on some systems, notably
VMS.

=item Can't make loaded symbols global on this platform while loading %s

(S) A module passed the flag 0x01 to DynaLoader::dl_load_file() to request
that symbols from the stated file are made available globally within the
process, but that functionality is not available on this platform.  Whilst
the module likely will still work, this may prevent the perl interpreter
from loading other XS-based extensions which need to link directly to
functions defined in the C or XS code in the stated file.

=item Can't modify %s in %s

(F) You aren't allowed to assign to the item indicated, or otherwise try
to change it, such as with an auto-increment.

=item Can't modify nonexistent substring

(P) The internal routine that does assignment to a substr() was handed
a NULL.

=item Can't modify non-lvalue subroutine call of &%s

(F) Subroutines meant to be used in lvalue context should be declared as
such.  See L<perlsub/"Lvalue subroutines">.

=item Can't modify reference to %s in %s assignment

(F) Only a limited number of constructs can be used as the argument to a
reference constructor on the left-hand side of an assignment, and what
you used was not one of them.  See L<perlref/Assigning to References>.

=item Can't modify reference to localized parenthesized array in list
assignment

(F) Assigning to C<\local(@array)> or C<\(local @array)> is not supported, as
it is not clear exactly what it should do.  If you meant to make @array
refer to some other array, use C<\@array = \@other_array>.  If you want to
make the elements of @array aliases of the scalars referenced on the
right-hand side, use C<\(@array) = @scalar_refs>.

=item Can't modify reference to parenthesized hash in list assignment

(F) Assigning to C<\(%hash)> is not supported.  If you meant to make %hash
refer to some other hash, use C<\%hash = \%other_hash>.  If you want to
make the elements of %hash into aliases of the scalars referenced on the
right-hand side, use a hash slice: C<\@hash{@keys} = @those_scalar_refs>.

=item Can't msgrcv to read-only var

(F) The target of a msgrcv must be modifiable to be used as a receive
buffer.

=item Can't "next" outside a loop block

(F) A "next" statement was executed to reiterate the current block, but
there isn't a current block.  Note that an "if" or "else" block doesn't
count as a "loopish" block, as doesn't a block given to sort(), map() or
grep().  You can usually double the curlies to get the same effect
though, because the inner curlies will be considered a block that loops
once.  See L<perlfunc/next>.

=item Can't open %s: %s

(S inplace) The implicit opening of a file through use of the C<< <> >>
filehandle, either implicitly under the C<-n> or C<-p> command-line
switches, or explicitly, failed for the indicated reason.  Usually
this is because you don't have read permission for a file which
you named on the command line.

(F) You tried to call perl with the B<-e> switch, but F</dev/null> (or
your operating system's equivalent) could not be opened.

=item Can't open a reference

(W io) You tried to open a scalar reference for reading or writing,
using the 3-arg open() syntax:

    open FH, '>', $ref;

but your version of perl is compiled without perlio, and this form of
open is not supported.

=item Can't open bidirectional pipe

(W pipe) You tried to say C<open(CMD, "|cmd|")>, which is not supported.
You can try any of several modules in the Perl library to do this, such
as IPC::Open2.  Alternately, direct the pipe's output to a file using
">", and then read it in under a different file handle.

=item Can't open error file %s as stderr

(F) An error peculiar to VMS.  Perl does its own command line
redirection, and couldn't open the file specified after '2>' or '2>>' on
the command line for writing.

=item Can't open input file %s as stdin

(F) An error peculiar to VMS.  Perl does its own command line
redirection, and couldn't open the file specified after '<' on the
command line for reading.

=item Can't open output file %s as stdout

(F) An error peculiar to VMS.  Perl does its own command line
redirection, and couldn't open the file specified after '>' or '>>' on
the command line for writing.

=item Can't open output pipe (name: %s)

(P) An error peculiar to VMS.  Perl does its own command line
redirection, and couldn't open the pipe into which to send data destined
for stdout.

=item Can't open perl script "%s": %s

(F) The script you specified can't be opened for the indicated reason.

If you're debugging a script that uses #!, and normally relies on the
shell's $PATH search, the -S option causes perl to do that search, so
you don't have to type the path or C<`which $scriptname`>.

=item Can't read CRTL environ

(S) A warning peculiar to VMS.  Perl tried to read an element of %ENV
from the CRTL's internal environment array and discovered the array was
missing.  You need to figure out where your CRTL misplaced its environ
or define F<PERL_ENV_TABLES> (see L<perlvms>) so that environ is not
searched.

=item Can't redeclare "%s" in "%s"

(F) A "my", "our" or "state" declaration was found within another declaration,
such as C<my ($x, my($y), $z)> or C<our (my $x)>.

=item Can't "redo" outside a loop block

(F) A "redo" statement was executed to restart the current block, but
there isn't a current block.  Note that an "if" or "else" block doesn't
count as a "loopish" block, as doesn't a block given to sort(), map()
or grep().  You can usually double the curlies to get the same effect
though, because the inner curlies will be considered a block that
loops once.  See L<perlfunc/redo>.

=item Can't remove %s: %s, skipping file

(S inplace) You requested an inplace edit without creating a backup
file.  Perl was unable to remove the original file to replace it with
the modified file.  The file was left unmodified.

=item Can't rename %s to %s: %s, skipping file

(S inplace) The rename done by the B<-i> switch failed for some reason,
probably because you don't have write permission to the directory.

=item Can't reopen input pipe (name: %s) in binary mode

(P) An error peculiar to VMS.  Perl thought stdin was a pipe, and tried
to reopen it to accept binary data.  Alas, it failed.

=item Can't represent character for Ox%X on this platform

(F) There is a hard limit to how big a character code point can be due
to the fundamental properties of UTF-8, especially on EBCDIC
platforms.  The given code point exceeds that.  The only work-around is
to not use such a large code point.

=item Can't reset %ENV on this system

(F) You called C<reset('E')> or similar, which tried to reset
all variables in the current package beginning with "E".  In
the main package, that includes %ENV.  Resetting %ENV is not
supported on some systems, notably VMS.

=item Can't resolve method "%s" overloading "%s" in package "%s"

(F)(P) Error resolving overloading specified by a method name (as
opposed to a subroutine reference): no such method callable via the
package.  If the method name is C<???>, this is an internal error.

=item Can't return %s from lvalue subroutine

(F) Perl detected an attempt to return illegal lvalues (such as
temporary or readonly values) from a subroutine used as an lvalue.  This
is not allowed.

=item Can't return outside a subroutine

(F) The return statement was executed in mainline code, that is, where
there was no subroutine call to return out of.  See L<perlsub>.

=item Can't return %s to lvalue scalar context

(F) You tried to return a complete array or hash from an lvalue
subroutine, but you called the subroutine in a way that made Perl
think you meant to return only one value.  You probably meant to
write parentheses around the call to the subroutine, which tell
Perl that the call should be in list context.

=item Can't stat script "%s"

(P) For some reason you can't fstat() the script even though you have it
open already.  Bizarre.

=item Can't take log of %g

(F) For ordinary real numbers, you can't take the logarithm of a
negative number or zero.  There's a Math::Complex package that comes
standard with Perl, though, if you really want to do that for the
negative numbers.

=item Can't take sqrt of %g

(F) For ordinary real numbers, you can't take the square root of a
negative number.  There's a Math::Complex package that comes standard
with Perl, though, if you really want to do that.

=item Can't undef active subroutine

(F) You can't undefine a routine that's currently running.  You can,
however, redefine it while it's running, and you can even undef the
redefined subroutine while the old routine is running.  Go figure.

=item Can't upgrade %s (%d) to %d

(P) The internal sv_upgrade routine adds "members" to an SV, making it
into a more specialized kind of SV.  The top several SV types are so
specialized, however, that they cannot be interconverted.  This message
indicates that such a conversion was attempted.

=item Can't use '%c' after -mname

(F) You tried to call perl with the B<-m> switch, but you put something
other than "=" after the module name.

=item Can't use a hash as a reference

(F) You tried to use a hash as a reference, as in
C<< %foo->{"bar"} >> or C<< %$ref->{"hello"} >>.  Versions of perl
<= 5.22.0 used to allow this syntax, but shouldn't
have.  This was deprecated in perl 5.6.1.

=item Can't use an array as a reference

(F) You tried to use an array as a reference, as in
C<< @foo->[23] >> or C<< @$ref->[99] >>.  Versions of perl <= 5.22.0
used to allow this syntax, but shouldn't have.  This
was deprecated in perl 5.6.1.

=item Can't use anonymous symbol table for method lookup

(F) The internal routine that does method lookup was handed a symbol
table that doesn't have a name.  Symbol tables can become anonymous
for example by undefining stashes: C<undef %Some::Package::>.

=item Can't use an undefined value as %s reference

(F) A value used as either a hard reference or a symbolic reference must
be a defined value.  This helps to delurk some insidious errors.

=item Can't use bareword ("%s") as %s ref while "strict refs" in use

(F) Only hard references are allowed by "strict refs".  Symbolic
references are disallowed.  See L<perlref>.

=item Can't use %! because Errno.pm is not available

(F) The first time the C<%!> hash is used, perl automatically loads the
Errno.pm module.  The Errno module is expected to tie the %! hash to
provide symbolic names for C<$!> errno values.

=item Can't use both '<' and '>' after type '%c' in %s

(F) A type cannot be forced to have both big-endian and little-endian
byte-order at the same time, so this combination of modifiers is not
allowed.  See L<perlfunc/pack>.

=item Can't use 'defined(@array)' (Maybe you should just omit the defined()?)

(F) defined() is not useful on arrays because it
checks for an undefined I<scalar> value.  If you want to see if the
array is empty, just use C<if (@array) { # not empty }> for example.

=item Can't use 'defined(%hash)' (Maybe you should just omit the defined()?)

(F) C<defined()> is not usually right on hashes.

Although C<defined %hash> is false on a plain not-yet-used hash, it
becomes true in several non-obvious circumstances, including iterators,
weak references, stash names, even remaining true after C<undef %hash>.
These things make C<defined %hash> fairly useless in practice, so it now
generates a fatal error.

If a check for non-empty is what you wanted then just put it in boolean
context (see L<perldata/Scalar values>):

    if (%hash) {
       # not empty
    }

If you had C<defined %Foo::Bar::QUUX> to check whether such a package
variable exists then that's never really been reliable, and isn't
a good way to enquire about the features of a package, or whether
it's loaded, etc.

=item Can't use %s for loop variable

(P) The parser got confused when trying to parse a C<foreach> loop.

=item Can't use global %s in "%s"

(F) You tried to declare a magical variable as a lexical variable.  This
is not allowed, because the magic can be tied to only one location
(namely the global variable) and it would be incredibly confusing to
have variables in your program that looked like magical variables but
weren't.

=item Can't use '%c' in a group with different byte-order in %s

(F) You attempted to force a different byte-order on a type
that is already inside a group with a byte-order modifier.
For example you cannot force little-endianness on a type that
is inside a big-endian group.

=item Can't use "my %s" in sort comparison

(F) The global variables $a and $b are reserved for sort comparisons.
You mentioned $a or $b in the same line as the <=> or cmp operator,
and the variable had earlier been declared as a lexical variable.
Either qualify the sort variable with the package name, or rename the
lexical variable.

=item Can't use %s ref as %s ref

(F) You've mixed up your reference types.  You have to dereference a
reference of the type needed.  You can use the ref() function to
test the type of the reference, if need be.

=item Can't use string ("%s") as %s ref while "strict refs" in use

=item Can't use string ("%s"...) as %s ref while "strict refs" in use

(F) You've told Perl to dereference a string, something which
C<use strict> blocks to prevent it happening accidentally.  See
L<perlref/"Symbolic references">.  This can be triggered by an C<@> or C<$>
in a double-quoted string immediately before interpolating a variable,
for example in C<"user @$twitter_id">, which says to treat the contents
of C<$twitter_id> as an array reference; use a C<\> to have a literal C<@>
symbol followed by the contents of C<$twitter_id>: C<"user \@$twitter_id">.

=item Can't use subscript on %s

(F) The compiler tried to interpret a bracketed expression as a
subscript.  But to the left of the brackets was an expression that
didn't look like a hash or array reference, or anything else subscriptable.

=item Can't use \%c to mean $%c in expression

(W syntax) In an ordinary expression, backslash is a unary operator that
creates a reference to its argument.  The use of backslash to indicate a
backreference to a matched substring is valid only as part of a regular
expression pattern.  Trying to do this in ordinary Perl code produces a
value that prints out looking like SCALAR(0xdecaf).  Use the $1 form
instead.

=item Can't weaken a nonreference

(F) You attempted to weaken something that was not a reference.  Only
references can be weakened.

=item Can't "when" outside a topicalizer

(F) You have used a when() block that is neither inside a C<foreach>
loop nor a C<given> block.  (Note that this error is issued on exit
from the C<when> block, so you won't get the error if the match fails,
or if you use an explicit C<continue>.)

=item Can't x= to read-only value

(F) You tried to repeat a constant value (often the undefined value)
with an assignment operator, which implies modifying the value itself.
Perhaps you need to copy the value to a temporary, and repeat that.

=item Character following "\c" must be printable ASCII

(F) In C<\cI<X>>, I<X> must be a printable (non-control) ASCII character.

Note that ASCII characters that don't map to control characters are
discouraged, and will generate the warning (when enabled)
L</""\c%c" is more clearly written simply as "%s"">.

=item Character following \%c must be '{' or a single-character Unicode property name in regex; marked by <-- HERE in m/%s/

(F) (In the above the C<%c> is replaced by either C<p> or C<P>.)  You
specified something that isn't a legal Unicode property name.  Most
Unicode properties are specified by C<\p{...}>.  But if the name is a
single character one, the braces may be omitted.

=item Character in 'C' format wrapped in pack

(W pack) You said

    pack("C", $x)

where $x is either less than 0 or more than 255; the C<"C"> format is
only for encoding native operating system characters (ASCII, EBCDIC,
and so on) and not for Unicode characters, so Perl behaved as if you meant

    pack("C", $x & 255)

If you actually want to pack Unicode codepoints, use the C<"U"> format
instead.

=item Character in 'c' format wrapped in pack

(W pack) You said

    pack("c", $x)

where $x is either less than -128 or more than 127; the C<"c"> format
is only for encoding native operating system characters (ASCII, EBCDIC,
and so on) and not for Unicode characters, so Perl behaved as if you meant

    pack("c", $x & 255);

If you actually want to pack Unicode codepoints, use the C<"U"> format
instead.

=item Character in '%c' format wrapped in unpack

(W unpack) You tried something like

   unpack("H", "\x{2a1}")

where the format expects to process a byte (a character with a value
below 256), but a higher value was provided instead.  Perl uses the
value modulus 256 instead, as if you had provided:

   unpack("H", "\x{a1}")

=item Character in 'W' format wrapped in pack

(W pack) You said

    pack("U0W", $x)

where $x is either less than 0 or more than 255.  However, C<U0>-mode
expects all values to fall in the interval [0, 255], so Perl behaved
as if you meant:

    pack("U0W", $x & 255)

=item Character(s) in '%c' format wrapped in pack

(W pack) You tried something like

   pack("u", "\x{1f3}b")

where the format expects to process a sequence of bytes (character with a
value below 256), but some of the characters had a higher value.  Perl
uses the character values modulus 256 instead, as if you had provided:

   pack("u", "\x{f3}b")

=item Character(s) in '%c' format wrapped in unpack

(W unpack) You tried something like

   unpack("s", "\x{1f3}b")

where the format expects to process a sequence of bytes (character with a
value below 256), but some of the characters had a higher value.  Perl
uses the character values modulus 256 instead, as if you had provided:

   unpack("s", "\x{f3}b")

=item charnames alias definitions may not contain a sequence of multiple spaces

(F) You defined a character name which had multiple space characters
in a row.  Change them to single spaces.  Usually these names are
defined in the C<:alias> import argument to C<use charnames>, but they
could be defined by a translator installed into C<$^H{charnames}>.  See
L<charnames/CUSTOM ALIASES>.

=item charnames alias definitions may not contain trailing white-space

(F) You defined a character name which ended in a space
character.  Remove the trailing space(s).  Usually these names are
defined in the C<:alias> import argument to C<use charnames>, but they
could be defined by a translator installed into C<$^H{charnames}>.
See L<charnames/CUSTOM ALIASES>.

=item chdir() on unopened filehandle %s

(W unopened) You tried chdir() on a filehandle that was never opened.

=item "\c%c" is more clearly written simply as "%s"

(W syntax) The C<\cI<X>> construct is intended to be a way to specify
non-printable characters.  You used it for a printable one, which
is better written as simply itself, perhaps preceded by a backslash
for non-word characters.  Doing it the way you did is not portable
between ASCII and EBCDIC platforms.

=item Cloning substitution context is unimplemented

(F) Creating a new thread inside the C<s///> operator is not supported.

=item closedir() attempted on invalid dirhandle %s

(W io) The dirhandle you tried to close is either closed or not really
a dirhandle.  Check your control flow.

=item close() on unopened filehandle %s

(W unopened) You tried to close a filehandle that was never opened.

=item Closure prototype called

(F) If a closure has attributes, the subroutine passed to an attribute
handler is the prototype that is cloned when a new closure is created.
This subroutine cannot be called.

=item \C no longer supported in regex; marked by S<<-- HERE> in m/%s/

(F) The \C character class used to allow a match of single byte
within a multi-byte utf-8 character, but was removed in v5.24 as
it broke encapsulation and its implementation was extremely buggy.
If you really need to process the individual bytes, you probably
want to convert your string to one where each underlying byte is
stored as a character, with utf8::encode().

=item Code missing after '/'

(F) You had a (sub-)template that ends with a '/'.  There must be
another template code following the slash.  See L<perlfunc/pack>.

=item Code point 0x%X is not Unicode, and not portable

(S non_unicode) You had a code point that has never been in any
standard, so it is likely that languages other than Perl will NOT
understand it.  At one time, it was legal in some standards to have code
points up to 0x7FFF_FFFF, but not higher, and this code point is higher.

Acceptance of these code points is a Perl extension, and you should
expect that nothing other than Perl can handle them; Perl itself on
EBCDIC platforms before v5.24 does not handle them.

Code points above 0xFFFF_FFFF require larger than a 32 bit word.

Perl also makes no guarantees that the representation of these code
points won't change at some point in the future, say when machines
become available that have larger than a 64-bit word.  At that time,
files written by an older Perl would require conversion before being
readable by a newer Perl.

=item Code point 0x%X is not Unicode, may not be portable

(S non_unicode) You had a code point above the Unicode maximum
of U+10FFFF.

Perl allows strings to contain a superset of Unicode code points, but
these may not be accepted by other languages/systems.  Further, even if
these languages/systems accept these large code points, they may have
chosen a different representation for them than the UTF-8-like one that
Perl has, which would mean files are not exchangeable between them and
Perl.

On EBCDIC platforms, code points above 0x3FFF_FFFF have a different
representation in Perl v5.24 than before, so any file containing these
that was written before that version will require conversion before
being readable by a later Perl.

=item %s: Command not found

(A) You've accidentally run your script through B<csh> or another shell
instead of Perl.  Check the #! line, or manually feed your script into
Perl yourself.  The #! line at the top of your file could look like

  #!/usr/bin/perl

=item %s: command not found

(A) You've accidentally run your script through B<bash> or another shell
instead of Perl.  Check the #! line, or manually feed your script into
Perl yourself.  The #! line at the top of your file could look like

  #!/usr/bin/perl

=item %s: command not found: %s

(A) You've accidentally run your script through B<zsh> or another shell
instead of Perl.  Check the #! line, or manually feed your script into
Perl yourself.  The #! line at the top of your file could look like

  #!/usr/bin/perl

=item Compilation failed in require

(F) Perl could not compile a file specified in a C<require> statement.
Perl uses this generic message when none of the errors that it
encountered were severe enough to halt compilation immediately.

=item Complex regular subexpression recursion limit (%d) exceeded

(W regexp) The regular expression engine uses recursion in complex
situations where back-tracking is required.  Recursion depth is limited
to 32766, or perhaps less in architectures where the stack cannot grow
arbitrarily.  ("Simple" and "medium" situations are handled without
recursion and are not subject to a limit.)  Try shortening the string
under examination; looping in Perl code (e.g. with C<while>) rather than
in the regular expression engine; or rewriting the regular expression so
that it is simpler or backtracks less.  (See L<perlfaq2> for information
on I<Mastering Regular Expressions>.)

=item connect() on closed socket %s

(W closed) You tried to do a connect on a closed socket.  Did you forget
to check the return value of your socket() call?  See
L<perlfunc/connect>.

=item Constant(%s): Call to &{$^H{%s}} did not return a defined value

(F) The subroutine registered to handle constant overloading
(see L<overload>) or a custom charnames handler (see
L<charnames/CUSTOM TRANSLATORS>) returned an undefined value.

=item Constant(%s): $^H{%s} is not defined

(F) The parser found inconsistencies while attempting to define an
overloaded constant.  Perhaps you forgot to load the corresponding
L<overload> pragma?

=item Constant is not %s reference

(F) A constant value (perhaps declared using the C<use constant> pragma)
is being dereferenced, but it amounts to the wrong type of reference.
The message indicates the type of reference that was expected.  This
usually indicates a syntax error in dereferencing the constant value.
See L<perlsub/"Constant Functions"> and L<constant>.

=item Constants from lexical variables potentially modified elsewhere are
deprecated. This will not be allowed in Perl 5.32

(D deprecated) You wrote something like

    my $var;
    $sub = sub () { $var };

but $var is referenced elsewhere and could be modified after the C<sub>
expression is evaluated.  Either it is explicitly modified elsewhere
(C<$var = 3>) or it is passed to a subroutine or to an operator like
C<printf> or C<map>, which may or may not modify the variable.

Traditionally, Perl has captured the value of the variable at that
point and turned the subroutine into a constant eligible for inlining.
In those cases where the variable can be modified elsewhere, this
breaks the behavior of closures, in which the subroutine captures
the variable itself, rather than its value, so future changes to the
variable are reflected in the subroutine's return value.

This usage is deprecated, and will no longer be allowed in Perl 5.32,
making it possible to change the behavior in the future.

If you intended for the subroutine to be eligible for inlining, then
make sure the variable is not referenced elsewhere, possibly by
copying it:

    my $var2 = $var;
    $sub = sub () { $var2 };

If you do want this subroutine to be a closure that reflects future
changes to the variable that it closes over, add an explicit C<return>:

    my $var;
    $sub = sub () { return $var };

=item Constant subroutine %s redefined

(W redefine)(S) You redefined a subroutine which had previously
been eligible for inlining.  See L<perlsub/"Constant Functions">
for commentary and workarounds.

=item Constant subroutine %s undefined

(W misc) You undefined a subroutine which had previously been eligible
for inlining.  See L<perlsub/"Constant Functions"> for commentary and
workarounds.

=item Constant(%s) unknown

(F) The parser found inconsistencies either while attempting
to define an overloaded constant, or when trying to find the
character name specified in the C<\N{...}> escape.  Perhaps you
forgot to load the corresponding L<overload> pragma?

=item :const is experimental

(S experimental::const_attr) The "const" attribute is experimental.
If you want to use the feature, disable the warning with C<no warnings
'experimental::const_attr'>, but know that in doing so you are taking
the risk that your code may break in a future Perl version.

=item :const is not permitted on named subroutines

(F) The "const" attribute causes an anonymous subroutine to be run and
its value captured at the time that it is cloned.  Named subroutines are
not cloned like this, so the attribute does not make sense on them.

=item Copy method did not return a reference

(F) The method which overloads "=" is buggy.  See
L<overload/Copy Constructor>.

=item &CORE::%s cannot be called directly

(F) You tried to call a subroutine in the C<CORE::> namespace
with C<&foo> syntax or through a reference.  Some subroutines
in this package cannot yet be called that way, but must be
called as barewords.  Something like this will work:

    BEGIN { *shove = \&CORE::push; }
    shove @array, 1,2,3; # pushes on to @array

=item CORE::%s is not a keyword

(F) The CORE:: namespace is reserved for Perl keywords.

=item Corrupted regexp opcode %d > %d

(P) This is either an error in Perl, or, if you're using
one, your L<custom regular expression engine|perlreapi>.  If not the
latter, report the problem through the L<perlbug> utility.

=item corrupted regexp pointers

(P) The regular expression engine got confused by what the regular
expression compiler gave it.

=item corrupted regexp program

(P) The regular expression engine got passed a regexp program without a
valid magic number.

=item Corrupt malloc ptr 0x%x at 0x%x

(P) The malloc package that comes with Perl had an internal failure.

=item Count after length/code in unpack

(F) You had an unpack template indicating a counted-length string, but
you have also specified an explicit size for the string.  See
L<perlfunc/pack>.

=item Declaring references is experimental

(S experimental::declared_refs) This warning is emitted if you use
a reference constructor on the right-hand side of C<my>, C<state>, C<our>, or
C<local>.  Simply suppress the warning if you want to use the feature, but
know that in doing so you are taking the risk of using an experimental
feature which may change or be removed in a future Perl version:

    no warnings "experimental::declared_refs";
    use feature "declared_refs";
    $fooref = my \$foo;

=for comment
The following are used in lib/diagnostics.t for testing two =items that
share the same description.  Changes here need to be propagated to there

=item Deep recursion on anonymous subroutine

=item Deep recursion on subroutine "%s"

(W recursion) This subroutine has called itself (directly or indirectly)
100 times more than it has returned.  This probably indicates an
infinite recursion, unless you're writing strange benchmark programs, in
which case it indicates something else.

This threshold can be changed from 100, by recompiling the F<perl> binary,
setting the C pre-processor macro C<PERL_SUB_DEPTH_WARN> to the desired value.

=item (?(DEFINE)....) does not allow branches in regex; marked by
S<<-- HERE> in m/%s/

(F) You used something like C<(?(DEFINE)...|..)> which is illegal.  The
most likely cause of this error is that you left out a parenthesis inside
of the C<....> part.

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item %s defines neither package nor VERSION--version check failed

(F) You said something like "use Module 42" but in the Module file
there are neither package declarations nor a C<$VERSION>.

=item delete argument is index/value array slice, use array slice

(F) You used index/value array slice syntax (C<%array[...]>) as
the argument to C<delete>.  You probably meant C<@array[...]> with
an @ symbol instead.

=item delete argument is key/value hash slice, use hash slice

(F) You used key/value hash slice syntax (C<%hash{...}>) as the argument to
C<delete>.  You probably meant C<@hash{...}> with an @ symbol instead.

=item delete argument is not a HASH or ARRAY element or slice

(F) The argument to C<delete> must be either a hash or array element,
such as:

    $foo{$bar}
    $ref->{"susie"}[12]

or a hash or array slice, such as:

    @foo[$bar, $baz, $xyzzy]
    @{$ref->[12]}{"susie", "queue"}

=item Delimiter for here document is too long

(F) In a here document construct like C<<<FOO>, the label C<FOO> is too
long for Perl to handle.  You have to be seriously twisted to write code
that triggers this error.

=item Deprecated use of my() in false conditional. This will be a fatal error in Perl 5.30

(D deprecated) You used a declaration similar to C<my $x if 0>.  There
has been a long-standing bug in Perl that causes a lexical variable
not to be cleared at scope exit when its declaration includes a false
conditional.  Some people have exploited this bug to achieve a kind of
static variable.  Since we intend to fix this bug, we don't want people
relying on this behavior.  You can achieve a similar static effect by
declaring the variable in a separate block outside the function, eg

    sub f { my $x if 0; return $x++ }

becomes

    { my $x; sub f { return $x++ } }

Beginning with perl 5.10.0, you can also use C<state> variables to have
lexicals that are initialized only once (see L<feature>):

    sub f { state $x; return $x++ }

This use of C<my()> in a false conditional has been deprecated since
Perl 5.10, and it will become a fatal error in Perl 5.30.

=item DESTROY created new reference to dead object '%s'

(F) A DESTROY() method created a new reference to the object which is
just being DESTROYed.  Perl is confused, and prefers to abort rather
than to create a dangling reference.

=item Did not produce a valid header

See L</500 Server error>.

=item %s did not return a true value

(F) A required (or used) file must return a true value to indicate that
it compiled correctly and ran its initialization code correctly.  It's
traditional to end such a file with a "1;", though any true value would
do.  See L<perlfunc/require>.

=item (Did you mean &%s instead?)

(W misc) You probably referred to an imported subroutine &FOO as $FOO or
some such.

=item (Did you mean "local" instead of "our"?)

(W misc) Remember that "our" does not localize the declared global
variable.  You have declared it again in the same lexical scope, which
seems superfluous.

=item (Did you mean $ or @ instead of %?)

(W) You probably said %hash{$key} when you meant $hash{$key} or
@hash{@keys}.  On the other hand, maybe you just meant %hash and got
carried away.

=item Died

(F) You passed die() an empty string (the equivalent of C<die "">) or
you called it with no args and C<$@> was empty.

=item Document contains no data

See L</500 Server error>.

=item %s does not define %s::VERSION--version check failed

(F) You said something like "use Module 42" but the Module did not
define a C<$VERSION>.

=item '/' does not take a repeat count

(F) You cannot put a repeat count of any kind right after the '/' code.
See L<perlfunc/pack>.

=item do "%s" failed, '.' is no longer in @INC; did you mean do "./%s"?

(D deprecated) Previously C< do "somefile"; > would search the current
directory for the specified file.  Since perl v5.26.0, F<.> has been
removed from C<@INC> by default, so this is no longer true.  To search the
current directory (and only the current directory) you can write
C< do "./somefile"; >.

=item Don't know how to get file name

(P) C<PerlIO_getname>, a perl internal I/O function specific to VMS, was
somehow called on another platform.  This should not happen.

=item Don't know how to handle magic of type \%o

(P) The internal handling of magical variables has been cursed.

=item do_study: out of memory

(P) This should have been caught by safemalloc() instead.

=item (Do you need to predeclare %s?)

(S syntax) This is an educated guess made in conjunction with the message
"%s found where operator expected".  It often means a subroutine or module
name is being referenced that hasn't been declared yet.  This may be
because of ordering problems in your file, or because of a missing
"sub", "package", "require", or "use" statement.  If you're referencing
something that isn't defined yet, you don't actually have to define the
subroutine or package before the current location.  You can use an empty
"sub foo;" or "package FOO;" to enter a "forward" declaration.

=item dump() better written as CORE::dump(). dump() will no longer be available in Perl 5.30

(D deprecated, misc) You used the obsolescent C<dump()> built-in function,
without fully qualifying it as C<CORE::dump()>. Maybe it's a typo.

Use of a unqualified C<dump()> was deprecated in Perl 5.8.0, and this
will not be available in Perl 5.30.

See L<perlfunc/dump>.

=item dump is not supported

(F) Your machine doesn't support dump/undump.

=item Duplicate free() ignored

(S malloc) An internal routine called free() on something that had
already been freed.

=item Duplicate modifier '%c' after '%c' in %s

(W unpack) You have applied the same modifier more than once after a
type in a pack template.  See L<perlfunc/pack>.

=item elseif should be elsif

(S syntax) There is no keyword "elseif" in Perl because Larry thinks
it's ugly.  Your code will be interpreted as an attempt to call a method
named "elseif" for the class returned by the following block.  This is
unlikely to be what you want.

=item Empty \%c in regex; marked by S<<-- HERE> in m/%s/

=item Empty \%c{} in regex; marked by S<<-- HERE> in m/%s/

(F) C<\p> and C<\P> are used to introduce a named Unicode property, as
described in L<perlunicode> and L<perlre>.  You used C<\p> or C<\P> in
a regular expression without specifying the property name.

=item ${^ENCODING} is no longer supported. Its use will be fatal in Perl 5.28

(D deprecated) The special variable C<${^ENCODING}>, formerly used to implement
the C<encoding> pragma, is no longer supported as of Perl 5.26.0.

Setting this variable will become a fatal error in Perl 5.28.

=item entering effective %s failed

(F) While under the C<use filetest> pragma, switching the real and
effective uids or gids failed.

=item %ENV is aliased to %s

(F) You're running under taint mode, and the C<%ENV> variable has been
aliased to another hash, so it doesn't reflect anymore the state of the
program's environment.  This is potentially insecure.

=item Error converting file specification %s

(F) An error peculiar to VMS.  Because Perl may have to deal with file
specifications in either VMS or Unix syntax, it converts them to a
single form when it must operate on them directly.  Either you've passed
an invalid file specification to Perl, or you've found a case the
conversion routines don't handle.  Drat.

=item Eval-group in insecure regular expression

(F) Perl detected tainted data when trying to compile a regular
expression that contains the C<(?{ ... })> zero-width assertion, which
is unsafe.  See L<perlre/(?{ code })>, and L<perlsec>.

=item Eval-group not allowed at runtime, use re 'eval' in regex m/%s/

(F) Perl tried to compile a regular expression containing the
C<(?{ ... })> zero-width assertion at run time, as it would when the
pattern contains interpolated values.  Since that is a security risk,
it is not allowed.  If you insist, you may still do this by using the
C<re 'eval'> pragma or by explicitly building the pattern from an
interpolated string at run time and using that in an eval().  See
L<perlre/(?{ code })>.

=item Eval-group not allowed, use re 'eval' in regex m/%s/

(F) A regular expression contained the C<(?{ ... })> zero-width
assertion, but that construct is only allowed when the C<use re 'eval'>
pragma is in effect.  See L<perlre/(?{ code })>.

=item EVAL without pos change exceeded limit in regex; marked by
S<<-- HERE> in m/%s/

(F) You used a pattern that nested too many EVAL calls without consuming
any text.  Restructure the pattern so that text is consumed.

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item Excessively long <> operator

(F) The contents of a <> operator may not exceed the maximum size of a
Perl identifier.  If you're just trying to glob a long list of
filenames, try using the glob() operator, or put the filenames into a
variable and glob that.

=item exec? I'm not *that* kind of operating system

(F) The C<exec> function is not implemented on some systems, e.g., Symbian
OS.  See L<perlport>.

=item %sExecution of %s aborted due to compilation errors.

(F) The final summary message when a Perl compilation fails.

=item exists argument is not a HASH or ARRAY element or a subroutine

(F) The argument to C<exists> must be a hash or array element or a
subroutine with an ampersand, such as:

    $foo{$bar}
    $ref->{"susie"}[12]
    &do_something

=item exists argument is not a subroutine name

(F) The argument to C<exists> for C<exists &sub> must be a subroutine name,
and not a subroutine call.  C<exists &sub()> will generate this error.

=item Exiting eval via %s

(W exiting) You are exiting an eval by unconventional means, such as a
goto, or a loop control statement.

=item Exiting format via %s

(W exiting) You are exiting a format by unconventional means, such as a
goto, or a loop control statement.

=item Exiting pseudo-block via %s

(W exiting) You are exiting a rather special block construct (like a
sort block or subroutine) by unconventional means, such as a goto, or a
loop control statement.  See L<perlfunc/sort>.

=item Exiting subroutine via %s

(W exiting) You are exiting a subroutine by unconventional means, such
as a goto, or a loop control statement.

=item Exiting substitution via %s

(W exiting) You are exiting a substitution by unconventional means, such
as a return, a goto, or a loop control statement.

=item Expecting close bracket in regex; marked by S<<-- HERE> in m/%s/

(F) You wrote something like

 (?13

to denote a capturing group of the form
L<C<(?I<PARNO>)>|perlre/(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)>,
but omitted the C<")">.

=item Expecting '(?flags:(?[...' in regex; marked by S<<-- HERE> in m/%s/

(F) The C<(?[...])> extended character class regular expression construct
only allows character classes (including character class escapes like
C<\d>), operators, and parentheses.  The one exception is C<(?flags:...)>
containing at least one flag and exactly one C<(?[...])> construct.
This allows a regular expression containing just C<(?[...])> to be
interpolated.  If you see this error message, then you probably
have some other C<(?...)> construct inside your character class.  See
L<perlrecharclass/Extended Bracketed Character Classes>.

=item Experimental aliasing via reference not enabled

(F) To do aliasing via references, you must first enable the feature:

    no warnings "experimental::refaliasing";
    use feature "refaliasing";
    \$x = \$y;

=item Experimental %s on scalar is now forbidden

(F) An experimental feature added in Perl 5.14 allowed C<each>, C<keys>,
C<push>, C<pop>, C<shift>, C<splice>, C<unshift>, and C<values> to be called with a
scalar argument.  This experiment is considered unsuccessful, and
has been removed.  The C<postderef> feature may meet your needs better.

=item Experimental subroutine signatures not enabled

(F) To use subroutine signatures, you must first enable them:

    no warnings "experimental::signatures";
    use feature "signatures";
    sub foo ($left, $right) { ... }

=item Explicit blessing to '' (assuming package main)

(W misc) You are blessing a reference to a zero length string.  This has
the effect of blessing the reference into the package main.  This is
usually not what you want.  Consider providing a default target package,
e.g. bless($ref, $p || 'MyPackage');

=item %s: Expression syntax

(A) You've accidentally run your script through B<csh> instead of Perl.
Check the #! line, or manually feed your script into Perl yourself.

=item %s failed--call queue aborted

(F) An untrapped exception was raised while executing a UNITCHECK,
CHECK, INIT, or END subroutine.  Processing of the remainder of the
queue of such routines has been prematurely ended.

=item Failed to close in-place edit file %s: %s

(F) Closing an output file from in-place editing, as with the C<-i>
command-line switch, failed.

=item False [] range "%s" in regex; marked by S<<-- HERE> in m/%s/

(W regexp)(F) A character class range must start and end at a literal
character, not another character class like C<\d> or C<[:alpha:]>.  The "-"
in your false range is interpreted as a literal "-".  In a C<(?[...])>
construct, this is an error, rather than a warning.  Consider quoting
the "-", "\-".  The S<<-- HERE> shows whereabouts in the regular expression
the problem was discovered.  See L<perlre>.

=item Fatal VMS error (status=%d) at %s, line %d

(P) An error peculiar to VMS.  Something untoward happened in a VMS
system service or RTL routine; Perl's exit status should provide more
details.  The filename in "at %s" and the line number in "line %d" tell
you which section of the Perl source code is distressed.

=item fcntl is not implemented

(F) Your machine apparently doesn't implement fcntl().  What is this, a
PDP-11 or something?

=item FETCHSIZE returned a negative value

(F) A tied array claimed to have a negative number of elements, which
is not possible.

=item Field too wide in 'u' format in pack

(W pack) Each line in an uuencoded string starts with a length indicator
which can't encode values above 63.  So there is no point in asking for
a line length bigger than that.  Perl will behave as if you specified
C<u63> as the format.

=item File::Glob::glob() will disappear in perl 5.30. Use File::Glob::bsd_glob() instead.

(D deprecated) C<< File::Glob >> has a function called C<< glob >>, which
just calls C<< bsd_glob >>. However, its prototype is different from the
prototype of C<< CORE::glob >>, and hence, C<< File::Glob::glob >> should
not be used.

C<< File::Glob::glob() >> was deprecated in perl 5.8.0. A deprecation
message was issued from perl 5.26.0 onwards, and the function will
disappear in perl 5.30.0.

Code using C<< File::Glob::glob() >> should call
C<< File::Glob::bsd_glob() >> instead.

=item Filehandle %s opened only for input

(W io) You tried to write on a read-only filehandle.  If you intended
it to be a read-write filehandle, you needed to open it with "+<" or
"+>" or "+>>" instead of with "<" or nothing.  If you intended only to
write the file, use ">" or ">>".  See L<perlfunc/open>.

=item Filehandle %s opened only for output

(W io) You tried to read from a filehandle opened only for writing, If
you intended it to be a read/write filehandle, you needed to open it
with "+<" or "+>" or "+>>" instead of with ">".  If you intended only to
read from the file, use "<".  See L<perlfunc/open>.  Another possibility
is that you attempted to open filedescriptor 0 (also known as STDIN) for
output (maybe you closed STDIN earlier?).

=item Filehandle %s reopened as %s only for input

(W io) You opened for reading a filehandle that got the same filehandle id
as STDOUT or STDERR.  This occurred because you closed STDOUT or STDERR
previously.

=item Filehandle STDIN reopened as %s only for output

(W io) You opened for writing a filehandle that got the same filehandle id
as STDIN.  This occurred because you closed STDIN previously.

=item Final $ should be \$ or $name

(F) You must now decide whether the final $ in a string was meant to be
a literal dollar sign, or was meant to introduce a variable name that
happens to be missing.  So you have to put either the backslash or the
name.

=item flock() on closed filehandle %s

(W closed) The filehandle you're attempting to flock() got itself closed
some time before now.  Check your control flow.  flock() operates on
filehandles.  Are you attempting to call flock() on a dirhandle by the
same name?

=item Format not terminated

(F) A format must be terminated by a line with a solitary dot.  Perl got
to the end of your file without finding such a line.

=item Format %s redefined

(W redefine) You redefined a format.  To suppress this warning, say

    {
	no warnings 'redefine';
	eval "format NAME =...";
    }

=item Found = in conditional, should be ==

(W syntax) You said

    if ($foo = 123)

when you meant

    if ($foo == 123)

(or something like that).

=item %s found where operator expected

(S syntax) The Perl lexer knows whether to expect a term or an operator.
If it sees what it knows to be a term when it was expecting to see an
operator, it gives you this warning.  Usually it indicates that an
operator or delimiter was omitted, such as a semicolon.

=item gdbm store returned %d, errno %d, key "%s"

(S) A warning from the GDBM_File extension that a store failed.

=item gethostent not implemented

(F) Your C library apparently doesn't implement gethostent(), probably
because if it did, it'd feel morally obligated to return every hostname
on the Internet.

=item get%sname() on closed socket %s

(W closed) You tried to get a socket or peer socket name on a closed
socket.  Did you forget to check the return value of your socket() call?

=item getpwnam returned invalid UIC %#o for user "%s"

(S) A warning peculiar to VMS.  The call to C<sys$getuai> underlying the
C<getpwnam> operator returned an invalid UIC.

=item getsockopt() on closed socket %s

(W closed) You tried to get a socket option on a closed socket.  Did you
forget to check the return value of your socket() call?  See
L<perlfunc/getsockopt>.

=item given is experimental

(S experimental::smartmatch) C<given> depends on smartmatch, which
is experimental, so its behavior may change or even be removed
in any future release of perl.  See the explanation under
L<perlsyn/Experimental Details on given and when>.

=item Global symbol "%s" requires explicit package name (did you forget to
declare "my %s"?)

(F) You've said "use strict" or "use strict vars", which indicates 
that all variables must either be lexically scoped (using "my" or "state"), 
declared beforehand using "our", or explicitly qualified to say 
which package the global variable is in (using "::").

=item glob failed (%s)

(S glob) Something went wrong with the external program(s) used
for C<glob> and C<< <*.c> >>.  Usually, this means that you supplied a C<glob>
pattern that caused the external program to fail and exit with a
nonzero status.  If the message indicates that the abnormal exit
resulted in a coredump, this may also mean that your csh (C shell)
is broken.  If so, you should change all of the csh-related variables
in config.sh:  If you have tcsh, make the variables refer to it as
if it were csh (e.g. C<full_csh='/usr/bin/tcsh'>); otherwise, make them
all empty (except that C<d_csh> should be C<'undef'>) so that Perl will
think csh is missing.  In either case, after editing config.sh, run
C<./Configure -S> and rebuild Perl.

=item Glob not terminated

(F) The lexer saw a left angle bracket in a place where it was expecting
a term, so it's looking for the corresponding right angle bracket, and
not finding it.  Chances are you left some needed parentheses out
earlier in the line, and you really meant a "less than".

=item gmtime(%f) failed

(W overflow) You called C<gmtime> with a number that it could not handle:
too large, too small, or NaN.  The returned value is C<undef>.

=item gmtime(%f) too large

(W overflow) You called C<gmtime> with a number that was larger than
it can reliably handle and C<gmtime> probably returned the wrong
date.  This warning is also triggered with NaN (the special
not-a-number value).

=item gmtime(%f) too small

(W overflow) You called C<gmtime> with a number that was smaller than
it can reliably handle and C<gmtime> probably returned the wrong date.

=item Got an error from DosAllocMem

(P) An error peculiar to OS/2.  Most probably you're using an obsolete
version of Perl, and this should not happen anyway.

=item goto must have label

(F) Unlike with "next" or "last", you're not allowed to goto an
unspecified destination.  See L<perlfunc/goto>.

=item Goto undefined subroutine%s

(F) You tried to call a subroutine with C<goto &sub> syntax, but
the indicated subroutine hasn't been defined, or if it was, it
has since been undefined.

=item Group name must start with a non-digit word character in regex; marked by 
S<<-- HERE> in m/%s/

(F) Group names must follow the rules for perl identifiers, meaning
they must start with a non-digit word character.  A common cause of
this error is using (?&0) instead of (?0).  See L<perlre>.

=item ()-group starts with a count

(F) A ()-group started with a count.  A count is supposed to follow
something: a template character or a ()-group.  See L<perlfunc/pack>.

=item %s had compilation errors.

(F) The final summary message when a C<perl -c> fails.

=item Had to create %s unexpectedly

(S internal) A routine asked for a symbol from a symbol table that ought
to have existed already, but for some reason it didn't, and had to be
created on an emergency basis to prevent a core dump.

=item %s has too many errors

(F) The parser has given up trying to parse the program after 10 errors.
Further error messages would likely be uninformative.

=item Hexadecimal float: exponent overflow

(W overflow) The hexadecimal floating point has a larger exponent
than the floating point supports.

=item Hexadecimal float: exponent underflow

(W overflow) The hexadecimal floating point has a smaller exponent
than the floating point supports.  With the IEEE 754 floating point,
this may also mean that the subnormals (formerly known as denormals)
are being used, which may or may not be an error.

=item Hexadecimal float: internal error (%s)

(F) Something went horribly bad in hexadecimal float handling.

=item Hexadecimal float: mantissa overflow

(W overflow) The hexadecimal floating point literal had more bits in
the mantissa (the part between the 0x and the exponent, also known as
the fraction or the significand) than the floating point supports.

=item Hexadecimal float: precision loss

(W overflow) The hexadecimal floating point had internally more
digits than could be output.  This can be caused by unsupported
long double formats, or by 64-bit integers not being available
(needed to retrieve the digits under some configurations).

=item Hexadecimal float: unsupported long double format

(F) You have configured Perl to use long doubles but
the internals of the long double format are unknown;
therefore the hexadecimal float output is impossible.

=item Hexadecimal number > 0xffffffff non-portable

(W portable) The hexadecimal number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems.  See
L<perlport> for more on portability concerns.

=item Identifier too long

(F) Perl limits identifiers (names for variables, functions, etc.) to
about 250 characters for simple names, and somewhat more for compound
names (like C<$A::B>).  You've exceeded Perl's limits.  Future versions
of Perl are likely to eliminate these arbitrary limitations.

=item Ignoring zero length \N{} in character class in regex; marked by
S<<-- HERE> in m/%s/

(W regexp) Named Unicode character escapes (C<\N{...}>) may return a
zero-length sequence.  When such an escape is used in a character
class its behavior is not well defined.  Check that the correct
escape has been used, and the correct charname handler is in scope.

=item Illegal binary digit %s

(F) You used a digit other than 0 or 1 in a binary number.

=item Illegal binary digit %s ignored

(W digit) You may have tried to use a digit other than 0 or 1 in a
binary number.  Interpretation of the binary number stopped before the
offending digit.

=item Illegal character after '_' in prototype for %s : %s

(W illegalproto) An illegal character was found in a prototype
declaration.  The '_' in a prototype must be followed by a ';',
indicating the rest of the parameters are optional, or one of '@'
or '%', since those two will accept 0 or more final parameters.

=item Illegal character \%o (carriage return)

(F) Perl normally treats carriage returns in the program text as
it would any other whitespace, which means you should never see
this error when Perl was built using standard options.  For some
reason, your version of Perl appears to have been built without
this support.  Talk to your Perl administrator.

=item Illegal character following sigil in a subroutine signature

(F) A parameter in a subroutine signature contained an unexpected character
following the C<$>, C<@> or C<%> sigil character.  Normally the sigil
should be followed by the variable name or C<=> etc.  Perhaps you are
trying use a prototype while in the scope of C<use feature 'signatures'>?
For example:

    sub foo ($$) {}            # legal - a prototype

    use feature 'signatures;
    sub foo ($$) {}            # illegal - was expecting a signature
    sub foo ($a, $b)
            :prototype($$) {}  # legal


=item Illegal character in prototype for %s : %s

(W illegalproto) An illegal character was found in a prototype declaration.
Legal characters in prototypes are $, @, %, *, ;, [, ], &, \, and +.
Perhaps you were trying to write a subroutine signature but didn't enable
that feature first (C<use feature 'signatures'>), so your signature was
instead interpreted as a bad prototype.

=item Illegal declaration of anonymous subroutine

(F) When using the C<sub> keyword to construct an anonymous subroutine,
you must always specify a block of code.  See L<perlsub>.

=item Illegal declaration of subroutine %s

(F) A subroutine was not declared correctly.  See L<perlsub>.

=item Illegal division by zero

(F) You tried to divide a number by 0.  Either something was wrong in
your logic, or you need to put a conditional in to guard against
meaningless input.

=item Illegal hexadecimal digit %s ignored

(W digit) You may have tried to use a character other than 0 - 9 or
A - F, a - f in a hexadecimal number.  Interpretation of the hexadecimal
number stopped before the illegal character.

=item Illegal modulus zero

(F) You tried to divide a number by 0 to get the remainder.  Most
numbers don't take to this kindly.

=item Illegal number of bits in vec

(F) The number of bits in vec() (the third argument) must be a power of
two from 1 to 32 (or 64, if your platform supports that).

=item Illegal octal digit %s

(F) You used an 8 or 9 in an octal number.

=item Illegal octal digit %s ignored

(W digit) You may have tried to use an 8 or 9 in an octal number.
Interpretation of the octal number stopped before the 8 or 9.

=item Illegal pattern in regex; marked by S<<-- HERE> in m/%s/

(F) You wrote something like

 (?+foo)

The C<"+"> is valid only when followed by digits, indicating a
capturing group.  See
L<C<(?I<PARNO>)>|perlre/(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)>.

=item Illegal suidscript

(F) The script run under suidperl was somehow illegal.

=item Illegal switch in PERL5OPT: -%c

(X) The PERL5OPT environment variable may only be used to set the
following switches: B<-[CDIMUdmtw]>.

=item Illegal user-defined property name

(F) You specified a Unicode-like property name in a regular expression
pattern (using C<\p{}> or C<\P{}>) that Perl knows isn't an official
Unicode property, and was likely meant to be a user-defined property
name, but it can't be one of those, as they must begin with either C<In>
or C<Is>.  Check the spelling.  See also
L</Can't find Unicode property definition "%s">.

=item Ill-formed CRTL environ value "%s"

(W internal) A warning peculiar to VMS.  Perl tried to read the CRTL's
internal environ array, and encountered an element without the C<=>
delimiter used to separate keys from values.  The element is ignored.

=item Ill-formed message in prime_env_iter: |%s|

(W internal) A warning peculiar to VMS.  Perl tried to read a logical
name or CLI symbol definition when preparing to iterate over %ENV, and
didn't see the expected delimiter between key and value, so the line was
ignored.

=item (in cleanup) %s

(W misc) This prefix usually indicates that a DESTROY() method raised
the indicated exception.  Since destructors are usually called by the
system at arbitrary points during execution, and often a vast number of
times, the warning is issued only once for any number of failures that
would otherwise result in the same message being repeated.

Failure of user callbacks dispatched using the C<G_KEEPERR> flag could
also result in this warning.  See L<perlcall/G_KEEPERR>.

=item Incomplete expression within '(?[ ])' in regex; marked by S<<-- HERE>
in m/%s/

(F) There was a syntax error within the C<(?[ ])>.  This can happen if the
expression inside the construct was completely empty, or if there are
too many or few operands for the number of operators.  Perl is not smart
enough to give you a more precise indication as to what is wrong.

=item Inconsistent hierarchy during C3 merge of class '%s': merging failed on 
parent '%s'

(F) The method resolution order (MRO) of the given class is not
C3-consistent, and you have enabled the C3 MRO for this class.  See the C3
documentation in L<mro> for more information.

=item Indentation on line %d of here-doc doesn't match delimiter

(F) You have an indented here-document where one or more of its lines
have whitespace at the beginning that does not match the closing
delimiter.

For example, line 2 below is wrong because it does not have at least
2 spaces, but lines 1 and 3 are fine because they have at least 2:

    if ($something) {
      print <<~EOF;
        Line 1
       Line 2 not
          Line 3
        EOF
    }

Note that tabs and spaces are compared strictly, meaning 1 tab will
not match 8 spaces.

=item Infinite recursion in regex

(F) You used a pattern that references itself without consuming any input
text.  You should check the pattern to ensure that recursive patterns
either consume text or fail.

=item Initialization of state variables in list context currently forbidden

(F) C<state> only permits initializing a single scalar variable, in scalar
context.  So C<state $a = 42> is allowed, but not C<state ($a) = 42>.  To apply
state semantics to a hash or array, store a hash or array reference in a
scalar variable.

=item %%s[%s] in scalar context better written as $%s[%s]

(W syntax) In scalar context, you've used an array index/value slice
(indicated by %) to select a single element of an array.  Generally
it's better to ask for a scalar value (indicated by $).  The difference
is that C<$foo[&bar]> always behaves like a scalar, both in the value it
returns and when evaluating its argument, while C<%foo[&bar]> provides
a list context to its subscript, which can do weird things if you're
expecting only one subscript.  When called in list context, it also
returns the index (what C<&bar> returns) in addition to the value.

=item %%s{%s} in scalar context better written as $%s{%s}

(W syntax) In scalar context, you've used a hash key/value slice
(indicated by %) to select a single element of a hash.  Generally it's
better to ask for a scalar value (indicated by $).  The difference
is that C<$foo{&bar}> always behaves like a scalar, both in the value
it returns and when evaluating its argument, while C<@foo{&bar}> and
provides a list context to its subscript, which can do weird things
if you're expecting only one subscript.  When called in list context,
it also returns the key in addition to the value.

=item Insecure dependency in %s

(F) You tried to do something that the tainting mechanism didn't like.
The tainting mechanism is turned on when you're running setuid or
setgid, or when you specify B<-T> to turn it on explicitly.  The
tainting mechanism labels all data that's derived directly or indirectly
from the user, who is considered to be unworthy of your trust.  If any
such data is used in a "dangerous" operation, you get this error.  See
L<perlsec> for more information.

=item Insecure directory in %s

(F) You can't use system(), exec(), or a piped open in a setuid or
setgid script if C<$ENV{PATH}> contains a directory that is writable by
the world.  Also, the PATH must not contain any relative directory.
See L<perlsec>.

=item Insecure $ENV{%s} while running %s

(F) You can't use system(), exec(), or a piped open in a setuid or
setgid script if any of C<$ENV{PATH}>, C<$ENV{IFS}>, C<$ENV{CDPATH}>,
C<$ENV{ENV}>, C<$ENV{BASH_ENV}> or C<$ENV{TERM}> are derived from data
supplied (or potentially supplied) by the user.  The script must set
the path to a known value, using trustworthy data.  See L<perlsec>.

=item Insecure user-defined property %s

(F) Perl detected tainted data when trying to compile a regular
expression that contains a call to a user-defined character property
function, i.e. C<\p{IsFoo}> or C<\p{InFoo}>.
See L<perlunicode/User-Defined Character Properties> and L<perlsec>.

=item Integer overflow in format string for %s

(F) The indexes and widths specified in the format string of C<printf()>
or C<sprintf()> are too large.  The numbers must not overflow the size of
integers for your architecture.

=item Integer overflow in %s number

(S overflow) The hexadecimal, octal or binary number you have specified
either as a literal or as an argument to hex() or oct() is too big for
your architecture, and has been converted to a floating point number.
On a 32-bit architecture the largest hexadecimal, octal or binary number
representable without overflow is 0xFFFFFFFF, 037777777777, or
0b11111111111111111111111111111111 respectively.  Note that Perl
transparently promotes all numbers to a floating point representation
internally--subject to loss of precision errors in subsequent
operations.

=item Integer overflow in srand

(S overflow) The number you have passed to srand is too big to fit
in your architecture's integer representation.  The number has been
replaced with the largest integer supported (0xFFFFFFFF on 32-bit
architectures).  This means you may be getting less randomness than
you expect, because different random seeds above the maximum will
return the same sequence of random numbers.

=item Integer overflow in version

=item Integer overflow in version %d

(W overflow) Some portion of a version initialization is too large for
the size of integers for your architecture.  This is not a warning
because there is no rational reason for a version to try and use an
element larger than typically 2**32.  This is usually caused by trying
to use some odd mathematical operation as a version, like 100/9.

=item Internal disaster in regex; marked by S<<-- HERE> in m/%s/

(P) Something went badly wrong in the regular expression parser.
The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item Internal inconsistency in tracking vforks

(S) A warning peculiar to VMS.  Perl keeps track of the number of times
you've called C<fork> and C<exec>, to determine whether the current call
to C<exec> should affect the current script or a subprocess (see
L<perlvms/"exec LIST">).  Somehow, this count has become scrambled, so
Perl is making a guess and treating this C<exec> as a request to
terminate the Perl script and execute the specified command.

=item internal %<num>p might conflict with future printf extensions

(S internal) Perl's internal routine that handles C<printf> and C<sprintf>
formatting follows a slightly different set of rules when called from
C or XS code.  Specifically, formats consisting of digits followed
by "p" (e.g., "%7p") are reserved for future use.  If you see this
message, then an XS module tried to call that routine with one such
reserved format.

=item Internal urp in regex; marked by S<<-- HERE> in m/%s/

(P) Something went badly awry in the regular expression parser.  The
S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item %s (...) interpreted as function

(W syntax) You've run afoul of the rule that says that any list operator
followed by parentheses turns into a function, with all the list
operators arguments found inside the parentheses.  See
L<perlop/Terms and List Operators (Leftward)>.

=item In '(?...)', the '(' and '?' must be adjacent in regex;
marked by S<<-- HERE> in m/%s/

(F) The two-character sequence C<"(?"> in this context in a regular
expression pattern should be an indivisible token, with nothing
intervening between the C<"("> and the C<"?">, but you separated them
with whitespace.

=item Invalid %s attribute: %s

(F) The indicated attribute for a subroutine or variable was not recognized
by Perl or by a user-supplied handler.  See L<attributes>.

=item Invalid %s attributes: %s

(F) The indicated attributes for a subroutine or variable were not
recognized by Perl or by a user-supplied handler.  See L<attributes>.

=item Invalid character in charnames alias definition; marked by
S<<-- HERE> in '%s

(F) You tried to create a custom alias for a character name, with
the C<:alias> option to C<use charnames> and the specified character in
the indicated name isn't valid.  See L<charnames/CUSTOM ALIASES>.

=item Invalid \0 character in %s for %s: %s\0%s

(W syscalls) Embedded \0 characters in pathnames or other system call
arguments produce a warning as of 5.20.  The parts after the \0 were
formerly ignored by system calls.

=item Invalid character in \N{...}; marked by S<<-- HERE> in \N{%s}

(F) Only certain characters are valid for character names.  The
indicated one isn't.  See L<charnames/CUSTOM ALIASES>.

=item Invalid conversion in %s: "%s"

(W printf) Perl does not understand the given format conversion.  See
L<perlfunc/sprintf>.

=item Invalid escape in the specified encoding in regex; marked by
S<<-- HERE> in m/%s/

(W regexp)(F) The numeric escape (for example C<\xHH>) of value < 256
didn't correspond to a single character through the conversion
from the encoding specified by the encoding pragma.
The escape was replaced with REPLACEMENT CHARACTER (U+FFFD)
instead, except within S<C<(?[   ])>>, where it is a fatal error.
The S<<-- HERE> shows whereabouts in the regular expression the
escape was discovered.

=item Invalid hexadecimal number in \N{U+...}

=item Invalid hexadecimal number in \N{U+...} in regex; marked by
S<<-- HERE> in m/%s/

(F) The character constant represented by C<...> is not a valid hexadecimal
number.  Either it is empty, or you tried to use a character other than
0 - 9 or A - F, a - f in a hexadecimal number.

=item Invalid module name %s with -%c option: contains single ':'

(F) The module argument to perl's B<-m> and B<-M> command-line options
cannot contain single colons in the module name, but only in the
arguments after "=".  In other words, B<-MFoo::Bar=:baz> is ok, but
B<-MFoo:Bar=baz> is not.

=item Invalid mro name: '%s'

(F) You tried to C<mro::set_mro("classname", "foo")> or C<use mro 'foo'>,
where C<foo> is not a valid method resolution order (MRO).  Currently,
the only valid ones supported are C<dfs> and C<c3>, unless you have loaded
a module that is a MRO plugin.  See L<mro> and L<perlmroapi>.

=item Invalid negative number (%s) in chr

(W utf8) You passed a negative number to C<chr>.  Negative numbers are
not valid character numbers, so it returns the Unicode replacement
character (U+FFFD).

=item Invalid number '%s' for -C option.

(F) You supplied a number to the -C option that either has extra leading
zeroes or overflows perl's unsigned integer representation.

=item invalid option -D%c, use -D'' to see choices

(S debugging) Perl was called with invalid debugger flags.  Call perl
with the B<-D> option with no flags to see the list of acceptable values.
See also L<perlrun/-Dletters>.

=item Invalid quantifier in {,} in regex; marked by S<<-- HERE> in m/%s/

(F) The pattern looks like a {min,max} quantifier, but the min or max
could not be parsed as a valid number - either it has leading zeroes,
or it represents too big a number to cope with.  The S<<-- HERE> shows
where in the regular expression the problem was discovered.  See L<perlre>.

=item Invalid [] range "%s" in regex; marked by S<<-- HERE> in m/%s/

(F) The range specified in a character class had a minimum character
greater than the maximum character.  One possibility is that you forgot the
C<{}> from your ending C<\x{}> - C<\x> without the curly braces can go only
up to C<ff>.  The S<<-- HERE> shows whereabouts in the regular expression the
problem was discovered.  See L<perlre>.

=item Invalid range "%s" in transliteration operator

(F) The range specified in the tr/// or y/// operator had a minimum
character greater than the maximum character.  See L<perlop>.

=item Invalid separator character %s in attribute list

(F) Something other than a colon or whitespace was seen between the
elements of an attribute list.  If the previous attribute had a
parenthesised parameter list, perhaps that list was terminated too soon.
See L<attributes>.

=item Invalid separator character %s in PerlIO layer specification %s

(W layer) When pushing layers onto the Perl I/O system, something other
than a colon or whitespace was seen between the elements of a layer list.
If the previous attribute had a parenthesised parameter list, perhaps that
list was terminated too soon.

=item Invalid strict version format (%s)

(F) A version number did not meet the "strict" criteria for versions.
A "strict" version number is a positive decimal number (integer or
decimal-fraction) without exponentiation or else a dotted-decimal
v-string with a leading 'v' character and at least three components.
The parenthesized text indicates which criteria were not met.
See the L<version> module for more details on allowed version formats.

=item Invalid type '%s' in %s

(F) The given character is not a valid pack or unpack type.
See L<perlfunc/pack>.

(W) The given character is not a valid pack or unpack type but used to be
silently ignored.

=item Invalid version format (%s)

(F) A version number did not meet the "lax" criteria for versions.
A "lax" version number is a positive decimal number (integer or
decimal-fraction) without exponentiation or else a dotted-decimal
v-string.  If the v-string has fewer than three components, it
must have a leading 'v' character.  Otherwise, the leading 'v' is
optional.  Both decimal and dotted-decimal versions may have a
trailing "alpha" component separated by an underscore character
after a fractional or dotted-decimal component.  The parenthesized
text indicates which criteria were not met.  See the L<version> module
for more details on allowed version formats.

=item Invalid version object

(F) The internal structure of the version object was invalid.
Perhaps the internals were modified directly in some way or
an arbitrary reference was blessed into the "version" class.

=item In '(*VERB...)', the '(' and '*' must be adjacent in regex;
marked by S<<-- HERE> in m/%s/

(F) The two-character sequence C<"(*"> in
this context in a regular expression pattern should be an
indivisible token, with nothing intervening between the C<"(">
and the C<"*">, but you separated them.

=item ioctl is not implemented

(F) Your machine apparently doesn't implement ioctl(), which is pretty
strange for a machine that supports C.

=item ioctl() on unopened %s

(W unopened) You tried ioctl() on a filehandle that was never opened.
Check your control flow and number of arguments.

=item IO layers (like '%s') unavailable

(F) Your Perl has not been configured to have PerlIO, and therefore
you cannot use IO layers.  To have PerlIO, Perl must be configured
with 'useperlio'.

=item IO::Socket::atmark not implemented on this architecture

(F) Your machine doesn't implement the sockatmark() functionality,
neither as a system call nor an ioctl call (SIOCATMARK).

=item '%s' is an unknown bound type in regex; marked by S<<-- HERE> in m/%s/

(F) You used C<\b{...}> or C<\B{...}> and the C<...> is not known to
Perl.  The current valid ones are given in
L<perlrebackslash/\b{}, \b, \B{}, \B>.

=item %s() is deprecated on :utf8 handles. This will be a fatal error in Perl 5.30

(D deprecated) The sysread(), recv(), syswrite() and send() operators are
deprecated on handles that have the C<:utf8> layer, either explicitly, or
implicitly, eg., with the C<:encoding(UTF-16LE)> layer.

Both sysread() and recv() currently use only the C<:utf8> flag for the stream,
ignoring the actual layers.  Since sysread() and recv() do no UTF-8
validation they can end up creating invalidly encoded scalars.

Similarly, syswrite() and send() use only the C<:utf8> flag, otherwise ignoring
any layers.  If the flag is set, both write the value UTF-8 encoded, even if
the layer is some different encoding, such as the example above.

Ideally, all of these operators would completely ignore the C<:utf8> state,
working only with bytes, but this would result in silently breaking existing
code.

In Perl 5.30, it will no longer be possible to use sysread(), recv(),
syswrite() or send() to read or send bytes from/to :utf8 handles.

=item "%s" is more clearly written simply as "%s" in regex; marked by S<<-- HERE> in m/%s/

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

You specified a character that has the given plainer way of writing it,
and which is also portable to platforms running with different character
sets.

=item $* is no longer supported. Its use will be fatal in Perl 5.30

(D deprecated, syntax) The special variable C<$*>, deprecated in older
perls, has been removed as of 5.10.0 and is no longer supported.  In
previous versions of perl the use of C<$*> enabled or disabled multi-line
matching within a string.

Instead of using C<$*> you should use the C</m> (and maybe C</s>) regexp
modifiers.  You can enable C</m> for a lexical scope (even a whole file)
with C<use re '/m'>.  (In older versions: when C<$*> was set to a true value
then all regular expressions behaved as if they were written using C</m>.)

Use of this variable will be a fatal error in Perl 5.30.

=item $# is no longer supported. Its use will be fatal in Perl 5.30

(D deprecated, syntax) The special variable C<$#>, deprecated in older
perls, has been removed as of 5.10.0 and is no longer supported.  You
should use the printf/sprintf functions instead.

Use of this variable will be a fatal error in Perl 5.30.

=item '%s' is not a code reference

(W overload) The second (fourth, sixth, ...) argument of
overload::constant needs to be a code reference.  Either
an anonymous subroutine, or a reference to a subroutine.

=item '%s' is not an overloadable type

(W overload) You tried to overload a constant type the overload package is
unaware of.

=item -i used with no filenames on the command line, reading from STDIN

(S inplace) The C<-i> option was passed on the command line, indicating
that the script is intended to edit files in place, but no files were
given.  This is usually a mistake, since editing STDIN in place doesn't
make sense, and can be confusing because it can make perl look like
it is hanging when it is really just trying to read from STDIN.  You
should either pass a filename to edit, or remove C<-i> from the command
line.  See L<perlrun> for more details.

=item Junk on end of regexp in regex m/%s/

(P) The regular expression parser is confused.

=item Label not found for "last %s"

(F) You named a loop to break out of, but you're not currently in a loop
of that name, not even if you count where you were called from.  See
L<perlfunc/last>.

=item Label not found for "next %s"

(F) You named a loop to continue, but you're not currently in a loop of
that name, not even if you count where you were called from.  See
L<perlfunc/last>.

=item Label not found for "redo %s"

(F) You named a loop to restart, but you're not currently in a loop of
that name, not even if you count where you were called from.  See
L<perlfunc/last>.

=item leaving effective %s failed

(F) While under the C<use filetest> pragma, switching the real and
effective uids or gids failed.

=item length/code after end of string in unpack

(F) While unpacking, the string buffer was already used up when an unpack
length/code combination tried to obtain more data.  This results in
an undefined value for the length.  See L<perlfunc/pack>.

=item length() used on %s (did you mean "scalar(%s)"?)

(W syntax) You used length() on either an array or a hash when you
probably wanted a count of the items.

Array size can be obtained by doing:

    scalar(@array);

The number of items in a hash can be obtained by doing:

    scalar(keys %hash);

=item Lexing code attempted to stuff non-Latin-1 character into Latin-1 input

(F) An extension is attempting to insert text into the current parse
(using L<lex_stuff_pvn|perlapi/lex_stuff_pvn> or similar), but tried to insert a character that
couldn't be part of the current input.  This is an inherent pitfall
of the stuffing mechanism, and one of the reasons to avoid it.  Where
it is necessary to stuff, stuffing only plain ASCII is recommended.

=item Lexing code internal error (%s)

(F) Lexing code supplied by an extension violated the lexer's API in a
detectable way.

=item listen() on closed socket %s

(W closed) You tried to do a listen on a closed socket.  Did you forget
to check the return value of your socket() call?  See
L<perlfunc/listen>.

=item List form of piped open not implemented

(F) On some platforms, notably Windows, the three-or-more-arguments
form of C<open> does not support pipes, such as C<open($pipe, '|-', @args)>.
Use the two-argument C<open($pipe, '|prog arg1 arg2...')> form instead.

=item %s: loadable library and perl binaries are mismatched (got handshake key %p, needed %p)

(P) A dynamic loading library C<.so> or C<.dll> was being loaded into the
process that was built against a different build of perl than the
said library was compiled against.  Reinstalling the XS module will
likely fix this error.

=item Locale '%s' may not work well.%s

(W locale) You are using the named locale, which is a non-UTF-8 one, and
which perl has determined is not fully compatible with what it can
handle.  The second C<%s> gives a reason.

By far the most common reason is that the locale has characters in it
that are represented by more than one byte.  The only such locales that
Perl can handle are the UTF-8 locales.  Most likely the specified locale
is a non-UTF-8 one for an East Asian language such as Chinese or
Japanese.  If the locale is a superset of ASCII, the ASCII portion of it
may work in Perl.

Some essentially obsolete locales that aren't supersets of ASCII, mainly
those in ISO 646 or other 7-bit locales, such as ASMO 449, can also have
problems, depending on what portions of the ASCII character set get
changed by the locale and are also used by the program.
The warning message lists the determinable conflicting characters.

Note that not all incompatibilities are found.

If this happens to you, there's not much you can do except switch to use a
different locale or use L<Encode> to translate from the locale into
UTF-8; if that's impracticable, you have been warned that some things
may break.

This message is output once each time a bad locale is switched into
within the scope of C<S<use locale>>, or on the first possibly-affected
operation if the C<S<use locale>> inherits a bad one.  It is not raised
for any operations from the L<POSIX> module.

=item localtime(%f) failed

(W overflow) You called C<localtime> with a number that it could not handle:
too large, too small, or NaN.  The returned value is C<undef>.

=item localtime(%f) too large

(W overflow) You called C<localtime> with a number that was larger
than it can reliably handle and C<localtime> probably returned the
wrong date.  This warning is also triggered with NaN (the special
not-a-number value).

=item localtime(%f) too small

(W overflow) You called C<localtime> with a number that was smaller
than it can reliably handle and C<localtime> probably returned the
wrong date.

=item Lookbehind longer than %d not implemented in regex m/%s/

(F) There is currently a limit on the length of string which lookbehind can
handle.  This restriction may be eased in a future release. 

=item Lost precision when %s %f by 1

(W imprecision) The value you attempted to increment or decrement by one
is too large for the underlying floating point representation to store
accurately, hence the target of C<++> or C<--> is unchanged.  Perl issues this
warning because it has already switched from integers to floating point
when values are too large for integers, and now even floating point is
insufficient.  You may wish to switch to using L<Math::BigInt> explicitly.

=item lstat() on filehandle%s

(W io) You tried to do an lstat on a filehandle.  What did you mean
by that?  lstat() makes sense only on filenames.  (Perl did a fstat()
instead on the filehandle.)

=item lvalue attribute %s already-defined subroutine

(W misc) Although L<attributes.pm|attributes> allows this, turning the lvalue
attribute on or off on a Perl subroutine that is already defined
does not always work properly.  It may or may not do what you
want, depending on what code is inside the subroutine, with exact
details subject to change between Perl versions.  Only do this
if you really know what you are doing.

=item lvalue attribute ignored after the subroutine has been defined

(W misc) Using the C<:lvalue> declarative syntax to make a Perl
subroutine an lvalue subroutine after it has been defined is
not permitted.  To make the subroutine an lvalue subroutine,
add the lvalue attribute to the definition, or put the C<sub
foo :lvalue;> declaration before the definition.

See also L<attributes.pm|attributes>.

=item Magical list constants are not supported

(F) You assigned a magical array to a stash element, and then tried
to use the subroutine from the same slot.  You are asking Perl to do
something it cannot do, details subject to change between Perl versions.

=item Malformed integer in [] in pack

(F) Between the brackets enclosing a numeric repeat count only digits
are permitted.  See L<perlfunc/pack>.

=item Malformed integer in [] in unpack

(F) Between the brackets enclosing a numeric repeat count only digits
are permitted.  See L<perlfunc/pack>.

=item Malformed PERLLIB_PREFIX

(F) An error peculiar to OS/2.  PERLLIB_PREFIX should be of the form

    prefix1;prefix2

or
    prefix1 prefix2

with nonempty prefix1 and prefix2.  If C<prefix1> is indeed a prefix of
a builtin library search path, prefix2 is substituted.  The error may
appear if components are not found, or are too long.  See
"PERLLIB_PREFIX" in L<perlos2>.

=item Malformed prototype for %s: %s

(F) You tried to use a function with a malformed prototype.  The
syntax of function prototypes is given a brief compile-time check for
obvious errors like invalid characters.  A more rigorous check is run
when the function is called.
Perhaps the function's author was trying to write a subroutine signature
but didn't enable that feature first (C<use feature 'signatures'>),
so the signature was instead interpreted as a bad prototype.

=item Malformed UTF-8 character%s

(S utf8)(F) Perl detected a string that should be UTF-8, but didn't
comply with UTF-8 encoding rules, or represents a code point whose
ordinal integer value doesn't fit into the word size of the current
platform (overflows).  Details as to the exact malformation are given in
the variable, C<%s>, part of the message.

One possible cause is that you set the UTF8 flag yourself for data that
you thought to be in UTF-8 but it wasn't (it was for example legacy
8-bit data).  To guard against this, you can use C<Encode::decode('UTF-8', ...)>.

If you use the C<:encoding(UTF-8)> PerlIO layer for input, invalid byte
sequences are handled gracefully, but if you use C<:utf8>, the flag is
set without validating the data, possibly resulting in this error
message.

See also L<Encode/"Handling Malformed Data">.

=item Malformed UTF-8 returned by \N{%s} immediately after '%s'

(F) The charnames handler returned malformed UTF-8.

=item Malformed UTF-8 string in '%c' format in unpack

(F) You tried to unpack something that didn't comply with UTF-8 encoding
rules and perl was unable to guess how to make more progress.

=item Malformed UTF-8 string in pack

(F) You tried to pack something that didn't comply with UTF-8 encoding
rules and perl was unable to guess how to make more progress.

=item Malformed UTF-8 string in unpack

(F) You tried to unpack something that didn't comply with UTF-8 encoding
rules and perl was unable to guess how to make more progress.

=item Malformed UTF-8 string in "%s"

(F) This message indicates a bug either in the Perl core or in XS
code. Such code was trying to find out if a character, allegedly
stored internally encoded as UTF-8, was of a given type, such as
being punctuation or a digit.  But the character was not encoded
in legal UTF-8.  The C<%s> is replaced by a string that can be used
by knowledgeable people to determine what the type being checked
against was.

Passing malformed strings was deprecated in Perl 5.18, and
became fatal in Perl 5.26.

=item Malformed UTF-16 surrogate

(F) Perl thought it was reading UTF-16 encoded character data but while
doing it Perl met a malformed Unicode surrogate.

=item Mandatory parameter follows optional parameter

(F) In a subroutine signature, you wrote something like "$a = undef,
$b", making an earlier parameter optional and a later one mandatory.
Parameters are filled from left to right, so it's impossible for the
caller to omit an earlier one and pass a later one.  If you want to act
as if the parameters are filled from right to left, declare the rightmost
optional and then shuffle the parameters around in the subroutine's body.

=item Matched non-Unicode code point 0x%X against Unicode property; may
not be portable

(S non_unicode) Perl allows strings to contain a superset of
Unicode code points; each code point may be as large as what is storable
in a signed integer on your system, but these may not be accepted by
other languages/systems.  This message occurs when you matched a string
containing such a code point against a regular expression pattern, and
the code point was matched against a Unicode property, C<\p{...}> or
C<\P{...}>.  Unicode properties are only defined on Unicode code points,
so the result of this match is undefined by Unicode, but Perl (starting
in v5.20) treats non-Unicode code points as if they were typical
unassigned Unicode ones, and matched this one accordingly.  Whether a
given property matches these code points or not is specified in
L<perluniprops/Properties accessible through \p{} and \P{}>.

This message is suppressed (unless it has been made fatal) if it is
immaterial to the results of the match if the code point is Unicode or
not.  For example, the property C<\p{ASCII_Hex_Digit}> only can match
the 22 characters C<[0-9A-Fa-f]>, so obviously all other code points,
Unicode or not, won't match it.  (And C<\P{ASCII_Hex_Digit}> will match
every code point except these 22.)

Getting this message indicates that the outcome of the match arguably
should have been the opposite of what actually happened.  If you think
that is the case, you may wish to make the C<non_unicode> warnings
category fatal; if you agree with Perl's decision, you may wish to turn
off this category.

See L<perlunicode/Beyond Unicode code points> for more information.

=item %s matches null string many times in regex; marked by S<<-- HERE> in
m/%s/

(W regexp) The pattern you've specified would be an infinite loop if the
regular expression engine didn't specifically check for that.  The S<<-- HERE>
shows whereabouts in the regular expression the problem was discovered.
See L<perlre>.

=item Maximal count of pending signals (%u) exceeded

(F) Perl aborted due to too high a number of signals pending.  This
usually indicates that your operating system tried to deliver signals
too fast (with a very high priority), starving the perl process from
resources it would need to reach a point where it can process signals
safely.  (See L<perlipc/"Deferred Signals (Safe Signals)">.)

=item "%s" may clash with future reserved word

(W) This warning may be due to running a perl5 script through a perl4
interpreter, especially if the word that is being warned about is
"use" or "my".

=item '%' may not be used in pack

(F) You can't pack a string by supplying a checksum, because the
checksumming process loses information, and you can't go the other way.
See L<perlfunc/unpack>.

=item Method for operation %s not found in package %s during blessing

(F) An attempt was made to specify an entry in an overloading table that
doesn't resolve to a valid subroutine.  See L<overload>.

=item Method %s not permitted

See L</500 Server error>.

=item Might be a runaway multi-line %s string starting on line %d

(S) An advisory indicating that the previous error may have been caused
by a missing delimiter on a string or pattern, because it eventually
ended earlier on the current line.

=item Misplaced _ in number

(W syntax) An underscore (underbar) in a numeric constant did not
separate two digits.

=item Missing argument in %s

(W missing) You called a function with fewer arguments than other
arguments you supplied indicated would be needed.

Currently only emitted when a printf-type format required more
arguments than were supplied, but might be used in the future for
other cases where we can statically determine that arguments to
functions are missing, e.g. for the L<perlfunc/pack> function.

=item Missing argument to -%c

(F) The argument to the indicated command line switch must follow
immediately after the switch, without intervening spaces.

=item Missing braces on \N{}

=item Missing braces on \N{} in regex; marked by S<<-- HERE> in m/%s/

(F) Wrong syntax of character name literal C<\N{charname}> within
double-quotish context.  This can also happen when there is a space
(or comment) between the C<\N> and the C<{> in a regex with the C</x> modifier.
This modifier does not change the requirement that the brace immediately
follow the C<\N>.

=item Missing braces on \o{}

(F) A C<\o> must be followed immediately by a C<{> in double-quotish context.

=item Missing comma after first argument to %s function

(F) While certain functions allow you to specify a filehandle or an
"indirect object" before the argument list, this ain't one of them.

=item Missing command in piped open

(W pipe) You used the C<open(FH, "| command")> or
C<open(FH, "command |")> construction, but the command was missing or
blank.

=item Missing control char name in \c

(F) A double-quoted string ended with "\c", without the required control
character name.

=item Missing ']' in prototype for %s : %s

(W illegalproto) A grouping was started with C<[> but never closed with C<]>.

=item Missing name in "%s sub"

(F) The syntax for lexically scoped subroutines requires that
they have a name with which they can be found.

=item Missing $ on loop variable

(F) Apparently you've been programming in B<csh> too much.  Variables
are always mentioned with the $ in Perl, unlike in the shells, where it
can vary from one line to the next.

=item (Missing operator before %s?)

(S syntax) This is an educated guess made in conjunction with the message
"%s found where operator expected".  Often the missing operator is a comma.

=item Missing or undefined argument to %s

(F) You tried to call require or do with no argument or with an undefined
value as an argument.  Require expects either a package name or a
file-specification as an argument; do expects a filename.  See
L<perlfunc/require EXPR> and L<perlfunc/do EXPR>.

=item Missing right brace on \%c{} in regex; marked by S<<-- HERE> in m/%s/

(F) Missing right brace in C<\x{...}>, C<\p{...}>, C<\P{...}>, or C<\N{...}>.

=item Missing right brace on \N{}

=item Missing right brace on \N{} or unescaped left brace after \N

(F) C<\N> has two meanings.

The traditional one has it followed by a name enclosed in braces,
meaning the character (or sequence of characters) given by that
name.  Thus C<\N{ASTERISK}> is another way of writing C<*>, valid in both
double-quoted strings and regular expression patterns.  In patterns,
it doesn't have the meaning an unescaped C<*> does.

Starting in Perl 5.12.0, C<\N> also can have an additional meaning (only)
in patterns, namely to match a non-newline character.  (This is short
for C<[^\n]>, and like C<.> but is not affected by the C</s> regex modifier.)

This can lead to some ambiguities.  When C<\N> is not followed immediately
by a left brace, Perl assumes the C<[^\n]> meaning.  Also, if the braces
form a valid quantifier such as C<\N{3}> or C<\N{5,}>, Perl assumes that this
means to match the given quantity of non-newlines (in these examples,
3; and 5 or more, respectively).  In all other case, where there is a
C<\N{> and a matching C<}>, Perl assumes that a character name is desired.

However, if there is no matching C<}>, Perl doesn't know if it was
mistakenly omitted, or if C<[^\n]{> was desired, and raises this error.
If you meant the former, add the right brace; if you meant the latter,
escape the brace with a backslash, like so: C<\N\{>

=item Missing right curly or square bracket

(F) The lexer counted more opening curly or square brackets than closing
ones.  As a general rule, you'll find it's missing near the place you
were last editing.

=item (Missing semicolon on previous line?)

(S syntax) This is an educated guess made in conjunction with the message
"%s found where operator expected".  Don't automatically put a semicolon on
the previous line just because you saw this message.

=item Modification of a read-only value attempted

(F) You tried, directly or indirectly, to change the value of a
constant.  You didn't, of course, try "2 = 1", because the compiler
catches that.  But an easy way to do the same thing is:

    sub mod { $_[0] = 1 }
    mod(2);

Another way is to assign to a substr() that's off the end of the string.

Yet another way is to assign to a C<foreach> loop I<VAR> when I<VAR>
is aliased to a constant in the look I<LIST>:

    $x = 1;
    foreach my $n ($x, 2) {
        $n *= 2; # modifies the $x, but fails on attempt to
    }            # modify the 2

=item Modification of non-creatable array value attempted, %s

(F) You tried to make an array value spring into existence, and the
subscript was probably negative, even counting from end of the array
backwards.

=item Modification of non-creatable hash value attempted, %s

(P) You tried to make a hash value spring into existence, and it
couldn't be created for some peculiar reason.

=item Module name must be constant

(F) Only a bare module name is allowed as the first argument to a "use".

=item Module name required with -%c option

(F) The C<-M> or C<-m> options say that Perl should load some module, but
you omitted the name of the module.  Consult L<perlrun> for full details
about C<-M> and C<-m>.

=item More than one argument to '%s' open

(F) The C<open> function has been asked to open multiple files.  This
can happen if you are trying to open a pipe to a command that takes a
list of arguments, but have forgotten to specify a piped open mode.
See L<perlfunc/open> for details.

=item mprotect for COW string %p %u failed with %d

(S) You compiled perl with B<-D>PERL_DEBUG_READONLY_COW (see
L<perlguts/"Copy on Write">), but a shared string buffer
could not be made read-only.

=item mprotect for %p %u failed with %d

(S) You compiled perl with B<-D>PERL_DEBUG_READONLY_OPS (see L<perlhacktips>),
but an op tree could not be made read-only.

=item mprotect RW for COW string %p %u failed with %d

(S) You compiled perl with B<-D>PERL_DEBUG_READONLY_COW (see
L<perlguts/"Copy on Write">), but a read-only shared string
buffer could not be made mutable.

=item mprotect RW for %p %u failed with %d

(S) You compiled perl with B<-D>PERL_DEBUG_READONLY_OPS (see
L<perlhacktips>), but a read-only op tree could not be made
mutable before freeing the ops.

=item msg%s not implemented

(F) You don't have System V message IPC on your system.

=item Multidimensional syntax %s not supported

(W syntax) Multidimensional arrays aren't written like C<$foo[1,2,3]>.
They're written like C<$foo[1][2][3]>, as in C.

=item Multiple slurpy parameters not allowed

(F) In subroutine signatures, a slurpy parameter (C<@> or C<%>) must be
the last parameter, and there must not be more than one of them; for
example:

    sub foo ($a, @b)    {} # legal
    sub foo ($a, @b, %) {} # invalid

=item '/' must follow a numeric type in unpack

(F) You had an unpack template that contained a '/', but this did not
follow some unpack specification producing a numeric value.
See L<perlfunc/pack>.

=item %s must not be a named sequence in transliteration operator

(F) Transliteration (C<tr///> and C<y///>) transliterates individual
characters.  But a named sequence by definition is more than an
individual charater, and hence doing this operation on it doesn't make
sense.

=item "my sub" not yet implemented

(F) Lexically scoped subroutines are not yet implemented.  Don't try
that yet.

=item "my" subroutine %s can't be in a package

(F) Lexically scoped subroutines aren't in a package, so it doesn't make
sense to try to declare one with a package qualifier on the front.

=item "my %s" used in sort comparison

(W syntax) The package variables $a and $b are used for sort comparisons.
You used $a or $b in as an operand to the C<< <=> >> or C<cmp> operator inside a
sort comparison block, and the variable had earlier been declared as a
lexical variable.  Either qualify the sort variable with the package
name, or rename the lexical variable.

=item "my" variable %s can't be in a package

(F) Lexically scoped variables aren't in a package, so it doesn't make
sense to try to declare one with a package qualifier on the front.  Use
local() if you want to localize a package variable.

=item Name "%s::%s" used only once: possible typo

(W once) Typographical errors often show up as unique variable
names.  If you had a good reason for having a unique name, then
just mention it again somehow to suppress the message.  The C<our>
declaration is also provided for this purpose.

NOTE: This warning detects package symbols that have been used
only once.  This means lexical variables will never trigger this
warning.  It also means that all of the package variables $c, @c,
%c, as well as *c, &c, sub c{}, c(), and c (the filehandle or
format) are considered the same; if a program uses $c only once
but also uses any of the others it will not trigger this warning.
Symbols beginning with an underscore and symbols using special
identifiers (q.v. L<perldata>) are exempt from this warning.

=item Need exactly 3 octal digits in regex; marked by S<<-- HERE> in m/%s/

(F) Within S<C<(?[   ])>>, all constants interpreted as octal need to be
exactly 3 digits long.  This helps catch some ambiguities.  If your
constant is too short, add leading zeros, like

 (?[ [ \078 ] ])     # Syntax error!
 (?[ [ \0078 ] ])    # Works
 (?[ [ \007 8 ] ])   # Clearer

The maximum number this construct can express is C<\777>.  If you
need a larger one, you need to use L<\o{}|perlrebackslash/Octal escapes> instead.  If you meant
two separate things, you need to separate them:

 (?[ [ \7776 ] ])        # Syntax error!
 (?[ [ \o{7776} ] ])     # One meaning
 (?[ [ \777 6 ] ])       # Another meaning
 (?[ [ \777 \006 ] ])    # Still another

=item Negative '/' count in unpack

(F) The length count obtained from a length/code unpack operation was
negative.  See L<perlfunc/pack>.

=item Negative length

(F) You tried to do a read/write/send/recv operation with a buffer
length that is less than 0.  This is difficult to imagine.

=item Negative offset to vec in lvalue context

(F) When C<vec> is called in an lvalue context, the second argument must be
greater than or equal to zero.

=item Negative repeat count does nothing

(W numeric) You tried to execute the
L<C<x>|perlop/Multiplicative Operators> repetition operator fewer than 0
times, which doesn't make sense.

=item Nested quantifiers in regex; marked by S<<-- HERE> in m/%s/

(F) You can't quantify a quantifier without intervening parentheses.
So things like ** or +* or ?* are illegal.  The S<<-- HERE> shows
whereabouts in the regular expression the problem was discovered.

Note that the minimal matching quantifiers, C<*?>, C<+?>, and
C<??> appear to be nested quantifiers, but aren't.  See L<perlre>.

=item %s never introduced

(S internal) The symbol in question was declared but somehow went out of
scope before it could possibly have been used.

=item next::method/next::can/maybe::next::method cannot find enclosing method

(F) C<next::method> needs to be called within the context of a
real method in a real package, and it could not find such a context.
See L<mro>.

=item \N in a character class must be a named character: \N{...} in regex; 
marked by S<<-- HERE> in m/%s/

(F) The new (as of Perl 5.12) meaning of C<\N> as C<[^\n]> is not valid in a
bracketed character class, for the same reason that C<.> in a character
class loses its specialness: it matches almost everything, which is
probably not what you want.

=item \N{} in inverted character class or as a range end-point is restricted to one character in regex; marked by <-- HERE in m/%s/

(F) Named Unicode character escapes (C<\N{...}>) may return a
multi-character sequence.  Even though a character class is
supposed to match just one character of input, perl will match the
whole thing correctly, except when the class is inverted (C<[^...]>),
or the escape is the beginning or final end point of a range.  The
mathematically logical behavior for what matches when inverting
is very different from what people expect, so we have decided to
forbid it.  Similarly unclear is what should be generated when the
C<\N{...}> is used as one of the end points of the range, such as in

 [\x{41}-\N{ARABIC SEQUENCE YEH WITH HAMZA ABOVE WITH AE}]

What is meant here is unclear, as the C<\N{...}> escape is a sequence
of code points, so this is made an error.

=item \N{NAME} must be resolved by the lexer in regex; marked by
S<<-- HERE> in m/%s/

(F) When compiling a regex pattern, an unresolved named character or
sequence was encountered.  This can happen in any of several ways that
bypass the lexer, such as using single-quotish context, or an extra
backslash in double-quotish:

    $re = '\N{SPACE}';	# Wrong!
    $re = "\\N{SPACE}";	# Wrong!
    /$re/;

Instead, use double-quotes with a single backslash:

    $re = "\N{SPACE}";	# ok
    /$re/;

The lexer can be bypassed as well by creating the pattern from smaller
components:

    $re = '\N';
    /${re}{SPACE}/;	# Wrong!

It's not a good idea to split a construct in the middle like this, and
it doesn't work here.  Instead use the solution above.

Finally, the message also can happen under the C</x> regex modifier when the
C<\N> is separated by spaces from the C<{>, in which case, remove the spaces.

    /\N {SPACE}/x;	# Wrong!
    /\N{SPACE}/x;	# ok

=item No %s allowed while running setuid

(F) Certain operations are deemed to be too insecure for a setuid or
setgid script to even be allowed to attempt.  Generally speaking there
will be another way to do what you want that is, if not secure, at least
securable.  See L<perlsec>.

=item No code specified for -%c

(F) Perl's B<-e> and B<-E> command-line options require an argument.  If
you want to run an empty program, pass the empty string as a separate
argument or run a program consisting of a single 0 or 1:

    perl -e ""
    perl -e0
    perl -e1

=item No comma allowed after %s

(F) A list operator that has a filehandle or "indirect object" is
not allowed to have a comma between that and the following arguments.
Otherwise it'd be just another one of the arguments.

One possible cause for this is that you expected to have imported
a constant to your name space with B<use> or B<import> while no such
importing took place, it may for example be that your operating
system does not support that particular constant.  Hopefully you did
use an explicit import list for the constants you expect to see;
please see L<perlfunc/use> and L<perlfunc/import>.  While an
explicit import list would probably have caught this error earlier
it naturally does not remedy the fact that your operating system
still does not support that constant.  Maybe you have a typo in
the constants of the symbol import list of B<use> or B<import> or in the
constant name at the line where this error was triggered?

=item No command into which to pipe on command line

(F) An error peculiar to VMS.  Perl handles its own command line
redirection, and found a '|' at the end of the command line, so it
doesn't know where you want to pipe the output from this command.

=item No DB::DB routine defined

(F) The currently executing code was compiled with the B<-d> switch, but
for some reason the current debugger (e.g. F<perl5db.pl> or a C<Devel::>
module) didn't define a routine to be called at the beginning of each
statement.

=item No dbm on this machine

(P) This is counted as an internal error, because every machine should
supply dbm nowadays, because Perl comes with SDBM.  See L<SDBM_File>.

=item No DB::sub routine defined

(F) The currently executing code was compiled with the B<-d> switch, but
for some reason the current debugger (e.g. F<perl5db.pl> or a C<Devel::>
module) didn't define a C<DB::sub> routine to be called at the beginning
of each ordinary subroutine call.

=item No directory specified for -I

(F) The B<-I> command-line switch requires a directory name as part of the
I<same> argument.  Use B<-Ilib>, for instance.  B<-I lib> won't work.

=item No error file after 2> or 2>> on command line

(F) An error peculiar to VMS.  Perl handles its own command line
redirection, and found a '2>' or a '2>>' on the command line, but can't
find the name of the file to which to write data destined for stderr.

=item No group ending character '%c' found in template

(F) A pack or unpack template has an opening '(' or '[' without its
matching counterpart.  See L<perlfunc/pack>.

=item No input file after < on command line

(F) An error peculiar to VMS.  Perl handles its own command line
redirection, and found a '<' on the command line, but can't find the
name of the file from which to read data for stdin.

=item No next::method '%s' found for %s

(F) C<next::method> found no further instances of this method name
in the remaining packages of the MRO of this class.  If you don't want
it throwing an exception, use C<maybe::next::method>
or C<next::can>.  See L<mro>.

=item Non-finite repeat count does nothing

(W numeric) You tried to execute the
L<C<x>|perlop/Multiplicative Operators> repetition operator C<Inf> (or
C<-Inf>) or C<NaN> times, which doesn't make sense.

=item Non-hex character in regex; marked by S<<-- HERE> in m/%s/

(F) In a regular expression, there was a non-hexadecimal character where
a hex one was expected, like

 (?[ [ \xDG ] ])
 (?[ [ \x{DEKA} ] ])

=item Non-octal character in regex; marked by S<<-- HERE> in m/%s/

(F) In a regular expression, there was a non-octal character where
an octal one was expected, like

 (?[ [ \o{1278} ] ])

=item Non-octal character '%c'.  Resolved as "%s"

(W digit) In parsing an octal numeric constant, a character was
unexpectedly encountered that isn't octal.  The resulting value
is as indicated.

=item "no" not allowed in expression

(F) The "no" keyword is recognized and executed at compile time, and
returns no useful value.  See L<perlmod>.

=item Non-string passed as bitmask

(W misc) A number has been passed as a bitmask argument to select().
Use the vec() function to construct the file descriptor bitmasks for
select.  See L<perlfunc/select>.

=item No output file after > on command line

(F) An error peculiar to VMS.  Perl handles its own command line
redirection, and found a lone '>' at the end of the command line, so it
doesn't know where you wanted to redirect stdout.

=item No output file after > or >> on command line

(F) An error peculiar to VMS.  Perl handles its own command line
redirection, and found a '>' or a '>>' on the command line, but can't
find the name of the file to which to write data destined for stdout.

=item No package name allowed for variable %s in "our"

(F) Fully qualified variable names are not allowed in "our"
declarations, because that doesn't make much sense under existing
rules.  Such syntax is reserved for future extensions.

=item No Perl script found in input

(F) You called C<perl -x>, but no line was found in the file beginning
with #! and containing the word "perl".

=item No setregid available

(F) Configure didn't find anything resembling the setregid() call for
your system.

=item No setreuid available

(F) Configure didn't find anything resembling the setreuid() call for
your system.

=item No such class %s

(F) You provided a class qualifier in a "my", "our" or "state"
declaration, but this class doesn't exist at this point in your program.

=item No such class field "%s" in variable %s of type %s

(F) You tried to access a key from a hash through the indicated typed
variable but that key is not allowed by the package of the same type.
The indicated package has restricted the set of allowed keys using the
L<fields> pragma.

=item No such hook: %s

(F) You specified a signal hook that was not recognized by Perl.
Currently, Perl accepts C<__DIE__> and C<__WARN__> as valid signal hooks.

=item No such pipe open

(P) An error peculiar to VMS.  The internal routine my_pclose() tried to
close a pipe which hadn't been opened.  This should have been caught
earlier as an attempt to close an unopened filehandle.

=item No such signal: SIG%s

(W signal) You specified a signal name as a subscript to %SIG that was
not recognized.  Say C<kill -l> in your shell to see the valid signal
names on your system.

=item Not a CODE reference

(F) Perl was trying to evaluate a reference to a code value (that is, a
subroutine), but found a reference to something else instead.  You can
use the ref() function to find out what kind of ref it really was.  See
also L<perlref>.

=item Not a GLOB reference

(F) Perl was trying to evaluate a reference to a "typeglob" (that is, a
symbol table entry that looks like C<*foo>), but found a reference to
something else instead.  You can use the ref() function to find out what
kind of ref it really was.  See L<perlref>.

=item Not a HASH reference

(F) Perl was trying to evaluate a reference to a hash value, but found a
reference to something else instead.  You can use the ref() function to
find out what kind of ref it really was.  See L<perlref>.

=item '#' not allowed immediately following a sigil in a subroutine signature

(F) In a subroutine signature definition, a comment following a sigil
(C<$>, C<@> or C<%>), needs to be separated by whitespace or a commma etc., in
particular to avoid confusion with the C<$#> variable.  For example:

    # bad
    sub f ($# ignore first arg
           , $b) {}
    # good
    sub f ($, # ignore first arg
           $b) {}

=item Not an ARRAY reference

(F) Perl was trying to evaluate a reference to an array value, but found
a reference to something else instead.  You can use the ref() function
to find out what kind of ref it really was.  See L<perlref>.

=item Not a SCALAR reference

(F) Perl was trying to evaluate a reference to a scalar value, but found
a reference to something else instead.  You can use the ref() function
to find out what kind of ref it really was.  See L<perlref>.

=item Not a subroutine reference

(F) Perl was trying to evaluate a reference to a code value (that is, a
subroutine), but found a reference to something else instead.  You can
use the ref() function to find out what kind of ref it really was.  See
also L<perlref>.

=item Not a subroutine reference in overload table

(F) An attempt was made to specify an entry in an overloading table that
doesn't somehow point to a valid subroutine.  See L<overload>.

=item Not enough arguments for %s

(F) The function requires more arguments than you specified.

=item Not enough format arguments

(W syntax) A format specified more picture fields than the next line
supplied.  See L<perlform>.

=item %s: not found

(A) You've accidentally run your script through the Bourne shell instead
of Perl.  Check the #! line, or manually feed your script into Perl
yourself.

=item (?[...]) not valid in locale in regex; marked by S<<-- HERE> in m/%s/

(F) C<(?[...])> cannot be used within the scope of a C<S<use locale>> or with
an C</l> regular expression modifier, as that would require deferring
to run-time the calculation of what it should evaluate to, and it is
regex compile-time only.

=item no UTC offset information; assuming local time is UTC

(S) A warning peculiar to VMS.  Perl was unable to find the local
timezone offset, so it's assuming that local system time is equivalent
to UTC.  If it's not, define the logical name
F<SYS$TIMEZONE_DIFFERENTIAL> to translate to the number of seconds which
need to be added to UTC to get local time.

=item NULL OP IN RUN

(S debugging) Some internal routine called run() with a null opcode
pointer.

=item Null picture in formline

(F) The first argument to formline must be a valid format picture
specification.  It was found to be empty, which probably means you
supplied it an uninitialized value.  See L<perlform>.

=item Null realloc

(P) An attempt was made to realloc NULL.

=item NULL regexp argument

(P) The internal pattern matching routines blew it big time.

=item NULL regexp parameter

(P) The internal pattern matching routines are out of their gourd.

=item Number too long

(F) Perl limits the representation of decimal numbers in programs to
about 250 characters.  You've exceeded that length.  Future
versions of Perl are likely to eliminate this arbitrary limitation.  In
the meantime, try using scientific notation (e.g. "1e6" instead of
"1_000_000").

=item Number with no digits

(F) Perl was looking for a number but found nothing that looked like
a number.  This happens, for example with C<\o{}>, with no number between
the braces.

=item Octal number > 037777777777 non-portable

(W portable) The octal number you specified is larger than 2**32-1
(4294967295) and therefore non-portable between systems.  See
L<perlport> for more on portability concerns.

=item Odd name/value argument for subroutine '%s'

(F) A subroutine using a slurpy hash parameter in its signature
received an odd number of arguments to populate the hash.  It requires
the arguments to be paired, with the same number of keys as values.
The caller of the subroutine is presumably at fault.

The message attempts to include the name of the called subroutine. If the
subroutine has been aliased, the subroutine's original name will be shown,
regardless of what name the caller used.

=item Odd number of arguments for overload::constant

(W overload) The call to overload::constant contained an odd number of
arguments.  The arguments should come in pairs.

=item Odd number of elements in anonymous hash

(W misc) You specified an odd number of elements to initialize a hash,
which is odd, because hashes come in key/value pairs.

=item Odd number of elements in hash assignment

(W misc) You specified an odd number of elements to initialize a hash,
which is odd, because hashes come in key/value pairs.

=item Offset outside string

(F)(W layer) You tried to do a read/write/send/recv/seek operation
with an offset pointing outside the buffer.  This is difficult to
imagine.  The sole exceptions to this are that zero padding will
take place when going past the end of the string when either
C<sysread()>ing a file, or when seeking past the end of a scalar opened
for I/O (in anticipation of future reads and to imitate the behavior
with real files).

=item %s() on unopened %s

(W unopened) An I/O operation was attempted on a filehandle that was
never initialized.  You need to do an open(), a sysopen(), or a socket()
call, or call a constructor from the FileHandle package.

=item -%s on unopened filehandle %s

(W unopened) You tried to invoke a file test operator on a filehandle
that isn't open.  Check your control flow.  See also L<perlfunc/-X>.

=item oops: oopsAV

(S internal) An internal warning that the grammar is screwed up.

=item oops: oopsHV

(S internal) An internal warning that the grammar is screwed up.

=item Opening dirhandle %s also as a file. This will be a fatal error in Perl 5.28

(D io, deprecated) You used open() to associate a filehandle to
a symbol (glob or scalar) that already holds a dirhandle.
Although legal, this idiom might render your code confusing
and this was deprecated in Perl 5.10. In Perl 5.28, this 
will be a fatal error.

=item Opening filehandle %s also as a directory. This will be a fatal error in Perl 5.28

(D io, deprecated) You used opendir() to associate a dirhandle to
a symbol (glob or scalar) that already holds a filehandle.
Although legal, this idiom might render your code confusing
and this was deprecated in Perl 5.10. In Perl 5.28, this 
will be a fatal error.

=item Operand with no preceding operator in regex; marked by S<<-- HERE> in
m/%s/

(F) You wrote something like

 (?[ \p{Digit} \p{Thai} ])

There are two operands, but no operator giving how you want to combine
them.

=item Operation "%s": no method found, %s

(F) An attempt was made to perform an overloaded operation for which no
handler was defined.  While some handlers can be autogenerated in terms
of other handlers, there is no default handler for any operation, unless
the C<fallback> overloading key is specified to be true.  See L<overload>.

=item Operation "%s" returns its argument for non-Unicode code point 0x%X

(S non_unicode) You performed an operation requiring Unicode rules
on a code point that is not in Unicode, so what it should do is not
defined.  Perl has chosen to have it do nothing, and warn you.

If the operation shown is "ToFold", it means that case-insensitive
matching in a regular expression was done on the code point.

If you know what you are doing you can turn off this warning by
C<no warnings 'non_unicode';>.

=item Operation "%s" returns its argument for UTF-16 surrogate U+%X

(S surrogate) You performed an operation requiring Unicode
rules on a Unicode surrogate.  Unicode frowns upon the use
of surrogates for anything but storing strings in UTF-16, but
rules are (reluctantly) defined for the surrogates, and
they are to do nothing for this operation.  Because the use of
surrogates can be dangerous, Perl warns.

If the operation shown is "ToFold", it means that case-insensitive
matching in a regular expression was done on the code point.

If you know what you are doing you can turn off this warning by
C<no warnings 'surrogate';>.

=item Operator or semicolon missing before %s

(S ambiguous) You used a variable or subroutine call where the parser
was expecting an operator.  The parser has assumed you really meant to
use an operator, but this is highly likely to be incorrect.  For
example, if you say "*foo *foo" it will be interpreted as if you said
"*foo * 'foo'".

=item Optional parameter lacks default expression

(F) In a subroutine signature, you wrote something like "$a =", making a
named optional parameter without a default value.  A nameless optional
parameter is permitted to have no default value, but a named one must
have a specific default.  You probably want "$a = undef".

=item "our" variable %s redeclared

(W misc) You seem to have already declared the same global once before
in the current lexical scope.

=item Out of memory!

(X) The malloc() function returned 0, indicating there was insufficient
remaining memory (or virtual memory) to satisfy the request.  Perl has
no option but to exit immediately.

At least in Unix you may be able to get past this by increasing your
process datasize limits: in csh/tcsh use C<limit> and
C<limit datasize n> (where C<n> is the number of kilobytes) to check
the current limits and change them, and in ksh/bash/zsh use C<ulimit -a>
and C<ulimit -d n>, respectively.

=item Out of memory during %s extend

(X) An attempt was made to extend an array, a list, or a string beyond
the largest possible memory allocation.

=item Out of memory during "large" request for %s

(F) The malloc() function returned 0, indicating there was insufficient
remaining memory (or virtual memory) to satisfy the request.  However,
the request was judged large enough (compile-time default is 64K), so a
possibility to shut down by trapping this error is granted.

=item Out of memory during request for %s

(X)(F) The malloc() function returned 0, indicating there was
insufficient remaining memory (or virtual memory) to satisfy the
request.

The request was judged to be small, so the possibility to trap it
depends on the way perl was compiled.  By default it is not trappable.
However, if compiled for this, Perl may use the contents of C<$^M> as an
emergency pool after die()ing with this message.  In this case the error
is trappable I<once>, and the error message will include the line and file
where the failed request happened.

=item Out of memory during ridiculously large request

(F) You can't allocate more than 2^31+"small amount" bytes.  This error
is most likely to be caused by a typo in the Perl program. e.g.,
C<$arr[time]> instead of C<$arr[$time]>.

=item Out of memory for yacc stack

(F) The yacc parser wanted to grow its stack so it could continue
parsing, but realloc() wouldn't give it more memory, virtual or
otherwise.

=item '.' outside of string in pack

(F) The argument to a '.' in your template tried to move the working
position to before the start of the packed string being built.

=item '@' outside of string in unpack

(F) You had a template that specified an absolute position outside
the string being unpacked.  See L<perlfunc/pack>.

=item '@' outside of string with malformed UTF-8 in unpack

(F) You had a template that specified an absolute position outside
the string being unpacked.  The string being unpacked was also invalid
UTF-8.  See L<perlfunc/pack>.

=item overload arg '%s' is invalid

(W overload) The L<overload> pragma was passed an argument it did not
recognize.  Did you mistype an operator?

=item Overloaded dereference did not return a reference

(F) An object with an overloaded dereference operator was dereferenced,
but the overloaded operation did not return a reference.  See
L<overload>.

=item Overloaded qr did not return a REGEXP

(F) An object with a C<qr> overload was used as part of a match, but the
overloaded operation didn't return a compiled regexp.  See L<overload>.

=item %s package attribute may clash with future reserved word: %s

(W reserved) A lowercase attribute name was used that had a
package-specific handler.  That name might have a meaning to Perl itself
some day, even though it doesn't yet.  Perhaps you should use a
mixed-case attribute name, instead.  See L<attributes>.

=item pack/unpack repeat count overflow

(F) You can't specify a repeat count so large that it overflows your
signed integers.  See L<perlfunc/pack>.

=item page overflow

(W io) A single call to write() produced more lines than can fit on a
page.  See L<perlform>.

=item panic: %s

(P) An internal error.

=item panic: attempt to call %s in %s

(P) One of the file test operators entered a code branch that calls
an ACL related-function, but that function is not available on this
platform.  Earlier checks mean that it should not be possible to
enter this branch on this platform.

=item panic: child pseudo-process was never scheduled

(P) A child pseudo-process in the ithreads implementation on Windows
was not scheduled within the time period allowed and therefore was not
able to initialize properly.

=item panic: ck_grep, type=%u

(P) Failed an internal consistency check trying to compile a grep.

=item panic: corrupt saved stack index %ld

(P) The savestack was requested to restore more localized values than
there are in the savestack.

=item panic: del_backref

(P) Failed an internal consistency check while trying to reset a weak
reference.

=item panic: do_subst

(P) The internal pp_subst() routine was called with invalid operational
data.

=item panic: do_trans_%s

(P) The internal do_trans routines were called with invalid operational
data.

=item panic: fold_constants JMPENV_PUSH returned %d

(P) While attempting folding constants an exception other than an C<eval>
failure was caught.

=item panic: frexp: %f

(P) The library function frexp() failed, making printf("%f") impossible.

=item panic: goto, type=%u, ix=%ld

(P) We popped the context stack to a context with the specified label,
and then discovered it wasn't a context we know how to do a goto in.

=item panic: gp_free failed to free glob pointer

(P) The internal routine used to clear a typeglob's entries tried
repeatedly, but each time something re-created entries in the glob.
Most likely the glob contains an object with a reference back to
the glob and a destructor that adds a new object to the glob.

=item panic: INTERPCASEMOD, %s

(P) The lexer got into a bad state at a case modifier.

=item panic: INTERPCONCAT, %s

(P) The lexer got into a bad state parsing a string with brackets.

=item panic: kid popen errno read

(F) A forked child returned an incomprehensible message about its errno.

=item panic: last, type=%u

(P) We popped the context stack to a block context, and then discovered
it wasn't a block context.

=item panic: leave_scope clearsv

(P) A writable lexical variable became read-only somehow within the
scope.

=item panic: leave_scope inconsistency %u

(P) The savestack probably got out of sync.  At least, there was an
invalid enum on the top of it.

=item panic: magic_killbackrefs

(P) Failed an internal consistency check while trying to reset all weak
references to an object.

=item panic: malloc, %s

(P) Something requested a negative number of bytes of malloc.

=item panic: memory wrap

(P) Something tried to allocate either more memory than possible or a
negative amount.

=item panic: pad_alloc, %p!=%p

(P) The compiler got confused about which scratch pad it was allocating
and freeing temporaries and lexicals from.

=item panic: pad_free curpad, %p!=%p

(P) The compiler got confused about which scratch pad it was allocating
and freeing temporaries and lexicals from.

=item panic: pad_free po

(P) A zero scratch pad offset was detected internally.  An attempt was
made to free a target that had not been allocated to begin with.

=item panic: pad_reset curpad, %p!=%p

(P) The compiler got confused about which scratch pad it was allocating
and freeing temporaries and lexicals from.

=item panic: pad_sv po

(P) A zero scratch pad offset was detected internally.  Most likely
an operator needed a target but that target had not been allocated
for whatever reason.

=item panic: pad_swipe curpad, %p!=%p

(P) The compiler got confused about which scratch pad it was allocating
and freeing temporaries and lexicals from.

=item panic: pad_swipe po

(P) An invalid scratch pad offset was detected internally.

=item panic: pp_iter, type=%u

(P) The foreach iterator got called in a non-loop context frame.

=item panic: pp_match%s

(P) The internal pp_match() routine was called with invalid operational
data.

=item panic: realloc, %s

(P) Something requested a negative number of bytes of realloc.

=item panic: reference miscount on nsv in sv_replace() (%d != 1)

(P) The internal sv_replace() function was handed a new SV with a
reference count other than 1.

=item panic: restartop in %s

(P) Some internal routine requested a goto (or something like it), and
didn't supply the destination.

=item panic: return, type=%u

(P) We popped the context stack to a subroutine or eval context, and
then discovered it wasn't a subroutine or eval context.

=item panic: scan_num, %s

(P) scan_num() got called on something that wasn't a number.

=item panic: Sequence (?{...}): no code block found in regex m/%s/

(P) While compiling a pattern that has embedded (?{}) or (??{}) code
blocks, perl couldn't locate the code block that should have already been
seen and compiled by perl before control passed to the regex compiler.

=item panic: strxfrm() gets absurd - a => %u, ab => %u

(P) The interpreter's sanity check of the C function strxfrm() failed.
In your current locale the returned transformation of the string "ab"
is shorter than that of the string "a", which makes no sense.

=item panic: sv_chop %s

(P) The sv_chop() routine was passed a position that is not within the
scalar's string buffer.

=item panic: sv_insert, midend=%p, bigend=%p

(P) The sv_insert() routine was told to remove more string than there
was string.

=item panic: top_env

(P) The compiler attempted to do a goto, or something weird like that.

=item panic: unimplemented op %s (#%d) called

(P) The compiler is screwed up and attempted to use an op that isn't
permitted at run time.

=item panic: unknown OA_*: %x

(P) The internal routine that handles arguments to C<&CORE::foo()>
subroutine calls was unable to determine what type of arguments
were expected.

=item panic: utf16_to_utf8: odd bytelen

(P) Something tried to call utf16_to_utf8 with an odd (as opposed
to even) byte length.

=item panic: utf16_to_utf8_reversed: odd bytelen

(P) Something tried to call utf16_to_utf8_reversed with an odd (as opposed
to even) byte length.

=item panic: yylex, %s

(P) The lexer got into a bad state while processing a case modifier.

=item Parentheses missing around "%s" list

(W parenthesis) You said something like

    my $foo, $bar = @_;

when you meant

    my ($foo, $bar) = @_;

Remember that "my", "our", "local" and "state" bind tighter than comma.

=item Parsing code internal error (%s)

(F) Parsing code supplied by an extension violated the parser's API in
a detectable way.

=item Pattern subroutine nesting without pos change exceeded limit in regex

(F) You used a pattern that uses too many nested subpattern calls without
consuming any text.  Restructure the pattern so text is consumed before
the nesting limit is exceeded.

=item C<-p> destination: %s

(F) An error occurred during the implicit output invoked by the C<-p>
command-line switch.  (This output goes to STDOUT unless you've
redirected it with select().)

=item Perl API version %s of %s does not match %s

(F) The XS module in question was compiled against a different incompatible
version of Perl than the one that has loaded the XS module.

=item Perl folding rules are not up-to-date for 0x%X; please use the perlbug
utility to report; in regex; marked by S<<-- HERE> in m/%s/

(S regexp) You used a regular expression with case-insensitive matching,
and there is a bug in Perl in which the built-in regular expression
folding rules are not accurate.  This may lead to incorrect results.
Please report this as a bug using the L<perlbug> utility.

=item PerlIO layer ':win32' is experimental

(S experimental::win32_perlio) The C<:win32> PerlIO layer is
experimental.  If you want to take the risk of using this layer,
simply disable this warning:

    no warnings "experimental::win32_perlio";

=item Perl_my_%s() not available

(F) Your platform has very uncommon byte-order and integer size,
so it was not possible to set up some or all fixed-width byte-order
conversion functions.  This is only a problem when you're using the
'<' or '>' modifiers in (un)pack templates.  See L<perlfunc/pack>.

=item Perl %s required (did you mean %s?)--this is only %s, stopped

(F) The code you are trying to run has asked for a newer version of
Perl than you are running.  Perhaps C<use 5.10> was written instead
of C<use 5.010> or C<use v5.10>.  Without the leading C<v>, the number is
interpreted as a decimal, with every three digits after the
decimal point representing a part of the version number.  So 5.10
is equivalent to v5.100.

=item Perl %s required--this is only %s, stopped

(F) The module in question uses features of a version of Perl more
recent than the currently running version.  How long has it been since
you upgraded, anyway?  See L<perlfunc/require>.

=item PERL_SH_DIR too long

(F) An error peculiar to OS/2.  PERL_SH_DIR is the directory to find the
C<sh>-shell in.  See "PERL_SH_DIR" in L<perlos2>.

=item PERL_SIGNALS illegal: "%s"

(X) See L<perlrun/PERL_SIGNALS> for legal values.

=item Perls since %s too modern--this is %s, stopped

(F) The code you are trying to run claims it will not run
on the version of Perl you are using because it is too new.
Maybe the code needs to be updated, or maybe it is simply
wrong and the version check should just be removed.

=item perl: warning: Non hex character in '$ENV{PERL_HASH_SEED}', seed only partially set

(S) PERL_HASH_SEED should match /^\s*(?:0x)?[0-9a-fA-F]+\s*\z/ but it
contained a non hex character.  This could mean you are not using the
hash seed you think you are.

=item perl: warning: Setting locale failed.

(S) The whole warning message will look something like:

	perl: warning: Setting locale failed.
	perl: warning: Please check that your locale settings:
	        LC_ALL = "En_US",
	        LANG = (unset)
	    are supported and installed on your system.
	perl: warning: Falling back to the standard locale ("C").

Exactly what were the failed locale settings varies.  In the above the
settings were that the LC_ALL was "En_US" and the LANG had no value.
This error means that Perl detected that you and/or your operating
system supplier and/or system administrator have set up the so-called
locale system but Perl could not use those settings.  This was not
dead serious, fortunately: there is a "default locale" called "C" that
Perl can and will use, and the script will be run.  Before you really
fix the problem, however, you will get the same error message each
time you run Perl.  How to really fix the problem can be found in
L<perllocale> section B<LOCALE PROBLEMS>.

=item perl: warning: strange setting in '$ENV{PERL_PERTURB_KEYS}': '%s'

(S) Perl was run with the environment variable PERL_PERTURB_KEYS defined
but containing an unexpected value.  The legal values of this setting
are as follows.

  Numeric | String        | Result
  --------+---------------+-----------------------------------------
  0       | NO            | Disables key traversal randomization
  1       | RANDOM        | Enables full key traversal randomization
  2       | DETERMINISTIC | Enables repeatable key traversal
          |               | randomization

Both numeric and string values are accepted, but note that string values are
case sensitive.  The default for this setting is "RANDOM" or 1.

=item pid %x not a child

(W exec) A warning peculiar to VMS.  Waitpid() was asked to wait for a
process which isn't a subprocess of the current process.  While this is
fine from VMS' perspective, it's probably not what you intended.

=item 'P' must have an explicit size in unpack

(F) The unpack format P must have an explicit size, not "*".

=item POSIX class [:%s:] unknown in regex; marked by S<<-- HERE> in m/%s/

(F) The class in the character class [: :] syntax is unknown.  The S<<-- HERE>
shows whereabouts in the regular expression the problem was discovered.
Note that the POSIX character classes do B<not> have the C<is> prefix
the corresponding C interfaces have: in other words, it's C<[[:print:]]>,
not C<isprint>.  See L<perlre>.

=item POSIX getpgrp can't take an argument

(F) Your system has POSIX getpgrp(), which takes no argument, unlike
the BSD version, which takes a pid.

=item POSIX syntax [%c %c] belongs inside character classes%s in regex; marked by
S<<-- HERE> in m/%s/

(W regexp) Perl thinks that you intended to write a POSIX character
class, but didn't use enough brackets.  These POSIX class constructs [:
:], [= =], and [. .]  go I<inside> character classes, the [] are part of
the construct, for example: C<qr/[012[:alpha:]345]/>.  What the regular
expression pattern compiled to is probably not what you were intending.
For example, C<qr/[:alpha:]/> compiles to a regular bracketed character
class consisting of the four characters C<":">,  C<"a">,  C<"l">,
C<"h">, and C<"p">.  To specify the POSIX class, it should have been
written C<qr/[[:alpha:]]/>.

Note that [= =] and [. .] are not currently
implemented; they are simply placeholders for future extensions and
will cause fatal errors.  The S<<-- HERE> shows whereabouts in the regular
expression the problem was discovered.  See L<perlre>.

If the specification of the class was not completely valid, the message
indicates that.

=item POSIX syntax [. .] is reserved for future extensions in regex; marked by 
S<<-- HERE> in m/%s/

(F) Within regular expression character classes ([]) the syntax beginning
with "[." and ending with ".]" is reserved for future extensions.  If you
need to represent those character sequences inside a regular expression
character class, just quote the square brackets with the backslash: "\[."
and ".\]".  The S<<-- HERE> shows whereabouts in the regular expression the
problem was discovered.  See L<perlre>.

=item POSIX syntax [= =] is reserved for future extensions in regex; marked by 
S<<-- HERE> in m/%s/

(F) Within regular expression character classes ([]) the syntax beginning
with "[=" and ending with "=]" is reserved for future extensions.  If you
need to represent those character sequences inside a regular expression
character class, just quote the square brackets with the backslash: "\[="
and "=\]".  The S<<-- HERE> shows whereabouts in the regular expression the
problem was discovered.  See L<perlre>.

=item Possible attempt to put comments in qw() list

(W qw) qw() lists contain items separated by whitespace; as with literal
strings, comment characters are not ignored, but are instead treated as
literal data.  (You may have used different delimiters than the
parentheses shown here; braces are also frequently used.)

You probably wrote something like this:

    @list = qw(
	a # a comment
        b # another comment
    );

when you should have written this:

    @list = qw(
	a
        b
    );

If you really want comments, build your list the
old-fashioned way, with quotes and commas:

    @list = (
        'a',    # a comment
        'b',    # another comment
    );

=item Possible attempt to separate words with commas

(W qw) qw() lists contain items separated by whitespace; therefore
commas aren't needed to separate the items.  (You may have used
different delimiters than the parentheses shown here; braces are also
frequently used.)

You probably wrote something like this:

    qw! a, b, c !;

which puts literal commas into some of the list items.  Write it without
commas if you don't want them to appear in your data:

    qw! a b c !;

=item Possible memory corruption: %s overflowed 3rd argument

(F) An ioctl() or fcntl() returned more than Perl was bargaining for.
Perl guesses a reasonable buffer size, but puts a sentinel byte at the
end of the buffer just in case.  This sentinel byte got clobbered, and
Perl assumes that memory is now corrupted.  See L<perlfunc/ioctl>.

=item Possible precedence issue with control flow operator

(W syntax) There is a possible problem with the mixing of a control
flow operator (e.g. C<return>) and a low-precedence operator like
C<or>.  Consider:

    sub { return $a or $b; }

This is parsed as:

    sub { (return $a) or $b; }

Which is effectively just:

    sub { return $a; }

Either use parentheses or the high-precedence variant of the operator.

Note this may be also triggered for constructs like:

    sub { 1 if die; }

=item Possible precedence problem on bitwise %s operator

(W precedence) Your program uses a bitwise logical operator in conjunction
with a numeric comparison operator, like this :

    if ($x & $y == 0) { ... }

This expression is actually equivalent to C<$x & ($y == 0)>, due to the
higher precedence of C<==>.  This is probably not what you want.  (If you
really meant to write this, disable the warning, or, better, put the
parentheses explicitly and write C<$x & ($y == 0)>).

=item Possible unintended interpolation of $\ in regex

(W ambiguous) You said something like C<m/$\/> in a regex.
The regex C<m/foo$\s+bar/m> translates to: match the word 'foo', the output
record separator (see L<perlvar/$\>) and the letter 's' (one time or more)
followed by the word 'bar'.

If this is what you intended then you can silence the warning by using 
C<m/${\}/> (for example: C<m/foo${\}s+bar/>).

If instead you intended to match the word 'foo' at the end of the line
followed by whitespace and the word 'bar' on the next line then you can use
C<m/$(?)\/> (for example: C<m/foo$(?)\s+bar/>).

=item Possible unintended interpolation of %s in string

(W ambiguous) You said something like '@foo' in a double-quoted string
but there was no array C<@foo> in scope at the time.  If you wanted a
literal @foo, then write it as \@foo; otherwise find out what happened
to the array you apparently lost track of.

=item Precedence problem: open %s should be open(%s)

(S precedence) The old irregular construct

    open FOO || die;

is now misinterpreted as

    open(FOO || die);

because of the strict regularization of Perl 5's grammar into unary and
list operators.  (The old open was a little of both.)  You must put
parentheses around the filehandle, or use the new "or" operator instead
of "||".

=item Premature end of script headers

See L</500 Server error>.

=item printf() on closed filehandle %s

(W closed) The filehandle you're writing to got itself closed sometime
before now.  Check your control flow.

=item print() on closed filehandle %s

(W closed) The filehandle you're printing on got itself closed sometime
before now.  Check your control flow.

=item Process terminated by SIG%s

(W) This is a standard message issued by OS/2 applications, while *nix
applications die in silence.  It is considered a feature of the OS/2
port.  One can easily disable this by appropriate sighandlers, see
L<perlipc/"Signals">.  See also "Process terminated by SIGTERM/SIGINT"
in L<perlos2>.

=item Prototype after '%c' for %s : %s

(W illegalproto) A character follows % or @ in a prototype.  This is
useless, since % and @ gobble the rest of the subroutine arguments.

=item Prototype mismatch: %s vs %s

(S prototype) The subroutine being declared or defined had previously been
declared or defined with a different function prototype.

=item Prototype not terminated

(F) You've omitted the closing parenthesis in a function prototype
definition.

=item Prototype '%s' overridden by attribute 'prototype(%s)' in %s

(W prototype) A prototype was declared in both the parentheses after
the sub name and via the prototype attribute.  The prototype in
parentheses is useless, since it will be replaced by the prototype
from the attribute before it's ever used.

=item Quantifier follows nothing in regex; marked by S<<-- HERE> in m/%s/

(F) You started a regular expression with a quantifier.  Backslash it if
you meant it literally.  The S<<-- HERE> shows whereabouts in the regular
expression the problem was discovered.  See L<perlre>.

=item Quantifier in {,} bigger than %d in regex; marked by S<<-- HERE> in m/%s/

(F) There is currently a limit to the size of the min and max values of
the {min,max} construct.  The S<<-- HERE> shows whereabouts in the regular
expression the problem was discovered.  See L<perlre>.

=item Quantifier {n,m} with n > m can't match in regex

=item Quantifier {n,m} with n > m can't match in regex; marked by
S<<-- HERE> in m/%s/

(W regexp) Minima should be less than or equal to maxima.  If you really
want your regexp to match something 0 times, just put {0}.

=item Quantifier unexpected on zero-length expression in regex m/%s/

(W regexp) You applied a regular expression quantifier in a place where
it makes no sense, such as on a zero-width assertion.  Try putting the
quantifier inside the assertion instead.  For example, the way to match
"abc" provided that it is followed by three repetitions of "xyz" is
C</abc(?=(?:xyz){3})/>, not C</abc(?=xyz){3}/>.

=item Range iterator outside integer range

(F) One (or both) of the numeric arguments to the range operator ".."
are outside the range which can be represented by integers internally.
One possible workaround is to force Perl to use magical string increment
by prepending "0" to your numbers.

=item Ranges of ASCII printables should be some subset of "0-9", "A-Z", or
"a-z" in regex; marked by S<<-- HERE> in m/%s/

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

Stricter rules help to find typos and other errors.  Perhaps you didn't
even intend a range here, if the C<"-"> was meant to be some other
character, or should have been escaped (like C<"\-">).  If you did
intend a range, the one that was used is not portable between ASCII and
EBCDIC platforms, and doesn't have an obvious meaning to a casual
reader.

 [3-7]    # OK; Obvious and portable
 [d-g]    # OK; Obvious and portable
 [A-Y]    # OK; Obvious and portable
 [A-z]    # WRONG; Not portable; not clear what is meant
 [a-Z]    # WRONG; Not portable; not clear what is meant
 [%-.]    # WRONG; Not portable; not clear what is meant
 [\x41-Z] # WRONG; Not portable; not obvious to non-geek

(You can force portability by specifying a Unicode range, which means that
the endpoints are specified by
L<C<\N{...}>|perlrecharclass/Character Ranges>, but the meaning may
still not be obvious.)
The stricter rules require that ranges that start or stop with an ASCII
character that is not a control have all their endpoints be the literal
character, and not some escape sequence (like C<"\x41">), and the ranges
must be all digits, or all uppercase letters, or all lowercase letters.

=item Ranges of digits should be from the same group in regex; marked by
S<<-- HERE> in m/%s/

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

Stricter rules help to find typos and other errors.  You included a
range, and at least one of the end points is a decimal digit.  Under the
stricter rules, when this happens, both end points should be digits in
the same group of 10 consecutive digits.

=item readdir() attempted on invalid dirhandle %s

(W io) The dirhandle you're reading from is either closed or not really
a dirhandle.  Check your control flow.

=item readline() on closed filehandle %s

(W closed) The filehandle you're reading from got itself closed sometime
before now.  Check your control flow.

=item read() on closed filehandle %s

(W closed) You tried to read from a closed filehandle.

=item read() on unopened filehandle %s

(W unopened) You tried to read from a filehandle that was never opened.

=item Reallocation too large: %x

(F) You can't allocate more than 64K on an MS-DOS machine.

=item realloc() of freed memory ignored

(S malloc) An internal routine called realloc() on something that had
already been freed.

=item Recompile perl with B<-D>DEBUGGING to use B<-D> switch

(S debugging) You can't use the B<-D> option unless the code to produce
the desired output is compiled into Perl, which entails some overhead,
which is why it's currently left out of your copy.

=item Recursive call to Perl_load_module in PerlIO_find_layer

(P) It is currently not permitted to load modules when creating
a filehandle inside an %INC hook.  This can happen with C<open my
$fh, '<', \$scalar>, which implicitly loads PerlIO::scalar.  Try
loading PerlIO::scalar explicitly first.

=item Recursive inheritance detected in package '%s'

(F) While calculating the method resolution order (MRO) of a package, Perl
believes it found an infinite loop in the C<@ISA> hierarchy.  This is a
crude check that bails out after 100 levels of C<@ISA> depth.

=item Redundant argument in %s

(W redundant) You called a function with more arguments than other
arguments you supplied indicated would be needed.  Currently only
emitted when a printf-type format required fewer arguments than were
supplied, but might be used in the future for e.g. L<perlfunc/pack>.

=item refcnt_dec: fd %d%s

=item refcnt: fd %d%s

=item refcnt_inc: fd %d%s

(P) Perl's I/O implementation failed an internal consistency check.  If
you see this message, something is very wrong.

=item Reference found where even-sized list expected

(W misc) You gave a single reference where Perl was expecting a list
with an even number of elements (for assignment to a hash).  This
usually means that you used the anon hash constructor when you meant
to use parens.  In any case, a hash requires key/value B<pairs>.

    %hash = { one => 1, two => 2, };	# WRONG
    %hash = [ qw/ an anon array / ];	# WRONG
    %hash = ( one => 1, two => 2, );	# right
    %hash = qw( one 1 two 2 );			# also fine

=item Reference is already weak

(W misc) You have attempted to weaken a reference that is already weak.
Doing so has no effect.

=item Reference to invalid group 0 in regex; marked by S<<-- HERE> in m/%s/

(F) You used C<\g0> or similar in a regular expression.  You may refer
to capturing parentheses only with strictly positive integers
(normal backreferences) or with strictly negative integers (relative
backreferences).  Using 0 does not make sense.

=item Reference to nonexistent group in regex; marked by S<<-- HERE> in
m/%s/

(F) You used something like C<\7> in your regular expression, but there are
not at least seven sets of capturing parentheses in the expression.  If
you wanted to have the character with ordinal 7 inserted into the regular
expression, prepend zeroes to make it three digits long: C<\007>

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item Reference to nonexistent named group in regex; marked by S<<-- HERE>
in m/%s/

(F) You used something like C<\k'NAME'> or C<< \k<NAME> >> in your regular
expression, but there is no corresponding named capturing parentheses
such as C<(?'NAME'...)> or C<< (?<NAME>...) >>.  Check if the name has been
spelled correctly both in the backreference and the declaration.

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item Reference to nonexistent or unclosed group in regex; marked by
S<<-- HERE> in m/%s/

(F) You used something like C<\g{-7}> in your regular expression, but there
are not at least seven sets of closed capturing parentheses in the
expression before where the C<\g{-7}> was located.

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item regexp memory corruption

(P) The regular expression engine got confused by what the regular
expression compiler gave it.

=item Regexp modifier "/%c" may appear a maximum of twice

=item Regexp modifier "%c" may appear a maximum of twice in regex; marked
by S<<-- HERE> in m/%s/

(F) The regular expression pattern had too many occurrences
of the specified modifier.  Remove the extraneous ones.

=item Regexp modifier "%c" may not appear after the "-" in regex; marked by <-- 
HERE in m/%s/

(F) Turning off the given modifier has the side effect of turning on
another one.  Perl currently doesn't allow this.  Reword the regular
expression to use the modifier you want to turn on (and place it before
the minus), instead of the one you want to turn off.

=item Regexp modifier "/%c" may not appear twice

=item Regexp modifier "%c" may not appear twice in regex; marked by <--
HERE in m/%s/

(F) The regular expression pattern had too many occurrences
of the specified modifier.  Remove the extraneous ones.

=item Regexp modifiers "/%c" and "/%c" are mutually exclusive

=item Regexp modifiers "%c" and "%c" are mutually exclusive in regex;
marked by S<<-- HERE> in m/%s/

(F) The regular expression pattern had more than one of these
mutually exclusive modifiers.  Retain only the modifier that is
supposed to be there.

=item Regexp out of space in regex m/%s/

(P) A "can't happen" error, because safemalloc() should have caught it
earlier.

=item Repeated format line will never terminate (~~ and @#)

(F) Your format contains the ~~ repeat-until-blank sequence and a
numeric field that will never go blank so that the repetition never
terminates.  You might use ^# instead.  See L<perlform>.

=item Replacement list is longer than search list

(W misc) You have used a replacement list that is longer than the
search list.  So the additional elements in the replacement list
are meaningless.

=item '%s' resolved to '\o{%s}%d'

(W misc, regexp)  You wrote something like C<\08>, or C<\179> in a
double-quotish string.  All but the last digit is treated as a single
character, specified in octal.  The last digit is the next character in
the string.  To tell Perl that this is indeed what you want, you can use
the C<\o{ }> syntax, or use exactly three digits to specify the octal
for the character.

=item Reversed %s= operator

(W syntax) You wrote your assignment operator backwards.  The = must
always come last, to avoid ambiguity with subsequent unary operators.

=item rewinddir() attempted on invalid dirhandle %s

(W io) The dirhandle you tried to do a rewinddir() on is either closed
or not really a dirhandle.  Check your control flow.

=item Scalars leaked: %d

(S internal) Something went wrong in Perl's internal bookkeeping
of scalars: not all scalar variables were deallocated by the time
Perl exited.  What this usually indicates is a memory leak, which
is of course bad, especially if the Perl program is intended to be
long-running.

=item Scalar value @%s[%s] better written as $%s[%s]

(W syntax) You've used an array slice (indicated by @) to select a
single element of an array.  Generally it's better to ask for a scalar
value (indicated by $).  The difference is that C<$foo[&bar]> always
behaves like a scalar, both when assigning to it and when evaluating its
argument, while C<@foo[&bar]> behaves like a list when you assign to it,
and provides a list context to its subscript, which can do weird things
if you're expecting only one subscript.

On the other hand, if you were actually hoping to treat the array
element as a list, you need to look into how references work, because
Perl will not magically convert between scalars and lists for you.  See
L<perlref>.

=item Scalar value @%s{%s} better written as $%s{%s}

(W syntax) You've used a hash slice (indicated by @) to select a single
element of a hash.  Generally it's better to ask for a scalar value
(indicated by $).  The difference is that C<$foo{&bar}> always behaves
like a scalar, both when assigning to it and when evaluating its
argument, while C<@foo{&bar}> behaves like a list when you assign to it,
and provides a list context to its subscript, which can do weird things
if you're expecting only one subscript.

On the other hand, if you were actually hoping to treat the hash element
as a list, you need to look into how references work, because Perl will
not magically convert between scalars and lists for you.  See
L<perlref>.

=item Search pattern not terminated

(F) The lexer couldn't find the final delimiter of a // or m{}
construct.  Remember that bracketing delimiters count nesting level.
Missing the leading C<$> from a variable C<$m> may cause this error.

Note that since Perl 5.10.0 a // can also be the I<defined-or>
construct, not just the empty search pattern.  Therefore code written
in Perl 5.10.0 or later that uses the // as the I<defined-or> can be
misparsed by pre-5.10.0 Perls as a non-terminated search pattern.

=item seekdir() attempted on invalid dirhandle %s

(W io) The dirhandle you are doing a seekdir() on is either closed or not
really a dirhandle.  Check your control flow.

=item %sseek() on unopened filehandle

(W unopened) You tried to use the seek() or sysseek() function on a
filehandle that was either never opened or has since been closed.

=item select not implemented

(F) This machine doesn't implement the select() system call.

=item Self-ties of arrays and hashes are not supported

(F) Self-ties are of arrays and hashes are not supported in
the current implementation.

=item Semicolon seems to be missing

(W semicolon) A nearby syntax error was probably caused by a missing
semicolon, or possibly some other missing operator, such as a comma.

=item semi-panic: attempt to dup freed string

(S internal) The internal newSVsv() routine was called to duplicate a
scalar that had previously been marked as free.

=item sem%s not implemented

(F) You don't have System V semaphore IPC on your system.

=item send() on closed socket %s

(W closed) The socket you're sending to got itself closed sometime
before now.  Check your control flow.

=item Sequence "\c{" invalid

(F) These three characters may not appear in sequence in a
double-quotish context.  This message is raised only on non-ASCII
platforms (a different error message is output on ASCII ones).  If you
were intending to specify a control character with this sequence, you'll
have to use a different way to specify it.

=item Sequence (? incomplete in regex; marked by S<<-- HERE> in m/%s/

(F) A regular expression ended with an incomplete extension (?.  The
S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.  See L<perlre>.

=item Sequence (?%c...) not implemented in regex; marked by S<<-- HERE> in
m/%s/

(F) A proposed regular expression extension has the character reserved
but has not yet been written.  The S<<-- HERE> shows whereabouts in the
regular expression the problem was discovered.  See L<perlre>.

=item Sequence (?%s...) not recognized in regex; marked by S<<-- HERE> in
m/%s/

(F) You used a regular expression extension that doesn't make sense.
The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.  This may happen when using the C<(?^...)> construct to tell
Perl to use the default regular expression modifiers, and you
redundantly specify a default modifier.  For other
causes, see L<perlre>.

=item Sequence (?#... not terminated in regex m/%s/

(F) A regular expression comment must be terminated by a closing
parenthesis.  Embedded parentheses aren't allowed.  See
L<perlre>.

=item Sequence (?&... not terminated in regex; marked by S<<-- HERE> in
m/%s/

(F) A named reference of the form C<(?&...)> was missing the final
closing parenthesis after the name.  The S<<-- HERE> shows whereabouts
in the regular expression the problem was discovered.

=item Sequence (?%c... not terminated in regex; marked by S<<-- HERE>
in m/%s/

(F) A named group of the form C<(?'...')> or C<< (?<...>) >> was missing the final
closing quote or angle bracket.  The S<<-- HERE> shows whereabouts in the
regular expression the problem was discovered.

=item Sequence (?(%c... not terminated in regex; marked by S<<-- HERE>
in m/%s/

(F) A named reference of the form C<(?('...')...)> or C<< (?(<...>)...) >> was
missing the final closing quote or angle bracket after the name.  The
S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item Sequence (?... not terminated in regex; marked by S<<-- HERE> in
m/%s/

(F) There was no matching closing parenthesis for the '('.  The
S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.

=item Sequence \%s... not terminated in regex; marked by S<<-- HERE> in
m/%s/

(F) The regular expression expects a mandatory argument following the escape
sequence and this has been omitted or incorrectly written.

=item Sequence (?{...}) not terminated with ')'

(F) The end of the perl code contained within the {...} must be
followed immediately by a ')'.

=item Sequence (?PE<gt>... not terminated in regex; marked by S<<-- HERE> in m/%s/

(F) A named reference of the form C<(?PE<gt>...)> was missing the final
closing parenthesis after the name.  The S<<-- HERE> shows whereabouts
in the regular expression the problem was discovered.

=item Sequence (?PE<lt>... not terminated in regex; marked by S<<-- HERE> in m/%s/

(F) A named group of the form C<(?PE<lt>...E<gt>')> was missing the final
closing angle bracket.  The S<<-- HERE> shows whereabouts in the
regular expression the problem was discovered.

=item Sequence ?P=... not terminated in regex; marked by S<<-- HERE> in
m/%s/

(F) A named reference of the form C<(?P=...)> was missing the final
closing parenthesis after the name.  The S<<-- HERE> shows whereabouts
in the regular expression the problem was discovered.

=item Sequence (?R) not terminated in regex m/%s/

(F) An C<(?R)> or C<(?0)> sequence in a regular expression was missing the
final parenthesis.

=item Z<>500 Server error

(A) This is the error message generally seen in a browser window
when trying to run a CGI program (including SSI) over the web.  The
actual error text varies widely from server to server.  The most
frequently-seen variants are "500 Server error", "Method (something)
not permitted", "Document contains no data", "Premature end of script
headers", and "Did not produce a valid header".

B<This is a CGI error, not a Perl error>.

You need to make sure your script is executable, is accessible by
the user CGI is running the script under (which is probably not the
user account you tested it under), does not rely on any environment
variables (like PATH) from the user it isn't running under, and isn't
in a location where the CGI server can't find it, basically, more or
less.  Please see the following for more information:

	http://www.perl.org/CGI_MetaFAQ.html
	http://www.htmlhelp.org/faq/cgifaq.html
	http://www.w3.org/Security/Faq/

You should also look at L<perlfaq9>.

=item setegid() not implemented

(F) You tried to assign to C<$)>, and your operating system doesn't
support the setegid() system call (or equivalent), or at least Configure
didn't think so.

=item seteuid() not implemented

(F) You tried to assign to C<< $> >>, and your operating system doesn't
support the seteuid() system call (or equivalent), or at least Configure
didn't think so.

=item setpgrp can't take arguments

(F) Your system has the setpgrp() from BSD 4.2, which takes no
arguments, unlike POSIX setpgid(), which takes a process ID and process
group ID.

=item setrgid() not implemented

(F) You tried to assign to C<$(>, and your operating system doesn't
support the setrgid() system call (or equivalent), or at least Configure
didn't think so.

=item setruid() not implemented

(F) You tried to assign to C<$<>, and your operating system doesn't
support the setruid() system call (or equivalent), or at least Configure
didn't think so.

=item setsockopt() on closed socket %s

(W closed) You tried to set a socket option on a closed socket.  Did you
forget to check the return value of your socket() call?  See
L<perlfunc/setsockopt>.

=item Setting $/ to a reference to %s as a form of slurp is deprecated, treating as undef. This will be fatal in Perl 5.28

(D deprecated) You assigned a reference to a scalar to C<$/> where the
referenced item is not a positive integer.  In older perls this B<appeared>
to work the same as setting it to C<undef> but was in fact internally
different, less efficient and with very bad luck could have resulted in
your file being split by a stringified form of the reference.

In Perl 5.20.0 this was changed so that it would be B<exactly> the same as
setting C<$/> to undef, with the exception that this warning would be
thrown.

You are recommended to change your code to set C<$/> to C<undef> explicitly
if you wish to slurp the file.  In Perl 5.28 assigning C<$/> to a 
reference to an integer which isn't positive will throw a fatal error.

=item Setting $/ to %s reference is forbidden

(F) You tried to assign a reference to a non integer to C<$/>.  In older
Perls this would have behaved similarly to setting it to a reference to
a positive integer, where the integer was the address of the reference.
As of Perl 5.20.0 this is a fatal error, to allow future versions of Perl
to use non-integer refs for more interesting purposes.

=item shm%s not implemented

(F) You don't have System V shared memory IPC on your system.

=item !=~ should be !~

(W syntax) The non-matching operator is !~, not !=~.  !=~ will be
interpreted as the != (numeric not equal) and ~ (1's complement)
operators: probably not what you intended.

=item /%s/ should probably be written as "%s"

(W syntax) You have used a pattern where Perl expected to find a string,
as in the first argument to C<join>.  Perl will treat the true or false
result of matching the pattern against $_ as the string, which is
probably not what you had in mind.

=item shutdown() on closed socket %s

(W closed) You tried to do a shutdown on a closed socket.  Seems a bit
superfluous.

=item SIG%s handler "%s" not defined

(W signal) The signal handler named in %SIG doesn't, in fact, exist.
Perhaps you put it into the wrong package?

=item Slab leaked from cv %p

(S) If you see this message, then something is seriously wrong with the
internal bookkeeping of op trees.  An op tree needed to be freed after
a compilation error, but could not be found, so it was leaked instead.

=item sleep(%u) too large

(W overflow) You called C<sleep> with a number that was larger than
it can reliably handle and C<sleep> probably slept for less time than
requested.

=item Slurpy parameter not last

(F) In a subroutine signature, you put something after a slurpy (array or
hash) parameter.  The slurpy parameter takes all the available arguments,
so there can't be any left to fill later parameters.

=item Smart matching a non-overloaded object breaks encapsulation

(F) You should not use the C<~~> operator on an object that does not
overload it: Perl refuses to use the object's underlying structure
for the smart match.

=item Smartmatch is experimental

(S experimental::smartmatch) This warning is emitted if you
use the smartmatch (C<~~>) operator.  This is currently an experimental
feature, and its details are subject to change in future releases of
Perl.  Particularly, its current behavior is noticed for being
unnecessarily complex and unintuitive, and is very likely to be
overhauled.

=item sort is now a reserved word

(F) An ancient error message that almost nobody ever runs into anymore.
But before sort was a keyword, people sometimes used it as a filehandle.

=item Source filters apply only to byte streams

(F) You tried to activate a source filter (usually by loading a
source filter module) within a string passed to C<eval>.  This is
not permitted under the C<unicode_eval> feature.  Consider using
C<evalbytes> instead.  See L<feature>.

=item splice() offset past end of array

(W misc) You attempted to specify an offset that was past the end of
the array passed to splice().  Splicing will instead commence at the
end of the array, rather than past it.  If this isn't what you want,
try explicitly pre-extending the array by assigning $#array = $offset.
See L<perlfunc/splice>.

=item Split loop

(P) The split was looping infinitely.  (Obviously, a split shouldn't
iterate more times than there are characters of input, which is what
happened.)  See L<perlfunc/split>.

=item Statement unlikely to be reached

(W exec) You did an exec() with some statement after it other than a
die().  This is almost always an error, because exec() never returns
unless there was a failure.  You probably wanted to use system()
instead, which does return.  To suppress this warning, put the exec() in
a block by itself.

=item "state" subroutine %s can't be in a package

(F) Lexically scoped subroutines aren't in a package, so it doesn't make
sense to try to declare one with a package qualifier on the front.

=item "state %s" used in sort comparison

(W syntax) The package variables $a and $b are used for sort comparisons.
You used $a or $b in as an operand to the C<< <=> >> or C<cmp> operator inside a
sort comparison block, and the variable had earlier been declared as a
lexical variable.  Either qualify the sort variable with the package
name, or rename the lexical variable.

=item "state" variable %s can't be in a package

(F) Lexically scoped variables aren't in a package, so it doesn't make
sense to try to declare one with a package qualifier on the front.  Use
local() if you want to localize a package variable.

=item stat() on unopened filehandle %s

(W unopened) You tried to use the stat() function on a filehandle that
was either never opened or has since been closed.

=item Strings with code points over 0xFF may not be mapped into in-memory file handles

(W utf8) You tried to open a reference to a scalar for read or append
where the scalar contained code points over 0xFF.  In-memory files
model on-disk files and can only contain bytes.

=item Stub found while resolving method "%s" overloading "%s" in package "%s"

(P) Overloading resolution over @ISA tree may be broken by importation
stubs.  Stubs should never be implicitly created, but explicit calls to
C<can> may break this.

=item Subroutine "&%s" is not available

(W closure) During compilation, an inner named subroutine or eval is
attempting to capture an outer lexical subroutine that is not currently
available.  This can happen for one of two reasons.  First, the lexical
subroutine may be declared in an outer anonymous subroutine that has
not yet been created.  (Remember that named subs are created at compile
time, while anonymous subs are created at run-time.)  For example,

    sub { my sub a {...} sub f { \&a } }

At the time that f is created, it can't capture the current "a" sub,
since the anonymous subroutine hasn't been created yet.  Conversely, the
following won't give a warning since the anonymous subroutine has by now
been created and is live:

    sub { my sub a {...} eval 'sub f { \&a }' }->();

The second situation is caused by an eval accessing a lexical subroutine
that has gone out of scope, for example,

    sub f {
	my sub a {...}
	sub { eval '\&a' }
    }
    f()->();

Here, when the '\&a' in the eval is being compiled, f() is not currently
being executed, so its &a is not available for capture.

=item "%s" subroutine &%s masks earlier declaration in same %s

(W misc) A "my" or "state" subroutine has been redeclared in the
current scope or statement, effectively eliminating all access to
the previous instance.  This is almost always a typographical error.
Note that the earlier subroutine will still exist until the end of
the scope or until all closure references to it are destroyed.

=item Subroutine %s redefined

(W redefine) You redefined a subroutine.  To suppress this warning, say

    {
	no warnings 'redefine';
	eval "sub name { ... }";
    }

=item Subroutine "%s" will not stay shared

(W closure) An inner (nested) I<named> subroutine is referencing a "my"
subroutine defined in an outer named subroutine.

When the inner subroutine is called, it will see the value of the outer
subroutine's lexical subroutine as it was before and during the *first*
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the lexical subroutine.  In other words,
it will no longer be shared.  This will especially make a difference
if the lexical subroutines accesses lexical variables declared in its
surrounding scope.

This problem can usually be solved by making the inner subroutine
anonymous, using the C<sub {}> syntax.  When inner anonymous subs that
reference lexical subroutines in outer subroutines are created, they
are automatically rebound to the current values of such lexical subs.

=item Substitution loop

(P) The substitution was looping infinitely.  (Obviously, a substitution
shouldn't iterate more times than there are characters of input, which
is what happened.)  See the discussion of substitution in
L<perlop/"Regexp Quote-Like Operators">.

=item Substitution pattern not terminated

(F) The lexer couldn't find the interior delimiter of an s/// or s{}{}
construct.  Remember that bracketing delimiters count nesting level.
Missing the leading C<$> from variable C<$s> may cause this error.

=item Substitution replacement not terminated

(F) The lexer couldn't find the final delimiter of an s/// or s{}{}
construct.  Remember that bracketing delimiters count nesting level.
Missing the leading C<$> from variable C<$s> may cause this error.

=item substr outside of string

(W substr)(F) You tried to reference a substr() that pointed outside of
a string.  That is, the absolute value of the offset was larger than the
length of the string.  See L<perlfunc/substr>.  This warning is fatal if
substr is used in an lvalue context (as the left hand side of an
assignment or as a subroutine argument for example).

=item sv_upgrade from type %d down to type %d

(P) Perl tried to force the upgrade of an SV to a type which was actually
inferior to its current type.

=item SWASHNEW didn't return an HV ref

(P) Something went wrong internally when Perl was trying to look up
Unicode characters.

=item Switch (?(condition)... contains too many branches in regex; marked by 
S<<-- HERE> in m/%s/

(F) A (?(condition)if-clause|else-clause) construct can have at most
two branches (the if-clause and the else-clause).  If you want one or
both to contain alternation, such as using C<this|that|other>, enclose
it in clustering parentheses:

    (?(condition)(?:this|that|other)|else-clause)

The S<<-- HERE> shows whereabouts in the regular expression the problem
was discovered.  See L<perlre>.

=item Switch condition not recognized in regex; marked by S<<-- HERE> in
m/%s/

(F) The condition part of a (?(condition)if-clause|else-clause) construct
is not known.  The condition must be one of the following:

 (1) (2) ...        true if 1st, 2nd, etc., capture matched
 (<NAME>) ('NAME')  true if named capture matched
 (?=...) (?<=...)   true if subpattern matches
 (?!...) (?<!...)   true if subpattern fails to match
 (?{ CODE })        true if code returns a true value
 (R)                true if evaluating inside recursion
 (R1) (R2) ...      true if directly inside capture group 1, 2, etc.
 (R&NAME)           true if directly inside named capture
 (DEFINE)           always false; for defining named subpatterns

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.  See L<perlre>.

=item Switch (?(condition)... not terminated in regex; marked by
S<<-- HERE> in m/%s/

(F) You omitted to close a (?(condition)...) block somewhere
in the pattern.  Add a closing parenthesis in the appropriate
position.  See L<perlre>.

=item switching effective %s is not implemented

(F) While under the C<use filetest> pragma, we cannot switch the real
and effective uids or gids.

=item syntax error

(F) Probably means you had a syntax error.  Common reasons include:

    A keyword is misspelled.
    A semicolon is missing.
    A comma is missing.
    An opening or closing parenthesis is missing.
    An opening or closing brace is missing.
    A closing quote is missing.

Often there will be another error message associated with the syntax
error giving more information.  (Sometimes it helps to turn on B<-w>.)
The error message itself often tells you where it was in the line when
it decided to give up.  Sometimes the actual error is several tokens
before this, because Perl is good at understanding random input.
Occasionally the line number may be misleading, and once in a blue moon
the only way to figure out what's triggering the error is to call
C<perl -c> repeatedly, chopping away half the program each time to see
if the error went away.  Sort of the cybernetic version of S<20 questions>.

=item syntax error at line %d: '%s' unexpected

(A) You've accidentally run your script through the Bourne shell instead
of Perl.  Check the #! line, or manually feed your script into Perl
yourself.

=item syntax error in file %s at line %d, next 2 tokens "%s"

(F) This error is likely to occur if you run a perl5 script through
a perl4 interpreter, especially if the next 2 tokens are "use strict"
or "my $var" or "our $var".

=item Syntax error in (?[...]) in regex; marked by <-- HERE in m/%s/

(F) Perl could not figure out what you meant inside this construct; this
notifies you that it is giving up trying.

=item %s syntax OK

(F) The final summary message when a C<perl -c> succeeds.

=item sysread() on closed filehandle %s

(W closed) You tried to read from a closed filehandle.

=item sysread() on unopened filehandle %s

(W unopened) You tried to read from a filehandle that was never opened.

=item System V %s is not implemented on this machine

(F) You tried to do something with a function beginning with "sem",
"shm", or "msg" but that System V IPC is not implemented in your
machine.  In some machines the functionality can exist but be
unconfigured.  Consult your system support.

=item syswrite() on closed filehandle %s

(W closed) The filehandle you're writing to got itself closed sometime
before now.  Check your control flow.

=item C<-T> and C<-B> not implemented on filehandles

(F) Perl can't peek at the stdio buffer of filehandles when it doesn't
know about your kind of stdio.  You'll have to use a filename instead.

=item Target of goto is too deeply nested

(F) You tried to use C<goto> to reach a label that was too deeply nested
for Perl to reach.  Perl is doing you a favor by refusing.

=item telldir() attempted on invalid dirhandle %s

(W io) The dirhandle you tried to telldir() is either closed or not really
a dirhandle.  Check your control flow.

=item tell() on unopened filehandle

(W unopened) You tried to use the tell() function on a filehandle that
was either never opened or has since been closed.

=item That use of $[ is unsupported

(F) Assignment to C<$[> is now strictly circumscribed, and interpreted
as a compiler directive.  You may say only one of

    $[ = 0;
    $[ = 1;
    ...
    local $[ = 0;
    local $[ = 1;
    ...

This is to prevent the problem of one module changing the array base out
from under another module inadvertently.  See L<perlvar/$[> and L<arybase>.

=item The bitwise feature is experimental

(S experimental::bitwise) This warning is emitted if you use bitwise
operators (C<& | ^ ~ &. |. ^. ~.>) with the "bitwise" feature enabled.
Simply suppress the warning if you want to use the feature, but know
that in doing so you are taking the risk of using an experimental
feature which may change or be removed in a future Perl version:

    no warnings "experimental::bitwise";
    use feature "bitwise";
    $x |.= $y;

=item The crypt() function is unimplemented due to excessive paranoia.

(F) Configure couldn't find the crypt() function on your machine,
probably because your vendor didn't supply it, probably because they
think the U.S. Government thinks it's a secret, or at least that they
will continue to pretend that it is.  And if you quote me on that, I
will deny it.

=item The experimental declared_refs feature is not enabled

(F) To declare references to variables, as in C<my \%x>, you must first enable
the feature:

    no warnings "experimental::declared_refs";
    use feature "declared_refs";

=item The %s function is unimplemented

(F) The function indicated isn't implemented on this architecture,
according to the probings of Configure.

=item The regex_sets feature is experimental

(S experimental::regex_sets) This warning is emitted if you
use the syntax S<C<(?[   ])>> in a regular expression.
The details of this feature are subject to change.
if you want to use it, but know that in doing so you
are taking the risk of using an experimental feature which may
change in a future Perl version, you can do this to silence the
warning:

    no warnings "experimental::regex_sets";

=item The signatures feature is experimental

(S experimental::signatures) This warning is emitted if you unwrap a
subroutine's arguments using a signature.  Simply suppress the warning
if you want to use the feature, but know that in doing so you are taking
the risk of using an experimental feature which may change or be removed
in a future Perl version:

    no warnings "experimental::signatures";
    use feature "signatures";
    sub foo ($left, $right) { ... }

=item The stat preceding %s wasn't an lstat

(F) It makes no sense to test the current stat buffer for symbolic
linkhood if the last stat that wrote to the stat buffer already went
past the symlink to get to the real file.  Use an actual filename
instead.

=item The 'unique' attribute may only be applied to 'our' variables

(F) This attribute was never supported on C<my> or C<sub> declarations.

=item This Perl can't reset CRTL environ elements (%s)

=item This Perl can't set CRTL environ elements (%s=%s)

(W internal) Warnings peculiar to VMS.  You tried to change or delete an
element of the CRTL's internal environ array, but your copy of Perl
wasn't built with a CRTL that contained the setenv() function.  You'll
need to rebuild Perl with a CRTL that does, or redefine
F<PERL_ENV_TABLES> (see L<perlvms>) so that the environ array isn't the
target of the change to
%ENV which produced the warning.

=item This Perl has not been built with support for randomized hash key traversal but something called Perl_hv_rand_set().

(F) Something has attempted to use an internal API call which
depends on Perl being compiled with the default support for randomized hash
key traversal, but this Perl has been compiled without it.  You should
report this warning to the relevant upstream party, or recompile perl
with default options.

=item times not implemented

(F) Your version of the C library apparently doesn't do times().  I
suspect you're not running on Unix.

=item "-T" is on the #! line, it must also be used on the command line

(X) The #! line (or local equivalent) in a Perl script contains
the B<-T> option (or the B<-t> option), but Perl was not invoked with
B<-T> in its command line.  This is an error because, by the time
Perl discovers a B<-T> in a script, it's too late to properly taint
everything from the environment.  So Perl gives up.

If the Perl script is being executed as a command using the #!
mechanism (or its local equivalent), this error can usually be
fixed by editing the #! line so that the B<-%c> option is a part of
Perl's first argument: e.g. change C<perl -n -%c> to C<perl -%c -n>.

If the Perl script is being executed as C<perl scriptname>, then the
B<-%c> option must appear on the command line: C<perl -%c scriptname>.

=item To%s: illegal mapping '%s'

(F) You tried to define a customized To-mapping for lc(), lcfirst,
uc(), or ucfirst() (or their string-inlined versions), but you
specified an illegal mapping.
See L<perlunicode/"User-Defined Character Properties">.

=item Too deeply nested ()-groups

(F) Your template contains ()-groups with a ridiculously deep nesting level.

=item Too few args to syscall

(F) There has to be at least one argument to syscall() to specify the
system call to call, silly dilly.

=item Too few arguments for subroutine '%s'

(F) A subroutine using a signature received too few arguments than
required by the signature.  The caller of the subroutine is presumably
at fault.

The message attempts to include the name of the called subroutine. If the
subroutine has been aliased, the subroutine's original name will be shown,
regardless of what name the caller used.

=item Too late for "-%s" option

(X) The #! line (or local equivalent) in a Perl script contains the
B<-M>, B<-m> or B<-C> option.

In the case of B<-M> and B<-m>, this is an error because those options
are not intended for use inside scripts.  Use the C<use> pragma instead.

The B<-C> option only works if it is specified on the command line as
well (with the same sequence of letters or numbers following).  Either
specify this option on the command line, or, if your system supports
it, make your script executable and run it directly instead of passing
it to perl.

=item Too late to run %s block

(W void) A CHECK or INIT block is being defined during run time proper,
when the opportunity to run them has already passed.  Perhaps you are
loading a file with C<require> or C<do> when you should be using C<use>
instead.  Or perhaps you should put the C<require> or C<do> inside a
BEGIN block.

=item Too many args to syscall

(F) Perl supports a maximum of only 14 args to syscall().

=item Too many arguments for %s

(F) The function requires fewer arguments than you specified.

=item Too many arguments for subroutine '%s'

(F) A subroutine using a signature received too many arguments than
required by the signature.  The caller of the subroutine is presumably
at fault.

The message attempts to include the name of the called subroutine. If the
subroutine has been aliased, the subroutine's original name will be shown,
regardless of what name the caller used.

=item Too many )'s

(A) You've accidentally run your script through B<csh> instead of Perl.
Check the #! line, or manually feed your script into Perl yourself.

=item Too many ('s

(A) You've accidentally run your script through B<csh> instead of Perl.
Check the #! line, or manually feed your script into Perl yourself.

=item Trailing \ in regex m/%s/

(F) The regular expression ends with an unbackslashed backslash.
Backslash it.   See L<perlre>.

=item Transliteration pattern not terminated

(F) The lexer couldn't find the interior delimiter of a tr/// or tr[][]
or y/// or y[][] construct.  Missing the leading C<$> from variables
C<$tr> or C<$y> may cause this error.

=item Transliteration replacement not terminated

(F) The lexer couldn't find the final delimiter of a tr///, tr[][],
y/// or y[][] construct.

=item '%s' trapped by operation mask

(F) You tried to use an operator from a Safe compartment in which it's
disallowed.  See L<Safe>.

=item truncate not implemented

(F) Your machine doesn't implement a file truncation mechanism that
Configure knows about.

=item Type of arg %d to &CORE::%s must be %s

(F) The subroutine in question in the CORE package requires its argument
to be a hard reference to data of the specified type.  Overloading is
ignored, so a reference to an object that is not the specified type, but
nonetheless has overloading to handle it, will still not be accepted.

=item Type of arg %d to %s must be %s (not %s)

(F) This function requires the argument in that position to be of a
certain type.  Arrays must be @NAME or C<@{EXPR}>.  Hashes must be
%NAME or C<%{EXPR}>.  No implicit dereferencing is allowed--use the
{EXPR} forms as an explicit dereference.  See L<perlref>.

=item umask not implemented

(F) Your machine doesn't implement the umask function and you tried to
use it to restrict permissions for yourself (EXPR & 0700).

=item Unbalanced context: %d more PUSHes than POPs

(S internal) The exit code detected an internal inconsistency in how
many execution contexts were entered and left.

=item Unbalanced saves: %d more saves than restores

(S internal) The exit code detected an internal inconsistency in how
many values were temporarily localized.

=item Unbalanced scopes: %d more ENTERs than LEAVEs

(S internal) The exit code detected an internal inconsistency in how
many blocks were entered and left.

=item Unbalanced string table refcount: (%d) for "%s"

(S internal) On exit, Perl found some strings remaining in the shared
string table used for copy on write and for hash keys.  The entries
should have been freed, so this indicates a bug somewhere.

=item Unbalanced tmps: %d more allocs than frees

(S internal) The exit code detected an internal inconsistency in how
many mortal scalars were allocated and freed.

=item Undefined format "%s" called

(F) The format indicated doesn't seem to exist.  Perhaps it's really in
another package?  See L<perlform>.

=item Undefined sort subroutine "%s" called

(F) The sort comparison routine specified doesn't seem to exist.
Perhaps it's in a different package?  See L<perlfunc/sort>.

=item Undefined subroutine &%s called

(F) The subroutine indicated hasn't been defined, or if it was, it has
since been undefined.

=item Undefined subroutine called

(F) The anonymous subroutine you're trying to call hasn't been defined,
or if it was, it has since been undefined.

=item Undefined subroutine in sort

(F) The sort comparison routine specified is declared but doesn't seem
to have been defined yet.  See L<perlfunc/sort>.

=item Undefined top format "%s" called

(F) The format indicated doesn't seem to exist.  Perhaps it's really in
another package?  See L<perlform>.

=item Undefined value assigned to typeglob

(W misc) An undefined value was assigned to a typeglob, a la
C<*foo = undef>.  This does nothing.  It's possible that you really mean
C<undef *foo>.

=item %s: Undefined variable

(A) You've accidentally run your script through B<csh> instead of Perl.
Check the #! line, or manually feed your script into Perl yourself.

=item Unescaped left brace in regex is deprecated here (and will be fatal in Perl 5.30), passed through in regex; marked by S<<-- HERE> in m/%s/

(D deprecated, regexp)  The simple rule to remember, if you want to
match a literal C<{> character (U+007B C<LEFT CURLY BRACKET>) in a
regular expression pattern, is to escape each literal instance of it in
some way.  Generally easiest is to precede it with a backslash, like
C<\{> or enclose it in square brackets (C<[{]>).  If the pattern
delimiters are also braces, any matching right brace (C<}>) should
also be escaped to avoid confusing the parser, for example,

 qr{abc\{def\}ghi}

Forcing literal C<{> characters to be escaped will enable the Perl
language to be extended in various ways in future releases.  To avoid
needlessly breaking existing code, the restriction is is not enforced in
contexts where there are unlikely to ever be extensions that could
conflict with the use there of C<{> as a literal.

In this release of Perl, some literal uses of C<{> are fatal, and some
still just deprecated.  This is because of an oversight:  some uses of a
literal C<{> that should have raised a deprecation warning starting in
v5.20 did not warn until v5.26.  By making the already-warned uses fatal
now, some of the planned extensions can be made to the language sooner.
The cases which are still allowed will be fatal in Perl 5.30.

The contexts where no warnings or errors are raised are:

=over 4

=item *

as the first character in a pattern, or following C<^> indicating to
anchor the match to the beginning of a line.

=item *

as the first character following a C<|> indicating alternation.

=item *

as the first character in a parenthesized grouping like

 /foo({bar)/
 /foo(?:{bar)/

=item *

as the first character following a quantifier

 /\s*{/

=back

=for comment
The text of the message above is duplicated below to allow splain (and
'use diagnostics') to work.  Since one is fatal, and one not, they can't
be combined as one message.  And since the non-fatal one is temporary,
there's no real need to enhance perldiag to handle this transient case.

=item Unescaped left brace in regex is illegal here in regex;
marked by S<<-- HERE> in m/%s/

(F) The simple rule to remember, if you want to
match a literal C<"{"> character (U+007B C<LEFT CURLY BRACKET>) in a
regular expression pattern, is to escape each literal instance of it in
some way.  Generally easiest is to precede it with a backslash, like
C<"\{"> or enclose it in square brackets (C<"[{]">).  If the pattern
delimiters are also braces, any matching right brace (C<"}">) should
also be escaped to avoid confusing the parser, for example,

 qr{abc\{def\}ghi}

Forcing literal C<"{"> characters to be escaped will enable the Perl
language to be extended in various ways in future releases.  To avoid
needlessly breaking existing code, the restriction is is not enforced in
contexts where there are unlikely to ever be extensions that could
conflict with the use there of C<"{"> as a literal.

In this release of Perl, some literal uses of C<"{"> are fatal, and some
still just deprecated.  This is because of an oversight:  some uses of a
literal C<"{"> that should have raised a deprecation warning starting in
v5.20 did not warn until v5.26.  By making the already-warned uses fatal
now, some of the planned extensions can be made to the language sooner.

The contexts where no warnings or errors are raised are:

=over 4

=item *

as the first character in a pattern, or following C<"^"> indicating to
anchor the match to the beginning of a line.

=item *

as the first character following a C<"|"> indicating alternation.

=item *

as the first character in a parenthesized grouping like

 /foo({bar)/
 /foo(?:{bar)/

=item *

as the first character following a quantifier

 /\s*{/

=back

=item Unescaped literal '%c' in regex; marked by <-- HERE in m/%s/

(W regexp) (only under C<S<use re 'strict'>>)

Within the scope of C<S<use re 'strict'>> in a regular expression
pattern, you included an unescaped C<}> or C<]> which was interpreted
literally.  These two characters are sometimes metacharacters, and
sometimes literals, depending on what precedes them in the
pattern.  This is unlike the similar C<)> which is always a
metacharacter unless escaped.

This action at a distance, perhaps a large distance, can lead to Perl
silently misinterpreting what you meant, so when you specify that you
want extra checking by C<S<use re 'strict'>>, this warning is generated.
If you meant the character as a literal, simply confirm that to Perl by
preceding the character with a backslash, or make it into a bracketed
character class (like C<[}]>).  If you meant it as closing a
corresponding C<[> or C<{>, you'll need to look back through the pattern
to find out why that isn't happening.

=item unexec of %s into %s failed!

(F) The unexec() routine failed for some reason.  See your local FSF
representative, who probably put it there in the first place.

=item Unexpected ']' with no following ')' in (?[... in regex; marked by <-- HERE in m/%s/

(F) While parsing an extended character class a ']' character was encountered
at a point in the definition where the only legal use of ']' is to close the
character class definition as part of a '])', you may have forgotten the close
paren, or otherwise confused the parser.

=item Expecting close paren for nested extended charclass in regex; marked by <-- HERE in m/%s/

(F) While parsing a nested extended character class like:

    (?[ ... (?flags:(?[ ... ])) ... ])
                             ^

we expected to see a close paren ')' (marked by ^) but did not.

=item Expecting close paren for wrapper for nested extended charclass in regex; marked by <-- HERE in m/%s/

(F) While parsing a nested extended character class like:

    (?[ ... (?flags:(?[ ... ])) ... ])
                              ^

we expected to see a close paren ')' (marked by ^) but did not.

=item Unexpected binary operator '%c' with no preceding operand in regex;
marked by S<<-- HERE> in m/%s/

(F) You had something like this:

 (?[ | \p{Digit} ])

where the C<"|"> is a binary operator with an operand on the right, but
no operand on the left.

=item Unexpected character in regex; marked by S<<-- HERE> in m/%s/

(F) You had something like this:

 (?[ z ])

Within C<(?[ ])>, no literal characters are allowed unless they are
within an inner pair of square brackets, like

 (?[ [ z ] ])

Another possibility is that you forgot a backslash.  Perl isn't smart
enough to figure out what you really meant.

=item Unexpected constant lvalue entersub entry via type/targ %d:%d

(P) When compiling a subroutine call in lvalue context, Perl failed an
internal consistency check.  It encountered a malformed op tree.

=item Unexpected exit %u

(S) exit() was called or the script otherwise finished gracefully when
C<PERL_EXIT_WARN> was set in C<PL_exit_flags>.

=item Unexpected exit failure %d

(S) An uncaught die() was called when C<PERL_EXIT_WARN> was set in
C<PL_exit_flags>.

=item Unexpected ')' in regex; marked by S<<-- HERE> in m/%s/

(F) You had something like this:

 (?[ ( \p{Digit} + ) ])

The C<")"> is out-of-place.  Something apparently was supposed to
be combined with the digits, or the C<"+"> shouldn't be there, or
something like that.  Perl can't figure out what was intended.

=item Unexpected '(' with no preceding operator in regex; marked by
S<<-- HERE> in m/%s/

(F) You had something like this:

 (?[ \p{Digit} ( \p{Lao} + \p{Thai} ) ])

There should be an operator before the C<"(">, as there's
no indication as to how the digits are to be combined
with the characters in the Lao and Thai scripts.

=item Unicode non-character U+%X is not recommended for open interchange

(S nonchar) Certain codepoints, such as U+FFFE and U+FFFF, are
defined by the Unicode standard to be non-characters.  Those
are legal codepoints, but are reserved for internal use; so,
applications shouldn't attempt to exchange them.  An application
may not be expecting any of these characters at all, and receiving
them may lead to bugs.  If you know what you are doing you can
turn off this warning by C<no warnings 'nonchar';>.

This is not really a "severe" error, but it is supposed to be
raised by default even if warnings are not enabled, and currently
the only way to do that in Perl is to mark it as serious.

=item Unicode surrogate U+%X is illegal in UTF-8

(S surrogate) You had a UTF-16 surrogate in a context where they are
not considered acceptable.  These code points, between U+D800 and
U+DFFF (inclusive), are used by Unicode only for UTF-16.  However, Perl
internally allows all unsigned integer code points (up to the size limit
available on your platform), including surrogates.  But these can cause
problems when being input or output, which is likely where this message
came from.  If you really really know what you are doing you can turn
off this warning by C<no warnings 'surrogate';>.

=item Unknown charname '%s'

(F) The name you used inside C<\N{}> is unknown to Perl.  Check the
spelling.  You can say C<use charnames ":loose"> to not have to be
so precise about spaces, hyphens, and capitalization on standard Unicode
names.  (Any custom aliases that have been created must be specified
exactly, regardless of whether C<:loose> is used or not.)  This error may
also happen if the C<\N{}> is not in the scope of the corresponding
C<S<use charnames>>.

=item Unknown charname '' is deprecated. Its use will be fatal in Perl 5.28

(D deprecated) You had a C<\N{}> with nothing between the braces.  This
usage was deprecated in Perl 5.24, and will be made a syntax error in 
in Perl 5.28.

=item Unknown error

(P) Perl was about to print an error message in C<$@>, but the C<$@> variable
did not exist, even after an attempt to create it.

=item Unknown open() mode '%s'

(F) The second argument of 3-argument open() is not among the list
of valid modes: C<< < >>, C<< > >>, C<<< >> >>>, C<< +< >>,
C<< +> >>, C<<< +>> >>>, C<-|>, C<|->, C<< <& >>, C<< >& >>.

=item Unknown PerlIO layer "%s"

(W layer) An attempt was made to push an unknown layer onto the Perl I/O
system.  (Layers take care of transforming data between external and
internal representations.)  Note that some layers, such as C<mmap>,
are not supported in all environments.  If your program didn't
explicitly request the failing operation, it may be the result of the
value of the environment variable PERLIO.

=item Unknown process %x sent message to prime_env_iter: %s

(P) An error peculiar to VMS.  Perl was reading values for %ENV before
iterating over it, and someone else stuck a message in the stream of
data Perl expected.  Someone's very confused, or perhaps trying to
subvert Perl's population of %ENV for nefarious purposes.

=item Unknown regex modifier "%s"

(F) Alphanumerics immediately following the closing delimiter
of a regular expression pattern are interpreted by Perl as modifier
flags for the regex.  One of the ones you specified is invalid.  One way
this can happen is if you didn't put in white space between the end of
the regex and a following alphanumeric operator:

 if ($a =~ /foo/and $bar == 3) { ... }

The C<"a"> is a valid modifier flag, but the C<"n"> is not, and raises
this error.  Likely what was meant instead was:

 if ($a =~ /foo/ and $bar == 3) { ... }

=item Unknown "re" subpragma '%s' (known ones are: %s)

(W) You tried to use an unknown subpragma of the "re" pragma.

=item Unknown switch condition (?(...)) in regex; marked by S<<-- HERE> in
m/%s/

(F) The condition part of a (?(condition)if-clause|else-clause) construct
is not known.  The condition must be one of the following:

 (1) (2) ...        true if 1st, 2nd, etc., capture matched
 (<NAME>) ('NAME')  true if named capture matched
 (?=...) (?<=...)   true if subpattern matches
 (?!...) (?<!...)   true if subpattern fails to match
 (?{ CODE })        true if code returns a true value
 (R)                true if evaluating inside recursion
 (R1) (R2) ...      true if directly inside capture group 1, 2, etc.
 (R&NAME)           true if directly inside named capture
 (DEFINE)           always false; for defining named subpatterns

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.  See L<perlre>.

=item Unknown Unicode option letter '%c'

(F) You specified an unknown Unicode option.  See L<perlrun> documentation
of the C<-C> switch for the list of known options.

=item Unknown Unicode option value %d

(F) You specified an unknown Unicode option.  See L<perlrun> documentation
of the C<-C> switch for the list of known options.

=item Unknown verb pattern '%s' in regex; marked by S<<-- HERE> in m/%s/

(F) You either made a typo or have incorrectly put a C<*> quantifier
after an open brace in your pattern.  Check the pattern and review
L<perlre> for details on legal verb patterns.

=item Unknown warnings category '%s'

(F) An error issued by the C<warnings> pragma.  You specified a warnings
category that is unknown to perl at this point.

Note that if you want to enable a warnings category registered by a
module (e.g. C<use warnings 'File::Find'>), you must have loaded this
module first.

=item Unmatched [ in regex; marked by S<<-- HERE> in m/%s/

(F) The brackets around a character class must match.  If you wish to
include a closing bracket in a character class, backslash it or put it
first.  The S<<-- HERE> shows whereabouts in the regular expression the
problem was discovered.  See L<perlre>.

=item Unmatched ( in regex; marked by S<<-- HERE> in m/%s/

=item Unmatched ) in regex; marked by S<<-- HERE> in m/%s/

(F) Unbackslashed parentheses must always be balanced in regular
expressions.  If you're a vi user, the % key is valuable for finding
the matching parenthesis.  The S<<-- HERE> shows whereabouts in the
regular expression the problem was discovered.  See L<perlre>.

=item Unmatched right %s bracket

(F) The lexer counted more closing curly or square brackets than opening
ones, so you're probably missing a matching opening bracket.  As a
general rule, you'll find the missing one (so to speak) near the place
you were last editing.

=item Unquoted string "%s" may clash with future reserved word

(W reserved) You used a bareword that might someday be claimed as a
reserved word.  It's best to put such a word in quotes, or capitalize it
somehow, or insert an underbar into it.  You might also declare it as a
subroutine.

=item Unrecognized character %s; marked by S<<-- HERE> after %s near column
%d

(F) The Perl parser has no idea what to do with the specified character
in your Perl script (or eval) near the specified column.  Perhaps you
tried  to run a compressed script, a binary program, or a directory as
a Perl program.

=item Unrecognized escape \%c in character class in regex; marked by
S<<-- HERE> in m/%s/

(F) You used a backslash-character combination which is not
recognized by Perl inside character classes.  This is a fatal
error when the character class is used within C<(?[ ])>.

=item Unrecognized escape \%c in character class passed through in regex; 
marked by S<<-- HERE> in m/%s/

(W regexp) You used a backslash-character combination which is not
recognized by Perl inside character classes.  The character was
understood literally, but this may change in a future version of Perl.
The S<<-- HERE> shows whereabouts in the regular expression the
escape was discovered.

=item Unrecognized escape \%c passed through

(W misc) You used a backslash-character combination which is not
recognized by Perl.  The character was understood literally, but this may
change in a future version of Perl.

=item Unrecognized escape \%s passed through in regex; marked by
S<<-- HERE> in m/%s/

(W regexp) You used a backslash-character combination which is not
recognized by Perl.  The character(s) were understood literally, but
this may change in a future version of Perl.  The S<<-- HERE> shows
whereabouts in the regular expression the escape was discovered.

=item Unrecognized signal name "%s"

(F) You specified a signal name to the kill() function that was not
recognized.  Say C<kill -l> in your shell to see the valid signal names
on your system.

=item Unrecognized switch: -%s  (-h will show valid options)

(F) You specified an illegal option to Perl.  Don't do that.  (If you
think you didn't do that, check the #! line to see if it's supplying the
bad switch on your behalf.)

=item Unsuccessful %s on filename containing newline

(W newline) A file operation was attempted on a filename, and that
operation failed, PROBABLY because the filename contained a newline,
PROBABLY because you forgot to chomp() it off.  See L<perlfunc/chomp>.

=item Unsupported directory function "%s" called

(F) Your machine doesn't support opendir() and readdir().

=item Unsupported function %s

(F) This machine doesn't implement the indicated function, apparently.
At least, Configure doesn't think so.

=item Unsupported function fork

(F) Your version of executable does not support forking.

Note that under some systems, like OS/2, there may be different flavors
of Perl executables, some of which may support fork, some not.  Try
changing the name you call Perl by to C<perl_>, C<perl__>, and so on.

=item Unsupported script encoding %s

(F) Your program file begins with a Unicode Byte Order Mark (BOM) which
declares it to be in a Unicode encoding that Perl cannot read.

=item Unsupported socket function "%s" called

(F) Your machine doesn't support the Berkeley socket mechanism, or at
least that's what Configure thought.

=item Unterminated attribute list

(F) The lexer found something other than a simple identifier at the
start of an attribute, and it wasn't a semicolon or the start of a
block.  Perhaps you terminated the parameter list of the previous
attribute too soon.  See L<attributes>.

=item Unterminated attribute parameter in attribute list

(F) The lexer saw an opening (left) parenthesis character while parsing
an attribute list, but the matching closing (right) parenthesis
character was not found.  You may need to add (or remove) a backslash
character to get your parentheses to balance.  See L<attributes>.

=item Unterminated compressed integer

(F) An argument to unpack("w",...) was incompatible with the BER
compressed integer format and could not be converted to an integer.
See L<perlfunc/pack>.

=item Unterminated delimiter for here document

(F) This message occurs when a here document label has an initial
quotation mark but the final quotation mark is missing.  Perhaps
you wrote:

    <<"foo

instead of:

    <<"foo"

=item Unterminated \g... pattern in regex; marked by S<<-- HERE> in m/%s/

=item Unterminated \g{...} pattern in regex; marked by S<<-- HERE> in m/%s/

(F) In a regular expression, you had a C<\g> that wasn't followed by a
proper group reference.  In the case of C<\g{>, the closing brace is
missing; otherwise the C<\g> must be followed by an integer.  Fix the
pattern and retry.

=item Unterminated <> operator

(F) The lexer saw a left angle bracket in a place where it was expecting
a term, so it's looking for the corresponding right angle bracket, and
not finding it.  Chances are you left some needed parentheses out
earlier in the line, and you really meant a "less than".

=item Unterminated verb pattern argument in regex; marked by S<<-- HERE> in
m/%s/

(F) You used a pattern of the form C<(*VERB:ARG)> but did not terminate
the pattern with a C<)>.  Fix the pattern and retry.

=item Unterminated verb pattern in regex; marked by S<<-- HERE> in m/%s/

(F) You used a pattern of the form C<(*VERB)> but did not terminate
the pattern with a C<)>.  Fix the pattern and retry.

=item untie attempted while %d inner references still exist

(W untie) A copy of the object returned from C<tie> (or C<tied>) was
still valid when C<untie> was called.

=item Usage: POSIX::%s(%s)

(F) You called a POSIX function with incorrect arguments.
See L<POSIX/FUNCTIONS> for more information.

=item Usage: Win32::%s(%s)

(F) You called a Win32 function with incorrect arguments.
See L<Win32> for more information.

=item $[ used in %s (did you mean $] ?)

(W syntax) You used C<$[> in a comparison, such as:

    if ($[ > 5.006) {
	...
    }

You probably meant to use C<$]> instead.  C<$[> is the base for indexing
arrays.  C<$]> is the Perl version number in decimal.

=item Use "%s" instead of "%s"

(F) The second listed construct is no longer legal.  Use the first one
instead.

=item Useless assignment to a temporary

(W misc) You assigned to an lvalue subroutine, but what
the subroutine returned was a temporary scalar about to
be discarded, so the assignment had no effect.

=item Useless (?-%s) - don't use /%s modifier in regex; marked by
S<<-- HERE> in m/%s/

(W regexp) You have used an internal modifier such as (?-o) that has no
meaning unless removed from the entire regexp:

    if ($string =~ /(?-o)$pattern/o) { ... }

must be written as

    if ($string =~ /$pattern/) { ... }

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.  See L<perlre>.

=item Useless localization of %s

(W syntax) The localization of lvalues such as C<local($x=10)> is legal,
but in fact the local() currently has no effect.  This may change at
some point in the future, but in the meantime such code is discouraged.

=item Useless (?%s) - use /%s modifier in regex; marked by S<<-- HERE> in
m/%s/

(W regexp) You have used an internal modifier such as (?o) that has no
meaning unless applied to the entire regexp:

    if ($string =~ /(?o)$pattern/) { ... }

must be written as

    if ($string =~ /$pattern/o) { ... }

The S<<-- HERE> shows whereabouts in the regular expression the problem was
discovered.  See L<perlre>.

=item Useless use of attribute "const"

(W misc) The C<const> attribute has no effect except
on anonymous closure prototypes.  You applied it to
a subroutine via L<attributes.pm|attributes>.  This is only useful
inside an attribute handler for an anonymous subroutine.

=item Useless use of /d modifier in transliteration operator

(W misc) You have used the /d modifier where the searchlist has the
same length as the replacelist.  See L<perlop> for more information
about the /d modifier.

=item Useless use of \E

(W misc) You have a \E in a double-quotish string without a C<\U>,
C<\L> or C<\Q> preceding it.

=item Useless use of greediness modifier '%c' in regex; marked by S<<-- HERE> in m/%s/

(W regexp) You specified something like these:

 qr/a{3}?/
 qr/b{1,1}+/

The C<"?"> and C<"+"> don't have any effect, as they modify whether to
match more or fewer when there is a choice, and by specifying to match
exactly a given numer, there is no room left for a choice.

=item Useless use of %s in void context

(W void) You did something without a side effect in a context that does
nothing with the return value, such as a statement that doesn't return a
value from a block, or the left side of a scalar comma operator.  Very
often this points not to stupidity on your part, but a failure of Perl
to parse your program the way you thought it would.  For example, you'd
get this if you mixed up your C precedence with Python precedence and
said

    $one, $two = 1, 2;

when you meant to say

    ($one, $two) = (1, 2);

Another common error is to use ordinary parentheses to construct a list
reference when you should be using square or curly brackets, for
example, if you say

    $array = (1,2);

when you should have said

    $array = [1,2];

The square brackets explicitly turn a list value into a scalar value,
while parentheses do not.  So when a parenthesized list is evaluated in
a scalar context, the comma is treated like C's comma operator, which
throws away the left argument, which is not what you want.  See
L<perlref> for more on this.

This warning will not be issued for numerical constants equal to 0 or 1
since they are often used in statements like

    1 while sub_with_side_effects();

String constants that would normally evaluate to 0 or 1 are warned
about.

=item Useless use of (?-p) in regex; marked by S<<-- HERE> in m/%s/

(W regexp) The C<p> modifier cannot be turned off once set.  Trying to do
so is futile.

=item Useless use of "re" pragma

(W) You did C<use re;> without any arguments.  That isn't very useful.

=item Useless use of sort in scalar context

(W void) You used sort in scalar context, as in :

    my $x = sort @y;

This is not very useful, and perl currently optimizes this away.

=item Useless use of %s with no values

(W syntax) You used the push() or unshift() function with no arguments
apart from the array, like C<push(@x)> or C<unshift(@foo)>.  That won't
usually have any effect on the array, so is completely useless.  It's
possible in principle that push(@tied_array) could have some effect
if the array is tied to a class which implements a PUSH method.  If so,
you can write it as C<push(@tied_array,())> to avoid this warning.

=item "use" not allowed in expression

(F) The "use" keyword is recognized and executed at compile time, and
returns no useful value.  See L<perlmod>.

=item Use of assignment to $[ is deprecated

(D deprecated) The C<$[> variable (index of the first element in an array)
is deprecated.  See L<perlvar/"$[">.

=item Use of bare << to mean <<"" is deprecated. Its use will be fatal in Perl 5.28

(D deprecated) You are now encouraged to use the explicitly quoted
form if you wish to use an empty line as the terminator of the
here-document.

Use of a bare terminator was deprecated in Perl 5.000, and
will be a fatal error in Perl 5.28.

=item Use of /c modifier is meaningless in s///

(W regexp) You used the /c modifier in a substitution.  The /c
modifier is not presently meaningful in substitutions.

=item Use of /c modifier is meaningless without /g

(W regexp) You used the /c modifier with a regex operand, but didn't
use the /g modifier.  Currently, /c is meaningful only when /g is
used.  (This may change in the future.)

=item Use of code point 0x%s is deprecated; the permissible max is 0x%s. This will be fatal in Perl 5.28

(D deprecated) You used a code point that will not be allowed in a
future perl version, because it is too large.  Unicode only allows code
points up to 0x10FFFF, but Perl allows much larger ones.  However, the
largest possible ones break the perl interpreter in some constructs,
including causing it to hang in a few cases.  The known problem areas
are in C<tr///>, regular expression pattern matching using quantifiers,
as quote delimiters in C<qI<X>...I<X>> (where I<X> is the C<chr()> of a large
code point), and as the upper limits in loops.
There may be other breakages as well.  If you get this warning, and
things aren't working correctly, you probably have found one of these.

If your code is to run on various platforms, keep in mind that the upper
limit depends on the platform.  It is much larger on 64-bit word sizes
than 32-bit ones.

The use of out of range code points was deprecated in Perl 5.24, and
it will be a fatal error in Perl 5.28.

=item Use of comma-less variable list is deprecated. Its use will be fatal in Perl 5.28

(D deprecated) The values you give to a format should be
separated by commas, not just aligned on a line.

This usage will be fatal in Perl 5.28.

=item Use of each() on hash after insertion without resetting hash iterator results in undefined behavior

(S internal) The behavior of C<each()> after insertion is undefined;
it may skip items, or visit items more than once.  Consider using
C<keys()> instead of C<each()>.

=item Infinite recursion via empty pattern

(F) You tried to use the empty pattern inside of a regex code block,
for instance C</(?{ s!!! })/>, which resulted in re-executing
the same pattern, which is an infinite loop which is broken by
throwing an exception.

=item Use of := for an empty attribute list is not allowed

(F) The construction C<my $x := 42> used to parse as equivalent to
C<my $x : = 42> (applying an empty attribute list to C<$x>).
This construct was deprecated in 5.12.0, and has now been made a syntax
error, so C<:=> can be reclaimed as a new operator in the future.

If you need an empty attribute list, for example in a code generator, add
a space before the C<=>.

=item Use of %s for non-UTF-8 locale is wrong.  Assuming a UTF-8 locale

(W locale)  You are matching a regular expression using locale rules,
and the specified construct was encountered.  This construct is only
valid for UTF-8 locales, which the current locale isn't.  This doesn't
make sense.  Perl will continue, assuming a Unicode (UTF-8) locale, but
the results are likely to be wrong.

=item Use of freed value in iteration

(F) Perhaps you modified the iterated array within the loop?
This error is typically caused by code like the following:

    @a = (3,4);
    @a = () for (1,2,@a);

You are not supposed to modify arrays while they are being iterated over.
For speed and efficiency reasons, Perl internally does not do full
reference-counting of iterated items, hence deleting such an item in the
middle of an iteration causes Perl to see a freed value.

=item Use of /g modifier is meaningless in split

(W regexp) You used the /g modifier on the pattern for a C<split>
operator.  Since C<split> always tries to match the pattern
repeatedly, the C</g> has no effect.

=item Use of "goto" to jump into a construct is deprecated

(D deprecated) Using C<goto> to jump from an outer scope into an inner
scope is deprecated and should be avoided.

This was deprecated in Perl 5.12.

=item Use of inherited AUTOLOAD for non-method %s() is deprecated. This will be fatal in Perl 5.28

(D deprecated) As an (ahem) accidental feature, C<AUTOLOAD>
subroutines are looked up as methods (using the C<@ISA> hierarchy)
even when the subroutines to be autoloaded were called as plain
functions (e.g. C<Foo::bar()>), not as methods (e.g. C<< Foo->bar() >> or
C<< $obj->bar() >>).

This bug will be rectified in future by using method lookup only for
methods' C<AUTOLOAD>s.  However, there is a significant base of existing
code that may be using the old behavior.  So, as an interim step, Perl
currently issues an optional warning when non-methods use inherited
C<AUTOLOAD>s.

The simple rule is:  Inheritance will not work when autoloading
non-methods.  The simple fix for old code is:  In any module that used
to depend on inheriting C<AUTOLOAD> for non-methods from a base class
named C<BaseClass>, execute C<*AUTOLOAD = \&BaseClass::AUTOLOAD> during
startup.

In code that currently says C<use AutoLoader; @ISA = qw(AutoLoader);>
you should remove AutoLoader from @ISA and change C<use AutoLoader;> to
C<use AutoLoader 'AUTOLOAD';>.

This feature was deprecated in Perl 5.004, and will be fatal in Perl 5.28.

=item Use of %s in printf format not supported

(F) You attempted to use a feature of printf that is accessible from
only C.  This usually means there's a better way to do it in Perl.

=item Use of -l on filehandle%s

(W io) A filehandle represents an opened file, and when you opened the file
it already went past any symlink you are presumably trying to look for.
The operation returned C<undef>.  Use a filename instead.

=item Use of reference "%s" as array index

(W misc) You tried to use a reference as an array index; this probably
isn't what you mean, because references in numerical context tend
to be huge numbers, and so usually indicates programmer error.

If you really do mean it, explicitly numify your reference, like so:
C<$array[0+$ref]>.  This warning is not given for overloaded objects,
however, because you can overload the numification and stringification
operators and then you presumably know what you are doing.

=item Use of state $_ is experimental

(S experimental::lexical_topic) Lexical $_ is an experimental feature and
its behavior may change or even be removed in any future release of perl.
See the explanation under L<perlvar/$_>.

=item Use of strings with code points over 0xFF as arguments to %s
operator is deprecated. This will be a fatal error in Perl 5.28

(D deprecated) You tried to use one of the string bitwise operators
(C<&> or C<|> or C<^> or C<~>) on a string containing a code point over
0xFF.  The string bitwise operators treat their operands as strings of
bytes, and values beyond 0xFF are nonsensical in this context.

Such usage will be a fatal error in Perl 5.28.

=item Use of tainted arguments in %s is deprecated

(W taint, deprecated) You have supplied C<system()> or C<exec()> with multiple
arguments and at least one of them is tainted.  This used to be allowed
but will become a fatal error in a future version of perl.  Untaint your
arguments.  See L<perlsec>.

=item Use of unassigned code point or non-standalone grapheme for a
delimiter will be a fatal error starting in Perl 5.30

(D deprecated)
A grapheme is what appears to a native-speaker of a language to be a
character.  In Unicode (and hence Perl) a grapheme may actually be
several adjacent characters that together form a complete grapheme.  For
example, there can be a base character, like "R" and an accent, like a
circumflex "^", that appear when displayed to be a single character with
the circumflex hovering over the "R".  Perl currently allows things like
that circumflex to be delimiters of strings, patterns, I<etc>.  When
displayed, the circumflex would look like it belongs to the character
just to the left of it.  In order to move the language to be able to
accept graphemes as delimiters, we have to deprecate the use of
delimiters which aren't graphemes by themselves.  Also, a delimiter must
already be assigned (or known to be never going to be assigned) to try
to future-proof code, for otherwise code that works today would fail to
compile if the currently unassigned delimiter ends up being something
that isn't a stand-alone grapheme.  Because Unicode is never going to
assign
L<non-character code points|perlunicode/Noncharacter code points>, nor
L<code points that are above the legal Unicode maximum|
perlunicode/Beyond Unicode code points>, those can be delimiters, and
their use won't raise this warning.

=item Use of uninitialized value%s

(W uninitialized) An undefined value was used as if it were already
defined.  It was interpreted as a "" or a 0, but maybe it was a mistake.
To suppress this warning assign a defined value to your variables.

To help you figure out what was undefined, perl will try to tell you
the name of the variable (if any) that was undefined.  In some cases
it cannot do this, so it also tells you what operation you used the
undefined value in.  Note, however, that perl optimizes your program
and the operation displayed in the warning may not necessarily appear
literally in your program.  For example, C<"that $foo"> is usually
optimized into C<"that " . $foo>, and the warning will refer to the
C<concatenation (.)> operator, even though there is no C<.> in
your program.

=item "use re 'strict'" is experimental

(S experimental::re_strict) The things that are different when a regular
expression pattern is compiled under C<'strict'> are subject to change
in future Perl releases in incompatible ways.  This means that a pattern
that compiles today may not in a future Perl release.  This warning is
to alert you to that risk.

=item Use \x{...} for more than two hex characters in regex; marked by
S<<-- HERE> in m/%s/

(F) In a regular expression, you said something like

 (?[ [ \xBEEF ] ])

Perl isn't sure if you meant this

 (?[ [ \x{BEEF} ] ])

or if you meant this

 (?[ [ \x{BE} E F ] ])

You need to add either braces or blanks to disambiguate.

=item Using just the first character returned by \N{} in character class in 
regex; marked by S<<-- HERE> in m/%s/

(W regexp) Named Unicode character escapes C<(\N{...})> may return
a multi-character sequence.  Even though a character class is
supposed to match just one character of input, perl will match
the whole thing correctly, except when the class is inverted
(C<[^...]>), or the escape is the beginning or final end point of
a range.  For these, what should happen isn't clear at all.  In
these circumstances, Perl discards all but the first character
of the returned sequence, which is not likely what you want.

=item Using /u for '%s' instead of /%s in regex; marked by S<<-- HERE> in m/%s/

(W regexp) You used a Unicode boundary (C<\b{...}> or C<\B{...}>) in a
portion of a regular expression where the character set modifiers C</a>
or C</aa> are in effect.  These two modifiers indicate an ASCII
interpretation, and this doesn't make sense for a Unicode defintion.
The generated regular expression will compile so that the boundary uses
all of Unicode.  No other portion of the regular expression is affected.

=item Using !~ with %s doesn't make sense

(F) Using the C<!~> operator with C<s///r>, C<tr///r> or C<y///r> is
currently reserved for future use, as the exact behavior has not
been decided.  (Simply returning the boolean opposite of the
modified string is usually not particularly useful.)

=item UTF-16 surrogate U+%X

(S surrogate) You had a UTF-16 surrogate in a context where they are
not considered acceptable.  These code points, between U+D800 and
U+DFFF (inclusive), are used by Unicode only for UTF-16.  However, Perl
internally allows all unsigned integer code points (up to the size limit
available on your platform), including surrogates.  But these can cause
problems when being input or output, which is likely where this message
came from.  If you really really know what you are doing you can turn
off this warning by C<no warnings 'surrogate';>.

=item Value of %s can be "0"; test with defined()

(W misc) In a conditional expression, you used <HANDLE>, <*> (glob),
C<each()>, or C<readdir()> as a boolean value.  Each of these constructs
can return a value of "0"; that would make the conditional expression
false, which is probably not what you intended.  When using these
constructs in conditional expressions, test their values with the
C<defined> operator.

=item Value of CLI symbol "%s" too long

(W misc) A warning peculiar to VMS.  Perl tried to read the value of an
%ENV element from a CLI symbol table, and found a resultant string
longer than 1024 characters.  The return value has been truncated to
1024 characters.

=item Variable "%s" is not available

(W closure) During compilation, an inner named subroutine or eval is
attempting to capture an outer lexical that is not currently available.
This can happen for one of two reasons.  First, the outer lexical may be
declared in an outer anonymous subroutine that has not yet been created.
(Remember that named subs are created at compile time, while anonymous
subs are created at run-time.)  For example,

    sub { my $a; sub f { $a } }

At the time that f is created, it can't capture the current value of $a,
since the anonymous subroutine hasn't been created yet.  Conversely,
the following won't give a warning since the anonymous subroutine has by
now been created and is live:

    sub { my $a; eval 'sub f { $a }' }->();

The second situation is caused by an eval accessing a variable that has
gone out of scope, for example,

    sub f {
	my $a;
	sub { eval '$a' }
    }
    f()->();

Here, when the '$a' in the eval is being compiled, f() is not currently
being executed, so its $a is not available for capture.

=item Variable "%s" is not imported%s

(S misc) With "use strict" in effect, you referred to a global variable
that you apparently thought was imported from another module, because
something else of the same name (usually a subroutine) is exported by
that module.  It usually means you put the wrong funny character on the
front of your variable.

=item Variable length lookbehind not implemented in regex m/%s/

(F) Lookbehind is allowed only for subexpressions whose length is fixed and
known at compile time.  For positive lookbehind, you can use the C<\K>
regex construct as a way to get the equivalent functionality.  See
L<(?<=pattern) and \K in perlre|perlre/\K>.

There are non-obvious Unicode rules under C</i> that can match variably,
but which you might not think could.  For example, the substring C<"ss">
can match the single character LATIN SMALL LETTER SHARP S.  There are
other sequences of ASCII characters that can match single ligature
characters, such as LATIN SMALL LIGATURE FFI matching C<qr/ffi/i>.
Starting in Perl v5.16, if you only care about ASCII matches, adding the
C</aa> modifier to the regex will exclude all these non-obvious matches,
thus getting rid of this message.  You can also say C<S<use re qw(/aa)>>
to apply C</aa> to all regular expressions compiled within its scope.
See L<re>.

=item "%s" variable %s masks earlier declaration in same %s

(W misc) A "my", "our" or "state" variable has been redeclared in the
current scope or statement, effectively eliminating all access to the
previous instance.  This is almost always a typographical error.  Note
that the earlier variable will still exist until the end of the scope
or until all closure references to it are destroyed.

=item Variable syntax

(A) You've accidentally run your script through B<csh> instead
of Perl.  Check the #! line, or manually feed your script into
Perl yourself.

=item Variable "%s" will not stay shared

(W closure) An inner (nested) I<named> subroutine is referencing a
lexical variable defined in an outer named subroutine.

When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the *first*
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable.  In other words, the
variable will no longer be shared.

This problem can usually be solved by making the inner subroutine
anonymous, using the C<sub {}> syntax.  When inner anonymous subs that
reference variables in outer subroutines are created, they
are automatically rebound to the current values of such variables.

=item vector argument not supported with alpha versions

(S printf) The %vd (s)printf format does not support version objects
with alpha parts.

=item Verb pattern '%s' has a mandatory argument in regex; marked by
S<<-- HERE> in m/%s/ 

(F) You used a verb pattern that requires an argument.  Supply an
argument or check that you are using the right verb.

=item Verb pattern '%s' may not have an argument in regex; marked by
S<<-- HERE> in m/%s/ 

(F) You used a verb pattern that is not allowed an argument.  Remove the 
argument or check that you are using the right verb.

=item Version control conflict marker

(F) The parser found a line starting with C<E<lt><<<<<<>,
C<E<gt>E<gt>E<gt>E<gt>E<gt>E<gt>E<gt>>, or C<=======>.  These may be left by a
version control system to mark conflicts after a failed merge operation.

=item Version number must be a constant number

(P) The attempt to translate a C<use Module n.n LIST> statement into
its equivalent C<BEGIN> block found an internal inconsistency with
the version number.

=item Version string '%s' contains invalid data; ignoring: '%s'

(W misc) The version string contains invalid characters at the end, which
are being ignored.

=item Warning: something's wrong

(W) You passed warn() an empty string (the equivalent of C<warn "">) or
you called it with no args and C<$@> was empty.

=item Warning: unable to close filehandle %s properly

(S) The implicit close() done by an open() got an error indication on
the close().  This usually indicates your file system ran out of disk
space.

=item Warning: unable to close filehandle properly: %s

=item Warning: unable to close filehandle %s properly: %s

(S io) There were errors during the implicit close() done on a filehandle
when its reference count reached zero while it was still open, e.g.:

    {
        open my $fh, '>', $file  or die "open: '$file': $!\n";
        print $fh $data or die "print: $!";
    } # implicit close here

Because various errors may only be detected by close() (e.g. buffering could
allow the C<print> in this example to return true even when the disk is full),
it is dangerous to ignore its result.  So when it happens implicitly, perl
will signal errors by warning.

B<Prior to version 5.22.0, perl ignored such errors>, so the common idiom shown
above was liable to cause B<silent data loss>.

=item Warning: Use of "%s" without parentheses is ambiguous

(S ambiguous) You wrote a unary operator followed by something that
looks like a binary operator that could also have been interpreted as a
term or unary operator.  For instance, if you know that the rand
function has a default argument of 1.0, and you write

    rand + 5;

you may THINK you wrote the same thing as

    rand() + 5;

but in actual fact, you got

    rand(+5);

So put in parentheses to say what you really mean.

=item when is experimental

(S experimental::smartmatch) C<when> depends on smartmatch, which is
experimental.  Additionally, it has several special cases that may
not be immediately obvious, and their behavior may change or
even be removed in any future release of perl.  See the explanation
under L<perlsyn/Experimental Details on given and when>.

=item Wide character in %s

(S utf8) Perl met a wide character (>255) when it wasn't expecting
one.  This warning is by default on for I/O (like print).  The easiest
way to quiet this warning is simply to add the C<:utf8> layer to the
output, e.g. C<binmode STDOUT, ':utf8'>.  Another way to turn off the
warning is to add C<no warnings 'utf8';> but that is often closer to
cheating.  In general, you are supposed to explicitly mark the
filehandle with an encoding, see L<open> and L<perlfunc/binmode>.

=item Wide character (U+%X) in %s

(W locale) While in a single-byte locale (I<i.e.>, a non-UTF-8
one), a multi-byte character was encountered.   Perl considers this
character to be the specified Unicode code point.  Combining non-UTF-8
locales and Unicode is dangerous.  Almost certainly some characters
will have two different representations.  For example, in the ISO 8859-7
(Greek) locale, the code point 0xC3 represents a Capital Gamma.  But so
also does 0x393.  This will make string comparisons unreliable.

You likely need to figure out how this multi-byte character got mixed up
with your single-byte locale (or perhaps you thought you had a UTF-8
locale, but Perl disagrees).

=item Within []-length '%c' not allowed

(F) The count in the (un)pack template may be replaced by C<[TEMPLATE]>
only if C<TEMPLATE> always matches the same amount of packed bytes that
can be determined from the template alone.  This is not possible if
it contains any of the codes @, /, U, u, w or a *-length.  Redesign
the template.

=item %s() with negative argument

(S misc) Certain operations make no sense with negative arguments.
Warning is given and the operation is not done.

=item write() on closed filehandle %s

(W closed) The filehandle you're writing to got itself closed sometime
before now.  Check your control flow.

=item %s "\x%X" does not map to Unicode

(S utf8) When reading in different encodings, Perl tries to
map everything into Unicode characters.  The bytes you read
in are not legal in this encoding.  For example

    utf8 "\xE4" does not map to Unicode

if you try to read in the a-diaereses Latin-1 as UTF-8.

=item 'X' outside of string

(F) You had a (un)pack template that specified a relative position before
the beginning of the string being (un)packed.  See L<perlfunc/pack>.

=item 'x' outside of string in unpack

(F) You had a pack template that specified a relative position after
the end of the string being unpacked.  See L<perlfunc/pack>.

=item YOU HAVEN'T DISABLED SET-ID SCRIPTS IN THE KERNEL YET!

(F) And you probably never will, because you probably don't have the
sources to your kernel, and your vendor probably doesn't give a rip
about what you want.  Your best bet is to put a setuid C wrapper around
your script.

=item You need to quote "%s"

(W syntax) You assigned a bareword as a signal handler name.
Unfortunately, you already have a subroutine of that name declared,
which means that Perl 5 will try to call the subroutine when the
assignment is executed, which is probably not what you want.  (If it IS
what you want, put an & in front.)

=item Your random numbers are not that random

(F) When trying to initialize the random seed for hashes, Perl could
not get any randomness out of your system.  This usually indicates
Something Very Wrong.

=item Zero length \N{} in regex; marked by S<<-- HERE> in m/%s/

(F) Named Unicode character escapes (C<\N{...}>) may return a zero-length
sequence.  Such an escape was used in an extended character class, i.e.
C<(?[...])>, or under C<use re 'strict'>, which is not permitted.  Check
that the correct escape has been used, and the correct charnames handler
is in scope.  The S<<-- HERE> shows whereabouts in the regular
expression the problem was discovered.

=back

=head1 SEE ALSO

L<warnings>, L<diagnostics>.

=cut
perlre.pod000064400000354105150344123460006553 0ustar00=head1 NAME
X<regular expression> X<regex> X<regexp>

perlre - Perl regular expressions

=head1 DESCRIPTION

This page describes the syntax of regular expressions in Perl.

If you haven't used regular expressions before, a tutorial introduction
is available in L<perlretut>.  If you know just a little about them,
a quick-start introduction is available in L<perlrequick>.

Except for L</The Basics> section, this page assumes you are familiar
with regular expression basics, like what is a "pattern", what does it
look like, and how it is basically used.  For a reference on how they
are used, plus various examples of the same, see discussions of C<m//>,
C<s///>, C<qr//> and C<"??"> in L<perlop/"Regexp Quote-Like Operators">.

New in v5.22, L<C<use re 'strict'>|re/'strict' mode> applies stricter
rules than otherwise when compiling regular expression patterns.  It can
find things that, while legal, may not be what you intended.

=head2 The Basics
X<regular expression, version 8> X<regex, version 8> X<regexp, version 8>

Regular expressions are strings with the very particular syntax and
meaning described in this document and auxiliary documents referred to
by this one.  The strings are called "patterns".  Patterns are used to
determine if some other string, called the "target", has (or doesn't
have) the characteristics specified by the pattern.  We call this
"matching" the target string against the pattern.  Usually the match is
done by having the target be the first operand, and the pattern be the
second operand, of one of the two binary operators C<=~> and C<!~>,
listed in L<perlop/Binding Operators>; and the pattern will have been
converted from an ordinary string by one of the operators in
L<perlop/"Regexp Quote-Like Operators">, like so:

 $foo =~ m/abc/

This evaluates to true if and only if the string in the variable C<$foo>
contains somewhere in it, the sequence of characters "a", "b", then "c".
(The C<=~ m>, or match operator, is described in
L<perlop/m/PATTERN/msixpodualngc>.)

Patterns that aren't already stored in some variable must be delimitted,
at both ends, by delimitter characters.  These are often, as in the
example above, forward slashes, and the typical way a pattern is written
in documentation is with those slashes.  In most cases, the delimitter
is the same character, fore and aft, but there are a few cases where a
character looks like it has a mirror-image mate, where the opening
version is the beginning delimiter, and the closing one is the ending
delimiter, like

 $foo =~ m<abc>

Most times, the pattern is evaluated in double-quotish context, but it
is possible to choose delimiters to force single-quotish, like

 $foo =~ m'abc'

If the pattern contains its delimiter within it, that delimiter must be
escaped.  Prefixing it with a backslash (I<e.g.>, C<"/foo\/bar/">)
serves this purpose.

Any single character in a pattern matches that same character in the
target string, unless the character is a I<metacharacter> with a special
meaning described in this document.  A sequence of non-metacharacters
matches the same sequence in the target string, as we saw above with
C<m/abc/>.

Only a few characters (all of them being ASCII punctuation characters)
are metacharacters.  The most commonly used one is a dot C<".">, which
normally matches almost any character (including a dot itself).

You can cause characters that normally function as metacharacters to be
interpreted literally by prefixing them with a C<"\">, just like the
pattern's delimiter must be escaped if it also occurs within the
pattern.  Thus, C<"\."> matches just a literal dot, C<"."> instead of
its normal meaning.  This means that the backslash is also a
metacharacter, so C<"\\"> matches a single C<"\">.  And a sequence that
contains an escaped metacharacter matches the same sequence (but without
the escape) in the target string.  So, the pattern C</blur\\fl/> would
match any target string that contains the sequence C<"blur\fl">.

The metacharacter C<"|"> is used to match one thing or another.  Thus

 $foo =~ m/this|that/

is TRUE if and only if C<$foo> contains either the sequence C<"this"> or
the sequence C<"that">.  Like all metacharacters, prefixing the C<"|">
with a backslash makes it match the plain punctuation character; in its
case, the VERTICAL LINE.

 $foo =~ m/this\|that/

is TRUE if and only if C<$foo> contains the sequence C<"this|that">.

You aren't limited to just a single C<"|">.

 $foo =~ m/fee|fie|foe|fum/

is TRUE if and only if C<$foo> contains any of those 4 sequences from
the children's story "Jack and the Beanstalk".

As you can see, the C<"|"> binds less tightly than a sequence of
ordinary characters.  We can override this by using the grouping
metacharacters, the parentheses C<"("> and C<")">.

 $foo =~ m/th(is|at) thing/

is TRUE if and only if C<$foo> contains either the sequence S<C<"this
thing">> or the sequence S<C<"that thing">>.  The portions of the string
that match the portions of the pattern enclosed in parentheses are
normally made available separately for use later in the pattern,
substitution, or program.  This is called "capturing", and it can get
complicated.  See L</Capture groups>.

The first alternative includes everything from the last pattern
delimiter (C<"(">, C<"(?:"> (described later), I<etc>. or the beginning
of the pattern) up to the first C<"|">, and the last alternative
contains everything from the last C<"|"> to the next closing pattern
delimiter.  That's why it's common practice to include alternatives in
parentheses: to minimize confusion about where they start and end.

Alternatives are tried from left to right, so the first
alternative found for which the entire expression matches, is the one that
is chosen. This means that alternatives are not necessarily greedy. For
example: when matching C<foo|foot> against C<"barefoot">, only the C<"foo">
part will match, as that is the first alternative tried, and it successfully
matches the target string. (This might not seem important, but it is
important when you are capturing matched text using parentheses.)

Besides taking away the special meaning of a metacharacter, a prefixed
backslash changes some letter and digit characters away from matching
just themselves to instead have special meaning.  These are called
"escape sequences", and all such are described in L<perlrebackslash>.  A
backslash sequence (of a letter or digit) that doesn't currently have
special meaning to Perl will raise a warning if warnings are enabled,
as those are reserved for potential future use.

One such sequence is C<\b>, which matches a boundary of some sort.
C<\b{wb}> and a few others give specialized types of boundaries.
(They are all described in detail starting at
L<perlrebackslash/\b{}, \b, \B{}, \B>.)  Note that these don't match
characters, but the zero-width spaces between characters.  They are an
example of a L<zero-width assertion|/Assertions>.  Consider again,

 $foo =~ m/fee|fie|foe|fum/

It evaluates to TRUE if, besides those 4 words, any of the sequences
"feed", "field", "Defoe", "fume", and many others are in C<$foo>.  By
judicious use of C<\b> (or better (because it is designed to handle
natural language) C<\b{wb}>), we can make sure that only the Giant's
words are matched:

 $foo =~ m/\b(fee|fie|foe|fum)\b/
 $foo =~ m/\b{wb}(fee|fie|foe|fum)\b{wb}/

The final example shows that the characters C<"{"> and C<"}"> are
metacharacters.

Another use for escape sequences is to specify characters that cannot
(or which you prefer not to) be written literally.  These are described
in detail in L<perlrebackslash/Character Escapes>, but the next three
paragraphs briefly describe some of them.

Various control characters can be written in C language style: C<"\n">
matches a newline, C<"\t"> a tab, C<"\r"> a carriage return, C<"\f"> a
form feed, I<etc>.

More generally, C<\I<nnn>>, where I<nnn> is a string of three octal
digits, matches the character whose native code point is I<nnn>.  You
can easily run into trouble if you don't have exactly three digits.  So
always use three, or since Perl 5.14, you can use C<\o{...}> to specify
any number of octal digits.

Similarly, C<\xI<nn>>, where I<nn> are hexadecimal digits, matches the
character whose native ordinal is I<nn>.  Again, not using exactly two
digits is a recipe for disaster, but you can use C<\x{...}> to specify
any number of hex digits.

Besides being a metacharacter, the C<"."> is an example of a "character
class", something that can match any single character of a given set of
them.  In its case, the set is just about all possible characters.  Perl
predefines several character classes besides the C<".">; there is a
separate reference page about just these, L<perlrecharclass>.

You can define your own custom character classes, by putting into your
pattern in the appropriate place(s), a list of all the characters you
want in the set.  You do this by enclosing the list within C<[]> bracket
characters.  These are called "bracketed character classes" when we are
being precise, but often the word "bracketed" is dropped.  (Dropping it
usually doesn't cause confusion.)  This means that the C<"["> character
is another metacharacter.  It doesn't match anything just by itelf; it
is used only to tell Perl that what follows it is a bracketed character
class.  If you want to match a literal left square bracket, you must
escape it, like C<"\[">.  The matching C<"]"> is also a metacharacter;
again it doesn't match anything by itself, but just marks the end of
your custom class to Perl.  It is an example of a "sometimes
metacharacter".  It isn't a metacharacter if there is no corresponding
C<"[">, and matches its literal self:

 print "]" =~ /]/;  # prints 1

The list of characters within the character class gives the set of
characters matched by the class.  C<"[abc]"> matches a single "a" or "b"
or "c".  But if the first character after the C<"["> is C<"^">, the
class matches any character not in the list.  Within a list, the C<"-">
character specifies a range of characters, so that C<a-z> represents all
characters between "a" and "z", inclusive.  If you want either C<"-"> or
C<"]"> itself to be a member of a class, put it at the start of the list
(possibly after a C<"^">), or escape it with a backslash.  C<"-"> is
also taken literally when it is at the end of the list, just before the
closing C<"]">.  (The following all specify the same class of three
characters: C<[-az]>, C<[az-]>, and C<[a\-z]>.  All are different from
C<[a-z]>, which specifies a class containing twenty-six characters, even
on EBCDIC-based character sets.)

There is lots more to bracketed character classes; full details are in
L<perlrecharclass/Bracketed Character Classes>.

=head3 Metacharacters
X<metacharacter>
X<\> X<^> X<.> X<$> X<|> X<(> X<()> X<[> X<[]>

L</The Basics> introduced some of the metacharacters.  This section
gives them all.  Most of them have the same meaning as in the I<egrep>
command.

Only the C<"\"> is always a metacharacter.  The others are metacharacters
just sometimes.  The following tables lists all of them, summarizes
their use, and gives the contexts where they are metacharacters.
Outside those contexts or if prefixed by a C<"\">, they match their
corresponding punctuation character.  In some cases, their meaning
varies depending on various pattern modifiers that alter the default
behaviors.  See L</Modifiers>.


            PURPOSE                                  WHERE
 \   Escape the next character                    Always, except when
                                                  escaped by another \
 ^   Match the beginning of the string            Not in []
       (or line, if /m is used)
 ^   Complement the [] class                      At the beginning of []
 .   Match any single character except newline    Not in []
       (under /s, includes newline)
 $   Match the end of the string                  Not in [], but can
       (or before newline at the end of the       mean interpolate a
       string; or before any newline if /m is     scalar
       used)
 |   Alternation                                  Not in []
 ()  Grouping                                     Not in []
 [   Start Bracketed Character class              Not in []
 ]   End Bracketed Character class                Only in [], and
                                                    not first
 *   Matches the preceding element 0 or more      Not in []
       times
 +   Matches the preceding element 1 or more      Not in []
       times
 ?   Matches the preceding element 0 or 1         Not in []
       times
 {   Starts a sequence that gives number(s)       Not in []
       of times the preceding element can be
       matched
 {   when following certain escape sequences
       starts a modifier to the meaning of the
       sequence
 }   End sequence started by {
 -   Indicates a range                            Only in [] interior

Notice that most of the metacharacters lose their special meaning when
they occur in a bracketed character class, except C<"^"> has a different
meaning when it is at the beginning of such a class.  And C<"-"> and C<"]">
are metacharacters only at restricted positions within bracketed
character classes; while C<"}"> is a metacharacter only when closing a
special construct started by C<"{">.

In double-quotish context, as is usually the case,  you need to be
careful about C<"$"> and the non-metacharacter C<"@">.  Those could
interpolate variables, which may or may not be what you intended.

These rules were designed for compactness of expression, rather than
legibility and maintainability.  The L</E<sol>x and E<sol>xx> pattern
modifiers allow you to insert white space to improve readability.  And
use of S<C<L<re 'strict'|re/'strict' mode>>> adds extra checking to
catch some typos that might silently compile into something unintended.

By default, the C<"^"> character is guaranteed to match only the
beginning of the string, the C<"$"> character only the end (or before the
newline at the end), and Perl does certain optimizations with the
assumption that the string contains only one line.  Embedded newlines
will not be matched by C<"^"> or C<"$">.  You may, however, wish to treat a
string as a multi-line buffer, such that the C<"^"> will match after any
newline within the string (except if the newline is the last character in
the string), and C<"$"> will match before any newline.  At the
cost of a little more overhead, you can do this by using the
L</C<E<sol>m>> modifier on the pattern match operator.  (Older programs
did this by setting C<$*>, but this option was removed in perl 5.10.)
X<^> X<$> X</m>

To simplify multi-line substitutions, the C<"."> character never matches a
newline unless you use the L<C<E<sol>s>|/s> modifier, which in effect tells
Perl to pretend the string is a single line--even if it isn't.
X<.> X</s>

=head2 Modifiers

=head3 Overview

The default behavior for matching can be changed, using various
modifiers.  Modifiers that relate to the interpretation of the pattern
are listed just below.  Modifiers that alter the way a pattern is used
by Perl are detailed in L<perlop/"Regexp Quote-Like Operators"> and
L<perlop/"Gory details of parsing quoted constructs">.

=over 4

=item B<C<m>>
X</m> X<regex, multiline> X<regexp, multiline> X<regular expression, multiline>

Treat the string being matched against as multiple lines.  That is, change C<"^"> and C<"$"> from matching
the start of the string's first line and the end of its last line to
matching the start and end of each line within the string.

=item B<C<s>>
X</s> X<regex, single-line> X<regexp, single-line>
X<regular expression, single-line>

Treat the string as single line.  That is, change C<"."> to match any character
whatsoever, even a newline, which normally it would not match.

Used together, as C</ms>, they let the C<"."> match any character whatsoever,
while still allowing C<"^"> and C<"$"> to match, respectively, just after
and just before newlines within the string.

=item B<C<i>>
X</i> X<regex, case-insensitive> X<regexp, case-insensitive>
X<regular expression, case-insensitive>

Do case-insensitive pattern matching.  For example, "A" will match "a"
under C</i>.

If locale matching rules are in effect, the case map is taken from the
current
locale for code points less than 255, and from Unicode rules for larger
code points.  However, matches that would cross the Unicode
rules/non-Unicode rules boundary (ords 255/256) will not succeed, unless
the locale is a UTF-8 one.  See L<perllocale>.

There are a number of Unicode characters that match a sequence of
multiple characters under C</i>.  For example,
C<LATIN SMALL LIGATURE FI> should match the sequence C<fi>.  Perl is not
currently able to do this when the multiple characters are in the pattern and
are split between groupings, or when one or more are quantified.  Thus

 "\N{LATIN SMALL LIGATURE FI}" =~ /fi/i;          # Matches
 "\N{LATIN SMALL LIGATURE FI}" =~ /[fi][fi]/i;    # Doesn't match!
 "\N{LATIN SMALL LIGATURE FI}" =~ /fi*/i;         # Doesn't match!

 # The below doesn't match, and it isn't clear what $1 and $2 would
 # be even if it did!!
 "\N{LATIN SMALL LIGATURE FI}" =~ /(f)(i)/i;      # Doesn't match!

Perl doesn't match multiple characters in a bracketed
character class unless the character that maps to them is explicitly
mentioned, and it doesn't match them at all if the character class is
inverted, which otherwise could be highly confusing.  See
L<perlrecharclass/Bracketed Character Classes>, and
L<perlrecharclass/Negation>.

=item B<C<x>> and B<C<xx>>
X</x>

Extend your pattern's legibility by permitting whitespace and comments.
Details in L</E<sol>x and  E<sol>xx>

=item B<C<p>>
X</p> X<regex, preserve> X<regexp, preserve>

Preserve the string matched such that C<${^PREMATCH}>, C<${^MATCH}>, and
C<${^POSTMATCH}> are available for use after matching.

In Perl 5.20 and higher this is ignored. Due to a new copy-on-write
mechanism, C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> will be available
after the match regardless of the modifier.

=item B<C<a>>, B<C<d>>, B<C<l>>, and B<C<u>>
X</a> X</d> X</l> X</u>

These modifiers, all new in 5.14, affect which character-set rules
(Unicode, I<etc>.) are used, as described below in
L</Character set modifiers>.

=item B<C<n>>
X</n> X<regex, non-capture> X<regexp, non-capture>
X<regular expression, non-capture>

Prevent the grouping metacharacters C<()> from capturing. This modifier,
new in 5.22, will stop C<$1>, C<$2>, I<etc>... from being filled in.

  "hello" =~ /(hi|hello)/;   # $1 is "hello"
  "hello" =~ /(hi|hello)/n;  # $1 is undef

This is equivalent to putting C<?:> at the beginning of every capturing group:

  "hello" =~ /(?:hi|hello)/; # $1 is undef

C</n> can be negated on a per-group basis. Alternatively, named captures
may still be used.

  "hello" =~ /(?-n:(hi|hello))/n;   # $1 is "hello"
  "hello" =~ /(?<greet>hi|hello)/n; # $1 is "hello", $+{greet} is
                                    # "hello"

=item Other Modifiers

There are a number of flags that can be found at the end of regular
expression constructs that are I<not> generic regular expression flags, but
apply to the operation being performed, like matching or substitution (C<m//>
or C<s///> respectively).

Flags described further in
L<perlretut/"Using regular expressions in Perl"> are:

  c  - keep the current position during repeated matching
  g  - globally match the pattern repeatedly in the string

Substitution-specific modifiers described in
L<perlop/"s/PATTERN/REPLACEMENT/msixpodualngcer"> are:

  e  - evaluate the right-hand side as an expression
  ee - evaluate the right side as a string then eval the result
  o  - pretend to optimize your code, but actually introduce bugs
  r  - perform non-destructive substitution and return the new value

=back

Regular expression modifiers are usually written in documentation
as I<e.g.>, "the C</x> modifier", even though the delimiter
in question might not really be a slash.  The modifiers C</imnsxadlup>
may also be embedded within the regular expression itself using
the C<(?...)> construct, see L</Extended Patterns> below.

=head3 Details on some modifiers

Some of the modifiers require more explanation than given in the
L</Overview> above.

=head4 C</x> and  C</xx>

A single C</x> tells
the regular expression parser to ignore most whitespace that is neither
backslashed nor within a bracketed character class.  You can use this to
break up your regular expression into more readable parts.
Also, the C<"#"> character is treated as a metacharacter introducing a
comment that runs up to the pattern's closing delimiter, or to the end
of the current line if the pattern extends onto the next line.  Hence,
this is very much like an ordinary Perl code comment.  (You can include
the closing delimiter within the comment only if you precede it with a
backslash, so be careful!)

Use of C</x> means that if you want real
whitespace or C<"#"> characters in the pattern (outside a bracketed character
class, which is unaffected by C</x>), then you'll either have to
escape them (using backslashes or C<\Q...\E>) or encode them using octal,
hex, or C<\N{}> escapes.
It is ineffective to try to continue a comment onto the next line by
escaping the C<\n> with a backslash or C<\Q>.

You can use L</(?#text)> to create a comment that ends earlier than the
end of the current line, but C<text> also can't contain the closing
delimiter unless escaped with a backslash.

A common pitfall is to forget that C<"#"> characters begin a comment under
C</x> and are not matched literally.  Just keep that in mind when trying
to puzzle out why a particular C</x> pattern isn't working as expected.

Starting in Perl v5.26, if the modifier has a second C<"x"> within it,
it does everything that a single C</x> does, but additionally
non-backslashed SPACE and TAB characters within bracketed character
classes are also generally ignored, and hence can be added to make the
classes more readable.

    / [d-e g-i 3-7]/xx
    /[ ! @ " # $ % ^ & * () = ? <> ' ]/xx

may be easier to grasp than the squashed equivalents

    /[d-eg-i3-7]/
    /[!@"#$%^&*()=?<>']/

Taken together, these features go a long way towards
making Perl's regular expressions more readable.  Here's an example:

    # Delete (most) C comments.
    $program =~ s {
	/\*	# Match the opening delimiter.
	.*?	# Match a minimal number of characters.
	\*/	# Match the closing delimiter.
    } []gsx;

Note that anything inside
a C<\Q...\E> stays unaffected by C</x>.  And note that C</x> doesn't affect
space interpretation within a single multi-character construct.  For
example in C<\x{...}>, regardless of the C</x> modifier, there can be no
spaces.  Same for a L<quantifier|/Quantifiers> such as C<{3}> or
C<{5,}>.  Similarly, C<(?:...)> can't have a space between the C<"(">,
C<"?">, and C<":">.  Within any delimiters for such a
construct, allowed spaces are not affected by C</x>, and depend on the
construct.  For example, C<\x{...}> can't have spaces because hexadecimal
numbers don't have spaces in them.  But, Unicode properties can have spaces, so
in C<\p{...}> there can be spaces that follow the Unicode rules, for which see
L<perluniprops/Properties accessible through \p{} and \P{}>.
X</x>

The set of characters that are deemed whitespace are those that Unicode
calls "Pattern White Space", namely:

 U+0009 CHARACTER TABULATION
 U+000A LINE FEED
 U+000B LINE TABULATION
 U+000C FORM FEED
 U+000D CARRIAGE RETURN
 U+0020 SPACE
 U+0085 NEXT LINE
 U+200E LEFT-TO-RIGHT MARK
 U+200F RIGHT-TO-LEFT MARK
 U+2028 LINE SEPARATOR
 U+2029 PARAGRAPH SEPARATOR

=head4 Character set modifiers

C</d>, C</u>, C</a>, and C</l>, available starting in 5.14, are called
the character set modifiers; they affect the character set rules
used for the regular expression.

The C</d>, C</u>, and C</l> modifiers are not likely to be of much use
to you, and so you need not worry about them very much.  They exist for
Perl's internal use, so that complex regular expression data structures
can be automatically serialized and later exactly reconstituted,
including all their nuances.  But, since Perl can't keep a secret, and
there may be rare instances where they are useful, they are documented
here.

The C</a> modifier, on the other hand, may be useful.  Its purpose is to
allow code that is to work mostly on ASCII data to not have to concern
itself with Unicode.

Briefly, C</l> sets the character set to that of whatever B<L>ocale is in
effect at the time of the execution of the pattern match.

C</u> sets the character set to B<U>nicode.

C</a> also sets the character set to Unicode, BUT adds several
restrictions for B<A>SCII-safe matching.

C</d> is the old, problematic, pre-5.14 B<D>efault character set
behavior.  Its only use is to force that old behavior.

At any given time, exactly one of these modifiers is in effect.  Their
existence allows Perl to keep the originally compiled behavior of a
regular expression, regardless of what rules are in effect when it is
actually executed.  And if it is interpolated into a larger regex, the
original's rules continue to apply to it, and only it.

The C</l> and C</u> modifiers are automatically selected for
regular expressions compiled within the scope of various pragmas,
and we recommend that in general, you use those pragmas instead of
specifying these modifiers explicitly.  For one thing, the modifiers
affect only pattern matching, and do not extend to even any replacement
done, whereas using the pragmas gives consistent results for all
appropriate operations within their scopes.  For example,

 s/foo/\Ubar/il

will match "foo" using the locale's rules for case-insensitive matching,
but the C</l> does not affect how the C<\U> operates.  Most likely you
want both of them to use locale rules.  To do this, instead compile the
regular expression within the scope of C<use locale>.  This both
implicitly adds the C</l>, and applies locale rules to the C<\U>.   The
lesson is to C<use locale>, and not C</l> explicitly.

Similarly, it would be better to use C<use feature 'unicode_strings'>
instead of,

 s/foo/\Lbar/iu

to get Unicode rules, as the C<\L> in the former (but not necessarily
the latter) would also use Unicode rules.

More detail on each of the modifiers follows.  Most likely you don't
need to know this detail for C</l>, C</u>, and C</d>, and can skip ahead
to L<E<sol>a|/E<sol>a (and E<sol>aa)>.

=head4 /l

means to use the current locale's rules (see L<perllocale>) when pattern
matching.  For example, C<\w> will match the "word" characters of that
locale, and C<"/i"> case-insensitive matching will match according to
the locale's case folding rules.  The locale used will be the one in
effect at the time of execution of the pattern match.  This may not be
the same as the compilation-time locale, and can differ from one match
to another if there is an intervening call of the
L<setlocale() function|perllocale/The setlocale function>.

Prior to v5.20, Perl did not support multi-byte locales.  Starting then,
UTF-8 locales are supported.  No other multi byte locales are ever
likely to be supported.  However, in all locales, one can have code
points above 255 and these will always be treated as Unicode no matter
what locale is in effect.

Under Unicode rules, there are a few case-insensitive matches that cross
the 255/256 boundary.  Except for UTF-8 locales in Perls v5.20 and
later, these are disallowed under C</l>.  For example, 0xFF (on ASCII
platforms) does not caselessly match the character at 0x178, C<LATIN
CAPITAL LETTER Y WITH DIAERESIS>, because 0xFF may not be C<LATIN SMALL
LETTER Y WITH DIAERESIS> in the current locale, and Perl has no way of
knowing if that character even exists in the locale, much less what code
point it is.

In a UTF-8 locale in v5.20 and later, the only visible difference
between locale and non-locale in regular expressions should be tainting
(see L<perlsec>).

This modifier may be specified to be the default by C<use locale>, but
see L</Which character set modifier is in effect?>.
X</l>

=head4 /u

means to use Unicode rules when pattern matching.  On ASCII platforms,
this means that the code points between 128 and 255 take on their
Latin-1 (ISO-8859-1) meanings (which are the same as Unicode's).
(Otherwise Perl considers their meanings to be undefined.)  Thus,
under this modifier, the ASCII platform effectively becomes a Unicode
platform; and hence, for example, C<\w> will match any of the more than
100_000 word characters in Unicode.

Unlike most locales, which are specific to a language and country pair,
Unicode classifies all the characters that are letters I<somewhere> in
the world as
C<\w>.  For example, your locale might not think that C<LATIN SMALL
LETTER ETH> is a letter (unless you happen to speak Icelandic), but
Unicode does.  Similarly, all the characters that are decimal digits
somewhere in the world will match C<\d>; this is hundreds, not 10,
possible matches.  And some of those digits look like some of the 10
ASCII digits, but mean a different number, so a human could easily think
a number is a different quantity than it really is.  For example,
C<BENGALI DIGIT FOUR> (U+09EA) looks very much like an
C<ASCII DIGIT EIGHT> (U+0038).  And, C<\d+>, may match strings of digits
that are a mixture from different writing systems, creating a security
issue.  L<Unicode::UCD/num()> can be used to sort
this out.  Or the C</a> modifier can be used to force C<\d> to match
just the ASCII 0 through 9.

Also, under this modifier, case-insensitive matching works on the full
set of Unicode
characters.  The C<KELVIN SIGN>, for example matches the letters "k" and
"K"; and C<LATIN SMALL LIGATURE FF> matches the sequence "ff", which,
if you're not prepared, might make it look like a hexadecimal constant,
presenting another potential security issue.  See
L<http://unicode.org/reports/tr36> for a detailed discussion of Unicode
security issues.

This modifier may be specified to be the default by C<use feature
'unicode_strings>, C<use locale ':not_characters'>, or
C<L<use 5.012|perlfunc/use VERSION>> (or higher),
but see L</Which character set modifier is in effect?>.
X</u>

=head4 /d

This modifier means to use the "Default" native rules of the platform
except when there is cause to use Unicode rules instead, as follows:

=over 4

=item 1

the target string is encoded in UTF-8; or

=item 2

the pattern is encoded in UTF-8; or

=item 3

the pattern explicitly mentions a code point that is above 255 (say by
C<\x{100}>); or

=item 4

the pattern uses a Unicode name (C<\N{...}>);  or

=item 5

the pattern uses a Unicode property (C<\p{...}> or C<\P{...}>); or

=item 6

the pattern uses a Unicode break (C<\b{...}> or C<\B{...}>); or

=item 7

the pattern uses L</C<(?[ ])>>

=back

Another mnemonic for this modifier is "Depends", as the rules actually
used depend on various things, and as a result you can get unexpected
results.  See L<perlunicode/The "Unicode Bug">.  The Unicode Bug has
become rather infamous, leading to yet another (printable) name for this
modifier, "Dodgy".

Unless the pattern or string are encoded in UTF-8, only ASCII characters
can match positively.

Here are some examples of how that works on an ASCII platform:

 $str =  "\xDF";      # $str is not in UTF-8 format.
 $str =~ /^\w/;       # No match, as $str isn't in UTF-8 format.
 $str .= "\x{0e0b}";  # Now $str is in UTF-8 format.
 $str =~ /^\w/;       # Match! $str is now in UTF-8 format.
 chop $str;
 $str =~ /^\w/;       # Still a match! $str remains in UTF-8 format.

This modifier is automatically selected by default when none of the
others are, so yet another name for it is "Default".

Because of the unexpected behaviors associated with this modifier, you
probably should only explicitly use it to maintain weird backward
compatibilities.

=head4 /a (and /aa)

This modifier stands for ASCII-restrict (or ASCII-safe).  This modifier
may be doubled-up to increase its effect.

When it appears singly, it causes the sequences C<\d>, C<\s>, C<\w>, and
the Posix character classes to match only in the ASCII range.  They thus
revert to their pre-5.6, pre-Unicode meanings.  Under C</a>,  C<\d>
always means precisely the digits C<"0"> to C<"9">; C<\s> means the five
characters C<[ \f\n\r\t]>, and starting in Perl v5.18, the vertical tab;
C<\w> means the 63 characters
C<[A-Za-z0-9_]>; and likewise, all the Posix classes such as
C<[[:print:]]> match only the appropriate ASCII-range characters.

This modifier is useful for people who only incidentally use Unicode,
and who do not wish to be burdened with its complexities and security
concerns.

With C</a>, one can write C<\d> with confidence that it will only match
ASCII characters, and should the need arise to match beyond ASCII, you
can instead use C<\p{Digit}> (or C<\p{Word}> for C<\w>).  There are
similar C<\p{...}> constructs that can match beyond ASCII both white
space (see L<perlrecharclass/Whitespace>), and Posix classes (see
L<perlrecharclass/POSIX Character Classes>).  Thus, this modifier
doesn't mean you can't use Unicode, it means that to get Unicode
matching you must explicitly use a construct (C<\p{}>, C<\P{}>) that
signals Unicode.

As you would expect, this modifier causes, for example, C<\D> to mean
the same thing as C<[^0-9]>; in fact, all non-ASCII characters match
C<\D>, C<\S>, and C<\W>.  C<\b> still means to match at the boundary
between C<\w> and C<\W>, using the C</a> definitions of them (similarly
for C<\B>).

Otherwise, C</a> behaves like the C</u> modifier, in that
case-insensitive matching uses Unicode rules; for example, "k" will
match the Unicode C<\N{KELVIN SIGN}> under C</i> matching, and code
points in the Latin1 range, above ASCII will have Unicode rules when it
comes to case-insensitive matching.

To forbid ASCII/non-ASCII matches (like "k" with C<\N{KELVIN SIGN}>),
specify the C<"a"> twice, for example C</aai> or C</aia>.  (The first
occurrence of C<"a"> restricts the C<\d>, I<etc>., and the second occurrence
adds the C</i> restrictions.)  But, note that code points outside the
ASCII range will use Unicode rules for C</i> matching, so the modifier
doesn't really restrict things to just ASCII; it just forbids the
intermixing of ASCII and non-ASCII.

To summarize, this modifier provides protection for applications that
don't wish to be exposed to all of Unicode.  Specifying it twice
gives added protection.

This modifier may be specified to be the default by C<use re '/a'>
or C<use re '/aa'>.  If you do so, you may actually have occasion to use
the C</u> modifier explicitly if there are a few regular expressions
where you do want full Unicode rules (but even here, it's best if
everything were under feature C<"unicode_strings">, along with the
C<use re '/aa'>).  Also see L</Which character set modifier is in
effect?>.
X</a>
X</aa>

=head4 Which character set modifier is in effect?

Which of these modifiers is in effect at any given point in a regular
expression depends on a fairly complex set of interactions.  These have
been designed so that in general you don't have to worry about it, but
this section gives the gory details.  As
explained below in L</Extended Patterns> it is possible to explicitly
specify modifiers that apply only to portions of a regular expression.
The innermost always has priority over any outer ones, and one applying
to the whole expression has priority over any of the default settings that are
described in the remainder of this section.

The C<L<use re 'E<sol>foo'|re/"'/flags' mode">> pragma can be used to set
default modifiers (including these) for regular expressions compiled
within its scope.  This pragma has precedence over the other pragmas
listed below that also change the defaults.

Otherwise, C<L<use locale|perllocale>> sets the default modifier to C</l>;
and C<L<use feature 'unicode_strings|feature>>, or
C<L<use 5.012|perlfunc/use VERSION>> (or higher) set the default to
C</u> when not in the same scope as either C<L<use locale|perllocale>>
or C<L<use bytes|bytes>>.
(C<L<use locale ':not_characters'|perllocale/Unicode and UTF-8>> also
sets the default to C</u>, overriding any plain C<use locale>.)
Unlike the mechanisms mentioned above, these
affect operations besides regular expressions pattern matching, and so
give more consistent results with other operators, including using
C<\U>, C<\l>, I<etc>. in substitution replacements.

If none of the above apply, for backwards compatibility reasons, the
C</d> modifier is the one in effect by default.  As this can lead to
unexpected results, it is best to specify which other rule set should be
used.

=head4 Character set modifier behavior prior to Perl 5.14

Prior to 5.14, there were no explicit modifiers, but C</l> was implied
for regexes compiled within the scope of C<use locale>, and C</d> was
implied otherwise.  However, interpolating a regex into a larger regex
would ignore the original compilation in favor of whatever was in effect
at the time of the second compilation.  There were a number of
inconsistencies (bugs) with the C</d> modifier, where Unicode rules
would be used when inappropriate, and vice versa.  C<\p{}> did not imply
Unicode rules, and neither did all occurrences of C<\N{}>, until 5.12.

=head2 Regular Expressions

=head3 Quantifiers

Quantifiers are used when a particular portion of a pattern needs to
match a certain number (or numbers) of times.  If there isn't a
quantifier the number of times to match is exactly one.  The following
standard quantifiers are recognized:
X<metacharacter> X<quantifier> X<*> X<+> X<?> X<{n}> X<{n,}> X<{n,m}>

    *           Match 0 or more times
    +           Match 1 or more times
    ?           Match 1 or 0 times
    {n}         Match exactly n times
    {n,}        Match at least n times
    {n,m}       Match at least n but not more than m times

(If a non-escaped curly bracket occurs in a context other than one of
the quantifiers listed above, where it does not form part of a
backslashed sequence like C<\x{...}>, it is either a fatal syntax error,
or treated as a regular character, generally with a deprecation warning
raised.  To escape it, you can precede it with a backslash (C<"\{">) or
enclose it within square brackets  (C<"[{]">).
This change will allow for future syntax extensions (like making the
lower bound of a quantifier optional), and better error checking of
quantifiers).

The C<"*"> quantifier is equivalent to C<{0,}>, the C<"+">
quantifier to C<{1,}>, and the C<"?"> quantifier to C<{0,1}>.  I<n> and I<m> are limited
to non-negative integral values less than a preset limit defined when perl is built.
This is usually 32766 on the most common platforms.  The actual limit can
be seen in the error message generated by code such as this:

    $_ **= $_ , / {$_} / for 2 .. 42;

By default, a quantified subpattern is "greedy", that is, it will match as
many times as possible (given a particular starting location) while still
allowing the rest of the pattern to match.  If you want it to match the
minimum number of times possible, follow the quantifier with a C<"?">.  Note
that the meanings don't change, just the "greediness":
X<metacharacter> X<greedy> X<greediness>
X<?> X<*?> X<+?> X<??> X<{n}?> X<{n,}?> X<{n,m}?>

    *?        Match 0 or more times, not greedily
    +?        Match 1 or more times, not greedily
    ??        Match 0 or 1 time, not greedily
    {n}?      Match exactly n times, not greedily (redundant)
    {n,}?     Match at least n times, not greedily
    {n,m}?    Match at least n but not more than m times, not greedily

Normally when a quantified subpattern does not allow the rest of the
overall pattern to match, Perl will backtrack. However, this behaviour is
sometimes undesirable. Thus Perl provides the "possessive" quantifier form
as well.

 *+     Match 0 or more times and give nothing back
 ++     Match 1 or more times and give nothing back
 ?+     Match 0 or 1 time and give nothing back
 {n}+   Match exactly n times and give nothing back (redundant)
 {n,}+  Match at least n times and give nothing back
 {n,m}+ Match at least n but not more than m times and give nothing back

For instance,

   'aaaa' =~ /a++a/

will never match, as the C<a++> will gobble up all the C<"a">'s in the
string and won't leave any for the remaining part of the pattern. This
feature can be extremely useful to give perl hints about where it
shouldn't backtrack. For instance, the typical "match a double-quoted
string" problem can be most efficiently performed when written as:

   /"(?:[^"\\]++|\\.)*+"/

as we know that if the final quote does not match, backtracking will not
help. See the independent subexpression
L</C<< (?>pattern) >>> for more details;
possessive quantifiers are just syntactic sugar for that construct. For
instance the above example could also be written as follows:

   /"(?>(?:(?>[^"\\]+)|\\.)*)"/

Note that the possessive quantifier modifier can not be be combined
with the non-greedy modifier. This is because it would make no sense.
Consider the follow equivalency table:

    Illegal         Legal
    ------------    ------
    X??+            X{0}
    X+?+            X{1}
    X{min,max}?+    X{min}

=head3 Escape sequences

Because patterns are processed as double-quoted strings, the following
also work:

 \t          tab                   (HT, TAB)
 \n          newline               (LF, NL)
 \r          return                (CR)
 \f          form feed             (FF)
 \a          alarm (bell)          (BEL)
 \e          escape (think troff)  (ESC)
 \cK         control char          (example: VT)
 \x{}, \x00  character whose ordinal is the given hexadecimal number
 \N{name}    named Unicode character or character sequence
 \N{U+263D}  Unicode character     (example: FIRST QUARTER MOON)
 \o{}, \000  character whose ordinal is the given octal number
 \l          lowercase next char (think vi)
 \u          uppercase next char (think vi)
 \L          lowercase until \E (think vi)
 \U          uppercase until \E (think vi)
 \Q          quote (disable) pattern metacharacters until \E
 \E          end either case modification or quoted section, think vi

Details are in L<perlop/Quote and Quote-like Operators>.

=head3 Character Classes and other Special Escapes

In addition, Perl defines the following:
X<\g> X<\k> X<\K> X<backreference>

 Sequence   Note    Description
  [...]     [1]  Match a character according to the rules of the
                   bracketed character class defined by the "...".
                   Example: [a-z] matches "a" or "b" or "c" ... or "z"
  [[:...:]] [2]  Match a character according to the rules of the POSIX
                   character class "..." within the outer bracketed
                   character class.  Example: [[:upper:]] matches any
                   uppercase character.
  (?[...])  [8]  Extended bracketed character class
  \w        [3]  Match a "word" character (alphanumeric plus "_", plus
                   other connector punctuation chars plus Unicode
                   marks)
  \W        [3]  Match a non-"word" character
  \s        [3]  Match a whitespace character
  \S        [3]  Match a non-whitespace character
  \d        [3]  Match a decimal digit character
  \D        [3]  Match a non-digit character
  \pP       [3]  Match P, named property.  Use \p{Prop} for longer names
  \PP       [3]  Match non-P
  \X        [4]  Match Unicode "eXtended grapheme cluster"
  \1        [5]  Backreference to a specific capture group or buffer.
                   '1' may actually be any positive integer.
  \g1       [5]  Backreference to a specific or previous group,
  \g{-1}    [5]  The number may be negative indicating a relative
                   previous group and may optionally be wrapped in
                   curly brackets for safer parsing.
  \g{name}  [5]  Named backreference
  \k<name>  [5]  Named backreference
  \K        [6]  Keep the stuff left of the \K, don't include it in $&
  \N        [7]  Any character but \n.  Not affected by /s modifier
  \v        [3]  Vertical whitespace
  \V        [3]  Not vertical whitespace
  \h        [3]  Horizontal whitespace
  \H        [3]  Not horizontal whitespace
  \R        [4]  Linebreak

=over 4

=item [1]

See L<perlrecharclass/Bracketed Character Classes> for details.

=item [2]

See L<perlrecharclass/POSIX Character Classes> for details.

=item [3]

See L<perlrecharclass/Backslash sequences> for details.

=item [4]

See L<perlrebackslash/Misc> for details.

=item [5]

See L</Capture groups> below for details.

=item [6]

See L</Extended Patterns> below for details.

=item [7]

Note that C<\N> has two meanings.  When of the form C<\N{NAME}>, it matches the
character or character sequence whose name is C<NAME>; and similarly
when of the form C<\N{U+I<hex>}>, it matches the character whose Unicode
code point is I<hex>.  Otherwise it matches any character but C<\n>.

=item [8]

See L<perlrecharclass/Extended Bracketed Character Classes> for details.

=back

=head3 Assertions

Besides L<C<"^"> and C<"$">|/Metacharacters>, Perl defines the following
zero-width assertions:
X<zero-width assertion> X<assertion> X<regex, zero-width assertion>
X<regexp, zero-width assertion>
X<regular expression, zero-width assertion>
X<\b> X<\B> X<\A> X<\Z> X<\z> X<\G>

 \b{}   Match at Unicode boundary of specified type
 \B{}   Match where corresponding \b{} doesn't match
 \b     Match a \w\W or \W\w boundary
 \B     Match except at a \w\W or \W\w boundary
 \A     Match only at beginning of string
 \Z     Match only at end of string, or before newline at the end
 \z     Match only at end of string
 \G     Match only at pos() (e.g. at the end-of-match position
        of prior m//g)

A Unicode boundary (C<\b{}>), available starting in v5.22, is a spot
between two characters, or before the first character in the string, or
after the final character in the string where certain criteria defined
by Unicode are met.  See L<perlrebackslash/\b{}, \b, \B{}, \B> for
details.

A word boundary (C<\b>) is a spot between two characters
that has a C<\w> on one side of it and a C<\W> on the other side
of it (in either order), counting the imaginary characters off the
beginning and end of the string as matching a C<\W>.  (Within
character classes C<\b> represents backspace rather than a word
boundary, just as it normally does in any double-quoted string.)
The C<\A> and C<\Z> are just like C<"^"> and C<"$">, except that they
won't match multiple times when the C</m> modifier is used, while
C<"^"> and C<"$"> will match at every internal line boundary.  To match
the actual end of the string and not ignore an optional trailing
newline, use C<\z>.
X<\b> X<\A> X<\Z> X<\z> X</m>

The C<\G> assertion can be used to chain global matches (using
C<m//g>), as described in L<perlop/"Regexp Quote-Like Operators">.
It is also useful when writing C<lex>-like scanners, when you have
several patterns that you want to match against consequent substrings
of your string; see the previous reference.  The actual location
where C<\G> will match can also be influenced by using C<pos()> as
an lvalue: see L<perlfunc/pos>. Note that the rule for zero-length
matches (see L</"Repeated Patterns Matching a Zero-length Substring">)
is modified somewhat, in that contents to the left of C<\G> are
not counted when determining the length of the match. Thus the following
will not match forever:
X<\G>

     my $string = 'ABC';
     pos($string) = 1;
     while ($string =~ /(.\G)/g) {
         print $1;
     }

It will print 'A' and then terminate, as it considers the match to
be zero-width, and thus will not match at the same position twice in a
row.

It is worth noting that C<\G> improperly used can result in an infinite
loop. Take care when using patterns that include C<\G> in an alternation.

Note also that C<s///> will refuse to overwrite part of a substitution
that has already been replaced; so for example this will stop after the
first iteration, rather than iterating its way backwards through the
string:

    $_ = "123456789";
    pos = 6;
    s/.(?=.\G)/X/g;
    print; 	# prints 1234X6789, not XXXXX6789


=head3 Capture groups

The grouping construct C<( ... )> creates capture groups (also referred to as
capture buffers). To refer to the current contents of a group later on, within
the same pattern, use C<\g1> (or C<\g{1}>) for the first, C<\g2> (or C<\g{2}>)
for the second, and so on.
This is called a I<backreference>.
X<regex, capture buffer> X<regexp, capture buffer>
X<regex, capture group> X<regexp, capture group>
X<regular expression, capture buffer> X<backreference>
X<regular expression, capture group> X<backreference>
X<\g{1}> X<\g{-1}> X<\g{name}> X<relative backreference> X<named backreference>
X<named capture buffer> X<regular expression, named capture buffer>
X<named capture group> X<regular expression, named capture group>
X<%+> X<$+{name}> X<< \k<name> >>
There is no limit to the number of captured substrings that you may use.
Groups are numbered with the leftmost open parenthesis being number 1, I<etc>.  If
a group did not match, the associated backreference won't match either. (This
can happen if the group is optional, or in a different branch of an
alternation.)
You can omit the C<"g">, and write C<"\1">, I<etc>, but there are some issues with
this form, described below.

You can also refer to capture groups relatively, by using a negative number, so
that C<\g-1> and C<\g{-1}> both refer to the immediately preceding capture
group, and C<\g-2> and C<\g{-2}> both refer to the group before it.  For
example:

        /
         (Y)            # group 1
         (              # group 2
            (X)         # group 3
            \g{-1}      # backref to group 3
            \g{-3}      # backref to group 1
         )
        /x

would match the same as C</(Y) ( (X) \g3 \g1 )/x>.  This allows you to
interpolate regexes into larger regexes and not have to worry about the
capture groups being renumbered.

You can dispense with numbers altogether and create named capture groups.
The notation is C<(?E<lt>I<name>E<gt>...)> to declare and C<\g{I<name>}> to
reference.  (To be compatible with .Net regular expressions, C<\g{I<name>}> may
also be written as C<\k{I<name>}>, C<\kE<lt>I<name>E<gt>> or C<\k'I<name>'>.)
I<name> must not begin with a number, nor contain hyphens.
When different groups within the same pattern have the same name, any reference
to that name assumes the leftmost defined group.  Named groups count in
absolute and relative numbering, and so can also be referred to by those
numbers.
(It's possible to do things with named capture groups that would otherwise
require C<(??{})>.)

Capture group contents are dynamically scoped and available to you outside the
pattern until the end of the enclosing block or until the next successful
match, whichever comes first.  (See L<perlsyn/"Compound Statements">.)
You can refer to them by absolute number (using C<"$1"> instead of C<"\g1">,
I<etc>); or by name via the C<%+> hash, using C<"$+{I<name>}">.

Braces are required in referring to named capture groups, but are optional for
absolute or relative numbered ones.  Braces are safer when creating a regex by
concatenating smaller strings.  For example if you have C<qr/$a$b/>, and C<$a>
contained C<"\g1">, and C<$b> contained C<"37">, you would get C</\g137/> which
is probably not what you intended.

The C<\g> and C<\k> notations were introduced in Perl 5.10.0.  Prior to that
there were no named nor relative numbered capture groups.  Absolute numbered
groups were referred to using C<\1>,
C<\2>, I<etc>., and this notation is still
accepted (and likely always will be).  But it leads to some ambiguities if
there are more than 9 capture groups, as C<\10> could mean either the tenth
capture group, or the character whose ordinal in octal is 010 (a backspace in
ASCII).  Perl resolves this ambiguity by interpreting C<\10> as a backreference
only if at least 10 left parentheses have opened before it.  Likewise C<\11> is
a backreference only if at least 11 left parentheses have opened before it.
And so on.  C<\1> through C<\9> are always interpreted as backreferences.
There are several examples below that illustrate these perils.  You can avoid
the ambiguity by always using C<\g{}> or C<\g> if you mean capturing groups;
and for octal constants always using C<\o{}>, or for C<\077> and below, using 3
digits padded with leading zeros, since a leading zero implies an octal
constant.

The C<\I<digit>> notation also works in certain circumstances outside
the pattern.  See L</Warning on \1 Instead of $1> below for details.

Examples:

    s/^([^ ]*) *([^ ]*)/$2 $1/;     # swap first two words

    /(.)\g1/                        # find first doubled char
         and print "'$1' is the first doubled character\n";

    /(?<char>.)\k<char>/            # ... a different way
         and print "'$+{char}' is the first doubled character\n";

    /(?'char'.)\g1/                 # ... mix and match
         and print "'$1' is the first doubled character\n";

    if (/Time: (..):(..):(..)/) {   # parse out values
        $hours = $1;
        $minutes = $2;
        $seconds = $3;
    }

    /(.)(.)(.)(.)(.)(.)(.)(.)(.)\g10/   # \g10 is a backreference
    /(.)(.)(.)(.)(.)(.)(.)(.)(.)\10/    # \10 is octal
    /((.)(.)(.)(.)(.)(.)(.)(.)(.))\10/  # \10 is a backreference
    /((.)(.)(.)(.)(.)(.)(.)(.)(.))\010/ # \010 is octal

    $a = '(.)\1';        # Creates problems when concatenated.
    $b = '(.)\g{1}';     # Avoids the problems.
    "aa" =~ /${a}/;      # True
    "aa" =~ /${b}/;      # True
    "aa0" =~ /${a}0/;    # False!
    "aa0" =~ /${b}0/;    # True
    "aa\x08" =~ /${a}0/;  # True!
    "aa\x08" =~ /${b}0/;  # False

Several special variables also refer back to portions of the previous
match.  C<$+> returns whatever the last bracket match matched.
C<$&> returns the entire matched string.  (At one point C<$0> did
also, but now it returns the name of the program.)  C<$`> returns
everything before the matched string.  C<$'> returns everything
after the matched string. And C<$^N> contains whatever was matched by
the most-recently closed group (submatch). C<$^N> can be used in
extended patterns (see below), for example to assign a submatch to a
variable.
X<$+> X<$^N> X<$&> X<$`> X<$'>

These special variables, like the C<%+> hash and the numbered match variables
(C<$1>, C<$2>, C<$3>, I<etc>.) are dynamically scoped
until the end of the enclosing block or until the next successful
match, whichever comes first.  (See L<perlsyn/"Compound Statements">.)
X<$+> X<$^N> X<$&> X<$`> X<$'>
X<$1> X<$2> X<$3> X<$4> X<$5> X<$6> X<$7> X<$8> X<$9>

B<NOTE>: Failed matches in Perl do not reset the match variables,
which makes it easier to write code that tests for a series of more
specific cases and remembers the best match.

B<WARNING>: If your code is to run on Perl 5.16 or earlier,
beware that once Perl sees that you need one of C<$&>, C<$`>, or
C<$'> anywhere in the program, it has to provide them for every
pattern match.  This may substantially slow your program.

Perl uses the same mechanism to produce C<$1>, C<$2>, I<etc>, so you also
pay a price for each pattern that contains capturing parentheses.
(To avoid this cost while retaining the grouping behaviour, use the
extended regular expression C<(?: ... )> instead.)  But if you never
use C<$&>, C<$`> or C<$'>, then patterns I<without> capturing
parentheses will not be penalized.  So avoid C<$&>, C<$'>, and C<$`>
if you can, but if you can't (and some algorithms really appreciate
them), once you've used them once, use them at will, because you've
already paid the price.
X<$&> X<$`> X<$'>

Perl 5.16 introduced a slightly more efficient mechanism that notes
separately whether each of C<$`>, C<$&>, and C<$'> have been seen, and
thus may only need to copy part of the string.  Perl 5.20 introduced a
much more efficient copy-on-write mechanism which eliminates any slowdown.

As another workaround for this problem, Perl 5.10.0 introduced C<${^PREMATCH}>,
C<${^MATCH}> and C<${^POSTMATCH}>, which are equivalent to C<$`>, C<$&>
and C<$'>, B<except> that they are only guaranteed to be defined after a
successful match that was executed with the C</p> (preserve) modifier.
The use of these variables incurs no global performance penalty, unlike
their punctuation character equivalents, however at the trade-off that you
have to tell perl when you want to use them.  As of Perl 5.20, these three
variables are equivalent to C<$`>, C<$&> and C<$'>, and C</p> is ignored.
X</p> X<p modifier>

=head2 Quoting metacharacters

Backslashed metacharacters in Perl are alphanumeric, such as C<\b>,
C<\w>, C<\n>.  Unlike some other regular expression languages, there
are no backslashed symbols that aren't alphanumeric.  So anything
that looks like C<\\>, C<\(>, C<\)>, C<\[>, C<\]>, C<\{>, or C<\}> is
always
interpreted as a literal character, not a metacharacter.  This was
once used in a common idiom to disable or quote the special meanings
of regular expression metacharacters in a string that you want to
use for a pattern. Simply quote all non-"word" characters:

    $pattern =~ s/(\W)/\\$1/g;

(If C<use locale> is set, then this depends on the current locale.)
Today it is more common to use the C<L<quotemeta()|perlfunc/quotemeta>>
function or the C<\Q> metaquoting escape sequence to disable all
metacharacters' special meanings like this:

    /$unquoted\Q$quoted\E$unquoted/

Beware that if you put literal backslashes (those not inside
interpolated variables) between C<\Q> and C<\E>, double-quotish
backslash interpolation may lead to confusing results.  If you
I<need> to use literal backslashes within C<\Q...\E>,
consult L<perlop/"Gory details of parsing quoted constructs">.

C<quotemeta()> and C<\Q> are fully described in L<perlfunc/quotemeta>.

=head2 Extended Patterns

Perl also defines a consistent extension syntax for features not
found in standard tools like B<awk> and
B<lex>.  The syntax for most of these is a
pair of parentheses with a question mark as the first thing within
the parentheses.  The character after the question mark indicates
the extension.

A question mark was chosen for this and for the minimal-matching
construct because 1) question marks are rare in older regular
expressions, and 2) whenever you see one, you should stop and
"question" exactly what is going on.  That's psychology....

=over 4

=item C<(?#text)>
X<(?#)>

A comment.  The text is ignored.
Note that Perl closes
the comment as soon as it sees a C<")">, so there is no way to put a literal
C<")"> in the comment.  The pattern's closing delimiter must be escaped by
a backslash if it appears in the comment.

See L</E<sol>x> for another way to have comments in patterns.

Note that a comment can go just about anywhere, except in the middle of
an escape sequence.   Examples:

 qr/foo(?#comment)bar/'  # Matches 'foobar'

 # The pattern below matches 'abcd', 'abccd', or 'abcccd'
 qr/abc(?#comment between literal and its quantifier){1,3}d/

 # The pattern below generates a syntax error, because the '\p' must
 # be followed immediately by a '{'.
 qr/\p(?#comment between \p and its property name){Any}/

 # The pattern below generates a syntax error, because the initial
 # '\(' is a literal opening parenthesis, and so there is nothing
 # for the  closing ')' to match
 qr/\(?#the backslash means this isn't a comment)p{Any}/

=item C<(?adlupimnsx-imnsx)>

=item C<(?^alupimnsx)>
X<(?)> X<(?^)>

One or more embedded pattern-match modifiers, to be turned on (or
turned off if preceded by C<"-">) for the remainder of the pattern or
the remainder of the enclosing pattern group (if any).

This is particularly useful for dynamically-generated patterns,
such as those read in from a
configuration file, taken from an argument, or specified in a table
somewhere.  Consider the case where some patterns want to be
case-sensitive and some do not:  The case-insensitive ones merely need to
include C<(?i)> at the front of the pattern.  For example:

    $pattern = "foobar";
    if ( /$pattern/i ) { }

    # more flexible:

    $pattern = "(?i)foobar";
    if ( /$pattern/ ) { }

These modifiers are restored at the end of the enclosing group. For example,

    ( (?i) blah ) \s+ \g1

will match C<blah> in any case, some spaces, and an exact (I<including the case>!)
repetition of the previous word, assuming the C</x> modifier, and no C</i>
modifier outside this group.

These modifiers do not carry over into named subpatterns called in the
enclosing group. In other words, a pattern such as C<((?i)(?&NAME))> does not
change the case-sensitivity of the C<"NAME"> pattern.

A modifier is overridden by later occurrences of this construct in the
same scope containing the same modifier, so that

    /((?im)foo(?-m)bar)/

matches all of C<foobar> case insensitively, but uses C</m> rules for
only the C<foo> portion.  The C<"a"> flag overrides C<aa> as well;
likewise C<aa> overrides C<"a">.  The same goes for C<"x"> and C<xx>.
Hence, in

    /(?-x)foo/xx

both C</x> and C</xx> are turned off during matching C<foo>.  And in

    /(?x)foo/x

C</x> but NOT C</xx> is turned on for matching C<foo>.  (One might
mistakenly think that since the inner C<(?x)> is already in the scope of
C</x>, that the result would effectively be the sum of them, yielding
C</xx>.  It doesn't work that way.)  Similarly, doing something like
C<(?xx-x)foo> turns off all C<"x"> behavior for matching C<foo>, it is not
that you subtract 1 C<"x"> from 2 to get 1 C<"x"> remaining.

Any of these modifiers can be set to apply globally to all regular
expressions compiled within the scope of a C<use re>.  See
L<re/"'/flags' mode">.

Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately
after the C<"?"> is a shorthand equivalent to C<d-imnsx>.  Flags (except
C<"d">) may follow the caret to override it.
But a minus sign is not legal with it.

Note that the C<"a">, C<"d">, C<"l">, C<"p">, and C<"u"> modifiers are special in
that they can only be enabled, not disabled, and the C<"a">, C<"d">, C<"l">, and
C<"u"> modifiers are mutually exclusive: specifying one de-specifies the
others, and a maximum of one (or two C<"a">'s) may appear in the
construct.  Thus, for
example, C<(?-p)> will warn when compiled under C<use warnings>;
C<(?-d:...)> and C<(?dl:...)> are fatal errors.

Note also that the C<"p"> modifier is special in that its presence
anywhere in a pattern has a global effect.

=item C<(?:pattern)>
X<(?:)>

=item C<(?adluimnsx-imnsx:pattern)>

=item C<(?^aluimnsx:pattern)>
X<(?^:)>

This is for clustering, not capturing; it groups subexpressions like
C<"()">, but doesn't make backreferences as C<"()"> does.  So

    @fields = split(/\b(?:a|b|c)\b/)

matches the same field delimiters as

    @fields = split(/\b(a|b|c)\b/)

but doesn't spit out the delimiters themselves as extra fields (even though
that's the behaviour of L<perlfunc/split> when its pattern contains capturing
groups).  It's also cheaper not to capture
characters if you don't need to.

Any letters between C<"?"> and C<":"> act as flags modifiers as with
C<(?adluimnsx-imnsx)>.  For example,

    /(?s-i:more.*than).*million/i

is equivalent to the more verbose

    /(?:(?s-i)more.*than).*million/i

Note that any C<()> constructs enclosed within this one will still
capture unless the C</n> modifier is in effect.

Like the L</(?adlupimnsx-imnsx)> construct, C<aa> and C<"a"> override each
other, as do C<xx> and C<"x">.  They are not additive.  So, doing
something like C<(?xx-x:foo)> turns off all C<"x"> behavior for matching
C<foo>.

Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately
after the C<"?"> is a shorthand equivalent to C<d-imnsx>.  Any positive
flags (except C<"d">) may follow the caret, so

    (?^x:foo)

is equivalent to

    (?x-imns:foo)

The caret tells Perl that this cluster doesn't inherit the flags of any
surrounding pattern, but uses the system defaults (C<d-imnsx>),
modified by any flags specified.

The caret allows for simpler stringification of compiled regular
expressions.  These look like

    (?^:pattern)

with any non-default flags appearing between the caret and the colon.
A test that looks at such stringification thus doesn't need to have the
system default flags hard-coded in it, just the caret.  If new flags are
added to Perl, the meaning of the caret's expansion will change to include
the default for those flags, so the test will still work, unchanged.

Specifying a negative flag after the caret is an error, as the flag is
redundant.

Mnemonic for C<(?^...)>:  A fresh beginning since the usual use of a caret is
to match at the beginning.

=item C<(?|pattern)>
X<(?|)> X<Branch reset>

This is the "branch reset" pattern, which has the special property
that the capture groups are numbered from the same starting point
in each alternation branch. It is available starting from perl 5.10.0.

Capture groups are numbered from left to right, but inside this
construct the numbering is restarted for each branch.

The numbering within each branch will be as normal, and any groups
following this construct will be numbered as though the construct
contained only one branch, that being the one with the most capture
groups in it.

This construct is useful when you want to capture one of a
number of alternative matches.

Consider the following pattern.  The numbers underneath show in
which group the captured content will be stored.


    # before  ---------------branch-reset----------- after
    / ( a )  (?| x ( y ) z | (p (q) r) | (t) u (v) ) ( z ) /x
    # 1            2         2  3        2     3     4

Be careful when using the branch reset pattern in combination with
named captures. Named captures are implemented as being aliases to
numbered groups holding the captures, and that interferes with the
implementation of the branch reset pattern. If you are using named
captures in a branch reset pattern, it's best to use the same names,
in the same order, in each of the alternations:

   /(?|  (?<a> x ) (?<b> y )
      |  (?<a> z ) (?<b> w )) /x

Not doing so may lead to surprises:

  "12" =~ /(?| (?<a> \d+ ) | (?<b> \D+))/x;
  say $+{a};    # Prints '12'
  say $+{b};    # *Also* prints '12'.

The problem here is that both the group named C<< a >> and the group
named C<< b >> are aliases for the group belonging to C<< $1 >>.

=item Lookaround Assertions
X<look-around assertion> X<lookaround assertion> X<look-around> X<lookaround>

Lookaround assertions are zero-width patterns which match a specific
pattern without including it in C<$&>. Positive assertions match when
their subpattern matches, negative assertions match when their subpattern
fails. Lookbehind matches text up to the current match position,
lookahead matches text following the current match position.

=over 4

=item C<(?=pattern)>
X<(?=)> X<look-ahead, positive> X<lookahead, positive>

A zero-width positive lookahead assertion.  For example, C</\w+(?=\t)/>
matches a word followed by a tab, without including the tab in C<$&>.

=item C<(?!pattern)>
X<(?!)> X<look-ahead, negative> X<lookahead, negative>

A zero-width negative lookahead assertion.  For example C</foo(?!bar)/>
matches any occurrence of "foo" that isn't followed by "bar".  Note
however that lookahead and lookbehind are NOT the same thing.  You cannot
use this for lookbehind.

If you are looking for a "bar" that isn't preceded by a "foo", C</(?!foo)bar/>
will not do what you want.  That's because the C<(?!foo)> is just saying that
the next thing cannot be "foo"--and it's not, it's a "bar", so "foobar" will
match.  Use lookbehind instead (see below).

=item C<(?<=pattern)>

=item C<\K>
X<(?<=)> X<look-behind, positive> X<lookbehind, positive> X<\K>

A zero-width positive lookbehind assertion.  For example, C</(?<=\t)\w+/>
matches a word that follows a tab, without including the tab in C<$&>.
Works only for fixed-width lookbehind.

There is a special form of this construct, called C<\K> (available since
Perl 5.10.0), which causes the
regex engine to "keep" everything it had matched prior to the C<\K> and
not include it in C<$&>. This effectively provides variable-length
lookbehind. The use of C<\K> inside of another lookaround assertion
is allowed, but the behaviour is currently not well defined.

For various reasons C<\K> may be significantly more efficient than the
equivalent C<< (?<=...) >> construct, and it is especially useful in
situations where you want to efficiently remove something following
something else in a string. For instance

  s/(foo)bar/$1/g;

can be rewritten as the much more efficient

  s/foo\Kbar//g;

=item C<(?<!pattern)>
X<(?<!)> X<look-behind, negative> X<lookbehind, negative>

A zero-width negative lookbehind assertion.  For example C</(?<!bar)foo/>
matches any occurrence of "foo" that does not follow "bar".  Works
only for fixed-width lookbehind.

=back

=item C<< (?<NAME>pattern) >>

=item C<(?'NAME'pattern)>
X<< (?<NAME>) >> X<(?'NAME')> X<named capture> X<capture>

A named capture group. Identical in every respect to normal capturing
parentheses C<()> but for the additional fact that the group
can be referred to by name in various regular expression
constructs (like C<\g{NAME}>) and can be accessed by name
after a successful match via C<%+> or C<%->. See L<perlvar>
for more details on the C<%+> and C<%-> hashes.

If multiple distinct capture groups have the same name then the
C<$+{NAME}> will refer to the leftmost defined group in the match.

The forms C<(?'NAME'pattern)> and C<< (?<NAME>pattern) >> are equivalent.

B<NOTE:> While the notation of this construct is the same as the similar
function in .NET regexes, the behavior is not. In Perl the groups are
numbered sequentially regardless of being named or not. Thus in the
pattern

  /(x)(?<foo>y)(z)/

C<$+{I<foo>}> will be the same as C<$2>, and C<$3> will contain 'z' instead of
the opposite which is what a .NET regex hacker might expect.

Currently I<NAME> is restricted to simple identifiers only.
In other words, it must match C</^[_A-Za-z][_A-Za-z0-9]*\z/> or
its Unicode extension (see L<utf8>),
though it isn't extended by the locale (see L<perllocale>).

B<NOTE:> In order to make things easier for programmers with experience
with the Python or PCRE regex engines, the pattern C<< (?PE<lt>NAMEE<gt>pattern) >>
may be used instead of C<< (?<NAME>pattern) >>; however this form does not
support the use of single quotes as a delimiter for the name.

=item C<< \k<NAME> >>

=item C<< \k'NAME' >>

Named backreference. Similar to numeric backreferences, except that
the group is designated by name and not number. If multiple groups
have the same name then it refers to the leftmost defined group in
the current match.

It is an error to refer to a name not defined by a C<< (?<NAME>) >>
earlier in the pattern.

Both forms are equivalent.

B<NOTE:> In order to make things easier for programmers with experience
with the Python or PCRE regex engines, the pattern C<< (?P=NAME) >>
may be used instead of C<< \k<NAME> >>.

=item C<(?{ code })>
X<(?{})> X<regex, code in> X<regexp, code in> X<regular expression, code in>

B<WARNING>: Using this feature safely requires that you understand its
limitations.  Code executed that has side effects may not perform identically
from version to version due to the effect of future optimisations in the regex
engine.  For more information on this, see L</Embedded Code Execution
Frequency>.

This zero-width assertion executes any embedded Perl code.  It always
succeeds, and its return value is set as C<$^R>.

In literal patterns, the code is parsed at the same time as the
surrounding code. While within the pattern, control is passed temporarily
back to the perl parser, until the logically-balancing closing brace is
encountered. This is similar to the way that an array index expression in
a literal string is handled, for example

    "abc$array[ 1 + f('[') + g()]def"

In particular, braces do not need to be balanced:

    s/abc(?{ f('{'); })/def/

Even in a pattern that is interpolated and compiled at run-time, literal
code blocks will be compiled once, at perl compile time; the following
prints "ABCD":

    print "D";
    my $qr = qr/(?{ BEGIN { print "A" } })/;
    my $foo = "foo";
    /$foo$qr(?{ BEGIN { print "B" } })/;
    BEGIN { print "C" }

In patterns where the text of the code is derived from run-time
information rather than appearing literally in a source code /pattern/,
the code is compiled at the same time that the pattern is compiled, and
for reasons of security, C<use re 'eval'> must be in scope. This is to
stop user-supplied patterns containing code snippets from being
executable.

In situations where you need to enable this with C<use re 'eval'>, you should
also have taint checking enabled.  Better yet, use the carefully
constrained evaluation within a Safe compartment.  See L<perlsec> for
details about both these mechanisms.

From the viewpoint of parsing, lexical variable scope and closures,

    /AAA(?{ BBB })CCC/

behaves approximately like

    /AAA/ && do { BBB } && /CCC/

Similarly,

    qr/AAA(?{ BBB })CCC/

behaves approximately like

    sub { /AAA/ && do { BBB } && /CCC/ }

In particular:

    { my $i = 1; $r = qr/(?{ print $i })/ }
    my $i = 2;
    /$r/; # prints "1"

Inside a C<(?{...})> block, C<$_> refers to the string the regular
expression is matching against. You can also use C<pos()> to know what is
the current position of matching within this string.

The code block introduces a new scope from the perspective of lexical
variable declarations, but B<not> from the perspective of C<local> and
similar localizing behaviours. So later code blocks within the same
pattern will still see the values which were localized in earlier blocks.
These accumulated localizations are undone either at the end of a
successful match, or if the assertion is backtracked (compare
L</"Backtracking">). For example,

  $_ = 'a' x 8;
  m<
     (?{ $cnt = 0 })               # Initialize $cnt.
     (
       a
       (?{
           local $cnt = $cnt + 1;  # Update $cnt,
                                   # backtracking-safe.
       })
     )*
     aaaa
     (?{ $res = $cnt })            # On success copy to
                                   # non-localized location.
   >x;

will initially increment C<$cnt> up to 8; then during backtracking, its
value will be unwound back to 4, which is the value assigned to C<$res>.
At the end of the regex execution, C<$cnt> will be wound back to its initial
value of 0.

This assertion may be used as the condition in a

    (?(condition)yes-pattern|no-pattern)

switch.  If I<not> used in this way, the result of evaluation of C<code>
is put into the special variable C<$^R>.  This happens immediately, so
C<$^R> can be used from other C<(?{ code })> assertions inside the same
regular expression.

The assignment to C<$^R> above is properly localized, so the old
value of C<$^R> is restored if the assertion is backtracked; compare
L</"Backtracking">.

Note that the special variable C<$^N>  is particularly useful with code
blocks to capture the results of submatches in variables without having to
keep track of the number of nested parentheses. For example:

  $_ = "The brown fox jumps over the lazy dog";
  /the (\S+)(?{ $color = $^N }) (\S+)(?{ $animal = $^N })/i;
  print "color = $color, animal = $animal\n";


=item C<(??{ code })>
X<(??{})>
X<regex, postponed> X<regexp, postponed> X<regular expression, postponed>

B<WARNING>: Using this feature safely requires that you understand its
limitations.  Code executed that has side effects may not perform
identically from version to version due to the effect of future
optimisations in the regex engine.  For more information on this, see
L</Embedded Code Execution Frequency>.

This is a "postponed" regular subexpression.  It behaves in I<exactly> the
same way as a C<(?{ code })> code block as described above, except that
its return value, rather than being assigned to C<$^R>, is treated as a
pattern, compiled if it's a string (or used as-is if its a qr// object),
then matched as if it were inserted instead of this construct.

During the matching of this sub-pattern, it has its own set of
captures which are valid during the sub-match, but are discarded once
control returns to the main pattern. For example, the following matches,
with the inner pattern capturing "B" and matching "BB", while the outer
pattern captures "A";

    my $inner = '(.)\1';
    "ABBA" =~ /^(.)(??{ $inner })\1/;
    print $1; # prints "A";

Note that this means that  there is no way for the inner pattern to refer
to a capture group defined outside.  (The code block itself can use C<$1>,
I<etc>., to refer to the enclosing pattern's capture groups.)  Thus, although

    ('a' x 100)=~/(??{'(.)' x 100})/

I<will> match, it will I<not> set C<$1> on exit.

The following pattern matches a parenthesized group:

 $re = qr{
            \(
            (?:
               (?> [^()]+ )  # Non-parens without backtracking
             |
               (??{ $re })   # Group with matching parens
            )*
            \)
         }x;

See also
L<C<(?I<PARNO>)>|/(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)>
for a different, more efficient way to accomplish
the same task.

Executing a postponed regular expression too many times without
consuming any input string will also result in a fatal error.  The depth
at which that happens is compiled into perl, so it can be changed with a
custom build.

=item C<(?I<PARNO>)> C<(?-I<PARNO>)> C<(?+I<PARNO>)> C<(?R)> C<(?0)>
X<(?PARNO)> X<(?1)> X<(?R)> X<(?0)> X<(?-1)> X<(?+1)> X<(?-PARNO)> X<(?+PARNO)>
X<regex, recursive> X<regexp, recursive> X<regular expression, recursive>
X<regex, relative recursion> X<GOSUB> X<GOSTART>

Recursive subpattern. Treat the contents of a given capture buffer in the
current pattern as an independent subpattern and attempt to match it at
the current position in the string. Information about capture state from
the caller for things like backreferences is available to the subpattern,
but capture buffers set by the subpattern are not visible to the caller.

Similar to C<(??{ code })> except that it does not involve executing any
code or potentially compiling a returned pattern string; instead it treats
the part of the current pattern contained within a specified capture group
as an independent pattern that must match at the current position. Also
different is the treatment of capture buffers, unlike C<(??{ code })>
recursive patterns have access to their caller's match state, so one can
use backreferences safely.

I<PARNO> is a sequence of digits (not starting with 0) whose value reflects
the paren-number of the capture group to recurse to. C<(?R)> recurses to
the beginning of the whole pattern. C<(?0)> is an alternate syntax for
C<(?R)>. If I<PARNO> is preceded by a plus or minus sign then it is assumed
to be relative, with negative numbers indicating preceding capture groups
and positive ones following. Thus C<(?-1)> refers to the most recently
declared group, and C<(?+1)> indicates the next group to be declared.
Note that the counting for relative recursion differs from that of
relative backreferences, in that with recursion unclosed groups B<are>
included.

The following pattern matches a function C<foo()> which may contain
balanced parentheses as the argument.

  $re = qr{ (                   # paren group 1 (full function)
              foo
              (                 # paren group 2 (parens)
                \(
                  (             # paren group 3 (contents of parens)
                  (?:
                   (?> [^()]+ ) # Non-parens without backtracking
                  |
                   (?2)         # Recurse to start of paren group 2
                  )*
                  )
                \)
              )
            )
          }x;

If the pattern was used as follows

    'foo(bar(baz)+baz(bop))'=~/$re/
        and print "\$1 = $1\n",
                  "\$2 = $2\n",
                  "\$3 = $3\n";

the output produced should be the following:

    $1 = foo(bar(baz)+baz(bop))
    $2 = (bar(baz)+baz(bop))
    $3 = bar(baz)+baz(bop)

If there is no corresponding capture group defined, then it is a
fatal error.  Recursing deeply without consuming any input string will
also result in a fatal error.  The depth at which that happens is
compiled into perl, so it can be changed with a custom build.

The following shows how using negative indexing can make it
easier to embed recursive patterns inside of a C<qr//> construct
for later use:

    my $parens = qr/(\((?:[^()]++|(?-1))*+\))/;
    if (/foo $parens \s+ \+ \s+ bar $parens/x) {
       # do something here...
    }

B<Note> that this pattern does not behave the same way as the equivalent
PCRE or Python construct of the same form. In Perl you can backtrack into
a recursed group, in PCRE and Python the recursed into group is treated
as atomic. Also, modifiers are resolved at compile time, so constructs
like C<(?i:(?1))> or C<(?:(?i)(?1))> do not affect how the sub-pattern will
be processed.

=item C<(?&NAME)>
X<(?&NAME)>

Recurse to a named subpattern. Identical to C<(?I<PARNO>)> except that the
parenthesis to recurse to is determined by name. If multiple parentheses have
the same name, then it recurses to the leftmost.

It is an error to refer to a name that is not declared somewhere in the
pattern.

B<NOTE:> In order to make things easier for programmers with experience
with the Python or PCRE regex engines the pattern C<< (?P>NAME) >>
may be used instead of C<< (?&NAME) >>.

=item C<(?(condition)yes-pattern|no-pattern)>
X<(?()>

=item C<(?(condition)yes-pattern)>

Conditional expression. Matches C<yes-pattern> if C<condition> yields
a true value, matches C<no-pattern> otherwise. A missing pattern always
matches.

C<(condition)> should be one of:

=over 4

=item an integer in parentheses

(which is valid if the corresponding pair of parentheses
matched);

=item a lookahead/lookbehind/evaluate zero-width assertion;

=item a name in angle brackets or single quotes

(which is valid if a group with the given name matched);

=item the special symbol C<(R)>

(true when evaluated inside of recursion or eval).  Additionally the
C<"R"> may be
followed by a number, (which will be true when evaluated when recursing
inside of the appropriate group), or by C<&NAME>, in which case it will
be true only when evaluated during recursion in the named group.

=back

Here's a summary of the possible predicates:

=over 4

=item C<(1)> C<(2)> ...

Checks if the numbered capturing group has matched something.
Full syntax: C<< (?(1)then|else) >>

=item C<(E<lt>I<NAME>E<gt>)> C<('I<NAME>')>

Checks if a group with the given name has matched something.
Full syntax: C<< (?(<name>)then|else) >>

=item C<(?=...)> C<(?!...)> C<(?<=...)> C<(?<!...)>

Checks whether the pattern matches (or does not match, for the C<"!">
variants).
Full syntax: C<< (?(?=lookahead)then|else) >>

=item C<(?{ I<CODE> })>

Treats the return value of the code block as the condition.
Full syntax: C<< (?(?{ code })then|else) >>

=item C<(R)>

Checks if the expression has been evaluated inside of recursion.
Full syntax: C<< (?(R)then|else) >>

=item C<(R1)> C<(R2)> ...

Checks if the expression has been evaluated while executing directly
inside of the n-th capture group. This check is the regex equivalent of

  if ((caller(0))[3] eq 'subname') { ... }

In other words, it does not check the full recursion stack.

Full syntax: C<< (?(R1)then|else) >>

=item C<(R&I<NAME>)>

Similar to C<(R1)>, this predicate checks to see if we're executing
directly inside of the leftmost group with a given name (this is the same
logic used by C<(?&I<NAME>)> to disambiguate). It does not check the full
stack, but only the name of the innermost active recursion.
Full syntax: C<< (?(R&name)then|else) >>

=item C<(DEFINE)>

In this case, the yes-pattern is never directly executed, and no
no-pattern is allowed. Similar in spirit to C<(?{0})> but more efficient.
See below for details.
Full syntax: C<< (?(DEFINE)definitions...) >>

=back

For example:

    m{ ( \( )?
       [^()]+
       (?(1) \) )
     }x

matches a chunk of non-parentheses, possibly included in parentheses
themselves.

A special form is the C<(DEFINE)> predicate, which never executes its
yes-pattern directly, and does not allow a no-pattern. This allows one to
define subpatterns which will be executed only by the recursion mechanism.
This way, you can define a set of regular expression rules that can be
bundled into any pattern you choose.

It is recommended that for this usage you put the DEFINE block at the
end of the pattern, and that you name any subpatterns defined within it.

Also, it's worth noting that patterns defined this way probably will
not be as efficient, as the optimizer is not very clever about
handling them.

An example of how this might be used is as follows:

  /(?<NAME>(?&NAME_PAT))(?<ADDR>(?&ADDRESS_PAT))
   (?(DEFINE)
     (?<NAME_PAT>....)
     (?<ADDRESS_PAT>....)
   )/x

Note that capture groups matched inside of recursion are not accessible
after the recursion returns, so the extra layer of capturing groups is
necessary. Thus C<$+{NAME_PAT}> would not be defined even though
C<$+{NAME}> would be.

Finally, keep in mind that subpatterns created inside a DEFINE block
count towards the absolute and relative number of captures, so this:

    my @captures = "a" =~ /(.)                  # First capture
                           (?(DEFINE)
                               (?<EXAMPLE> 1 )  # Second capture
                           )/x;
    say scalar @captures;

Will output 2, not 1. This is particularly important if you intend to
compile the definitions with the C<qr//> operator, and later
interpolate them in another pattern.

=item C<< (?>pattern) >>
X<backtrack> X<backtracking> X<atomic> X<possessive>

An "independent" subexpression, one which matches the substring
that a I<standalone> C<pattern> would match if anchored at the given
position, and it matches I<nothing other than this substring>.  This
construct is useful for optimizations of what would otherwise be
"eternal" matches, because it will not backtrack (see L</"Backtracking">).
It may also be useful in places where the "grab all you can, and do not
give anything back" semantic is desirable.

For example: C<< ^(?>a*)ab >> will never match, since C<< (?>a*) >>
(anchored at the beginning of string, as above) will match I<all>
characters C<"a"> at the beginning of string, leaving no C<"a"> for
C<ab> to match.  In contrast, C<a*ab> will match the same as C<a+b>,
since the match of the subgroup C<a*> is influenced by the following
group C<ab> (see L</"Backtracking">).  In particular, C<a*> inside
C<a*ab> will match fewer characters than a standalone C<a*>, since
this makes the tail match.

C<< (?>pattern) >> does not disable backtracking altogether once it has
matched. It is still possible to backtrack past the construct, but not
into it. So C<< ((?>a*)|(?>b*))ar >> will still match "bar".

An effect similar to C<< (?>pattern) >> may be achieved by writing
C<(?=(pattern))\g{-1}>.  This matches the same substring as a standalone
C<a+>, and the following C<\g{-1}> eats the matched string; it therefore
makes a zero-length assertion into an analogue of C<< (?>...) >>.
(The difference between these two constructs is that the second one
uses a capturing group, thus shifting ordinals of backreferences
in the rest of a regular expression.)

Consider this pattern:

    m{ \(
          (
            [^()]+           # x+
          |
            \( [^()]* \)
          )+
       \)
     }x

That will efficiently match a nonempty group with matching parentheses
two levels deep or less.  However, if there is no such group, it
will take virtually forever on a long string.  That's because there
are so many different ways to split a long string into several
substrings.  This is what C<(.+)+> is doing, and C<(.+)+> is similar
to a subpattern of the above pattern.  Consider how the pattern
above detects no-match on C<((()aaaaaaaaaaaaaaaaaa> in several
seconds, but that each extra letter doubles this time.  This
exponential performance will make it appear that your program has
hung.  However, a tiny change to this pattern

    m{ \(
          (
            (?> [^()]+ )        # change x+ above to (?> x+ )
          |
            \( [^()]* \)
          )+
       \)
     }x

which uses C<< (?>...) >> matches exactly when the one above does (verifying
this yourself would be a productive exercise), but finishes in a fourth
the time when used on a similar string with 1000000 C<"a">s.  Be aware,
however, that, when this construct is followed by a
quantifier, it currently triggers a warning message under
the C<use warnings> pragma or B<-w> switch saying it
C<"matches null string many times in regex">.

On simple groups, such as the pattern C<< (?> [^()]+ ) >>, a comparable
effect may be achieved by negative lookahead, as in C<[^()]+ (?! [^()] )>.
This was only 4 times slower on a string with 1000000 C<"a">s.

The "grab all you can, and do not give anything back" semantic is desirable
in many situations where on the first sight a simple C<()*> looks like
the correct solution.  Suppose we parse text with comments being delimited
by C<"#"> followed by some optional (horizontal) whitespace.  Contrary to
its appearance, C<#[ \t]*> I<is not> the correct subexpression to match
the comment delimiter, because it may "give up" some whitespace if
the remainder of the pattern can be made to match that way.  The correct
answer is either one of these:

    (?>#[ \t]*)
    #[ \t]*(?![ \t])

For example, to grab non-empty comments into C<$1>, one should use either
one of these:

    / (?> \# [ \t]* ) (        .+ ) /x;
    /     \# [ \t]*   ( [^ \t] .* ) /x;

Which one you pick depends on which of these expressions better reflects
the above specification of comments.

In some literature this construct is called "atomic matching" or
"possessive matching".

Possessive quantifiers are equivalent to putting the item they are applied
to inside of one of these constructs. The following equivalences apply:

    Quantifier Form     Bracketing Form
    ---------------     ---------------
    PAT*+               (?>PAT*)
    PAT++               (?>PAT+)
    PAT?+               (?>PAT?)
    PAT{min,max}+       (?>PAT{min,max})

=item C<(?[ ])>

See L<perlrecharclass/Extended Bracketed Character Classes>.

Note that this feature is currently L<experimental|perlpolicy/experimental>;
using it yields a warning in the C<experimental::regex_sets> category.

=back

=head2 Backtracking
X<backtrack> X<backtracking>

NOTE: This section presents an abstract approximation of regular
expression behavior.  For a more rigorous (and complicated) view of
the rules involved in selecting a match among possible alternatives,
see L</Combining RE Pieces>.

A fundamental feature of regular expression matching involves the
notion called I<backtracking>, which is currently used (when needed)
by all regular non-possessive expression quantifiers, namely C<"*">, C<*?>, C<"+">,
C<+?>, C<{n,m}>, and C<{n,m}?>.  Backtracking is often optimized
internally, but the general principle outlined here is valid.

For a regular expression to match, the I<entire> regular expression must
match, not just part of it.  So if the beginning of a pattern containing a
quantifier succeeds in a way that causes later parts in the pattern to
fail, the matching engine backs up and recalculates the beginning
part--that's why it's called backtracking.

Here is an example of backtracking:  Let's say you want to find the
word following "foo" in the string "Food is on the foo table.":

    $_ = "Food is on the foo table.";
    if ( /\b(foo)\s+(\w+)/i ) {
        print "$2 follows $1.\n";
    }

When the match runs, the first part of the regular expression (C<\b(foo)>)
finds a possible match right at the beginning of the string, and loads up
C<$1> with "Foo".  However, as soon as the matching engine sees that there's
no whitespace following the "Foo" that it had saved in C<$1>, it realizes its
mistake and starts over again one character after where it had the
tentative match.  This time it goes all the way until the next occurrence
of "foo". The complete regular expression matches this time, and you get
the expected output of "table follows foo."

Sometimes minimal matching can help a lot.  Imagine you'd like to match
everything between "foo" and "bar".  Initially, you write something
like this:

    $_ =  "The food is under the bar in the barn.";
    if ( /foo(.*)bar/ ) {
        print "got <$1>\n";
    }

Which perhaps unexpectedly yields:

  got <d is under the bar in the >

That's because C<.*> was greedy, so you get everything between the
I<first> "foo" and the I<last> "bar".  Here it's more effective
to use minimal matching to make sure you get the text between a "foo"
and the first "bar" thereafter.

    if ( /foo(.*?)bar/ ) { print "got <$1>\n" }
  got <d is under the >

Here's another example. Let's say you'd like to match a number at the end
of a string, and you also want to keep the preceding part of the match.
So you write this:

    $_ = "I have 2 numbers: 53147";
    if ( /(.*)(\d*)/ ) {                                # Wrong!
        print "Beginning is <$1>, number is <$2>.\n";
    }

That won't work at all, because C<.*> was greedy and gobbled up the
whole string. As C<\d*> can match on an empty string the complete
regular expression matched successfully.

    Beginning is <I have 2 numbers: 53147>, number is <>.

Here are some variants, most of which don't work:

    $_ = "I have 2 numbers: 53147";
    @pats = qw{
        (.*)(\d*)
        (.*)(\d+)
        (.*?)(\d*)
        (.*?)(\d+)
        (.*)(\d+)$
        (.*?)(\d+)$
        (.*)\b(\d+)$
        (.*\D)(\d+)$
    };

    for $pat (@pats) {
        printf "%-12s ", $pat;
        if ( /$pat/ ) {
            print "<$1> <$2>\n";
        } else {
            print "FAIL\n";
        }
    }

That will print out:

    (.*)(\d*)    <I have 2 numbers: 53147> <>
    (.*)(\d+)    <I have 2 numbers: 5314> <7>
    (.*?)(\d*)   <> <>
    (.*?)(\d+)   <I have > <2>
    (.*)(\d+)$   <I have 2 numbers: 5314> <7>
    (.*?)(\d+)$  <I have 2 numbers: > <53147>
    (.*)\b(\d+)$ <I have 2 numbers: > <53147>
    (.*\D)(\d+)$ <I have 2 numbers: > <53147>

As you see, this can be a bit tricky.  It's important to realize that a
regular expression is merely a set of assertions that gives a definition
of success.  There may be 0, 1, or several different ways that the
definition might succeed against a particular string.  And if there are
multiple ways it might succeed, you need to understand backtracking to
know which variety of success you will achieve.

When using lookahead assertions and negations, this can all get even
trickier.  Imagine you'd like to find a sequence of non-digits not
followed by "123".  You might try to write that as

    $_ = "ABC123";
    if ( /^\D*(?!123)/ ) {                # Wrong!
        print "Yup, no 123 in $_\n";
    }

But that isn't going to match; at least, not the way you're hoping.  It
claims that there is no 123 in the string.  Here's a clearer picture of
why that pattern matches, contrary to popular expectations:

    $x = 'ABC123';
    $y = 'ABC445';

    print "1: got $1\n" if $x =~ /^(ABC)(?!123)/;
    print "2: got $1\n" if $y =~ /^(ABC)(?!123)/;

    print "3: got $1\n" if $x =~ /^(\D*)(?!123)/;
    print "4: got $1\n" if $y =~ /^(\D*)(?!123)/;

This prints

    2: got ABC
    3: got AB
    4: got ABC

You might have expected test 3 to fail because it seems to a more
general purpose version of test 1.  The important difference between
them is that test 3 contains a quantifier (C<\D*>) and so can use
backtracking, whereas test 1 will not.  What's happening is
that you've asked "Is it true that at the start of C<$x>, following 0 or more
non-digits, you have something that's not 123?"  If the pattern matcher had
let C<\D*> expand to "ABC", this would have caused the whole pattern to
fail.

The search engine will initially match C<\D*> with "ABC".  Then it will
try to match C<(?!123)> with "123", which fails.  But because
a quantifier (C<\D*>) has been used in the regular expression, the
search engine can backtrack and retry the match differently
in the hope of matching the complete regular expression.

The pattern really, I<really> wants to succeed, so it uses the
standard pattern back-off-and-retry and lets C<\D*> expand to just "AB" this
time.  Now there's indeed something following "AB" that is not
"123".  It's "C123", which suffices.

We can deal with this by using both an assertion and a negation.
We'll say that the first part in C<$1> must be followed both by a digit
and by something that's not "123".  Remember that the lookaheads
are zero-width expressions--they only look, but don't consume any
of the string in their match.  So rewriting this way produces what
you'd expect; that is, case 5 will fail, but case 6 succeeds:

    print "5: got $1\n" if $x =~ /^(\D*)(?=\d)(?!123)/;
    print "6: got $1\n" if $y =~ /^(\D*)(?=\d)(?!123)/;

    6: got ABC

In other words, the two zero-width assertions next to each other work as though
they're ANDed together, just as you'd use any built-in assertions:  C</^$/>
matches only if you're at the beginning of the line AND the end of the
line simultaneously.  The deeper underlying truth is that juxtaposition in
regular expressions always means AND, except when you write an explicit OR
using the vertical bar.  C</ab/> means match "a" AND (then) match "b",
although the attempted matches are made at different positions because "a"
is not a zero-width assertion, but a one-width assertion.

B<WARNING>: Particularly complicated regular expressions can take
exponential time to solve because of the immense number of possible
ways they can use backtracking to try for a match.  For example, without
internal optimizations done by the regular expression engine, this will
take a painfully long time to run:

    'aaaaaaaaaaaa' =~ /((a{0,5}){0,5})*[c]/

And if you used C<"*">'s in the internal groups instead of limiting them
to 0 through 5 matches, then it would take forever--or until you ran
out of stack space.  Moreover, these internal optimizations are not
always applicable.  For example, if you put C<{0,5}> instead of C<"*">
on the external group, no current optimization is applicable, and the
match takes a long time to finish.

A powerful tool for optimizing such beasts is what is known as an
"independent group",
which does not backtrack (see L</C<< (?>pattern) >>>).  Note also that
zero-length lookahead/lookbehind assertions will not backtrack to make
the tail match, since they are in "logical" context: only
whether they match is considered relevant.  For an example
where side-effects of lookahead I<might> have influenced the
following match, see L</C<< (?>pattern) >>>.

=head2 Special Backtracking Control Verbs

These special patterns are generally of the form C<(*I<VERB>:I<ARG>)>. Unless
otherwise stated the I<ARG> argument is optional; in some cases, it is
mandatory.

Any pattern containing a special backtracking verb that allows an argument
has the special behaviour that when executed it sets the current package's
C<$REGERROR> and C<$REGMARK> variables. When doing so the following
rules apply:

On failure, the C<$REGERROR> variable will be set to the I<ARG> value of the
verb pattern, if the verb was involved in the failure of the match. If the
I<ARG> part of the pattern was omitted, then C<$REGERROR> will be set to the
name of the last C<(*MARK:NAME)> pattern executed, or to TRUE if there was
none. Also, the C<$REGMARK> variable will be set to FALSE.

On a successful match, the C<$REGERROR> variable will be set to FALSE, and
the C<$REGMARK> variable will be set to the name of the last
C<(*MARK:NAME)> pattern executed.  See the explanation for the
C<(*MARK:NAME)> verb below for more details.

B<NOTE:> C<$REGERROR> and C<$REGMARK> are not magic variables like C<$1>
and most other regex-related variables. They are not local to a scope, nor
readonly, but instead are volatile package variables similar to C<$AUTOLOAD>.
They are set in the package containing the code that I<executed> the regex
(rather than the one that compiled it, where those differ).  If necessary, you
can use C<local> to localize changes to these variables to a specific scope
before executing a regex.

If a pattern does not contain a special backtracking verb that allows an
argument, then C<$REGERROR> and C<$REGMARK> are not touched at all.

=over 3

=item Verbs

=over 4

=item C<(*PRUNE)> C<(*PRUNE:NAME)>
X<(*PRUNE)> X<(*PRUNE:NAME)>

This zero-width pattern prunes the backtracking tree at the current point
when backtracked into on failure. Consider the pattern C</I<A> (*PRUNE) I<B>/>,
where I<A> and I<B> are complex patterns. Until the C<(*PRUNE)> verb is reached,
I<A> may backtrack as necessary to match. Once it is reached, matching
continues in I<B>, which may also backtrack as necessary; however, should B
not match, then no further backtracking will take place, and the pattern
will fail outright at the current starting position.

The following example counts all the possible matching strings in a
pattern (without actually matching any of them).

    'aaab' =~ /a+b?(?{print "$&\n"; $count++})(*FAIL)/;
    print "Count=$count\n";

which produces:

    aaab
    aaa
    aa
    a
    aab
    aa
    a
    ab
    a
    Count=9

If we add a C<(*PRUNE)> before the count like the following

    'aaab' =~ /a+b?(*PRUNE)(?{print "$&\n"; $count++})(*FAIL)/;
    print "Count=$count\n";

we prevent backtracking and find the count of the longest matching string
at each matching starting point like so:

    aaab
    aab
    ab
    Count=3

Any number of C<(*PRUNE)> assertions may be used in a pattern.

See also C<<< L<< /(?>pattern) >> >>> and possessive quantifiers for
other ways to
control backtracking. In some cases, the use of C<(*PRUNE)> can be
replaced with a C<< (?>pattern) >> with no functional difference; however,
C<(*PRUNE)> can be used to handle cases that cannot be expressed using a
C<< (?>pattern) >> alone.

=item C<(*SKIP)> C<(*SKIP:NAME)>
X<(*SKIP)>

This zero-width pattern is similar to C<(*PRUNE)>, except that on
failure it also signifies that whatever text that was matched leading up
to the C<(*SKIP)> pattern being executed cannot be part of I<any> match
of this pattern. This effectively means that the regex engine "skips" forward
to this position on failure and tries to match again, (assuming that
there is sufficient room to match).

The name of the C<(*SKIP:NAME)> pattern has special significance. If a
C<(*MARK:NAME)> was encountered while matching, then it is that position
which is used as the "skip point". If no C<(*MARK)> of that name was
encountered, then the C<(*SKIP)> operator has no effect. When used
without a name the "skip point" is where the match point was when
executing the C<(*SKIP)> pattern.

Compare the following to the examples in C<(*PRUNE)>; note the string
is twice as long:

 'aaabaaab' =~ /a+b?(*SKIP)(?{print "$&\n"; $count++})(*FAIL)/;
 print "Count=$count\n";

outputs

    aaab
    aaab
    Count=2

Once the 'aaab' at the start of the string has matched, and the C<(*SKIP)>
executed, the next starting point will be where the cursor was when the
C<(*SKIP)> was executed.

=item C<(*MARK:NAME)> C<(*:NAME)>
X<(*MARK)> X<(*MARK:NAME)> X<(*:NAME)>

This zero-width pattern can be used to mark the point reached in a string
when a certain part of the pattern has been successfully matched. This
mark may be given a name. A later C<(*SKIP)> pattern will then skip
forward to that point if backtracked into on failure. Any number of
C<(*MARK)> patterns are allowed, and the I<NAME> portion may be duplicated.

In addition to interacting with the C<(*SKIP)> pattern, C<(*MARK:NAME)>
can be used to "label" a pattern branch, so that after matching, the
program can determine which branches of the pattern were involved in the
match.

When a match is successful, the C<$REGMARK> variable will be set to the
name of the most recently executed C<(*MARK:NAME)> that was involved
in the match.

This can be used to determine which branch of a pattern was matched
without using a separate capture group for each branch, which in turn
can result in a performance improvement, as perl cannot optimize
C</(?:(x)|(y)|(z))/> as efficiently as something like
C</(?:x(*MARK:x)|y(*MARK:y)|z(*MARK:z))/>.

When a match has failed, and unless another verb has been involved in
failing the match and has provided its own name to use, the C<$REGERROR>
variable will be set to the name of the most recently executed
C<(*MARK:NAME)>.

See L</(*SKIP)> for more details.

As a shortcut C<(*MARK:NAME)> can be written C<(*:NAME)>.

=item C<(*THEN)> C<(*THEN:NAME)>

This is similar to the "cut group" operator C<::> from Perl 6.  Like
C<(*PRUNE)>, this verb always matches, and when backtracked into on
failure, it causes the regex engine to try the next alternation in the
innermost enclosing group (capturing or otherwise) that has alternations.
The two branches of a C<(?(condition)yes-pattern|no-pattern)> do not
count as an alternation, as far as C<(*THEN)> is concerned.

Its name comes from the observation that this operation combined with the
alternation operator (C<"|">) can be used to create what is essentially a
pattern-based if/then/else block:

  ( COND (*THEN) FOO | COND2 (*THEN) BAR | COND3 (*THEN) BAZ )

Note that if this operator is used and NOT inside of an alternation then
it acts exactly like the C<(*PRUNE)> operator.

  / A (*PRUNE) B /

is the same as

  / A (*THEN) B /

but

  / ( A (*THEN) B | C ) /

is not the same as

  / ( A (*PRUNE) B | C ) /

as after matching the I<A> but failing on the I<B> the C<(*THEN)> verb will
backtrack and try I<C>; but the C<(*PRUNE)> verb will simply fail.

=item C<(*COMMIT)> C<(*COMMIT:args)>
X<(*COMMIT)>

This is the Perl 6 "commit pattern" C<< <commit> >> or C<:::>. It's a
zero-width pattern similar to C<(*SKIP)>, except that when backtracked
into on failure it causes the match to fail outright. No further attempts
to find a valid match by advancing the start pointer will occur again.
For example,

 'aaabaaab' =~ /a+b?(*COMMIT)(?{print "$&\n"; $count++})(*FAIL)/;
 print "Count=$count\n";

outputs

    aaab
    Count=1

In other words, once the C<(*COMMIT)> has been entered, and if the pattern
does not match, the regex engine will not try any further matching on the
rest of the string.

=item C<(*FAIL)> C<(*F)> C<(*FAIL:arg)>
X<(*FAIL)> X<(*F)>

This pattern matches nothing and always fails. It can be used to force the
engine to backtrack. It is equivalent to C<(?!)>, but easier to read. In
fact, C<(?!)> gets optimised into C<(*FAIL)> internally. You can provide
an argument so that if the match fails because of this C<FAIL> directive
the argument can be obtained from C<$REGERROR>.

It is probably useful only when combined with C<(?{})> or C<(??{})>.

=item C<(*ACCEPT)> C<(*ACCEPT:arg)>
X<(*ACCEPT)>

This pattern matches nothing and causes the end of successful matching at
the point at which the C<(*ACCEPT)> pattern was encountered, regardless of
whether there is actually more to match in the string. When inside of a
nested pattern, such as recursion, or in a subpattern dynamically generated
via C<(??{})>, only the innermost pattern is ended immediately.

If the C<(*ACCEPT)> is inside of capturing groups then the groups are
marked as ended at the point at which the C<(*ACCEPT)> was encountered.
For instance:

  'AB' =~ /(A (A|B(*ACCEPT)|C) D)(E)/x;

will match, and C<$1> will be C<AB> and C<$2> will be C<"B">, C<$3> will not
be set. If another branch in the inner parentheses was matched, such as in the
string 'ACDE', then the C<"D"> and C<"E"> would have to be matched as well.

You can provide an argument, which will be available in the var
C<$REGMARK> after the match completes.

=back

=back

=head2 Warning on C<\1> Instead of C<$1>

Some people get too used to writing things like:

    $pattern =~ s/(\W)/\\\1/g;

This is grandfathered (for \1 to \9) for the RHS of a substitute to avoid
shocking the
B<sed> addicts, but it's a dirty habit to get into.  That's because in
PerlThink, the righthand side of an C<s///> is a double-quoted string.  C<\1> in
the usual double-quoted string means a control-A.  The customary Unix
meaning of C<\1> is kludged in for C<s///>.  However, if you get into the habit
of doing that, you get yourself into trouble if you then add an C</e>
modifier.

    s/(\d+)/ \1 + 1 /eg;            # causes warning under -w

Or if you try to do

    s/(\d+)/\1000/;

You can't disambiguate that by saying C<\{1}000>, whereas you can fix it with
C<${1}000>.  The operation of interpolation should not be confused
with the operation of matching a backreference.  Certainly they mean two
different things on the I<left> side of the C<s///>.

=head2 Repeated Patterns Matching a Zero-length Substring

B<WARNING>: Difficult material (and prose) ahead.  This section needs a rewrite.

Regular expressions provide a terse and powerful programming language.  As
with most other power tools, power comes together with the ability
to wreak havoc.

A common abuse of this power stems from the ability to make infinite
loops using regular expressions, with something as innocuous as:

    'foo' =~ m{ ( o? )* }x;

The C<o?> matches at the beginning of "C<foo>", and since the position
in the string is not moved by the match, C<o?> would match again and again
because of the C<"*"> quantifier.  Another common way to create a similar cycle
is with the looping modifier C</g>:

    @matches = ( 'foo' =~ m{ o? }xg );

or

    print "match: <$&>\n" while 'foo' =~ m{ o? }xg;

or the loop implied by C<split()>.

However, long experience has shown that many programming tasks may
be significantly simplified by using repeated subexpressions that
may match zero-length substrings.  Here's a simple example being:

    @chars = split //, $string;           # // is not magic in split
    ($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// /

Thus Perl allows such constructs, by I<forcefully breaking
the infinite loop>.  The rules for this are different for lower-level
loops given by the greedy quantifiers C<*+{}>, and for higher-level
ones like the C</g> modifier or C<split()> operator.

The lower-level loops are I<interrupted> (that is, the loop is
broken) when Perl detects that a repeated expression matched a
zero-length substring.   Thus

   m{ (?: NON_ZERO_LENGTH | ZERO_LENGTH )* }x;

is made equivalent to

   m{ (?: NON_ZERO_LENGTH )* (?: ZERO_LENGTH )? }x;

For example, this program

   #!perl -l
   "aaaaab" =~ /
     (?:
        a                 # non-zero
        |                 # or
       (?{print "hello"}) # print hello whenever this
                          #    branch is tried
       (?=(b))            # zero-width assertion
     )*  # any number of times
    /x;
   print $&;
   print $1;

prints

   hello
   aaaaa
   b

Notice that "hello" is only printed once, as when Perl sees that the sixth
iteration of the outermost C<(?:)*> matches a zero-length string, it stops
the C<"*">.

The higher-level loops preserve an additional state between iterations:
whether the last match was zero-length.  To break the loop, the following
match after a zero-length match is prohibited to have a length of zero.
This prohibition interacts with backtracking (see L</"Backtracking">),
and so the I<second best> match is chosen if the I<best> match is of
zero length.

For example:

    $_ = 'bar';
    s/\w??/<$&>/g;

results in C<< <><b><><a><><r><> >>.  At each position of the string the best
match given by non-greedy C<??> is the zero-length match, and the I<second
best> match is what is matched by C<\w>.  Thus zero-length matches
alternate with one-character-long matches.

Similarly, for repeated C<m/()/g> the second-best match is the match at the
position one notch further in the string.

The additional state of being I<matched with zero-length> is associated with
the matched string, and is reset by each assignment to C<pos()>.
Zero-length matches at the end of the previous match are ignored
during C<split>.

=head2 Combining RE Pieces

Each of the elementary pieces of regular expressions which were described
before (such as C<ab> or C<\Z>) could match at most one substring
at the given position of the input string.  However, in a typical regular
expression these elementary pieces are combined into more complicated
patterns using combining operators C<ST>, C<S|T>, C<S*> I<etc>.
(in these examples C<"S"> and C<"T"> are regular subexpressions).

Such combinations can include alternatives, leading to a problem of choice:
if we match a regular expression C<a|ab> against C<"abc">, will it match
substring C<"a"> or C<"ab">?  One way to describe which substring is
actually matched is the concept of backtracking (see L</"Backtracking">).
However, this description is too low-level and makes you think
in terms of a particular implementation.

Another description starts with notions of "better"/"worse".  All the
substrings which may be matched by the given regular expression can be
sorted from the "best" match to the "worst" match, and it is the "best"
match which is chosen.  This substitutes the question of "what is chosen?"
by the question of "which matches are better, and which are worse?".

Again, for elementary pieces there is no such question, since at most
one match at a given position is possible.  This section describes the
notion of better/worse for combining operators.  In the description
below C<"S"> and C<"T"> are regular subexpressions.

=over 4

=item C<ST>

Consider two possible matches, C<AB> and C<A'B'>, C<"A"> and C<A'> are
substrings which can be matched by C<"S">, C<"B"> and C<B'> are substrings
which can be matched by C<"T">.

If C<"A"> is a better match for C<"S"> than C<A'>, C<AB> is a better
match than C<A'B'>.

If C<"A"> and C<A'> coincide: C<AB> is a better match than C<AB'> if
C<"B"> is a better match for C<"T"> than C<B'>.

=item C<S|T>

When C<"S"> can match, it is a better match than when only C<"T"> can match.

Ordering of two matches for C<"S"> is the same as for C<"S">.  Similar for
two matches for C<"T">.

=item C<S{REPEAT_COUNT}>

Matches as C<SSS...S> (repeated as many times as necessary).

=item C<S{min,max}>

Matches as C<S{max}|S{max-1}|...|S{min+1}|S{min}>.

=item C<S{min,max}?>

Matches as C<S{min}|S{min+1}|...|S{max-1}|S{max}>.

=item C<S?>, C<S*>, C<S+>

Same as C<S{0,1}>, C<S{0,BIG_NUMBER}>, C<S{1,BIG_NUMBER}> respectively.

=item C<S??>, C<S*?>, C<S+?>

Same as C<S{0,1}?>, C<S{0,BIG_NUMBER}?>, C<S{1,BIG_NUMBER}?> respectively.

=item C<< (?>S) >>

Matches the best match for C<"S"> and only that.

=item C<(?=S)>, C<(?<=S)>

Only the best match for C<"S"> is considered.  (This is important only if
C<"S"> has capturing parentheses, and backreferences are used somewhere
else in the whole regular expression.)

=item C<(?!S)>, C<(?<!S)>

For this grouping operator there is no need to describe the ordering, since
only whether or not C<"S"> can match is important.

=item C<(??{ EXPR })>, C<(?I<PARNO>)>

The ordering is the same as for the regular expression which is
the result of EXPR, or the pattern contained by capture group I<PARNO>.

=item C<(?(condition)yes-pattern|no-pattern)>

Recall that which of C<yes-pattern> or C<no-pattern> actually matches is
already determined.  The ordering of the matches is the same as for the
chosen subexpression.

=back

The above recipes describe the ordering of matches I<at a given position>.
One more rule is needed to understand how a match is determined for the
whole regular expression: a match at an earlier position is always better
than a match at a later position.

=head2 Creating Custom RE Engines

As of Perl 5.10.0, one can create custom regular expression engines.  This
is not for the faint of heart, as they have to plug in at the C level.  See
L<perlreapi> for more details.

As an alternative, overloaded constants (see L<overload>) provide a simple
way to extend the functionality of the RE engine, by substituting one
pattern for another.

Suppose that we want to enable a new RE escape-sequence C<\Y|> which
matches at a boundary between whitespace characters and non-whitespace
characters.  Note that C<(?=\S)(?<!\S)|(?!\S)(?<=\S)> matches exactly
at these positions, so we want to have each C<\Y|> in the place of the
more complicated version.  We can create a module C<customre> to do
this:

    package customre;
    use overload;

    sub import {
      shift;
      die "No argument to customre::import allowed" if @_;
      overload::constant 'qr' => \&convert;
    }

    sub invalid { die "/$_[0]/: invalid escape '\\$_[1]'"}

    # We must also take care of not escaping the legitimate \\Y|
    # sequence, hence the presence of '\\' in the conversion rules.
    my %rules = ( '\\' => '\\\\',
                  'Y|' => qr/(?=\S)(?<!\S)|(?!\S)(?<=\S)/ );
    sub convert {
      my $re = shift;
      $re =~ s{
                \\ ( \\ | Y . )
              }
              { $rules{$1} or invalid($re,$1) }sgex;
      return $re;
    }

Now C<use customre> enables the new escape in constant regular
expressions, I<i.e.>, those without any runtime variable interpolations.
As documented in L<overload>, this conversion will work only over
literal parts of regular expressions.  For C<\Y|$re\Y|> the variable
part of this regular expression needs to be converted explicitly
(but only if the special meaning of C<\Y|> should be enabled inside C<$re>):

    use customre;
    $re = <>;
    chomp $re;
    $re = customre::convert $re;
    /\Y|$re\Y|/;

=head2 Embedded Code Execution Frequency

The exact rules for how often C<(??{})> and C<(?{})> are executed in a pattern
are unspecified.  In the case of a successful match you can assume that
they DWIM and will be executed in left to right order the appropriate
number of times in the accepting path of the pattern as would any other
meta-pattern.  How non-accepting pathways and match failures affect the
number of times a pattern is executed is specifically unspecified and
may vary depending on what optimizations can be applied to the pattern
and is likely to change from version to version.

For instance in

  "aaabcdeeeee"=~/a(?{print "a"})b(?{print "b"})cde/;

the exact number of times "a" or "b" are printed out is unspecified for
failure, but you may assume they will be printed at least once during
a successful match, additionally you may assume that if "b" is printed,
it will be preceded by at least one "a".

In the case of branching constructs like the following:

  /a(b|(?{ print "a" }))c(?{ print "c" })/;

you can assume that the input "ac" will output "ac", and that "abc"
will output only "c".

When embedded code is quantified, successful matches will call the
code once for each matched iteration of the quantifier.  For
example:

  "good" =~ /g(?:o(?{print "o"}))*d/;

will output "o" twice.

=head2 PCRE/Python Support

As of Perl 5.10.0, Perl supports several Python/PCRE-specific extensions
to the regex syntax. While Perl programmers are encouraged to use the
Perl-specific syntax, the following are also accepted:

=over 4

=item C<< (?PE<lt>NAMEE<gt>pattern) >>

Define a named capture group. Equivalent to C<< (?<NAME>pattern) >>.

=item C<< (?P=NAME) >>

Backreference to a named capture group. Equivalent to C<< \g{NAME} >>.

=item C<< (?P>NAME) >>

Subroutine call to a named capture group. Equivalent to C<< (?&NAME) >>.

=back

=head1 BUGS

There are a number of issues with regard to case-insensitive matching
in Unicode rules.  See C<"i"> under L</Modifiers> above.

This document varies from difficult to understand to completely
and utterly opaque.  The wandering prose riddled with jargon is
hard to fathom in several places.

This document needs a rewrite that separates the tutorial content
from the reference content.

=head1 SEE ALSO

The syntax of patterns used in Perl pattern matching evolved from those
supplied in the Bell Labs Research Unix 8th Edition (Version 8) regex
routines.  (The code is actually derived (distantly) from Henry
Spencer's freely redistributable reimplementation of those V8 routines.)

L<perlrequick>.

L<perlretut>.

L<perlop/"Regexp Quote-Like Operators">.

L<perlop/"Gory details of parsing quoted constructs">.

L<perlfaq6>.

L<perlfunc/pos>.

L<perllocale>.

L<perlebcdic>.

I<Mastering Regular Expressions> by Jeffrey Friedl, published
by O'Reilly and Associates.
perlperf.pod000064400000141331150344123460007074 0ustar00=head1 NAME

perlperf - Perl Performance and Optimization Techniques

=head1 DESCRIPTION

This is an introduction to the use of performance and optimization techniques
which can be used with particular reference to perl programs.  While many perl
developers have come from other languages, and can use their prior knowledge
where appropriate, there are many other people who might benefit from a few
perl specific pointers.  If you want the condensed version, perhaps the best
advice comes from the renowned Japanese Samurai, Miyamoto Musashi, who said:

 "Do Not Engage in Useless Activity"

in 1645.

=head1 OVERVIEW

Perhaps the most common mistake programmers make is to attempt to optimize
their code before a program actually does anything useful - this is a bad idea.
There's no point in having an extremely fast program that doesn't work.  The
first job is to get a program to I<correctly> do something B<useful>, (not to
mention ensuring the test suite is fully functional), and only then to consider
optimizing it.  Having decided to optimize existing working code, there are
several simple but essential steps to consider which are intrinsic to any
optimization process.

=head2 ONE STEP SIDEWAYS

Firstly, you need to establish a baseline time for the existing code, which
timing needs to be reliable and repeatable.  You'll probably want to use the
C<Benchmark> or C<Devel::NYTProf> modules, or something similar, for this step,
or perhaps the Unix system C<time> utility, whichever is appropriate.  See the
base of this document for a longer list of benchmarking and profiling modules,
and recommended further reading.

=head2 ONE STEP FORWARD

Next, having examined the program for I<hot spots>, (places where the code
seems to run slowly), change the code with the intention of making it run
faster.  Using version control software, like C<subversion>, will ensure no
changes are irreversible.  It's too easy to fiddle here and fiddle there -
don't change too much at any one time or you might not discover which piece of
code B<really> was the slow bit.

=head2 ANOTHER STEP SIDEWAYS

It's not enough to say: "that will make it run faster", you have to check it.
Rerun the code under control of the benchmarking or profiling modules, from the
first step above, and check that the new code executed the B<same task> in
I<less time>.  Save your work and repeat...

=head1 GENERAL GUIDELINES

The critical thing when considering performance is to remember there is no such
thing as a C<Golden Bullet>, which is why there are no rules, only guidelines.

It is clear that inline code is going to be faster than subroutine or method
calls, because there is less overhead, but this approach has the disadvantage
of being less maintainable and comes at the cost of greater memory usage -
there is no such thing as a free lunch.  If you are searching for an element in
a list, it can be more efficient to store the data in a hash structure, and
then simply look to see whether the key is defined, rather than to loop through
the entire array using grep() for instance.  substr() may be (a lot) faster
than grep() but not as flexible, so you have another trade-off to access.  Your
code may contain a line which takes 0.01 of a second to execute which if you
call it 1,000 times, quite likely in a program parsing even medium sized files
for instance, you already have a 10 second delay, in just one single code
location, and if you call that line 100,000 times, your entire program will
slow down to an unbearable crawl.

Using a subroutine as part of your sort is a powerful way to get exactly what
you want, but will usually be slower than the built-in I<alphabetic> C<cmp> and
I<numeric> C<E<lt>=E<gt>> sort operators.  It is possible to make multiple
passes over your data, building indices to make the upcoming sort more
efficient, and to use what is known as the C<OM> (Orcish Maneuver) to cache the
sort keys in advance.  The cache lookup, while a good idea, can itself be a
source of slowdown by enforcing a double pass over the data - once to setup the
cache, and once to sort the data.  Using C<pack()> to extract the required sort
key into a consistent string can be an efficient way to build a single string
to compare, instead of using multiple sort keys, which makes it possible to use
the standard, written in C<c> and fast, perl C<sort()> function on the output,
and is the basis of the C<GRT> (Guttman Rossler Transform).  Some string
combinations can slow the C<GRT> down, by just being too plain complex for its
own good.

For applications using database backends, the standard C<DBIx> namespace has
tries to help with keeping things nippy, not least because it tries to I<not>
query the database until the latest possible moment, but always read the docs
which come with your choice of libraries.  Among the many issues facing
developers dealing with databases should remain aware of is to always use
C<SQL> placeholders and to consider pre-fetching data sets when this might
prove advantageous.  Splitting up a large file by assigning multiple processes
to parsing a single file, using say C<POE>, C<threads> or C<fork> can also be a
useful way of optimizing your usage of the available C<CPU> resources, though
this technique is fraught with concurrency issues and demands high attention to
detail.

Every case has a specific application and one or more exceptions, and there is
no replacement for running a few tests and finding out which method works best
for your particular environment, this is why writing optimal code is not an
exact science, and why we love using Perl so much - TMTOWTDI.

=head1 BENCHMARKS

Here are a few examples to demonstrate usage of Perl's benchmarking tools.

=head2  Assigning and Dereferencing Variables.

I'm sure most of us have seen code which looks like, (or worse than), this:

 if ( $obj->{_ref}->{_myscore} >= $obj->{_ref}->{_yourscore} ) {
     ...

This sort of code can be a real eyesore to read, as well as being very
sensitive to typos, and it's much clearer to dereference the variable
explicitly.  We're side-stepping the issue of working with object-oriented
programming techniques to encapsulate variable access via methods, only
accessible through an object.  Here we're just discussing the technical
implementation of choice, and whether this has an effect on performance.  We
can see whether this dereferencing operation, has any overhead by putting
comparative code in a file and running a C<Benchmark> test.

# dereference

 #!/usr/bin/perl

 use strict;
 use warnings;

 use Benchmark;

 my $ref = {
         'ref'   => {
             _myscore    => '100 + 1',
             _yourscore  => '102 - 1',
         },
 };

 timethese(1000000, {
         'direct'       => sub {
           my $x = $ref->{ref}->{_myscore} . $ref->{ref}->{_yourscore} ;
         },
         'dereference'  => sub {
             my $ref  = $ref->{ref};
             my $myscore = $ref->{_myscore};
             my $yourscore = $ref->{_yourscore};
             my $x = $myscore . $yourscore;
         },
 });

It's essential to run any timing measurements a sufficient number of times so
the numbers settle on a numerical average, otherwise each run will naturally
fluctuate due to variations in the environment, to reduce the effect of
contention for C<CPU> resources and network bandwidth for instance.  Running
the above code for one million iterations, we can take a look at the report
output by the C<Benchmark> module, to see which approach is the most effective.

 $> perl dereference

 Benchmark: timing 1000000 iterations of dereference, direct...
 dereference:  2 wallclock secs ( 1.59 usr +  0.00 sys =  1.59 CPU) @ 628930.82/s (n=1000000)
     direct:  1 wallclock secs ( 1.20 usr +  0.00 sys =  1.20 CPU) @ 833333.33/s (n=1000000)

The difference is clear to see and the dereferencing approach is slower.  While
it managed to execute an average of 628,930 times a second during our test, the
direct approach managed to run an additional 204,403 times, unfortunately.
Unfortunately, because there are many examples of code written using the
multiple layer direct variable access, and it's usually horrible.  It is,
however, minusculy faster.  The question remains whether the minute gain is
actually worth the eyestrain, or the loss of maintainability.

=head2  Search and replace or tr

If we have a string which needs to be modified, while a regex will almost
always be much more flexible, C<tr>, an oft underused tool, can still be a
useful.  One scenario might be replace all vowels with another character.  The
regex solution might look like this:

 $str =~ s/[aeiou]/x/g

The C<tr> alternative might look like this:

 $str =~ tr/aeiou/xxxxx/

We can put that into a test file which we can run to check which approach is
the fastest, using a global C<$STR> variable to assign to the C<my $str>
variable so as to avoid perl trying to optimize any of the work away by
noticing it's assigned only the once.

# regex-transliterate

 #!/usr/bin/perl

 use strict;
 use warnings;

 use Benchmark;

 my $STR = "$$-this and that";

 timethese( 1000000, {
 'sr'  => sub { my $str = $STR; $str =~ s/[aeiou]/x/g; return $str; },
 'tr'  => sub { my $str = $STR; $str =~ tr/aeiou/xxxxx/; return $str; },
 });

Running the code gives us our results:

 $> perl regex-transliterate

 Benchmark: timing 1000000 iterations of sr, tr...
         sr:  2 wallclock secs ( 1.19 usr +  0.00 sys =  1.19 CPU) @ 840336.13/s (n=1000000)
         tr:  0 wallclock secs ( 0.49 usr +  0.00 sys =  0.49 CPU) @ 2040816.33/s (n=1000000)

The C<tr> version is a clear winner.  One solution is flexible, the other is
fast - and it's appropriately the programmer's choice which to use.

Check the C<Benchmark> docs for further useful techniques.

=head1 PROFILING TOOLS

A slightly larger piece of code will provide something on which a profiler can
produce more extensive reporting statistics.  This example uses the simplistic
C<wordmatch> program which parses a given input file and spews out a short
report on the contents.

# wordmatch

 #!/usr/bin/perl

 use strict;
 use warnings;

 =head1 NAME

 filewords - word analysis of input file

 =head1 SYNOPSIS

     filewords -f inputfilename [-d]

 =head1 DESCRIPTION

 This program parses the given filename, specified with C<-f>, and
 displays a simple analysis of the words found therein.  Use the C<-d>
 switch to enable debugging messages.

 =cut

 use FileHandle;
 use Getopt::Long;

 my $debug   =  0;
 my $file    = '';

 my $result = GetOptions (
     'debug'         => \$debug,
     'file=s'        => \$file,
 );
 die("invalid args") unless $result;

 unless ( -f $file ) {
     die("Usage: $0 -f filename [-d]");
 }
 my $FH = FileHandle->new("< $file")
                               or die("unable to open file($file): $!");

 my $i_LINES = 0;
 my $i_WORDS = 0;
 my %count   = ();

 my @lines = <$FH>;
 foreach my $line ( @lines ) {
     $i_LINES++;
     $line =~ s/\n//;
     my @words = split(/ +/, $line);
     my $i_words = scalar(@words);
     $i_WORDS = $i_WORDS + $i_words;
     debug("line: $i_LINES supplying $i_words words: @words");
     my $i_word = 0;
     foreach my $word ( @words ) {
         $i_word++;
         $count{$i_LINES}{spec} += matches($i_word, $word,
                                           '[^a-zA-Z0-9]');
         $count{$i_LINES}{only} += matches($i_word, $word,
                                           '^[^a-zA-Z0-9]+$');
         $count{$i_LINES}{cons} += matches($i_word, $word,
                                     '^[(?i:bcdfghjklmnpqrstvwxyz)]+$');
         $count{$i_LINES}{vows} += matches($i_word, $word,
                                           '^[(?i:aeiou)]+$');
         $count{$i_LINES}{caps} += matches($i_word, $word,
                                           '^[(A-Z)]+$');
     }
 }

 print report( %count );

 sub matches {
     my $i_wd  = shift;
     my $word  = shift;
     my $regex = shift;
     my $has = 0;

     if ( $word =~ /($regex)/ ) {
         $has++ if $1;
     }

     debug( "word: $i_wd "
           . ($has ? 'matches' : 'does not match')
           . " chars: /$regex/");

     return $has;
 }

 sub report {
     my %report = @_;
     my %rep;

     foreach my $line ( keys %report ) {
         foreach my $key ( keys %{ $report{$line} } ) {
             $rep{$key} += $report{$line}{$key};
         }
     }

     my $report = qq|
 $0 report for $file:
 lines in file: $i_LINES
 words in file: $i_WORDS
 words with special (non-word) characters: $i_spec
 words with only special (non-word) characters: $i_only
 words with only consonants: $i_cons
 words with only capital letters: $i_caps
 words with only vowels: $i_vows
 |;

     return $report;
 }

 sub debug {
     my $message = shift;

     if ( $debug ) {
         print STDERR "DBG: $message\n";
     }
 }

 exit 0;

=head2 Devel::DProf

This venerable module has been the de-facto standard for Perl code profiling
for more than a decade, but has been replaced by a number of other modules
which have brought us back to the 21st century.  Although you're recommended to
evaluate your tool from the several mentioned here and from the CPAN list at
the base of this document, (and currently L<Devel::NYTProf> seems to be the
weapon of choice - see below), we'll take a quick look at the output from
L<Devel::DProf> first, to set a baseline for Perl profiling tools.  Run the
above program under the control of C<Devel::DProf> by using the C<-d> switch on
the command-line.

 $> perl -d:DProf wordmatch -f perl5db.pl

 <...multiple lines snipped...>

 wordmatch report for perl5db.pl:
 lines in file: 9428
 words in file: 50243
 words with special (non-word) characters: 20480
 words with only special (non-word) characters: 7790
 words with only consonants: 4801
 words with only capital letters: 1316
 words with only vowels: 1701

C<Devel::DProf> produces a special file, called F<tmon.out> by default, and
this file is read by the C<dprofpp> program, which is already installed as part
of the C<Devel::DProf> distribution.  If you call C<dprofpp> with no options,
it will read the F<tmon.out> file in the current directory and produce a human
readable statistics report of the run of your program.  Note that this may take
a little time.

 $> dprofpp

 Total Elapsed Time = 2.951677 Seconds
   User+System Time = 2.871677 Seconds
 Exclusive Times
 %Time ExclSec CumulS #Calls sec/call Csec/c  Name
  102.   2.945  3.003 251215   0.0000 0.0000  main::matches
  2.40   0.069  0.069 260643   0.0000 0.0000  main::debug
  1.74   0.050  0.050      1   0.0500 0.0500  main::report
  1.04   0.030  0.049      4   0.0075 0.0123  main::BEGIN
  0.35   0.010  0.010      3   0.0033 0.0033  Exporter::as_heavy
  0.35   0.010  0.010      7   0.0014 0.0014  IO::File::BEGIN
  0.00       - -0.000      1        -      -  Getopt::Long::FindOption
  0.00       - -0.000      1        -      -  Symbol::BEGIN
  0.00       - -0.000      1        -      -  Fcntl::BEGIN
  0.00       - -0.000      1        -      -  Fcntl::bootstrap
  0.00       - -0.000      1        -      -  warnings::BEGIN
  0.00       - -0.000      1        -      -  IO::bootstrap
  0.00       - -0.000      1        -      -  Getopt::Long::ConfigDefaults
  0.00       - -0.000      1        -      -  Getopt::Long::Configure
  0.00       - -0.000      1        -      -  Symbol::gensym

C<dprofpp> will produce some quite detailed reporting on the activity of the
C<wordmatch> program.  The wallclock, user and system, times are at the top of
the analysis, and after this are the main columns defining which define the
report.  Check the C<dprofpp> docs for details of the many options it supports.

See also C<L<Apache::DProf>> which hooks C<Devel::DProf> into C<mod_perl>.

=head2 Devel::Profiler

Let's take a look at the same program using a different profiler:
C<Devel::Profiler>, a drop-in Perl-only replacement for C<Devel::DProf>.  The
usage is very slightly different in that instead of using the special C<-d:>
flag, you pull C<Devel::Profiler> in directly as a module using C<-M>.

 $> perl -MDevel::Profiler wordmatch -f perl5db.pl

 <...multiple lines snipped...>

 wordmatch report for perl5db.pl:
 lines in file: 9428
 words in file: 50243
 words with special (non-word) characters: 20480
 words with only special (non-word) characters: 7790
 words with only consonants: 4801
 words with only capital letters: 1316
 words with only vowels: 1701


C<Devel::Profiler> generates a tmon.out file which is compatible with the
C<dprofpp> program, thus saving the construction of a dedicated statistics
reader program.  C<dprofpp> usage is therefore identical to the above example.

 $> dprofpp

 Total Elapsed Time =   20.984 Seconds
   User+System Time =   19.981 Seconds
 Exclusive Times
 %Time ExclSec CumulS #Calls sec/call Csec/c  Name
  49.0   9.792 14.509 251215   0.0000 0.0001  main::matches
  24.4   4.887  4.887 260643   0.0000 0.0000  main::debug
  0.25   0.049  0.049      1   0.0490 0.0490  main::report
  0.00   0.000  0.000      1   0.0000 0.0000  Getopt::Long::GetOptions
  0.00   0.000  0.000      2   0.0000 0.0000  Getopt::Long::ParseOptionSpec
  0.00   0.000  0.000      1   0.0000 0.0000  Getopt::Long::FindOption
  0.00   0.000  0.000      1   0.0000 0.0000  IO::File::new
  0.00   0.000  0.000      1   0.0000 0.0000  IO::Handle::new
  0.00   0.000  0.000      1   0.0000 0.0000  Symbol::gensym
  0.00   0.000  0.000      1   0.0000 0.0000  IO::File::open

Interestingly we get slightly different results, which is mostly because the
algorithm which generates the report is different, even though the output file
format was allegedly identical.  The elapsed, user and system times are clearly
showing the time it took for C<Devel::Profiler> to execute its own run, but
the column listings feel more accurate somehow than the ones we had earlier
from C<Devel::DProf>.  The 102% figure has disappeared, for example.  This is
where we have to use the tools at our disposal, and recognise their pros and
cons, before using them.  Interestingly, the numbers of calls for each
subroutine are identical in the two reports, it's the percentages which differ.
As the author of C<Devel::Proviler> writes:

 ...running HTML::Template's test suite under Devel::DProf shows
 output() taking NO time but Devel::Profiler shows around 10% of the
 time is in output().  I don't know which to trust but my gut tells me
 something is wrong with Devel::DProf.  HTML::Template::output() is a
 big routine that's called for every test. Either way, something needs
 fixing.

YMMV.

See also C<L<Devel::Apache::Profiler>> which hooks C<Devel::Profiler>
into C<mod_perl>.

=head2 Devel::SmallProf

The C<Devel::SmallProf> profiler examines the runtime of your Perl program and
produces a line-by-line listing to show how many times each line was called,
and how long each line took to execute.  It is called by supplying the familiar
C<-d> flag to Perl at runtime.

 $> perl -d:SmallProf wordmatch -f perl5db.pl

 <...multiple lines snipped...>

 wordmatch report for perl5db.pl:
 lines in file: 9428
 words in file: 50243
 words with special (non-word) characters: 20480
 words with only special (non-word) characters: 7790
 words with only consonants: 4801
 words with only capital letters: 1316
 words with only vowels: 1701

C<Devel::SmallProf> writes it's output into a file called F<smallprof.out>, by
default.  The format of the file looks like this:

 <num> <time> <ctime> <line>:<text>

When the program has terminated, the output may be examined and sorted using
any standard text filtering utilities.  Something like the following may be
sufficient:

 $> cat smallprof.out | grep \d*: | sort -k3 | tac | head -n20

 251215   1.65674   7.68000    75: if ( $word =~ /($regex)/ ) {
 251215   0.03264   4.40000    79: debug("word: $i_wd ".($has ? 'matches' :
 251215   0.02693   4.10000    81: return $has;
 260643   0.02841   4.07000   128: if ( $debug ) {
 260643   0.02601   4.04000   126: my $message = shift;
 251215   0.02641   3.91000    73: my $has = 0;
 251215   0.03311   3.71000    70: my $i_wd  = shift;
 251215   0.02699   3.69000    72: my $regex = shift;
 251215   0.02766   3.68000    71: my $word  = shift;
  50243   0.59726   1.00000    59:  $count{$i_LINES}{cons} =
  50243   0.48175   0.92000    61:  $count{$i_LINES}{spec} =
  50243   0.00644   0.89000    56:  my $i_cons = matches($i_word, $word,
  50243   0.48837   0.88000    63:  $count{$i_LINES}{caps} =
  50243   0.00516   0.88000    58:  my $i_caps = matches($i_word, $word, '^[(A-
  50243   0.00631   0.81000    54:  my $i_spec = matches($i_word, $word, '[^a-
  50243   0.00496   0.80000    57:  my $i_vows = matches($i_word, $word,
  50243   0.00688   0.80000    53:  $i_word++;
  50243   0.48469   0.79000    62:  $count{$i_LINES}{only} =
  50243   0.48928   0.77000    60:  $count{$i_LINES}{vows} =
  50243   0.00683   0.75000    55:  my $i_only = matches($i_word, $word, '^[^a-

You can immediately see a slightly different focus to the subroutine profiling
modules, and we start to see exactly which line of code is taking the most
time.  That regex line is looking a bit suspicious, for example.  Remember that
these tools are supposed to be used together, there is no single best way to
profile your code, you need to use the best tools for the job.

See also C<L<Apache::SmallProf>> which hooks C<Devel::SmallProf> into
C<mod_perl>.

=head2 Devel::FastProf

C<Devel::FastProf> is another Perl line profiler.  This was written with a view
to getting a faster line profiler, than is possible with for example
C<Devel::SmallProf>, because it's written in C<C>.  To use C<Devel::FastProf>,
supply the C<-d> argument to Perl:

 $> perl -d:FastProf wordmatch -f perl5db.pl

 <...multiple lines snipped...>

 wordmatch report for perl5db.pl:
 lines in file: 9428
 words in file: 50243
 words with special (non-word) characters: 20480
 words with only special (non-word) characters: 7790
 words with only consonants: 4801
 words with only capital letters: 1316
 words with only vowels: 1701

C<Devel::FastProf> writes statistics to the file F<fastprof.out> in the current
directory.  The output file, which can be specified, can be interpreted by using
the C<fprofpp> command-line program.

 $> fprofpp | head -n20

 # fprofpp output format is:
 # filename:line time count: source
 wordmatch:75 3.93338 251215: if ( $word =~ /($regex)/ ) {
 wordmatch:79 1.77774 251215: debug("word: $i_wd ".($has ? 'matches' : 'does not match')." chars: /$regex/");
 wordmatch:81 1.47604 251215: return $has;
 wordmatch:126 1.43441 260643: my $message = shift;
 wordmatch:128 1.42156 260643: if ( $debug ) {
 wordmatch:70 1.36824 251215: my $i_wd  = shift;
 wordmatch:71 1.36739 251215: my $word  = shift;
 wordmatch:72 1.35939 251215: my $regex = shift;

Straightaway we can see that the number of times each line has been called is
identical to the C<Devel::SmallProf> output, and the sequence is only very
slightly different based on the ordering of the amount of time each line took
to execute, C<if ( $debug ) { > and C<my $message = shift;>, for example.  The
differences in the actual times recorded might be in the algorithm used
internally, or it could be due to system resource limitations or contention.

See also the L<DBIx::Profile> which will profile database queries running
under the C<DBIx::*> namespace.

=head2 Devel::NYTProf

C<Devel::NYTProf> is the B<next generation> of Perl code profiler, fixing many
shortcomings in other tools and implementing many cool features.  First of all it
can be used as either a I<line> profiler, a I<block> or a I<subroutine>
profiler, all at once.  It can also use sub-microsecond (100ns) resolution on
systems which provide C<clock_gettime()>.  It can be started and stopped even
by the program being profiled.  It's a one-line entry to profile C<mod_perl>
applications.  It's written in C<c> and is probably the fastest profiler
available for Perl.  The list of coolness just goes on.  Enough of that, let's
see how to it works - just use the familiar C<-d> switch to plug it in and run
the code.

 $> perl -d:NYTProf wordmatch -f perl5db.pl

 wordmatch report for perl5db.pl:
 lines in file: 9427
 words in file: 50243
 words with special (non-word) characters: 20480
 words with only special (non-word) characters: 7790
 words with only consonants: 4801
 words with only capital letters: 1316
 words with only vowels: 1701

C<NYTProf> will generate a report database into the file F<nytprof.out> by
default.  Human readable reports can be generated from here by using the
supplied C<nytprofhtml> (HTML output) and C<nytprofcsv> (CSV output) programs.
We've used the Unix system C<html2text> utility to convert the
F<nytprof/index.html> file for convenience here.

 $> html2text nytprof/index.html

 Performance Profile Index
 For wordmatch
   Run on Fri Sep 26 13:46:39 2008
 Reported on Fri Sep 26 13:47:23 2008

          Top 15 Subroutines -- ordered by exclusive time
 |Calls |P |F |Inclusive|Exclusive|Subroutine                          |
 |      |  |  |Time     |Time     |                                    |
 |251215|5 |1 |13.09263 |10.47692 |main::              |matches        |
 |260642|2 |1 |2.71199  |2.71199  |main::              |debug          |
 |1     |1 |1 |0.21404  |0.21404  |main::              |report         |
 |2     |2 |2 |0.00511  |0.00511  |XSLoader::          |load (xsub)    |
 |14    |14|7 |0.00304  |0.00298  |Exporter::          |import         |
 |3     |1 |1 |0.00265  |0.00254  |Exporter::          |as_heavy       |
 |10    |10|4 |0.00140  |0.00140  |vars::              |import         |
 |13    |13|1 |0.00129  |0.00109  |constant::          |import         |
 |1     |1 |1 |0.00360  |0.00096  |FileHandle::        |import         |
 |3     |3 |3 |0.00086  |0.00074  |warnings::register::|import         |
 |9     |3 |1 |0.00036  |0.00036  |strict::            |bits           |
 |13    |13|13|0.00032  |0.00029  |strict::            |import         |
 |2     |2 |2 |0.00020  |0.00020  |warnings::          |import         |
 |2     |1 |1 |0.00020  |0.00020  |Getopt::Long::      |ParseOptionSpec|
 |7     |7 |6 |0.00043  |0.00020  |strict::            |unimport       |

 For more information see the full list of 189 subroutines.

The first part of the report already shows the critical information regarding
which subroutines are using the most time.  The next gives some statistics
about the source files profiled.

         Source Code Files -- ordered by exclusive time then name
 |Stmts  |Exclusive|Avg.   |Reports                     |Source File         |
 |       |Time     |       |                            |                    |
 |2699761|15.66654 |6e-06  |line   .    block   .    sub|wordmatch           |
 |35     |0.02187  |0.00062|line   .    block   .    sub|IO/Handle.pm        |
 |274    |0.01525  |0.00006|line   .    block   .    sub|Getopt/Long.pm      |
 |20     |0.00585  |0.00029|line   .    block   .    sub|Fcntl.pm            |
 |128    |0.00340  |0.00003|line   .    block   .    sub|Exporter/Heavy.pm   |
 |42     |0.00332  |0.00008|line   .    block   .    sub|IO/File.pm          |
 |261    |0.00308  |0.00001|line   .    block   .    sub|Exporter.pm         |
 |323    |0.00248  |8e-06  |line   .    block   .    sub|constant.pm         |
 |12     |0.00246  |0.00021|line   .    block   .    sub|File/Spec/Unix.pm   |
 |191    |0.00240  |0.00001|line   .    block   .    sub|vars.pm             |
 |77     |0.00201  |0.00003|line   .    block   .    sub|FileHandle.pm       |
 |12     |0.00198  |0.00016|line   .    block   .    sub|Carp.pm             |
 |14     |0.00175  |0.00013|line   .    block   .    sub|Symbol.pm           |
 |15     |0.00130  |0.00009|line   .    block   .    sub|IO.pm               |
 |22     |0.00120  |0.00005|line   .    block   .    sub|IO/Seekable.pm      |
 |198    |0.00085  |4e-06  |line   .    block   .    sub|warnings/register.pm|
 |114    |0.00080  |7e-06  |line   .    block   .    sub|strict.pm           |
 |47     |0.00068  |0.00001|line   .    block   .    sub|warnings.pm         |
 |27     |0.00054  |0.00002|line   .    block   .    sub|overload.pm         |
 |9      |0.00047  |0.00005|line   .    block   .    sub|SelectSaver.pm      |
 |13     |0.00045  |0.00003|line   .    block   .    sub|File/Spec.pm        |
 |2701595|15.73869 |       |Total                       |
 |128647 |0.74946  |       |Average                     |
 |       |0.00201  |0.00003|Median                      |
 |       |0.00121  |0.00003|Deviation                   |

 Report produced by the NYTProf 2.03 Perl profiler, developed by Tim Bunce and
 Adam Kaplan.

At this point, if you're using the I<html> report, you can click through the
various links to bore down into each subroutine and each line of code.  Because
we're using the text reporting here, and there's a whole directory full of
reports built for each source file, we'll just display a part of the
corresponding F<wordmatch-line.html> file, sufficient to give an idea of the
sort of output you can expect from this cool tool.

 $> html2text nytprof/wordmatch-line.html

 Performance Profile -- -block view-.-line view-.-sub view-
 For wordmatch
 Run on Fri Sep 26 13:46:39 2008
 Reported on Fri Sep 26 13:47:22 2008

 File wordmatch

  Subroutines -- ordered by exclusive time
 |Calls |P|F|Inclusive|Exclusive|Subroutine    |
 |      | | |Time     |Time     |              |
 |251215|5|1|13.09263 |10.47692 |main::|matches|
 |260642|2|1|2.71199  |2.71199  |main::|debug  |
 |1     |1|1|0.21404  |0.21404  |main::|report |
 |0     |0|0|0        |0        |main::|BEGIN  |


 |Line|Stmts.|Exclusive|Avg.   |Code                                           |
 |    |      |Time     |       |                                               |
 |1   |      |         |       |#!/usr/bin/perl                                |
 |2   |      |         |       |                                               |
 |    |      |         |       |use strict;                                    |
 |3   |3     |0.00086  |0.00029|# spent 0.00003s making 1 calls to strict::    |
 |    |      |         |       |import                                         |
 |    |      |         |       |use warnings;                                  |
 |4   |3     |0.01563  |0.00521|# spent 0.00012s making 1 calls to warnings::  |
 |    |      |         |       |import                                         |
 |5   |      |         |       |                                               |
 |6   |      |         |       |=head1 NAME                                    |
 |7   |      |         |       |                                               |
 |8   |      |         |       |filewords - word analysis of input file        |
 <...snip...>
 |62  |1     |0.00445  |0.00445|print report( %count );                        |
 |    |      |         |       |# spent 0.21404s making 1 calls to main::report|
 |63  |      |         |       |                                               |
 |    |      |         |       |# spent 23.56955s (10.47692+2.61571) within    |
 |    |      |         |       |main::matches which was called 251215 times,   |
 |    |      |         |       |avg 0.00005s/call: # 50243 times               |
 |    |      |         |       |(2.12134+0.51939s) at line 57 of wordmatch, avg|
 |    |      |         |       |0.00005s/call # 50243 times (2.17735+0.54550s) |
 |64  |      |         |       |at line 56 of wordmatch, avg 0.00005s/call #   |
 |    |      |         |       |50243 times (2.10992+0.51797s) at line 58 of   |
 |    |      |         |       |wordmatch, avg 0.00005s/call # 50243 times     |
 |    |      |         |       |(2.12696+0.51598s) at line 55 of wordmatch, avg|
 |    |      |         |       |0.00005s/call # 50243 times (1.94134+0.51687s) |
 |    |      |         |       |at line 54 of wordmatch, avg 0.00005s/call     |
 |    |      |         |       |sub matches {                                  |
 <...snip...>
 |102 |      |         |       |                                               |
 |    |      |         |       |# spent 2.71199s within main::debug which was  |
 |    |      |         |       |called 260642 times, avg 0.00001s/call: #      |
 |    |      |         |       |251215 times (2.61571+0s) by main::matches at  |
 |103 |      |         |       |line 74 of wordmatch, avg 0.00001s/call # 9427 |
 |    |      |         |       |times (0.09628+0s) at line 50 of wordmatch, avg|
 |    |      |         |       |0.00001s/call                                  |
 |    |      |         |       |sub debug {                                    |
 |104 |260642|0.58496  |2e-06  |my $message = shift;                           |
 |105 |      |         |       |                                               |
 |106 |260642|1.09917  |4e-06  |if ( $debug ) {                                |
 |107 |      |         |       |print STDERR "DBG: $message\n";                |
 |108 |      |         |       |}                                              |
 |109 |      |         |       |}                                              |
 |110 |      |         |       |                                               |
 |111 |1     |0.01501  |0.01501|exit 0;                                        |
 |112 |      |         |       |                                               |

Oodles of very useful information in there - this seems to be the way forward.

See also C<L<Devel::NYTProf::Apache>> which hooks C<Devel::NYTProf> into
C<mod_perl>.

=head1  SORTING

Perl modules are not the only tools a performance analyst has at their
disposal, system tools like C<time> should not be overlooked as the next
example shows, where we take a quick look at sorting.  Many books, theses and
articles, have been written about efficient sorting algorithms, and this is not
the place to repeat such work, there's several good sorting modules which
deserve taking a look at too: C<Sort::Maker>, C<Sort::Key> spring to mind.
However, it's still possible to make some observations on certain Perl specific
interpretations on issues relating to sorting data sets and give an example or
two with regard to how sorting large data volumes can effect performance.
Firstly, an often overlooked point when sorting large amounts of data, one can
attempt to reduce the data set to be dealt with and in many cases C<grep()> can
be quite useful as a simple filter:

 @data = sort grep { /$filter/ } @incoming

A command such as this can vastly reduce the volume of material to actually
sort through in the first place, and should not be too lightly disregarded
purely on the basis of its simplicity.  The C<KISS> principle is too often
overlooked - the next example uses the simple system C<time> utility to
demonstrate.  Let's take a look at an actual example of sorting the contents of
a large file, an apache logfile would do.  This one has over a quarter of a
million lines, is 50M in size, and a snippet of it looks like this:

# logfile

 188.209-65-87.adsl-dyn.isp.belgacom.be - - [08/Feb/2007:12:57:16 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
 188.209-65-87.adsl-dyn.isp.belgacom.be - - [08/Feb/2007:12:57:16 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
 151.56.71.198 - - [08/Feb/2007:12:57:41 +0000] "GET /suse-on-vaio.html HTTP/1.1" 200 2858 "http://www.linux-on-laptops.com/sony.html" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
 151.56.71.198 - - [08/Feb/2007:12:57:42 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/suse-on-vaio.html" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
 151.56.71.198 - - [08/Feb/2007:12:57:43 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
 217.113.68.60 - - [08/Feb/2007:13:02:15 +0000] "GET / HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
 217.113.68.60 - - [08/Feb/2007:13:02:16 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
 debora.to.isac.cnr.it - - [08/Feb/2007:13:03:58 +0000] "GET /suse-on-vaio.html HTTP/1.1" 200 2858 "http://www.linux-on-laptops.com/sony.html" "Mozilla/5.0 (compatible; Konqueror/3.4; Linux) KHTML/3.4.0 (like Gecko)"
 debora.to.isac.cnr.it - - [08/Feb/2007:13:03:58 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/suse-on-vaio.html" "Mozilla/5.0 (compatible; Konqueror/3.4; Linux) KHTML/3.4.0 (like Gecko)"
 debora.to.isac.cnr.it - - [08/Feb/2007:13:03:58 +0000] "GET /favicon.ico HTTP/1.1" 404 209 "-" "Mozilla/5.0 (compatible; Konqueror/3.4; Linux) KHTML/3.4.0 (like Gecko)"
 195.24.196.99 - - [08/Feb/2007:13:26:48 +0000] "GET / HTTP/1.0" 200 3309 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9"
 195.24.196.99 - - [08/Feb/2007:13:26:58 +0000] "GET /data/css HTTP/1.0" 404 206 "http://www.rfi.net/" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9"
 195.24.196.99 - - [08/Feb/2007:13:26:59 +0000] "GET /favicon.ico HTTP/1.0" 404 209 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; fr; rv:1.8.0.9) Gecko/20061206 Firefox/1.5.0.9"
 crawl1.cosmixcorp.com - - [08/Feb/2007:13:27:57 +0000] "GET /robots.txt HTTP/1.0" 200 179 "-" "voyager/1.0"
 crawl1.cosmixcorp.com - - [08/Feb/2007:13:28:25 +0000] "GET /links.html HTTP/1.0" 200 3413 "-" "voyager/1.0"
 fhm226.internetdsl.tpnet.pl - - [08/Feb/2007:13:37:32 +0000] "GET /suse-on-vaio.html HTTP/1.1" 200 2858 "http://www.linux-on-laptops.com/sony.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
 fhm226.internetdsl.tpnet.pl - - [08/Feb/2007:13:37:34 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net/suse-on-vaio.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)"
 80.247.140.134 - - [08/Feb/2007:13:57:35 +0000] "GET / HTTP/1.1" 200 3309 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
 80.247.140.134 - - [08/Feb/2007:13:57:37 +0000] "GET /data/css HTTP/1.1" 404 206 "http://www.rfi.net" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"
 pop.compuscan.co.za - - [08/Feb/2007:14:10:43 +0000] "GET / HTTP/1.1" 200 3309 "-" "www.clamav.net"
 livebot-207-46-98-57.search.live.com - - [08/Feb/2007:14:12:04 +0000] "GET /robots.txt HTTP/1.0" 200 179 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
 livebot-207-46-98-57.search.live.com - - [08/Feb/2007:14:12:04 +0000] "GET /html/oracle.html HTTP/1.0" 404 214 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
 dslb-088-064-005-154.pools.arcor-ip.net - - [08/Feb/2007:14:12:15 +0000] "GET / HTTP/1.1" 200 3309 "-" "www.clamav.net"
 196.201.92.41 - - [08/Feb/2007:14:15:01 +0000] "GET / HTTP/1.1" 200 3309 "-" "MOT-L7/08.B7.DCR MIB/2.2.1 Profile/MIDP-2.0 Configuration/CLDC-1.1"

The specific task here is to sort the 286,525 lines of this file by Response
Code, Query, Browser, Referring Url, and lastly Date.  One solution might be to
use the following code, which iterates over the files given on the
command-line.

# sort-apache-log

 #!/usr/bin/perl -n

 use strict;
 use warnings;

 my @data;

 LINE:
 while ( <> ) {
     my $line = $_;
     if (
         $line =~ m/^(
             ([\w\.\-]+)             # client
             \s*-\s*-\s*\[
             ([^]]+)                 # date
             \]\s*"\w+\s*
             (\S+)                   # query
             [^"]+"\s*
             (\d+)                   # status
             \s+\S+\s+"[^"]*"\s+"
             ([^"]*)                 # browser
             "
             .*
         )$/x
     ) {
         my @chunks = split(/ +/, $line);
         my $ip      = $1;
         my $date    = $2;
         my $query   = $3;
         my $status  = $4;
         my $browser = $5;

         push(@data, [$ip, $date, $query, $status, $browser, $line]);
     }
 }

 my @sorted = sort {
     $a->[3] cmp $b->[3]
             ||
     $a->[2] cmp $b->[2]
             ||
     $a->[0] cmp $b->[0]
             ||
     $a->[1] cmp $b->[1]
             ||
     $a->[4] cmp $b->[4]
 } @data;

 foreach my $data ( @sorted ) {
     print $data->[5];
 }

 exit 0;

When running this program, redirect C<STDOUT> so it is possible to check the
output is correct from following test runs and use the system C<time> utility
to check the overall runtime.

 $> time ./sort-apache-log logfile > out-sort

 real    0m17.371s
 user    0m15.757s
 sys     0m0.592s

The program took just over 17 wallclock seconds to run.  Note the different
values C<time> outputs, it's important to always use the same one, and to not
confuse what each one means.

=over 4

=item Elapsed Real Time

The overall, or wallclock, time between when C<time> was called, and when it
terminates.  The elapsed time includes both user and system times, and time
spent waiting for other users and processes on the system.  Inevitably, this is
the most approximate of the measurements given.

=item User CPU Time

The user time is the amount of time the entire process spent on behalf of the
user on this system executing this program.

=item System CPU Time

The system time is the amount of time the kernel itself spent executing
routines, or system calls, on behalf of this process user.

=back

Running this same process as a C<Schwarzian Transform> it is possible to
eliminate the input and output arrays for storing all the data, and work on the
input directly as it arrives too.  Otherwise, the code looks fairly similar:

# sort-apache-log-schwarzian

 #!/usr/bin/perl -n

 use strict;
 use warnings;

 print

     map $_->[0] =>

     sort {
         $a->[4] cmp $b->[4]
                 ||
         $a->[3] cmp $b->[3]
                 ||
         $a->[1] cmp $b->[1]
                 ||
         $a->[2] cmp $b->[2]
                 ||
         $a->[5] cmp $b->[5]
     }
     map  [ $_, m/^(
         ([\w\.\-]+)             # client
         \s*-\s*-\s*\[
         ([^]]+)                 # date
         \]\s*"\w+\s*
         (\S+)                   # query
         [^"]+"\s*
         (\d+)                   # status
         \s+\S+\s+"[^"]*"\s+"
         ([^"]*)                 # browser
         "
         .*
     )$/xo ]

     => <>;

 exit 0;

Run the new code against the same logfile, as above, to check the new time.

 $> time ./sort-apache-log-schwarzian logfile > out-schwarz

 real    0m9.664s
 user    0m8.873s
 sys     0m0.704s

The time has been cut in half, which is a respectable speed improvement by any
standard.  Naturally, it is important to check the output is consistent with
the first program run, this is where the Unix system C<cksum> utility comes in.

 $> cksum out-sort out-schwarz
 3044173777 52029194 out-sort
 3044173777 52029194 out-schwarz

BTW. Beware too of pressure from managers who see you speed a program up by 50%
of the runtime once, only to get a request one month later to do the same again
(true story) - you'll just have to point out you're only human, even if you are a
Perl programmer, and you'll see what you can do...

=head1 LOGGING

An essential part of any good development process is appropriate error handling
with appropriately informative messages, however there exists a school of
thought which suggests that log files should be I<chatty>, as if the chain of
unbroken output somehow ensures the survival of the program.  If speed is in
any way an issue, this approach is wrong.

A common sight is code which looks something like this:

 logger->debug( "A logging message via process-id: $$ INC: "
                                                       . Dumper(\%INC) )

The problem is that this code will always be parsed and executed, even when the
debug level set in the logging configuration file is zero.  Once the debug()
subroutine has been entered, and the internal C<$debug> variable confirmed to
be zero, for example, the message which has been sent in will be discarded and
the program will continue.  In the example given though, the C<\%INC> hash will
already have been dumped, and the message string constructed, all of which work
could be bypassed by a debug variable at the statement level, like this:

 logger->debug( "A logging message via process-id: $$ INC: "
                                            . Dumper(\%INC) ) if $DEBUG;

This effect can be demonstrated by setting up a test script with both forms,
including a C<debug()> subroutine to emulate typical C<logger()> functionality.

# ifdebug

 #!/usr/bin/perl

 use strict;
 use warnings;

 use Benchmark;
 use Data::Dumper;
 my $DEBUG = 0;

 sub debug {
     my $msg = shift;

     if ( $DEBUG ) {
         print "DEBUG: $msg\n";
     }
 };

 timethese(100000, {
         'debug'       => sub {
             debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) )
         },
         'ifdebug'  => sub {
             debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) ) if $DEBUG
         },
 });

Let's see what C<Benchmark> makes of this:

 $> perl ifdebug
 Benchmark: timing 100000 iterations of constant, sub...
    ifdebug:  0 wallclock secs ( 0.01 usr +  0.00 sys =  0.01 CPU) @ 10000000.00/s (n=100000)
             (warning: too few iterations for a reliable count)
      debug: 14 wallclock secs (13.18 usr +  0.04 sys = 13.22 CPU) @ 7564.30/s (n=100000)

In the one case the code, which does exactly the same thing as far as
outputting any debugging information is concerned, in other words nothing,
takes 14 seconds, and in the other case the code takes one hundredth of a
second.  Looks fairly definitive.  Use a C<$DEBUG> variable BEFORE you call the
subroutine, rather than relying on the smart functionality inside it.

=head2  Logging if DEBUG (constant)

It's possible to take the previous idea a little further, by using a compile
time C<DEBUG> constant.

# ifdebug-constant

 #!/usr/bin/perl

 use strict;
 use warnings;

 use Benchmark;
 use Data::Dumper;
 use constant
     DEBUG => 0
 ;

 sub debug {
     if ( DEBUG ) {
         my $msg = shift;
         print "DEBUG: $msg\n";
     }
 };

 timethese(100000, {
         'debug'       => sub {
             debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) )
         },
         'constant'  => sub {
             debug( "A $0 logging message via process-id: $$" . Dumper(\%INC) ) if DEBUG
         },
 });

Running this program produces the following output:

 $> perl ifdebug-constant
 Benchmark: timing 100000 iterations of constant, sub...
   constant:  0 wallclock secs (-0.00 usr +  0.00 sys = -0.00 CPU) @ -7205759403792793600000.00/s (n=100000)
             (warning: too few iterations for a reliable count)
        sub: 14 wallclock secs (13.09 usr +  0.00 sys = 13.09 CPU) @ 7639.42/s (n=100000)

The C<DEBUG> constant wipes the floor with even the C<$debug> variable,
clocking in at minus zero seconds, and generates a "warning: too few iterations
for a reliable count" message into the bargain.  To see what is really going
on, and why we had too few iterations when we thought we asked for 100000, we
can use the very useful C<B::Deparse> to inspect the new code:

 $> perl -MO=Deparse ifdebug-constant

 use Benchmark;
 use Data::Dumper;
 use constant ('DEBUG', 0);
 sub debug {
     use warnings;
     use strict 'refs';
     0;
 }
 use warnings;
 use strict 'refs';
 timethese(100000, {'sub', sub {
     debug "A $0 logging message via process-id: $$" . Dumper(\%INC);
 }
 , 'constant', sub {
     0;
 }
 });
 ifdebug-constant syntax OK

The output shows the constant() subroutine we're testing being replaced with
the value of the C<DEBUG> constant: zero.  The line to be tested has been
completely optimized away, and you can't get much more efficient than that.

=head1 POSTSCRIPT

This document has provided several way to go about identifying hot-spots, and
checking whether any modifications have improved the runtime of the code.

As a final thought, remember that it's not (at the time of writing) possible to
produce a useful program which will run in zero or negative time and this basic
principle can be written as: I<useful programs are slow> by their very
definition.  It is of course possible to write a nearly instantaneous program,
but it's not going to do very much, here's a very efficient one:

 $> perl -e 0

Optimizing that any further is a job for C<p5p>.

=head1 SEE ALSO

Further reading can be found using the modules and links below.

=head2 PERLDOCS

For example: C<perldoc -f sort>.

L<perlfaq4>.

L<perlfork>, L<perlfunc>, L<perlretut>, L<perlthrtut>.

L<threads>.

=head2 MAN PAGES

C<time>.

=head2 MODULES

It's not possible to individually showcase all the performance related code for
Perl here, naturally, but here's a short list of modules from the CPAN which
deserve further attention.

 Apache::DProf
 Apache::SmallProf
 Benchmark
 DBIx::Profile
 Devel::AutoProfiler
 Devel::DProf
 Devel::DProfLB
 Devel::FastProf
 Devel::GraphVizProf
 Devel::NYTProf
 Devel::NYTProf::Apache
 Devel::Profiler
 Devel::Profile
 Devel::Profit
 Devel::SmallProf
 Devel::WxProf
 POE::Devel::Profiler
 Sort::Key
 Sort::Maker

=head2 URLS

Very useful online reference material:

 http://www.ccl4.org/~nick/P/Fast_Enough/

 http://www-128.ibm.com/developerworks/library/l-optperl.html

 http://perlbuzz.com/2007/11/bind-output-variables-in-dbi-for-speed-and-safety.html

 http://en.wikipedia.org/wiki/Performance_analysis

 http://apache.perl.org/docs/1.0/guide/performance.html

 http://perlgolf.sourceforge.net/

 http://www.sysarch.com/Perl/sort_paper.html

=head1 AUTHOR

Richard Foley <richard.foley@rfi.net> Copyright (c) 2008

=cut
perl5160delta.pod000064400000405023150344123460007546 0ustar00=encoding utf8

=head1 NAME

perl5160delta - what is new for perl v5.16.0

=head1 DESCRIPTION

This document describes differences between the 5.14.0 release and
the 5.16.0 release.

If you are upgrading from an earlier release such as 5.12.0, first read
L<perl5140delta>, which describes differences between 5.12.0 and
5.14.0.

Some bug fixes in this release have been backported to later
releases of 5.14.x.  Those are indicated with the 5.14.x version in
parentheses.

=head1 Notice

With the release of Perl 5.16.0, the 5.12.x series of releases is now out of
its support period.  There may be future 5.12.x releases, but only in the
event of a critical security issue.  Users of Perl 5.12 or earlier should
consider upgrading to a more recent release of Perl.

This policy is described in greater detail in
L<perlpolicy|perlpolicy/MAINTENANCE AND SUPPORT>.

=head1 Core Enhancements

=head2 C<use I<VERSION>>

As of this release, version declarations like C<use v5.16> now disable
all features before enabling the new feature bundle.  This means that
the following holds true:

    use 5.016;
    # only 5.16 features enabled here
    use 5.014;
    # only 5.14 features enabled here (not 5.16)

C<use v5.12> and higher continue to enable strict, but explicit C<use
strict> and C<no strict> now override the version declaration, even
when they come first:

    no strict;
    use 5.012;
    # no strict here

There is a new ":default" feature bundle that represents the set of
features enabled before any version declaration or C<use feature> has
been seen.  Version declarations below 5.10 now enable the ":default"
feature set.  This does not actually change the behavior of C<use
v5.8>, because features added to the ":default" set are those that were
traditionally enabled by default, before they could be turned off.

C<< no feature >> now resets to the default feature set.  To disable all
features (which is likely to be a pretty special-purpose request, since
it presumably won't match any named set of semantics) you can now  
write C<< no feature ':all' >>.

C<$[> is now disabled under C<use v5.16>.  It is part of the default
feature set and can be turned on or off explicitly with C<use feature
'array_base'>.

=head2 C<__SUB__>

The new C<__SUB__> token, available under the C<current_sub> feature
(see L<feature>) or C<use v5.16>, returns a reference to the current
subroutine, making it easier to write recursive closures.

=head2 New and Improved Built-ins

=head3 More consistent C<eval>

The C<eval> operator sometimes treats a string argument as a sequence of
characters and sometimes as a sequence of bytes, depending on the
internal encoding.  The internal encoding is not supposed to make any
difference, but there is code that relies on this inconsistency.

The new C<unicode_eval> and C<evalbytes> features (enabled under C<use
5.16.0>) resolve this.  The C<unicode_eval> feature causes C<eval
$string> to treat the string always as Unicode.  The C<evalbytes>
features provides a function, itself called C<evalbytes>, which
evaluates its argument always as a string of bytes.

These features also fix oddities with source filters leaking to outer
dynamic scopes.

See L<feature> for more detail.

=head3 C<substr> lvalue revamp

=for comment Does this belong here, or under Incompatible Changes?

When C<substr> is called in lvalue or potential lvalue context with two
or three arguments, a special lvalue scalar is returned that modifies
the original string (the first argument) when assigned to.

Previously, the offsets (the second and third arguments) passed to
C<substr> would be converted immediately to match the string, negative
offsets being translated to positive and offsets beyond the end of the
string being truncated.

Now, the offsets are recorded without modification in the special
lvalue scalar that is returned, and the original string is not even
looked at by C<substr> itself, but only when the returned lvalue is
read or modified.

These changes result in an incompatible change:

If the original string changes length after the call to C<substr> but
before assignment to its return value, negative offsets will remember
their position from the end of the string, affecting code like this:

    my $string = "string";
    my $lvalue = \substr $string, -4, 2;
    print $$lvalue, "\n"; # prints "ri"
    $string = "bailing twine";
    print $$lvalue, "\n"; # prints "wi"; used to print "il"

The same thing happens with an omitted third argument.  The returned
lvalue will always extend to the end of the string, even if the string
becomes longer.

Since this change also allowed many bugs to be fixed (see
L</The C<substr> operator>), and since the behavior
of negative offsets has never been specified, the
change was deemed acceptable.

=head3 Return value of C<tied>

The value returned by C<tied> on a tied variable is now the actual
scalar that holds the object to which the variable is tied.  This
lets ties be weakened with C<Scalar::Util::weaken(tied
$tied_variable)>.

=head2 Unicode Support

=head3 Supports (I<almost>) Unicode 6.1

Besides the addition of whole new scripts, and new characters in
existing scripts, this new version of Unicode, as always, makes some
changes to existing characters.  One change that may trip up some
applications is that the General Category of two characters in the
Latin-1 range, PILCROW SIGN and SECTION SIGN, has been changed from
Other_Symbol to Other_Punctuation.  The same change has been made for
a character in each of Tibetan, Ethiopic, and Aegean.
The code points U+3248..U+324F (CIRCLED NUMBER TEN ON BLACK SQUARE
through CIRCLED NUMBER EIGHTY ON BLACK SQUARE) have had their General
Category changed from Other_Symbol to Other_Numeric.  The Line Break
property has changes for Hebrew and Japanese; and because of
other changes in 6.1, the Perl regular expression construct C<\X> now
works differently for some characters in Thai and Lao.

New aliases (synonyms) have been defined for many property values;
these, along with the previously existing ones, are all cross-indexed in
L<perluniprops>.

The return value of C<charnames::viacode()> is affected by other
changes:

 Code point      Old Name             New Name
   U+000A    LINE FEED (LF)        LINE FEED
   U+000C    FORM FEED (FF)        FORM FEED
   U+000D    CARRIAGE RETURN (CR)  CARRIAGE RETURN
   U+0085    NEXT LINE (NEL)       NEXT LINE
   U+008E    SINGLE-SHIFT 2        SINGLE-SHIFT-2
   U+008F    SINGLE-SHIFT 3        SINGLE-SHIFT-3
   U+0091    PRIVATE USE 1         PRIVATE USE-1
   U+0092    PRIVATE USE 2         PRIVATE USE-2
   U+2118    SCRIPT CAPITAL P      WEIERSTRASS ELLIPTIC FUNCTION

Perl will accept any of these names as input, but
C<charnames::viacode()> now returns the new name of each pair.  The
change for U+2118 is considered by Unicode to be a correction, that is
the original name was a mistake (but again, it will remain forever valid
to use it to refer to U+2118).  But most of these changes are the
fallout of the mistake Unicode 6.0 made in naming a character used in
Japanese cell phones to be "BELL", which conflicts with the longstanding
industry use of (and Unicode's recommendation to use) that name
to mean the ASCII control character at U+0007.  Therefore, that name
has been deprecated in Perl since v5.14, and any use of it will raise a
warning message (unless turned off).  The name "ALERT" is now the
preferred name for this code point, with "BEL" an acceptable short
form.  The name for the new cell phone character, at code point U+1F514,
remains undefined in this version of Perl (hence we don't 
implement quite all of Unicode 6.1), but starting in v5.18, BELL will mean
this character, and not U+0007.

Unicode has taken steps to make sure that this sort of mistake does not
happen again.  The Standard now includes all generally accepted
names and abbreviations for control characters, whereas previously it
didn't (though there were recommended names for most of them, which Perl
used).  This means that most of those recommended names are now
officially in the Standard.  Unicode did not recommend names for the
four code points listed above between U+008E and U+008F, and in
standardizing them Unicode subtly changed the names that Perl had
previously given them, by replacing the final blank in each name by a
hyphen.  Unicode also officially accepts names that Perl had deprecated,
such as FILE SEPARATOR.  Now the only deprecated name is BELL.
Finally, Perl now uses the new official names instead of the old
(now considered obsolete) names for the first four code points in the
list above (the ones which have the parentheses in them).

Now that the names have been placed in the Unicode standard, these kinds
of changes should not happen again, though corrections, such as to
U+2118, are still possible.

Unicode also added some name abbreviations, which Perl now accepts:
SP for SPACE;
TAB for CHARACTER TABULATION;
NEW LINE, END OF LINE, NL, and EOL for LINE FEED;
LOCKING-SHIFT ONE for SHIFT OUT;
LOCKING-SHIFT ZERO for SHIFT IN;
and ZWNBSP for ZERO WIDTH NO-BREAK SPACE.

More details on this version of Unicode are provided in
L<http://www.unicode.org/versions/Unicode6.1.0/>.

=head3 C<use charnames> is no longer needed for C<\N{I<name>}>

When C<\N{I<name>}> is encountered, the C<charnames> module is now
automatically loaded when needed as if the C<:full> and C<:short>
options had been specified.  See L<charnames> for more information.

=head3 C<\N{...}> can now have Unicode loose name matching

This is described in the C<charnames> item in
L</Updated Modules and Pragmata> below.

=head3 Unicode Symbol Names

Perl now has proper support for Unicode in symbol names.  It used to be
that C<*{$foo}> would ignore the internal UTF8 flag and use the bytes of
the underlying representation to look up the symbol.  That meant that
C<*{"\x{100}"}> and C<*{"\xc4\x80"}> would return the same thing.  All
these parts of Perl have been fixed to account for Unicode:

=over

=item *

Method names (including those passed to C<use overload>)

=item *

Typeglob names (including names of variables, subroutines, and filehandles)

=item *

Package names

=item *

C<goto>

=item *

Symbolic dereferencing

=item *

Second argument to C<bless()> and C<tie()>

=item *

Return value of C<ref()>

=item *

Subroutine prototypes

=item *

Attributes

=item *

Various warnings and error messages that mention variable names or values,
methods, etc.

=back

In addition, a parsing bug has been fixed that prevented C<*{é}> from
implicitly quoting the name, but instead interpreted it as C<*{+é}>, which
would cause a strict violation.

C<*{"*a::b"}> automatically strips off the * if it is followed by an ASCII
letter.  That has been extended to all Unicode identifier characters.

One-character non-ASCII non-punctuation variables (like C<$é>) are now
subject to "Used only once" warnings.  They used to be exempt, as they
were treated as punctuation variables.

Also, single-character Unicode punctuation variables (like C<$‰>) are now
supported [perl #69032].

=head3 Improved ability to mix locales and Unicode, including UTF-8 locales

An optional parameter has been added to C<use locale>

 use locale ':not_characters';

which tells Perl to use all but the C<LC_CTYPE> and C<LC_COLLATE>
portions of the current locale.  Instead, the character set is assumed
to be Unicode.  This lets locales and Unicode be seamlessly mixed,
including the increasingly frequent UTF-8 locales.  When using this
hybrid form of locales, the C<:locale> layer to the L<open> pragma can
be used to interface with the file system, and there are CPAN modules
available for ARGV and environment variable conversions.

Full details are in L<perllocale>.

=head3 New function C<fc> and corresponding escape sequence C<\F> for Unicode foldcase

Unicode foldcase is an extension to lowercase that gives better results
when comparing two strings case-insensitively.  It has long been used
internally in regular expression C</i> matching.  Now it is available
explicitly through the new C<fc> function call (enabled by
S<C<"use feature 'fc'">>, or C<use v5.16>, or explicitly callable via
C<CORE::fc>) or through the new C<\F> sequence in double-quotish
strings.

Full details are in L<perlfunc/fc>.

=head3 The Unicode C<Script_Extensions> property is now supported.

New in Unicode 6.0, this is an improved C<Script> property.  Details
are in L<perlunicode/Scripts>.

=head2 XS Changes

=head3 Improved typemaps for Some Builtin Types

Most XS authors will know there is a longstanding bug in the
OUTPUT typemap for T_AVREF (C<AV*>), T_HVREF (C<HV*>), T_CVREF (C<CV*>),
and T_SVREF (C<SVREF> or C<\$foo>) that requires manually decrementing
the reference count of the return value instead of the typemap taking
care of this.  For backwards-compatibility, this cannot be changed in the
default typemaps.  But we now provide additional typemaps
C<T_AVREF_REFCOUNT_FIXED>, etc. that do not exhibit this bug.  Using
them in your extension is as simple as having one line in your
C<TYPEMAP> section:

  HV*	T_HVREF_REFCOUNT_FIXED

=head3 C<is_utf8_char()>

The XS-callable function C<is_utf8_char()>, when presented with
malformed UTF-8 input, can read up to 12 bytes beyond the end of the
string.  This cannot be fixed without changing its API, and so its
use is now deprecated.  Use C<is_utf8_char_buf()> (described just below)
instead.

=head3 Added C<is_utf8_char_buf()>

This function is designed to replace the deprecated L</is_utf8_char()>
function.  It includes an extra parameter to make sure it doesn't read
past the end of the input buffer.

=head3 Other C<is_utf8_foo()> functions, as well as C<utf8_to_foo()>, etc.

Most other XS-callable functions that take UTF-8 encoded input
implicitly assume that the UTF-8 is valid (not malformed) with respect to
buffer length.  Do not do things such as change a character's case or
see if it is alphanumeric without first being sure that it is valid
UTF-8.  This can be safely done for a whole string by using one of the
functions C<is_utf8_string()>, C<is_utf8_string_loc()>, and
C<is_utf8_string_loclen()>.

=head3 New Pad API

Many new functions have been added to the API for manipulating lexical
pads.  See L<perlapi/Pad Data Structures> for more information.

=head2 Changes to Special Variables

=head3 C<$$> can be assigned to

C<$$> was made read-only in Perl 5.8.0.  But only sometimes: C<local $$>
would make it writable again.  Some CPAN modules were using C<local $$> or
XS code to bypass the read-only check, so there is no reason to keep C<$$>
read-only.  (This change also allowed a bug to be fixed while maintaining
backward compatibility.)

=head3 C<$^X> converted to an absolute path on FreeBSD, OS X and Solaris

C<$^X> is now converted to an absolute path on OS X, FreeBSD (without
needing F</proc> mounted) and Solaris 10 and 11.  This augments the
previous approach of using F</proc> on Linux, FreeBSD, and NetBSD
(in all cases, where mounted).

This makes relocatable perl installations more useful on these platforms.
(See "Relocatable @INC" in F<INSTALL>)

=head2 Debugger Changes

=head3 Features inside the debugger

The current Perl's L<feature> bundle is now enabled for commands entered
in the interactive debugger.

=head3 New option for the debugger's B<t> command

The B<t> command in the debugger, which toggles tracing mode, now
accepts a numeric argument that determines how many levels of subroutine
calls to trace.

=head3 C<enable> and C<disable>

The debugger now has C<disable> and C<enable> commands for disabling
existing breakpoints and re-enabling them.  See L<perldebug>.

=head3 Breakpoints with file names

The debugger's "b" command for setting breakpoints now lets a line
number be prefixed with a file name.  See
L<perldebug/"b [file]:[line] [condition]">.

=head2 The C<CORE> Namespace

=head3 The C<CORE::> prefix

The C<CORE::> prefix can now be used on keywords enabled by
L<feature.pm|feature>, even outside the scope of C<use feature>.

=head3 Subroutines in the C<CORE> namespace

Many Perl keywords are now available as subroutines in the CORE namespace.
This lets them be aliased:

    BEGIN { *entangle = \&CORE::tie }
    entangle $variable, $package, @args;

And for prototypes to be bypassed:

    sub mytie(\[%$*@]$@) {
	my ($ref, $pack, @args) = @_;
	... do something ...
	goto &CORE::tie;
    }

Some of these cannot be called through references or via C<&foo> syntax,
but must be called as barewords.

See L<CORE> for details.

=head2 Other Changes

=head3 Anonymous handles

Automatically generated file handles are now named __ANONIO__ when the
variable name cannot be determined, rather than $__ANONIO__.

=head3 Autoloaded sort Subroutines

Custom sort subroutines can now be autoloaded [perl #30661]:

    sub AUTOLOAD { ... }
    @sorted = sort foo @list; # uses AUTOLOAD

=head3 C<continue> no longer requires the "switch" feature

The C<continue> keyword has two meanings.  It can introduce a C<continue>
block after a loop, or it can exit the current C<when> block.  Up to now,
the latter meaning was valid only with the "switch" feature enabled, and
was a syntax error otherwise.  Since the main purpose of feature.pm is to
avoid conflicts with user-defined subroutines, there is no reason for
C<continue> to depend on it.

=head3 DTrace probes for interpreter phase change

The C<phase-change> probes will fire when the interpreter's phase
changes, which tracks the C<${^GLOBAL_PHASE}> variable.  C<arg0> is
the new phase name; C<arg1> is the old one.  This is useful 
for limiting your instrumentation to one or more of: compile time,
run time, or destruct time.

=head3 C<__FILE__()> Syntax

The C<__FILE__>, C<__LINE__> and C<__PACKAGE__> tokens can now be written
with an empty pair of parentheses after them.  This makes them parse the
same way as C<time>, C<fork> and other built-in functions.

=head3 The C<\$> prototype accepts any scalar lvalue

The C<\$> and C<\[$]> subroutine prototypes now accept any scalar lvalue
argument.  Previously they accepted only scalars beginning with C<$> and
hash and array elements.  This change makes them consistent with the way
the built-in C<read> and C<recv> functions (among others) parse their
arguments.  This means that one can override the built-in functions with
custom subroutines that parse their arguments the same way.

=head3 C<_> in subroutine prototypes

The C<_> character in subroutine prototypes is now allowed before C<@> or
C<%>.

=head1 Security

=head2 Use C<is_utf8_char_buf()> and not C<is_utf8_char()>

The latter function is now deprecated because its API is insufficient to
guarantee that it doesn't read (up to 12 bytes in the worst case) beyond
the end of its input string.  See
L<is_utf8_char_buf()|/Added is_utf8_char_buf()>.

=head2 Malformed UTF-8 input could cause attempts to read beyond the end of the buffer

Two new XS-accessible functions, C<utf8_to_uvchr_buf()> and
C<utf8_to_uvuni_buf()> are now available to prevent this, and the Perl
core has been converted to use them.
See L</Internal Changes>.

=head2 C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC (CVE-2011-2728).

Calling C<File::Glob::bsd_glob> with the unsupported flag
GLOB_ALTDIRFUNC would cause an access violation / segfault.  A Perl
program that accepts a flags value from an external source could expose
itself to denial of service or arbitrary code execution attacks.  There
are no known exploits in the wild.  The problem has been corrected by
explicitly disabling all unsupported flags and setting unused function
pointers to null.  Bug reported by Clément Lecigne. (5.14.2)

=head2 Privileges are now set correctly when assigning to C<$(>

A hypothetical bug (probably unexploitable in practice) because the
incorrect setting of the effective group ID while setting C<$(> has been
fixed.  The bug would have affected only systems that have C<setresgid()>
but not C<setregid()>, but no such systems are known to exist.

=head1 Deprecations

=head2 Don't read the Unicode data base files in F<lib/unicore>

It is now deprecated to directly read the Unicode data base files.
These are stored in the F<lib/unicore> directory.  Instead, you should
use the new functions in L<Unicode::UCD>.  These provide a stable API,
and give complete information.

Perl may at some point in the future change or remove these files.  The
file which applications were most likely to have used is
F<lib/unicore/ToDigit.pl>.  L<Unicode::UCD/prop_invmap()> can be used to
get at its data instead.

=head2 XS functions C<is_utf8_char()>, C<utf8_to_uvchr()> and
C<utf8_to_uvuni()>

This function is deprecated because it could read beyond the end of the
input string.  Use the new L<is_utf8_char_buf()|/Added is_utf8_char_buf()>,
C<utf8_to_uvchr_buf()> and C<utf8_to_uvuni_buf()> instead.

=head1 Future Deprecations

This section serves as a notice of features that are I<likely> to be
removed or L<deprecated|perlpolicy/deprecated> in the next release of
perl (5.18.0).  If your code depends on these features, you should
contact the Perl 5 Porters via the L<mailing
list|http://lists.perl.org/list/perl5-porters.html> or L<perlbug> to
explain your use case and inform the deprecation process.

=head2 Core Modules

These modules may be marked as deprecated I<from the core>.  This only
means that they will no longer be installed by default with the core
distribution, but will remain available on the CPAN.

=over

=item *

CPANPLUS

=item *

Filter::Simple

=item *

PerlIO::mmap

=item *

Pod::LaTeX

=item *

Pod::Parser

=item *

SelfLoader

=item *

Text::Soundex

=item *

Thread.pm

=back

=head2 Platforms with no supporting programmers

These platforms will probably have their
special build support removed during the
5.17.0 development series.

=over

=item *

BeOS

=item *

djgpp

=item *

dgux

=item *

EPOC

=item *

MPE/iX

=item *

Rhapsody

=item *

UTS

=item *

VM/ESA

=back

=head2 Other Future Deprecations

=over

=item *

Swapping of $< and $>

For more information about this future deprecation, see L<the relevant RT
ticket|https://rt.perl.org/rt3/Ticket/Display.html?id=96212>.

=item *

sfio, stdio

Perl supports being built without PerlIO proper, using a stdio or sfio
wrapper instead.  A perl build like this will not support IO layers and
thus Unicode IO, making it rather handicapped.

PerlIO supports a C<stdio> layer if stdio use is desired, and similarly a
sfio layer could be produced.

=item *

Unescaped literal C<< "{" >> in regular expressions.

Starting with v5.20, it is planned to require a literal C<"{"> to be
escaped, for example by preceding it with a backslash.  In v5.18, a
deprecated warning message will be emitted for all such uses.  
This affects only patterns that are to match a literal C<"{">.  Other
uses of this character, such as part of a quantifier or sequence as in
those below, are completely unaffected:

    /foo{3,5}/
    /\p{Alphabetic}/
    /\N{DIGIT ZERO}

Removing this will permit extensions to Perl's pattern syntax and better
error checking for existing syntax.  See L<perlre/Quantifiers> for an
example.

=item *

Revamping C<< "\Q" >> semantics in double-quotish strings when combined with other escapes.

There are several bugs and inconsistencies involving combinations
of C<\Q> and escapes like C<\x>, C<\L>, etc., within a C<\Q...\E> pair.
These need to be fixed, and doing so will necessarily change current
behavior.  The changes have not yet been settled.

=back

=head1 Incompatible Changes

=head2 Special blocks called in void context

Special blocks (C<BEGIN>, C<CHECK>, C<INIT>, C<UNITCHECK>, C<END>) are now
called in void context.  This avoids wasteful copying of the result of the
last statement [perl #108794].

=head2 The C<overloading> pragma and regexp objects

With C<no overloading>, regular expression objects returned by C<qr//> are
now stringified as "Regexp=REGEXP(0xbe600d)" instead of the regular
expression itself [perl #108780].

=head2 Two XS typemap Entries removed

Two presumably unused XS typemap entries have been removed from the
core typemap: T_DATAUNIT and T_CALLBACK.  If you are, against all odds,
a user of these, please see the instructions on how to restore them
in L<perlxstypemap>.

=head2 Unicode 6.1 has incompatibilities with Unicode 6.0

These are detailed in L</Supports (almost) Unicode 6.1> above.
You can compile this version of Perl to use Unicode 6.0.  See
L<perlunicode/Hacking Perl to work on earlier Unicode versions (for very serious hackers only)>.

=head2 Borland compiler

All support for the Borland compiler has been dropped.  The code had not
worked for a long time anyway.

=head2 Certain deprecated Unicode properties are no longer supported by default

Perl should never have exposed certain Unicode properties that are used
by Unicode internally and not meant to be publicly available.  Use of
these has generated deprecated warning messages since Perl 5.12.  The
removed properties are Other_Alphabetic,
Other_Default_Ignorable_Code_Point, Other_Grapheme_Extend,
Other_ID_Continue, Other_ID_Start, Other_Lowercase, Other_Math, and
Other_Uppercase.

Perl may be recompiled to include any or all of them; instructions are
given in
L<perluniprops/Unicode character properties that are NOT accepted by Perl>.

=head2 Dereferencing IO thingies as typeglobs

The C<*{...}> operator, when passed a reference to an IO thingy (as in
C<*{*STDIN{IO}}>), creates a new typeglob containing just that IO object.
Previously, it would stringify as an empty string, but some operators would
treat it as undefined, producing an "uninitialized" warning.
Now it stringifies as __ANONIO__ [perl #96326].

=head2 User-defined case-changing operations

This feature was deprecated in Perl 5.14, and has now been removed.
The CPAN module L<Unicode::Casing> provides better functionality without
the drawbacks that this feature had, as are detailed in the 5.14
documentation:
L<http://perldoc.perl.org/5.14.0/perlunicode.html#User-Defined-Case-Mappings-%28for-serious-hackers-only%29>

=head2 XSUBs are now 'static'

XSUB C functions are now 'static', that is, they are not visible from
outside the compilation unit.  Users can use the new C<XS_EXTERNAL(name)>
and C<XS_INTERNAL(name)> macros to pick the desired linking behavior.
The ordinary C<XS(name)> declaration for XSUBs will continue to declare
non-'static' XSUBs for compatibility, but the XS compiler,
L<ExtUtils::ParseXS> (C<xsubpp>) will emit 'static' XSUBs by default.
L<ExtUtils::ParseXS>'s behavior can be reconfigured from XS using the
C<EXPORT_XSUB_SYMBOLS> keyword.  See L<perlxs> for details.

=head2 Weakening read-only references

Weakening read-only references is no longer permitted.  It should never
have worked anyway, and could sometimes result in crashes.

=head2 Tying scalars that hold typeglobs

Attempting to tie a scalar after a typeglob was assigned to it would
instead tie the handle in the typeglob's IO slot.  This meant that it was
impossible to tie the scalar itself.  Similar problems affected C<tied> and
C<untie>: C<tied $scalar> would return false on a tied scalar if the last
thing returned was a typeglob, and C<untie $scalar> on such a tied scalar
would do nothing.

We fixed this problem before Perl 5.14.0, but it caused problems with some
CPAN modules, so we put in a deprecation cycle instead.

Now the deprecation has been removed and this bug has been fixed.  So
C<tie $scalar> will always tie the scalar, not the handle it holds.  To tie
the handle, use C<tie *$scalar> (with an explicit asterisk).  The same
applies to C<tied *$scalar> and C<untie *$scalar>.

=head2 IPC::Open3 no longer provides C<xfork()>, C<xclose_on_exec()>
and C<xpipe_anon()>

All three functions were private, undocumented, and unexported.  They do
not appear to be used by any code on CPAN.  Two have been inlined and one
deleted entirely.

=head2 C<$$> no longer caches PID

Previously, if one called fork(3) from C, Perl's
notion of C<$$> could go out of sync with what getpid() returns.  By always
fetching the value of C<$$> via getpid(), this potential bug is eliminated.
Code that depends on the caching behavior will break.  As described in
L<Core Enhancements|/C<$$> can be assigned to>,
C<$$> is now writable, but it will be reset during a
fork.

=head2 C<$$> and C<getppid()> no longer emulate POSIX semantics under LinuxThreads

The POSIX emulation of C<$$> and C<getppid()> under the obsolete
LinuxThreads implementation has been removed.
This only impacts users of Linux 2.4 and
users of Debian GNU/kFreeBSD up to and including 6.0, not the vast
majority of Linux installations that use NPTL threads.

This means that C<getppid()>, like C<$$>, is now always guaranteed to
return the OS's idea of the current state of the process, not perl's
cached version of it.

See the documentation for L<$$|perlvar/$$> for details.

=head2 C<< $< >>, C<< $> >>, C<$(> and C<$)> are no longer cached

Similarly to the changes to C<$$> and C<getppid()>, the internal
caching of C<< $< >>, C<< $> >>, C<$(> and C<$)> has been removed.

When we cached these values our idea of what they were would drift out
of sync with reality if someone (e.g., someone embedding perl) called
C<sete?[ug]id()> without updating C<PL_e?[ug]id>.  Having to deal with
this complexity wasn't worth it given how cheap the C<gete?[ug]id()>
system call is.

This change will break a handful of CPAN modules that use the XS-level
C<PL_uid>, C<PL_gid>, C<PL_euid> or C<PL_egid> variables.

The fix for those breakages is to use C<PerlProc_gete?[ug]id()> to
retrieve them (e.g., C<PerlProc_getuid()>), and not to assign to
C<PL_e?[ug]id> if you change the UID/GID/EUID/EGID.  There is no longer
any need to do so since perl will always retrieve the up-to-date
version of those values from the OS.

=head2 Which Non-ASCII characters get quoted by C<quotemeta> and C<\Q> has changed

This is unlikely to result in a real problem, as Perl does not attach
special meaning to any non-ASCII character, so it is currently
irrelevant which are quoted or not.  This change fixes bug [perl #77654] and
brings Perl's behavior more into line with Unicode's recommendations.
See L<perlfunc/quotemeta>.

=head1 Performance Enhancements

=over

=item *

Improved performance for Unicode properties in regular expressions

=for comment Can this be compacted some? -- rjbs, 2012-02-20

Matching a code point against a Unicode property is now done via a
binary search instead of linear.  This means for example that the worst
case for a 1000 item property is 10 probes instead of 1000.  This
inefficiency has been compensated for in the past by permanently storing
in a hash the results of a given probe plus the results for the adjacent
64 code points, under the theory that near-by code points are likely to
be searched for.  A separate hash was used for each mention of a Unicode
property in each regular expression.  Thus, C<qr/\p{foo}abc\p{foo}/>
would generate two hashes.  Any probes in one instance would be unknown
to the other, and the hashes could expand separately to be quite large
if the regular expression were used on many different widely-separated
code points.
Now, however, there is just one hash shared by all instances of a given
property.  This means that if C<\p{foo}> is matched against "A" in one
regular expression in a thread, the result will be known immediately to
all regular expressions, and the relentless march of using up memory is
slowed considerably.

=item *

Version declarations with the C<use> keyword (e.g., C<use 5.012>) are now
faster, as they enable features without loading F<feature.pm>.

=item *

C<local $_> is faster now, as it no longer iterates through magic that it
is not going to copy anyway.

=item *

Perl 5.12.0 sped up the destruction of objects whose classes define
empty C<DESTROY> methods (to prevent autoloading), by simply not
calling such empty methods.  This release takes this optimization a
step further, by not calling any C<DESTROY> method that begins with a
C<return> statement.  This can be useful for destructors that are only
used for debugging:

    use constant DEBUG => 1;
    sub DESTROY { return unless DEBUG; ... }

Constant-folding will reduce the first statement to C<return;> if DEBUG
is set to 0, triggering this optimization.

=item *

Assigning to a variable that holds a typeglob or copy-on-write scalar
is now much faster.  Previously the typeglob would be stringified or
the copy-on-write scalar would be copied before being clobbered.

=item *

Assignment to C<substr> in void context is now more than twice its
previous speed.  Instead of creating and returning a special lvalue
scalar that is then assigned to, C<substr> modifies the original string
itself.

=item *

C<substr> no longer calculates a value to return when called in void
context.

=item *

Due to changes in L<File::Glob>, Perl's C<glob> function and its C<<
<...> >> equivalent are now much faster.  The splitting of the pattern
into words has been rewritten in C, resulting in speed-ups of 20% for
some cases.

This does not affect C<glob> on VMS, as it does not use File::Glob.

=item *

The short-circuiting operators C<&&>, C<||>, and C<//>, when chained
(such as C<$a || $b || $c>), are now considerably faster to short-circuit,
due to reduced optree traversal.

=item *

The implementation of C<s///r> makes one fewer copy of the scalar's value.

=item *

Recursive calls to lvalue subroutines in lvalue scalar context use less
memory.

=back

=head1 Modules and Pragmata

=head2 Deprecated Modules

=over

=item L<Version::Requirements>

Version::Requirements is now DEPRECATED, use L<CPAN::Meta::Requirements>,
which is a drop-in replacement.  It will be deleted from perl.git blead
in v5.17.0.

=back

=head2 New Modules and Pragmata

=over 4

=item *

L<arybase> -- this new module implements the C<$[> variable.

=item *

L<PerlIO::mmap> 0.010 has been added to the Perl core.

The C<mmap> PerlIO layer is no longer implemented by perl itself, but has
been moved out into the new L<PerlIO::mmap> module.

=back

=head2 Updated Modules and Pragmata

This is only an overview of selected module updates.  For a complete list of
updates, run:

    $ corelist --diff 5.14.0 5.16.0

You can substitute your favorite version in place of 5.14.0, too.

=over 4

=item *

L<Archive::Extract> has been upgraded from version 0.48 to 0.58.

Includes a fix for FreeBSD to only use C<unzip> if it is located in
C</usr/local/bin>, as FreeBSD 9.0 will ship with a limited C<unzip> in
C</usr/bin>.

=item *

L<Archive::Tar> has been upgraded from version 1.76 to 1.82.

Adjustments to handle files >8gb (>0777777777777 octal) and a feature
to return the MD5SUM of files in the archive.

=item *

L<base> has been upgraded from version 2.16 to 2.18.

C<base> no longer sets a module's C<$VERSION> to "-1" when a module it
loads does not define a C<$VERSION>.  This change has been made because
"-1" is not a valid version number under the new "lax" criteria used
internally by C<UNIVERSAL::VERSION>.  (See L<version> for more on "lax"
version criteria.)

C<base> no longer internally skips loading modules it has already loaded
and instead relies on C<require> to inspect C<%INC>.  This fixes a bug
when C<base> is used with code that clear C<%INC> to force a module to
be reloaded.

=item *

L<Carp> has been upgraded from version 1.20 to 1.26.

It now includes last read filehandle info and puts a dot after the file
and line number, just like errors from C<die> [perl #106538].

=item *

L<charnames> has been updated from version 1.18 to 1.30.

C<charnames> can now be invoked with a new option, C<:loose>,
which is like the existing C<:full> option, but enables Unicode loose
name matching.  Details are in L<charnames/LOOSE MATCHES>.

=item *

L<B::Deparse> has been upgraded from version 1.03 to 1.14.  This fixes
numerous deparsing bugs.

=item *

L<CGI> has been upgraded from version 3.52 to 3.59.

It uses the public and documented FCGI.pm API in CGI::Fast.  CGI::Fast was
using an FCGI API that was deprecated and removed from documentation
more than ten years ago.  Usage of this deprecated API with FCGI E<gt>=
0.70 or FCGI E<lt>= 0.73 introduces a security issue.
L<https://rt.cpan.org/Public/Bug/Display.html?id=68380>
L<http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2011-2766>

Things that may break your code:

C<url()> was fixed to return C<PATH_INFO> when it is explicitly requested
with either the C<path=E<gt>1> or C<path_info=E<gt>1> flag.

If your code is running under mod_rewrite (or compatible) and you are
calling C<self_url()> or you are calling C<url()> and passing
C<path_info=E<gt>1>, these methods will actually be returning
C<PATH_INFO> now, as you have explicitly requested or C<self_url()>
has requested on your behalf.

The C<PATH_INFO> has been omitted in such URLs since the issue was
introduced in the 3.12 release in December, 2005.

This bug is so old your application may have come to depend on it or
workaround it. Check for application before upgrading to this release.

Examples of affected method calls:

  $q->url(-absolute => 1, -query => 1, -path_info => 1);
  $q->url(-path=>1);
  $q->url(-full=>1,-path=>1);
  $q->url(-rewrite=>1,-path=>1);
  $q->self_url();

We no longer read from STDIN when the Content-Length is not set,
preventing requests with no Content-Length from sometimes freezing.
This is consistent with the CGI RFC 3875, and is also consistent with
CGI::Simple.  However, the old behavior may have been expected by some
command-line uses of CGI.pm.

In addition, the DELETE HTTP verb is now supported.

=item *

L<Compress::Zlib> has been upgraded from version 2.035 to 2.048.

IO::Compress::Zip and IO::Uncompress::Unzip now have support for LZMA
(method 14).  There is a fix for a CRC issue in IO::Compress::Unzip and
it supports Streamed Stored context now.  And fixed a Zip64 issue in
IO::Compress::Zip when the content size was exactly 0xFFFFFFFF.

=item *

L<Digest::SHA> has been upgraded from version 5.61 to 5.71.

Added BITS mode to the addfile method and shasum.  This makes
partial-byte inputs possible via files/STDIN and lets shasum check
all 8074 NIST Msg vectors, where previously special programming was
required to do this.

=item *

L<Encode> has been upgraded from version 2.42 to 2.44.

Missing aliases added, a deep recursion error fixed and various
documentation updates.

Addressed 'decode_xs n-byte heap-overflow' security bug in Unicode.xs
(CVE-2011-2939). (5.14.2)

=item *

L<ExtUtils::CBuilder> updated from version 0.280203 to 0.280206.

The new version appends CFLAGS and LDFLAGS to their Config.pm
counterparts.

=item *

L<ExtUtils::ParseXS> has been upgraded from version 2.2210 to 3.16.

Much of L<ExtUtils::ParseXS>, the module behind the XS compiler C<xsubpp>,
was rewritten and cleaned up.  It has been made somewhat more extensible
and now finally uses strictures.

The typemap logic has been moved into a separate module,
L<ExtUtils::Typemaps>.  See L</New Modules and Pragmata>, above.

For a complete set of changes, please see the ExtUtils::ParseXS
changelog, available on the CPAN.

=item *

L<File::Glob> has been upgraded from version 1.12 to 1.17.

On Windows, tilde (~) expansion now checks the C<USERPROFILE> environment
variable, after checking C<HOME>.

It has a new C<:bsd_glob> export tag, intended to replace C<:glob>.  Like
C<:glob> it overrides C<glob> with a function that does not split the glob
pattern into words, but, unlike C<:glob>, it iterates properly in scalar
context, instead of returning the last file.

There are other changes affecting Perl's own C<glob> operator (which uses
File::Glob internally, except on VMS).  See L</Performance Enhancements>
and L</Selected Bug Fixes>.

=item *

L<FindBin> updated from version 1.50 to 1.51.

It no longer returns a wrong result if a script of the same name as the
current one exists in the path and is executable.

=item *

L<HTTP::Tiny> has been upgraded from version 0.012 to 0.017.

Added support for using C<$ENV{http_proxy}> to set the default proxy host.

Adds additional shorthand methods for all common HTTP verbs,
a C<post_form()> method for POST-ing x-www-form-urlencoded data and
a C<www_form_urlencode()> utility method.

=item *

L<IO> has been upgraded from version 1.25_04 to 1.25_06, and L<IO::Handle>
from version 1.31 to 1.33.

Together, these upgrades fix a problem with IO::Handle's C<getline> and
C<getlines> methods.  When these methods are called on the special ARGV
handle, the next file is automatically opened, as happens with the built-in
C<E<lt>E<gt>> and C<readline> functions.  But, unlike the built-ins, these
methods were not respecting the caller's use of the L<open> pragma and
applying the appropriate I/O layers to the newly-opened file
[rt.cpan.org #66474].

=item *

L<IPC::Cmd> has been upgraded from version 0.70 to 0.76.

Capturing of command output (both C<STDOUT> and C<STDERR>) is now supported
using L<IPC::Open3> on MSWin32 without requiring L<IPC::Run>.

=item *

L<IPC::Open3> has been upgraded from version 1.09 to 1.12.

Fixes a bug which prevented use of C<open3> on Windows when C<*STDIN>,
C<*STDOUT> or C<*STDERR> had been localized.

Fixes a bug which prevented duplicating numeric file descriptors on Windows.

C<open3> with "-" for the program name works once more.  This was broken in
version 1.06 (and hence in Perl 5.14.0) [perl #95748].

=item *

L<Locale::Codes> has been upgraded from version 3.16 to 3.21.

Added Language Extension codes (langext) and Language Variation codes (langvar)
as defined in the IANA language registry.

Added language codes from ISO 639-5

Added language/script codes from the IANA language subtag registry

Fixed an uninitialized value warning [rt.cpan.org #67438].

Fixed the return value for the all_XXX_codes and all_XXX_names functions
[rt.cpan.org #69100].

Reorganized modules to move Locale::MODULE to Locale::Codes::MODULE to allow
for cleaner future additions.  The original four modules (Locale::Language,
Locale::Currency, Locale::Country, Locale::Script) will continue to work, but
all new sets of codes will be added in the Locale::Codes namespace.

The code2XXX, XXX2code, all_XXX_codes, and all_XXX_names functions now
support retired codes.  All codesets may be specified by a constant or
by their name now.  Previously, they were specified only by a constant.

The alias_code function exists for backward compatibility.  It has been
replaced by rename_country_code.  The alias_code function will be
removed some time after September, 2013.

All work is now done in the central module (Locale::Codes).  Previously,
some was still done in the wrapper modules (Locale::Codes::*).  Added
Language Family codes (langfam) as defined in ISO 639-5.

=item *

L<Math::BigFloat> has been upgraded from version 1.993 to 1.997.

The C<numify> method has been corrected to return a normalized Perl number
(the result of C<0 + $thing>), instead of a string [rt.cpan.org #66732].

=item *

L<Math::BigInt> has been upgraded from version 1.994 to 1.998.

It provides a new C<bsgn> method that complements the C<babs> method.

It fixes the internal C<objectify> function's handling of "foreign objects"
so they are converted to the appropriate class (Math::BigInt or
Math::BigFloat).

=item *

L<Math::BigRat> has been upgraded from version 0.2602 to 0.2603.

C<int()> on a Math::BigRat object containing -1/2 now creates a
Math::BigInt containing 0, rather than -0.  L<Math::BigInt> does not even
support negative zero, so the resulting object was actually malformed
[perl #95530].

=item *

L<Math::Complex> has been upgraded from version 1.56 to 1.59
and L<Math::Trig> from version 1.2 to 1.22.

Fixes include: correct copy constructor usage; fix polarwise formatting with
numeric format specifier; and more stable C<great_circle_direction> algorithm.

=item *

L<Module::CoreList> has been upgraded from version 2.51 to 2.66.

The C<corelist> utility now understands the C<-r> option for displaying
Perl release dates and the C<--diff> option to print the set of modlib
changes between two perl distributions.

=item *

L<Module::Metadata> has been upgraded from version 1.000004 to 1.000009.

Adds C<provides> method to generate a CPAN META provides data structure
correctly; use of C<package_versions_from_directory> is discouraged.

=item *

L<ODBM_File> has been upgraded from version 1.10 to 1.12.

The XS code is now compiled with C<PERL_NO_GET_CONTEXT>, which will aid
performance under ithreads.

=item *

L<open> has been upgraded from version 1.08 to 1.10.

It no longer turns off layers on standard handles when invoked without the
":std" directive.  Similarly, when invoked I<with> the ":std" directive, it
now clears layers on STDERR before applying the new ones, and not just on
STDIN and STDOUT [perl #92728].

=item *

L<overload> has been upgraded from version 1.13 to 1.18.

C<overload::Overloaded> no longer calls C<can> on the class, but uses
another means to determine whether the object has overloading.  It was
never correct for it to call C<can>, as overloading does not respect
AUTOLOAD.  So classes that autoload methods and implement C<can> no longer
have to account for overloading [perl #40333].

A warning is now produced for invalid arguments.  See L</New Diagnostics>.

=item *

L<PerlIO::scalar> has been upgraded from version 0.11 to 0.14.

(This is the module that implements C<< open $fh, '>', \$scalar >>.)

It fixes a problem with C<< open my $fh, ">", \$scalar >> not working if
C<$scalar> is a copy-on-write scalar. (5.14.2)

It also fixes a hang that occurs with C<readline> or C<< <$fh> >> if a
typeglob has been assigned to $scalar [perl #92258].

It no longer assumes during C<seek> that $scalar is a string internally.
If it didn't crash, it was close to doing so [perl #92706].  Also, the
internal print routine no longer assumes that the position set by C<seek>
is valid, but extends the string to that position, filling the intervening
bytes (between the old length and the seek position) with nulls
[perl #78980].

Printing to an in-memory handle now works if the $scalar holds a reference,
stringifying the reference before modifying it.  References used to be
treated as empty strings.

Printing to an in-memory handle no longer crashes if the $scalar happens to
hold a number internally, but no string buffer.

Printing to an in-memory handle no longer creates scalars that confuse
the regular expression engine [perl #108398].

=item *

L<Pod::Functions> has been upgraded from version 1.04 to 1.05.

F<Functions.pm> is now generated at perl build time from annotations in
F<perlfunc.pod>.  This will ensure that L<Pod::Functions> and L<perlfunc>
remain in synchronisation.

=item *

L<Pod::Html> has been upgraded from version 1.11 to 1.1502.

This is an extensive rewrite of Pod::Html to use L<Pod::Simple> under
the hood.  The output has changed significantly.

=item *

L<Pod::Perldoc> has been upgraded from version 3.15_03 to 3.17.

It corrects the search paths on VMS [perl #90640]. (5.14.1)

The B<-v> option now fetches the right section for C<$0>.

This upgrade has numerous significant fixes.  Consult its changelog on
the CPAN for more information.

=item *

L<POSIX> has been upgraded from version 1.24 to 1.30.

L<POSIX> no longer uses L<AutoLoader>.  Any code which was relying on this
implementation detail was buggy, and may fail because of this change.
The module's Perl code has been considerably simplified, roughly halving
the number of lines, with no change in functionality.  The XS code has
been refactored to reduce the size of the shared object by about 12%,
with no change in functionality.  More POSIX functions now have tests.

C<sigsuspend> and C<pause> now run signal handlers before returning, as the
whole point of these two functions is to wait until a signal has
arrived, and then return I<after> it has been triggered.  Delayed, or
"safe", signals were preventing that from happening, possibly resulting in
race conditions [perl #107216].

C<POSIX::sleep> is now a direct call into the underlying OS C<sleep>
function, instead of being a Perl wrapper on C<CORE::sleep>.
C<POSIX::dup2> now returns the correct value on Win32 (I<i.e.>, the file
descriptor).  C<POSIX::SigSet> C<sigsuspend> and C<sigpending> and
C<POSIX::pause> now dispatch safe signals immediately before returning to
their caller.

C<POSIX::Termios::setattr> now defaults the third argument to C<TCSANOW>,
instead of 0. On most platforms C<TCSANOW> is defined to be 0, but on some
0 is not a valid parameter, which caused a call with defaults to fail.

=item *

L<Socket> has been upgraded from version 1.94 to 2.001.

It has new functions and constants for handling IPv6 sockets:

    pack_ipv6_mreq
    unpack_ipv6_mreq
    IPV6_ADD_MEMBERSHIP
    IPV6_DROP_MEMBERSHIP
    IPV6_MTU
    IPV6_MTU_DISCOVER
    IPV6_MULTICAST_HOPS
    IPV6_MULTICAST_IF
    IPV6_MULTICAST_LOOP
    IPV6_UNICAST_HOPS
    IPV6_V6ONLY

=item *

L<Storable> has been upgraded from version 2.27 to 2.34.

It no longer turns copy-on-write scalars into read-only scalars when
freezing and thawing.

=item *

L<Sys::Syslog> has been upgraded from version 0.27 to 0.29.

This upgrade closes many outstanding bugs.

=item *

L<Term::ANSIColor> has been upgraded from version 3.00 to 3.01.

Only interpret an initial array reference as a list of colors, not any initial
reference, allowing the colored function to work properly on objects with
stringification defined.

=item *

L<Term::ReadLine> has been upgraded from version 1.07 to 1.09.

Term::ReadLine now supports any event loop, including unpublished ones and
simple L<IO::Select>, loops without the need to rewrite existing code for
any particular framework [perl #108470].

=item *

L<threads::shared> has been upgraded from version 1.37 to 1.40.

Destructors on shared objects used to be ignored sometimes if the objects
were referenced only by shared data structures.  This has been mostly
fixed, but destructors may still be ignored if the objects still exist at
global destruction time [perl #98204].

=item *

L<Unicode::Collate> has been upgraded from version 0.73 to 0.89.

Updated to CLDR 1.9.1

Locales updated to CLDR 2.0: mk, mt, nb, nn, ro, ru, sk, sr, sv, uk,
zh__pinyin, zh__stroke

Newly supported locales: bn, fa, ml, mr, or, pa, sa, si, si__dictionary,
sr_Latn, sv__reformed, ta, te, th, ur, wae.

Tailored compatibility ideographs as well as unified ideographs for the
locales: ja, ko, zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke.

Locale/*.pl files are now searched for in @INC.

=item *

L<Unicode::Normalize> has been upgraded from version 1.10 to 1.14.

Fixes for the removal of F<unicore/CompositionExclusions.txt> from core.

=item *

L<Unicode::UCD> has been upgraded from version 0.32 to 0.43.

This adds four new functions:  C<prop_aliases()> and
C<prop_value_aliases()>, which are used to find all Unicode-approved
synonyms for property names, or to convert from one name to another;
C<prop_invlist> which returns all code points matching a given
Unicode binary property; and C<prop_invmap> which returns the complete
specification of a given Unicode property.

=item *

L<Win32API::File> has been upgraded from version 0.1101 to 0.1200.

Added SetStdHandle and GetStdHandle functions

=back

=head2 Removed Modules and Pragmata

As promised in Perl 5.14.0's release notes, the following modules have
been removed from the core distribution, and if needed should be installed
from CPAN instead.

=over

=item *

L<Devel::DProf> has been removed from the Perl core.  Prior version was
20110228.00.

=item *

L<Shell> has been removed from the Perl core.  Prior version was 0.72_01.

=item *

Several old perl4-style libraries which have been deprecated with 5.14
are now removed:

    abbrev.pl assert.pl bigfloat.pl bigint.pl bigrat.pl cacheout.pl
    complete.pl ctime.pl dotsh.pl exceptions.pl fastcwd.pl flush.pl
    getcwd.pl getopt.pl getopts.pl hostname.pl importenv.pl
    lib/find{,depth}.pl look.pl newgetopt.pl open2.pl open3.pl
    pwd.pl shellwords.pl stat.pl tainted.pl termcap.pl timelocal.pl

They can be found on CPAN as L<Perl4::CoreLibs>.

=back

=head1 Documentation

=head2 New Documentation

=head3 L<perldtrace>

L<perldtrace> describes Perl's DTrace support, listing the provided probes
and gives examples of their use.

=head3 L<perlexperiment>

This document is intended to provide a list of experimental features in
Perl.  It is still a work in progress.

=head3 L<perlootut>

This a new OO tutorial.  It focuses on basic OO concepts, and then recommends
that readers choose an OO framework from CPAN.

=head3 L<perlxstypemap>

The new manual describes the XS typemapping mechanism in unprecedented
detail and combines new documentation with information extracted from
L<perlxs> and the previously unofficial list of all core typemaps.

=head2 Changes to Existing Documentation

=head3 L<perlapi>

=over 4

=item *

The HV API has long accepted negative lengths to show that the key is
in UTF8.  This is now documented.

=item *

The C<boolSV()> macro is now documented.

=back

=head3 L<perlfunc>

=over 4

=item *

C<dbmopen> treats a 0 mode as a special case, that prevents a nonexistent
file from being created.  This has been the case since Perl 5.000, but was
never documented anywhere.  Now the perlfunc entry mentions it
[perl #90064].

=item *

As an accident of history, C<open $fh, '<:', ...> applies the default
layers for the platform (C<:raw> on Unix, C<:crlf> on Windows), ignoring
whatever is declared by L<open.pm|open>.  This seems such a useful feature
it has been documented in L<perlfunc|perlfunc/open> and L<open>.

=item *

The entry for C<split> has been rewritten.  It is now far clearer than
before.

=back

=head3 L<perlguts>

=over 4

=item *

A new section, L<Autoloading with XSUBs|perlguts/Autoloading with XSUBs>,
has been added, which explains the two APIs for accessing the name of the
autoloaded sub.

=item *

Some function descriptions in L<perlguts> were confusing, as it was
not clear whether they referred to the function above or below the
description.  This has been clarified [perl #91790].

=back

=head3 L<perlobj>

=over 4

=item *

This document has been rewritten from scratch, and its coverage of various OO
concepts has been expanded.

=back

=head3 L<perlop>

=over 4

=item *

Documentation of the smartmatch operator has been reworked and moved from
perlsyn to perlop where it belongs.

It has also been corrected for the case of C<undef> on the left-hand
side.  The list of different smart match behaviors had an item in the
wrong place.

=item *

Documentation of the ellipsis statement (C<...>) has been reworked and
moved from perlop to perlsyn.

=item *

The explanation of bitwise operators has been expanded to explain how they
work on Unicode strings (5.14.1).

=item *

More examples for C<m//g> have been added (5.14.1).

=item *

The C<<< <<\FOO >>> here-doc syntax has been documented (5.14.1).

=back

=head3 L<perlpragma>

=over 4

=item *

There is now a standard convention for naming keys in the C<%^H>,
documented under L<Key naming|perlpragma/Key naming>.

=back

=head3 L<perlsec/Laundering and Detecting Tainted Data>

=over 4

=item *

The example function for checking for taintedness contained a subtle
error.  C<$@> needs to be localized to prevent its changing this
global's value outside the function.  The preferred method to check for
this remains L<Scalar::Util/tainted>.

=back

=head3 L<perllol>

=over

=item *

L<perllol> has been expanded with examples using the new C<push $scalar>
syntax introduced in Perl 5.14.0 (5.14.1).

=back

=head3 L<perlmod>

=over

=item *

L<perlmod> now states explicitly that some types of explicit symbol table
manipulation are not supported.  This codifies what was effectively already
the case [perl #78074].

=back

=head3 L<perlpodstyle>

=over 4

=item *

The tips on which formatting codes to use have been corrected and greatly
expanded.

=item *

There are now a couple of example one-liners for previewing POD files after
they have been edited.

=back

=head3 L<perlre>

=over

=item *

The C<(*COMMIT)> directive is now listed in the right section
(L<Verbs without an argument|perlre/Verbs without an argument>).

=back

=head3 L<perlrun>

=over

=item *

L<perlrun> has undergone a significant clean-up.  Most notably, the
B<-0x...> form of the B<-0> flag has been clarified, and the final section
on environment variables has been corrected and expanded (5.14.1).

=back

=head3 L<perlsub>

=over

=item *

The ($;) prototype syntax, which has existed for rather a long time, is now
documented in L<perlsub>.  It lets a unary function have the same
precedence as a list operator.

=back

=head3 L<perltie>

=over

=item *

The required syntax for tying handles has been documented.

=back

=head3 L<perlvar>

=over

=item *

The documentation for L<$!|perlvar/$!> has been corrected and clarified.
It used to state that $! could be C<undef>, which is not the case.  It was
also unclear whether system calls set C's C<errno> or Perl's C<$!>
[perl #91614].

=item *

Documentation for L<$$|perlvar/$$> has been amended with additional
cautions regarding changing the process ID.

=back

=head3 Other Changes

=over 4

=item *

L<perlxs> was extended with documentation on inline typemaps.

=item *

L<perlref> has a new L<Circular References|perlref/Circular References>
section explaining how circularities may not be freed and how to solve that
with weak references.

=item *

Parts of L<perlapi> were clarified, and Perl equivalents of some C
functions have been added as an additional mode of exposition.

=item *

A few parts of L<perlre> and L<perlrecharclass> were clarified.

=back

=head2 Removed Documentation

=head3 Old OO Documentation

The old OO tutorials, perltoot, perltooc, and perlboot, have been
removed.  The perlbot (bag of object tricks) document has been removed
as well.

=head3 Development Deltas

The perldelta files for development releases are no longer packaged with
perl.  These can still be found in the perl source code repository.

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<Cannot set tied @DB::args|perldiag/"Cannot set tied @DB::args">

This error occurs when C<caller> tries to set C<@DB::args> but finds it
tied.  Before this error was added, it used to crash instead.

=item *

L<Cannot tie unreifiable array|perldiag/"Cannot tie unreifiable array">

This error is part of a safety check that the C<tie> operator does before
tying a special array like C<@_>.  You should never see this message.

=item *

L<&CORE::%s cannot be called directly|perldiag/"&CORE::%s cannot be called directly">

This occurs when a subroutine in the C<CORE::> namespace is called
with C<&foo> syntax or through a reference.  Some subroutines
in this package cannot yet be called that way, but must be
called as barewords.  See L</Subroutines in the C<CORE> namespace>, above.

=item *

L<Source filters apply only to byte streams|perldiag/"Source filters apply only to byte streams">

This new error occurs when you try to activate a source filter (usually by
loading a source filter module) within a string passed to C<eval> under the
C<unicode_eval> feature.

=back

=head3 New Warnings

=over 4

=item *

L<defined(@array) is deprecated|perldiag/"defined(@array) is deprecated">

The long-deprecated C<defined(@array)> now also warns for package variables.
Previously it issued a warning for lexical variables only.

=item *

L<length() used on %s|perldiag/length() used on %s>

This new warning occurs when C<length> is used on an array or hash, instead
of C<scalar(@array)> or C<scalar(keys %hash)>.

=item *

L<lvalue attribute %s already-defined subroutine|perldiag/"lvalue attribute %s already-defined subroutine">

L<attributes.pm|attributes> now emits this warning when the :lvalue
attribute is applied to a Perl subroutine that has already been defined, as
doing so can have unexpected side-effects.

=item *

L<overload arg '%s' is invalid|perldiag/"overload arg '%s' is invalid">

This warning, in the "overload" category, is produced when the overload
pragma is given an argument it doesn't recognize, presumably a mistyped
operator.

=item *

L<$[ used in %s (did you mean $] ?)|perldiag/"$[ used in %s (did you mean $] ?)">

This new warning exists to catch the mistaken use of C<$[> in version
checks.  C<$]>, not C<$[>, contains the version number.

=item *

L<Useless assignment to a temporary|perldiag/"Useless assignment to a temporary">

Assigning to a temporary scalar returned
from an lvalue subroutine now produces this
warning [perl #31946].

=item *

L<Useless use of \E|perldiag/"Useless use of \E">

C<\E> does nothing unless preceded by C<\Q>, C<\L> or C<\U>.

=back

=head2 Removed Errors

=over

=item *

"sort is now a reserved word"

This error used to occur when C<sort> was called without arguments,
followed by C<;> or C<)>.  (E.g., C<sort;> would die, but C<{sort}> was
OK.)  This error message was added in Perl 3 to catch code like
C<close(sort)> which would no longer work.  More than two decades later,
this message is no longer appropriate.  Now C<sort> without arguments is
always allowed, and returns an empty list, as it did in those cases
where it was already allowed [perl #90030].

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

The "Applying pattern match..." or similar warning produced when an
array or hash is on the left-hand side of the C<=~> operator now
mentions the name of the variable.

=item *

The "Attempt to free non-existent shared string" has had the spelling
of "non-existent" corrected to "nonexistent".  It was already listed
with the correct spelling in L<perldiag>.

=item *

The error messages for using C<default> and C<when> outside a
topicalizer have been standardized to match the messages for C<continue>
and loop controls.  They now read 'Can't "default" outside a
topicalizer' and 'Can't "when" outside a topicalizer'.  They both used
to be 'Can't use when() outside a topicalizer' [perl #91514].

=item *

The message, "Code point 0x%X is not Unicode, no properties match it;
all inverse properties do" has been changed to "Code point 0x%X is not
Unicode, all \p{} matches fail; all \P{} matches succeed".

=item *

Redefinition warnings for constant subroutines used to be mandatory,
even occurring under C<no warnings>.  Now they respect the L<warnings>
pragma.

=item *

The "glob failed" warning message is now suppressible via C<no warnings>
[perl #111656].

=item *

The L<Invalid version format|perldiag/"Invalid version format (%s)">
error message now says "negative version number" within the parentheses,
rather than "non-numeric data", for negative numbers.

=item *

The two warnings
L<Possible attempt to put comments in qw() list|perldiag/"Possible attempt to put comments in qw() list">
and
L<Possible attempt to separate words with commas|perldiag/"Possible attempt to separate words with commas">
are no longer mutually exclusive: the same C<qw> construct may produce
both.

=item *

The uninitialized warning for C<y///r> when C<$_> is implicit and
undefined now mentions the variable name, just like the non-/r variation
of the operator.

=item *

The 'Use of "foo" without parentheses is ambiguous' warning has been
extended to apply also to user-defined subroutines with a (;$)
prototype, and not just to built-in functions.

=item *

Warnings that mention the names of lexical (C<my>) variables with
Unicode characters in them now respect the presence or absence of the
C<:utf8> layer on the output handle, instead of outputting UTF8
regardless.  Also, the correct names are included in the strings passed
to C<$SIG{__WARN__}> handlers, rather than the raw UTF8 bytes.

=back

=head1 Utility Changes

=head3 L<h2ph>

=over 4

=item *

L<h2ph> used to generate code of the form

  unless(defined(&FOO)) {
    sub FOO () {42;}
  }

But the subroutine is a compile-time declaration, and is hence unaffected
by the condition.  It has now been corrected to emit a string C<eval>
around the subroutine [perl #99368].

=back

=head3 L<splain>

=over 4

=item *

F<splain> no longer emits backtraces with the first line number repeated.

This:

    Uncaught exception from user code:
            Cannot fwiddle the fwuddle at -e line 1.
     at -e line 1
            main::baz() called at -e line 1
            main::bar() called at -e line 1
            main::foo() called at -e line 1

has become this:

    Uncaught exception from user code:
            Cannot fwiddle the fwuddle at -e line 1.
            main::baz() called at -e line 1
            main::bar() called at -e line 1
            main::foo() called at -e line 1

=item *

Some error messages consist of multiple lines that are listed as separate
entries in L<perldiag>.  splain has been taught to find the separate
entries in these cases, instead of simply failing to find the message.

=back

=head3 L<zipdetails>

=over 4

=item *

This is a new utility, included as part of an
L<IO::Compress::Base> upgrade.

L<zipdetails> displays information about the internal record structure
of the zip file.  It is not concerned with displaying any details of
the compressed data stored in the zip file.

=back

=head1 Configuration and Compilation

=over 4

=item *

F<regexp.h> has been modified for compatibility with GCC's B<-Werror>
option, as used by some projects that include perl's header files (5.14.1).

=item *

C<USE_LOCALE{,_COLLATE,_CTYPE,_NUMERIC}> have been added the output of perl -V
as they have affect the behavior of the interpreter binary (albeit
in only a small area).

=item *

The code and tests for L<IPC::Open2> have been moved from F<ext/IPC-Open2>
into F<ext/IPC-Open3>, as C<IPC::Open2::open2()> is implemented as a thin
wrapper around C<IPC::Open3::_open3()>, and hence is very tightly coupled to
it.

=item *

The magic types and magic vtables are now generated from data in a new script
F<regen/mg_vtable.pl>, instead of being maintained by hand.  As different
EBCDIC variants can't agree on the code point for '~', the character to code
point conversion is done at build time by F<generate_uudmap> to a new generated
header F<mg_data.h>.  C<PL_vtbl_bm> and C<PL_vtbl_fm> are now defined by the
pre-processor as C<PL_vtbl_regexp>, instead of being distinct C variables.
C<PL_vtbl_sig> has been removed.

=item *

Building with C<-DPERL_GLOBAL_STRUCT> works again.  This configuration is not
generally used.

=item *

Perl configured with I<MAD> now correctly frees C<MADPROP> structures when
OPs are freed.  C<MADPROP>s are now allocated with C<PerlMemShared_malloc()>

=item *

F<makedef.pl> has been refactored.  This should have no noticeable affect on
any of the platforms that use it as part of their build (AIX, VMS, Win32).

=item *

C<useperlio> can no longer be disabled.

=item *

The file F<global.sym> is no longer needed, and has been removed.  It
contained a list of all exported functions, one of the files generated by
F<regen/embed.pl> from data in F<embed.fnc> and F<regen/opcodes>.  The code
has been refactored so that the only user of F<global.sym>, F<makedef.pl>,
now reads F<embed.fnc> and F<regen/opcodes> directly, removing the need to
store the list of exported functions in an intermediate file.

As F<global.sym> was never installed, this change should not be visible
outside the build process.

=item *

F<pod/buildtoc>, used by the build process to build L<perltoc>, has been
refactored and simplified.  It now contains only code to build L<perltoc>;
the code to regenerate Makefiles has been moved to F<Porting/pod_rules.pl>.
It's a bug if this change has any material effect on the build process.

=item *

F<pod/roffitall> is now built by F<pod/buildtoc>, instead of being
shipped with the distribution.  Its list of manpages is now generated
(and therefore current).  See also RT #103202 for an unresolved related
issue.

=item *

The man page for C<XS::Typemap> is no longer installed.  C<XS::Typemap>
is a test module which is not installed, hence installing its
documentation makes no sense.

=item *

The -Dusesitecustomize and -Duserelocatableinc options now work
together properly.

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=head3 Cygwin

=over 4

=item *

Since version 1.7, Cygwin supports native UTF-8 paths.  If Perl is built
under that environment, directory and filenames will be UTF-8 encoded.

=item *

Cygwin does not initialize all original Win32 environment variables.  See
F<README.cygwin> for a discussion of the newly-added
C<Cygwin::sync_winenv()> function [perl #110190] and for
further links.

=back

=head3 HP-UX

=over 4

=item *

HP-UX PA-RISC/64 now supports gcc-4.x

A fix to correct the socketsize now makes the test suite pass on HP-UX
PA-RISC for 64bitall builds. (5.14.2)

=back

=head3 VMS

=over 4

=item *

Remove unnecessary includes, fix miscellaneous compiler warnings and
close some unclosed comments on F<vms/vms.c>.

=item *

Remove sockadapt layer from the VMS build.

=item *

Explicit support for VMS versions before v7.0 and DEC C versions
before v6.0 has been removed.

=item *

Since Perl 5.10.1, the home-grown C<stat> wrapper has been unable to
distinguish between a directory name containing an underscore and an
otherwise-identical filename containing a dot in the same position
(e.g., t/test_pl as a directory and t/test.pl as a file).  This problem
has been corrected.

=item *

The build on VMS now permits names of the resulting symbols in C code for
Perl longer than 31 characters.  Symbols like
C<Perl__it_was_the_best_of_times_it_was_the_worst_of_times> can now be
created freely without causing the VMS linker to seize up.

=back

=head3 GNU/Hurd

=over 4

=item *

Numerous build and test failures on GNU/Hurd have been resolved with hints
for building DBM modules, detection of the library search path, and enabling
of large file support.

=back

=head3 OpenVOS

=over 4

=item *

Perl is now built with dynamic linking on OpenVOS, the minimum supported
version of which is now Release 17.1.0.

=back

=head3 SunOS

The CC workshop C++ compiler is now detected and used on systems that ship
without cc.

=head1 Internal Changes

=over 4

=item *

The compiled representation of formats is now stored via the C<mg_ptr> of
their C<PERL_MAGIC_fm>.  Previously it was stored in the string buffer,
beyond C<SvLEN()>, the regular end of the string.  C<SvCOMPILED()> and
C<SvCOMPILED_{on,off}()> now exist solely for compatibility for XS code.
The first is always 0, the other two now no-ops. (5.14.1)

=item *

Some global variables have been marked C<const>, members in the interpreter
structure have been re-ordered, and the opcodes have been re-ordered.  The
op C<OP_AELEMFAST> has been split into C<OP_AELEMFAST> and C<OP_AELEMFAST_LEX>.

=item *

When empting a hash of its elements (e.g., via undef(%h), or %h=()), HvARRAY
field is no longer temporarily zeroed.  Any destructors called on the freed
elements see the remaining elements.  Thus, %h=() becomes more like
C<delete $h{$_} for keys %h>.

=item *

Boyer-Moore compiled scalars are now PVMGs, and the Boyer-Moore tables are now
stored via the mg_ptr of their C<PERL_MAGIC_bm>.
Previously they were PVGVs, with the tables stored in
the string buffer, beyond C<SvLEN()>.  This eliminates
the last place where the core stores data beyond C<SvLEN()>.

=item *

Simplified logic in C<Perl_sv_magic()> introduces a small change of
behavior for error cases involving unknown magic types.  Previously, if
C<Perl_sv_magic()> was passed a magic type unknown to it, it would

=over

=item 1.

Croak "Modification of a read-only value attempted" if read only

=item 2.

Return without error if the SV happened to already have this magic

=item 3.

otherwise croak "Don't know how to handle magic of type \\%o"

=back

Now it will always croak "Don't know how to handle magic of type \\%o", even
on read-only values, or SVs which already have the unknown magic type.

=item *

The experimental C<fetch_cop_label> function has been renamed to
C<cop_fetch_label>.

=item *

The C<cop_store_label> function has been added to the API, but is
experimental.

=item *

F<embedvar.h> has been simplified, and one level of macro indirection for
PL_* variables has been removed for the default (non-multiplicity)
configuration.  PERLVAR*() macros now directly expand their arguments to
tokens such as C<PL_defgv>, instead of expanding to C<PL_Idefgv>, with
F<embedvar.h> defining a macro to map C<PL_Idefgv> to C<PL_defgv>.  XS code
which has unwarranted chumminess with the implementation may need updating.

=item *

An API has been added to explicitly choose whether to export XSUB
symbols.  More detail can be found in the comments for commit e64345f8.

=item *

The C<is_gv_magical_sv> function has been eliminated and merged with
C<gv_fetchpvn_flags>.  It used to be called to determine whether a GV
should be autovivified in rvalue context.  Now it has been replaced with a
new C<GV_ADDMG> flag (not part of the API).

=item *

The returned code point from the function C<utf8n_to_uvuni()>
when the input is malformed UTF-8, malformations are allowed, and
C<utf8> warnings are off is now the Unicode REPLACEMENT CHARACTER
whenever the malformation is such that no well-defined code point can be
computed.  Previously the returned value was essentially garbage.  The
only malformations that have well-defined values are a zero-length
string (0 is the return), and overlong UTF-8 sequences.

=item *

Padlists are now marked C<AvREAL>; i.e., reference-counted.  They have
always been reference-counted, but were not marked real, because F<pad.c>
did its own clean-up, instead of using the usual clean-up code in F<sv.c>.
That caused problems in thread cloning, so now the C<AvREAL> flag is on,
but is turned off in F<pad.c> right before the padlist is freed (after
F<pad.c> has done its custom freeing of the pads).

=item *

All C files that make up the Perl core have been converted to UTF-8.

=item *

These new functions have been added as part of the work on Unicode symbols:

    HvNAMELEN
    HvNAMEUTF8
    HvENAMELEN
    HvENAMEUTF8
    gv_init_pv
    gv_init_pvn
    gv_init_pvsv
    gv_fetchmeth_pv
    gv_fetchmeth_pvn
    gv_fetchmeth_sv
    gv_fetchmeth_pv_autoload
    gv_fetchmeth_pvn_autoload
    gv_fetchmeth_sv_autoload
    gv_fetchmethod_pv_flags
    gv_fetchmethod_pvn_flags
    gv_fetchmethod_sv_flags
    gv_autoload_pv
    gv_autoload_pvn
    gv_autoload_sv
    newGVgen_flags
    sv_derived_from_pv
    sv_derived_from_pvn
    sv_derived_from_sv
    sv_does_pv
    sv_does_pvn
    sv_does_sv
    whichsig_pv
    whichsig_pvn
    whichsig_sv
    newCONSTSUB_flags

The gv_fetchmethod_*_flags functions, like gv_fetchmethod_flags, are
experimental and may change in a future release.

=item *

The following functions were added.  These are I<not> part of the API:

    GvNAMEUTF8
    GvENAMELEN
    GvENAME_HEK
    CopSTASH_flags
    CopSTASH_flags_set
    PmopSTASH_flags
    PmopSTASH_flags_set
    sv_sethek
    HEKfARG

There is also a C<HEKf> macro corresponding to C<SVf>, for
interpolating HEKs in formatted strings.

=item *

C<sv_catpvn_flags> takes a couple of new internal-only flags,
C<SV_CATBYTES> and C<SV_CATUTF8>, which tell it whether the char array to
be concatenated is UTF8.  This allows for more efficient concatenation than
creating temporary SVs to pass to C<sv_catsv>.

=item *

For XS AUTOLOAD subs, $AUTOLOAD is set once more, as it was in 5.6.0.  This
is in addition to setting C<SvPVX(cv)>, for compatibility with 5.8 to 5.14.
See L<perlguts/Autoloading with XSUBs>.

=item *

Perl now checks whether the array (the linearized isa) returned by a MRO
plugin begins with the name of the class itself, for which the array was
created, instead of assuming that it does.  This prevents the first element
from being skipped during method lookup.  It also means that
C<mro::get_linear_isa> may return an array with one more element than the
MRO plugin provided [perl #94306].

=item *

C<PL_curstash> is now reference-counted.

=item *

There are now feature bundle hints in C<PL_hints> (C<$^H>) that version
declarations use, to avoid having to load F<feature.pm>.  One setting of
the hint bits indicates a "custom" feature bundle, which means that the
entries in C<%^H> still apply.  F<feature.pm> uses that.

The C<HINT_FEATURE_MASK> macro is defined in F<perl.h> along with other
hints.  Other macros for setting and testing features and bundles are in
the new F<feature.h>.  C<FEATURE_IS_ENABLED> (which has moved to
F<feature.h>) is no longer used throughout the codebase, but more specific
macros, e.g., C<FEATURE_SAY_IS_ENABLED>, that are defined in F<feature.h>.

=item *

F<lib/feature.pm> is now a generated file, created by the new
F<regen/feature.pl> script, which also generates F<feature.h>.

=item *

Tied arrays are now always C<AvREAL>.  If C<@_> or C<DB::args> is tied, it
is reified first, to make sure this is always the case.

=item *

Two new functions C<utf8_to_uvchr_buf()> and C<utf8_to_uvuni_buf()> have
been added.  These are the same as C<utf8_to_uvchr> and
C<utf8_to_uvuni> (which are now deprecated), but take an extra parameter
that is used to guard against reading beyond the end of the input
string.
See L<perlapi/utf8_to_uvchr_buf> and L<perlapi/utf8_to_uvuni_buf>.

=item *

The regular expression engine now does TRIE case insensitive matches
under Unicode. This may change the output of C<< use re 'debug'; >>,
and will speed up various things.

=item *

There is a new C<wrap_op_checker()> function, which provides a thread-safe
alternative to writing to C<PL_check> directly.

=back

=head1 Selected Bug Fixes

=head2 Array and hash

=over

=item *

A bug has been fixed that would cause a "Use of freed value in iteration"
error if the next two hash elements that would be iterated over are
deleted [perl #85026]. (5.14.1)

=item *

Deleting the current hash iterator (the hash element that would be returned
by the next call to C<each>) in void context used not to free it
[perl #85026].

=item *

Deletion of methods via C<delete $Class::{method}> syntax used to update
method caches if called in void context, but not scalar or list context.

=item *

When hash elements are deleted in void context, the internal hash entry is
now freed before the value is freed, to prevent destructors called by that
latter freeing from seeing the hash in an inconsistent state.  It was
possible to cause double-frees if the destructor freed the hash itself
[perl #100340].

=item *

A C<keys> optimization in Perl 5.12.0 to make it faster on empty hashes
caused C<each> not to reset the iterator if called after the last element
was deleted.

=item *

Freeing deeply nested hashes no longer crashes [perl #44225].

=item *

It is possible from XS code to create hashes with elements that have no
values.  The hash element and slice operators used to crash
when handling these in lvalue context.  They now
produce a "Modification of non-creatable hash value attempted" error
message.

=item *

If list assignment to a hash or array triggered destructors that freed the
hash or array itself, a crash would ensue.  This is no longer the case
[perl #107440].

=item *

It used to be possible to free the typeglob of a localized array or hash
(e.g., C<local @{"x"}; delete $::{x}>), resulting in a crash on scope exit.

=item *

Some core bugs affecting L<Hash::Util> have been fixed: locking a hash
element that is a glob copy no longer causes the next assignment to it to
corrupt the glob (5.14.2), and unlocking a hash element that holds a
copy-on-write scalar no longer causes modifications to that scalar to
modify other scalars that were sharing the same string buffer.

=back

=head2 C API fixes

=over

=item *

The C<newHVhv> XS function now works on tied hashes, instead of crashing or
returning an empty hash.

=item *

The C<SvIsCOW> C macro now returns false for read-only copies of typeglobs,
such as those created by:

  $hash{elem} = *foo;
  Hash::Util::lock_value %hash, 'elem';

It used to return true.

=item *

The C<SvPVutf8> C function no longer tries to modify its argument,
resulting in errors [perl #108994].

=item *

C<SvPVutf8> now works properly with magical variables.

=item *

C<SvPVbyte> now works properly non-PVs.

=item *

When presented with malformed UTF-8 input, the XS-callable functions
C<is_utf8_string()>, C<is_utf8_string_loc()>, and
C<is_utf8_string_loclen()> could read beyond the end of the input
string by up to 12 bytes.  This no longer happens.  [perl #32080].
However, currently, C<is_utf8_char()> still has this defect, see
L</is_utf8_char()> above.

=item *

The C-level C<pregcomp> function could become confused about whether the
pattern was in UTF8 if the pattern was an overloaded, tied, or otherwise
magical scalar [perl #101940].

=back

=head2 Compile-time hints

=over

=item *

Tying C<%^H> no longer causes perl to crash or ignore the contents of
C<%^H> when entering a compilation scope [perl #106282].

=item *

C<eval $string> and C<require> used not to
localize C<%^H> during compilation if it
was empty at the time the C<eval> call itself was compiled.  This could
lead to scary side effects, like C<use re "/m"> enabling other flags that
the surrounding code was trying to enable for its caller [perl #68750].

=item *

C<eval $string> and C<require> no longer localize hints (C<$^H> and C<%^H>)
at run time, but only during compilation of the $string or required file.
This makes C<BEGIN { $^H{foo}=7 }> equivalent to
C<BEGIN { eval '$^H{foo}=7' }> [perl #70151].

=item *

Creating a BEGIN block from XS code (via C<newXS> or C<newATTRSUB>) would,
on completion, make the hints of the current compiling code the current
hints.  This could cause warnings to occur in a non-warning scope.

=back

=head2 Copy-on-write scalars

Copy-on-write or shared hash key scalars
were introduced in 5.8.0, but most Perl code
did not encounter them (they were used mostly internally).  Perl
5.10.0 extended them, such that assigning C<__PACKAGE__> or a
hash key to a scalar would make it copy-on-write.  Several parts
of Perl were not updated to account for them, but have now been fixed.

=over

=item *

C<utf8::decode> had a nasty bug that would modify copy-on-write scalars'
string buffers in place (i.e., skipping the copy).  This could result in
hashes having two elements with the same key [perl #91834]. (5.14.2)

=item *

Lvalue subroutines were not allowing COW scalars to be returned.  This was
fixed for lvalue scalar context in Perl 5.12.3 and 5.14.0, but list context
was not fixed until this release.

=item *

Elements of restricted hashes (see the L<fields> pragma) containing
copy-on-write values couldn't be deleted, nor could such hashes be cleared
(C<%hash = ()>). (5.14.2)

=item *

Localizing a tied variable used to make it read-only if it contained a
copy-on-write string. (5.14.2)

=item *

Assigning a copy-on-write string to a stash
element no longer causes a double free.  Regardless of this change, the
results of such assignments are still undefined.

=item *

Assigning a copy-on-write string to a tied variable no longer stops that
variable from being tied if it happens to be a PVMG or PVLV internally.

=item *

Doing a substitution on a tied variable returning a copy-on-write
scalar used to cause an assertion failure or an "Attempt to free
nonexistent shared string" warning.

=item *

This one is a regression from 5.12: In 5.14.0, the bitwise assignment
operators C<|=>, C<^=> and C<&=> started leaving the left-hand side
undefined if it happened to be a copy-on-write string [perl #108480].

=item *

L<Storable>, L<Devel::Peek> and L<PerlIO::scalar> had similar problems.
See L</Updated Modules and Pragmata>, above.

=back

=head2 The debugger

=over

=item *

F<dumpvar.pl>, and therefore the C<x> command in the debugger, have been
fixed to handle objects blessed into classes whose names contain "=".  The
contents of such objects used not to be dumped [perl #101814].

=item *

The "R" command for restarting a debugger session has been fixed to work on
Windows, or any other system lacking a C<POSIX::_SC_OPEN_MAX> constant
[perl #87740].

=item *

The C<#line 42 foo> directive used not to update the arrays of lines used
by the debugger if it occurred in a string eval.  This was partially fixed
in 5.14, but it worked only for a single C<#line 42 foo> in each eval.  Now
it works for multiple.

=item *

When subroutine calls are intercepted by the debugger, the name of the
subroutine or a reference to it is stored in C<$DB::sub>, for the debugger
to access.  Sometimes (such as C<$foo = *bar; undef *bar; &$foo>)
C<$DB::sub> would be set to a name that could not be used to find the
subroutine, and so the debugger's attempt to call it would fail.  Now the
check to see whether a reference is needed is more robust, so those
problems should not happen anymore [rt.cpan.org #69862].

=item *

Every subroutine has a filename associated with it that the debugger uses.
The one associated with constant subroutines used to be misallocated when
cloned under threads.  Consequently, debugging threaded applications could
result in memory corruption [perl #96126].

=back

=head2 Dereferencing operators

=over

=item *

C<defined(${"..."})>, C<defined(*{"..."})>, etc., used to
return true for most, but not all built-in variables, if
they had not been used yet.  This bug affected C<${^GLOBAL_PHASE}> and
C<${^UTF8CACHE}>, among others.  It also used to return false if the
package name was given as well (C<${"::!"}>) [perl #97978, #97492].

=item *

Perl 5.10.0 introduced a similar bug: C<defined(*{"foo"})> where "foo"
represents the name of a built-in global variable used to return false if
the variable had never been used before, but only on the I<first> call.
This, too, has been fixed.

=item *

Since 5.6.0, C<*{ ... }> has been inconsistent in how it treats undefined
values.  It would die in strict mode or lvalue context for most undefined
values, but would be treated as the empty string (with a warning) for the
specific scalar return by C<undef()> (C<&PL_sv_undef> internally).  This
has been corrected.  C<undef()> is now treated like other undefined
scalars, as in Perl 5.005.

=back

=head2 Filehandle, last-accessed

Perl has an internal variable that stores the last filehandle to be
accessed.  It is used by C<$.> and by C<tell> and C<eof> without
arguments.

=over

=item *

It used to be possible to set this internal variable to a glob copy and
then modify that glob copy to be something other than a glob, and still
have the last-accessed filehandle associated with the variable after
assigning a glob to it again:

    my $foo = *STDOUT;  # $foo is a glob copy
    <$foo>;             # $foo is now the last-accessed handle
    $foo = 3;           # no longer a glob
    $foo = *STDERR;     # still the last-accessed handle

Now the C<$foo = 3> assignment unsets that internal variable, so there
is no last-accessed filehandle, just as if C<< <$foo> >> had never
happened.

This also prevents some unrelated handle from becoming the last-accessed
handle if $foo falls out of scope and the same internal SV gets used for
another handle [perl #97988].

=item *

A regression in 5.14 caused these statements not to set that internal
variable:

    my $fh = *STDOUT;
    tell $fh;
    eof  $fh;
    seek $fh, 0,0;
    tell     *$fh;
    eof      *$fh;
    seek     *$fh, 0,0;
    readline *$fh;

This is now fixed, but C<tell *{ *$fh }> still has the problem, and it
is not clear how to fix it [perl #106536].

=back

=head2 Filetests and C<stat>

The term "filetests" refers to the operators that consist of a hyphen
followed by a single letter: C<-r>, C<-x>, C<-M>, etc.  The term "stacked"
when applied to filetests means followed by another filetest operator
sharing the same operand, as in C<-r -x -w $fooo>.

=over

=item *

C<stat> produces more consistent warnings.  It no longer warns for "_"
[perl #71002] and no longer skips the warning at times for other unopened
handles.  It no longer warns about an unopened handle when the operating
system's C<fstat> function fails.

=item *

C<stat> would sometimes return negative numbers for large inode numbers,
because it was using the wrong internal C type. [perl #84590]

=item *

C<lstat> is documented to fall back to C<stat> (with a warning) when given
a filehandle.  When passed an IO reference, it was actually doing the
equivalent of S<C<stat _>> and ignoring the handle.

=item *

C<-T _> with no preceding C<stat> used to produce a
confusing "uninitialized" warning, even though there
is no visible uninitialized value to speak of.

=item *

C<-T>, C<-B>, C<-l> and C<-t> now work
when stacked with other filetest operators
[perl #77388].

=item *

In 5.14.0, filetest ops (C<-r>, C<-x>, etc.) started calling FETCH on a
tied argument belonging to the previous argument to a list operator, if
called with a bareword argument or no argument at all.  This has been
fixed, so C<push @foo, $tied, -r> no longer calls FETCH on C<$tied>.

=item *

In Perl 5.6, C<-l> followed by anything other than a bareword would treat
its argument as a file name.  That was changed in 5.8 for glob references
(C<\*foo>), but not for globs themselves (C<*foo>).  C<-l> started
returning C<undef> for glob references without setting the last
stat buffer that the "_" handle uses, but only if warnings
were turned on.  With warnings off, it was the same as 5.6.
In other words, it was simply buggy and inconsistent.  Now the 5.6
behavior has been restored.

=item *

C<-l> followed by a bareword no longer "eats" the previous argument to
the list operator in whose argument list it resides.  Hence,
C<print "bar", -l foo> now actually prints "bar", because C<-l>
on longer eats it.

=item *

Perl keeps several internal variables to keep track of the last stat
buffer, from which file(handle) it originated, what type it was, and
whether the last stat succeeded.

There were various cases where these could get out of synch, resulting in
inconsistent or erratic behavior in edge cases (every mention of C<-T>
applies to C<-B> as well):

=over

=item *

C<-T I<HANDLE>>, even though it does a C<stat>, was not resetting the last
stat type, so an C<lstat _> following it would merrily return the wrong
results.  Also, it was not setting the success status.

=item *

Freeing the handle last used by C<stat> or a filetest could result in
S<C<-T _>> using an unrelated handle.

=item *

C<stat> with an IO reference would not reset the stat type or record the
filehandle for S<C<-T _>> to use.

=item *

Fatal warnings could cause the stat buffer not to be reset
for a filetest operator on an unopened filehandle or C<-l> on any handle.
Fatal warnings also stopped C<-T> from setting C<$!>.

=item *

When the last stat was on an unreadable file, C<-T _> is supposed to
return C<undef>, leaving the last stat buffer unchanged.  But it was
setting the stat type, causing C<lstat _> to stop working.

=item *

C<-T I<FILENAME>> was not resetting the internal stat buffers for
unreadable files.

=back

These have all been fixed.

=back

=head2 Formats

=over

=item *

Several edge cases have been fixed with formats and C<formline>;
in particular, where the format itself is potentially variable (such as
with ties and overloading), and where the format and data differ in their
encoding.  In both these cases, it used to possible for the output to be
corrupted [perl #91032].

=item *

C<formline> no longer converts its argument into a string in-place.  So
passing a reference to C<formline> no longer destroys the reference
[perl #79532].

=item *

Assignment to C<$^A> (the format output accumulator) now recalculates
the number of lines output.

=back

=head2 C<given> and C<when>

=over

=item *

C<given> was not scoping its implicit $_ properly, resulting in memory
leaks or "Variable is not available" warnings [perl #94682].

=item *

C<given> was not calling set-magic on the implicit lexical C<$_> that it
uses.  This meant, for example, that C<pos> would be remembered from one
execution of the same C<given> block to the next, even if the input were a
different variable [perl #84526].

=item *

C<when> blocks are now capable of returning variables declared inside the
enclosing C<given> block [perl #93548].

=back

=head2 The C<glob> operator

=over

=item *

On OSes other than VMS, Perl's C<glob> operator (and the C<< <...> >> form)
use L<File::Glob> underneath.  L<File::Glob> splits the pattern into words,
before feeding each word to its C<bsd_glob> function.

There were several inconsistencies in the way the split was done.  Now
quotation marks (' and ") are always treated as shell-style word delimiters
(that allow whitespace as part of a word) and backslashes are always
preserved, unless they exist to escape quotation marks.  Before, those
would only sometimes be the case, depending on whether the pattern
contained whitespace.  Also, escaped whitespace at the end of the pattern
is no longer stripped [perl #40470].

=item *

C<CORE::glob> now works as a way to call the default globbing function.  It
used to respect overrides, despite the C<CORE::> prefix.

=item *

Under miniperl (used to configure modules when perl itself is built),
C<glob> now clears %ENV before calling csh, since the latter croaks on some
systems if it does not like the contents of the LS_COLORS environment
variable [perl #98662].

=back

=head2 Lvalue subroutines

=over

=item *

Explicit return now returns the actual argument passed to return, instead
of copying it [perl #72724, #72706].

=item *

Lvalue subroutines used to enforce lvalue syntax (i.e., whatever can go on
the left-hand side of C<=>) for the last statement and the arguments to
return.  Since lvalue subroutines are not always called in lvalue context,
this restriction has been lifted.

=item *

Lvalue subroutines are less restrictive about what values can be returned.
It used to croak on values returned by C<shift> and C<delete> and from
other subroutines, but no longer does so [perl #71172].

=item *

Empty lvalue subroutines (C<sub :lvalue {}>) used to return C<@_> in list
context.  All subroutines used to do this, but regular subs were fixed in
Perl 5.8.2.  Now lvalue subroutines have been likewise fixed.

=item *

Autovivification now works on values returned from lvalue subroutines
[perl #7946], as does returning C<keys> in lvalue context.

=item *

Lvalue subroutines used to copy their return values in rvalue context.  Not
only was this a waste of CPU cycles, but it also caused bugs.  A C<($)>
prototype would cause an lvalue sub to copy its return value [perl #51408],
and C<while(lvalue_sub() =~ m/.../g) { ... }> would loop endlessly
[perl #78680].

=item *

When called in potential lvalue context
(e.g., subroutine arguments or a list
passed to C<for>), lvalue subroutines used to copy
any read-only value that was returned.  E.g., C< sub :lvalue { $] } >
would not return C<$]>, but a copy of it.

=item *

When called in potential lvalue context, an lvalue subroutine returning
arrays or hashes used to bind the arrays or hashes to scalar variables,
resulting in bugs.  This was fixed in 5.14.0 if an array were the first
thing returned from the subroutine (but not for C<$scalar, @array> or
hashes being returned).  Now a more general fix has been applied
[perl #23790].

=item *

Method calls whose arguments were all surrounded with C<my()> or C<our()>
(as in C<< $object->method(my($a,$b)) >>) used to force lvalue context on
the subroutine.  This would prevent lvalue methods from returning certain
values.

=item *

Lvalue sub calls that are not determined to be such at compile time
(C<&$name> or &{"name"}) are no longer exempt from strict refs if they
occur in the last statement of an lvalue subroutine [perl #102486].

=item *

Sub calls whose subs are not visible at compile time, if
they occurred in the last statement of an lvalue subroutine,
would reject non-lvalue subroutines and die with "Can't modify non-lvalue
subroutine call" [perl #102486].

Non-lvalue sub calls whose subs I<are> visible at compile time exhibited
the opposite bug.  If the call occurred in the last statement of an lvalue
subroutine, there would be no error when the lvalue sub was called in
lvalue context.  Perl would blindly assign to the temporary value returned
by the non-lvalue subroutine.

=item *

C<AUTOLOAD> routines used to take precedence over the actual sub being
called (i.e., when autoloading wasn't needed), for sub calls in lvalue or
potential lvalue context, if the subroutine was not visible at compile
time.

=item *

Applying the C<:lvalue> attribute to an XSUB or to an aliased subroutine
stub with C<< sub foo :lvalue; >> syntax stopped working in Perl 5.12.
This has been fixed.

=item *

Applying the :lvalue attribute to subroutine that is already defined does
not work properly, as the attribute changes the way the sub is compiled.
Hence, Perl 5.12 began warning when an attempt is made to apply the
attribute to an already defined sub.  In such cases, the attribute is
discarded.

But the change in 5.12 missed the case where custom attributes are also
present: that case still silently and ineffectively applied the attribute.
That omission has now been corrected.  C<sub foo :lvalue :Whatever> (when
C<foo> is already defined) now warns about the :lvalue attribute, and does
not apply it.

=item *

A bug affecting lvalue context propagation through nested lvalue subroutine
calls has been fixed.  Previously, returning a value in nested rvalue
context would be treated as lvalue context by the inner subroutine call,
resulting in some values (such as read-only values) being rejected.

=back

=head2 Overloading

=over

=item *

Arithmetic assignment (C<$left += $right>) involving overloaded objects
that rely on the 'nomethod' override no longer segfault when the left
operand is not overloaded.

=item *

Errors that occur when methods cannot be found during overloading now
mention the correct package name, as they did in 5.8.x, instead of
erroneously mentioning the "overload" package, as they have since 5.10.0.

=item *

Undefining C<%overload::> no longer causes a crash.

=back

=head2 Prototypes of built-in keywords

=over

=item *

The C<prototype> function no longer dies for the C<__FILE__>, C<__LINE__>
and C<__PACKAGE__> directives.  It now returns an empty-string prototype
for them, because they are syntactically indistinguishable from nullary
functions like C<time>.

=item *

C<prototype> now returns C<undef> for all overridable infix operators,
such as C<eq>, which are not callable in any way resembling functions.
It used to return incorrect prototypes for some and die for others
[perl #94984].

=item *

The prototypes of several built-in functions--C<getprotobynumber>, C<lock>,
C<not> and C<select>--have been corrected, or at least are now closer to
reality than before.

=back

=head2 Regular expressions

=for comment Is it possible to merge some of these items?

=over 4

=item *

C</[[:ascii:]]/> and C</[[:blank:]]/> now use locale rules under
C<use locale> when the platform supports that.  Previously, they used
the platform's native character set.

=item *

C<m/[[:ascii:]]/i> and C</\p{ASCII}/i> now match identically (when not
under a differing locale).  This fixes a regression introduced in 5.14
in which the first expression could match characters outside of ASCII,
such as the KELVIN SIGN.

=item *

C</.*/g> would sometimes refuse to match at the end of a string that ends
with "\n".  This has been fixed [perl #109206].

=item *

Starting with 5.12.0, Perl used to get its internal bookkeeping muddled up
after assigning C<${ qr// }> to a hash element and locking it with
L<Hash::Util>.  This could result in double frees, crashes, or erratic
behavior.

=item *

The new (in 5.14.0) regular expression modifier C</a> when repeated like
C</aa> forbids the characters outside the ASCII range that match
characters inside that range from matching under C</i>.  This did not
work under some circumstances, all involving alternation, such as:

 "\N{KELVIN SIGN}" =~ /k|foo/iaa;

succeeded inappropriately.  This is now fixed.

=item *

5.14.0 introduced some memory leaks in regular expression character
classes such as C<[\w\s]>, which have now been fixed. (5.14.1)

=item *

An edge case in regular expression matching could potentially loop.
This happened only under C</i> in bracketed character classes that have
characters with multi-character folds, and the target string to match
against includes the first portion of the fold, followed by another
character that has a multi-character fold that begins with the remaining
portion of the fold, plus some more.

 "s\N{U+DF}" =~ /[\x{DF}foo]/i

is one such case.  C<\xDF> folds to C<"ss">. (5.14.1)

=item *

A few characters in regular expression pattern matches did not
match correctly in some circumstances, all involving C</i>.  The
affected characters are:
COMBINING GREEK YPOGEGRAMMENI,
GREEK CAPITAL LETTER IOTA,
GREEK CAPITAL LETTER UPSILON,
GREEK PROSGEGRAMMENI,
GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA,
GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS,
GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA,
GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS,
LATIN SMALL LETTER LONG S,
LATIN SMALL LIGATURE LONG S T,
and
LATIN SMALL LIGATURE ST.

=item *

A memory leak regression in regular expression compilation
under threading has been fixed.

=item *

A regression introduced in 5.14.0 has
been fixed.  This involved an inverted
bracketed character class in a regular expression that consisted solely
of a Unicode property.  That property wasn't getting inverted outside the
Latin1 range.

=item *

Three problematic Unicode characters now work better in regex pattern matching under C</i>.

In the past, three Unicode characters:
LATIN SMALL LETTER SHARP S,
GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS,
and
GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS,
along with the sequences that they fold to
(including "ss" for LATIN SMALL LETTER SHARP S),
did not properly match under C</i>.  5.14.0 fixed some of these cases,
but introduced others, including a panic when one of the characters or
sequences was used in the C<(?(DEFINE)> regular expression predicate.
The known bugs that were introduced in 5.14 have now been fixed; as well
as some other edge cases that have never worked until now.  These all
involve using the characters and sequences outside bracketed character
classes under C</i>.  This closes [perl #98546].

There remain known problems when using certain characters with
multi-character folds inside bracketed character classes, including such
constructs as C<qr/[\N{LATIN SMALL LETTER SHARP}a-z]/i>.  These
remaining bugs are addressed in [perl #89774].

=item *

RT #78266: The regex engine has been leaking memory when accessing
named captures that weren't matched as part of a regex ever since 5.10
when they were introduced; e.g., this would consume over a hundred MB of
memory:

    for (1..10_000_000) {
        if ("foo" =~ /(foo|(?<capture>bar))?/) {
            my $capture = $+{capture}
        }
    }
    system "ps -o rss $$"'

=item *

In 5.14, C</[[:lower:]]/i> and C</[[:upper:]]/i> no longer matched the
opposite case.  This has been fixed [perl #101970].

=item *

A regular expression match with an overloaded object on the right-hand side
would sometimes stringify the object too many times.

=item *

A regression has been fixed that was introduced in 5.14, in C</i>
regular expression matching, in which a match improperly fails if the
pattern is in UTF-8, the target string is not, and a Latin-1 character
precedes a character in the string that should match the pattern.
[perl #101710]

=item *

In case-insensitive regular expression pattern matching, no longer on
UTF-8 encoded strings does the scan for the start of match look only at
the first possible position.  This caused matches such as
C<"f\x{FB00}" =~ /ff/i> to fail.

=item *

The regexp optimizer no longer crashes on debugging builds when merging
fixed-string nodes with inconvenient contents.

=item *

A panic involving the combination of the regular expression modifiers
C</aa> and the C<\b> escape sequence introduced in 5.14.0 has been
fixed [perl #95964]. (5.14.2)

=item *

The combination of the regular expression modifiers C</aa> and the C<\b>
and C<\B> escape sequences did not work properly on UTF-8 encoded
strings.  All non-ASCII characters under C</aa> should be treated as
non-word characters, but what was happening was that Unicode rules were
used to determine wordness/non-wordness for non-ASCII characters.  This
is now fixed [perl #95968].

=item *

C<< (?foo: ...) >> no longer loses passed in character set.

=item *

The trie optimization used to have problems with alternations containing
an empty C<(?:)>, causing C<< "x" =~ /\A(?>(?:(?:)A|B|C?x))\z/ >> not to
match, whereas it should [perl #111842].

=item *

Use of lexical (C<my>) variables in code blocks embedded in regular
expressions will no longer result in memory corruption or crashes.

Nevertheless, these code blocks are still experimental, as there are still
problems with the wrong variables being closed over (in loops for instance)
and with abnormal exiting (e.g., C<die>) causing memory corruption.

=item *

The C<\h>, C<\H>, C<\v> and C<\V> regular expression metacharacters used to
cause a panic error message when trying to match at the end of the
string [perl #96354].

=item *

The abbreviations for four C1 control characters C<MW> C<PM>, C<RI>, and
C<ST> were previously unrecognized by C<\N{}>, vianame(), and
string_vianame().

=item *

Mentioning a variable named "&" other than C<$&> (i.e., C<@&> or C<%&>) no
longer stops C<$&> from working.  The same applies to variables named "'"
and "`" [perl #24237].

=item *

Creating a C<UNIVERSAL::AUTOLOAD> sub no longer stops C<%+>, C<%-> and
C<%!> from working some of the time [perl #105024].

=back

=head2 Smartmatching

=over

=item *

C<~~> now correctly handles the precedence of Any~~Object, and is not tricked
by an overloaded object on the left-hand side.

=item *

In Perl 5.14.0, C<$tainted ~~ @array> stopped working properly.  Sometimes
it would erroneously fail (when C<$tainted> contained a string that occurs
in the array I<after> the first element) or erroneously succeed (when
C<undef> occurred after the first element) [perl #93590].

=back

=head2 The C<sort> operator

=over

=item *

C<sort> was not treating C<sub {}> and C<sub {()}> as equivalent when
such a sub was provided as the comparison routine.  It used to croak on
C<sub {()}>.

=item *

C<sort> now works once more with custom sort routines that are XSUBs.  It
stopped working in 5.10.0.

=item *

C<sort> with a constant for a custom sort routine, although it produces
unsorted results, no longer crashes.  It started crashing in 5.10.0.

=item *

Warnings emitted by C<sort> when a custom comparison routine returns a
non-numeric value now contain "in sort" and show the line number of the
C<sort> operator, rather than the last line of the comparison routine.  The
warnings also now occur only if warnings are enabled in the scope where
C<sort> occurs.  Previously the warnings would occur if enabled in the
comparison routine's scope.

=item *

C<< sort { $a <=> $b } >>, which is optimized internally, now produces
"uninitialized" warnings for NaNs (not-a-number values), since C<< <=> >>
returns C<undef> for those.  This brings it in line with
S<C<< sort { 1; $a <=> $b } >>> and other more complex cases, which are not
optimized [perl #94390].

=back

=head2 The C<substr> operator

=over

=item *

Tied (and otherwise magical) variables are no longer exempt from the
"Attempt to use reference as lvalue in substr" warning.

=item *

That warning now occurs when the returned lvalue is assigned to, not
when C<substr> itself is called.  This makes a difference only if the
return value of C<substr> is referenced and later assigned to.

=item *

Passing a substring of a read-only value or a typeglob to a function
(potential lvalue context) no longer causes an immediate "Can't coerce"
or "Modification of a read-only value" error.  That error occurs only 
if the passed value is assigned to.

The same thing happens with the "substr outside of string" error.  If
the lvalue is only read from, not written to, it is now just a warning, as
with rvalue C<substr>.

=item *

C<substr> assignments no longer call FETCH twice if the first argument
is a tied variable, just once.

=back

=head2 Support for embedded nulls

Some parts of Perl did not work correctly with nulls (C<chr 0>) embedded in
strings.  That meant that, for instance, C<< $m = "a\0b"; foo->$m >> would
call the "a" method, instead of the actual method name contained in $m.
These parts of perl have been fixed to support nulls:

=over

=item *

Method names

=item *

Typeglob names (including filehandle and subroutine names)

=item *

Package names, including the return value of C<ref()>

=item *

Typeglob elements (C<*foo{"THING\0stuff"}>)

=item *

Signal names

=item *

Various warnings and error messages that mention variable names or values,
methods, etc.

=back

One side effect of these changes is that blessing into "\0" no longer
causes C<ref()> to return false.

=head2 Threading bugs

=over

=item *

Typeglobs returned from threads are no longer cloned if the parent thread
already has a glob with the same name.  This means that returned
subroutines will now assign to the right package variables [perl #107366].

=item *

Some cases of threads crashing due to memory allocation during cloning have
been fixed [perl #90006].

=item *

Thread joining would sometimes emit "Attempt to free unreferenced scalar"
warnings if C<caller> had been used from the C<DB> package before thread
creation [perl #98092].

=item *

Locking a subroutine (via C<lock &sub>) is no longer a compile-time error
for regular subs.  For lvalue subroutines, it no longer tries to return the
sub as a scalar, resulting in strange side effects like C<ref \$_>
returning "CODE" in some instances.

C<lock &sub> is now a run-time error if L<threads::shared> is loaded (a
no-op otherwise), but that may be rectified in a future version.

=back

=head2 Tied variables

=over

=item *

Various cases in which FETCH was being ignored or called too many times
have been fixed:

=over

=item *

C<PerlIO::get_layers> [perl #97956]

=item *

C<$tied =~ y/a/b/>, C<chop $tied> and C<chomp $tied> when $tied holds a
reference.

=item *

When calling C<local $_> [perl #105912]

=item *

Four-argument C<select>

=item *

A tied buffer passed to C<sysread>

=item *

C<< $tied .= <> >>

=item *

Three-argument C<open>, the third being a tied file handle
(as in C<< open $fh, ">&", $tied >>)

=item *

C<sort> with a reference to a tied glob for the comparison routine.

=item *

C<..> and C<...> in list context [perl #53554].

=item *

C<${$tied}>, C<@{$tied}>, C<%{$tied}> and C<*{$tied}> where the tied
variable returns a string (C<&{}> was unaffected)

=item *

C<defined ${ $tied_variable }>

=item *

Various functions that take a filehandle argument in rvalue context
(C<close>, C<readline>, etc.) [perl #97482]

=item *

Some cases of dereferencing a complex expression, such as
C<${ (), $tied } = 1>, used to call C<FETCH> multiple times, but now call
it once.

=item *

C<$tied-E<gt>method> where $tied returns a package name--even resulting in
a failure to call the method, due to memory corruption

=item *

Assignments like C<*$tied = \&{"..."}> and C<*glob = $tied>

=item *

C<chdir>, C<chmod>, C<chown>, C<utime>, C<truncate>, C<stat>, C<lstat> and
the filetest ops (C<-r>, C<-x>, etc.)

=back

=item *

C<caller> sets C<@DB::args> to the subroutine arguments when called from
the DB package.  It used to crash when doing so if C<@DB::args> happened to
be tied.  Now it croaks instead.

=item *

Tying an element of %ENV or C<%^H> and then deleting that element would
result in a call to the tie object's DELETE method, even though tying the
element itself is supposed to be equivalent to tying a scalar (the element
is, of course, a scalar) [perl #67490].

=item *

When Perl autovivifies an element of a tied array or hash (which entails
calling STORE with a new reference), it now calls FETCH immediately after
the STORE, instead of assuming that FETCH would have returned the same
reference.  This can make it easier to implement tied objects [perl #35865, #43011].

=item *

Four-argument C<select> no longer produces its "Non-string passed as
bitmask" warning on tied or tainted variables that are strings.

=item *

Localizing a tied scalar that returns a typeglob no longer stops it from
being tied till the end of the scope.

=item *

Attempting to C<goto> out of a tied handle method used to cause memory
corruption or crashes.  Now it produces an error message instead
[perl #8611].

=item *

A bug has been fixed that occurs when a tied variable is used as a
subroutine reference:  if the last thing assigned to or returned from the
variable was a reference or typeglob, the C<\&$tied> could either crash or
return the wrong subroutine.  The reference case is a regression introduced
in Perl 5.10.0.  For typeglobs, it has probably never worked till now.

=back

=head2 Version objects and vstrings

=over

=item *

The bitwise complement operator (and possibly other operators, too) when
passed a vstring would leave vstring magic attached to the return value,
even though the string had changed.  This meant that
C<< version->new(~v1.2.3) >> would create a version looking like "v1.2.3"
even though the string passed to C<< version->new >> was actually
"\376\375\374".  This also caused L<B::Deparse> to deparse C<~v1.2.3>
incorrectly, without the C<~> [perl #29070].

=item *

Assigning a vstring to a magic (e.g., tied, C<$!>) variable and then
assigning something else used to blow away all magic.  This meant that
tied variables would come undone, C<$!> would stop getting updated on
failed system calls, C<$|> would stop setting autoflush, and other
mischief would take place.  This has been fixed.

=item *

C<< version->new("version") >> and C<printf "%vd", "version"> no longer
crash [perl #102586].

=item *

Version comparisons, such as those that happen implicitly with C<use
v5.43>, no longer cause locale settings to change [perl #105784].

=item *

Version objects no longer cause memory leaks in boolean context
[perl #109762].

=back

=head2 Warnings, redefinition

=over

=item *

Subroutines from the C<autouse> namespace are once more exempt from
redefinition warnings.  This used to work in 5.005, but was broken in
5.6 for most subroutines.  For subs created via XS that redefine
subroutines from the C<autouse> package, this stopped working in 5.10.

=item *

New XSUBs now produce redefinition warnings if they overwrite existing
subs, as they did in 5.8.x.  (The C<autouse> logic was reversed in
5.10-14.  Only subroutines from the C<autouse> namespace would warn
when clobbered.)

=item *

C<newCONSTSUB> used to use compile-time warning hints, instead of
run-time hints.  The following code should never produce a redefinition
warning, but it used to, if C<newCONSTSUB> redefined an existing
subroutine:

    use warnings;
    BEGIN {
        no warnings;
        some_XS_function_that_calls_new_CONSTSUB();
    }

=item *

Redefinition warnings for constant subroutines are on by default (what
are known as severe warnings in L<perldiag>).  This occurred only
when it was a glob assignment or declaration of a Perl subroutine that
caused the warning.  If the creation of XSUBs triggered the warning, it
was not a default warning.  This has been corrected.

=item *

The internal check to see whether a redefinition warning should occur
used to emit "uninitialized" warnings in cases like this:

    use warnings "uninitialized";
    use constant {u => undef, v => undef};
    sub foo(){u}
    sub foo(){v}

=back

=head2 Warnings, "Uninitialized"

=over

=item *

Various functions that take a filehandle argument in rvalue context
(C<close>, C<readline>, etc.) used to warn twice for an undefined handle
[perl #97482].

=item *

C<dbmopen> now only warns once, rather than three times, if the mode
argument is C<undef> [perl #90064].

=item *

The C<+=> operator does not usually warn when the left-hand side is
C<undef>, but it was doing so for tied variables.  This has been fixed
[perl #44895].

=item *

A bug fix in Perl 5.14 introduced a new bug, causing "uninitialized"
warnings to report the wrong variable if the operator in question had
two operands and one was C<%{...}> or C<@{...}>.  This has been fixed
[perl #103766].

=item *

C<..> and C<...> in list context now mention the name of the variable in
"uninitialized" warnings for string (as opposed to numeric) ranges.

=back

=head2 Weak references

=over

=item *

Weakening the first argument to an automatically-invoked C<DESTROY> method
could result in erroneous "DESTROY created new reference" errors or
crashes.  Now it is an error to weaken a read-only reference.

=item *

Weak references to lexical hashes going out of scope were not going stale
(becoming undefined), but continued to point to the hash.

=item *

Weak references to lexical variables going out of scope are now broken
before any magical methods (e.g., DESTROY on a tie object) are called.
This prevents such methods from modifying the variable that will be seen
the next time the scope is entered.

=item *

Creating a weak reference to an @ISA array or accessing the array index
(C<$#ISA>) could result in confused internal bookkeeping for elements
later added to the @ISA array.  For instance, creating a weak
reference to the element itself could push that weak reference on to @ISA;
and elements added after use of C<$#ISA> would be ignored by method lookup
[perl #85670].

=back

=head2 Other notable fixes

=over

=item *

C<quotemeta> now quotes consistently the same non-ASCII characters under
C<use feature 'unicode_strings'>, regardless of whether the string is
encoded in UTF-8 or not, hence fixing the last vestiges (we hope) of the
notorious L<perlunicode/The "Unicode Bug">.  [perl #77654].

Which of these code points is quoted has changed, based on Unicode's
recommendations.  See L<perlfunc/quotemeta> for details.

=item *

C<study> is now a no-op, presumably fixing all outstanding bugs related to
study causing regex matches to behave incorrectly!

=item *

When one writes C<open foo || die>, which used to work in Perl 4, a
"Precedence problem" warning is produced.  This warning used erroneously to
apply to fully-qualified bareword handle names not followed by C<||>.  This
has been corrected.

=item *

After package aliasing (C<*foo:: = *bar::>), C<select> with 0 or 1 argument
would sometimes return a name that could not be used to refer to the
filehandle, or sometimes it would return C<undef> even when a filehandle
was selected.  Now it returns a typeglob reference in such cases.

=item *

C<PerlIO::get_layers> no longer ignores some arguments that it thinks are
numeric, while treating others as filehandle names.  It is now consistent
for flat scalars (i.e., not references).

=item *

Unrecognized switches on C<#!> line

If a switch, such as B<-x>, that cannot occur on the C<#!> line is used
there, perl dies with "Can't emulate...".

It used to produce the same message for switches that perl did not
recognize at all, whether on the command line or the C<#!> line.

Now it produces the "Unrecognized switch" error message [perl #104288].

=item *

C<system> now temporarily blocks the SIGCHLD signal handler, to prevent the
signal handler from stealing the exit status [perl #105700].

=item *

The %n formatting code for C<printf> and C<sprintf>, which causes the number
of characters to be assigned to the next argument, now actually
assigns the number of characters, instead of the number of bytes.

It also works now with special lvalue functions like C<substr> and with
nonexistent hash and array elements [perl #3471, #103492].

=item *

Perl skips copying values returned from a subroutine, for the sake of
speed, if doing so would make no observable difference.  Because of faulty
logic, this would happen with the
result of C<delete>, C<shift> or C<splice>, even if the result was
referenced elsewhere.  It also did so with tied variables about to be freed
[perl #91844, #95548].

=item *

C<utf8::decode> now refuses to modify read-only scalars [perl #91850].

=item *

Freeing $_ inside a C<grep> or C<map> block, a code block embedded in a
regular expression, or an @INC filter (a subroutine returned by a
subroutine in @INC) used to result in double frees or crashes
[perl #91880, #92254, #92256].

=item *

C<eval> returns C<undef> in scalar context or an empty list in list
context when there is a run-time error.  When C<eval> was passed a
string in list context and a syntax error occurred, it used to return a
list containing a single undefined element.  Now it returns an empty
list in list context for all errors [perl #80630].

=item *

C<goto &func> no longer crashes, but produces an error message, when
the unwinding of the current subroutine's scope fires a destructor that
undefines the subroutine being "goneto" [perl #99850].

=item *

Perl now holds an extra reference count on the package that code is
currently compiling in.  This means that the following code no longer
crashes [perl #101486]:

    package Foo;
    BEGIN {*Foo:: = *Bar::}
    sub foo;

=item *

The C<x> repetition operator no longer crashes on 64-bit builds with large
repeat counts [perl #94560].

=item *

Calling C<require> on an implicit C<$_> when C<*CORE::GLOBAL::require> has
been overridden does not segfault anymore, and C<$_> is now passed to the
overriding subroutine [perl #78260].

=item *

C<use> and C<require> are no longer affected by the I/O layers active in
the caller's scope (enabled by L<open.pm|open>) [perl #96008].

=item *

C<our $::é; $é> (which is invalid) no longer produces the "Compilation
error at lib/utf8_heavy.pl..." error message, which it started emitting in
5.10.0 [perl #99984].

=item *

On 64-bit systems, C<read()> now understands large string offsets beyond
the 32-bit range.

=item *

Errors that occur when processing subroutine attributes no longer cause the
subroutine's op tree to leak.

=item *

Passing the same constant subroutine to both C<index> and C<formline> no
longer causes one or the other to fail [perl #89218]. (5.14.1)

=item *

List assignment to lexical variables declared with attributes in the same
statement (C<my ($x,@y) : blimp = (72,94)>) stopped working in Perl 5.8.0.
It has now been fixed.

=item *

Perl 5.10.0 introduced some faulty logic that made "U*" in the middle of
a pack template equivalent to "U0" if the input string was empty.  This has
been fixed [perl #90160]. (5.14.2)

=item *

Destructors on objects were not called during global destruction on objects
that were not referenced by any scalars.  This could happen if an array
element were blessed (e.g., C<bless \$a[0]>) or if a closure referenced a
blessed variable (C<bless \my @a; sub foo { @a }>).

Now there is an extra pass during global destruction to fire destructors on
any objects that might be left after the usual passes that check for
objects referenced by scalars [perl #36347].

=item *

Fixed a case where it was possible that a freed buffer may have been read
from when parsing a here document [perl #90128]. (5.14.1)

=item *

C<each(I<ARRAY>)> is now wrapped in C<defined(...)>, like C<each(I<HASH>)>,
inside a C<while> condition [perl #90888].

=item *

A problem with context propagation when a C<do> block is an argument to
C<return> has been fixed.  It used to cause C<undef> to be returned in
certain cases of a C<return> inside an C<if> block which itself is followed by
another C<return>.

=item *

Calling C<index> with a tainted constant no longer causes constants in
subsequently compiled code to become tainted [perl #64804].

=item *

Infinite loops like C<1 while 1> used to stop C<strict 'subs'> mode from
working for the rest of the block.

=item *

For list assignments like C<($a,$b) = ($b,$a)>, Perl has to make a copy of
the items on the right-hand side before assignment them to the left.  For
efficiency's sake, it assigns the values on the right straight to the items
on the left if no one variable is mentioned on both sides, as in C<($a,$b) =
($c,$d)>.  The logic for determining when it can cheat was faulty, in that
C<&&> and C<||> on the right-hand side could fool it.  So C<($a,$b) =
$some_true_value && ($b,$a)> would end up assigning the value of C<$b> to
both scalars.

=item *

Perl no longer tries to apply lvalue context to the string in
C<("string", $variable) ||= 1> (which used to be an error).  Since the
left-hand side of C<||=> is evaluated in scalar context, that's a scalar
comma operator, which gives all but the last item void context.  There is
no such thing as void lvalue context, so it was a mistake for Perl to try
to force it [perl #96942].

=item *

C<caller> no longer leaks memory when called from the DB package if
C<@DB::args> was assigned to after the first call to C<caller>.  L<Carp>
was triggering this bug [perl #97010]. (5.14.2)

=item *

C<close> and similar filehandle functions, when called on built-in global
variables (like C<$+>), used to die if the variable happened to hold the
undefined value, instead of producing the usual "Use of uninitialized
value" warning.

=item *

When autovivified file handles were introduced in Perl 5.6.0, C<readline>
was inadvertently made to autovivify when called as C<readline($foo)> (but
not as C<E<lt>$fooE<gt>>).  It has now been fixed never to autovivify.

=item *

Calling an undefined anonymous subroutine (e.g., what $x holds after
C<undef &{$x = sub{}}>) used to cause a "Not a CODE reference" error, which
has been corrected to "Undefined subroutine called" [perl #71154].

=item *

Causing C<@DB::args> to be freed between uses of C<caller> no longer
results in a crash [perl #93320].

=item *

C<setpgrp($foo)> used to be equivalent to C<($foo, setpgrp)>, because
C<setpgrp> was ignoring its argument if there was just one.  Now it is
equivalent to C<setpgrp($foo,0)>.

=item *

C<shmread> was not setting the scalar flags correctly when reading from
shared memory, causing the existing cached numeric representation in the
scalar to persist [perl #98480].

=item *

C<++> and C<--> now work on copies of globs, instead of dying.

=item *

C<splice()> doesn't warn when truncating

You can now limit the size of an array using C<splice(@a,MAX_LEN)> without
worrying about warnings.

=item *

C<< $$ >> is no longer tainted.  Since this value comes directly from
C<< getpid() >>, it is always safe.

=item *

The parser no longer leaks a filehandle if STDIN was closed before parsing
started [perl #37033].

=item *

C<< die; >> with a non-reference, non-string, or magical (e.g., tainted)
value in $@ now properly propagates that value [perl #111654].

=back

=head1 Known Problems

=over 4

=item *

On Solaris, we have two kinds of failure.

If F<make> is Sun's F<make>, we get an error about a badly formed macro
assignment in the F<Makefile>.  That happens when F<./Configure> tries to
make depends.  F<Configure> then exits 0, but further F<make>-ing fails.

If F<make> is F<gmake>, F<Configure> completes, then we get errors related
to F</usr/include/stdbool.h>

=item *

On Win32, a number of tests hang unless STDERR is redirected.  The cause of
this is still under investigation.

=item *

When building as root with a umask that prevents files from being
other-readable, F<t/op/filetest.t> will fail.  This is a test bug, not a
bug in perl's behavior.

=item *

Configuring with a recent gcc and link-time-optimization, such as
C<Configure -Doptimize='-O2 -flto'> fails
because the optimizer optimizes away some of Configure's tests.  A
workaround is to omit the C<-flto> flag when running Configure, but add
it back in while actually building, something like

    sh Configure -Doptimize=-O2                                             
    make OPTIMIZE='-O2 -flto'                                               

=item *

The following CPAN modules have test failures with perl 5.16.  Patches have
been submitted for all of these, so hopefully there will be new releases
soon:

=over

=item *

L<Date::Pcalc> version 6.1

=item *

L<Module::CPANTS::Analyse> version 0.85

This fails due to problems in L<Module::Find> 0.10 and L<File::MMagic>
1.27.

=item *

L<PerlIO::Util> version 0.72

=back

=back

=head1 Acknowledgements

Perl 5.16.0 represents approximately 12 months of development since Perl
5.14.0 and contains approximately 590,000 lines of changes across 2,500
files from 139 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.16.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Alan Haggai Alavi, Alberto
Simões, Alexandr Ciornii, Andreas König, Andy Dougherty, Aristotle
Pagaltzis, Bo Johansson, Bo Lindbergh, Breno G. de Oliveira, brian d
foy, Brian Fraser, Brian Greenfield, Carl Hayter, Chas. Owens,
Chia-liang Kao, Chip Salzenberg, Chris 'BinGOs' Williams, Christian
Hansen, Christopher J. Madsen, chromatic, Claes Jacobsson, Claudio
Ramirez, Craig A. Berry, Damian Conway, Daniel Kahn Gillmor, Darin
McBride, Dave Rolsky, David Cantrell, David Golden, David Leadbeater,
David Mitchell, Dee Newcum, Dennis Kaarsemaker, Dominic Hargreaves,
Douglas Christopher Wilson, Eric Brine, Father Chrysostomos, Florian
Ragwitz, Frederic Briere, George Greer, Gerard Goossen, Gisle Aas,
H.Merijn Brand, Hojung Youn, Ian Goodacre, James E Keenan, Jan Dubois,
Jerry D. Hedden, Jesse Luehrs, Jesse Vincent, Jilles Tjoelker, Jim
Cromie, Jim Meyering, Joel Berger, Johan Vromans, Johannes Plunien, John
Hawkinson, John P. Linderman, John Peacock, Joshua ben Jore, Juerd
Waalboer, Karl Williamson, Karthik Rajagopalan, Keith Thompson, Kevin J.
Woolley, Kevin Ryde, Laurent Dami, Leo Lapworth, Leon Brocard, Leon
Timmermans, Louis Strous, Lukas Mai, Marc Green, Marcel Grünauer, Mark
A.  Stratman, Mark Dootson, Mark Jason Dominus, Martin Hasch, Matthew
Horsfall, Max Maischein, Michael G Schwern, Michael Witten, Mike
Sheldrake, Moritz Lenz, Nicholas Clark, Niko Tyni, Nuno Carvalho, Pau
Amma, Paul Evans, Paul Green, Paul Johnson, Perlover, Peter John Acklam,
Peter Martini, Peter Scott, Phil Monsen, Pino Toscano, Rafael
Garcia-Suarez, Rainer Tammer, Reini Urban, Ricardo Signes, Robin Barker,
Rodolfo Carvalho, Salvador Fandiño, Sam Kimbrel, Samuel Thibault, Shawn
M Moore, Shigeya Suzuki, Shirakata Kentaro, Shlomi Fish, Sisyphus,
Slaven Rezic, Spiros Denaxas, Steffen Müller, Steffen Schwigon, Stephen
Bennett, Stephen Oberholtzer, Stevan Little, Steve Hay, Steve Peters,
Thomas Sibley, Thorsten Glaser, Timothe Litt, Todd Rinaldo, Tom
Christiansen, Tom Hukins, Tony Cook, Vadim Konovalov, Vincent Pit,
Vladimir Timofeev, Walt Mankowski, Yves Orton, Zefram, Zsbán Ambrus,
Ævar Arnfjörð Bjarmason.

The list above is almost certainly incomplete as it is automatically
generated from version control history.  In particular, it does not
include the names of the (very much appreciated) contributors who
reported issues to the Perl bug tracker.

Many of the changes included in this version originated in the CPAN
modules included in Perl's core.  We're grateful to the entire CPAN
community for helping Perl to flourish.

For a more complete list of all of Perl's historical contributors,
please see the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at L<http://rt.perl.org/perlbug/>.  There may also be
information at L<http://www.perl.org/>, the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please
send it to perl5-security-report@perl.org.  This points to a closed
subscription unarchived mailing list, which includes all core
committers, who will be able to help assess the impact of issues, figure
out a resolution, and help co-ordinate the release of patches to
mitigate or fix the problem across all platforms on which Perl is
supported.  Please use this address only for security issues in the Perl
core, not for modules independently distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlpodspec.pod000064400000205574150344123460007607 0ustar00=encoding utf8

=head1 NAME

perlpodspec - Plain Old Documentation: format specification and notes

=head1 DESCRIPTION

This document is detailed notes on the Pod markup language.  Most
people will only have to read L<perlpod|perlpod> to know how to write
in Pod, but this document may answer some incidental questions to do
with parsing and rendering Pod.

In this document, "must" / "must not", "should" /
"should not", and "may" have their conventional (cf. RFC 2119)
meanings: "X must do Y" means that if X doesn't do Y, it's against
this specification, and should really be fixed.  "X should do Y"
means that it's recommended, but X may fail to do Y, if there's a
good reason.  "X may do Y" is merely a note that X can do Y at
will (although it is up to the reader to detect any connotation of
"and I think it would be I<nice> if X did Y" versus "it wouldn't
really I<bother> me if X did Y").

Notably, when I say "the parser should do Y", the
parser may fail to do Y, if the calling application explicitly
requests that the parser I<not> do Y.  I often phrase this as
"the parser should, by default, do Y."  This doesn't I<require>
the parser to provide an option for turning off whatever
feature Y is (like expanding tabs in verbatim paragraphs), although
it implicates that such an option I<may> be provided.

=head1 Pod Definitions

Pod is embedded in files, typically Perl source files, although you
can write a file that's nothing but Pod.

A B<line> in a file consists of zero or more non-newline characters,
terminated by either a newline or the end of the file.

A B<newline sequence> is usually a platform-dependent concept, but
Pod parsers should understand it to mean any of CR (ASCII 13), LF
(ASCII 10), or a CRLF (ASCII 13 followed immediately by ASCII 10), in
addition to any other system-specific meaning.  The first CR/CRLF/LF
sequence in the file may be used as the basis for identifying the
newline sequence for parsing the rest of the file.

A B<blank line> is a line consisting entirely of zero or more spaces
(ASCII 32) or tabs (ASCII 9), and terminated by a newline or end-of-file.
A B<non-blank line> is a line containing one or more characters other
than space or tab (and terminated by a newline or end-of-file).

(I<Note:> Many older Pod parsers did not accept a line consisting of
spaces/tabs and then a newline as a blank line. The only lines they
considered blank were lines consisting of I<no characters at all>,
terminated by a newline.)

B<Whitespace> is used in this document as a blanket term for spaces,
tabs, and newline sequences.  (By itself, this term usually refers
to literal whitespace.  That is, sequences of whitespace characters
in Pod source, as opposed to "EE<lt>32>", which is a formatting
code that I<denotes> a whitespace character.)

A B<Pod parser> is a module meant for parsing Pod (regardless of
whether this involves calling callbacks or building a parse tree or
directly formatting it).  A B<Pod formatter> (or B<Pod translator>)
is a module or program that converts Pod to some other format (HTML,
plaintext, TeX, PostScript, RTF).  A B<Pod processor> might be a
formatter or translator, or might be a program that does something
else with the Pod (like counting words, scanning for index points,
etc.).

Pod content is contained in B<Pod blocks>.  A Pod block starts with a
line that matches C<m/\A=[a-zA-Z]/>, and continues up to the next line
that matches C<m/\A=cut/> or up to the end of the file if there is
no C<m/\A=cut/> line.

=for comment
 The current perlsyn says:
 [beginquote]
   Note that pod translators should look at only paragraphs beginning
   with a pod directive (it makes parsing easier), whereas the compiler
   actually knows to look for pod escapes even in the middle of a
   paragraph.  This means that the following secret stuff will be ignored
   by both the compiler and the translators.
      $a=3;
      =secret stuff
       warn "Neither POD nor CODE!?"
      =cut back
      print "got $a\n";
   You probably shouldn't rely upon the warn() being podded out forever.
   Not all pod translators are well-behaved in this regard, and perhaps
   the compiler will become pickier.
 [endquote]
 I think that those paragraphs should just be removed; paragraph-based
 parsing  seems to have been largely abandoned, because of the hassle
 with non-empty blank lines messing up what people meant by "paragraph".
 Even if the "it makes parsing easier" bit were especially true,
 it wouldn't be worth the confusion of having perl and pod2whatever
 actually disagree on what can constitute a Pod block.

Within a Pod block, there are B<Pod paragraphs>.  A Pod paragraph
consists of non-blank lines of text, separated by one or more blank
lines.

For purposes of Pod processing, there are four types of paragraphs in
a Pod block:

=over

=item *

A command paragraph (also called a "directive").  The first line of
this paragraph must match C<m/\A=[a-zA-Z]/>.  Command paragraphs are
typically one line, as in:

  =head1 NOTES

  =item *

But they may span several (non-blank) lines:

  =for comment
  Hm, I wonder what it would look like if
  you tried to write a BNF for Pod from this.

  =head3 Dr. Strangelove, or: How I Learned to
  Stop Worrying and Love the Bomb

I<Some> command paragraphs allow formatting codes in their content
(i.e., after the part that matches C<m/\A=[a-zA-Z]\S*\s*/>), as in:

  =head1 Did You Remember to C<use strict;>?

In other words, the Pod processing handler for "head1" will apply the
same processing to "Did You Remember to CE<lt>use strict;>?" that it
would to an ordinary paragraph (i.e., formatting codes like
"CE<lt>...>") are parsed and presumably formatted appropriately, and
whitespace in the form of literal spaces and/or tabs is not
significant.

=item *

A B<verbatim paragraph>.  The first line of this paragraph must be a
literal space or tab, and this paragraph must not be inside a "=begin
I<identifier>", ... "=end I<identifier>" sequence unless
"I<identifier>" begins with a colon (":").  That is, if a paragraph
starts with a literal space or tab, but I<is> inside a
"=begin I<identifier>", ... "=end I<identifier>" region, then it's
a data paragraph, unless "I<identifier>" begins with a colon.

Whitespace I<is> significant in verbatim paragraphs (although, in
processing, tabs are probably expanded).

=item *

An B<ordinary paragraph>.  A paragraph is an ordinary paragraph
if its first line matches neither C<m/\A=[a-zA-Z]/> nor
C<m/\A[ \t]/>, I<and> if it's not inside a "=begin I<identifier>",
... "=end I<identifier>" sequence unless "I<identifier>" begins with
a colon (":").

=item *

A B<data paragraph>.  This is a paragraph that I<is> inside a "=begin
I<identifier>" ... "=end I<identifier>" sequence where
"I<identifier>" does I<not> begin with a literal colon (":").  In
some sense, a data paragraph is not part of Pod at all (i.e.,
effectively it's "out-of-band"), since it's not subject to most kinds
of Pod parsing; but it is specified here, since Pod
parsers need to be able to call an event for it, or store it in some
form in a parse tree, or at least just parse I<around> it.

=back

For example: consider the following paragraphs:

  # <- that's the 0th column

  =head1 Foo

  Stuff

    $foo->bar

  =cut

Here, "=head1 Foo" and "=cut" are command paragraphs because the first
line of each matches C<m/\A=[a-zA-Z]/>.  "I<[space][space]>$foo->bar"
is a verbatim paragraph, because its first line starts with a literal
whitespace character (and there's no "=begin"..."=end" region around).

The "=begin I<identifier>" ... "=end I<identifier>" commands stop
paragraphs that they surround from being parsed as ordinary or verbatim
paragraphs, if I<identifier> doesn't begin with a colon.  This
is discussed in detail in the section
L</About Data Paragraphs and "=beginE<sol>=end" Regions>.

=head1 Pod Commands

This section is intended to supplement and clarify the discussion in
L<perlpod/"Command Paragraph">.  These are the currently recognized
Pod commands:

=over

=item "=head1", "=head2", "=head3", "=head4"

This command indicates that the text in the remainder of the paragraph
is a heading.  That text may contain formatting codes.  Examples:

  =head1 Object Attributes

  =head3 What B<Not> to Do!

=item "=pod"

This command indicates that this paragraph begins a Pod block.  (If we
are already in the middle of a Pod block, this command has no effect at
all.)  If there is any text in this command paragraph after "=pod",
it must be ignored.  Examples:

  =pod

  This is a plain Pod paragraph.

  =pod This text is ignored.

=item "=cut"

This command indicates that this line is the end of this previously
started Pod block.  If there is any text after "=cut" on the line, it must be
ignored.  Examples:

  =cut

  =cut The documentation ends here.

  =cut
  # This is the first line of program text.
  sub foo { # This is the second.

It is an error to try to I<start> a Pod block with a "=cut" command.  In
that case, the Pod processor must halt parsing of the input file, and
must by default emit a warning.

=item "=over"

This command indicates that this is the start of a list/indent
region.  If there is any text following the "=over", it must consist
of only a nonzero positive numeral.  The semantics of this numeral is
explained in the L</"About =over...=back Regions"> section, further
below.  Formatting codes are not expanded.  Examples:

  =over 3

  =over 3.5

  =over

=item "=item"

This command indicates that an item in a list begins here.  Formatting
codes are processed.  The semantics of the (optional) text in the
remainder of this paragraph are
explained in the L</"About =over...=back Regions"> section, further
below.  Examples:

  =item

  =item *

  =item      *    

  =item 14

  =item   3.

  =item C<< $thing->stuff(I<dodad>) >>

  =item For transporting us beyond seas to be tried for pretended
  offenses

  =item He is at this time transporting large armies of foreign
  mercenaries to complete the works of death, desolation and
  tyranny, already begun with circumstances of cruelty and perfidy
  scarcely paralleled in the most barbarous ages, and totally
  unworthy the head of a civilized nation.

=item "=back"

This command indicates that this is the end of the region begun
by the most recent "=over" command.  It permits no text after the
"=back" command.

=item "=begin formatname"

=item "=begin formatname parameter"

This marks the following paragraphs (until the matching "=end
formatname") as being for some special kind of processing.  Unless
"formatname" begins with a colon, the contained non-command
paragraphs are data paragraphs.  But if "formatname" I<does> begin
with a colon, then non-command paragraphs are ordinary paragraphs
or data paragraphs.  This is discussed in detail in the section
L</About Data Paragraphs and "=beginE<sol>=end" Regions>.

It is advised that formatnames match the regexp
C<m/\A:?[-a-zA-Z0-9_]+\z/>.  Everything following whitespace after the
formatname is a parameter that may be used by the formatter when dealing
with this region.  This parameter must not be repeated in the "=end"
paragraph.  Implementors should anticipate future expansion in the
semantics and syntax of the first parameter to "=begin"/"=end"/"=for".

=item "=end formatname"

This marks the end of the region opened by the matching
"=begin formatname" region.  If "formatname" is not the formatname
of the most recent open "=begin formatname" region, then this
is an error, and must generate an error message.  This
is discussed in detail in the section
L</About Data Paragraphs and "=beginE<sol>=end" Regions>.

=item "=for formatname text..."

This is synonymous with:

     =begin formatname

     text...

     =end formatname

That is, it creates a region consisting of a single paragraph; that
paragraph is to be treated as a normal paragraph if "formatname"
begins with a ":"; if "formatname" I<doesn't> begin with a colon,
then "text..." will constitute a data paragraph.  There is no way
to use "=for formatname text..." to express "text..." as a verbatim
paragraph.

=item "=encoding encodingname"

This command, which should occur early in the document (at least
before any non-US-ASCII data!), declares that this document is
encoded in the encoding I<encodingname>, which must be
an encoding name that L<Encode> recognizes.  (Encode's list
of supported encodings, in L<Encode::Supported>, is useful here.)
If the Pod parser cannot decode the declared encoding, it 
should emit a warning and may abort parsing the document
altogether.

A document having more than one "=encoding" line should be
considered an error.  Pod processors may silently tolerate this if
the not-first "=encoding" lines are just duplicates of the
first one (e.g., if there's a "=encoding utf8" line, and later on
another "=encoding utf8" line).  But Pod processors should complain if
there are contradictory "=encoding" lines in the same document
(e.g., if there is a "=encoding utf8" early in the document and
"=encoding big5" later).  Pod processors that recognize BOMs
may also complain if they see an "=encoding" line
that contradicts the BOM (e.g., if a document with a UTF-16LE
BOM has an "=encoding shiftjis" line).

=back

If a Pod processor sees any command other than the ones listed
above (like "=head", or "=haed1", or "=stuff", or "=cuttlefish",
or "=w123"), that processor must by default treat this as an
error.  It must not process the paragraph beginning with that
command, must by default warn of this as an error, and may
abort the parse.  A Pod parser may allow a way for particular
applications to add to the above list of known commands, and to
stipulate, for each additional command, whether formatting
codes should be processed.

Future versions of this specification may add additional
commands.



=head1 Pod Formatting Codes

(Note that in previous drafts of this document and of perlpod,
formatting codes were referred to as "interior sequences", and
this term may still be found in the documentation for Pod parsers,
and in error messages from Pod processors.)

There are two syntaxes for formatting codes:

=over

=item *

A formatting code starts with a capital letter (just US-ASCII [A-Z])
followed by a "<", any number of characters, and ending with the first
matching ">".  Examples:

    That's what I<you> think!

    What's C<dump()> for?

    X<C<chmod> and C<unlink()> Under Different Operating Systems>

=item *

A formatting code starts with a capital letter (just US-ASCII [A-Z])
followed by two or more "<"'s, one or more whitespace characters,
any number of characters, one or more whitespace characters,
and ending with the first matching sequence of two or more ">"'s, where
the number of ">"'s equals the number of "<"'s in the opening of this
formatting code.  Examples:

    That's what I<< you >> think!

    C<<< open(X, ">>thing.dat") || die $! >>>

    B<< $foo->bar(); >>

With this syntax, the whitespace character(s) after the "CE<lt><<"
and before the ">>>" (or whatever letter) are I<not> renderable. They
do not signify whitespace, are merely part of the formatting codes
themselves.  That is, these are all synonymous:

    C<thing>
    C<< thing >>
    C<<           thing     >>
    C<<<   thing >>>
    C<<<<
    thing
               >>>>

and so on.

Finally, the multiple-angle-bracket form does I<not> alter the interpretation
of nested formatting codes, meaning that the following four example lines are
identical in meaning:

  B<example: C<$a E<lt>=E<gt> $b>>

  B<example: C<< $a <=> $b >>>

  B<example: C<< $a E<lt>=E<gt> $b >>>

  B<<< example: C<< $a E<lt>=E<gt> $b >> >>>

=back

In parsing Pod, a notably tricky part is the correct parsing of
(potentially nested!) formatting codes.  Implementors should
consult the code in the C<parse_text> routine in Pod::Parser as an
example of a correct implementation.

=over

=item C<IE<lt>textE<gt>> -- italic text

See the brief discussion in L<perlpod/"Formatting Codes">.

=item C<BE<lt>textE<gt>> -- bold text

See the brief discussion in L<perlpod/"Formatting Codes">.

=item C<CE<lt>codeE<gt>> -- code text

See the brief discussion in L<perlpod/"Formatting Codes">.

=item C<FE<lt>filenameE<gt>> -- style for filenames

See the brief discussion in L<perlpod/"Formatting Codes">.

=item C<XE<lt>topic nameE<gt>> -- an index entry

See the brief discussion in L<perlpod/"Formatting Codes">.

This code is unusual in that most formatters completely discard
this code and its content.  Other formatters will render it with
invisible codes that can be used in building an index of
the current document.

=item C<ZE<lt>E<gt>> -- a null (zero-effect) formatting code

Discussed briefly in L<perlpod/"Formatting Codes">.

This code is unusual in that it should have no content.  That is,
a processor may complain if it sees C<ZE<lt>potatoesE<gt>>.  Whether
or not it complains, the I<potatoes> text should ignored.

=item C<LE<lt>nameE<gt>> -- a hyperlink

The complicated syntaxes of this code are discussed at length in
L<perlpod/"Formatting Codes">, and implementation details are
discussed below, in L</"About LE<lt>...E<gt> Codes">.  Parsing the
contents of LE<lt>content> is tricky.  Notably, the content has to be
checked for whether it looks like a URL, or whether it has to be split
on literal "|" and/or "/" (in the right order!), and so on,
I<before> EE<lt>...> codes are resolved.

=item C<EE<lt>escapeE<gt>> -- a character escape

See L<perlpod/"Formatting Codes">, and several points in
L</Notes on Implementing Pod Processors>.

=item C<SE<lt>textE<gt>> -- text contains non-breaking spaces

This formatting code is syntactically simple, but semantically
complex.  What it means is that each space in the printable
content of this code signifies a non-breaking space.

Consider:

    C<$x ? $y    :  $z>

    S<C<$x ? $y     :  $z>>

Both signify the monospace (c[ode] style) text consisting of
"$x", one space, "?", one space, ":", one space, "$z".  The
difference is that in the latter, with the S code, those spaces
are not "normal" spaces, but instead are non-breaking spaces.

=back


If a Pod processor sees any formatting code other than the ones
listed above (as in "NE<lt>...>", or "QE<lt>...>", etc.), that
processor must by default treat this as an error.
A Pod parser may allow a way for particular
applications to add to the above list of known formatting codes;
a Pod parser might even allow a way to stipulate, for each additional
command, whether it requires some form of special processing, as
LE<lt>...> does.

Future versions of this specification may add additional
formatting codes.

Historical note:  A few older Pod processors would not see a ">" as
closing a "CE<lt>" code, if the ">" was immediately preceded by
a "-".  This was so that this:

    C<$foo->bar>

would parse as equivalent to this:

    C<$foo-E<gt>bar>

instead of as equivalent to a "C" formatting code containing 
only "$foo-", and then a "bar>" outside the "C" formatting code.  This
problem has since been solved by the addition of syntaxes like this:

    C<< $foo->bar >>

Compliant parsers must not treat "->" as special.

Formatting codes absolutely cannot span paragraphs.  If a code is
opened in one paragraph, and no closing code is found by the end of
that paragraph, the Pod parser must close that formatting code,
and should complain (as in "Unterminated I code in the paragraph
starting at line 123: 'Time objects are not...'").  So these
two paragraphs:

  I<I told you not to do this!

  Don't make me say it again!>

...must I<not> be parsed as two paragraphs in italics (with the I
code starting in one paragraph and starting in another.)  Instead,
the first paragraph should generate a warning, but that aside, the
above code must parse as if it were:

  I<I told you not to do this!>

  Don't make me say it again!E<gt>

(In SGMLish jargon, all Pod commands are like block-level
elements, whereas all Pod formatting codes are like inline-level
elements.)



=head1 Notes on Implementing Pod Processors

The following is a long section of miscellaneous requirements
and suggestions to do with Pod processing.

=over

=item *

Pod formatters should tolerate lines in verbatim blocks that are of
any length, even if that means having to break them (possibly several
times, for very long lines) to avoid text running off the side of the
page.  Pod formatters may warn of such line-breaking.  Such warnings
are particularly appropriate for lines are over 100 characters long, which
are usually not intentional.

=item *

Pod parsers must recognize I<all> of the three well-known newline
formats: CR, LF, and CRLF.  See L<perlport|perlport>.

=item *

Pod parsers should accept input lines that are of any length.

=item *

Since Perl recognizes a Unicode Byte Order Mark at the start of files
as signaling that the file is Unicode encoded as in UTF-16 (whether
big-endian or little-endian) or UTF-8, Pod parsers should do the
same.  Otherwise, the character encoding should be understood as
being UTF-8 if the first highbit byte sequence in the file seems
valid as a UTF-8 sequence, or otherwise as CP-1252 (earlier versions of
this specification used Latin-1 instead of CP-1252).

Future versions of this specification may specify
how Pod can accept other encodings.  Presumably treatment of other
encodings in Pod parsing would be as in XML parsing: whatever the
encoding declared by a particular Pod file, content is to be
stored in memory as Unicode characters.

=item *

The well known Unicode Byte Order Marks are as follows:  if the
file begins with the two literal byte values 0xFE 0xFF, this is
the BOM for big-endian UTF-16.  If the file begins with the two
literal byte value 0xFF 0xFE, this is the BOM for little-endian
UTF-16.  On an ASCII platform, if the file begins with the three literal
byte values
0xEF 0xBB 0xBF, this is the BOM for UTF-8.
A mechanism portable to EBCDIC platforms is to:

  my $utf8_bom = "\x{FEFF}";
  utf8::encode($utf8_bom);

=for comment
 use bytes; print map sprintf(" 0x%02X", ord $_), split '', "\x{feff}";
 0xEF 0xBB 0xBF

=for comment
 If toke.c is modified to support UTF-32, add mention of those here.

=item *

A naive, but often sufficient heuristic on ASCII platforms, for testing
the first highbit
byte-sequence in a BOM-less file (whether in code or in Pod!), to see
whether that sequence is valid as UTF-8 (RFC 2279) is to check whether
that the first byte in the sequence is in the range 0xC2 - 0xFD
I<and> whether the next byte is in the range
0x80 - 0xBF.  If so, the parser may conclude that this file is in
UTF-8, and all highbit sequences in the file should be assumed to
be UTF-8.  Otherwise the parser should treat the file as being
in CP-1252.  (A better check, and which works on EBCDIC platforms as
well, is to pass a copy of the sequence to
L<utf8::decode()|utf8> which performs a full validity check on the
sequence and returns TRUE if it is valid UTF-8, FALSE otherwise.  This
function is always pre-loaded, is fast because it is written in C, and
will only get called at most once, so you don't need to avoid it out of
performance concerns.)
In the unlikely circumstance that the first highbit
sequence in a truly non-UTF-8 file happens to appear to be UTF-8, one
can cater to our heuristic (as well as any more intelligent heuristic)
by prefacing that line with a comment line containing a highbit
sequence that is clearly I<not> valid as UTF-8.  A line consisting
of simply "#", an e-acute, and any non-highbit byte,
is sufficient to establish this file's encoding.

=for comment
 If/WHEN some brave soul makes these heuristics into a generic
 text-file class (or PerlIO layer?), we can presumably delete
 mention of these icky details from this file, and can instead
 tell people to just use appropriate class/layer.
 Auto-recognition of newline sequences would be another desirable
 feature of such a class/layer.
 HINT HINT HINT.

=for comment
 "The probability that a string of characters
 in any other encoding appears as valid UTF-8 is low" - RFC2279

=item *

Pod processors must treat a "=for [label] [content...]" paragraph as
meaning the same thing as a "=begin [label]" paragraph, content, and
an "=end [label]" paragraph.  (The parser may conflate these two
constructs, or may leave them distinct, in the expectation that the
formatter will nevertheless treat them the same.)

=item *

When rendering Pod to a format that allows comments (i.e., to nearly
any format other than plaintext), a Pod formatter must insert comment
text identifying its name and version number, and the name and
version numbers of any modules it might be using to process the Pod.
Minimal examples:

 %% POD::Pod2PS v3.14159, using POD::Parser v1.92

 <!-- Pod::HTML v3.14159, using POD::Parser v1.92 -->

 {\doccomm generated by Pod::Tree::RTF 3.14159 using Pod::Tree 1.08}

 .\" Pod::Man version 3.14159, using POD::Parser version 1.92

Formatters may also insert additional comments, including: the
release date of the Pod formatter program, the contact address for
the author(s) of the formatter, the current time, the name of input
file, the formatting options in effect, version of Perl used, etc.

Formatters may also choose to note errors/warnings as comments,
besides or instead of emitting them otherwise (as in messages to
STDERR, or C<die>ing).

=item *

Pod parsers I<may> emit warnings or error messages ("Unknown E code
EE<lt>zslig>!") to STDERR (whether through printing to STDERR, or
C<warn>ing/C<carp>ing, or C<die>ing/C<croak>ing), but I<must> allow
suppressing all such STDERR output, and instead allow an option for
reporting errors/warnings
in some other way, whether by triggering a callback, or noting errors
in some attribute of the document object, or some similarly unobtrusive
mechanism -- or even by appending a "Pod Errors" section to the end of
the parsed form of the document.

=item *

In cases of exceptionally aberrant documents, Pod parsers may abort the
parse.  Even then, using C<die>ing/C<croak>ing is to be avoided; where
possible, the parser library may simply close the input file
and add text like "*** Formatting Aborted ***" to the end of the
(partial) in-memory document.

=item *

In paragraphs where formatting codes (like EE<lt>...>, BE<lt>...>)
are understood (i.e., I<not> verbatim paragraphs, but I<including>
ordinary paragraphs, and command paragraphs that produce renderable
text, like "=head1"), literal whitespace should generally be considered
"insignificant", in that one literal space has the same meaning as any
(nonzero) number of literal spaces, literal newlines, and literal tabs
(as long as this produces no blank lines, since those would terminate
the paragraph).  Pod parsers should compact literal whitespace in each
processed paragraph, but may provide an option for overriding this
(since some processing tasks do not require it), or may follow
additional special rules (for example, specially treating
period-space-space or period-newline sequences).

=item *

Pod parsers should not, by default, try to coerce apostrophe (') and
quote (") into smart quotes (little 9's, 66's, 99's, etc), nor try to
turn backtick (`) into anything else but a single backtick character
(distinct from an open quote character!), nor "--" into anything but
two minus signs.  They I<must never> do any of those things to text
in CE<lt>...> formatting codes, and never I<ever> to text in verbatim
paragraphs.

=item *

When rendering Pod to a format that has two kinds of hyphens (-), one
that's a non-breaking hyphen, and another that's a breakable hyphen
(as in "object-oriented", which can be split across lines as
"object-", newline, "oriented"), formatters are encouraged to
generally translate "-" to non-breaking hyphen, but may apply
heuristics to convert some of these to breaking hyphens.

=item *

Pod formatters should make reasonable efforts to keep words of Perl
code from being broken across lines.  For example, "Foo::Bar" in some
formatting systems is seen as eligible for being broken across lines
as "Foo::" newline "Bar" or even "Foo::-" newline "Bar".  This should
be avoided where possible, either by disabling all line-breaking in
mid-word, or by wrapping particular words with internal punctuation
in "don't break this across lines" codes (which in some formats may
not be a single code, but might be a matter of inserting non-breaking
zero-width spaces between every pair of characters in a word.)

=item *

Pod parsers should, by default, expand tabs in verbatim paragraphs as
they are processed, before passing them to the formatter or other
processor.  Parsers may also allow an option for overriding this.

=item *

Pod parsers should, by default, remove newlines from the end of
ordinary and verbatim paragraphs before passing them to the
formatter.  For example, while the paragraph you're reading now
could be considered, in Pod source, to end with (and contain)
the newline(s) that end it, it should be processed as ending with
(and containing) the period character that ends this sentence.

=item *

Pod parsers, when reporting errors, should make some effort to report
an approximate line number ("Nested EE<lt>>'s in Paragraph #52, near
line 633 of Thing/Foo.pm!"), instead of merely noting the paragraph
number ("Nested EE<lt>>'s in Paragraph #52 of Thing/Foo.pm!").  Where
this is problematic, the paragraph number should at least be
accompanied by an excerpt from the paragraph ("Nested EE<lt>>'s in
Paragraph #52 of Thing/Foo.pm, which begins 'Read/write accessor for
the CE<lt>interest rate> attribute...'").

=item *

Pod parsers, when processing a series of verbatim paragraphs one
after another, should consider them to be one large verbatim
paragraph that happens to contain blank lines.  I.e., these two
lines, which have a blank line between them:

	use Foo;

	print Foo->VERSION

should be unified into one paragraph ("\tuse Foo;\n\n\tprint
Foo->VERSION") before being passed to the formatter or other
processor.  Parsers may also allow an option for overriding this.

While this might be too cumbersome to implement in event-based Pod
parsers, it is straightforward for parsers that return parse trees.

=item *

Pod formatters, where feasible, are advised to avoid splitting short
verbatim paragraphs (under twelve lines, say) across pages.

=item *

Pod parsers must treat a line with only spaces and/or tabs on it as a
"blank line" such as separates paragraphs.  (Some older parsers
recognized only two adjacent newlines as a "blank line" but would not
recognize a newline, a space, and a newline, as a blank line.  This
is noncompliant behavior.)

=item *

Authors of Pod formatters/processors should make every effort to
avoid writing their own Pod parser.  There are already several in
CPAN, with a wide range of interface styles -- and one of them,
Pod::Simple, comes with modern versions of Perl.

=item *

Characters in Pod documents may be conveyed either as literals, or by
number in EE<lt>n> codes, or by an equivalent mnemonic, as in
EE<lt>eacute> which is exactly equivalent to EE<lt>233>.  The numbers
are the Latin1/Unicode values, even on EBCDIC platforms.

When referring to characters by using a EE<lt>n> numeric code, numbers
in the range 32-126 refer to those well known US-ASCII characters (also
defined there by Unicode, with the same meaning), which all Pod
formatters must render faithfully.  Characters whose EE<lt>E<gt> numbers
are in the ranges 0-31 and 127-159 should not be used (neither as
literals,
nor as EE<lt>number> codes), except for the literal byte-sequences for
newline (ASCII 13, ASCII 13 10, or ASCII 10), and tab (ASCII 9).

Numbers in the range 160-255 refer to Latin-1 characters (also
defined there by Unicode, with the same meaning).  Numbers above
255 should be understood to refer to Unicode characters.

=item *

Be warned
that some formatters cannot reliably render characters outside 32-126;
and many are able to handle 32-126 and 160-255, but nothing above
255.

=item *

Besides the well-known "EE<lt>lt>" and "EE<lt>gt>" codes for
less-than and greater-than, Pod parsers must understand "EE<lt>sol>"
for "/" (solidus, slash), and "EE<lt>verbar>" for "|" (vertical bar,
pipe).  Pod parsers should also understand "EE<lt>lchevron>" and
"EE<lt>rchevron>" as legacy codes for characters 171 and 187, i.e.,
"left-pointing double angle quotation mark" = "left pointing
guillemet" and "right-pointing double angle quotation mark" = "right
pointing guillemet".  (These look like little "<<" and ">>", and they
are now preferably expressed with the HTML/XHTML codes "EE<lt>laquo>"
and "EE<lt>raquo>".)

=item *

Pod parsers should understand all "EE<lt>html>" codes as defined
in the entity declarations in the most recent XHTML specification at
C<www.W3.org>.  Pod parsers must understand at least the entities
that define characters in the range 160-255 (Latin-1).  Pod parsers,
when faced with some unknown "EE<lt>I<identifier>>" code,
shouldn't simply replace it with nullstring (by default, at least),
but may pass it through as a string consisting of the literal characters
E, less-than, I<identifier>, greater-than.  Or Pod parsers may offer the
alternative option of processing such unknown
"EE<lt>I<identifier>>" codes by firing an event especially
for such codes, or by adding a special node-type to the in-memory
document tree.  Such "EE<lt>I<identifier>>" may have special meaning
to some processors, or some processors may choose to add them to
a special error report.

=item *

Pod parsers must also support the XHTML codes "EE<lt>quot>" for
character 34 (doublequote, "), "EE<lt>amp>" for character 38
(ampersand, &), and "EE<lt>apos>" for character 39 (apostrophe, ').

=item *

Note that in all cases of "EE<lt>whateverE<gt>", I<whatever> (whether
an htmlname, or a number in any base) must consist only of
alphanumeric characters -- that is, I<whatever> must match
C<m/\A\w+\z/>.  So S<"EE<lt> 0 1 2 3 E<gt>"> is invalid, because
it contains spaces, which aren't alphanumeric characters.  This
presumably does not I<need> special treatment by a Pod processor;
S<" 0 1 2 3 "> doesn't look like a number in any base, so it would
presumably be looked up in the table of HTML-like names.  Since
there isn't (and cannot be) an HTML-like entity called S<" 0 1 2 3 ">,
this will be treated as an error.  However, Pod processors may
treat S<"EE<lt> 0 1 2 3 E<gt>"> or "EE<lt>e-acute>" as I<syntactically>
invalid, potentially earning a different error message than the
error message (or warning, or event) generated by a merely unknown
(but theoretically valid) htmlname, as in "EE<lt>qacute>"
[sic].  However, Pod parsers are not required to make this
distinction.

=item *

Note that EE<lt>number> I<must not> be interpreted as simply
"codepoint I<number> in the current/native character set".  It always
means only "the character represented by codepoint I<number> in
Unicode."  (This is identical to the semantics of &#I<number>; in XML.)

This will likely require many formatters to have tables mapping from
treatable Unicode codepoints (such as the "\xE9" for the e-acute
character) to the escape sequences or codes necessary for conveying
such sequences in the target output format.  A converter to *roff
would, for example know that "\xE9" (whether conveyed literally, or via
a EE<lt>...> sequence) is to be conveyed as "e\\*'".
Similarly, a program rendering Pod in a Mac OS application window, would
presumably need to know that "\xE9" maps to codepoint 142 in MacRoman
encoding that (at time of writing) is native for Mac OS.  Such
Unicode2whatever mappings are presumably already widely available for
common output formats.  (Such mappings may be incomplete!  Implementers
are not expected to bend over backwards in an attempt to render
Cherokee syllabics, Etruscan runes, Byzantine musical symbols, or any
of the other weird things that Unicode can encode.)  And
if a Pod document uses a character not found in such a mapping, the
formatter should consider it an unrenderable character.

=item *

If, surprisingly, the implementor of a Pod formatter can't find a
satisfactory pre-existing table mapping from Unicode characters to
escapes in the target format (e.g., a decent table of Unicode
characters to *roff escapes), it will be necessary to build such a
table.  If you are in this circumstance, you should begin with the
characters in the range 0x00A0 - 0x00FF, which is mostly the heavily
used accented characters.  Then proceed (as patience permits and
fastidiousness compels) through the characters that the (X)HTML
standards groups judged important enough to merit mnemonics
for.  These are declared in the (X)HTML specifications at the
www.W3.org site.  At time of writing (September 2001), the most recent
entity declaration files are:

  http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent
  http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent
  http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent

Then you can progress through any remaining notable Unicode characters
in the range 0x2000-0x204D (consult the character tables at
www.unicode.org), and whatever else strikes your fancy.  For example,
in F<xhtml-symbol.ent>, there is the entry:

  <!ENTITY infin    "&#8734;"> <!-- infinity, U+221E ISOtech -->

While the mapping "infin" to the character "\x{221E}" will (hopefully)
have been already handled by the Pod parser, the presence of the
character in this file means that it's reasonably important enough to
include in a formatter's table that maps from notable Unicode characters
to the codes necessary for rendering them.  So for a Unicode-to-*roff
mapping, for example, this would merit the entry:

  "\x{221E}" => '\(in',

It is eagerly hoped that in the future, increasing numbers of formats
(and formatters) will support Unicode characters directly (as (X)HTML
does with C<&infin;>, C<&#8734;>, or C<&#x221E;>), reducing the need
for idiosyncratic mappings of Unicode-to-I<my_escapes>.

=item *

It is up to individual Pod formatter to display good judgement when
confronted with an unrenderable character (which is distinct from an
unknown EE<lt>thing> sequence that the parser couldn't resolve to
anything, renderable or not).  It is good practice to map Latin letters
with diacritics (like "EE<lt>eacute>"/"EE<lt>233>") to the corresponding
unaccented US-ASCII letters (like a simple character 101, "e"), but
clearly this is often not feasible, and an unrenderable character may
be represented as "?", or the like.  In attempting a sane fallback
(as from EE<lt>233> to "e"), Pod formatters may use the
%Latin1Code_to_fallback table in L<Pod::Escapes|Pod::Escapes>, or
L<Text::Unidecode|Text::Unidecode>, if available.

For example, this Pod text:

  magic is enabled if you set C<$Currency> to 'E<euro>'.

may be rendered as:
"magic is enabled if you set C<$Currency> to 'I<?>'" or as
"magic is enabled if you set C<$Currency> to 'B<[euro]>'", or as
"magic is enabled if you set C<$Currency> to '[x20AC]', etc.

A Pod formatter may also note, in a comment or warning, a list of what
unrenderable characters were encountered.

=item *

EE<lt>...> may freely appear in any formatting code (other than
in another EE<lt>...> or in an ZE<lt>>).  That is, "XE<lt>The
EE<lt>euro>1,000,000 Solution>" is valid, as is "LE<lt>The
EE<lt>euro>1,000,000 Solution|Million::Euros>".

=item *

Some Pod formatters output to formats that implement non-breaking
spaces as an individual character (which I'll call "NBSP"), and
others output to formats that implement non-breaking spaces just as
spaces wrapped in a "don't break this across lines" code.  Note that
at the level of Pod, both sorts of codes can occur: Pod can contain a
NBSP character (whether as a literal, or as a "EE<lt>160>" or
"EE<lt>nbsp>" code); and Pod can contain "SE<lt>foo
IE<lt>barE<gt> baz>" codes, where "mere spaces" (character 32) in
such codes are taken to represent non-breaking spaces.  Pod
parsers should consider supporting the optional parsing of "SE<lt>foo
IE<lt>barE<gt> baz>" as if it were
"fooI<NBSP>IE<lt>barE<gt>I<NBSP>baz", and, going the other way, the
optional parsing of groups of words joined by NBSP's as if each group
were in a SE<lt>...> code, so that formatters may use the
representation that maps best to what the output format demands.

=item *

Some processors may find that the C<SE<lt>...E<gt>> code is easiest to
implement by replacing each space in the parse tree under the content
of the S, with an NBSP.  But note: the replacement should apply I<not> to
spaces in I<all> text, but I<only> to spaces in I<printable> text.  (This
distinction may or may not be evident in the particular tree/event
model implemented by the Pod parser.)  For example, consider this
unusual case:

   S<L</Autoloaded Functions>>

This means that the space in the middle of the visible link text must
not be broken across lines.  In other words, it's the same as this:

   L<"AutoloadedE<160>Functions"/Autoloaded Functions>

However, a misapplied space-to-NBSP replacement could (wrongly)
produce something equivalent to this:

   L<"AutoloadedE<160>Functions"/AutoloadedE<160>Functions>

...which is almost definitely not going to work as a hyperlink (assuming
this formatter outputs a format supporting hypertext).

Formatters may choose to just not support the S format code,
especially in cases where the output format simply has no NBSP
character/code and no code for "don't break this stuff across lines".

=item *

Besides the NBSP character discussed above, implementors are reminded
of the existence of the other "special" character in Latin-1, the
"soft hyphen" character, also known as "discretionary hyphen",
i.e. C<EE<lt>173E<gt>> = C<EE<lt>0xADE<gt>> =
C<EE<lt>shyE<gt>>).  This character expresses an optional hyphenation
point.  That is, it normally renders as nothing, but may render as a
"-" if a formatter breaks the word at that point.  Pod formatters
should, as appropriate, do one of the following:  1) render this with
a code with the same meaning (e.g., "\-" in RTF), 2) pass it through
in the expectation that the formatter understands this character as
such, or 3) delete it.

For example:

  sigE<shy>action
  manuE<shy>script
  JarkE<shy>ko HieE<shy>taE<shy>nieE<shy>mi

These signal to a formatter that if it is to hyphenate "sigaction"
or "manuscript", then it should be done as
"sig-I<[linebreak]>action" or "manu-I<[linebreak]>script"
(and if it doesn't hyphenate it, then the C<EE<lt>shyE<gt>> doesn't
show up at all).  And if it is
to hyphenate "Jarkko" and/or "Hietaniemi", it can do
so only at the points where there is a C<EE<lt>shyE<gt>> code.

In practice, it is anticipated that this character will not be used
often, but formatters should either support it, or delete it.

=item *

If you think that you want to add a new command to Pod (like, say, a
"=biblio" command), consider whether you could get the same
effect with a for or begin/end sequence: "=for biblio ..." or "=begin
biblio" ... "=end biblio".  Pod processors that don't understand
"=for biblio", etc, will simply ignore it, whereas they may complain
loudly if they see "=biblio".

=item *

Throughout this document, "Pod" has been the preferred spelling for
the name of the documentation format.  One may also use "POD" or
"pod".  For the documentation that is (typically) in the Pod
format, you may use "pod", or "Pod", or "POD".  Understanding these
distinctions is useful; but obsessing over how to spell them, usually
is not.

=back





=head1 About LE<lt>...E<gt> Codes

As you can tell from a glance at L<perlpod|perlpod>, the LE<lt>...>
code is the most complex of the Pod formatting codes.  The points below
will hopefully clarify what it means and how processors should deal
with it.

=over

=item *

In parsing an LE<lt>...> code, Pod parsers must distinguish at least
four attributes:

=over

=item First:

The link-text.  If there is none, this must be C<undef>.  (E.g., in
"LE<lt>Perl Functions|perlfunc>", the link-text is "Perl Functions".
In "LE<lt>Time::HiRes>" and even "LE<lt>|Time::HiRes>", there is no
link text.  Note that link text may contain formatting.)

=item Second:

The possibly inferred link-text; i.e., if there was no real link
text, then this is the text that we'll infer in its place.  (E.g., for
"LE<lt>Getopt::Std>", the inferred link text is "Getopt::Std".)

=item Third:

The name or URL, or C<undef> if none.  (E.g., in "LE<lt>Perl
Functions|perlfunc>", the name (also sometimes called the page)
is "perlfunc".  In "LE<lt>/CAVEATS>", the name is C<undef>.)

=item Fourth:

The section (AKA "item" in older perlpods), or C<undef> if none.  E.g.,
in "LE<lt>Getopt::Std/DESCRIPTIONE<gt>", "DESCRIPTION" is the section.  (Note
that this is not the same as a manpage section like the "5" in "man 5
crontab".  "Section Foo" in the Pod sense means the part of the text
that's introduced by the heading or item whose text is "Foo".)

=back

Pod parsers may also note additional attributes including:

=over

=item Fifth:

A flag for whether item 3 (if present) is a URL (like
"http://lists.perl.org" is), in which case there should be no section
attribute; a Pod name (like "perldoc" and "Getopt::Std" are); or
possibly a man page name (like "crontab(5)" is).

=item Sixth:

The raw original LE<lt>...> content, before text is split on
"|", "/", etc, and before EE<lt>...> codes are expanded.

=back

(The above were numbered only for concise reference below.  It is not
a requirement that these be passed as an actual list or array.)

For example:

  L<Foo::Bar>
    =>  undef,                         # link text
        "Foo::Bar",                    # possibly inferred link text
        "Foo::Bar",                    # name
        undef,                         # section
        'pod',                         # what sort of link
        "Foo::Bar"                     # original content

  L<Perlport's section on NL's|perlport/Newlines>
    =>  "Perlport's section on NL's",  # link text
        "Perlport's section on NL's",  # possibly inferred link text
        "perlport",                    # name
        "Newlines",                    # section
        'pod',                         # what sort of link
        "Perlport's section on NL's|perlport/Newlines"
                                       # original content

  L<perlport/Newlines>
    =>  undef,                         # link text
        '"Newlines" in perlport',      # possibly inferred link text
        "perlport",                    # name
        "Newlines",                    # section
        'pod',                         # what sort of link
        "perlport/Newlines"            # original content

  L<crontab(5)/"DESCRIPTION">
    =>  undef,                         # link text
        '"DESCRIPTION" in crontab(5)', # possibly inferred link text
        "crontab(5)",                  # name
        "DESCRIPTION",                 # section
        'man',                         # what sort of link
        'crontab(5)/"DESCRIPTION"'     # original content

  L</Object Attributes>
    =>  undef,                         # link text
        '"Object Attributes"',         # possibly inferred link text
        undef,                         # name
        "Object Attributes",           # section
        'pod',                         # what sort of link
        "/Object Attributes"           # original content

  L<http://www.perl.org/>
    =>  undef,                         # link text
        "http://www.perl.org/",        # possibly inferred link text
        "http://www.perl.org/",        # name
        undef,                         # section
        'url',                         # what sort of link
        "http://www.perl.org/"         # original content

  L<Perl.org|http://www.perl.org/>
    =>  "Perl.org",                    # link text
        "http://www.perl.org/",        # possibly inferred link text
        "http://www.perl.org/",        # name
        undef,                         # section
        'url',                         # what sort of link
        "Perl.org|http://www.perl.org/" # original content

Note that you can distinguish URL-links from anything else by the
fact that they match C<m/\A\w+:[^:\s]\S*\z/>.  So
C<LE<lt>http://www.perl.comE<gt>> is a URL, but
C<LE<lt>HTTP::ResponseE<gt>> isn't.

=item *

In case of LE<lt>...> codes with no "text|" part in them,
older formatters have exhibited great variation in actually displaying
the link or cross reference.  For example, LE<lt>crontab(5)> would render
as "the C<crontab(5)> manpage", or "in the C<crontab(5)> manpage"
or just "C<crontab(5)>".

Pod processors must now treat "text|"-less links as follows:

  L<name>         =>  L<name|name>
  L</section>     =>  L<"section"|/section>
  L<name/section> =>  L<"section" in name|name/section>

=item *

Note that section names might contain markup.  I.e., if a section
starts with:

  =head2 About the C<-M> Operator

or with:

  =item About the C<-M> Operator

then a link to it would look like this:

  L<somedoc/About the C<-M> Operator>

Formatters may choose to ignore the markup for purposes of resolving
the link and use only the renderable characters in the section name,
as in:

  <h1><a name="About_the_-M_Operator">About the <code>-M</code>
  Operator</h1>

  ...

  <a href="somedoc#About_the_-M_Operator">About the <code>-M</code>
  Operator" in somedoc</a>

=item *

Previous versions of perlpod distinguished C<LE<lt>name/"section"E<gt>>
links from C<LE<lt>name/itemE<gt>> links (and their targets).  These
have been merged syntactically and semantically in the current
specification, and I<section> can refer either to a "=headI<n> Heading
Content" command or to a "=item Item Content" command.  This
specification does not specify what behavior should be in the case
of a given document having several things all seeming to produce the
same I<section> identifier (e.g., in HTML, several things all producing
the same I<anchorname> in <a name="I<anchorname>">...</a>
elements).  Where Pod processors can control this behavior, they should
use the first such anchor.  That is, C<LE<lt>Foo/BarE<gt>> refers to the
I<first> "Bar" section in Foo.

But for some processors/formats this cannot be easily controlled; as
with the HTML example, the behavior of multiple ambiguous
<a name="I<anchorname>">...</a> is most easily just left up to
browsers to decide.

=item *

In a C<LE<lt>text|...E<gt>> code, text may contain formatting codes
for formatting or for EE<lt>...> escapes, as in:

  L<B<ummE<234>stuff>|...>

For C<LE<lt>...E<gt>> codes without a "name|" part, only
C<EE<lt>...E<gt>> and C<ZE<lt>E<gt>> codes may occur.  That is,
authors should not use "C<LE<lt>BE<lt>Foo::BarE<gt>E<gt>>".

Note, however, that formatting codes and ZE<lt>>'s can occur in any
and all parts of an LE<lt>...> (i.e., in I<name>, I<section>, I<text>,
and I<url>).

Authors must not nest LE<lt>...> codes.  For example, "LE<lt>The
LE<lt>Foo::Bar> man page>" should be treated as an error.

=item *

Note that Pod authors may use formatting codes inside the "text"
part of "LE<lt>text|name>" (and so on for LE<lt>text|/"sec">).

In other words, this is valid:

  Go read L<the docs on C<$.>|perlvar/"$.">

Some output formats that do allow rendering "LE<lt>...>" codes as
hypertext, might not allow the link-text to be formatted; in
that case, formatters will have to just ignore that formatting.

=item *

At time of writing, C<LE<lt>nameE<gt>> values are of two types:
either the name of a Pod page like C<LE<lt>Foo::BarE<gt>> (which
might be a real Perl module or program in an @INC / PATH
directory, or a .pod file in those places); or the name of a Unix
man page, like C<LE<lt>crontab(5)E<gt>>.  In theory, C<LE<lt>chmodE<gt>>
is ambiguous between a Pod page called "chmod", or the Unix man page
"chmod" (in whatever man-section).  However, the presence of a string
in parens, as in "crontab(5)", is sufficient to signal that what
is being discussed is not a Pod page, and so is presumably a
Unix man page.  The distinction is of no importance to many
Pod processors, but some processors that render to hypertext formats
may need to distinguish them in order to know how to render a
given C<LE<lt>fooE<gt>> code.

=item *

Previous versions of perlpod allowed for a C<LE<lt>sectionE<gt>> syntax (as in
C<LE<lt>Object AttributesE<gt>>), which was not easily distinguishable from
C<LE<lt>nameE<gt>> syntax and for C<LE<lt>"section"E<gt>> which was only
slightly less ambiguous.  This syntax is no longer in the specification, and
has been replaced by the C<LE<lt>/sectionE<gt>> syntax (where the slash was
formerly optional).  Pod parsers should tolerate the C<LE<lt>"section"E<gt>>
syntax, for a while at least.  The suggested heuristic for distinguishing
C<LE<lt>sectionE<gt>> from C<LE<lt>nameE<gt>> is that if it contains any
whitespace, it's a I<section>.  Pod processors should warn about this being
deprecated syntax.

=back

=head1 About =over...=back Regions

"=over"..."=back" regions are used for various kinds of list-like
structures.  (I use the term "region" here simply as a collective
term for everything from the "=over" to the matching "=back".)

=over

=item *

The non-zero numeric I<indentlevel> in "=over I<indentlevel>" ...
"=back" is used for giving the formatter a clue as to how many
"spaces" (ems, or roughly equivalent units) it should tab over,
although many formatters will have to convert this to an absolute
measurement that may not exactly match with the size of spaces (or M's)
in the document's base font.  Other formatters may have to completely
ignore the number.  The lack of any explicit I<indentlevel> parameter is
equivalent to an I<indentlevel> value of 4.  Pod processors may
complain if I<indentlevel> is present but is not a positive number
matching C<m/\A(\d*\.)?\d+\z/>.

=item *

Authors of Pod formatters are reminded that "=over" ... "=back" may
map to several different constructs in your output format.  For
example, in converting Pod to (X)HTML, it can map to any of
<ul>...</ul>, <ol>...</ol>, <dl>...</dl>, or
<blockquote>...</blockquote>.  Similarly, "=item" can map to <li> or
<dt>.

=item *

Each "=over" ... "=back" region should be one of the following:

=over

=item *

An "=over" ... "=back" region containing only "=item *" commands,
each followed by some number of ordinary/verbatim paragraphs, other
nested "=over" ... "=back" regions, "=for..." paragraphs, and
"=begin"..."=end" regions.

(Pod processors must tolerate a bare "=item" as if it were "=item
*".)  Whether "*" is rendered as a literal asterisk, an "o", or as
some kind of real bullet character, is left up to the Pod formatter,
and may depend on the level of nesting.

=item *

An "=over" ... "=back" region containing only
C<m/\A=item\s+\d+\.?\s*\z/> paragraphs, each one (or each group of them)
followed by some number of ordinary/verbatim paragraphs, other nested
"=over" ... "=back" regions, "=for..." paragraphs, and/or
"=begin"..."=end" codes.  Note that the numbers must start at 1
in each section, and must proceed in order and without skipping
numbers.

(Pod processors must tolerate lines like "=item 1" as if they were
"=item 1.", with the period.)

=item *

An "=over" ... "=back" region containing only "=item [text]"
commands, each one (or each group of them) followed by some number of
ordinary/verbatim paragraphs, other nested "=over" ... "=back"
regions, or "=for..." paragraphs, and "=begin"..."=end" regions.

The "=item [text]" paragraph should not match
C<m/\A=item\s+\d+\.?\s*\z/> or C<m/\A=item\s+\*\s*\z/>, nor should it
match just C<m/\A=item\s*\z/>.

=item *

An "=over" ... "=back" region containing no "=item" paragraphs at
all, and containing only some number of 
ordinary/verbatim paragraphs, and possibly also some nested "=over"
... "=back" regions, "=for..." paragraphs, and "=begin"..."=end"
regions.  Such an itemless "=over" ... "=back" region in Pod is
equivalent in meaning to a "<blockquote>...</blockquote>" element in
HTML.

=back

Note that with all the above cases, you can determine which type of
"=over" ... "=back" you have, by examining the first (non-"=cut", 
non-"=pod") Pod paragraph after the "=over" command.

=item *

Pod formatters I<must> tolerate arbitrarily large amounts of text
in the "=item I<text...>" paragraph.  In practice, most such
paragraphs are short, as in:

  =item For cutting off our trade with all parts of the world

But they may be arbitrarily long:

  =item For transporting us beyond seas to be tried for pretended
  offenses

  =item He is at this time transporting large armies of foreign
  mercenaries to complete the works of death, desolation and
  tyranny, already begun with circumstances of cruelty and perfidy
  scarcely paralleled in the most barbarous ages, and totally
  unworthy the head of a civilized nation.

=item *

Pod processors should tolerate "=item *" / "=item I<number>" commands
with no accompanying paragraph.  The middle item is an example:

  =over

  =item 1

  Pick up dry cleaning.

  =item 2

  =item 3

  Stop by the store.  Get Abba Zabas, Stoli, and cheap lawn chairs.

  =back

=item *

No "=over" ... "=back" region can contain headings.  Processors may
treat such a heading as an error.

=item *

Note that an "=over" ... "=back" region should have some
content.  That is, authors should not have an empty region like this:

  =over

  =back

Pod processors seeing such a contentless "=over" ... "=back" region,
may ignore it, or may report it as an error.

=item *

Processors must tolerate an "=over" list that goes off the end of the
document (i.e., which has no matching "=back"), but they may warn
about such a list.

=item *

Authors of Pod formatters should note that this construct:

  =item Neque

  =item Porro

  =item Quisquam Est

  Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci 
  velit, sed quia non numquam eius modi tempora incidunt ut
  labore et dolore magnam aliquam quaerat voluptatem.

  =item Ut Enim

is semantically ambiguous, in a way that makes formatting decisions
a bit difficult.  On the one hand, it could be mention of an item
"Neque", mention of another item "Porro", and mention of another
item "Quisquam Est", with just the last one requiring the explanatory
paragraph "Qui dolorem ipsum quia dolor..."; and then an item
"Ut Enim".  In that case, you'd want to format it like so:

  Neque

  Porro

  Quisquam Est
    Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci
    velit, sed quia non numquam eius modi tempora incidunt ut
    labore et dolore magnam aliquam quaerat voluptatem.

  Ut Enim

But it could equally well be a discussion of three (related or equivalent)
items, "Neque", "Porro", and "Quisquam Est", followed by a paragraph
explaining them all, and then a new item "Ut Enim".  In that case, you'd
probably want to format it like so:

  Neque
  Porro
  Quisquam Est
    Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci
    velit, sed quia non numquam eius modi tempora incidunt ut
    labore et dolore magnam aliquam quaerat voluptatem.

  Ut Enim

But (for the foreseeable future), Pod does not provide any way for Pod
authors to distinguish which grouping is meant by the above
"=item"-cluster structure.  So formatters should format it like so:

  Neque

  Porro

  Quisquam Est

    Qui dolorem ipsum quia dolor sit amet, consectetur, adipisci
    velit, sed quia non numquam eius modi tempora incidunt ut
    labore et dolore magnam aliquam quaerat voluptatem.

  Ut Enim

That is, there should be (at least roughly) equal spacing between
items as between paragraphs (although that spacing may well be less
than the full height of a line of text).  This leaves it to the reader
to use (con)textual cues to figure out whether the "Qui dolorem
ipsum..." paragraph applies to the "Quisquam Est" item or to all three
items "Neque", "Porro", and "Quisquam Est".  While not an ideal
situation, this is preferable to providing formatting cues that may
be actually contrary to the author's intent.

=back



=head1 About Data Paragraphs and "=begin/=end" Regions

Data paragraphs are typically used for inlining non-Pod data that is
to be used (typically passed through) when rendering the document to
a specific format:

  =begin rtf

  \par{\pard\qr\sa4500{\i Printed\~\chdate\~\chtime}\par}

  =end rtf

The exact same effect could, incidentally, be achieved with a single
"=for" paragraph:

  =for rtf \par{\pard\qr\sa4500{\i Printed\~\chdate\~\chtime}\par}

(Although that is not formally a data paragraph, it has the same
meaning as one, and Pod parsers may parse it as one.)

Another example of a data paragraph:

  =begin html

  I like <em>PIE</em>!

  <hr>Especially pecan pie!

  =end html

If these were ordinary paragraphs, the Pod parser would try to
expand the "EE<lt>/em>" (in the first paragraph) as a formatting
code, just like "EE<lt>lt>" or "EE<lt>eacute>".  But since this
is in a "=begin I<identifier>"..."=end I<identifier>" region I<and>
the identifier "html" doesn't begin have a ":" prefix, the contents
of this region are stored as data paragraphs, instead of being
processed as ordinary paragraphs (or if they began with a spaces
and/or tabs, as verbatim paragraphs).

As a further example: At time of writing, no "biblio" identifier is
supported, but suppose some processor were written to recognize it as
a way of (say) denoting a bibliographic reference (necessarily
containing formatting codes in ordinary paragraphs).  The fact that
"biblio" paragraphs were meant for ordinary processing would be
indicated by prefacing each "biblio" identifier with a colon:

  =begin :biblio

  Wirth, Niklaus.  1976.  I<Algorithms + Data Structures =
  Programs.>  Prentice-Hall, Englewood Cliffs, NJ.

  =end :biblio

This would signal to the parser that paragraphs in this begin...end
region are subject to normal handling as ordinary/verbatim paragraphs
(while still tagged as meant only for processors that understand the
"biblio" identifier).  The same effect could be had with:

  =for :biblio
  Wirth, Niklaus.  1976.  I<Algorithms + Data Structures =
  Programs.>  Prentice-Hall, Englewood Cliffs, NJ.

The ":" on these identifiers means simply "process this stuff
normally, even though the result will be for some special target".
I suggest that parser APIs report "biblio" as the target identifier,
but also report that it had a ":" prefix.  (And similarly, with the
above "html", report "html" as the target identifier, and note the
I<lack> of a ":" prefix.)

Note that a "=begin I<identifier>"..."=end I<identifier>" region where
I<identifier> begins with a colon, I<can> contain commands.  For example:

  =begin :biblio

  Wirth's classic is available in several editions, including:

  =for comment
   hm, check abebooks.com for how much used copies cost.

  =over

  =item

  Wirth, Niklaus.  1975.  I<Algorithmen und Datenstrukturen.>
  Teubner, Stuttgart.  [Yes, it's in German.]

  =item

  Wirth, Niklaus.  1976.  I<Algorithms + Data Structures =
  Programs.>  Prentice-Hall, Englewood Cliffs, NJ.

  =back

  =end :biblio

Note, however, a "=begin I<identifier>"..."=end I<identifier>"
region where I<identifier> does I<not> begin with a colon, should not
directly contain "=head1" ... "=head4" commands, nor "=over", nor "=back",
nor "=item".  For example, this may be considered invalid:

  =begin somedata

  This is a data paragraph.

  =head1 Don't do this!

  This is a data paragraph too.

  =end somedata

A Pod processor may signal that the above (specifically the "=head1"
paragraph) is an error.  Note, however, that the following should
I<not> be treated as an error:

  =begin somedata

  This is a data paragraph.

  =cut

  # Yup, this isn't Pod anymore.
  sub excl { (rand() > .5) ? "hoo!" : "hah!" }

  =pod

  This is a data paragraph too.

  =end somedata

And this too is valid:

  =begin someformat

  This is a data paragraph.

    And this is a data paragraph.

  =begin someotherformat

  This is a data paragraph too.

    And this is a data paragraph too.

  =begin :yetanotherformat

  =head2 This is a command paragraph!

  This is an ordinary paragraph!

    And this is a verbatim paragraph!

  =end :yetanotherformat

  =end someotherformat

  Another data paragraph!

  =end someformat

The contents of the above "=begin :yetanotherformat" ...
"=end :yetanotherformat" region I<aren't> data paragraphs, because
the immediately containing region's identifier (":yetanotherformat")
begins with a colon.  In practice, most regions that contain
data paragraphs will contain I<only> data paragraphs; however, 
the above nesting is syntactically valid as Pod, even if it is
rare.  However, the handlers for some formats, like "html",
will accept only data paragraphs, not nested regions; and they may
complain if they see (targeted for them) nested regions, or commands,
other than "=end", "=pod", and "=cut".

Also consider this valid structure:

  =begin :biblio

  Wirth's classic is available in several editions, including:

  =over

  =item

  Wirth, Niklaus.  1975.  I<Algorithmen und Datenstrukturen.>
  Teubner, Stuttgart.  [Yes, it's in German.]

  =item

  Wirth, Niklaus.  1976.  I<Algorithms + Data Structures =
  Programs.>  Prentice-Hall, Englewood Cliffs, NJ.

  =back

  Buy buy buy!

  =begin html

  <img src='wirth_spokesmodeling_book.png'>

  <hr>

  =end html

  Now now now!

  =end :biblio

There, the "=begin html"..."=end html" region is nested inside
the larger "=begin :biblio"..."=end :biblio" region.  Note that the
content of the "=begin html"..."=end html" region is data
paragraph(s), because the immediately containing region's identifier
("html") I<doesn't> begin with a colon.

Pod parsers, when processing a series of data paragraphs one
after another (within a single region), should consider them to
be one large data paragraph that happens to contain blank lines.  So
the content of the above "=begin html"..."=end html" I<may> be stored
as two data paragraphs (one consisting of
"<img src='wirth_spokesmodeling_book.png'>\n"
and another consisting of "<hr>\n"), but I<should> be stored as
a single data paragraph (consisting of 
"<img src='wirth_spokesmodeling_book.png'>\n\n<hr>\n").

Pod processors should tolerate empty
"=begin I<something>"..."=end I<something>" regions,
empty "=begin :I<something>"..."=end :I<something>" regions, and
contentless "=for I<something>" and "=for :I<something>"
paragraphs.  I.e., these should be tolerated:

  =for html

  =begin html

  =end html

  =begin :biblio

  =end :biblio

Incidentally, note that there's no easy way to express a data
paragraph starting with something that looks like a command.  Consider:

  =begin stuff

  =shazbot

  =end stuff

There, "=shazbot" will be parsed as a Pod command "shazbot", not as a data
paragraph "=shazbot\n".  However, you can express a data paragraph consisting
of "=shazbot\n" using this code:

  =for stuff =shazbot

The situation where this is necessary, is presumably quite rare.

Note that =end commands must match the currently open =begin command.  That
is, they must properly nest.  For example, this is valid:

  =begin outer

  X

  =begin inner

  Y

  =end inner

  Z

  =end outer

while this is invalid:

  =begin outer

  X

  =begin inner

  Y

  =end outer

  Z

  =end inner

This latter is improper because when the "=end outer" command is seen, the
currently open region has the formatname "inner", not "outer".  (It just
happens that "outer" is the format name of a higher-up region.)  This is
an error.  Processors must by default report this as an error, and may halt
processing the document containing that error.  A corollary of this is that
regions cannot "overlap". That is, the latter block above does not represent
a region called "outer" which contains X and Y, overlapping a region called
"inner" which contains Y and Z.  But because it is invalid (as all
apparently overlapping regions would be), it doesn't represent that, or
anything at all.

Similarly, this is invalid:

  =begin thing

  =end hting

This is an error because the region is opened by "thing", and the "=end"
tries to close "hting" [sic].

This is also invalid:

  =begin thing

  =end

This is invalid because every "=end" command must have a formatname
parameter.

=head1 SEE ALSO

L<perlpod>, L<perlsyn/"PODs: Embedded Documentation">,
L<podchecker>

=head1 AUTHOR

Sean M. Burke

=cut


perllocale.pod000064400000206106150344123460007401 0ustar00=encoding utf8

=head1 NAME

perllocale - Perl locale handling (internationalization and localization)

=head1 DESCRIPTION

In the beginning there was ASCII, the "American Standard Code for
Information Interchange", which works quite well for Americans with
their English alphabet and dollar-denominated currency.  But it doesn't
work so well even for other English speakers, who may use different
currencies, such as the pound sterling (as the symbol for that currency
is not in ASCII); and it's hopelessly inadequate for many of the
thousands of the world's other languages.

To address these deficiencies, the concept of locales was invented
(formally the ISO C, XPG4, POSIX 1.c "locale system").  And applications
were and are being written that use the locale mechanism.  The process of
making such an application take account of its users' preferences in
these kinds of matters is called B<internationalization> (often
abbreviated as B<i18n>); telling such an application about a particular
set of preferences is known as B<localization> (B<l10n>).

Perl has been extended to support the locale system.  This
is controlled per application by using one pragma, one function call,
and several environment variables.

Unfortunately, there are quite a few deficiencies with the design (and
often, the implementations) of locales.  Unicode was invented (see
L<perlunitut> for an introduction to that) in part to address these
design deficiencies, and nowadays, there is a series of "UTF-8
locales", based on Unicode.  These are locales whose character set is
Unicode, encoded in UTF-8.  Starting in v5.20, Perl fully supports
UTF-8 locales, except for sorting and string comparisons like C<lt> and
C<ge>.  Starting in v5.26, Perl can handle these reasonably as well,
depending on the platform's implementation.  However, for earlier
releases or for better control, use L<Unicode::Collate> .  Perl continues to
support the old non UTF-8 locales as well.  There are currently no UTF-8
locales for EBCDIC platforms.

(Unicode is also creating C<CLDR>, the "Common Locale Data Repository",
L<http://cldr.unicode.org/> which includes more types of information than
are available in the POSIX locale system.  At the time of this writing,
there was no CPAN module that provides access to this XML-encoded data.
However, it is possible to compute the POSIX locale data from them, and
earlier CLDR versions had these already extracted for you as UTF-8 locales
L<http://unicode.org/Public/cldr/2.0.1/>.)

=head1 WHAT IS A LOCALE

A locale is a set of data that describes various aspects of how various
communities in the world categorize their world.  These categories are
broken down into the following types (some of which include a brief
note here):

=over

=item Category C<LC_NUMERIC>: Numeric formatting

This indicates how numbers should be formatted for human readability,
for example the character used as the decimal point.

=item Category C<LC_MONETARY>: Formatting of monetary amounts

=for comment
The nbsp below makes this look better (though not great)

E<160>

=item Category C<LC_TIME>: Date/Time formatting

=for comment
The nbsp below makes this look better (though not great)

E<160>

=item Category C<LC_MESSAGES>: Error and other messages

This is used by Perl itself only for accessing operating system error
messages via L<$!|perlvar/$ERRNO> and L<$^E|perlvar/$EXTENDED_OS_ERROR>.

=item Category C<LC_COLLATE>: Collation

This indicates the ordering of letters for comparison and sorting.
In Latin alphabets, for example, "b", generally follows "a".

=item Category C<LC_CTYPE>: Character Types

This indicates, for example if a character is an uppercase letter.

=item Other categories

Some platforms have other categories, dealing with such things as
measurement units and paper sizes.  None of these are used directly by
Perl, but outside operations that Perl interacts with may use
these.  See L</Not within the scope of "use locale"> below.

=back

More details on the categories used by Perl are given below in L</LOCALE
CATEGORIES>.

Together, these categories go a long way towards being able to customize
a single program to run in many different locations.  But there are
deficiencies, so keep reading.

=head1 PREPARING TO USE LOCALES

Perl itself (outside the L<POSIX> module) will not use locales unless
specifically requested to (but
again note that Perl may interact with code that does use them).  Even
if there is such a request, B<all> of the following must be true
for it to work properly:

=over 4

=item *

B<Your operating system must support the locale system>.  If it does,
you should find that the C<setlocale()> function is a documented part of
its C library.

=item *

B<Definitions for locales that you use must be installed>.  You, or
your system administrator, must make sure that this is the case. The
available locales, the location in which they are kept, and the manner
in which they are installed all vary from system to system.  Some systems
provide only a few, hard-wired locales and do not allow more to be
added.  Others allow you to add "canned" locales provided by the system
supplier.  Still others allow you or the system administrator to define
and add arbitrary locales.  (You may have to ask your supplier to
provide canned locales that are not delivered with your operating
system.)  Read your system documentation for further illumination.

=item *

B<Perl must believe that the locale system is supported>.  If it does,
C<perl -V:d_setlocale> will say that the value for C<d_setlocale> is
C<define>.

=back

If you want a Perl application to process and present your data
according to a particular locale, the application code should include
the S<C<use locale>> pragma (see L</The "use locale" pragma>) where
appropriate, and B<at least one> of the following must be true:

=over 4

=item 1

B<The locale-determining environment variables (see L</"ENVIRONMENT">)
must be correctly set up> at the time the application is started, either
by yourself or by whomever set up your system account; or

=item 2

B<The application must set its own locale> using the method described in
L</The setlocale function>.

=back

=head1 USING LOCALES

=head2 The C<"use locale"> pragma

WARNING!  Do NOT use this pragma in scripts that have multiple
L<threads|threads> active.  The locale is not local to a single thread.
Another thread may change the locale at any time, which could cause at a
minimum that a given thread is operating in a locale it isn't expecting
to be in.  On some platforms, segfaults can also occur.  The locale
change need not be explicit; some operations cause perl to change the
locale itself.  You are vulnerable simply by having done a C<"use
locale">.

By default, Perl itself (outside the L<POSIX> module)
ignores the current locale.  The S<C<use locale>>
pragma tells Perl to use the current locale for some operations.
Starting in v5.16, there are optional parameters to this pragma,
described below, which restrict which operations are affected by it.

The current locale is set at execution time by
L<setlocale()|/The setlocale function> described below.  If that function
hasn't yet been called in the course of the program's execution, the
current locale is that which was determined by the L</"ENVIRONMENT"> in
effect at the start of the program.
If there is no valid environment, the current locale is whatever the
system default has been set to.   On POSIX systems, it is likely, but
not necessarily, the "C" locale.  On Windows, the default is set via the
computer's S<C<Control Panel-E<gt>Regional and Language Options>> (or its
current equivalent).

The operations that are affected by locale are:

=over 4

=item B<Not within the scope of C<"use locale">>

Only certain operations originating outside Perl should be affected, as
follows:

=over 4

=item *

The current locale is used when going outside of Perl with
operations like L<system()|perlfunc/system LIST> or
L<qxE<sol>E<sol>|perlop/qxE<sol>STRINGE<sol>>, if those operations are
locale-sensitive.

=item *

Also Perl gives access to various C library functions through the
L<POSIX> module.  Some of those functions are always affected by the
current locale.  For example, C<POSIX::strftime()> uses C<LC_TIME>;
C<POSIX::strtod()> uses C<LC_NUMERIC>; C<POSIX::strcoll()> and
C<POSIX::strxfrm()> use C<LC_COLLATE>.  All such functions
will behave according to the current underlying locale, even if that
locale isn't exposed to Perl space.

=item *

XS modules for all categories but C<LC_NUMERIC> get the underlying
locale, and hence any C library functions they call will use that
underlying locale.  For more discussion, see L<perlxs/CAVEATS>.

=back

Note that all C programs (including the perl interpreter, which is
written in C) always have an underlying locale.  That locale is the "C"
locale unless changed by a call to L<setlocale()|/The setlocale
function>.  When Perl starts up, it changes the underlying locale to the
one which is indicated by the L</ENVIRONMENT>.  When using the L<POSIX>
module or writing XS code, it is important to keep in mind that the
underlying locale may be something other than "C", even if the program
hasn't explicitly changed it.

=for comment
The nbsp below makes this look better (though not great)

E<160>

=item B<Lingering effects of C<S<use locale>>>

Certain Perl operations that are set-up within the scope of a
C<use locale> retain that effect even outside the scope.
These include:

=over 4

=item *

The output format of a L<write()|perlfunc/write> is determined by an
earlier format declaration (L<perlfunc/format>), so whether or not the
output is affected by locale is determined by if the C<format()> is
within the scope of a C<use locale>, not whether the C<write()>
is.

=item *

Regular expression patterns can be compiled using
L<qrE<sol>E<sol>|perlop/qrE<sol>STRINGE<sol>msixpodualn> with actual
matching deferred to later.  Again, it is whether or not the compilation
was done within the scope of C<use locale> that determines the match
behavior, not if the matches are done within such a scope or not.

=back

=for comment
The nbsp below makes this look better (though not great)


E<160>

=item B<Under C<"use locale";>>

=over 4

=item *

All the above operations

=item *

B<Format declarations> (L<perlfunc/format>) and hence any subsequent
C<write()>s use C<LC_NUMERIC>.

=item *

B<stringification and output> use C<LC_NUMERIC>.
These include the results of
C<print()>,
C<printf()>,
C<say()>,
and
C<sprintf()>.

=item *

B<The comparison operators> (C<lt>, C<le>, C<cmp>, C<ge>, and C<gt>) use
C<LC_COLLATE>.  C<sort()> is also affected if used without an
explicit comparison function, because it uses C<cmp> by default.

B<Note:> C<eq> and C<ne> are unaffected by locale: they always
perform a char-by-char comparison of their scalar operands.  What's
more, if C<cmp> finds that its operands are equal according to the
collation sequence specified by the current locale, it goes on to
perform a char-by-char comparison, and only returns I<0> (equal) if the
operands are char-for-char identical.  If you really want to know whether
two strings--which C<eq> and C<cmp> may consider different--are equal
as far as collation in the locale is concerned, see the discussion in
L<Category C<LC_COLLATE>: Collation>.

=item *

B<Regular expressions and case-modification functions> (C<uc()>, C<lc()>,
C<ucfirst()>, and C<lcfirst()>) use C<LC_CTYPE>

=item *

B<The variables L<$!|perlvar/$ERRNO>> (and its synonyms C<$ERRNO> and
C<$OS_ERROR>) B<and L<$^E|perlvar/$EXTENDED_OS_ERROR>> (and its synonym
C<$EXTENDED_OS_ERROR>) when used as strings use C<LC_MESSAGES>.

=back

=back

The default behavior is restored with the S<C<no locale>> pragma, or
upon reaching the end of the block enclosing C<use locale>.
Note that C<use locale> calls may be
nested, and that what is in effect within an inner scope will revert to
the outer scope's rules at the end of the inner scope.

The string result of any operation that uses locale
information is tainted, as it is possible for a locale to be
untrustworthy.  See L</"SECURITY">.

Starting in Perl v5.16 in a very limited way, and more generally in
v5.22, you can restrict which category or categories are enabled by this
particular instance of the pragma by adding parameters to it.  For
example,

 use locale qw(:ctype :numeric);

enables locale awareness within its scope of only those operations
(listed above) that are affected by C<LC_CTYPE> and C<LC_NUMERIC>.

The possible categories are: C<:collate>, C<:ctype>, C<:messages>,
C<:monetary>, C<:numeric>, C<:time>, and the pseudo category
C<:characters> (described below).

Thus you can say

 use locale ':messages';

and only L<$!|perlvar/$ERRNO> and L<$^E|perlvar/$EXTENDED_OS_ERROR>
will be locale aware.  Everything else is unaffected.

Since Perl doesn't currently do anything with the C<LC_MONETARY>
category, specifying C<:monetary> does effectively nothing.  Some
systems have other categories, such as C<LC_PAPER_SIZE>, but Perl
also doesn't know anything about them, and there is no way to specify
them in this pragma's arguments.

You can also easily say to use all categories but one, by either, for
example,

 use locale ':!ctype';
 use locale ':not_ctype';

both of which mean to enable locale awarness of all categories but
C<LC_CTYPE>.  Only one category argument may be specified in a
S<C<use locale>> if it is of the negated form.

Prior to v5.22 only one form of the pragma with arguments is available:

 use locale ':not_characters';

(and you have to say C<not_>; you can't use the bang C<!> form).  This
pseudo category is a shorthand for specifying both C<:collate> and
C<:ctype>.  Hence, in the negated form, it is nearly the same thing as
saying

 use locale qw(:messages :monetary :numeric :time);

We use the term "nearly", because C<:not_characters> also turns on
S<C<use feature 'unicode_strings'>> within its scope.  This form is
less useful in v5.20 and later, and is described fully in
L</Unicode and UTF-8>, but briefly, it tells Perl to not use the
character portions of the locale definition, that is the C<LC_CTYPE> and
C<LC_COLLATE> categories.  Instead it will use the native character set
(extended by Unicode).  When using this parameter, you are responsible
for getting the external character set translated into the
native/Unicode one (which it already will be if it is one of the
increasingly popular UTF-8 locales).  There are convenient ways of doing
this, as described in L</Unicode and UTF-8>.

=head2 The setlocale function

WARNING!  Do NOT use this function in a L<thread|threads>.  The locale
will change in all other threads at the same time, and should your
thread get paused by the operating system, and another started, that
thread will not have the locale it is expecting.  On some platforms,
there can be a race leading to segfaults if two threads call this
function nearly simultaneously.

You can switch locales as often as you wish at run time with the
C<POSIX::setlocale()> function:

        # Import locale-handling tool set from POSIX module.
        # This example uses: setlocale -- the function call
        #                    LC_CTYPE -- explained below
        # (Showing the testing for success/failure of operations is
        # omitted in these examples to avoid distracting from the main
        # point)

        use POSIX qw(locale_h);
        use locale;
        my $old_locale;

        # query and save the old locale
        $old_locale = setlocale(LC_CTYPE);

        setlocale(LC_CTYPE, "fr_CA.ISO8859-1");
        # LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1"

        setlocale(LC_CTYPE, "");
        # LC_CTYPE now reset to the default defined by the
        # LC_ALL/LC_CTYPE/LANG environment variables, or to the system
        # default.  See below for documentation.

        # restore the old locale
        setlocale(LC_CTYPE, $old_locale);

The first argument of C<setlocale()> gives the B<category>, the second the
B<locale>.  The category tells in what aspect of data processing you
want to apply locale-specific rules.  Category names are discussed in
L</LOCALE CATEGORIES> and L</"ENVIRONMENT">.  The locale is the name of a
collection of customization information corresponding to a particular
combination of language, country or territory, and codeset.  Read on for
hints on the naming of locales: not all systems name locales as in the
example.

If no second argument is provided and the category is something other
than C<LC_ALL>, the function returns a string naming the current locale
for the category.  You can use this value as the second argument in a
subsequent call to C<setlocale()>, B<but> on some platforms the string
is opaque, not something that most people would be able to decipher as
to what locale it means.

If no second argument is provided and the category is C<LC_ALL>, the
result is implementation-dependent.  It may be a string of
concatenated locale names (separator also implementation-dependent)
or a single locale name.  Please consult your L<setlocale(3)> man page for
details.

If a second argument is given and it corresponds to a valid locale,
the locale for the category is set to that value, and the function
returns the now-current locale value.  You can then use this in yet
another call to C<setlocale()>.  (In some implementations, the return
value may sometimes differ from the value you gave as the second
argument--think of it as an alias for the value you gave.)

As the example shows, if the second argument is an empty string, the
category's locale is returned to the default specified by the
corresponding environment variables.  Generally, this results in a
return to the default that was in force when Perl started up: changes
to the environment made by the application after startup may or may not
be noticed, depending on your system's C library.

Note that when a form of C<use locale> that doesn't include all
categories is specified, Perl ignores the excluded categories.

If C<set_locale()> fails for some reason (for example, an attempt to set
to a locale unknown to the system), the locale for the category is not
changed, and the function returns C<undef>.


For further information about the categories, consult L<setlocale(3)>.

=head2 Finding locales

For locales available in your system, consult also L<setlocale(3)> to
see whether it leads to the list of available locales (search for the
I<SEE ALSO> section).  If that fails, try the following command lines:

        locale -a

        nlsinfo

        ls /usr/lib/nls/loc

        ls /usr/lib/locale

        ls /usr/lib/nls

	ls /usr/share/locale

and see whether they list something resembling these

        en_US.ISO8859-1     de_DE.ISO8859-1     ru_RU.ISO8859-5
        en_US.iso88591      de_DE.iso88591      ru_RU.iso88595
        en_US               de_DE               ru_RU
        en                  de                  ru
        english             german              russian
        english.iso88591    german.iso88591     russian.iso88595
        english.roman8                          russian.koi8r

Sadly, even though the calling interface for C<setlocale()> has been
standardized, names of locales and the directories where the
configuration resides have not been.  The basic form of the name is
I<language_territory>B<.>I<codeset>, but the latter parts after
I<language> are not always present.  The I<language> and I<country>
are usually from the standards B<ISO 3166> and B<ISO 639>, the
two-letter abbreviations for the countries and the languages of the
world, respectively.  The I<codeset> part often mentions some B<ISO
8859> character set, the Latin codesets.  For example, C<ISO 8859-1>
is the so-called "Western European codeset" that can be used to encode
most Western European languages adequately.  Again, there are several
ways to write even the name of that one standard.  Lamentably.

Two special locales are worth particular mention: "C" and "POSIX".
Currently these are effectively the same locale: the difference is
mainly that the first one is defined by the C standard, the second by
the POSIX standard.  They define the B<default locale> in which
every program starts in the absence of locale information in its
environment.  (The I<default> default locale, if you will.)  Its language
is (American) English and its character codeset ASCII or, rarely, a
superset thereof (such as the "DEC Multinational Character Set
(DEC-MCS)").  B<Warning>. The C locale delivered by some vendors
may not actually exactly match what the C standard calls for.  So
beware.

B<NOTE>: Not all systems have the "POSIX" locale (not all systems are
POSIX-conformant), so use "C" when you need explicitly to specify this
default locale.

=head2 LOCALE PROBLEMS

You may encounter the following warning message at Perl startup:

	perl: warning: Setting locale failed.
	perl: warning: Please check that your locale settings:
	        LC_ALL = "En_US",
	        LANG = (unset)
	    are supported and installed on your system.
	perl: warning: Falling back to the standard locale ("C").

This means that your locale settings had C<LC_ALL> set to "En_US" and
LANG exists but has no value.  Perl tried to believe you but could not.
Instead, Perl gave up and fell back to the "C" locale, the default locale
that is supposed to work no matter what.  (On Windows, it first tries
falling back to the system default locale.)  This usually means your
locale settings were wrong, they mention locales your system has never
heard of, or the locale installation in your system has problems (for
example, some system files are broken or missing).  There are quick and
temporary fixes to these problems, as well as more thorough and lasting
fixes.

=head2 Testing for broken locales

If you are building Perl from source, the Perl test suite file
F<lib/locale.t> can be used to test the locales on your system.
Setting the environment variable C<PERL_DEBUG_FULL_TEST> to 1
will cause it to output detailed results.  For example, on Linux, you
could say

 PERL_DEBUG_FULL_TEST=1 ./perl -T -Ilib lib/locale.t > locale.log 2>&1

Besides many other tests, it will test every locale it finds on your
system to see if they conform to the POSIX standard.  If any have
errors, it will include a summary near the end of the output of which
locales passed all its tests, and which failed, and why.

=head2 Temporarily fixing locale problems

The two quickest fixes are either to render Perl silent about any
locale inconsistencies or to run Perl under the default locale "C".

Perl's moaning about locale problems can be silenced by setting the
environment variable C<PERL_BADLANG> to "0" or "".
This method really just sweeps the problem under the carpet: you tell
Perl to shut up even when Perl sees that something is wrong.  Do not
be surprised if later something locale-dependent misbehaves.

Perl can be run under the "C" locale by setting the environment
variable C<LC_ALL> to "C".  This method is perhaps a bit more civilized
than the C<PERL_BADLANG> approach, but setting C<LC_ALL> (or
other locale variables) may affect other programs as well, not just
Perl.  In particular, external programs run from within Perl will see
these changes.  If you make the new settings permanent (read on), all
programs you run see the changes.  See L</"ENVIRONMENT"> for
the full list of relevant environment variables and L</"USING LOCALES">
for their effects in Perl.  Effects in other programs are
easily deducible.  For example, the variable C<LC_COLLATE> may well affect
your B<sort> program (or whatever the program that arranges "records"
alphabetically in your system is called).

You can test out changing these variables temporarily, and if the
new settings seem to help, put those settings into your shell startup
files.  Consult your local documentation for the exact details.  For
Bourne-like shells (B<sh>, B<ksh>, B<bash>, B<zsh>):

	LC_ALL=en_US.ISO8859-1
	export LC_ALL

This assumes that we saw the locale "en_US.ISO8859-1" using the commands
discussed above.  We decided to try that instead of the above faulty
locale "En_US"--and in Cshish shells (B<csh>, B<tcsh>)

	setenv LC_ALL en_US.ISO8859-1

or if you have the "env" application you can do (in any shell)

	env LC_ALL=en_US.ISO8859-1 perl ...

If you do not know what shell you have, consult your local
helpdesk or the equivalent.

=head2 Permanently fixing locale problems

The slower but superior fixes are when you may be able to yourself
fix the misconfiguration of your own environment variables.  The
mis(sing)configuration of the whole system's locales usually requires
the help of your friendly system administrator.

First, see earlier in this document about L</Finding locales>.  That tells
how to find which locales are really supported--and more importantly,
installed--on your system.  In our example error message, environment
variables affecting the locale are listed in the order of decreasing
importance (and unset variables do not matter).  Therefore, having
LC_ALL set to "En_US" must have been the bad choice, as shown by the
error message.  First try fixing locale settings listed first.

Second, if using the listed commands you see something B<exactly>
(prefix matches do not count and case usually counts) like "En_US"
without the quotes, then you should be okay because you are using a
locale name that should be installed and available in your system.
In this case, see L</Permanently fixing your system's locale configuration>.

=head2 Permanently fixing your system's locale configuration

This is when you see something like:

	perl: warning: Please check that your locale settings:
	        LC_ALL = "En_US",
	        LANG = (unset)
	    are supported and installed on your system.

but then cannot see that "En_US" listed by the above-mentioned
commands.  You may see things like "en_US.ISO8859-1", but that isn't
the same.  In this case, try running under a locale
that you can list and which somehow matches what you tried.  The
rules for matching locale names are a bit vague because
standardization is weak in this area.  See again the
L</Finding locales> about general rules.

=head2 Fixing system locale configuration

Contact a system administrator (preferably your own) and report the exact
error message you get, and ask them to read this same documentation you
are now reading.  They should be able to check whether there is something
wrong with the locale configuration of the system.  The L</Finding locales>
section is unfortunately a bit vague about the exact commands and places
because these things are not that standardized.

=head2 The localeconv function

The C<POSIX::localeconv()> function allows you to get particulars of the
locale-dependent numeric formatting information specified by the current
underlying C<LC_NUMERIC> and C<LC_MONETARY> locales (regardless of
whether called from within the scope of C<S<use locale>> or not).  (If
you just want the name of
the current locale for a particular category, use C<POSIX::setlocale()>
with a single parameter--see L</The setlocale function>.)

        use POSIX qw(locale_h);

        # Get a reference to a hash of locale-dependent info
        $locale_values = localeconv();

        # Output sorted list of the values
        for (sort keys %$locale_values) {
            printf "%-20s = %s\n", $_, $locale_values->{$_}
        }

C<localeconv()> takes no arguments, and returns B<a reference to> a hash.
The keys of this hash are variable names for formatting, such as
C<decimal_point> and C<thousands_sep>.  The values are the
corresponding, er, values.  See L<POSIX/localeconv> for a longer
example listing the categories an implementation might be expected to
provide; some provide more and others fewer.  You don't need an
explicit C<use locale>, because C<localeconv()> always observes the
current locale.

Here's a simple-minded example program that rewrites its command-line
parameters as integers correctly formatted in the current locale:

    use POSIX qw(locale_h);

    # Get some of locale's numeric formatting parameters
    my ($thousands_sep, $grouping) =
            @{localeconv()}{'thousands_sep', 'grouping'};

    # Apply defaults if values are missing
    $thousands_sep = ',' unless $thousands_sep;

    # grouping and mon_grouping are packed lists
    # of small integers (characters) telling the
    # grouping (thousand_seps and mon_thousand_seps
    # being the group dividers) of numbers and
    # monetary quantities.  The integers' meanings:
    # 255 means no more grouping, 0 means repeat
    # the previous grouping, 1-254 means use that
    # as the current grouping.  Grouping goes from
    # right to left (low to high digits).  In the
    # below we cheat slightly by never using anything
    # else than the first grouping (whatever that is).
    if ($grouping) {
        @grouping = unpack("C*", $grouping);
    } else {
        @grouping = (3);
    }

    # Format command line params for current locale
    for (@ARGV) {
        $_ = int;    # Chop non-integer part
        1 while
        s/(\d)(\d{$grouping[0]}($|$thousands_sep))/$1$thousands_sep$2/;
        print "$_";
    }
    print "\n";

Note that if the platform doesn't have C<LC_NUMERIC> and/or
C<LC_MONETARY> available or enabled, the corresponding elements of the
hash will be missing.

=head2 I18N::Langinfo

Another interface for querying locale-dependent information is the
C<I18N::Langinfo::langinfo()> function, available at least in Unix-like
systems and VMS.

The following example will import the C<langinfo()> function itself and
three constants to be used as arguments to C<langinfo()>: a constant for
the abbreviated first day of the week (the numbering starts from
Sunday = 1) and two more constants for the affirmative and negative
answers for a yes/no question in the current locale.

    use I18N::Langinfo qw(langinfo ABDAY_1 YESSTR NOSTR);

    my ($abday_1, $yesstr, $nostr)
                = map { langinfo } qw(ABDAY_1 YESSTR NOSTR);

    print "$abday_1? [$yesstr/$nostr] ";

In other words, in the "C" (or English) locale the above will probably
print something like:

    Sun? [yes/no]

See L<I18N::Langinfo> for more information.

=head1 LOCALE CATEGORIES

The following subsections describe basic locale categories.  Beyond these,
some combination categories allow manipulation of more than one
basic category at a time.  See L</"ENVIRONMENT"> for a discussion of these.

=head2 Category C<LC_COLLATE>: Collation: Text Comparisons and Sorting

In the scope of a S<C<use locale>> form that includes collation, Perl
looks to the C<LC_COLLATE>
environment variable to determine the application's notions on collation
(ordering) of characters.  For example, "b" follows "a" in Latin
alphabets, but where do "E<aacute>" and "E<aring>" belong?  And while
"color" follows "chocolate" in English, what about in traditional Spanish?

The following collations all make sense and you may meet any of them
if you C<"use locale">.

	A B C D E a b c d e
	A a B b C c D d E e
	a A b B c C d D e E
	a b c d e A B C D E

Here is a code snippet to tell what "word"
characters are in the current locale, in that locale's order:

        use locale;
        print +(sort grep /\w/, map { chr } 0..255), "\n";

Compare this with the characters that you see and their order if you
state explicitly that the locale should be ignored:

        no locale;
        print +(sort grep /\w/, map { chr } 0..255), "\n";

This machine-native collation (which is what you get unless S<C<use
locale>> has appeared earlier in the same block) must be used for
sorting raw binary data, whereas the locale-dependent collation of the
first example is useful for natural text.

As noted in L</USING LOCALES>, C<cmp> compares according to the current
collation locale when C<use locale> is in effect, but falls back to a
char-by-char comparison for strings that the locale says are equal. You
can use C<POSIX::strcoll()> if you don't want this fall-back:

        use POSIX qw(strcoll);
        $equal_in_locale =
            !strcoll("space and case ignored", "SpaceAndCaseIgnored");

C<$equal_in_locale> will be true if the collation locale specifies a
dictionary-like ordering that ignores space characters completely and
which folds case.

Perl uses the platform's C library collation functions C<strcoll()> and
C<strxfrm()>.  That means you get whatever they give.  On some
platforms, these functions work well on UTF-8 locales, giving
a reasonable default collation for the code points that are important in
that locale.  (And if they aren't working well, the problem may only be
that the locale definition is deficient, so can be fixed by using a
better definition file.  Unicode's definitions (see L</Freely available
locale definitions>) provide reasonable UTF-8 locale collation
definitions.)  Starting in Perl v5.26, Perl's use of these functions has
been made more seamless.  This may be sufficient for your needs.  For
more control, and to make sure strings containing any code point (not
just the ones important in the locale) collate properly, the
L<Unicode::Collate> module is suggested.

In non-UTF-8 locales (hence single byte), code points above 0xFF are
technically invalid.  But if present, again starting in v5.26, they will
collate to the same position as the highest valid code point does.  This
generally gives good results, but the collation order may be skewed if
the valid code point gets special treatment when it forms particular
sequences with other characters as defined by the locale.
When two strings collate identically, the code point order is used as a
tie breaker.

If Perl detects that there are problems with the locale collation order,
it reverts to using non-locale collation rules for that locale.

If Perl detects that there are problems with the locale collation order,
it reverts to using non-locale collation rules for that locale.

If you have a single string that you want to check for "equality in
locale" against several others, you might think you could gain a little
efficiency by using C<POSIX::strxfrm()> in conjunction with C<eq>:

        use POSIX qw(strxfrm);
        $xfrm_string = strxfrm("Mixed-case string");
        print "locale collation ignores spaces\n"
            if $xfrm_string eq strxfrm("Mixed-casestring");
        print "locale collation ignores hyphens\n"
            if $xfrm_string eq strxfrm("Mixedcase string");
        print "locale collation ignores case\n"
            if $xfrm_string eq strxfrm("mixed-case string");

C<strxfrm()> takes a string and maps it into a transformed string for use
in char-by-char comparisons against other transformed strings during
collation.  "Under the hood", locale-affected Perl comparison operators
call C<strxfrm()> for both operands, then do a char-by-char
comparison of the transformed strings.  By calling C<strxfrm()> explicitly
and using a non locale-affected comparison, the example attempts to save
a couple of transformations.  But in fact, it doesn't save anything: Perl
magic (see L<perlguts/Magic Variables>) creates the transformed version of a
string the first time it's needed in a comparison, then keeps this version around
in case it's needed again.  An example rewritten the easy way with
C<cmp> runs just about as fast.  It also copes with null characters
embedded in strings; if you call C<strxfrm()> directly, it treats the first
null it finds as a terminator.  don't expect the transformed strings
it produces to be portable across systems--or even from one revision
of your operating system to the next.  In short, don't call C<strxfrm()>
directly: let Perl do it for you.

Note: C<use locale> isn't shown in some of these examples because it isn't
needed: C<strcoll()> and C<strxfrm()> are POSIX functions
which use the standard system-supplied C<libc> functions that
always obey the current C<LC_COLLATE> locale.

=head2 Category C<LC_CTYPE>: Character Types

In the scope of a S<C<use locale>> form that includes C<LC_CTYPE>, Perl
obeys the C<LC_CTYPE> locale
setting.  This controls the application's notion of which characters are
alphabetic, numeric, punctuation, I<etc>.  This affects Perl's C<\w>
regular expression metanotation,
which stands for alphanumeric characters--that is, alphabetic,
numeric, and the platform's native underscore.
(Consult L<perlre> for more information about
regular expressions.)  Thanks to C<LC_CTYPE>, depending on your locale
setting, characters like "E<aelig>", "E<eth>", "E<szlig>", and
"E<oslash>" may be understood as C<\w> characters.
It also affects things like C<\s>, C<\D>, and the POSIX character
classes, like C<[[:graph:]]>.  (See L<perlrecharclass> for more
information on all these.)

The C<LC_CTYPE> locale also provides the map used in transliterating
characters between lower and uppercase.  This affects the case-mapping
functions--C<fc()>, C<lc()>, C<lcfirst()>, C<uc()>, and C<ucfirst()>;
case-mapping
interpolation with C<\F>, C<\l>, C<\L>, C<\u>, or C<\U> in double-quoted
strings and C<s///> substitutions; and case-independent regular expression
pattern matching using the C<i> modifier.

Starting in v5.20, Perl supports UTF-8 locales for C<LC_CTYPE>, but
otherwise Perl only supports single-byte locales, such as the ISO 8859
series.  This means that wide character locales, for example for Asian
languages, are not well-supported.  Use of these locales may cause core
dumps.  If the platform has the capability for Perl to detect such a
locale, starting in Perl v5.22, L<Perl will warn, default
enabled|warnings/Category Hierarchy>, using the C<locale> warning
category, whenever such a locale is switched into.  The UTF-8 locale
support is actually a
superset of POSIX locales, because it is really full Unicode behavior
as if no C<LC_CTYPE> locale were in effect at all (except for tainting;
see L</SECURITY>).  POSIX locales, even UTF-8 ones,
are lacking certain concepts in Unicode, such as the idea that changing
the case of a character could expand to be more than one character.
Perl in a UTF-8 locale, will give you that expansion.  Prior to v5.20,
Perl treated a UTF-8 locale on some platforms like an ISO 8859-1 one,
with some restrictions, and on other platforms more like the "C" locale.
For releases v5.16 and v5.18, C<S<use locale 'not_characters>> could be
used as a workaround for this (see L</Unicode and UTF-8>).

Note that there are quite a few things that are unaffected by the
current locale.  Any literal character is the native character for the
given platform.  Hence 'A' means the character at code point 65 on ASCII
platforms, and 193 on EBCDIC.  That may or may not be an 'A' in the
current locale, if that locale even has an 'A'.
Similarly, all the escape sequences for particular characters,
C<\n> for example, always mean the platform's native one.  This means,
for example, that C<\N> in regular expressions (every character
but new-line) works on the platform character set.

Starting in v5.22, Perl will by default warn when switching into a
locale that redefines any ASCII printable character (plus C<\t> and
C<\n>) into a different class than expected.  This is likely to
happen on modern locales only on EBCDIC platforms, where, for example,
a CCSID 0037 locale on a CCSID 1047 machine moves C<"[">, but it can
happen on ASCII platforms with the ISO 646 and other
7-bit locales that are essentially obsolete.  Things may still work,
depending on what features of Perl are used by the program.  For
example, in the example from above where C<"|"> becomes a C<\w>, and
there are no regular expressions where this matters, the program may
still work properly.  The warning lists all the characters that
it can determine could be adversely affected.

B<Note:> A broken or malicious C<LC_CTYPE> locale definition may result
in clearly ineligible characters being considered to be alphanumeric by
your application.  For strict matching of (mundane) ASCII letters and
digits--for example, in command strings--locale-aware applications
should use C<\w> with the C</a> regular expression modifier.  See L</"SECURITY">.

=head2 Category C<LC_NUMERIC>: Numeric Formatting

After a proper C<POSIX::setlocale()> call, and within the scope of
of a C<use locale> form that includes numerics, Perl obeys the
C<LC_NUMERIC> locale information, which controls an application's idea
of how numbers should be formatted for human readability.
In most implementations the only effect is to
change the character used for the decimal point--perhaps from "."  to ",".
The functions aren't aware of such niceties as thousands separation and
so on. (See L</The localeconv function> if you care about these things.)

 use POSIX qw(strtod setlocale LC_NUMERIC);
 use locale;

 setlocale LC_NUMERIC, "";

 $n = 5/2;   # Assign numeric 2.5 to $n

 $a = " $n"; # Locale-dependent conversion to string

 print "half five is $n\n";       # Locale-dependent output

 printf "half five is %g\n", $n;  # Locale-dependent output

 print "DECIMAL POINT IS COMMA\n"
          if $n == (strtod("2,5"))[0]; # Locale-dependent conversion

See also L<I18N::Langinfo> and C<RADIXCHAR>.

=head2 Category C<LC_MONETARY>: Formatting of monetary amounts

The C standard defines the C<LC_MONETARY> category, but not a function
that is affected by its contents.  (Those with experience of standards
committees will recognize that the working group decided to punt on the
issue.)  Consequently, Perl essentially takes no notice of it.  If you
really want to use C<LC_MONETARY>, you can query its contents--see
L</The localeconv function>--and use the information that it returns in your
application's own formatting of currency amounts.  However, you may well
find that the information, voluminous and complex though it may be, still
does not quite meet your requirements: currency formatting is a hard nut
to crack.

See also L<I18N::Langinfo> and C<CRNCYSTR>.

=head2 Category C<LC_TIME>: Respresentation of time

Output produced by C<POSIX::strftime()>, which builds a formatted
human-readable date/time string, is affected by the current C<LC_TIME>
locale.  Thus, in a French locale, the output produced by the C<%B>
format element (full month name) for the first month of the year would
be "janvier".  Here's how to get a list of long month names in the
current locale:

        use POSIX qw(strftime);
        for (0..11) {
            $long_month_name[$_] =
                strftime("%B", 0, 0, 0, 1, $_, 96);
        }

Note: C<use locale> isn't needed in this example: C<strftime()> is a POSIX
function which uses the standard system-supplied C<libc> function that
always obeys the current C<LC_TIME> locale.

See also L<I18N::Langinfo> and C<ABDAY_1>..C<ABDAY_7>, C<DAY_1>..C<DAY_7>,
C<ABMON_1>..C<ABMON_12>, and C<ABMON_1>..C<ABMON_12>.

=head2 Other categories

The remaining locale categories are not currently used by Perl itself.
But again note that things Perl interacts with may use these, including
extensions outside the standard Perl distribution, and by the
operating system and its utilities.  Note especially that the string
value of C<$!> and the error messages given by external utilities may
be changed by C<LC_MESSAGES>.  If you want to have portable error
codes, use C<%!>.  See L<Errno>.

=head1 SECURITY

Although the main discussion of Perl security issues can be found in
L<perlsec>, a discussion of Perl's locale handling would be incomplete
if it did not draw your attention to locale-dependent security issues.
Locales--particularly on systems that allow unprivileged users to
build their own locales--are untrustworthy.  A malicious (or just plain
broken) locale can make a locale-aware application give unexpected
results.  Here are a few possibilities:

=over 4

=item *

Regular expression checks for safe file names or mail addresses using
C<\w> may be spoofed by an C<LC_CTYPE> locale that claims that
characters such as C<"E<gt>"> and C<"|"> are alphanumeric.

=item *

String interpolation with case-mapping, as in, say, C<$dest =
"C:\U$name.$ext">, may produce dangerous results if a bogus C<LC_CTYPE>
case-mapping table is in effect.

=item *

A sneaky C<LC_COLLATE> locale could result in the names of students with
"D" grades appearing ahead of those with "A"s.

=item *

An application that takes the trouble to use information in
C<LC_MONETARY> may format debits as if they were credits and vice versa
if that locale has been subverted.  Or it might make payments in US
dollars instead of Hong Kong dollars.

=item *

The date and day names in dates formatted by C<strftime()> could be
manipulated to advantage by a malicious user able to subvert the
C<LC_DATE> locale.  ("Look--it says I wasn't in the building on
Sunday.")

=back

Such dangers are not peculiar to the locale system: any aspect of an
application's environment which may be modified maliciously presents
similar challenges.  Similarly, they are not specific to Perl: any
programming language that allows you to write programs that take
account of their environment exposes you to these issues.

Perl cannot protect you from all possibilities shown in the
examples--there is no substitute for your own vigilance--but, when
C<use locale> is in effect, Perl uses the tainting mechanism (see
L<perlsec>) to mark string results that become locale-dependent, and
which may be untrustworthy in consequence.  Here is a summary of the
tainting behavior of operators and functions that may be affected by
the locale:

=over 4

=item  *

B<Comparison operators> (C<lt>, C<le>, C<ge>, C<gt> and C<cmp>):

Scalar true/false (or less/equal/greater) result is never tainted.

=item  *

B<Case-mapping interpolation> (with C<\l>, C<\L>, C<\u>, C<\U>, or C<\F>)

The result string containing interpolated material is tainted if
a C<use locale> form that includes C<LC_CTYPE> is in effect.

=item  *

B<Matching operator> (C<m//>):

Scalar true/false result never tainted.

All subpatterns, either delivered as a list-context result or as C<$1>
I<etc>., are tainted if a C<use locale> form that includes
C<LC_CTYPE> is in effect, and the subpattern
regular expression contains a locale-dependent construct.  These
constructs include C<\w> (to match an alphanumeric character), C<\W>
(non-alphanumeric character), C<\b> and C<\B> (word-boundary and
non-boundardy, which depend on what C<\w> and C<\W> match), C<\s>
(whitespace character), C<\S> (non whitespace character), C<\d> and
C<\D> (digits and non-digits), and the POSIX character classes, such as
C<[:alpha:]> (see L<perlrecharclass/POSIX Character Classes>).

Tainting is also likely if the pattern is to be matched
case-insensitively (via C</i>).  The exception is if all the code points
to be matched this way are above 255 and do not have folds under Unicode
rules to below 256.  Tainting is not done for these because Perl
only uses Unicode rules for such code points, and those rules are the
same no matter what the current locale.

The matched-pattern variables, C<$&>, C<$`> (pre-match), C<$'>
(post-match), and C<$+> (last match) also are tainted.

=item  *

B<Substitution operator> (C<s///>):

Has the same behavior as the match operator.  Also, the left
operand of C<=~> becomes tainted when a C<use locale>
form that includes C<LC_CTYPE> is in effect, if modified as
a result of a substitution based on a regular
expression match involving any of the things mentioned in the previous
item, or of case-mapping, such as C<\l>, C<\L>,C<\u>, C<\U>, or C<\F>.

=item *

B<Output formatting functions> (C<printf()> and C<write()>):

Results are never tainted because otherwise even output from print,
for example C<print(1/7)>, should be tainted if C<use locale> is in
effect.

=item *

B<Case-mapping functions> (C<lc()>, C<lcfirst()>, C<uc()>, C<ucfirst()>):

Results are tainted if a C<use locale> form that includes C<LC_CTYPE> is
in effect.

=item *

B<POSIX locale-dependent functions> (C<localeconv()>, C<strcoll()>,
C<strftime()>, C<strxfrm()>):

Results are never tainted.

=back

Three examples illustrate locale-dependent tainting.
The first program, which ignores its locale, won't run: a value taken
directly from the command line may not be used to name an output file
when taint checks are enabled.

        #/usr/local/bin/perl -T
        # Run with taint checking

        # Command line sanity check omitted...
        $tainted_output_file = shift;

        open(F, ">$tainted_output_file")
            or warn "Open of $tainted_output_file failed: $!\n";

The program can be made to run by "laundering" the tainted value through
a regular expression: the second example--which still ignores locale
information--runs, creating the file named on its command line
if it can.

        #/usr/local/bin/perl -T

        $tainted_output_file = shift;
        $tainted_output_file =~ m%[\w/]+%;
        $untainted_output_file = $&;

        open(F, ">$untainted_output_file")
            or warn "Open of $untainted_output_file failed: $!\n";

Compare this with a similar but locale-aware program:

        #/usr/local/bin/perl -T

        $tainted_output_file = shift;
        use locale;
        $tainted_output_file =~ m%[\w/]+%;
        $localized_output_file = $&;

        open(F, ">$localized_output_file")
            or warn "Open of $localized_output_file failed: $!\n";

This third program fails to run because C<$&> is tainted: it is the result
of a match involving C<\w> while C<use locale> is in effect.

=head1 ENVIRONMENT

=over 12

=item PERL_SKIP_LOCALE_INIT

This environment variable, available starting in Perl v5.20, if set
(to any value), tells Perl to not use the rest of the
environment variables to initialize with.  Instead, Perl uses whatever
the current locale settings are.  This is particularly useful in
embedded environments, see
L<perlembed/Using embedded Perl with POSIX locales>.

=item PERL_BADLANG

A string that can suppress Perl's warning about failed locale settings
at startup.  Failure can occur if the locale support in the operating
system is lacking (broken) in some way--or if you mistyped the name of
a locale when you set up your environment.  If this environment
variable is absent, or has a value other than "0" or "", Perl will
complain about locale setting failures.

B<NOTE>: C<PERL_BADLANG> only gives you a way to hide the warning message.
The message tells about some problem in your system's locale support,
and you should investigate what the problem is.

=back

The following environment variables are not specific to Perl: They are
part of the standardized (ISO C, XPG4, POSIX 1.c) C<setlocale()> method
for controlling an application's opinion on data.  Windows is non-POSIX,
but Perl arranges for the following to work as described anyway.
If the locale given by an environment variable is not valid, Perl tries
the next lower one in priority.  If none are valid, on Windows, the
system default locale is then tried.  If all else fails, the C<"C">
locale is used.  If even that doesn't work, something is badly broken,
but Perl tries to forge ahead with whatever the locale settings might
be.

=over 12

=item C<LC_ALL>

C<LC_ALL> is the "override-all" locale environment variable. If
set, it overrides all the rest of the locale environment variables.

=item C<LANGUAGE>

B<NOTE>: C<LANGUAGE> is a GNU extension, it affects you only if you
are using the GNU libc.  This is the case if you are using e.g. Linux.
If you are using "commercial" Unixes you are most probably I<not>
using GNU libc and you can ignore C<LANGUAGE>.

However, in the case you are using C<LANGUAGE>: it affects the
language of informational, warning, and error messages output by
commands (in other words, it's like C<LC_MESSAGES>) but it has higher
priority than C<LC_ALL>.  Moreover, it's not a single value but
instead a "path" (":"-separated list) of I<languages> (not locales).
See the GNU C<gettext> library documentation for more information.

=item C<LC_CTYPE>

In the absence of C<LC_ALL>, C<LC_CTYPE> chooses the character type
locale.  In the absence of both C<LC_ALL> and C<LC_CTYPE>, C<LANG>
chooses the character type locale.

=item C<LC_COLLATE>

In the absence of C<LC_ALL>, C<LC_COLLATE> chooses the collation
(sorting) locale.  In the absence of both C<LC_ALL> and C<LC_COLLATE>,
C<LANG> chooses the collation locale.

=item C<LC_MONETARY>

In the absence of C<LC_ALL>, C<LC_MONETARY> chooses the monetary
formatting locale.  In the absence of both C<LC_ALL> and C<LC_MONETARY>,
C<LANG> chooses the monetary formatting locale.

=item C<LC_NUMERIC>

In the absence of C<LC_ALL>, C<LC_NUMERIC> chooses the numeric format
locale.  In the absence of both C<LC_ALL> and C<LC_NUMERIC>, C<LANG>
chooses the numeric format.

=item C<LC_TIME>

In the absence of C<LC_ALL>, C<LC_TIME> chooses the date and time
formatting locale.  In the absence of both C<LC_ALL> and C<LC_TIME>,
C<LANG> chooses the date and time formatting locale.

=item C<LANG>

C<LANG> is the "catch-all" locale environment variable. If it is set, it
is used as the last resort after the overall C<LC_ALL> and the
category-specific C<LC_I<foo>>.

=back

=head2 Examples

The C<LC_NUMERIC> controls the numeric output:

   use locale;
   use POSIX qw(locale_h); # Imports setlocale() and the LC_ constants.
   setlocale(LC_NUMERIC, "fr_FR") or die "Pardon";
   printf "%g\n", 1.23; # If the "fr_FR" succeeded, probably shows 1,23.

and also how strings are parsed by C<POSIX::strtod()> as numbers:

   use locale;
   use POSIX qw(locale_h strtod);
   setlocale(LC_NUMERIC, "de_DE") or die "Entschuldigung";
   my $x = strtod("2,34") + 5;
   print $x, "\n"; # Probably shows 7,34.

=head1 NOTES

=head2 String C<eval> and C<LC_NUMERIC>

A string L<eval|perlfunc/eval EXPR> parses its expression as standard
Perl.  It is therefore expecting the decimal point to be a dot.  If
C<LC_NUMERIC> is set to have this be a comma instead, the parsing will
be confused, perhaps silently.

 use locale;
 use POSIX qw(locale_h);
 setlocale(LC_NUMERIC, "fr_FR") or die "Pardon";
 my $a = 1.2;
 print eval "$a + 1.5";
 print "\n";

prints C<13,5>.  This is because in that locale, the comma is the
decimal point character.  The C<eval> thus expands to:

 eval "1,2 + 1.5"

and the result is not what you likely expected.  No warnings are
generated.  If you do string C<eval>'s within the scope of
S<C<use locale>>, you should instead change the C<eval> line to do
something like:

 print eval "no locale; $a + 1.5";

This prints C<2.7>.

You could also exclude C<LC_NUMERIC>, if you don't need it, by

 use locale ':!numeric';

=head2 Backward compatibility

Versions of Perl prior to 5.004 B<mostly> ignored locale information,
generally behaving as if something similar to the C<"C"> locale were
always in force, even if the program environment suggested otherwise
(see L</The setlocale function>).  By default, Perl still behaves this
way for backward compatibility.  If you want a Perl application to pay
attention to locale information, you B<must> use the S<C<use locale>>
pragma (see L</The "use locale" pragma>) or, in the unlikely event
that you want to do so for just pattern matching, the
C</l> regular expression modifier (see L<perlre/Character set
modifiers>) to instruct it to do so.

Versions of Perl from 5.002 to 5.003 did use the C<LC_CTYPE>
information if available; that is, C<\w> did understand what
were the letters according to the locale environment variables.
The problem was that the user had no control over the feature:
if the C library supported locales, Perl used them.

=head2 I18N:Collate obsolete

In versions of Perl prior to 5.004, per-locale collation was possible
using the C<I18N::Collate> library module.  This module is now mildly
obsolete and should be avoided in new applications.  The C<LC_COLLATE>
functionality is now integrated into the Perl core language: One can
use locale-specific scalar data completely normally with C<use locale>,
so there is no longer any need to juggle with the scalar references of
C<I18N::Collate>.

=head2 Sort speed and memory use impacts

Comparing and sorting by locale is usually slower than the default
sorting; slow-downs of two to four times have been observed.  It will
also consume more memory: once a Perl scalar variable has participated
in any string comparison or sorting operation obeying the locale
collation rules, it will take 3-15 times more memory than before.  (The
exact multiplier depends on the string's contents, the operating system
and the locale.) These downsides are dictated more by the operating
system's implementation of the locale system than by Perl.

=head2 Freely available locale definitions

The Unicode CLDR project extracts the POSIX portion of many of its
locales, available at

  http://unicode.org/Public/cldr/2.0.1/

(Newer versions of CLDR require you to compute the POSIX data yourself.
See L<http://unicode.org/Public/cldr/latest/>.)

There is a large collection of locale definitions at:

  http://std.dkuug.dk/i18n/WG15-collection/locales/

You should be aware that it is
unsupported, and is not claimed to be fit for any purpose.  If your
system allows installation of arbitrary locales, you may find the
definitions useful as they are, or as a basis for the development of
your own locales.

=head2 I18n and l10n

"Internationalization" is often abbreviated as B<i18n> because its first
and last letters are separated by eighteen others.  (You may guess why
the internalin ... internaliti ... i18n tends to get abbreviated.)  In
the same way, "localization" is often abbreviated to B<l10n>.

=head2 An imperfect standard

Internationalization, as defined in the C and POSIX standards, can be
criticized as incomplete, ungainly, and having too large a granularity.
(Locales apply to a whole process, when it would arguably be more useful
to have them apply to a single thread, window group, or whatever.)  They
also have a tendency, like standards groups, to divide the world into
nations, when we all know that the world can equally well be divided
into bankers, bikers, gamers, and so on.

=head1 Unicode and UTF-8

The support of Unicode is new starting from Perl version v5.6, and more fully
implemented in versions v5.8 and later.  See L<perluniintro>.

Starting in Perl v5.20, UTF-8 locales are supported in Perl, except
C<LC_COLLATE> is only partially supported; collation support is improved
in Perl v5.26 to a level that may be sufficient for your needs
(see L</Category C<LC_COLLATE>: Collation: Text Comparisons and Sorting>).

If you have Perl v5.16 or v5.18 and can't upgrade, you can use

    use locale ':not_characters';

When this form of the pragma is used, only the non-character portions of
locales are used by Perl, for example C<LC_NUMERIC>.  Perl assumes that
you have translated all the characters it is to operate on into Unicode
(actually the platform's native character set (ASCII or EBCDIC) plus
Unicode).  For data in files, this can conveniently be done by also
specifying

    use open ':locale';

This pragma arranges for all inputs from files to be translated into
Unicode from the current locale as specified in the environment (see
L</ENVIRONMENT>), and all outputs to files to be translated back
into the locale.  (See L<open>).  On a per-filehandle basis, you can
instead use the L<PerlIO::locale> module, or the L<Encode::Locale>
module, both available from CPAN.  The latter module also has methods to
ease the handling of C<ARGV> and environment variables, and can be used
on individual strings.  If you know that all your locales will be
UTF-8, as many are these days, you can use the L<B<-C>|perlrun/-C>
command line switch.

This form of the pragma allows essentially seamless handling of locales
with Unicode.  The collation order will be by Unicode code point order.
L<Unicode::Collate> can be used to get Unicode rules collation.

All the modules and switches just described can be used in v5.20 with
just plain C<use locale>, and, should the input locales not be UTF-8,
you'll get the less than ideal behavior, described below, that you get
with pre-v5.16 Perls, or when you use the locale pragma without the
C<:not_characters> parameter in v5.16 and v5.18.  If you are using
exclusively UTF-8 locales in v5.20 and higher, the rest of this section
does not apply to you.

There are two cases, multi-byte and single-byte locales.  First
multi-byte:

The only multi-byte (or wide character) locale that Perl is ever likely
to support is UTF-8.  This is due to the difficulty of implementation,
the fact that high quality UTF-8 locales are now published for every
area of the world (L<http://unicode.org/Public/cldr/2.0.1/> for
ones that are already set-up, but from an earlier version;
L<http://unicode.org/Public/cldr/latest/> for the most up-to-date, but
you have to extract the POSIX information yourself), and that
failing all that you can use the L<Encode> module to translate to/from
your locale.  So, you'll have to do one of those things if you're using
one of these locales, such as Big5 or Shift JIS.  For UTF-8 locales, in
Perls (pre v5.20) that don't have full UTF-8 locale support, they may
work reasonably well (depending on your C library implementation)
simply because both
they and Perl store characters that take up multiple bytes the same way.
However, some, if not most, C library implementations may not process
the characters in the upper half of the Latin-1 range (128 - 255)
properly under C<LC_CTYPE>.  To see if a character is a particular type
under a locale, Perl uses the functions like C<isalnum()>.  Your C
library may not work for UTF-8 locales with those functions, instead
only working under the newer wide library functions like C<iswalnum()>,
which Perl does not use.
These multi-byte locales are treated like single-byte locales, and will
have the restrictions described below.  Starting in Perl v5.22 a warning
message is raised when Perl detects a multi-byte locale that it doesn't
fully support.

For single-byte locales,
Perl generally takes the tack to use locale rules on code points that can fit
in a single byte, and Unicode rules for those that can't (though this
isn't uniformly applied, see the note at the end of this section).  This
prevents many problems in locales that aren't UTF-8.  Suppose the locale
is ISO8859-7, Greek.  The character at 0xD7 there is a capital Chi. But
in the ISO8859-1 locale, Latin1, it is a multiplication sign.  The POSIX
regular expression character class C<[[:alpha:]]> will magically match
0xD7 in the Greek locale but not in the Latin one.

However, there are places where this breaks down.  Certain Perl constructs are
for Unicode only, such as C<\p{Alpha}>.  They assume that 0xD7 always has its
Unicode meaning (or the equivalent on EBCDIC platforms).  Since Latin1 is a
subset of Unicode and 0xD7 is the multiplication sign in both Latin1 and
Unicode, C<\p{Alpha}> will never match it, regardless of locale.  A similar
issue occurs with C<\N{...}>.  Prior to v5.20, It is therefore a bad
idea to use C<\p{}> or
C<\N{}> under plain C<use locale>--I<unless> you can guarantee that the
locale will be ISO8859-1.  Use POSIX character classes instead.

Another problem with this approach is that operations that cross the
single byte/multiple byte boundary are not well-defined, and so are
disallowed.  (This boundary is between the codepoints at 255/256.)
For example, lower casing LATIN CAPITAL LETTER Y WITH DIAERESIS (U+0178)
should return LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF).  But in the
Greek locale, for example, there is no character at 0xFF, and Perl
has no way of knowing what the character at 0xFF is really supposed to
represent.  Thus it disallows the operation.  In this mode, the
lowercase of U+0178 is itself.

The same problems ensue if you enable automatic UTF-8-ification of your
standard file handles, default C<open()> layer, and C<@ARGV> on non-ISO8859-1,
non-UTF-8 locales (by using either the B<-C> command line switch or the
C<PERL_UNICODE> environment variable; see L<perlrun>).
Things are read in as UTF-8, which would normally imply a Unicode
interpretation, but the presence of a locale causes them to be interpreted
in that locale instead.  For example, a 0xD7 code point in the Unicode
input, which should mean the multiplication sign, won't be interpreted by
Perl that way under the Greek locale.  This is not a problem
I<provided> you make certain that all locales will always and only be either
an ISO8859-1, or, if you don't have a deficient C library, a UTF-8 locale.

Still another problem is that this approach can lead to two code
points meaning the same character.  Thus in a Greek locale, both U+03A7
and U+00D7 are GREEK CAPITAL LETTER CHI.

Because of all these problems, starting in v5.22, Perl will raise a
warning if a multi-byte (hence Unicode) code point is used when a
single-byte locale is in effect.  (Although it doesn't check for this if
doing so would unreasonably slow execution down.)

Vendor locales are notoriously buggy, and it is difficult for Perl to test
its locale-handling code because this interacts with code that Perl has no
control over; therefore the locale-handling code in Perl may be buggy as
well.  (However, the Unicode-supplied locales should be better, and
there is a feed back mechanism to correct any problems.  See
L</Freely available locale definitions>.)

If you have Perl v5.16, the problems mentioned above go away if you use
the C<:not_characters> parameter to the locale pragma (except for vendor
bugs in the non-character portions).  If you don't have v5.16, and you
I<do> have locales that work, using them may be worthwhile for certain
specific purposes, as long as you keep in mind the gotchas already
mentioned.  For example, if the collation for your locales works, it
runs faster under locales than under L<Unicode::Collate>; and you gain
access to such things as the local currency symbol and the names of the
months and days of the week.  (But to hammer home the point, in v5.16,
you get this access without the downsides of locales by using the
C<:not_characters> form of the pragma.)

Note: The policy of using locale rules for code points that can fit in a
byte, and Unicode rules for those that can't is not uniformly applied.
Pre-v5.12, it was somewhat haphazard; in v5.12 it was applied fairly
consistently to regular expression matching except for bracketed
character classes; in v5.14 it was extended to all regex matches; and in
v5.16 to the casing operations such as C<\L> and C<uc()>.  For
collation, in all releases so far, the system's C<strxfrm()> function is
called, and whatever it does is what you get.  Starting in v5.26, various
bugs are fixed with the way perl uses this function.

=head1 BUGS

=head2 Collation of strings containing embedded C<NUL> characters

C<NUL> characters will sort the same as the lowest collating control
character does, or to C<"\001"> in the unlikely event that there are no
control characters at all in the locale.  In cases where the strings
don't contain this non-C<NUL> control, the results will be correct, and
in many locales, this control, whatever it might be, will rarely be
encountered.  But there are cases where a C<NUL> should sort before this
control, but doesn't.  If two strings do collate identically, the one
containing the C<NUL> will sort to earlier.

=head2 Broken systems

In certain systems, the operating system's locale support
is broken and cannot be fixed or used by Perl.  Such deficiencies can
and will result in mysterious hangs and/or Perl core dumps when
C<use locale> is in effect.  When confronted with such a system,
please report in excruciating detail to <F<perlbug@perl.org>>, and
also contact your vendor: bug fixes may exist for these problems
in your operating system.  Sometimes such bug fixes are called an
operating system upgrade.  If you have the source for Perl, include in
the perlbug email the output of the test described above in L</Testing
for broken locales>.

=head1 SEE ALSO

L<I18N::Langinfo>, L<perluniintro>, L<perlunicode>, L<open>,
L<POSIX/isalnum>, L<POSIX/isalpha>,
L<POSIX/isdigit>, L<POSIX/isgraph>, L<POSIX/islower>,
L<POSIX/isprint>, L<POSIX/ispunct>, L<POSIX/isspace>,
L<POSIX/isupper>, L<POSIX/isxdigit>, L<POSIX/localeconv>,
L<POSIX/setlocale>, L<POSIX/strcoll>, L<POSIX/strftime>,
L<POSIX/strtod>, L<POSIX/strxfrm>.

For special considerations when Perl is embedded in a C program,
see L<perlembed/Using embedded Perl with POSIX locales>.

=head1 HISTORY

Jarkko Hietaniemi's original F<perli18n.pod> heavily hacked by Dominic
Dunlop, assisted by the perl5-porters.  Prose worked over a bit by
Tom Christiansen, and updated by Perl 5 porters.
perlirix.pod000064400000010453150344123460007113 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perlirix - Perl version 5 on Irix systems

=head1 DESCRIPTION

This document describes various features of Irix that will affect how Perl
version 5 (hereafter just Perl) is compiled and/or runs.

=head2 Building 32-bit Perl in Irix

Use

	sh Configure -Dcc='cc -n32'

to compile Perl 32-bit.  Don't bother with -n32 unless you have 7.1
or later compilers (use cc -version to check).

(Building 'cc -n32' is the default.)

=head2 Building 64-bit Perl in Irix

Use

	sh Configure -Dcc='cc -64' -Duse64bitint

This requires require a 64-bit MIPS CPU (R8000, R10000, ...)

You can also use

	sh Configure -Dcc='cc -64' -Duse64bitall

but that makes no difference compared with the -Duse64bitint because
of the C<cc -64>.

You can also do

	sh Configure -Dcc='cc -n32' -Duse64bitint

to use long longs for the 64-bit integer type, in case you don't
have a 64-bit CPU.

If you are using gcc, just

	sh Configure -Dcc=gcc -Duse64bitint

should be enough, the Configure should automatically probe for the
correct 64-bit settings.

=head2 About Compiler Versions of Irix

Some Irix cc versions, e.g. 7.3.1.1m (try cc -version) have been known
to have issues (coredumps) when compiling perl.c.  If you've used
-OPT:fast_io=ON and this happens, try removing it.  If that fails, or
you didn't use that, then try adjusting other optimization options
(-LNO, -INLINE, -O3 to -O2, etcetera).  The compiler bug has been
reported to SGI.  (Allen Smith <easmith@beatrice.rutgers.edu>)

=head2 Linker Problems in Irix

If you get complaints about so_locations then search in the file
hints/irix_6.sh for "lddflags" and do the suggested adjustments.
(David Billinghurst <David.Billinghurst@riotinto.com.au>)

=head2 Malloc in Irix

Do not try to use Perl's malloc, this will lead into very mysterious
errors (especially with -Duse64bitall).

=head2 Building with threads in Irix

Run Configure with -Duseithreads which will configure Perl with
the Perl 5.8.0 "interpreter threads", see L<threads>.

For Irix 6.2 with perl threads, you have to have the following
patches installed:

        1404 Irix 6.2 Posix 1003.1b man pages
        1645 Irix 6.2 & 6.3 POSIX header file updates
        2000 Irix 6.2 Posix 1003.1b support modules
        2254 Pthread library fixes
        2401 6.2 all platform kernel rollup

B<IMPORTANT>: Without patch 2401, a kernel bug in Irix 6.2 will cause
your machine to panic and crash when running threaded perl.  Irix 6.3
and later are okay.

    Thanks to Hannu Napari <Hannu.Napari@hut.fi> for the IRIX
    pthreads patches information.

=head2 Irix 5.3

While running Configure and when building, you are likely to get
quite a few of these warnings:

  ld:
  The shared object /usr/lib/libm.so did not resolve any symbols.
        You may want to remove it from your link line.

Ignore them: in IRIX 5.3 there is no way to quieten ld about this.

During compilation you will see this warning from toke.c:

  uopt: Warning: Perl_yylex: this procedure not optimized because it
        exceeds size threshold; to optimize this procedure, use -Olimit
        option with value >= 4252.

Ignore the warning.

In IRIX 5.3 and with Perl 5.8.1 (Perl 5.8.0 didn't compile in IRIX 5.3)
the following failures are known.

 Failed Test                  Stat Wstat Total Fail  Failed  List of Failed
 -----------------------------------------------------------------------
 ../ext/List/Util/t/shuffle.t    0   139    ??   ??       %  ??
 ../lib/Math/Trig.t            255 65280    29   12  41.38%  24-29
 ../lib/sort.t                   0   138   119   72  60.50%  48-119
 56 tests and 474 subtests skipped.
 Failed 3/811 test scripts, 99.63% okay. 78/75813 subtests failed,
    99.90% okay.

They are suspected to be compiler errors (at least the shuffle.t
failure is known from some IRIX 6 setups) and math library errors
(the Trig.t failure), but since IRIX 5 is long since end-of-lifed,
further fixes for the IRIX are unlikely.  If you can get gcc for 5.3,
you could try that, too, since gcc in IRIX 6 is a known workaround for
at least the shuffle.t and sort.t failures.

=head1 AUTHOR

Jarkko Hietaniemi <jhi@iki.fi>

Please report any errors, updates, or suggestions to F<perlbug@perl.org>.

perlmacos.pod000064400000001751150344123460007243 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlmacos - Perl under Mac OS (Classic)

=head1 SYNOPSIS

For Mac OS X see README.macosx

Perl under Mac OS Classic has not been supported since before Perl 5.10
(April 2004).

When we say "Mac OS" below, we mean Mac OS 7, 8, and 9, and I<not>
Mac OS X.

=head1 DESCRIPTION

The port of Perl to to Mac OS was officially removed as of Perl 5.12,
though the last official production release of MacPerl corresponded to 
Perl 5.6. While Perl 5.10 included the port to Mac OS, ExtUtils::MakeMaker,
a core part of Perl's module installation infrastructure officially dropped support for Mac OS in April 2004.

=head1 AUTHOR

Perl was ported to Mac OS by Matthias Neeracher
E<lt>neeracher@mac.comE<gt>. Chris Nandor E<lt>pudge@pobox.comE<gt>
continued development and maintenance for the duration of the port's life.
perlembed.pod000064400000110514150344123460007213 0ustar00=head1 NAME

perlembed - how to embed perl in your C program

=head1 DESCRIPTION

=head2 PREAMBLE

Do you want to:

=over 5

=item B<Use C from Perl?>

Read L<perlxstut>, L<perlxs>, L<h2xs>, L<perlguts>, and L<perlapi>.

=item B<Use a Unix program from Perl?>

Read about back-quotes and about C<system> and C<exec> in L<perlfunc>.

=item B<Use Perl from Perl?>

Read about L<perlfunc/do> and L<perlfunc/eval> and L<perlfunc/require>
and L<perlfunc/use>.

=item B<Use C from C?>

Rethink your design.

=item B<Use Perl from C?>

Read on...

=back

=head2 ROADMAP

=over 5

=item *

Compiling your C program

=item *

Adding a Perl interpreter to your C program

=item *

Calling a Perl subroutine from your C program

=item *

Evaluating a Perl statement from your C program

=item *

Performing Perl pattern matches and substitutions from your C program

=item *

Fiddling with the Perl stack from your C program

=item *

Maintaining a persistent interpreter

=item *

Maintaining multiple interpreter instances

=item *

Using Perl modules, which themselves use C libraries, from your C program

=item *

Embedding Perl under Win32

=back

=head2 Compiling your C program

If you have trouble compiling the scripts in this documentation,
you're not alone.  The cardinal rule: COMPILE THE PROGRAMS IN EXACTLY
THE SAME WAY THAT YOUR PERL WAS COMPILED.  (Sorry for yelling.)

Also, every C program that uses Perl must link in the I<perl library>.
What's that, you ask?  Perl is itself written in C; the perl library
is the collection of compiled C programs that were used to create your
perl executable (I</usr/bin/perl> or equivalent).  (Corollary: you
can't use Perl from your C program unless Perl has been compiled on
your machine, or installed properly--that's why you shouldn't blithely
copy Perl executables from machine to machine without also copying the
I<lib> directory.)

When you use Perl from C, your C program will--usually--allocate,
"run", and deallocate a I<PerlInterpreter> object, which is defined by
the perl library.

If your copy of Perl is recent enough to contain this documentation
(version 5.002 or later), then the perl library (and I<EXTERN.h> and
I<perl.h>, which you'll also need) will reside in a directory
that looks like this:

    /usr/local/lib/perl5/your_architecture_here/CORE

or perhaps just

    /usr/local/lib/perl5/CORE

or maybe something like

    /usr/opt/perl5/CORE

Execute this statement for a hint about where to find CORE:

    perl -MConfig -e 'print $Config{archlib}'

Here's how you'd compile the example in the next section,
L</Adding a Perl interpreter to your C program>, on my Linux box:

    % gcc -O2 -Dbool=char -DHAS_BOOL -I/usr/local/include
    -I/usr/local/lib/perl5/i586-linux/5.003/CORE
    -L/usr/local/lib/perl5/i586-linux/5.003/CORE
    -o interp interp.c -lperl -lm

(That's all one line.)  On my DEC Alpha running old 5.003_05, the
incantation is a bit different:

    % cc -O2 -Olimit 2900 -DSTANDARD_C -I/usr/local/include
    -I/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE
    -L/usr/local/lib/perl5/alpha-dec_osf/5.00305/CORE -L/usr/local/lib
    -D__LANGUAGE_C__ -D_NO_PROTO -o interp interp.c -lperl -lm

How can you figure out what to add?  Assuming your Perl is post-5.001,
execute a C<perl -V> command and pay special attention to the "cc" and
"ccflags" information.

You'll have to choose the appropriate compiler (I<cc>, I<gcc>, et al.) for
your machine: C<perl -MConfig -e 'print $Config{cc}'> will tell you what
to use.

You'll also have to choose the appropriate library directory
(I</usr/local/lib/...>) for your machine.  If your compiler complains
that certain functions are undefined, or that it can't locate
I<-lperl>, then you need to change the path following the C<-L>.  If it
complains that it can't find I<EXTERN.h> and I<perl.h>, you need to
change the path following the C<-I>.

You may have to add extra libraries as well.  Which ones?
Perhaps those printed by

   perl -MConfig -e 'print $Config{libs}'

Provided your perl binary was properly configured and installed the
B<ExtUtils::Embed> module will determine all of this information for
you:

   % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

If the B<ExtUtils::Embed> module isn't part of your Perl distribution,
you can retrieve it from
L<http://www.perl.com/perl/CPAN/modules/by-module/ExtUtils/>
(If this documentation came from your Perl distribution, then you're
running 5.004 or better and you already have it.)

The B<ExtUtils::Embed> kit on CPAN also contains all source code for
the examples in this document, tests, additional examples and other
information you may find useful.

=head2 Adding a Perl interpreter to your C program

In a sense, perl (the C program) is a good example of embedding Perl
(the language), so I'll demonstrate embedding with I<miniperlmain.c>,
included in the source distribution.  Here's a bastardized, non-portable
version of I<miniperlmain.c> containing the essentials of embedding:

 #include <EXTERN.h>               /* from the Perl distribution     */
 #include <perl.h>                 /* from the Perl distribution     */

 static PerlInterpreter *my_perl;  /***    The Perl interpreter    ***/

 int main(int argc, char **argv, char **env)
 {
	PERL_SYS_INIT3(&argc,&argv,&env);
        my_perl = perl_alloc();
        perl_construct(my_perl);
	PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
        perl_parse(my_perl, NULL, argc, argv, (char **)NULL);
        perl_run(my_perl);
        perl_destruct(my_perl);
        perl_free(my_perl);
	PERL_SYS_TERM();
 }

Notice that we don't use the C<env> pointer.  Normally handed to
C<perl_parse> as its final argument, C<env> here is replaced by
C<NULL>, which means that the current environment will be used.

The macros PERL_SYS_INIT3() and PERL_SYS_TERM() provide system-specific
tune up of the C runtime environment necessary to run Perl interpreters;
they should only be called once regardless of how many interpreters you
create or destroy. Call PERL_SYS_INIT3() before you create your first
interpreter, and PERL_SYS_TERM() after you free your last interpreter.

Since PERL_SYS_INIT3() may change C<env>, it may be more appropriate to
provide C<env> as an argument to perl_parse().

Also notice that no matter what arguments you pass to perl_parse(),
PERL_SYS_INIT3() must be invoked on the C main() argc, argv and env and
only once.

Mind that argv[argc] must be NULL, same as those passed to a main
function in C.

Now compile this program (I'll call it I<interp.c>) into an executable:

    % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

After a successful compilation, you'll be able to use I<interp> just
like perl itself:

    % interp
    print "Pretty Good Perl \n";
    print "10890 - 9801 is ", 10890 - 9801;
    <CTRL-D>
    Pretty Good Perl
    10890 - 9801 is 1089

or

    % interp -e 'printf("%x", 3735928559)'
    deadbeef

You can also read and execute Perl statements from a file while in the
midst of your C program, by placing the filename in I<argv[1]> before
calling I<perl_run>.

=head2 Calling a Perl subroutine from your C program

To call individual Perl subroutines, you can use any of the B<call_*>
functions documented in L<perlcall>.
In this example we'll use C<call_argv>.

That's shown below, in a program I'll call I<showtime.c>.

    #include <EXTERN.h>
    #include <perl.h>

    static PerlInterpreter *my_perl;

    int main(int argc, char **argv, char **env)
    {
        char *args[] = { NULL };
	PERL_SYS_INIT3(&argc,&argv,&env);
        my_perl = perl_alloc();
        perl_construct(my_perl);

        perl_parse(my_perl, NULL, argc, argv, NULL);
	PL_exit_flags |= PERL_EXIT_DESTRUCT_END;

        /*** skipping perl_run() ***/

        call_argv("showtime", G_DISCARD | G_NOARGS, args);

        perl_destruct(my_perl);
        perl_free(my_perl);
	PERL_SYS_TERM();
    }

where I<showtime> is a Perl subroutine that takes no arguments (that's the
I<G_NOARGS>) and for which I'll ignore the return value (that's the
I<G_DISCARD>).  Those flags, and others, are discussed in L<perlcall>.

I'll define the I<showtime> subroutine in a file called I<showtime.pl>:

 print "I shan't be printed.";

 sub showtime {
     print time;
 }

Simple enough. Now compile and run:

 % cc -o showtime showtime.c \
     `perl -MExtUtils::Embed -e ccopts -e ldopts`
 % showtime showtime.pl
 818284590

yielding the number of seconds that elapsed between January 1, 1970
(the beginning of the Unix epoch), and the moment I began writing this
sentence.

In this particular case we don't have to call I<perl_run>, as we set
the PL_exit_flag PERL_EXIT_DESTRUCT_END which executes END blocks in
perl_destruct.

If you want to pass arguments to the Perl subroutine, you can add
strings to the C<NULL>-terminated C<args> list passed to
I<call_argv>.  For other data types, or to examine return values,
you'll need to manipulate the Perl stack.  That's demonstrated in
L</Fiddling with the Perl stack from your C program>.

=head2 Evaluating a Perl statement from your C program

Perl provides two API functions to evaluate pieces of Perl code.
These are L<perlapi/eval_sv> and L<perlapi/eval_pv>.

Arguably, these are the only routines you'll ever need to execute
snippets of Perl code from within your C program.  Your code can be as
long as you wish; it can contain multiple statements; it can employ
L<perlfunc/use>, L<perlfunc/require>, and L<perlfunc/do> to
include external Perl files.

I<eval_pv> lets us evaluate individual Perl strings, and then
extract variables for coercion into C types.  The following program,
I<string.c>, executes three Perl strings, extracting an C<int> from
the first, a C<float> from the second, and a C<char *> from the third.

 #include <EXTERN.h>
 #include <perl.h>

 static PerlInterpreter *my_perl;

 main (int argc, char **argv, char **env)
 {
     char *embedding[] = { "", "-e", "0" };

     PERL_SYS_INIT3(&argc,&argv,&env);
     my_perl = perl_alloc();
     perl_construct( my_perl );

     perl_parse(my_perl, NULL, 3, embedding, NULL);
     PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
     perl_run(my_perl);

     /** Treat $a as an integer **/
     eval_pv("$a = 3; $a **= 2", TRUE);
     printf("a = %d\n", SvIV(get_sv("a", 0)));

     /** Treat $a as a float **/
     eval_pv("$a = 3.14; $a **= 2", TRUE);
     printf("a = %f\n", SvNV(get_sv("a", 0)));

     /** Treat $a as a string **/
     eval_pv(
       "$a = 'rekcaH lreP rehtonA tsuJ'; $a = reverse($a);", TRUE);
     printf("a = %s\n", SvPV_nolen(get_sv("a", 0)));

     perl_destruct(my_perl);
     perl_free(my_perl);
     PERL_SYS_TERM();
 }

All of those strange functions with I<sv> in their names help convert Perl
scalars to C types.  They're described in L<perlguts> and L<perlapi>.

If you compile and run I<string.c>, you'll see the results of using
I<SvIV()> to create an C<int>, I<SvNV()> to create a C<float>, and
I<SvPV()> to create a string:

   a = 9
   a = 9.859600
   a = Just Another Perl Hacker

In the example above, we've created a global variable to temporarily
store the computed value of our eval'ed expression.  It is also
possible and in most cases a better strategy to fetch the return value
from I<eval_pv()> instead.  Example:

   ...
   SV *val = eval_pv("reverse 'rekcaH lreP rehtonA tsuJ'", TRUE);
   printf("%s\n", SvPV_nolen(val));
   ...

This way, we avoid namespace pollution by not creating global
variables and we've simplified our code as well.

=head2 Performing Perl pattern matches and substitutions from your C program

The I<eval_sv()> function lets us evaluate strings of Perl code, so we can
define some functions that use it to "specialize" in matches and
substitutions: I<match()>, I<substitute()>, and I<matches()>.

   I32 match(SV *string, char *pattern);

Given a string and a pattern (e.g., C<m/clasp/> or C</\b\w*\b/>, which
in your C program might appear as "/\\b\\w*\\b/"), match()
returns 1 if the string matches the pattern and 0 otherwise.

   int substitute(SV **string, char *pattern);

Given a pointer to an C<SV> and an C<=~> operation (e.g.,
C<s/bob/robert/g> or C<tr[A-Z][a-z]>), substitute() modifies the string
within the C<SV> as according to the operation, returning the number of
substitutions made.

   SSize_t matches(SV *string, char *pattern, AV **matches);

Given an C<SV>, a pattern, and a pointer to an empty C<AV>,
matches() evaluates C<$string =~ $pattern> in a list context, and
fills in I<matches> with the array elements, returning the number of matches
found.

Here's a sample program, I<match.c>, that uses all three (long lines have
been wrapped here):

 #include <EXTERN.h>
 #include <perl.h>

 static PerlInterpreter *my_perl;

 /** my_eval_sv(code, error_check)
 ** kinda like eval_sv(),
 ** but we pop the return value off the stack
 **/
 SV* my_eval_sv(SV *sv, I32 croak_on_error)
 {
     dSP;
     SV* retval;


     PUSHMARK(SP);
     eval_sv(sv, G_SCALAR);

     SPAGAIN;
     retval = POPs;
     PUTBACK;

     if (croak_on_error && SvTRUE(ERRSV))
 	croak(SvPVx_nolen(ERRSV));

     return retval;
 }

 /** match(string, pattern)
 **
 ** Used for matches in a scalar context.
 **
 ** Returns 1 if the match was successful; 0 otherwise.
 **/

 I32 match(SV *string, char *pattern)
 {
     SV *command = newSV(0), *retval;

     sv_setpvf(command, "my $string = '%s'; $string =~ %s",
 	      SvPV_nolen(string), pattern);

     retval = my_eval_sv(command, TRUE);
     SvREFCNT_dec(command);

     return SvIV(retval);
 }

 /** substitute(string, pattern)
 **
 ** Used for =~ operations that
 ** modify their left-hand side (s/// and tr///)
 **
 ** Returns the number of successful matches, and
 ** modifies the input string if there were any.
 **/

 I32 substitute(SV **string, char *pattern)
 {
     SV *command = newSV(0), *retval;

     sv_setpvf(command, "$string = '%s'; ($string =~ %s)",
 	      SvPV_nolen(*string), pattern);

     retval = my_eval_sv(command, TRUE);
     SvREFCNT_dec(command);

     *string = get_sv("string", 0);
     return SvIV(retval);
 }

 /** matches(string, pattern, matches)
 **
 ** Used for matches in a list context.
 **
 ** Returns the number of matches,
 ** and fills in **matches with the matching substrings
 **/

 SSize_t matches(SV *string, char *pattern, AV **match_list)
 {
     SV *command = newSV(0);
     SSize_t num_matches;

     sv_setpvf(command, "my $string = '%s'; @array = ($string =~ %s)",
 	      SvPV_nolen(string), pattern);

     my_eval_sv(command, TRUE);
     SvREFCNT_dec(command);

     *match_list = get_av("array", 0);
     num_matches = av_top_index(*match_list) + 1;

     return num_matches;
 }

 main (int argc, char **argv, char **env)
 {
     char *embedding[] = { "", "-e", "0" };
     AV *match_list;
     I32 num_matches, i;
     SV *text;

     PERL_SYS_INIT3(&argc,&argv,&env);
     my_perl = perl_alloc();
     perl_construct(my_perl);
     perl_parse(my_perl, NULL, 3, embedding, NULL);
     PL_exit_flags |= PERL_EXIT_DESTRUCT_END;

     text = newSV(0);
     sv_setpv(text, "When he is at a convenience store and the "
	"bill comes to some amount like 76 cents, Maynard is "
	"aware that there is something he *should* do, something "
	"that will enable him to get back a quarter, but he has "
	"no idea *what*.  He fumbles through his red squeezey "
	"changepurse and gives the boy three extra pennies with "
	"his dollar, hoping that he might luck into the correct "
	"amount.  The boy gives him back two of his own pennies "
	"and then the big shiny quarter that is his prize. "
	"-RICHH");

     if (match(text, "m/quarter/")) /** Does text contain 'quarter'? **/
 	printf("match: Text contains the word 'quarter'.\n\n");
     else
 	printf("match: Text doesn't contain the word 'quarter'.\n\n");

     if (match(text, "m/eighth/")) /** Does text contain 'eighth'? **/
 	printf("match: Text contains the word 'eighth'.\n\n");
     else
 	printf("match: Text doesn't contain the word 'eighth'.\n\n");

     /** Match all occurrences of /wi../ **/
     num_matches = matches(text, "m/(wi..)/g", &match_list);
     printf("matches: m/(wi..)/g found %d matches...\n", num_matches);

     for (i = 0; i < num_matches; i++)
         printf("match: %s\n",
                  SvPV_nolen(*av_fetch(match_list, i, FALSE)));
     printf("\n");

     /** Remove all vowels from text **/
     num_matches = substitute(&text, "s/[aeiou]//gi");
     if (num_matches) {
 	printf("substitute: s/[aeiou]//gi...%lu substitutions made.\n",
 	       (unsigned long)num_matches);
 	printf("Now text is: %s\n\n", SvPV_nolen(text));
     }

     /** Attempt a substitution **/
     if (!substitute(&text, "s/Perl/C/")) {
 	printf("substitute: s/Perl/C...No substitution made.\n\n");
     }

     SvREFCNT_dec(text);
     PL_perl_destruct_level = 1;
     perl_destruct(my_perl);
     perl_free(my_perl);
     PERL_SYS_TERM();
 }

which produces the output (again, long lines have been wrapped here)

  match: Text contains the word 'quarter'.

  match: Text doesn't contain the word 'eighth'.

  matches: m/(wi..)/g found 2 matches...
  match: will
  match: with

  substitute: s/[aeiou]//gi...139 substitutions made.
  Now text is: Whn h s t  cnvnnc str nd th bll cms t sm mnt lk 76 cnts,
  Mynrd s wr tht thr s smthng h *shld* d, smthng tht wll nbl hm t gt
  bck qrtr, bt h hs n d *wht*.  H fmbls thrgh hs rd sqzy chngprs nd
  gvs th by thr xtr pnns wth hs dllr, hpng tht h mght lck nt th crrct
  mnt.  Th by gvs hm bck tw f hs wn pnns nd thn th bg shny qrtr tht s
  hs prz. -RCHH

  substitute: s/Perl/C...No substitution made.

=head2 Fiddling with the Perl stack from your C program

When trying to explain stacks, most computer science textbooks mumble
something about spring-loaded columns of cafeteria plates: the last
thing you pushed on the stack is the first thing you pop off.  That'll
do for our purposes: your C program will push some arguments onto "the Perl
stack", shut its eyes while some magic happens, and then pop the
results--the return value of your Perl subroutine--off the stack.

First you'll need to know how to convert between C types and Perl
types, with newSViv() and sv_setnv() and newAV() and all their
friends.  They're described in L<perlguts> and L<perlapi>.

Then you'll need to know how to manipulate the Perl stack.  That's
described in L<perlcall>.

Once you've understood those, embedding Perl in C is easy.

Because C has no builtin function for integer exponentiation, let's
make Perl's ** operator available to it (this is less useful than it
sounds, because Perl implements ** with C's I<pow()> function).  First
I'll create a stub exponentiation function in I<power.pl>:

    sub expo {
        my ($a, $b) = @_;
        return $a ** $b;
    }

Now I'll create a C program, I<power.c>, with a function
I<PerlPower()> that contains all the perlguts necessary to push the
two arguments into I<expo()> and to pop the return value out.  Take a
deep breath...

 #include <EXTERN.h>
 #include <perl.h>

 static PerlInterpreter *my_perl;

 static void
 PerlPower(int a, int b)
 {
   dSP;                            /* initialize stack pointer      */
   ENTER;                          /* everything created after here */
   SAVETMPS;                       /* ...is a temporary variable.   */
   PUSHMARK(SP);                   /* remember the stack pointer    */
   XPUSHs(sv_2mortal(newSViv(a))); /* push the base onto the stack  */
   XPUSHs(sv_2mortal(newSViv(b))); /* push the exponent onto stack  */
   PUTBACK;                      /* make local stack pointer global */
   call_pv("expo", G_SCALAR);      /* call the function             */
   SPAGAIN;                        /* refresh stack pointer         */
                                 /* pop the return value from stack */
   printf ("%d to the %dth power is %d.\n", a, b, POPi);
   PUTBACK;
   FREETMPS;                       /* free that return value        */
   LEAVE;                       /* ...and the XPUSHed "mortal" args.*/
 }

 int main (int argc, char **argv, char **env)
 {
   char *my_argv[] = { "", "power.pl" };

   PERL_SYS_INIT3(&argc,&argv,&env);
   my_perl = perl_alloc();
   perl_construct( my_perl );

   perl_parse(my_perl, NULL, 2, my_argv, (char **)NULL);
   PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
   perl_run(my_perl);

   PerlPower(3, 4);                      /*** Compute 3 ** 4 ***/

   perl_destruct(my_perl);
   perl_free(my_perl);
   PERL_SYS_TERM();
 }



Compile and run:

    % cc -o power power.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

    % power
    3 to the 4th power is 81.

=head2 Maintaining a persistent interpreter

When developing interactive and/or potentially long-running
applications, it's a good idea to maintain a persistent interpreter
rather than allocating and constructing a new interpreter multiple
times.  The major reason is speed: since Perl will only be loaded into
memory once.

However, you have to be more cautious with namespace and variable
scoping when using a persistent interpreter.  In previous examples
we've been using global variables in the default package C<main>.  We
knew exactly what code would be run, and assumed we could avoid
variable collisions and outrageous symbol table growth.

Let's say your application is a server that will occasionally run Perl
code from some arbitrary file.  Your server has no way of knowing what
code it's going to run.  Very dangerous.

If the file is pulled in by C<perl_parse()>, compiled into a newly
constructed interpreter, and subsequently cleaned out with
C<perl_destruct()> afterwards, you're shielded from most namespace
troubles.

One way to avoid namespace collisions in this scenario is to translate
the filename into a guaranteed-unique package name, and then compile
the code into that package using L<perlfunc/eval>.  In the example
below, each file will only be compiled once.  Or, the application
might choose to clean out the symbol table associated with the file
after it's no longer needed.  Using L<perlapi/call_argv>, We'll
call the subroutine C<Embed::Persistent::eval_file> which lives in the
file C<persistent.pl> and pass the filename and boolean cleanup/cache
flag as arguments.

Note that the process will continue to grow for each file that it
uses.  In addition, there might be C<AUTOLOAD>ed subroutines and other
conditions that cause Perl's symbol table to grow.  You might want to
add some logic that keeps track of the process size, or restarts
itself after a certain number of requests, to ensure that memory
consumption is minimized.  You'll also want to scope your variables
with L<perlfunc/my> whenever possible.


 package Embed::Persistent;
 #persistent.pl

 use strict;
 our %Cache;
 use Symbol qw(delete_package);

 sub valid_package_name {
     my($string) = @_;
     $string =~ s/([^A-Za-z0-9\/])/sprintf("_%2x",unpack("C",$1))/eg;
     # second pass only for words starting with a digit
     $string =~ s|/(\d)|sprintf("/_%2x",unpack("C",$1))|eg;

     # Dress it up as a real package name
     $string =~ s|/|::|g;
     return "Embed" . $string;
 }

 sub eval_file {
     my($filename, $delete) = @_;
     my $package = valid_package_name($filename);
     my $mtime = -M $filename;
     if(defined $Cache{$package}{mtime}
        &&
        $Cache{$package}{mtime} <= $mtime)
     {
        # we have compiled this subroutine already,
        # it has not been updated on disk, nothing left to do
        print STDERR "already compiled $package->handler\n";
     }
     else {
        local *FH;
        open FH, $filename or die "open '$filename' $!";
        local($/) = undef;
        my $sub = <FH>;
        close FH;

        #wrap the code into a subroutine inside our unique package
        my $eval = qq{package $package; sub handler { $sub; }};
        {
            # hide our variables within this block
            my($filename,$mtime,$package,$sub);
            eval $eval;
        }
        die $@ if $@;

        #cache it unless we're cleaning out each time
        $Cache{$package}{mtime} = $mtime unless $delete;
     }

     eval {$package->handler;};
     die $@ if $@;

     delete_package($package) if $delete;

     #take a look if you want
     #print Devel::Symdump->rnew($package)->as_string, $/;
 }

 1;

 __END__

 /* persistent.c */
 #include <EXTERN.h>
 #include <perl.h>

 /* 1 = clean out filename's symbol table after each request,
    0 = don't
 */
 #ifndef DO_CLEAN
 #define DO_CLEAN 0
 #endif

 #define BUFFER_SIZE 1024

 static PerlInterpreter *my_perl = NULL;

 int
 main(int argc, char **argv, char **env)
 {
     char *embedding[] = { "", "persistent.pl" };
     char *args[] = { "", DO_CLEAN, NULL };
     char filename[BUFFER_SIZE];
     int exitstatus = 0;

     PERL_SYS_INIT3(&argc,&argv,&env);
     if((my_perl = perl_alloc()) == NULL) {
        fprintf(stderr, "no memory!");
        exit(1);
     }
     perl_construct(my_perl);

     PL_origalen = 1; /* don't let $0 assignment update the
                         proctitle or embedding[0] */
     exitstatus = perl_parse(my_perl, NULL, 2, embedding, NULL);
     PL_exit_flags |= PERL_EXIT_DESTRUCT_END;
     if(!exitstatus) {
        exitstatus = perl_run(my_perl);

        while(printf("Enter file name: ") &&
              fgets(filename, BUFFER_SIZE, stdin)) {

            filename[strlen(filename)-1] = '\0'; /* strip \n */
            /* call the subroutine,
                     passing it the filename as an argument */
            args[0] = filename;
            call_argv("Embed::Persistent::eval_file",
                           G_DISCARD | G_EVAL, args);

            /* check $@ */
            if(SvTRUE(ERRSV))
                fprintf(stderr, "eval error: %s\n", SvPV_nolen(ERRSV));
        }
     }

     PL_perl_destruct_level = 0;
     perl_destruct(my_perl);
     perl_free(my_perl);
     PERL_SYS_TERM();
     exit(exitstatus);
 }

Now compile:

 % cc -o persistent persistent.c \
        `perl -MExtUtils::Embed -e ccopts -e ldopts`

Here's an example script file:

 #test.pl
 my $string = "hello";
 foo($string);

 sub foo {
     print "foo says: @_\n";
 }

Now run:

 % persistent
 Enter file name: test.pl
 foo says: hello
 Enter file name: test.pl
 already compiled Embed::test_2epl->handler
 foo says: hello
 Enter file name: ^C

=head2 Execution of END blocks

Traditionally END blocks have been executed at the end of the perl_run.
This causes problems for applications that never call perl_run. Since
perl 5.7.2 you can specify C<PL_exit_flags |= PERL_EXIT_DESTRUCT_END>
to get the new behaviour. This also enables the running of END blocks if
the perl_parse fails and C<perl_destruct> will return the exit value.

=head2 $0 assignments

When a perl script assigns a value to $0 then the perl runtime will
try to make this value show up as the program name reported by "ps" by
updating the memory pointed to by the argv passed to perl_parse() and
also calling API functions like setproctitle() where available.  This
behaviour might not be appropriate when embedding perl and can be
disabled by assigning the value C<1> to the variable C<PL_origalen>
before perl_parse() is called.

The F<persistent.c> example above is for instance likely to segfault
when $0 is assigned to if the C<PL_origalen = 1;> assignment is
removed.  This because perl will try to write to the read only memory
of the C<embedding[]> strings.

=head2 Maintaining multiple interpreter instances

Some rare applications will need to create more than one interpreter
during a session.  Such an application might sporadically decide to
release any resources associated with the interpreter.

The program must take care to ensure that this takes place I<before>
the next interpreter is constructed.  By default, when perl is not
built with any special options, the global variable
C<PL_perl_destruct_level> is set to C<0>, since extra cleaning isn't
usually needed when a program only ever creates a single interpreter
in its entire lifetime.

Setting C<PL_perl_destruct_level> to C<1> makes everything squeaky clean:

 while(1) {
     ...
     /* reset global variables here with PL_perl_destruct_level = 1 */
     PL_perl_destruct_level = 1;
     perl_construct(my_perl);
     ...
     /* clean and reset _everything_ during perl_destruct */
     PL_perl_destruct_level = 1;
     perl_destruct(my_perl);
     perl_free(my_perl);
     ...
     /* let's go do it again! */
 }

When I<perl_destruct()> is called, the interpreter's syntax parse tree
and symbol tables are cleaned up, and global variables are reset.  The
second assignment to C<PL_perl_destruct_level> is needed because
perl_construct resets it to C<0>.

Now suppose we have more than one interpreter instance running at the
same time.  This is feasible, but only if you used the Configure option
C<-Dusemultiplicity> or the options C<-Dusethreads -Duseithreads> when
building perl.  By default, enabling one of these Configure options
sets the per-interpreter global variable C<PL_perl_destruct_level> to
C<1>, so that thorough cleaning is automatic and interpreter variables
are initialized correctly.  Even if you don't intend to run two or
more interpreters at the same time, but to run them sequentially, like
in the above example, it is recommended to build perl with the
C<-Dusemultiplicity> option otherwise some interpreter variables may
not be initialized correctly between consecutive runs and your
application may crash.

See also L<perlxs/Thread-aware system interfaces>.

Using C<-Dusethreads -Duseithreads> rather than C<-Dusemultiplicity>
is more appropriate if you intend to run multiple interpreters
concurrently in different threads, because it enables support for
linking in the thread libraries of your system with the interpreter.

Let's give it a try:


 #include <EXTERN.h>
 #include <perl.h>

 /* we're going to embed two interpreters */

 #define SAY_HELLO "-e", "print qq(Hi, I'm $^X\n)"

 int main(int argc, char **argv, char **env)
 {
     PerlInterpreter *one_perl, *two_perl;
     char *one_args[] = { "one_perl", SAY_HELLO };
     char *two_args[] = { "two_perl", SAY_HELLO };

     PERL_SYS_INIT3(&argc,&argv,&env);
     one_perl = perl_alloc();
     two_perl = perl_alloc();

     PERL_SET_CONTEXT(one_perl);
     perl_construct(one_perl);
     PERL_SET_CONTEXT(two_perl);
     perl_construct(two_perl);

     PERL_SET_CONTEXT(one_perl);
     perl_parse(one_perl, NULL, 3, one_args, (char **)NULL);
     PERL_SET_CONTEXT(two_perl);
     perl_parse(two_perl, NULL, 3, two_args, (char **)NULL);

     PERL_SET_CONTEXT(one_perl);
     perl_run(one_perl);
     PERL_SET_CONTEXT(two_perl);
     perl_run(two_perl);

     PERL_SET_CONTEXT(one_perl);
     perl_destruct(one_perl);
     PERL_SET_CONTEXT(two_perl);
     perl_destruct(two_perl);

     PERL_SET_CONTEXT(one_perl);
     perl_free(one_perl);
     PERL_SET_CONTEXT(two_perl);
     perl_free(two_perl);
     PERL_SYS_TERM();
 }

Note the calls to PERL_SET_CONTEXT().  These are necessary to initialize
the global state that tracks which interpreter is the "current" one on
the particular process or thread that may be running it.  It should
always be used if you have more than one interpreter and are making
perl API calls on both interpreters in an interleaved fashion.

PERL_SET_CONTEXT(interp) should also be called whenever C<interp> is
used by a thread that did not create it (using either perl_alloc(), or
the more esoteric perl_clone()).

Compile as usual:

 % cc -o multiplicity multiplicity.c \
  `perl -MExtUtils::Embed -e ccopts -e ldopts`

Run it, Run it:

 % multiplicity
 Hi, I'm one_perl
 Hi, I'm two_perl

=head2 Using Perl modules, which themselves use C libraries, from your C
program

If you've played with the examples above and tried to embed a script
that I<use()>s a Perl module (such as I<Socket>) which itself uses a C or C++
library, this probably happened:


 Can't load module Socket, dynamic loading not available in this perl.
  (You may need to build a new perl executable which either supports
  dynamic loading or has the Socket module statically linked into it.)


What's wrong?

Your interpreter doesn't know how to communicate with these extensions
on its own.  A little glue will help.  Up until now you've been
calling I<perl_parse()>, handing it NULL for the second argument:

 perl_parse(my_perl, NULL, argc, my_argv, NULL);

That's where the glue code can be inserted to create the initial contact
between Perl and linked C/C++ routines. Let's take a look some pieces of
I<perlmain.c> to see how Perl does this:

 static void xs_init (pTHX);

 EXTERN_C void boot_DynaLoader (pTHX_ CV* cv);
 EXTERN_C void boot_Socket (pTHX_ CV* cv);


 EXTERN_C void
 xs_init(pTHX)
 {
        char *file = __FILE__;
        /* DynaLoader is a special case */
        newXS("DynaLoader::boot_DynaLoader", boot_DynaLoader, file);
        newXS("Socket::bootstrap", boot_Socket, file);
 }

Simply put: for each extension linked with your Perl executable
(determined during its initial configuration on your
computer or when adding a new extension),
a Perl subroutine is created to incorporate the extension's
routines.  Normally, that subroutine is named
I<Module::bootstrap()> and is invoked when you say I<use Module>.  In
turn, this hooks into an XSUB, I<boot_Module>, which creates a Perl
counterpart for each of the extension's XSUBs.  Don't worry about this
part; leave that to the I<xsubpp> and extension authors.  If your
extension is dynamically loaded, DynaLoader creates I<Module::bootstrap()>
for you on the fly.  In fact, if you have a working DynaLoader then there
is rarely any need to link in any other extensions statically.


Once you have this code, slap it into the second argument of I<perl_parse()>:


 perl_parse(my_perl, xs_init, argc, my_argv, NULL);


Then compile:

 % cc -o interp interp.c `perl -MExtUtils::Embed -e ccopts -e ldopts`

 % interp
   use Socket;
   use SomeDynamicallyLoadedModule;

   print "Now I can use extensions!\n"'

B<ExtUtils::Embed> can also automate writing the I<xs_init> glue code.

 % perl -MExtUtils::Embed -e xsinit -- -o perlxsi.c
 % cc -c perlxsi.c `perl -MExtUtils::Embed -e ccopts`
 % cc -c interp.c  `perl -MExtUtils::Embed -e ccopts`
 % cc -o interp perlxsi.o interp.o `perl -MExtUtils::Embed -e ldopts`

Consult L<perlxs>, L<perlguts>, and L<perlapi> for more details.

=head2 Using embedded Perl with POSIX locales

(See L<perllocale> for information about these.)
When a Perl interpreter normally starts up, it tells the system it wants
to use the system's default locale.  This is often, but not necessarily,
the "C" or "POSIX" locale.  Absent a S<C<"use locale">> within the perl
code, this mostly has no effect (but see L<perllocale/Not within the
scope of "use locale">).  Also, there is not a problem if the
locale you want to use in your embedded Perl is the same as the system
default.  However, this doesn't work if you have set up and want to use
a locale that isn't the system default one.  Starting in Perl v5.20, you
can tell the embedded Perl interpreter that the locale is already
properly set up, and to skip doing its own normal initialization.  It
skips if the environment variable C<PERL_SKIP_LOCALE_INIT> is set (even
if set to 0 or C<"">).  A Perl that has this capability will define the
C pre-processor symbol C<HAS_SKIP_LOCALE_INIT>.  This allows code that
has to work with multiple Perl versions to do some sort of work-around
when confronted with an earlier Perl.

=head1 Hiding Perl_

If you completely hide the short forms of the Perl public API,
add -DPERL_NO_SHORT_NAMES to the compilation flags.  This means that
for example instead of writing

    warn("%d bottles of beer on the wall", bottlecount);

you will have to write the explicit full form

    Perl_warn(aTHX_ "%d bottles of beer on the wall", bottlecount);

(See L<perlguts/"Background and PERL_IMPLICIT_CONTEXT"> for the explanation
of the C<aTHX_>. )  Hiding the short forms is very useful for avoiding
all sorts of nasty (C preprocessor or otherwise) conflicts with other
software packages (Perl defines about 2400 APIs with these short names,
take or leave few hundred, so there certainly is room for conflict.)

=head1 MORAL

You can sometimes I<write faster code> in C, but
you can always I<write code faster> in Perl.  Because you can use
each from the other, combine them as you wish.


=head1 AUTHOR

Jon Orwant <F<orwant@media.mit.edu>> and Doug MacEachern
<F<dougm@covalent.net>>, with small contributions from Tim Bunce, Tom
Christiansen, Guy Decoux, Hallvard Furuseth, Dov Grobgeld, and Ilya
Zakharevich.

Doug MacEachern has an article on embedding in Volume 1, Issue 4 of
The Perl Journal ( L<http://www.tpj.com/> ).  Doug is also the developer of the
most widely-used Perl embedding: the mod_perl system
(perl.apache.org), which embeds Perl in the Apache web server.
Oracle, Binary Evolution, ActiveState, and Ben Sugars's nsapi_perl
have used this model for Oracle, Netscape and Internet Information
Server Perl plugins.

=head1 COPYRIGHT

Copyright (C) 1995, 1996, 1997, 1998 Doug MacEachern and Jon Orwant.  All
Rights Reserved.

This document may be distributed under the same terms as Perl itself.
perlnumber.pod000064400000020241150344123460007424 0ustar00=head1 NAME

perlnumber - semantics of numbers and numeric operations in Perl

=head1 SYNOPSIS

    $n = 1234;		    # decimal integer
    $n = 0b1110011;	    # binary integer
    $n = 01234;		    # octal integer
    $n = 0x1234;	    # hexadecimal integer
    $n = 12.34e-56;	    # exponential notation
    $n = "-12.34e56";	    # number specified as a string
    $n = "1234";	    # number specified as a string

=head1 DESCRIPTION

This document describes how Perl internally handles numeric values.

Perl's operator overloading facility is completely ignored here.  Operator
overloading allows user-defined behaviors for numbers, such as operations
over arbitrarily large integers, floating points numbers with arbitrary
precision, operations over "exotic" numbers such as modular arithmetic or
p-adic arithmetic, and so on.  See L<overload> for details.

=head1 Storing numbers

Perl can internally represent numbers in 3 different ways: as native
integers, as native floating point numbers, and as decimal strings.
Decimal strings may have an exponential notation part, as in C<"12.34e-56">.
I<Native> here means "a format supported by the C compiler which was used
to build perl".

The term "native" does not mean quite as much when we talk about native
integers, as it does when native floating point numbers are involved.
The only implication of the term "native" on integers is that the limits for
the maximal and the minimal supported true integral quantities are close to
powers of 2.  However, "native" floats have a most fundamental
restriction: they may represent only those numbers which have a relatively
"short" representation when converted to a binary fraction.  For example,
0.9 cannot be represented by a native float, since the binary fraction
for 0.9 is infinite:

  binary0.1110011001100...

with the sequence C<1100> repeating again and again.  In addition to this
limitation,  the exponent of the binary number is also restricted when it
is represented as a floating point number.  On typical hardware, floating
point values can store numbers with up to 53 binary digits, and with binary
exponents between -1024 and 1024.  In decimal representation this is close
to 16 decimal digits and decimal exponents in the range of -304..304.
The upshot of all this is that Perl cannot store a number like
12345678901234567 as a floating point number on such architectures without
loss of information.

Similarly, decimal strings can represent only those numbers which have a
finite decimal expansion.  Being strings, and thus of arbitrary length, there
is no practical limit for the exponent or number of decimal digits for these
numbers.  (But realize that what we are discussing the rules for just the
I<storage> of these numbers.  The fact that you can store such "large" numbers
does not mean that the I<operations> over these numbers will use all
of the significant digits.
See L</"Numeric operators and numeric conversions"> for details.)

In fact numbers stored in the native integer format may be stored either
in the signed native form, or in the unsigned native form.  Thus the limits
for Perl numbers stored as native integers would typically be -2**31..2**32-1,
with appropriate modifications in the case of 64-bit integers.  Again, this
does not mean that Perl can do operations only over integers in this range:
it is possible to store many more integers in floating point format.

Summing up, Perl numeric values can store only those numbers which have
a finite decimal expansion or a "short" binary expansion.

=head1 Numeric operators and numeric conversions

As mentioned earlier, Perl can store a number in any one of three formats,
but most operators typically understand only one of those formats.  When
a numeric value is passed as an argument to such an operator, it will be
converted to the format understood by the operator.

Six such conversions are possible:

  native integer        --> native floating point	(*)
  native integer        --> decimal string
  native floating_point --> native integer		(*)
  native floating_point --> decimal string		(*)
  decimal string        --> native integer
  decimal string        --> native floating point	(*)

These conversions are governed by the following general rules:

=over 4

=item *

If the source number can be represented in the target form, that
representation is used.

=item *

If the source number is outside of the limits representable in the target form,
a representation of the closest limit is used.  (I<Loss of information>)

=item *

If the source number is between two numbers representable in the target form,
a representation of one of these numbers is used.  (I<Loss of information>)

=item *

In C<< native floating point --> native integer >> conversions the magnitude
of the result is less than or equal to the magnitude of the source.
(I<"Rounding to zero".>)

=item *

If the C<< decimal string --> native integer >> conversion cannot be done
without loss of information, the result is compatible with the conversion
sequence C<< decimal_string --> native_floating_point --> native_integer >>.
In particular, rounding is strongly biased to 0, though a number like
C<"0.99999999999999999999"> has a chance of being rounded to 1.

=back

B<RESTRICTION>: The conversions marked with C<(*)> above involve steps
performed by the C compiler.  In particular, bugs/features of the compiler
used may lead to breakage of some of the above rules.

=head1 Flavors of Perl numeric operations

Perl operations which take a numeric argument treat that argument in one
of four different ways: they may force it to one of the integer/floating/
string formats, or they may behave differently depending on the format of
the operand.  Forcing a numeric value to a particular format does not
change the number stored in the value.

All the operators which need an argument in the integer format treat the
argument as in modular arithmetic, e.g., C<mod 2**32> on a 32-bit
architecture.  C<sprintf "%u", -1> therefore provides the same result as
C<sprintf "%u", ~0>.

=over 4

=item Arithmetic operators

The binary operators C<+> C<-> C<*> C</> C<%> C<==> C<!=> C<E<gt>> C<E<lt>>
C<E<gt>=> C<E<lt>=> and the unary operators C<-> C<abs> and C<--> will
attempt to convert arguments to integers.  If both conversions are possible
without loss of precision, and the operation can be performed without
loss of precision then the integer result is used.  Otherwise arguments are
converted to floating point format and the floating point result is used.
The caching of conversions (as described above) means that the integer
conversion does not throw away fractional parts on floating point numbers.

=item ++

C<++> behaves as the other operators above, except that if it is a string
matching the format C</^[a-zA-Z]*[0-9]*\z/> the string increment described
in L<perlop> is used.

=item Arithmetic operators during C<use integer>

In scopes where C<use integer;> is in force, nearly all the operators listed
above will force their argument(s) into integer format, and return an integer
result.  The exceptions, C<abs>, C<++> and C<-->, do not change their
behavior with C<use integer;>

=item Other mathematical operators

Operators such as C<**>, C<sin> and C<exp> force arguments to floating point
format.

=item Bitwise operators

Arguments are forced into the integer format if not strings.

=item Bitwise operators during C<use integer>

forces arguments to integer format. Also shift operations internally use
signed integers rather than the default unsigned.

=item Operators which expect an integer

force the argument into the integer format.  This is applicable
to the third and fourth arguments of C<sysread>, for example.

=item Operators which expect a string

force the argument into the string format.  For example, this is
applicable to C<printf "%s", $value>.

=back

Though forcing an argument into a particular form does not change the
stored number, Perl remembers the result of such conversions.  In
particular, though the first such conversion may be time-consuming,
repeated operations will not need to redo the conversion.

=head1 AUTHOR

Ilya Zakharevich C<ilya@math.ohio-state.edu>

Editorial adjustments by Gurusamy Sarathy <gsar@ActiveState.com>

Updates for 5.8.0 by Nicholas Clark <nick@ccl4.org>

=head1 SEE ALSO

L<overload>, L<perlop>
perlxstypemap.pod000064400000056701150344123460010200 0ustar00=head1 NAME

perlxstypemap - Perl XS C/Perl type mapping

=head1 DESCRIPTION

The more you think about interfacing between two languages, the more
you'll realize that the majority of programmer effort has to go into
converting between the data structures that are native to either of
the languages involved.  This trumps other matter such as differing
calling conventions because the problem space is so much greater.
There are simply more ways to shove data into memory than there are
ways to implement a function call.

Perl XS' attempt at a solution to this is the concept of typemaps.
At an abstract level, a Perl XS typemap is nothing but a recipe for
converting from a certain Perl data structure to a certain C
data structure and vice versa.  Since there can be C types that
are sufficiently similar to one another to warrant converting with
the same logic, XS typemaps are represented by a unique identifier,
henceforth called an B<XS type> in this document.  You can then tell
the XS compiler that multiple C types are to be mapped with the same
XS typemap.

In your XS code, when you define an argument with a C type or when
you are using a C<CODE:> and an C<OUTPUT:> section together with a
C return type of your XSUB, it'll be the typemapping mechanism that
makes this easy.

=head2 Anatomy of a typemap

In more practical terms, the typemap is a collection of code
fragments which are used by the B<xsubpp> compiler to map C function
parameters and values to Perl values.  The typemap file may consist
of three sections labelled C<TYPEMAP>, C<INPUT>, and C<OUTPUT>.
An unlabelled initial section is assumed to be a C<TYPEMAP> section.
The INPUT section tells the compiler how to translate Perl values
into variables of certain C types.  The OUTPUT section tells the
compiler how to translate the values from certain C types into values
Perl can understand.  The TYPEMAP section tells the compiler which
of the INPUT and OUTPUT code fragments should be used to map a given
C type to a Perl value.  The section labels C<TYPEMAP>, C<INPUT>, or
C<OUTPUT> must begin in the first column on a line by themselves,
and must be in uppercase.

Each type of section can appear an arbitrary number of times
and does not have to appear at all.  For example, a typemap may
commonly lack C<INPUT> and C<OUTPUT> sections if all it needs to
do is associate additional C types with core XS types like T_PTROBJ.
Lines that start with a hash C<#> are considered comments and ignored
in the C<TYPEMAP> section, but are considered significant in C<INPUT>
and C<OUTPUT>. Blank lines are generally ignored.

Traditionally, typemaps needed to be written to a separate file,
conventionally called C<typemap> in a CPAN distribution.  With
ExtUtils::ParseXS (the XS compiler) version 3.12 or better which
comes with perl 5.16, typemaps can also be embedded directly into
XS code using a HERE-doc like syntax:

  TYPEMAP: <<HERE
  ...
  HERE

where C<HERE> can be replaced by other identifiers like with normal
Perl HERE-docs.  All details below about the typemap textual format
remain valid.

The C<TYPEMAP> section should contain one pair of C type and
XS type per line as follows.  An example from the core typemap file:

  TYPEMAP
  # all variants of char* is handled by the T_PV typemap
  char *          T_PV
  const char *    T_PV
  unsigned char * T_PV
  ...

The C<INPUT> and C<OUTPUT> sections have identical formats, that is,
each unindented line starts a new in- or output map respectively.
A new in- or output map must start with the name of the XS type to
map on a line by itself, followed by the code that implements it
indented on the following lines. Example:

  INPUT
  T_PV
    $var = ($type)SvPV_nolen($arg)
  T_PTR
    $var = INT2PTR($type,SvIV($arg))

We'll get to the meaning of those Perlish-looking variables in a
little bit.

Finally, here's an example of the full typemap file for mapping C
strings of the C<char *> type to Perl scalars/strings:

  TYPEMAP
  char *  T_PV

  INPUT
  T_PV
    $var = ($type)SvPV_nolen($arg)

  OUTPUT
  T_PV
    sv_setpv((SV*)$arg, $var);

Here's a more complicated example: suppose that you wanted
C<struct netconfig> to be blessed into the class C<Net::Config>.
One way to do this is to use underscores (_) to separate package
names, as follows:

  typedef struct netconfig * Net_Config;

And then provide a typemap entry C<T_PTROBJ_SPECIAL> that maps
underscores to double-colons (::), and declare C<Net_Config> to be of
that type:

  TYPEMAP
  Net_Config      T_PTROBJ_SPECIAL

  INPUT
  T_PTROBJ_SPECIAL
    if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){
      IV tmp = SvIV((SV*)SvRV($arg));
      $var = INT2PTR($type, tmp);
    }
    else
      croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")

  OUTPUT
  T_PTROBJ_SPECIAL
    sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\",
                 (void*)$var);

The INPUT and OUTPUT sections substitute underscores for double-colons
on the fly, giving the desired effect.  This example demonstrates some
of the power and versatility of the typemap facility.

The C<INT2PTR> macro (defined in perl.h) casts an integer to a pointer
of a given type, taking care of the possible different size of integers
and pointers.  There are also C<PTR2IV>, C<PTR2UV>, C<PTR2NV> macros,
to map the other way, which may be useful in OUTPUT sections.

=head2 The Role of the typemap File in Your Distribution

The default typemap in the F<lib/ExtUtils> directory of the Perl source
contains many useful types which can be used by Perl extensions.  Some
extensions define additional typemaps which they keep in their own directory.
These additional typemaps may reference INPUT and OUTPUT maps in the main
typemap.  The B<xsubpp> compiler will allow the extension's own typemap to
override any mappings which are in the default typemap.  Instead of using
an additional F<typemap> file, typemaps may be embedded verbatim in XS
with a heredoc-like syntax.  See the documentation on the C<TYPEMAP:> XS
keyword.

For CPAN distributions, you can assume that the XS types defined by
the perl core are already available. Additionally, the core typemap
has default XS types for a large number of C types.  For example, if
you simply return a C<char *> from your XSUB, the core typemap will
have this C type associated with the T_PV XS type.  That means your
C string will be copied into the PV (pointer value) slot of a new scalar
that will be returned from your XSUB to Perl.

If you're developing a CPAN distribution using XS, you may add your own
file called F<typemap> to the distribution.  That file may contain
typemaps that either map types that are specific to your code or that
override the core typemap file's mappings for common C types.

=head2 Sharing typemaps Between CPAN Distributions

Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16
and better), it is rather easy to share typemap code between multiple
CPAN distributions. The general idea is to share it as a module that
offers a certain API and have the dependent modules declare that as a
built-time requirement and import the typemap into the XS. An example
of such a typemap-sharing module on CPAN is
C<ExtUtils::Typemaps::Basic>. Two steps to getting that module's
typemaps available in your code:

=over 4

=item *

Declare C<ExtUtils::Typemaps::Basic> as a build-time dependency
in C<Makefile.PL> (use C<BUILD_REQUIRES>), or in your C<Build.PL>
(use C<build_requires>).

=item *

Include the following line in the XS section of your XS file:
(don't break the line)

  INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd
                   -e "print embeddable_typemap(q{Basic})"

=back

=head2 Writing typemap Entries

Each INPUT or OUTPUT typemap entry is a double-quoted Perl string that
will be evaluated in the presence of certain variables to get the
final C code for mapping a certain C type.

This means that you can embed Perl code in your typemap (C) code using
constructs such as
C<${ perl code that evaluates to scalar reference here }>. A common
use case is to generate error messages that refer to the true function
name even when using the ALIAS XS feature:

  ${ $ALIAS ? \q[GvNAME(CvGV(cv))] : \qq[\"$pname\"] }

For many typemap examples, refer to the core typemap file that can be
found in the perl source tree at F<lib/ExtUtils/typemap>.

The Perl variables that are available for interpolation into typemaps
are the following:

=over 4

=item *

I<$var> - the name of the input or output variable, eg. RETVAL for
return values.

=item *

I<$type> - the raw C type of the parameter, any C<:> replaced with
C<_>.
e.g. for a type of C<Foo::Bar>, I<$type> is C<Foo__Bar>

=item *

I<$ntype> - the supplied type with C<*> replaced with C<Ptr>.
e.g. for a type of C<Foo*>, I<$ntype> is C<FooPtr>

=item *

I<$arg> - the stack entry, that the parameter is input from or output
to, e.g. C<ST(0)>

=item *

I<$argoff> - the argument stack offset of the argument.  ie. 0 for the
first argument, etc.

=item *

I<$pname> - the full name of the XSUB, with including the C<PACKAGE>
name, with any C<PREFIX> stripped.  This is the non-ALIAS name.

=item *

I<$Package> - the package specified by the most recent C<PACKAGE>
keyword.

=item *

I<$ALIAS> - non-zero if the current XSUB has any aliases declared with
C<ALIAS>.

=back

=head2 Full Listing of Core Typemaps

Each C type is represented by an entry in the typemap file that
is responsible for converting perl variables (SV, AV, HV, CV, etc.)
to and from that type. The following sections list all XS types
that come with perl by default.

=over 4

=item T_SV

This simply passes the C representation of the Perl variable (an SV*)
in and out of the XS layer. This can be used if the C code wants
to deal directly with the Perl variable.

=item T_SVREF

Used to pass in and return a reference to an SV.

Note that this typemap does not decrement the reference count
when returning the reference to an SV*.
See also: T_SVREF_REFCOUNT_FIXED

=item T_SVREF_FIXED

Used to pass in and return a reference to an SV.
This is a fixed
variant of T_SVREF that decrements the refcount appropriately
when returning a reference to an SV*. Introduced in perl 5.15.4.

=item T_AVREF

From the perl level this is a reference to a perl array.
From the C level this is a pointer to an AV.

Note that this typemap does not decrement the reference count
when returning an AV*. See also: T_AVREF_REFCOUNT_FIXED

=item T_AVREF_REFCOUNT_FIXED

From the perl level this is a reference to a perl array.
From the C level this is a pointer to an AV. This is a fixed
variant of T_AVREF that decrements the refcount appropriately
when returning an AV*. Introduced in perl 5.15.4.

=item T_HVREF

From the perl level this is a reference to a perl hash.
From the C level this is a pointer to an HV.

Note that this typemap does not decrement the reference count
when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED

=item T_HVREF_REFCOUNT_FIXED

From the perl level this is a reference to a perl hash.
From the C level this is a pointer to an HV. This is a fixed
variant of T_HVREF that decrements the refcount appropriately
when returning an HV*. Introduced in perl 5.15.4.

=item T_CVREF

From the perl level this is a reference to a perl subroutine
(e.g. $sub = sub { 1 };). From the C level this is a pointer
to a CV.

Note that this typemap does not decrement the reference count
when returning an HV*. See also: T_HVREF_REFCOUNT_FIXED

=item T_CVREF_REFCOUNT_FIXED

From the perl level this is a reference to a perl subroutine
(e.g. $sub = sub { 1 };). From the C level this is a pointer
to a CV.

This is a fixed
variant of T_HVREF that decrements the refcount appropriately
when returning an HV*. Introduced in perl 5.15.4.

=item T_SYSRET

The T_SYSRET typemap is used to process return values from system calls.
It is only meaningful when passing values from C to perl (there is
no concept of passing a system return value from Perl to C).

System calls return -1 on error (setting ERRNO with the reason)
and (usually) 0 on success. If the return value is -1 this typemap
returns C<undef>. If the return value is not -1, this typemap
translates a 0 (perl false) to "0 but true" (which
is perl true) or returns the value itself, to indicate that the
command succeeded.

The L<POSIX|POSIX> module makes extensive use of this type.

=item T_UV

An unsigned integer.

=item T_IV

A signed integer. This is cast to the required integer type when
passed to C and converted to an IV when passed back to Perl.

=item T_INT

A signed integer. This typemap converts the Perl value to a native
integer type (the C<int> type on the current platform). When returning
the value to perl it is processed in the same way as for T_IV.

Its behaviour is identical to using an C<int> type in XS with T_IV.

=item T_ENUM

An enum value. Used to transfer an enum component
from C. There is no reason to pass an enum value to C since
it is stored as an IV inside perl.

=item T_BOOL

A boolean type. This can be used to pass true and false values to and
from C.

=item T_U_INT

This is for unsigned integers. It is equivalent to using T_UV
but explicitly casts the variable to type C<unsigned int>.
The default type for C<unsigned int> is T_UV.

=item T_SHORT

Short integers. This is equivalent to T_IV but explicitly casts
the return to type C<short>. The default typemap for C<short>
is T_IV.

=item T_U_SHORT

Unsigned short integers. This is equivalent to T_UV but explicitly
casts the return to type C<unsigned short>. The default typemap for
C<unsigned short> is T_UV.

T_U_SHORT is used for type C<U16> in the standard typemap.

=item T_LONG

Long integers. This is equivalent to T_IV but explicitly casts
the return to type C<long>. The default typemap for C<long>
is T_IV.

=item T_U_LONG

Unsigned long integers. This is equivalent to T_UV but explicitly
casts the return to type C<unsigned long>. The default typemap for
C<unsigned long> is T_UV.

T_U_LONG is used for type C<U32> in the standard typemap.

=item T_CHAR

Single 8-bit characters.

=item T_U_CHAR

An unsigned byte.

=item T_FLOAT

A floating point number. This typemap guarantees to return a variable
cast to a C<float>.

=item T_NV

A Perl floating point number. Similar to T_IV and T_UV in that the
return type is cast to the requested numeric type rather than
to a specific type.

=item T_DOUBLE

A double precision floating point number. This typemap guarantees to
return a variable cast to a C<double>.

=item T_PV

A string (char *).

=item T_PTR

A memory address (pointer). Typically associated with a C<void *>
type.

=item T_PTRREF

Similar to T_PTR except that the pointer is stored in a scalar and the
reference to that scalar is returned to the caller. This can be used
to hide the actual pointer value from the programmer since it is usually
not required directly from within perl.

The typemap checks that a scalar reference is passed from perl to XS.

=item T_PTROBJ

Similar to T_PTRREF except that the reference is blessed into a class.
This allows the pointer to be used as an object. Most commonly used to
deal with C structs. The typemap checks that the perl object passed
into the XS routine is of the correct class (or part of a subclass).

The pointer is blessed into a class that is derived from the name
of type of the pointer but with all '*' in the name replaced with
'Ptr'.

For C<DESTROY> XSUBs only, a T_PTROBJ is optimized to a T_PTRREF. This means
the class check is skipped.

=item T_REF_IV_REF

NOT YET

=item T_REF_IV_PTR

Similar to T_PTROBJ in that the pointer is blessed into a scalar object.
The difference is that when the object is passed back into XS it must be
of the correct type (inheritance is not supported) while T_PTROBJ supports
inheritance.

The pointer is blessed into a class that is derived from the name
of type of the pointer but with all '*' in the name replaced with
'Ptr'.

For C<DESTROY> XSUBs only, a T_REF_IV_PTR is optimized to a T_PTRREF. This
means the class check is skipped.

=item T_PTRDESC

NOT YET

=item T_REFREF

Similar to T_PTRREF, except the pointer stored in the referenced scalar
is dereferenced and copied to the output variable. This means that
T_REFREF is to T_PTRREF as T_OPAQUE is to T_OPAQUEPTR. All clear?

Only the INPUT part of this is implemented (Perl to XSUB) and there
are no known users in core or on CPAN.

=item T_REFOBJ

Like T_REFREF, except it does strict type checking (inheritance is not
supported).

For C<DESTROY> XSUBs only, a T_REFOBJ is optimized to a T_REFREF. This means
the class check is skipped.

=item T_OPAQUEPTR

This can be used to store bytes in the string component of the
SV. Here the representation of the data is irrelevant to perl and the
bytes themselves are just stored in the SV. It is assumed that the C
variable is a pointer (the bytes are copied from that memory
location).  If the pointer is pointing to something that is
represented by 8 bytes then those 8 bytes are stored in the SV (and
length() will report a value of 8). This entry is similar to T_OPAQUE.

In principle the unpack() command can be used to convert the bytes
back to a number (if the underlying type is known to be a number).

This entry can be used to store a C structure (the number
of bytes to be copied is calculated using the C C<sizeof> function)
and can be used as an alternative to T_PTRREF without having to worry
about a memory leak (since Perl will clean up the SV).

=item T_OPAQUE

This can be used to store data from non-pointer types in the string
part of an SV. It is similar to T_OPAQUEPTR except that the
typemap retrieves the pointer directly rather than assuming it
is being supplied. For example, if an integer is imported into
Perl using T_OPAQUE rather than T_IV the underlying bytes representing
the integer will be stored in the SV but the actual integer value will
not be available. i.e. The data is opaque to perl.

The data may be retrieved using the C<unpack> function if the
underlying type of the byte stream is known.

T_OPAQUE supports input and output of simple types.
T_OPAQUEPTR can be used to pass these bytes back into C if a pointer
is acceptable.

=item Implicit array

xsubpp supports a special syntax for returning
packed C arrays to perl. If the XS return type is given as

  array(type, nelem)

xsubpp will copy the contents of C<nelem * sizeof(type)> bytes from
RETVAL to an SV and push it onto the stack. This is only really useful
if the number of items to be returned is known at compile time and you
don't mind having a string of bytes in your SV.  Use T_ARRAY to push a
variable number of arguments onto the return stack (they won't be
packed as a single string though).

This is similar to using T_OPAQUEPTR but can be used to process more
than one element.

=item T_PACKED

Calls user-supplied functions for conversion. For C<OUTPUT>
(XSUB to Perl), a function named C<XS_pack_$ntype> is called
with the output Perl scalar and the C variable to convert from.
C<$ntype> is the normalized C type that is to be mapped to
Perl. Normalized means that all C<*> are replaced by the
string C<Ptr>. The return value of the function is ignored.

Conversely for C<INPUT> (Perl to XSUB) mapping, the
function named C<XS_unpack_$ntype> is called with the input Perl
scalar as argument and the return value is cast to the mapped
C type and assigned to the output C variable.

An example conversion function for a typemapped struct
C<foo_t *> might be:

  static void
  XS_pack_foo_tPtr(SV *out, foo_t *in)
  {
    dTHX; /* alas, signature does not include pTHX_ */
    HV* hash = newHV();
    hv_stores(hash, "int_member", newSViv(in->int_member));
    hv_stores(hash, "float_member", newSVnv(in->float_member));
    /* ... */

    /* mortalize as thy stack is not refcounted */
    sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash)));
  }

The conversion from Perl to C is left as an exercise to the reader,
but the prototype would be:

  static foo_t *
  XS_unpack_foo_tPtr(SV *in);

Instead of an actual C function that has to fetch the thread context
using C<dTHX>, you can define macros of the same name and avoid the
overhead. Also, keep in mind to possibly free the memory allocated by
C<XS_unpack_foo_tPtr>.

=item T_PACKEDARRAY

T_PACKEDARRAY is similar to T_PACKED. In fact, the C<INPUT> (Perl
to XSUB) typemap is identical, but the C<OUTPUT> typemap passes
an additional argument to the C<XS_pack_$ntype> function. This
third parameter indicates the number of elements in the output
so that the function can handle C arrays sanely. The variable
needs to be declared by the user and must have the name
C<count_$ntype> where C<$ntype> is the normalized C type name
as explained above. The signature of the function would be for
the example above and C<foo_t **>:

  static void
  XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr);

The type of the third parameter is arbitrary as far as the typemap
is concerned. It just has to be in line with the declared variable.

Of course, unless you know the number of elements in the
C<sometype **> C array, within your XSUB, the return value from
C<foo_t ** XS_unpack_foo_tPtrPtr(...)> will be hard to decipher.
Since the details are all up to the XS author (the typemap user),
there are several solutions, none of which particularly elegant.
The most commonly seen solution has been to allocate memory for
N+1 pointers and assign C<NULL> to the (N+1)th to facilitate
iteration.

Alternatively, using a customized typemap for your purposes in
the first place is probably preferable.

=item T_DATAUNIT

NOT YET

=item T_CALLBACK

NOT YET

=item T_ARRAY

This is used to convert the perl argument list to a C array
and for pushing the contents of a C array onto the perl
argument stack.

The usual calling signature is

  @out = array_func( @in );

Any number of arguments can occur in the list before the array but
the input and output arrays must be the last elements in the list.

When used to pass a perl list to C the XS writer must provide a
function (named after the array type but with 'Ptr' substituted for
'*') to allocate the memory required to hold the list. A pointer
should be returned. It is up to the XS writer to free the memory on
exit from the function. The variable C<ix_$var> is set to the number
of elements in the new array.

When returning a C array to Perl the XS writer must provide an integer
variable called C<size_$var> containing the number of elements in the
array. This is used to determine how many elements should be pushed
onto the return argument stack. This is not required on input since
Perl knows how many arguments are on the stack when the routine is
called. Ordinarily this variable would be called C<size_RETVAL>.

Additionally, the type of each element is determined from the type of
the array. If the array uses type C<intArray *> xsubpp will
automatically work out that it contains variables of type C<int> and
use that typemap entry to perform the copy of each element. All
pointer '*' and 'Array' tags are removed from the name to determine
the subtype.

=item T_STDIO

This is used for passing perl filehandles to and from C using
C<FILE *> structures.

=item T_INOUT

This is used for passing perl filehandles to and from C using
C<PerlIO *> structures. The file handle can used for reading and
writing. This corresponds to the C<+E<lt>> mode, see also T_IN
and T_OUT.

See L<perliol> for more information on the Perl IO abstraction
layer. Perl must have been built with C<-Duseperlio>.

There is no check to assert that the filehandle passed from Perl
to C was created with the right C<open()> mode.

Hint: The L<perlxstut> tutorial covers the T_INOUT, T_IN, and T_OUT
XS types nicely.

=item T_IN

Same as T_INOUT, but the filehandle that is returned from C to Perl
can only be used for reading (mode C<E<lt>>).

=item T_OUT

Same as T_INOUT, but the filehandle that is returned from C to Perl
is set to use the open mode C<+E<gt>>.

=back

perlinterp.pod000064400000101627150344123460007445 0ustar00=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlinterp.pod

=head1 NAME

perlinterp - An overview of the Perl interpreter

=head1 DESCRIPTION

This document provides an overview of how the Perl interpreter works at
the level of C code, along with pointers to the relevant C source code
files.

=head1 ELEMENTS OF THE INTERPRETER

The work of the interpreter has two main stages: compiling the code
into the internal representation, or bytecode, and then executing it.
L<perlguts/Compiled code> explains exactly how the compilation stage
happens.

Here is a short breakdown of perl's operation:

=head2 Startup

The action begins in F<perlmain.c>. (or F<miniperlmain.c> for miniperl)
This is very high-level code, enough to fit on a single screen, and it
resembles the code found in L<perlembed>; most of the real action takes
place in F<perl.c>

F<perlmain.c> is generated by C<ExtUtils::Miniperl> from
F<miniperlmain.c> at make time, so you should make perl to follow this
along.

First, F<perlmain.c> allocates some memory and constructs a Perl
interpreter, along these lines:

    1 PERL_SYS_INIT3(&argc,&argv,&env);
    2
    3 if (!PL_do_undump) {
    4     my_perl = perl_alloc();
    5     if (!my_perl)
    6         exit(1);
    7     perl_construct(my_perl);
    8     PL_perl_destruct_level = 0;
    9 }

Line 1 is a macro, and its definition is dependent on your operating
system. Line 3 references C<PL_do_undump>, a global variable - all
global variables in Perl start with C<PL_>. This tells you whether the
current running program was created with the C<-u> flag to perl and
then F<undump>, which means it's going to be false in any sane context.

Line 4 calls a function in F<perl.c> to allocate memory for a Perl
interpreter. It's quite a simple function, and the guts of it looks
like this:

 my_perl = (PerlInterpreter*)PerlMem_malloc(sizeof(PerlInterpreter));

Here you see an example of Perl's system abstraction, which we'll see
later: C<PerlMem_malloc> is either your system's C<malloc>, or Perl's
own C<malloc> as defined in F<malloc.c> if you selected that option at
configure time.

Next, in line 7, we construct the interpreter using perl_construct,
also in F<perl.c>; this sets up all the special variables that Perl
needs, the stacks, and so on.

Now we pass Perl the command line options, and tell it to go:

 exitstatus = perl_parse(my_perl, xs_init, argc, argv, (char **)NULL);
 if (!exitstatus)
     perl_run(my_perl);

 exitstatus = perl_destruct(my_perl);

 perl_free(my_perl);

C<perl_parse> is actually a wrapper around C<S_parse_body>, as defined
in F<perl.c>, which processes the command line options, sets up any
statically linked XS modules, opens the program and calls C<yyparse> to
parse it.

=head2 Parsing

The aim of this stage is to take the Perl source, and turn it into an
op tree. We'll see what one of those looks like later. Strictly
speaking, there's three things going on here.

C<yyparse>, the parser, lives in F<perly.c>, although you're better off
reading the original YACC input in F<perly.y>. (Yes, Virginia, there
B<is> a YACC grammar for Perl!) The job of the parser is to take your
code and "understand" it, splitting it into sentences, deciding which
operands go with which operators and so on.

The parser is nobly assisted by the lexer, which chunks up your input
into tokens, and decides what type of thing each token is: a variable
name, an operator, a bareword, a subroutine, a core function, and so
on. The main point of entry to the lexer is C<yylex>, and that and its
associated routines can be found in F<toke.c>. Perl isn't much like
other computer languages; it's highly context sensitive at times, it
can be tricky to work out what sort of token something is, or where a
token ends. As such, there's a lot of interplay between the tokeniser
and the parser, which can get pretty frightening if you're not used to
it.

As the parser understands a Perl program, it builds up a tree of
operations for the interpreter to perform during execution. The
routines which construct and link together the various operations are
to be found in F<op.c>, and will be examined later.

=head2 Optimization

Now the parsing stage is complete, and the finished tree represents the
operations that the Perl interpreter needs to perform to execute our
program. Next, Perl does a dry run over the tree looking for
optimisations: constant expressions such as C<3 + 4> will be computed
now, and the optimizer will also see if any multiple operations can be
replaced with a single one. For instance, to fetch the variable
C<$foo>, instead of grabbing the glob C<*foo> and looking at the scalar
component, the optimizer fiddles the op tree to use a function which
directly looks up the scalar in question. The main optimizer is C<peep>
in F<op.c>, and many ops have their own optimizing functions.

=head2 Running

Now we're finally ready to go: we have compiled Perl byte code, and all
that's left to do is run it. The actual execution is done by the
C<runops_standard> function in F<run.c>; more specifically, it's done
by these three innocent looking lines:

    while ((PL_op = PL_op->op_ppaddr(aTHX))) {
        PERL_ASYNC_CHECK();
    }

You may be more comfortable with the Perl version of that:

    PERL_ASYNC_CHECK() while $Perl::op = &{$Perl::op->{function}};

Well, maybe not. Anyway, each op contains a function pointer, which
stipulates the function which will actually carry out the operation.
This function will return the next op in the sequence - this allows for
things like C<if> which choose the next op dynamically at run time. The
C<PERL_ASYNC_CHECK> makes sure that things like signals interrupt
execution if required.

The actual functions called are known as PP code, and they're spread
between four files: F<pp_hot.c> contains the "hot" code, which is most
often used and highly optimized, F<pp_sys.c> contains all the
system-specific functions, F<pp_ctl.c> contains the functions which
implement control structures (C<if>, C<while> and the like) and F<pp.c>
contains everything else. These are, if you like, the C code for Perl's
built-in functions and operators.

Note that each C<pp_> function is expected to return a pointer to the
next op. Calls to perl subs (and eval blocks) are handled within the
same runops loop, and do not consume extra space on the C stack. For
example, C<pp_entersub> and C<pp_entertry> just push a C<CxSUB> or
C<CxEVAL> block struct onto the context stack which contain the address
of the op following the sub call or eval. They then return the first op
of that sub or eval block, and so execution continues of that sub or
block. Later, a C<pp_leavesub> or C<pp_leavetry> op pops the C<CxSUB>
or C<CxEVAL>, retrieves the return op from it, and returns it.

=head2 Exception handing

Perl's exception handing (i.e. C<die> etc.) is built on top of the
low-level C<setjmp()>/C<longjmp()> C-library functions. These basically
provide a way to capture the current PC and SP registers and later
restore them; i.e. a C<longjmp()> continues at the point in code where
a previous C<setjmp()> was done, with anything further up on the C
stack being lost. This is why code should always save values using
C<SAVE_FOO> rather than in auto variables.

The perl core wraps C<setjmp()> etc in the macros C<JMPENV_PUSH> and
C<JMPENV_JUMP>. The basic rule of perl exceptions is that C<exit>, and
C<die> (in the absence of C<eval>) perform a C<JMPENV_JUMP(2)>, while
C<die> within C<eval> does a C<JMPENV_JUMP(3)>.

At entry points to perl, such as C<perl_parse()>, C<perl_run()> and
C<call_sv(cv, G_EVAL)> each does a C<JMPENV_PUSH>, then enter a runops
loop or whatever, and handle possible exception returns. For a 2
return, final cleanup is performed, such as popping stacks and calling
C<CHECK> or C<END> blocks. Amongst other things, this is how scope
cleanup still occurs during an C<exit>.

If a C<die> can find a C<CxEVAL> block on the context stack, then the
stack is popped to that level and the return op in that block is
assigned to C<PL_restartop>; then a C<JMPENV_JUMP(3)> is performed.
This normally passes control back to the guard. In the case of
C<perl_run> and C<call_sv>, a non-null C<PL_restartop> triggers
re-entry to the runops loop. The is the normal way that C<die> or
C<croak> is handled within an C<eval>.

Sometimes ops are executed within an inner runops loop, such as tie,
sort or overload code. In this case, something like

    sub FETCH { eval { die } }

would cause a longjmp right back to the guard in C<perl_run>, popping
both runops loops, which is clearly incorrect. One way to avoid this is
for the tie code to do a C<JMPENV_PUSH> before executing C<FETCH> in
the inner runops loop, but for efficiency reasons, perl in fact just
sets a flag, using C<CATCH_SET(TRUE)>. The C<pp_require>,
C<pp_entereval> and C<pp_entertry> ops check this flag, and if true,
they call C<docatch>, which does a C<JMPENV_PUSH> and starts a new
runops level to execute the code, rather than doing it on the current
loop.

As a further optimisation, on exit from the eval block in the C<FETCH>,
execution of the code following the block is still carried on in the
inner loop. When an exception is raised, C<docatch> compares the
C<JMPENV> level of the C<CxEVAL> with C<PL_top_env> and if they differ,
just re-throws the exception. In this way any inner loops get popped.

Here's an example.

    1: eval { tie @a, 'A' };
    2: sub A::TIEARRAY {
    3:     eval { die };
    4:     die;
    5: }

To run this code, C<perl_run> is called, which does a C<JMPENV_PUSH>
then enters a runops loop. This loop executes the eval and tie ops on
line 1, with the eval pushing a C<CxEVAL> onto the context stack.

The C<pp_tie> does a C<CATCH_SET(TRUE)>, then starts a second runops
loop to execute the body of C<TIEARRAY>. When it executes the entertry
op on line 3, C<CATCH_GET> is true, so C<pp_entertry> calls C<docatch>
which does a C<JMPENV_PUSH> and starts a third runops loop, which then
executes the die op. At this point the C call stack looks like this:

    Perl_pp_die
    Perl_runops      # third loop
    S_docatch_body
    S_docatch
    Perl_pp_entertry
    Perl_runops      # second loop
    S_call_body
    Perl_call_sv
    Perl_pp_tie
    Perl_runops      # first loop
    S_run_body
    perl_run
    main

and the context and data stacks, as shown by C<-Dstv>, look like:

    STACK 0: MAIN
      CX 0: BLOCK  =>
      CX 1: EVAL   => AV()  PV("A"\0)
      retop=leave
    STACK 1: MAGIC
      CX 0: SUB    =>
      retop=(null)
      CX 1: EVAL   => *
    retop=nextstate

The die pops the first C<CxEVAL> off the context stack, sets
C<PL_restartop> from it, does a C<JMPENV_JUMP(3)>, and control returns
to the top C<docatch>. This then starts another third-level runops
level, which executes the nextstate, pushmark and die ops on line 4. At
the point that the second C<pp_die> is called, the C call stack looks
exactly like that above, even though we are no longer within an inner
eval; this is because of the optimization mentioned earlier. However,
the context stack now looks like this, ie with the top CxEVAL popped:

    STACK 0: MAIN
      CX 0: BLOCK  =>
      CX 1: EVAL   => AV()  PV("A"\0)
      retop=leave
    STACK 1: MAGIC
      CX 0: SUB    =>
      retop=(null)

The die on line 4 pops the context stack back down to the CxEVAL,
leaving it as:

    STACK 0: MAIN
      CX 0: BLOCK  =>

As usual, C<PL_restartop> is extracted from the C<CxEVAL>, and a
C<JMPENV_JUMP(3)> done, which pops the C stack back to the docatch:

    S_docatch
    Perl_pp_entertry
    Perl_runops      # second loop
    S_call_body
    Perl_call_sv
    Perl_pp_tie
    Perl_runops      # first loop
    S_run_body
    perl_run
    main

In  this case, because the C<JMPENV> level recorded in the C<CxEVAL>
differs from the current one, C<docatch> just does a C<JMPENV_JUMP(3)>
and the C stack unwinds to:

    perl_run
    main

Because C<PL_restartop> is non-null, C<run_body> starts a new runops
loop and execution continues.

=head2 INTERNAL VARIABLE TYPES

You should by now have had a look at L<perlguts>, which tells you about
Perl's internal variable types: SVs, HVs, AVs and the rest. If not, do
that now.

These variables are used not only to represent Perl-space variables,
but also any constants in the code, as well as some structures
completely internal to Perl. The symbol table, for instance, is an
ordinary Perl hash. Your code is represented by an SV as it's read into
the parser; any program files you call are opened via ordinary Perl
filehandles, and so on.

The core L<Devel::Peek|Devel::Peek> module lets us examine SVs from a
Perl program. Let's see, for instance, how Perl treats the constant
C<"hello">.

      % perl -MDevel::Peek -e 'Dump("hello")'
    1 SV = PV(0xa041450) at 0xa04ecbc
    2   REFCNT = 1
    3   FLAGS = (POK,READONLY,pPOK)
    4   PV = 0xa0484e0 "hello"\0
    5   CUR = 5
    6   LEN = 6

Reading C<Devel::Peek> output takes a bit of practise, so let's go
through it line by line.

Line 1 tells us we're looking at an SV which lives at C<0xa04ecbc> in
memory. SVs themselves are very simple structures, but they contain a
pointer to a more complex structure. In this case, it's a PV, a
structure which holds a string value, at location C<0xa041450>. Line 2
is the reference count; there are no other references to this data, so
it's 1.

Line 3 are the flags for this SV - it's OK to use it as a PV, it's a
read-only SV (because it's a constant) and the data is a PV internally.
Next we've got the contents of the string, starting at location
C<0xa0484e0>.

Line 5 gives us the current length of the string - note that this does
B<not> include the null terminator. Line 6 is not the length of the
string, but the length of the currently allocated buffer; as the string
grows, Perl automatically extends the available storage via a routine
called C<SvGROW>.

You can get at any of these quantities from C very easily; just add
C<Sv> to the name of the field shown in the snippet, and you've got a
macro which will return the value: C<SvCUR(sv)> returns the current
length of the string, C<SvREFCOUNT(sv)> returns the reference count,
C<SvPV(sv, len)> returns the string itself with its length, and so on.
More macros to manipulate these properties can be found in L<perlguts>.

Let's take an example of manipulating a PV, from C<sv_catpvn>, in
F<sv.c>

     1  void
     2  Perl_sv_catpvn(pTHX_ SV *sv, const char *ptr, STRLEN len)
     3  {
     4      STRLEN tlen;
     5      char *junk;

     6      junk = SvPV_force(sv, tlen);
     7      SvGROW(sv, tlen + len + 1);
     8      if (ptr == junk)
     9          ptr = SvPVX(sv);
    10      Move(ptr,SvPVX(sv)+tlen,len,char);
    11      SvCUR(sv) += len;
    12      *SvEND(sv) = '\0';
    13      (void)SvPOK_only_UTF8(sv);          /* validate pointer */
    14      SvTAINT(sv);
    15  }

This is a function which adds a string, C<ptr>, of length C<len> onto
the end of the PV stored in C<sv>. The first thing we do in line 6 is
make sure that the SV B<has> a valid PV, by calling the C<SvPV_force>
macro to force a PV. As a side effect, C<tlen> gets set to the current
value of the PV, and the PV itself is returned to C<junk>.

In line 7, we make sure that the SV will have enough room to
accommodate the old string, the new string and the null terminator. If
C<LEN> isn't big enough, C<SvGROW> will reallocate space for us.

Now, if C<junk> is the same as the string we're trying to add, we can
grab the string directly from the SV; C<SvPVX> is the address of the PV
in the SV.

Line 10 does the actual catenation: the C<Move> macro moves a chunk of
memory around: we move the string C<ptr> to the end of the PV - that's
the start of the PV plus its current length. We're moving C<len> bytes
of type C<char>. After doing so, we need to tell Perl we've extended
the string, by altering C<CUR> to reflect the new length. C<SvEND> is a
macro which gives us the end of the string, so that needs to be a
C<"\0">.

Line 13 manipulates the flags; since we've changed the PV, any IV or NV
values will no longer be valid: if we have C<$a=10; $a.="6";> we don't
want to use the old IV of 10. C<SvPOK_only_utf8> is a special
UTF-8-aware version of C<SvPOK_only>, a macro which turns off the IOK
and NOK flags and turns on POK. The final C<SvTAINT> is a macro which
launders tainted data if taint mode is turned on.

AVs and HVs are more complicated, but SVs are by far the most common
variable type being thrown around. Having seen something of how we
manipulate these, let's go on and look at how the op tree is
constructed.

=head1 OP TREES

First, what is the op tree, anyway? The op tree is the parsed
representation of your program, as we saw in our section on parsing,
and it's the sequence of operations that Perl goes through to execute
your program, as we saw in L</Running>.

An op is a fundamental operation that Perl can perform: all the
built-in functions and operators are ops, and there are a series of ops
which deal with concepts the interpreter needs internally - entering
and leaving a block, ending a statement, fetching a variable, and so
on.

The op tree is connected in two ways: you can imagine that there are
two "routes" through it, two orders in which you can traverse the tree.
First, parse order reflects how the parser understood the code, and
secondly, execution order tells perl what order to perform the
operations in.

The easiest way to examine the op tree is to stop Perl after it has
finished parsing, and get it to dump out the tree. This is exactly what
the compiler backends L<B::Terse|B::Terse>, L<B::Concise|B::Concise>
and L<B::Debug|B::Debug> do.

Let's have a look at how Perl sees C<$a = $b + $c>:

     % perl -MO=Terse -e '$a=$b+$c'
     1  LISTOP (0x8179888) leave
     2      OP (0x81798b0) enter
     3      COP (0x8179850) nextstate
     4      BINOP (0x8179828) sassign
     5          BINOP (0x8179800) add [1]
     6              UNOP (0x81796e0) null [15]
     7                  SVOP (0x80fafe0) gvsv  GV (0x80fa4cc) *b
     8              UNOP (0x81797e0) null [15]
     9                  SVOP (0x8179700) gvsv  GV (0x80efeb0) *c
    10          UNOP (0x816b4f0) null [15]
    11              SVOP (0x816dcf0) gvsv  GV (0x80fa460) *a

Let's start in the middle, at line 4. This is a BINOP, a binary
operator, which is at location C<0x8179828>. The specific operator in
question is C<sassign> - scalar assignment - and you can find the code
which implements it in the function C<pp_sassign> in F<pp_hot.c>. As a
binary operator, it has two children: the add operator, providing the
result of C<$b+$c>, is uppermost on line 5, and the left hand side is
on line 10.

Line 10 is the null op: this does exactly nothing. What is that doing
there? If you see the null op, it's a sign that something has been
optimized away after parsing. As we mentioned in L</Optimization>, the
optimization stage sometimes converts two operations into one, for
example when fetching a scalar variable. When this happens, instead of
rewriting the op tree and cleaning up the dangling pointers, it's
easier just to replace the redundant operation with the null op.
Originally, the tree would have looked like this:

    10          SVOP (0x816b4f0) rv2sv [15]
    11              SVOP (0x816dcf0) gv  GV (0x80fa460) *a

That is, fetch the C<a> entry from the main symbol table, and then look
at the scalar component of it: C<gvsv> (C<pp_gvsv> in F<pp_hot.c>)
happens to do both these things.

The right hand side, starting at line 5 is similar to what we've just
seen: we have the C<add> op (C<pp_add>, also in F<pp_hot.c>) add
together two C<gvsv>s.

Now, what's this about?

     1  LISTOP (0x8179888) leave
     2      OP (0x81798b0) enter
     3      COP (0x8179850) nextstate

C<enter> and C<leave> are scoping ops, and their job is to perform any
housekeeping every time you enter and leave a block: lexical variables
are tidied up, unreferenced variables are destroyed, and so on. Every
program will have those first three lines: C<leave> is a list, and its
children are all the statements in the block. Statements are delimited
by C<nextstate>, so a block is a collection of C<nextstate> ops, with
the ops to be performed for each statement being the children of
C<nextstate>. C<enter> is a single op which functions as a marker.

That's how Perl parsed the program, from top to bottom:

                        Program
                           |
                       Statement
                           |
                           =
                          / \
                         /   \
                        $a   +
                            / \
                          $b   $c

However, it's impossible to B<perform> the operations in this order:
you have to find the values of C<$b> and C<$c> before you add them
together, for instance. So, the other thread that runs through the op
tree is the execution order: each op has a field C<op_next> which
points to the next op to be run, so following these pointers tells us
how perl executes the code. We can traverse the tree in this order
using the C<exec> option to C<B::Terse>:

     % perl -MO=Terse,exec -e '$a=$b+$c'
     1  OP (0x8179928) enter
     2  COP (0x81798c8) nextstate
     3  SVOP (0x81796c8) gvsv  GV (0x80fa4d4) *b
     4  SVOP (0x8179798) gvsv  GV (0x80efeb0) *c
     5  BINOP (0x8179878) add [1]
     6  SVOP (0x816dd38) gvsv  GV (0x80fa468) *a
     7  BINOP (0x81798a0) sassign
     8  LISTOP (0x8179900) leave

This probably makes more sense for a human: enter a block, start a
statement. Get the values of C<$b> and C<$c>, and add them together.
Find C<$a>, and assign one to the other. Then leave.

The way Perl builds up these op trees in the parsing process can be
unravelled by examining F<toke.c>, the lexer, and F<perly.y>, the YACC
grammar. Let's look at the code that constructs the tree for C<$a = $b +
$c>.

First, we'll look at the C<Perl_yylex> function in the lexer. We want to
look for C<case 'x'>, where x is the first character of the operator.
(Incidentally, when looking for the code that handles a keyword, you'll
want to search for C<KEY_foo> where "foo" is the keyword.) Here is the code
that handles assignment (there are quite a few operators beginning with
C<=>, so most of it is omitted for brevity):

     1    case '=':
     2        s++;
              ... code that handles == => etc. and pod ...
     3        pl_yylval.ival = 0;
     4        OPERATOR(ASSIGNOP);

We can see on line 4 that our token type is C<ASSIGNOP> (C<OPERATOR> is a
macro, defined in F<toke.c>, that returns the token type, among other
things). And C<+>:

     1     case '+':
     2         {
     3             const char tmp = *s++;
                   ... code for ++ ...
     4             if (PL_expect == XOPERATOR) {
                       ...
     5                 Aop(OP_ADD);
     6             }
                   ...
     7         }

Line 4 checks what type of token we are expecting. C<Aop> returns a token.
If you search for C<Aop> elsewhere in F<toke.c>, you will see that it
returns an C<ADDOP> token.

Now that we know the two token types we want to look for in the parser,
let's take the piece of F<perly.y> we need to construct the tree for
C<$a = $b + $c>

    1 term    :   term ASSIGNOP term
    2                { $$ = newASSIGNOP(OPf_STACKED, $1, $2, $3); }
    3         |   term ADDOP term
    4                { $$ = newBINOP($2, 0, scalar($1), scalar($3)); }

If you're not used to reading BNF grammars, this is how it works:
You're fed certain things by the tokeniser, which generally end up in
upper case. C<ADDOP> and C<ASSIGNOP> are examples of "terminal symbols",
because you can't get any simpler than
them.

The grammar, lines one and three of the snippet above, tells you how to
build up more complex forms. These complex forms, "non-terminal
symbols" are generally placed in lower case. C<term> here is a
non-terminal symbol, representing a single expression.

The grammar gives you the following rule: you can make the thing on the
left of the colon if you see all the things on the right in sequence.
This is called a "reduction", and the aim of parsing is to completely
reduce the input. There are several different ways you can perform a
reduction, separated by vertical bars: so, C<term> followed by C<=>
followed by C<term> makes a C<term>, and C<term> followed by C<+>
followed by C<term> can also make a C<term>.

So, if you see two terms with an C<=> or C<+>, between them, you can
turn them into a single expression. When you do this, you execute the
code in the block on the next line: if you see C<=>, you'll do the code
in line 2. If you see C<+>, you'll do the code in line 4. It's this
code which contributes to the op tree.

            |   term ADDOP term
            { $$ = newBINOP($2, 0, scalar($1), scalar($3)); }

What this does is creates a new binary op, and feeds it a number of
variables. The variables refer to the tokens: C<$1> is the first token
in the input, C<$2> the second, and so on - think regular expression
backreferences. C<$$> is the op returned from this reduction. So, we
call C<newBINOP> to create a new binary operator. The first parameter
to C<newBINOP>, a function in F<op.c>, is the op type. It's an addition
operator, so we want the type to be C<ADDOP>. We could specify this
directly, but it's right there as the second token in the input, so we
use C<$2>. The second parameter is the op's flags: 0 means "nothing
special". Then the things to add: the left and right hand side of our
expression, in scalar context.

The functions that create ops, which have names like C<newUNOP> and
C<newBINOP>, call a "check" function associated with each op type, before
returning the op. The check functions can mangle the op as they see fit,
and even replace it with an entirely new one. These functions are defined
in F<op.c>, and have a C<Perl_ck_> prefix. You can find out which
check function is used for a particular op type by looking in
F<regen/opcodes>.  Take C<OP_ADD>, for example. (C<OP_ADD> is the token
value from the C<Aop(OP_ADD)> in F<toke.c> which the parser passes to
C<newBINOP> as its first argument.) Here is the relevant line:

    add             addition (+)            ck_null         IfsT2   S S

The check function in this case is C<Perl_ck_null>, which does nothing.
Let's look at a more interesting case:

    readline        <HANDLE>                ck_readline     t%      F?

And here is the function from F<op.c>:

     1 OP *
     2 Perl_ck_readline(pTHX_ OP *o)
     3 {
     4     PERL_ARGS_ASSERT_CK_READLINE;
     5 
     6     if (o->op_flags & OPf_KIDS) {
     7          OP *kid = cLISTOPo->op_first;
     8          if (kid->op_type == OP_RV2GV)
     9              kid->op_private |= OPpALLOW_FAKE;
    10     }
    11     else {
    12         OP * const newop
    13             = newUNOP(OP_READLINE, 0, newGVOP(OP_GV, 0,
    14                                               PL_argvgv));
    15         op_free(o);
    16         return newop;
    17     }
    18     return o;
    19 }

One particularly interesting aspect is that if the op has no kids (i.e.,
C<readline()> or C<< <> >>) the op is freed and replaced with an entirely
new one that references C<*ARGV> (lines 12-16).

=head1 STACKS

When perl executes something like C<addop>, how does it pass on its
results to the next op? The answer is, through the use of stacks. Perl
has a number of stacks to store things it's currently working on, and
we'll look at the three most important ones here.

=head2 Argument stack

Arguments are passed to PP code and returned from PP code using the
argument stack, C<ST>. The typical way to handle arguments is to pop
them off the stack, deal with them how you wish, and then push the
result back onto the stack. This is how, for instance, the cosine
operator works:

      NV value;
      value = POPn;
      value = Perl_cos(value);
      XPUSHn(value);

We'll see a more tricky example of this when we consider Perl's macros
below. C<POPn> gives you the NV (floating point value) of the top SV on
the stack: the C<$x> in C<cos($x)>. Then we compute the cosine, and
push the result back as an NV. The C<X> in C<XPUSHn> means that the
stack should be extended if necessary - it can't be necessary here,
because we know there's room for one more item on the stack, since
we've just removed one! The C<XPUSH*> macros at least guarantee safety.

Alternatively, you can fiddle with the stack directly: C<SP> gives you
the first element in your portion of the stack, and C<TOP*> gives you
the top SV/IV/NV/etc. on the stack. So, for instance, to do unary
negation of an integer:

     SETi(-TOPi);

Just set the integer value of the top stack entry to its negation.

Argument stack manipulation in the core is exactly the same as it is in
XSUBs - see L<perlxstut>, L<perlxs> and L<perlguts> for a longer
description of the macros used in stack manipulation.

=head2 Mark stack

I say "your portion of the stack" above because PP code doesn't
necessarily get the whole stack to itself: if your function calls
another function, you'll only want to expose the arguments aimed for
the called function, and not (necessarily) let it get at your own data.
The way we do this is to have a "virtual" bottom-of-stack, exposed to
each function. The mark stack keeps bookmarks to locations in the
argument stack usable by each function. For instance, when dealing with
a tied variable, (internally, something with "P" magic) Perl has to
call methods for accesses to the tied variables. However, we need to
separate the arguments exposed to the method to the argument exposed to
the original function - the store or fetch or whatever it may be.
Here's roughly how the tied C<push> is implemented; see C<av_push> in
F<av.c>:

     1	PUSHMARK(SP);
     2	EXTEND(SP,2);
     3	PUSHs(SvTIED_obj((SV*)av, mg));
     4	PUSHs(val);
     5	PUTBACK;
     6	ENTER;
     7	call_method("PUSH", G_SCALAR|G_DISCARD);
     8	LEAVE;

Let's examine the whole implementation, for practice:

     1	PUSHMARK(SP);

Push the current state of the stack pointer onto the mark stack. This
is so that when we've finished adding items to the argument stack, Perl
knows how many things we've added recently.

     2	EXTEND(SP,2);
     3	PUSHs(SvTIED_obj((SV*)av, mg));
     4	PUSHs(val);

We're going to add two more items onto the argument stack: when you
have a tied array, the C<PUSH> subroutine receives the object and the
value to be pushed, and that's exactly what we have here - the tied
object, retrieved with C<SvTIED_obj>, and the value, the SV C<val>.

     5	PUTBACK;

Next we tell Perl to update the global stack pointer from our internal
variable: C<dSP> only gave us a local copy, not a reference to the
global.

     6	ENTER;
     7	call_method("PUSH", G_SCALAR|G_DISCARD);
     8	LEAVE;

C<ENTER> and C<LEAVE> localise a block of code - they make sure that
all variables are tidied up, everything that has been localised gets
its previous value returned, and so on. Think of them as the C<{> and
C<}> of a Perl block.

To actually do the magic method call, we have to call a subroutine in
Perl space: C<call_method> takes care of that, and it's described in
L<perlcall>. We call the C<PUSH> method in scalar context, and we're
going to discard its return value. The call_method() function removes
the top element of the mark stack, so there is nothing for the caller
to clean up.

=head2 Save stack

C doesn't have a concept of local scope, so perl provides one. We've
seen that C<ENTER> and C<LEAVE> are used as scoping braces; the save
stack implements the C equivalent of, for example:

    {
        local $foo = 42;
        ...
    }

See L<perlguts/"Localizing changes"> for how to use the save stack.

=head1 MILLIONS OF MACROS

One thing you'll notice about the Perl source is that it's full of
macros. Some have called the pervasive use of macros the hardest thing
to understand, others find it adds to clarity. Let's take an example,
the code which implements the addition operator:

   1  PP(pp_add)
   2  {
   3      dSP; dATARGET; tryAMAGICbin(add,opASSIGN);
   4      {
   5        dPOPTOPnnrl_ul;
   6        SETn( left + right );
   7        RETURN;
   8      }
   9  }

Every line here (apart from the braces, of course) contains a macro.
The first line sets up the function declaration as Perl expects for PP
code; line 3 sets up variable declarations for the argument stack and
the target, the return value of the operation. Finally, it tries to see
if the addition operation is overloaded; if so, the appropriate
subroutine is called.

Line 5 is another variable declaration - all variable declarations
start with C<d> - which pops from the top of the argument stack two NVs
(hence C<nn>) and puts them into the variables C<right> and C<left>,
hence the C<rl>. These are the two operands to the addition operator.
Next, we call C<SETn> to set the NV of the return value to the result
of adding the two values. This done, we return - the C<RETURN> macro
makes sure that our return value is properly handled, and we pass the
next operator to run back to the main run loop.

Most of these macros are explained in L<perlapi>, and some of the more
important ones are explained in L<perlxs> as well. Pay special
attention to L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for
information on the C<[pad]THX_?> macros.

=head1 FURTHER READING

For more information on the Perl internals, please see the documents
listed at L<perl/Internals and C Language Interface>.
perlnewmod.pod000064400000025434150344123460007436 0ustar00=head1 NAME

perlnewmod - preparing a new module for distribution

=head1 DESCRIPTION

This document gives you some suggestions about how to go about writing
Perl modules, preparing them for distribution, and making them available
via CPAN.

One of the things that makes Perl really powerful is the fact that Perl
hackers tend to want to share the solutions to problems they've faced,
so you and I don't have to battle with the same problem again.

The main way they do this is by abstracting the solution into a Perl
module. If you don't know what one of these is, the rest of this
document isn't going to be much use to you. You're also missing out on
an awful lot of useful code; consider having a look at L<perlmod>,
L<perlmodlib> and L<perlmodinstall> before coming back here.

When you've found that there isn't a module available for what you're
trying to do, and you've had to write the code yourself, consider
packaging up the solution into a module and uploading it to CPAN so that
others can benefit.

You should also take a look at L<perlmodstyle> for best practices in
making a module.

=head2 Warning

We're going to primarily concentrate on Perl-only modules here, rather
than XS modules. XS modules serve a rather different purpose, and
you should consider different things before distributing them - the
popularity of the library you are gluing, the portability to other
operating systems, and so on. However, the notes on preparing the Perl
side of the module and packaging and distributing it will apply equally
well to an XS module as a pure-Perl one.

=head2 What should I make into a module?

You should make a module out of any code that you think is going to be
useful to others. Anything that's likely to fill a hole in the communal
library and which someone else can slot directly into their program. Any
part of your code which you can isolate and extract and plug into
something else is a likely candidate.

Let's take an example. Suppose you're reading in data from a local
format into a hash-of-hashes in Perl, turning that into a tree, walking
the tree and then piping each node to an Acme Transmogrifier Server.

Now, quite a few people have the Acme Transmogrifier, and you've had to
write something to talk the protocol from scratch - you'd almost
certainly want to make that into a module. The level at which you pitch
it is up to you: you might want protocol-level modules analogous to
L<Net::SMTP|Net::SMTP> which then talk to higher level modules analogous
to L<Mail::Send|Mail::Send>. The choice is yours, but you do want to get
a module out for that server protocol.

Nobody else on the planet is going to talk your local data format, so we
can ignore that. But what about the thing in the middle? Building tree
structures from Perl variables and then traversing them is a nice,
general problem, and if nobody's already written a module that does
that, you might want to modularise that code too.

So hopefully you've now got a few ideas about what's good to modularise.
Let's now see how it's done.

=head2 Step-by-step: Preparing the ground

Before we even start scraping out the code, there are a few things we'll
want to do in advance.

=over 3

=item Look around

Dig into a bunch of modules to see how they're written. I'd suggest
starting with L<Text::Tabs|Text::Tabs>, since it's in the standard
library and is nice and simple, and then looking at something a little
more complex like L<File::Copy|File::Copy>.  For object oriented
code, L<WWW::Mechanize> or the C<Email::*> modules provide some good
examples.

These should give you an overall feel for how modules are laid out and
written.

=item Check it's new

There are a lot of modules on CPAN, and it's easy to miss one that's
similar to what you're planning on contributing. Have a good plough
through L<http://metacpan.org> and make sure you're not the one
reinventing the wheel!

=item Discuss the need

You might love it. You might feel that everyone else needs it. But there
might not actually be any real demand for it out there. If you're unsure
about the demand your module will have, consider asking the
C<module-authors@perl.org> mailing list (send an email to
C<module-authors-subscribe@perl.org> to subscribe; see
L<http://lists.perl.org/list/module-authors.html> for more information
and a link to the archives).

=item Choose a name

Perl modules included on CPAN have a naming hierarchy you should try to
fit in with. See L<perlmodlib> for more details on how this works, and
browse around CPAN and the modules list to get a feel of it. At the very
least, remember this: modules should be title capitalised, (This::Thing)
fit in with a category, and explain their purpose succinctly.

=item Check again

While you're doing that, make really sure you haven't missed a module
similar to the one you're about to write.

When you've got your name sorted out and you're sure that your module is
wanted and not currently available, it's time to start coding.

=back

=head2 Step-by-step: Making the module

=over 3

=item Start with F<module-starter> or F<h2xs>

The F<module-starter> utility is distributed as part of the
L<Module::Starter|Module::Starter> CPAN package.  It creates a directory
with stubs of all the necessary files to start a new module, according
to recent "best practice" for module development, and is invoked from
the command line, thus:

    module-starter --module=Foo::Bar \
       --author="Your Name" --email=yourname@cpan.org

If you do not wish to install the L<Module::Starter|Module::Starter>
package from CPAN, F<h2xs> is an older tool, originally intended for the
development of XS modules, which comes packaged with the Perl
distribution. 

A typical invocation of L<h2xs|h2xs> for a pure Perl module is:

    h2xs -AX --skip-exporter --use-new-tests -n Foo::Bar 

The C<-A> omits the Autoloader code, C<-X> omits XS elements,
C<--skip-exporter> omits the Exporter code, C<--use-new-tests> sets up a
modern testing environment, and C<-n> specifies the name of the module.

=item Use L<strict|strict> and L<warnings|warnings>

A module's code has to be warning and strict-clean, since you can't
guarantee the conditions that it'll be used under. Besides, you wouldn't
want to distribute code that wasn't warning or strict-clean anyway,
right?

=item Use L<Carp|Carp>

The L<Carp|Carp> module allows you to present your error messages from
the caller's perspective; this gives you a way to signal a problem with
the caller and not your module. For instance, if you say this:

    warn "No hostname given";

the user will see something like this:

 No hostname given at
 /usr/local/lib/perl5/site_perl/5.6.0/Net/Acme.pm line 123.

which looks like your module is doing something wrong. Instead, you want
to put the blame on the user, and say this:

    No hostname given at bad_code, line 10.

You do this by using L<Carp|Carp> and replacing your C<warn>s with
C<carp>s. If you need to C<die>, say C<croak> instead. However, keep
C<warn> and C<die> in place for your sanity checks - where it really is
your module at fault.

=item Use L<Exporter|Exporter> - wisely!

L<Exporter|Exporter> gives you a standard way of exporting symbols and
subroutines from your module into the caller's namespace. For instance,
saying C<use Net::Acme qw(&frob)> would import the C<frob> subroutine.

The package variable C<@EXPORT> will determine which symbols will get
exported when the caller simply says C<use Net::Acme> - you will hardly
ever want to put anything in there. C<@EXPORT_OK>, on the other hand,
specifies which symbols you're willing to export. If you do want to
export a bunch of symbols, use the C<%EXPORT_TAGS> and define a standard
export set - look at L<Exporter> for more details.

=item Use L<plain old documentation|perlpod>

The work isn't over until the paperwork is done, and you're going to
need to put in some time writing some documentation for your module.
C<module-starter> or C<h2xs> will provide a stub for you to fill in; if
you're not sure about the format, look at L<perlpod> for an
introduction. Provide a good synopsis of how your module is used in
code, a description, and then notes on the syntax and function of the
individual subroutines or methods. Use Perl comments for developer notes
and POD for end-user notes.

=item Write tests

You're encouraged to create self-tests for your module to ensure it's
working as intended on the myriad platforms Perl supports; if you upload
your module to CPAN, a host of testers will build your module and send
you the results of the tests. Again, C<module-starter> and C<h2xs>
provide a test framework which you can extend - you should do something
more than just checking your module will compile.
L<Test::Simple|Test::Simple> and L<Test::More|Test::More> are good
places to start when writing a test suite.

=item Write the F<README>

If you're uploading to CPAN, the automated gremlins will extract the
README file and place that in your CPAN directory. It'll also appear in
the main F<by-module> and F<by-category> directories if you make it onto
the modules list. It's a good idea to put here what the module actually
does in detail.

=item Write F<Changes>

Add any user-visible changes since the last release to your F<Changes>
file.

=back

=head2 Step-by-step: Distributing your module

=over 3

=item Get a CPAN user ID

Every developer publishing modules on CPAN needs a CPAN ID.  Visit
C<L<http://pause.perl.org/>>, select "Request PAUSE Account", and wait for
your request to be approved by the PAUSE administrators.

=item C<perl Makefile.PL; make test; make distcheck; make dist>

Once again, C<module-starter> or C<h2xs> has done all the work for you.
They produce the standard C<Makefile.PL> you see when you download and
install modules, and this produces a Makefile with a C<dist> target.

Once you've ensured that your module passes its own tests - always a
good thing to make sure - you can C<make distcheck> to make sure
everything looks OK, followed by C<make dist>, and the Makefile will
hopefully produce you a nice tarball of your module, ready for upload.

=item Upload the tarball

The email you got when you received your CPAN ID will tell you how to
log in to PAUSE, the Perl Authors Upload SErver. From the menus there,
you can upload your module to CPAN.

Alternatively you can use the F<cpan-upload> script, part of the
L<CPAN::Uploader> distribution on CPAN.

=item Fix bugs!

Once you start accumulating users, they'll send you bug reports. If
you're lucky, they'll even send you patches. Welcome to the joys of
maintaining a software project...

=back

=head1 AUTHOR

Simon Cozens, C<simon@cpan.org>

Updated by Kirrily "Skud" Robert, C<skud@cpan.org>

=head1 SEE ALSO

L<perlmod>, L<perlmodlib>, L<perlmodinstall>, L<h2xs>, L<strict>,
L<Carp>, L<Exporter>, L<perlpod>, L<Test::Simple>, L<Test::More>
L<ExtUtils::MakeMaker>, L<Module::Build>, L<Module::Starter>
L<http://www.cpan.org/>, Ken Williams' tutorial on building your own
module at L<http://mathforum.org/~ken/perl_modules.html>
perl5120delta.pod000064400000256270150344123460007552 0ustar00=encoding utf8

=head1 NAME

perl5120delta - what is new for perl v5.12.0

=head1 DESCRIPTION

This document describes differences between the 5.10.0 release and the
5.12.0 release.

Many of the bug fixes in 5.12.0 are already included in the 5.10.1
maintenance release.

You can see the list of those changes in the 5.10.1 release notes
(L<perl5101delta>).


=head1 Core Enhancements

=head2 New C<package NAME VERSION> syntax

This new syntax allows a module author to set the $VERSION of a namespace
when the namespace is declared with 'package'. It eliminates the need
for C<our $VERSION = ...> and similar constructs. E.g.

      package Foo::Bar 1.23;
      # $Foo::Bar::VERSION == 1.23

There are several advantages to this:

=over

=item *

C<$VERSION> is parsed in exactly the same way as C<use NAME VERSION>

=item *

C<$VERSION> is set at compile time

=item *

C<$VERSION> is a version object that provides proper overloading of
comparison operators so comparing C<$VERSION> to decimal (1.23) or
dotted-decimal (v1.2.3) version numbers works correctly.

=item *

Eliminates C<$VERSION = ...> and C<eval $VERSION> clutter

=item *

As it requires VERSION to be a numeric literal or v-string
literal, it can be statically parsed by toolchain modules
without C<eval> the way MM-E<gt>parse_version does for C<$VERSION = ...>

=back

It does not break old code with only C<package NAME>, but code that uses
C<package NAME VERSION> will need to be restricted to perl 5.12.0 or newer
This is analogous to the change to C<open> from two-args to three-args.
Users requiring the latest Perl will benefit, and perhaps after several
years, it will become a standard practice.


However, C<package NAME VERSION> requires a new, 'strict' version
number format. See L</"Version number formats"> for details.


=head2 The C<...> operator

A new operator, C<...>, nicknamed the Yada Yada operator, has been added.
It is intended to mark placeholder code that is not yet implemented.
See L<perlop/"Yada Yada Operator">.

=head2 Implicit strictures

Using the C<use VERSION> syntax with a version number greater or equal
to 5.11.0 will lexically enable strictures just like C<use strict>
would do (in addition to enabling features.) The following:

    use 5.12.0;

means:

    use strict;
    use feature ':5.12';

=head2 Unicode improvements

Perl 5.12 comes with Unicode 5.2, the latest version available to
us at the time of release.  This version of Unicode was released in
October 2009. See L<http://www.unicode.org/versions/Unicode5.2.0> for
further details about what's changed in this version of the standard.
See L<perlunicode> for instructions on installing and using other versions
of Unicode.

Additionally, Perl's developers have significantly improved Perl's Unicode
implementation. For full details, see L</Unicode overhaul> below.

=head2 Y2038 compliance

Perl's core time-related functions are now Y2038 compliant. (It may not mean much to you, but your kids will love it!)

=head2 qr overloading

It is now possible to overload the C<qr//> operator, that is,
conversion to regexp, like it was already possible to overload
conversion to boolean, string or number of objects. It is invoked when
an object appears on the right hand side of the C<=~> operator or when
it is interpolated into a regexp. See L<overload>.

=head2 Pluggable keywords

Extension modules can now cleanly hook into the Perl parser to define
new kinds of keyword-headed expression and compound statement. The
syntax following the keyword is defined entirely by the extension. This
allows a completely non-Perl sublanguage to be parsed inline, with the
correct ops cleanly generated.

See L<perlapi/PL_keyword_plugin> for the mechanism. The Perl core
source distribution also includes a new module
L<XS::APItest::KeywordRPN>, which implements reverse Polish notation
arithmetic via pluggable keywords. This module is mainly used for test
purposes, and is not normally installed, but also serves as an example
of how to use the new mechanism.

Perl's developers consider this feature to be experimental. We may remove
it or change it in a backwards-incompatible way in Perl 5.14.

=head2 APIs for more internals

The lowest layers of the lexer and parts of the pad system now have C
APIs available to XS extensions. These are necessary to support proper
use of pluggable keywords, but have other uses too. The new APIs are
experimental, and only cover a small proportion of what would be
necessary to take full advantage of the core's facilities in these
areas. It is intended that the Perl 5.13 development cycle will see the
addition of a full range of clean, supported interfaces.

Perl's developers consider this feature to be experimental. We may remove
it or change it in a backwards-incompatible way in Perl 5.14.

=head2 Overridable function lookup

Where an extension module hooks the creation of rv2cv ops to modify the
subroutine lookup process, this now works correctly for bareword
subroutine calls. This means that prototypes on subroutines referenced
this way will be processed correctly. (Previously bareword subroutine
names were initially looked up, for parsing purposes, by an unhookable
mechanism, so extensions could only properly influence subroutine names
that appeared with an C<&> sigil.)

=head2 A proper interface for pluggable Method Resolution Orders

As of Perl 5.12.0 there is a new interface for plugging and using method
resolution orders other than the default linear depth first search.
The C3 method resolution order added in 5.10.0 has been re-implemented as
a plugin, without changing its Perl-space interface. See L<perlmroapi> for
more information.



=head2 C<\N> experimental regex escape

Perl now supports C<\N>, a new regex escape which you can think of as
the inverse of C<\n>. It will match any character that is not a newline,
independently from the presence or absence of the single line match
modifier C</s>. It is not usable within a character class.  C<\N{3}>
means to match 3 non-newlines; C<\N{5,}> means to match at least 5.
C<\N{NAME}> still means the character or sequence named C<NAME>, but
C<NAME> no longer can be things like C<3>, or C<5,>.

This will break a L<custom charnames translator|charnames/CUSTOM
TRANSLATORS> which allows numbers for character names, as C<\N{3}> will
now mean to match 3 non-newline characters, and not the character whose
name is C<3>. (No name defined by the Unicode standard is a number,
so only custom translators might be affected.)

Perl's developers are somewhat concerned about possible user confusion
with the existing C<\N{...}> construct which matches characters by their
Unicode name. Consequently, this feature is experimental. We may remove
it or change it in a backwards-incompatible way in Perl 5.14.

=head2 DTrace support

Perl now has some support for DTrace. See "DTrace support" in F<INSTALL>.

=head2 Support for C<configure_requires> in CPAN module metadata

Both C<CPAN> and C<CPANPLUS> now support the C<configure_requires>
keyword in the F<META.yml> metadata file included in most recent CPAN
distributions.  This allows distribution authors to specify configuration
prerequisites that must be installed before running F<Makefile.PL>
or F<Build.PL>.

See the documentation for C<ExtUtils::MakeMaker> or C<Module::Build> for
more on how to specify C<configure_requires> when creating a distribution
for CPAN.

=head2 C<each>, C<keys>, C<values> are now more flexible

The C<each>, C<keys>, C<values> function can now operate on arrays.

=head2 C<when> as a statement modifier

C<when> is now allowed to be used as a statement modifier.

=head2 C<$,> flexibility

The variable C<$,> may now be tied.

=head2 // in when clauses

// now behaves like || in when clauses

=head2 Enabling warnings from your shell environment

You can now set C<-W> from the C<PERL5OPT> environment variable

=head2 C<delete local>

C<delete local> now allows you to locally delete a hash entry.

=head2 New support for Abstract namespace sockets

Abstract namespace sockets are Linux-specific socket type that live in
AF_UNIX family, slightly abusing it to be able to use arbitrary
character arrays as addresses: They start with nul byte and are not
terminated by nul byte, but with the length passed to the socket()
system call.

=head2 32-bit limit on substr arguments removed

The 32-bit limit on C<substr> arguments has now been removed. The full
range of the system's signed and unsigned integers is now available for
the C<pos> and C<len> arguments.

=head1 Potentially Incompatible Changes

=head2 Deprecations warn by default

Over the years, Perl's developers have deprecated a number of language
features for a variety of reasons.  Perl now defaults to issuing a
warning if a deprecated language feature is used. Many of the deprecations
Perl now warns you about have been deprecated for many years.  You can
find a list of what was deprecated in a given release of Perl in the
C<perl5xxdelta.pod> file for that release.

To disable this feature in a given lexical scope, you should use C<no
warnings 'deprecated';> For information about which language features
are deprecated and explanations of various deprecation warnings, please
see L<perldiag>. See L</Deprecations> below for the list of features
and modules Perl's developers have deprecated as part of this release.

=head2 Version number formats

Acceptable version number formats have been formalized into "strict" and
"lax" rules. C<package NAME VERSION> takes a strict version number.
C<UNIVERSAL::VERSION> and the L<version> object constructors take lax
version numbers. Providing an invalid version will result in a fatal
error. The version argument in C<use NAME VERSION> is first parsed as a
numeric literal or v-string and then passed to C<UNIVERSAL::VERSION>
(and must then pass the "lax" format test).

These formats are documented fully in the L<version> module. To a first
approximation, a "strict" version number is a positive decimal number
(integer or decimal-fraction) without exponentiation or else a
dotted-decimal v-string with a leading 'v' character and at least three
components. A "lax" version number allows v-strings with fewer than
three components or without a leading 'v'. Under "lax" rules, both
decimal and dotted-decimal versions may have a trailing "alpha"
component separated by an underscore character after a fractional or
dotted-decimal component.

The L<version> module adds C<version::is_strict> and C<version::is_lax>
functions to check a scalar against these rules.

=head2 @INC reorganization

In C<@INC>, C<ARCHLIB> and C<PRIVLIB> now occur after the current
version's C<site_perl> and C<vendor_perl>.  Modules installed into
C<site_perl> and C<vendor_perl> will now be loaded in preference to
those installed in C<ARCHLIB> and C<PRIVLIB>.


=head2 REGEXPs are now first class

Internally, Perl now treats compiled regular expressions (such as
those created with C<qr//>) as first class entities. Perl modules which
serialize, deserialize or otherwise have deep interaction with Perl's
internal data structures need to be updated for this change.  Most
affected CPAN modules have already been updated as of this writing.

=head2 Switch statement changes

The C<given>/C<when> switch statement handles complex statements better
than Perl 5.10.0 did (These enhancements are also available in
5.10.1 and subsequent 5.10 releases.) There are two new cases where
C<when> now interprets its argument as a boolean, instead of an
expression to be used in a smart match:

=over

=item flip-flop operators

The C<..> and C<...> flip-flop operators are now evaluated in boolean
context, following their usual semantics; see L<perlop/"Range Operators">.

Note that, as in perl 5.10.0, C<when (1..10)> will not work to test
whether a given value is an integer between 1 and 10; you should use
C<when ([1..10])> instead (note the array reference).

However, contrary to 5.10.0, evaluating the flip-flop operators in
boolean context ensures it can now be useful in a C<when()>, notably
for implementing bistable conditions, like in:

    when (/^=begin/ .. /^=end/) {
      # do something
    }

=item defined-or operator

A compound expression involving the defined-or operator, as in
C<when (expr1 // expr2)>, will be treated as boolean if the first
expression is boolean. (This just extends the existing rule that applies
to the regular or operator, as in C<when (expr1 || expr2)>.)

=back

=head2 Smart match changes

Since Perl 5.10.0, Perl's developers have made a number of changes to
the smart match operator. These, of course, also alter the behaviour
of the switch statements where smart matching is implicitly used.
These changes were also made for the 5.10.1 release, and will remain in
subsequent 5.10 releases.

=head3 Changes to type-based dispatch

The smart match operator C<~~> is no longer commutative. The behaviour of
a smart match now depends primarily on the type of its right hand
argument. Moreover, its semantics have been adjusted for greater
consistency or usefulness in several cases. While the general backwards
compatibility is maintained, several changes must be noted:

=over 4

=item *

Code references with an empty prototype are no longer treated specially.
They are passed an argument like the other code references (even if they
choose to ignore it).

=item *

C<%hash ~~ sub {}> and C<@array ~~ sub {}> now test that the subroutine
returns a true value for each key of the hash (or element of the
array), instead of passing the whole hash or array as a reference to
the subroutine.

=item *

Due to the commutativity breakage, code references are no longer
treated specially when appearing on the left of the C<~~> operator,
but like any vulgar scalar.

=item *

C<undef ~~ %hash> is always false (since C<undef> can't be a key in a
hash). No implicit conversion to C<""> is done (as was the case in perl
5.10.0).

=item *

C<$scalar ~~ @array> now always distributes the smart match across the
elements of the array. It's true if one element in @array verifies
C<$scalar ~~ $element>. This is a generalization of the old behaviour
that tested whether the array contained the scalar.

=back

The full dispatch table for the smart match operator is given in
L<perlsyn/"Smart matching in detail">.

=head3 Smart match and overloading

According to the rule of dispatch based on the rightmost argument type,
when an object overloading C<~~> appears on the right side of the
operator, the overload routine will always be called (with a 3rd argument
set to a true value, see L<overload>.) However, when the object will
appear on the left, the overload routine will be called only when the
rightmost argument is a simple scalar. This way, distributivity of smart
match across arrays is not broken, as well as the other behaviours with
complex types (coderefs, hashes, regexes). Thus, writers of overloading
routines for smart match mostly need to worry only with comparing
against a scalar, and possibly with stringification overloading; the
other common cases will be automatically handled consistently.

C<~~> will now refuse to work on objects that do not overload it (in order
to avoid relying on the object's underlying structure). (However, if the
object overloads the stringification or the numification operators, and
if overload fallback is active, it will be used instead, as usual.)

=head2 Other potentially incompatible changes

=over 4

=item *

The definitions of a number of Unicode properties have changed to match
those of the current Unicode standard. These are listed above under
L</Unicode overhaul>. This change may break code that expects the old
definitions.

=item *

The boolkeys op has moved to the group of hash ops. This breaks binary
compatibility.

=item *

Filehandles are now always blessed into C<IO::File>.

The previous behaviour was to bless Filehandles into L<FileHandle>
(an empty proxy class) if it was loaded into memory and otherwise
to bless them into C<IO::Handle>.

=item *

The semantics of C<use feature :5.10*> have changed slightly.
See L</"Modules and Pragmata"> for more information.

=item *

Perl's developers now use git, rather than Perforce.  This should be
a purely internal change only relevant to people actively working on
the core.  However, you may see minor difference in perl as a consequence
of the change.  For example in some of details of the output of C<perl
-V>. See L<perlrepository> for more information.

=item *

As part of the C<Test::Harness> 2.x to 3.x upgrade, the experimental
C<Test::Harness::Straps> module has been removed.
See L</"Modules and Pragmata"> for more details.

=item *

As part of the C<ExtUtils::MakeMaker> upgrade, the
C<ExtUtils::MakeMaker::bytes> and C<ExtUtils::MakeMaker::vmsish> modules
have been removed from this distribution.

=item *

C<Module::CoreList> no longer contains the C<%:patchlevel> hash.

=item *

C<length undef> now returns undef.

=item *

Unsupported private C API functions are now declared "static" to prevent
leakage to Perl's public API.

=item *

To support the bootstrapping process, F<miniperl> no longer builds with
UTF-8 support in the regexp engine.

This allows a build to complete with PERL_UNICODE set and a UTF-8 locale.
Without this there's a bootstrapping problem, as miniperl can't load
the UTF-8 components of the regexp engine, because they're not yet built.

=item *

F<miniperl>'s @INC is now restricted to just C<-I...>, the split of
C<$ENV{PERL5LIB}>, and "C<.>"

=item *

A space or a newline is now required after a C<"#line XXX"> directive.

=item *

Tied filehandles now have an additional method EOF which provides the
EOF type.

=item *

To better match all other flow control statements, C<foreach> may no
longer be used as an attribute.

=item *

Perl's command-line switch "-P", which was deprecated in version 5.10.0, has
now been removed. The CPAN module C<< Filter::cpp >> can be used as an 
alternative.

=back


=head1 Deprecations

From time to time, Perl's developers find it necessary to deprecate
features or modules we've previously shipped as part of the core
distribution. We are well aware of the pain and frustration that a
backwards-incompatible change to Perl can cause for developers building
or maintaining software in Perl. You can be sure that when we deprecate
a functionality or syntax, it isn't a choice we make lightly. Sometimes,
we choose to deprecate functionality or syntax because it was found to
be poorly designed or implemented. Sometimes, this is because they're
holding back other features or causing performance problems. Sometimes,
the reasons are more complex. Wherever possible, we try to keep deprecated
functionality available to developers in its previous form for at least
one major release. So long as a deprecated feature isn't actively
disrupting our ability to maintain and extend Perl, we'll try to leave
it in place as long as possible.

The following items are now deprecated:

=over

=item suidperl

C<suidperl> is no longer part of Perl. It used to provide a mechanism to
emulate setuid permission bits on systems that don't support it properly.

=item Use of C<:=> to mean an empty attribute list

An accident of Perl's parser meant that these constructions were all
equivalent:

    my $pi := 4;
    my $pi : = 4;
    my $pi :  = 4;

with the C<:> being treated as the start of an attribute list, which
ends before the C<=>. As whitespace is not significant here, all are
parsed as an empty attribute list, hence all the above are equivalent
to, and better written as

    my $pi = 4;

because no attribute processing is done for an empty list.

As is, this meant that C<:=> cannot be used as a new token, without
silently changing the meaning of existing code. Hence that particular
form is now deprecated, and will become a syntax error. If it is
absolutely necessary to have empty attribute lists (for example,
because of a code generator) then avoid the warning by adding a space
before the C<=>.

=item C<< UNIVERSAL->import() >>

The method C<< UNIVERSAL->import() >> is now deprecated. Attempting to
pass import arguments to a C<use UNIVERSAL> statement will result in a
deprecation warning.

=item Use of "goto" to jump into a construct

Using C<goto> to jump from an outer scope into an inner scope is now
deprecated. This rare use case was causing problems in the
implementation of scopes.

=item Custom character names in \N{name} that don't look like names

In C<\N{I<name>}>, I<name> can be just about anything. The standard
Unicode names have a very limited domain, but a custom name translator
could create names that are, for example, made up entirely of punctuation
symbols. It is now deprecated to make names that don't begin with an
alphabetic character, and aren't alphanumeric or contain other than
a very few other characters, namely spaces, dashes, parentheses
and colons. Because of the added meaning of C<\N> (See L</C<\N>
experimental regex escape>), names that look like curly brace -enclosed
quantifiers won't work. For example, C<\N{3,4}> now means to match 3 to
4 non-newlines; before a custom name C<3,4> could have been created.

=item Deprecated Modules

The following modules will be removed from the core distribution in a
future release, and should be installed from CPAN instead. Distributions
on CPAN which require these should add them to their prerequisites. The
core versions of these modules warnings will issue a deprecation warning.

If you ship a packaged version of Perl, either alone or as part of a
larger system, then you should carefully consider the repercussions of
core module deprecations. You may want to consider shipping your default
build of Perl with packages for some or all deprecated modules which
install into C<vendor> or C<site> perl library directories. This will
inhibit the deprecation warnings.

Alternatively, you may want to consider patching F<lib/deprecate.pm>
to provide deprecation warnings specific to your packaging system
or distribution of Perl, consistent with how your packaging system
or distribution manages a staged transition from a release where the
installation of a single package provides the given functionality, to
a later release where the system administrator needs to know to install
multiple packages to get that same functionality.

You can silence these deprecation warnings by installing the modules
in question from CPAN.  To install the latest version of all of them,
just install C<Task::Deprecations::5_12>.

=over

=item L<Class::ISA>

=item L<Pod::Plainer>

=item L<Shell>

=item L<Switch>

Switch is buggy and should be avoided. You may find Perl's new
C<given>/C<when> feature a suitable replacement.  See L<perlsyn/"Switch
statements"> for more information.

=back

=item Assignment to $[

=item Use of the attribute :locked on subroutines

=item Use of "locked" with the attributes pragma

=item Use of "unique" with the attributes pragma

=item Perl_pmflag

C<Perl_pmflag> is no longer part of Perl's public API. Calling it now
generates a deprecation warning, and it will be removed in a future
release. Although listed as part of the API, it was never documented,
and only ever used in F<toke.c>, and prior to 5.10, F<regcomp.c>. In
core, it has been replaced by a static function.

=item Numerous Perl 4-era libraries

F<termcap.pl>, F<tainted.pl>, F<stat.pl>, F<shellwords.pl>, F<pwd.pl>,
F<open3.pl>, F<open2.pl>, F<newgetopt.pl>, F<look.pl>, F<find.pl>,
F<finddepth.pl>, F<importenv.pl>, F<hostname.pl>, F<getopts.pl>,
F<getopt.pl>, F<getcwd.pl>, F<flush.pl>, F<fastcwd.pl>, F<exceptions.pl>,
F<ctime.pl>, F<complete.pl>, F<cacheout.pl>, F<bigrat.pl>, F<bigint.pl>,
F<bigfloat.pl>, F<assert.pl>, F<abbrev.pl>, F<dotsh.pl>, and
F<timelocal.pl> are all now deprecated.  Earlier, Perl's developers
intended to remove these libraries from Perl's core for the 5.14.0 release.

During final testing before the release of 5.12.0, several developers
discovered current production code using these ancient libraries, some
inside the Perl core itself.  Accordingly, the pumpking granted them
a stay of execution. They will begin to warn about their deprecation
in the 5.14.0 release and will be removed in the 5.16.0 release.


=back

=head1 Unicode overhaul

Perl's developers have made a concerted effort to update Perl to be in
sync with the latest Unicode standard. Changes for this include:

Perl can now handle every Unicode character property. New documentation,
L<perluniprops>, lists all available non-Unihan character properties. By
default, perl does not expose Unihan, deprecated or Unicode-internal
properties.  See below for more details on these; there is also a section
in the pod listing them, and explaining why they are not exposed.

Perl now fully supports the Unicode compound-style of using C<=>
and C<:> in writing regular expressions: C<\p{property=value}> and
C<\p{property:value}> (both of which mean the same thing).

Perl now fully supports the Unicode loose matching rules for text between
the braces in C<\p{...}> constructs. In addition, Perl allows underscores
between digits of numbers.

Perl now accepts all the Unicode-defined synonyms for properties and
property values.

C<qr/\X/>, which matches a Unicode logical character, has
been expanded to work better with various Asian languages. It
now is defined as an I<extended grapheme cluster>. (See
L<http://www.unicode.org/reports/tr29/>).  Anything matched previously
and that made sense will continue to be accepted.   Additionally:

=over

=item *

C<\X> will not break apart a C<S<CR LF>> sequence.

=item *

C<\X> will now match a sequence which includes the C<ZWJ> and C<ZWNJ>
characters.

=item *

C<\X> will now always match at least one character, including an initial
mark.  Marks generally come after a base character, but it is possible in
Unicode to have them in isolation, and C<\X> will now handle that case,
for example at the beginning of a line, or after a C<ZWSP>. And this is
the part where C<\X> doesn't match the things that it used to that don't
make sense. Formerly, for example, you could have the nonsensical case
of an accented LF.

=item *

C<\X> will now match a (Korean) Hangul syllable sequence, and the Thai
and Lao exception cases.

=back

Otherwise, this change should be transparent for the non-affected
languages.

C<\p{...}> matches using the Canonical_Combining_Class property were
completely broken in previous releases of Perl.  They should now work
correctly.

Before Perl 5.12, the Unicode C<Decomposition_Type=Compat> property
and a Perl extension had the same name, which led to neither matching
all the correct values (with more than 100 mistakes in one, and several
thousand in the other). The Perl extension has now been renamed to be
C<Decomposition_Type=Noncanonical> (short: C<dt=noncanon>). It has the
same meaning as was previously intended, namely the union of all the
non-canonical Decomposition types, with Unicode C<Compat> being just
one of those.

C<\p{Decomposition_Type=Canonical}> now includes the Hangul syllables.

C<\p{Uppercase}> and C<\p{Lowercase}> now work as the Unicode standard
says they should.  This means they each match a few more characters than
they used to.

C<\p{Cntrl}> now matches the same characters as C<\p{Control}>. This
means it no longer will match Private Use (gc=co), Surrogates (gc=cs),
nor Format (gc=cf) code points. The Format code points represent the
biggest possible problem. All but 36 of them are either officially
deprecated or strongly discouraged from being used. Of those 36, likely
the most widely used are the soft hyphen (U+00AD), and BOM, ZWSP, ZWNJ,
WJ, and similar characters, plus bidirectional controls.

C<\p{Alpha}> now matches the same characters as C<\p{Alphabetic}>. Before
5.12, Perl's definition included a number of things that aren't
really alpha (all marks) while omitting many that were. The definitions
of C<\p{Alnum}> and C<\p{Word}> depend on Alpha's definition and have
changed accordingly.

C<\p{Word}> no longer incorrectly matches non-word characters such
as fractions.

C<\p{Print}> no longer matches the line control characters: Tab, LF,
CR, FF, VT, and NEL. This brings it in line with standards and the
documentation.

C<\p{XDigit}> now matches the same characters as C<\p{Hex_Digit}>. This
means that in addition to the characters it currently matches,
C<[A-Fa-f0-9]>, it will also match the 22 fullwidth equivalents, for
example U+FF10: FULLWIDTH DIGIT ZERO.

The Numeric type property has been extended to include the Unihan
characters.

There is a new Perl extension, the 'Present_In', or simply 'In',
property. This is an extension of the Unicode Age property, but
C<\p{In=5.0}> matches any code point whose usage has been determined
I<as of> Unicode version 5.0. The C<\p{Age=5.0}> only matches code points
added in I<precisely> version 5.0.

A number of properties now have the correct values for unassigned
code points. The affected properties are Bidi_Class, East_Asian_Width,
Joining_Type, Decomposition_Type, Hangul_Syllable_Type, Numeric_Type,
and Line_Break.

The Default_Ignorable_Code_Point, ID_Continue, and ID_Start properties
are now up to date with current Unicode definitions.

Earlier versions of Perl erroneously exposed certain properties that
are supposed to be Unicode internal-only.  Use of these in regular
expressions will now generate, if enabled, a deprecation warning message.
The properties are: Other_Alphabetic, Other_Default_Ignorable_Code_Point,
Other_Grapheme_Extend, Other_ID_Continue, Other_ID_Start, Other_Lowercase,
Other_Math, and Other_Uppercase.

It is now possible to change which Unicode properties Perl understands
on a per-installation basis. As mentioned above, certain properties
are turned off by default.  These include all the Unihan properties
(which should be accessible via the CPAN module Unicode::Unihan) and any
deprecated or Unicode internal-only property that Perl has never exposed.

The generated files in the C<lib/unicore/To> directory are now more
clearly marked as being stable, directly usable by applications.  New hash
entries in them give the format of the normal entries, which allows for
easier machine parsing. Perl can generate files in this directory for
any property, though most are suppressed.  You can find instructions
for changing which are written in L<perluniprops>.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

=over 4

=item C<autodie>

C<autodie> is a new lexically-scoped alternative for the C<Fatal> module.
The bundled version is 2.06_01. Note that in this release, using a string
eval when C<autodie> is in effect can cause the autodie behaviour to leak
into the surrounding scope. See L<autodie/"BUGS"> for more details.

Version 2.06_01 has been added to the Perl core.

=item C<Compress::Raw::Bzip2>

Version 2.024 has been added to the Perl core.

=item C<overloading>

C<overloading> allows you to lexically disable or enable overloading
for some or all operations.

Version 0.001 has been added to the Perl core.

=item C<parent>

C<parent> establishes an ISA relationship with base classes at compile
time. It provides the key feature of C<base> without further unwanted
behaviors.

Version 0.223 has been added to the Perl core.

=item C<Parse::CPAN::Meta>

Version 1.40 has been added to the Perl core.

=item C<VMS::DCLsym>

Version 1.03 has been added to the Perl core.

=item C<VMS::Stdio>

Version 2.4 has been added to the Perl core.

=item C<XS::APItest::KeywordRPN>

Version 0.003 has been added to the Perl core.

=back

=head2 Updated Pragmata

=over 4

=item C<base>

Upgraded from version 2.13 to 2.15.

=item C<bignum>

Upgraded from version 0.22 to 0.23.

=item C<charnames>

C<charnames> now contains the Unicode F<NameAliases.txt> database file.
This has the effect of adding some extra C<\N> character names that
formerly wouldn't have been recognised; for example, C<"\N{LATIN CAPITAL
LETTER GHA}">.

Upgraded from version 1.06 to 1.07.

=item C<constant>

Upgraded from version 1.13 to 1.20.

=item C<diagnostics>

C<diagnostics> now supports %.0f formatting internally.

C<diagnostics> no longer suppresses C<Use of uninitialized value in range
(or flip)> warnings. [perl #71204]

Upgraded from version 1.17 to 1.19.

=item C<feature>

In C<feature>, the meaning of the C<:5.10> and C<:5.10.X> feature
bundles has changed slightly. The last component, if any (i.e. C<X>) is
simply ignored.  This is predicated on the assumption that new features
will not, in general, be added to maintenance releases. So C<:5.10>
and C<:5.10.X> have identical effect. This is a change to the behaviour
documented for 5.10.0.

C<feature> now includes the C<unicode_strings> feature:

    use feature "unicode_strings";

This pragma turns on Unicode semantics for the case-changing operations
(C<uc>, C<lc>, C<ucfirst>, C<lcfirst>) on strings that don't have the
internal UTF-8 flag set, but that contain single-byte characters between
128 and 255.

Upgraded from version 1.11 to 1.16.

=item C<less>

C<less> now includes the C<stash_name> method to allow subclasses of
C<less> to pick where in %^H to store their stash.

Upgraded from version 0.02 to 0.03.

=item C<lib>

Upgraded from version 0.5565 to 0.62.

=item C<mro>

C<mro> is now implemented as an XS extension. The documented interface has
not changed. Code relying on the implementation detail that some C<mro::>
methods happened to be available at all times gets to "keep both pieces".

Upgraded from version 1.00 to 1.02.

=item C<overload>

C<overload> now allow overloading of 'qr'.

Upgraded from version 1.06 to 1.10.

=item C<threads>

Upgraded from version 1.67 to 1.75.

=item C<threads::shared>

Upgraded from version 1.14 to 1.32.

=item C<version>

C<version> now has support for L</Version number formats> as described
earlier in this document and in its own documentation.

Upgraded from version 0.74 to 0.82.

=item C<warnings>

C<warnings> has a new C<warnings::fatal_enabled()> function.  It also
includes a new C<illegalproto> warning category. See also L</New or
Changed Diagnostics> for this change.

Upgraded from version 1.06 to 1.09.

=back

=head2 Updated Modules

=over 4

=item C<Archive::Extract>

Upgraded from version 0.24 to 0.38.

=item C<Archive::Tar>

Upgraded from version 1.38 to 1.54.

=item C<Attribute::Handlers>

Upgraded from version 0.79 to 0.87.

=item C<AutoLoader>

Upgraded from version 5.63 to 5.70.

=item C<B::Concise>

Upgraded from version 0.74 to 0.78.

=item C<B::Debug>

Upgraded from version 1.05 to 1.12.

=item C<B::Deparse>

Upgraded from version 0.83 to 0.96.

=item C<B::Lint>

Upgraded from version 1.09 to 1.11_01.

=item C<CGI>

Upgraded from version 3.29 to 3.48.

=item C<Class::ISA>

Upgraded from version 0.33 to 0.36.

NOTE: C<Class::ISA> is deprecated and may be removed from a future
version of Perl.

=item C<Compress::Raw::Zlib>

Upgraded from version 2.008 to 2.024.

=item C<CPAN>

Upgraded from version 1.9205 to 1.94_56.

=item C<CPANPLUS>

Upgraded from version 0.84 to 0.90.

=item C<CPANPLUS::Dist::Build>

Upgraded from version 0.06_02 to 0.46.

=item C<Data::Dumper>

Upgraded from version 2.121_14 to 2.125.

=item C<DB_File>

Upgraded from version 1.816_1 to 1.820.

=item C<Devel::PPPort>

Upgraded from version 3.13 to 3.19.

=item C<Digest>

Upgraded from version 1.15 to 1.16.

=item C<Digest::MD5>

Upgraded from version 2.36_01 to 2.39.

=item C<Digest::SHA>

Upgraded from version 5.45 to 5.47.

=item C<Encode>

Upgraded from version 2.23 to 2.39.

=item C<Exporter>

Upgraded from version 5.62 to 5.64_01.

=item C<ExtUtils::CBuilder>

Upgraded from version 0.21 to 0.27.

=item C<ExtUtils::Command>

Upgraded from version 1.13 to 1.16.

=item C<ExtUtils::Constant>

Upgraded from version 0.2 to 0.22.

=item C<ExtUtils::Install>

Upgraded from version 1.44 to 1.55.

=item C<ExtUtils::MakeMaker>

Upgraded from version 6.42 to 6.56.

=item C<ExtUtils::Manifest>

Upgraded from version 1.51_01 to 1.57.

=item C<ExtUtils::ParseXS>

Upgraded from version 2.18_02 to 2.21.

=item C<File::Fetch>

Upgraded from version 0.14 to 0.24.

=item C<File::Path>

Upgraded from version 2.04 to 2.08_01.

=item C<File::Temp>

Upgraded from version 0.18 to 0.22.

=item C<Filter::Simple>

Upgraded from version 0.82 to 0.84.

=item C<Filter::Util::Call>

Upgraded from version 1.07 to 1.08.

=item C<Getopt::Long>

Upgraded from version 2.37 to 2.38.

=item C<IO>

Upgraded from version 1.23_01 to 1.25_02.

=item C<IO::Zlib>

Upgraded from version 1.07 to 1.10.

=item C<IPC::Cmd>

Upgraded from version 0.40_1 to 0.54.

=item C<IPC::SysV>

Upgraded from version 1.05 to 2.01.

=item C<Locale::Maketext>

Upgraded from version 1.12 to 1.14.

=item C<Locale::Maketext::Simple>

Upgraded from version 0.18 to 0.21.

=item C<Log::Message>

Upgraded from version 0.01 to 0.02.

=item C<Log::Message::Simple>

Upgraded from version 0.04 to 0.06.

=item C<Math::BigInt>

Upgraded from version 1.88 to 1.89_01.

=item C<Math::BigInt::FastCalc>

Upgraded from version 0.16 to 0.19.

=item C<Math::BigRat>

Upgraded from version 0.21 to 0.24.

=item C<Math::Complex>

Upgraded from version 1.37 to 1.56.

=item C<Memoize>

Upgraded from version 1.01_02 to 1.01_03.

=item C<MIME::Base64>

Upgraded from version 3.07_01 to 3.08.

=item C<Module::Build>

Upgraded from version 0.2808_01 to 0.3603.

=item C<Module::CoreList>

Upgraded from version 2.12 to 2.29.

=item C<Module::Load>

Upgraded from version 0.12 to 0.16.

=item C<Module::Load::Conditional>

Upgraded from version 0.22 to 0.34.

=item C<Module::Loaded>

Upgraded from version 0.01 to 0.06.

=item C<Module::Pluggable>

Upgraded from version 3.6 to 3.9.

=item C<Net::Ping>

Upgraded from version 2.33 to 2.36.

=item C<NEXT>

Upgraded from version 0.60_01 to 0.64.

=item C<Object::Accessor>

Upgraded from version 0.32 to 0.36.

=item C<Package::Constants>

Upgraded from version 0.01 to 0.02.

=item C<PerlIO>

Upgraded from version 1.04 to 1.06.

=item C<Pod::Parser>

Upgraded from version 1.35 to 1.37.

=item C<Pod::Perldoc>

Upgraded from version 3.14_02 to 3.15_02.

=item C<Pod::Plainer>

Upgraded from version 0.01 to 1.02.

NOTE: C<Pod::Plainer> is deprecated and may be removed from a future
version of Perl.

=item C<Pod::Simple>

Upgraded from version 3.05 to 3.13.

=item C<Safe>

Upgraded from version 2.12 to 2.22.

=item C<SelfLoader>

Upgraded from version 1.11 to 1.17.

=item C<Storable>

Upgraded from version 2.18 to 2.22.

=item C<Switch>

Upgraded from version 2.13 to 2.16.

NOTE: C<Switch> is deprecated and may be removed from a future version
of Perl.

=item C<Sys::Syslog>

Upgraded from version 0.22 to 0.27.

=item C<Term::ANSIColor>

Upgraded from version 1.12 to 2.02.

=item C<Term::UI>

Upgraded from version 0.18 to 0.20.

=item C<Test>

Upgraded from version 1.25 to 1.25_02.

=item C<Test::Harness>

Upgraded from version 2.64 to 3.17.

=item C<Test::Simple>

Upgraded from version 0.72 to 0.94.

=item C<Text::Balanced>

Upgraded from version 2.0.0 to 2.02.

=item C<Text::ParseWords>

Upgraded from version 3.26 to 3.27.

=item C<Text::Soundex>

Upgraded from version 3.03 to 3.03_01.

=item C<Thread::Queue>

Upgraded from version 2.00 to 2.11.

=item C<Thread::Semaphore>

Upgraded from version 2.01 to 2.09.

=item C<Tie::RefHash>

Upgraded from version 1.37 to 1.38.

=item C<Time::HiRes>

Upgraded from version 1.9711 to 1.9719.

=item C<Time::Local>

Upgraded from version 1.18 to 1.1901_01.

=item C<Time::Piece>

Upgraded from version 1.12 to 1.15.

=item C<Unicode::Collate>

Upgraded from version 0.52 to 0.52_01.

=item C<Unicode::Normalize>

Upgraded from version 1.02 to 1.03.

=item C<Win32>

Upgraded from version 0.34 to 0.39.

=item C<Win32API::File>

Upgraded from version 0.1001_01 to 0.1101.

=item C<XSLoader>

Upgraded from version 0.08 to 0.10.

=back

=head2 Removed Modules and Pragmata

=over 4

=item C<attrs>

Removed from the Perl core.  Prior version was 1.02.

=item C<CPAN::API::HOWTO>

Removed from the Perl core.  Prior version was 'undef'.

=item C<CPAN::DeferedCode>

Removed from the Perl core.  Prior version was 5.50.

=item C<CPANPLUS::inc>

Removed from the Perl core.  Prior version was 'undef'.

=item C<DCLsym>

Removed from the Perl core.  Prior version was 1.03.

=item C<ExtUtils::MakeMaker::bytes>

Removed from the Perl core.  Prior version was 6.42.

=item C<ExtUtils::MakeMaker::vmsish>

Removed from the Perl core.  Prior version was 6.42.

=item C<Stdio>

Removed from the Perl core.  Prior version was 2.3.

=item C<Test::Harness::Assert>

Removed from the Perl core.  Prior version was 0.02.

=item C<Test::Harness::Iterator>

Removed from the Perl core.  Prior version was 0.02.

=item C<Test::Harness::Point>

Removed from the Perl core.  Prior version was 0.01.

=item C<Test::Harness::Results>

Removed from the Perl core.  Prior version was 0.01.

=item C<Test::Harness::Straps>

Removed from the Perl core.  Prior version was 0.26_01.

=item C<Test::Harness::Util>

Removed from the Perl core.  Prior version was 0.01.

=item C<XSSymSet>

Removed from the Perl core.  Prior version was 1.1.

=back

=head2 Deprecated Modules and Pragmata

See L</Deprecated Modules> above.


=head1 Documentation

=head2 New Documentation

=over 4

=item *

L<perlhaiku> contains instructions on how to build perl for the Haiku
platform.

=item *

L<perlmroapi> describes the new interface for pluggable Method Resolution
Orders.

=item *

L<perlperf>, by Richard Foley, provides an introduction to the use of
performance and optimization techniques which can be used with particular
reference to perl programs.

=item *

L<perlrepository> describes how to access the perl source using the I<git>
version control system.

=item *

L<perlpolicy> extends the "Social contract about contributed modules" into
the beginnings of a document on Perl porting policies.

=back

=head2 Changes to Existing Documentation


=over

=item *

The various large F<Changes*> files (which listed every change made
to perl over the last 18 years) have been removed, and replaced by a
small file, also called F<Changes>, which just explains how that same
information may be extracted from the git version control system.

=item *

F<Porting/patching.pod> has been deleted, as it mainly described
interacting with the old Perforce-based repository, which is now obsolete.
Information still relevant has been moved to L<perlrepository>.

=item *

The syntax C<unless (EXPR) BLOCK else BLOCK> is now documented as valid,
as is the syntax C<unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else
BLOCK>, although actually using the latter may not be the best idea for
the readability of your source code.

=item *

Documented -X overloading.

=item *

Documented that C<when()> treats specially most of the filetest operators

=item *

Documented C<when> as a syntax modifier.

=item *

Eliminated "Old Perl threads tutorial", which described 5005 threads.

F<pod/perlthrtut.pod> is the same material reworked for ithreads.

=item *

Correct previous documentation: v-strings are not deprecated

With version objects, we need them to use MODULE VERSION syntax. This
patch removes the deprecation notice.

=item *

Security contact information is now part of L<perlsec>.

=item *

A significant fraction of the core documentation has been updated to
clarify the behavior of Perl's Unicode handling.

Much of the remaining core documentation has been reviewed and edited
for clarity, consistent use of language, and to fix the spelling of Tom
Christiansen's name.

=item *

The Pod specification (L<perlpodspec>) has been updated to bring the
specification in line with modern usage already supported by most Pod
systems. A parameter string may now follow the format name in a
"begin/end" region. Links to URIs with a text description are now
allowed. The usage of C<LE<lt>"section"E<gt>> has been marked as
deprecated.

=item *

L<if.pm|if> has been documented in L<perlfunc/use> as a means to get
conditional loading of modules despite the implicit BEGIN block around
C<use>.

=item *

The documentation for C<$1> in perlvar.pod has been clarified.

=item *

C<\N{U+I<code point>}> is now documented.

=back

=head1 Selected Performance Enhancements

=over 4

=item *

A new internal cache means that C<isa()> will often be faster.

=item *

The implementation of C<C3> Method Resolution Order has been
optimised - linearisation for classes with single inheritance is 40%
faster. Performance for multiple inheritance is unchanged.

=item *

Under C<use locale>, the locale-relevant information is now cached on
read-only values, such as the list returned by C<keys %hash>. This makes
operations such as C<sort keys %hash> in the scope of C<use locale>
much faster.

=item *

Empty C<DESTROY> methods are no longer called.

=item *

C<Perl_sv_utf8_upgrade()> is now faster.

=item *

C<keys> on empty hash is now faster.

=item *

C<if (%foo)> has been optimized to be faster than C<if (keys %foo)>.

=item *

The string repetition operator (C<$str x $num>) is now several times
faster when C<$str> has length one or C<$num> is large.

=item *

Reversing an array to itself (as in C<@a = reverse @a>) in void context
now happens in-place and is several orders of magnitude faster than
it used to be. It will also preserve non-existent elements whenever
possible, i.e. for non magical arrays or tied arrays with C<EXISTS>
and C<DELETE> methods.

=back

=head1 Installation and Configuration Improvements

=over 4

=item *

L<perlapi>, L<perlintern>, L<perlmodlib> and L<perltoc> are now all
generated at build time, rather than being shipped as part of the release.

=item *

If C<vendorlib> and C<vendorarch> are the same, then they are only added
to C<@INC> once.

=item *

C<$Config{usedevel}> and the C-level C<PERL_USE_DEVEL> are now defined if
perl is built with  C<-Dusedevel>.

=item *

F<Configure> will enable use of C<-fstack-protector>, to provide protection
against stack-smashing attacks, if the compiler supports it.

=item *

F<Configure> will now determine the correct prototypes for re-entrant
functions and for C<gconvert> if you are using a C++ compiler rather
than a C compiler.

=item *

On Unix, if you build from a tree containing a git repository, the
configuration process will note the commit hash you have checked out, for
display in the output of C<perl -v> and C<perl -V>. Unpushed local commits
are automatically added to the list of local patches displayed by
C<perl -V>.

=item *

Perl now supports SystemTap's C<dtrace> compatibility layer and an
issue with linking C<miniperl> has been fixed in the process.

=item *

perldoc now uses C<less -R> instead of C<less> for improved behaviour
in the face of C<groff>'s new usage of ANSI escape codes.

=item *


C<perl -V> now reports use of the compile-time options C<USE_PERL_ATOF> and
C<USE_ATTRIBUTES_FOR_PERLIO>.

=item *

As part of the flattening of F<ext>, all extensions on all platforms are
built by F<make_ext.pl>. This replaces the Unix-specific
F<ext/util/make_ext>, VMS-specific F<make_ext.com> and Win32-specific
F<win32/buildext.pl>.

=back

=head1 Internal Changes

Each release of Perl sees numerous internal changes which shouldn't
affect day to day usage but may still be notable for developers working
with Perl's source code.

=over

=item *

The J.R.R. Tolkien quotes at the head of C source file have been checked
and proper citations added, thanks to a patch from Tom Christiansen.

=item *

The internal structure of the dual-life modules traditionally found in
the F<lib/> and F<ext/> directories in the perl source has changed
significantly. Where possible, dual-lifed modules have been extracted
from F<lib/> and F<ext/>.

Dual-lifed modules maintained by Perl's developers as part of the Perl
core now live in F<dist/>.  Dual-lifed modules maintained primarily on
CPAN now live in F<cpan/>.  When reporting a bug in a module located
under F<cpan/>, please send your bug report directly to the module's
bug tracker or author, rather than Perl's bug tracker.

=item *

C<\N{...}> now compiles better, always forces UTF-8 internal representation

Perl's developers have fixed several problems with the recognition of
C<\N{...}> constructs.  As part of this, perl will store any scalar
or regex containing C<\N{I<name>}> or C<\N{U+I<code point>}> in its
definition in UTF-8 format. (This was true previously for all occurrences
of C<\N{I<name>}> that did not use a custom translator, but now it's
always true.)

=item *

Perl_magic_setmglob now knows about globs, fixing RT #71254.

=item *

C<SVt_RV> no longer exists. RVs are now stored in IVs.

=item *

C<Perl_vcroak()> now accepts a null first argument. In addition, a full
audit was made of the "not NULL" compiler annotations, and those for
several other internal functions were corrected.

=item *

New macros C<dSAVEDERRNO>, C<dSAVE_ERRNO>, C<SAVE_ERRNO>, C<RESTORE_ERRNO>
have been added to formalise the temporary saving of the C<errno>
variable.

=item *

The function C<Perl_sv_insert_flags> has been added to augment
C<Perl_sv_insert>.

=item *

The function C<Perl_newSV_type(type)> has been added, equivalent to
C<Perl_newSV()> followed by C<Perl_sv_upgrade(type)>.

=item *

The function C<Perl_newSVpvn_flags()> has been added, equivalent to
C<Perl_newSVpvn()> and then performing the action relevant to the flag.

Two flag bits are currently supported.

=over 4

=item *

C<SVf_UTF8> will call C<SvUTF8_on()> for you. (Note that this does
not convert an sequence of ISO 8859-1 characters to UTF-8). A wrapper,
C<newSVpvn_utf8()> is available for this.

=item *

C<SVs_TEMP> now calls C<Perl_sv_2mortal()> on the new SV.

=back

There is also a wrapper that takes constant strings, C<newSVpvs_flags()>.

=item *

The function C<Perl_croak_xs_usage> has been added as a wrapper to
C<Perl_croak>.

=item *

Perl now exports the functions C<PerlIO_find_layer> and C<PerlIO_list_alloc>.

=item *

C<PL_na> has been exterminated from the core code, replaced by local
STRLEN temporaries, or C<*_nolen()> calls. Either approach is faster than
C<PL_na>, which is a pointer dereference into the interpreter structure
under ithreads, and a global variable otherwise.

=item *

C<Perl_mg_free()> used to leave freed memory accessible via C<SvMAGIC()>
on the scalar. It now updates the linked list to remove each piece of
magic as it is freed.

=item *

Under ithreads, the regex in C<PL_reg_curpm> is now reference
counted. This eliminates a lot of hackish workarounds to cope with it
not being reference counted.

=item *

C<Perl_mg_magical()> would sometimes incorrectly turn on C<SvRMAGICAL()>.
This has been fixed.

=item *

The I<public> IV and NV flags are now not set if the string value has
trailing "garbage". This behaviour is consistent with not setting the
public IV or NV flags if the value is out of range for the type.

=item *

Uses of C<Nullav>, C<Nullcv>, C<Nullhv>, C<Nullop>, C<Nullsv> etc have
been replaced by C<NULL> in the core code, and non-dual-life modules,
as C<NULL> is clearer to those unfamiliar with the core code.

=item *

A macro C<MUTABLE_PTR(p)> has been added, which on (non-pedantic) gcc will
not cast away C<const>, returning a C<void *>. Macros C<MUTABLE_SV(av)>,
C<MUTABLE_SV(cv)> etc build on this, casting to C<AV *> etc without
casting away C<const>. This allows proper compile-time auditing of
C<const> correctness in the core, and helped picked up some errors
(now fixed).

=item *

Macros C<mPUSHs()> and C<mXPUSHs()> have been added, for pushing SVs on the
stack and mortalizing them.

=item *

Use of the private structure C<mro_meta> has changed slightly. Nothing
outside the core should be accessing this directly anyway.

=item *

A new tool, F<Porting/expand-macro.pl> has been added, that allows you
to view how a C preprocessor macro would be expanded when compiled.
This is handy when trying to decode the macro hell that is the perl
guts.

=back

=head1 Testing

=head2 Testing improvements

=over 4

=item Parallel tests

The core distribution can now run its regression tests in parallel on
Unix-like platforms. Instead of running C<make test>, set C<TEST_JOBS> in
your environment to the number of tests to run in parallel, and run
C<make test_harness>. On a Bourne-like shell, this can be done as

    TEST_JOBS=3 make test_harness  # Run 3 tests in parallel

An environment variable is used, rather than parallel make itself, because
L<TAP::Harness> needs to be able to schedule individual non-conflicting test
scripts itself, and there is no standard interface to C<make> utilities to
interact with their job schedulers.

Note that currently some test scripts may fail when run in parallel (most
notably C<ext/IO/t/io_dir.t>). If necessary run just the failing scripts
again sequentially and see if the failures go away.

=item Test harness flexibility

It's now possible to override C<PERL5OPT> and friends in F<t/TEST>

=item Test watchdog

Several tests that have the potential to hang forever if they fail now
incorporate a "watchdog" functionality that will kill them after a timeout,
which helps ensure that C<make test> and C<make test_harness> run to
completion automatically.


=back

=head2 New Tests

Perl's developers have added a number of new tests to the core.
In addition to the items listed below, many modules updated from CPAN
incorporate new tests.

=over 4

=item *

Significant cleanups to core tests to ensure that language and
interpreter features are not used before they're tested.

=item *

C<make test_porting> now runs a number of important pre-commit checks
which might be of use to anyone working on the Perl core.

=item *

F<t/porting/podcheck.t> automatically checks the well-formedness of
POD found in all .pl, .pm and .pod files in the F<MANIFEST>, other than in
dual-lifed modules which are primarily maintained outside the Perl core.

=item *

F<t/porting/manifest.t> now tests that all files listed in MANIFEST
are present.

=item *

F<t/op/while_readdir.t> tests that a bare readdir in while loop sets $_.

=item *

F<t/comp/retainedlines.t> checks that the debugger can retain source
lines from C<eval>.

=item *

F<t/io/perlio_fail.t> checks that bad layers fail.

=item *

F<t/io/perlio_leaks.t> checks that PerlIO layers are not leaking.

=item *

F<t/io/perlio_open.t> checks that certain special forms of open work.

=item *

F<t/io/perlio.t> includes general PerlIO tests.

=item *

F<t/io/pvbm.t> checks that there is no unexpected interaction between
the internal types C<PVBM> and C<PVGV>.

=item *

F<t/mro/package_aliases.t> checks that mro works properly in the presence
of aliased packages.

=item *

F<t/op/dbm.t> tests C<dbmopen> and C<dbmclose>.

=item *

F<t/op/index_thr.t> tests the interaction of C<index> and threads.

=item *

F<t/op/pat_thr.t> tests the interaction of esoteric patterns and threads.

=item *

F<t/op/qr_gc.t> tests that C<qr> doesn't leak.

=item *

F<t/op/reg_email_thr.t> tests the interaction of regex recursion and threads.

=item *

F<t/op/regexp_qr_embed_thr.t> tests the interaction of patterns with
embedded C<qr//> and threads.

=item *

F<t/op/regexp_unicode_prop.t> tests Unicode properties in regular
expressions.

=item *

F<t/op/regexp_unicode_prop_thr.t> tests the interaction of Unicode
properties and threads.

=item *

F<t/op/reg_nc_tie.t> tests the tied methods of C<Tie::Hash::NamedCapture>.

=item *

F<t/op/reg_posixcc.t> checks that POSIX character classes behave
consistently.

=item *

F<t/op/re.t> checks that exportable C<re> functions in F<universal.c> work.

=item *

F<t/op/setpgrpstack.t> checks that C<setpgrp> works.

=item *

F<t/op/substr_thr.t> tests the interaction of C<substr> and threads.

=item *

F<t/op/upgrade.t> checks that upgrading and assigning scalars works.

=item *

F<t/uni/lex_utf8.t> checks that Unicode in the lexer works.

=item *

F<t/uni/tie.t> checks that Unicode and C<tie> work.

=item *

F<t/comp/final_line_num.t> tests whether line numbers are correct at EOF

=item *

F<t/comp/form_scope.t> tests format scoping.

=item *

F<t/comp/line_debug.t> tests whether C<< @{"_<$file"} >> works.

=item *

F<t/op/filetest_t.t> tests if -t file test works.

=item *

F<t/op/qr.t> tests C<qr>.

=item *

F<t/op/utf8cache.t> tests malfunctions of the utf8 cache.

=item *

F<t/re/uniprops.t> test unicodes C<\p{}> regex constructs.

=item *

F<t/op/filehandle.t> tests some suitably portable filetest operators
to check that they work as expected, particularly in the light of some
internal changes made in how filehandles are blessed.

=item *

F<t/op/time_loop.t> tests that unix times greater than C<2**63>, which
can now be handed to C<gmtime> and C<localtime>, do not cause an internal
overflow or an excessively long loop.

=back


=head1 New or Changed Diagnostics

=head2 New Diagnostics

=over

=item *

SV allocation tracing has been added to the diagnostics enabled by C<-Dm>.
The tracing can alternatively output via the C<PERL_MEM_LOG> mechanism, if
that was enabled when the F<perl> binary was compiled.

=item *

Smartmatch resolution tracing has been added as a new diagnostic. Use
C<-DM> to enable it.

=item *

A new debugging flag C<-DB> now dumps subroutine definitions, leaving
C<-Dx> for its original purpose of dumping syntax trees.

=item *

Perl 5.12 provides a number of new diagnostic messages to help you write
better code.  See L<perldiag> for details of these new messages.

=over 4

=item *

C<Bad plugin affecting keyword '%s'>

=item *

C<gmtime(%.0f) too large>

=item *

C<Lexing code attempted to stuff non-Latin-1 character into Latin-1 input>

=item *

C<Lexing code internal error (%s)>

=item *

C<localtime(%.0f) too large>

=item *

C<Overloaded dereference did not return a reference>

=item *

C<Overloaded qr did not return a REGEXP>

=item *

C<Perl_pmflag() is deprecated, and will be removed from the XS API>

=item *

C<lvalue attribute ignored after the subroutine has been defined>

This new warning is issued when one attempts to mark a subroutine as
lvalue after it has been defined.

=item *

Perl now warns you if C<++> or C<--> are unable to change the value
because it's beyond the limit of representation.

This uses a new warnings category: "imprecision".

=item *

C<lc>, C<uc>, C<lcfirst>, and C<ucfirst> warn when passed undef.

=item *

C<Show constant in "Useless use of a constant in void context">

=item *

C<Prototype after '%s'>

=item *

C<panic: sv_chop %s>

This new fatal error occurs when the C routine C<Perl_sv_chop()> was
passed a position that is not within the scalar's string buffer. This
could be caused by buggy XS code, and at this point recovery is not
possible.

=item *

The fatal error C<Malformed UTF-8 returned by \N> is now produced if the
C<charnames> handler returns malformed UTF-8.

=item *

If an unresolved named character or sequence was encountered when
compiling a regex pattern then the fatal error C<\N{NAME} must be resolved
by the lexer> is now produced. This can happen, for example, when using a
single-quotish context like C<$re = '\N{SPACE}'; /$re/;>. See L<perldiag>
for more examples of how the lexer can get bypassed.

=item *

C<Invalid hexadecimal number in \N{U+...}> is a new fatal error
triggered when the character constant represented by C<...> is not a
valid hexadecimal number.

=item *

The new meaning of C<\N> as C<[^\n]> is not valid in a bracketed character
class, just like C<.> in a character class loses its special meaning,
and will cause the fatal error C<\N in a character class must be a named
character: \N{...}>.

=item *

The rules on what is legal for the C<...> in C<\N{...}> have been
tightened up so that unless the C<...> begins with an alphabetic
character and continues with a combination of alphanumerics, dashes,
spaces, parentheses or colons then the warning C<Deprecated character(s)
in \N{...} starting at '%s'> is now issued.

=item *

The warning C<Using just the first characters returned by \N{}> will
be issued if the C<charnames> handler returns a sequence of characters
which exceeds the limit of the number of characters that can be used. The
message will indicate which characters were used and which were discarded.

=back

=back

=head2 Changed Diagnostics

A number of existing diagnostic messages have been improved or corrected:

=over

=item *

A new warning category C<illegalproto> allows finer-grained control of
warnings around function prototypes.

The two warnings:

=over

=item C<Illegal character in prototype for %s : %s>

=item C<Prototype after '%c' for %s : %s>

=back

have been moved from the C<syntax> top-level warnings category into a new
first-level category, C<illegalproto>. These two warnings are currently
the only ones emitted during parsing of an invalid/illegal prototype,
so one can now use

  no warnings 'illegalproto';

to suppress only those, but not other syntax-related warnings. Warnings
where prototypes are changed, ignored, or not met are still in the
C<prototype> category as before.

=item *

C<Deep recursion on subroutine "%s">

It is now possible to change the depth threshold for this warning from the
default of 100, by recompiling the F<perl> binary, setting the C
pre-processor macro C<PERL_SUB_DEPTH_WARN> to the desired value.

=item *

C<Illegal character in prototype> warning is now more precise
when reporting illegal characters after _

=item *

mro merging error messages are now very similar to those produced by
L<Algorithm::C3>.

=item *

Amelioration of the error message "Unrecognized character %s in column %d"

Changes the error message to "Unrecognized character %s; marked by E<lt>--
HERE after %sE<lt>-- HERE near column %d". This should make it a little
simpler to spot and correct the suspicious character.

=item *

Perl now explicitly points to C<$.> when it causes an uninitialized
warning for ranges in scalar context.

=item *

C<split> now warns when called in void context.

=item *

C<printf>-style functions called with too few arguments will now issue the
warning C<"Missing argument in %s"> [perl #71000]

=item *

Perl now properly returns a syntax error instead of segfaulting
if C<each>, C<keys>, or C<values> is used without an argument.

=item *

C<tell()> now fails properly if called without an argument and when no
previous file was read.

C<tell()> now returns C<-1>, and sets errno to C<EBADF>, thus restoring
the 5.8.x behaviour.

=item *

C<overload> no longer implicitly unsets fallback on repeated 'use
overload' lines.

=item *

POSIX::strftime() can now handle Unicode characters in the format string.

=item *

The C<syntax> category was removed from 5 warnings that should only be in
C<deprecated>.

=item *

Three fatal C<pack>/C<unpack> error messages have been normalized to
C<panic: %s>

=item *

C<Unicode character is illegal> has been rephrased to be more accurate

It now reads C<Unicode non-character is illegal in interchange> and the
perldiag documentation has been expanded a bit.

=item *

Currently, all but the first of the several characters that the
C<charnames> handler may return are discarded when used in a regular
expression pattern bracketed character class. If this happens then the
warning C<Using just the first character returned by \N{} in character
class> will be issued.

=item *

The warning C<Missing right brace on \N{} or unescaped left brace after
\N.  Assuming the latter> will be issued if Perl encounters a C<\N{>
but doesn't find a matching C<}>. In this case Perl doesn't know if it
was mistakenly omitted, or if "match non-newline" followed by "match
a C<{>" was desired.  It assumes the latter because that is actually a
valid interpretation as written, unlike the other case.  If you meant
the former, you need to add the matching right brace.  If you did mean
the latter, you can silence this warning by writing instead C<\N\{>.

=item *

C<gmtime> and C<localtime> called with numbers smaller than they can
reliably handle will now issue the warnings C<gmtime(%.0f) too small>
and C<localtime(%.0f) too small>.

=back

The following diagnostic messages have been removed:

=over 4

=item *

C<Runaway format>

=item *

C<Can't locate package %s for the parents of %s>

In general this warning it only got produced in
conjunction with other warnings, and removing it allowed an ISA lookup
optimisation to be added.

=item *

C<v-string in use/require is non-portable>

=back

=head1 Utility Changes

=over 4

=item *

F<h2ph> now looks in C<include-fixed> too, which is a recent addition
to gcc's search path.

=item *

F<h2xs> no longer incorrectly treats enum values like macros.
It also now handles C++ style comments (C<//>) properly in enums.

=item *

F<perl5db.pl> now supports C<LVALUE> subroutines.  Additionally, the
debugger now correctly handles proxy constant subroutines, and
subroutine stubs.

=item *

F<perlbug> now uses C<%Module::CoreList::bug_tracker> to print out
upstream bug tracker URLs.  If a user identifies a particular module
as the topic of their bug report and we're able to divine the URL for
its upstream bug tracker, perlbug now provide a message to the user
explaining that the core copies the CPAN version directly, and provide
the URL for reporting the bug directly to the upstream author.

F<perlbug> no longer reports "Message sent" when it hasn't actually sent
the message

=item *

F<perlthanks> is a new utility for sending non-bug-reports to the
authors and maintainers of Perl. Getting nothing but bug reports can
become a bit demoralising. If Perl 5.12 works well for you, please try
out F<perlthanks>. It will make the developers smile.

=item *

Perl's developers have fixed bugs in F<a2p> having to do with the
C<match()> operator in list context.  Additionally, F<a2p> no longer
generates code that uses the C<$[> variable.

=back

=head1 Selected Bug Fixes

=over 4

=item *

U+0FFFF is now a legal character in regular expressions.

=item *

pp_qr now always returns a new regexp SV. Resolves RT #69852.

Instead of returning a(nother) reference to the (pre-compiled) regexp
in the optree, use reg_temp_copy() to create a copy of it, and return a
reference to that. This resolves issues about Regexp::DESTROY not being
called in a timely fashion (the original bug tracked by RT #69852), as
well as bugs related to blessing regexps, and of assigning to regexps,
as described in correspondence added to the ticket.

It transpires that we also need to undo the SvPVX() sharing when ithreads
cloning a Regexp SV, because mother_re is set to NULL, instead of a
cloned copy of the mother_re. This change might fix bugs with regexps
and threads in certain other situations, but as yet neither tests nor
bug reports have indicated any problems, so it might not actually be an
edge case that it's possible to reach.

=item *

Several compilation errors and segfaults when perl was built with C<-Dmad>
were fixed.

=item *

Fixes for lexer API changes in 5.11.2 which broke NYTProf's savesrc option.

=item *

C<-t> should only return TRUE for file handles connected to a TTY

The Microsoft C version of C<isatty()> returns TRUE for all character mode
devices, including the F</dev/null>-style "nul" device and printers like
"lpt1".

=item *

Fixed a regression caused by commit fafafbaf which caused a panic during
parameter passing [perl #70171]

=item *

On systems which in-place edits without backup files, -i'*' now works as
the documentation says it does [perl #70802]

=item *

Saving and restoring magic flags no longer loses readonly flag.

=item *

The malformed syntax C<grep EXPR LIST> (note the missing comma) no longer
causes abrupt and total failure.

=item *

Regular expressions compiled with C<qr{}> literals properly set C<$'> when
matching again.

=item *

Using named subroutines with C<sort> should no longer lead to bus errors
[perl #71076]

=item *

Numerous bugfixes catch small issues caused by the recently-added Lexer API.

=item *

Smart match against C<@_> sometimes gave false negatives. [perl #71078]

=item *

C<$@> may now be assigned a read-only value (without error or busting
the stack).

=item *

C<sort> called recursively from within an active comparison subroutine no
longer causes a bus error if run multiple times. [perl #71076]

=item *

Tie::Hash::NamedCapture::* will not abort if passed bad input (RT #71828)

=item *

@_ and $_ no longer leak under threads (RT #34342 and #41138, also
#70602, #70974)

=item *

C<-I> on shebang line now adds directories in front of @INC
as documented, and as does C<-I> when specified on the command-line.

=item *

C<kill> is now fatal when called on non-numeric process identifiers.
Previously, an C<undef> process identifier would be interpreted as a
request to kill process 0, which would terminate the current process
group on POSIX systems. Since process identifiers are always integers,
killing a non-numeric process is now fatal.

=item *

5.10.0 inadvertently disabled an optimisation, which caused a measurable
performance drop in list assignment, such as is often used to assign
function parameters from C<@_>. The optimisation has been re-instated, and
the performance regression fixed. (This fix is also present in 5.10.1)

=item *

Fixed memory leak on C<while (1) { map 1, 1 }> [RT #53038].

=item *

Some potential coredumps in PerlIO fixed [RT #57322,54828].

=item *

The debugger now works with lvalue subroutines.

=item *

The debugger's C<m> command was broken on modules that defined constants
[RT #61222].

=item *

C<crypt> and string complement could return tainted values for untainted
arguments [RT #59998].

=item *

The C<-i>I<.suffix> command-line switch now recreates the file using
restricted permissions, before changing its mode to match the original
file. This eliminates a potential race condition [RT #60904].

=item *

On some Unix systems, the value in C<$?> would not have the top bit set
(C<$? & 128>) even if the child core dumped.

=item *

Under some circumstances, C<$^R> could incorrectly become undefined
[RT #57042].

=item *

In the XS API, various hash functions, when passed a pre-computed hash where
the key is UTF-8, might result in an incorrect lookup.

=item *

XS code including F<XSUB.h> before F<perl.h> gave a compile-time error
[RT #57176].

=item *

C<< $object-E<gt>isa('Foo') >> would report false if the package C<Foo>
didn't exist, even if the object's C<@ISA> contained C<Foo>.

=item *

Various bugs in the new-to 5.10.0 mro code, triggered by manipulating
C<@ISA>, have been found and fixed.

=item *

Bitwise operations on references could crash the interpreter, e.g.
C<$x=\$y; $x |= "foo"> [RT #54956].

=item *

Patterns including alternation might be sensitive to the internal UTF-8
representation, e.g.

    my $byte = chr(192);
    my $utf8 = chr(192); utf8::upgrade($utf8);
    $utf8 =~ /$byte|X}/i;	# failed in 5.10.0

=item *

Within UTF8-encoded Perl source files (i.e. where C<use utf8> is in
effect), double-quoted literal strings could be corrupted where a C<\xNN>,
C<\0NNN> or C<\N{}> is followed by a literal character with ordinal value
greater than 255 [RT #59908].

=item *

C<B::Deparse> failed to correctly deparse various constructs:
C<readpipe STRING> [RT #62428], C<CORE::require(STRING)> [RT #62488],
C<sub foo(_)> [RT #62484].

=item *

Using C<setpgrp> with no arguments could corrupt the perl stack.

=item *

The block form of C<eval> is now specifically trappable by C<Safe> and
C<ops>. Previously it was erroneously treated like string C<eval>.

=item *

In 5.10.0, the two characters C<[~> were sometimes parsed as the smart
match operator (C<~~>) [RT #63854].

=item *

In 5.10.0, the C<*> quantifier in patterns was sometimes treated as
C<{0,32767}> [RT #60034, #60464]. For example, this match would fail:

    ("ab" x 32768) =~ /^(ab)*$/

=item *

C<shmget> was limited to a 32 bit segment size on a 64 bit OS [RT #63924].

=item *

Using C<next> or C<last> to exit a C<given> block no longer produces a
spurious warning like the following:

    Exiting given via last at foo.pl line 123

=item *

Assigning a format to a glob could corrupt the format; e.g.:

     *bar=*foo{FORMAT}; # foo format now bad

=item *

Attempting to coerce a typeglob to a string or number could cause an
assertion failure. The correct error message is now generated,
C<Can't coerce GLOB to I<$type>>.

=item *

Under C<use filetest 'access'>, C<-x> was using the wrong access
mode. This has been fixed [RT #49003].

=item *

C<length> on a tied scalar that returned a Unicode value would not be
correct the first time. This has been fixed.

=item *

Using an array C<tie> inside in array C<tie> could SEGV. This has been
fixed. [RT #51636]

=item *

A race condition inside C<PerlIOStdio_close()> has been identified and
fixed. This used to cause various threading issues, including SEGVs.

=item *

In C<unpack>, the use of C<()> groups in scalar context was internally
placing a list on the interpreter's stack, which manifested in various
ways, including SEGVs. This is now fixed [RT #50256].

=item *

Magic was called twice in C<substr>, C<\&$x>, C<tie $x, $m> and C<chop>.
These have all been fixed.

=item *

A 5.10.0 optimisation to clear the temporary stack within the implicit
loop of C<s///ge> has been reverted, as it turned out to be the cause of
obscure bugs in seemingly unrelated parts of the interpreter [commit
ef0d4e17921ee3de].

=item *

The line numbers for warnings inside C<elsif> are now correct.

=item *

The C<..> operator now works correctly with ranges whose ends are at or
close to the values of the smallest and largest integers.

=item *

C<binmode STDIN, ':raw'> could lead to segmentation faults on some platforms.
This has been fixed [RT #54828].

=item *

An off-by-one error meant that C<index $str, ...> was effectively being
executed as C<index "$str\0", ...>. This has been fixed [RT #53746].

=item *

Various leaks associated with named captures in regexes have been fixed
[RT #57024].

=item *

A weak reference to a hash would leak. This was affecting C<DBI>
[RT #56908].

=item *

Using (?|) in a regex could cause a segfault [RT #59734].

=item *

Use of a UTF-8 C<tr//> within a closure could cause a segfault [RT #61520].

=item *

Calling C<Perl_sv_chop()> or otherwise upgrading an SV could result in an
unaligned 64-bit access on the SPARC architecture [RT #60574].

=item *

In the 5.10.0 release, C<inc_version_list> would incorrectly list
C<5.10.*> after C<5.8.*>; this affected the C<@INC> search order
[RT #67628].

=item *

In 5.10.0, C<pack "a*", $tainted_value> returned a non-tainted value
[RT #52552].

=item *

In 5.10.0, C<printf> and C<sprintf> could produce the fatal error
C<panic: utf8_mg_pos_cache_update> when printing UTF-8 strings
[RT #62666].

=item *

In the 5.10.0 release, a dynamically created C<AUTOLOAD> method might be
missed (method cache issue) [RT #60220,60232].

=item *

In the 5.10.0 release, a combination of C<use feature> and C<//ee> could
cause a memory leak [RT #63110].

=item *

C<-C> on the shebang (C<#!>) line is once more permitted if it is also
specified on the command line. C<-C> on the shebang line used to be a
silent no-op I<if> it was not also on the command line, so perl 5.10.0
disallowed it, which broke some scripts. Now perl checks whether it is
also on the command line and only dies if it is not [RT #67880].

=item *

In 5.10.0, certain types of re-entrant regular expression could crash,
or cause the following assertion failure [RT #60508]:

    Assertion rx->sublen >= (s - rx->subbeg) + i failed

=item *

Perl now includes previously missing files from the Unicode Character
Database.

=item *

Perl now honors C<TMPDIR> when opening an anonymous temporary file.

=back


=head1 Platform Specific Changes

Perl is incredibly portable. In general, if a platform has a C compiler,
someone has ported Perl to it (or will soon).  We're happy to announce
that Perl 5.12 includes support for several new platforms.  At the same
time, it's time to bid farewell to some (very) old friends.

=head2 New Platforms

=over

=item Haiku

Perl's developers have merged patches from Haiku's maintainers. Perl
should now build on Haiku.

=item MirOS BSD

Perl should now build on MirOS BSD.

=back

=head2 Discontinued Platforms

=over

=item Domain/OS

=item MiNT

=item Tenon MachTen

=back

=head2 Updated Platforms

=over 4

=item AIX

=over 4

=item *

Removed F<libbsd> for AIX 5L and 6.1. Only C<flock()> was used from
F<libbsd>.

=item *

Removed F<libgdbm> for AIX 5L and 6.1 if F<libgdbm> < 1.8.3-5 is
installed.  The F<libgdbm> is delivered as an optional package with the
AIX Toolbox.  Unfortunately the versions below 1.8.3-5 are broken.

=item *

Hints changes mean that AIX 4.2 should work again.

=back

=item Cygwin

=over 4

=item *

Perl now supports IPv6 on Cygwin 1.7 and newer.

=item *

On Cygwin we now strip the last number from the DLL. This has been the
behaviour in the cygwin.com build for years. The hints files have been
updated.

=back

=item Darwin (Mac OS X)

=over 4

=item *

Skip testing the be_BY.CP1131 locale on Darwin 10 (Mac OS X 10.6),
as it's still buggy.

=item *

Correct infelicities in the regexp used to identify buggy locales
on Darwin 8 and 9 (Mac OS X 10.4 and 10.5, respectively).

=back

=item DragonFly BSD

=over 4

=item *

Fix thread library selection [perl #69686]

=back

=item FreeBSD

=over 4

=item *

The hints files now identify the correct threading libraries on FreeBSD 7
and later.

=back

=item Irix

=over 4

=item *

We now work around a bizarre preprocessor bug in the Irix 6.5 compiler:
C<cc -E -> unfortunately goes into K&R mode, but C<cc -E file.c> doesn't.

=back

=item NetBSD

=over 4

=item *

Hints now supports versions 5.*.

=back

=item OpenVMS

=over 4

=item *

C<-UDEBUGGING> is now the default on VMS.

Like it has been everywhere else for ages and ages. Also make command-line
selection of -UDEBUGGING and -DDEBUGGING work in configure.com; before
the only way to turn it off was by saying no in answer to the interactive
question.

=item *

The default pipe buffer size on VMS has been updated to 8192 on 64-bit
systems.

=item *

Reads from the in-memory temporary files of C<PerlIO::scalar> used to fail
if C<$/> was set to a numeric reference (to indicate record-style reads).
This is now fixed.

=item *

VMS now supports C<getgrgid>.

=item *

Many improvements and cleanups have been made to the VMS file name handling
and conversion code.

=item *

Enabling the C<PERL_VMS_POSIX_EXIT> logical name now encodes a POSIX exit
status in a VMS condition value for better interaction with GNV's bash
shell and other utilities that depend on POSIX exit values. See
L<perlvms/"$?"> for details.

=item *

C<File::Copy> now detects Unix compatibility mode on VMS.

=back

=item Stratus VOS

=over 4

=item *

Various changes from Stratus have been merged in.

=back

=item Symbian

=over 4

=item *

There is now support for Symbian S60 3.2 SDK and S60 5.0 SDK.

=back

=item Windows

=over 4

=item *

Perl 5.12 supports Windows 2000 and later. The supporting code for
legacy versions of Windows is still included, but will be removed
during the next development cycle.

=item *

Initial support for building Perl with MinGW-w64 is now available.

=item *

F<perl.exe> now includes a manifest resource to specify the C<trustInfo>
settings for Windows Vista and later. Without this setting Windows
would treat F<perl.exe> as a legacy application and apply various
heuristics like redirecting access to protected file system areas
(like the "Program Files" folder) to the users "VirtualStore"
instead of generating a proper "permission denied" error.

The manifest resource also requests the Microsoft Common-Controls
version 6.0 (themed controls introduced in Windows XP).  Check out the
Win32::VisualStyles module on CPAN to switch back to old style
unthemed controls for legacy applications.

=item *

The C<-t> filetest operator now only returns true if the filehandle
is connected to a console window.  In previous versions of Perl it
would return true for all character mode devices, including F<NUL>
and F<LPT1>.

=item *

The C<-p> filetest operator now works correctly, and the
Fcntl::S_IFIFO constant is defined when Perl is compiled with
Microsoft Visual C.  In previous Perl versions C<-p> always
returned a false value, and the Fcntl::S_IFIFO constant
was not defined.

This bug is specific to Microsoft Visual C and never affected
Perl binaries built with MinGW.

=item *

The socket error codes are now more widely supported:  The POSIX
module will define the symbolic names, like POSIX::EWOULDBLOCK,
and stringification of socket error codes in $! works as well
now;

  C:\>perl -MPOSIX -E "$!=POSIX::EWOULDBLOCK; say $!"
  A non-blocking socket operation could not be completed immediately.

=item *

flock() will now set sensible error codes in $!.  Previous Perl versions
copied the value of $^E into $!, which caused much confusion.

=item *

select() now supports all empty C<fd_set>s more correctly.

=item *

C<'.\foo'> and C<'..\foo'>  were treated differently than
C<'./foo'> and C<'../foo'> by C<do> and C<require> [RT #63492].

=item *

Improved message window handling means that C<alarm> and C<kill> messages
will no longer be dropped under race conditions.

=item *

Various bits of Perl's build infrastructure are no longer converted to
win32 line endings at release time. If this hurts you, please report the
problem with the L<perlbug> program included with perl.

=back

=back


=head1 Known Problems

This is a list of some significant unfixed bugs, which are regressions
from either 5.10.x or 5.8.x.

=over 4

=item *

Some CPANPLUS tests may fail if there is a functioning file
F<../../cpanp-run-perl> outside your build directory. The failure
shouldn't imply there's a problem with the actual functional
software. The bug is already fixed in [RT #74188] and is scheduled for
inclusion in perl-v5.12.1.

=item *

C<List::Util::first> misbehaves in the presence of a lexical C<$_>
(typically introduced by C<my $_> or implicitly by C<given>). The variable
which gets set for each iteration is the package variable C<$_>, not the
lexical C<$_> [RT #67694].

A similar issue may occur in other modules that provide functions which
take a block as their first argument, like

    foo { ... $_ ...} list

=item *

Some regexes may run much more slowly when run in a child thread compared
with the thread the pattern was compiled into [RT #55600].

=item *

Things like C<"\N{LATIN SMALL LIGATURE FF}" =~ /\N{LATIN SMALL LETTER F}+/>
will appear to hang as they get into a very long running loop [RT #72998].

=item *

Several porters have reported mysterious crashes when Perl's entire
test suite is run after a build on certain Windows 2000 systems. When
run by hand, the individual tests reportedly work fine.

=back

=head1 Errata

=over

=item *

This one is actually a change introduced in 5.10.0, but it was missed
from that release's perldelta, so it is mentioned here instead.

A bugfix related to the handling of the C</m> modifier and C<qr> resulted
in a change of behaviour between 5.8.x and 5.10.0:

    # matches in 5.8.x, doesn't match in 5.10.0
    $re = qr/^bar/; "foo\nbar" =~ /$re/m;

=back

=head1 Acknowledgements

Perl 5.12.0 represents approximately two years of development since
Perl 5.10.0 and contains over 750,000 lines of changes across over
3,000 files from over 200 authors and committers.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.12.0:

Aaron Crane, Abe Timmerman, Abhijit Menon-Sen, Abigail, Adam Russell,
Adriano Ferreira, Ævar Arnfjörð Bjarmason, Alan Grover, Alexandr
Ciornii, Alex Davies, Alex Vandiver, Andreas Koenig, Andrew Rodland,
andrew@sundale.net, Andy Armstrong, Andy Dougherty, Jose AUGUSTE-ETIENNE,
Benjamin Smith, Ben Morrow, bharanee rathna, Bo Borgerson, Bo Lindbergh,
Brad Gilbert, Bram, Brendan O'Dea, brian d foy, Charles Bailey,
Chip Salzenberg, Chris 'BinGOs' Williams, Christoph Lamprecht, Chris
Williams, chromatic, Claes Jakobsson, Craig A. Berry, Dan Dascalescu,
Daniel Frederick Crisman, Daniel M. Quinlan, Dan Jacobson, Dan Kogai,
Dave Mitchell, Dave Rolsky, David Cantrell, David Dick, David Golden,
David Mitchell, David M. Syzdek, David Nicol, David Wheeler, Dennis
Kaarsemaker, Dintelmann, Peter, Dominic Dunlop, Dr.Ruud, Duke Leto,
Enrico Sorcinelli, Eric Brine, Father Chrysostomos, Florian Ragwitz,
Frank Wiegand, Gabor Szabo, Gene Sullivan, Geoffrey T. Dairiki, George
Greer, Gerard Goossen, Gisle Aas, Goro Fuji, Graham Barr, Green, Paul,
Hans Dieter Pearcey, Harmen, H. Merijn Brand, Hugo van der Sanden,
Ian Goodacre, Igor Sutton, Ingo Weinhold, James Bence, James Mastros,
Jan Dubois, Jari Aalto, Jarkko Hietaniemi, Jay Hannah, Jerry Hedden,
Jesse Vincent, Jim Cromie, Jody Belka, John E. Malmberg, John Malmberg,
John Peacock, John Peacock via RT, John P. Linderman, John Wright,
Josh ben Jore, Jos I. Boumans, Karl Williamson, Kenichi Ishigaki, Ken
Williams, Kevin Brintnall, Kevin Ryde, Kurt Starsinic, Leon Brocard,
Lubomir Rintel, Luke Ross, Marcel Grünauer, Marcus Holland-Moritz, Mark
Jason Dominus, Marko Asplund, Martin Hasch, Mashrab Kuvatov, Matt Kraai,
Matt S Trout, Max Maischein, Michael Breen, Michael Cartmell, Michael
G Schwern, Michael Witten, Mike Giroux, Milosz Tanski, Moritz Lenz,
Nicholas Clark, Nick Cleaton, Niko Tyni, Offer Kaye, Osvaldo Villalon,
Paul Fenwick, Paul Gaborit, Paul Green, Paul Johnson, Paul Marquess,
Philip Hazel, Philippe Bruhat, Rafael Garcia-Suarez, Rainer Tammer,
Rajesh Mandalemula, Reini Urban, Renée Bäcker, Ricardo Signes,
Ricardo SIGNES, Richard Foley, Rich Rauenzahn, Rick Delaney, Risto
Kankkunen, Robert May, Roberto C. Sanchez, Robin Barker, SADAHIRO
Tomoyuki, Salvador Ortiz Garcia, Sam Vilain, Scott Lanning, Sébastien
Aperghis-Tramoni, Sérgio Durigan Júnior, Shlomi Fish, Simon 'corecode'
Schubert, Sisyphus, Slaven Rezic, Smylers, Steffen Müller, Steffen
Ullrich, Stepan Kasal, Steve Hay, Steven Schubiger, Steve Peters, Tels,
The Doctor, Tim Bunce, Tim Jenness, Todd Rinaldo, Tom Christiansen,
Tom Hukins, Tom Wyant, Tony Cook, Torsten Schoenfeld, Tye McQueen,
Vadim Konovalov, Vincent Pit, Hio YAMASHINA, Yasuhiro Matsumoto,
Yitzchak Scott-Thoennes, Yuval Kogman, Yves Orton, Zefram, Zsban Ambrus

This is woefully incomplete as it's automatically generated from version
control history.  In particular, it doesn't include the names of the
(very much appreciated) contributors who reported issues in previous
versions of Perl that helped make Perl 5.12.0 better. For a more complete
list of all of Perl's historical contributors, please see the C<AUTHORS>
file in the Perl 5.12.0 distribution.

Our "retired" pumpkings Nicholas Clark and Rafael Garcia-Suarez
deserve special thanks for their brilliant and substantive ongoing
contributions. Nicholas personally authored over 30% of the patches
since 5.10.0. Rafael comes in second in patch authorship with 11%,
but is first by a long shot in committing patches authored by others,
pushing 44% of the commits since 5.10.0 in this category, often after
providing considerable coaching to the patch authors. These statistics
in no way comprise all of their contributions, but express in shorthand
that we couldn't have done it without them.

Many of the changes included in this version originated in the CPAN
modules included in Perl's core. We're grateful to the entire CPAN
community for helping Perl to flourish.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at L<http://rt.perl.org/perlbug/>. There may also be
information at L<http://www.perl.org/>, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release. Be sure to trim your bug down
to a tiny but sufficient test case. Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analyzed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes
all the core committers, who will be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

L<http://dev.perl.org/perl5/errata.html> for a list of issues
found after this release, as well as a list of CPAN modules known
to be incompatible with this release.

=cut
perl5144delta.pod000064400000014267150344123460007556 0ustar00=encoding utf8

=head1 NAME

perl5144delta - what is new for perl v5.14.4

=head1 DESCRIPTION

This document describes differences between the 5.14.3 release and
the 5.14.4 release.

If you are upgrading from an earlier release such as 5.12.0, first read
L<perl5140delta>, which describes differences between 5.12.0 and
5.14.0.

=head1 Core Enhancements

No changes since 5.14.0.

=head1 Security

This release contains one major, and medium, and a number of minor
security fixes.  The latter are included mainly to allow the test suite to
pass cleanly with the clang compiler's address sanitizer facility.

=head2 CVE-2013-1667: memory exhaustion with arbitrary hash keys

With a carefully crafted set of hash keys (for example arguments on a
URL), it is possible to cause a hash to consume a large amount of memory
and CPU, and thus possibly to achieve a Denial-of-Service.

This problem has been fixed.

=head2 memory leak in Encode

The UTF-8 encoding implementation in Encode.xs had a memory leak which has been
fixed.

=head2 [perl #111594] Socket::unpack_sockaddr_un heap-buffer-overflow

A read buffer overflow could occur when copying C<sockaddr> buffers.
Fairly harmless.

This problem has been fixed.

=head2 [perl #111586] SDBM_File: fix off-by-one access to global ".dir"

An extra byte was being copied for some string literals. Fairly harmless.

This problem has been fixed.

=head2 off-by-two error in List::Util

A string literal was being used that included two bytes beyond the
end of the string. Fairly harmless.

This problem has been fixed.

=head2 [perl #115994] fix segv in regcomp.c:S_join_exact()

Under debugging builds, while marking optimised-out regex nodes as type
C<OPTIMIZED>, it could treat blocks of exact text as if they were nodes,
and thus SEGV. Fairly harmless.

This problem has been fixed.

=head2 [perl #115992] PL_eval_start use-after-free

The statement C<local $[;>, when preceded by an C<eval>, and when not part
of an assignment, could crash. Fairly harmless.

This problem has been fixed.

=head2 wrap-around with IO on long strings

Reading or writing strings greater than 2**31 bytes in size could segfault
due to integer wraparound.

This problem has been fixed.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any
exist, they are bugs and reports are welcome.

=head1 Deprecations

There have been no deprecations since 5.14.0.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

None

=head2 Updated Modules and Pragmata

The following modules have just the minor code fixes as listed above in
L</Security> (version numbers have not changed):

=over 4

=item Socket

=item SDBM_File

=item List::Util

=back

L<Encode> has been upgraded from version 2.42_01 to version 2.42_02.

L<Module::CoreList> has been updated to version 2.49_06 to add data for
this release.

=head2 Removed Modules and Pragmata

None.

=head1 Documentation

=head2 New Documentation

None.

=head2 Changes to Existing Documentation

None.

=head1 Diagnostics

No new or changed diagnostics.

=head1 Utility Changes

None

=head1 Configuration and Compilation

No changes.

=head1 Platform Support

=head2 New Platforms

None.

=head2 Discontinued Platforms

None.

=head2 Platform-Specific Notes

=over 4

=item VMS

5.14.3 failed to compile on VMS due to incomplete application of a patch
series that allowed C<userelocatableinc> and C<usesitecustomize> to be
used simultaneously.  Other platforms were not affected and the problem
has now been corrected.

=back

=head1 Selected Bug Fixes

=over 4

=item *

In Perl 5.14.0, C<$tainted ~~ @array> stopped working properly.  Sometimes
it would erroneously fail (when C<$tainted> contained a string that occurs
in the array I<after> the first element) or erroneously succeed (when
C<undef> occurred after the first element) [perl #93590].

=back

=head1 Known Problems

None.

=head1 Acknowledgements

Perl 5.14.4 represents approximately 5 months of development since Perl 5.14.3
and contains approximately 1,700 lines of changes across 49 files from 12
authors.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.14.4:

Andy Dougherty, Chris 'BinGOs' Williams, Christian Hansen, Craig A. Berry,
Dave Rolsky, David Mitchell, Dominic Hargreaves, Father Chrysostomos,
Florian Ragwitz, Reini Urban, Ricardo Signes, Yves Orton.


The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.


=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes all the core committers, who be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlrequick.pod000064400000044101150344123460007600 0ustar00=head1 NAME

perlrequick - Perl regular expressions quick start

=head1 DESCRIPTION

This page covers the very basics of understanding, creating and
using regular expressions ('regexes') in Perl.


=head1 The Guide

This page assumes you already know things, like what a "pattern" is, and
the basic syntax of using them.  If you don't, see L<perlretut>.

=head2 Simple word matching

The simplest regex is simply a word, or more generally, a string of
characters.  A regex consisting of a word matches any string that
contains that word:

    "Hello World" =~ /World/;  # matches

In this statement, C<World> is a regex and the C<//> enclosing
C</World/> tells Perl to search a string for a match.  The operator
C<=~> associates the string with the regex match and produces a true
value if the regex matched, or false if the regex did not match.  In
our case, C<World> matches the second word in C<"Hello World">, so the
expression is true.  This idea has several variations.

Expressions like this are useful in conditionals:

    print "It matches\n" if "Hello World" =~ /World/;

The sense of the match can be reversed by using C<!~> operator:

    print "It doesn't match\n" if "Hello World" !~ /World/;

The literal string in the regex can be replaced by a variable:

    $greeting = "World";
    print "It matches\n" if "Hello World" =~ /$greeting/;

If you're matching against C<$_>, the C<$_ =~> part can be omitted:

    $_ = "Hello World";
    print "It matches\n" if /World/;

Finally, the C<//> default delimiters for a match can be changed to
arbitrary delimiters by putting an C<'m'> out front:

    "Hello World" =~ m!World!;   # matches, delimited by '!'
    "Hello World" =~ m{World};   # matches, note the matching '{}'
    "/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin',
                                 # '/' becomes an ordinary char

Regexes must match a part of the string I<exactly> in order for the
statement to be true:

    "Hello World" =~ /world/;  # doesn't match, case sensitive
    "Hello World" =~ /o W/;    # matches, ' ' is an ordinary char
    "Hello World" =~ /World /; # doesn't match, no ' ' at end

Perl will always match at the earliest possible point in the string:

    "Hello World" =~ /o/;       # matches 'o' in 'Hello'
    "That hat is red" =~ /hat/; # matches 'hat' in 'That'

Not all characters can be used 'as is' in a match.  Some characters,
called B<metacharacters>, are reserved for use in regex notation.
The metacharacters are

    {}[]()^$.|*+?\

A metacharacter can be matched by putting a backslash before it:

    "2+2=4" =~ /2+2/;    # doesn't match, + is a metacharacter
    "2+2=4" =~ /2\+2/;   # matches, \+ is treated like an ordinary +
    'C:\WIN32' =~ /C:\\WIN/;                       # matches
    "/usr/bin/perl" =~ /\/usr\/bin\/perl/;  # matches

In the last regex, the forward slash C<'/'> is also backslashed,
because it is used to delimit the regex.

Non-printable ASCII characters are represented by B<escape sequences>.
Common examples are C<\t> for a tab, C<\n> for a newline, and C<\r>
for a carriage return.  Arbitrary bytes are represented by octal
escape sequences, e.g., C<\033>, or hexadecimal escape sequences,
e.g., C<\x1B>:

    "1000\t2000" =~ m(0\t2)  # matches
    "cat" =~ /\143\x61\x74/  # matches in ASCII, but 
                             # a weird way to spell cat

Regexes are treated mostly as double-quoted strings, so variable
substitution works:

    $foo = 'house';
    'cathouse' =~ /cat$foo/;   # matches
    'housecat' =~ /${foo}cat/; # matches

With all of the regexes above, if the regex matched anywhere in the
string, it was considered a match.  To specify I<where> it should
match, we would use the B<anchor> metacharacters C<^> and C<$>.  The
anchor C<^> means match at the beginning of the string and the anchor
C<$> means match at the end of the string, or before a newline at the
end of the string.  Some examples:

    "housekeeper" =~ /keeper/;         # matches
    "housekeeper" =~ /^keeper/;        # doesn't match
    "housekeeper" =~ /keeper$/;        # matches
    "housekeeper\n" =~ /keeper$/;      # matches
    "housekeeper" =~ /^housekeeper$/;  # matches

=head2 Using character classes

A B<character class> allows a set of possible characters, rather than
just a single character, to match at a particular point in a regex.
Character classes are denoted by brackets C<[...]>, with the set of
characters to be possibly matched inside.  Here are some examples:

    /cat/;            # matches 'cat'
    /[bcr]at/;        # matches 'bat', 'cat', or 'rat'
    "abc" =~ /[cab]/; # matches 'a'

In the last statement, even though C<'c'> is the first character in
the class, the earliest point at which the regex can match is C<'a'>.

    /[yY][eE][sS]/; # match 'yes' in a case-insensitive way
                    # 'yes', 'Yes', 'YES', etc.
    /yes/i;         # also match 'yes' in a case-insensitive way

The last example shows a match with an C<'i'> B<modifier>, which makes
the match case-insensitive.

Character classes also have ordinary and special characters, but the
sets of ordinary and special characters inside a character class are
different than those outside a character class.  The special
characters for a character class are C<-]\^$> and are matched using an
escape:

   /[\]c]def/; # matches ']def' or 'cdef'
   $x = 'bcr';
   /[$x]at/;   # matches 'bat, 'cat', or 'rat'
   /[\$x]at/;  # matches '$at' or 'xat'
   /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat'

The special character C<'-'> acts as a range operator within character
classes, so that the unwieldy C<[0123456789]> and C<[abc...xyz]>
become the svelte C<[0-9]> and C<[a-z]>:

    /item[0-9]/;  # matches 'item0' or ... or 'item9'
    /[0-9a-fA-F]/;  # matches a hexadecimal digit

If C<'-'> is the first or last character in a character class, it is
treated as an ordinary character.

The special character C<^> in the first position of a character class
denotes a B<negated character class>, which matches any character but
those in the brackets.  Both C<[...]> and C<[^...]> must match a
character, or the match fails.  Then

    /[^a]at/;  # doesn't match 'aat' or 'at', but matches
               # all other 'bat', 'cat, '0at', '%at', etc.
    /[^0-9]/;  # matches a non-numeric character
    /[a^]at/;  # matches 'aat' or '^at'; here '^' is ordinary

Perl has several abbreviations for common character classes. (These
definitions are those that Perl uses in ASCII-safe mode with the C</a> modifier.
Otherwise they could match many more non-ASCII Unicode characters as
well.  See L<perlrecharclass/Backslash sequences> for details.)

=over 4

=item *

\d is a digit and represents

    [0-9]

=item *

\s is a whitespace character and represents

    [\ \t\r\n\f]

=item *

\w is a word character (alphanumeric or _) and represents

    [0-9a-zA-Z_]

=item *

\D is a negated \d; it represents any character but a digit

    [^0-9]

=item *

\S is a negated \s; it represents any non-whitespace character

    [^\s]

=item *

\W is a negated \w; it represents any non-word character

    [^\w]

=item *

The period '.' matches any character but "\n"

=back

The C<\d\s\w\D\S\W> abbreviations can be used both inside and outside
of character classes.  Here are some in use:

    /\d\d:\d\d:\d\d/; # matches a hh:mm:ss time format
    /[\d\s]/;         # matches any digit or whitespace character
    /\w\W\w/;         # matches a word char, followed by a
                      # non-word char, followed by a word char
    /..rt/;           # matches any two chars, followed by 'rt'
    /end\./;          # matches 'end.'
    /end[.]/;         # same thing, matches 'end.'

The S<B<word anchor> > C<\b> matches a boundary between a word
character and a non-word character C<\w\W> or C<\W\w>:

    $x = "Housecat catenates house and cat";
    $x =~ /\bcat/;  # matches cat in 'catenates'
    $x =~ /cat\b/;  # matches cat in 'housecat'
    $x =~ /\bcat\b/;  # matches 'cat' at end of string

In the last example, the end of the string is considered a word
boundary.

For natural language processing (so that, for example, apostrophes are
included in words), use instead C<\b{wb}>

    "don't" =~ / .+? \b{wb} /x;  # matches the whole string

=head2 Matching this or that

We can match different character strings with the B<alternation>
metacharacter C<'|'>.  To match C<dog> or C<cat>, we form the regex
C<dog|cat>.  As before, Perl will try to match the regex at the
earliest possible point in the string.  At each character position,
Perl will first try to match the first alternative, C<dog>.  If
C<dog> doesn't match, Perl will then try the next alternative, C<cat>.
If C<cat> doesn't match either, then the match fails and Perl moves to
the next position in the string.  Some examples:

    "cats and dogs" =~ /cat|dog|bird/;  # matches "cat"
    "cats and dogs" =~ /dog|cat|bird/;  # matches "cat"

Even though C<dog> is the first alternative in the second regex,
C<cat> is able to match earlier in the string.

    "cats"          =~ /c|ca|cat|cats/; # matches "c"
    "cats"          =~ /cats|cat|ca|c/; # matches "cats"

At a given character position, the first alternative that allows the
regex match to succeed will be the one that matches. Here, all the
alternatives match at the first string position, so the first matches.

=head2 Grouping things and hierarchical matching

The B<grouping> metacharacters C<()> allow a part of a regex to be
treated as a single unit.  Parts of a regex are grouped by enclosing
them in parentheses.  The regex C<house(cat|keeper)> means match
C<house> followed by either C<cat> or C<keeper>.  Some more examples
are

    /(a|b)b/;    # matches 'ab' or 'bb'
    /(^a|b)c/;   # matches 'ac' at start of string or 'bc' anywhere

    /house(cat|)/;  # matches either 'housecat' or 'house'
    /house(cat(s|)|)/;  # matches either 'housecats' or 'housecat' or
                        # 'house'.  Note groups can be nested.

    "20" =~ /(19|20|)\d\d/;  # matches the null alternative '()\d\d',
                             # because '20\d\d' can't match

=head2 Extracting matches

The grouping metacharacters C<()> also allow the extraction of the
parts of a string that matched.  For each grouping, the part that
matched inside goes into the special variables C<$1>, C<$2>, etc.
They can be used just as ordinary variables:

    # extract hours, minutes, seconds
    $time =~ /(\d\d):(\d\d):(\d\d)/;  # match hh:mm:ss format
    $hours = $1;
    $minutes = $2;
    $seconds = $3;

In list context, a match C</regex/> with groupings will return the
list of matched values C<($1,$2,...)>.  So we could rewrite it as

    ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/);

If the groupings in a regex are nested, C<$1> gets the group with the
leftmost opening parenthesis, C<$2> the next opening parenthesis,
etc.  For example, here is a complex regex and the matching variables
indicated below it:

    /(ab(cd|ef)((gi)|j))/;
     1  2      34

Associated with the matching variables C<$1>, C<$2>, ... are
the B<backreferences> C<\g1>, C<\g2>, ...  Backreferences are
matching variables that can be used I<inside> a regex:

    /(\w\w\w)\s\g1/; # find sequences like 'the the' in string

C<$1>, C<$2>, ... should only be used outside of a regex, and C<\g1>,
C<\g2>, ... only inside a regex.

=head2 Matching repetitions

The B<quantifier> metacharacters C<?>, C<*>, C<+>, and C<{}> allow us
to determine the number of repeats of a portion of a regex we
consider to be a match.  Quantifiers are put immediately after the
character, character class, or grouping that we want to specify.  They
have the following meanings:

=over 4

=item *

C<a?> = match 'a' 1 or 0 times

=item *

C<a*> = match 'a' 0 or more times, i.e., any number of times

=item *

C<a+> = match 'a' 1 or more times, i.e., at least once

=item *

C<a{n,m}> = match at least C<n> times, but not more than C<m>
times.

=item *

C<a{n,}> = match at least C<n> or more times

=item *

C<a{n}> = match exactly C<n> times

=back

Here are some examples:

    /[a-z]+\s+\d*/;  # match a lowercase word, at least some space, and
                     # any number of digits
    /(\w+)\s+\g1/;    # match doubled words of arbitrary length
    $year =~ /^\d{2,4}$/;  # make sure year is at least 2 but not more
                           # than 4 digits
    $year =~ /^\d{4}$|^\d{2}$/; # better match; throw out 3 digit dates

These quantifiers will try to match as much of the string as possible,
while still allowing the regex to match.  So we have

    $x = 'the cat in the hat';
    $x =~ /^(.*)(at)(.*)$/; # matches,
                            # $1 = 'the cat in the h'
                            # $2 = 'at'
                            # $3 = ''   (0 matches)

The first quantifier C<.*> grabs as much of the string as possible
while still having the regex match. The second quantifier C<.*> has
no string left to it, so it matches 0 times.

=head2 More matching

There are a few more things you might want to know about matching
operators.
The global modifier C</g> allows the matching operator to match
within a string as many times as possible.  In scalar context,
successive matches against a string will have C</g> jump from match
to match, keeping track of position in the string as it goes along.
You can get or set the position with the C<pos()> function.
For example,

    $x = "cat dog house"; # 3 words
    while ($x =~ /(\w+)/g) {
        print "Word is $1, ends at position ", pos $x, "\n";
    }

prints

    Word is cat, ends at position 3
    Word is dog, ends at position 7
    Word is house, ends at position 13

A failed match or changing the target string resets the position.  If
you don't want the position reset after failure to match, add the
C</c>, as in C</regex/gc>.

In list context, C</g> returns a list of matched groupings, or if
there are no groupings, a list of matches to the whole regex.  So

    @words = ($x =~ /(\w+)/g);  # matches,
                                # $word[0] = 'cat'
                                # $word[1] = 'dog'
                                # $word[2] = 'house'

=head2 Search and replace

Search and replace is performed using C<s/regex/replacement/modifiers>.
The C<replacement> is a Perl double-quoted string that replaces in the
string whatever is matched with the C<regex>.  The operator C<=~> is
also used here to associate a string with C<s///>.  If matching
against C<$_>, the S<C<$_ =~>> can be dropped.  If there is a match,
C<s///> returns the number of substitutions made; otherwise it returns
false.  Here are a few examples:

    $x = "Time to feed the cat!";
    $x =~ s/cat/hacker/;   # $x contains "Time to feed the hacker!"
    $y = "'quoted words'";
    $y =~ s/^'(.*)'$/$1/;  # strip single quotes,
                           # $y contains "quoted words"

With the C<s///> operator, the matched variables C<$1>, C<$2>, etc.
are immediately available for use in the replacement expression. With
the global modifier, C<s///g> will search and replace all occurrences
of the regex in the string:

    $x = "I batted 4 for 4";
    $x =~ s/4/four/;   # $x contains "I batted four for 4"
    $x = "I batted 4 for 4";
    $x =~ s/4/four/g;  # $x contains "I batted four for four"

The non-destructive modifier C<s///r> causes the result of the substitution
to be returned instead of modifying C<$_> (or whatever variable the
substitute was bound to with C<=~>):

    $x = "I like dogs.";
    $y = $x =~ s/dogs/cats/r;
    print "$x $y\n"; # prints "I like dogs. I like cats."

    $x = "Cats are great.";
    print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~
        s/Frogs/Hedgehogs/r, "\n";
    # prints "Hedgehogs are great."

    @foo = map { s/[a-z]/X/r } qw(a b c 1 2 3);
    # @foo is now qw(X X X 1 2 3)

The evaluation modifier C<s///e> wraps an C<eval{...}> around the
replacement string and the evaluated result is substituted for the
matched substring.  Some examples:

    # reverse all the words in a string
    $x = "the cat in the hat";
    $x =~ s/(\w+)/reverse $1/ge;   # $x contains "eht tac ni eht tah"

    # convert percentage to decimal
    $x = "A 39% hit rate";
    $x =~ s!(\d+)%!$1/100!e;       # $x contains "A 0.39 hit rate"

The last example shows that C<s///> can use other delimiters, such as
C<s!!!> and C<s{}{}>, and even C<s{}//>.  If single quotes are used
C<s'''>, then the regex and replacement are treated as single-quoted
strings.

=head2 The split operator

C<split /regex/, string> splits C<string> into a list of substrings
and returns that list.  The regex determines the character sequence
that C<string> is split with respect to.  For example, to split a
string into words, use

    $x = "Calvin and Hobbes";
    @word = split /\s+/, $x;  # $word[0] = 'Calvin'
                              # $word[1] = 'and'
                              # $word[2] = 'Hobbes'

To extract a comma-delimited list of numbers, use

    $x = "1.618,2.718,   3.142";
    @const = split /,\s*/, $x;  # $const[0] = '1.618'
                                # $const[1] = '2.718'
                                # $const[2] = '3.142'

If the empty regex C<//> is used, the string is split into individual
characters.  If the regex has groupings, then the list produced contains
the matched substrings from the groupings as well:

    $x = "/usr/bin";
    @parts = split m!(/)!, $x;  # $parts[0] = ''
                                # $parts[1] = '/'
                                # $parts[2] = 'usr'
                                # $parts[3] = '/'
                                # $parts[4] = 'bin'

Since the first character of $x matched the regex, C<split> prepended
an empty initial element to the list.

=head2 C<use re 'strict'>

New in v5.22, this applies stricter rules than otherwise when compiling
regular expression patterns.  It can find things that, while legal, may
not be what you intended.

See L<'strict' in re|re/'strict' mode>.

=head1 BUGS

None.

=head1 SEE ALSO

This is just a quick start guide.  For a more in-depth tutorial on
regexes, see L<perlretut> and for the reference page, see L<perlre>.

=head1 AUTHOR AND COPYRIGHT

Copyright (c) 2000 Mark Kvale
All rights reserved.

This document may be distributed under the same terms as Perl itself.

=head2 Acknowledgments

The author would like to thank Mark-Jason Dominus, Tom Christiansen,
Ilya Zakharevich, Brad Hughes, and Mike Giroux for all their helpful
comments.

=cut

perlapi.pod000064400001542217150344123470006723 0ustar00-*- buffer-read-only: t -*-
!!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
This file is built by autodoc.pl extracting documentation from the C source
files.
Any changes made here will be lost!

=head1 NAME

perlapi - autogenerated documentation for the perl public API

=head1 DESCRIPTION
X<Perl API> X<API> X<api>

This file contains the documentation of the perl public API generated by
F<embed.pl>, specifically a listing of functions, macros, flags, and variables
that may be used by extension writers.  L<At the end|/Undocumented functions>
is a list of functions which have yet to be documented.  The interfaces of
those are subject to change without notice.  Anything not listed here is
not part of the public API, and should not be used by extension writers at
all.  For these reasons, blindly using functions listed in proto.h is to be
avoided when writing extensions.

In Perl, unlike C, a string of characters may generally contain embedded
C<NUL> characters.  Sometimes in the documentation a Perl string is referred
to as a "buffer" to distinguish it from a C string, but sometimes they are
both just referred to as strings.

Note that all Perl API global variables must be referenced with the C<PL_>
prefix.  Again, those not listed here are not to be used by extension writers,
and can be changed or removed without notice; same with macros.
Some macros are provided for compatibility with the older,
unadorned names, but this support may be disabled in a future release.

Perl was originally written to handle US-ASCII only (that is characters
whose ordinal numbers are in the range 0 - 127).
And documentation and comments may still use the term ASCII, when
sometimes in fact the entire range from 0 - 255 is meant.

The non-ASCII characters below 256 can have various meanings, depending on
various things.  (See, most notably, L<perllocale>.)  But usually the whole
range can be referred to as ISO-8859-1.  Often, the term "Latin-1" (or
"Latin1") is used as an equivalent for ISO-8859-1.  But some people treat
"Latin1" as referring just to the characters in the range 128 through 255, or
somethimes from 160 through 255.
This documentation uses "Latin1" and "Latin-1" to refer to all 256 characters.

Note that Perl can be compiled and run under either ASCII or EBCDIC (See
L<perlebcdic>).  Most of the documentation (and even comments in the code)
ignore the EBCDIC possibility.  
For almost all purposes the differences are transparent.
As an example, under EBCDIC,
instead of UTF-8, UTF-EBCDIC is used to encode Unicode strings, and so
whenever this documentation refers to C<utf8>
(and variants of that name, including in function names),
it also (essentially transparently) means C<UTF-EBCDIC>.
But the ordinals of characters differ between ASCII, EBCDIC, and
the UTF- encodings, and a string encoded in UTF-EBCDIC may occupy a different
number of bytes than in UTF-8.

The listing below is alphabetical, case insensitive.


=head1 Array Manipulation Functions

=over 8

=item av_clear
X<av_clear>

Frees the all the elements of an array, leaving it empty.
The XS equivalent of C<@array = ()>.  See also L</av_undef>.

Note that it is possible that the actions of a destructor called directly
or indirectly by freeing an element of the array could cause the reference
count of the array itself to be reduced (e.g. by deleting an entry in the
symbol table). So it is a possibility that the AV could have been freed
(or even reallocated) on return from the call unless you hold a reference
to it.

	void	av_clear(AV *av)

=for hackers
Found in file av.c

=item av_create_and_push
X<av_create_and_push>


NOTE: this function is experimental and may change or be
removed without notice.


Push an SV onto the end of the array, creating the array if necessary.
A small internal helper function to remove a commonly duplicated idiom.

	void	av_create_and_push(AV **const avp,
		                   SV *const val)

=for hackers
Found in file av.c

=item av_create_and_unshift_one
X<av_create_and_unshift_one>


NOTE: this function is experimental and may change or be
removed without notice.


Unshifts an SV onto the beginning of the array, creating the array if
necessary.
A small internal helper function to remove a commonly duplicated idiom.

	SV**	av_create_and_unshift_one(AV **const avp,
		                          SV *const val)

=for hackers
Found in file av.c

=item av_delete
X<av_delete>

Deletes the element indexed by C<key> from the array, makes the element
mortal, and returns it.  If C<flags> equals C<G_DISCARD>, the element is
freed and NULL is returned. NULL is also returned if C<key> is out of
range.

Perl equivalent: S<C<splice(@myarray, $key, 1, undef)>> (with the
C<splice> in void context if C<G_DISCARD> is present).

	SV*	av_delete(AV *av, SSize_t key, I32 flags)

=for hackers
Found in file av.c

=item av_exists
X<av_exists>

Returns true if the element indexed by C<key> has been initialized.

This relies on the fact that uninitialized array elements are set to
C<NULL>.

Perl equivalent: C<exists($myarray[$key])>.

	bool	av_exists(AV *av, SSize_t key)

=for hackers
Found in file av.c

=item av_extend
X<av_extend>

Pre-extend an array.  The C<key> is the index to which the array should be
extended.

	void	av_extend(AV *av, SSize_t key)

=for hackers
Found in file av.c

=item av_fetch
X<av_fetch>

Returns the SV at the specified index in the array.  The C<key> is the
index.  If lval is true, you are guaranteed to get a real SV back (in case
it wasn't real before), which you can then modify.  Check that the return
value is non-null before dereferencing it to a C<SV*>.

See L<perlguts/"Understanding the Magic of Tied Hashes and Arrays"> for
more information on how to use this function on tied arrays. 

The rough perl equivalent is C<$myarray[$key]>.

	SV**	av_fetch(AV *av, SSize_t key, I32 lval)

=for hackers
Found in file av.c

=item AvFILL
X<AvFILL>

Same as C<av_top_index()>.  Deprecated, use C<av_top_index()> instead.

	int	AvFILL(AV* av)

=for hackers
Found in file av.h

=item av_fill
X<av_fill>

Set the highest index in the array to the given number, equivalent to
Perl's S<C<$#array = $fill;>>.

The number of elements in the array will be S<C<fill + 1>> after
C<av_fill()> returns.  If the array was previously shorter, then the
additional elements appended are set to NULL.  If the array
was longer, then the excess elements are freed.  S<C<av_fill(av, -1)>> is
the same as C<av_clear(av)>.

	void	av_fill(AV *av, SSize_t fill)

=for hackers
Found in file av.c

=item av_len
X<av_len>

Same as L</av_top_index>.  Note that, unlike what the name implies, it returns
the highest index in the array, so to get the size of the array you need to use
S<C<av_len(av) + 1>>.  This is unlike L</sv_len>, which returns what you would
expect.

	SSize_t	av_len(AV *av)

=for hackers
Found in file av.c

=item av_make
X<av_make>

Creates a new AV and populates it with a list of SVs.  The SVs are copied
into the array, so they may be freed after the call to C<av_make>.  The new AV
will have a reference count of 1.

Perl equivalent: C<my @new_array = ($scalar1, $scalar2, $scalar3...);>

	AV*	av_make(SSize_t size, SV **strp)

=for hackers
Found in file av.c

=item av_pop
X<av_pop>

Removes one SV from the end of the array, reducing its size by one and
returning the SV (transferring control of one reference count) to the
caller.  Returns C<&PL_sv_undef> if the array is empty.

Perl equivalent: C<pop(@myarray);>

	SV*	av_pop(AV *av)

=for hackers
Found in file av.c

=item av_push
X<av_push>

Pushes an SV (transferring control of one reference count) onto the end of the
array.  The array will grow automatically to accommodate the addition.

Perl equivalent: C<push @myarray, $val;>.

	void	av_push(AV *av, SV *val)

=for hackers
Found in file av.c

=item av_shift
X<av_shift>

Removes one SV from the start of the array, reducing its size by one and
returning the SV (transferring control of one reference count) to the
caller.  Returns C<&PL_sv_undef> if the array is empty.

Perl equivalent: C<shift(@myarray);>

	SV*	av_shift(AV *av)

=for hackers
Found in file av.c

=item av_store
X<av_store>

Stores an SV in an array.  The array index is specified as C<key>.  The
return value will be C<NULL> if the operation failed or if the value did not
need to be actually stored within the array (as in the case of tied
arrays).  Otherwise, it can be dereferenced
to get the C<SV*> that was stored
there (= C<val>)).

Note that the caller is responsible for suitably incrementing the reference
count of C<val> before the call, and decrementing it if the function
returned C<NULL>.

Approximate Perl equivalent: C<splice(@myarray, $key, 1, $val)>.

See L<perlguts/"Understanding the Magic of Tied Hashes and Arrays"> for
more information on how to use this function on tied arrays.

	SV**	av_store(AV *av, SSize_t key, SV *val)

=for hackers
Found in file av.c

=item av_tindex
X<av_tindex>

Same as C<av_top_index()>.

	int	av_tindex(AV* av)

=for hackers
Found in file av.h

=item av_top_index
X<av_top_index>

Returns the highest index in the array.  The number of elements in the
array is S<C<av_top_index(av) + 1>>.  Returns -1 if the array is empty.

The Perl equivalent for this is C<$#myarray>.

(A slightly shorter form is C<av_tindex>.)

	SSize_t	av_top_index(AV *av)

=for hackers
Found in file av.c

=item av_undef
X<av_undef>

Undefines the array. The XS equivalent of C<undef(@array)>.

As well as freeing all the elements of the array (like C<av_clear()>), this
also frees the memory used by the av to store its list of scalars.

See L</av_clear> for a note about the array possibly being invalid on
return.

	void	av_undef(AV *av)

=for hackers
Found in file av.c

=item av_unshift
X<av_unshift>

Unshift the given number of C<undef> values onto the beginning of the
array.  The array will grow automatically to accommodate the addition.

Perl equivalent: S<C<unshift @myarray, ((undef) x $num);>>

	void	av_unshift(AV *av, SSize_t num)

=for hackers
Found in file av.c

=item get_av
X<get_av>

Returns the AV of the specified Perl global or package array with the given
name (so it won't work on lexical variables).  C<flags> are passed 
to C<gv_fetchpv>.  If C<GV_ADD> is set and the
Perl variable does not exist then it will be created.  If C<flags> is zero
and the variable does not exist then NULL is returned.

Perl equivalent: C<@{"$name"}>.

NOTE: the perl_ form of this function is deprecated.

	AV*	get_av(const char *name, I32 flags)

=for hackers
Found in file perl.c

=item newAV
X<newAV>

Creates a new AV.  The reference count is set to 1.

Perl equivalent: C<my @array;>.

	AV*	newAV()

=for hackers
Found in file av.h

=item sortsv
X<sortsv>

In-place sort an array of SV pointers with the given comparison routine.

Currently this always uses mergesort.  See C<L</sortsv_flags>> for a more
flexible routine.

	void	sortsv(SV** array, size_t num_elts,
		       SVCOMPARE_t cmp)

=for hackers
Found in file pp_sort.c

=item sortsv_flags
X<sortsv_flags>

In-place sort an array of SV pointers with the given comparison routine,
with various SORTf_* flag options.

	void	sortsv_flags(SV** array, size_t num_elts,
		             SVCOMPARE_t cmp, U32 flags)

=for hackers
Found in file pp_sort.c


=back

=head1 Callback Functions

=over 8

=item call_argv
X<call_argv>

Performs a callback to the specified named and package-scoped Perl subroutine 
with C<argv> (a C<NULL>-terminated array of strings) as arguments.  See
L<perlcall>.

Approximate Perl equivalent: C<&{"$sub_name"}(@$argv)>.

NOTE: the perl_ form of this function is deprecated.

	I32	call_argv(const char* sub_name, I32 flags,
		          char** argv)

=for hackers
Found in file perl.c

=item call_method
X<call_method>

Performs a callback to the specified Perl method.  The blessed object must
be on the stack.  See L<perlcall>.

NOTE: the perl_ form of this function is deprecated.

	I32	call_method(const char* methname, I32 flags)

=for hackers
Found in file perl.c

=item call_pv
X<call_pv>

Performs a callback to the specified Perl sub.  See L<perlcall>.

NOTE: the perl_ form of this function is deprecated.

	I32	call_pv(const char* sub_name, I32 flags)

=for hackers
Found in file perl.c

=item call_sv
X<call_sv>

Performs a callback to the Perl sub specified by the SV.

If neither the C<G_METHOD> nor C<G_METHOD_NAMED> flag is supplied, the
SV may be any of a CV, a GV, a reference to a CV, a reference to a GV
or C<SvPV(sv)> will be used as the name of the sub to call.

If the C<G_METHOD> flag is supplied, the SV may be a reference to a CV or
C<SvPV(sv)> will be used as the name of the method to call.

If the C<G_METHOD_NAMED> flag is supplied, C<SvPV(sv)> will be used as
the name of the method to call.

Some other values are treated specially for internal use and should
not be depended on.

See L<perlcall>.

NOTE: the perl_ form of this function is deprecated.

	I32	call_sv(SV* sv, VOL I32 flags)

=for hackers
Found in file perl.c

=item ENTER
X<ENTER>

Opening bracket on a callback.  See C<L</LEAVE>> and L<perlcall>.

		ENTER;

=for hackers
Found in file scope.h

=item ENTER_with_name(name)
X<ENTER_with_name(name)>

Same as C<L</ENTER>>, but when debugging is enabled it also associates the
given literal string with the new scope.

		ENTER_with_name(name);

=for hackers
Found in file scope.h

=item eval_pv
X<eval_pv>

Tells Perl to C<eval> the given string in scalar context and return an SV* result.

NOTE: the perl_ form of this function is deprecated.

	SV*	eval_pv(const char* p, I32 croak_on_error)

=for hackers
Found in file perl.c

=item eval_sv
X<eval_sv>

Tells Perl to C<eval> the string in the SV.  It supports the same flags
as C<call_sv>, with the obvious exception of C<G_EVAL>.  See L<perlcall>.

NOTE: the perl_ form of this function is deprecated.

	I32	eval_sv(SV* sv, I32 flags)

=for hackers
Found in file perl.c

=item FREETMPS
X<FREETMPS>

Closing bracket for temporaries on a callback.  See C<L</SAVETMPS>> and
L<perlcall>.

		FREETMPS;

=for hackers
Found in file scope.h

=item LEAVE
X<LEAVE>

Closing bracket on a callback.  See C<L</ENTER>> and L<perlcall>.

		LEAVE;

=for hackers
Found in file scope.h

=item LEAVE_with_name(name)
X<LEAVE_with_name(name)>

Same as C<L</LEAVE>>, but when debugging is enabled it first checks that the
scope has the given name. C<name> must be a C<NUL>-terminated literal string.

		LEAVE_with_name(name);

=for hackers
Found in file scope.h

=item SAVETMPS
X<SAVETMPS>

Opening bracket for temporaries on a callback.  See C<L</FREETMPS>> and
L<perlcall>.

		SAVETMPS;

=for hackers
Found in file scope.h


=back

=head1 Character case changing

Perl uses "full" Unicode case mappings.  This means that converting a single
character to another case may result in a sequence of more than one character.
For example, the uppercase of C<E<223>> (LATIN SMALL LETTER SHARP S) is the two
character sequence C<SS>.  This presents some complications   The lowercase of
all characters in the range 0..255 is a single character, and thus
C<L</toLOWER_L1>> is furnished.  But, C<toUPPER_L1> can't exist, as it couldn't
return a valid result for all legal inputs.  Instead C<L</toUPPER_uvchr>> has
an API that does allow every possible legal result to be returned.)  Likewise
no other function that is crippled by not being able to give the correct
results for the full range of possible inputs has been implemented here.


=over 8

=item toFOLD
X<toFOLD>

Converts the specified character to foldcase.  If the input is anything but an
ASCII uppercase character, that input character itself is returned.  Variant
C<toFOLD_A> is equivalent.  (There is no equivalent C<to_FOLD_L1> for the full
Latin1 range, as the full generality of L</toFOLD_uvchr> is needed there.)

	U8	toFOLD(U8 ch)

=for hackers
Found in file handy.h

=item toFOLD_utf8
X<toFOLD_utf8>

This is like C<L</toFOLD_utf8_safe>>, but doesn't have the C<e>
parameter  The function therefore can't check if it is reading
beyond the end of the string.  Starting in Perl v5.30, it will take the C<e>
parameter, becoming a synonym for C<toFOLD_utf8_safe>.  At that time every
program that uses it will have to be changed to successfully compile.  In the
meantime, the first runtime call to C<toFOLD_utf8> from each call point in the
program will raise a deprecation warning, enabled by default.  You can convert
your program now to use C<toFOLD_utf8_safe>, and avoid the warnings, and get an
extra measure of protection, or you can wait until v5.30, when you'll be forced
to add the C<e> parameter.

	UV	toFOLD_utf8(U8* p, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h

=item toFOLD_utf8_safe
X<toFOLD_utf8_safe>

Converts the first UTF-8 encoded character in the sequence starting at C<p> and
extending no further than S<C<e - 1>> to its foldcase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the foldcase version may be longer than the original character.

The first code point of the foldcased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more).

The suffix C<_safe> in the function's name indicates that it will not attempt
to read beyond S<C<e - 1>>, provided that the constraint S<C<s E<lt> e>> is
true (this is asserted for in C<-DDEBUGGING> builds).  If the UTF-8 for the
input character is malformed in some way, the program may croak, or the
function may return the REPLACEMENT CHARACTER, at the discretion of the
implementation, and subject to change in future releases.

	UV	toFOLD_utf8_safe(U8* p, U8* e, U8* s,
		                 STRLEN* lenp)

=for hackers
Found in file handy.h

=item toFOLD_uvchr
X<toFOLD_uvchr>

Converts the code point C<cp> to its foldcase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  The code
point is interpreted as native if less than 256; otherwise as Unicode.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the foldcase version may be longer than the original character.

The first code point of the foldcased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more).

	UV	toFOLD_uvchr(UV cp, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h

=item toLOWER
X<toLOWER>

Converts the specified character to lowercase.  If the input is anything but an
ASCII uppercase character, that input character itself is returned.  Variant
C<toLOWER_A> is equivalent.

	U8	toLOWER(U8 ch)

=for hackers
Found in file handy.h

=item toLOWER_L1
X<toLOWER_L1>

Converts the specified Latin1 character to lowercase.  The results are
undefined if the input doesn't fit in a byte.

	U8	toLOWER_L1(U8 ch)

=for hackers
Found in file handy.h

=item toLOWER_LC
X<toLOWER_LC>

Converts the specified character to lowercase using the current locale's rules,
if possible; otherwise returns the input character itself.

	U8	toLOWER_LC(U8 ch)

=for hackers
Found in file handy.h

=item toLOWER_utf8
X<toLOWER_utf8>

This is like C<L</toLOWER_utf8_safe>>, but doesn't have the C<e>
parameter  The function therefore can't check if it is reading
beyond the end of the string.  Starting in Perl v5.30, it will take the C<e>
parameter, becoming a synonym for C<toLOWER_utf8_safe>.  At that time every
program that uses it will have to be changed to successfully compile.  In the
meantime, the first runtime call to C<toLOWER_utf8> from each call point in the
program will raise a deprecation warning, enabled by default.  You can convert
your program now to use C<toLOWER_utf8_safe>, and avoid the warnings, and get an
extra measure of protection, or you can wait until v5.30, when you'll be forced
to add the C<e> parameter.

	UV	toLOWER_utf8(U8* p, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h

=item toLOWER_utf8_safe
X<toLOWER_utf8_safe>

Converts the first UTF-8 encoded character in the sequence starting at C<p> and
extending no further than S<C<e - 1>> to its lowercase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the lowercase version may be longer than the original character.

The first code point of the lowercased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more).

The suffix C<_safe> in the function's name indicates that it will not attempt
to read beyond S<C<e - 1>>, provided that the constraint S<C<s E<lt> e>> is
true (this is asserted for in C<-DDEBUGGING> builds).  If the UTF-8 for the
input character is malformed in some way, the program may croak, or the
function may return the REPLACEMENT CHARACTER, at the discretion of the
implementation, and subject to change in future releases.

	UV	toLOWER_utf8_safe(U8* p, U8* e, U8* s,
		                  STRLEN* lenp)

=for hackers
Found in file handy.h

=item toLOWER_uvchr
X<toLOWER_uvchr>

Converts the code point C<cp> to its lowercase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  The code
point is interpreted as native if less than 256; otherwise as Unicode.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the lowercase version may be longer than the original character.

The first code point of the lowercased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more).


	UV	toLOWER_uvchr(UV cp, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h

=item toTITLE
X<toTITLE>

Converts the specified character to titlecase.  If the input is anything but an
ASCII lowercase character, that input character itself is returned.  Variant
C<toTITLE_A> is equivalent.  (There is no C<toTITLE_L1> for the full Latin1
range, as the full generality of L</toTITLE_uvchr> is needed there.  Titlecase is
not a concept used in locale handling, so there is no functionality for that.)

	U8	toTITLE(U8 ch)

=for hackers
Found in file handy.h

=item toTITLE_utf8
X<toTITLE_utf8>

This is like C<L</toLOWER_utf8_safe>>, but doesn't have the C<e>
parameter  The function therefore can't check if it is reading
beyond the end of the string.  Starting in Perl v5.30, it will take the C<e>
parameter, becoming a synonym for C<toTITLE_utf8_safe>.  At that time every
program that uses it will have to be changed to successfully compile.  In the
meantime, the first runtime call to C<toTITLE_utf8> from each call point in the
program will raise a deprecation warning, enabled by default.  You can convert
your program now to use C<toTITLE_utf8_safe>, and avoid the warnings, and get an
extra measure of protection, or you can wait until v5.30, when you'll be forced
to add the C<e> parameter.

	UV	toTITLE_utf8(U8* p, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h

=item toTITLE_utf8_safe
X<toTITLE_utf8_safe>

Converts the first UTF-8 encoded character in the sequence starting at C<p> and
extending no further than S<C<e - 1>> to its titlecase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the titlecase version may be longer than the original character.

The first code point of the titlecased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more).

The suffix C<_safe> in the function's name indicates that it will not attempt
to read beyond S<C<e - 1>>, provided that the constraint S<C<s E<lt> e>> is
true (this is asserted for in C<-DDEBUGGING> builds).  If the UTF-8 for the
input character is malformed in some way, the program may croak, or the
function may return the REPLACEMENT CHARACTER, at the discretion of the
implementation, and subject to change in future releases.

	UV	toTITLE_utf8_safe(U8* p, U8* e, U8* s,
		                  STRLEN* lenp)

=for hackers
Found in file handy.h

=item toTITLE_uvchr
X<toTITLE_uvchr>

Converts the code point C<cp> to its titlecase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  The code
point is interpreted as native if less than 256; otherwise as Unicode.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the titlecase version may be longer than the original character.

The first code point of the titlecased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more).

	UV	toTITLE_uvchr(UV cp, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h

=item toUPPER
X<toUPPER>

Converts the specified character to uppercase.  If the input is anything but an
ASCII lowercase character, that input character itself is returned.  Variant
C<toUPPER_A> is equivalent.

	U8	toUPPER(U8 ch)

=for hackers
Found in file handy.h

=item toUPPER_utf8
X<toUPPER_utf8>

This is like C<L</toUPPER_utf8_safe>>, but doesn't have the C<e>
parameter  The function therefore can't check if it is reading
beyond the end of the string.  Starting in Perl v5.30, it will take the C<e>
parameter, becoming a synonym for C<toUPPER_utf8_safe>.  At that time every
program that uses it will have to be changed to successfully compile.  In the
meantime, the first runtime call to C<toUPPER_utf8> from each call point in the
program will raise a deprecation warning, enabled by default.  You can convert
your program now to use C<toUPPER_utf8_safe>, and avoid the warnings, and get an
extra measure of protection, or you can wait until v5.30, when you'll be forced
to add the C<e> parameter.

	UV	toUPPER_utf8(U8* p, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h

=item toUPPER_utf8_safe
X<toUPPER_utf8_safe>

Converts the first UTF-8 encoded character in the sequence starting at C<p> and
extending no further than S<C<e - 1>> to its uppercase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the uppercase version may be longer than the original character.

The first code point of the uppercased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more).

The suffix C<_safe> in the function's name indicates that it will not attempt
to read beyond S<C<e - 1>>, provided that the constraint S<C<s E<lt> e>> is
true (this is asserted for in C<-DDEBUGGING> builds).  If the UTF-8 for the
input character is malformed in some way, the program may croak, or the
function may return the REPLACEMENT CHARACTER, at the discretion of the
implementation, and subject to change in future releases.

	UV	toUPPER_utf8_safe(U8* p, U8* e, U8* s,
		                  STRLEN* lenp)

=for hackers
Found in file handy.h

=item toUPPER_uvchr
X<toUPPER_uvchr>

Converts the code point C<cp> to its uppercase version, and
stores that in UTF-8 in C<s>, and its length in bytes in C<lenp>.  The code
point is interpreted as native if less than 256; otherwise as Unicode.  Note
that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
bytes since the uppercase version may be longer than the original character.

The first code point of the uppercased version is returned
(but note, as explained at L<the top of this section|/Character case
changing>, that there may be more.)

	UV	toUPPER_uvchr(UV cp, U8* s, STRLEN* lenp)

=for hackers
Found in file handy.h


=back

=head1 Character classification

This section is about functions (really macros) that classify characters
into types, such as punctuation versus alphabetic, etc.  Most of these are
analogous to regular expression character classes.  (See
L<perlrecharclass/POSIX Character Classes>.)  There are several variants for
each class.  (Not all macros have all variants; each item below lists the
ones valid for it.)  None are affected by C<use bytes>, and only the ones
with C<LC> in the name are affected by the current locale.

The base function, e.g., C<isALPHA()>, takes an octet (either a C<char> or a
C<U8>) as input and returns a boolean as to whether or not the character
represented by that octet is (or on non-ASCII platforms, corresponds to) an
ASCII character in the named class based on platform, Unicode, and Perl rules.
If the input is a number that doesn't fit in an octet, FALSE is returned.

Variant C<isI<FOO>_A> (e.g., C<isALPHA_A()>) is identical to the base function
with no suffix C<"_A">.  This variant is used to emphasize by its name that
only ASCII-range characters can return TRUE.

Variant C<isI<FOO>_L1> imposes the Latin-1 (or EBCDIC equivalent) character set
onto the platform.  That is, the code points that are ASCII are unaffected,
since ASCII is a subset of Latin-1.  But the non-ASCII code points are treated
as if they are Latin-1 characters.  For example, C<isWORDCHAR_L1()> will return
true when called with the code point 0xDF, which is a word character in both
ASCII and EBCDIC (though it represents different characters in each).

Variant C<isI<FOO>_uvchr> is like the C<isI<FOO>_L1> variant, but accepts any UV code
point as input.  If the code point is larger than 255, Unicode rules are used
to determine if it is in the character class.  For example,
C<isWORDCHAR_uvchr(0x100)> returns TRUE, since 0x100 is LATIN CAPITAL LETTER A
WITH MACRON in Unicode, and is a word character.

Variant C<isI<FOO>_utf8_safe> is like C<isI<FOO>_uvchr>, but is used for UTF-8
encoded strings.  Each call classifies one character, even if the string
contains many.  This variant takes two parameters.  The first, C<p>, is a
pointer to the first byte of the character to be classified.  (Recall that it
may take more than one byte to represent a character in UTF-8 strings.)  The
second parameter, C<e>, points to anywhere in the string beyond the first
character, up to one byte past the end of the entire string.  The suffix
C<_safe> in the function's name indicates that it will not attempt to read
beyond S<C<e - 1>>, provided that the constraint S<C<s E<lt> e>> is true (this
is asserted for in C<-DDEBUGGING> builds).  If the UTF-8 for the input
character is malformed in some way, the program may croak, or the function may
return FALSE, at the discretion of the implementation, and subject to change in
future releases.

Variant C<isI<FOO>_utf8> is like C<isI<FOO>_utf8_safe>, but takes just a single
parameter, C<p>, which has the same meaning as the corresponding parameter does
in C<isI<FOO>_utf8_safe>.  The function therefore can't check if it is reading
beyond the end of the string.  Starting in Perl v5.30, it will take a second
parameter, becoming a synonym for C<isI<FOO>_utf8_safe>.  At that time every
program that uses it will have to be changed to successfully compile.  In the
meantime, the first runtime call to C<isI<FOO>_utf8> from each call point in the
program will raise a deprecation warning, enabled by default.  You can convert
your program now to use C<isI<FOO>_utf8_safe>, and avoid the warnings, and get an
extra measure of protection, or you can wait until v5.30, when you'll be forced
to add the C<e> parameter.

Variant C<isI<FOO>_LC> is like the C<isI<FOO>_A> and C<isI<FOO>_L1> variants, but the
result is based on the current locale, which is what C<LC> in the name stands
for.  If Perl can determine that the current locale is a UTF-8 locale, it uses
the published Unicode rules; otherwise, it uses the C library function that
gives the named classification.  For example, C<isDIGIT_LC()> when not in a
UTF-8 locale returns the result of calling C<isdigit()>.  FALSE is always
returned if the input won't fit into an octet.  On some platforms where the C
library function is known to be defective, Perl changes its result to follow
the POSIX standard's rules.

Variant C<isI<FOO>_LC_uvchr> is like C<isI<FOO>_LC>, but is defined on any UV.  It
returns the same as C<isI<FOO>_LC> for input code points less than 256, and
returns the hard-coded, not-affected-by-locale, Unicode results for larger ones.

Variant C<isI<FOO>_LC_utf8_safe> is like C<isI<FOO>_LC_uvchr>, but is used for UTF-8
encoded strings.  Each call classifies one character, even if the string
contains many.  This variant takes two parameters.  The first, C<p>, is a
pointer to the first byte of the character to be classified.  (Recall that it
may take more than one byte to represent a character in UTF-8 strings.) The
second parameter, C<e>, points to anywhere in the string beyond the first
character, up to one byte past the end of the entire string.  The suffix
C<_safe> in the function's name indicates that it will not attempt to read
beyond S<C<e - 1>>, provided that the constraint S<C<s E<lt> e>> is true (this
is asserted for in C<-DDEBUGGING> builds).  If the UTF-8 for the input
character is malformed in some way, the program may croak, or the function may
return FALSE, at the discretion of the implementation, and subject to change in
future releases.

Variant C<isI<FOO>_LC_utf8> is like C<isI<FOO>_LC_utf8_safe>, but takes just a single
parameter, C<p>, which has the same meaning as the corresponding parameter does
in C<isI<FOO>_LC_utf8_safe>.  The function therefore can't check if it is reading
beyond the end of the string.  Starting in Perl v5.30, it will take a second
parameter, becoming a synonym for C<isI<FOO>_LC_utf8_safe>.  At that time every
program that uses it will have to be changed to successfully compile.  In the
meantime, the first runtime call to C<isI<FOO>_LC_utf8> from each call point in
the program will raise a deprecation warning, enabled by default.  You can
convert your program now to use C<isI<FOO>_LC_utf8_safe>, and avoid the warnings,
and get an extra measure of protection, or you can wait until v5.30, when
you'll be forced to add the C<e> parameter.


=over 8

=item isALPHA
X<isALPHA>

Returns a boolean indicating whether the specified character is an
alphabetic character, analogous to C<m/[[:alpha:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isALPHA_A>, C<isALPHA_L1>, C<isALPHA_uvchr>, C<isALPHA_utf8_safe>,
C<isALPHA_LC>, C<isALPHA_LC_uvchr>, and C<isALPHA_LC_utf8_safe>.

	bool	isALPHA(char ch)

=for hackers
Found in file handy.h

=item isALPHANUMERIC
X<isALPHANUMERIC>

Returns a boolean indicating whether the specified character is a either an
alphabetic character or decimal digit, analogous to C<m/[[:alnum:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isALPHANUMERIC_A>, C<isALPHANUMERIC_L1>, C<isALPHANUMERIC_uvchr>,
C<isALPHANUMERIC_utf8_safe>, C<isALPHANUMERIC_LC>, C<isALPHANUMERIC_LC_uvchr>,
and C<isALPHANUMERIC_LC_utf8_safe>.

	bool	isALPHANUMERIC(char ch)

=for hackers
Found in file handy.h

=item isASCII
X<isASCII>

Returns a boolean indicating whether the specified character is one of the 128
characters in the ASCII character set, analogous to C<m/[[:ascii:]]/>.
On non-ASCII platforms, it returns TRUE iff this
character corresponds to an ASCII character.  Variants C<isASCII_A()> and
C<isASCII_L1()> are identical to C<isASCII()>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isASCII_uvchr>, C<isASCII_utf8_safe>, C<isASCII_LC>, C<isASCII_LC_uvchr>, and
C<isASCII_LC_utf8_safe>.  Note, however, that some platforms do not have the C
library routine C<isascii()>.  In these cases, the variants whose names contain
C<LC> are the same as the corresponding ones without.

Also note, that because all ASCII characters are UTF-8 invariant (meaning they
have the exact same representation (always a single byte) whether encoded in
UTF-8 or not), C<isASCII> will give the correct results when called with any
byte in any string encoded or not in UTF-8.  And similarly C<isASCII_utf8_safe>
will work properly on any string encoded or not in UTF-8.

	bool	isASCII(char ch)

=for hackers
Found in file handy.h

=item isBLANK
X<isBLANK>

Returns a boolean indicating whether the specified character is a
character considered to be a blank, analogous to C<m/[[:blank:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isBLANK_A>, C<isBLANK_L1>, C<isBLANK_uvchr>, C<isBLANK_utf8_safe>,
C<isBLANK_LC>, C<isBLANK_LC_uvchr>, and C<isBLANK_LC_utf8_safe>.  Note,
however, that some platforms do not have the C library routine
C<isblank()>.  In these cases, the variants whose names contain C<LC> are
the same as the corresponding ones without.

	bool	isBLANK(char ch)

=for hackers
Found in file handy.h

=item isCNTRL
X<isCNTRL>

Returns a boolean indicating whether the specified character is a
control character, analogous to C<m/[[:cntrl:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isCNTRL_A>, C<isCNTRL_L1>, C<isCNTRL_uvchr>, C<isCNTRL_utf8_safe>,
C<isCNTRL_LC>, C<isCNTRL_LC_uvchr>, and C<isCNTRL_LC_utf8_safe> On EBCDIC
platforms, you almost always want to use the C<isCNTRL_L1> variant.

	bool	isCNTRL(char ch)

=for hackers
Found in file handy.h

=item isDIGIT
X<isDIGIT>

Returns a boolean indicating whether the specified character is a
digit, analogous to C<m/[[:digit:]]/>.
Variants C<isDIGIT_A> and C<isDIGIT_L1> are identical to C<isDIGIT>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isDIGIT_uvchr>, C<isDIGIT_utf8_safe>, C<isDIGIT_LC>, C<isDIGIT_LC_uvchr>, and
C<isDIGIT_LC_utf8_safe>.

	bool	isDIGIT(char ch)

=for hackers
Found in file handy.h

=item isGRAPH
X<isGRAPH>

Returns a boolean indicating whether the specified character is a
graphic character, analogous to C<m/[[:graph:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants C<isGRAPH_A>, C<isGRAPH_L1>, C<isGRAPH_uvchr>, C<isGRAPH_utf8_safe>,
C<isGRAPH_LC>, C<isGRAPH_LC_uvchr>, and C<isGRAPH_LC_utf8_safe>.

	bool	isGRAPH(char ch)

=for hackers
Found in file handy.h

=item isIDCONT
X<isIDCONT>

Returns a boolean indicating whether the specified character can be the
second or succeeding character of an identifier.  This is very close to, but
not quite the same as the official Unicode property C<XID_Continue>.  The
difference is that this returns true only if the input character also matches
L</isWORDCHAR>.  See the L<top of this section|/Character classification> for
an
explanation of variants C<isIDCONT_A>, C<isIDCONT_L1>, C<isIDCONT_uvchr>,
C<isIDCONT_utf8_safe>, C<isIDCONT_LC>, C<isIDCONT_LC_uvchr>, and
C<isIDCONT_LC_utf8_safe>.

	bool	isIDCONT(char ch)

=for hackers
Found in file handy.h

=item isIDFIRST
X<isIDFIRST>

Returns a boolean indicating whether the specified character can be the first
character of an identifier.  This is very close to, but not quite the same as
the official Unicode property C<XID_Start>.  The difference is that this
returns true only if the input character also matches L</isWORDCHAR>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isIDFIRST_A>, C<isIDFIRST_L1>, C<isIDFIRST_uvchr>, C<isIDFIRST_utf8_safe>,
C<isIDFIRST_LC>, C<isIDFIRST_LC_uvchr>, and C<isIDFIRST_LC_utf8_safe>.

	bool	isIDFIRST(char ch)

=for hackers
Found in file handy.h

=item isLOWER
X<isLOWER>

Returns a boolean indicating whether the specified character is a
lowercase character, analogous to C<m/[[:lower:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isLOWER_A>, C<isLOWER_L1>, C<isLOWER_uvchr>, C<isLOWER_utf8_safe>,
C<isLOWER_LC>, C<isLOWER_LC_uvchr>, and C<isLOWER_LC_utf8_safe>.

	bool	isLOWER(char ch)

=for hackers
Found in file handy.h

=item isOCTAL
X<isOCTAL>

Returns a boolean indicating whether the specified character is an
octal digit, [0-7].
The only two variants are C<isOCTAL_A> and C<isOCTAL_L1>; each is identical to
C<isOCTAL>.

	bool	isOCTAL(char ch)

=for hackers
Found in file handy.h

=item isPRINT
X<isPRINT>

Returns a boolean indicating whether the specified character is a
printable character, analogous to C<m/[[:print:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isPRINT_A>, C<isPRINT_L1>, C<isPRINT_uvchr>, C<isPRINT_utf8_safe>,
C<isPRINT_LC>, C<isPRINT_LC_uvchr>, and C<isPRINT_LC_utf8_safe>.

	bool	isPRINT(char ch)

=for hackers
Found in file handy.h

=item isPSXSPC
X<isPSXSPC>

(short for Posix Space)
Starting in 5.18, this is identical in all its forms to the
corresponding C<isSPACE()> macros.
The locale forms of this macro are identical to their corresponding
C<isSPACE()> forms in all Perl releases.  In releases prior to 5.18, the
non-locale forms differ from their C<isSPACE()> forms only in that the
C<isSPACE()> forms don't match a Vertical Tab, and the C<isPSXSPC()> forms do.
Otherwise they are identical.  Thus this macro is analogous to what
C<m/[[:space:]]/> matches in a regular expression.
See the L<top of this section|/Character classification> for an explanation of
variants C<isPSXSPC_A>, C<isPSXSPC_L1>, C<isPSXSPC_uvchr>, C<isPSXSPC_utf8_safe>,
C<isPSXSPC_LC>, C<isPSXSPC_LC_uvchr>, and C<isPSXSPC_LC_utf8_safe>.

	bool	isPSXSPC(char ch)

=for hackers
Found in file handy.h

=item isPUNCT
X<isPUNCT>

Returns a boolean indicating whether the specified character is a
punctuation character, analogous to C<m/[[:punct:]]/>.
Note that the definition of what is punctuation isn't as
straightforward as one might desire.  See L<perlrecharclass/POSIX Character
Classes> for details.
See the L<top of this section|/Character classification> for an explanation of
variants C<isPUNCT_A>, C<isPUNCT_L1>, C<isPUNCT_uvchr>, C<isPUNCT_utf8_safe>,
C<isPUNCT_LC>, C<isPUNCT_LC_uvchr>, and C<isPUNCT_LC_utf8_safe>.

	bool	isPUNCT(char ch)

=for hackers
Found in file handy.h

=item isSPACE
X<isSPACE>

Returns a boolean indicating whether the specified character is a
whitespace character.  This is analogous
to what C<m/\s/> matches in a regular expression.  Starting in Perl 5.18
this also matches what C<m/[[:space:]]/> does.  Prior to 5.18, only the
locale forms of this macro (the ones with C<LC> in their names) matched
precisely what C<m/[[:space:]]/> does.  In those releases, the only difference,
in the non-locale variants, was that C<isSPACE()> did not match a vertical tab.
(See L</isPSXSPC> for a macro that matches a vertical tab in all releases.)
See the L<top of this section|/Character classification> for an explanation of
variants
C<isSPACE_A>, C<isSPACE_L1>, C<isSPACE_uvchr>, C<isSPACE_utf8_safe>,
C<isSPACE_LC>, C<isSPACE_LC_uvchr>, and C<isSPACE_LC_utf8_safe>.

	bool	isSPACE(char ch)

=for hackers
Found in file handy.h

=item isUPPER
X<isUPPER>

Returns a boolean indicating whether the specified character is an
uppercase character, analogous to C<m/[[:upper:]]/>.
See the L<top of this section|/Character classification> for an explanation of
variants C<isUPPER_A>, C<isUPPER_L1>, C<isUPPER_uvchr>, C<isUPPER_utf8_safe>,
C<isUPPER_LC>, C<isUPPER_LC_uvchr>, and C<isUPPER_LC_utf8_safe>.

	bool	isUPPER(char ch)

=for hackers
Found in file handy.h

=item isWORDCHAR
X<isWORDCHAR>

Returns a boolean indicating whether the specified character is a character
that is a word character, analogous to what C<m/\w/> and C<m/[[:word:]]/> match
in a regular expression.  A word character is an alphabetic character, a
decimal digit, a connecting punctuation character (such as an underscore), or
a "mark" character that attaches to one of those (like some sort of accent).
C<isALNUM()> is a synonym provided for backward compatibility, even though a
word character includes more than the standard C language meaning of
alphanumeric.
See the L<top of this section|/Character classification> for an explanation of
variants C<isWORDCHAR_A>, C<isWORDCHAR_L1>, C<isWORDCHAR_uvchr>, and
C<isWORDCHAR_utf8_safe>.  C<isWORDCHAR_LC>, C<isWORDCHAR_LC_uvchr>, and
C<isWORDCHAR_LC_utf8_safe> are also as described there, but additionally
include the platform's native underscore.

	bool	isWORDCHAR(char ch)

=for hackers
Found in file handy.h

=item isXDIGIT
X<isXDIGIT>

Returns a boolean indicating whether the specified character is a hexadecimal
digit.  In the ASCII range these are C<[0-9A-Fa-f]>.  Variants C<isXDIGIT_A()>
and C<isXDIGIT_L1()> are identical to C<isXDIGIT()>.
See the L<top of this section|/Character classification> for an explanation of
variants
C<isXDIGIT_uvchr>, C<isXDIGIT_utf8_safe>, C<isXDIGIT_LC>, C<isXDIGIT_LC_uvchr>,
and C<isXDIGIT_LC_utf8_safe>.

	bool	isXDIGIT(char ch)

=for hackers
Found in file handy.h


=back

=head1 Cloning an interpreter

=over 8

=item perl_clone
X<perl_clone>

Create and return a new interpreter by cloning the current one.

C<perl_clone> takes these flags as parameters:

C<CLONEf_COPY_STACKS> - is used to, well, copy the stacks also,
without it we only clone the data and zero the stacks,
with it we copy the stacks and the new perl interpreter is
ready to run at the exact same point as the previous one.
The pseudo-fork code uses C<COPY_STACKS> while the
threads->create doesn't.

C<CLONEf_KEEP_PTR_TABLE> -
C<perl_clone> keeps a ptr_table with the pointer of the old
variable as a key and the new variable as a value,
this allows it to check if something has been cloned and not
clone it again but rather just use the value and increase the
refcount.  If C<KEEP_PTR_TABLE> is not set then C<perl_clone> will kill
the ptr_table using the function
C<ptr_table_free(PL_ptr_table); PL_ptr_table = NULL;>,
reason to keep it around is if you want to dup some of your own
variable who are outside the graph perl scans, an example of this
code is in F<threads.xs> create.

C<CLONEf_CLONE_HOST> -
This is a win32 thing, it is ignored on unix, it tells perls
win32host code (which is c++) to clone itself, this is needed on
win32 if you want to run two threads at the same time,
if you just want to do some stuff in a separate perl interpreter
and then throw it away and return to the original one,
you don't need to do anything.

	PerlInterpreter* perl_clone(
	                     PerlInterpreter *proto_perl,
	                     UV flags
	                 )

=for hackers
Found in file sv.c


=back

=head1 Compile-time scope hooks

=over 8

=item BhkDISABLE
X<BhkDISABLE>


NOTE: this function is experimental and may change or be
removed without notice.


Temporarily disable an entry in this BHK structure, by clearing the
appropriate flag.  C<which> is a preprocessor token indicating which
entry to disable.

	void	BhkDISABLE(BHK *hk, which)

=for hackers
Found in file op.h

=item BhkENABLE
X<BhkENABLE>


NOTE: this function is experimental and may change or be
removed without notice.


Re-enable an entry in this BHK structure, by setting the appropriate
flag.  C<which> is a preprocessor token indicating which entry to enable.
This will assert (under -DDEBUGGING) if the entry doesn't contain a valid
pointer.

	void	BhkENABLE(BHK *hk, which)

=for hackers
Found in file op.h

=item BhkENTRY_set
X<BhkENTRY_set>


NOTE: this function is experimental and may change or be
removed without notice.


Set an entry in the BHK structure, and set the flags to indicate it is
valid.  C<which> is a preprocessing token indicating which entry to set.
The type of C<ptr> depends on the entry.

	void	BhkENTRY_set(BHK *hk, which, void *ptr)

=for hackers
Found in file op.h

=item blockhook_register
X<blockhook_register>


NOTE: this function is experimental and may change or be
removed without notice.


Register a set of hooks to be called when the Perl lexical scope changes
at compile time.  See L<perlguts/"Compile-time scope hooks">.

NOTE: this function must be explicitly called as Perl_blockhook_register with an aTHX_ parameter.

	void	Perl_blockhook_register(pTHX_ BHK *hk)

=for hackers
Found in file op.c


=back

=head1 COP Hint Hashes

=over 8

=item cophh_2hv
X<cophh_2hv>


NOTE: this function is experimental and may change or be
removed without notice.


Generates and returns a standard Perl hash representing the full set of
key/value pairs in the cop hints hash C<cophh>.  C<flags> is currently
unused and must be zero.

	HV *	cophh_2hv(const COPHH *cophh, U32 flags)

=for hackers
Found in file cop.h

=item cophh_copy
X<cophh_copy>


NOTE: this function is experimental and may change or be
removed without notice.


Make and return a complete copy of the cop hints hash C<cophh>.

	COPHH *	cophh_copy(COPHH *cophh)

=for hackers
Found in file cop.h

=item cophh_delete_pv
X<cophh_delete_pv>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_delete_pvn>, but takes a nul-terminated string instead of
a string/length pair.

	COPHH *	cophh_delete_pv(const COPHH *cophh,
		                const char *key, U32 hash,
		                U32 flags)

=for hackers
Found in file cop.h

=item cophh_delete_pvn
X<cophh_delete_pvn>


NOTE: this function is experimental and may change or be
removed without notice.


Delete a key and its associated value from the cop hints hash C<cophh>,
and returns the modified hash.  The returned hash pointer is in general
not the same as the hash pointer that was passed in.  The input hash is
consumed by the function, and the pointer to it must not be subsequently
used.  Use L</cophh_copy> if you need both hashes.

The key is specified by C<keypv> and C<keylen>.  If C<flags> has the
C<COPHH_KEY_UTF8> bit set, the key octets are interpreted as UTF-8,
otherwise they are interpreted as Latin-1.  C<hash> is a precomputed
hash of the key string, or zero if it has not been precomputed.

	COPHH *	cophh_delete_pvn(COPHH *cophh,
		                 const char *keypv,
		                 STRLEN keylen, U32 hash,
		                 U32 flags)

=for hackers
Found in file cop.h

=item cophh_delete_pvs
X<cophh_delete_pvs>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_delete_pvn>, but takes a C<NUL>-terminated literal string instead
of a string/length pair, and no precomputed hash.

	COPHH *	cophh_delete_pvs(const COPHH *cophh,
		                 const char *key, U32 flags)

=for hackers
Found in file cop.h

=item cophh_delete_sv
X<cophh_delete_sv>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_delete_pvn>, but takes a Perl scalar instead of a
string/length pair.

	COPHH *	cophh_delete_sv(const COPHH *cophh, SV *key,
		                U32 hash, U32 flags)

=for hackers
Found in file cop.h

=item cophh_fetch_pv
X<cophh_fetch_pv>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_fetch_pvn>, but takes a nul-terminated string instead of
a string/length pair.

	SV *	cophh_fetch_pv(const COPHH *cophh,
		               const char *key, U32 hash,
		               U32 flags)

=for hackers
Found in file cop.h

=item cophh_fetch_pvn
X<cophh_fetch_pvn>


NOTE: this function is experimental and may change or be
removed without notice.


Look up the entry in the cop hints hash C<cophh> with the key specified by
C<keypv> and C<keylen>.  If C<flags> has the C<COPHH_KEY_UTF8> bit set,
the key octets are interpreted as UTF-8, otherwise they are interpreted
as Latin-1.  C<hash> is a precomputed hash of the key string, or zero if
it has not been precomputed.  Returns a mortal scalar copy of the value
associated with the key, or C<&PL_sv_placeholder> if there is no value
associated with the key.

	SV *	cophh_fetch_pvn(const COPHH *cophh,
		                const char *keypv,
		                STRLEN keylen, U32 hash,
		                U32 flags)

=for hackers
Found in file cop.h

=item cophh_fetch_pvs
X<cophh_fetch_pvs>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_fetch_pvn>, but takes a C<NUL>-terminated literal string instead
of a string/length pair, and no precomputed hash.

	SV *	cophh_fetch_pvs(const COPHH *cophh,
		                const char *key, U32 flags)

=for hackers
Found in file cop.h

=item cophh_fetch_sv
X<cophh_fetch_sv>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_fetch_pvn>, but takes a Perl scalar instead of a
string/length pair.

	SV *	cophh_fetch_sv(const COPHH *cophh, SV *key,
		               U32 hash, U32 flags)

=for hackers
Found in file cop.h

=item cophh_free
X<cophh_free>


NOTE: this function is experimental and may change or be
removed without notice.


Discard the cop hints hash C<cophh>, freeing all resources associated
with it.

	void	cophh_free(COPHH *cophh)

=for hackers
Found in file cop.h

=item cophh_new_empty
X<cophh_new_empty>


NOTE: this function is experimental and may change or be
removed without notice.


Generate and return a fresh cop hints hash containing no entries.

	COPHH *	cophh_new_empty()

=for hackers
Found in file cop.h

=item cophh_store_pv
X<cophh_store_pv>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_store_pvn>, but takes a nul-terminated string instead of
a string/length pair.

	COPHH *	cophh_store_pv(const COPHH *cophh,
		               const char *key, U32 hash,
		               SV *value, U32 flags)

=for hackers
Found in file cop.h

=item cophh_store_pvn
X<cophh_store_pvn>


NOTE: this function is experimental and may change or be
removed without notice.


Stores a value, associated with a key, in the cop hints hash C<cophh>,
and returns the modified hash.  The returned hash pointer is in general
not the same as the hash pointer that was passed in.  The input hash is
consumed by the function, and the pointer to it must not be subsequently
used.  Use L</cophh_copy> if you need both hashes.

The key is specified by C<keypv> and C<keylen>.  If C<flags> has the
C<COPHH_KEY_UTF8> bit set, the key octets are interpreted as UTF-8,
otherwise they are interpreted as Latin-1.  C<hash> is a precomputed
hash of the key string, or zero if it has not been precomputed.

C<value> is the scalar value to store for this key.  C<value> is copied
by this function, which thus does not take ownership of any reference
to it, and later changes to the scalar will not be reflected in the
value visible in the cop hints hash.  Complex types of scalar will not
be stored with referential integrity, but will be coerced to strings.

	COPHH *	cophh_store_pvn(COPHH *cophh, const char *keypv,
		                STRLEN keylen, U32 hash,
		                SV *value, U32 flags)

=for hackers
Found in file cop.h

=item cophh_store_pvs
X<cophh_store_pvs>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_store_pvn>, but takes a C<NUL>-terminated literal string instead
of a string/length pair, and no precomputed hash.

	COPHH *	cophh_store_pvs(const COPHH *cophh,
		                const char *key, SV *value,
		                U32 flags)

=for hackers
Found in file cop.h

=item cophh_store_sv
X<cophh_store_sv>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</cophh_store_pvn>, but takes a Perl scalar instead of a
string/length pair.

	COPHH *	cophh_store_sv(const COPHH *cophh, SV *key,
		               U32 hash, SV *value, U32 flags)

=for hackers
Found in file cop.h


=back

=head1 COP Hint Reading

=over 8

=item cop_hints_2hv
X<cop_hints_2hv>

Generates and returns a standard Perl hash representing the full set of
hint entries in the cop C<cop>.  C<flags> is currently unused and must
be zero.

	HV *	cop_hints_2hv(const COP *cop, U32 flags)

=for hackers
Found in file cop.h

=item cop_hints_fetch_pv
X<cop_hints_fetch_pv>

Like L</cop_hints_fetch_pvn>, but takes a nul-terminated string instead
of a string/length pair.

	SV *	cop_hints_fetch_pv(const COP *cop,
		                   const char *key, U32 hash,
		                   U32 flags)

=for hackers
Found in file cop.h

=item cop_hints_fetch_pvn
X<cop_hints_fetch_pvn>

Look up the hint entry in the cop C<cop> with the key specified by
C<keypv> and C<keylen>.  If C<flags> has the C<COPHH_KEY_UTF8> bit set,
the key octets are interpreted as UTF-8, otherwise they are interpreted
as Latin-1.  C<hash> is a precomputed hash of the key string, or zero if
it has not been precomputed.  Returns a mortal scalar copy of the value
associated with the key, or C<&PL_sv_placeholder> if there is no value
associated with the key.

	SV *	cop_hints_fetch_pvn(const COP *cop,
		                    const char *keypv,
		                    STRLEN keylen, U32 hash,
		                    U32 flags)

=for hackers
Found in file cop.h

=item cop_hints_fetch_pvs
X<cop_hints_fetch_pvs>

Like L</cop_hints_fetch_pvn>, but takes a C<NUL>-terminated literal string
instead of a string/length pair, and no precomputed hash.

	SV *	cop_hints_fetch_pvs(const COP *cop,
		                    const char *key, U32 flags)

=for hackers
Found in file cop.h

=item cop_hints_fetch_sv
X<cop_hints_fetch_sv>

Like L</cop_hints_fetch_pvn>, but takes a Perl scalar instead of a
string/length pair.

	SV *	cop_hints_fetch_sv(const COP *cop, SV *key,
		                   U32 hash, U32 flags)

=for hackers
Found in file cop.h


=back

=head1 Custom Operators

=over 8

=item custom_op_register
X<custom_op_register>

Register a custom op.  See L<perlguts/"Custom Operators">.

NOTE: this function must be explicitly called as Perl_custom_op_register with an aTHX_ parameter.

	void	Perl_custom_op_register(pTHX_ 
		                        Perl_ppaddr_t ppaddr,
		                        const XOP *xop)

=for hackers
Found in file op.c

=item custom_op_xop
X<custom_op_xop>

Return the XOP structure for a given custom op.  This macro should be
considered internal to C<OP_NAME> and the other access macros: use them instead.
This macro does call a function.  Prior
to 5.19.6, this was implemented as a
function.

NOTE: this function must be explicitly called as Perl_custom_op_xop with an aTHX_ parameter.

	const XOP * Perl_custom_op_xop(pTHX_ const OP *o)

=for hackers
Found in file op.c

=item XopDISABLE
X<XopDISABLE>

Temporarily disable a member of the XOP, by clearing the appropriate flag.

	void	XopDISABLE(XOP *xop, which)

=for hackers
Found in file op.h

=item XopENABLE
X<XopENABLE>

Reenable a member of the XOP which has been disabled.

	void	XopENABLE(XOP *xop, which)

=for hackers
Found in file op.h

=item XopENTRY
X<XopENTRY>

Return a member of the XOP structure.  C<which> is a cpp token
indicating which entry to return.  If the member is not set
this will return a default value.  The return type depends
on C<which>.  This macro evaluates its arguments more than
once.  If you are using C<Perl_custom_op_xop> to retreive a
C<XOP *> from a C<OP *>, use the more efficient L</XopENTRYCUSTOM> instead.

		XopENTRY(XOP *xop, which)

=for hackers
Found in file op.h

=item XopENTRYCUSTOM
X<XopENTRYCUSTOM>

Exactly like C<XopENTRY(XopENTRY(Perl_custom_op_xop(aTHX_ o), which)> but more
efficient.  The C<which> parameter is identical to L</XopENTRY>.

		XopENTRYCUSTOM(const OP *o, which)

=for hackers
Found in file op.h

=item XopENTRY_set
X<XopENTRY_set>

Set a member of the XOP structure.  C<which> is a cpp token
indicating which entry to set.  See L<perlguts/"Custom Operators">
for details about the available members and how
they are used.  This macro evaluates its argument
more than once.

	void	XopENTRY_set(XOP *xop, which, value)

=for hackers
Found in file op.h

=item XopFLAGS
X<XopFLAGS>

Return the XOP's flags.

	U32	XopFLAGS(XOP *xop)

=for hackers
Found in file op.h


=back

=head1 CV Manipulation Functions

This section documents functions to manipulate CVs which are code-values,
or subroutines.  For more information, see L<perlguts>.


=over 8

=item caller_cx
X<caller_cx>

The XSUB-writer's equivalent of L<caller()|perlfunc/caller>.  The
returned C<PERL_CONTEXT> structure can be interrogated to find all the
information returned to Perl by C<caller>.  Note that XSUBs don't get a
stack frame, so C<caller_cx(0, NULL)> will return information for the
immediately-surrounding Perl code.

This function skips over the automatic calls to C<&DB::sub> made on the
behalf of the debugger.  If the stack frame requested was a sub called by
C<DB::sub>, the return value will be the frame for the call to
C<DB::sub>, since that has the correct line number/etc. for the call
site.  If I<dbcxp> is non-C<NULL>, it will be set to a pointer to the
frame for the sub call itself.

	const PERL_CONTEXT * caller_cx(
	                         I32 level,
	                         const PERL_CONTEXT **dbcxp
	                     )

=for hackers
Found in file pp_ctl.c

=item CvSTASH
X<CvSTASH>

Returns the stash of the CV.  A stash is the symbol table hash, containing
the package-scoped variables in the package where the subroutine was defined.
For more information, see L<perlguts>.

This also has a special use with XS AUTOLOAD subs.
See L<perlguts/Autoloading with XSUBs>.

	HV*	CvSTASH(CV* cv)

=for hackers
Found in file cv.h

=item find_runcv
X<find_runcv>

Locate the CV corresponding to the currently executing sub or eval.
If C<db_seqp> is non_null, skip CVs that are in the DB package and populate
C<*db_seqp> with the cop sequence number at the point that the DB:: code was
entered.  (This allows debuggers to eval in the scope of the breakpoint
rather than in the scope of the debugger itself.)

	CV*	find_runcv(U32 *db_seqp)

=for hackers
Found in file pp_ctl.c

=item get_cv
X<get_cv>

Uses C<strlen> to get the length of C<name>, then calls C<get_cvn_flags>.

NOTE: the perl_ form of this function is deprecated.

	CV*	get_cv(const char* name, I32 flags)

=for hackers
Found in file perl.c

=item get_cvn_flags
X<get_cvn_flags>

Returns the CV of the specified Perl subroutine.  C<flags> are passed to
C<gv_fetchpvn_flags>.  If C<GV_ADD> is set and the Perl subroutine does not
exist then it will be declared (which has the same effect as saying
C<sub name;>).  If C<GV_ADD> is not set and the subroutine does not exist
then NULL is returned.

NOTE: the perl_ form of this function is deprecated.

	CV*	get_cvn_flags(const char* name, STRLEN len,
		              I32 flags)

=for hackers
Found in file perl.c


=back

=head1 C<xsubpp> variables and internal functions

=over 8

=item ax
X<ax>

Variable which is setup by C<xsubpp> to indicate the stack base offset,
used by the C<ST>, C<XSprePUSH> and C<XSRETURN> macros.  The C<dMARK> macro
must be called prior to setup the C<MARK> variable.

	I32	ax

=for hackers
Found in file XSUB.h

=item CLASS
X<CLASS>

Variable which is setup by C<xsubpp> to indicate the 
class name for a C++ XS constructor.  This is always a C<char*>.  See
C<L</THIS>>.

	char*	CLASS

=for hackers
Found in file XSUB.h

=item dAX
X<dAX>

Sets up the C<ax> variable.
This is usually handled automatically by C<xsubpp> by calling C<dXSARGS>.

		dAX;

=for hackers
Found in file XSUB.h

=item dAXMARK
X<dAXMARK>

Sets up the C<ax> variable and stack marker variable C<mark>.
This is usually handled automatically by C<xsubpp> by calling C<dXSARGS>.

		dAXMARK;

=for hackers
Found in file XSUB.h

=item dITEMS
X<dITEMS>

Sets up the C<items> variable.
This is usually handled automatically by C<xsubpp> by calling C<dXSARGS>.

		dITEMS;

=for hackers
Found in file XSUB.h

=item dUNDERBAR
X<dUNDERBAR>

Sets up any variable needed by the C<UNDERBAR> macro.  It used to define
C<padoff_du>, but it is currently a noop.  However, it is strongly advised
to still use it for ensuring past and future compatibility.

		dUNDERBAR;

=for hackers
Found in file XSUB.h

=item dXSARGS
X<dXSARGS>

Sets up stack and mark pointers for an XSUB, calling C<dSP> and C<dMARK>.
Sets up the C<ax> and C<items> variables by calling C<dAX> and C<dITEMS>.
This is usually handled automatically by C<xsubpp>.

		dXSARGS;

=for hackers
Found in file XSUB.h

=item dXSI32
X<dXSI32>

Sets up the C<ix> variable for an XSUB which has aliases.  This is usually
handled automatically by C<xsubpp>.

		dXSI32;

=for hackers
Found in file XSUB.h

=item items
X<items>

Variable which is setup by C<xsubpp> to indicate the number of 
items on the stack.  See L<perlxs/"Variable-length Parameter Lists">.

	I32	items

=for hackers
Found in file XSUB.h

=item ix
X<ix>

Variable which is setup by C<xsubpp> to indicate which of an 
XSUB's aliases was used to invoke it.  See L<perlxs/"The ALIAS: Keyword">.

	I32	ix

=for hackers
Found in file XSUB.h

=item RETVAL
X<RETVAL>

Variable which is setup by C<xsubpp> to hold the return value for an 
XSUB.  This is always the proper type for the XSUB.  See 
L<perlxs/"The RETVAL Variable">.

	(whatever)	RETVAL

=for hackers
Found in file XSUB.h

=item ST
X<ST>

Used to access elements on the XSUB's stack.

	SV*	ST(int ix)

=for hackers
Found in file XSUB.h

=item THIS
X<THIS>

Variable which is setup by C<xsubpp> to designate the object in a C++ 
XSUB.  This is always the proper type for the C++ object.  See C<L</CLASS>> and
L<perlxs/"Using XS With C++">.

	(whatever)	THIS

=for hackers
Found in file XSUB.h

=item UNDERBAR
X<UNDERBAR>

The SV* corresponding to the C<$_> variable.  Works even if there
is a lexical C<$_> in scope.

=for hackers
Found in file XSUB.h

=item XS
X<XS>

Macro to declare an XSUB and its C parameter list.  This is handled by
C<xsubpp>.  It is the same as using the more explicit C<XS_EXTERNAL> macro.

=for hackers
Found in file XSUB.h

=item XS_EXTERNAL
X<XS_EXTERNAL>

Macro to declare an XSUB and its C parameter list explicitly exporting the symbols.

=for hackers
Found in file XSUB.h

=item XS_INTERNAL
X<XS_INTERNAL>

Macro to declare an XSUB and its C parameter list without exporting the symbols.
This is handled by C<xsubpp> and generally preferable over exporting the XSUB
symbols unnecessarily.

=for hackers
Found in file XSUB.h


=back

=head1 Debugging Utilities

=over 8

=item dump_all
X<dump_all>

Dumps the entire optree of the current program starting at C<PL_main_root> to 
C<STDERR>.  Also dumps the optrees for all visible subroutines in
C<PL_defstash>.

	void	dump_all()

=for hackers
Found in file dump.c

=item dump_packsubs
X<dump_packsubs>

Dumps the optrees for all visible subroutines in C<stash>.

	void	dump_packsubs(const HV* stash)

=for hackers
Found in file dump.c

=item op_class
X<op_class>

Given an op, determine what type of struct it has been allocated as.
Returns one of the OPclass enums, such as OPclass_LISTOP.

	OPclass	op_class(const OP *o)

=for hackers
Found in file dump.c

=item op_dump
X<op_dump>

Dumps the optree starting at OP C<o> to C<STDERR>.

	void	op_dump(const OP *o)

=for hackers
Found in file dump.c

=item sv_dump
X<sv_dump>

Dumps the contents of an SV to the C<STDERR> filehandle.

For an example of its output, see L<Devel::Peek>.

	void	sv_dump(SV* sv)

=for hackers
Found in file dump.c


=back

=head1 Display and Dump functions

=over 8

=item pv_display
X<pv_display>

Similar to

  pv_escape(dsv,pv,cur,pvlim,PERL_PV_ESCAPE_QUOTE);

except that an additional "\0" will be appended to the string when
len > cur and pv[cur] is "\0".

Note that the final string may be up to 7 chars longer than pvlim.

	char*	pv_display(SV *dsv, const char *pv, STRLEN cur,
		           STRLEN len, STRLEN pvlim)

=for hackers
Found in file dump.c

=item pv_escape
X<pv_escape>

Escapes at most the first C<count> chars of C<pv> and puts the results into
C<dsv> such that the size of the escaped string will not exceed C<max> chars
and will not contain any incomplete escape sequences.  The number of bytes
escaped will be returned in the C<STRLEN *escaped> parameter if it is not null.
When the C<dsv> parameter is null no escaping actually occurs, but the number
of bytes that would be escaped were it not null will be calculated.

If flags contains C<PERL_PV_ESCAPE_QUOTE> then any double quotes in the string
will also be escaped.

Normally the SV will be cleared before the escaped string is prepared,
but when C<PERL_PV_ESCAPE_NOCLEAR> is set this will not occur.

If C<PERL_PV_ESCAPE_UNI> is set then the input string is treated as UTF-8
if C<PERL_PV_ESCAPE_UNI_DETECT> is set then the input string is scanned
using C<is_utf8_string()> to determine if it is UTF-8.

If C<PERL_PV_ESCAPE_ALL> is set then all input chars will be output
using C<\x01F1> style escapes, otherwise if C<PERL_PV_ESCAPE_NONASCII> is set, only
non-ASCII chars will be escaped using this style; otherwise, only chars above
255 will be so escaped; other non printable chars will use octal or
common escaped patterns like C<\n>.
Otherwise, if C<PERL_PV_ESCAPE_NOBACKSLASH>
then all chars below 255 will be treated as printable and
will be output as literals.

If C<PERL_PV_ESCAPE_FIRSTCHAR> is set then only the first char of the
string will be escaped, regardless of max.  If the output is to be in hex,
then it will be returned as a plain hex
sequence.  Thus the output will either be a single char,
an octal escape sequence, a special escape like C<\n> or a hex value.

If C<PERL_PV_ESCAPE_RE> is set then the escape char used will be a C<"%"> and
not a C<"\\">.  This is because regexes very often contain backslashed
sequences, whereas C<"%"> is not a particularly common character in patterns.

Returns a pointer to the escaped text as held by C<dsv>.

	char*	pv_escape(SV *dsv, char const * const str,
		          const STRLEN count, const STRLEN max,
		          STRLEN * const escaped,
		          const U32 flags)

=for hackers
Found in file dump.c

=item pv_pretty
X<pv_pretty>

Converts a string into something presentable, handling escaping via
C<pv_escape()> and supporting quoting and ellipses.

If the C<PERL_PV_PRETTY_QUOTE> flag is set then the result will be
double quoted with any double quotes in the string escaped.  Otherwise
if the C<PERL_PV_PRETTY_LTGT> flag is set then the result be wrapped in
angle brackets. 

If the C<PERL_PV_PRETTY_ELLIPSES> flag is set and not all characters in
string were output then an ellipsis C<...> will be appended to the
string.  Note that this happens AFTER it has been quoted.

If C<start_color> is non-null then it will be inserted after the opening
quote (if there is one) but before the escaped text.  If C<end_color>
is non-null then it will be inserted after the escaped text but before
any quotes or ellipses.

Returns a pointer to the prettified text as held by C<dsv>.

	char*	pv_pretty(SV *dsv, char const * const str,
		          const STRLEN count, const STRLEN max,
		          char const * const start_color,
		          char const * const end_color,
		          const U32 flags)

=for hackers
Found in file dump.c


=back

=head1 Embedding Functions

=over 8

=item cv_clone
X<cv_clone>

Clone a CV, making a lexical closure.  C<proto> supplies the prototype
of the function: its code, pad structure, and other attributes.
The prototype is combined with a capture of outer lexicals to which the
code refers, which are taken from the currently-executing instance of
the immediately surrounding code.

	CV *	cv_clone(CV *proto)

=for hackers
Found in file pad.c

=item cv_name
X<cv_name>

Returns an SV containing the name of the CV, mainly for use in error
reporting.  The CV may actually be a GV instead, in which case the returned
SV holds the GV's name.  Anything other than a GV or CV is treated as a
string already holding the sub name, but this could change in the future.

An SV may be passed as a second argument.  If so, the name will be assigned
to it and it will be returned.  Otherwise the returned SV will be a new
mortal.

If C<flags> has the C<CV_NAME_NOTQUAL> bit set, then the package name will not be
included.  If the first argument is neither a CV nor a GV, this flag is
ignored (subject to change).

	SV *	cv_name(CV *cv, SV *sv, U32 flags)

=for hackers
Found in file pad.c

=item cv_undef
X<cv_undef>

Clear out all the active components of a CV.  This can happen either
by an explicit C<undef &foo>, or by the reference count going to zero.
In the former case, we keep the C<CvOUTSIDE> pointer, so that any anonymous
children can still follow the full lexical scope chain.

	void	cv_undef(CV* cv)

=for hackers
Found in file pad.c

=item find_rundefsv
X<find_rundefsv>

Returns the global variable C<$_>.

	SV *	find_rundefsv()

=for hackers
Found in file pad.c

=item find_rundefsvoffset
X<find_rundefsvoffset>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Until the lexical C<$_> feature was removed, this function would
find the position of the lexical C<$_> in the pad of the
currently-executing function and returns the offset in the current pad,
or C<NOT_IN_PAD>.

Now it always returns C<NOT_IN_PAD>.

NOTE: the perl_ form of this function is deprecated.

	PADOFFSET find_rundefsvoffset()

=for hackers
Found in file pad.c

=item intro_my
X<intro_my>

"Introduce" C<my> variables to visible status.  This is called during parsing
at the end of each statement to make lexical variables visible to subsequent
statements.

	U32	intro_my()

=for hackers
Found in file pad.c

=item load_module
X<load_module>

Loads the module whose name is pointed to by the string part of C<name>.
Note that the actual module name, not its filename, should be given.
Eg, "Foo::Bar" instead of "Foo/Bar.pm". ver, if specified and not NULL,
provides version semantics similar to C<use Foo::Bar VERSION>. The optional
trailing arguments can be used to specify arguments to the module's C<import()>
method, similar to C<use Foo::Bar VERSION LIST>; their precise handling depends
on the flags. The flags argument is a bitwise-ORed collection of any of
C<PERL_LOADMOD_DENY>, C<PERL_LOADMOD_NOIMPORT>, or C<PERL_LOADMOD_IMPORT_OPS>
(or 0 for no flags).

If C<PERL_LOADMOD_NOIMPORT> is set, the module is loaded as if with an empty
import list, as in C<use Foo::Bar ()>; this is the only circumstance in which
the trailing optional arguments may be omitted entirely. Otherwise, if
C<PERL_LOADMOD_IMPORT_OPS> is set, the trailing arguments must consist of
exactly one C<OP*>, containing the op tree that produces the relevant import
arguments. Otherwise, the trailing arguments must all be C<SV*> values that
will be used as import arguments; and the list must be terminated with C<(SV*)
NULL>. If neither C<PERL_LOADMOD_NOIMPORT> nor C<PERL_LOADMOD_IMPORT_OPS> is
set, the trailing C<NULL> pointer is needed even if no import arguments are
desired. The reference count for each specified C<SV*> argument is
decremented. In addition, the C<name> argument is modified.

If C<PERL_LOADMOD_DENY> is set, the module is loaded as if with C<no> rather
than C<use>.

	void	load_module(U32 flags, SV* name, SV* ver, ...)

=for hackers
Found in file op.c

=item newPADNAMELIST
X<newPADNAMELIST>


NOTE: this function is experimental and may change or be
removed without notice.


Creates a new pad name list.  C<max> is the highest index for which space
is allocated.

	PADNAMELIST * newPADNAMELIST(size_t max)

=for hackers
Found in file pad.c

=item newPADNAMEouter
X<newPADNAMEouter>


NOTE: this function is experimental and may change or be
removed without notice.


Constructs and returns a new pad name.  Only use this function for names
that refer to outer lexicals.  (See also L</newPADNAMEpvn>.)  C<outer> is
the outer pad name that this one mirrors.  The returned pad name has the
C<PADNAMEt_OUTER> flag already set.

	PADNAME * newPADNAMEouter(PADNAME *outer)

=for hackers
Found in file pad.c

=item newPADNAMEpvn
X<newPADNAMEpvn>


NOTE: this function is experimental and may change or be
removed without notice.


Constructs and returns a new pad name.  C<s> must be a UTF-8 string.  Do not
use this for pad names that point to outer lexicals.  See
C<L</newPADNAMEouter>>.

	PADNAME * newPADNAMEpvn(const char *s, STRLEN len)

=for hackers
Found in file pad.c

=item nothreadhook
X<nothreadhook>

Stub that provides thread hook for perl_destruct when there are
no threads.

	int	nothreadhook()

=for hackers
Found in file perl.c

=item pad_add_anon
X<pad_add_anon>

Allocates a place in the currently-compiling pad (via L</pad_alloc>)
for an anonymous function that is lexically scoped inside the
currently-compiling function.
The function C<func> is linked into the pad, and its C<CvOUTSIDE> link
to the outer scope is weakened to avoid a reference loop.

One reference count is stolen, so you may need to do C<SvREFCNT_inc(func)>.

C<optype> should be an opcode indicating the type of operation that the
pad entry is to support.  This doesn't affect operational semantics,
but is used for debugging.

	PADOFFSET pad_add_anon(CV *func, I32 optype)

=for hackers
Found in file pad.c

=item pad_add_name_pv
X<pad_add_name_pv>

Exactly like L</pad_add_name_pvn>, but takes a nul-terminated string
instead of a string/length pair.

	PADOFFSET pad_add_name_pv(const char *name, U32 flags,
	                          HV *typestash, HV *ourstash)

=for hackers
Found in file pad.c

=item pad_add_name_pvn
X<pad_add_name_pvn>

Allocates a place in the currently-compiling pad for a named lexical
variable.  Stores the name and other metadata in the name part of the
pad, and makes preparations to manage the variable's lexical scoping.
Returns the offset of the allocated pad slot.

C<namepv>/C<namelen> specify the variable's name, including leading sigil.
If C<typestash> is non-null, the name is for a typed lexical, and this
identifies the type.  If C<ourstash> is non-null, it's a lexical reference
to a package variable, and this identifies the package.  The following
flags can be OR'ed together:

 padadd_OUR          redundantly specifies if it's a package var
 padadd_STATE        variable will retain value persistently
 padadd_NO_DUP_CHECK skip check for lexical shadowing

	PADOFFSET pad_add_name_pvn(const char *namepv,
	                           STRLEN namelen, U32 flags,
	                           HV *typestash, HV *ourstash)

=for hackers
Found in file pad.c

=item pad_add_name_sv
X<pad_add_name_sv>

Exactly like L</pad_add_name_pvn>, but takes the name string in the form
of an SV instead of a string/length pair.

	PADOFFSET pad_add_name_sv(SV *name, U32 flags,
	                          HV *typestash, HV *ourstash)

=for hackers
Found in file pad.c

=item pad_alloc
X<pad_alloc>


NOTE: this function is experimental and may change or be
removed without notice.


Allocates a place in the currently-compiling pad,
returning the offset of the allocated pad slot.
No name is initially attached to the pad slot.
C<tmptype> is a set of flags indicating the kind of pad entry required,
which will be set in the value SV for the allocated pad entry:

    SVs_PADMY    named lexical variable ("my", "our", "state")
    SVs_PADTMP   unnamed temporary store
    SVf_READONLY constant shared between recursion levels

C<SVf_READONLY> has been supported here only since perl 5.20.  To work with
earlier versions as well, use C<SVf_READONLY|SVs_PADTMP>.  C<SVf_READONLY>
does not cause the SV in the pad slot to be marked read-only, but simply
tells C<pad_alloc> that it I<will> be made read-only (by the caller), or at
least should be treated as such.

C<optype> should be an opcode indicating the type of operation that the
pad entry is to support.  This doesn't affect operational semantics,
but is used for debugging.

	PADOFFSET pad_alloc(I32 optype, U32 tmptype)

=for hackers
Found in file pad.c

=item pad_findmy_pv
X<pad_findmy_pv>

Exactly like L</pad_findmy_pvn>, but takes a nul-terminated string
instead of a string/length pair.

	PADOFFSET pad_findmy_pv(const char *name, U32 flags)

=for hackers
Found in file pad.c

=item pad_findmy_pvn
X<pad_findmy_pvn>

Given the name of a lexical variable, find its position in the
currently-compiling pad.
C<namepv>/C<namelen> specify the variable's name, including leading sigil.
C<flags> is reserved and must be zero.
If it is not in the current pad but appears in the pad of any lexically
enclosing scope, then a pseudo-entry for it is added in the current pad.
Returns the offset in the current pad,
or C<NOT_IN_PAD> if no such lexical is in scope.

	PADOFFSET pad_findmy_pvn(const char *namepv,
	                         STRLEN namelen, U32 flags)

=for hackers
Found in file pad.c

=item pad_findmy_sv
X<pad_findmy_sv>

Exactly like L</pad_findmy_pvn>, but takes the name string in the form
of an SV instead of a string/length pair.

	PADOFFSET pad_findmy_sv(SV *name, U32 flags)

=for hackers
Found in file pad.c

=item padnamelist_fetch
X<padnamelist_fetch>


NOTE: this function is experimental and may change or be
removed without notice.


Fetches the pad name from the given index.

	PADNAME * padnamelist_fetch(PADNAMELIST *pnl,
	                            SSize_t key)

=for hackers
Found in file pad.c

=item padnamelist_store
X<padnamelist_store>


NOTE: this function is experimental and may change or be
removed without notice.


Stores the pad name (which may be null) at the given index, freeing any
existing pad name in that slot.

	PADNAME ** padnamelist_store(PADNAMELIST *pnl,
	                             SSize_t key, PADNAME *val)

=for hackers
Found in file pad.c

=item pad_setsv
X<pad_setsv>

Set the value at offset C<po> in the current (compiling or executing) pad.
Use the macro C<PAD_SETSV()> rather than calling this function directly.

	void	pad_setsv(PADOFFSET po, SV *sv)

=for hackers
Found in file pad.c

=item pad_sv
X<pad_sv>

Get the value at offset C<po> in the current (compiling or executing) pad.
Use macro C<PAD_SV> instead of calling this function directly.

	SV *	pad_sv(PADOFFSET po)

=for hackers
Found in file pad.c

=item pad_tidy
X<pad_tidy>


NOTE: this function is experimental and may change or be
removed without notice.


Tidy up a pad at the end of compilation of the code to which it belongs.
Jobs performed here are: remove most stuff from the pads of anonsub
prototypes; give it a C<@_>; mark temporaries as such.  C<type> indicates
the kind of subroutine:

    padtidy_SUB        ordinary subroutine
    padtidy_SUBCLONE   prototype for lexical closure
    padtidy_FORMAT     format

	void	pad_tidy(padtidy_type type)

=for hackers
Found in file pad.c

=item perl_alloc
X<perl_alloc>

Allocates a new Perl interpreter.  See L<perlembed>.

	PerlInterpreter* perl_alloc()

=for hackers
Found in file perl.c

=item perl_construct
X<perl_construct>

Initializes a new Perl interpreter.  See L<perlembed>.

	void	perl_construct(PerlInterpreter *my_perl)

=for hackers
Found in file perl.c

=item perl_destruct
X<perl_destruct>

Shuts down a Perl interpreter.  See L<perlembed>.

	int	perl_destruct(PerlInterpreter *my_perl)

=for hackers
Found in file perl.c

=item perl_free
X<perl_free>

Releases a Perl interpreter.  See L<perlembed>.

	void	perl_free(PerlInterpreter *my_perl)

=for hackers
Found in file perl.c

=item perl_parse
X<perl_parse>

Tells a Perl interpreter to parse a Perl script.  See L<perlembed>.

	int	perl_parse(PerlInterpreter *my_perl,
		           XSINIT_t xsinit, int argc,
		           char** argv, char** env)

=for hackers
Found in file perl.c

=item perl_run
X<perl_run>

Tells a Perl interpreter to run.  See L<perlembed>.

	int	perl_run(PerlInterpreter *my_perl)

=for hackers
Found in file perl.c

=item require_pv
X<require_pv>

Tells Perl to C<require> the file named by the string argument.  It is
analogous to the Perl code C<eval "require '$file'">.  It's even
implemented that way; consider using load_module instead.

NOTE: the perl_ form of this function is deprecated.

	void	require_pv(const char* pv)

=for hackers
Found in file perl.c


=back

=head1 Exception Handling (simple) Macros

=over 8

=item dXCPT
X<dXCPT>

Set up necessary local variables for exception handling.
See L<perlguts/"Exception Handling">.

		dXCPT;

=for hackers
Found in file XSUB.h

=item XCPT_CATCH
X<XCPT_CATCH>

Introduces a catch block.  See L<perlguts/"Exception Handling">.

=for hackers
Found in file XSUB.h

=item XCPT_RETHROW
X<XCPT_RETHROW>

Rethrows a previously caught exception.  See L<perlguts/"Exception Handling">.

		XCPT_RETHROW;

=for hackers
Found in file XSUB.h

=item XCPT_TRY_END
X<XCPT_TRY_END>

Ends a try block.  See L<perlguts/"Exception Handling">.

=for hackers
Found in file XSUB.h

=item XCPT_TRY_START
X<XCPT_TRY_START>

Starts a try block.  See L<perlguts/"Exception Handling">.

=for hackers
Found in file XSUB.h


=back

=head1 Functions in file scope.c


=over 8

=item save_gp
X<save_gp>

Saves the current GP of gv on the save stack to be restored on scope exit.

If empty is true, replace the GP with a new GP.

If empty is false, mark gv with GVf_INTRO so the next reference
assigned is localized, which is how C< local *foo = $someref; > works.

	void	save_gp(GV* gv, I32 empty)

=for hackers
Found in file scope.c


=back

=head1 Functions in file vutil.c


=over 8

=item new_version
X<new_version>

Returns a new version object based on the passed in SV:

    SV *sv = new_version(SV *ver);

Does not alter the passed in ver SV.  See "upg_version" if you
want to upgrade the SV.

	SV*	new_version(SV *ver)

=for hackers
Found in file vutil.c

=item prescan_version
X<prescan_version>

Validate that a given string can be parsed as a version object, but doesn't
actually perform the parsing.  Can use either strict or lax validation rules.
Can optionally set a number of hint variables to save the parsing code
some time when tokenizing.

	const char* prescan_version(const char *s, bool strict,
	                            const char** errstr,
	                            bool *sqv,
	                            int *ssaw_decimal,
	                            int *swidth, bool *salpha)

=for hackers
Found in file vutil.c

=item scan_version
X<scan_version>

Returns a pointer to the next character after the parsed
version string, as well as upgrading the passed in SV to
an RV.

Function must be called with an already existing SV like

    sv = newSV(0);
    s = scan_version(s, SV *sv, bool qv);

Performs some preprocessing to the string to ensure that
it has the correct characteristics of a version.  Flags the
object if it contains an underscore (which denotes this
is an alpha version).  The boolean qv denotes that the version
should be interpreted as if it had multiple decimals, even if
it doesn't.

	const char* scan_version(const char *s, SV *rv, bool qv)

=for hackers
Found in file vutil.c

=item upg_version
X<upg_version>

In-place upgrade of the supplied SV to a version object.

    SV *sv = upg_version(SV *sv, bool qv);

Returns a pointer to the upgraded SV.  Set the boolean qv if you want
to force this SV to be interpreted as an "extended" version.

	SV*	upg_version(SV *ver, bool qv)

=for hackers
Found in file vutil.c

=item vcmp
X<vcmp>

Version object aware cmp.  Both operands must already have been 
converted into version objects.

	int	vcmp(SV *lhv, SV *rhv)

=for hackers
Found in file vutil.c

=item vnormal
X<vnormal>

Accepts a version object and returns the normalized string
representation.  Call like:

    sv = vnormal(rv);

NOTE: you can pass either the object directly or the SV
contained within the RV.

The SV returned has a refcount of 1.

	SV*	vnormal(SV *vs)

=for hackers
Found in file vutil.c

=item vnumify
X<vnumify>

Accepts a version object and returns the normalized floating
point representation.  Call like:

    sv = vnumify(rv);

NOTE: you can pass either the object directly or the SV
contained within the RV.

The SV returned has a refcount of 1.

	SV*	vnumify(SV *vs)

=for hackers
Found in file vutil.c

=item vstringify
X<vstringify>

In order to maintain maximum compatibility with earlier versions
of Perl, this function will return either the floating point
notation or the multiple dotted notation, depending on whether
the original version contained 1 or more dots, respectively.

The SV returned has a refcount of 1.

	SV*	vstringify(SV *vs)

=for hackers
Found in file vutil.c

=item vverify
X<vverify>

Validates that the SV contains valid internal structure for a version object.
It may be passed either the version object (RV) or the hash itself (HV).  If
the structure is valid, it returns the HV.  If the structure is invalid,
it returns NULL.

    SV *hv = vverify(sv);

Note that it only confirms the bare minimum structure (so as not to get
confused by derived classes which may contain additional hash entries):

=over 4

=item * The SV is an HV or a reference to an HV

=item * The hash contains a "version" key

=item * The "version" key has a reference to an AV as its value

=back

	SV*	vverify(SV *vs)

=for hackers
Found in file vutil.c


=back

=head1 "Gimme" Values

=over 8

=item G_ARRAY
X<G_ARRAY>

Used to indicate list context.  See C<L</GIMME_V>>, C<L</GIMME>> and
L<perlcall>.

=for hackers
Found in file cop.h

=item G_DISCARD
X<G_DISCARD>

Indicates that arguments returned from a callback should be discarded.  See
L<perlcall>.

=for hackers
Found in file cop.h

=item G_EVAL
X<G_EVAL>

Used to force a Perl C<eval> wrapper around a callback.  See
L<perlcall>.

=for hackers
Found in file cop.h

=item GIMME
X<GIMME>

A backward-compatible version of C<GIMME_V> which can only return
C<G_SCALAR> or C<G_ARRAY>; in a void context, it returns C<G_SCALAR>.
Deprecated.  Use C<GIMME_V> instead.

	U32	GIMME

=for hackers
Found in file op.h

=item GIMME_V
X<GIMME_V>

The XSUB-writer's equivalent to Perl's C<wantarray>.  Returns C<G_VOID>,
C<G_SCALAR> or C<G_ARRAY> for void, scalar or list context,
respectively.  See L<perlcall> for a usage example.

	U32	GIMME_V

=for hackers
Found in file op.h

=item G_NOARGS
X<G_NOARGS>

Indicates that no arguments are being sent to a callback.  See
L<perlcall>.

=for hackers
Found in file cop.h

=item G_SCALAR
X<G_SCALAR>

Used to indicate scalar context.  See C<L</GIMME_V>>, C<L</GIMME>>, and
L<perlcall>.

=for hackers
Found in file cop.h

=item G_VOID
X<G_VOID>

Used to indicate void context.  See C<L</GIMME_V>> and L<perlcall>.

=for hackers
Found in file cop.h


=back

=head1 Global Variables

These variables are global to an entire process.  They are shared between
all interpreters and all threads in a process.  Any variables not documented
here may be changed or removed without notice, so don't use them!
If you feel you really do need to use an unlisted variable, first send email to
L<perl5-porters@perl.org|mailto:perl5-porters@perl.org>.  It may be that
someone there will point out a way to accomplish what you need without using an
internal variable.  But if not, you should get a go-ahead to document and then
use the variable.


=over 8

=item PL_check
X<PL_check>

Array, indexed by opcode, of functions that will be called for the "check"
phase of optree building during compilation of Perl code.  For most (but
not all) types of op, once the op has been initially built and populated
with child ops it will be filtered through the check function referenced
by the appropriate element of this array.  The new op is passed in as the
sole argument to the check function, and the check function returns the
completed op.  The check function may (as the name suggests) check the op
for validity and signal errors.  It may also initialise or modify parts of
the ops, or perform more radical surgery such as adding or removing child
ops, or even throw the op away and return a different op in its place.

This array of function pointers is a convenient place to hook into the
compilation process.  An XS module can put its own custom check function
in place of any of the standard ones, to influence the compilation of a
particular type of op.  However, a custom check function must never fully
replace a standard check function (or even a custom check function from
another module).  A module modifying checking must instead B<wrap> the
preexisting check function.  A custom check function must be selective
about when to apply its custom behaviour.  In the usual case where
it decides not to do anything special with an op, it must chain the
preexisting op function.  Check functions are thus linked in a chain,
with the core's base checker at the end.

For thread safety, modules should not write directly to this array.
Instead, use the function L</wrap_op_checker>.

=for hackers
Found in file perlvars.h

=item PL_keyword_plugin
X<PL_keyword_plugin>


NOTE: this function is experimental and may change or be
removed without notice.


Function pointer, pointing at a function used to handle extended keywords.
The function should be declared as

	int keyword_plugin_function(pTHX_
		char *keyword_ptr, STRLEN keyword_len,
		OP **op_ptr)

The function is called from the tokeniser, whenever a possible keyword
is seen.  C<keyword_ptr> points at the word in the parser's input
buffer, and C<keyword_len> gives its length; it is not null-terminated.
The function is expected to examine the word, and possibly other state
such as L<%^H|perlvar/%^H>, to decide whether it wants to handle it
as an extended keyword.  If it does not, the function should return
C<KEYWORD_PLUGIN_DECLINE>, and the normal parser process will continue.

If the function wants to handle the keyword, it first must
parse anything following the keyword that is part of the syntax
introduced by the keyword.  See L</Lexer interface> for details.

When a keyword is being handled, the plugin function must build
a tree of C<OP> structures, representing the code that was parsed.
The root of the tree must be stored in C<*op_ptr>.  The function then
returns a constant indicating the syntactic role of the construct that
it has parsed: C<KEYWORD_PLUGIN_STMT> if it is a complete statement, or
C<KEYWORD_PLUGIN_EXPR> if it is an expression.  Note that a statement
construct cannot be used inside an expression (except via C<do BLOCK>
and similar), and an expression is not a complete statement (it requires
at least a terminating semicolon).

When a keyword is handled, the plugin function may also have
(compile-time) side effects.  It may modify C<%^H>, define functions, and
so on.  Typically, if side effects are the main purpose of a handler,
it does not wish to generate any ops to be included in the normal
compilation.  In this case it is still required to supply an op tree,
but it suffices to generate a single null op.

That's how the C<*PL_keyword_plugin> function needs to behave overall.
Conventionally, however, one does not completely replace the existing
handler function.  Instead, take a copy of C<PL_keyword_plugin> before
assigning your own function pointer to it.  Your handler function should
look for keywords that it is interested in and handle those.  Where it
is not interested, it should call the saved plugin function, passing on
the arguments it received.  Thus C<PL_keyword_plugin> actually points
at a chain of handler functions, all of which have an opportunity to
handle keywords, and only the last function in the chain (built into
the Perl core) will normally return C<KEYWORD_PLUGIN_DECLINE>.

=for hackers
Found in file perlvars.h


=back

=head1 GV Functions

A GV is a structure which corresponds to to a Perl typeglob, ie *foo.
It is a structure that holds a pointer to a scalar, an array, a hash etc,
corresponding to $foo, @foo, %foo.

GVs are usually found as values in stashes (symbol table hashes) where
Perl stores its global variables.


=over 8

=item GvAV
X<GvAV>

Return the AV from the GV.

	AV*	GvAV(GV* gv)

=for hackers
Found in file gv.h

=item gv_const_sv
X<gv_const_sv>

If C<gv> is a typeglob whose subroutine entry is a constant sub eligible for
inlining, or C<gv> is a placeholder reference that would be promoted to such
a typeglob, then returns the value returned by the sub.  Otherwise, returns
C<NULL>.

	SV*	gv_const_sv(GV* gv)

=for hackers
Found in file gv.c

=item GvCV
X<GvCV>

Return the CV from the GV.

	CV*	GvCV(GV* gv)

=for hackers
Found in file gv.h

=item gv_fetchmeth
X<gv_fetchmeth>

Like L</gv_fetchmeth_pvn>, but lacks a flags parameter.

	GV*	gv_fetchmeth(HV* stash, const char* name,
		             STRLEN len, I32 level)

=for hackers
Found in file gv.c

=item gv_fetchmethod_autoload
X<gv_fetchmethod_autoload>

Returns the glob which contains the subroutine to call to invoke the method
on the C<stash>.  In fact in the presence of autoloading this may be the
glob for "AUTOLOAD".  In this case the corresponding variable C<$AUTOLOAD> is
already setup.

The third parameter of C<gv_fetchmethod_autoload> determines whether
AUTOLOAD lookup is performed if the given method is not present: non-zero
means yes, look for AUTOLOAD; zero means no, don't look for AUTOLOAD.
Calling C<gv_fetchmethod> is equivalent to calling C<gv_fetchmethod_autoload>
with a non-zero C<autoload> parameter.

These functions grant C<"SUPER"> token
as a prefix of the method name.  Note
that if you want to keep the returned glob for a long time, you need to
check for it being "AUTOLOAD", since at the later time the call may load a
different subroutine due to C<$AUTOLOAD> changing its value.  Use the glob
created as a side effect to do this.

These functions have the same side-effects as C<gv_fetchmeth> with
C<level==0>.  The warning against passing the GV returned by
C<gv_fetchmeth> to C<call_sv> applies equally to these functions.

	GV*	gv_fetchmethod_autoload(HV* stash,
		                        const char* name,
		                        I32 autoload)

=for hackers
Found in file gv.c

=item gv_fetchmeth_autoload
X<gv_fetchmeth_autoload>

This is the old form of L</gv_fetchmeth_pvn_autoload>, which has no flags
parameter.

	GV*	gv_fetchmeth_autoload(HV* stash,
		                      const char* name,
		                      STRLEN len, I32 level)

=for hackers
Found in file gv.c

=item gv_fetchmeth_pv
X<gv_fetchmeth_pv>

Exactly like L</gv_fetchmeth_pvn>, but takes a nul-terminated string 
instead of a string/length pair.

	GV*	gv_fetchmeth_pv(HV* stash, const char* name,
		                I32 level, U32 flags)

=for hackers
Found in file gv.c

=item gv_fetchmeth_pvn
X<gv_fetchmeth_pvn>

Returns the glob with the given C<name> and a defined subroutine or
C<NULL>.  The glob lives in the given C<stash>, or in the stashes
accessible via C<@ISA> and C<UNIVERSAL::>.

The argument C<level> should be either 0 or -1.  If C<level==0>, as a
side-effect creates a glob with the given C<name> in the given C<stash>
which in the case of success contains an alias for the subroutine, and sets
up caching info for this glob.

The only significant values for C<flags> are C<GV_SUPER> and C<SVf_UTF8>.

C<GV_SUPER> indicates that we want to look up the method in the superclasses
of the C<stash>.

The
GV returned from C<gv_fetchmeth> may be a method cache entry, which is not
visible to Perl code.  So when calling C<call_sv>, you should not use
the GV directly; instead, you should use the method's CV, which can be
obtained from the GV with the C<GvCV> macro.

	GV*	gv_fetchmeth_pvn(HV* stash, const char* name,
		                 STRLEN len, I32 level,
		                 U32 flags)

=for hackers
Found in file gv.c

=item gv_fetchmeth_pvn_autoload
X<gv_fetchmeth_pvn_autoload>

Same as C<gv_fetchmeth_pvn()>, but looks for autoloaded subroutines too.
Returns a glob for the subroutine.

For an autoloaded subroutine without a GV, will create a GV even
if C<level < 0>.  For an autoloaded subroutine without a stub, C<GvCV()>
of the result may be zero.

Currently, the only significant value for C<flags> is C<SVf_UTF8>.

	GV*	gv_fetchmeth_pvn_autoload(HV* stash,
		                          const char* name,
		                          STRLEN len, I32 level,
		                          U32 flags)

=for hackers
Found in file gv.c

=item gv_fetchmeth_pv_autoload
X<gv_fetchmeth_pv_autoload>

Exactly like L</gv_fetchmeth_pvn_autoload>, but takes a nul-terminated string
instead of a string/length pair.

	GV*	gv_fetchmeth_pv_autoload(HV* stash,
		                         const char* name,
		                         I32 level, U32 flags)

=for hackers
Found in file gv.c

=item gv_fetchmeth_sv
X<gv_fetchmeth_sv>

Exactly like L</gv_fetchmeth_pvn>, but takes the name string in the form
of an SV instead of a string/length pair.

	GV*	gv_fetchmeth_sv(HV* stash, SV* namesv,
		                I32 level, U32 flags)

=for hackers
Found in file gv.c

=item gv_fetchmeth_sv_autoload
X<gv_fetchmeth_sv_autoload>

Exactly like L</gv_fetchmeth_pvn_autoload>, but takes the name string in the form
of an SV instead of a string/length pair.

	GV*	gv_fetchmeth_sv_autoload(HV* stash, SV* namesv,
		                         I32 level, U32 flags)

=for hackers
Found in file gv.c

=item GvHV
X<GvHV>

Return the HV from the GV.

	HV*	GvHV(GV* gv)

=for hackers
Found in file gv.h

=item gv_init
X<gv_init>

The old form of C<gv_init_pvn()>.  It does not work with UTF-8 strings, as it
has no flags parameter.  If the C<multi> parameter is set, the
C<GV_ADDMULTI> flag will be passed to C<gv_init_pvn()>.

	void	gv_init(GV* gv, HV* stash, const char* name,
		        STRLEN len, int multi)

=for hackers
Found in file gv.c

=item gv_init_pv
X<gv_init_pv>

Same as C<gv_init_pvn()>, but takes a nul-terminated string for the name
instead of separate char * and length parameters.

	void	gv_init_pv(GV* gv, HV* stash, const char* name,
		           U32 flags)

=for hackers
Found in file gv.c

=item gv_init_pvn
X<gv_init_pvn>

Converts a scalar into a typeglob.  This is an incoercible typeglob;
assigning a reference to it will assign to one of its slots, instead of
overwriting it as happens with typeglobs created by C<SvSetSV>.  Converting
any scalar that is C<SvOK()> may produce unpredictable results and is reserved
for perl's internal use.

C<gv> is the scalar to be converted.

C<stash> is the parent stash/package, if any.

C<name> and C<len> give the name.  The name must be unqualified;
that is, it must not include the package name.  If C<gv> is a
stash element, it is the caller's responsibility to ensure that the name
passed to this function matches the name of the element.  If it does not
match, perl's internal bookkeeping will get out of sync.

C<flags> can be set to C<SVf_UTF8> if C<name> is a UTF-8 string, or
the return value of SvUTF8(sv).  It can also take the
C<GV_ADDMULTI> flag, which means to pretend that the GV has been
seen before (i.e., suppress "Used once" warnings).

	void	gv_init_pvn(GV* gv, HV* stash, const char* name,
		            STRLEN len, U32 flags)

=for hackers
Found in file gv.c

=item gv_init_sv
X<gv_init_sv>

Same as C<gv_init_pvn()>, but takes an SV * for the name instead of separate
char * and length parameters.  C<flags> is currently unused.

	void	gv_init_sv(GV* gv, HV* stash, SV* namesv,
		           U32 flags)

=for hackers
Found in file gv.c

=item gv_stashpv
X<gv_stashpv>

Returns a pointer to the stash for a specified package.  Uses C<strlen> to
determine the length of C<name>, then calls C<gv_stashpvn()>.

	HV*	gv_stashpv(const char* name, I32 flags)

=for hackers
Found in file gv.c

=item gv_stashpvn
X<gv_stashpvn>

Returns a pointer to the stash for a specified package.  The C<namelen>
parameter indicates the length of the C<name>, in bytes.  C<flags> is passed
to C<gv_fetchpvn_flags()>, so if set to C<GV_ADD> then the package will be
created if it does not already exist.  If the package does not exist and
C<flags> is 0 (or any other setting that does not create packages) then C<NULL>
is returned.

Flags may be one of:

    GV_ADD
    SVf_UTF8
    GV_NOADD_NOINIT
    GV_NOINIT
    GV_NOEXPAND
    GV_ADDMG

The most important of which are probably C<GV_ADD> and C<SVf_UTF8>.

Note, use of C<gv_stashsv> instead of C<gv_stashpvn> where possible is strongly
recommended for performance reasons.

	HV*	gv_stashpvn(const char* name, U32 namelen,
		            I32 flags)

=for hackers
Found in file gv.c

=item gv_stashpvs
X<gv_stashpvs>

Like C<gv_stashpvn>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	HV*	gv_stashpvs(const char* name, I32 create)

=for hackers
Found in file handy.h

=item gv_stashsv
X<gv_stashsv>

Returns a pointer to the stash for a specified package.  See
C<L</gv_stashpvn>>.

Note this interface is strongly preferred over C<gv_stashpvn> for performance
reasons.

	HV*	gv_stashsv(SV* sv, I32 flags)

=for hackers
Found in file gv.c

=item GvSV
X<GvSV>

Return the SV from the GV.

	SV*	GvSV(GV* gv)

=for hackers
Found in file gv.h

=item setdefout
X<setdefout>

Sets C<PL_defoutgv>, the default file handle for output, to the passed in
typeglob.  As C<PL_defoutgv> "owns" a reference on its typeglob, the reference
count of the passed in typeglob is increased by one, and the reference count
of the typeglob that C<PL_defoutgv> points to is decreased by one.

	void	setdefout(GV* gv)

=for hackers
Found in file pp_sys.c


=back

=head1 Handy Values

=over 8

=item Nullav
X<Nullav>

Null AV pointer.

(deprecated - use C<(AV *)NULL> instead)

=for hackers
Found in file av.h

=item Nullch
X<Nullch>

Null character pointer.  (No longer available when C<PERL_CORE> is
defined.)

=for hackers
Found in file handy.h

=item Nullcv
X<Nullcv>

Null CV pointer.

(deprecated - use C<(CV *)NULL> instead)

=for hackers
Found in file cv.h

=item Nullhv
X<Nullhv>

Null HV pointer.

(deprecated - use C<(HV *)NULL> instead)

=for hackers
Found in file hv.h

=item Nullsv
X<Nullsv>

Null SV pointer.  (No longer available when C<PERL_CORE> is defined.)

=for hackers
Found in file handy.h


=back

=head1 Hash Manipulation Functions

A HV structure represents a Perl hash.  It consists mainly of an array
of pointers, each of which points to a linked list of HE structures.  The
array is indexed by the hash function of the key, so each linked list
represents all the hash entries with the same hash value.  Each HE contains
a pointer to the actual value, plus a pointer to a HEK structure which
holds the key and hash value.


=over 8

=item cop_fetch_label
X<cop_fetch_label>


NOTE: this function is experimental and may change or be
removed without notice.


Returns the label attached to a cop.
The flags pointer may be set to C<SVf_UTF8> or 0.

	const char * cop_fetch_label(COP *const cop,
	                             STRLEN *len, U32 *flags)

=for hackers
Found in file hv.c

=item cop_store_label
X<cop_store_label>


NOTE: this function is experimental and may change or be
removed without notice.


Save a label into a C<cop_hints_hash>.
You need to set flags to C<SVf_UTF8>
for a UTF-8 label.

	void	cop_store_label(COP *const cop,
		                const char *label, STRLEN len,
		                U32 flags)

=for hackers
Found in file hv.c

=item get_hv
X<get_hv>

Returns the HV of the specified Perl hash.  C<flags> are passed to
C<gv_fetchpv>.  If C<GV_ADD> is set and the
Perl variable does not exist then it will be created.  If C<flags> is zero
and the variable does not exist then C<NULL> is returned.

NOTE: the perl_ form of this function is deprecated.

	HV*	get_hv(const char *name, I32 flags)

=for hackers
Found in file perl.c

=item HEf_SVKEY
X<HEf_SVKEY>

This flag, used in the length slot of hash entries and magic structures,
specifies the structure contains an C<SV*> pointer where a C<char*> pointer
is to be expected.  (For information only--not to be used).

=for hackers
Found in file hv.h

=item HeHASH
X<HeHASH>

Returns the computed hash stored in the hash entry.

	U32	HeHASH(HE* he)

=for hackers
Found in file hv.h

=item HeKEY
X<HeKEY>

Returns the actual pointer stored in the key slot of the hash entry.  The
pointer may be either C<char*> or C<SV*>, depending on the value of
C<HeKLEN()>.  Can be assigned to.  The C<HePV()> or C<HeSVKEY()> macros are
usually preferable for finding the value of a key.

	void*	HeKEY(HE* he)

=for hackers
Found in file hv.h

=item HeKLEN
X<HeKLEN>

If this is negative, and amounts to C<HEf_SVKEY>, it indicates the entry
holds an C<SV*> key.  Otherwise, holds the actual length of the key.  Can
be assigned to.  The C<HePV()> macro is usually preferable for finding key
lengths.

	STRLEN	HeKLEN(HE* he)

=for hackers
Found in file hv.h

=item HePV
X<HePV>

Returns the key slot of the hash entry as a C<char*> value, doing any
necessary dereferencing of possibly C<SV*> keys.  The length of the string
is placed in C<len> (this is a macro, so do I<not> use C<&len>).  If you do
not care about what the length of the key is, you may use the global
variable C<PL_na>, though this is rather less efficient than using a local
variable.  Remember though, that hash keys in perl are free to contain
embedded nulls, so using C<strlen()> or similar is not a good way to find
the length of hash keys.  This is very similar to the C<SvPV()> macro
described elsewhere in this document.  See also C<L</HeUTF8>>.

If you are using C<HePV> to get values to pass to C<newSVpvn()> to create a
new SV, you should consider using C<newSVhek(HeKEY_hek(he))> as it is more
efficient.

	char*	HePV(HE* he, STRLEN len)

=for hackers
Found in file hv.h

=item HeSVKEY
X<HeSVKEY>

Returns the key as an C<SV*>, or C<NULL> if the hash entry does not
contain an C<SV*> key.

	SV*	HeSVKEY(HE* he)

=for hackers
Found in file hv.h

=item HeSVKEY_force
X<HeSVKEY_force>

Returns the key as an C<SV*>.  Will create and return a temporary mortal
C<SV*> if the hash entry contains only a C<char*> key.

	SV*	HeSVKEY_force(HE* he)

=for hackers
Found in file hv.h

=item HeSVKEY_set
X<HeSVKEY_set>

Sets the key to a given C<SV*>, taking care to set the appropriate flags to
indicate the presence of an C<SV*> key, and returns the same
C<SV*>.

	SV*	HeSVKEY_set(HE* he, SV* sv)

=for hackers
Found in file hv.h

=item HeUTF8
X<HeUTF8>

Returns whether the C<char *> value returned by C<HePV> is encoded in UTF-8,
doing any necessary dereferencing of possibly C<SV*> keys.  The value returned
will be 0 or non-0, not necessarily 1 (or even a value with any low bits set),
so B<do not> blindly assign this to a C<bool> variable, as C<bool> may be a
typedef for C<char>.

	U32	HeUTF8(HE* he)

=for hackers
Found in file hv.h

=item HeVAL
X<HeVAL>

Returns the value slot (type C<SV*>)
stored in the hash entry.  Can be assigned
to.

  SV *foo= HeVAL(hv);
  HeVAL(hv)= sv;


	SV*	HeVAL(HE* he)

=for hackers
Found in file hv.h

=item hv_assert
X<hv_assert>

Check that a hash is in an internally consistent state.

	void	hv_assert(HV *hv)

=for hackers
Found in file hv.c

=item hv_bucket_ratio
X<hv_bucket_ratio>


NOTE: this function is experimental and may change or be
removed without notice.


If the hash is tied dispatches through to the SCALAR tied method,
otherwise if the hash contains no keys returns 0, otherwise returns
a mortal sv containing a string specifying the number of used buckets,
followed by a slash, followed by the number of available buckets.

This function is expensive, it must scan all of the buckets
to determine which are used, and the count is NOT cached.
In a large hash this could be a lot of buckets.

	SV*	hv_bucket_ratio(HV *hv)

=for hackers
Found in file hv.c

=item hv_clear
X<hv_clear>

Frees the all the elements of a hash, leaving it empty.
The XS equivalent of C<%hash = ()>.  See also L</hv_undef>.

See L</av_clear> for a note about the hash possibly being invalid on
return.

	void	hv_clear(HV *hv)

=for hackers
Found in file hv.c

=item hv_clear_placeholders
X<hv_clear_placeholders>

Clears any placeholders from a hash.  If a restricted hash has any of its keys
marked as readonly and the key is subsequently deleted, the key is not actually
deleted but is marked by assigning it a value of C<&PL_sv_placeholder>.  This tags
it so it will be ignored by future operations such as iterating over the hash,
but will still allow the hash to have a value reassigned to the key at some
future point.  This function clears any such placeholder keys from the hash.
See C<L<Hash::Util::lock_keys()|Hash::Util/lock_keys>> for an example of its
use.

	void	hv_clear_placeholders(HV *hv)

=for hackers
Found in file hv.c

=item hv_copy_hints_hv
X<hv_copy_hints_hv>

A specialised version of L</newHVhv> for copying C<%^H>.  C<ohv> must be
a pointer to a hash (which may have C<%^H> magic, but should be generally
non-magical), or C<NULL> (interpreted as an empty hash).  The content
of C<ohv> is copied to a new hash, which has the C<%^H>-specific magic
added to it.  A pointer to the new hash is returned.

	HV *	hv_copy_hints_hv(HV *ohv)

=for hackers
Found in file hv.c

=item hv_delete
X<hv_delete>

Deletes a key/value pair in the hash.  The value's SV is removed from
the hash, made mortal, and returned to the caller.  The absolute
value of C<klen> is the length of the key.  If C<klen> is negative the
key is assumed to be in UTF-8-encoded Unicode.  The C<flags> value
will normally be zero; if set to C<G_DISCARD> then C<NULL> will be returned.
C<NULL> will also be returned if the key is not found.

	SV*	hv_delete(HV *hv, const char *key, I32 klen,
		          I32 flags)

=for hackers
Found in file hv.c

=item hv_delete_ent
X<hv_delete_ent>

Deletes a key/value pair in the hash.  The value SV is removed from the hash,
made mortal, and returned to the caller.  The C<flags> value will normally be
zero; if set to C<G_DISCARD> then C<NULL> will be returned.  C<NULL> will also
be returned if the key is not found.  C<hash> can be a valid precomputed hash
value, or 0 to ask for it to be computed.

	SV*	hv_delete_ent(HV *hv, SV *keysv, I32 flags,
		              U32 hash)

=for hackers
Found in file hv.c

=item HvENAME
X<HvENAME>

Returns the effective name of a stash, or NULL if there is none.  The
effective name represents a location in the symbol table where this stash
resides.  It is updated automatically when packages are aliased or deleted.
A stash that is no longer in the symbol table has no effective name.  This
name is preferable to C<HvNAME> for use in MRO linearisations and isa
caches.

	char*	HvENAME(HV* stash)

=for hackers
Found in file hv.h

=item HvENAMELEN
X<HvENAMELEN>

Returns the length of the stash's effective name.

	STRLEN	HvENAMELEN(HV *stash)

=for hackers
Found in file hv.h

=item HvENAMEUTF8
X<HvENAMEUTF8>

Returns true if the effective name is in UTF-8 encoding.

	unsigned char HvENAMEUTF8(HV *stash)

=for hackers
Found in file hv.h

=item hv_exists
X<hv_exists>

Returns a boolean indicating whether the specified hash key exists.  The
absolute value of C<klen> is the length of the key.  If C<klen> is
negative the key is assumed to be in UTF-8-encoded Unicode.

	bool	hv_exists(HV *hv, const char *key, I32 klen)

=for hackers
Found in file hv.c

=item hv_exists_ent
X<hv_exists_ent>

Returns a boolean indicating whether
the specified hash key exists.  C<hash>
can be a valid precomputed hash value, or 0 to ask for it to be
computed.

	bool	hv_exists_ent(HV *hv, SV *keysv, U32 hash)

=for hackers
Found in file hv.c

=item hv_fetch
X<hv_fetch>

Returns the SV which corresponds to the specified key in the hash.
The absolute value of C<klen> is the length of the key.  If C<klen> is
negative the key is assumed to be in UTF-8-encoded Unicode.  If
C<lval> is set then the fetch will be part of a store.  This means that if
there is no value in the hash associated with the given key, then one is
created and a pointer to it is returned.  The C<SV*> it points to can be
assigned to.  But always check that the
return value is non-null before dereferencing it to an C<SV*>.

See L<perlguts/"Understanding the Magic of Tied Hashes and Arrays"> for more
information on how to use this function on tied hashes.

	SV**	hv_fetch(HV *hv, const char *key, I32 klen,
		         I32 lval)

=for hackers
Found in file hv.c

=item hv_fetchs
X<hv_fetchs>

Like C<hv_fetch>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	SV**	hv_fetchs(HV* tb, const char* key, I32 lval)

=for hackers
Found in file handy.h

=item hv_fetch_ent
X<hv_fetch_ent>

Returns the hash entry which corresponds to the specified key in the hash.
C<hash> must be a valid precomputed hash number for the given C<key>, or 0
if you want the function to compute it.  IF C<lval> is set then the fetch
will be part of a store.  Make sure the return value is non-null before
accessing it.  The return value when C<hv> is a tied hash is a pointer to a
static location, so be sure to make a copy of the structure if you need to
store it somewhere.

See L<perlguts/"Understanding the Magic of Tied Hashes and Arrays"> for more
information on how to use this function on tied hashes.

	HE*	hv_fetch_ent(HV *hv, SV *keysv, I32 lval,
		             U32 hash)

=for hackers
Found in file hv.c

=item hv_fill
X<hv_fill>

Returns the number of hash buckets that happen to be in use.

This function is wrapped by the macro C<HvFILL>.

As of perl 5.25 this function is used only for debugging
purposes, and the number of used hash buckets is not
in any way cached, thus this function can be costly
to execute as it must iterate over all the buckets in the
hash.

	STRLEN	hv_fill(HV *const hv)

=for hackers
Found in file hv.c

=item hv_iterinit
X<hv_iterinit>

Prepares a starting point to traverse a hash table.  Returns the number of
keys in the hash, including placeholders (i.e. the same as C<HvTOTALKEYS(hv)>).
The return value is currently only meaningful for hashes without tie magic.

NOTE: Before version 5.004_65, C<hv_iterinit> used to return the number of
hash buckets that happen to be in use.  If you still need that esoteric
value, you can get it through the macro C<HvFILL(hv)>.


	I32	hv_iterinit(HV *hv)

=for hackers
Found in file hv.c

=item hv_iterkey
X<hv_iterkey>

Returns the key from the current position of the hash iterator.  See
C<L</hv_iterinit>>.

	char*	hv_iterkey(HE* entry, I32* retlen)

=for hackers
Found in file hv.c

=item hv_iterkeysv
X<hv_iterkeysv>

Returns the key as an C<SV*> from the current position of the hash
iterator.  The return value will always be a mortal copy of the key.  Also
see C<L</hv_iterinit>>.

	SV*	hv_iterkeysv(HE* entry)

=for hackers
Found in file hv.c

=item hv_iternext
X<hv_iternext>

Returns entries from a hash iterator.  See C<L</hv_iterinit>>.

You may call C<hv_delete> or C<hv_delete_ent> on the hash entry that the
iterator currently points to, without losing your place or invalidating your
iterator.  Note that in this case the current entry is deleted from the hash
with your iterator holding the last reference to it.  Your iterator is flagged
to free the entry on the next call to C<hv_iternext>, so you must not discard
your iterator immediately else the entry will leak - call C<hv_iternext> to
trigger the resource deallocation.

	HE*	hv_iternext(HV *hv)

=for hackers
Found in file hv.c

=item hv_iternextsv
X<hv_iternextsv>

Performs an C<hv_iternext>, C<hv_iterkey>, and C<hv_iterval> in one
operation.

	SV*	hv_iternextsv(HV *hv, char **key, I32 *retlen)

=for hackers
Found in file hv.c

=item hv_iternext_flags
X<hv_iternext_flags>


NOTE: this function is experimental and may change or be
removed without notice.


Returns entries from a hash iterator.  See C<L</hv_iterinit>> and
C<L</hv_iternext>>.
The C<flags> value will normally be zero; if C<HV_ITERNEXT_WANTPLACEHOLDERS> is
set the placeholders keys (for restricted hashes) will be returned in addition
to normal keys.  By default placeholders are automatically skipped over.
Currently a placeholder is implemented with a value that is
C<&PL_sv_placeholder>.  Note that the implementation of placeholders and
restricted hashes may change, and the implementation currently is
insufficiently abstracted for any change to be tidy.

	HE*	hv_iternext_flags(HV *hv, I32 flags)

=for hackers
Found in file hv.c

=item hv_iterval
X<hv_iterval>

Returns the value from the current position of the hash iterator.  See
C<L</hv_iterkey>>.

	SV*	hv_iterval(HV *hv, HE *entry)

=for hackers
Found in file hv.c

=item hv_magic
X<hv_magic>

Adds magic to a hash.  See C<L</sv_magic>>.

	void	hv_magic(HV *hv, GV *gv, int how)

=for hackers
Found in file hv.c

=item HvNAME
X<HvNAME>

Returns the package name of a stash, or C<NULL> if C<stash> isn't a stash.
See C<L</SvSTASH>>, C<L</CvSTASH>>.

	char*	HvNAME(HV* stash)

=for hackers
Found in file hv.h

=item HvNAMELEN
X<HvNAMELEN>

Returns the length of the stash's name.

	STRLEN	HvNAMELEN(HV *stash)

=for hackers
Found in file hv.h

=item HvNAMEUTF8
X<HvNAMEUTF8>

Returns true if the name is in UTF-8 encoding.

	unsigned char HvNAMEUTF8(HV *stash)

=for hackers
Found in file hv.h

=item hv_scalar
X<hv_scalar>

Evaluates the hash in scalar context and returns the result.

When the hash is tied dispatches through to the SCALAR method,
otherwise returns a mortal SV containing the number of keys
in the hash.

Note, prior to 5.25 this function returned what is now
returned by the hv_bucket_ratio() function.

	SV*	hv_scalar(HV *hv)

=for hackers
Found in file hv.c

=item hv_store
X<hv_store>

Stores an SV in a hash.  The hash key is specified as C<key> and the
absolute value of C<klen> is the length of the key.  If C<klen> is
negative the key is assumed to be in UTF-8-encoded Unicode.  The
C<hash> parameter is the precomputed hash value; if it is zero then
Perl will compute it.

The return value will be
C<NULL> if the operation failed or if the value did not need to be actually
stored within the hash (as in the case of tied hashes).  Otherwise it can
be dereferenced to get the original C<SV*>.  Note that the caller is
responsible for suitably incrementing the reference count of C<val> before
the call, and decrementing it if the function returned C<NULL>.  Effectively
a successful C<hv_store> takes ownership of one reference to C<val>.  This is
usually what you want; a newly created SV has a reference count of one, so
if all your code does is create SVs then store them in a hash, C<hv_store>
will own the only reference to the new SV, and your code doesn't need to do
anything further to tidy up.  C<hv_store> is not implemented as a call to
C<hv_store_ent>, and does not create a temporary SV for the key, so if your
key data is not already in SV form then use C<hv_store> in preference to
C<hv_store_ent>.

See L<perlguts/"Understanding the Magic of Tied Hashes and Arrays"> for more
information on how to use this function on tied hashes.

	SV**	hv_store(HV *hv, const char *key, I32 klen,
		         SV *val, U32 hash)

=for hackers
Found in file hv.c

=item hv_stores
X<hv_stores>

Like C<hv_store>, but takes a C<NUL>-terminated literal string instead of a
string/length pair
and omits the hash parameter.

	SV**	hv_stores(HV* tb, const char* key,
		          NULLOK SV* val)

=for hackers
Found in file handy.h

=item hv_store_ent
X<hv_store_ent>

Stores C<val> in a hash.  The hash key is specified as C<key>.  The C<hash>
parameter is the precomputed hash value; if it is zero then Perl will
compute it.  The return value is the new hash entry so created.  It will be
C<NULL> if the operation failed or if the value did not need to be actually
stored within the hash (as in the case of tied hashes).  Otherwise the
contents of the return value can be accessed using the C<He?> macros
described here.  Note that the caller is responsible for suitably
incrementing the reference count of C<val> before the call, and
decrementing it if the function returned NULL.  Effectively a successful
C<hv_store_ent> takes ownership of one reference to C<val>.  This is
usually what you want; a newly created SV has a reference count of one, so
if all your code does is create SVs then store them in a hash, C<hv_store>
will own the only reference to the new SV, and your code doesn't need to do
anything further to tidy up.  Note that C<hv_store_ent> only reads the C<key>;
unlike C<val> it does not take ownership of it, so maintaining the correct
reference count on C<key> is entirely the caller's responsibility.  C<hv_store>
is not implemented as a call to C<hv_store_ent>, and does not create a temporary
SV for the key, so if your key data is not already in SV form then use
C<hv_store> in preference to C<hv_store_ent>.

See L<perlguts/"Understanding the Magic of Tied Hashes and Arrays"> for more
information on how to use this function on tied hashes.

	HE*	hv_store_ent(HV *hv, SV *key, SV *val, U32 hash)

=for hackers
Found in file hv.c

=item hv_undef
X<hv_undef>

Undefines the hash.  The XS equivalent of C<undef(%hash)>.

As well as freeing all the elements of the hash (like C<hv_clear()>), this
also frees any auxiliary data and storage associated with the hash.

See L</av_clear> for a note about the hash possibly being invalid on
return.

	void	hv_undef(HV *hv)

=for hackers
Found in file hv.c

=item newHV
X<newHV>

Creates a new HV.  The reference count is set to 1.

	HV*	newHV()

=for hackers
Found in file hv.h


=back

=head1 Hook manipulation

These functions provide convenient and thread-safe means of manipulating
hook variables.


=over 8

=item wrap_op_checker
X<wrap_op_checker>

Puts a C function into the chain of check functions for a specified op
type.  This is the preferred way to manipulate the L</PL_check> array.
C<opcode> specifies which type of op is to be affected.  C<new_checker>
is a pointer to the C function that is to be added to that opcode's
check chain, and C<old_checker_p> points to the storage location where a
pointer to the next function in the chain will be stored.  The value of
C<new_pointer> is written into the L</PL_check> array, while the value
previously stored there is written to C<*old_checker_p>.

The function should be defined like this:

    static OP *new_checker(pTHX_ OP *op) { ... }

It is intended to be called in this manner:

    new_checker(aTHX_ op)

C<old_checker_p> should be defined like this:

    static Perl_check_t old_checker_p;

L</PL_check> is global to an entire process, and a module wishing to
hook op checking may find itself invoked more than once per process,
typically in different threads.  To handle that situation, this function
is idempotent.  The location C<*old_checker_p> must initially (once
per process) contain a null pointer.  A C variable of static duration
(declared at file scope, typically also marked C<static> to give
it internal linkage) will be implicitly initialised appropriately,
if it does not have an explicit initialiser.  This function will only
actually modify the check chain if it finds C<*old_checker_p> to be null.
This function is also thread safe on the small scale.  It uses appropriate
locking to avoid race conditions in accessing L</PL_check>.

When this function is called, the function referenced by C<new_checker>
must be ready to be called, except for C<*old_checker_p> being unfilled.
In a threading situation, C<new_checker> may be called immediately,
even before this function has returned.  C<*old_checker_p> will always
be appropriately set before C<new_checker> is called.  If C<new_checker>
decides not to do anything special with an op that it is given (which
is the usual case for most uses of op check hooking), it must chain the
check function referenced by C<*old_checker_p>.

If you want to influence compilation of calls to a specific subroutine,
then use L</cv_set_call_checker> rather than hooking checking of all
C<entersub> ops.

	void	wrap_op_checker(Optype opcode,
		                Perl_check_t new_checker,
		                Perl_check_t *old_checker_p)

=for hackers
Found in file op.c


=back

=head1 Lexer interface

This is the lower layer of the Perl parser, managing characters and tokens.


=over 8

=item lex_bufutf8
X<lex_bufutf8>


NOTE: this function is experimental and may change or be
removed without notice.


Indicates whether the octets in the lexer buffer
(L</PL_parser-E<gt>linestr>) should be interpreted as the UTF-8 encoding
of Unicode characters.  If not, they should be interpreted as Latin-1
characters.  This is analogous to the C<SvUTF8> flag for scalars.

In UTF-8 mode, it is not guaranteed that the lexer buffer actually
contains valid UTF-8.  Lexing code must be robust in the face of invalid
encoding.

The actual C<SvUTF8> flag of the L</PL_parser-E<gt>linestr> scalar
is significant, but not the whole story regarding the input character
encoding.  Normally, when a file is being read, the scalar contains octets
and its C<SvUTF8> flag is off, but the octets should be interpreted as
UTF-8 if the C<use utf8> pragma is in effect.  During a string eval,
however, the scalar may have the C<SvUTF8> flag on, and in this case its
octets should be interpreted as UTF-8 unless the C<use bytes> pragma
is in effect.  This logic may change in the future; use this function
instead of implementing the logic yourself.

	bool	lex_bufutf8()

=for hackers
Found in file toke.c

=item lex_discard_to
X<lex_discard_to>


NOTE: this function is experimental and may change or be
removed without notice.


Discards the first part of the L</PL_parser-E<gt>linestr> buffer,
up to C<ptr>.  The remaining content of the buffer will be moved, and
all pointers into the buffer updated appropriately.  C<ptr> must not
be later in the buffer than the position of L</PL_parser-E<gt>bufptr>:
it is not permitted to discard text that has yet to be lexed.

Normally it is not necessarily to do this directly, because it suffices to
use the implicit discarding behaviour of L</lex_next_chunk> and things
based on it.  However, if a token stretches across multiple lines,
and the lexing code has kept multiple lines of text in the buffer for
that purpose, then after completion of the token it would be wise to
explicitly discard the now-unneeded earlier lines, to avoid future
multi-line tokens growing the buffer without bound.

	void	lex_discard_to(char *ptr)

=for hackers
Found in file toke.c

=item lex_grow_linestr
X<lex_grow_linestr>


NOTE: this function is experimental and may change or be
removed without notice.


Reallocates the lexer buffer (L</PL_parser-E<gt>linestr>) to accommodate
at least C<len> octets (including terminating C<NUL>).  Returns a
pointer to the reallocated buffer.  This is necessary before making
any direct modification of the buffer that would increase its length.
L</lex_stuff_pvn> provides a more convenient way to insert text into
the buffer.

Do not use C<SvGROW> or C<sv_grow> directly on C<PL_parser-E<gt>linestr>;
this function updates all of the lexer's variables that point directly
into the buffer.

	char *	lex_grow_linestr(STRLEN len)

=for hackers
Found in file toke.c

=item lex_next_chunk
X<lex_next_chunk>


NOTE: this function is experimental and may change or be
removed without notice.


Reads in the next chunk of text to be lexed, appending it to
L</PL_parser-E<gt>linestr>.  This should be called when lexing code has
looked to the end of the current chunk and wants to know more.  It is
usual, but not necessary, for lexing to have consumed the entirety of
the current chunk at this time.

If L</PL_parser-E<gt>bufptr> is pointing to the very end of the current
chunk (i.e., the current chunk has been entirely consumed), normally the
current chunk will be discarded at the same time that the new chunk is
read in.  If C<flags> has the C<LEX_KEEP_PREVIOUS> bit set, the current chunk
will not be discarded.  If the current chunk has not been entirely
consumed, then it will not be discarded regardless of the flag.

Returns true if some new text was added to the buffer, or false if the
buffer has reached the end of the input text.

	bool	lex_next_chunk(U32 flags)

=for hackers
Found in file toke.c

=item lex_peek_unichar
X<lex_peek_unichar>


NOTE: this function is experimental and may change or be
removed without notice.


Looks ahead one (Unicode) character in the text currently being lexed.
Returns the codepoint (unsigned integer value) of the next character,
or -1 if lexing has reached the end of the input text.  To consume the
peeked character, use L</lex_read_unichar>.

If the next character is in (or extends into) the next chunk of input
text, the next chunk will be read in.  Normally the current chunk will be
discarded at the same time, but if C<flags> has the C<LEX_KEEP_PREVIOUS>
bit set, then the current chunk will not be discarded.

If the input is being interpreted as UTF-8 and a UTF-8 encoding error
is encountered, an exception is generated.

	I32	lex_peek_unichar(U32 flags)

=for hackers
Found in file toke.c

=item lex_read_space
X<lex_read_space>


NOTE: this function is experimental and may change or be
removed without notice.


Reads optional spaces, in Perl style, in the text currently being
lexed.  The spaces may include ordinary whitespace characters and
Perl-style comments.  C<#line> directives are processed if encountered.
L</PL_parser-E<gt>bufptr> is moved past the spaces, so that it points
at a non-space character (or the end of the input text).

If spaces extend into the next chunk of input text, the next chunk will
be read in.  Normally the current chunk will be discarded at the same
time, but if C<flags> has the C<LEX_KEEP_PREVIOUS> bit set, then the current
chunk will not be discarded.

	void	lex_read_space(U32 flags)

=for hackers
Found in file toke.c

=item lex_read_to
X<lex_read_to>


NOTE: this function is experimental and may change or be
removed without notice.


Consume text in the lexer buffer, from L</PL_parser-E<gt>bufptr> up
to C<ptr>.  This advances L</PL_parser-E<gt>bufptr> to match C<ptr>,
performing the correct bookkeeping whenever a newline character is passed.
This is the normal way to consume lexed text.

Interpretation of the buffer's octets can be abstracted out by
using the slightly higher-level functions L</lex_peek_unichar> and
L</lex_read_unichar>.

	void	lex_read_to(char *ptr)

=for hackers
Found in file toke.c

=item lex_read_unichar
X<lex_read_unichar>


NOTE: this function is experimental and may change or be
removed without notice.


Reads the next (Unicode) character in the text currently being lexed.
Returns the codepoint (unsigned integer value) of the character read,
and moves L</PL_parser-E<gt>bufptr> past the character, or returns -1
if lexing has reached the end of the input text.  To non-destructively
examine the next character, use L</lex_peek_unichar> instead.

If the next character is in (or extends into) the next chunk of input
text, the next chunk will be read in.  Normally the current chunk will be
discarded at the same time, but if C<flags> has the C<LEX_KEEP_PREVIOUS>
bit set, then the current chunk will not be discarded.

If the input is being interpreted as UTF-8 and a UTF-8 encoding error
is encountered, an exception is generated.

	I32	lex_read_unichar(U32 flags)

=for hackers
Found in file toke.c

=item lex_start
X<lex_start>


NOTE: this function is experimental and may change or be
removed without notice.


Creates and initialises a new lexer/parser state object, supplying
a context in which to lex and parse from a new source of Perl code.
A pointer to the new state object is placed in L</PL_parser>.  An entry
is made on the save stack so that upon unwinding, the new state object
will be destroyed and the former value of L</PL_parser> will be restored.
Nothing else need be done to clean up the parsing context.

The code to be parsed comes from C<line> and C<rsfp>.  C<line>, if
non-null, provides a string (in SV form) containing code to be parsed.
A copy of the string is made, so subsequent modification of C<line>
does not affect parsing.  C<rsfp>, if non-null, provides an input stream
from which code will be read to be parsed.  If both are non-null, the
code in C<line> comes first and must consist of complete lines of input,
and C<rsfp> supplies the remainder of the source.

The C<flags> parameter is reserved for future use.  Currently it is only
used by perl internally, so extensions should always pass zero.

	void	lex_start(SV *line, PerlIO *rsfp, U32 flags)

=for hackers
Found in file toke.c

=item lex_stuff_pv
X<lex_stuff_pv>


NOTE: this function is experimental and may change or be
removed without notice.


Insert characters into the lexer buffer (L</PL_parser-E<gt>linestr>),
immediately after the current lexing point (L</PL_parser-E<gt>bufptr>),
reallocating the buffer if necessary.  This means that lexing code that
runs later will see the characters as if they had appeared in the input.
It is not recommended to do this as part of normal parsing, and most
uses of this facility run the risk of the inserted characters being
interpreted in an unintended manner.

The string to be inserted is represented by octets starting at C<pv>
and continuing to the first nul.  These octets are interpreted as either
UTF-8 or Latin-1, according to whether the C<LEX_STUFF_UTF8> flag is set
in C<flags>.  The characters are recoded for the lexer buffer, according
to how the buffer is currently being interpreted (L</lex_bufutf8>).
If it is not convenient to nul-terminate a string to be inserted, the
L</lex_stuff_pvn> function is more appropriate.

	void	lex_stuff_pv(const char *pv, U32 flags)

=for hackers
Found in file toke.c

=item lex_stuff_pvn
X<lex_stuff_pvn>


NOTE: this function is experimental and may change or be
removed without notice.


Insert characters into the lexer buffer (L</PL_parser-E<gt>linestr>),
immediately after the current lexing point (L</PL_parser-E<gt>bufptr>),
reallocating the buffer if necessary.  This means that lexing code that
runs later will see the characters as if they had appeared in the input.
It is not recommended to do this as part of normal parsing, and most
uses of this facility run the risk of the inserted characters being
interpreted in an unintended manner.

The string to be inserted is represented by C<len> octets starting
at C<pv>.  These octets are interpreted as either UTF-8 or Latin-1,
according to whether the C<LEX_STUFF_UTF8> flag is set in C<flags>.
The characters are recoded for the lexer buffer, according to how the
buffer is currently being interpreted (L</lex_bufutf8>).  If a string
to be inserted is available as a Perl scalar, the L</lex_stuff_sv>
function is more convenient.

	void	lex_stuff_pvn(const char *pv, STRLEN len,
		              U32 flags)

=for hackers
Found in file toke.c

=item lex_stuff_pvs
X<lex_stuff_pvs>


NOTE: this function is experimental and may change or be
removed without notice.


Like L</lex_stuff_pvn>, but takes a C<NUL>-terminated literal string instead of
a string/length pair.

	void	lex_stuff_pvs(const char *pv, U32 flags)

=for hackers
Found in file handy.h

=item lex_stuff_sv
X<lex_stuff_sv>


NOTE: this function is experimental and may change or be
removed without notice.


Insert characters into the lexer buffer (L</PL_parser-E<gt>linestr>),
immediately after the current lexing point (L</PL_parser-E<gt>bufptr>),
reallocating the buffer if necessary.  This means that lexing code that
runs later will see the characters as if they had appeared in the input.
It is not recommended to do this as part of normal parsing, and most
uses of this facility run the risk of the inserted characters being
interpreted in an unintended manner.

The string to be inserted is the string value of C<sv>.  The characters
are recoded for the lexer buffer, according to how the buffer is currently
being interpreted (L</lex_bufutf8>).  If a string to be inserted is
not already a Perl scalar, the L</lex_stuff_pvn> function avoids the
need to construct a scalar.

	void	lex_stuff_sv(SV *sv, U32 flags)

=for hackers
Found in file toke.c

=item lex_unstuff
X<lex_unstuff>


NOTE: this function is experimental and may change or be
removed without notice.


Discards text about to be lexed, from L</PL_parser-E<gt>bufptr> up to
C<ptr>.  Text following C<ptr> will be moved, and the buffer shortened.
This hides the discarded text from any lexing code that runs later,
as if the text had never appeared.

This is not the normal way to consume lexed text.  For that, use
L</lex_read_to>.

	void	lex_unstuff(char *ptr)

=for hackers
Found in file toke.c

=item parse_arithexpr
X<parse_arithexpr>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a Perl arithmetic expression.  This may contain operators of precedence
down to the bit shift operators.  The expression must be followed (and thus
terminated) either by a comparison or lower-precedence operator or by
something that would normally terminate an expression such as semicolon.
If C<flags> has the C<PARSE_OPTIONAL> bit set, then the expression is optional,
otherwise it is mandatory.  It is up to the caller to ensure that the
dynamic parser state (L</PL_parser> et al) is correctly set to reflect
the source of the code to be parsed and the lexical context for the
expression.

The op tree representing the expression is returned.  If an optional
expression is absent, a null pointer is returned, otherwise the pointer
will be non-null.

If an error occurs in parsing or compilation, in most cases a valid op
tree is returned anyway.  The error is reflected in the parser state,
normally resulting in a single exception at the top level of parsing
which covers all the compilation errors that occurred.  Some compilation
errors, however, will throw an exception immediately.

	OP *	parse_arithexpr(U32 flags)

=for hackers
Found in file toke.c

=item parse_barestmt
X<parse_barestmt>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a single unadorned Perl statement.  This may be a normal imperative
statement or a declaration that has compile-time effect.  It does not
include any label or other affixture.  It is up to the caller to ensure
that the dynamic parser state (L</PL_parser> et al) is correctly set to
reflect the source of the code to be parsed and the lexical context for
the statement.

The op tree representing the statement is returned.  This may be a
null pointer if the statement is null, for example if it was actually
a subroutine definition (which has compile-time side effects).  If not
null, it will be ops directly implementing the statement, suitable to
pass to L</newSTATEOP>.  It will not normally include a C<nextstate> or
equivalent op (except for those embedded in a scope contained entirely
within the statement).

If an error occurs in parsing or compilation, in most cases a valid op
tree (most likely null) is returned anyway.  The error is reflected in
the parser state, normally resulting in a single exception at the top
level of parsing which covers all the compilation errors that occurred.
Some compilation errors, however, will throw an exception immediately.

The C<flags> parameter is reserved for future use, and must always
be zero.

	OP *	parse_barestmt(U32 flags)

=for hackers
Found in file toke.c

=item parse_block
X<parse_block>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a single complete Perl code block.  This consists of an opening
brace, a sequence of statements, and a closing brace.  The block
constitutes a lexical scope, so C<my> variables and various compile-time
effects can be contained within it.  It is up to the caller to ensure
that the dynamic parser state (L</PL_parser> et al) is correctly set to
reflect the source of the code to be parsed and the lexical context for
the statement.

The op tree representing the code block is returned.  This is always a
real op, never a null pointer.  It will normally be a C<lineseq> list,
including C<nextstate> or equivalent ops.  No ops to construct any kind
of runtime scope are included by virtue of it being a block.

If an error occurs in parsing or compilation, in most cases a valid op
tree (most likely null) is returned anyway.  The error is reflected in
the parser state, normally resulting in a single exception at the top
level of parsing which covers all the compilation errors that occurred.
Some compilation errors, however, will throw an exception immediately.

The C<flags> parameter is reserved for future use, and must always
be zero.

	OP *	parse_block(U32 flags)

=for hackers
Found in file toke.c

=item parse_fullexpr
X<parse_fullexpr>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a single complete Perl expression.  This allows the full
expression grammar, including the lowest-precedence operators such
as C<or>.  The expression must be followed (and thus terminated) by a
token that an expression would normally be terminated by: end-of-file,
closing bracketing punctuation, semicolon, or one of the keywords that
signals a postfix expression-statement modifier.  If C<flags> has the
C<PARSE_OPTIONAL> bit set, then the expression is optional, otherwise it is
mandatory.  It is up to the caller to ensure that the dynamic parser
state (L</PL_parser> et al) is correctly set to reflect the source of
the code to be parsed and the lexical context for the expression.

The op tree representing the expression is returned.  If an optional
expression is absent, a null pointer is returned, otherwise the pointer
will be non-null.

If an error occurs in parsing or compilation, in most cases a valid op
tree is returned anyway.  The error is reflected in the parser state,
normally resulting in a single exception at the top level of parsing
which covers all the compilation errors that occurred.  Some compilation
errors, however, will throw an exception immediately.

	OP *	parse_fullexpr(U32 flags)

=for hackers
Found in file toke.c

=item parse_fullstmt
X<parse_fullstmt>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a single complete Perl statement.  This may be a normal imperative
statement or a declaration that has compile-time effect, and may include
optional labels.  It is up to the caller to ensure that the dynamic
parser state (L</PL_parser> et al) is correctly set to reflect the source
of the code to be parsed and the lexical context for the statement.

The op tree representing the statement is returned.  This may be a
null pointer if the statement is null, for example if it was actually
a subroutine definition (which has compile-time side effects).  If not
null, it will be the result of a L</newSTATEOP> call, normally including
a C<nextstate> or equivalent op.

If an error occurs in parsing or compilation, in most cases a valid op
tree (most likely null) is returned anyway.  The error is reflected in
the parser state, normally resulting in a single exception at the top
level of parsing which covers all the compilation errors that occurred.
Some compilation errors, however, will throw an exception immediately.

The C<flags> parameter is reserved for future use, and must always
be zero.

	OP *	parse_fullstmt(U32 flags)

=for hackers
Found in file toke.c

=item parse_label
X<parse_label>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a single label, possibly optional, of the type that may prefix a
Perl statement.  It is up to the caller to ensure that the dynamic parser
state (L</PL_parser> et al) is correctly set to reflect the source of
the code to be parsed.  If C<flags> has the C<PARSE_OPTIONAL> bit set, then the
label is optional, otherwise it is mandatory.

The name of the label is returned in the form of a fresh scalar.  If an
optional label is absent, a null pointer is returned.

If an error occurs in parsing, which can only occur if the label is
mandatory, a valid label is returned anyway.  The error is reflected in
the parser state, normally resulting in a single exception at the top
level of parsing which covers all the compilation errors that occurred.

	SV *	parse_label(U32 flags)

=for hackers
Found in file toke.c

=item parse_listexpr
X<parse_listexpr>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a Perl list expression.  This may contain operators of precedence
down to the comma operator.  The expression must be followed (and thus
terminated) either by a low-precedence logic operator such as C<or> or by
something that would normally terminate an expression such as semicolon.
If C<flags> has the C<PARSE_OPTIONAL> bit set, then the expression is optional,
otherwise it is mandatory.  It is up to the caller to ensure that the
dynamic parser state (L</PL_parser> et al) is correctly set to reflect
the source of the code to be parsed and the lexical context for the
expression.

The op tree representing the expression is returned.  If an optional
expression is absent, a null pointer is returned, otherwise the pointer
will be non-null.

If an error occurs in parsing or compilation, in most cases a valid op
tree is returned anyway.  The error is reflected in the parser state,
normally resulting in a single exception at the top level of parsing
which covers all the compilation errors that occurred.  Some compilation
errors, however, will throw an exception immediately.

	OP *	parse_listexpr(U32 flags)

=for hackers
Found in file toke.c

=item parse_stmtseq
X<parse_stmtseq>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a sequence of zero or more Perl statements.  These may be normal
imperative statements, including optional labels, or declarations
that have compile-time effect, or any mixture thereof.  The statement
sequence ends when a closing brace or end-of-file is encountered in a
place where a new statement could have validly started.  It is up to
the caller to ensure that the dynamic parser state (L</PL_parser> et al)
is correctly set to reflect the source of the code to be parsed and the
lexical context for the statements.

The op tree representing the statement sequence is returned.  This may
be a null pointer if the statements were all null, for example if there
were no statements or if there were only subroutine definitions (which
have compile-time side effects).  If not null, it will be a C<lineseq>
list, normally including C<nextstate> or equivalent ops.

If an error occurs in parsing or compilation, in most cases a valid op
tree is returned anyway.  The error is reflected in the parser state,
normally resulting in a single exception at the top level of parsing
which covers all the compilation errors that occurred.  Some compilation
errors, however, will throw an exception immediately.

The C<flags> parameter is reserved for future use, and must always
be zero.

	OP *	parse_stmtseq(U32 flags)

=for hackers
Found in file toke.c

=item parse_termexpr
X<parse_termexpr>


NOTE: this function is experimental and may change or be
removed without notice.


Parse a Perl term expression.  This may contain operators of precedence
down to the assignment operators.  The expression must be followed (and thus
terminated) either by a comma or lower-precedence operator or by
something that would normally terminate an expression such as semicolon.
If C<flags> has the C<PARSE_OPTIONAL> bit set, then the expression is optional,
otherwise it is mandatory.  It is up to the caller to ensure that the
dynamic parser state (L</PL_parser> et al) is correctly set to reflect
the source of the code to be parsed and the lexical context for the
expression.

The op tree representing the expression is returned.  If an optional
expression is absent, a null pointer is returned, otherwise the pointer
will be non-null.

If an error occurs in parsing or compilation, in most cases a valid op
tree is returned anyway.  The error is reflected in the parser state,
normally resulting in a single exception at the top level of parsing
which covers all the compilation errors that occurred.  Some compilation
errors, however, will throw an exception immediately.

	OP *	parse_termexpr(U32 flags)

=for hackers
Found in file toke.c

=item PL_parser
X<PL_parser>

Pointer to a structure encapsulating the state of the parsing operation
currently in progress.  The pointer can be locally changed to perform
a nested parse without interfering with the state of an outer parse.
Individual members of C<PL_parser> have their own documentation.

=for hackers
Found in file toke.c

=item PL_parser-E<gt>bufend
X<PL_parser-E<gt>bufend>


NOTE: this function is experimental and may change or be
removed without notice.


Direct pointer to the end of the chunk of text currently being lexed, the
end of the lexer buffer.  This is equal to C<SvPVX(PL_parser-E<gt>linestr)
+ SvCUR(PL_parser-E<gt>linestr)>.  A C<NUL> character (zero octet) is
always located at the end of the buffer, and does not count as part of
the buffer's contents.

=for hackers
Found in file toke.c

=item PL_parser-E<gt>bufptr
X<PL_parser-E<gt>bufptr>


NOTE: this function is experimental and may change or be
removed without notice.


Points to the current position of lexing inside the lexer buffer.
Characters around this point may be freely examined, within
the range delimited by C<SvPVX(L</PL_parser-E<gt>linestr>)> and
L</PL_parser-E<gt>bufend>.  The octets of the buffer may be intended to be
interpreted as either UTF-8 or Latin-1, as indicated by L</lex_bufutf8>.

Lexing code (whether in the Perl core or not) moves this pointer past
the characters that it consumes.  It is also expected to perform some
bookkeeping whenever a newline character is consumed.  This movement
can be more conveniently performed by the function L</lex_read_to>,
which handles newlines appropriately.

Interpretation of the buffer's octets can be abstracted out by
using the slightly higher-level functions L</lex_peek_unichar> and
L</lex_read_unichar>.

=for hackers
Found in file toke.c

=item PL_parser-E<gt>linestart
X<PL_parser-E<gt>linestart>


NOTE: this function is experimental and may change or be
removed without notice.


Points to the start of the current line inside the lexer buffer.
This is useful for indicating at which column an error occurred, and
not much else.  This must be updated by any lexing code that consumes
a newline; the function L</lex_read_to> handles this detail.

=for hackers
Found in file toke.c

=item PL_parser-E<gt>linestr
X<PL_parser-E<gt>linestr>


NOTE: this function is experimental and may change or be
removed without notice.


Buffer scalar containing the chunk currently under consideration of the
text currently being lexed.  This is always a plain string scalar (for
which C<SvPOK> is true).  It is not intended to be used as a scalar by
normal scalar means; instead refer to the buffer directly by the pointer
variables described below.

The lexer maintains various C<char*> pointers to things in the
C<PL_parser-E<gt>linestr> buffer.  If C<PL_parser-E<gt>linestr> is ever
reallocated, all of these pointers must be updated.  Don't attempt to
do this manually, but rather use L</lex_grow_linestr> if you need to
reallocate the buffer.

The content of the text chunk in the buffer is commonly exactly one
complete line of input, up to and including a newline terminator,
but there are situations where it is otherwise.  The octets of the
buffer may be intended to be interpreted as either UTF-8 or Latin-1.
The function L</lex_bufutf8> tells you which.  Do not use the C<SvUTF8>
flag on this scalar, which may disagree with it.

For direct examination of the buffer, the variable
L</PL_parser-E<gt>bufend> points to the end of the buffer.  The current
lexing position is pointed to by L</PL_parser-E<gt>bufptr>.  Direct use
of these pointers is usually preferable to examination of the scalar
through normal scalar means.

=for hackers
Found in file toke.c


=back

=head1 Locale-related functions and macros

=over 8

=item DECLARATION_FOR_LC_NUMERIC_MANIPULATION
X<DECLARATION_FOR_LC_NUMERIC_MANIPULATION>

This macro should be used as a statement.  It declares a private variable
(whose name begins with an underscore) that is needed by the other macros in
this section.  Failing to include this correctly should lead to a syntax error.
For compatibility with C89 C compilers it should be placed in a block before
any executable statements.

	void	DECLARATION_FOR_LC_NUMERIC_MANIPULATION

=for hackers
Found in file perl.h

=item RESTORE_LC_NUMERIC
X<RESTORE_LC_NUMERIC>

This is used in conjunction with one of the macros
L</STORE_LC_NUMERIC_SET_TO_NEEDED>
and
L</STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>

to properly restore the C<LC_NUMERIC> state.

A call to L</DECLARATION_FOR_LC_NUMERIC_MANIPULATION> must have been made to
declare at compile time a private variable used by this macro and the two
C<STORE> ones.  This macro should be called as a single statement, not an
expression, but with an empty argument list, like this:

 {
    DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
     ...
    RESTORE_LC_NUMERIC();
     ...
 }

	void	RESTORE_LC_NUMERIC()

=for hackers
Found in file perl.h

=item STORE_LC_NUMERIC_FORCE_TO_UNDERLYING
X<STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>

This is used by XS code that that is C<LC_NUMERIC> locale-aware to force the
locale for category C<LC_NUMERIC> to be what perl thinks is the current
underlying locale.  (The perl interpreter could be wrong about what the
underlying locale actually is if some C or XS code has called the C library
function L<setlocale(3)> behind its back; calling L</sync_locale> before calling
this macro will update perl's records.)

A call to L</DECLARATION_FOR_LC_NUMERIC_MANIPULATION> must have been made to
declare at compile time a private variable used by this macro.  This macro
should be called as a single statement, not an expression, but with an empty
argument list, like this:

 {
    DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
     ...
    STORE_LC_NUMERIC_FORCE_TO_UNDERLYING();
     ...
    RESTORE_LC_NUMERIC();
     ...
 }

The private variable is used to save the current locale state, so
that the requisite matching call to L</RESTORE_LC_NUMERIC> can restore it.

	void	STORE_LC_NUMERIC_FORCE_TO_UNDERLYING()

=for hackers
Found in file perl.h

=item STORE_LC_NUMERIC_SET_TO_NEEDED
X<STORE_LC_NUMERIC_SET_TO_NEEDED>

This is used to help wrap XS or C code that that is C<LC_NUMERIC> locale-aware.
This locale category is generally kept set to the C locale by Perl for
backwards compatibility, and because most XS code that reads floating point
values can cope only with the decimal radix character being a dot.

This macro makes sure the current C<LC_NUMERIC> state is set properly, to be
aware of locale if the call to the XS or C code from the Perl program is
from within the scope of a S<C<use locale>>; or to ignore locale if the call is
instead from outside such scope.

This macro is the start of wrapping the C or XS code; the wrap ending is done
by calling the L</RESTORE_LC_NUMERIC> macro after the operation.  Otherwise
the state can be changed that will adversely affect other XS code.

A call to L</DECLARATION_FOR_LC_NUMERIC_MANIPULATION> must have been made to
declare at compile time a private variable used by this macro.  This macro
should be called as a single statement, not an expression, but with an empty
argument list, like this:

 {
    DECLARATION_FOR_LC_NUMERIC_MANIPULATION;
     ...
    STORE_LC_NUMERIC_SET_TO_NEEDED();
     ...
    RESTORE_LC_NUMERIC();
     ...
 }

	void	STORE_LC_NUMERIC_SET_TO_NEEDED()

=for hackers
Found in file perl.h

=item sync_locale
X<sync_locale>

Changing the program's locale should be avoided by XS code.  Nevertheless,
certain non-Perl libraries called from XS, such as C<Gtk> do so.  When this
happens, Perl needs to be told that the locale has changed.  Use this function
to do so, before returning to Perl.

	void	sync_locale()

=for hackers
Found in file locale.c


=back

=head1 Magical Functions

=over 8

=item mg_clear
X<mg_clear>

Clear something magical that the SV represents.  See C<L</sv_magic>>.

	int	mg_clear(SV* sv)

=for hackers
Found in file mg.c

=item mg_copy
X<mg_copy>

Copies the magic from one SV to another.  See C<L</sv_magic>>.

	int	mg_copy(SV *sv, SV *nsv, const char *key,
		        I32 klen)

=for hackers
Found in file mg.c

=item mg_find
X<mg_find>

Finds the magic pointer for C<type> matching the SV.  See C<L</sv_magic>>.

	MAGIC*	mg_find(const SV* sv, int type)

=for hackers
Found in file mg.c

=item mg_findext
X<mg_findext>

Finds the magic pointer of C<type> with the given C<vtbl> for the C<SV>.  See
C<L</sv_magicext>>.

	MAGIC*	mg_findext(const SV* sv, int type,
		           const MGVTBL *vtbl)

=for hackers
Found in file mg.c

=item mg_free
X<mg_free>

Free any magic storage used by the SV.  See C<L</sv_magic>>.

	int	mg_free(SV* sv)

=for hackers
Found in file mg.c

=item mg_free_type
X<mg_free_type>

Remove any magic of type C<how> from the SV C<sv>.  See L</sv_magic>.

	void	mg_free_type(SV *sv, int how)

=for hackers
Found in file mg.c

=item mg_get
X<mg_get>

Do magic before a value is retrieved from the SV.  The type of SV must
be >= C<SVt_PVMG>.  See C<L</sv_magic>>.

	int	mg_get(SV* sv)

=for hackers
Found in file mg.c

=item mg_length
X<mg_length>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Reports on the SV's length in bytes, calling length magic if available,
but does not set the UTF8 flag on C<sv>.  It will fall back to 'get'
magic if there is no 'length' magic, but with no indication as to
whether it called 'get' magic.  It assumes C<sv> is a C<PVMG> or
higher.  Use C<sv_len()> instead.

	U32	mg_length(SV* sv)

=for hackers
Found in file mg.c

=item mg_magical
X<mg_magical>

Turns on the magical status of an SV.  See C<L</sv_magic>>.

	void	mg_magical(SV* sv)

=for hackers
Found in file mg.c

=item mg_set
X<mg_set>

Do magic after a value is assigned to the SV.  See C<L</sv_magic>>.

	int	mg_set(SV* sv)

=for hackers
Found in file mg.c

=item SvGETMAGIC
X<SvGETMAGIC>

Invokes C<mg_get> on an SV if it has 'get' magic.  For example, this
will call C<FETCH> on a tied variable.  This macro evaluates its
argument more than once.

	void	SvGETMAGIC(SV* sv)

=for hackers
Found in file sv.h

=item SvLOCK
X<SvLOCK>

Arranges for a mutual exclusion lock to be obtained on C<sv> if a suitable module
has been loaded.

	void	SvLOCK(SV* sv)

=for hackers
Found in file sv.h

=item SvSETMAGIC
X<SvSETMAGIC>

Invokes C<mg_set> on an SV if it has 'set' magic.  This is necessary
after modifying a scalar, in case it is a magical variable like C<$|>
or a tied variable (it calls C<STORE>).  This macro evaluates its
argument more than once.

	void	SvSETMAGIC(SV* sv)

=for hackers
Found in file sv.h

=item SvSetMagicSV
X<SvSetMagicSV>

Like C<SvSetSV>, but does any set magic required afterwards.

	void	SvSetMagicSV(SV* dsv, SV* ssv)

=for hackers
Found in file sv.h

=item SvSetMagicSV_nosteal
X<SvSetMagicSV_nosteal>

Like C<SvSetSV_nosteal>, but does any set magic required afterwards.

	void	SvSetMagicSV_nosteal(SV* dsv, SV* ssv)

=for hackers
Found in file sv.h

=item SvSetSV
X<SvSetSV>

Calls C<sv_setsv> if C<dsv> is not the same as C<ssv>.  May evaluate arguments
more than once.  Does not handle 'set' magic on the destination SV.

	void	SvSetSV(SV* dsv, SV* ssv)

=for hackers
Found in file sv.h

=item SvSetSV_nosteal
X<SvSetSV_nosteal>

Calls a non-destructive version of C<sv_setsv> if C<dsv> is not the same as
C<ssv>.  May evaluate arguments more than once.

	void	SvSetSV_nosteal(SV* dsv, SV* ssv)

=for hackers
Found in file sv.h

=item SvSHARE
X<SvSHARE>

Arranges for C<sv> to be shared between threads if a suitable module
has been loaded.

	void	SvSHARE(SV* sv)

=for hackers
Found in file sv.h

=item SvUNLOCK
X<SvUNLOCK>

Releases a mutual exclusion lock on C<sv> if a suitable module
has been loaded.

	void	SvUNLOCK(SV* sv)

=for hackers
Found in file sv.h


=back

=head1 Memory Management

=over 8

=item Copy
X<Copy>

The XSUB-writer's interface to the C C<memcpy> function.  The C<src> is the
source, C<dest> is the destination, C<nitems> is the number of items, and
C<type> is the type.  May fail on overlapping copies.  See also C<L</Move>>.

	void	Copy(void* src, void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item CopyD
X<CopyD>

Like C<Copy> but returns C<dest>.  Useful
for encouraging compilers to tail-call
optimise.

	void *	CopyD(void* src, void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item Move
X<Move>

The XSUB-writer's interface to the C C<memmove> function.  The C<src> is the
source, C<dest> is the destination, C<nitems> is the number of items, and
C<type> is the type.  Can do overlapping moves.  See also C<L</Copy>>.

	void	Move(void* src, void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item MoveD
X<MoveD>

Like C<Move> but returns C<dest>.  Useful
for encouraging compilers to tail-call
optimise.

	void *	MoveD(void* src, void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item Newx
X<Newx>

The XSUB-writer's interface to the C C<malloc> function.

Memory obtained by this should B<ONLY> be freed with L</"Safefree">.

In 5.9.3, Newx() and friends replace the older New() API, and drops
the first parameter, I<x>, a debug aid which allowed callers to identify
themselves.  This aid has been superseded by a new build option,
PERL_MEM_LOG (see L<perlhacktips/PERL_MEM_LOG>).  The older API is still
there for use in XS modules supporting older perls.

	void	Newx(void* ptr, int nitems, type)

=for hackers
Found in file handy.h

=item Newxc
X<Newxc>

The XSUB-writer's interface to the C C<malloc> function, with
cast.  See also C<L</Newx>>.

Memory obtained by this should B<ONLY> be freed with L</"Safefree">.

	void	Newxc(void* ptr, int nitems, type, cast)

=for hackers
Found in file handy.h

=item Newxz
X<Newxz>

The XSUB-writer's interface to the C C<malloc> function.  The allocated
memory is zeroed with C<memzero>.  See also C<L</Newx>>.

Memory obtained by this should B<ONLY> be freed with L</"Safefree">.

	void	Newxz(void* ptr, int nitems, type)

=for hackers
Found in file handy.h

=item Poison
X<Poison>

PoisonWith(0xEF) for catching access to freed memory.

	void	Poison(void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item PoisonFree
X<PoisonFree>

PoisonWith(0xEF) for catching access to freed memory.

	void	PoisonFree(void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item PoisonNew
X<PoisonNew>

PoisonWith(0xAB) for catching access to allocated but uninitialized memory.

	void	PoisonNew(void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item PoisonWith
X<PoisonWith>

Fill up memory with a byte pattern (a byte repeated over and over
again) that hopefully catches attempts to access uninitialized memory.

	void	PoisonWith(void* dest, int nitems, type,
		           U8 byte)

=for hackers
Found in file handy.h

=item Renew
X<Renew>

The XSUB-writer's interface to the C C<realloc> function.

Memory obtained by this should B<ONLY> be freed with L</"Safefree">.

	void	Renew(void* ptr, int nitems, type)

=for hackers
Found in file handy.h

=item Renewc
X<Renewc>

The XSUB-writer's interface to the C C<realloc> function, with
cast.

Memory obtained by this should B<ONLY> be freed with L</"Safefree">.

	void	Renewc(void* ptr, int nitems, type, cast)

=for hackers
Found in file handy.h

=item Safefree
X<Safefree>

The XSUB-writer's interface to the C C<free> function.

This should B<ONLY> be used on memory obtained using L</"Newx"> and friends.

	void	Safefree(void* ptr)

=for hackers
Found in file handy.h

=item savepv
X<savepv>

Perl's version of C<strdup()>.  Returns a pointer to a newly allocated
string which is a duplicate of C<pv>.  The size of the string is
determined by C<strlen()>, which means it may not contain embedded C<NUL>
characters and must have a trailing C<NUL>.  The memory allocated for the new
string can be freed with the C<Safefree()> function.

On some platforms, Windows for example, all allocated memory owned by a thread
is deallocated when that thread ends.  So if you need that not to happen, you
need to use the shared memory functions, such as C<L</savesharedpv>>.

	char*	savepv(const char* pv)

=for hackers
Found in file util.c

=item savepvn
X<savepvn>

Perl's version of what C<strndup()> would be if it existed.  Returns a
pointer to a newly allocated string which is a duplicate of the first
C<len> bytes from C<pv>, plus a trailing
C<NUL> byte.  The memory allocated for
the new string can be freed with the C<Safefree()> function.

On some platforms, Windows for example, all allocated memory owned by a thread
is deallocated when that thread ends.  So if you need that not to happen, you
need to use the shared memory functions, such as C<L</savesharedpvn>>.

	char*	savepvn(const char* pv, I32 len)

=for hackers
Found in file util.c

=item savepvs
X<savepvs>

Like C<savepvn>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	char*	savepvs(const char* s)

=for hackers
Found in file handy.h

=item savesharedpv
X<savesharedpv>

A version of C<savepv()> which allocates the duplicate string in memory
which is shared between threads.

	char*	savesharedpv(const char* pv)

=for hackers
Found in file util.c

=item savesharedpvn
X<savesharedpvn>

A version of C<savepvn()> which allocates the duplicate string in memory
which is shared between threads.  (With the specific difference that a C<NULL>
pointer is not acceptable)

	char*	savesharedpvn(const char *const pv,
		              const STRLEN len)

=for hackers
Found in file util.c

=item savesharedpvs
X<savesharedpvs>

A version of C<savepvs()> which allocates the duplicate string in memory
which is shared between threads.

	char*	savesharedpvs(const char* s)

=for hackers
Found in file handy.h

=item savesharedsvpv
X<savesharedsvpv>

A version of C<savesharedpv()> which allocates the duplicate string in
memory which is shared between threads.

	char*	savesharedsvpv(SV *sv)

=for hackers
Found in file util.c

=item savesvpv
X<savesvpv>

A version of C<savepv()>/C<savepvn()> which gets the string to duplicate from
the passed in SV using C<SvPV()>

On some platforms, Windows for example, all allocated memory owned by a thread
is deallocated when that thread ends.  So if you need that not to happen, you
need to use the shared memory functions, such as C<L</savesharedsvpv>>.

	char*	savesvpv(SV* sv)

=for hackers
Found in file util.c

=item StructCopy
X<StructCopy>

This is an architecture-independent macro to copy one structure to another.

	void	StructCopy(type *src, type *dest, type)

=for hackers
Found in file handy.h

=item Zero
X<Zero>

The XSUB-writer's interface to the C C<memzero> function.  The C<dest> is the
destination, C<nitems> is the number of items, and C<type> is the type.

	void	Zero(void* dest, int nitems, type)

=for hackers
Found in file handy.h

=item ZeroD
X<ZeroD>

Like C<Zero> but returns dest.  Useful
for encouraging compilers to tail-call
optimise.

	void *	ZeroD(void* dest, int nitems, type)

=for hackers
Found in file handy.h


=back

=head1 Miscellaneous Functions

=over 8

=item dump_c_backtrace
X<dump_c_backtrace>

Dumps the C backtrace to the given C<fp>.

Returns true if a backtrace could be retrieved, false if not.

	bool	dump_c_backtrace(PerlIO* fp, int max_depth,
		                 int skip)

=for hackers
Found in file util.c

=item fbm_compile
X<fbm_compile>

Analyses the string in order to make fast searches on it using C<fbm_instr()>
-- the Boyer-Moore algorithm.

	void	fbm_compile(SV* sv, U32 flags)

=for hackers
Found in file util.c

=item fbm_instr
X<fbm_instr>

Returns the location of the SV in the string delimited by C<big> and
C<bigend> (C<bigend>) is the char following the last char).
It returns C<NULL> if the string can't be found.  The C<sv>
does not have to be C<fbm_compiled>, but the search will not be as fast
then.

	char*	fbm_instr(unsigned char* big,
		          unsigned char* bigend, SV* littlestr,
		          U32 flags)

=for hackers
Found in file util.c

=item foldEQ
X<foldEQ>

Returns true if the leading C<len> bytes of the strings C<s1> and C<s2> are the
same
case-insensitively; false otherwise.  Uppercase and lowercase ASCII range bytes
match themselves and their opposite case counterparts.  Non-cased and non-ASCII
range bytes match only themselves.

	I32	foldEQ(const char* a, const char* b, I32 len)

=for hackers
Found in file inline.h

=item foldEQ_locale
X<foldEQ_locale>

Returns true if the leading C<len> bytes of the strings C<s1> and C<s2> are the
same case-insensitively in the current locale; false otherwise.

	I32	foldEQ_locale(const char* a, const char* b,
		              I32 len)

=for hackers
Found in file inline.h

=item form
X<form>

Takes a sprintf-style format pattern and conventional
(non-SV) arguments and returns the formatted string.

    (char *) Perl_form(pTHX_ const char* pat, ...)

can be used any place a string (char *) is required:

    char * s = Perl_form("%d.%d",major,minor);

Uses a single private buffer so if you want to format several strings you
must explicitly copy the earlier strings away (and free the copies when you
are done).

	char*	form(const char* pat, ...)

=for hackers
Found in file util.c

=item getcwd_sv
X<getcwd_sv>

Fill C<sv> with current working directory

	int	getcwd_sv(SV* sv)

=for hackers
Found in file util.c

=item get_c_backtrace_dump
X<get_c_backtrace_dump>

Returns a SV containing a dump of C<depth> frames of the call stack, skipping
the C<skip> innermost ones.  C<depth> of 20 is usually enough.

The appended output looks like:

...
1   10e004812:0082   Perl_croak   util.c:1716    /usr/bin/perl
2   10df8d6d2:1d72   perl_parse   perl.c:3975    /usr/bin/perl
...

The fields are tab-separated.  The first column is the depth (zero
being the innermost non-skipped frame).  In the hex:offset, the hex is
where the program counter was in C<S_parse_body>, and the :offset (might
be missing) tells how much inside the C<S_parse_body> the program counter was.

The C<util.c:1716> is the source code file and line number.

The F</usr/bin/perl> is obvious (hopefully).

Unknowns are C<"-">.  Unknowns can happen unfortunately quite easily:
if the platform doesn't support retrieving the information;
if the binary is missing the debug information;
if the optimizer has transformed the code by for example inlining.

	SV*	get_c_backtrace_dump(int max_depth, int skip)

=for hackers
Found in file util.c

=item ibcmp
X<ibcmp>

This is a synonym for S<C<(! foldEQ())>>

	I32	ibcmp(const char* a, const char* b, I32 len)

=for hackers
Found in file util.h

=item ibcmp_locale
X<ibcmp_locale>

This is a synonym for S<C<(! foldEQ_locale())>>

	I32	ibcmp_locale(const char* a, const char* b,
		             I32 len)

=for hackers
Found in file util.h

=item is_safe_syscall
X<is_safe_syscall>

Test that the given C<pv> doesn't contain any internal C<NUL> characters.
If it does, set C<errno> to C<ENOENT>, optionally warn, and return FALSE.

Return TRUE if the name is safe.

Used by the C<IS_SAFE_SYSCALL()> macro.

	bool	is_safe_syscall(const char *pv, STRLEN len,
		                const char *what,
		                const char *op_name)

=for hackers
Found in file inline.h

=item memEQ
X<memEQ>

Test two buffers (which may contain embedded C<NUL> characters, to see if they
are equal.  The C<len> parameter indicates the number of bytes to compare.
Returns zero if equal, or non-zero if non-equal.

	bool	memEQ(char* s1, char* s2, STRLEN len)

=for hackers
Found in file handy.h

=item memNE
X<memNE>

Test two buffers (which may contain embedded C<NUL> characters, to see if they
are not equal.  The C<len> parameter indicates the number of bytes to compare.
Returns zero if non-equal, or non-zero if equal.

	bool	memNE(char* s1, char* s2, STRLEN len)

=for hackers
Found in file handy.h

=item mess
X<mess>

Take a sprintf-style format pattern and argument list.  These are used to
generate a string message.  If the message does not end with a newline,
then it will be extended with some indication of the current location
in the code, as described for L</mess_sv>.

Normally, the resulting message is returned in a new mortal SV.
During global destruction a single SV may be shared between uses of
this function.

	SV *	mess(const char *pat, ...)

=for hackers
Found in file util.c

=item mess_sv
X<mess_sv>

Expands a message, intended for the user, to include an indication of
the current location in the code, if the message does not already appear
to be complete.

C<basemsg> is the initial message or object.  If it is a reference, it
will be used as-is and will be the result of this function.  Otherwise it
is used as a string, and if it already ends with a newline, it is taken
to be complete, and the result of this function will be the same string.
If the message does not end with a newline, then a segment such as C<at
foo.pl line 37> will be appended, and possibly other clauses indicating
the current state of execution.  The resulting message will end with a
dot and a newline.

Normally, the resulting message is returned in a new mortal SV.
During global destruction a single SV may be shared between uses of this
function.  If C<consume> is true, then the function is permitted (but not
required) to modify and return C<basemsg> instead of allocating a new SV.

	SV *	mess_sv(SV *basemsg, bool consume)

=for hackers
Found in file util.c

=item my_snprintf
X<my_snprintf>

The C library C<snprintf> functionality, if available and
standards-compliant (uses C<vsnprintf>, actually).  However, if the
C<vsnprintf> is not available, will unfortunately use the unsafe
C<vsprintf> which can overrun the buffer (there is an overrun check,
but that may be too late).  Consider using C<sv_vcatpvf> instead, or
getting C<vsnprintf>.

	int	my_snprintf(char *buffer, const Size_t len,
		            const char *format, ...)

=for hackers
Found in file util.c

=item my_sprintf
X<my_sprintf>

The C library C<sprintf>, wrapped if necessary, to ensure that it will return
the length of the string written to the buffer.  Only rare pre-ANSI systems
need the wrapper function - usually this is a direct call to C<sprintf>.

	int	my_sprintf(char *buffer, const char *pat, ...)

=for hackers
Found in file util.c

=item my_strlcat
X<my_strlcat>

The C library C<strlcat> if available, or a Perl implementation of it.
This operates on C C<NUL>-terminated strings.

C<my_strlcat()> appends string C<src> to the end of C<dst>.  It will append at
most S<C<size - strlen(dst) - 1>> characters.  It will then C<NUL>-terminate,
unless C<size> is 0 or the original C<dst> string was longer than C<size> (in
practice this should not happen as it means that either C<size> is incorrect or
that C<dst> is not a proper C<NUL>-terminated string).

Note that C<size> is the full size of the destination buffer and
the result is guaranteed to be C<NUL>-terminated if there is room.  Note that
room for the C<NUL> should be included in C<size>.

The return value is the total length that C<dst> would have if C<size> is
sufficiently large.  Thus it is the initial length of C<dst> plus the length of
C<src>.  If C<size> is smaller than the return, the excess was not appended.

	Size_t	my_strlcat(char *dst, const char *src,
		           Size_t size)

=for hackers
Found in file util.c

=item my_strlcpy
X<my_strlcpy>

The C library C<strlcpy> if available, or a Perl implementation of it.
This operates on C C<NUL>-terminated strings.

C<my_strlcpy()> copies up to S<C<size - 1>> characters from the string C<src>
to C<dst>, C<NUL>-terminating the result if C<size> is not 0.

The return value is the total length C<src> would be if the copy completely
succeeded.  If it is larger than C<size>, the excess was not copied.

	Size_t	my_strlcpy(char *dst, const char *src,
		           Size_t size)

=for hackers
Found in file util.c

=item my_vsnprintf
X<my_vsnprintf>

The C library C<vsnprintf> if available and standards-compliant.
However, if if the C<vsnprintf> is not available, will unfortunately
use the unsafe C<vsprintf> which can overrun the buffer (there is an
overrun check, but that may be too late).  Consider using
C<sv_vcatpvf> instead, or getting C<vsnprintf>.

	int	my_vsnprintf(char *buffer, const Size_t len,
		             const char *format, va_list ap)

=for hackers
Found in file util.c

=item ninstr
X<ninstr>

Find the first (leftmost) occurrence of a sequence of bytes within another
sequence.  This is the Perl version of C<strstr()>, extended to handle
arbitrary sequences, potentially containing embedded C<NUL> characters (C<NUL>
is what the initial C<n> in the function name stands for; some systems have an
equivalent, C<memmem()>, but with a somewhat different API).

Another way of thinking about this function is finding a needle in a haystack.
C<big> points to the first byte in the haystack.  C<big_end> points to one byte
beyond the final byte in the haystack.  C<little> points to the first byte in
the needle.  C<little_end> points to one byte beyond the final byte in the
needle.  All the parameters must be non-C<NULL>.

The function returns C<NULL> if there is no occurrence of C<little> within
C<big>.  If C<little> is the empty string, C<big> is returned.

Because this function operates at the byte level, and because of the inherent
characteristics of UTF-8 (or UTF-EBCDIC), it will work properly if both the
needle and the haystack are strings with the same UTF-8ness, but not if the
UTF-8ness differs.

	char *	ninstr(char * big, char * bigend, char * little,
		       char * little_end)

=for hackers
Found in file util.c

=item PERL_SYS_INIT
X<PERL_SYS_INIT>

Provides system-specific tune up of the C runtime environment necessary to
run Perl interpreters.  This should be called only once, before creating
any Perl interpreters.

	void	PERL_SYS_INIT(int *argc, char*** argv)

=for hackers
Found in file perl.h

=item PERL_SYS_INIT3
X<PERL_SYS_INIT3>

Provides system-specific tune up of the C runtime environment necessary to
run Perl interpreters.  This should be called only once, before creating
any Perl interpreters.

	void	PERL_SYS_INIT3(int *argc, char*** argv,
		               char*** env)

=for hackers
Found in file perl.h

=item PERL_SYS_TERM
X<PERL_SYS_TERM>

Provides system-specific clean up of the C runtime environment after
running Perl interpreters.  This should be called only once, after
freeing any remaining Perl interpreters.

	void	PERL_SYS_TERM()

=for hackers
Found in file perl.h

=item quadmath_format_needed
X<quadmath_format_needed>

C<quadmath_format_needed()> returns true if the C<format> string seems to
contain at least one non-Q-prefixed C<%[efgaEFGA]> format specifier,
or returns false otherwise.

The format specifier detection is not complete printf-syntax detection,
but it should catch most common cases.

If true is returned, those arguments B<should> in theory be processed
with C<quadmath_snprintf()>, but in case there is more than one such
format specifier (see L</quadmath_format_single>), and if there is
anything else beyond that one (even just a single byte), they
B<cannot> be processed because C<quadmath_snprintf()> is very strict,
accepting only one format spec, and nothing else.
In this case, the code should probably fail.

	bool	quadmath_format_needed(const char* format)

=for hackers
Found in file util.c

=item quadmath_format_single
X<quadmath_format_single>

C<quadmath_snprintf()> is very strict about its C<format> string and will
fail, returning -1, if the format is invalid.  It accepts exactly
one format spec.

C<quadmath_format_single()> checks that the intended single spec looks
sane: begins with C<%>, has only one C<%>, ends with C<[efgaEFGA]>,
and has C<Q> before it.  This is not a full "printf syntax check",
just the basics.

Returns the format if it is valid, NULL if not.

C<quadmath_format_single()> can and will actually patch in the missing
C<Q>, if necessary.  In this case it will return the modified copy of
the format, B<which the caller will need to free.>

See also L</quadmath_format_needed>.

	const char* quadmath_format_single(const char* format)

=for hackers
Found in file util.c

=item READ_XDIGIT
X<READ_XDIGIT>

Returns the value of an ASCII-range hex digit and advances the string pointer.
Behaviour is only well defined when isXDIGIT(*str) is true.

	U8	READ_XDIGIT(char str*)

=for hackers
Found in file handy.h

=item rninstr
X<rninstr>

Like C<L</ninstr>>, but instead finds the final (rightmost) occurrence of a
sequence of bytes within another sequence, returning C<NULL> if there is no
such occurrence.

	char *	rninstr(char * big, char * bigend,
		        char * little, char * little_end)

=for hackers
Found in file util.c

=item strEQ
X<strEQ>

Test two C<NUL>-terminated strings to see if they are equal.  Returns true or
false.

	bool	strEQ(char* s1, char* s2)

=for hackers
Found in file handy.h

=item strGE
X<strGE>

Test two C<NUL>-terminated strings to see if the first, C<s1>, is greater than
or equal to the second, C<s2>.  Returns true or false.

	bool	strGE(char* s1, char* s2)

=for hackers
Found in file handy.h

=item strGT
X<strGT>

Test two C<NUL>-terminated strings to see if the first, C<s1>, is greater than
the second, C<s2>.  Returns true or false.

	bool	strGT(char* s1, char* s2)

=for hackers
Found in file handy.h

=item strLE
X<strLE>

Test two C<NUL>-terminated strings to see if the first, C<s1>, is less than or
equal to the second, C<s2>.  Returns true or false.

	bool	strLE(char* s1, char* s2)

=for hackers
Found in file handy.h

=item strLT
X<strLT>

Test two C<NUL>-terminated strings to see if the first, C<s1>, is less than the
second, C<s2>.  Returns true or false.

	bool	strLT(char* s1, char* s2)

=for hackers
Found in file handy.h

=item strNE
X<strNE>

Test two C<NUL>-terminated strings to see if they are different.  Returns true
or false.

	bool	strNE(char* s1, char* s2)

=for hackers
Found in file handy.h

=item strnEQ
X<strnEQ>

Test two C<NUL>-terminated strings to see if they are equal.  The C<len>
parameter indicates the number of bytes to compare.  Returns true or false.  (A
wrapper for C<strncmp>).

	bool	strnEQ(char* s1, char* s2, STRLEN len)

=for hackers
Found in file handy.h

=item strnNE
X<strnNE>

Test two C<NUL>-terminated strings to see if they are different.  The C<len>
parameter indicates the number of bytes to compare.  Returns true or false.  (A
wrapper for C<strncmp>).

	bool	strnNE(char* s1, char* s2, STRLEN len)

=for hackers
Found in file handy.h

=item sv_destroyable
X<sv_destroyable>

Dummy routine which reports that object can be destroyed when there is no
sharing module present.  It ignores its single SV argument, and returns
'true'.  Exists to avoid test for a C<NULL> function pointer and because it
could potentially warn under some level of strict-ness.

	bool	sv_destroyable(SV *sv)

=for hackers
Found in file util.c

=item sv_nosharing
X<sv_nosharing>

Dummy routine which "shares" an SV when there is no sharing module present.
Or "locks" it.  Or "unlocks" it.  In other
words, ignores its single SV argument.
Exists to avoid test for a C<NULL> function pointer and because it could
potentially warn under some level of strict-ness.

	void	sv_nosharing(SV *sv)

=for hackers
Found in file util.c

=item vmess
X<vmess>

C<pat> and C<args> are a sprintf-style format pattern and encapsulated
argument list, respectively.  These are used to generate a string message.  If
the
message does not end with a newline, then it will be extended with
some indication of the current location in the code, as described for
L</mess_sv>.

Normally, the resulting message is returned in a new mortal SV.
During global destruction a single SV may be shared between uses of
this function.

	SV *	vmess(const char *pat, va_list *args)

=for hackers
Found in file util.c


=back

=head1 MRO Functions

These functions are related to the method resolution order of perl classes


=over 8

=item mro_get_linear_isa
X<mro_get_linear_isa>

Returns the mro linearisation for the given stash.  By default, this
will be whatever C<mro_get_linear_isa_dfs> returns unless some
other MRO is in effect for the stash.  The return value is a
read-only AV*.

You are responsible for C<SvREFCNT_inc()> on the
return value if you plan to store it anywhere
semi-permanently (otherwise it might be deleted
out from under you the next time the cache is
invalidated).

	AV*	mro_get_linear_isa(HV* stash)

=for hackers
Found in file mro_core.c

=item mro_method_changed_in
X<mro_method_changed_in>

Invalidates method caching on any child classes
of the given stash, so that they might notice
the changes in this one.

Ideally, all instances of C<PL_sub_generation++> in
perl source outside of F<mro.c> should be
replaced by calls to this.

Perl automatically handles most of the common
ways a method might be redefined.  However, there
are a few ways you could change a method in a stash
without the cache code noticing, in which case you
need to call this method afterwards:

1) Directly manipulating the stash HV entries from
XS code.

2) Assigning a reference to a readonly scalar
constant into a stash entry in order to create
a constant subroutine (like F<constant.pm>
does).

This same method is available from pure perl
via, C<mro::method_changed_in(classname)>.

	void	mro_method_changed_in(HV* stash)

=for hackers
Found in file mro_core.c

=item mro_register
X<mro_register>

Registers a custom mro plugin.  See L<perlmroapi> for details.

	void	mro_register(const struct mro_alg *mro)

=for hackers
Found in file mro_core.c


=back

=head1 Multicall Functions

=over 8

=item dMULTICALL
X<dMULTICALL>

Declare local variables for a multicall.  See L<perlcall/LIGHTWEIGHT CALLBACKS>.

		dMULTICALL;

=for hackers
Found in file cop.h

=item MULTICALL
X<MULTICALL>

Make a lightweight callback.  See L<perlcall/LIGHTWEIGHT CALLBACKS>.

		MULTICALL;

=for hackers
Found in file cop.h

=item POP_MULTICALL
X<POP_MULTICALL>

Closing bracket for a lightweight callback.
See L<perlcall/LIGHTWEIGHT CALLBACKS>.

		POP_MULTICALL;

=for hackers
Found in file cop.h

=item PUSH_MULTICALL
X<PUSH_MULTICALL>

Opening bracket for a lightweight callback.
See L<perlcall/LIGHTWEIGHT CALLBACKS>.

		PUSH_MULTICALL;

=for hackers
Found in file cop.h


=back

=head1 Numeric functions

=over 8

=item grok_bin
X<grok_bin>

converts a string representing a binary number to numeric form.

On entry C<start> and C<*len> give the string to scan, C<*flags> gives
conversion flags, and C<result> should be C<NULL> or a pointer to an NV.
The scan stops at the end of the string, or the first invalid character.
Unless C<PERL_SCAN_SILENT_ILLDIGIT> is set in C<*flags>, encountering an
invalid character will also trigger a warning.
On return C<*len> is set to the length of the scanned string,
and C<*flags> gives output flags.

If the value is <= C<UV_MAX> it is returned as a UV, the output flags are clear,
and nothing is written to C<*result>.  If the value is > C<UV_MAX>, C<grok_bin>
returns C<UV_MAX>, sets C<PERL_SCAN_GREATER_THAN_UV_MAX> in the output flags,
and writes the value to C<*result> (or the value is discarded if C<result>
is NULL).

The binary number may optionally be prefixed with C<"0b"> or C<"b"> unless
C<PERL_SCAN_DISALLOW_PREFIX> is set in C<*flags> on entry.  If
C<PERL_SCAN_ALLOW_UNDERSCORES> is set in C<*flags> then the binary
number may use C<"_"> characters to separate digits.

	UV	grok_bin(const char* start, STRLEN* len_p,
		         I32* flags, NV *result)

=for hackers
Found in file numeric.c

=item grok_hex
X<grok_hex>

converts a string representing a hex number to numeric form.

On entry C<start> and C<*len_p> give the string to scan, C<*flags> gives
conversion flags, and C<result> should be C<NULL> or a pointer to an NV.
The scan stops at the end of the string, or the first invalid character.
Unless C<PERL_SCAN_SILENT_ILLDIGIT> is set in C<*flags>, encountering an
invalid character will also trigger a warning.
On return C<*len> is set to the length of the scanned string,
and C<*flags> gives output flags.

If the value is <= C<UV_MAX> it is returned as a UV, the output flags are clear,
and nothing is written to C<*result>.  If the value is > C<UV_MAX>, C<grok_hex>
returns C<UV_MAX>, sets C<PERL_SCAN_GREATER_THAN_UV_MAX> in the output flags,
and writes the value to C<*result> (or the value is discarded if C<result>
is C<NULL>).

The hex number may optionally be prefixed with C<"0x"> or C<"x"> unless
C<PERL_SCAN_DISALLOW_PREFIX> is set in C<*flags> on entry.  If
C<PERL_SCAN_ALLOW_UNDERSCORES> is set in C<*flags> then the hex
number may use C<"_"> characters to separate digits.

	UV	grok_hex(const char* start, STRLEN* len_p,
		         I32* flags, NV *result)

=for hackers
Found in file numeric.c

=item grok_infnan
X<grok_infnan>

Helper for C<grok_number()>, accepts various ways of spelling "infinity"
or "not a number", and returns one of the following flag combinations:

  IS_NUMBER_INFINITE
  IS_NUMBER_NAN
  IS_NUMBER_INFINITE | IS_NUMBER_NEG
  IS_NUMBER_NAN | IS_NUMBER_NEG
  0

possibly |-ed with C<IS_NUMBER_TRAILING>.

If an infinity or a not-a-number is recognized, C<*sp> will point to
one byte past the end of the recognized string.  If the recognition fails,
zero is returned, and C<*sp> will not move.

	int	grok_infnan(const char** sp, const char *send)

=for hackers
Found in file numeric.c

=item grok_number
X<grok_number>

Identical to C<grok_number_flags()> with C<flags> set to zero.

	int	grok_number(const char *pv, STRLEN len,
		            UV *valuep)

=for hackers
Found in file numeric.c

=item grok_number_flags
X<grok_number_flags>

Recognise (or not) a number.  The type of the number is returned
(0 if unrecognised), otherwise it is a bit-ORed combination of
C<IS_NUMBER_IN_UV>, C<IS_NUMBER_GREATER_THAN_UV_MAX>, C<IS_NUMBER_NOT_INT>,
C<IS_NUMBER_NEG>, C<IS_NUMBER_INFINITY>, C<IS_NUMBER_NAN> (defined in perl.h).

If the value of the number can fit in a UV, it is returned in C<*valuep>.
C<IS_NUMBER_IN_UV> will be set to indicate that C<*valuep> is valid, C<IS_NUMBER_IN_UV>
will never be set unless C<*valuep> is valid, but C<*valuep> may have been assigned
to during processing even though C<IS_NUMBER_IN_UV> is not set on return.
If C<valuep> is C<NULL>, C<IS_NUMBER_IN_UV> will be set for the same cases as when
C<valuep> is non-C<NULL>, but no actual assignment (or SEGV) will occur.

C<IS_NUMBER_NOT_INT> will be set with C<IS_NUMBER_IN_UV> if trailing decimals were
seen (in which case C<*valuep> gives the true value truncated to an integer), and
C<IS_NUMBER_NEG> if the number is negative (in which case C<*valuep> holds the
absolute value).  C<IS_NUMBER_IN_UV> is not set if e notation was used or the
number is larger than a UV.

C<flags> allows only C<PERL_SCAN_TRAILING>, which allows for trailing
non-numeric text on an otherwise successful I<grok>, setting
C<IS_NUMBER_TRAILING> on the result.

	int	grok_number_flags(const char *pv, STRLEN len,
		                  UV *valuep, U32 flags)

=for hackers
Found in file numeric.c

=item grok_numeric_radix
X<grok_numeric_radix>

Scan and skip for a numeric decimal separator (radix).

	bool	grok_numeric_radix(const char **sp,
		                   const char *send)

=for hackers
Found in file numeric.c

=item grok_oct
X<grok_oct>

converts a string representing an octal number to numeric form.

On entry C<start> and C<*len> give the string to scan, C<*flags> gives
conversion flags, and C<result> should be C<NULL> or a pointer to an NV.
The scan stops at the end of the string, or the first invalid character.
Unless C<PERL_SCAN_SILENT_ILLDIGIT> is set in C<*flags>, encountering an
8 or 9 will also trigger a warning.
On return C<*len> is set to the length of the scanned string,
and C<*flags> gives output flags.

If the value is <= C<UV_MAX> it is returned as a UV, the output flags are clear,
and nothing is written to C<*result>.  If the value is > C<UV_MAX>, C<grok_oct>
returns C<UV_MAX>, sets C<PERL_SCAN_GREATER_THAN_UV_MAX> in the output flags,
and writes the value to C<*result> (or the value is discarded if C<result>
is C<NULL>).

If C<PERL_SCAN_ALLOW_UNDERSCORES> is set in C<*flags> then the octal
number may use C<"_"> characters to separate digits.

	UV	grok_oct(const char* start, STRLEN* len_p,
		         I32* flags, NV *result)

=for hackers
Found in file numeric.c

=item isinfnan
X<isinfnan>

C<Perl_isinfnan()> is utility function that returns true if the NV
argument is either an infinity or a C<NaN>, false otherwise.  To test
in more detail, use C<Perl_isinf()> and C<Perl_isnan()>.

This is also the logical inverse of Perl_isfinite().

	bool	isinfnan(NV nv)

=for hackers
Found in file numeric.c

=item Perl_signbit
X<Perl_signbit>


NOTE: this function is experimental and may change or be
removed without notice.


Return a non-zero integer if the sign bit on an NV is set, and 0 if
it is not.  

If F<Configure> detects this system has a C<signbit()> that will work with
our NVs, then we just use it via the C<#define> in F<perl.h>.  Otherwise,
fall back on this implementation.  The main use of this function
is catching C<-0.0>.

C<Configure> notes:  This function is called C<'Perl_signbit'> instead of a
plain C<'signbit'> because it is easy to imagine a system having a C<signbit()>
function or macro that doesn't happen to work with our particular choice
of NVs.  We shouldn't just re-C<#define> C<signbit> as C<Perl_signbit> and expect
the standard system headers to be happy.  Also, this is a no-context
function (no C<pTHX_>) because C<Perl_signbit()> is usually re-C<#defined> in
F<perl.h> as a simple macro call to the system's C<signbit()>.
Users should just always call C<Perl_signbit()>.

	int	Perl_signbit(NV f)

=for hackers
Found in file numeric.c

=item scan_bin
X<scan_bin>

For backwards compatibility.  Use C<grok_bin> instead.

	NV	scan_bin(const char* start, STRLEN len,
		         STRLEN* retlen)

=for hackers
Found in file numeric.c

=item scan_hex
X<scan_hex>

For backwards compatibility.  Use C<grok_hex> instead.

	NV	scan_hex(const char* start, STRLEN len,
		         STRLEN* retlen)

=for hackers
Found in file numeric.c

=item scan_oct
X<scan_oct>

For backwards compatibility.  Use C<grok_oct> instead.

	NV	scan_oct(const char* start, STRLEN len,
		         STRLEN* retlen)

=for hackers
Found in file numeric.c


=back

=head1 Obsolete backwards compatibility functions

Some of these are also deprecated.  You can exclude these from
your compiled Perl by adding this option to Configure:
C<-Accflags='-DNO_MATHOMS'>


=over 8

=item custom_op_desc
X<custom_op_desc>

Return the description of a given custom op.  This was once used by the
C<OP_DESC> macro, but is no longer: it has only been kept for
compatibility, and should not be used.

	const char * custom_op_desc(const OP *o)

=for hackers
Found in file mathoms.c

=item custom_op_name
X<custom_op_name>

Return the name for a given custom op.  This was once used by the C<OP_NAME>
macro, but is no longer: it has only been kept for compatibility, and
should not be used.

	const char * custom_op_name(const OP *o)

=for hackers
Found in file mathoms.c

=item gv_fetchmethod
X<gv_fetchmethod>

See L</gv_fetchmethod_autoload>.

	GV*	gv_fetchmethod(HV* stash, const char* name)

=for hackers
Found in file mathoms.c

=item is_utf8_char
X<is_utf8_char>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Tests if some arbitrary number of bytes begins in a valid UTF-8
character.  Note that an INVARIANT (i.e. ASCII on non-EBCDIC machines)
character is a valid UTF-8 character.  The actual number of bytes in the UTF-8
character will be returned if it is valid, otherwise 0.

This function is deprecated due to the possibility that malformed input could
cause reading beyond the end of the input buffer.  Use L</isUTF8_CHAR>
instead.

	STRLEN	is_utf8_char(const U8 *s)

=for hackers
Found in file mathoms.c

=item is_utf8_char_buf
X<is_utf8_char_buf>

This is identical to the macro L</isUTF8_CHAR>.

	STRLEN	is_utf8_char_buf(const U8 *buf,
		                 const U8 *buf_end)

=for hackers
Found in file mathoms.c

=item pack_cat
X<pack_cat>

The engine implementing C<pack()> Perl function.  Note: parameters
C<next_in_list> and C<flags> are not used.  This call should not be used; use
C<packlist> instead.

	void	pack_cat(SV *cat, const char *pat,
		         const char *patend, SV **beglist,
		         SV **endlist, SV ***next_in_list,
		         U32 flags)

=for hackers
Found in file mathoms.c

=item pad_compname_type
X<pad_compname_type>

Looks up the type of the lexical variable at position C<po> in the
currently-compiling pad.  If the variable is typed, the stash of the
class to which it is typed is returned.  If not, C<NULL> is returned.

	HV *	pad_compname_type(PADOFFSET po)

=for hackers
Found in file mathoms.c

=item sv_2pvbyte_nolen
X<sv_2pvbyte_nolen>

Return a pointer to the byte-encoded representation of the SV.
May cause the SV to be downgraded from UTF-8 as a side-effect.

Usually accessed via the C<SvPVbyte_nolen> macro.

	char*	sv_2pvbyte_nolen(SV* sv)

=for hackers
Found in file mathoms.c

=item sv_2pvutf8_nolen
X<sv_2pvutf8_nolen>

Return a pointer to the UTF-8-encoded representation of the SV.
May cause the SV to be upgraded to UTF-8 as a side-effect.

Usually accessed via the C<SvPVutf8_nolen> macro.

	char*	sv_2pvutf8_nolen(SV* sv)

=for hackers
Found in file mathoms.c

=item sv_2pv_nolen
X<sv_2pv_nolen>

Like C<sv_2pv()>, but doesn't return the length too.  You should usually
use the macro wrapper C<SvPV_nolen(sv)> instead.

	char*	sv_2pv_nolen(SV* sv)

=for hackers
Found in file mathoms.c

=item sv_catpvn_mg
X<sv_catpvn_mg>

Like C<sv_catpvn>, but also handles 'set' magic.

	void	sv_catpvn_mg(SV *sv, const char *ptr,
		             STRLEN len)

=for hackers
Found in file mathoms.c

=item sv_catsv_mg
X<sv_catsv_mg>

Like C<sv_catsv>, but also handles 'set' magic.

	void	sv_catsv_mg(SV *dsv, SV *ssv)

=for hackers
Found in file mathoms.c

=item sv_force_normal
X<sv_force_normal>

Undo various types of fakery on an SV: if the PV is a shared string, make
a private copy; if we're a ref, stop refing; if we're a glob, downgrade to
an C<xpvmg>.  See also C<L</sv_force_normal_flags>>.

	void	sv_force_normal(SV *sv)

=for hackers
Found in file mathoms.c

=item sv_iv
X<sv_iv>

A private implementation of the C<SvIVx> macro for compilers which can't
cope with complex macro expressions.  Always use the macro instead.

	IV	sv_iv(SV* sv)

=for hackers
Found in file mathoms.c

=item sv_nolocking
X<sv_nolocking>

Dummy routine which "locks" an SV when there is no locking module present.
Exists to avoid test for a C<NULL> function pointer and because it could
potentially warn under some level of strict-ness.

"Superseded" by C<sv_nosharing()>.

	void	sv_nolocking(SV *sv)

=for hackers
Found in file mathoms.c

=item sv_nounlocking
X<sv_nounlocking>

Dummy routine which "unlocks" an SV when there is no locking module present.
Exists to avoid test for a C<NULL> function pointer and because it could
potentially warn under some level of strict-ness.

"Superseded" by C<sv_nosharing()>.

	void	sv_nounlocking(SV *sv)

=for hackers
Found in file mathoms.c

=item sv_nv
X<sv_nv>

A private implementation of the C<SvNVx> macro for compilers which can't
cope with complex macro expressions.  Always use the macro instead.

	NV	sv_nv(SV* sv)

=for hackers
Found in file mathoms.c

=item sv_pv
X<sv_pv>

Use the C<SvPV_nolen> macro instead

	char*	sv_pv(SV *sv)

=for hackers
Found in file mathoms.c

=item sv_pvbyte
X<sv_pvbyte>

Use C<SvPVbyte_nolen> instead.

	char*	sv_pvbyte(SV *sv)

=for hackers
Found in file mathoms.c

=item sv_pvbyten
X<sv_pvbyten>

A private implementation of the C<SvPVbyte> macro for compilers
which can't cope with complex macro expressions.  Always use the macro
instead.

	char*	sv_pvbyten(SV *sv, STRLEN *lp)

=for hackers
Found in file mathoms.c

=item sv_pvn
X<sv_pvn>

A private implementation of the C<SvPV> macro for compilers which can't
cope with complex macro expressions.  Always use the macro instead.

	char*	sv_pvn(SV *sv, STRLEN *lp)

=for hackers
Found in file mathoms.c

=item sv_pvutf8
X<sv_pvutf8>

Use the C<SvPVutf8_nolen> macro instead

	char*	sv_pvutf8(SV *sv)

=for hackers
Found in file mathoms.c

=item sv_pvutf8n
X<sv_pvutf8n>

A private implementation of the C<SvPVutf8> macro for compilers
which can't cope with complex macro expressions.  Always use the macro
instead.

	char*	sv_pvutf8n(SV *sv, STRLEN *lp)

=for hackers
Found in file mathoms.c

=item sv_taint
X<sv_taint>

Taint an SV.  Use C<SvTAINTED_on> instead.

	void	sv_taint(SV* sv)

=for hackers
Found in file mathoms.c

=item sv_unref
X<sv_unref>

Unsets the RV status of the SV, and decrements the reference count of
whatever was being referenced by the RV.  This can almost be thought of
as a reversal of C<newSVrv>.  This is C<sv_unref_flags> with the C<flag>
being zero.  See C<L</SvROK_off>>.

	void	sv_unref(SV* sv)

=for hackers
Found in file mathoms.c

=item sv_usepvn
X<sv_usepvn>

Tells an SV to use C<ptr> to find its string value.  Implemented by
calling C<sv_usepvn_flags> with C<flags> of 0, hence does not handle 'set'
magic.  See C<L</sv_usepvn_flags>>.

	void	sv_usepvn(SV* sv, char* ptr, STRLEN len)

=for hackers
Found in file mathoms.c

=item sv_usepvn_mg
X<sv_usepvn_mg>

Like C<sv_usepvn>, but also handles 'set' magic.

	void	sv_usepvn_mg(SV *sv, char *ptr, STRLEN len)

=for hackers
Found in file mathoms.c

=item sv_uv
X<sv_uv>

A private implementation of the C<SvUVx> macro for compilers which can't
cope with complex macro expressions.  Always use the macro instead.

	UV	sv_uv(SV* sv)

=for hackers
Found in file mathoms.c

=item unpack_str
X<unpack_str>

The engine implementing C<unpack()> Perl function.  Note: parameters C<strbeg>,
C<new_s> and C<ocnt> are not used.  This call should not be used, use
C<unpackstring> instead.

	I32	unpack_str(const char *pat, const char *patend,
		           const char *s, const char *strbeg,
		           const char *strend, char **new_s,
		           I32 ocnt, U32 flags)

=for hackers
Found in file mathoms.c

=item utf8_to_uvchr
X<utf8_to_uvchr>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Returns the native code point of the first character in the string C<s>
which is assumed to be in UTF-8 encoding; C<retlen> will be set to the
length, in bytes, of that character.

Some, but not all, UTF-8 malformations are detected, and in fact, some
malformed input could cause reading beyond the end of the input buffer, which
is why this function is deprecated.  Use L</utf8_to_uvchr_buf> instead.

If C<s> points to one of the detected malformations, and UTF8 warnings are
enabled, zero is returned and C<*retlen> is set (if C<retlen> isn't
C<NULL>) to -1.  If those warnings are off, the computed value if well-defined (or
the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and C<*retlen>
is set (if C<retlen> isn't NULL) so that (S<C<s> + C<*retlen>>) is the
next possible position in C<s> that could begin a non-malformed character.
See L</utf8n_to_uvchr> for details on when the REPLACEMENT CHARACTER is returned.

	UV	utf8_to_uvchr(const U8 *s, STRLEN *retlen)

=for hackers
Found in file mathoms.c

=item utf8_to_uvuni
X<utf8_to_uvuni>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Returns the Unicode code point of the first character in the string C<s>
which is assumed to be in UTF-8 encoding; C<retlen> will be set to the
length, in bytes, of that character.

Some, but not all, UTF-8 malformations are detected, and in fact, some
malformed input could cause reading beyond the end of the input buffer, which
is one reason why this function is deprecated.  The other is that only in
extremely limited circumstances should the Unicode versus native code point be
of any interest to you.  See L</utf8_to_uvuni_buf> for alternatives.

If C<s> points to one of the detected malformations, and UTF8 warnings are
enabled, zero is returned and C<*retlen> is set (if C<retlen> doesn't point to
NULL) to -1.  If those warnings are off, the computed value if well-defined (or
the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and C<*retlen>
is set (if C<retlen> isn't NULL) so that (S<C<s> + C<*retlen>>) is the
next possible position in C<s> that could begin a non-malformed character.
See L</utf8n_to_uvchr> for details on when the REPLACEMENT CHARACTER is returned.

	UV	utf8_to_uvuni(const U8 *s, STRLEN *retlen)

=for hackers
Found in file mathoms.c


=back

=head1 Optree construction

=over 8

=item newASSIGNOP
X<newASSIGNOP>

Constructs, checks, and returns an assignment op.  C<left> and C<right>
supply the parameters of the assignment; they are consumed by this
function and become part of the constructed op tree.

If C<optype> is C<OP_ANDASSIGN>, C<OP_ORASSIGN>, or C<OP_DORASSIGN>, then
a suitable conditional optree is constructed.  If C<optype> is the opcode
of a binary operator, such as C<OP_BIT_OR>, then an op is constructed that
performs the binary operation and assigns the result to the left argument.
Either way, if C<optype> is non-zero then C<flags> has no effect.

If C<optype> is zero, then a plain scalar or list assignment is
constructed.  Which type of assignment it is is automatically determined.
C<flags> gives the eight bits of C<op_flags>, except that C<OPf_KIDS>
will be set automatically, and, shifted up eight bits, the eight bits
of C<op_private>, except that the bit with value 1 or 2 is automatically
set as required.

	OP *	newASSIGNOP(I32 flags, OP *left, I32 optype,
		            OP *right)

=for hackers
Found in file op.c

=item newBINOP
X<newBINOP>

Constructs, checks, and returns an op of any binary type.  C<type>
is the opcode.  C<flags> gives the eight bits of C<op_flags>, except
that C<OPf_KIDS> will be set automatically, and, shifted up eight bits,
the eight bits of C<op_private>, except that the bit with value 1 or
2 is automatically set as required.  C<first> and C<last> supply up to
two ops to be the direct children of the binary op; they are consumed
by this function and become part of the constructed op tree.

	OP *	newBINOP(I32 type, I32 flags, OP *first,
		         OP *last)

=for hackers
Found in file op.c

=item newCONDOP
X<newCONDOP>

Constructs, checks, and returns a conditional-expression (C<cond_expr>)
op.  C<flags> gives the eight bits of C<op_flags>, except that C<OPf_KIDS>
will be set automatically, and, shifted up eight bits, the eight bits of
C<op_private>, except that the bit with value 1 is automatically set.
C<first> supplies the expression selecting between the two branches,
and C<trueop> and C<falseop> supply the branches; they are consumed by
this function and become part of the constructed op tree.

	OP *	newCONDOP(I32 flags, OP *first, OP *trueop,
		          OP *falseop)

=for hackers
Found in file op.c

=item newDEFSVOP
X<newDEFSVOP>

Constructs and returns an op to access C<$_>.

	OP *	newDEFSVOP()

=for hackers
Found in file op.c

=item newFOROP
X<newFOROP>

Constructs, checks, and returns an op tree expressing a C<foreach>
loop (iteration through a list of values).  This is a heavyweight loop,
with structure that allows exiting the loop by C<last> and suchlike.

C<sv> optionally supplies the variable that will be aliased to each
item in turn; if null, it defaults to C<$_>.
C<expr> supplies the list of values to iterate over.  C<block> supplies
the main body of the loop, and C<cont> optionally supplies a C<continue>
block that operates as a second half of the body.  All of these optree
inputs are consumed by this function and become part of the constructed
op tree.

C<flags> gives the eight bits of C<op_flags> for the C<leaveloop>
op and, shifted up eight bits, the eight bits of C<op_private> for
the C<leaveloop> op, except that (in both cases) some bits will be set
automatically.

	OP *	newFOROP(I32 flags, OP *sv, OP *expr, OP *block,
		         OP *cont)

=for hackers
Found in file op.c

=item newGIVENOP
X<newGIVENOP>

Constructs, checks, and returns an op tree expressing a C<given> block.
C<cond> supplies the expression that will be locally assigned to a lexical
variable, and C<block> supplies the body of the C<given> construct; they
are consumed by this function and become part of the constructed op tree.
C<defsv_off> must be zero (it used to identity the pad slot of lexical $_).

	OP *	newGIVENOP(OP *cond, OP *block,
		           PADOFFSET defsv_off)

=for hackers
Found in file op.c

=item newGVOP
X<newGVOP>

Constructs, checks, and returns an op of any type that involves an
embedded reference to a GV.  C<type> is the opcode.  C<flags> gives the
eight bits of C<op_flags>.  C<gv> identifies the GV that the op should
reference; calling this function does not transfer ownership of any
reference to it.

	OP *	newGVOP(I32 type, I32 flags, GV *gv)

=for hackers
Found in file op.c

=item newLISTOP
X<newLISTOP>

Constructs, checks, and returns an op of any list type.  C<type> is
the opcode.  C<flags> gives the eight bits of C<op_flags>, except that
C<OPf_KIDS> will be set automatically if required.  C<first> and C<last>
supply up to two ops to be direct children of the list op; they are
consumed by this function and become part of the constructed op tree.

For most list operators, the check function expects all the kid ops to be
present already, so calling C<newLISTOP(OP_JOIN, ...)> (e.g.) is not
appropriate.  What you want to do in that case is create an op of type
C<OP_LIST>, append more children to it, and then call L</op_convert_list>.
See L</op_convert_list> for more information.


	OP *	newLISTOP(I32 type, I32 flags, OP *first,
		          OP *last)

=for hackers
Found in file op.c

=item newLOGOP
X<newLOGOP>

Constructs, checks, and returns a logical (flow control) op.  C<type>
is the opcode.  C<flags> gives the eight bits of C<op_flags>, except
that C<OPf_KIDS> will be set automatically, and, shifted up eight bits,
the eight bits of C<op_private>, except that the bit with value 1 is
automatically set.  C<first> supplies the expression controlling the
flow, and C<other> supplies the side (alternate) chain of ops; they are
consumed by this function and become part of the constructed op tree.

	OP *	newLOGOP(I32 type, I32 flags, OP *first,
		         OP *other)

=for hackers
Found in file op.c

=item newLOOPEX
X<newLOOPEX>

Constructs, checks, and returns a loop-exiting op (such as C<goto>
or C<last>).  C<type> is the opcode.  C<label> supplies the parameter
determining the target of the op; it is consumed by this function and
becomes part of the constructed op tree.

	OP *	newLOOPEX(I32 type, OP *label)

=for hackers
Found in file op.c

=item newLOOPOP
X<newLOOPOP>

Constructs, checks, and returns an op tree expressing a loop.  This is
only a loop in the control flow through the op tree; it does not have
the heavyweight loop structure that allows exiting the loop by C<last>
and suchlike.  C<flags> gives the eight bits of C<op_flags> for the
top-level op, except that some bits will be set automatically as required.
C<expr> supplies the expression controlling loop iteration, and C<block>
supplies the body of the loop; they are consumed by this function and
become part of the constructed op tree.  C<debuggable> is currently
unused and should always be 1.

	OP *	newLOOPOP(I32 flags, I32 debuggable, OP *expr,
		          OP *block)

=for hackers
Found in file op.c

=item newMETHOP
X<newMETHOP>

Constructs, checks, and returns an op of method type with a method name
evaluated at runtime.  C<type> is the opcode.  C<flags> gives the eight
bits of C<op_flags>, except that C<OPf_KIDS> will be set automatically,
and, shifted up eight bits, the eight bits of C<op_private>, except that
the bit with value 1 is automatically set.  C<dynamic_meth> supplies an
op which evaluates method name; it is consumed by this function and
become part of the constructed op tree.
Supported optypes: C<OP_METHOD>.

	OP *	newMETHOP(I32 type, I32 flags, OP *first)

=for hackers
Found in file op.c

=item newMETHOP_named
X<newMETHOP_named>

Constructs, checks, and returns an op of method type with a constant
method name.  C<type> is the opcode.  C<flags> gives the eight bits of
C<op_flags>, and, shifted up eight bits, the eight bits of
C<op_private>.  C<const_meth> supplies a constant method name;
it must be a shared COW string.
Supported optypes: C<OP_METHOD_NAMED>.

	OP *	newMETHOP_named(I32 type, I32 flags,
		                SV *const_meth)

=for hackers
Found in file op.c

=item newNULLLIST
X<newNULLLIST>

Constructs, checks, and returns a new C<stub> op, which represents an
empty list expression.

	OP *	newNULLLIST()

=for hackers
Found in file op.c

=item newOP
X<newOP>

Constructs, checks, and returns an op of any base type (any type that
has no extra fields).  C<type> is the opcode.  C<flags> gives the
eight bits of C<op_flags>, and, shifted up eight bits, the eight bits
of C<op_private>.

	OP *	newOP(I32 type, I32 flags)

=for hackers
Found in file op.c

=item newPADOP
X<newPADOP>

Constructs, checks, and returns an op of any type that involves a
reference to a pad element.  C<type> is the opcode.  C<flags> gives the
eight bits of C<op_flags>.  A pad slot is automatically allocated, and
is populated with C<sv>; this function takes ownership of one reference
to it.

This function only exists if Perl has been compiled to use ithreads.

	OP *	newPADOP(I32 type, I32 flags, SV *sv)

=for hackers
Found in file op.c

=item newPMOP
X<newPMOP>

Constructs, checks, and returns an op of any pattern matching type.
C<type> is the opcode.  C<flags> gives the eight bits of C<op_flags>
and, shifted up eight bits, the eight bits of C<op_private>.

	OP *	newPMOP(I32 type, I32 flags)

=for hackers
Found in file op.c

=item newPVOP
X<newPVOP>

Constructs, checks, and returns an op of any type that involves an
embedded C-level pointer (PV).  C<type> is the opcode.  C<flags> gives
the eight bits of C<op_flags>.  C<pv> supplies the C-level pointer, which
must have been allocated using C<PerlMemShared_malloc>; the memory will
be freed when the op is destroyed.

	OP *	newPVOP(I32 type, I32 flags, char *pv)

=for hackers
Found in file op.c

=item newRANGE
X<newRANGE>

Constructs and returns a C<range> op, with subordinate C<flip> and
C<flop> ops.  C<flags> gives the eight bits of C<op_flags> for the
C<flip> op and, shifted up eight bits, the eight bits of C<op_private>
for both the C<flip> and C<range> ops, except that the bit with value
1 is automatically set.  C<left> and C<right> supply the expressions
controlling the endpoints of the range; they are consumed by this function
and become part of the constructed op tree.

	OP *	newRANGE(I32 flags, OP *left, OP *right)

=for hackers
Found in file op.c

=item newSLICEOP
X<newSLICEOP>

Constructs, checks, and returns an C<lslice> (list slice) op.  C<flags>
gives the eight bits of C<op_flags>, except that C<OPf_KIDS> will
be set automatically, and, shifted up eight bits, the eight bits of
C<op_private>, except that the bit with value 1 or 2 is automatically
set as required.  C<listval> and C<subscript> supply the parameters of
the slice; they are consumed by this function and become part of the
constructed op tree.

	OP *	newSLICEOP(I32 flags, OP *subscript,
		           OP *listval)

=for hackers
Found in file op.c

=item newSTATEOP
X<newSTATEOP>

Constructs a state op (COP).  The state op is normally a C<nextstate> op,
but will be a C<dbstate> op if debugging is enabled for currently-compiled
code.  The state op is populated from C<PL_curcop> (or C<PL_compiling>).
If C<label> is non-null, it supplies the name of a label to attach to
the state op; this function takes ownership of the memory pointed at by
C<label>, and will free it.  C<flags> gives the eight bits of C<op_flags>
for the state op.

If C<o> is null, the state op is returned.  Otherwise the state op is
combined with C<o> into a C<lineseq> list op, which is returned.  C<o>
is consumed by this function and becomes part of the returned op tree.

	OP *	newSTATEOP(I32 flags, char *label, OP *o)

=for hackers
Found in file op.c

=item newSVOP
X<newSVOP>

Constructs, checks, and returns an op of any type that involves an
embedded SV.  C<type> is the opcode.  C<flags> gives the eight bits
of C<op_flags>.  C<sv> gives the SV to embed in the op; this function
takes ownership of one reference to it.

	OP *	newSVOP(I32 type, I32 flags, SV *sv)

=for hackers
Found in file op.c

=item newUNOP
X<newUNOP>

Constructs, checks, and returns an op of any unary type.  C<type> is
the opcode.  C<flags> gives the eight bits of C<op_flags>, except that
C<OPf_KIDS> will be set automatically if required, and, shifted up eight
bits, the eight bits of C<op_private>, except that the bit with value 1
is automatically set.  C<first> supplies an optional op to be the direct
child of the unary op; it is consumed by this function and become part
of the constructed op tree.

	OP *	newUNOP(I32 type, I32 flags, OP *first)

=for hackers
Found in file op.c

=item newUNOP_AUX
X<newUNOP_AUX>

Similar to C<newUNOP>, but creates an C<UNOP_AUX> struct instead, with C<op_aux>
initialised to C<aux>

	OP*	newUNOP_AUX(I32 type, I32 flags, OP* first,
		            UNOP_AUX_item *aux)

=for hackers
Found in file op.c

=item newWHENOP
X<newWHENOP>

Constructs, checks, and returns an op tree expressing a C<when> block.
C<cond> supplies the test expression, and C<block> supplies the block
that will be executed if the test evaluates to true; they are consumed
by this function and become part of the constructed op tree.  C<cond>
will be interpreted DWIMically, often as a comparison against C<$_>,
and may be null to generate a C<default> block.

	OP *	newWHENOP(OP *cond, OP *block)

=for hackers
Found in file op.c

=item newWHILEOP
X<newWHILEOP>

Constructs, checks, and returns an op tree expressing a C<while> loop.
This is a heavyweight loop, with structure that allows exiting the loop
by C<last> and suchlike.

C<loop> is an optional preconstructed C<enterloop> op to use in the
loop; if it is null then a suitable op will be constructed automatically.
C<expr> supplies the loop's controlling expression.  C<block> supplies the
main body of the loop, and C<cont> optionally supplies a C<continue> block
that operates as a second half of the body.  All of these optree inputs
are consumed by this function and become part of the constructed op tree.

C<flags> gives the eight bits of C<op_flags> for the C<leaveloop>
op and, shifted up eight bits, the eight bits of C<op_private> for
the C<leaveloop> op, except that (in both cases) some bits will be set
automatically.  C<debuggable> is currently unused and should always be 1.
C<has_my> can be supplied as true to force the
loop body to be enclosed in its own scope.

	OP *	newWHILEOP(I32 flags, I32 debuggable,
		           LOOP *loop, OP *expr, OP *block,
		           OP *cont, I32 has_my)

=for hackers
Found in file op.c


=back

=head1 Optree Manipulation Functions

=over 8

=item alloccopstash
X<alloccopstash>


NOTE: this function is experimental and may change or be
removed without notice.


Available only under threaded builds, this function allocates an entry in
C<PL_stashpad> for the stash passed to it.

	PADOFFSET alloccopstash(HV *hv)

=for hackers
Found in file op.c

=item block_end
X<block_end>

Handles compile-time scope exit.  C<floor>
is the savestack index returned by
C<block_start>, and C<seq> is the body of the block.  Returns the block,
possibly modified.

	OP *	block_end(I32 floor, OP *seq)

=for hackers
Found in file op.c

=item block_start
X<block_start>

Handles compile-time scope entry.
Arranges for hints to be restored on block
exit and also handles pad sequence numbers to make lexical variables scope
right.  Returns a savestack index for use with C<block_end>.

	int	block_start(int full)

=for hackers
Found in file op.c

=item ck_entersub_args_list
X<ck_entersub_args_list>

Performs the default fixup of the arguments part of an C<entersub>
op tree.  This consists of applying list context to each of the
argument ops.  This is the standard treatment used on a call marked
with C<&>, or a method call, or a call through a subroutine reference,
or any other call where the callee can't be identified at compile time,
or a call where the callee has no prototype.

	OP *	ck_entersub_args_list(OP *entersubop)

=for hackers
Found in file op.c

=item ck_entersub_args_proto
X<ck_entersub_args_proto>

Performs the fixup of the arguments part of an C<entersub> op tree
based on a subroutine prototype.  This makes various modifications to
the argument ops, from applying context up to inserting C<refgen> ops,
and checking the number and syntactic types of arguments, as directed by
the prototype.  This is the standard treatment used on a subroutine call,
not marked with C<&>, where the callee can be identified at compile time
and has a prototype.

C<protosv> supplies the subroutine prototype to be applied to the call.
It may be a normal defined scalar, of which the string value will be used.
Alternatively, for convenience, it may be a subroutine object (a C<CV*>
that has been cast to C<SV*>) which has a prototype.  The prototype
supplied, in whichever form, does not need to match the actual callee
referenced by the op tree.

If the argument ops disagree with the prototype, for example by having
an unacceptable number of arguments, a valid op tree is returned anyway.
The error is reflected in the parser state, normally resulting in a single
exception at the top level of parsing which covers all the compilation
errors that occurred.  In the error message, the callee is referred to
by the name defined by the C<namegv> parameter.

	OP *	ck_entersub_args_proto(OP *entersubop,
		                       GV *namegv, SV *protosv)

=for hackers
Found in file op.c

=item ck_entersub_args_proto_or_list
X<ck_entersub_args_proto_or_list>

Performs the fixup of the arguments part of an C<entersub> op tree either
based on a subroutine prototype or using default list-context processing.
This is the standard treatment used on a subroutine call, not marked
with C<&>, where the callee can be identified at compile time.

C<protosv> supplies the subroutine prototype to be applied to the call,
or indicates that there is no prototype.  It may be a normal scalar,
in which case if it is defined then the string value will be used
as a prototype, and if it is undefined then there is no prototype.
Alternatively, for convenience, it may be a subroutine object (a C<CV*>
that has been cast to C<SV*>), of which the prototype will be used if it
has one.  The prototype (or lack thereof) supplied, in whichever form,
does not need to match the actual callee referenced by the op tree.

If the argument ops disagree with the prototype, for example by having
an unacceptable number of arguments, a valid op tree is returned anyway.
The error is reflected in the parser state, normally resulting in a single
exception at the top level of parsing which covers all the compilation
errors that occurred.  In the error message, the callee is referred to
by the name defined by the C<namegv> parameter.

	OP *	ck_entersub_args_proto_or_list(OP *entersubop,
		                               GV *namegv,
		                               SV *protosv)

=for hackers
Found in file op.c

=item cv_const_sv
X<cv_const_sv>

If C<cv> is a constant sub eligible for inlining, returns the constant
value returned by the sub.  Otherwise, returns C<NULL>.

Constant subs can be created with C<newCONSTSUB> or as described in
L<perlsub/"Constant Functions">.

	SV*	cv_const_sv(const CV *const cv)

=for hackers
Found in file op.c

=item cv_get_call_checker
X<cv_get_call_checker>

Retrieves the function that will be used to fix up a call to C<cv>.
Specifically, the function is applied to an C<entersub> op tree for a
subroutine call, not marked with C<&>, where the callee can be identified
at compile time as C<cv>.

The C-level function pointer is returned in C<*ckfun_p>, and an SV
argument for it is returned in C<*ckobj_p>.  The function is intended
to be called in this manner:

 entersubop = (*ckfun_p)(aTHX_ entersubop, namegv, (*ckobj_p));

In this call, C<entersubop> is a pointer to the C<entersub> op,
which may be replaced by the check function, and C<namegv> is a GV
supplying the name that should be used by the check function to refer
to the callee of the C<entersub> op if it needs to emit any diagnostics.
It is permitted to apply the check function in non-standard situations,
such as to a call to a different subroutine or to a method call.

By default, the function is
L<Perl_ck_entersub_args_proto_or_list|/ck_entersub_args_proto_or_list>,
and the SV parameter is C<cv> itself.  This implements standard
prototype processing.  It can be changed, for a particular subroutine,
by L</cv_set_call_checker>.

	void	cv_get_call_checker(CV *cv,
		                    Perl_call_checker *ckfun_p,
		                    SV **ckobj_p)

=for hackers
Found in file op.c

=item cv_set_call_checker
X<cv_set_call_checker>

The original form of L</cv_set_call_checker_flags>, which passes it the
C<CALL_CHECKER_REQUIRE_GV> flag for backward-compatibility.

	void	cv_set_call_checker(CV *cv,
		                    Perl_call_checker ckfun,
		                    SV *ckobj)

=for hackers
Found in file op.c

=item cv_set_call_checker_flags
X<cv_set_call_checker_flags>

Sets the function that will be used to fix up a call to C<cv>.
Specifically, the function is applied to an C<entersub> op tree for a
subroutine call, not marked with C<&>, where the callee can be identified
at compile time as C<cv>.

The C-level function pointer is supplied in C<ckfun>, and an SV argument
for it is supplied in C<ckobj>.  The function should be defined like this:

    STATIC OP * ckfun(pTHX_ OP *op, GV *namegv, SV *ckobj)

It is intended to be called in this manner:

    entersubop = ckfun(aTHX_ entersubop, namegv, ckobj);

In this call, C<entersubop> is a pointer to the C<entersub> op,
which may be replaced by the check function, and C<namegv> supplies
the name that should be used by the check function to refer
to the callee of the C<entersub> op if it needs to emit any diagnostics.
It is permitted to apply the check function in non-standard situations,
such as to a call to a different subroutine or to a method call.

C<namegv> may not actually be a GV.  For efficiency, perl may pass a
CV or other SV instead.  Whatever is passed can be used as the first
argument to L</cv_name>.  You can force perl to pass a GV by including
C<CALL_CHECKER_REQUIRE_GV> in the C<flags>.

The current setting for a particular CV can be retrieved by
L</cv_get_call_checker>.

	void	cv_set_call_checker_flags(
		    CV *cv, Perl_call_checker ckfun, SV *ckobj,
		    U32 flags
		)

=for hackers
Found in file op.c

=item LINKLIST
X<LINKLIST>

Given the root of an optree, link the tree in execution order using the
C<op_next> pointers and return the first op executed.  If this has
already been done, it will not be redone, and C<< o->op_next >> will be
returned.  If C<< o->op_next >> is not already set, C<o> should be at
least an C<UNOP>.

	OP*	LINKLIST(OP *o)

=for hackers
Found in file op.h

=item newCONSTSUB
X<newCONSTSUB>

See L</newCONSTSUB_flags>.

	CV*	newCONSTSUB(HV* stash, const char* name, SV* sv)

=for hackers
Found in file op.c

=item newCONSTSUB_flags
X<newCONSTSUB_flags>

Creates a constant sub equivalent to Perl S<C<sub FOO () { 123 }>> which is
eligible for inlining at compile-time.

Currently, the only useful value for C<flags> is C<SVf_UTF8>.

The newly created subroutine takes ownership of a reference to the passed in
SV.

Passing C<NULL> for SV creates a constant sub equivalent to S<C<sub BAR () {}>>,
which won't be called if used as a destructor, but will suppress the overhead
of a call to C<AUTOLOAD>.  (This form, however, isn't eligible for inlining at
compile time.)

	CV*	newCONSTSUB_flags(HV* stash, const char* name,
		                  STRLEN len, U32 flags, SV* sv)

=for hackers
Found in file op.c

=item newXS
X<newXS>

Used by C<xsubpp> to hook up XSUBs as Perl subs.  C<filename> needs to be
static storage, as it is used directly as CvFILE(), without a copy being made.

=for hackers
Found in file op.c

=item op_append_elem
X<op_append_elem>

Append an item to the list of ops contained directly within a list-type
op, returning the lengthened list.  C<first> is the list-type op,
and C<last> is the op to append to the list.  C<optype> specifies the
intended opcode for the list.  If C<first> is not already a list of the
right type, it will be upgraded into one.  If either C<first> or C<last>
is null, the other is returned unchanged.

	OP *	op_append_elem(I32 optype, OP *first, OP *last)

=for hackers
Found in file op.c

=item op_append_list
X<op_append_list>

Concatenate the lists of ops contained directly within two list-type ops,
returning the combined list.  C<first> and C<last> are the list-type ops
to concatenate.  C<optype> specifies the intended opcode for the list.
If either C<first> or C<last> is not already a list of the right type,
it will be upgraded into one.  If either C<first> or C<last> is null,
the other is returned unchanged.

	OP *	op_append_list(I32 optype, OP *first, OP *last)

=for hackers
Found in file op.c

=item OP_CLASS
X<OP_CLASS>

Return the class of the provided OP: that is, which of the *OP
structures it uses.  For core ops this currently gets the information out
of C<PL_opargs>, which does not always accurately reflect the type used;
in v5.26 onwards, see also the function C<L</op_class>> which can do a better
job of determining the used type.

For custom ops the type is returned from the registration, and it is up
to the registree to ensure it is accurate.  The value returned will be
one of the C<OA_>* constants from F<op.h>.

	U32	OP_CLASS(OP *o)

=for hackers
Found in file op.h

=item op_contextualize
X<op_contextualize>

Applies a syntactic context to an op tree representing an expression.
C<o> is the op tree, and C<context> must be C<G_SCALAR>, C<G_ARRAY>,
or C<G_VOID> to specify the context to apply.  The modified op tree
is returned.

	OP *	op_contextualize(OP *o, I32 context)

=for hackers
Found in file op.c

=item op_convert_list
X<op_convert_list>

Converts C<o> into a list op if it is not one already, and then converts it
into the specified C<type>, calling its check function, allocating a target if
it needs one, and folding constants.

A list-type op is usually constructed one kid at a time via C<newLISTOP>,
C<op_prepend_elem> and C<op_append_elem>.  Then finally it is passed to
C<op_convert_list> to make it the right type.

	OP *	op_convert_list(I32 type, I32 flags, OP *o)

=for hackers
Found in file op.c

=item OP_DESC
X<OP_DESC>

Return a short description of the provided OP.

	const char * OP_DESC(OP *o)

=for hackers
Found in file op.h

=item op_free
X<op_free>

Free an op.  Only use this when an op is no longer linked to from any
optree.

	void	op_free(OP *o)

=for hackers
Found in file op.c

=item OpHAS_SIBLING
X<OpHAS_SIBLING>

Returns true if C<o> has a sibling

	bool	OpHAS_SIBLING(OP *o)

=for hackers
Found in file op.h

=item OpLASTSIB_set
X<OpLASTSIB_set>

Marks C<o> as having no further siblings. On C<PERL_OP_PARENT> builds, marks
o as having the specified parent. See also C<L</OpMORESIB_set>> and
C<OpMAYBESIB_set>. For a higher-level interface, see
C<L</op_sibling_splice>>.

	void	OpLASTSIB_set(OP *o, OP *parent)

=for hackers
Found in file op.h

=item op_linklist
X<op_linklist>

This function is the implementation of the L</LINKLIST> macro.  It should
not be called directly.

	OP*	op_linklist(OP *o)

=for hackers
Found in file op.c

=item op_lvalue
X<op_lvalue>


NOTE: this function is experimental and may change or be
removed without notice.


Propagate lvalue ("modifiable") context to an op and its children.
C<type> represents the context type, roughly based on the type of op that
would do the modifying, although C<local()> is represented by C<OP_NULL>,
because it has no op type of its own (it is signalled by a flag on
the lvalue op).

This function detects things that can't be modified, such as C<$x+1>, and
generates errors for them.  For example, C<$x+1 = 2> would cause it to be
called with an op of type C<OP_ADD> and a C<type> argument of C<OP_SASSIGN>.

It also flags things that need to behave specially in an lvalue context,
such as C<$$x = 5> which might have to vivify a reference in C<$x>.

	OP *	op_lvalue(OP *o, I32 type)

=for hackers
Found in file op.c

=item OpMAYBESIB_set
X<OpMAYBESIB_set>

Conditionally does C<OpMORESIB_set> or C<OpLASTSIB_set> depending on whether
C<sib> is non-null. For a higher-level interface, see C<L</op_sibling_splice>>.

	void	OpMAYBESIB_set(OP *o, OP *sib, OP *parent)

=for hackers
Found in file op.h

=item OpMORESIB_set
X<OpMORESIB_set>

Sets the sibling of C<o> to the non-zero value C<sib>. See also C<L</OpLASTSIB_set>>
and C<L</OpMAYBESIB_set>>. For a higher-level interface, see
C<L</op_sibling_splice>>.

	void	OpMORESIB_set(OP *o, OP *sib)

=for hackers
Found in file op.h

=item OP_NAME
X<OP_NAME>

Return the name of the provided OP.  For core ops this looks up the name
from the op_type; for custom ops from the op_ppaddr.

	const char * OP_NAME(OP *o)

=for hackers
Found in file op.h

=item op_null
X<op_null>

Neutralizes an op when it is no longer needed, but is still linked to from
other ops.

	void	op_null(OP *o)

=for hackers
Found in file op.c

=item op_parent
X<op_parent>

Returns the parent OP of C<o>, if it has a parent. Returns C<NULL> otherwise.
This function is only available on perls built with C<-DPERL_OP_PARENT>.

	OP*	op_parent(OP *o)

=for hackers
Found in file op.c

=item op_prepend_elem
X<op_prepend_elem>

Prepend an item to the list of ops contained directly within a list-type
op, returning the lengthened list.  C<first> is the op to prepend to the
list, and C<last> is the list-type op.  C<optype> specifies the intended
opcode for the list.  If C<last> is not already a list of the right type,
it will be upgraded into one.  If either C<first> or C<last> is null,
the other is returned unchanged.

	OP *	op_prepend_elem(I32 optype, OP *first, OP *last)

=for hackers
Found in file op.c

=item op_scope
X<op_scope>


NOTE: this function is experimental and may change or be
removed without notice.


Wraps up an op tree with some additional ops so that at runtime a dynamic
scope will be created.  The original ops run in the new dynamic scope,
and then, provided that they exit normally, the scope will be unwound.
The additional ops used to create and unwind the dynamic scope will
normally be an C<enter>/C<leave> pair, but a C<scope> op may be used
instead if the ops are simple enough to not need the full dynamic scope
structure.

	OP *	op_scope(OP *o)

=for hackers
Found in file op.c

=item OpSIBLING
X<OpSIBLING>

Returns the sibling of C<o>, or C<NULL> if there is no sibling

	OP*	OpSIBLING(OP *o)

=for hackers
Found in file op.h

=item op_sibling_splice
X<op_sibling_splice>

A general function for editing the structure of an existing chain of
op_sibling nodes.  By analogy with the perl-level C<splice()> function, allows
you to delete zero or more sequential nodes, replacing them with zero or
more different nodes.  Performs the necessary op_first/op_last
housekeeping on the parent node and op_sibling manipulation on the
children.  The last deleted node will be marked as as the last node by
updating the op_sibling/op_sibparent or op_moresib field as appropriate.

Note that op_next is not manipulated, and nodes are not freed; that is the
responsibility of the caller.  It also won't create a new list op for an
empty list etc; use higher-level functions like op_append_elem() for that.

C<parent> is the parent node of the sibling chain. It may passed as C<NULL> if
the splicing doesn't affect the first or last op in the chain.

C<start> is the node preceding the first node to be spliced.  Node(s)
following it will be deleted, and ops will be inserted after it.  If it is
C<NULL>, the first node onwards is deleted, and nodes are inserted at the
beginning.

C<del_count> is the number of nodes to delete.  If zero, no nodes are deleted.
If -1 or greater than or equal to the number of remaining kids, all
remaining kids are deleted.

C<insert> is the first of a chain of nodes to be inserted in place of the nodes.
If C<NULL>, no nodes are inserted.

The head of the chain of deleted ops is returned, or C<NULL> if no ops were
deleted.

For example:

    action                    before      after         returns
    ------                    -----       -----         -------

                              P           P
    splice(P, A, 2, X-Y-Z)    |           |             B-C
                              A-B-C-D     A-X-Y-Z-D

                              P           P
    splice(P, NULL, 1, X-Y)   |           |             A
                              A-B-C-D     X-Y-B-C-D

                              P           P
    splice(P, NULL, 3, NULL)  |           |             A-B-C
                              A-B-C-D     D

                              P           P
    splice(P, B, 0, X-Y)      |           |             NULL
                              A-B-C-D     A-B-X-Y-C-D


For lower-level direct manipulation of C<op_sibparent> and C<op_moresib>,
see C<L</OpMORESIB_set>>, C<L</OpLASTSIB_set>>, C<L</OpMAYBESIB_set>>.

	OP*	op_sibling_splice(OP *parent, OP *start,
		                  int del_count, OP* insert)

=for hackers
Found in file op.c

=item OP_TYPE_IS
X<OP_TYPE_IS>

Returns true if the given OP is not a C<NULL> pointer
and if it is of the given type.

The negation of this macro, C<OP_TYPE_ISNT> is also available
as well as C<OP_TYPE_IS_NN> and C<OP_TYPE_ISNT_NN> which elide
the NULL pointer check.

	bool	OP_TYPE_IS(OP *o, Optype type)

=for hackers
Found in file op.h

=item OP_TYPE_IS_OR_WAS
X<OP_TYPE_IS_OR_WAS>

Returns true if the given OP is not a NULL pointer and
if it is of the given type or used to be before being
replaced by an OP of type OP_NULL.

The negation of this macro, C<OP_TYPE_ISNT_AND_WASNT>
is also available as well as C<OP_TYPE_IS_OR_WAS_NN>
and C<OP_TYPE_ISNT_AND_WASNT_NN> which elide
the C<NULL> pointer check.

	bool	OP_TYPE_IS_OR_WAS(OP *o, Optype type)

=for hackers
Found in file op.h

=item rv2cv_op_cv
X<rv2cv_op_cv>

Examines an op, which is expected to identify a subroutine at runtime,
and attempts to determine at compile time which subroutine it identifies.
This is normally used during Perl compilation to determine whether
a prototype can be applied to a function call.  C<cvop> is the op
being considered, normally an C<rv2cv> op.  A pointer to the identified
subroutine is returned, if it could be determined statically, and a null
pointer is returned if it was not possible to determine statically.

Currently, the subroutine can be identified statically if the RV that the
C<rv2cv> is to operate on is provided by a suitable C<gv> or C<const> op.
A C<gv> op is suitable if the GV's CV slot is populated.  A C<const> op is
suitable if the constant value must be an RV pointing to a CV.  Details of
this process may change in future versions of Perl.  If the C<rv2cv> op
has the C<OPpENTERSUB_AMPER> flag set then no attempt is made to identify
the subroutine statically: this flag is used to suppress compile-time
magic on a subroutine call, forcing it to use default runtime behaviour.

If C<flags> has the bit C<RV2CVOPCV_MARK_EARLY> set, then the handling
of a GV reference is modified.  If a GV was examined and its CV slot was
found to be empty, then the C<gv> op has the C<OPpEARLY_CV> flag set.
If the op is not optimised away, and the CV slot is later populated with
a subroutine having a prototype, that flag eventually triggers the warning
"called too early to check prototype".

If C<flags> has the bit C<RV2CVOPCV_RETURN_NAME_GV> set, then instead
of returning a pointer to the subroutine it returns a pointer to the
GV giving the most appropriate name for the subroutine in this context.
Normally this is just the C<CvGV> of the subroutine, but for an anonymous
(C<CvANON>) subroutine that is referenced through a GV it will be the
referencing GV.  The resulting C<GV*> is cast to C<CV*> to be returned.
A null pointer is returned as usual if there is no statically-determinable
subroutine.

	CV *	rv2cv_op_cv(OP *cvop, U32 flags)

=for hackers
Found in file op.c


=back

=head1 Pack and Unpack

=over 8

=item packlist
X<packlist>

The engine implementing C<pack()> Perl function.

	void	packlist(SV *cat, const char *pat,
		         const char *patend, SV **beglist,
		         SV **endlist)

=for hackers
Found in file pp_pack.c

=item unpackstring
X<unpackstring>

The engine implementing the C<unpack()> Perl function.

Using the template C<pat..patend>, this function unpacks the string
C<s..strend> into a number of mortal SVs, which it pushes onto the perl
argument (C<@_>) stack (so you will need to issue a C<PUTBACK> before and
C<SPAGAIN> after the call to this function).  It returns the number of
pushed elements.

The C<strend> and C<patend> pointers should point to the byte following the
last character of each string.

Although this function returns its values on the perl argument stack, it
doesn't take any parameters from that stack (and thus in particular
there's no need to do a C<PUSHMARK> before calling it, unlike L</call_pv> for
example).

	I32	unpackstring(const char *pat,
		             const char *patend, const char *s,
		             const char *strend, U32 flags)

=for hackers
Found in file pp_pack.c


=back

=head1 Pad Data Structures

=over 8

=item CvPADLIST
X<CvPADLIST>


NOTE: this function is experimental and may change or be
removed without notice.


CV's can have CvPADLIST(cv) set to point to a PADLIST.  This is the CV's
scratchpad, which stores lexical variables and opcode temporary and
per-thread values.

For these purposes "formats" are a kind-of CV; eval""s are too (except they're
not callable at will and are always thrown away after the eval"" is done
executing).  Require'd files are simply evals without any outer lexical
scope.

XSUBs do not have a C<CvPADLIST>.  C<dXSTARG> fetches values from C<PL_curpad>,
but that is really the callers pad (a slot of which is allocated by
every entersub). Do not get or set C<CvPADLIST> if a CV is an XSUB (as
determined by C<CvISXSUB()>), C<CvPADLIST> slot is reused for a different
internal purpose in XSUBs.

The PADLIST has a C array where pads are stored.

The 0th entry of the PADLIST is a PADNAMELIST
which represents the "names" or rather
the "static type information" for lexicals.  The individual elements of a
PADNAMELIST are PADNAMEs.  Future
refactorings might stop the PADNAMELIST from being stored in the PADLIST's
array, so don't rely on it.  See L</PadlistNAMES>.

The CvDEPTH'th entry of a PADLIST is a PAD (an AV) which is the stack frame
at that depth of recursion into the CV.  The 0th slot of a frame AV is an
AV which is C<@_>.  Other entries are storage for variables and op targets.

Iterating over the PADNAMELIST iterates over all possible pad
items.  Pad slots for targets (C<SVs_PADTMP>)
and GVs end up having &PL_padname_undef "names", while slots for constants 
have C<&PL_padname_const> "names" (see C<L</pad_alloc>>).  That
C<&PL_padname_undef>
and C<&PL_padname_const> are used is an implementation detail subject to
change.  To test for them, use C<!PadnamePV(name)> and
S<C<PadnamePV(name) && !PadnameLEN(name)>>, respectively.

Only C<my>/C<our> variable slots get valid names.
The rest are op targets/GVs/constants which are statically allocated
or resolved at compile time.  These don't have names by which they
can be looked up from Perl code at run time through eval"" the way
C<my>/C<our> variables can be.  Since they can't be looked up by "name"
but only by their index allocated at compile time (which is usually
in C<PL_op->op_targ>), wasting a name SV for them doesn't make sense.

The pad names in the PADNAMELIST have their PV holding the name of
the variable.  The C<COP_SEQ_RANGE_LOW> and C<_HIGH> fields form a range
(low+1..high inclusive) of cop_seq numbers for which the name is
valid.  During compilation, these fields may hold the special value
PERL_PADSEQ_INTRO to indicate various stages:

 COP_SEQ_RANGE_LOW        _HIGH
 -----------------        -----
 PERL_PADSEQ_INTRO            0   variable not yet introduced:
                                  { my ($x
 valid-seq#   PERL_PADSEQ_INTRO   variable in scope:
                                  { my ($x);
 valid-seq#          valid-seq#   compilation of scope complete:
                                  { my ($x); .... }

When a lexical var hasn't yet been introduced, it already exists from the
perspective of duplicate declarations, but not for variable lookups, e.g.

    my ($x, $x); # '"my" variable $x masks earlier declaration'
    my $x = $x;  # equal to my $x = $::x;

For typed lexicals C<PadnameTYPE> points at the type stash.  For C<our>
lexicals, C<PadnameOURSTASH> points at the stash of the associated global (so
that duplicate C<our> declarations in the same package can be detected).
C<PadnameGEN> is sometimes used to store the generation number during
compilation.

If C<PadnameOUTER> is set on the pad name, then that slot in the frame AV
is a REFCNT'ed reference to a lexical from "outside".  Such entries
are sometimes referred to as 'fake'.  In this case, the name does not
use 'low' and 'high' to store a cop_seq range, since it is in scope
throughout.  Instead 'high' stores some flags containing info about
the real lexical (is it declared in an anon, and is it capable of being
instantiated multiple times?), and for fake ANONs, 'low' contains the index
within the parent's pad where the lexical's value is stored, to make
cloning quicker.

If the 'name' is C<&> the corresponding entry in the PAD
is a CV representing a possible closure.

Note that formats are treated as anon subs, and are cloned each time
write is called (if necessary).

The flag C<SVs_PADSTALE> is cleared on lexicals each time the C<my()> is executed,
and set on scope exit.  This allows the
C<"Variable $x is not available"> warning
to be generated in evals, such as 

    { my $x = 1; sub f { eval '$x'} } f();

For state vars, C<SVs_PADSTALE> is overloaded to mean 'not yet initialised',
but this internal state is stored in a separate pad entry.

	PADLIST * CvPADLIST(CV *cv)

=for hackers
Found in file pad.c

=item pad_add_name_pvs
X<pad_add_name_pvs>

Exactly like L</pad_add_name_pvn>, but takes a C<NUL>-terminated literal string
instead of a string/length pair.

	PADOFFSET pad_add_name_pvs(const char *name, U32 flags,
	                           HV *typestash, HV *ourstash)

=for hackers
Found in file pad.h

=item PadARRAY
X<PadARRAY>


NOTE: this function is experimental and may change or be
removed without notice.


The C array of pad entries.

	SV **	PadARRAY(PAD pad)

=for hackers
Found in file pad.h

=item pad_findmy_pvs
X<pad_findmy_pvs>

Exactly like L</pad_findmy_pvn>, but takes a C<NUL>-terminated literal string
instead of a string/length pair.

	PADOFFSET pad_findmy_pvs(const char *name, U32 flags)

=for hackers
Found in file pad.h

=item PadlistARRAY
X<PadlistARRAY>


NOTE: this function is experimental and may change or be
removed without notice.


The C array of a padlist, containing the pads.  Only subscript it with
numbers >= 1, as the 0th entry is not guaranteed to remain usable.

	PAD **	PadlistARRAY(PADLIST padlist)

=for hackers
Found in file pad.h

=item PadlistMAX
X<PadlistMAX>


NOTE: this function is experimental and may change or be
removed without notice.


The index of the last allocated space in the padlist.  Note that the last
pad may be in an earlier slot.  Any entries following it will be C<NULL> in
that case.

	SSize_t	PadlistMAX(PADLIST padlist)

=for hackers
Found in file pad.h

=item PadlistNAMES
X<PadlistNAMES>


NOTE: this function is experimental and may change or be
removed without notice.


The names associated with pad entries.

	PADNAMELIST * PadlistNAMES(PADLIST padlist)

=for hackers
Found in file pad.h

=item PadlistNAMESARRAY
X<PadlistNAMESARRAY>


NOTE: this function is experimental and may change or be
removed without notice.


The C array of pad names.

	PADNAME ** PadlistNAMESARRAY(PADLIST padlist)

=for hackers
Found in file pad.h

=item PadlistNAMESMAX
X<PadlistNAMESMAX>


NOTE: this function is experimental and may change or be
removed without notice.


The index of the last pad name.

	SSize_t	PadlistNAMESMAX(PADLIST padlist)

=for hackers
Found in file pad.h

=item PadlistREFCNT
X<PadlistREFCNT>


NOTE: this function is experimental and may change or be
removed without notice.


The reference count of the padlist.  Currently this is always 1.

	U32	PadlistREFCNT(PADLIST padlist)

=for hackers
Found in file pad.h

=item PadMAX
X<PadMAX>


NOTE: this function is experimental and may change or be
removed without notice.


The index of the last pad entry.

	SSize_t	PadMAX(PAD pad)

=for hackers
Found in file pad.h

=item PadnameLEN
X<PadnameLEN>


NOTE: this function is experimental and may change or be
removed without notice.


The length of the name.

	STRLEN	PadnameLEN(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnamelistARRAY
X<PadnamelistARRAY>


NOTE: this function is experimental and may change or be
removed without notice.


The C array of pad names.

	PADNAME ** PadnamelistARRAY(PADNAMELIST pnl)

=for hackers
Found in file pad.h

=item PadnamelistMAX
X<PadnamelistMAX>


NOTE: this function is experimental and may change or be
removed without notice.


The index of the last pad name.

	SSize_t	PadnamelistMAX(PADNAMELIST pnl)

=for hackers
Found in file pad.h

=item PadnamelistREFCNT
X<PadnamelistREFCNT>


NOTE: this function is experimental and may change or be
removed without notice.


The reference count of the pad name list.

	SSize_t	PadnamelistREFCNT(PADNAMELIST pnl)

=for hackers
Found in file pad.h

=item PadnamelistREFCNT_dec
X<PadnamelistREFCNT_dec>


NOTE: this function is experimental and may change or be
removed without notice.


Lowers the reference count of the pad name list.

	void	PadnamelistREFCNT_dec(PADNAMELIST pnl)

=for hackers
Found in file pad.h

=item PadnamePV
X<PadnamePV>


NOTE: this function is experimental and may change or be
removed without notice.


The name stored in the pad name struct.  This returns C<NULL> for a target
slot.

	char *	PadnamePV(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnameREFCNT
X<PadnameREFCNT>


NOTE: this function is experimental and may change or be
removed without notice.


The reference count of the pad name.

	SSize_t	PadnameREFCNT(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnameREFCNT_dec
X<PadnameREFCNT_dec>


NOTE: this function is experimental and may change or be
removed without notice.


Lowers the reference count of the pad name.


	void	PadnameREFCNT_dec(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnameSV
X<PadnameSV>


NOTE: this function is experimental and may change or be
removed without notice.


Returns the pad name as a mortal SV.

	SV *	PadnameSV(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnameUTF8
X<PadnameUTF8>


NOTE: this function is experimental and may change or be
removed without notice.


Whether PadnamePV is in UTF-8.  Currently, this is always true.

	bool	PadnameUTF8(PADNAME pn)

=for hackers
Found in file pad.h

=item pad_new
X<pad_new>

Create a new padlist, updating the global variables for the
currently-compiling padlist to point to the new padlist.  The following
flags can be OR'ed together:

    padnew_CLONE	this pad is for a cloned CV
    padnew_SAVE		save old globals on the save stack
    padnew_SAVESUB	also save extra stuff for start of sub

	PADLIST * pad_new(int flags)

=for hackers
Found in file pad.c

=item PL_comppad
X<PL_comppad>


NOTE: this function is experimental and may change or be
removed without notice.


During compilation, this points to the array containing the values
part of the pad for the currently-compiling code.  (At runtime a CV may
have many such value arrays; at compile time just one is constructed.)
At runtime, this points to the array containing the currently-relevant
values for the pad for the currently-executing code.

=for hackers
Found in file pad.c

=item PL_comppad_name
X<PL_comppad_name>


NOTE: this function is experimental and may change or be
removed without notice.


During compilation, this points to the array containing the names part
of the pad for the currently-compiling code.

=for hackers
Found in file pad.c

=item PL_curpad
X<PL_curpad>


NOTE: this function is experimental and may change or be
removed without notice.


Points directly to the body of the L</PL_comppad> array.
(I.e., this is C<PAD_ARRAY(PL_comppad)>.)

=for hackers
Found in file pad.c


=back

=head1 Per-Interpreter Variables

=over 8

=item PL_modglobal
X<PL_modglobal>

C<PL_modglobal> is a general purpose, interpreter global HV for use by
extensions that need to keep information on a per-interpreter basis.
In a pinch, it can also be used as a symbol table for extensions
to share data among each other.  It is a good idea to use keys
prefixed by the package name of the extension that owns the data.

	HV*	PL_modglobal

=for hackers
Found in file intrpvar.h

=item PL_na
X<PL_na>

A convenience variable which is typically used with C<SvPV> when one
doesn't care about the length of the string.  It is usually more efficient
to either declare a local variable and use that instead or to use the
C<SvPV_nolen> macro.

	STRLEN	PL_na

=for hackers
Found in file intrpvar.h

=item PL_opfreehook
X<PL_opfreehook>

When non-C<NULL>, the function pointed by this variable will be called each time an OP is freed with the corresponding OP as the argument.
This allows extensions to free any extra attribute they have locally attached to an OP.
It is also assured to first fire for the parent OP and then for its kids.

When you replace this variable, it is considered a good practice to store the possibly previously installed hook and that you recall it inside your own.

	Perl_ophook_t	PL_opfreehook

=for hackers
Found in file intrpvar.h

=item PL_peepp
X<PL_peepp>

Pointer to the per-subroutine peephole optimiser.  This is a function
that gets called at the end of compilation of a Perl subroutine (or
equivalently independent piece of Perl code) to perform fixups of
some ops and to perform small-scale optimisations.  The function is
called once for each subroutine that is compiled, and is passed, as sole
parameter, a pointer to the op that is the entry point to the subroutine.
It modifies the op tree in place.

The peephole optimiser should never be completely replaced.  Rather,
add code to it by wrapping the existing optimiser.  The basic way to do
this can be seen in L<perlguts/Compile pass 3: peephole optimization>.
If the new code wishes to operate on ops throughout the subroutine's
structure, rather than just at the top level, it is likely to be more
convenient to wrap the L</PL_rpeepp> hook.

	peep_t	PL_peepp

=for hackers
Found in file intrpvar.h

=item PL_rpeepp
X<PL_rpeepp>

Pointer to the recursive peephole optimiser.  This is a function
that gets called at the end of compilation of a Perl subroutine (or
equivalently independent piece of Perl code) to perform fixups of some
ops and to perform small-scale optimisations.  The function is called
once for each chain of ops linked through their C<op_next> fields;
it is recursively called to handle each side chain.  It is passed, as
sole parameter, a pointer to the op that is at the head of the chain.
It modifies the op tree in place.

The peephole optimiser should never be completely replaced.  Rather,
add code to it by wrapping the existing optimiser.  The basic way to do
this can be seen in L<perlguts/Compile pass 3: peephole optimization>.
If the new code wishes to operate only on ops at a subroutine's top level,
rather than throughout the structure, it is likely to be more convenient
to wrap the L</PL_peepp> hook.

	peep_t	PL_rpeepp

=for hackers
Found in file intrpvar.h

=item PL_sv_no
X<PL_sv_no>

This is the C<false> SV.  See C<L</PL_sv_yes>>.  Always refer to this as
C<&PL_sv_no>.

	SV	PL_sv_no

=for hackers
Found in file intrpvar.h

=item PL_sv_undef
X<PL_sv_undef>

This is the C<undef> SV.  Always refer to this as C<&PL_sv_undef>.

	SV	PL_sv_undef

=for hackers
Found in file intrpvar.h

=item PL_sv_yes
X<PL_sv_yes>

This is the C<true> SV.  See C<L</PL_sv_no>>.  Always refer to this as
C<&PL_sv_yes>.

	SV	PL_sv_yes

=for hackers
Found in file intrpvar.h


=back

=head1 REGEXP Functions

=over 8

=item SvRX
X<SvRX>

Convenience macro to get the REGEXP from a SV.  This is approximately
equivalent to the following snippet:

    if (SvMAGICAL(sv))
        mg_get(sv);
    if (SvROK(sv))
        sv = MUTABLE_SV(SvRV(sv));
    if (SvTYPE(sv) == SVt_REGEXP)
        return (REGEXP*) sv;

C<NULL> will be returned if a REGEXP* is not found.

	REGEXP * SvRX(SV *sv)

=for hackers
Found in file regexp.h

=item SvRXOK
X<SvRXOK>

Returns a boolean indicating whether the SV (or the one it references)
is a REGEXP.

If you want to do something with the REGEXP* later use SvRX instead
and check for NULL.

	bool	SvRXOK(SV* sv)

=for hackers
Found in file regexp.h


=back

=head1 Stack Manipulation Macros

=over 8

=item dMARK
X<dMARK>

Declare a stack marker variable, C<mark>, for the XSUB.  See C<L</MARK>> and
C<L</dORIGMARK>>.

		dMARK;

=for hackers
Found in file pp.h

=item dORIGMARK
X<dORIGMARK>

Saves the original stack mark for the XSUB.  See C<L</ORIGMARK>>.

		dORIGMARK;

=for hackers
Found in file pp.h

=item dSP
X<dSP>

Declares a local copy of perl's stack pointer for the XSUB, available via
the C<SP> macro.  See C<L</SP>>.

		dSP;

=for hackers
Found in file pp.h

=item EXTEND
X<EXTEND>

Used to extend the argument stack for an XSUB's return values.  Once
used, guarantees that there is room for at least C<nitems> to be pushed
onto the stack.

	void	EXTEND(SP, SSize_t nitems)

=for hackers
Found in file pp.h

=item MARK
X<MARK>

Stack marker variable for the XSUB.  See C<L</dMARK>>.

=for hackers
Found in file pp.h

=item mPUSHi
X<mPUSHi>

Push an integer onto the stack.  The stack must have room for this element.
Does not use C<TARG>.  See also C<L</PUSHi>>, C<L</mXPUSHi>> and C<L</XPUSHi>>.

	void	mPUSHi(IV iv)

=for hackers
Found in file pp.h

=item mPUSHn
X<mPUSHn>

Push a double onto the stack.  The stack must have room for this element.
Does not use C<TARG>.  See also C<L</PUSHn>>, C<L</mXPUSHn>> and C<L</XPUSHn>>.

	void	mPUSHn(NV nv)

=for hackers
Found in file pp.h

=item mPUSHp
X<mPUSHp>

Push a string onto the stack.  The stack must have room for this element.
The C<len> indicates the length of the string.  Does not use C<TARG>.
See also C<L</PUSHp>>, C<L</mXPUSHp>> and C<L</XPUSHp>>.

	void	mPUSHp(char* str, STRLEN len)

=for hackers
Found in file pp.h

=item mPUSHs
X<mPUSHs>

Push an SV onto the stack and mortalizes the SV.  The stack must have room
for this element.  Does not use C<TARG>.  See also C<L</PUSHs>> and
C<L</mXPUSHs>>.

	void	mPUSHs(SV* sv)

=for hackers
Found in file pp.h

=item mPUSHu
X<mPUSHu>

Push an unsigned integer onto the stack.  The stack must have room for this
element.  Does not use C<TARG>.  See also C<L</PUSHu>>, C<L</mXPUSHu>> and
C<L</XPUSHu>>.

	void	mPUSHu(UV uv)

=for hackers
Found in file pp.h

=item mXPUSHi
X<mXPUSHi>

Push an integer onto the stack, extending the stack if necessary.
Does not use C<TARG>.  See also C<L</XPUSHi>>, C<L</mPUSHi>> and C<L</PUSHi>>.

	void	mXPUSHi(IV iv)

=for hackers
Found in file pp.h

=item mXPUSHn
X<mXPUSHn>

Push a double onto the stack, extending the stack if necessary.
Does not use C<TARG>.  See also C<L</XPUSHn>>, C<L</mPUSHn>> and C<L</PUSHn>>.

	void	mXPUSHn(NV nv)

=for hackers
Found in file pp.h

=item mXPUSHp
X<mXPUSHp>

Push a string onto the stack, extending the stack if necessary.  The C<len>
indicates the length of the string.  Does not use C<TARG>.  See also
C<L</XPUSHp>>, C<mPUSHp> and C<PUSHp>.

	void	mXPUSHp(char* str, STRLEN len)

=for hackers
Found in file pp.h

=item mXPUSHs
X<mXPUSHs>

Push an SV onto the stack, extending the stack if necessary and mortalizes
the SV.  Does not use C<TARG>.  See also C<L</XPUSHs>> and C<L</mPUSHs>>.

	void	mXPUSHs(SV* sv)

=for hackers
Found in file pp.h

=item mXPUSHu
X<mXPUSHu>

Push an unsigned integer onto the stack, extending the stack if necessary.
Does not use C<TARG>.  See also C<L</XPUSHu>>, C<L</mPUSHu>> and C<L</PUSHu>>.

	void	mXPUSHu(UV uv)

=for hackers
Found in file pp.h

=item ORIGMARK
X<ORIGMARK>

The original stack mark for the XSUB.  See C<L</dORIGMARK>>.

=for hackers
Found in file pp.h

=item POPi
X<POPi>

Pops an integer off the stack.

	IV	POPi

=for hackers
Found in file pp.h

=item POPl
X<POPl>

Pops a long off the stack.

	long	POPl

=for hackers
Found in file pp.h

=item POPn
X<POPn>

Pops a double off the stack.

	NV	POPn

=for hackers
Found in file pp.h

=item POPp
X<POPp>

Pops a string off the stack.

	char*	POPp

=for hackers
Found in file pp.h

=item POPpbytex
X<POPpbytex>

Pops a string off the stack which must consist of bytes i.e. characters < 256.

	char*	POPpbytex

=for hackers
Found in file pp.h

=item POPpx
X<POPpx>

Pops a string off the stack.  Identical to POPp.  There are two names for
historical reasons.

	char*	POPpx

=for hackers
Found in file pp.h

=item POPs
X<POPs>

Pops an SV off the stack.

	SV*	POPs

=for hackers
Found in file pp.h

=item POPu
X<POPu>

Pops an unsigned integer off the stack.

	UV	POPu

=for hackers
Found in file pp.h

=item POPul
X<POPul>

Pops an unsigned long off the stack.

	long	POPul

=for hackers
Found in file pp.h

=item PUSHi
X<PUSHi>

Push an integer onto the stack.  The stack must have room for this element.
Handles 'set' magic.  Uses C<TARG>, so C<dTARGET> or C<dXSTARG> should be
called to declare it.  Do not call multiple C<TARG>-oriented macros to 
return lists from XSUB's - see C<L</mPUSHi>> instead.  See also C<L</XPUSHi>>
and C<L</mXPUSHi>>.

	void	PUSHi(IV iv)

=for hackers
Found in file pp.h

=item PUSHMARK
X<PUSHMARK>

Opening bracket for arguments on a callback.  See C<L</PUTBACK>> and
L<perlcall>.

	void	PUSHMARK(SP)

=for hackers
Found in file pp.h

=item PUSHmortal
X<PUSHmortal>

Push a new mortal SV onto the stack.  The stack must have room for this
element.  Does not use C<TARG>.  See also C<L</PUSHs>>, C<L</XPUSHmortal>> and
C<L</XPUSHs>>.

	void	PUSHmortal()

=for hackers
Found in file pp.h

=item PUSHn
X<PUSHn>

Push a double onto the stack.  The stack must have room for this element.
Handles 'set' magic.  Uses C<TARG>, so C<dTARGET> or C<dXSTARG> should be
called to declare it.  Do not call multiple C<TARG>-oriented macros to
return lists from XSUB's - see C<L</mPUSHn>> instead.  See also C<L</XPUSHn>>
and C<L</mXPUSHn>>.

	void	PUSHn(NV nv)

=for hackers
Found in file pp.h

=item PUSHp
X<PUSHp>

Push a string onto the stack.  The stack must have room for this element.
The C<len> indicates the length of the string.  Handles 'set' magic.  Uses
C<TARG>, so C<dTARGET> or C<dXSTARG> should be called to declare it.  Do not
call multiple C<TARG>-oriented macros to return lists from XSUB's - see
C<L</mPUSHp>> instead.  See also C<L</XPUSHp>> and C<L</mXPUSHp>>.

	void	PUSHp(char* str, STRLEN len)

=for hackers
Found in file pp.h

=item PUSHs
X<PUSHs>

Push an SV onto the stack.  The stack must have room for this element.
Does not handle 'set' magic.  Does not use C<TARG>.  See also
C<L</PUSHmortal>>, C<L</XPUSHs>>, and C<L</XPUSHmortal>>.

	void	PUSHs(SV* sv)

=for hackers
Found in file pp.h

=item PUSHu
X<PUSHu>

Push an unsigned integer onto the stack.  The stack must have room for this
element.  Handles 'set' magic.  Uses C<TARG>, so C<dTARGET> or C<dXSTARG>
should be called to declare it.  Do not call multiple C<TARG>-oriented
macros to return lists from XSUB's - see C<L</mPUSHu>> instead.  See also
C<L</XPUSHu>> and C<L</mXPUSHu>>.

	void	PUSHu(UV uv)

=for hackers
Found in file pp.h

=item PUTBACK
X<PUTBACK>

Closing bracket for XSUB arguments.  This is usually handled by C<xsubpp>.
See C<L</PUSHMARK>> and L<perlcall> for other uses.

		PUTBACK;

=for hackers
Found in file pp.h

=item SP
X<SP>

Stack pointer.  This is usually handled by C<xsubpp>.  See C<L</dSP>> and
C<SPAGAIN>.

=for hackers
Found in file pp.h

=item SPAGAIN
X<SPAGAIN>

Refetch the stack pointer.  Used after a callback.  See L<perlcall>.

		SPAGAIN;

=for hackers
Found in file pp.h

=item XPUSHi
X<XPUSHi>

Push an integer onto the stack, extending the stack if necessary.  Handles
'set' magic.  Uses C<TARG>, so C<dTARGET> or C<dXSTARG> should be called to
declare it.  Do not call multiple C<TARG>-oriented macros to return lists
from XSUB's - see C<L</mXPUSHi>> instead.  See also C<L</PUSHi>> and
C<L</mPUSHi>>.

	void	XPUSHi(IV iv)

=for hackers
Found in file pp.h

=item XPUSHmortal
X<XPUSHmortal>

Push a new mortal SV onto the stack, extending the stack if necessary.
Does not use C<TARG>.  See also C<L</XPUSHs>>, C<L</PUSHmortal>> and
C<L</PUSHs>>.

	void	XPUSHmortal()

=for hackers
Found in file pp.h

=item XPUSHn
X<XPUSHn>

Push a double onto the stack, extending the stack if necessary.  Handles
'set' magic.  Uses C<TARG>, so C<dTARGET> or C<dXSTARG> should be called to
declare it.  Do not call multiple C<TARG>-oriented macros to return lists
from XSUB's - see C<L</mXPUSHn>> instead.  See also C<L</PUSHn>> and
C<L</mPUSHn>>.

	void	XPUSHn(NV nv)

=for hackers
Found in file pp.h

=item XPUSHp
X<XPUSHp>

Push a string onto the stack, extending the stack if necessary.  The C<len>
indicates the length of the string.  Handles 'set' magic.  Uses C<TARG>, so
C<dTARGET> or C<dXSTARG> should be called to declare it.  Do not call
multiple C<TARG>-oriented macros to return lists from XSUB's - see
C<L</mXPUSHp>> instead.  See also C<L</PUSHp>> and C<L</mPUSHp>>.

	void	XPUSHp(char* str, STRLEN len)

=for hackers
Found in file pp.h

=item XPUSHs
X<XPUSHs>

Push an SV onto the stack, extending the stack if necessary.  Does not
handle 'set' magic.  Does not use C<TARG>.  See also C<L</XPUSHmortal>>,
C<PUSHs> and C<PUSHmortal>.

	void	XPUSHs(SV* sv)

=for hackers
Found in file pp.h

=item XPUSHu
X<XPUSHu>

Push an unsigned integer onto the stack, extending the stack if necessary.
Handles 'set' magic.  Uses C<TARG>, so C<dTARGET> or C<dXSTARG> should be
called to declare it.  Do not call multiple C<TARG>-oriented macros to
return lists from XSUB's - see C<L</mXPUSHu>> instead.  See also C<L</PUSHu>> and
C<L</mPUSHu>>.

	void	XPUSHu(UV uv)

=for hackers
Found in file pp.h

=item XSRETURN
X<XSRETURN>

Return from XSUB, indicating number of items on the stack.  This is usually
handled by C<xsubpp>.

	void	XSRETURN(int nitems)

=for hackers
Found in file XSUB.h

=item XSRETURN_EMPTY
X<XSRETURN_EMPTY>

Return an empty list from an XSUB immediately.

		XSRETURN_EMPTY;

=for hackers
Found in file XSUB.h

=item XSRETURN_IV
X<XSRETURN_IV>

Return an integer from an XSUB immediately.  Uses C<XST_mIV>.

	void	XSRETURN_IV(IV iv)

=for hackers
Found in file XSUB.h

=item XSRETURN_NO
X<XSRETURN_NO>

Return C<&PL_sv_no> from an XSUB immediately.  Uses C<XST_mNO>.

		XSRETURN_NO;

=for hackers
Found in file XSUB.h

=item XSRETURN_NV
X<XSRETURN_NV>

Return a double from an XSUB immediately.  Uses C<XST_mNV>.

	void	XSRETURN_NV(NV nv)

=for hackers
Found in file XSUB.h

=item XSRETURN_PV
X<XSRETURN_PV>

Return a copy of a string from an XSUB immediately.  Uses C<XST_mPV>.

	void	XSRETURN_PV(char* str)

=for hackers
Found in file XSUB.h

=item XSRETURN_UNDEF
X<XSRETURN_UNDEF>

Return C<&PL_sv_undef> from an XSUB immediately.  Uses C<XST_mUNDEF>.

		XSRETURN_UNDEF;

=for hackers
Found in file XSUB.h

=item XSRETURN_UV
X<XSRETURN_UV>

Return an integer from an XSUB immediately.  Uses C<XST_mUV>.

	void	XSRETURN_UV(IV uv)

=for hackers
Found in file XSUB.h

=item XSRETURN_YES
X<XSRETURN_YES>

Return C<&PL_sv_yes> from an XSUB immediately.  Uses C<XST_mYES>.

		XSRETURN_YES;

=for hackers
Found in file XSUB.h

=item XST_mIV
X<XST_mIV>

Place an integer into the specified position C<pos> on the stack.  The
value is stored in a new mortal SV.

	void	XST_mIV(int pos, IV iv)

=for hackers
Found in file XSUB.h

=item XST_mNO
X<XST_mNO>

Place C<&PL_sv_no> into the specified position C<pos> on the
stack.

	void	XST_mNO(int pos)

=for hackers
Found in file XSUB.h

=item XST_mNV
X<XST_mNV>

Place a double into the specified position C<pos> on the stack.  The value
is stored in a new mortal SV.

	void	XST_mNV(int pos, NV nv)

=for hackers
Found in file XSUB.h

=item XST_mPV
X<XST_mPV>

Place a copy of a string into the specified position C<pos> on the stack. 
The value is stored in a new mortal SV.

	void	XST_mPV(int pos, char* str)

=for hackers
Found in file XSUB.h

=item XST_mUNDEF
X<XST_mUNDEF>

Place C<&PL_sv_undef> into the specified position C<pos> on the
stack.

	void	XST_mUNDEF(int pos)

=for hackers
Found in file XSUB.h

=item XST_mYES
X<XST_mYES>

Place C<&PL_sv_yes> into the specified position C<pos> on the
stack.

	void	XST_mYES(int pos)

=for hackers
Found in file XSUB.h


=back

=head1 SV-Body Allocation

=over 8

=item looks_like_number
X<looks_like_number>

Test if the content of an SV looks like a number (or is a number).
C<Inf> and C<Infinity> are treated as numbers (so will not issue a
non-numeric warning), even if your C<atof()> doesn't grok them.  Get-magic is
ignored.

	I32	looks_like_number(SV *const sv)

=for hackers
Found in file sv.c

=item newRV_noinc
X<newRV_noinc>

Creates an RV wrapper for an SV.  The reference count for the original
SV is B<not> incremented.

	SV*	newRV_noinc(SV *const tmpRef)

=for hackers
Found in file sv.c

=item newSV
X<newSV>

Creates a new SV.  A non-zero C<len> parameter indicates the number of
bytes of preallocated string space the SV should have.  An extra byte for a
trailing C<NUL> is also reserved.  (C<SvPOK> is not set for the SV even if string
space is allocated.)  The reference count for the new SV is set to 1.

In 5.9.3, C<newSV()> replaces the older C<NEWSV()> API, and drops the first
parameter, I<x>, a debug aid which allowed callers to identify themselves.
This aid has been superseded by a new build option, C<PERL_MEM_LOG> (see
L<perlhacktips/PERL_MEM_LOG>).  The older API is still there for use in XS
modules supporting older perls.

	SV*	newSV(const STRLEN len)

=for hackers
Found in file sv.c

=item newSVhek
X<newSVhek>

Creates a new SV from the hash key structure.  It will generate scalars that
point to the shared string table where possible.  Returns a new (undefined)
SV if C<hek> is NULL.

	SV*	newSVhek(const HEK *const hek)

=for hackers
Found in file sv.c

=item newSViv
X<newSViv>

Creates a new SV and copies an integer into it.  The reference count for the
SV is set to 1.

	SV*	newSViv(const IV i)

=for hackers
Found in file sv.c

=item newSVnv
X<newSVnv>

Creates a new SV and copies a floating point value into it.
The reference count for the SV is set to 1.

	SV*	newSVnv(const NV n)

=for hackers
Found in file sv.c

=item newSVpv
X<newSVpv>

Creates a new SV and copies a string (which may contain C<NUL> (C<\0>)
characters) into it.  The reference count for the
SV is set to 1.  If C<len> is zero, Perl will compute the length using
C<strlen()>, (which means if you use this option, that C<s> can't have embedded
C<NUL> characters and has to have a terminating C<NUL> byte).

This function can cause reliability issues if you are likely to pass in
empty strings that are not null terminated, because it will run
strlen on the string and potentially run past valid memory.

Using L</newSVpvn> is a safer alternative for non C<NUL> terminated strings.
For string literals use L</newSVpvs> instead.  This function will work fine for
C<NUL> terminated strings, but if you want to avoid the if statement on whether
to call C<strlen> use C<newSVpvn> instead (calling C<strlen> yourself).

	SV*	newSVpv(const char *const s, const STRLEN len)

=for hackers
Found in file sv.c

=item newSVpvf
X<newSVpvf>

Creates a new SV and initializes it with the string formatted like
C<sv_catpvf>.

	SV*	newSVpvf(const char *const pat, ...)

=for hackers
Found in file sv.c

=item newSVpvn
X<newSVpvn>

Creates a new SV and copies a string into it, which may contain C<NUL> characters
(C<\0>) and other binary data.  The reference count for the SV is set to 1.
Note that if C<len> is zero, Perl will create a zero length (Perl) string.  You
are responsible for ensuring that the source buffer is at least
C<len> bytes long.  If the C<buffer> argument is NULL the new SV will be
undefined.

	SV*	newSVpvn(const char *const s, const STRLEN len)

=for hackers
Found in file sv.c

=item newSVpvn_flags
X<newSVpvn_flags>

Creates a new SV and copies a string (which may contain C<NUL> (C<\0>)
characters) into it.  The reference count for the
SV is set to 1.  Note that if C<len> is zero, Perl will create a zero length
string.  You are responsible for ensuring that the source string is at least
C<len> bytes long.  If the C<s> argument is NULL the new SV will be undefined.
Currently the only flag bits accepted are C<SVf_UTF8> and C<SVs_TEMP>.
If C<SVs_TEMP> is set, then C<sv_2mortal()> is called on the result before
returning.  If C<SVf_UTF8> is set, C<s>
is considered to be in UTF-8 and the
C<SVf_UTF8> flag will be set on the new SV.
C<newSVpvn_utf8()> is a convenience wrapper for this function, defined as

    #define newSVpvn_utf8(s, len, u)			\
	newSVpvn_flags((s), (len), (u) ? SVf_UTF8 : 0)

	SV*	newSVpvn_flags(const char *const s,
		               const STRLEN len,
		               const U32 flags)

=for hackers
Found in file sv.c

=item newSVpvn_share
X<newSVpvn_share>

Creates a new SV with its C<SvPVX_const> pointing to a shared string in the string
table.  If the string does not already exist in the table, it is
created first.  Turns on the C<SvIsCOW> flag (or C<READONLY>
and C<FAKE> in 5.16 and earlier).  If the C<hash> parameter
is non-zero, that value is used; otherwise the hash is computed.
The string's hash can later be retrieved from the SV
with the C<SvSHARED_HASH()> macro.  The idea here is
that as the string table is used for shared hash keys these strings will have
C<SvPVX_const == HeKEY> and hash lookup will avoid string compare.

	SV*	newSVpvn_share(const char* s, I32 len, U32 hash)

=for hackers
Found in file sv.c

=item newSVpvs
X<newSVpvs>

Like C<newSVpvn>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	SV*	newSVpvs(const char* s)

=for hackers
Found in file handy.h

=item newSVpvs_flags
X<newSVpvs_flags>

Like C<newSVpvn_flags>, but takes a C<NUL>-terminated literal string instead of
a string/length pair.

	SV*	newSVpvs_flags(const char* s, U32 flags)

=for hackers
Found in file handy.h

=item newSVpv_share
X<newSVpv_share>

Like C<newSVpvn_share>, but takes a C<NUL>-terminated string instead of a
string/length pair.

	SV*	newSVpv_share(const char* s, U32 hash)

=for hackers
Found in file sv.c

=item newSVpvs_share
X<newSVpvs_share>

Like C<newSVpvn_share>, but takes a C<NUL>-terminated literal string instead of
a string/length pair and omits the hash parameter.

	SV*	newSVpvs_share(const char* s)

=for hackers
Found in file handy.h

=item newSVrv
X<newSVrv>

Creates a new SV for the existing RV, C<rv>, to point to.  If C<rv> is not an
RV then it will be upgraded to one.  If C<classname> is non-null then the new
SV will be blessed in the specified package.  The new SV is returned and its
reference count is 1.  The reference count 1 is owned by C<rv>.

	SV*	newSVrv(SV *const rv,
		        const char *const classname)

=for hackers
Found in file sv.c

=item newSVsv
X<newSVsv>

Creates a new SV which is an exact duplicate of the original SV.
(Uses C<sv_setsv>.)

	SV*	newSVsv(SV *const old)

=for hackers
Found in file sv.c

=item newSV_type
X<newSV_type>

Creates a new SV, of the type specified.  The reference count for the new SV
is set to 1.

	SV*	newSV_type(const svtype type)

=for hackers
Found in file sv.c

=item newSVuv
X<newSVuv>

Creates a new SV and copies an unsigned integer into it.
The reference count for the SV is set to 1.

	SV*	newSVuv(const UV u)

=for hackers
Found in file sv.c

=item sv_2bool
X<sv_2bool>

This macro is only used by C<sv_true()> or its macro equivalent, and only if
the latter's argument is neither C<SvPOK>, C<SvIOK> nor C<SvNOK>.
It calls C<sv_2bool_flags> with the C<SV_GMAGIC> flag.

	bool	sv_2bool(SV *const sv)

=for hackers
Found in file sv.c

=item sv_2bool_flags
X<sv_2bool_flags>

This function is only used by C<sv_true()> and friends,  and only if
the latter's argument is neither C<SvPOK>, C<SvIOK> nor C<SvNOK>.  If the flags
contain C<SV_GMAGIC>, then it does an C<mg_get()> first.


	bool	sv_2bool_flags(SV *sv, I32 flags)

=for hackers
Found in file sv.c

=item sv_2cv
X<sv_2cv>

Using various gambits, try to get a CV from an SV; in addition, try if
possible to set C<*st> and C<*gvp> to the stash and GV associated with it.
The flags in C<lref> are passed to C<gv_fetchsv>.

	CV*	sv_2cv(SV* sv, HV **const st, GV **const gvp,
		       const I32 lref)

=for hackers
Found in file sv.c

=item sv_2io
X<sv_2io>

Using various gambits, try to get an IO from an SV: the IO slot if its a
GV; or the recursive result if we're an RV; or the IO slot of the symbol
named after the PV if we're a string.

'Get' magic is ignored on the C<sv> passed in, but will be called on
C<SvRV(sv)> if C<sv> is an RV.

	IO*	sv_2io(SV *const sv)

=for hackers
Found in file sv.c

=item sv_2iv_flags
X<sv_2iv_flags>

Return the integer value of an SV, doing any necessary string
conversion.  If C<flags> has the C<SV_GMAGIC> bit set, does an C<mg_get()> first.
Normally used via the C<SvIV(sv)> and C<SvIVx(sv)> macros.

	IV	sv_2iv_flags(SV *const sv, const I32 flags)

=for hackers
Found in file sv.c

=item sv_2mortal
X<sv_2mortal>

Marks an existing SV as mortal.  The SV will be destroyed "soon", either
by an explicit call to C<FREETMPS>, or by an implicit call at places such as
statement boundaries.  C<SvTEMP()> is turned on which means that the SV's
string buffer can be "stolen" if this SV is copied.  See also
C<L</sv_newmortal>> and C<L</sv_mortalcopy>>.

	SV*	sv_2mortal(SV *const sv)

=for hackers
Found in file sv.c

=item sv_2nv_flags
X<sv_2nv_flags>

Return the num value of an SV, doing any necessary string or integer
conversion.  If C<flags> has the C<SV_GMAGIC> bit set, does an C<mg_get()> first.
Normally used via the C<SvNV(sv)> and C<SvNVx(sv)> macros.

	NV	sv_2nv_flags(SV *const sv, const I32 flags)

=for hackers
Found in file sv.c

=item sv_2pvbyte
X<sv_2pvbyte>

Return a pointer to the byte-encoded representation of the SV, and set C<*lp>
to its length.  May cause the SV to be downgraded from UTF-8 as a
side-effect.

Usually accessed via the C<SvPVbyte> macro.

	char*	sv_2pvbyte(SV *sv, STRLEN *const lp)

=for hackers
Found in file sv.c

=item sv_2pvutf8
X<sv_2pvutf8>

Return a pointer to the UTF-8-encoded representation of the SV, and set C<*lp>
to its length.  May cause the SV to be upgraded to UTF-8 as a side-effect.

Usually accessed via the C<SvPVutf8> macro.

	char*	sv_2pvutf8(SV *sv, STRLEN *const lp)

=for hackers
Found in file sv.c

=item sv_2pv_flags
X<sv_2pv_flags>

Returns a pointer to the string value of an SV, and sets C<*lp> to its length.
If flags has the C<SV_GMAGIC> bit set, does an C<mg_get()> first.  Coerces C<sv> to a
string if necessary.  Normally invoked via the C<SvPV_flags> macro.
C<sv_2pv()> and C<sv_2pv_nomg> usually end up here too.

	char*	sv_2pv_flags(SV *const sv, STRLEN *const lp,
		             const I32 flags)

=for hackers
Found in file sv.c

=item sv_2uv_flags
X<sv_2uv_flags>

Return the unsigned integer value of an SV, doing any necessary string
conversion.  If C<flags> has the C<SV_GMAGIC> bit set, does an C<mg_get()> first.
Normally used via the C<SvUV(sv)> and C<SvUVx(sv)> macros.

	UV	sv_2uv_flags(SV *const sv, const I32 flags)

=for hackers
Found in file sv.c

=item sv_backoff
X<sv_backoff>

Remove any string offset.  You should normally use the C<SvOOK_off> macro
wrapper instead.

	void	sv_backoff(SV *const sv)

=for hackers
Found in file sv.c

=item sv_bless
X<sv_bless>

Blesses an SV into a specified package.  The SV must be an RV.  The package
must be designated by its stash (see C<L</gv_stashpv>>).  The reference count
of the SV is unaffected.

	SV*	sv_bless(SV *const sv, HV *const stash)

=for hackers
Found in file sv.c

=item sv_catpv
X<sv_catpv>

Concatenates the C<NUL>-terminated string onto the end of the string which is
in the SV.
If the SV has the UTF-8 status set, then the bytes appended should be
valid UTF-8.  Handles 'get' magic, but not 'set' magic.  See
C<L</sv_catpv_mg>>.

	void	sv_catpv(SV *const sv, const char* ptr)

=for hackers
Found in file sv.c

=item sv_catpvf
X<sv_catpvf>

Processes its arguments like C<sv_catpvfn>, and appends the formatted
output to an SV.  As with C<sv_catpvfn> called with a non-null C-style
variable argument list, argument reordering is not supported.
If the appended data contains "wide" characters
(including, but not limited to, SVs with a UTF-8 PV formatted with C<%s>,
and characters >255 formatted with C<%c>), the original SV might get
upgraded to UTF-8.  Handles 'get' magic, but not 'set' magic.  See
C<L</sv_catpvf_mg>>.  If the original SV was UTF-8, the pattern should be
valid UTF-8; if the original SV was bytes, the pattern should be too.

	void	sv_catpvf(SV *const sv, const char *const pat,
		          ...)

=for hackers
Found in file sv.c

=item sv_catpvf_mg
X<sv_catpvf_mg>

Like C<sv_catpvf>, but also handles 'set' magic.

	void	sv_catpvf_mg(SV *const sv,
		             const char *const pat, ...)

=for hackers
Found in file sv.c

=item sv_catpvn
X<sv_catpvn>

Concatenates the string onto the end of the string which is in the SV.
C<len> indicates number of bytes to copy.  If the SV has the UTF-8
status set, then the bytes appended should be valid UTF-8.
Handles 'get' magic, but not 'set' magic.  See C<L</sv_catpvn_mg>>.

	void	sv_catpvn(SV *dsv, const char *sstr, STRLEN len)

=for hackers
Found in file sv.c

=item sv_catpvn_flags
X<sv_catpvn_flags>

Concatenates the string onto the end of the string which is in the SV.  The
C<len> indicates number of bytes to copy.

By default, the string appended is assumed to be valid UTF-8 if the SV has
the UTF-8 status set, and a string of bytes otherwise.  One can force the
appended string to be interpreted as UTF-8 by supplying the C<SV_CATUTF8>
flag, and as bytes by supplying the C<SV_CATBYTES> flag; the SV or the
string appended will be upgraded to UTF-8 if necessary.

If C<flags> has the C<SV_SMAGIC> bit set, will
C<mg_set> on C<dsv> afterwards if appropriate.
C<sv_catpvn> and C<sv_catpvn_nomg> are implemented
in terms of this function.

	void	sv_catpvn_flags(SV *const dstr,
		                const char *sstr,
		                const STRLEN len,
		                const I32 flags)

=for hackers
Found in file sv.c

=item sv_catpvs
X<sv_catpvs>

Like C<sv_catpvn>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	void	sv_catpvs(SV* sv, const char* s)

=for hackers
Found in file handy.h

=item sv_catpvs_flags
X<sv_catpvs_flags>

Like C<sv_catpvn_flags>, but takes a C<NUL>-terminated literal string instead
of a string/length pair.

	void	sv_catpvs_flags(SV* sv, const char* s,
		                I32 flags)

=for hackers
Found in file handy.h

=item sv_catpvs_mg
X<sv_catpvs_mg>

Like C<sv_catpvn_mg>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	void	sv_catpvs_mg(SV* sv, const char* s)

=for hackers
Found in file handy.h

=item sv_catpvs_nomg
X<sv_catpvs_nomg>

Like C<sv_catpvn_nomg>, but takes a C<NUL>-terminated literal string instead of
a string/length pair.

	void	sv_catpvs_nomg(SV* sv, const char* s)

=for hackers
Found in file handy.h

=item sv_catpv_flags
X<sv_catpv_flags>

Concatenates the C<NUL>-terminated string onto the end of the string which is
in the SV.
If the SV has the UTF-8 status set, then the bytes appended should
be valid UTF-8.  If C<flags> has the C<SV_SMAGIC> bit set, will C<mg_set>
on the modified SV if appropriate.

	void	sv_catpv_flags(SV *dstr, const char *sstr,
		               const I32 flags)

=for hackers
Found in file sv.c

=item sv_catpv_mg
X<sv_catpv_mg>

Like C<sv_catpv>, but also handles 'set' magic.

	void	sv_catpv_mg(SV *const sv, const char *const ptr)

=for hackers
Found in file sv.c

=item sv_catsv
X<sv_catsv>

Concatenates the string from SV C<ssv> onto the end of the string in SV
C<dsv>.  If C<ssv> is null, does nothing; otherwise modifies only C<dsv>.
Handles 'get' magic on both SVs, but no 'set' magic.  See C<L</sv_catsv_mg>>
and C<L</sv_catsv_nomg>>.

	void	sv_catsv(SV *dstr, SV *sstr)

=for hackers
Found in file sv.c

=item sv_catsv_flags
X<sv_catsv_flags>

Concatenates the string from SV C<ssv> onto the end of the string in SV
C<dsv>.  If C<ssv> is null, does nothing; otherwise modifies only C<dsv>.
If C<flags> has the C<SV_GMAGIC> bit set, will call C<mg_get> on both SVs if
appropriate.  If C<flags> has the C<SV_SMAGIC> bit set, C<mg_set> will be called on
the modified SV afterward, if appropriate.  C<sv_catsv>, C<sv_catsv_nomg>,
and C<sv_catsv_mg> are implemented in terms of this function.

	void	sv_catsv_flags(SV *const dsv, SV *const ssv,
		               const I32 flags)

=for hackers
Found in file sv.c

=item sv_chop
X<sv_chop>

Efficient removal of characters from the beginning of the string buffer.
C<SvPOK(sv)>, or at least C<SvPOKp(sv)>, must be true and C<ptr> must be a
pointer to somewhere inside the string buffer.  C<ptr> becomes the first
character of the adjusted string.  Uses the C<OOK> hack.  On return, only
C<SvPOK(sv)> and C<SvPOKp(sv)> among the C<OK> flags will be true.

Beware: after this function returns, C<ptr> and SvPVX_const(sv) may no longer
refer to the same chunk of data.

The unfortunate similarity of this function's name to that of Perl's C<chop>
operator is strictly coincidental.  This function works from the left;
C<chop> works from the right.

	void	sv_chop(SV *const sv, const char *const ptr)

=for hackers
Found in file sv.c

=item sv_clear
X<sv_clear>

Clear an SV: call any destructors, free up any memory used by the body,
and free the body itself.  The SV's head is I<not> freed, although
its type is set to all 1's so that it won't inadvertently be assumed
to be live during global destruction etc.
This function should only be called when C<REFCNT> is zero.  Most of the time
you'll want to call C<sv_free()> (or its macro wrapper C<SvREFCNT_dec>)
instead.

	void	sv_clear(SV *const orig_sv)

=for hackers
Found in file sv.c

=item sv_cmp
X<sv_cmp>

Compares the strings in two SVs.  Returns -1, 0, or 1 indicating whether the
string in C<sv1> is less than, equal to, or greater than the string in
C<sv2>.  Is UTF-8 and S<C<'use bytes'>> aware, handles get magic, and will
coerce its args to strings if necessary.  See also C<L</sv_cmp_locale>>.

	I32	sv_cmp(SV *const sv1, SV *const sv2)

=for hackers
Found in file sv.c

=item sv_cmp_flags
X<sv_cmp_flags>

Compares the strings in two SVs.  Returns -1, 0, or 1 indicating whether the
string in C<sv1> is less than, equal to, or greater than the string in
C<sv2>.  Is UTF-8 and S<C<'use bytes'>> aware and will coerce its args to strings
if necessary.  If the flags has the C<SV_GMAGIC> bit set, it handles get magic.  See
also C<L</sv_cmp_locale_flags>>.

	I32	sv_cmp_flags(SV *const sv1, SV *const sv2,
		             const U32 flags)

=for hackers
Found in file sv.c

=item sv_cmp_locale
X<sv_cmp_locale>

Compares the strings in two SVs in a locale-aware manner.  Is UTF-8 and
S<C<'use bytes'>> aware, handles get magic, and will coerce its args to strings
if necessary.  See also C<L</sv_cmp>>.

	I32	sv_cmp_locale(SV *const sv1, SV *const sv2)

=for hackers
Found in file sv.c

=item sv_cmp_locale_flags
X<sv_cmp_locale_flags>

Compares the strings in two SVs in a locale-aware manner.  Is UTF-8 and
S<C<'use bytes'>> aware and will coerce its args to strings if necessary.  If
the flags contain C<SV_GMAGIC>, it handles get magic.  See also
C<L</sv_cmp_flags>>.

	I32	sv_cmp_locale_flags(SV *const sv1,
		                    SV *const sv2,
		                    const U32 flags)

=for hackers
Found in file sv.c

=item sv_collxfrm
X<sv_collxfrm>

This calls C<sv_collxfrm_flags> with the SV_GMAGIC flag.  See
C<L</sv_collxfrm_flags>>.

	char*	sv_collxfrm(SV *const sv, STRLEN *const nxp)

=for hackers
Found in file sv.c

=item sv_collxfrm_flags
X<sv_collxfrm_flags>

Add Collate Transform magic to an SV if it doesn't already have it.  If the
flags contain C<SV_GMAGIC>, it handles get-magic.

Any scalar variable may carry C<PERL_MAGIC_collxfrm> magic that contains the
scalar data of the variable, but transformed to such a format that a normal
memory comparison can be used to compare the data according to the locale
settings.

	char*	sv_collxfrm_flags(SV *const sv,
		                  STRLEN *const nxp,
		                  I32 const flags)

=for hackers
Found in file sv.c

=item sv_copypv
X<sv_copypv>

Copies a stringified representation of the source SV into the
destination SV.  Automatically performs any necessary C<mg_get> and
coercion of numeric values into strings.  Guaranteed to preserve
C<UTF8> flag even from overloaded objects.  Similar in nature to
C<sv_2pv[_flags]> but operates directly on an SV instead of just the
string.  Mostly uses C<sv_2pv_flags> to do its work, except when that
would lose the UTF-8'ness of the PV.

	void	sv_copypv(SV *const dsv, SV *const ssv)

=for hackers
Found in file sv.c

=item sv_copypv_flags
X<sv_copypv_flags>

Implementation of C<sv_copypv> and C<sv_copypv_nomg>.  Calls get magic iff flags
has the C<SV_GMAGIC> bit set.

	void	sv_copypv_flags(SV *const dsv, SV *const ssv,
		                const I32 flags)

=for hackers
Found in file sv.c

=item sv_copypv_nomg
X<sv_copypv_nomg>

Like C<sv_copypv>, but doesn't invoke get magic first.

	void	sv_copypv_nomg(SV *const dsv, SV *const ssv)

=for hackers
Found in file sv.c

=item sv_dec
X<sv_dec>

Auto-decrement of the value in the SV, doing string to numeric conversion
if necessary.  Handles 'get' magic and operator overloading.

	void	sv_dec(SV *const sv)

=for hackers
Found in file sv.c

=item sv_dec_nomg
X<sv_dec_nomg>

Auto-decrement of the value in the SV, doing string to numeric conversion
if necessary.  Handles operator overloading.  Skips handling 'get' magic.

	void	sv_dec_nomg(SV *const sv)

=for hackers
Found in file sv.c

=item sv_eq
X<sv_eq>

Returns a boolean indicating whether the strings in the two SVs are
identical.  Is UTF-8 and S<C<'use bytes'>> aware, handles get magic, and will
coerce its args to strings if necessary.

	I32	sv_eq(SV* sv1, SV* sv2)

=for hackers
Found in file sv.c

=item sv_eq_flags
X<sv_eq_flags>

Returns a boolean indicating whether the strings in the two SVs are
identical.  Is UTF-8 and S<C<'use bytes'>> aware and coerces its args to strings
if necessary.  If the flags has the C<SV_GMAGIC> bit set, it handles get-magic, too.

	I32	sv_eq_flags(SV* sv1, SV* sv2, const U32 flags)

=for hackers
Found in file sv.c

=item sv_force_normal_flags
X<sv_force_normal_flags>

Undo various types of fakery on an SV, where fakery means
"more than" a string: if the PV is a shared string, make
a private copy; if we're a ref, stop refing; if we're a glob, downgrade to
an C<xpvmg>; if we're a copy-on-write scalar, this is the on-write time when
we do the copy, and is also used locally; if this is a
vstring, drop the vstring magic.  If C<SV_COW_DROP_PV> is set
then a copy-on-write scalar drops its PV buffer (if any) and becomes
C<SvPOK_off> rather than making a copy.  (Used where this
scalar is about to be set to some other value.)  In addition,
the C<flags> parameter gets passed to C<sv_unref_flags()>
when unreffing.  C<sv_force_normal> calls this function
with flags set to 0.

This function is expected to be used to signal to perl that this SV is
about to be written to, and any extra book-keeping needs to be taken care
of.  Hence, it croaks on read-only values.

	void	sv_force_normal_flags(SV *const sv,
		                      const U32 flags)

=for hackers
Found in file sv.c

=item sv_free
X<sv_free>

Decrement an SV's reference count, and if it drops to zero, call
C<sv_clear> to invoke destructors and free up any memory used by
the body; finally, deallocating the SV's head itself.
Normally called via a wrapper macro C<SvREFCNT_dec>.

	void	sv_free(SV *const sv)

=for hackers
Found in file sv.c

=item sv_gets
X<sv_gets>

Get a line from the filehandle and store it into the SV, optionally
appending to the currently-stored string.  If C<append> is not 0, the
line is appended to the SV instead of overwriting it.  C<append> should
be set to the byte offset that the appended string should start at
in the SV (typically, C<SvCUR(sv)> is a suitable choice).

	char*	sv_gets(SV *const sv, PerlIO *const fp,
		        I32 append)

=for hackers
Found in file sv.c

=item sv_get_backrefs
X<sv_get_backrefs>


NOTE: this function is experimental and may change or be
removed without notice.


If C<sv> is the target of a weak reference then it returns the back
references structure associated with the sv; otherwise return C<NULL>.

When returning a non-null result the type of the return is relevant. If it
is an AV then the elements of the AV are the weak reference RVs which
point at this item. If it is any other type then the item itself is the
weak reference.

See also C<Perl_sv_add_backref()>, C<Perl_sv_del_backref()>,
C<Perl_sv_kill_backrefs()>

	SV*	sv_get_backrefs(SV *const sv)

=for hackers
Found in file sv.c

=item sv_grow
X<sv_grow>

Expands the character buffer in the SV.  If necessary, uses C<sv_unref> and
upgrades the SV to C<SVt_PV>.  Returns a pointer to the character buffer.
Use the C<SvGROW> wrapper instead.

	char*	sv_grow(SV *const sv, STRLEN newlen)

=for hackers
Found in file sv.c

=item sv_inc
X<sv_inc>

Auto-increment of the value in the SV, doing string to numeric conversion
if necessary.  Handles 'get' magic and operator overloading.

	void	sv_inc(SV *const sv)

=for hackers
Found in file sv.c

=item sv_inc_nomg
X<sv_inc_nomg>

Auto-increment of the value in the SV, doing string to numeric conversion
if necessary.  Handles operator overloading.  Skips handling 'get' magic.

	void	sv_inc_nomg(SV *const sv)

=for hackers
Found in file sv.c

=item sv_insert
X<sv_insert>

Inserts a string at the specified offset/length within the SV.  Similar to
the Perl C<substr()> function.  Handles get magic.

	void	sv_insert(SV *const bigstr, const STRLEN offset,
		          const STRLEN len,
		          const char *const little,
		          const STRLEN littlelen)

=for hackers
Found in file sv.c

=item sv_insert_flags
X<sv_insert_flags>

Same as C<sv_insert>, but the extra C<flags> are passed to the
C<SvPV_force_flags> that applies to C<bigstr>.

	void	sv_insert_flags(SV *const bigstr,
		                const STRLEN offset,
		                const STRLEN len,
		                const char *little,
		                const STRLEN littlelen,
		                const U32 flags)

=for hackers
Found in file sv.c

=item sv_isa
X<sv_isa>

Returns a boolean indicating whether the SV is blessed into the specified
class.  This does not check for subtypes; use C<sv_derived_from> to verify
an inheritance relationship.

	int	sv_isa(SV* sv, const char *const name)

=for hackers
Found in file sv.c

=item sv_isobject
X<sv_isobject>

Returns a boolean indicating whether the SV is an RV pointing to a blessed
object.  If the SV is not an RV, or if the object is not blessed, then this
will return false.

	int	sv_isobject(SV* sv)

=for hackers
Found in file sv.c

=item sv_len
X<sv_len>

Returns the length of the string in the SV.  Handles magic and type
coercion and sets the UTF8 flag appropriately.  See also C<L</SvCUR>>, which
gives raw access to the C<xpv_cur> slot.

	STRLEN	sv_len(SV *const sv)

=for hackers
Found in file sv.c

=item sv_len_utf8
X<sv_len_utf8>

Returns the number of characters in the string in an SV, counting wide
UTF-8 bytes as a single character.  Handles magic and type coercion.

	STRLEN	sv_len_utf8(SV *const sv)

=for hackers
Found in file sv.c

=item sv_magic
X<sv_magic>

Adds magic to an SV.  First upgrades C<sv> to type C<SVt_PVMG> if
necessary, then adds a new magic item of type C<how> to the head of the
magic list.

See C<L</sv_magicext>> (which C<sv_magic> now calls) for a description of the
handling of the C<name> and C<namlen> arguments.

You need to use C<sv_magicext> to add magic to C<SvREADONLY> SVs and also
to add more than one instance of the same C<how>.

	void	sv_magic(SV *const sv, SV *const obj,
		         const int how, const char *const name,
		         const I32 namlen)

=for hackers
Found in file sv.c

=item sv_magicext
X<sv_magicext>

Adds magic to an SV, upgrading it if necessary.  Applies the
supplied C<vtable> and returns a pointer to the magic added.

Note that C<sv_magicext> will allow things that C<sv_magic> will not.
In particular, you can add magic to C<SvREADONLY> SVs, and add more than
one instance of the same C<how>.

If C<namlen> is greater than zero then a C<savepvn> I<copy> of C<name> is
stored, if C<namlen> is zero then C<name> is stored as-is and - as another
special case - if C<(name && namlen == HEf_SVKEY)> then C<name> is assumed
to contain an SV* and is stored as-is with its C<REFCNT> incremented.

(This is now used as a subroutine by C<sv_magic>.)

	MAGIC *	sv_magicext(SV *const sv, SV *const obj,
		            const int how,
		            const MGVTBL *const vtbl,
		            const char *const name,
		            const I32 namlen)

=for hackers
Found in file sv.c

=item sv_mortalcopy
X<sv_mortalcopy>

Creates a new SV which is a copy of the original SV (using C<sv_setsv>).
The new SV is marked as mortal.  It will be destroyed "soon", either by an
explicit call to C<FREETMPS>, or by an implicit call at places such as
statement boundaries.  See also C<L</sv_newmortal>> and C<L</sv_2mortal>>.

	SV*	sv_mortalcopy(SV *const oldsv)

=for hackers
Found in file sv.c

=item sv_newmortal
X<sv_newmortal>

Creates a new null SV which is mortal.  The reference count of the SV is
set to 1.  It will be destroyed "soon", either by an explicit call to
C<FREETMPS>, or by an implicit call at places such as statement boundaries.
See also C<L</sv_mortalcopy>> and C<L</sv_2mortal>>.

	SV*	sv_newmortal()

=for hackers
Found in file sv.c

=item sv_newref
X<sv_newref>

Increment an SV's reference count.  Use the C<SvREFCNT_inc()> wrapper
instead.

	SV*	sv_newref(SV *const sv)

=for hackers
Found in file sv.c

=item sv_pos_b2u
X<sv_pos_b2u>

Converts the value pointed to by C<offsetp> from a count of bytes from the
start of the string, to a count of the equivalent number of UTF-8 chars.
Handles magic and type coercion.

Use C<sv_pos_b2u_flags> in preference, which correctly handles strings
longer than 2Gb.

	void	sv_pos_b2u(SV *const sv, I32 *const offsetp)

=for hackers
Found in file sv.c

=item sv_pos_b2u_flags
X<sv_pos_b2u_flags>

Converts C<offset> from a count of bytes from the start of the string, to
a count of the equivalent number of UTF-8 chars.  Handles type coercion.
C<flags> is passed to C<SvPV_flags>, and usually should be
C<SV_GMAGIC|SV_CONST_RETURN> to handle magic.

	STRLEN	sv_pos_b2u_flags(SV *const sv,
		                 STRLEN const offset, U32 flags)

=for hackers
Found in file sv.c

=item sv_pos_u2b
X<sv_pos_u2b>

Converts the value pointed to by C<offsetp> from a count of UTF-8 chars from
the start of the string, to a count of the equivalent number of bytes; if
C<lenp> is non-zero, it does the same to C<lenp>, but this time starting from
the offset, rather than from the start of the string.  Handles magic and
type coercion.

Use C<sv_pos_u2b_flags> in preference, which correctly handles strings longer
than 2Gb.

	void	sv_pos_u2b(SV *const sv, I32 *const offsetp,
		           I32 *const lenp)

=for hackers
Found in file sv.c

=item sv_pos_u2b_flags
X<sv_pos_u2b_flags>

Converts the offset from a count of UTF-8 chars from
the start of the string, to a count of the equivalent number of bytes; if
C<lenp> is non-zero, it does the same to C<lenp>, but this time starting from
C<offset>, rather than from the start
of the string.  Handles type coercion.
C<flags> is passed to C<SvPV_flags>, and usually should be
C<SV_GMAGIC|SV_CONST_RETURN> to handle magic.

	STRLEN	sv_pos_u2b_flags(SV *const sv, STRLEN uoffset,
		                 STRLEN *const lenp, U32 flags)

=for hackers
Found in file sv.c

=item sv_pvbyten_force
X<sv_pvbyten_force>

The backend for the C<SvPVbytex_force> macro.  Always use the macro
instead.

	char*	sv_pvbyten_force(SV *const sv, STRLEN *const lp)

=for hackers
Found in file sv.c

=item sv_pvn_force
X<sv_pvn_force>

Get a sensible string out of the SV somehow.
A private implementation of the C<SvPV_force> macro for compilers which
can't cope with complex macro expressions.  Always use the macro instead.

	char*	sv_pvn_force(SV* sv, STRLEN* lp)

=for hackers
Found in file sv.c

=item sv_pvn_force_flags
X<sv_pvn_force_flags>

Get a sensible string out of the SV somehow.
If C<flags> has the C<SV_GMAGIC> bit set, will C<mg_get> on C<sv> if
appropriate, else not.  C<sv_pvn_force> and C<sv_pvn_force_nomg> are
implemented in terms of this function.
You normally want to use the various wrapper macros instead: see
C<L</SvPV_force>> and C<L</SvPV_force_nomg>>.

	char*	sv_pvn_force_flags(SV *const sv,
		                   STRLEN *const lp,
		                   const I32 flags)

=for hackers
Found in file sv.c

=item sv_pvutf8n_force
X<sv_pvutf8n_force>

The backend for the C<SvPVutf8x_force> macro.  Always use the macro
instead.

	char*	sv_pvutf8n_force(SV *const sv, STRLEN *const lp)

=for hackers
Found in file sv.c

=item sv_ref
X<sv_ref>

Returns a SV describing what the SV passed in is a reference to.

dst can be a SV to be set to the description or NULL, in which case a
mortal SV is returned.

If ob is true and the SV is blessed, the description is the class
name, otherwise it is the type of the SV, "SCALAR", "ARRAY" etc.

	SV*	sv_ref(SV *dst, const SV *const sv,
		       const int ob)

=for hackers
Found in file sv.c

=item sv_reftype
X<sv_reftype>

Returns a string describing what the SV is a reference to.

If ob is true and the SV is blessed, the string is the class name,
otherwise it is the type of the SV, "SCALAR", "ARRAY" etc.

	const char* sv_reftype(const SV *const sv, const int ob)

=for hackers
Found in file sv.c

=item sv_replace
X<sv_replace>

Make the first argument a copy of the second, then delete the original.
The target SV physically takes over ownership of the body of the source SV
and inherits its flags; however, the target keeps any magic it owns,
and any magic in the source is discarded.
Note that this is a rather specialist SV copying operation; most of the
time you'll want to use C<sv_setsv> or one of its many macro front-ends.

	void	sv_replace(SV *const sv, SV *const nsv)

=for hackers
Found in file sv.c

=item sv_reset
X<sv_reset>

Underlying implementation for the C<reset> Perl function.
Note that the perl-level function is vaguely deprecated.

	void	sv_reset(const char* s, HV *const stash)

=for hackers
Found in file sv.c

=item sv_rvweaken
X<sv_rvweaken>

Weaken a reference: set the C<SvWEAKREF> flag on this RV; give the
referred-to SV C<PERL_MAGIC_backref> magic if it hasn't already; and
push a back-reference to this RV onto the array of backreferences
associated with that magic.  If the RV is magical, set magic will be
called after the RV is cleared.

	SV*	sv_rvweaken(SV *const sv)

=for hackers
Found in file sv.c

=item sv_setiv
X<sv_setiv>

Copies an integer into the given SV, upgrading first if necessary.
Does not handle 'set' magic.  See also C<L</sv_setiv_mg>>.

	void	sv_setiv(SV *const sv, const IV num)

=for hackers
Found in file sv.c

=item sv_setiv_mg
X<sv_setiv_mg>

Like C<sv_setiv>, but also handles 'set' magic.

	void	sv_setiv_mg(SV *const sv, const IV i)

=for hackers
Found in file sv.c

=item sv_setnv
X<sv_setnv>

Copies a double into the given SV, upgrading first if necessary.
Does not handle 'set' magic.  See also C<L</sv_setnv_mg>>.

	void	sv_setnv(SV *const sv, const NV num)

=for hackers
Found in file sv.c

=item sv_setnv_mg
X<sv_setnv_mg>

Like C<sv_setnv>, but also handles 'set' magic.

	void	sv_setnv_mg(SV *const sv, const NV num)

=for hackers
Found in file sv.c

=item sv_setpv
X<sv_setpv>

Copies a string into an SV.  The string must be terminated with a C<NUL>
character, and not contain embeded C<NUL>'s.
Does not handle 'set' magic.  See C<L</sv_setpv_mg>>.

	void	sv_setpv(SV *const sv, const char *const ptr)

=for hackers
Found in file sv.c

=item sv_setpvf
X<sv_setpvf>

Works like C<sv_catpvf> but copies the text into the SV instead of
appending it.  Does not handle 'set' magic.  See C<L</sv_setpvf_mg>>.

	void	sv_setpvf(SV *const sv, const char *const pat,
		          ...)

=for hackers
Found in file sv.c

=item sv_setpvf_mg
X<sv_setpvf_mg>

Like C<sv_setpvf>, but also handles 'set' magic.

	void	sv_setpvf_mg(SV *const sv,
		             const char *const pat, ...)

=for hackers
Found in file sv.c

=item sv_setpviv
X<sv_setpviv>

Copies an integer into the given SV, also updating its string value.
Does not handle 'set' magic.  See C<L</sv_setpviv_mg>>.

	void	sv_setpviv(SV *const sv, const IV num)

=for hackers
Found in file sv.c

=item sv_setpviv_mg
X<sv_setpviv_mg>

Like C<sv_setpviv>, but also handles 'set' magic.

	void	sv_setpviv_mg(SV *const sv, const IV iv)

=for hackers
Found in file sv.c

=item sv_setpvn
X<sv_setpvn>

Copies a string (possibly containing embedded C<NUL> characters) into an SV.
The C<len> parameter indicates the number of
bytes to be copied.  If the C<ptr> argument is NULL the SV will become
undefined.  Does not handle 'set' magic.  See C<L</sv_setpvn_mg>>.

	void	sv_setpvn(SV *const sv, const char *const ptr,
		          const STRLEN len)

=for hackers
Found in file sv.c

=item sv_setpvn_mg
X<sv_setpvn_mg>

Like C<sv_setpvn>, but also handles 'set' magic.

	void	sv_setpvn_mg(SV *const sv,
		             const char *const ptr,
		             const STRLEN len)

=for hackers
Found in file sv.c

=item sv_setpvs
X<sv_setpvs>

Like C<sv_setpvn>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	void	sv_setpvs(SV* sv, const char* s)

=for hackers
Found in file handy.h

=item sv_setpvs_mg
X<sv_setpvs_mg>

Like C<sv_setpvn_mg>, but takes a C<NUL>-terminated literal string instead of a
string/length pair.

	void	sv_setpvs_mg(SV* sv, const char* s)

=for hackers
Found in file handy.h

=item sv_setpv_bufsize
X<sv_setpv_bufsize>

Sets the SV to be a string of cur bytes length, with at least
len bytes available. Ensures that there is a null byte at SvEND.
Returns a char * pointer to the SvPV buffer.

	char  *	sv_setpv_bufsize(SV *const sv, const STRLEN cur,
		                 const STRLEN len)

=for hackers
Found in file sv.c

=item sv_setpv_mg
X<sv_setpv_mg>

Like C<sv_setpv>, but also handles 'set' magic.

	void	sv_setpv_mg(SV *const sv, const char *const ptr)

=for hackers
Found in file sv.c

=item sv_setref_iv
X<sv_setref_iv>

Copies an integer into a new SV, optionally blessing the SV.  The C<rv>
argument will be upgraded to an RV.  That RV will be modified to point to
the new SV.  The C<classname> argument indicates the package for the
blessing.  Set C<classname> to C<NULL> to avoid the blessing.  The new SV
will have a reference count of 1, and the RV will be returned.

	SV*	sv_setref_iv(SV *const rv,
		             const char *const classname,
		             const IV iv)

=for hackers
Found in file sv.c

=item sv_setref_nv
X<sv_setref_nv>

Copies a double into a new SV, optionally blessing the SV.  The C<rv>
argument will be upgraded to an RV.  That RV will be modified to point to
the new SV.  The C<classname> argument indicates the package for the
blessing.  Set C<classname> to C<NULL> to avoid the blessing.  The new SV
will have a reference count of 1, and the RV will be returned.

	SV*	sv_setref_nv(SV *const rv,
		             const char *const classname,
		             const NV nv)

=for hackers
Found in file sv.c

=item sv_setref_pv
X<sv_setref_pv>

Copies a pointer into a new SV, optionally blessing the SV.  The C<rv>
argument will be upgraded to an RV.  That RV will be modified to point to
the new SV.  If the C<pv> argument is C<NULL>, then C<PL_sv_undef> will be placed
into the SV.  The C<classname> argument indicates the package for the
blessing.  Set C<classname> to C<NULL> to avoid the blessing.  The new SV
will have a reference count of 1, and the RV will be returned.

Do not use with other Perl types such as HV, AV, SV, CV, because those
objects will become corrupted by the pointer copy process.

Note that C<sv_setref_pvn> copies the string while this copies the pointer.

	SV*	sv_setref_pv(SV *const rv,
		             const char *const classname,
		             void *const pv)

=for hackers
Found in file sv.c

=item sv_setref_pvn
X<sv_setref_pvn>

Copies a string into a new SV, optionally blessing the SV.  The length of the
string must be specified with C<n>.  The C<rv> argument will be upgraded to
an RV.  That RV will be modified to point to the new SV.  The C<classname>
argument indicates the package for the blessing.  Set C<classname> to
C<NULL> to avoid the blessing.  The new SV will have a reference count
of 1, and the RV will be returned.

Note that C<sv_setref_pv> copies the pointer while this copies the string.

	SV*	sv_setref_pvn(SV *const rv,
		              const char *const classname,
		              const char *const pv,
		              const STRLEN n)

=for hackers
Found in file sv.c

=item sv_setref_pvs
X<sv_setref_pvs>

Like C<sv_setref_pvn>, but takes a C<NUL>-terminated literal string instead of
a string/length pair.

	SV *	sv_setref_pvs(const char* s)

=for hackers
Found in file handy.h

=item sv_setref_uv
X<sv_setref_uv>

Copies an unsigned integer into a new SV, optionally blessing the SV.  The C<rv>
argument will be upgraded to an RV.  That RV will be modified to point to
the new SV.  The C<classname> argument indicates the package for the
blessing.  Set C<classname> to C<NULL> to avoid the blessing.  The new SV
will have a reference count of 1, and the RV will be returned.

	SV*	sv_setref_uv(SV *const rv,
		             const char *const classname,
		             const UV uv)

=for hackers
Found in file sv.c

=item sv_setsv
X<sv_setsv>

Copies the contents of the source SV C<ssv> into the destination SV
C<dsv>.  The source SV may be destroyed if it is mortal, so don't use this
function if the source SV needs to be reused.  Does not handle 'set' magic on
destination SV.  Calls 'get' magic on source SV.  Loosely speaking, it
performs a copy-by-value, obliterating any previous content of the
destination.

You probably want to use one of the assortment of wrappers, such as
C<SvSetSV>, C<SvSetSV_nosteal>, C<SvSetMagicSV> and
C<SvSetMagicSV_nosteal>.

	void	sv_setsv(SV *dstr, SV *sstr)

=for hackers
Found in file sv.c

=item sv_setsv_flags
X<sv_setsv_flags>

Copies the contents of the source SV C<ssv> into the destination SV
C<dsv>.  The source SV may be destroyed if it is mortal, so don't use this
function if the source SV needs to be reused.  Does not handle 'set' magic.
Loosely speaking, it performs a copy-by-value, obliterating any previous
content of the destination.
If the C<flags> parameter has the C<SV_GMAGIC> bit set, will C<mg_get> on
C<ssv> if appropriate, else not.  If the C<flags>
parameter has the C<SV_NOSTEAL> bit set then the
buffers of temps will not be stolen.  C<sv_setsv>
and C<sv_setsv_nomg> are implemented in terms of this function.

You probably want to use one of the assortment of wrappers, such as
C<SvSetSV>, C<SvSetSV_nosteal>, C<SvSetMagicSV> and
C<SvSetMagicSV_nosteal>.

This is the primary function for copying scalars, and most other
copy-ish functions and macros use this underneath.

	void	sv_setsv_flags(SV *dstr, SV *sstr,
		               const I32 flags)

=for hackers
Found in file sv.c

=item sv_setsv_mg
X<sv_setsv_mg>

Like C<sv_setsv>, but also handles 'set' magic.

	void	sv_setsv_mg(SV *const dstr, SV *const sstr)

=for hackers
Found in file sv.c

=item sv_setuv
X<sv_setuv>

Copies an unsigned integer into the given SV, upgrading first if necessary.
Does not handle 'set' magic.  See also C<L</sv_setuv_mg>>.

	void	sv_setuv(SV *const sv, const UV num)

=for hackers
Found in file sv.c

=item sv_setuv_mg
X<sv_setuv_mg>

Like C<sv_setuv>, but also handles 'set' magic.

	void	sv_setuv_mg(SV *const sv, const UV u)

=for hackers
Found in file sv.c

=item sv_set_undef
X<sv_set_undef>

Equivalent to C<sv_setsv(sv, &PL_sv_undef)>, but more efficient.
Doesn't handle set magic.

The perl equivalent is C<$sv = undef;>. Note that it doesn't free any string
buffer, unlike C<undef $sv>.

Introduced in perl 5.26.0.

	void	sv_set_undef(SV *sv)

=for hackers
Found in file sv.c

=item sv_tainted
X<sv_tainted>

Test an SV for taintedness.  Use C<SvTAINTED> instead.

	bool	sv_tainted(SV *const sv)

=for hackers
Found in file sv.c

=item sv_true
X<sv_true>

Returns true if the SV has a true value by Perl's rules.
Use the C<SvTRUE> macro instead, which may call C<sv_true()> or may
instead use an in-line version.

	I32	sv_true(SV *const sv)

=for hackers
Found in file sv.c

=item sv_unmagic
X<sv_unmagic>

Removes all magic of type C<type> from an SV.

	int	sv_unmagic(SV *const sv, const int type)

=for hackers
Found in file sv.c

=item sv_unmagicext
X<sv_unmagicext>

Removes all magic of type C<type> with the specified C<vtbl> from an SV.

	int	sv_unmagicext(SV *const sv, const int type,
		              MGVTBL *vtbl)

=for hackers
Found in file sv.c

=item sv_unref_flags
X<sv_unref_flags>

Unsets the RV status of the SV, and decrements the reference count of
whatever was being referenced by the RV.  This can almost be thought of
as a reversal of C<newSVrv>.  The C<cflags> argument can contain
C<SV_IMMEDIATE_UNREF> to force the reference count to be decremented
(otherwise the decrementing is conditional on the reference count being
different from one or the reference being a readonly SV).
See C<L</SvROK_off>>.

	void	sv_unref_flags(SV *const ref, const U32 flags)

=for hackers
Found in file sv.c

=item sv_untaint
X<sv_untaint>

Untaint an SV.  Use C<SvTAINTED_off> instead.

	void	sv_untaint(SV *const sv)

=for hackers
Found in file sv.c

=item sv_upgrade
X<sv_upgrade>

Upgrade an SV to a more complex form.  Generally adds a new body type to the
SV, then copies across as much information as possible from the old body.
It croaks if the SV is already in a more complex form than requested.  You
generally want to use the C<SvUPGRADE> macro wrapper, which checks the type
before calling C<sv_upgrade>, and hence does not croak.  See also
C<L</svtype>>.

	void	sv_upgrade(SV *const sv, svtype new_type)

=for hackers
Found in file sv.c

=item sv_usepvn_flags
X<sv_usepvn_flags>

Tells an SV to use C<ptr> to find its string value.  Normally the
string is stored inside the SV, but sv_usepvn allows the SV to use an
outside string.  C<ptr> should point to memory that was allocated
by L<C<Newx>|perlclib/Memory Management and String Handling>.  It must be
the start of a C<Newx>-ed block of memory, and not a pointer to the
middle of it (beware of L<C<OOK>|perlguts/Offsets> and copy-on-write),
and not be from a non-C<Newx> memory allocator like C<malloc>.  The
string length, C<len>, must be supplied.  By default this function
will C<Renew> (i.e. realloc, move) the memory pointed to by C<ptr>,
so that pointer should not be freed or used by the programmer after
giving it to C<sv_usepvn>, and neither should any pointers from "behind"
that pointer (e.g. ptr + 1) be used.

If S<C<flags & SV_SMAGIC>> is true, will call C<SvSETMAGIC>.  If
S<C<flags> & SV_HAS_TRAILING_NUL>> is true, then C<ptr[len]> must be C<NUL>,
and the realloc
will be skipped (i.e. the buffer is actually at least 1 byte longer than
C<len>, and already meets the requirements for storing in C<SvPVX>).

	void	sv_usepvn_flags(SV *const sv, char* ptr,
		                const STRLEN len,
		                const U32 flags)

=for hackers
Found in file sv.c

=item sv_utf8_decode
X<sv_utf8_decode>


NOTE: this function is experimental and may change or be
removed without notice.


If the PV of the SV is an octet sequence in Perl's extended UTF-8
and contains a multiple-byte character, the C<SvUTF8> flag is turned on
so that it looks like a character.  If the PV contains only single-byte
characters, the C<SvUTF8> flag stays off.
Scans PV for validity and returns FALSE if the PV is invalid UTF-8.

	bool	sv_utf8_decode(SV *const sv)

=for hackers
Found in file sv.c

=item sv_utf8_downgrade
X<sv_utf8_downgrade>


NOTE: this function is experimental and may change or be
removed without notice.


Attempts to convert the PV of an SV from characters to bytes.
If the PV contains a character that cannot fit
in a byte, this conversion will fail;
in this case, either returns false or, if C<fail_ok> is not
true, croaks.

This is not a general purpose Unicode to byte encoding interface:
use the C<Encode> extension for that.

	bool	sv_utf8_downgrade(SV *const sv,
		                  const bool fail_ok)

=for hackers
Found in file sv.c

=item sv_utf8_encode
X<sv_utf8_encode>

Converts the PV of an SV to UTF-8, but then turns the C<SvUTF8>
flag off so that it looks like octets again.

	void	sv_utf8_encode(SV *const sv)

=for hackers
Found in file sv.c

=item sv_utf8_upgrade
X<sv_utf8_upgrade>

Converts the PV of an SV to its UTF-8-encoded form.
Forces the SV to string form if it is not already.
Will C<mg_get> on C<sv> if appropriate.
Always sets the C<SvUTF8> flag to avoid future validity checks even
if the whole string is the same in UTF-8 as not.
Returns the number of bytes in the converted string

This is not a general purpose byte encoding to Unicode interface:
use the Encode extension for that.

	STRLEN	sv_utf8_upgrade(SV *sv)

=for hackers
Found in file sv.c

=item sv_utf8_upgrade_flags
X<sv_utf8_upgrade_flags>

Converts the PV of an SV to its UTF-8-encoded form.
Forces the SV to string form if it is not already.
Always sets the SvUTF8 flag to avoid future validity checks even
if all the bytes are invariant in UTF-8.
If C<flags> has C<SV_GMAGIC> bit set,
will C<mg_get> on C<sv> if appropriate, else not.

If C<flags> has C<SV_FORCE_UTF8_UPGRADE> set, this function assumes that the PV
will expand when converted to UTF-8, and skips the extra work of checking for
that.  Typically this flag is used by a routine that has already parsed the
string and found such characters, and passes this information on so that the
work doesn't have to be repeated.

Returns the number of bytes in the converted string.

This is not a general purpose byte encoding to Unicode interface:
use the Encode extension for that.

	STRLEN	sv_utf8_upgrade_flags(SV *const sv,
		                      const I32 flags)

=for hackers
Found in file sv.c

=item sv_utf8_upgrade_flags_grow
X<sv_utf8_upgrade_flags_grow>

Like C<sv_utf8_upgrade_flags>, but has an additional parameter C<extra>, which is
the number of unused bytes the string of C<sv> is guaranteed to have free after
it upon return.  This allows the caller to reserve extra space that it intends
to fill, to avoid extra grows.

C<sv_utf8_upgrade>, C<sv_utf8_upgrade_nomg>, and C<sv_utf8_upgrade_flags>
are implemented in terms of this function.

Returns the number of bytes in the converted string (not including the spares).

	STRLEN	sv_utf8_upgrade_flags_grow(SV *const sv,
		                           const I32 flags,
		                           STRLEN extra)

=for hackers
Found in file sv.c

=item sv_utf8_upgrade_nomg
X<sv_utf8_upgrade_nomg>

Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.

	STRLEN	sv_utf8_upgrade_nomg(SV *sv)

=for hackers
Found in file sv.c

=item sv_vcatpvf
X<sv_vcatpvf>

Processes its arguments like C<sv_catpvfn> called with a non-null C-style
variable argument list, and appends the formatted output
to an SV.  Does not handle 'set' magic.  See C<L</sv_vcatpvf_mg>>.

Usually used via its frontend C<sv_catpvf>.

	void	sv_vcatpvf(SV *const sv, const char *const pat,
		           va_list *const args)

=for hackers
Found in file sv.c

=item sv_vcatpvfn
X<sv_vcatpvfn>

	void	sv_vcatpvfn(SV *const sv, const char *const pat,
		            const STRLEN patlen,
		            va_list *const args,
		            SV **const svargs, const I32 svmax,
		            bool *const maybe_tainted)

=for hackers
Found in file sv.c

=item sv_vcatpvfn_flags
X<sv_vcatpvfn_flags>

Processes its arguments like C<vsprintf> and appends the formatted output
to an SV.  Uses an array of SVs if the C-style variable argument list is
missing (C<NULL>). Argument reordering (using format specifiers like C<%2$d>
or C<%*2$d>) is supported only when using an array of SVs; using a C-style
C<va_list> argument list with a format string that uses argument reordering
will yield an exception.

When running with taint checks enabled, indicates via
C<maybe_tainted> if results are untrustworthy (often due to the use of
locales).

If called as C<sv_vcatpvfn> or flags has the C<SV_GMAGIC> bit set, calls get magic.

Usually used via one of its frontends C<sv_vcatpvf> and C<sv_vcatpvf_mg>.

	void	sv_vcatpvfn_flags(SV *const sv,
		                  const char *const pat,
		                  const STRLEN patlen,
		                  va_list *const args,
		                  SV **const svargs,
		                  const I32 svmax,
		                  bool *const maybe_tainted,
		                  const U32 flags)

=for hackers
Found in file sv.c

=item sv_vcatpvf_mg
X<sv_vcatpvf_mg>

Like C<sv_vcatpvf>, but also handles 'set' magic.

Usually used via its frontend C<sv_catpvf_mg>.

	void	sv_vcatpvf_mg(SV *const sv,
		              const char *const pat,
		              va_list *const args)

=for hackers
Found in file sv.c

=item sv_vsetpvf
X<sv_vsetpvf>

Works like C<sv_vcatpvf> but copies the text into the SV instead of
appending it.  Does not handle 'set' magic.  See C<L</sv_vsetpvf_mg>>.

Usually used via its frontend C<sv_setpvf>.

	void	sv_vsetpvf(SV *const sv, const char *const pat,
		           va_list *const args)

=for hackers
Found in file sv.c

=item sv_vsetpvfn
X<sv_vsetpvfn>

Works like C<sv_vcatpvfn> but copies the text into the SV instead of
appending it.

Usually used via one of its frontends C<sv_vsetpvf> and C<sv_vsetpvf_mg>.

	void	sv_vsetpvfn(SV *const sv, const char *const pat,
		            const STRLEN patlen,
		            va_list *const args,
		            SV **const svargs, const I32 svmax,
		            bool *const maybe_tainted)

=for hackers
Found in file sv.c

=item sv_vsetpvf_mg
X<sv_vsetpvf_mg>

Like C<sv_vsetpvf>, but also handles 'set' magic.

Usually used via its frontend C<sv_setpvf_mg>.

	void	sv_vsetpvf_mg(SV *const sv,
		              const char *const pat,
		              va_list *const args)

=for hackers
Found in file sv.c


=back

=head1 SV Flags

=over 8

=item SVt_INVLIST
X<SVt_INVLIST>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_IV
X<SVt_IV>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_NULL
X<SVt_NULL>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_NV
X<SVt_NV>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PV
X<SVt_PV>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVAV
X<SVt_PVAV>

Type flag for arrays.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVCV
X<SVt_PVCV>

Type flag for subroutines.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVFM
X<SVt_PVFM>

Type flag for formats.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVGV
X<SVt_PVGV>

Type flag for typeglobs.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVHV
X<SVt_PVHV>

Type flag for hashes.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVIO
X<SVt_PVIO>

Type flag for I/O objects.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVIV
X<SVt_PVIV>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVLV
X<SVt_PVLV>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVMG
X<SVt_PVMG>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_PVNV
X<SVt_PVNV>

Type flag for scalars.  See L</svtype>.

=for hackers
Found in file sv.h

=item SVt_REGEXP
X<SVt_REGEXP>

Type flag for regular expressions.  See L</svtype>.

=for hackers
Found in file sv.h

=item svtype
X<svtype>

An enum of flags for Perl types.  These are found in the file F<sv.h>
in the C<svtype> enum.  Test these flags with the C<SvTYPE> macro.

The types are:

    SVt_NULL
    SVt_IV
    SVt_NV
    SVt_RV
    SVt_PV
    SVt_PVIV
    SVt_PVNV
    SVt_PVMG
    SVt_INVLIST
    SVt_REGEXP
    SVt_PVGV
    SVt_PVLV
    SVt_PVAV
    SVt_PVHV
    SVt_PVCV
    SVt_PVFM
    SVt_PVIO

These are most easily explained from the bottom up.

C<SVt_PVIO> is for I/O objects, C<SVt_PVFM> for formats, C<SVt_PVCV> for
subroutines, C<SVt_PVHV> for hashes and C<SVt_PVAV> for arrays.

All the others are scalar types, that is, things that can be bound to a
C<$> variable.  For these, the internal types are mostly orthogonal to
types in the Perl language.

Hence, checking C<< SvTYPE(sv) < SVt_PVAV >> is the best way to see whether
something is a scalar.

C<SVt_PVGV> represents a typeglob.  If C<!SvFAKE(sv)>, then it is a real,
incoercible typeglob.  If C<SvFAKE(sv)>, then it is a scalar to which a
typeglob has been assigned.  Assigning to it again will stop it from being
a typeglob.  C<SVt_PVLV> represents a scalar that delegates to another scalar
behind the scenes.  It is used, e.g., for the return value of C<substr> and
for tied hash and array elements.  It can hold any scalar value, including
a typeglob.  C<SVt_REGEXP> is for regular
expressions.  C<SVt_INVLIST> is for Perl
core internal use only.

C<SVt_PVMG> represents a "normal" scalar (not a typeglob, regular expression,
or delegate).  Since most scalars do not need all the internal fields of a
PVMG, we save memory by allocating smaller structs when possible.  All the
other types are just simpler forms of C<SVt_PVMG>, with fewer internal fields.
C<SVt_NULL> can only hold undef.  C<SVt_IV> can hold undef, an integer, or a
reference.  (C<SVt_RV> is an alias for C<SVt_IV>, which exists for backward
compatibility.)  C<SVt_NV> can hold any of those or a double.  C<SVt_PV> can only
hold C<undef> or a string.  C<SVt_PVIV> is a superset of C<SVt_PV> and C<SVt_IV>.
C<SVt_PVNV> is similar.  C<SVt_PVMG> can hold anything C<SVt_PVNV> can hold, but it
can, but does not have to, be blessed or magical.

=for hackers
Found in file sv.h


=back

=head1 SV Manipulation Functions

=over 8

=item boolSV
X<boolSV>

Returns a true SV if C<b> is a true value, or a false SV if C<b> is 0.

See also C<L</PL_sv_yes>> and C<L</PL_sv_no>>.

	SV *	boolSV(bool b)

=for hackers
Found in file sv.h

=item croak_xs_usage
X<croak_xs_usage>

A specialised variant of C<croak()> for emitting the usage message for xsubs

    croak_xs_usage(cv, "eee_yow");

works out the package name and subroutine name from C<cv>, and then calls
C<croak()>.  Hence if C<cv> is C<&ouch::awk>, it would call C<croak> as:

 Perl_croak(aTHX_ "Usage: %" SVf "::%" SVf "(%s)", "ouch" "awk",
                                                     "eee_yow");

	void	croak_xs_usage(const CV *const cv,
		               const char *const params)

=for hackers
Found in file universal.c

=item get_sv
X<get_sv>

Returns the SV of the specified Perl scalar.  C<flags> are passed to
C<gv_fetchpv>.  If C<GV_ADD> is set and the
Perl variable does not exist then it will be created.  If C<flags> is zero
and the variable does not exist then NULL is returned.

NOTE: the perl_ form of this function is deprecated.

	SV*	get_sv(const char *name, I32 flags)

=for hackers
Found in file perl.c

=item newRV_inc
X<newRV_inc>

Creates an RV wrapper for an SV.  The reference count for the original SV is
incremented.

	SV*	newRV_inc(SV* sv)

=for hackers
Found in file sv.h

=item newSVpadname
X<newSVpadname>


NOTE: this function is experimental and may change or be
removed without notice.


Creates a new SV containing the pad name.

	SV*	newSVpadname(PADNAME *pn)

=for hackers
Found in file sv.h

=item newSVpvn_utf8
X<newSVpvn_utf8>

Creates a new SV and copies a string (which may contain C<NUL> (C<\0>)
characters) into it.  If C<utf8> is true, calls
C<SvUTF8_on> on the new SV.  Implemented as a wrapper around C<newSVpvn_flags>.

	SV*	newSVpvn_utf8(NULLOK const char* s, STRLEN len,
		              U32 utf8)

=for hackers
Found in file sv.h

=item sv_catpvn_nomg
X<sv_catpvn_nomg>

Like C<sv_catpvn> but doesn't process magic.

	void	sv_catpvn_nomg(SV* sv, const char* ptr,
		               STRLEN len)

=for hackers
Found in file sv.h

=item sv_catpv_nomg
X<sv_catpv_nomg>

Like C<sv_catpv> but doesn't process magic.

	void	sv_catpv_nomg(SV* sv, const char* ptr)

=for hackers
Found in file sv.h

=item sv_catsv_nomg
X<sv_catsv_nomg>

Like C<sv_catsv> but doesn't process magic.

	void	sv_catsv_nomg(SV* dsv, SV* ssv)

=for hackers
Found in file sv.h

=item SvCUR
X<SvCUR>

Returns the length of the string which is in the SV.  See C<L</SvLEN>>.

	STRLEN	SvCUR(SV* sv)

=for hackers
Found in file sv.h

=item SvCUR_set
X<SvCUR_set>

Set the current length of the string which is in the SV.  See C<L</SvCUR>>
and C<SvIV_set>>.

	void	SvCUR_set(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item sv_derived_from
X<sv_derived_from>

Exactly like L</sv_derived_from_pv>, but doesn't take a C<flags> parameter.

	bool	sv_derived_from(SV* sv, const char *const name)

=for hackers
Found in file universal.c

=item sv_derived_from_pv
X<sv_derived_from_pv>

Exactly like L</sv_derived_from_pvn>, but takes a nul-terminated string 
instead of a string/length pair.

	bool	sv_derived_from_pv(SV* sv,
		                   const char *const name,
		                   U32 flags)

=for hackers
Found in file universal.c

=item sv_derived_from_pvn
X<sv_derived_from_pvn>

Returns a boolean indicating whether the SV is derived from the specified class
I<at the C level>.  To check derivation at the Perl level, call C<isa()> as a
normal Perl method.

Currently, the only significant value for C<flags> is SVf_UTF8.

	bool	sv_derived_from_pvn(SV* sv,
		                    const char *const name,
		                    const STRLEN len, U32 flags)

=for hackers
Found in file universal.c

=item sv_derived_from_sv
X<sv_derived_from_sv>

Exactly like L</sv_derived_from_pvn>, but takes the name string in the form
of an SV instead of a string/length pair.

	bool	sv_derived_from_sv(SV* sv, SV *namesv,
		                   U32 flags)

=for hackers
Found in file universal.c

=item sv_does
X<sv_does>

Like L</sv_does_pv>, but doesn't take a C<flags> parameter.

	bool	sv_does(SV* sv, const char *const name)

=for hackers
Found in file universal.c

=item sv_does_pv
X<sv_does_pv>

Like L</sv_does_sv>, but takes a nul-terminated string instead of an SV.

	bool	sv_does_pv(SV* sv, const char *const name,
		           U32 flags)

=for hackers
Found in file universal.c

=item sv_does_pvn
X<sv_does_pvn>

Like L</sv_does_sv>, but takes a string/length pair instead of an SV.

	bool	sv_does_pvn(SV* sv, const char *const name,
		            const STRLEN len, U32 flags)

=for hackers
Found in file universal.c

=item sv_does_sv
X<sv_does_sv>

Returns a boolean indicating whether the SV performs a specific, named role.
The SV can be a Perl object or the name of a Perl class.

	bool	sv_does_sv(SV* sv, SV* namesv, U32 flags)

=for hackers
Found in file universal.c

=item SvEND
X<SvEND>

Returns a pointer to the spot just after the last character in
the string which is in the SV, where there is usually a trailing
C<NUL> character (even though Perl scalars do not strictly require it).
See C<L</SvCUR>>.  Access the character as C<*(SvEND(sv))>.

Warning: If C<SvCUR> is equal to C<SvLEN>, then C<SvEND> points to
unallocated memory.

	char*	SvEND(SV* sv)

=for hackers
Found in file sv.h

=item SvGAMAGIC
X<SvGAMAGIC>

Returns true if the SV has get magic or
overloading.  If either is true then
the scalar is active data, and has the potential to return a new value every
time it is accessed.  Hence you must be careful to
only read it once per user logical operation and work
with that returned value.  If neither is true then
the scalar's value cannot change unless written to.

	U32	SvGAMAGIC(SV* sv)

=for hackers
Found in file sv.h

=item SvGROW
X<SvGROW>

Expands the character buffer in the SV so that it has room for the
indicated number of bytes (remember to reserve space for an extra trailing
C<NUL> character).  Calls C<sv_grow> to perform the expansion if necessary.
Returns a pointer to the character
buffer.  SV must be of type >= C<SVt_PV>.  One
alternative is to call C<sv_grow> if you are not sure of the type of SV.

You might mistakenly think that C<len> is the number of bytes to add to the
existing size, but instead it is the total size C<sv> should be.

	char *	SvGROW(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvIOK
X<SvIOK>

Returns a U32 value indicating whether the SV contains an integer.

	U32	SvIOK(SV* sv)

=for hackers
Found in file sv.h

=item SvIOK_notUV
X<SvIOK_notUV>

Returns a boolean indicating whether the SV contains a signed integer.

	bool	SvIOK_notUV(SV* sv)

=for hackers
Found in file sv.h

=item SvIOK_off
X<SvIOK_off>

Unsets the IV status of an SV.

	void	SvIOK_off(SV* sv)

=for hackers
Found in file sv.h

=item SvIOK_on
X<SvIOK_on>

Tells an SV that it is an integer.

	void	SvIOK_on(SV* sv)

=for hackers
Found in file sv.h

=item SvIOK_only
X<SvIOK_only>

Tells an SV that it is an integer and disables all other C<OK> bits.

	void	SvIOK_only(SV* sv)

=for hackers
Found in file sv.h

=item SvIOK_only_UV
X<SvIOK_only_UV>

Tells an SV that it is an unsigned integer and disables all other C<OK> bits.

	void	SvIOK_only_UV(SV* sv)

=for hackers
Found in file sv.h

=item SvIOKp
X<SvIOKp>

Returns a U32 value indicating whether the SV contains an integer.  Checks
the B<private> setting.  Use C<SvIOK> instead.

	U32	SvIOKp(SV* sv)

=for hackers
Found in file sv.h

=item SvIOK_UV
X<SvIOK_UV>

Returns a boolean indicating whether the SV contains an integer that must be
interpreted as unsigned.  A non-negative integer whose value is within the
range of both an IV and a UV may be be flagged as either C<SvUOK> or C<SVIOK>.

	bool	SvIOK_UV(SV* sv)

=for hackers
Found in file sv.h

=item SvIsCOW
X<SvIsCOW>

Returns a U32 value indicating whether the SV is Copy-On-Write (either shared
hash key scalars, or full Copy On Write scalars if 5.9.0 is configured for
COW).

	U32	SvIsCOW(SV* sv)

=for hackers
Found in file sv.h

=item SvIsCOW_shared_hash
X<SvIsCOW_shared_hash>

Returns a boolean indicating whether the SV is Copy-On-Write shared hash key
scalar.

	bool	SvIsCOW_shared_hash(SV* sv)

=for hackers
Found in file sv.h

=item SvIV
X<SvIV>

Coerces the given SV to IV and returns it.  The returned value in many
circumstances will get stored in C<sv>'s IV slot, but not in all cases.  (Use
C<L</sv_setiv>> to make sure it does).

See C<L</SvIVx>> for a version which guarantees to evaluate C<sv> only once.

	IV	SvIV(SV* sv)

=for hackers
Found in file sv.h

=item SvIV_nomg
X<SvIV_nomg>

Like C<SvIV> but doesn't process magic.

	IV	SvIV_nomg(SV* sv)

=for hackers
Found in file sv.h

=item SvIV_set
X<SvIV_set>

Set the value of the IV pointer in sv to val.  It is possible to perform
the same function of this macro with an lvalue assignment to C<SvIVX>.
With future Perls, however, it will be more efficient to use 
C<SvIV_set> instead of the lvalue assignment to C<SvIVX>.

	void	SvIV_set(SV* sv, IV val)

=for hackers
Found in file sv.h

=item SvIVX
X<SvIVX>

Returns the raw value in the SV's IV slot, without checks or conversions.
Only use when you are sure C<SvIOK> is true.  See also C<L</SvIV>>.

	IV	SvIVX(SV* sv)

=for hackers
Found in file sv.h

=item SvIVx
X<SvIVx>

Coerces the given SV to IV and returns it.  The returned value in many
circumstances will get stored in C<sv>'s IV slot, but not in all cases.  (Use
C<L</sv_setiv>> to make sure it does).

This form guarantees to evaluate C<sv> only once.  Only use this if C<sv> is an
expression with side effects, otherwise use the more efficient C<SvIV>.

	IV	SvIVx(SV* sv)

=for hackers
Found in file sv.h

=item SvLEN
X<SvLEN>

Returns the size of the string buffer in the SV, not including any part
attributable to C<SvOOK>.  See C<L</SvCUR>>.

	STRLEN	SvLEN(SV* sv)

=for hackers
Found in file sv.h

=item SvLEN_set
X<SvLEN_set>

Set the size of the string buffer for the SV. See C<L</SvLEN>>.

	void	SvLEN_set(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvMAGIC_set
X<SvMAGIC_set>

Set the value of the MAGIC pointer in C<sv> to val.  See C<L</SvIV_set>>.

	void	SvMAGIC_set(SV* sv, MAGIC* val)

=for hackers
Found in file sv.h

=item SvNIOK
X<SvNIOK>

Returns a U32 value indicating whether the SV contains a number, integer or
double.

	U32	SvNIOK(SV* sv)

=for hackers
Found in file sv.h

=item SvNIOK_off
X<SvNIOK_off>

Unsets the NV/IV status of an SV.

	void	SvNIOK_off(SV* sv)

=for hackers
Found in file sv.h

=item SvNIOKp
X<SvNIOKp>

Returns a U32 value indicating whether the SV contains a number, integer or
double.  Checks the B<private> setting.  Use C<SvNIOK> instead.

	U32	SvNIOKp(SV* sv)

=for hackers
Found in file sv.h

=item SvNOK
X<SvNOK>

Returns a U32 value indicating whether the SV contains a double.

	U32	SvNOK(SV* sv)

=for hackers
Found in file sv.h

=item SvNOK_off
X<SvNOK_off>

Unsets the NV status of an SV.

	void	SvNOK_off(SV* sv)

=for hackers
Found in file sv.h

=item SvNOK_on
X<SvNOK_on>

Tells an SV that it is a double.

	void	SvNOK_on(SV* sv)

=for hackers
Found in file sv.h

=item SvNOK_only
X<SvNOK_only>

Tells an SV that it is a double and disables all other OK bits.

	void	SvNOK_only(SV* sv)

=for hackers
Found in file sv.h

=item SvNOKp
X<SvNOKp>

Returns a U32 value indicating whether the SV contains a double.  Checks the
B<private> setting.  Use C<SvNOK> instead.

	U32	SvNOKp(SV* sv)

=for hackers
Found in file sv.h

=item SvNV
X<SvNV>

Coerces the given SV to NV and returns it.  The returned value in many
circumstances will get stored in C<sv>'s NV slot, but not in all cases.  (Use
C<L</sv_setnv>> to make sure it does).

See C<L</SvNVx>> for a version which guarantees to evaluate C<sv> only once.

	NV	SvNV(SV* sv)

=for hackers
Found in file sv.h

=item SvNV_nomg
X<SvNV_nomg>

Like C<SvNV> but doesn't process magic.

	NV	SvNV_nomg(SV* sv)

=for hackers
Found in file sv.h

=item SvNV_set
X<SvNV_set>

Set the value of the NV pointer in C<sv> to val.  See C<L</SvIV_set>>.

	void	SvNV_set(SV* sv, NV val)

=for hackers
Found in file sv.h

=item SvNVX
X<SvNVX>

Returns the raw value in the SV's NV slot, without checks or conversions.
Only use when you are sure C<SvNOK> is true.  See also C<L</SvNV>>.

	NV	SvNVX(SV* sv)

=for hackers
Found in file sv.h

=item SvNVx
X<SvNVx>

Coerces the given SV to NV and returns it.  The returned value in many
circumstances will get stored in C<sv>'s NV slot, but not in all cases.  (Use
C<L</sv_setnv>> to make sure it does).

This form guarantees to evaluate C<sv> only once.  Only use this if C<sv> is an
expression with side effects, otherwise use the more efficient C<SvNV>.

	NV	SvNVx(SV* sv)

=for hackers
Found in file sv.h

=item SvOK
X<SvOK>

Returns a U32 value indicating whether the value is defined.  This is
only meaningful for scalars.

	U32	SvOK(SV* sv)

=for hackers
Found in file sv.h

=item SvOOK
X<SvOOK>

Returns a U32 indicating whether the pointer to the string buffer is offset.
This hack is used internally to speed up removal of characters from the
beginning of a C<SvPV>.  When C<SvOOK> is true, then the start of the
allocated string buffer is actually C<SvOOK_offset()> bytes before C<SvPVX>.
This offset used to be stored in C<SvIVX>, but is now stored within the spare
part of the buffer.

	U32	SvOOK(SV* sv)

=for hackers
Found in file sv.h

=item SvOOK_offset
X<SvOOK_offset>

Reads into C<len> the offset from C<SvPVX> back to the true start of the
allocated buffer, which will be non-zero if C<sv_chop> has been used to
efficiently remove characters from start of the buffer.  Implemented as a
macro, which takes the address of C<len>, which must be of type C<STRLEN>.
Evaluates C<sv> more than once.  Sets C<len> to 0 if C<SvOOK(sv)> is false.

	void	SvOOK_offset(NN SV*sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPOK
X<SvPOK>

Returns a U32 value indicating whether the SV contains a character
string.

	U32	SvPOK(SV* sv)

=for hackers
Found in file sv.h

=item SvPOK_off
X<SvPOK_off>

Unsets the PV status of an SV.

	void	SvPOK_off(SV* sv)

=for hackers
Found in file sv.h

=item SvPOK_on
X<SvPOK_on>

Tells an SV that it is a string.

	void	SvPOK_on(SV* sv)

=for hackers
Found in file sv.h

=item SvPOK_only
X<SvPOK_only>

Tells an SV that it is a string and disables all other C<OK> bits.
Will also turn off the UTF-8 status.

	void	SvPOK_only(SV* sv)

=for hackers
Found in file sv.h

=item SvPOK_only_UTF8
X<SvPOK_only_UTF8>

Tells an SV that it is a string and disables all other C<OK> bits,
and leaves the UTF-8 status as it was.

	void	SvPOK_only_UTF8(SV* sv)

=for hackers
Found in file sv.h

=item SvPOKp
X<SvPOKp>

Returns a U32 value indicating whether the SV contains a character string.
Checks the B<private> setting.  Use C<SvPOK> instead.

	U32	SvPOKp(SV* sv)

=for hackers
Found in file sv.h

=item SvPV
X<SvPV>

Returns a pointer to the string in the SV, or a stringified form of
the SV if the SV does not contain a string.  The SV may cache the
stringified version becoming C<SvPOK>.  Handles 'get' magic.  The
C<len> variable will be set to the length of the string (this is a macro, so
don't use C<&len>).  See also C<L</SvPVx>> for a version which guarantees to
evaluate C<sv> only once.

Note that there is no guarantee that the return value of C<SvPV()> is
equal to C<SvPVX(sv)>, or that C<SvPVX(sv)> contains valid data, or that
successive calls to C<SvPV(sv)> will return the same pointer value each
time.  This is due to the way that things like overloading and
Copy-On-Write are handled.  In these cases, the return value may point to
a temporary buffer or similar.  If you absolutely need the C<SvPVX> field to
be valid (for example, if you intend to write to it), then see
C<L</SvPV_force>>.

	char*	SvPV(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVbyte
X<SvPVbyte>

Like C<SvPV>, but converts C<sv> to byte representation first if necessary.

	char*	SvPVbyte(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVbyte_force
X<SvPVbyte_force>

Like C<SvPV_force>, but converts C<sv> to byte representation first if necessary.

	char*	SvPVbyte_force(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVbyte_nolen
X<SvPVbyte_nolen>

Like C<SvPV_nolen>, but converts C<sv> to byte representation first if necessary.

	char*	SvPVbyte_nolen(SV* sv)

=for hackers
Found in file sv.h

=item SvPVbytex
X<SvPVbytex>

Like C<SvPV>, but converts C<sv> to byte representation first if necessary.
Guarantees to evaluate C<sv> only once; use the more efficient C<SvPVbyte>
otherwise.

	char*	SvPVbytex(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVbytex_force
X<SvPVbytex_force>

Like C<SvPV_force>, but converts C<sv> to byte representation first if necessary.
Guarantees to evaluate C<sv> only once; use the more efficient C<SvPVbyte_force>
otherwise.

	char*	SvPVbytex_force(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVCLEAR
X<SvPVCLEAR>

Ensures that sv is a SVt_PV and that its SvCUR is 0, and that it is
properly null terminated. Equivalent to sv_setpvs(""), but more efficient.

	char *	SvPVCLEAR(SV* sv)

=for hackers
Found in file sv.h

=item SvPV_force
X<SvPV_force>

Like C<SvPV> but will force the SV into containing a string (C<SvPOK>), and
only a string (C<SvPOK_only>), by hook or by crook.  You need force if you are
going to update the C<SvPVX> directly.  Processes get magic.

Note that coercing an arbitrary scalar into a plain PV will potentially
strip useful data from it.  For example if the SV was C<SvROK>, then the
referent will have its reference count decremented, and the SV itself may
be converted to an C<SvPOK> scalar with a string buffer containing a value
such as C<"ARRAY(0x1234)">.

	char*	SvPV_force(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPV_force_nomg
X<SvPV_force_nomg>

Like C<SvPV_force>, but doesn't process get magic.

	char*	SvPV_force_nomg(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPV_nolen
X<SvPV_nolen>

Like C<SvPV> but doesn't set a length variable.

	char*	SvPV_nolen(SV* sv)

=for hackers
Found in file sv.h

=item SvPV_nomg
X<SvPV_nomg>

Like C<SvPV> but doesn't process magic.

	char*	SvPV_nomg(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPV_nomg_nolen
X<SvPV_nomg_nolen>

Like C<SvPV_nolen> but doesn't process magic.

	char*	SvPV_nomg_nolen(SV* sv)

=for hackers
Found in file sv.h

=item SvPV_set
X<SvPV_set>

This is probably not what you want to use, you probably wanted
L</sv_usepvn_flags> or L</sv_setpvn> or L</sv_setpvs>.

Set the value of the PV pointer in C<sv> to the Perl allocated
C<NUL>-terminated string C<val>.  See also C<L</SvIV_set>>.

Remember to free the previous PV buffer. There are many things to check.
Beware that the existing pointer may be involved in copy-on-write or other
mischief, so do C<SvOOK_off(sv)> and use C<sv_force_normal> or
C<SvPV_force> (or check the C<SvIsCOW> flag) first to make sure this
modification is safe. Then finally, if it is not a COW, call C<SvPV_free> to
free the previous PV buffer.

	void	SvPV_set(SV* sv, char* val)

=for hackers
Found in file sv.h

=item SvPVutf8
X<SvPVutf8>

Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.

	char*	SvPVutf8(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVutf8x
X<SvPVutf8x>

Like C<SvPV>, but converts C<sv> to UTF-8 first if necessary.
Guarantees to evaluate C<sv> only once; use the more efficient C<SvPVutf8>
otherwise.

	char*	SvPVutf8x(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVutf8x_force
X<SvPVutf8x_force>

Like C<SvPV_force>, but converts C<sv> to UTF-8 first if necessary.
Guarantees to evaluate C<sv> only once; use the more efficient C<SvPVutf8_force>
otherwise.

	char*	SvPVutf8x_force(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVutf8_force
X<SvPVutf8_force>

Like C<SvPV_force>, but converts C<sv> to UTF-8 first if necessary.

	char*	SvPVutf8_force(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvPVutf8_nolen
X<SvPVutf8_nolen>

Like C<SvPV_nolen>, but converts C<sv> to UTF-8 first if necessary.

	char*	SvPVutf8_nolen(SV* sv)

=for hackers
Found in file sv.h

=item SvPVX
X<SvPVX>

Returns a pointer to the physical string in the SV.  The SV must contain a
string.  Prior to 5.9.3 it is not safe
to execute this macro unless the SV's
type >= C<SVt_PV>.

This is also used to store the name of an autoloaded subroutine in an XS
AUTOLOAD routine.  See L<perlguts/Autoloading with XSUBs>.

	char*	SvPVX(SV* sv)

=for hackers
Found in file sv.h

=item SvPVx
X<SvPVx>

A version of C<SvPV> which guarantees to evaluate C<sv> only once.
Only use this if C<sv> is an expression with side effects, otherwise use the
more efficient C<SvPV>.

	char*	SvPVx(SV* sv, STRLEN len)

=for hackers
Found in file sv.h

=item SvREADONLY
X<SvREADONLY>

Returns true if the argument is readonly, otherwise returns false.
Exposed to to perl code via Internals::SvREADONLY().

	U32	SvREADONLY(SV* sv)

=for hackers
Found in file sv.h

=item SvREADONLY_off
X<SvREADONLY_off>

Mark an object as not-readonly. Exactly what this mean depends on the
object type. Exposed to perl code via Internals::SvREADONLY().

	U32	SvREADONLY_off(SV* sv)

=for hackers
Found in file sv.h

=item SvREADONLY_on
X<SvREADONLY_on>

Mark an object as readonly. Exactly what this means depends on the object
type. Exposed to perl code via Internals::SvREADONLY().

	U32	SvREADONLY_on(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT
X<SvREFCNT>

Returns the value of the object's reference count. Exposed
to perl code via Internals::SvREFCNT().

	U32	SvREFCNT(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_dec
X<SvREFCNT_dec>

Decrements the reference count of the given SV.  C<sv> may be C<NULL>.

	void	SvREFCNT_dec(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_dec_NN
X<SvREFCNT_dec_NN>

Same as C<SvREFCNT_dec>, but can only be used if you know C<sv>
is not C<NULL>.  Since we don't have to check the NULLness, it's faster
and smaller.

	void	SvREFCNT_dec_NN(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc
X<SvREFCNT_inc>

Increments the reference count of the given SV, returning the SV.

All of the following C<SvREFCNT_inc>* macros are optimized versions of
C<SvREFCNT_inc>, and can be replaced with C<SvREFCNT_inc>.

	SV*	SvREFCNT_inc(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc_NN
X<SvREFCNT_inc_NN>

Same as C<SvREFCNT_inc>, but can only be used if you know C<sv>
is not C<NULL>.  Since we don't have to check the NULLness, it's faster
and smaller.

	SV*	SvREFCNT_inc_NN(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc_simple
X<SvREFCNT_inc_simple>

Same as C<SvREFCNT_inc>, but can only be used with expressions without side
effects.  Since we don't have to store a temporary value, it's faster.

	SV*	SvREFCNT_inc_simple(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc_simple_NN
X<SvREFCNT_inc_simple_NN>

Same as C<SvREFCNT_inc_simple>, but can only be used if you know C<sv>
is not C<NULL>.  Since we don't have to check the NULLness, it's faster
and smaller.

	SV*	SvREFCNT_inc_simple_NN(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc_simple_void
X<SvREFCNT_inc_simple_void>

Same as C<SvREFCNT_inc_simple>, but can only be used if you don't need the
return value.  The macro doesn't need to return a meaningful value.

	void	SvREFCNT_inc_simple_void(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc_simple_void_NN
X<SvREFCNT_inc_simple_void_NN>

Same as C<SvREFCNT_inc>, but can only be used if you don't need the return
value, and you know that C<sv> is not C<NULL>.  The macro doesn't need
to return a meaningful value, or check for NULLness, so it's smaller
and faster.

	void	SvREFCNT_inc_simple_void_NN(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc_void
X<SvREFCNT_inc_void>

Same as C<SvREFCNT_inc>, but can only be used if you don't need the
return value.  The macro doesn't need to return a meaningful value.

	void	SvREFCNT_inc_void(SV* sv)

=for hackers
Found in file sv.h

=item SvREFCNT_inc_void_NN
X<SvREFCNT_inc_void_NN>

Same as C<SvREFCNT_inc>, but can only be used if you don't need the return
value, and you know that C<sv> is not C<NULL>.  The macro doesn't need
to return a meaningful value, or check for NULLness, so it's smaller
and faster.

	void	SvREFCNT_inc_void_NN(SV* sv)

=for hackers
Found in file sv.h

=item sv_report_used
X<sv_report_used>

Dump the contents of all SVs not yet freed (debugging aid).

	void	sv_report_used()

=for hackers
Found in file sv.c

=item SvROK
X<SvROK>

Tests if the SV is an RV.

	U32	SvROK(SV* sv)

=for hackers
Found in file sv.h

=item SvROK_off
X<SvROK_off>

Unsets the RV status of an SV.

	void	SvROK_off(SV* sv)

=for hackers
Found in file sv.h

=item SvROK_on
X<SvROK_on>

Tells an SV that it is an RV.

	void	SvROK_on(SV* sv)

=for hackers
Found in file sv.h

=item SvRV
X<SvRV>

Dereferences an RV to return the SV.

	SV*	SvRV(SV* sv)

=for hackers
Found in file sv.h

=item SvRV_set
X<SvRV_set>

Set the value of the RV pointer in C<sv> to val.  See C<L</SvIV_set>>.

	void	SvRV_set(SV* sv, SV* val)

=for hackers
Found in file sv.h

=item sv_setsv_nomg
X<sv_setsv_nomg>

Like C<sv_setsv> but doesn't process magic.

	void	sv_setsv_nomg(SV* dsv, SV* ssv)

=for hackers
Found in file sv.h

=item SvSTASH
X<SvSTASH>

Returns the stash of the SV.

	HV*	SvSTASH(SV* sv)

=for hackers
Found in file sv.h

=item SvSTASH_set
X<SvSTASH_set>

Set the value of the STASH pointer in C<sv> to val.  See C<L</SvIV_set>>.

	void	SvSTASH_set(SV* sv, HV* val)

=for hackers
Found in file sv.h

=item SvTAINT
X<SvTAINT>

Taints an SV if tainting is enabled, and if some input to the current
expression is tainted--usually a variable, but possibly also implicit
inputs such as locale settings.  C<SvTAINT> propagates that taintedness to
the outputs of an expression in a pessimistic fashion; i.e., without paying
attention to precisely which outputs are influenced by which inputs.

	void	SvTAINT(SV* sv)

=for hackers
Found in file sv.h

=item SvTAINTED
X<SvTAINTED>

Checks to see if an SV is tainted.  Returns TRUE if it is, FALSE if
not.

	bool	SvTAINTED(SV* sv)

=for hackers
Found in file sv.h

=item SvTAINTED_off
X<SvTAINTED_off>

Untaints an SV.  Be I<very> careful with this routine, as it short-circuits
some of Perl's fundamental security features.  XS module authors should not
use this function unless they fully understand all the implications of
unconditionally untainting the value.  Untainting should be done in the
standard perl fashion, via a carefully crafted regexp, rather than directly
untainting variables.

	void	SvTAINTED_off(SV* sv)

=for hackers
Found in file sv.h

=item SvTAINTED_on
X<SvTAINTED_on>

Marks an SV as tainted if tainting is enabled.

	void	SvTAINTED_on(SV* sv)

=for hackers
Found in file sv.h

=item SvTRUE
X<SvTRUE>

Returns a boolean indicating whether Perl would evaluate the SV as true or
false.  See C<L</SvOK>> for a defined/undefined test.  Handles 'get' magic
unless the scalar is already C<SvPOK>, C<SvIOK> or C<SvNOK> (the public, not the
private flags).

	bool	SvTRUE(SV* sv)

=for hackers
Found in file sv.h

=item SvTRUE_nomg
X<SvTRUE_nomg>

Returns a boolean indicating whether Perl would evaluate the SV as true or
false.  See C<L</SvOK>> for a defined/undefined test.  Does not handle 'get' magic.

	bool	SvTRUE_nomg(SV* sv)

=for hackers
Found in file sv.h

=item SvTYPE
X<SvTYPE>

Returns the type of the SV.  See C<L</svtype>>.

	svtype	SvTYPE(SV* sv)

=for hackers
Found in file sv.h

=item SvUOK
X<SvUOK>

Returns a boolean indicating whether the SV contains an integer that must be
interpreted as unsigned.  A non-negative integer whose value is within the
range of both an IV and a UV may be be flagged as either C<SvUOK> or C<SVIOK>.

	bool	SvUOK(SV* sv)

=for hackers
Found in file sv.h

=item SvUPGRADE
X<SvUPGRADE>

Used to upgrade an SV to a more complex form.  Uses C<sv_upgrade> to
perform the upgrade if necessary.  See C<L</svtype>>.

	void	SvUPGRADE(SV* sv, svtype type)

=for hackers
Found in file sv.h

=item SvUTF8
X<SvUTF8>

Returns a U32 value indicating the UTF-8 status of an SV.  If things are set-up
properly, this indicates whether or not the SV contains UTF-8 encoded data.
You should use this I<after> a call to C<SvPV()> or one of its variants, in
case any call to string overloading updates the internal flag.

If you want to take into account the L<bytes> pragma, use C<L</DO_UTF8>>
instead.

	U32	SvUTF8(SV* sv)

=for hackers
Found in file sv.h

=item sv_utf8_upgrade_nomg
X<sv_utf8_upgrade_nomg>

Like C<sv_utf8_upgrade>, but doesn't do magic on C<sv>.

	STRLEN	sv_utf8_upgrade_nomg(NN SV *sv)

=for hackers
Found in file sv.h

=item SvUTF8_off
X<SvUTF8_off>

Unsets the UTF-8 status of an SV (the data is not changed, just the flag).
Do not use frivolously.

	void	SvUTF8_off(SV *sv)

=for hackers
Found in file sv.h

=item SvUTF8_on
X<SvUTF8_on>

Turn on the UTF-8 status of an SV (the data is not changed, just the flag).
Do not use frivolously.

	void	SvUTF8_on(SV *sv)

=for hackers
Found in file sv.h

=item SvUV
X<SvUV>

Coerces the given SV to UV and returns it.  The returned value in many
circumstances will get stored in C<sv>'s UV slot, but not in all cases.  (Use
C<L</sv_setuv>> to make sure it does).

See C<L</SvUVx>> for a version which guarantees to evaluate C<sv> only once.

	UV	SvUV(SV* sv)

=for hackers
Found in file sv.h

=item SvUV_nomg
X<SvUV_nomg>

Like C<SvUV> but doesn't process magic.

	UV	SvUV_nomg(SV* sv)

=for hackers
Found in file sv.h

=item SvUV_set
X<SvUV_set>

Set the value of the UV pointer in C<sv> to val.  See C<L</SvIV_set>>.

	void	SvUV_set(SV* sv, UV val)

=for hackers
Found in file sv.h

=item SvUVX
X<SvUVX>

Returns the raw value in the SV's UV slot, without checks or conversions.
Only use when you are sure C<SvIOK> is true.  See also C<L</SvUV>>.

	UV	SvUVX(SV* sv)

=for hackers
Found in file sv.h

=item SvUVx
X<SvUVx>

Coerces the given SV to UV and returns it.  The returned value in many
circumstances will get stored in C<sv>'s UV slot, but not in all cases.  (Use
C<L</sv_setuv>> to make sure it does).

This form guarantees to evaluate C<sv> only once.  Only use this if C<sv> is an
expression with side effects, otherwise use the more efficient C<SvUV>.

	UV	SvUVx(SV* sv)

=for hackers
Found in file sv.h

=item SvVOK
X<SvVOK>

Returns a boolean indicating whether the SV contains a v-string.

	bool	SvVOK(SV* sv)

=for hackers
Found in file sv.h


=back

=head1 Unicode Support

L<perlguts/Unicode Support> has an introduction to this API.

See also L</Character classification>,
and L</Character case changing>.
Various functions outside this section also work specially with Unicode.
Search for the string "utf8" in this document.


=over 8

=item BOM_UTF8
X<BOM_UTF8>

This is a macro that evaluates to a string constant of the  UTF-8 bytes that
define the Unicode BYTE ORDER MARK (U+FEFF) for the platform that perl
is compiled on.  This allows code to use a mnemonic for this character that
works on both ASCII and EBCDIC platforms.
S<C<sizeof(BOM_UTF8) - 1>> can be used to get its length in
bytes.

=for hackers
Found in file unicode_constants.h

=item bytes_cmp_utf8
X<bytes_cmp_utf8>

Compares the sequence of characters (stored as octets) in C<b>, C<blen> with the
sequence of characters (stored as UTF-8)
in C<u>, C<ulen>.  Returns 0 if they are
equal, -1 or -2 if the first string is less than the second string, +1 or +2
if the first string is greater than the second string.

-1 or +1 is returned if the shorter string was identical to the start of the
longer string.  -2 or +2 is returned if
there was a difference between characters
within the strings.

	int	bytes_cmp_utf8(const U8 *b, STRLEN blen,
		               const U8 *u, STRLEN ulen)

=for hackers
Found in file utf8.c

=item bytes_from_utf8
X<bytes_from_utf8>


NOTE: this function is experimental and may change or be
removed without notice.


Converts a string C<s> of length C<len> from UTF-8 into native byte encoding.
Unlike L</utf8_to_bytes> but like L</bytes_to_utf8>, returns a pointer to
the newly-created string, and updates C<len> to contain the new
length.  Returns the original string if no conversion occurs, C<len>
is unchanged.  Do nothing if C<is_utf8> points to 0.  Sets C<is_utf8> to
0 if C<s> is converted or consisted entirely of characters that are invariant
in UTF-8 (i.e., US-ASCII on non-EBCDIC machines).

	U8*	bytes_from_utf8(const U8 *s, STRLEN *len,
		                bool *is_utf8)

=for hackers
Found in file utf8.c

=item bytes_to_utf8
X<bytes_to_utf8>


NOTE: this function is experimental and may change or be
removed without notice.


Converts a string C<s> of length C<len> bytes from the native encoding into
UTF-8.
Returns a pointer to the newly-created string, and sets C<len> to
reflect the new length in bytes.

A C<NUL> character will be written after the end of the string.

If you want to convert to UTF-8 from encodings other than
the native (Latin1 or EBCDIC),
see L</sv_recode_to_utf8>().

	U8*	bytes_to_utf8(const U8 *s, STRLEN *len)

=for hackers
Found in file utf8.c

=item DO_UTF8
X<DO_UTF8>

Returns a bool giving whether or not the PV in C<sv> is to be treated as being
encoded in UTF-8.

You should use this I<after> a call to C<SvPV()> or one of its variants, in
case any call to string overloading updates the internal UTF-8 encoding flag.

	bool	DO_UTF8(SV* sv)

=for hackers
Found in file utf8.h

=item foldEQ_utf8
X<foldEQ_utf8>

Returns true if the leading portions of the strings C<s1> and C<s2> (either or both
of which may be in UTF-8) are the same case-insensitively; false otherwise.
How far into the strings to compare is determined by other input parameters.

If C<u1> is true, the string C<s1> is assumed to be in UTF-8-encoded Unicode;
otherwise it is assumed to be in native 8-bit encoding.  Correspondingly for C<u2>
with respect to C<s2>.

If the byte length C<l1> is non-zero, it says how far into C<s1> to check for fold
equality.  In other words, C<s1>+C<l1> will be used as a goal to reach.  The
scan will not be considered to be a match unless the goal is reached, and
scanning won't continue past that goal.  Correspondingly for C<l2> with respect to
C<s2>.

If C<pe1> is non-C<NULL> and the pointer it points to is not C<NULL>, that pointer is
considered an end pointer to the position 1 byte past the maximum point
in C<s1> beyond which scanning will not continue under any circumstances.
(This routine assumes that UTF-8 encoded input strings are not malformed;
malformed input can cause it to read past C<pe1>).
This means that if both C<l1> and C<pe1> are specified, and C<pe1>
is less than C<s1>+C<l1>, the match will never be successful because it can
never
get as far as its goal (and in fact is asserted against).  Correspondingly for
C<pe2> with respect to C<s2>.

At least one of C<s1> and C<s2> must have a goal (at least one of C<l1> and
C<l2> must be non-zero), and if both do, both have to be
reached for a successful match.   Also, if the fold of a character is multiple
characters, all of them must be matched (see tr21 reference below for
'folding').

Upon a successful match, if C<pe1> is non-C<NULL>,
it will be set to point to the beginning of the I<next> character of C<s1>
beyond what was matched.  Correspondingly for C<pe2> and C<s2>.

For case-insensitiveness, the "casefolding" of Unicode is used
instead of upper/lowercasing both the characters, see
L<http://www.unicode.org/unicode/reports/tr21/> (Case Mappings).

	I32	foldEQ_utf8(const char *s1, char **pe1, UV l1,
		            bool u1, const char *s2, char **pe2,
		            UV l2, bool u2)

=for hackers
Found in file utf8.c

=item is_ascii_string
X<is_ascii_string>

This is a misleadingly-named synonym for L</is_utf8_invariant_string>.
On ASCII-ish platforms, the name isn't misleading: the ASCII-range characters
are exactly the UTF-8 invariants.  But EBCDIC machines have more invariants
than just the ASCII characters, so C<is_utf8_invariant_string> is preferred.

	bool	is_ascii_string(const U8* const s,
		                const STRLEN len)

=for hackers
Found in file utf8.h

=item is_c9strict_utf8_string
X<is_c9strict_utf8_string>

Returns TRUE if the first C<len> bytes of string C<s> form a valid
UTF-8-encoded string that conforms to
L<Unicode Corrigendum #9|http://www.unicode.org/versions/corrigendum9.html>;
otherwise it returns FALSE.  If C<len> is 0, it will be calculated using
C<strlen(s)> (which means if you use this option, that C<s> can't have embedded
C<NUL> characters and has to have a terminating C<NUL> byte).  Note that all
characters being ASCII constitute 'a valid UTF-8 string'.

This function returns FALSE for strings containing any code points above the
Unicode max of 0x10FFFF or surrogate code points, but accepts non-character
code points per
L<Corrigendum #9|http://www.unicode.org/versions/corrigendum9.html>.

See also
C<L</is_utf8_invariant_string>>,
C<L</is_utf8_string>>,
C<L</is_utf8_string_flags>>,
C<L</is_utf8_string_loc>>,
C<L</is_utf8_string_loc_flags>>,
C<L</is_utf8_string_loclen>>,
C<L</is_utf8_string_loclen_flags>>,
C<L</is_utf8_fixed_width_buf_flags>>,
C<L</is_utf8_fixed_width_buf_loc_flags>>,
C<L</is_utf8_fixed_width_buf_loclen_flags>>,
C<L</is_strict_utf8_string>>,
C<L</is_strict_utf8_string_loc>>,
C<L</is_strict_utf8_string_loclen>>,
C<L</is_c9strict_utf8_string_loc>>,
and
C<L</is_c9strict_utf8_string_loclen>>.

	bool	is_c9strict_utf8_string(const U8 *s,
		                        const STRLEN len)

=for hackers
Found in file inline.h

=item is_c9strict_utf8_string_loc
X<is_c9strict_utf8_string_loc>

Like C<L</is_c9strict_utf8_string>> but stores the location of the failure (in
the case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer.

See also C<L</is_c9strict_utf8_string_loclen>>.

	bool	is_c9strict_utf8_string_loc(const U8 *s,
		                            const STRLEN len,
		                            const U8 **ep)

=for hackers
Found in file inline.h

=item is_c9strict_utf8_string_loclen
X<is_c9strict_utf8_string_loclen>

Like C<L</is_c9strict_utf8_string>> but stores the location of the failure (in
the case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer, and the number of UTF-8 encoded
characters in the C<el> pointer.

See also C<L</is_c9strict_utf8_string_loc>>.

	bool	is_c9strict_utf8_string_loclen(
		    const U8 *s, const STRLEN len,
		    const U8 **ep, STRLEN *el
		)

=for hackers
Found in file inline.h

=item isC9_STRICT_UTF8_CHAR
X<isC9_STRICT_UTF8_CHAR>

Evaluates to non-zero if the first few bytes of the string starting at C<s> and
looking no further than S<C<e - 1>> are well-formed UTF-8 that represents some
Unicode non-surrogate code point; otherwise it evaluates to 0.  If non-zero,
the value gives how many bytes starting at C<s> comprise the code point's
representation.  Any bytes remaining before C<e>, but beyond the ones needed to
form the first code point in C<s>, are not examined.

The largest acceptable code point is the Unicode maximum 0x10FFFF.  This
differs from C<L</isSTRICT_UTF8_CHAR>> only in that it accepts non-character
code points.  This corresponds to
L<Unicode Corrigendum #9|http://www.unicode.org/versions/corrigendum9.html>.
which said that non-character code points are merely discouraged rather than
completely forbidden in open interchange.  See
L<perlunicode/Noncharacter code points>.

Use C<L</isUTF8_CHAR>> to check for Perl's extended UTF-8; and
C<L</isUTF8_CHAR_flags>> for a more customized definition.

Use C<L</is_c9strict_utf8_string>>, C<L</is_c9strict_utf8_string_loc>>, and
C<L</is_c9strict_utf8_string_loclen>> to check entire strings.

	STRLEN	isC9_STRICT_UTF8_CHAR(const U8 *s, const U8 *e)

=for hackers
Found in file utf8.h

=item is_invariant_string
X<is_invariant_string>

This is a somewhat misleadingly-named synonym for L</is_utf8_invariant_string>.
C<is_utf8_invariant_string> is preferred, as it indicates under what conditions
the string is invariant.

	bool	is_invariant_string(const U8* const s,
		                    const STRLEN len)

=for hackers
Found in file utf8.h

=item isSTRICT_UTF8_CHAR
X<isSTRICT_UTF8_CHAR>

Evaluates to non-zero if the first few bytes of the string starting at C<s> and
looking no further than S<C<e - 1>> are well-formed UTF-8 that represents some
Unicode code point completely acceptable for open interchange between all
applications; otherwise it evaluates to 0.  If non-zero, the value gives how
many bytes starting at C<s> comprise the code point's representation.  Any
bytes remaining before C<e>, but beyond the ones needed to form the first code
point in C<s>, are not examined.

The largest acceptable code point is the Unicode maximum 0x10FFFF, and must not
be a surrogate nor a non-character code point.  Thus this excludes any code
point from Perl's extended UTF-8.

This is used to efficiently decide if the next few bytes in C<s> is
legal Unicode-acceptable UTF-8 for a single character.

Use C<L</isC9_STRICT_UTF8_CHAR>> to use the L<Unicode Corrigendum
#9|http://www.unicode.org/versions/corrigendum9.html> definition of allowable
code points; C<L</isUTF8_CHAR>> to check for Perl's extended UTF-8;
and C<L</isUTF8_CHAR_flags>> for a more customized definition.

Use C<L</is_strict_utf8_string>>, C<L</is_strict_utf8_string_loc>>, and
C<L</is_strict_utf8_string_loclen>> to check entire strings.

	STRLEN	isSTRICT_UTF8_CHAR(const U8 *s, const U8 *e)

=for hackers
Found in file utf8.h

=item is_strict_utf8_string
X<is_strict_utf8_string>

Returns TRUE if the first C<len> bytes of string C<s> form a valid
UTF-8-encoded string that is fully interchangeable by any application using
Unicode rules; otherwise it returns FALSE.  If C<len> is 0, it will be
calculated using C<strlen(s)> (which means if you use this option, that C<s>
can't have embedded C<NUL> characters and has to have a terminating C<NUL>
byte).  Note that all characters being ASCII constitute 'a valid UTF-8 string'.

This function returns FALSE for strings containing any
code points above the Unicode max of 0x10FFFF, surrogate code points, or
non-character code points.

See also
C<L</is_utf8_invariant_string>>,
C<L</is_utf8_string>>,
C<L</is_utf8_string_flags>>,
C<L</is_utf8_string_loc>>,
C<L</is_utf8_string_loc_flags>>,
C<L</is_utf8_string_loclen>>,
C<L</is_utf8_string_loclen_flags>>,
C<L</is_utf8_fixed_width_buf_flags>>,
C<L</is_utf8_fixed_width_buf_loc_flags>>,
C<L</is_utf8_fixed_width_buf_loclen_flags>>,
C<L</is_strict_utf8_string_loc>>,
C<L</is_strict_utf8_string_loclen>>,
C<L</is_c9strict_utf8_string>>,
C<L</is_c9strict_utf8_string_loc>>,
and
C<L</is_c9strict_utf8_string_loclen>>.

	bool	is_strict_utf8_string(const U8 *s,
		                      const STRLEN len)

=for hackers
Found in file inline.h

=item is_strict_utf8_string_loc
X<is_strict_utf8_string_loc>

Like C<L</is_strict_utf8_string>> but stores the location of the failure (in the
case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer.

See also C<L</is_strict_utf8_string_loclen>>.

	bool	is_strict_utf8_string_loc(const U8 *s,
		                          const STRLEN len,
		                          const U8 **ep)

=for hackers
Found in file inline.h

=item is_strict_utf8_string_loclen
X<is_strict_utf8_string_loclen>

Like C<L</is_strict_utf8_string>> but stores the location of the failure (in the
case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer, and the number of UTF-8
encoded characters in the C<el> pointer.

See also C<L</is_strict_utf8_string_loc>>.

	bool	is_strict_utf8_string_loclen(const U8 *s,
		                             const STRLEN len,
		                             const U8 **ep,
		                             STRLEN *el)

=for hackers
Found in file inline.h

=item is_utf8_fixed_width_buf_flags
X<is_utf8_fixed_width_buf_flags>

Returns TRUE if the fixed-width buffer starting at C<s> with length C<len>
is entirely valid UTF-8, subject to the restrictions given by C<flags>;
otherwise it returns FALSE.

If C<flags> is 0, any well-formed UTF-8, as extended by Perl, is accepted
without restriction.  If the final few bytes of the buffer do not form a
complete code point, this will return TRUE anyway, provided that
C<L</is_utf8_valid_partial_char_flags>> returns TRUE for them.

If C<flags> in non-zero, it can be any combination of the
C<UTF8_DISALLOW_I<foo>> flags accepted by C<L</utf8n_to_uvchr>>, and with the
same meanings.

This function differs from C<L</is_utf8_string_flags>> only in that the latter
returns FALSE if the final few bytes of the string don't form a complete code
point.

	bool	is_utf8_fixed_width_buf_flags(
		    const U8 * const s, const STRLEN len,
		    const U32 flags
		)

=for hackers
Found in file inline.h

=item is_utf8_fixed_width_buf_loclen_flags
X<is_utf8_fixed_width_buf_loclen_flags>

Like C<L</is_utf8_fixed_width_buf_loc_flags>> but stores the number of
complete, valid characters found in the C<el> pointer.

	bool	is_utf8_fixed_width_buf_loclen_flags(
		    const U8 * const s, const STRLEN len,
		    const U8 **ep, STRLEN *el, const U32 flags
		)

=for hackers
Found in file inline.h

=item is_utf8_fixed_width_buf_loc_flags
X<is_utf8_fixed_width_buf_loc_flags>

Like C<L</is_utf8_fixed_width_buf_flags>> but stores the location of the
failure in the C<ep> pointer.  If the function returns TRUE, C<*ep> will point
to the beginning of any partial character at the end of the buffer; if there is
no partial character C<*ep> will contain C<s>+C<len>.

See also C<L</is_utf8_fixed_width_buf_loclen_flags>>.

	bool	is_utf8_fixed_width_buf_loc_flags(
		    const U8 * const s, const STRLEN len,
		    const U8 **ep, const U32 flags
		)

=for hackers
Found in file inline.h

=item is_utf8_invariant_string
X<is_utf8_invariant_string>

Returns TRUE if the first C<len> bytes of the string C<s> are the same
regardless of the UTF-8 encoding of the string (or UTF-EBCDIC encoding on
EBCDIC machines); otherwise it returns FALSE.  That is, it returns TRUE if they
are UTF-8 invariant.  On ASCII-ish machines, all the ASCII characters and only
the ASCII characters fit this definition.  On EBCDIC machines, the ASCII-range
characters are invariant, but so also are the C1 controls.

If C<len> is 0, it will be calculated using C<strlen(s)>, (which means if you
use this option, that C<s> can't have embedded C<NUL> characters and has to
have a terminating C<NUL> byte).

See also
C<L</is_utf8_string>>,
C<L</is_utf8_string_flags>>,
C<L</is_utf8_string_loc>>,
C<L</is_utf8_string_loc_flags>>,
C<L</is_utf8_string_loclen>>,
C<L</is_utf8_string_loclen_flags>>,
C<L</is_utf8_fixed_width_buf_flags>>,
C<L</is_utf8_fixed_width_buf_loc_flags>>,
C<L</is_utf8_fixed_width_buf_loclen_flags>>,
C<L</is_strict_utf8_string>>,
C<L</is_strict_utf8_string_loc>>,
C<L</is_strict_utf8_string_loclen>>,
C<L</is_c9strict_utf8_string>>,
C<L</is_c9strict_utf8_string_loc>>,
and
C<L</is_c9strict_utf8_string_loclen>>.

	bool	is_utf8_invariant_string(const U8* const s,
		                         STRLEN const len)

=for hackers
Found in file inline.h

=item is_utf8_string
X<is_utf8_string>

Returns TRUE if the first C<len> bytes of string C<s> form a valid
Perl-extended-UTF-8 string; returns FALSE otherwise.  If C<len> is 0, it will
be calculated using C<strlen(s)> (which means if you use this option, that C<s>
can't have embedded C<NUL> characters and has to have a terminating C<NUL>
byte).  Note that all characters being ASCII constitute 'a valid UTF-8 string'.

This function considers Perl's extended UTF-8 to be valid.  That means that
code points above Unicode, surrogates, and non-character code points are
considered valid by this function.  Use C<L</is_strict_utf8_string>>,
C<L</is_c9strict_utf8_string>>, or C<L</is_utf8_string_flags>> to restrict what
code points are considered valid.

See also
C<L</is_utf8_invariant_string>>,
C<L</is_utf8_string_loc>>,
C<L</is_utf8_string_loclen>>,
C<L</is_utf8_fixed_width_buf_flags>>,
C<L</is_utf8_fixed_width_buf_loc_flags>>,
C<L</is_utf8_fixed_width_buf_loclen_flags>>,

	bool	is_utf8_string(const U8 *s, const STRLEN len)

=for hackers
Found in file inline.h

=item is_utf8_string_flags
X<is_utf8_string_flags>

Returns TRUE if the first C<len> bytes of string C<s> form a valid
UTF-8 string, subject to the restrictions imposed by C<flags>;
returns FALSE otherwise.  If C<len> is 0, it will be calculated
using C<strlen(s)> (which means if you use this option, that C<s> can't have
embedded C<NUL> characters and has to have a terminating C<NUL> byte).  Note
that all characters being ASCII constitute 'a valid UTF-8 string'.

If C<flags> is 0, this gives the same results as C<L</is_utf8_string>>; if
C<flags> is C<UTF8_DISALLOW_ILLEGAL_INTERCHANGE>, this gives the same results
as C<L</is_strict_utf8_string>>; and if C<flags> is
C<UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE>, this gives the same results as
C<L</is_c9strict_utf8_string>>.  Otherwise C<flags> may be any
combination of the C<UTF8_DISALLOW_I<foo>> flags understood by
C<L</utf8n_to_uvchr>>, with the same meanings.

See also
C<L</is_utf8_invariant_string>>,
C<L</is_utf8_string>>,
C<L</is_utf8_string_loc>>,
C<L</is_utf8_string_loc_flags>>,
C<L</is_utf8_string_loclen>>,
C<L</is_utf8_string_loclen_flags>>,
C<L</is_utf8_fixed_width_buf_flags>>,
C<L</is_utf8_fixed_width_buf_loc_flags>>,
C<L</is_utf8_fixed_width_buf_loclen_flags>>,
C<L</is_strict_utf8_string>>,
C<L</is_strict_utf8_string_loc>>,
C<L</is_strict_utf8_string_loclen>>,
C<L</is_c9strict_utf8_string>>,
C<L</is_c9strict_utf8_string_loc>>,
and
C<L</is_c9strict_utf8_string_loclen>>.

	bool	is_utf8_string_flags(const U8 *s,
		                     const STRLEN len,
		                     const U32 flags)

=for hackers
Found in file inline.h

=item is_utf8_string_loc
X<is_utf8_string_loc>

Like C<L</is_utf8_string>> but stores the location of the failure (in the
case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer.

See also C<L</is_utf8_string_loclen>>.

	bool	is_utf8_string_loc(const U8 *s,
		                   const STRLEN len,
		                   const U8 **ep)

=for hackers
Found in file inline.h

=item is_utf8_string_loclen
X<is_utf8_string_loclen>

Like C<L</is_utf8_string>> but stores the location of the failure (in the
case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer, and the number of UTF-8
encoded characters in the C<el> pointer.

See also C<L</is_utf8_string_loc>>.

	bool	is_utf8_string_loclen(const U8 *s,
		                      const STRLEN len,
		                      const U8 **ep, STRLEN *el)

=for hackers
Found in file inline.h

=item is_utf8_string_loclen_flags
X<is_utf8_string_loclen_flags>

Like C<L</is_utf8_string_flags>> but stores the location of the failure (in the
case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer, and the number of UTF-8
encoded characters in the C<el> pointer.

See also C<L</is_utf8_string_loc_flags>>.

	bool	is_utf8_string_loclen_flags(const U8 *s,
		                            const STRLEN len,
		                            const U8 **ep,
		                            STRLEN *el,
		                            const U32 flags)

=for hackers
Found in file inline.h

=item is_utf8_string_loc_flags
X<is_utf8_string_loc_flags>

Like C<L</is_utf8_string_flags>> but stores the location of the failure (in the
case of "utf8ness failure") or the location C<s>+C<len> (in the case of
"utf8ness success") in the C<ep> pointer.

See also C<L</is_utf8_string_loclen_flags>>.

	bool	is_utf8_string_loc_flags(const U8 *s,
		                         const STRLEN len,
		                         const U8 **ep,
		                         const U32 flags)

=for hackers
Found in file inline.h

=item is_utf8_valid_partial_char
X<is_utf8_valid_partial_char>

Returns 0 if the sequence of bytes starting at C<s> and looking no further than
S<C<e - 1>> is the UTF-8 encoding, as extended by Perl, for one or more code
points.  Otherwise, it returns 1 if there exists at least one non-empty
sequence of bytes that when appended to sequence C<s>, starting at position
C<e> causes the entire sequence to be the well-formed UTF-8 of some code point;
otherwise returns 0.

In other words this returns TRUE if C<s> points to a partial UTF-8-encoded code
point.

This is useful when a fixed-length buffer is being tested for being well-formed
UTF-8, but the final few bytes in it don't comprise a full character; that is,
it is split somewhere in the middle of the final code point's UTF-8
representation.  (Presumably when the buffer is refreshed with the next chunk
of data, the new first bytes will complete the partial code point.)   This
function is used to verify that the final bytes in the current buffer are in
fact the legal beginning of some code point, so that if they aren't, the
failure can be signalled without having to wait for the next read.

	bool	is_utf8_valid_partial_char(const U8 * const s,
		                           const U8 * const e)

=for hackers
Found in file inline.h

=item is_utf8_valid_partial_char_flags
X<is_utf8_valid_partial_char_flags>

Like C<L</is_utf8_valid_partial_char>>, it returns a boolean giving whether
or not the input is a valid UTF-8 encoded partial character, but it takes an
extra parameter, C<flags>, which can further restrict which code points are
considered valid.

If C<flags> is 0, this behaves identically to
C<L</is_utf8_valid_partial_char>>.  Otherwise C<flags> can be any combination
of the C<UTF8_DISALLOW_I<foo>> flags accepted by C<L</utf8n_to_uvchr>>.  If
there is any sequence of bytes that can complete the input partial character in
such a way that a non-prohibited character is formed, the function returns
TRUE; otherwise FALSE.  Non character code points cannot be determined based on
partial character input.  But many  of the other possible excluded types can be
determined from just the first one or two bytes.

	bool	is_utf8_valid_partial_char_flags(
		    const U8 * const s, const U8 * const e,
		    const U32 flags
		)

=for hackers
Found in file inline.h

=item isUTF8_CHAR
X<isUTF8_CHAR>

Evaluates to non-zero if the first few bytes of the string starting at C<s> and
looking no further than S<C<e - 1>> are well-formed UTF-8, as extended by Perl,
that represents some code point; otherwise it evaluates to 0.  If non-zero, the
value gives how many bytes starting at C<s> comprise the code point's
representation.  Any bytes remaining before C<e>, but beyond the ones needed to
form the first code point in C<s>, are not examined.

The code point can be any that will fit in a UV on this machine, using Perl's
extension to official UTF-8 to represent those higher than the Unicode maximum
of 0x10FFFF.  That means that this macro is used to efficiently decide if the
next few bytes in C<s> is legal UTF-8 for a single character.

Use C<L</isSTRICT_UTF8_CHAR>> to restrict the acceptable code points to those
defined by Unicode to be fully interchangeable across applications;
C<L</isC9_STRICT_UTF8_CHAR>> to use the L<Unicode Corrigendum
#9|http://www.unicode.org/versions/corrigendum9.html> definition of allowable
code points; and C<L</isUTF8_CHAR_flags>> for a more customized definition.

Use C<L</is_utf8_string>>, C<L</is_utf8_string_loc>>, and
C<L</is_utf8_string_loclen>> to check entire strings.

Note that it is deprecated to use code points higher than what will fit in an
IV.  This macro does not raise any warnings for such code points, treating them
as valid.

Note also that a UTF-8 INVARIANT character (i.e. ASCII on non-EBCDIC machines)
is a valid UTF-8 character.

	STRLEN	isUTF8_CHAR(const U8 *s, const U8 *e)

=for hackers
Found in file utf8.h

=item isUTF8_CHAR_flags
X<isUTF8_CHAR_flags>

Evaluates to non-zero if the first few bytes of the string starting at C<s> and
looking no further than S<C<e - 1>> are well-formed UTF-8, as extended by Perl,
that represents some code point, subject to the restrictions given by C<flags>;
otherwise it evaluates to 0.  If non-zero, the value gives how many bytes
starting at C<s> comprise the code point's representation.  Any bytes remaining
before C<e>, but beyond the ones needed to form the first code point in C<s>,
are not examined.

If C<flags> is 0, this gives the same results as C<L</isUTF8_CHAR>>;
if C<flags> is C<UTF8_DISALLOW_ILLEGAL_INTERCHANGE>, this gives the same results
as C<L</isSTRICT_UTF8_CHAR>>;
and if C<flags> is C<UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE>, this gives
the same results as C<L</isC9_STRICT_UTF8_CHAR>>.
Otherwise C<flags> may be any combination of the C<UTF8_DISALLOW_I<foo>> flags
understood by C<L</utf8n_to_uvchr>>, with the same meanings.

The three alternative macros are for the most commonly needed validations; they
are likely to run somewhat faster than this more general one, as they can be
inlined into your code.

Use L</is_utf8_string_flags>, L</is_utf8_string_loc_flags>, and
L</is_utf8_string_loclen_flags> to check entire strings.

	STRLEN	isUTF8_CHAR_flags(const U8 *s, const U8 *e,
		                   const U32 flags)

=for hackers
Found in file utf8.h

=item pv_uni_display
X<pv_uni_display>

Build to the scalar C<dsv> a displayable version of the string C<spv>,
length C<len>, the displayable version being at most C<pvlim> bytes long
(if longer, the rest is truncated and C<"..."> will be appended).

The C<flags> argument can have C<UNI_DISPLAY_ISPRINT> set to display
C<isPRINT()>able characters as themselves, C<UNI_DISPLAY_BACKSLASH>
to display the C<\\[nrfta\\]> as the backslashed versions (like C<"\n">)
(C<UNI_DISPLAY_BACKSLASH> is preferred over C<UNI_DISPLAY_ISPRINT> for C<"\\">).
C<UNI_DISPLAY_QQ> (and its alias C<UNI_DISPLAY_REGEX>) have both
C<UNI_DISPLAY_BACKSLASH> and C<UNI_DISPLAY_ISPRINT> turned on.

The pointer to the PV of the C<dsv> is returned.

See also L</sv_uni_display>.

	char*	pv_uni_display(SV *dsv, const U8 *spv,
		               STRLEN len, STRLEN pvlim,
		               UV flags)

=for hackers
Found in file utf8.c

=item REPLACEMENT_CHARACTER_UTF8
X<REPLACEMENT_CHARACTER_UTF8>

This is a macro that evaluates to a string constant of the  UTF-8 bytes that
define the Unicode REPLACEMENT CHARACTER (U+FFFD) for the platform that perl
is compiled on.  This allows code to use a mnemonic for this character that
works on both ASCII and EBCDIC platforms.
S<C<sizeof(REPLACEMENT_CHARACTER_UTF8) - 1>> can be used to get its length in
bytes.

=for hackers
Found in file unicode_constants.h

=item sv_cat_decode
X<sv_cat_decode>

C<encoding> is assumed to be an C<Encode> object, the PV of C<ssv> is
assumed to be octets in that encoding and decoding the input starts
from the position which S<C<(PV + *offset)>> pointed to.  C<dsv> will be
concatenated with the decoded UTF-8 string from C<ssv>.  Decoding will terminate
when the string C<tstr> appears in decoding output or the input ends on
the PV of C<ssv>.  The value which C<offset> points will be modified
to the last input position on C<ssv>.

Returns TRUE if the terminator was found, else returns FALSE.

	bool	sv_cat_decode(SV* dsv, SV *encoding, SV *ssv,
		              int *offset, char* tstr, int tlen)

=for hackers
Found in file sv.c

=item sv_recode_to_utf8
X<sv_recode_to_utf8>

C<encoding> is assumed to be an C<Encode> object, on entry the PV
of C<sv> is assumed to be octets in that encoding, and C<sv>
will be converted into Unicode (and UTF-8).

If C<sv> already is UTF-8 (or if it is not C<POK>), or if C<encoding>
is not a reference, nothing is done to C<sv>.  If C<encoding> is not
an C<Encode::XS> Encoding object, bad things will happen.
(See F<cpan/Encode/encoding.pm> and L<Encode>.)

The PV of C<sv> is returned.

	char*	sv_recode_to_utf8(SV* sv, SV *encoding)

=for hackers
Found in file sv.c

=item sv_uni_display
X<sv_uni_display>

Build to the scalar C<dsv> a displayable version of the scalar C<sv>,
the displayable version being at most C<pvlim> bytes long
(if longer, the rest is truncated and "..." will be appended).

The C<flags> argument is as in L</pv_uni_display>().

The pointer to the PV of the C<dsv> is returned.

	char*	sv_uni_display(SV *dsv, SV *ssv, STRLEN pvlim,
		               UV flags)

=for hackers
Found in file utf8.c

=item to_utf8_case
X<to_utf8_case>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Instead use the appropriate one of L</toUPPER_utf8_safe>,
L</toTITLE_utf8_safe>,
L</toLOWER_utf8_safe>,
or L</toFOLD_utf8_safe>.

This function will be removed in Perl v5.28.

C<p> contains the pointer to the UTF-8 string encoding
the character that is being converted.  This routine assumes that the character
at C<p> is well-formed.

C<ustrp> is a pointer to the character buffer to put the
conversion result to.  C<lenp> is a pointer to the length
of the result.

C<swashp> is a pointer to the swash to use.

Both the special and normal mappings are stored in F<lib/unicore/To/Foo.pl>,
and loaded by C<SWASHNEW>, using F<lib/utf8_heavy.pl>.  C<special> (usually,
but not always, a multicharacter mapping), is tried first.

C<special> is a string, normally C<NULL> or C<"">.  C<NULL> means to not use
any special mappings; C<""> means to use the special mappings.  Values other
than these two are treated as the name of the hash containing the special
mappings, like C<"utf8::ToSpecLower">.

C<normal> is a string like C<"ToLower"> which means the swash
C<%utf8::ToLower>.

Code points above the platform's C<IV_MAX> will raise a deprecation warning,
unless those are turned off.

	UV	to_utf8_case(const U8 *p, U8* ustrp,
		             STRLEN *lenp, SV **swashp,
		             const char *normal,
		             const char *special)

=for hackers
Found in file utf8.c

=item to_utf8_fold
X<to_utf8_fold>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Instead use L</toFOLD_utf8_safe>.

	UV	to_utf8_fold(const U8 *p, U8* ustrp,
		             STRLEN *lenp)

=for hackers
Found in file utf8.c

=item to_utf8_lower
X<to_utf8_lower>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Instead use L</toLOWER_utf8_safe>.

	UV	to_utf8_lower(const U8 *p, U8* ustrp,
		              STRLEN *lenp)

=for hackers
Found in file utf8.c

=item to_utf8_title
X<to_utf8_title>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Instead use L</toTITLE_utf8_safe>.

	UV	to_utf8_title(const U8 *p, U8* ustrp,
		              STRLEN *lenp)

=for hackers
Found in file utf8.c

=item to_utf8_upper
X<to_utf8_upper>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Instead use L</toUPPER_utf8_safe>.

	UV	to_utf8_upper(const U8 *p, U8* ustrp,
		              STRLEN *lenp)

=for hackers
Found in file utf8.c

=item utf8n_to_uvchr
X<utf8n_to_uvchr>

THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED CIRCUMSTANCES.
Most code should use L</utf8_to_uvchr_buf>() rather than call this directly.

Bottom level UTF-8 decode routine.
Returns the native code point value of the first character in the string C<s>,
which is assumed to be in UTF-8 (or UTF-EBCDIC) encoding, and no longer than
C<curlen> bytes; C<*retlen> (if C<retlen> isn't NULL) will be set to
the length, in bytes, of that character.

The value of C<flags> determines the behavior when C<s> does not point to a
well-formed UTF-8 character.  If C<flags> is 0, encountering a malformation
causes zero to be returned and C<*retlen> is set so that (S<C<s> + C<*retlen>>)
is the next possible position in C<s> that could begin a non-malformed
character.  Also, if UTF-8 warnings haven't been lexically disabled, a warning
is raised.  Some UTF-8 input sequences may contain multiple malformations.
This function tries to find every possible one in each call, so multiple
warnings can be raised for each sequence.

Various ALLOW flags can be set in C<flags> to allow (and not warn on)
individual types of malformations, such as the sequence being overlong (that
is, when there is a shorter sequence that can express the same code point;
overlong sequences are expressly forbidden in the UTF-8 standard due to
potential security issues).  Another malformation example is the first byte of
a character not being a legal first byte.  See F<utf8.h> for the list of such
flags.  Even if allowed, this function generally returns the Unicode
REPLACEMENT CHARACTER when it encounters a malformation.  There are flags in
F<utf8.h> to override this behavior for the overlong malformations, but don't
do that except for very specialized purposes.

The C<UTF8_CHECK_ONLY> flag overrides the behavior when a non-allowed (by other
flags) malformation is found.  If this flag is set, the routine assumes that
the caller will raise a warning, and this function will silently just set
C<retlen> to C<-1> (cast to C<STRLEN>) and return zero.

Note that this API requires disambiguation between successful decoding a C<NUL>
character, and an error return (unless the C<UTF8_CHECK_ONLY> flag is set), as
in both cases, 0 is returned, and, depending on the malformation, C<retlen> may
be set to 1.  To disambiguate, upon a zero return, see if the first byte of
C<s> is 0 as well.  If so, the input was a C<NUL>; if not, the input had an
error.  Or you can use C<L</utf8n_to_uvchr_error>>.

Certain code points are considered problematic.  These are Unicode surrogates,
Unicode non-characters, and code points above the Unicode maximum of 0x10FFFF.
By default these are considered regular code points, but certain situations
warrant special handling for them, which can be specified using the C<flags>
parameter.  If C<flags> contains C<UTF8_DISALLOW_ILLEGAL_INTERCHANGE>, all
three classes are treated as malformations and handled as such.  The flags
C<UTF8_DISALLOW_SURROGATE>, C<UTF8_DISALLOW_NONCHAR>, and
C<UTF8_DISALLOW_SUPER> (meaning above the legal Unicode maximum) can be set to
disallow these categories individually.  C<UTF8_DISALLOW_ILLEGAL_INTERCHANGE>
restricts the allowed inputs to the strict UTF-8 traditionally defined by
Unicode.  Use C<UTF8_DISALLOW_ILLEGAL_C9_INTERCHANGE> to use the strictness
definition given by
L<Unicode Corrigendum #9|http://www.unicode.org/versions/corrigendum9.html>.
The difference between traditional strictness and C9 strictness is that the
latter does not forbid non-character code points.  (They are still discouraged,
however.)  For more discussion see L<perlunicode/Noncharacter code points>.

The flags C<UTF8_WARN_ILLEGAL_INTERCHANGE>,
C<UTF8_WARN_ILLEGAL_C9_INTERCHANGE>, C<UTF8_WARN_SURROGATE>,
C<UTF8_WARN_NONCHAR>, and C<UTF8_WARN_SUPER> will cause warning messages to be
raised for their respective categories, but otherwise the code points are
considered valid (not malformations).  To get a category to both be treated as
a malformation and raise a warning, specify both the WARN and DISALLOW flags.
(But note that warnings are not raised if lexically disabled nor if
C<UTF8_CHECK_ONLY> is also specified.)

It is now deprecated to have very high code points (above C<IV_MAX> on the
platforms) and this function will raise a deprecation warning for these (unless
such warnings are turned off).  This value is typically 0x7FFF_FFFF (2**31 -1)
in a 32-bit word.

Code points above 0x7FFF_FFFF (2**31 - 1) were never specified in any standard,
so using them is more problematic than other above-Unicode code points.  Perl
invented an extension to UTF-8 to represent the ones above 2**36-1, so it is
likely that non-Perl languages will not be able to read files that contain
these; nor would Perl understand files
written by something that uses a different extension.  For these reasons, there
is a separate set of flags that can warn and/or disallow these extremely high
code points, even if other above-Unicode ones are accepted.  These are the
C<UTF8_WARN_ABOVE_31_BIT> and C<UTF8_DISALLOW_ABOVE_31_BIT> flags.  These
are entirely independent from the deprecation warning for code points above
C<IV_MAX>.  On 32-bit machines, it will eventually be forbidden to have any
code point that needs more than 31 bits to represent.  When that happens,
effectively the C<UTF8_DISALLOW_ABOVE_31_BIT> flag will always be set on
32-bit machines.  (Of course C<UTF8_DISALLOW_SUPER> will treat all
above-Unicode code points, including these, as malformations; and
C<UTF8_WARN_SUPER> warns on these.)

On EBCDIC platforms starting in Perl v5.24, the Perl extension for representing
extremely high code points kicks in at 0x3FFF_FFFF (2**30 -1), which is lower
than on ASCII.  Prior to that, code points 2**31 and higher were simply
unrepresentable, and a different, incompatible method was used to represent
code points between 2**30 and 2**31 - 1.  The flags C<UTF8_WARN_ABOVE_31_BIT>
and C<UTF8_DISALLOW_ABOVE_31_BIT> have the same function as on ASCII
platforms, warning and disallowing 2**31 and higher.

All other code points corresponding to Unicode characters, including private
use and those yet to be assigned, are never considered malformed and never
warn.

	UV	utf8n_to_uvchr(const U8 *s, STRLEN curlen,
		               STRLEN *retlen, const U32 flags)

=for hackers
Found in file utf8.c

=item utf8n_to_uvchr_error
X<utf8n_to_uvchr_error>

THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED CIRCUMSTANCES.
Most code should use L</utf8_to_uvchr_buf>() rather than call this directly.

This function is for code that needs to know what the precise malformation(s)
are when an error is found.

It is like C<L</utf8n_to_uvchr>> but it takes an extra parameter placed after
all the others, C<errors>.  If this parameter is 0, this function behaves
identically to C<L</utf8n_to_uvchr>>.  Otherwise, C<errors> should be a pointer
to a C<U32> variable, which this function sets to indicate any errors found.
Upon return, if C<*errors> is 0, there were no errors found.  Otherwise,
C<*errors> is the bit-wise C<OR> of the bits described in the list below.  Some
of these bits will be set if a malformation is found, even if the input
C<flags> parameter indicates that the given malformation is allowed; those
exceptions are noted:

=over 4

=item C<UTF8_GOT_ABOVE_31_BIT>

The code point represented by the input UTF-8 sequence occupies more than 31
bits.
This bit is set only if the input C<flags> parameter contains either the
C<UTF8_DISALLOW_ABOVE_31_BIT> or the C<UTF8_WARN_ABOVE_31_BIT> flags.

=item C<UTF8_GOT_CONTINUATION>

The input sequence was malformed in that the first byte was a a UTF-8
continuation byte.

=item C<UTF8_GOT_EMPTY>

The input C<curlen> parameter was 0.

=item C<UTF8_GOT_LONG>

The input sequence was malformed in that there is some other sequence that
evaluates to the same code point, but that sequence is shorter than this one.

=item C<UTF8_GOT_NONCHAR>

The code point represented by the input UTF-8 sequence is for a Unicode
non-character code point.
This bit is set only if the input C<flags> parameter contains either the
C<UTF8_DISALLOW_NONCHAR> or the C<UTF8_WARN_NONCHAR> flags.

=item C<UTF8_GOT_NON_CONTINUATION>

The input sequence was malformed in that a non-continuation type byte was found
in a position where only a continuation type one should be.

=item C<UTF8_GOT_OVERFLOW>

The input sequence was malformed in that it is for a code point that is not
representable in the number of bits available in a UV on the current platform.

=item C<UTF8_GOT_SHORT>

The input sequence was malformed in that C<curlen> is smaller than required for
a complete sequence.  In other words, the input is for a partial character
sequence.

=item C<UTF8_GOT_SUPER>

The input sequence was malformed in that it is for a non-Unicode code point;
that is, one above the legal Unicode maximum.
This bit is set only if the input C<flags> parameter contains either the
C<UTF8_DISALLOW_SUPER> or the C<UTF8_WARN_SUPER> flags.

=item C<UTF8_GOT_SURROGATE>

The input sequence was malformed in that it is for a -Unicode UTF-16 surrogate
code point.
This bit is set only if the input C<flags> parameter contains either the
C<UTF8_DISALLOW_SURROGATE> or the C<UTF8_WARN_SURROGATE> flags.

=back

To do your own error handling, call this function with the C<UTF8_CHECK_ONLY>
flag to suppress any warnings, and then examine the C<*errors> return.

	UV	utf8n_to_uvchr_error(const U8 *s, STRLEN curlen,
		                     STRLEN *retlen,
		                     const U32 flags,
		                     U32 * errors)

=for hackers
Found in file utf8.c

=item utf8n_to_uvuni
X<utf8n_to_uvuni>

Instead use L</utf8_to_uvchr_buf>, or rarely, L</utf8n_to_uvchr>.

This function was useful for code that wanted to handle both EBCDIC and
ASCII platforms with Unicode properties, but starting in Perl v5.20, the
distinctions between the platforms have mostly been made invisible to most
code, so this function is quite unlikely to be what you want.  If you do need
this precise functionality, use instead
C<L<NATIVE_TO_UNI(utf8_to_uvchr_buf(...))|/utf8_to_uvchr_buf>>
or C<L<NATIVE_TO_UNI(utf8n_to_uvchr(...))|/utf8n_to_uvchr>>.

	UV	utf8n_to_uvuni(const U8 *s, STRLEN curlen,
		               STRLEN *retlen, U32 flags)

=for hackers
Found in file utf8.c

=item UTF8SKIP
X<UTF8SKIP>

returns the number of bytes in the UTF-8 encoded character whose first (perhaps
only) byte is pointed to by C<s>.

	STRLEN	UTF8SKIP(char* s)

=for hackers
Found in file utf8.h

=item utf8_distance
X<utf8_distance>

Returns the number of UTF-8 characters between the UTF-8 pointers C<a>
and C<b>.

WARNING: use only if you *know* that the pointers point inside the
same UTF-8 buffer.

	IV	utf8_distance(const U8 *a, const U8 *b)

=for hackers
Found in file inline.h

=item utf8_hop
X<utf8_hop>

Return the UTF-8 pointer C<s> displaced by C<off> characters, either
forward or backward.

WARNING: do not use the following unless you *know* C<off> is within
the UTF-8 data pointed to by C<s> *and* that on entry C<s> is aligned
on the first byte of character or just after the last byte of a character.

	U8*	utf8_hop(const U8 *s, SSize_t off)

=for hackers
Found in file inline.h

=item utf8_hop_back
X<utf8_hop_back>

Return the UTF-8 pointer C<s> displaced by up to C<off> characters,
backward.

C<off> must be non-positive.

C<s> must be after or equal to C<start>.

When moving backward it will not move before C<start>.

Will not exceed this limit even if the string is not valid "UTF-8".

	U8*	utf8_hop_back(const U8 *s, SSize_t off,
		              const U8 *start)

=for hackers
Found in file inline.h

=item utf8_hop_forward
X<utf8_hop_forward>

Return the UTF-8 pointer C<s> displaced by up to C<off> characters,
forward.

C<off> must be non-negative.

C<s> must be before or equal to C<end>.

When moving forward it will not move beyond C<end>.

Will not exceed this limit even if the string is not valid "UTF-8".

	U8*	utf8_hop_forward(const U8 *s, SSize_t off,
		                 const U8 *end)

=for hackers
Found in file inline.h

=item utf8_hop_safe
X<utf8_hop_safe>

Return the UTF-8 pointer C<s> displaced by up to C<off> characters,
either forward or backward.

When moving backward it will not move before C<start>.

When moving forward it will not move beyond C<end>.

Will not exceed those limits even if the string is not valid "UTF-8".

	U8*	utf8_hop_safe(const U8 *s, SSize_t off,
		              const U8 *start, const U8 *end)

=for hackers
Found in file inline.h

=item UTF8_IS_INVARIANT
X<UTF8_IS_INVARIANT>

Evaluates to 1 if the byte C<c> represents the same character when encoded in
UTF-8 as when not; otherwise evaluates to 0.  UTF-8 invariant characters can be
copied as-is when converting to/from UTF-8, saving time.

In spite of the name, this macro gives the correct result if the input string
from which C<c> comes is not encoded in UTF-8.

See C<L</UVCHR_IS_INVARIANT>> for checking if a UV is invariant.

	bool	UTF8_IS_INVARIANT(char c)

=for hackers
Found in file utf8.h

=item UTF8_IS_NONCHAR
X<UTF8_IS_NONCHAR>

Evaluates to non-zero if the first few bytes of the string starting at C<s> and
looking no further than S<C<e - 1>> are well-formed UTF-8 that represents one
of the Unicode non-character code points; otherwise it evaluates to 0.  If
non-zero, the value gives how many bytes starting at C<s> comprise the code
point's representation.

	bool	UTF8_IS_NONCHAR(const U8 *s, const U8 *e)

=for hackers
Found in file utf8.h

=item UTF8_IS_SUPER
X<UTF8_IS_SUPER>

Recall that Perl recognizes an extension to UTF-8 that can encode code
points larger than the ones defined by Unicode, which are 0..0x10FFFF.

This macro evaluates to non-zero if the first few bytes of the string starting
at C<s> and looking no further than S<C<e - 1>> are from this UTF-8 extension;
otherwise it evaluates to 0.  If non-zero, the value gives how many bytes
starting at C<s> comprise the code point's representation.

0 is returned if the bytes are not well-formed extended UTF-8, or if they
represent a code point that cannot fit in a UV on the current platform.  Hence
this macro can give different results when run on a 64-bit word machine than on
one with a 32-bit word size.

Note that it is deprecated to have code points that are larger than what can
fit in an IV on the current machine.

	bool	UTF8_IS_SUPER(const U8 *s, const U8 *e)

=for hackers
Found in file utf8.h

=item UTF8_IS_SURROGATE
X<UTF8_IS_SURROGATE>

Evaluates to non-zero if the first few bytes of the string starting at C<s> and
looking no further than S<C<e - 1>> are well-formed UTF-8 that represents one
of the Unicode surrogate code points; otherwise it evaluates to 0.  If
non-zero, the value gives how many bytes starting at C<s> comprise the code
point's representation.

	bool	UTF8_IS_SURROGATE(const U8 *s, const U8 *e)

=for hackers
Found in file utf8.h

=item utf8_length
X<utf8_length>

Return the length of the UTF-8 char encoded string C<s> in characters.
Stops at C<e> (inclusive).  If C<e E<lt> s> or if the scan would end
up past C<e>, croaks.

	STRLEN	utf8_length(const U8* s, const U8 *e)

=for hackers
Found in file utf8.c

=item utf8_to_bytes
X<utf8_to_bytes>


NOTE: this function is experimental and may change or be
removed without notice.


Converts a string C<s> of length C<len> from UTF-8 into native byte encoding.
Unlike L</bytes_to_utf8>, this over-writes the original string, and
updates C<len> to contain the new length.
Returns zero on failure, setting C<len> to -1.

If you need a copy of the string, see L</bytes_from_utf8>.

	U8*	utf8_to_bytes(U8 *s, STRLEN *len)

=for hackers
Found in file utf8.c

=item utf8_to_uvchr_buf
X<utf8_to_uvchr_buf>

Returns the native code point of the first character in the string C<s> which
is assumed to be in UTF-8 encoding; C<send> points to 1 beyond the end of C<s>.
C<*retlen> will be set to the length, in bytes, of that character.

If C<s> does not point to a well-formed UTF-8 character and UTF8 warnings are
enabled, zero is returned and C<*retlen> is set (if C<retlen> isn't
C<NULL>) to -1.  If those warnings are off, the computed value, if well-defined
(or the Unicode REPLACEMENT CHARACTER if not), is silently returned, and
C<*retlen> is set (if C<retlen> isn't C<NULL>) so that (S<C<s> + C<*retlen>>) is
the next possible position in C<s> that could begin a non-malformed character.
See L</utf8n_to_uvchr> for details on when the REPLACEMENT CHARACTER is
returned.

Code points above the platform's C<IV_MAX> will raise a deprecation warning,
unless those are turned off.

	UV	utf8_to_uvchr_buf(const U8 *s, const U8 *send,
		                  STRLEN *retlen)

=for hackers
Found in file utf8.c

=item utf8_to_uvuni_buf
X<utf8_to_uvuni_buf>


DEPRECATED!  It is planned to remove this function from a
future release of Perl.  Do not use it for new code; remove it from
existing code.


Only in very rare circumstances should code need to be dealing in Unicode
(as opposed to native) code points.  In those few cases, use
C<L<NATIVE_TO_UNI(utf8_to_uvchr_buf(...))|/utf8_to_uvchr_buf>> instead.

Returns the Unicode (not-native) code point of the first character in the
string C<s> which
is assumed to be in UTF-8 encoding; C<send> points to 1 beyond the end of C<s>.
C<retlen> will be set to the length, in bytes, of that character.

If C<s> does not point to a well-formed UTF-8 character and UTF8 warnings are
enabled, zero is returned and C<*retlen> is set (if C<retlen> isn't
NULL) to -1.  If those warnings are off, the computed value if well-defined (or
the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and C<*retlen>
is set (if C<retlen> isn't NULL) so that (S<C<s> + C<*retlen>>) is the
next possible position in C<s> that could begin a non-malformed character.
See L</utf8n_to_uvchr> for details on when the REPLACEMENT CHARACTER is returned.

Code points above the platform's C<IV_MAX> will raise a deprecation warning,
unless those are turned off.

	UV	utf8_to_uvuni_buf(const U8 *s, const U8 *send,
		                  STRLEN *retlen)

=for hackers
Found in file utf8.c

=item UVCHR_IS_INVARIANT
X<UVCHR_IS_INVARIANT>

Evaluates to 1 if the representation of code point C<cp> is the same whether or
not it is encoded in UTF-8; otherwise evaluates to 0.  UTF-8 invariant
characters can be copied as-is when converting to/from UTF-8, saving time.
C<cp> is Unicode if above 255; otherwise is platform-native.

	bool	UVCHR_IS_INVARIANT(UV cp)

=for hackers
Found in file utf8.h

=item UVCHR_SKIP
X<UVCHR_SKIP>

returns the number of bytes required to represent the code point C<cp> when
encoded as UTF-8.  C<cp> is a native (ASCII or EBCDIC) code point if less than
255; a Unicode code point otherwise.

	STRLEN	UVCHR_SKIP(UV cp)

=for hackers
Found in file utf8.h

=item uvchr_to_utf8
X<uvchr_to_utf8>

Adds the UTF-8 representation of the native code point C<uv> to the end
of the string C<d>; C<d> should have at least C<UVCHR_SKIP(uv)+1> (up to
C<UTF8_MAXBYTES+1>) free bytes available.  The return value is the pointer to
the byte after the end of the new character.  In other words,

    d = uvchr_to_utf8(d, uv);

is the recommended wide native character-aware way of saying

    *(d++) = uv;

This function accepts any UV as input, but very high code points (above
C<IV_MAX> on the platform)  will raise a deprecation warning.  This is
typically 0x7FFF_FFFF in a 32-bit word.

It is possible to forbid or warn on non-Unicode code points, or those that may
be problematic by using L</uvchr_to_utf8_flags>.

	U8*	uvchr_to_utf8(U8 *d, UV uv)

=for hackers
Found in file utf8.c

=item uvchr_to_utf8_flags
X<uvchr_to_utf8_flags>

Adds the UTF-8 representation of the native code point C<uv> to the end
of the string C<d>; C<d> should have at least C<UVCHR_SKIP(uv)+1> (up to
C<UTF8_MAXBYTES+1>) free bytes available.  The return value is the pointer to
the byte after the end of the new character.  In other words,

    d = uvchr_to_utf8_flags(d, uv, flags);

or, in most cases,

    d = uvchr_to_utf8_flags(d, uv, 0);

This is the Unicode-aware way of saying

    *(d++) = uv;

If C<flags> is 0, this function accepts any UV as input, but very high code
points (above C<IV_MAX> for the platform)  will raise a deprecation warning.
This is typically 0x7FFF_FFFF in a 32-bit word.

Specifying C<flags> can further restrict what is allowed and not warned on, as
follows:

If C<uv> is a Unicode surrogate code point and C<UNICODE_WARN_SURROGATE> is set,
the function will raise a warning, provided UTF8 warnings are enabled.  If
instead C<UNICODE_DISALLOW_SURROGATE> is set, the function will fail and return
NULL.  If both flags are set, the function will both warn and return NULL.

Similarly, the C<UNICODE_WARN_NONCHAR> and C<UNICODE_DISALLOW_NONCHAR> flags
affect how the function handles a Unicode non-character.

And likewise, the C<UNICODE_WARN_SUPER> and C<UNICODE_DISALLOW_SUPER> flags
affect the handling of code points that are above the Unicode maximum of
0x10FFFF.  Languages other than Perl may not be able to accept files that
contain these.

The flag C<UNICODE_WARN_ILLEGAL_INTERCHANGE> selects all three of
the above WARN flags; and C<UNICODE_DISALLOW_ILLEGAL_INTERCHANGE> selects all
three DISALLOW flags.  C<UNICODE_DISALLOW_ILLEGAL_INTERCHANGE> restricts the
allowed inputs to the strict UTF-8 traditionally defined by Unicode.
Similarly, C<UNICODE_WARN_ILLEGAL_C9_INTERCHANGE> and
C<UNICODE_DISALLOW_ILLEGAL_C9_INTERCHANGE> are shortcuts to select the
above-Unicode and surrogate flags, but not the non-character ones, as
defined in
L<Unicode Corrigendum #9|http://www.unicode.org/versions/corrigendum9.html>.
See L<perlunicode/Noncharacter code points>.

Code points above 0x7FFF_FFFF (2**31 - 1) were never specified in any standard,
so using them is more problematic than other above-Unicode code points.  Perl
invented an extension to UTF-8 to represent the ones above 2**36-1, so it is
likely that non-Perl languages will not be able to read files that contain
these that written by the perl interpreter; nor would Perl understand files
written by something that uses a different extension.  For these reasons, there
is a separate set of flags that can warn and/or disallow these extremely high
code points, even if other above-Unicode ones are accepted.  These are the
C<UNICODE_WARN_ABOVE_31_BIT> and C<UNICODE_DISALLOW_ABOVE_31_BIT> flags.  These
are entirely independent from the deprecation warning for code points above
C<IV_MAX>.  On 32-bit machines, it will eventually be forbidden to have any
code point that needs more than 31 bits to represent.  When that happens,
effectively the C<UNICODE_DISALLOW_ABOVE_31_BIT> flag will always be set on
32-bit machines.  (Of course C<UNICODE_DISALLOW_SUPER> will treat all
above-Unicode code points, including these, as malformations; and
C<UNICODE_WARN_SUPER> warns on these.)

On EBCDIC platforms starting in Perl v5.24, the Perl extension for representing
extremely high code points kicks in at 0x3FFF_FFFF (2**30 -1), which is lower
than on ASCII.  Prior to that, code points 2**31 and higher were simply
unrepresentable, and a different, incompatible method was used to represent
code points between 2**30 and 2**31 - 1.  The flags C<UNICODE_WARN_ABOVE_31_BIT>
and C<UNICODE_DISALLOW_ABOVE_31_BIT> have the same function as on ASCII
platforms, warning and disallowing 2**31 and higher.

	U8*	uvchr_to_utf8_flags(U8 *d, UV uv, UV flags)

=for hackers
Found in file utf8.c

=item uvoffuni_to_utf8_flags
X<uvoffuni_to_utf8_flags>

THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED CIRCUMSTANCES.
Instead, B<Almost all code should use L</uvchr_to_utf8> or
L</uvchr_to_utf8_flags>>.

This function is like them, but the input is a strict Unicode
(as opposed to native) code point.  Only in very rare circumstances should code
not be using the native code point.

For details, see the description for L</uvchr_to_utf8_flags>.

	U8*	uvoffuni_to_utf8_flags(U8 *d, UV uv,
		                       const UV flags)

=for hackers
Found in file utf8.c

=item uvuni_to_utf8_flags
X<uvuni_to_utf8_flags>

Instead you almost certainly want to use L</uvchr_to_utf8> or
L</uvchr_to_utf8_flags>.

This function is a deprecated synonym for L</uvoffuni_to_utf8_flags>,
which itself, while not deprecated, should be used only in isolated
circumstances.  These functions were useful for code that wanted to handle
both EBCDIC and ASCII platforms with Unicode properties, but starting in Perl
v5.20, the distinctions between the platforms have mostly been made invisible
to most code, so this function is quite unlikely to be what you want.

	U8*	uvuni_to_utf8_flags(U8 *d, UV uv, UV flags)

=for hackers
Found in file utf8.c

=item valid_utf8_to_uvchr
X<valid_utf8_to_uvchr>

Like C<L</utf8_to_uvchr_buf>>, but should only be called when it is known that
the next character in the input UTF-8 string C<s> is well-formed (I<e.g.>,
it passes C<L</isUTF8_CHAR>>.  Surrogates, non-character code points, and
non-Unicode code points are allowed.

	UV	valid_utf8_to_uvchr(const U8 *s, STRLEN *retlen)

=for hackers
Found in file inline.h


=back

=head1 Variables created by C<xsubpp> and C<xsubpp> internal functions

=over 8

=item newXSproto
X<newXSproto>

Used by C<xsubpp> to hook up XSUBs as Perl subs.  Adds Perl prototypes to
the subs.

=for hackers
Found in file XSUB.h

=item XS_APIVERSION_BOOTCHECK
X<XS_APIVERSION_BOOTCHECK>

Macro to verify that the perl api version an XS module has been compiled against
matches the api version of the perl interpreter it's being loaded into.

		XS_APIVERSION_BOOTCHECK;

=for hackers
Found in file XSUB.h

=item XS_VERSION
X<XS_VERSION>

The version identifier for an XS module.  This is usually
handled automatically by C<ExtUtils::MakeMaker>.  See
C<L</XS_VERSION_BOOTCHECK>>.

=for hackers
Found in file XSUB.h

=item XS_VERSION_BOOTCHECK
X<XS_VERSION_BOOTCHECK>

Macro to verify that a PM module's C<$VERSION> variable matches the XS
module's C<XS_VERSION> variable.  This is usually handled automatically by
C<xsubpp>.  See L<perlxs/"The VERSIONCHECK: Keyword">.

		XS_VERSION_BOOTCHECK;

=for hackers
Found in file XSUB.h


=back

=head1 Warning and Dieing

=over 8

=item ckWARN
X<ckWARN>

Returns a boolean as to whether or not warnings are enabled for the warning
category C<w>.  If the category is by default enabled even if not within the
scope of S<C<use warnings>>, instead use the L</ckWARN_d> macro.

	bool	ckWARN(U32 w)

=for hackers
Found in file warnings.h

=item ckWARN2
X<ckWARN2>

Like C<L</ckWARN>>, but takes two warnings categories as input, and returns
TRUE if either is enabled.  If either category is by default enabled even if
not within the scope of S<C<use warnings>>, instead use the L</ckWARN2_d>
macro.  The categories must be completely independent, one may not be
subclassed from the other.

	bool	ckWARN2(U32 w1, U32 w2)

=for hackers
Found in file warnings.h

=item ckWARN3
X<ckWARN3>

Like C<L</ckWARN2>>, but takes three warnings categories as input, and returns
TRUE if any is enabled.  If any of the categories is by default enabled even
if not within the scope of S<C<use warnings>>, instead use the L</ckWARN3_d>
macro.  The categories must be completely independent, one may not be
subclassed from any other.

	bool	ckWARN3(U32 w1, U32 w2, U32 w3)

=for hackers
Found in file warnings.h

=item ckWARN4
X<ckWARN4>

Like C<L</ckWARN3>>, but takes four warnings categories as input, and returns
TRUE if any is enabled.  If any of the categories is by default enabled even
if not within the scope of S<C<use warnings>>, instead use the L</ckWARN4_d>
macro.  The categories must be completely independent, one may not be
subclassed from any other.

	bool	ckWARN4(U32 w1, U32 w2, U32 w3, U32 w4)

=for hackers
Found in file warnings.h

=item ckWARN_d
X<ckWARN_d>

Like C<L</ckWARN>>, but for use if and only if the warning category is by
default enabled even if not within the scope of S<C<use warnings>>.

	bool	ckWARN_d(U32 w)

=for hackers
Found in file warnings.h

=item ckWARN2_d
X<ckWARN2_d>

Like C<L</ckWARN2>>, but for use if and only if either warning category is by
default enabled even if not within the scope of S<C<use warnings>>.

	bool	ckWARN2_d(U32 w1, U32 w2)

=for hackers
Found in file warnings.h

=item ckWARN3_d
X<ckWARN3_d>

Like C<L</ckWARN3>>, but for use if and only if any of the warning categories
is by default enabled even if not within the scope of S<C<use warnings>>.

	bool	ckWARN3_d(U32 w1, U32 w2, U32 w3)

=for hackers
Found in file warnings.h

=item ckWARN4_d
X<ckWARN4_d>

Like C<L</ckWARN4>>, but for use if and only if any of the warning categories
is by default enabled even if not within the scope of S<C<use warnings>>.

	bool	ckWARN4_d(U32 w1, U32 w2, U32 w3, U32 w4)

=for hackers
Found in file warnings.h

=item croak
X<croak>

This is an XS interface to Perl's C<die> function.

Take a sprintf-style format pattern and argument list.  These are used to
generate a string message.  If the message does not end with a newline,
then it will be extended with some indication of the current location
in the code, as described for L</mess_sv>.

The error message will be used as an exception, by default
returning control to the nearest enclosing C<eval>, but subject to
modification by a C<$SIG{__DIE__}> handler.  In any case, the C<croak>
function never returns normally.

For historical reasons, if C<pat> is null then the contents of C<ERRSV>
(C<$@>) will be used as an error message or object instead of building an
error message from arguments.  If you want to throw a non-string object,
or build an error message in an SV yourself, it is preferable to use
the L</croak_sv> function, which does not involve clobbering C<ERRSV>.

	void	croak(const char *pat, ...)

=for hackers
Found in file util.c

=item croak_no_modify
X<croak_no_modify>

Exactly equivalent to C<Perl_croak(aTHX_ "%s", PL_no_modify)>, but generates
terser object code than using C<Perl_croak>.  Less code used on exception code
paths reduces CPU cache pressure.

	void	croak_no_modify()

=for hackers
Found in file util.c

=item croak_sv
X<croak_sv>

This is an XS interface to Perl's C<die> function.

C<baseex> is the error message or object.  If it is a reference, it
will be used as-is.  Otherwise it is used as a string, and if it does
not end with a newline then it will be extended with some indication of
the current location in the code, as described for L</mess_sv>.

The error message or object will be used as an exception, by default
returning control to the nearest enclosing C<eval>, but subject to
modification by a C<$SIG{__DIE__}> handler.  In any case, the C<croak_sv>
function never returns normally.

To die with a simple string message, the L</croak> function may be
more convenient.

	void	croak_sv(SV *baseex)

=for hackers
Found in file util.c

=item die
X<die>

Behaves the same as L</croak>, except for the return type.
It should be used only where the C<OP *> return type is required.
The function never actually returns.

	OP *	die(const char *pat, ...)

=for hackers
Found in file util.c

=item die_sv
X<die_sv>

Behaves the same as L</croak_sv>, except for the return type.
It should be used only where the C<OP *> return type is required.
The function never actually returns.

	OP *	die_sv(SV *baseex)

=for hackers
Found in file util.c

=item vcroak
X<vcroak>

This is an XS interface to Perl's C<die> function.

C<pat> and C<args> are a sprintf-style format pattern and encapsulated
argument list.  These are used to generate a string message.  If the
message does not end with a newline, then it will be extended with
some indication of the current location in the code, as described for
L</mess_sv>.

The error message will be used as an exception, by default
returning control to the nearest enclosing C<eval>, but subject to
modification by a C<$SIG{__DIE__}> handler.  In any case, the C<croak>
function never returns normally.

For historical reasons, if C<pat> is null then the contents of C<ERRSV>
(C<$@>) will be used as an error message or object instead of building an
error message from arguments.  If you want to throw a non-string object,
or build an error message in an SV yourself, it is preferable to use
the L</croak_sv> function, which does not involve clobbering C<ERRSV>.

	void	vcroak(const char *pat, va_list *args)

=for hackers
Found in file util.c

=item vwarn
X<vwarn>

This is an XS interface to Perl's C<warn> function.

C<pat> and C<args> are a sprintf-style format pattern and encapsulated
argument list.  These are used to generate a string message.  If the
message does not end with a newline, then it will be extended with
some indication of the current location in the code, as described for
L</mess_sv>.

The error message or object will by default be written to standard error,
but this is subject to modification by a C<$SIG{__WARN__}> handler.

Unlike with L</vcroak>, C<pat> is not permitted to be null.

	void	vwarn(const char *pat, va_list *args)

=for hackers
Found in file util.c

=item warn
X<warn>

This is an XS interface to Perl's C<warn> function.

Take a sprintf-style format pattern and argument list.  These are used to
generate a string message.  If the message does not end with a newline,
then it will be extended with some indication of the current location
in the code, as described for L</mess_sv>.

The error message or object will by default be written to standard error,
but this is subject to modification by a C<$SIG{__WARN__}> handler.

Unlike with L</croak>, C<pat> is not permitted to be null.

	void	warn(const char *pat, ...)

=for hackers
Found in file util.c

=item warn_sv
X<warn_sv>

This is an XS interface to Perl's C<warn> function.

C<baseex> is the error message or object.  If it is a reference, it
will be used as-is.  Otherwise it is used as a string, and if it does
not end with a newline then it will be extended with some indication of
the current location in the code, as described for L</mess_sv>.

The error message or object will by default be written to standard error,
but this is subject to modification by a C<$SIG{__WARN__}> handler.

To warn with a simple string message, the L</warn> function may be
more convenient.

	void	warn_sv(SV *baseex)

=for hackers
Found in file util.c


=back

=head1 Undocumented functions

The following functions have been flagged as part of the public API,
but are currently undocumented.  Use them at your own risk, as the
interfaces are subject to change.  Functions that are not listed in this
document are not intended for public use, and should NOT be used under any
circumstances.

If you feel you need to use one of these functions, first send email to
L<perl5-porters@perl.org|mailto:perl5-porters@perl.org>.  It may be
that there is a good reason for the function not being documented, and it
should be removed from this list; or it may just be that no one has gotten
around to documenting it.  In the latter case, you will be asked to submit a
patch to document the function.  Once your patch is accepted, it will indicate
that the interface is stable (unless it is explicitly marked otherwise) and
usable by you.

=over

=item GetVars
X<GetVars>

=item Gv_AMupdate
X<Gv_AMupdate>

=item PerlIO_clearerr
X<PerlIO_clearerr>

=item PerlIO_close
X<PerlIO_close>

=item PerlIO_context_layers
X<PerlIO_context_layers>

=item PerlIO_eof
X<PerlIO_eof>

=item PerlIO_error
X<PerlIO_error>

=item PerlIO_fileno
X<PerlIO_fileno>

=item PerlIO_fill
X<PerlIO_fill>

=item PerlIO_flush
X<PerlIO_flush>

=item PerlIO_get_base
X<PerlIO_get_base>

=item PerlIO_get_bufsiz
X<PerlIO_get_bufsiz>

=item PerlIO_get_cnt
X<PerlIO_get_cnt>

=item PerlIO_get_ptr
X<PerlIO_get_ptr>

=item PerlIO_read
X<PerlIO_read>

=item PerlIO_seek
X<PerlIO_seek>

=item PerlIO_set_cnt
X<PerlIO_set_cnt>

=item PerlIO_set_ptrcnt
X<PerlIO_set_ptrcnt>

=item PerlIO_setlinebuf
X<PerlIO_setlinebuf>

=item PerlIO_stderr
X<PerlIO_stderr>

=item PerlIO_stdin
X<PerlIO_stdin>

=item PerlIO_stdout
X<PerlIO_stdout>

=item PerlIO_tell
X<PerlIO_tell>

=item PerlIO_unread
X<PerlIO_unread>

=item PerlIO_write
X<PerlIO_write>

=item amagic_call
X<amagic_call>

=item amagic_deref_call
X<amagic_deref_call>

=item any_dup
X<any_dup>

=item atfork_lock
X<atfork_lock>

=item atfork_unlock
X<atfork_unlock>

=item av_arylen_p
X<av_arylen_p>

=item av_iter_p
X<av_iter_p>

=item block_gimme
X<block_gimme>

=item call_atexit
X<call_atexit>

=item call_list
X<call_list>

=item calloc
X<calloc>

=item cast_i32
X<cast_i32>

=item cast_iv
X<cast_iv>

=item cast_ulong
X<cast_ulong>

=item cast_uv
X<cast_uv>

=item ck_warner
X<ck_warner>

=item ck_warner_d
X<ck_warner_d>

=item ckwarn
X<ckwarn>

=item ckwarn_d
X<ckwarn_d>

=item clear_defarray
X<clear_defarray>

=item clone_params_del
X<clone_params_del>

=item clone_params_new
X<clone_params_new>

=item croak_memory_wrap
X<croak_memory_wrap>

=item croak_nocontext
X<croak_nocontext>

=item csighandler
X<csighandler>

=item cx_dump
X<cx_dump>

=item cx_dup
X<cx_dup>

=item cxinc
X<cxinc>

=item deb
X<deb>

=item deb_nocontext
X<deb_nocontext>

=item debop
X<debop>

=item debprofdump
X<debprofdump>

=item debstack
X<debstack>

=item debstackptrs
X<debstackptrs>

=item delimcpy
X<delimcpy>

=item despatch_signals
X<despatch_signals>

=item die_nocontext
X<die_nocontext>

=item dirp_dup
X<dirp_dup>

=item do_aspawn
X<do_aspawn>

=item do_binmode
X<do_binmode>

=item do_close
X<do_close>

=item do_gv_dump
X<do_gv_dump>

=item do_gvgv_dump
X<do_gvgv_dump>

=item do_hv_dump
X<do_hv_dump>

=item do_join
X<do_join>

=item do_magic_dump
X<do_magic_dump>

=item do_op_dump
X<do_op_dump>

=item do_open
X<do_open>

=item do_open9
X<do_open9>

=item do_openn
X<do_openn>

=item do_pmop_dump
X<do_pmop_dump>

=item do_spawn
X<do_spawn>

=item do_spawn_nowait
X<do_spawn_nowait>

=item do_sprintf
X<do_sprintf>

=item do_sv_dump
X<do_sv_dump>

=item doing_taint
X<doing_taint>

=item doref
X<doref>

=item dounwind
X<dounwind>

=item dowantarray
X<dowantarray>

=item dump_eval
X<dump_eval>

=item dump_form
X<dump_form>

=item dump_indent
X<dump_indent>

=item dump_mstats
X<dump_mstats>

=item dump_sub
X<dump_sub>

=item dump_vindent
X<dump_vindent>

=item filter_add
X<filter_add>

=item filter_del
X<filter_del>

=item filter_read
X<filter_read>

=item foldEQ_latin1
X<foldEQ_latin1>

=item form_nocontext
X<form_nocontext>

=item fp_dup
X<fp_dup>

=item fprintf_nocontext
X<fprintf_nocontext>

=item free_global_struct
X<free_global_struct>

=item free_tmps
X<free_tmps>

=item get_context
X<get_context>

=item get_mstats
X<get_mstats>

=item get_op_descs
X<get_op_descs>

=item get_op_names
X<get_op_names>

=item get_ppaddr
X<get_ppaddr>

=item get_vtbl
X<get_vtbl>

=item gp_dup
X<gp_dup>

=item gp_free
X<gp_free>

=item gp_ref
X<gp_ref>

=item gv_AVadd
X<gv_AVadd>

=item gv_HVadd
X<gv_HVadd>

=item gv_IOadd
X<gv_IOadd>

=item gv_SVadd
X<gv_SVadd>

=item gv_add_by_type
X<gv_add_by_type>

=item gv_autoload4
X<gv_autoload4>

=item gv_autoload_pv
X<gv_autoload_pv>

=item gv_autoload_pvn
X<gv_autoload_pvn>

=item gv_autoload_sv
X<gv_autoload_sv>

=item gv_check
X<gv_check>

=item gv_dump
X<gv_dump>

=item gv_efullname
X<gv_efullname>

=item gv_efullname3
X<gv_efullname3>

=item gv_efullname4
X<gv_efullname4>

=item gv_fetchfile
X<gv_fetchfile>

=item gv_fetchfile_flags
X<gv_fetchfile_flags>

=item gv_fetchpv
X<gv_fetchpv>

=item gv_fetchpvn_flags
X<gv_fetchpvn_flags>

=item gv_fetchsv
X<gv_fetchsv>

=item gv_fullname
X<gv_fullname>

=item gv_fullname3
X<gv_fullname3>

=item gv_fullname4
X<gv_fullname4>

=item gv_handler
X<gv_handler>

=item gv_name_set
X<gv_name_set>

=item he_dup
X<he_dup>

=item hek_dup
X<hek_dup>

=item hv_common
X<hv_common>

=item hv_common_key_len
X<hv_common_key_len>

=item hv_delayfree_ent
X<hv_delayfree_ent>

=item hv_eiter_p
X<hv_eiter_p>

=item hv_eiter_set
X<hv_eiter_set>

=item hv_free_ent
X<hv_free_ent>

=item hv_ksplit
X<hv_ksplit>

=item hv_name_set
X<hv_name_set>

=item hv_placeholders_get
X<hv_placeholders_get>

=item hv_placeholders_set
X<hv_placeholders_set>

=item hv_rand_set
X<hv_rand_set>

=item hv_riter_p
X<hv_riter_p>

=item hv_riter_set
X<hv_riter_set>

=item ibcmp_utf8
X<ibcmp_utf8>

=item init_global_struct
X<init_global_struct>

=item init_stacks
X<init_stacks>

=item init_tm
X<init_tm>

=item instr
X<instr>

=item is_lvalue_sub
X<is_lvalue_sub>

=item leave_scope
X<leave_scope>

=item load_module_nocontext
X<load_module_nocontext>

=item magic_dump
X<magic_dump>

=item malloc
X<malloc>

=item markstack_grow
X<markstack_grow>

=item mess_nocontext
X<mess_nocontext>

=item mfree
X<mfree>

=item mg_dup
X<mg_dup>

=item mg_size
X<mg_size>

=item mini_mktime
X<mini_mktime>

=item moreswitches
X<moreswitches>

=item mro_get_from_name
X<mro_get_from_name>

=item mro_get_private_data
X<mro_get_private_data>

=item mro_set_mro
X<mro_set_mro>

=item mro_set_private_data
X<mro_set_private_data>

=item my_atof
X<my_atof>

=item my_atof2
X<my_atof2>

=item my_bcopy
X<my_bcopy>

=item my_bzero
X<my_bzero>

=item my_chsize
X<my_chsize>

=item my_cxt_index
X<my_cxt_index>

=item my_cxt_init
X<my_cxt_init>

=item my_dirfd
X<my_dirfd>

=item my_exit
X<my_exit>

=item my_failure_exit
X<my_failure_exit>

=item my_fflush_all
X<my_fflush_all>

=item my_fork
X<my_fork>

=item my_lstat
X<my_lstat>

=item my_memcmp
X<my_memcmp>

=item my_memset
X<my_memset>

=item my_pclose
X<my_pclose>

=item my_popen
X<my_popen>

=item my_popen_list
X<my_popen_list>

=item my_setenv
X<my_setenv>

=item my_socketpair
X<my_socketpair>

=item my_stat
X<my_stat>

=item my_strftime
X<my_strftime>

=item newANONATTRSUB
X<newANONATTRSUB>

=item newANONHASH
X<newANONHASH>

=item newANONLIST
X<newANONLIST>

=item newANONSUB
X<newANONSUB>

=item newATTRSUB
X<newATTRSUB>

=item newAVREF
X<newAVREF>

=item newCVREF
X<newCVREF>

=item newFORM
X<newFORM>

=item newGVREF
X<newGVREF>

=item newGVgen
X<newGVgen>

=item newGVgen_flags
X<newGVgen_flags>

=item newHVREF
X<newHVREF>

=item newHVhv
X<newHVhv>

=item newIO
X<newIO>

=item newMYSUB
X<newMYSUB>

=item newPROG
X<newPROG>

=item newRV
X<newRV>

=item newSUB
X<newSUB>

=item newSVREF
X<newSVREF>

=item newSVpvf_nocontext
X<newSVpvf_nocontext>

=item new_stackinfo
X<new_stackinfo>

=item op_refcnt_lock
X<op_refcnt_lock>

=item op_refcnt_unlock
X<op_refcnt_unlock>

=item parser_dup
X<parser_dup>

=item perl_alloc_using
X<perl_alloc_using>

=item perl_clone_using
X<perl_clone_using>

=item pmop_dump
X<pmop_dump>

=item pop_scope
X<pop_scope>

=item pregcomp
X<pregcomp>

=item pregexec
X<pregexec>

=item pregfree
X<pregfree>

=item pregfree2
X<pregfree2>

=item printf_nocontext
X<printf_nocontext>

=item ptr_table_fetch
X<ptr_table_fetch>

=item ptr_table_free
X<ptr_table_free>

=item ptr_table_new
X<ptr_table_new>

=item ptr_table_split
X<ptr_table_split>

=item ptr_table_store
X<ptr_table_store>

=item push_scope
X<push_scope>

=item re_compile
X<re_compile>

=item re_dup_guts
X<re_dup_guts>

=item re_intuit_start
X<re_intuit_start>

=item re_intuit_string
X<re_intuit_string>

=item realloc
X<realloc>

=item reentrant_free
X<reentrant_free>

=item reentrant_init
X<reentrant_init>

=item reentrant_retry
X<reentrant_retry>

=item reentrant_size
X<reentrant_size>

=item ref
X<ref>

=item reg_named_buff_all
X<reg_named_buff_all>

=item reg_named_buff_exists
X<reg_named_buff_exists>

=item reg_named_buff_fetch
X<reg_named_buff_fetch>

=item reg_named_buff_firstkey
X<reg_named_buff_firstkey>

=item reg_named_buff_nextkey
X<reg_named_buff_nextkey>

=item reg_named_buff_scalar
X<reg_named_buff_scalar>

=item regdump
X<regdump>

=item regdupe_internal
X<regdupe_internal>

=item regexec_flags
X<regexec_flags>

=item regfree_internal
X<regfree_internal>

=item reginitcolors
X<reginitcolors>

=item regnext
X<regnext>

=item repeatcpy
X<repeatcpy>

=item rsignal
X<rsignal>

=item rsignal_state
X<rsignal_state>

=item runops_debug
X<runops_debug>

=item runops_standard
X<runops_standard>

=item rvpv_dup
X<rvpv_dup>

=item safesyscalloc
X<safesyscalloc>

=item safesysfree
X<safesysfree>

=item safesysmalloc
X<safesysmalloc>

=item safesysrealloc
X<safesysrealloc>

=item save_I16
X<save_I16>

=item save_I32
X<save_I32>

=item save_I8
X<save_I8>

=item save_adelete
X<save_adelete>

=item save_aelem
X<save_aelem>

=item save_aelem_flags
X<save_aelem_flags>

=item save_alloc
X<save_alloc>

=item save_aptr
X<save_aptr>

=item save_ary
X<save_ary>

=item save_bool
X<save_bool>

=item save_clearsv
X<save_clearsv>

=item save_delete
X<save_delete>

=item save_destructor
X<save_destructor>

=item save_destructor_x
X<save_destructor_x>

=item save_freeop
X<save_freeop>

=item save_freepv
X<save_freepv>

=item save_freesv
X<save_freesv>

=item save_generic_pvref
X<save_generic_pvref>

=item save_generic_svref
X<save_generic_svref>

=item save_hash
X<save_hash>

=item save_hdelete
X<save_hdelete>

=item save_helem
X<save_helem>

=item save_helem_flags
X<save_helem_flags>

=item save_hints
X<save_hints>

=item save_hptr
X<save_hptr>

=item save_int
X<save_int>

=item save_item
X<save_item>

=item save_iv
X<save_iv>

=item save_list
X<save_list>

=item save_long
X<save_long>

=item save_mortalizesv
X<save_mortalizesv>

=item save_nogv
X<save_nogv>

=item save_op
X<save_op>

=item save_padsv_and_mortalize
X<save_padsv_and_mortalize>

=item save_pptr
X<save_pptr>

=item save_pushi32ptr
X<save_pushi32ptr>

=item save_pushptr
X<save_pushptr>

=item save_pushptrptr
X<save_pushptrptr>

=item save_re_context
X<save_re_context>

=item save_scalar
X<save_scalar>

=item save_set_svflags
X<save_set_svflags>

=item save_shared_pvref
X<save_shared_pvref>

=item save_sptr
X<save_sptr>

=item save_svref
X<save_svref>

=item save_vptr
X<save_vptr>

=item savestack_grow
X<savestack_grow>

=item savestack_grow_cnt
X<savestack_grow_cnt>

=item scan_num
X<scan_num>

=item scan_vstring
X<scan_vstring>

=item seed
X<seed>

=item set_context
X<set_context>

=item set_numeric_local
X<set_numeric_local>

=item set_numeric_radix
X<set_numeric_radix>

=item set_numeric_standard
X<set_numeric_standard>

=item share_hek
X<share_hek>

=item si_dup
X<si_dup>

=item ss_dup
X<ss_dup>

=item stack_grow
X<stack_grow>

=item start_subparse
X<start_subparse>

=item str_to_version
X<str_to_version>

=item sv_2iv
X<sv_2iv>

=item sv_2pv
X<sv_2pv>

=item sv_2uv
X<sv_2uv>

=item sv_catpvf_mg_nocontext
X<sv_catpvf_mg_nocontext>

=item sv_catpvf_nocontext
X<sv_catpvf_nocontext>

=item sv_dup
X<sv_dup>

=item sv_dup_inc
X<sv_dup_inc>

=item sv_peek
X<sv_peek>

=item sv_pvn_nomg
X<sv_pvn_nomg>

=item sv_setpvf_mg_nocontext
X<sv_setpvf_mg_nocontext>

=item sv_setpvf_nocontext
X<sv_setpvf_nocontext>

=item sys_init
X<sys_init>

=item sys_init3
X<sys_init3>

=item sys_intern_clear
X<sys_intern_clear>

=item sys_intern_dup
X<sys_intern_dup>

=item sys_intern_init
X<sys_intern_init>

=item sys_term
X<sys_term>

=item taint_env
X<taint_env>

=item taint_proper
X<taint_proper>

=item unlnk
X<unlnk>

=item unsharepvn
X<unsharepvn>

=item utf16_to_utf8
X<utf16_to_utf8>

=item utf16_to_utf8_reversed
X<utf16_to_utf8_reversed>

=item uvuni_to_utf8
X<uvuni_to_utf8>

=item vdeb
X<vdeb>

=item vform
X<vform>

=item vload_module
X<vload_module>

=item vnewSVpvf
X<vnewSVpvf>

=item vwarner
X<vwarner>

=item warn_nocontext
X<warn_nocontext>

=item warner
X<warner>

=item warner_nocontext
X<warner_nocontext>

=item whichsig
X<whichsig>

=item whichsig_pv
X<whichsig_pv>

=item whichsig_pvn
X<whichsig_pvn>

=item whichsig_sv
X<whichsig_sv>

=back


=head1 AUTHORS

Until May 1997, this document was maintained by Jeff Okamoto
<okamoto@corp.hp.com>.  It is now maintained as part of Perl itself.

With lots of help and suggestions from Dean Roehrich, Malcolm Beattie,
Andreas Koenig, Paul Hudson, Ilya Zakharevich, Paul Marquess, Neil
Bowers, Matthew Green, Tim Bunce, Spider Boardman, Ulrich Pfeifer,
Stephen McCamant, and Gurusamy Sarathy.

API Listing originally by Dean Roehrich <roehrich@cray.com>.

Updated to be autogenerated from comments in the source by Benjamin Stuhl.

=head1 SEE ALSO

L<perlguts>, L<perlxs>, L<perlxstut>, L<perlintern>

=cut

ex: set ro:
perltru64.pod000064400000020454150344123470007127 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perltru64 - Perl version 5 on Tru64 (formerly known as Digital UNIX formerly known as DEC OSF/1) systems

=head1 DESCRIPTION

This document describes various features of HP's (formerly Compaq's,
formerly Digital's) Unix operating system (Tru64) that will affect
how Perl version 5 (hereafter just Perl) is configured, compiled
and/or runs.

=head2 Compiling Perl 5 on Tru64

The recommended compiler to use in Tru64 is the native C compiler.
The native compiler produces much faster code (the speed difference is
noticeable: several dozen percentages) and also more correct code: if
you are considering using the GNU C compiler you should use at the
very least the release of 2.95.3 since all older gcc releases are
known to produce broken code when compiling Perl.  One manifestation
of this brokenness is the lib/sdbm test dumping core; another is many
of the op/regexp and op/pat, or ext/Storable tests dumping core
(the exact pattern of failures depending on the GCC release and
optimization flags).

Both the native cc and gcc seem to consume lots of memory when
building Perl.  toke.c is a known trouble spot when optimizing:
256 megabytes of data section seems to be enough.  Another known
trouble spot is the mktables script which builds the Unicode support
tables.  The default setting of the process data section in Tru64
should be one gigabyte, but some sites/setups might have lowered that.
The configuration process of Perl checks for too low process limits,
and lowers the optimization for the toke.c if necessary, and also
gives advice on how to raise the process limits
(for example: C<ulimit -d 262144>)

Also, Configure might abort with

 Build a threading Perl? [n]
 Configure[2437]: Syntax error at line 1 : 'config.sh' is not expected.

This indicates that Configure is being run with a broken Korn shell
(even though you think you are using a Bourne shell by using
"sh Configure" or "./Configure").  The Korn shell bug has been reported
to Compaq as of February 1999 but in the meanwhile, the reason ksh is
being used is that you have the environment variable BIN_SH set to
'xpg4'.  This causes /bin/sh to delegate its duties to /bin/posix/sh
(a ksh).  Unset the environment variable and rerun Configure.

=head2 Using Large Files with Perl on Tru64

In Tru64 Perl is automatically able to use large files, that is,
files larger than 2 gigabytes, there is no need to use the Configure
-Duselargefiles option as described in INSTALL (though using the option
is harmless).

=head2 Threaded Perl on Tru64

If you want to use threads, you should primarily use the Perl
5.8.0 threads model by running Configure with -Duseithreads.

Perl threading is going to work only in Tru64 4.0 and newer releases,
older operating releases like 3.2 aren't probably going to work
properly with threads.

In Tru64 V5 (at least V5.1A, V5.1B) you cannot build threaded Perl with gcc
because the system header <pthread.h> explicitly checks for supported
C compilers, gcc (at least 3.2.2) not being one of them.  But the
system C compiler should work just fine.

=head2 Long Doubles on Tru64

You cannot Configure Perl to use long doubles unless you have at least
Tru64 V5.0, the long double support simply wasn't functional enough
before that.  Perl's Configure will override attempts to use the long
doubles (you can notice this by Configure finding out that the modfl()
function does not work as it should).

At the time of this writing (June 2002), there is a known bug in the
Tru64 libc printing of long doubles when not using "e" notation.
The values are correct and usable, but you only get a limited number
of digits displayed unless you force the issue by using C<printf
"%.33e",$num> or the like.  For Tru64 versions V5.0A through V5.1A, a
patch is expected sometime after perl 5.8.0 is released.  If your libc
has not yet been patched, you'll get a warning from Configure when
selecting long doubles.

=head2 DB_File tests failing on Tru64

The DB_File tests (db-btree.t, db-hash.t, db-recno.t) may fail you
have installed a newer version of Berkeley DB into the system and the
-I and -L compiler and linker flags introduce version conflicts with
the DB 1.85 headers and libraries that came with the Tru64.  For example, 
mixing a DB v2 library with the DB v1 headers is a bad idea.  Watch
out for Configure options -Dlocincpth and -Dloclibpth, and check your
/usr/local/include and /usr/local/lib since they are included by default.

The second option is to explicitly instruct Configure to detect the
newer Berkeley DB installation, by supplying the right directories with
C<-Dlocincpth=/some/include> and C<-Dloclibpth=/some/lib> B<and> before
running "make test" setting your LD_LIBRARY_PATH to F</some/lib>.

The third option is to work around the problem by disabling the
DB_File completely when build Perl by specifying -Ui_db to Configure,
and then using the BerkeleyDB module from CPAN instead of DB_File.
The BerkeleyDB works with Berkeley DB versions 2.* or greater.

The Berkeley DB 4.1.25 has been tested with Tru64 V5.1A and found
to work.  The latest Berkeley DB can be found from L<http://www.sleepycat.com>.

=head2 64-bit Perl on Tru64

In Tru64 Perl's integers are automatically 64-bit wide, there is
no need to use the Configure -Duse64bitint option as described
in INSTALL.  Similarly, there is no need for -Duse64bitall
since pointers are automatically 64-bit wide.

=head2 Warnings about floating-point overflow when compiling Perl on Tru64

When compiling Perl in Tru64 you may (depending on the compiler
release) see two warnings like this

 cc: Warning: numeric.c, line 104: In this statement, floating-point
 overflow occurs in evaluating the expression "1.8e308". (floatoverfl)
     return HUGE_VAL;
 -----------^

and when compiling the POSIX extension

 cc: Warning: const-c.inc, line 2007: In this statement, floating-point
 overflow occurs in evaluating the expression "1.8e308". (floatoverfl)
             return HUGE_VAL;
 -------------------^

The exact line numbers may vary between Perl releases.  The warnings
are benign and can be ignored: in later C compiler releases the warnings
should be gone.

When the file F<pp_sys.c> is being compiled you may (depending on the
operating system release) see an additional compiler flag being used:
C<-DNO_EFF_ONLY_OK>.  This is normal and refers to a feature that is
relevant only if you use the C<filetest> pragma.  In older releases of
the operating system the feature was broken and the NO_EFF_ONLY_OK
instructs Perl not to use the feature.

=head1 Testing Perl on Tru64

During "make test" the C<comp>/C<cpp> will be skipped because on Tru64 it
cannot be tested before Perl has been installed.  The test refers to
the use of the C<-P> option of Perl.

=head1 ext/ODBM_File/odbm Test Failing With Static Builds

The ext/ODBM_File/odbm is known to fail with static builds
(Configure -Uusedl) due to a known bug in Tru64's static libdbm
library.  The good news is that you very probably don't need to ever
use the ODBM_File extension since more advanced NDBM_File works fine,
not to mention the even more advanced DB_File.

=head1 Perl Fails Because Of Unresolved Symbol sockatmark

If you get an error like

    Can't load '.../OSF1/lib/perl5/5.8.0/alpha-dec_osf/auto/IO/IO.so' for module IO: Unresolved symbol in .../lib/perl5/5.8.0/alpha-dec_osf/auto/IO/IO.so: sockatmark at .../lib/perl5/5.8.0/alpha-dec_osf/XSLoader.pm line 75.

you need to either recompile your Perl in Tru64 4.0D or upgrade your
Tru64 4.0D to at least 4.0F: the sockatmark() system call was
added in Tru64 4.0F, and the IO extension refers that symbol.

=head1 read_cur_obj_info: bad file magic number

You may be mixing the Tru64 cc/ar/ld with the GNU gcc/ar/ld.
That may work, but sometimes it doesn't (your gcc or GNU utils
may have been compiled for an incompatible OS release).

Try 'which ld' and 'which ld' (or try 'ar --version' and 'ld --version',
which work only for the GNU tools, and will announce themselves to be such),
and adjust your PATH so that you are consistently using either
the native tools or the GNU tools.  After fixing your PATH, you should
do 'make distclean' and start all the way from running the Configure
since you may have quite a confused situation.

=head1 AUTHOR

Jarkko Hietaniemi <jhi@iki.fi>

=cut
perlmodinstall.pod000064400000030770150344123470010313 0ustar00=head1 NAME

perlmodinstall - Installing CPAN Modules

=head1 DESCRIPTION

You can think of a module as the fundamental unit of reusable Perl
code; see L<perlmod> for details.  Whenever anyone creates a chunk of
Perl code that they think will be useful to the world, they register
as a Perl developer at L<http://www.cpan.org/modules/04pause.html>
so that they can then upload their code to the CPAN.  The CPAN is the
Comprehensive Perl Archive Network and can be accessed at
L<http://www.cpan.org/> , and searched at L<http://search.cpan.org/> .

This documentation is for people who want to download CPAN modules
and install them on their own computer.

=head2 PREAMBLE

First, are you sure that the module isn't already on your system?  Try
C<perl -MFoo -e 1>.  (Replace "Foo" with the name of the module; for
instance, C<perl -MCGI::Carp -e 1>.)

If you don't see an error message, you have the module.  (If you do
see an error message, it's still possible you have the module, but
that it's not in your path, which you can display with C<perl -e
"print qq(@INC)">.)  For the remainder of this document, we'll assume
that you really honestly truly lack an installed module, but have
found it on the CPAN.

So now you have a file ending in .tar.gz (or, less often, .zip).  You
know there's a tasty module inside.  There are four steps you must now
take:

=over 5

=item B<DECOMPRESS> the file

=item B<UNPACK> the file into a directory

=item B<BUILD> the module (sometimes unnecessary)

=item B<INSTALL> the module.

=back

Here's how to perform each step for each operating system.  This is
<not> a substitute for reading the README and INSTALL files that
might have come with your module!

Also note that these instructions are tailored for installing the
module into your system's repository of Perl modules, but you can
install modules into any directory you wish.  For instance, where I
say C<perl Makefile.PL>, you can substitute C<perl Makefile.PL
PREFIX=/my/perl_directory> to install the modules into
F</my/perl_directory>.  Then you can use the modules from your Perl
programs with C<use lib "/my/perl_directory/lib/site_perl";> or
sometimes just C<use "/my/perl_directory";>.  If you're on a system
that requires superuser/root access to install modules into the
directories you see when you type C<perl -e "print qq(@INC)">, you'll
want to install them into a local directory (such as your home
directory) and use this approach.

=over 4

=item *

B<If you're on a Unix or Unix-like system,>

You can use Andreas Koenig's CPAN module
( L<http://www.cpan.org/modules/by-module/CPAN> )
to automate the following steps, from DECOMPRESS through INSTALL.

A. DECOMPRESS

Decompress the file with C<gzip -d yourmodule.tar.gz>

You can get gzip from L<ftp://prep.ai.mit.edu/pub/gnu/>

Or, you can combine this step with the next to save disk space:

     gzip -dc yourmodule.tar.gz | tar -xof -

B. UNPACK

Unpack the result with C<tar -xof yourmodule.tar>

C. BUILD

Go into the newly-created directory and type:

      perl Makefile.PL
      make test

or

      perl Makefile.PL PREFIX=/my/perl_directory

to install it locally.  (Remember that if you do this, you'll have to
put C<use lib "/my/perl_directory";> near the top of the program that
is to use this module.

D. INSTALL

While still in that directory, type:

      make install

Make sure you have the appropriate permissions to install the module
in your Perl 5 library directory.  Often, you'll need to be root.

That's all you need to do on Unix systems with dynamic linking.
Most Unix systems have dynamic linking. If yours doesn't, or if for
another reason you have a statically-linked perl, B<and> the
module requires compilation, you'll need to build a new Perl binary
that includes the module.  Again, you'll probably need to be root.

=item *

B<If you're running ActivePerl (Win95/98/2K/NT/XP, Linux, Solaris),>

First, type C<ppm> from a shell and see whether ActiveState's PPM
repository has your module.  If so, you can install it with C<ppm> and
you won't have to bother with any of the other steps here.  You might
be able to use the CPAN instructions from the "Unix or Linux" section
above as well; give it a try.  Otherwise, you'll have to follow the
steps below.

   A. DECOMPRESS

You can use the shareware Winzip ( L<http://www.winzip.com> ) to
decompress and unpack modules.

   B. UNPACK

If you used WinZip, this was already done for you.

   C. BUILD

You'll need the C<nmake> utility, available at
L<http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/nmake15.exe>
or dmake, available on CPAN.
L<http://search.cpan.org/dist/dmake/>

Does the module require compilation (i.e. does it have files that end
in .xs, .c, .h, .y, .cc, .cxx, or .C)?  If it does, life is now
officially tough for you, because you have to compile the module
yourself (no easy feat on Windows).  You'll need a compiler such as
Visual C++.  Alternatively, you can download a pre-built PPM package
from ActiveState.
L<http://aspn.activestate.com/ASPN/Downloads/ActivePerl/PPM/>

Go into the newly-created directory and type:

      perl Makefile.PL
      nmake test


   D. INSTALL

While still in that directory, type:

      nmake install

=item *

B<If you're using a Macintosh with "Classic" MacOS and MacPerl,>


A. DECOMPRESS

First, make sure you have the latest B<cpan-mac> distribution (
L<http://www.cpan.org/authors/id/CNANDOR/> ), which has utilities for
doing all of the steps.  Read the cpan-mac directions carefully and
install it.  If you choose not to use cpan-mac for some reason, there
are alternatives listed here.

After installing cpan-mac, drop the module archive on the
B<untarzipme> droplet, which will decompress and unpack for you.

B<Or>, you can either use the shareware B<StuffIt Expander> program
( L<http://my.smithmicro.com/mac/stuffit/> )
or the freeware B<MacGzip> program (
L<http://persephone.cps.unizar.es/general/gente/spd/gzip/gzip.html> ).

B. UNPACK

If you're using untarzipme or StuffIt, the archive should be extracted
now.  B<Or>, you can use the freeware B<suntar> or I<Tar> (
L<http://hyperarchive.lcs.mit.edu/HyperArchive/Archive/cmp/> ).

C. BUILD

Check the contents of the distribution.
Read the module's documentation, looking for
reasons why you might have trouble using it with MacPerl.  Look for
F<.xs> and F<.c> files, which normally denote that the distribution
must be compiled, and you cannot install it "out of the box."
(See L</"PORTABILITY">.)

D. INSTALL

If you are using cpan-mac, just drop the folder on the
B<installme> droplet, and use the module.

B<Or>, if you aren't using cpan-mac, do some manual labor.

Make sure the newlines for the modules are in Mac format, not Unix format.
If they are not then you might have decompressed them incorrectly.  Check
your decompression and unpacking utilities settings to make sure they are
translating text files properly.

As a last resort, you can use the perl one-liner:

    perl -i.bak -pe 's/(?:\015)?\012/\015/g' <filenames>

on the source files.

Then move the files (probably just the F<.pm> files, though there
may be some additional ones, too; check the module documentation)
to their final destination: This will
most likely be in C<$ENV{MACPERL}site_lib:> (i.e.,
C<HD:MacPerl folder:site_lib:>).  You can add new paths to
the default C<@INC> in the Preferences menu item in the
MacPerl application (C<$ENV{MACPERL}site_lib:> is added
automagically).  Create whatever directory structures are required
(i.e., for C<Some::Module>, create
C<$ENV{MACPERL}site_lib:Some:> and put
C<Module.pm> in that directory).

Then run the following script (or something like it):

     #!perl -w
     use AutoSplit;
     my $dir = "${MACPERL}site_perl";
     autosplit("$dir:Some:Module.pm", "$dir:auto", 0, 1, 1);

=item *

B<If you're on the DJGPP port of DOS,>

   A. DECOMPRESS

djtarx ( L<ftp://ftp.delorie.com/pub/djgpp/current/v2/> )
will both uncompress and unpack.

   B. UNPACK

See above.

   C. BUILD

Go into the newly-created directory and type:

      perl Makefile.PL
      make test

You will need the packages mentioned in F<README.dos>
in the Perl distribution.

   D. INSTALL

While still in that directory, type:

     make install	

You will need the packages mentioned in F<README.dos> in the Perl distribution.

=item *

B<If you're on OS/2,>

Get the EMX development suite and gzip/tar, from either Hobbes (
L<http://hobbes.nmsu.edu> ) or Leo ( L<http://www.leo.org> ), and then follow
the instructions for Unix.

=item *

B<If you're on VMS,>

When downloading from CPAN, save your file with a C<.tgz>
extension instead of C<.tar.gz>.  All other periods in the
filename should be replaced with underscores.  For example,
C<Your-Module-1.33.tar.gz> should be downloaded as
C<Your-Module-1_33.tgz>.

A. DECOMPRESS

Type

    gzip -d Your-Module.tgz

or, for zipped modules, type

    unzip Your-Module.zip

Executables for gzip, zip, and VMStar:

    http://www.hp.com/go/openvms/freeware/

and their source code:

    http://www.fsf.org/order/ftp.html

Note that GNU's gzip/gunzip is not the same as Info-ZIP's zip/unzip
package.  The former is a simple compression tool; the latter permits
creation of multi-file archives.

B. UNPACK

If you're using VMStar:

     VMStar xf Your-Module.tar

Or, if you're fond of VMS command syntax:

     tar/extract/verbose Your_Module.tar

C. BUILD

Make sure you have MMS (from Digital) or the freeware MMK ( available
from MadGoat at L<http://www.madgoat.com> ).  Then type this to create
the DESCRIP.MMS for the module:

    perl Makefile.PL

Now you're ready to build:

    mms test

Substitute C<mmk> for C<mms> above if you're using MMK.

D. INSTALL

Type

    mms install

Substitute C<mmk> for C<mms> above if you're using MMK.

=item *

B<If you're on MVS>,

Introduce the F<.tar.gz> file into an HFS as binary; don't translate from
ASCII to EBCDIC.

A. DECOMPRESS

Decompress the file with C<gzip -d yourmodule.tar.gz>

You can get gzip from
L<http://www.s390.ibm.com/products/oe/bpxqp1.html>

B. UNPACK

Unpack the result with

     pax -o to=IBM-1047,from=ISO8859-1 -r < yourmodule.tar

The BUILD and INSTALL steps are identical to those for Unix.  Some
modules generate Makefiles that work better with GNU make, which is
available from L<http://www.mks.com/s390/gnu/>

=back

=head1 PORTABILITY

Note that not all modules will work with on all platforms.
See L<perlport> for more information on portability issues.
Read the documentation to see if the module will work on your
system.  There are basically three categories
of modules that will not work "out of the box" with all
platforms (with some possibility of overlap):

=over 4

=item *

B<Those that should, but don't.>  These need to be fixed; consider
contacting the author and possibly writing a patch.

=item *

B<Those that need to be compiled, where the target platform
doesn't have compilers readily available.>  (These modules contain
F<.xs> or F<.c> files, usually.)  You might be able to find
existing binaries on the CPAN or elsewhere, or you might
want to try getting compilers and building it yourself, and then
release the binary for other poor souls to use.

=item *

B<Those that are targeted at a specific platform.>
(Such as the Win32:: modules.)  If the module is targeted
specifically at a platform other than yours, you're out
of luck, most likely.

=back



Check the CPAN Testers if a module should work with your platform
but it doesn't behave as you'd expect, or you aren't sure whether or
not a module will work under your platform.  If the module you want
isn't listed there, you can test it yourself and let CPAN Testers know,
you can join CPAN Testers, or you can request it be tested.

    http://testers.cpan.org/


=head1 HEY

If you have any suggested changes for this page, let me know.  Please
don't send me mail asking for help on how to install your modules.
There are too many modules, and too few Orwants, for me to be able to
answer or even acknowledge all your questions.  Contact the module
author instead, ask someone familiar with Perl on your operating
system, or if all else fails, file a ticket at http://rt.cpan.org/.

=head1 AUTHOR

Jon Orwant

orwant@medita.mit.edu

with invaluable help from Chris Nandor, and valuable help from Brandon
Allbery, Charles Bailey, Graham Barr, Dominic Dunlop, Jarkko
Hietaniemi, Ben Holzman, Tom Horsley, Nick Ing-Simmons, Tuomas
J. Lukka, Laszlo Molnar, Alan Olsen, Peter Prymmer, Gurusamy Sarathy,
Christoph Spalinger, Dan Sugalski, Larry Virden, and Ilya Zakharevich.

First version July 22, 1998; last revised November 21, 2001.

=head1 COPYRIGHT

Copyright (C) 1998, 2002, 2003 Jon Orwant.  All Rights Reserved.

This document may be distributed under the same terms as Perl itself.
perl5140delta.pod000064400000431704150344123470007552 0ustar00=encoding utf8

=head1 NAME

perl5140delta - what is new for perl v5.14.0

=head1 DESCRIPTION

This document describes differences between the 5.12.0 release and
the 5.14.0 release.

If you are upgrading from an earlier release such as 5.10.0, first read
L<perl5120delta>, which describes differences between 5.10.0 and
5.12.0.

Some of the bug fixes in this release have been backported to subsequent
releases of 5.12.x.  Those are indicated with the 5.12.x version in
parentheses.

=head1 Notice

As described in L<perlpolicy>, the release of Perl 5.14.0 marks the
official end of support for Perl 5.10.  Users of Perl 5.10 or earlier
should consider upgrading to a more recent release of Perl.

=head1 Core Enhancements

=head2 Unicode

=head3 Unicode Version 6.0 is now supported (mostly)

Perl comes with the Unicode 6.0 data base updated with
L<Corrigendum #8|http://www.unicode.org/versions/corrigendum8.html>,
with one exception noted below.
See L<http://unicode.org/versions/Unicode6.0.0/> for details on the new
release.  Perl does not support any Unicode provisional properties,
including the new ones for this release.

Unicode 6.0 has chosen to use the name C<BELL> for the character at U+1F514,
which is a symbol that looks like a bell, and is used in Japanese cell
phones.  This conflicts with the long-standing Perl usage of having
C<BELL> mean the ASCII C<BEL> character, U+0007.  In Perl 5.14,
C<\N{BELL}> continues to mean U+0007, but its use generates a
deprecation warning message unless such warnings are turned off.  The
new name for U+0007 in Perl is C<ALERT>, which corresponds nicely
with the existing shorthand sequence for it, C<"\a">.  C<\N{BEL}>
means U+0007, with no warning given.  The character at U+1F514 has no
name in 5.14, but can be referred to by C<\N{U+1F514}>. 
In Perl 5.16, C<\N{BELL}> will refer to U+1F514; all code
that uses C<\N{BELL}> should be converted to use C<\N{ALERT}>,
C<\N{BEL}>, or C<"\a"> before upgrading.

=head3 Full functionality for C<use feature 'unicode_strings'>

This release provides full functionality for C<use feature
'unicode_strings'>.  Under its scope, all string operations executed and
regular expressions compiled (even if executed outside its scope) have
Unicode semantics.  See L<feature/"the 'unicode_strings' feature">.
However, see L</Inverted bracketed character classes and multi-character folds>,
below.

This feature avoids most forms of the "Unicode Bug" (see
L<perlunicode/The "Unicode Bug"> for details).  If there is any
possibility that your code will process Unicode strings, you are
I<strongly> encouraged to use this subpragma to avoid nasty surprises.

=head3 C<\N{I<NAME>}> and C<charnames> enhancements

=over

=item *

C<\N{I<NAME>}> and C<charnames::vianame> now know about the abbreviated
character names listed by Unicode, such as NBSP, SHY, LRO, ZWJ, etc.; all
customary abbreviations for the C0 and C1 control characters (such as
ACK, BEL, CAN, etc.); and a few new variants of some C1 full names that
are in common usage.

=item *

Unicode has several I<named character sequences>, in which particular sequences
of code points are given names.  C<\N{I<NAME>}> now recognizes these.

=item *

C<\N{I<NAME>}>, C<charnames::vianame>, and C<charnames::viacode>
now know about every character in Unicode.  In earlier releases of
Perl, they didn't know about the Hangul syllables nor several
CJK (Chinese/Japanese/Korean) characters.

=item *

It is now possible to override Perl's abbreviations with your own custom aliases.

=item *

You can now create a custom alias of the ordinal of a
character, known by C<\N{I<NAME>}>, C<charnames::vianame()>, and
C<charnames::viacode()>.  Previously, aliases had to be to official
Unicode character names.  This made it impossible to create an alias for
unnamed code points, such as those reserved for private
use.

=item *

The new function charnames::string_vianame() is a run-time version
of C<\N{I<NAME>}}>, returning the string of characters whose Unicode
name is its parameter.  It can handle Unicode named character
sequences, whereas the pre-existing charnames::vianame() cannot,
as the latter returns a single code point.

=back

See L<charnames> for details on all these changes.

=head3 New warnings categories for problematic (non-)Unicode code points.

Three new warnings subcategories of "utf8" have been added.  These
allow you to turn off some "utf8" warnings, while allowing
other warnings to remain on.  The three categories are:
C<surrogate> when UTF-16 surrogates are encountered;
C<nonchar> when Unicode non-character code points are encountered;
and C<non_unicode> when code points above the legal Unicode
maximum of 0x10FFFF are encountered.

=head3 Any unsigned value can be encoded as a character

With this release, Perl is adopting a model that any unsigned value
can be treated as a code point and encoded internally (as utf8)
without warnings, not just the code points that are legal in Unicode.
However, unless utf8 or the corresponding sub-category (see previous
item) of lexical warnings have been explicitly turned off, outputting
or executing a Unicode-defined operation such as upper-casing
on such a code point generates a warning.  Attempting to input these
using strict rules (such as with the C<:encoding(UTF-8)> layer)
will continue to fail.  Prior to this release, handling was
inconsistent and in places, incorrect.

Unicode non-characters, some of which previously were erroneously
considered illegal in places by Perl, contrary to the Unicode Standard,
are now always legal internally.  Inputting or outputting them 
works the same as with the non-legal Unicode code points, because the Unicode
Standard says they are (only) illegal for "open interchange".

=head3 Unicode database files not installed

The Unicode database files are no longer installed with Perl.  This
doesn't affect any functionality in Perl and saves significant disk
space.  If you need these files, you can download them from
L<http://www.unicode.org/Public/zipped/6.0.0/>.

=head2 Regular Expressions

=head3 C<(?^...)> construct signifies default modifiers

An ASCII caret C<"^"> immediately following a C<"(?"> in a regular
expression now means that the subexpression does not inherit surrounding
modifiers such as C</i>, but reverts to the Perl defaults.  Any modifiers
following the caret override the defaults.

Stringification of regular expressions now uses this notation.  
For example, C<qr/hlagh/i> would previously be stringified as
C<(?i-xsm:hlagh)>, but now it's stringified as C<(?^i:hlagh)>.

The main purpose of this change is to allow tests that rely on the
stringification I<not> to have to change whenever new modifiers are added.
See L<perlre/Extended Patterns>.

This change is likely to break code that compares stringified regular
expressions with fixed strings containing C<?-xism>.

=head3 C</d>, C</l>, C</u>, and C</a> modifiers

Four new regular expression modifiers have been added.  These are mutually
exclusive: one only can be turned on at a time.

=over 

=item *

The C</l> modifier says to compile the regular expression as if it were
in the scope of C<use locale>, even if it is not.

=item *

The C</u> modifier says to compile the regular expression as if it were
in the scope of a C<use feature 'unicode_strings'> pragma.

=item *

The C</d> (default) modifier is used to override any C<use locale> and
C<use feature 'unicode_strings'> pragmas in effect at the time
of compiling the regular expression.

=item *

The C</a> regular expression modifier restricts C<\s>, C<\d> and C<\w> and
the POSIX (C<[[:posix:]]>) character classes to the ASCII range.  Their
complements and C<\b> and C<\B> are correspondingly
affected.  Otherwise, C</a> behaves like the C</u> modifier, in that
case-insensitive matching uses Unicode semantics.

If the C</a> modifier is repeated, then additionally in case-insensitive
matching, no ASCII character can match a non-ASCII character.
For example,

    "k"     =~ /\N{KELVIN SIGN}/ai
    "\xDF" =~ /ss/ai

match but 

    "k"    =~ /\N{KELVIN SIGN}/aai
    "\xDF" =~ /ss/aai

do not match.

=back

See L<perlre/Modifiers> for more detail.

=head3 Non-destructive substitution

The substitution (C<s///>) and transliteration
(C<y///>) operators now support an C</r> option that
copies the input variable, carries out the substitution on
the copy, and returns the result.  The original remains unmodified.

  my $old = "cat";
  my $new = $old =~ s/cat/dog/r;
  # $old is "cat" and $new is "dog"

This is particularly useful with C<map>.  See L<perlop> for more examples.

=head3 Re-entrant regular expression engine

It is now safe to use regular expressions within C<(?{...})> and
C<(??{...})> code blocks inside regular expressions.

These blocks are still experimental, however, and still have problems with
lexical (C<my>) variables and abnormal exiting.

=head3 C<use re '/flags'>

The C<re> pragma now has the ability to turn on regular expression flags
till the end of the lexical scope:

    use re "/x";
    "foo" =~ / (.+) /;  # /x implied

See L<re/"'/flags' mode"> for details.

=head3 \o{...} for octals

There is a new octal escape sequence, C<"\o">, in doublequote-like
contexts.  This construct allows large octal ordinals beyond the
current max of 0777 to be represented.  It also allows you to specify a
character in octal which can safely be concatenated with other regex
snippets and which won't be confused with being a backreference to
a regex capture group.  See L<perlre/Capture groups>.

=head3 Add C<\p{Titlecase}> as a synonym for C<\p{Title}>

This synonym is added for symmetry with the Unicode property names
C<\p{Uppercase}> and C<\p{Lowercase}>.

=head3 Regular expression debugging output improvement

Regular expression debugging output (turned on by C<use re 'debug'>) now
uses hexadecimal when escaping non-ASCII characters, instead of octal.

=head3 Return value of C<delete $+{...}>

Custom regular expression engines can now determine the return value of
C<delete> on an entry of C<%+> or C<%->.

=head2 Syntactical Enhancements

=head3 Array and hash container functions accept references

B<Warning:> This feature is considered experimental, as the exact behaviour
may change in a future version of Perl.

All builtin functions that operate directly on array or hash
containers now also accept unblessed hard references to arrays
or hashes:

  |----------------------------+---------------------------|
  | Traditional syntax         | Terse syntax              |
  |----------------------------+---------------------------|
  | push @$arrayref, @stuff    | push $arrayref, @stuff    |
  | unshift @$arrayref, @stuff | unshift $arrayref, @stuff |
  | pop @$arrayref             | pop $arrayref             |
  | shift @$arrayref           | shift $arrayref           |
  | splice @$arrayref, 0, 2    | splice $arrayref, 0, 2    |
  | keys %$hashref             | keys $hashref             |
  | keys @$arrayref            | keys $arrayref            |
  | values %$hashref           | values $hashref           |
  | values @$arrayref          | values $arrayref          |
  | ($k,$v) = each %$hashref   | ($k,$v) = each $hashref   |
  | ($k,$v) = each @$arrayref  | ($k,$v) = each $arrayref  |
  |----------------------------+---------------------------|

This allows these builtin functions to act on long dereferencing chains
or on the return value of subroutines without needing to wrap them in
C<@{}> or C<%{}>:

  push @{$obj->tags}, $new_tag;  # old way
  push $obj->tags,    $new_tag;  # new way

  for ( keys %{$hoh->{genres}{artists}} ) {...} # old way 
  for ( keys $hoh->{genres}{artists}    ) {...} # new way 

=head3 Single term prototype

The C<+> prototype is a special alternative to C<$> that acts like
C<\[@%]> when given a literal array or hash variable, but will otherwise
force scalar context on the argument.  See L<perlsub/Prototypes>.

=head3 C<package> block syntax

A package declaration can now contain a code block, in which case the
declaration is in scope inside that block only.  So C<package Foo { ... }>
is precisely equivalent to C<{ package Foo; ... }>.  It also works with
a version number in the declaration, as in C<package Foo 1.2 { ... }>, 
which is its most attractive feature.  See L<perlfunc>.

=head3 Statement labels can appear in more places

Statement labels can now occur before any type of statement or declaration,
such as C<package>.

=head3 Stacked labels

Multiple statement labels can now appear before a single statement.

=head3 Uppercase X/B allowed in hexadecimal/binary literals

Literals may now use either upper case C<0X...> or C<0B...> prefixes,
in addition to the already supported C<0x...> and C<0b...>
syntax [perl #76296].

C, Ruby, Python, and PHP already support this syntax, and it makes
Perl more internally consistent: a round-trip with C<eval sprintf
"%#X", 0x10> now returns C<16>, just like C<eval sprintf "%#x", 0x10>.

=head3 Overridable tie functions

C<tie>, C<tied> and C<untie> can now be overridden [perl #75902].

=head2 Exception Handling

To make them more reliable and consistent, several changes have been made
to how C<die>, C<warn>, and C<$@> behave.

=over

=item * 

When an exception is thrown inside an C<eval>, the exception is no
longer at risk of being clobbered by destructor code running during unwinding.
Previously, the exception was written into C<$@>
early in the throwing process, and would be overwritten if C<eval> was
used internally in the destructor for an object that had to be freed
while exiting from the outer C<eval>.  Now the exception is written
into C<$@> last thing before exiting the outer C<eval>, so the code
running immediately thereafter can rely on the value in C<$@> correctly
corresponding to that C<eval>.  (C<$@> is still also set before exiting the
C<eval>, for the sake of destructors that rely on this.)

Likewise, a C<local $@> inside an C<eval> no longer clobbers any
exception thrown in its scope.  Previously, the restoration of C<$@> upon
unwinding would overwrite any exception being thrown.  Now the exception
gets to the C<eval> anyway.  So C<local $@> is safe before a C<die>.

Exceptions thrown from object destructors no longer modify the C<$@>
of the surrounding context.  (If the surrounding context was exception
unwinding, this used to be another way to clobber the exception being
thrown.)  Previously such an exception was
sometimes emitted as a warning, and then either was
string-appended to the surrounding C<$@> or completely replaced the
surrounding C<$@>, depending on whether that exception and the surrounding
C<$@> were strings or objects.  Now, an exception in this situation is
always emitted as a warning, leaving the surrounding C<$@> untouched.
In addition to object destructors, this also affects any function call
run by XS code using the C<G_KEEPERR> flag.

=item * 

Warnings for C<warn> can now be objects in the same way as exceptions
for C<die>.  If an object-based warning gets the default handling
of writing to standard error, it is stringified as before with the
filename and line number appended.  But a C<$SIG{__WARN__}> handler now
receives an object-based warning as an object, where previously it
was passed the result of stringifying the object.

=back

=head2 Other Enhancements

=head3 Assignment to C<$0> sets the legacy process name with prctl() on Linux

On Linux the legacy process name is now set with L<prctl(2)>, in
addition to altering the POSIX name via C<argv[0]>, as Perl has done
since version 4.000.  Now system utilities that read the legacy process
name such as I<ps>, I<top>, and I<killall> recognize the name you set when
assigning to C<$0>.  The string you supply is truncated at 16 bytes;
this limitation is imposed by Linux.

=head3 srand() now returns the seed

This allows programs that need to have repeatable results not to have to come
up with their own seed-generating mechanism.  Instead, they can use srand()
and stash the return value for future use.  One example is a test program with
too many combinations to test comprehensively in the time available for
each run.  It can test a random subset each time and, should there be a failure,
log the seed used for that run so this can later be used to produce the same results.

=head3 printf-like functions understand post-1980 size modifiers

Perl's printf and sprintf operators, and Perl's internal printf replacement
function, now understand the C90 size modifiers "hh" (C<char>), "z"
(C<size_t>), and "t" (C<ptrdiff_t>).  Also, when compiled with a C99
compiler, Perl now understands the size modifier "j" (C<intmax_t>) 
(but this is not portable).

So, for example, on any modern machine, C<sprintf("%hhd", 257)> returns "1".

=head3 New global variable C<${^GLOBAL_PHASE}>

A new global variable, C<${^GLOBAL_PHASE}>, has been added to allow
introspection of the current phase of the Perl interpreter.  It's explained in
detail in L<perlvar/"${^GLOBAL_PHASE}"> and in
L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">.

=head3 C<-d:-foo> calls C<Devel::foo::unimport>

The syntax B<-d:foo> was extended in 5.6.1 to make B<-d:foo=bar>
equivalent to B<-MDevel::foo=bar>, which expands
internally to C<use Devel::foo 'bar'>.
Perl now allows prefixing the module name with B<->, with the same
semantics as B<-M>; that is:

=over 4

=item C<-d:-foo>

Equivalent to B<-M-Devel::foo>: expands to
C<no Devel::foo> and calls C<< Devel::foo->unimport() >>
if that method exists.

=item C<-d:-foo=bar>

Equivalent to B<-M-Devel::foo=bar>: expands to C<no Devel::foo 'bar'>,
and calls C<< Devel::foo->unimport("bar") >> if that method exists.

=back

This is particularly useful for suppressing the default actions of a
C<Devel::*> module's C<import> method whilst still loading it for debugging.

=head3 Filehandle method calls load L<IO::File> on demand

When a method call on a filehandle would die because the method cannot
be resolved and L<IO::File> has not been loaded, Perl now loads L<IO::File>
via C<require> and attempts method resolution again:

  open my $fh, ">", $file;
  $fh->binmode(":raw");     # loads IO::File and succeeds

This also works for globs like C<STDOUT>, C<STDERR>, and C<STDIN>:

  STDOUT->autoflush(1);

Because this on-demand load happens only if method resolution fails, the
legacy approach of manually loading an L<IO::File> parent class for partial
method support still works as expected:

  use IO::Handle;
  open my $fh, ">", $file;
  $fh->autoflush(1);        # IO::File not loaded

=head3 Improved IPv6 support

The C<Socket> module provides new affordances for IPv6,
including implementations of the C<Socket::getaddrinfo()> and
C<Socket::getnameinfo()> functions, along with related constants and a
handful of new functions.  See L<Socket>.

=head3 DTrace probes now include package name

The C<DTrace> probes now include an additional argument, C<arg3>, which contains
the package the subroutine being entered or left was compiled in.

For example, using the following DTrace script:

  perl$target:::sub-entry
  {
      printf("%s::%s\n", copyinstr(arg0), copyinstr(arg3));
  }

and then running:

  $ perl -e 'sub test { }; test'

C<DTrace> will print:

  main::test

=head2 New C APIs

See L</Internal Changes>.

=head1 Security

=head2 User-defined regular expression properties

L<perlunicode/"User-Defined Character Properties"> documented that you can
create custom properties by defining subroutines whose names begin with
"In" or "Is".  However, Perl did not actually enforce that naming
restriction, so C<\p{foo::bar}> could call foo::bar() if it existed.  The documented
convention is now enforced.

Also, Perl no longer allows tainted regular expressions to invoke a
user-defined property.  It simply dies instead [perl #82616].

=head1 Incompatible Changes

Perl 5.14.0 is not binary-compatible with any previous stable release.

In addition to the sections that follow, see L</C API Changes>.

=head2 Regular Expressions and String Escapes

=head3 Inverted bracketed character classes and multi-character folds

Some characters match a sequence of two or three characters in C</i>
regular expression matching under Unicode rules.  One example is
C<LATIN SMALL LETTER SHARP S> which matches the sequence C<ss>.

 'ss' =~ /\A[\N{LATIN SMALL LETTER SHARP S}]\z/i  # Matches

This, however, can lead to very counter-intuitive results, especially
when inverted.  Because of this, Perl 5.14 does not use multi-character C</i>
matching in inverted character classes.

 'ss' =~ /\A[^\N{LATIN SMALL LETTER SHARP S}]+\z/i  # ???

This should match any sequences of characters that aren't the C<SHARP S>
nor what C<SHARP S> matches under C</i>.  C<"s"> isn't C<SHARP S>, but
Unicode says that C<"ss"> is what C<SHARP S> matches under C</i>.  So
which one "wins"? Do you fail the match because the string has C<ss> or
accept it because it has an C<s> followed by another C<s>?

Earlier releases of Perl did allow this multi-character matching,
but due to bugs, it mostly did not work.

=head3 \400-\777

In certain circumstances, C<\400>-C<\777> in regexes have behaved
differently than they behave in all other doublequote-like contexts.
Since 5.10.1, Perl has issued a deprecation warning when this happens.
Now, these literals behave the same in all doublequote-like contexts,
namely to be equivalent to C<\x{100}>-C<\x{1FF}>, with no deprecation
warning.

Use of C<\400>-C<\777> in the command-line option B<-0> retain their
conventional meaning.  They slurp whole input files; previously, this
was documented only for B<-0777>.

Because of various ambiguities, you should use the new
C<\o{...}> construct to represent characters in octal instead.

=head3 Most C<\p{}> properties are now immune to case-insensitive matching

For most Unicode properties, it doesn't make sense to have them match
differently under C</i> case-insensitive matching.  Doing so can lead
to unexpected results and potential security holes.  For example

 m/\p{ASCII_Hex_Digit}+/i

could previously match non-ASCII characters because of the Unicode
matching rules (although there were several bugs with this).  Now
matching under C</i> gives the same results as non-C</i> matching except
for those few properties where people have come to expect differences,
namely the ones where casing is an integral part of their meaning, such
as C<m/\p{Uppercase}/i> and C<m/\p{Lowercase}/i>, both of which match
the same code points as matched by C<m/\p{Cased}/i>.
Details are in L<perlrecharclass/Unicode Properties>.

User-defined property handlers that need to match differently under C</i>
must be changed to read the new boolean parameter passed to them, which
is non-zero if case-insensitive matching is in effect and 0 otherwise.
See L<perlunicode/User-Defined Character Properties>.

=head3 \p{} implies Unicode semantics

Specifying a Unicode property in the pattern indicates
that the pattern is meant for matching according to Unicode rules, the way
C<\N{I<NAME>}> does.

=head3 Regular expressions retain their localeness when interpolated

Regular expressions compiled under C<use locale> now retain this when
interpolated into a new regular expression compiled outside a
C<use locale>, and vice-versa.

Previously, one regular expression interpolated into another inherited
the localeness of the surrounding regex, losing whatever state it
originally had.  This is considered a bug fix, but may trip up code that
has come to rely on the incorrect behaviour.

=head3 Stringification of regexes has changed

Default regular expression modifiers are now notated using
C<(?^...)>.  Code relying on the old stringification will fail.  
This is so that when new modifiers are added, such code won't
have to keep changing each time this happens, because the stringification 
will automatically incorporate the new modifiers.

Code that needs to work properly with both old- and new-style regexes
can avoid the whole issue by using (for perls since 5.9.5; see L<re>):

 use re qw(regexp_pattern);
 my ($pat, $mods) = regexp_pattern($re_ref);

If the actual stringification is important or older Perls need to be
supported, you can use something like the following:

    # Accept both old and new-style stringification
    my $modifiers = (qr/foobar/ =~ /\Q(?^/) ? "^" : "-xism";

And then use C<$modifiers> instead of C<-xism>.

=head3 Run-time code blocks in regular expressions inherit pragmata

Code blocks in regular expressions (C<(?{...})> and C<(??{...})>) previously
did not inherit pragmata (strict, warnings, etc.) if the regular expression
was compiled at run time as happens in cases like these two:

  use re "eval";
  $foo =~ $bar; # when $bar contains (?{...})
  $foo =~ /$bar(?{ $finished = 1 })/;

This bug has now been fixed, but code that relied on the buggy behaviour
may need to be fixed to account for the correct behaviour.

=head2 Stashes and Package Variables

=head3 Localised tied hashes and arrays are no longed tied

In the following:

    tie @a, ...;
    {
	    local @a;
	    # here, @a is a now a new, untied array
    }
    # here, @a refers again to the old, tied array

Earlier versions of Perl incorrectly tied the new local array.  This has
now been fixed.  This fix could however potentially cause a change in
behaviour of some code.

=head3 Stashes are now always defined

C<defined %Foo::> now always returns true, even when no symbols have yet been
defined in that package.

This is a side-effect of removing a special-case kludge in the tokeniser,
added for 5.10.0, to hide side-effects of changes to the internal storage of
hashes.  The fix drastically reduces hashes' memory overhead.

Calling defined on a stash has been deprecated since 5.6.0, warned on
lexicals since 5.6.0, and warned for stashes and other package
variables since 5.12.0.  C<defined %hash> has always exposed an
implementation detail: emptying a hash by deleting all entries from it does
not make C<defined %hash> false.  Hence C<defined %hash> is not valid code to
determine whether an arbitrary hash is empty.  Instead, use the behaviour
of an empty C<%hash> always returning false in scalar context.

=head3 Clearing stashes

Stash list assignment C<%foo:: = ()> used to make the stash temporarily 
anonymous while it was being emptied.  Consequently, any of its
subroutines referenced elsewhere would become anonymous,  showing up as
"(unknown)" in C<caller>.  They now retain their package names such that
C<caller> returns the original sub name if there is still a reference
to its typeglob and "foo::__ANON__" otherwise [perl #79208].

=head3 Dereferencing typeglobs

If you assign a typeglob to a scalar variable:

    $glob = *foo;

the glob that is copied to C<$glob> is marked with a special flag
indicating that the glob is just a copy.  This allows subsequent
assignments to C<$glob> to overwrite the glob.  The original glob,
however, is immutable.

Some Perl operators did not distinguish between these two types of globs.
This would result in strange behaviour in edge cases: C<untie $scalar>
would not untie the scalar if the last thing assigned to it was a glob
(because it treated it as C<untie *$scalar>, which unties a handle).
Assignment to a glob slot (such as C<*$glob = \@some_array>) would simply
assign C<\@some_array> to C<$glob>.

To fix this, the C<*{}> operator (including its C<*foo> and C<*$foo> forms)
has been modified to make a new immutable glob if its operand is a glob
copy.  This allows operators that make a distinction between globs and
scalars to be modified to treat only immutable globs as globs.  (C<tie>,
C<tied> and C<untie> have been left as they are for compatibility's sake,
but will warn.  See L</Deprecations>.)

This causes an incompatible change in code that assigns a glob to the
return value of C<*{}> when that operator was passed a glob copy.  Take the
following code, for instance:

    $glob = *foo;
    *$glob = *bar;

The C<*$glob> on the second line returns a new immutable glob.  That new
glob is made an alias to C<*bar>.  Then it is discarded.  So the second
assignment has no effect.

See L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=77810> for
more detail.

=head3 Magic variables outside the main package

In previous versions of Perl, magic variables like C<$!>, C<%SIG>, etc. would
"leak" into other packages.  So C<%foo::SIG> could be used to access signals,
C<${"foo::!"}> (with strict mode off) to access C's C<errno>, etc.

This was a bug, or an "unintentional" feature, which caused various ill effects,
such as signal handlers being wiped when modules were loaded, etc.

This has been fixed (or the feature has been removed, depending on how you see
it).

=head3 local($_) strips all magic from $_

local() on scalar variables gives them a new value but keeps all
their magic intact.  This has proven problematic for the default
scalar variable $_, where L<perlsub> recommends that any subroutine
that assigns to $_ should first localize it.  This would throw an
exception if $_ is aliased to a read-only variable, and could in general have
various unintentional side-effects.

Therefore, as an exception to the general rule, local($_) will not
only assign a new value to $_, but also remove all existing magic from
it as well.

=head3 Parsing of package and variable names

Parsing the names of packages and package variables has changed: 
multiple adjacent pairs of colons, as in C<foo::::bar>, are now all 
treated as package separators.

Regardless of this change, the exact parsing of package separators has
never been guaranteed and is subject to change in future Perl versions.

=head2 Changes to Syntax or to Perl Operators

=head3 C<given> return values

C<given> blocks now return the last evaluated
expression, or an empty list if the block was exited by C<break>.  Thus you
can now write:

    my $type = do {
     given ($num) {
      break     when undef;
      "integer" when /^[+-]?[0-9]+$/;
      "float"   when /^[+-]?[0-9]+(?:\.[0-9]+)?$/;
      "unknown";
     }
    };

See L<perlsyn/Return value> for details.

=head3 Change in parsing of certain prototypes

Functions declared with the following prototypes now behave correctly as unary
functions:

  *
  \$ \% \@ \* \&
  \[...]
  ;$ ;*
  ;\$ ;\% etc.
  ;\[...]

Due to this bug fix [perl #75904], functions
using the C<(*)>, C<(;$)> and C<(;*)> prototypes
are parsed with higher precedence than before.  So
in the following example:

  sub foo(;$);
  foo $a < $b;

the second line is now parsed correctly as C<< foo($a) < $b >>, rather than
C<< foo($a < $b) >>.  This happens when one of these operators is used in
an unparenthesised argument:

  < > <= >= lt gt le ge
  == != <=> eq ne cmp ~~
  &
  | ^
  &&
  || //
  .. ...
  ?:
  = += -= *= etc.
  , =>

=head3 Smart-matching against array slices

Previously, the following code resulted in a successful match:

    my @a = qw(a y0 z);
    my @b = qw(a x0 z);
    @a[0 .. $#b] ~~ @b;

This odd behaviour has now been fixed [perl #77468].

=head3 Negation treats strings differently from before

The unary negation operator, C<->, now treats strings that look like numbers
as numbers [perl #57706].

=head3 Negative zero

Negative zero (-0.0), when converted to a string, now becomes "0" on all
platforms.  It used to become "-0" on some, but "0" on others.

If you still need to determine whether a zero is negative, use
C<sprintf("%g", $zero) =~ /^-/> or the L<Data::Float> module on CPAN.

=head3 C<:=> is now a syntax error

Previously C<my $pi := 4> was exactly equivalent to C<my $pi : = 4>,
with the C<:> being treated as the start of an attribute list, ending before
the C<=>.  The use of C<:=> to mean C<: => was deprecated in 5.12.0, and is
now a syntax error.  This allows future use of C<:=> as a new token.

Outside the core's tests for it, we find no Perl 5 code on CPAN
using this construction, so we believe that this change will have
little impact on real-world codebases.

If it is absolutely necessary to have empty attribute lists (for example,
because of a code generator), simply avoid the error by adding a space before
the C<=>.

=head3 Change in the parsing of identifiers

Characters outside the Unicode "XIDStart" set are no longer allowed at the
beginning of an identifier.  This means that certain accents and marks
that normally follow an alphabetic character may no longer be the first
character of an identifier.

=head2 Threads and Processes

=head3 Directory handles not copied to threads

On systems other than Windows that do not have
a C<fchdir> function, newly-created threads no
longer inherit directory handles from their parent threads.  Such programs
would usually have crashed anyway [perl #75154].

=head3 C<close> on shared pipes

To avoid deadlocks, the C<close> function no longer waits for the
child process to exit if the underlying file descriptor is still
in use by another thread.  It returns true in such cases.

=head3 fork() emulation will not wait for signalled children

On Windows parent processes would not terminate until all forked
children had terminated first.  However, C<kill("KILL", ...)> is
inherently unstable on pseudo-processes, and C<kill("TERM", ...)>
might not get delivered if the child is blocked in a system call.

To avoid the deadlock and still provide a safe mechanism to terminate
the hosting process, Perl now no longer waits for children that
have been sent a SIGTERM signal.  It is up to the parent process to
waitpid() for these children if child-cleanup processing must be
allowed to finish.  However, it is also then the responsibility of the
parent to avoid the deadlock by making sure the child process
can't be blocked on I/O.

See L<perlfork> for more information about the fork() emulation on
Windows.

=head2 Configuration

=head3 Naming fixes in Policy_sh.SH may invalidate Policy.sh

Several long-standing typos and naming confusions in F<Policy_sh.SH> have
been fixed, standardizing on the variable names used in F<config.sh>.

This will change the behaviour of F<Policy.sh> if you happen to have been
accidentally relying on its incorrect behaviour.

=head3 Perl source code is read in text mode on Windows

Perl scripts used to be read in binary mode on Windows for the benefit
of the L<ByteLoader> module (which is no longer part of core Perl).  This
had the side-effect of breaking various operations on the C<DATA> filehandle,
including seek()/tell(), and even simply reading from C<DATA> after filehandles
have been flushed by a call to system(), backticks, fork() etc.

The default build options for Windows have been changed to read Perl source
code on Windows in text mode now.  L<ByteLoader> will (hopefully) be updated on
CPAN to automatically handle this situation [perl #28106].

=head1 Deprecations

See also L</Deprecated C APIs>.

=head2 Omitting a space between a regular expression and subsequent word

Omitting the space between a regular expression operator or
its modifiers and the following word is deprecated.  For
example, C<< m/foo/sand $bar >> is for now still parsed
as C<< m/foo/s and $bar >>, but will now issue a warning.

=head2 C<\cI<X>>

The backslash-c construct was designed as a way of specifying
non-printable characters, but there were no restrictions (on ASCII
platforms) on what the character following the C<c> could be.  Now,
a deprecation warning is raised if that character isn't an ASCII character.
Also, a deprecation warning is raised for C<"\c{"> (which is the same
as simply saying C<";">).

=head2 C<"\b{"> and C<"\B{">

In regular expressions, a literal C<"{"> immediately following a C<"\b">
(not in a bracketed character class) or a C<"\B{"> is now deprecated
to allow for its future use by Perl itself.

=head2 Perl 4-era .pl libraries

Perl bundles a handful of library files that predate Perl 5.
This bundling is now deprecated for most of these files, which are now
available from CPAN.  The affected files now warn when run, if they were
installed as part of the core.

This is a mandatory warning, not obeying B<-X> or lexical warning bits.
The warning is modelled on that supplied by F<deprecate.pm> for
deprecated-in-core F<.pm> libraries.  It points to the specific CPAN
distribution that contains the F<.pl> libraries.  The CPAN versions, of
course, do not generate the warning.

=head2 List assignment to C<$[>

Assignment to C<$[> was deprecated and started to give warnings in
Perl version 5.12.0.  This version of Perl (5.14) now also emits a warning 
when assigning to C<$[> in list context.  This fixes an oversight in 5.12.0.

=head2 Use of qw(...) as parentheses

Historically the parser fooled itself into thinking that C<qw(...)> literals
were always enclosed in parentheses, and as a result you could sometimes omit
parentheses around them:

    for $x qw(a b c) { ... }

The parser no longer lies to itself in this way.  Wrap the list literal in
parentheses like this:

    for $x (qw(a b c)) { ... }

This is being deprecated because the parentheses in C<for $i (1,2,3) { ... }>
are not part of expression syntax.  They are part of the statement
syntax, with the C<for> statement wanting literal parentheses.
The synthetic parentheses that a C<qw> expression acquired were only
intended to be treated as part of expression syntax.

Note that this does not change the behaviour of cases like:

    use POSIX qw(setlocale localeconv);
    our @EXPORT = qw(foo bar baz);

where parentheses were never required around the expression.

=head2 C<\N{BELL}>

This is because Unicode is using that name for a different character.
See L</Unicode Version 6.0 is now supported (mostly)> for more
explanation.

=head2 C<?PATTERN?>

C<?PATTERN?> (without the initial C<m>) has been deprecated and now produces
a warning.  This is to allow future use of C<?> in new operators.
The match-once functionality is still available as C<m?PATTERN?>.

=head2 Tie functions on scalars holding typeglobs

Calling a tie function (C<tie>, C<tied>, C<untie>) with a scalar argument
acts on a filehandle if the scalar happens to hold a typeglob.

This is a long-standing bug that will be removed in Perl 5.16, as
there is currently no way to tie the scalar itself when it holds
a typeglob, and no way to untie a scalar that has had a typeglob
assigned to it.

Now there is a deprecation warning whenever a tie
function is used on a handle without an explicit C<*>.

=head2 User-defined case-mapping

This feature is being deprecated due to its many issues, as documented in
L<perlunicode/User-Defined Case Mappings (for serious hackers only)>.
This feature will be removed in Perl 5.16.  Instead use the CPAN module
L<Unicode::Casing>, which provides improved functionality.

=head2 Deprecated modules

The following module will be removed from the core distribution in a
future release, and should be installed from CPAN instead.  Distributions
on CPAN that require this should add it to their prerequisites.  The
core version of these module now issues a deprecation warning.

If you ship a packaged version of Perl, either alone or as part of a
larger system, then you should carefully consider the repercussions of
core module deprecations.  You may want to consider shipping your default
build of Perl with a package for the deprecated module that
installs into C<vendor> or C<site> Perl library directories.  This will
inhibit the deprecation warnings.

Alternatively, you may want to consider patching F<lib/deprecate.pm>
to provide deprecation warnings specific to your packaging system
or distribution of Perl, consistent with how your packaging system
or distribution manages a staged transition from a release where the
installation of a single package provides the given functionality, to
a later release where the system administrator needs to know to install
multiple packages to get that same functionality.

You can silence these deprecation warnings by installing the module
in question from CPAN.  To install the latest version of it by role
rather than by name, just install C<Task::Deprecations::5_14>.

=over

=item L<Devel::DProf>

We strongly recommend that you install and use L<Devel::NYTProf> instead
of L<Devel::DProf>, as L<Devel::NYTProf> offers significantly
improved profiling and reporting.

=back

=head1 Performance Enhancements

=head2 "Safe signals" optimisation

Signal dispatch has been moved from the runloop into control ops.
This should give a few percent speed increase, and eliminates nearly
all the speed penalty caused by the introduction of "safe signals"
in 5.8.0.  Signals should still be dispatched within the same
statement as they were previously.  If this does I<not> happen, or
if you find it possible to create uninterruptible loops, this is a
bug, and reports are encouraged of how to recreate such issues.

=head2 Optimisation of shift() and pop() calls without arguments

Two fewer OPs are used for shift() and pop() calls with no argument (with
implicit C<@_>).  This change makes shift() 5% faster than C<shift @_>
on non-threaded perls, and 25% faster on threaded ones.

=head2 Optimisation of regexp engine string comparison work

The C<foldEQ_utf8> API function for case-insensitive comparison of strings (which
is used heavily by the regexp engine) was substantially refactored and
optimised -- and its documentation much improved as a free bonus.

=head2 Regular expression compilation speed-up

Compiling regular expressions has been made faster when upgrading
the regex to utf8 is necessary but this isn't known when the compilation begins.

=head2 String appending is 100 times faster

When doing a lot of string appending, perls built to use the system's
C<malloc> could end up allocating a lot more memory than needed in a
inefficient way.

C<sv_grow>, the function used to allocate more memory if necessary
when appending to a string, has been taught to round up the memory
it requests to a certain geometric progression, making it much faster on
certain platforms and configurations.  On Win32, it's now about 100 times
faster.

=head2 Eliminate C<PL_*> accessor functions under ithreads

When C<MULTIPLICITY> was first developed, and interpreter state moved into
an interpreter struct, thread- and interpreter-local C<PL_*> variables
were defined as macros that called accessor functions (returning the
address of the value) outside the Perl core.  The intent was to allow
members within the interpreter struct to change size without breaking
binary compatibility, so that bug fixes could be merged to a maintenance
branch that necessitated such a size change.  This mechanism was redundant
and penalised well-behaved code.  It has been removed.

=head2 Freeing weak references

When there are many weak references to an object, freeing that object
can under some circumstances take O(I<N*N>) time to free, where
I<N> is the number of references.  The circumstances in which this can happen
have been reduced [perl #75254]

=head2 Lexical array and hash assignments

An earlier optimisation to speed up C<my @array = ...> and
C<my %hash = ...> assignments caused a bug and was disabled in Perl 5.12.0.

Now we have found another way to speed up these assignments [perl #82110].

=head2 C<@_> uses less memory

Previously, C<@_> was allocated for every subroutine at compile time with
enough space for four entries.  Now this allocation is done on demand when
the subroutine is called [perl #72416].

=head2 Size optimisations to SV and HV structures

C<xhv_fill> has been eliminated from C<struct xpvhv>, saving 1 IV per hash and
on some systems will cause C<struct xpvhv> to become cache-aligned.  To avoid
this memory saving causing a slowdown elsewhere, boolean use of C<HvFILL>
now calls C<HvTOTALKEYS> instead (which is equivalent), so while the fill
data when actually required are now calculated on demand, cases when
this needs to be done should be rare.

The order of structure elements in SV bodies has changed.  Effectively,
the NV slot has swapped location with STASH and MAGIC.  As all access to
SV members is via macros, this should be completely transparent.  This
change allows the space saving for PVHVs documented above, and may reduce
the memory allocation needed for PVIVs on some architectures.

C<XPV>, C<XPVIV>, and C<XPVNV> now allocate only the parts of the C<SV> body
they actually use, saving some space.

Scalars containing regular expressions now allocate only the part of the C<SV>
body they actually use, saving some space.

=head2 Memory consumption improvements to Exporter

The C<@EXPORT_FAIL> AV is no longer created unless needed, hence neither is
the typeglob backing it.  This saves about 200 bytes for every package that
uses Exporter but doesn't use this functionality.

=head2 Memory savings for weak references

For weak references, the common case of just a single weak reference
per referent has been optimised to reduce the storage required.  In this
case it saves the equivalent of one small Perl array per referent.

=head2 C<%+> and C<%-> use less memory

The bulk of the C<Tie::Hash::NamedCapture> module used to be in the Perl
core.  It has now been moved to an XS module to reduce overhead for
programs that do not use C<%+> or C<%->.

=head2 Multiple small improvements to threads

The internal structures of threading now make fewer API calls and fewer
allocations, resulting in noticeably smaller object code.  Additionally,
many thread context checks have been deferred so they're done only 
as needed (although this is only possible for non-debugging builds).

=head2 Adjacent pairs of nextstate opcodes are now optimized away

Previously, in code such as

    use constant DEBUG => 0;

    sub GAK {
        warn if DEBUG;
        print "stuff\n";
    }

the ops for C<warn if DEBUG> would be folded to a C<null> op (C<ex-const>), but
the C<nextstate> op would remain, resulting in a runtime op dispatch of
C<nextstate>, C<nextstate>, etc.

The execution of a sequence of C<nextstate> ops is indistinguishable from just
the last C<nextstate> op so the peephole optimizer now eliminates the first of
a pair of C<nextstate> ops except when the first carries a label, since labels
must not be eliminated by the optimizer, and label usage isn't conclusively known
at compile time.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

=over 4

=item *

L<CPAN::Meta::YAML> 0.003 has been added as a dual-life module.  It supports a
subset of YAML sufficient for reading and writing F<META.yml> and F<MYMETA.yml> files
included with CPAN distributions or generated by the module installation
toolchain.  It should not be used for any other general YAML parsing or
generation task.

=item *

L<CPAN::Meta> version 2.110440 has been added as a dual-life module.  It
provides a standard library to read, interpret and write CPAN distribution
metadata files (like F<META.json> and F<META.yml>) that describe a
distribution, its contents, and the requirements for building it and
installing it.  The latest CPAN distribution metadata specification is
included as L<CPAN::Meta::Spec> and notes on changes in the specification
over time are given in L<CPAN::Meta::History>.

=item *

L<HTTP::Tiny> 0.012 has been added as a dual-life module.  It is a very
small, simple HTTP/1.1 client designed for simple GET requests and file
mirroring.  It has been added so that F<CPAN.pm> and L<CPANPLUS> can
"bootstrap" HTTP access to CPAN using pure Perl without relying on external
binaries like L<curl(1)> or L<wget(1)>.

=item *

L<JSON::PP> 2.27105 has been added as a dual-life module to allow CPAN
clients to read F<META.json> files in CPAN distributions.

=item *

L<Module::Metadata> 1.000004 has been added as a dual-life module.  It gathers
package and POD information from Perl module files.  It is a standalone module
based on L<Module::Build::ModuleInfo> for use by other module installation
toolchain components.  L<Module::Build::ModuleInfo> has been deprecated in
favor of this module instead.

=item *

L<Perl::OSType> 1.002 has been added as a dual-life module.  It maps Perl
operating system names (like "dragonfly" or "MSWin32") to more generic types
with standardized names (like "Unix" or "Windows").  It has been refactored
out of L<Module::Build> and L<ExtUtils::CBuilder> and consolidates such mappings into
a single location for easier maintenance.

=item *

The following modules were added by the L<Unicode::Collate> 
upgrade.  See below for details.

L<Unicode::Collate::CJK::Big5>

L<Unicode::Collate::CJK::GB2312>

L<Unicode::Collate::CJK::JISX0208>

L<Unicode::Collate::CJK::Korean>

L<Unicode::Collate::CJK::Pinyin>

L<Unicode::Collate::CJK::Stroke>

=item *

L<Version::Requirements> version 0.101020 has been added as a dual-life
module.  It provides a standard library to model and manipulates module
prerequisites and version constraints defined in L<CPAN::Meta::Spec>.

=back

=head2 Updated Modules and Pragma

=over 4

=item *

L<attributes> has been upgraded from version 0.12 to 0.14.

=item *

L<Archive::Extract> has been upgraded from version 0.38 to 0.48.

Updates since 0.38 include: a safe print method that guards
L<Archive::Extract> from changes to C<$\>; a fix to the tests when run in core
Perl; support for TZ files; a modification for the lzma
logic to favour L<IO::Uncompress::Unlzma>; and a fix
for an issue with NetBSD-current and its new L<unzip(1)>
executable.

=item *

L<Archive::Tar> has been upgraded from version 1.54 to 1.76.

Important changes since 1.54 include the following:

=over

=item *

Compatibility with busybox implementations of L<tar(1)>.

=item *

A fix so that write() and create_archive()
close only filehandles they themselves opened.

=item *

A bug was fixed regarding the exit code of extract_archive.

=item *

The L<ptar(1)> utility has a new option to allow safe creation of
tarballs without world-writable files on Windows, allowing those
archives to be uploaded to CPAN.

=item *

A new L<ptargrep(1)> utility for using regular expressions against 
the contents of files in a tar archive.

=item *

L<pax> extended headers are now skipped.

=back

=item *

L<Attribute::Handlers> has been upgraded from version 0.87 to 0.89.

=item *

L<autodie> has been upgraded from version 2.06_01 to 2.1001.

=item *

L<AutoLoader> has been upgraded from version 5.70 to 5.71.

=item *

The L<B> module has been upgraded from version 1.23 to 1.29.

It no longer crashes when taking apart a C<y///> containing characters
outside the octet range or compiled in a C<use utf8> scope.

The size of the shared object has been reduced by about 40%, with no
reduction in functionality.

=item *

L<B::Concise> has been upgraded from version 0.78 to 0.83.

L<B::Concise> marks rv2sv(), rv2av(), and rv2hv() ops with the new
C<OPpDEREF> flag as "DREFed".

It no longer produces mangled output with the B<-tree> option
[perl #80632].

=item *

L<B::Debug> has been upgraded from version 1.12 to 1.16.

=item *

L<B::Deparse> has been upgraded from version 0.96 to 1.03.

The deparsing of a C<nextstate> op has changed when it has both a
change of package relative to the previous nextstate, or a change of
C<%^H> or other state and a label.  The label was previously emitted
first, but is now emitted last (5.12.1).

The C<no 5.13.2> or similar form is now correctly handled by L<B::Deparse>
(5.12.3).

L<B::Deparse> now properly handles the code that applies a conditional
pattern match against implicit C<$_> as it was fixed in [perl #20444].

Deparsing of C<our> followed by a variable with funny characters
(as permitted under the C<use utf8> pragma) has also been fixed [perl #33752].

=item *

L<B::Lint> has been upgraded from version 1.11_01 to 1.13.

=item *

L<base> has been upgraded from version 2.15 to 2.16.

=item *

L<Benchmark> has been upgraded from version 1.11 to 1.12.

=item *

L<bignum> has been upgraded from version 0.23 to 0.27.

=item *

L<Carp> has been upgraded from version 1.15 to 1.20.

L<Carp> now detects incomplete L<caller()|perlfunc/"caller EXPR">
overrides and avoids using bogus C<@DB::args>.  To provide backtraces,
Carp relies on particular behaviour of the caller() builtin.
L<Carp> now detects if other code has overridden this with an
incomplete implementation, and modifies its backtrace accordingly.
Previously incomplete overrides would cause incorrect values in
backtraces (best case), or obscure fatal errors (worst case).

This fixes certain cases of "Bizarre copy of ARRAY" caused by modules
overriding caller() incorrectly (5.12.2).

It now also avoids using regular expressions that cause Perl to
load its Unicode tables, so as to avoid the "BEGIN not safe after
errors" error that ensue if there has been a syntax error
[perl #82854].

=item *

L<CGI> has been upgraded from version 3.48 to 3.52.

This provides the following security fixes: the MIME boundary in 
multipart_init() is now random and the handling of 
newlines embedded in header values has been improved.

=item *

L<Compress::Raw::Bzip2> has been upgraded from version 2.024 to 2.033.

It has been updated to use L<bzip2(1)> 1.0.6.

=item *

L<Compress::Raw::Zlib> has been upgraded from version 2.024 to 2.033.

=item *

L<constant> has been upgraded from version 1.20 to 1.21.

Unicode constants work once more.  They have been broken since Perl 5.10.0
[CPAN RT #67525].

=item *

L<CPAN> has been upgraded from version 1.94_56 to 1.9600.

Major highlights:

=over 4

=item * much less configuration dialog hassle

=item * support for F<META/MYMETA.json>

=item * support for L<local::lib>

=item * support for L<HTTP::Tiny> to reduce the dependency on FTP sites 

=item * automatic mirror selection

=item * iron out all known bugs in configure_requires

=item * support for distributions compressed with L<bzip2(1)>

=item * allow F<Foo/Bar.pm> on the command line to mean C<Foo::Bar>

=back

=item *

L<CPANPLUS> has been upgraded from version 0.90 to 0.9103.

A change to F<cpanp-run-perl>
resolves L<RT #55964|http://rt.cpan.org/Public/Bug/Display.html?id=55964>
and L<RT #57106|http://rt.cpan.org/Public/Bug/Display.html?id=57106>, both
of which related to failures to install distributions that use
C<Module::Install::DSL> (5.12.2).

A dependency on L<Config> was not recognised as a
core module dependency.  This has been fixed.

L<CPANPLUS> now includes support for F<META.json> and F<MYMETA.json>.

=item *

L<CPANPLUS::Dist::Build> has been upgraded from version 0.46 to 0.54.

=item *

L<Data::Dumper> has been upgraded from version 2.125 to 2.130_02.

The indentation used to be off when C<$Data::Dumper::Terse> was set.  This
has been fixed [perl #73604].

This upgrade also fixes a crash when using custom sort functions that might
cause the stack to change [perl #74170].

L<Dumpxs> no longer crashes with globs returned by C<*$io_ref>
[perl #72332].

=item *

L<DB_File> has been upgraded from version 1.820 to 1.821.

=item *

L<DBM_Filter> has been upgraded from version 0.03 to 0.04.

=item *

L<Devel::DProf> has been upgraded from version 20080331.00 to 20110228.00.

Merely loading L<Devel::DProf> now no longer triggers profiling to start.
Both C<use Devel::DProf> and C<perl -d:DProf ...> behave as before and start
the profiler.

B<NOTE>: L<Devel::DProf> is deprecated and will be removed from a future
version of Perl.  We strongly recommend that you install and use
L<Devel::NYTProf> instead, as it offers significantly improved
profiling and reporting.

=item *

L<Devel::Peek> has been upgraded from version 1.04 to 1.07.

=item *

L<Devel::SelfStubber> has been upgraded from version 1.03 to 1.05.

=item *

L<diagnostics> has been upgraded from version 1.19 to 1.22.

It now renders pod links slightly better, and has been taught to find
descriptions for messages that share their descriptions with other
messages.

=item *

L<Digest::MD5> has been upgraded from version 2.39 to 2.51.

It is now safe to use this module in combination with threads.

=item *

L<Digest::SHA> has been upgraded from version 5.47 to 5.61.

C<shasum> now more closely mimics L<sha1sum(1)>/L<md5sum(1)>.

C<addfile> accepts all POSIX filenames.

New SHA-512/224 and SHA-512/256 transforms (ref. NIST Draft FIPS 180-4
[February 2011])

=item *

L<DirHandle> has been upgraded from version 1.03 to 1.04.

=item *

L<Dumpvalue> has been upgraded from version 1.13 to 1.16.

=item *

L<DynaLoader> has been upgraded from version 1.10 to 1.13.

It fixes a buffer overflow when passed a very long file name.

It no longer inherits from L<AutoLoader>; hence it no longer
produces weird error messages for unsuccessful method calls on classes that
inherit from L<DynaLoader> [perl #84358].

=item *

L<Encode> has been upgraded from version 2.39 to 2.42.

Now, all 66 Unicode non-characters are treated the same way U+FFFF has
always been treated: in cases when it was disallowed, all 66 are
disallowed, and in cases where it warned, all 66 warn.

=item *

L<Env> has been upgraded from version 1.01 to 1.02.

=item *

L<Errno> has been upgraded from version 1.11 to 1.13.

The implementation of L<Errno> has been refactored to use about 55% less memory.

On some platforms with unusual header files, like Win32 L<gcc(1)> using C<mingw64>
headers, some constants that weren't actually error numbers have been exposed
by L<Errno>.  This has been fixed [perl #77416].

=item *

L<Exporter> has been upgraded from version 5.64_01 to 5.64_03.

Exporter no longer overrides C<$SIG{__WARN__}> [perl #74472]

=item *

L<ExtUtils::CBuilder> has been upgraded from version 0.27 to 0.280203.

=item *

L<ExtUtils::Command> has been upgraded from version 1.16 to 1.17.

=item *

L<ExtUtils::Constant> has been upgraded from 0.22 to 0.23.

The L<AUTOLOAD> helper code generated by C<ExtUtils::Constant::ProxySubs>
can now croak() for missing constants, or generate a complete C<AUTOLOAD>
subroutine in XS, allowing simplification of many modules that use it
(L<Fcntl>, L<File::Glob>, L<GDBM_File>, L<I18N::Langinfo>, L<POSIX>,
L<Socket>).

L<ExtUtils::Constant::ProxySubs> can now optionally push the names of all
constants onto the package's C<@EXPORT_OK>.

=item *

L<ExtUtils::Install> has been upgraded from version 1.55 to 1.56.

=item *

L<ExtUtils::MakeMaker> has been upgraded from version 6.56 to 6.57_05.

=item *

L<ExtUtils::Manifest> has been upgraded from version 1.57 to 1.58.

=item *

L<ExtUtils::ParseXS> has been upgraded from version 2.21 to 2.2210.

=item *

L<Fcntl> has been upgraded from version 1.06 to 1.11.

=item *

L<File::Basename> has been upgraded from version 2.78 to 2.82.

=item *

L<File::CheckTree> has been upgraded from version 4.4 to 4.41.

=item *

L<File::Copy> has been upgraded from version 2.17 to 2.21.

=item *

L<File::DosGlob> has been upgraded from version 1.01 to 1.04.

It allows patterns containing literal parentheses: they no longer need to
be escaped.  On Windows, it no longer
adds an extra F<./> to file names
returned when the pattern is a relative glob with a drive specification,
like F<C:*.pl> [perl #71712].

=item *

L<File::Fetch> has been upgraded from version 0.24 to 0.32.

L<HTTP::Lite> is now supported for the "http" scheme.

The L<fetch(1)> utility is supported on FreeBSD, NetBSD, and
Dragonfly BSD for the C<http> and C<ftp> schemes.

=item *

L<File::Find> has been upgraded from version 1.15 to 1.19.

It improves handling of backslashes on Windows, so that paths like
F<C:\dir\/file> are no longer generated [perl #71710].

=item *

L<File::Glob> has been upgraded from version 1.07 to 1.12.

=item *

L<File::Spec> has been upgraded from version 3.31 to 3.33.

Several portability fixes were made in L<File::Spec::VMS>: a colon is now
recognized as a delimiter in native filespecs; caret-escaped delimiters are
recognized for better handling of extended filespecs; catpath() returns
an empty directory rather than the current directory if the input directory
name is empty; and abs2rel() properly handles Unix-style input (5.12.2).

=item *

L<File::stat> has been upgraded from 1.02 to 1.05.

The C<-x> and C<-X> file test operators now work correctly when run
by the superuser.

=item *

L<Filter::Simple> has been upgraded from version 0.84 to 0.86.

=item *

L<GDBM_File> has been upgraded from 1.10 to 1.14.

This fixes a memory leak when DBM filters are used.

=item *

L<Hash::Util> has been upgraded from 0.07 to 0.11.

L<Hash::Util> no longer emits spurious "uninitialized" warnings when
recursively locking hashes that have undefined values [perl #74280].

=item *

L<Hash::Util::FieldHash> has been upgraded from version 1.04 to 1.09.

=item *

L<I18N::Collate> has been upgraded from version 1.01 to 1.02.

=item *

L<I18N::Langinfo> has been upgraded from version 0.03 to 0.08.

langinfo() now defaults to using C<$_> if there is no argument given, just
as the documentation has always claimed.

=item *

L<I18N::LangTags> has been upgraded from version 0.35 to 0.35_01.

=item *

L<if> has been upgraded from version 0.05 to 0.0601.

=item *

L<IO> has been upgraded from version 1.25_02 to 1.25_04.

This version of L<IO> includes a new L<IO::Select>, which now allows L<IO::Handle>
objects (and objects in derived classes) to be removed from an L<IO::Select> set
even if the underlying file descriptor is closed or invalid.

=item *

L<IPC::Cmd> has been upgraded from version 0.54 to 0.70.

Resolves an issue with splitting Win32 command lines.  An argument
consisting of the single character "0" used to be omitted (CPAN RT #62961).

=item *

L<IPC::Open3> has been upgraded from 1.05 to 1.09.

open3() now produces an error if the C<exec> call fails, allowing this
condition to be distinguished from a child process that exited with a
non-zero status [perl #72016].

The internal xclose() routine now knows how to handle file descriptors as
documented, so duplicating C<STDIN> in a child process using its file
descriptor now works [perl #76474].

=item *

L<IPC::SysV> has been upgraded from version 2.01 to 2.03.

=item *

L<lib> has been upgraded from version 0.62 to 0.63.

=item *

L<Locale::Maketext> has been upgraded from version 1.14 to 1.19.

L<Locale::Maketext> now supports external caches.

This upgrade also fixes an infinite loop in
C<Locale::Maketext::Guts::_compile()> when
working with tainted values (CPAN RT #40727).

C<< ->maketext >> calls now back up and restore C<$@> so error
messages are not suppressed (CPAN RT #34182).

=item *

L<Log::Message> has been upgraded from version 0.02 to 0.04.

=item *

L<Log::Message::Simple> has been upgraded from version 0.06 to 0.08.

=item *

L<Math::BigInt> has been upgraded from version 1.89_01 to 1.994.

This fixes, among other things, incorrect results when computing binomial
coefficients [perl #77640].

It also prevents C<sqrt($int)> from crashing under C<use bigrat>.
[perl #73534].

=item *

L<Math::BigInt::FastCalc> has been upgraded from version 0.19 to 0.28.

=item *

L<Math::BigRat> has been upgraded from version 0.24 to 0.26_02.

=item *

L<Memoize> has been upgraded from version 1.01_03 to 1.02.

=item *

L<MIME::Base64> has been upgraded from 3.08 to 3.13.

Includes new functions to calculate the length of encoded and decoded
base64 strings.

Now provides encode_base64url() and decode_base64url() functions to process
the base64 scheme for "URL applications".

=item *

L<Module::Build> has been upgraded from version 0.3603 to 0.3800.

A notable change is the deprecation of several modules.
L<Module::Build::Version> has been deprecated and L<Module::Build> now
relies on the L<version> pragma directly.  L<Module::Build::ModuleInfo> has
been deprecated in favor of a standalone copy called L<Module::Metadata>.
L<Module::Build::YAML> has been deprecated in favor of L<CPAN::Meta::YAML>.

L<Module::Build> now also generates F<META.json> and F<MYMETA.json> files
in accordance with version 2 of the CPAN distribution metadata specification,
L<CPAN::Meta::Spec>.  The older format F<META.yml> and F<MYMETA.yml> files are
still generated.

=item *

L<Module::CoreList> has been upgraded from version 2.29 to 2.47.

Besides listing the updated core modules of this release, it also stops listing
the C<Filespec> module.  That module never existed in core.  The scripts
generating L<Module::CoreList> confused it with L<VMS::Filespec>, which actually
is a core module as of Perl 5.8.7.

=item *

L<Module::Load> has been upgraded from version 0.16 to 0.18.

=item *

L<Module::Load::Conditional> has been upgraded from version 0.34 to 0.44.

=item *

The L<mro> pragma has been upgraded from version 1.02 to 1.07.

=item *

L<NDBM_File> has been upgraded from version 1.08 to 1.12.

This fixes a memory leak when DBM filters are used.

=item *

L<Net::Ping> has been upgraded from version 2.36 to 2.38.

=item *

L<NEXT> has been upgraded from version 0.64 to 0.65.

=item *

L<Object::Accessor> has been upgraded from version 0.36 to 0.38.

=item *

L<ODBM_File> has been upgraded from version 1.07 to 1.10.

This fixes a memory leak when DBM filters are used.

=item *

L<Opcode> has been upgraded from version 1.15 to 1.18.

=item *

The L<overload> pragma has been upgraded from 1.10 to 1.13.

C<overload::Method> can now handle subroutines that are themselves blessed
into overloaded classes [perl #71998].

The documentation has greatly improved.  See L</Documentation> below.

=item *

L<Params::Check> has been upgraded from version 0.26 to 0.28.

=item *

The L<parent> pragma has been upgraded from version 0.223 to 0.225.

=item *

L<Parse::CPAN::Meta> has been upgraded from version 1.40 to 1.4401.

The latest Parse::CPAN::Meta can now read YAML and JSON files using
L<CPAN::Meta::YAML> and L<JSON::PP>, which are now part of the Perl core.

=item *

L<PerlIO::encoding> has been upgraded from version 0.12 to 0.14.

=item *

L<PerlIO::scalar> has been upgraded from 0.07 to 0.11.

A read() after a seek() beyond the end of the string no longer thinks it
has data to read [perl #78716].

=item *

L<PerlIO::via> has been upgraded from version 0.09 to 0.11.

=item *

L<Pod::Html> has been upgraded from version 1.09 to 1.11.

=item *

L<Pod::LaTeX> has been upgraded from version 0.58 to 0.59.

=item *

L<Pod::Perldoc> has been upgraded from version 3.15_02 to 3.15_03.

=item *

L<Pod::Simple> has been upgraded from version 3.13 to 3.16.

=item *

L<POSIX> has been upgraded from 1.19 to 1.24.

It now includes constants for POSIX signal constants.

=item *

The L<re> pragma has been upgraded from version 0.11 to 0.18.

The C<use re '/flags'> subpragma is new.

The regmust() function used to crash when called on a regular expression
belonging to a pluggable engine.  Now it croaks instead.

regmust() no longer leaks memory.

=item *

L<Safe> has been upgraded from version 2.25 to 2.29.

Coderefs returned by reval() and rdo() are now wrapped via
wrap_code_refs() (5.12.1).

This fixes a possible infinite loop when looking for coderefs.

It adds several C<version::vxs::*> routines to the default share.

=item *

L<SDBM_File> has been upgraded from version 1.06 to 1.09.

=item *

L<SelfLoader> has been upgraded from 1.17 to 1.18.

It now works in taint mode [perl #72062].

=item *

The L<sigtrap> pragma has been upgraded from version 1.04 to 1.05.

It no longer tries to modify read-only arguments when generating a
backtrace [perl #72340].

=item *

L<Socket> has been upgraded from version 1.87 to 1.94.

See L</Improved IPv6 support> above.

=item *

L<Storable> has been upgraded from version 2.22 to 2.27.

Includes performance improvement for overloaded classes.

This adds support for serialising code references that contain UTF-8 strings
correctly.  The L<Storable> minor version
number changed as a result, meaning that
L<Storable> users who set C<$Storable::accept_future_minor> to a C<FALSE> value
will see errors (see L<Storable/FORWARD COMPATIBILITY> for more details).

Freezing no longer gets confused if the Perl stack gets reallocated
during freezing [perl #80074].

=item *

L<Sys::Hostname> has been upgraded from version 1.11 to 1.16.

=item *

L<Term::ANSIColor> has been upgraded from version 2.02 to 3.00.

=item *

L<Term::UI> has been upgraded from version 0.20 to 0.26.

=item *

L<Test::Harness> has been upgraded from version 3.17 to 3.23.

=item *

L<Test::Simple> has been upgraded from version 0.94 to 0.98.

Among many other things, subtests without a C<plan> or C<no_plan> now have an
implicit done_testing() added to them.

=item *

L<Thread::Semaphore> has been upgraded from version 2.09 to 2.12.

It provides two new methods that give more control over the decrementing of
semaphores: C<down_nb> and C<down_force>.

=item *

L<Thread::Queue> has been upgraded from version 2.11 to 2.12.

=item *

The L<threads> pragma has been upgraded from version 1.75 to 1.83.

=item *

The L<threads::shared> pragma has been upgraded from version 1.32 to 1.37.

=item *

L<Tie::Hash> has been upgraded from version 1.03 to 1.04.

Calling C<< Tie::Hash->TIEHASH() >> used to loop forever.  Now it C<croak>s.

=item *

L<Tie::Hash::NamedCapture> has been upgraded from version 0.06 to 0.08.

=item *

L<Tie::RefHash> has been upgraded from version 1.38 to 1.39.

=item *

L<Time::HiRes> has been upgraded from version 1.9719 to 1.9721_01.

=item *

L<Time::Local> has been upgraded from version 1.1901_01 to 1.2000.

=item *

L<Time::Piece> has been upgraded from version 1.15_01 to 1.20_01.

=item *

L<Unicode::Collate> has been upgraded from version 0.52_01 to 0.73.

L<Unicode::Collate> has been updated to use Unicode 6.0.0.

L<Unicode::Collate::Locale> now supports a plethora of new locales: I<ar, be,
bg, de__phonebook, hu, hy, kk, mk, nso, om, tn, vi, hr, ig, ja, ko, ru, sq, 
se, sr, to, uk, zh, zh__big5han, zh__gb2312han, zh__pinyin>, and I<zh__stroke>.

The following modules have been added:

L<Unicode::Collate::CJK::Big5> for C<zh__big5han> which makes 
tailoring of CJK Unified Ideographs in the order of CLDR's big5han ordering.

L<Unicode::Collate::CJK::GB2312> for C<zh__gb2312han> which makes
tailoring of CJK Unified Ideographs in the order of CLDR's gb2312han ordering.

L<Unicode::Collate::CJK::JISX0208> which makes tailoring of 6355 kanji 
(CJK Unified Ideographs) in the JIS X 0208 order.

L<Unicode::Collate::CJK::Korean> which makes tailoring of CJK Unified Ideographs 
in the order of CLDR's Korean ordering.

L<Unicode::Collate::CJK::Pinyin> for C<zh__pinyin> which makes
tailoring of CJK Unified Ideographs in the order of CLDR's pinyin ordering.

L<Unicode::Collate::CJK::Stroke> for C<zh__stroke> which makes
tailoring of CJK Unified Ideographs in the order of CLDR's stroke ordering.

This also sees the switch from using the pure-Perl version of this
module to the XS version.

=item *

L<Unicode::Normalize> has been upgraded from version 1.03 to 1.10.

=item *

L<Unicode::UCD> has been upgraded from version 0.27 to 0.32.

A new function, Unicode::UCD::num(), has been added.  This function
returns the numeric value of the string passed it or C<undef> if the string
in its entirety has no "safe" numeric value.  (For more detail, and for the
definition of "safe", see L<Unicode::UCD/num()>.)

This upgrade also includes several bug fixes:

=over 4

=item charinfo()

=over 4

=item *

It is now updated to Unicode Version 6.0.0 with I<Corrigendum #8>, 
excepting that, just as with Perl 5.14, the code point at U+1F514 has no name.

=item *

Hangul syllable code points have the correct names, and their
decompositions are always output without requiring L<Lingua::KO::Hangul::Util>
to be installed.

=item *

CJK (Chinese-Japanese-Korean) code points U+2A700 to U+2B734
and U+2B740 to U+2B81D are now properly handled.

=item *

Numeric values are now output for those CJK code points that have them.

=item *

Names output for code points with multiple aliases are now the
corrected ones.

=back

=item charscript()

This now correctly returns "Unknown" instead of C<undef> for the script
of a code point that hasn't been assigned another one.

=item charblock()

This now correctly returns "No_Block" instead of C<undef> for the block
of a code point that hasn't been assigned to another one.

=back

=item *

The L<version> pragma has been upgraded from 0.82 to 0.88.

Because of a bug, now fixed, the is_strict() and is_lax() functions did not
work when exported (5.12.1).

=item *

The L<warnings> pragma has been upgraded from version 1.09 to 1.12.

Calling C<use warnings> without arguments is now significantly more efficient.

=item *

The L<warnings::register> pragma has been upgraded from version 1.01 to 1.02.

It is now possible to register warning categories other than the names of
packages using L<warnings::register>.  See L<perllexwarn(1)> for more information.

=item *

L<XSLoader> has been upgraded from version 0.10 to 0.13.

=item *

L<VMS::DCLsym> has been upgraded from version 1.03 to 1.05.

Two bugs have been fixed [perl #84086]:

The symbol table name was lost when tying a hash, due to a thinko in
C<TIEHASH>.  The result was that all tied hashes interacted with the
local symbol table.

Unless a symbol table name had been explicitly specified in the call
to the constructor, querying the special key C<:LOCAL> failed to
identify objects connected to the local symbol table.

=item *

The L<Win32> module has been upgraded from version 0.39 to 0.44.

This release has several new functions: Win32::GetSystemMetrics(),
Win32::GetProductInfo(), Win32::GetOSDisplayName().

The names returned by Win32::GetOSName() and Win32::GetOSDisplayName()
have been corrected.

=item *

L<XS::Typemap> has been upgraded from version 0.03 to 0.05.

=back

=head2 Removed Modules and Pragmata

As promised in Perl 5.12.0's release notes, the following modules have
been removed from the core distribution, and if needed should be installed
from CPAN instead.

=over

=item *

L<Class::ISA> has been removed from the Perl core.  Prior version was 0.36.

=item *

L<Pod::Plainer> has been removed from the Perl core.  Prior version was 1.02.

=item *

L<Switch> has been removed from the Perl core.  Prior version was 2.16.

=back

The removal of L<Shell> has been deferred until after 5.14, as the
implementation of L<Shell> shipped with 5.12.0 did not correctly issue the
warning that it was to be removed from core.

=head1 Documentation

=head2 New Documentation

=head3 L<perlgpl>

L<perlgpl> has been updated to contain GPL version 1, as is included in the
F<README> distributed with Perl (5.12.1).

=head3 Perl 5.12.x delta files

The perldelta files for Perl 5.12.1 to 5.12.3 have been added from the
maintenance branch: L<perl5121delta>, L<perl5122delta>, L<perl5123delta>.

=head3 L<perlpodstyle>

New style guide for POD documentation,
split mostly from the NOTES section of the L<pod2man(1)> manpage.

=head3 L<perlsource>, L<perlinterp>, L<perlhacktut>, and L<perlhacktips>

See L</perlhack and perlrepository revamp>, below.

=head2 Changes to Existing Documentation

=head3 L<perlmodlib> is now complete

The L<perlmodlib> manpage that came with Perl 5.12.0 was missing several
modules due to a bug in the script that generates the list.  This has been
fixed [perl #74332] (5.12.1).

=head3 Replace incorrect tr/// table in L<perlebcdic>

L<perlebcdic> contains a helpful table to use in C<tr///> to convert
between EBCDIC and Latin1/ASCII.  The table was the inverse of the one
it describes, though the code that used the table worked correctly for
the specific example given.

The table has been corrected and the sample code changed to correspond.

The table has also been changed to hex from octal, and the recipes in the
pod have been altered to print out leading zeros to make all values
the same length.

=head3 Tricks for user-defined casing

L<perlunicode> now contains an explanation of how to override, mangle
and otherwise tweak the way Perl handles upper-, lower- and other-case
conversions on Unicode data, and how to provide scoped changes to alter
one's own code's behaviour without stomping on anybody else's.

=head3 INSTALL explicitly states that Perl requires a C89 compiler

This was already true, but it's now Officially Stated For The Record
(5.12.2).

=head3 Explanation of C<\xI<HH>> and C<\oI<OOO>> escapes

L<perlop> has been updated with more detailed explanation of these two
character escapes.

=head3 B<-0I<NNN>> switch

In L<perlrun>, the behaviour of the B<-0NNN> switch for B<-0400> or higher
has been clarified (5.12.2).

=head3 Maintenance policy

L<perlpolicy> now contains the policy on what patches are acceptable for
maintenance branches (5.12.1).

=head3 Deprecation policy

L<perlpolicy> now contains the policy on compatibility and deprecation
along with definitions of terms like "deprecation" (5.12.2).

=head3 New descriptions in L<perldiag>

The following existing diagnostics are now documented:

=over 4

=item *

L<Ambiguous use of %c resolved as operator %c|perldiag/"Ambiguous use of %c resolved as operator %c">

=item *

L<Ambiguous use of %c{%s} resolved to %c%s|perldiag/"Ambiguous use of %c{%s} resolved to %c%s">

=item *

L<Ambiguous use of %c{%s[...]} resolved to %c%s[...]|perldiag/"Ambiguous use of %c{%s[...]} resolved to %c%s[...]">

=item *

L<Ambiguous use of %c{%s{...}} resolved to %c%s{...}|perldiag/"Ambiguous use of %c{%s{...}} resolved to %c%s{...}">

=item *

L<Ambiguous use of -%s resolved as -&%s()|perldiag/"Ambiguous use of -%s resolved as -&%s()">

=item *

L<Invalid strict version format (%s)|perldiag/"Invalid strict version format (%s)">

=item *

L<Invalid version format (%s)|perldiag/"Invalid version format (%s)">

=item *

L<Invalid version object|perldiag/"Invalid version object">

=back

=head3 L<perlbook>

L<perlbook> has been expanded to cover many more popular books.

=head3 C<SvTRUE> macro

The documentation for the C<SvTRUE> macro in
L<perlapi> was simply wrong in stating that
get-magic is not processed.  It has been corrected.

=head3 op manipulation functions

Several API functions that process optrees have been newly documented.

=head3 L<perlvar> revamp

L<perlvar> reorders the variables and groups them by topic.  Each variable
introduced after Perl 5.000 notes the first version in which it is 
available.  L<perlvar> also has a new section for deprecated variables to
note when they were removed.

=head3 Array and hash slices in scalar context

These are now documented in L<perldata>.

=head3 C<use locale> and formats

L<perlform> and L<perllocale> have been corrected to state that
C<use locale> affects formats.

=head3 L<overload>

L<overload>'s documentation has practically undergone a rewrite.  It
is now much more straightforward and clear.

=head3 perlhack and perlrepository revamp

The L<perlhack> document is now much shorter, and focuses on the Perl 5
development process and submitting patches to Perl.  The technical content
has been moved to several new documents, L<perlsource>, L<perlinterp>,
L<perlhacktut>, and L<perlhacktips>.  This technical content has 
been only lightly edited.

The perlrepository document has been renamed to L<perlgit>.  This new
document is just a how-to on using git with the Perl source code.
Any other content that used to be in perlrepository has been moved
to L<perlhack>.

=head3 Time::Piece examples

Examples in L<perlfaq4> have been updated to show the use of
L<Time::Piece>.

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over

=item Closure prototype called

This error occurs when a subroutine reference passed to an attribute
handler is called, if the subroutine is a closure [perl #68560].

=item Insecure user-defined property %s

Perl detected tainted data when trying to compile a regular
expression that contains a call to a user-defined character property
function, meaning C<\p{IsFoo}> or C<\p{InFoo}>.
See L<perlunicode/User-Defined Character Properties> and L<perlsec>.

=item panic: gp_free failed to free glob pointer - something is repeatedly re-creating entries

This new error is triggered if a destructor called on an object in a
typeglob that is being freed creates a new typeglob entry containing an
object with a destructor that creates a new entry containing an object etc.

=item Parsing code internal error (%s)

This new fatal error is produced when parsing
code supplied by an extension violates the
parser's API in a detectable way.

=item refcnt: fd %d%s

This new error only occurs if a internal consistency check fails when a
pipe is about to be closed.

=item Regexp modifier "/%c" may not appear twice

The regular expression pattern has one of the
mutually exclusive modifiers repeated.

=item Regexp modifiers "/%c" and "/%c" are mutually exclusive

The regular expression pattern has more than one of the mutually
exclusive modifiers.

=item Using !~ with %s doesn't make sense

This error occurs when C<!~> is used with C<s///r> or C<y///r>.

=back

=head3 New Warnings

=over

=item "\b{" is deprecated; use "\b\{" instead

=item "\B{" is deprecated; use "\B\{" instead

Use of an unescaped "{" immediately following a C<\b> or C<\B> is now
deprecated in order to reserve its use for Perl itself in a future release.

=item Operation "%s" returns its argument for ...

Performing an operation requiring Unicode semantics (such as case-folding)
on a Unicode surrogate or a non-Unicode character now triggers this
warning.

=item Use of qw(...) as parentheses is deprecated

See L</"Use of qw(...) as parentheses">, above, for details.

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

The "Variable $foo is not imported" warning that precedes a
C<strict 'vars'> error has now been assigned the "misc" category, so that
C<no warnings> will suppress it [perl #73712].

=item *

warn() and die() now produce "Wide character" warnings when fed a
character outside the byte range if C<STDERR> is a byte-sized handle.

=item *

The "Layer does not match this perl" error message has been replaced with
these more helpful messages [perl #73754]:

=over 4

=item *

PerlIO layer function table size (%d) does not match size expected by this
perl (%d)

=item *

PerlIO layer instance size (%d) does not match size expected by this perl
(%d)

=back

=item *

The "Found = in conditional" warning that is emitted when a constant is
assigned to a variable in a condition is now withheld if the constant is
actually a subroutine or one generated by C<use constant>, since the value
of the constant may not be known at the time the program is written
[perl #77762].

=item *

Previously, if none of the gethostbyaddr(), gethostbyname() and
gethostent() functions were implemented on a given platform, they would
all die with the message "Unsupported socket function 'gethostent' called",
with analogous messages for getnet*() and getserv*().  This has been
corrected.

=item *

The warning message about unrecognized regular expression escapes passed
through has been changed to include any literal "{" following the
two-character escape.  For example, "\q{" is now emitted instead of "\q".

=back

=head1 Utility Changes

=head3 L<perlbug(1)>

=over 4

=item *

L<perlbug> now looks in the EMAIL environment variable for a return address
if the REPLY-TO and REPLYTO variables are empty.

=item *

L<perlbug> did not previously generate a "From:" header, potentially
resulting in dropped mail; it now includes that header.

=item *

The user's address is now used as the Return-Path.

Many systems these days don't have a valid Internet domain name, and
perlbug@perl.org does not accept email with a return-path that does
not resolve.  So the user's address is now passed to sendmail so it's
less likely to get stuck in a mail queue somewhere [perl #82996].

=item *

L<perlbug> now always gives the reporter a chance to change the email
address it guesses for them (5.12.2).

=item *

L<perlbug> should no longer warn about uninitialized values when using the B<-d>
and B<-v> options (5.12.2).

=back

=head3 L<perl5db.pl>

=over

=item *

The remote terminal works after forking and spawns new sessions, one
per forked process.

=back

=head3 L<ptargrep>

=over 4

=item *

L<ptargrep> is a new utility to apply pattern matching to the contents of
files  in a tar archive.  It comes with C<Archive::Tar>.

=back

=head1 Configuration and Compilation

See also L</"Naming fixes in Policy_sh.SH may invalidate Policy.sh">,
above.

=over 4

=item *

CCINCDIR and CCLIBDIR for the mingw64 cross-compiler are now correctly
under F<$(CCHOME)\mingw\include> and F<\lib> rather than immediately below
F<$(CCHOME)>.

This means the "incpath", "libpth", "ldflags", "lddlflags" and
"ldflags_nolargefiles" values in F<Config.pm> and F<Config_heavy.pl> are now
set correctly.

=item *

C<make test.valgrind> has been adjusted to account for F<cpan/dist/ext>
separation.

=item *

On compilers that support it, B<-Wwrite-strings> is now added to cflags by
default.

=item *

The L<Encode> module can now (once again) be included in a static Perl
build.  The special-case handling for this situation got broken in Perl
5.11.0, and has now been repaired.

=item *

The previous default size of a PerlIO buffer (4096 bytes) has been increased
to the larger of 8192 bytes and your local BUFSIZ.  Benchmarks show that doubling
this decade-old default increases read and write performance by around
25% to 50% when using the default layers of perlio on top of unix.  To choose
a non-default size, such as to get back the old value or to obtain an even
larger value, configure with:

     ./Configure -Accflags=-DPERLIOBUF_DEFAULT_BUFSIZ=N

where N is the desired size in bytes; it should probably be a multiple of
your page size.

=item *

An "incompatible operand types" error in ternary expressions when building
with C<clang> has been fixed (5.12.2).

=item *

Perl now skips setuid L<File::Copy> tests on partitions it detects mounted
as C<nosuid> (5.12.2).

=back

=head1 Platform Support

=head2 New Platforms

=over 4

=item AIX

Perl now builds on AIX 4.2 (5.12.1).

=back

=head2 Discontinued Platforms

=over 4

=item Apollo DomainOS

The last vestiges of support for this platform have been excised from
the Perl distribution.  It was officially discontinued in version 5.12.0.
It had not worked for years before that.

=item MacOS Classic

The last vestiges of support for this platform have been excised from the
Perl distribution.  It was officially discontinued in an earlier version.

=back

=head2 Platform-Specific Notes

=head3 AIX

=over

=item *

F<README.aix> has been updated with information about the XL C/C++ V11 compiler
suite (5.12.2).

=back

=head3 ARM

=over

=item *

The C<d_u32align> configuration probe on ARM has been fixed (5.12.2).

=back

=head3 Cygwin

=over 4

=item *

L<MakeMaker> has been updated to build manpages on cygwin.

=item *

Improved rebase behaviour

If a DLL is updated on cygwin the old imagebase address is reused.
This solves most rebase errors, especially when updating on core DLL's.
See L<http://www.tishler.net/jason/software/rebase/rebase-2.4.2.README> 
for more information.

=item *

Support for the standard cygwin dll prefix (needed for FFIs)

=item *

Updated build hints file

=back

=head3 FreeBSD 7

=over

=item * 

FreeBSD 7 no longer contains F</usr/bin/objformat>.  At build time,
Perl now skips the F<objformat> check for versions 7 and higher and
assumes ELF (5.12.1).

=back

=head3 HP-UX

=over 

=item *

Perl now allows B<-Duse64bitint> without promoting to C<use64bitall> on HP-UX
(5.12.1).

=back

=head3 IRIX

=over

=item *

Conversion of strings to floating-point numbers is now more accurate on
IRIX systems [perl #32380].

=back

=head3 Mac OS X

=over

=item *

Early versions of Mac OS X (Darwin) had buggy implementations of the
setregid(), setreuid(), setrgid(,) and setruid() functions, so Perl
would pretend they did not exist.

These functions are now recognised on Mac OS 10.5 (Leopard; Darwin 9) and
higher, as they have been fixed [perl #72990].

=back

=head3 MirBSD

=over

=item *

Previously if you built Perl with a shared F<libperl.so> on MirBSD (the
default config), it would work up to the installation; however, once
installed, it would be unable to find F<libperl>.  Path handling is now
treated as in the other BSD dialects.

=back

=head3 NetBSD

=over

=item *

The NetBSD hints file has been changed to make the system malloc the
default.

=back

=head3 OpenBSD

=over

=item *

OpenBSD E<gt> 3.7 has a new malloc implementation which is I<mmap>-based,
and as such can release memory back to the OS; however, Perl's use of
this malloc causes a substantial slowdown, so we now default to using
Perl's malloc instead [perl #75742].

=back

=head3 OpenVOS

=over

=item *

Perl now builds again with OpenVOS (formerly known as Stratus VOS)
[perl #78132] (5.12.3).

=back

=head3 Solaris

=over

=item *

DTrace is now supported on Solaris.  There used to be build failures, but
these have been fixed [perl #73630] (5.12.3).

=back

=head3 VMS

=over

=item *

Extension building on older (pre 7.3-2) VMS systems was broken because
configure.com hit the DCL symbol length limit of 1K.  We now work within
this limit when assembling the list of extensions in the core build (5.12.1).

=item *

We fixed configuring and building Perl with B<-Uuseperlio> (5.12.1).

=item *

C<PerlIOUnix_open> now honours the default permissions on VMS.

When C<perlio> became the default and C<unix> became the default bottom layer,
the most common path for creating files from Perl became C<PerlIOUnix_open>,
which has always explicitly used C<0666> as the permission mask.  This prevents
inheriting permissions from RMS defaults and ACLs, so to avoid that problem,
we now pass C<0777> to open().  In the VMS CRTL, C<0777> has a special
meaning over and above intersecting with the current umask; specifically, it
allows Unix syscalls to preserve native default permissions (5.12.3).

=item *

The shortening of symbols longer than 31 characters in the core C sources
and in extensions is now by default done by the C compiler rather than by
xsubpp (which could only do so for generated symbols in XS code).  You can
reenable xsubpp's symbol shortening by configuring with -Uuseshortenedsymbols,
but you'll have some work to do to get the core sources to compile.

=item *

Record-oriented files (record format variable or variable with fixed control)
opened for write by the C<perlio> layer will now be line-buffered to prevent the
introduction of spurious line breaks whenever the perlio buffer fills up.

=item *

F<git_version.h> is now installed on VMS.  This was an oversight in v5.12.0 which
caused some extensions to fail to build (5.12.2).

=item *

Several memory leaks in L<stat()|perlfunc/"stat FILEHANDLE"> have been fixed (5.12.2).

=item *

A memory leak in Perl_rename() due to a double allocation has been
fixed (5.12.2).

=item *

A memory leak in vms_fid_to_name() (used by realpath() and
realname()> has been fixed (5.12.2).

=back

=head3 Windows

See also L</"fork() emulation will not wait for signalled children"> and
L</"Perl source code is read in text mode on Windows">, above.

=over 4

=item *

Fixed build process for SDK2003SP1 compilers.

=item *

Compilation with Visual Studio 2010 is now supported.

=item *

When using old 32-bit compilers, the define C<_USE_32BIT_TIME_T> is now
set in C<$Config{ccflags}>.  This improves portability when compiling
XS extensions using new compilers, but for a Perl compiled with old 32-bit
compilers.

=item *

C<$Config{gccversion}> is now set correctly when Perl is built using the
mingw64 compiler from L<http://mingw64.org> [perl #73754].

=item *

When building Perl with the mingw64 x64 cross-compiler C<incpath>,
C<libpth>, C<ldflags>, C<lddlflags> and C<ldflags_nolargefiles> values
in F<Config.pm> and F<Config_heavy.pl> were not previously being set
correctly because, with that compiler, the include and lib directories
are not immediately below C<$(CCHOME)> (5.12.2).

=item *

The build process proceeds more smoothly with mingw and dmake when
F<C:\MSYS\bin> is in the PATH, due to a C<Cwd> fix.

=item *

Support for building with Visual C++ 2010 is now underway, but is not yet
complete.  See F<README.win32> or L<perlwin32> for more details.

=item *

The option to use an externally-supplied crypt(), or to build with no
crypt() at all, has been removed.  Perl supplies its own crypt()
implementation for Windows, and the political situation that required
this part of the distribution to sometimes be omitted is long gone.

=back

=head1 Internal Changes

=head2 New APIs

=head3 CLONE_PARAMS structure added to ease correct thread creation

Modules that create threads should now create C<CLONE_PARAMS> structures
by calling the new function Perl_clone_params_new(), and free them with
Perl_clone_params_del().  This will ensure compatibility with any future
changes to the internals of the C<CLONE_PARAMS> structure layout, and that
it is correctly allocated and initialised.

=head3 New parsing functions

Several functions have been added for parsing Perl statements and
expressions.  These functions are meant to be used by XS code invoked
during Perl parsing, in a recursive-descent manner, to allow modules to
augment the standard Perl syntax.

=over

=item *

L<parse_stmtseq()|perlapi/parse_stmtseq>
parses a sequence of statements, up to closing brace or EOF.

=item *

L<parse_fullstmt()|perlapi/parse_fullstmt>
parses a complete Perl statement, including optional label.

=item *

L<parse_barestmt()|perlapi/parse_barestmt>
parses a statement without a label.

=item *

L<parse_block()|perlapi/parse_block>
parses a code block.

=item *

L<parse_label()|perlapi/parse_label>
parses a statement label, separate from statements.

=item *

L<C<parse_fullexpr()>|perlapi/parse_fullexpr>,
L<C<parse_listexpr()>|perlapi/parse_listexpr>,
L<C<parse_termexpr()>|perlapi/parse_termexpr>, and
L<C<parse_arithexpr()>|perlapi/parse_arithexpr>
parse expressions at various precedence levels.

=back

=head3 Hints hash API

A new C API for introspecting the hinthash C<%^H> at runtime has been
added.  See C<cop_hints_2hv>, C<cop_hints_fetchpvn>, C<cop_hints_fetchpvs>,
C<cop_hints_fetchsv>, and C<hv_copy_hints_hv> in L<perlapi> for details.

A new, experimental API has been added for accessing the internal
structure that Perl uses for C<%^H>.  See the functions beginning with
C<cophh_> in L<perlapi>.

=head3 C interface to caller()

The C<caller_cx> function has been added as an XSUB-writer's equivalent of
caller().  See L<perlapi> for details.

=head3 Custom per-subroutine check hooks

XS code in an extension module can now annotate a subroutine (whether
implemented in XS or in Perl) so that nominated XS code will be called
at compile time (specifically as part of op checking) to change the op
tree of that subroutine.  The compile-time check function (supplied by
the extension module) can implement argument processing that can't be
expressed as a prototype, generate customised compile-time warnings,
perform constant folding for a pure function, inline a subroutine
consisting of sufficiently simple ops, replace the whole call with a
custom op, and so on.  This was previously all possible by hooking the
C<entersub> op checker, but the new mechanism makes it easy to tie the
hook to a specific subroutine.  See L<perlapi/cv_set_call_checker>.

To help in writing custom check hooks, several subtasks within standard
C<entersub> op checking have been separated out and exposed in the API.

=head3 Improved support for custom OPs

Custom ops can now be registered with the new C<custom_op_register> C
function and the C<XOP> structure.  This will make it easier to add new
properties of custom ops in the future.  Two new properties have been added
already, C<xop_class> and C<xop_peep>.

C<xop_class> is one of the OA_*OP constants.  It allows L<B> and other
introspection mechanisms to work with custom ops
that aren't BASEOPs.  C<xop_peep> is a pointer to
a function that will be called for ops of this
type from C<Perl_rpeep>.

See L<perlguts/Custom Operators> and L<perlapi/Custom Operators> for more
detail.

The old C<PL_custom_op_names>/C<PL_custom_op_descs> interface is still
supported but discouraged.

=head3 Scope hooks

It is now possible for XS code to hook into Perl's lexical scope
mechanism at compile time, using the new C<Perl_blockhook_register>
function.  See L<perlguts/"Compile-time scope hooks">.

=head3 The recursive part of the peephole optimizer is now hookable

In addition to C<PL_peepp>, for hooking into the toplevel peephole optimizer, a
C<PL_rpeepp> is now available to hook into the optimizer recursing into
side-chains of the optree.

=head3 New non-magical variants of existing functions

The following functions/macros have been added to the API.  The C<*_nomg>
macros are equivalent to their non-C<_nomg> variants, except that they ignore
get-magic.  Those ending in C<_flags> allow one to specify whether
get-magic is processed.

  sv_2bool_flags
  SvTRUE_nomg
  sv_2nv_flags
  SvNV_nomg
  sv_cmp_flags
  sv_cmp_locale_flags
  sv_eq_flags
  sv_collxfrm_flags

In some of these cases, the non-C<_flags> functions have
been replaced with wrappers around the new functions. 

=head3 pv/pvs/sv versions of existing functions

Many functions ending with pvn now have equivalent C<pv/pvs/sv> versions.

=head3 List op-building functions

List op-building functions have been added to the
API.  See L<op_append_elem|perlapi/op_append_elem>,
L<op_append_list|perlapi/op_append_list>, and
L<op_prepend_elem|perlapi/op_prepend_elem> in L<perlapi>.

=head3 C<LINKLIST>

The L<LINKLIST|perlapi/LINKLIST> macro, part of op building that
constructs the execution-order op chain, has been added to the API.

=head3 Localisation functions

The C<save_freeop>, C<save_op>, C<save_pushi32ptr> and C<save_pushptrptr>
functions have been added to the API.

=head3 Stash names

A stash can now have a list of effective names in addition to its usual
name.  The first effective name can be accessed via the C<HvENAME> macro,
which is now the recommended name to use in MRO linearisations (C<HvNAME>
being a fallback if there is no C<HvENAME>).

These names are added and deleted via C<hv_ename_add> and
C<hv_ename_delete>.  These two functions are I<not> part of the API.

=head3 New functions for finding and removing magic

The L<C<mg_findext()>|perlapi/mg_findext> and
L<C<sv_unmagicext()>|perlapi/sv_unmagicext>
functions have been added to the API.
They allow extension authors to find and remove magic attached to
scalars based on both the magic type and the magic virtual table, similar to how
sv_magicext() attaches magic of a certain type and with a given virtual table
to a scalar.  This eliminates the need for extensions to walk the list of
C<MAGIC> pointers of an C<SV> to find the magic that belongs to them.

=head3 C<find_rundefsv>

This function returns the SV representing C<$_>, whether it's lexical
or dynamic.

=head3 C<Perl_croak_no_modify>

Perl_croak_no_modify() is short-hand for
C<Perl_croak("%s", PL_no_modify)>.

=head3 C<PERL_STATIC_INLINE> define

The C<PERL_STATIC_INLINE> define has been added to provide the best-guess
incantation to use for static inline functions, if the C compiler supports
C99-style static inline.  If it doesn't, it'll give a plain C<static>.

C<HAS_STATIC_INLINE> can be used to check if the compiler actually supports
inline functions.

=head3 New C<pv_escape> option for hexadecimal escapes

A new option, C<PERL_PV_ESCAPE_NONASCII>, has been added to C<pv_escape> to
dump all characters above ASCII in hexadecimal.  Before, one could get all
characters as hexadecimal or the Latin1 non-ASCII as octal.

=head3 C<lex_start>

C<lex_start> has been added to the API, but is considered experimental.

=head3 op_scope() and op_lvalue()

The op_scope() and op_lvalue() functions have been added to the API,
but are considered experimental.

=head2 C API Changes

=head3 C<PERL_POLLUTE> has been removed

The option to define C<PERL_POLLUTE> to expose older 5.005 symbols for
backwards compatibility has been removed.  Its use was always discouraged,
and MakeMaker contains a more specific escape hatch:

    perl Makefile.PL POLLUTE=1

This can be used for modules that have not been upgraded to 5.6 naming
conventions (and really should be completely obsolete by now).

=head3 Check API compatibility when loading XS modules

When Perl's API changes in incompatible ways (which usually happens between
major releases), XS modules compiled for previous versions of Perl will no
longer work.  They need to be recompiled against the new Perl.

The C<XS_APIVERSION_BOOTCHECK> macro has been added to ensure that modules
are recompiled and to prevent users from accidentally loading modules
compiled for old perls into newer perls.  That macro, which is called when
loading every newly compiled extension, compares the API version of the
running perl with the version a module has been compiled for and raises an
exception if they don't match.

=head3 Perl_fetch_cop_label

The first argument of the C API function C<Perl_fetch_cop_label> has changed
from C<struct refcounted_he *> to C<COP *>, to insulate the user from
implementation details.

This API function was marked as "may change", and likely isn't in use outside
the core.  (Neither an unpacked CPAN nor Google's codesearch finds any other
references to it.)

=head3 GvCV() and GvGP() are no longer lvalues

The new GvCV_set() and GvGP_set() macros are now provided to replace
assignment to those two macros.

This allows a future commit to eliminate some backref magic between GV
and CVs, which will require complete control over assignment to the
C<gp_cv> slot.

=head3 CvGV() is no longer an lvalue

Under some circumstances, the CvGV() field of a CV is now
reference-counted.  To ensure consistent behaviour, direct assignment to
it, for example C<CvGV(cv) = gv> is now a compile-time error.  A new macro,
C<CvGV_set(cv,gv)> has been introduced to run this operation
safely.  Note that modification of this field is not part of the public
API, regardless of this new macro (and despite its being listed in this section).

=head3 CvSTASH() is no longer an lvalue

The CvSTASH() macro can now only be used as an rvalue.  CvSTASH_set()
has been added to replace assignment to CvSTASH().  This is to ensure
that backreferences are handled properly.  These macros are not part of the
API.

=head3 Calling conventions for C<newFOROP> and C<newWHILEOP>

The way the parser handles labels has been cleaned up and refactored.  As a
result, the newFOROP() constructor function no longer takes a parameter
stating what label is to go in the state op.

The newWHILEOP() and newFOROP() functions no longer accept a line
number as a parameter.

=head3 Flags passed to C<uvuni_to_utf8_flags> and C<utf8n_to_uvuni>

Some of the flags parameters to uvuni_to_utf8_flags() and
utf8n_to_uvuni() have changed.  This is a result of Perl's now allowing
internal storage and manipulation of code points that are problematic
in some situations.  Hence, the default actions for these functions has
been complemented to allow these code points.  The new flags are
documented in L<perlapi>.  Code that requires the problematic code
points to be rejected needs to change to use the new flags.  Some flag
names are retained for backward source compatibility, though they do
nothing, as they are now the default.  However the flags
C<UNICODE_ALLOW_FDD0>, C<UNICODE_ALLOW_FFFF>, C<UNICODE_ILLEGAL>, and
C<UNICODE_IS_ILLEGAL> have been removed, as they stem from a
fundamentally broken model of how the Unicode non-character code points
should be handled, which is now described in
L<perlunicode/Non-character code points>.  See also the Unicode section
under L</Selected Bug Fixes>.

=head2 Deprecated C APIs

=over

=item C<Perl_ptr_table_clear>

C<Perl_ptr_table_clear> is no longer part of Perl's public API.  Calling it
now generates a deprecation warning, and it will be removed in a future
release.

=item C<sv_compile_2op>

The sv_compile_2op() API function is now deprecated.  Searches suggest
that nothing on CPAN is using it, so this should have zero impact.

It attempted to provide an API to compile code down to an optree, but failed
to bind correctly to lexicals in the enclosing scope.  It's not possible to
fix this problem within the constraints of its parameters and return value.

=item C<find_rundefsvoffset>

The C<find_rundefsvoffset> function has been deprecated.  It appeared that
its design was insufficient for reliably getting the lexical C<$_> at
run-time.

Use the new C<find_rundefsv> function or the C<UNDERBAR> macro
instead.  They directly return the right SV
representing C<$_>, whether it's
lexical or dynamic.

=item C<CALL_FPTR> and C<CPERLscope>

Those are left from an old implementation of C<MULTIPLICITY> using C++ objects,
which was removed in Perl 5.8.  Nowadays these macros do exactly nothing, so
they shouldn't be used anymore.

For compatibility, they are still defined for external C<XS> code.  Only
extensions defining C<PERL_CORE> must be updated now.

=back

=head2 Other Internal Changes

=head3 Stack unwinding

The protocol for unwinding the C stack at the last stage of a C<die>
has changed how it identifies the target stack frame.  This now uses
a separate variable C<PL_restartjmpenv>, where previously it relied on
the C<blk_eval.cur_top_env> pointer in the C<eval> context frame that
has nominally just been discarded.  This change means that code running
during various stages of Perl-level unwinding no longer needs to take
care to avoid destroying the ghost frame.

=head3 Scope stack entries

The format of entries on the scope stack has been changed, resulting in a
reduction of memory usage of about 10%.  In particular, the memory used by
the scope stack to record each active lexical variable has been halved.

=head3 Memory allocation for pointer tables

Memory allocation for pointer tables has been changed.  Previously
C<Perl_ptr_table_store> allocated memory from the same arena system as
C<SV> bodies and C<HE>s, with freed memory remaining bound to those arenas
until interpreter exit.  Now it allocates memory from arenas private to the
specific pointer table, and that memory is returned to the system when
C<Perl_ptr_table_free> is called.  Additionally, allocation and release are
both less CPU intensive.

=head3 C<UNDERBAR>

The C<UNDERBAR> macro now calls C<find_rundefsv>.  C<dUNDERBAR> is now a
noop but should still be used to ensure past and future compatibility.

=head3 String comparison routines renamed

The C<ibcmp_*> functions have been renamed and are now called C<foldEQ>,
C<foldEQ_locale>, and C<foldEQ_utf8>.  The old names are still available as
macros.

=head3 C<chop> and C<chomp> implementations merged

The opcode bodies for C<chop> and C<chomp> and for C<schop> and C<schomp>
have been merged.  The implementation functions Perl_do_chop() and
Perl_do_chomp(), never part of the public API, have been merged and
moved to a static function in F<pp.c>.  This shrinks the Perl binary
slightly, and should not affect any code outside the core (unless it is
relying on the order of side-effects when C<chomp> is passed a I<list> of
values).

=head1 Selected Bug Fixes

=head2 I/O

=over 4

=item *

Perl no longer produces this warning:

    $ perl -we 'open(my $f, ">", \my $x); binmode($f, "scalar")'
    Use of uninitialized value in binmode at -e line 1.

=item *

Opening a glob reference via C<< open($fh, ">", \*glob) >> no longer
causes the glob to be corrupted when the filehandle is printed to.  This would
cause Perl to crash whenever the glob's contents were accessed
[perl #77492].

=item *

PerlIO no longer crashes when called recursively, such as from a signal
handler.  Now it just leaks memory [perl #75556].

=item *

Most I/O functions were not warning for unopened handles unless the
"closed" and "unopened" warnings categories were both enabled.  Now only
C<use warnings 'unopened'> is necessary to trigger these warnings, as
had always been the intention.

=item *

There have been several fixes to PerlIO layers:

When C<binmode(FH, ":crlf")> pushes the C<:crlf> layer on top of the stack,
it no longer enables crlf layers lower in the stack so as to avoid
unexpected results [perl #38456].

Opening a file in C<:raw> mode now does what it advertises to do (first
open the file, then C<binmode> it), instead of simply leaving off the top
layer [perl #80764].

The three layers C<:pop>, C<:utf8>, and C<:bytes> didn't allow stacking when
opening a file.  For example
this:

    open(FH, ">:pop:perlio", "some.file") or die $!;

would throw an "Invalid argument" error.  This has been fixed in this
release [perl #82484].

=back

=head2 Regular Expression Bug Fixes

=over

=item *

The regular expression engine no longer loops when matching
C<"\N{LATIN SMALL LIGATURE FF}" =~ /f+/i> and similar expressions
[perl #72998] (5.12.1).

=item *

The trie runtime code should no longer allocate massive amounts of memory,
fixing #74484.

=item *

Syntax errors in C<< (?{...}) >> blocks no longer cause panic messages
[perl #2353].

=item *

A pattern like C<(?:(o){2})?> no longer causes a "panic" error
[perl #39233].

=item *

A fatal error in regular expressions containing C<(.*?)> when processing
UTF-8 data has been fixed [perl #75680] (5.12.2).

=item *

An erroneous regular expression engine optimisation that caused regex verbs like
C<*COMMIT> sometimes to be ignored has been removed.

=item *

The regular expression bracketed character class C<[\8\9]> was effectively the
same as C<[89\000]>, incorrectly matching a NULL character.  It also gave
incorrect warnings that the C<8> and C<9> were ignored.  Now C<[\8\9]> is the
same as C<[89]> and gives legitimate warnings that C<\8> and C<\9> are
unrecognized escape sequences, passed-through.

=item *

A regular expression match in the right-hand side of a global substitution
(C<s///g>) that is in the same scope will no longer cause match variables
to have the wrong values on subsequent iterations.  This can happen when an
array or hash subscript is interpolated in the right-hand side, as in
C<s|(.)|@a{ print($1), /./ }|g> [perl #19078].

=item *

Several cases in which characters in the Latin-1 non-ASCII range (0x80 to
0xFF) used not to match themselves, or used to match both a character class
and its complement, have been fixed.  For instance, U+00E2 could match both
C<\w> and C<\W> [perl #78464] [perl #18281] [perl #60156].

=item *

Matching a Unicode character against an alternation containing characters
that happened to match continuation bytes in the former's UTF8
representation (like C<qq{\x{30ab}} =~ /\xab|\xa9/>) would cause erroneous
warnings [perl #70998].

=item *

The trie optimisation was not taking empty groups into account, preventing
"foo" from matching C</\A(?:(?:)foo|bar|zot)\z/> [perl #78356].

=item *

A pattern containing a C<+> inside a lookahead would sometimes cause an
incorrect match failure in a global match (for example, C</(?=(\S+))/g>)
[perl #68564].

=item *

A regular expression optimisation would sometimes cause a match with a
C<{n,m}> quantifier to fail when it should have matched [perl #79152].

=item *

Case-insensitive matching in regular expressions compiled under
C<use locale> now works much more sanely when the pattern or target
string is internally encoded in UTF8.  Previously, under these
conditions the localeness was completely lost.  Now, code points
above 255 are treated as Unicode, but code points between 0 and 255
are treated using the current locale rules, regardless of whether
the pattern or the string is encoded in UTF8.  The few case-insensitive
matches that cross the 255/256 boundary are not allowed.  For
example, 0xFF does not caselessly match the character at 0x178,
LATIN CAPITAL LETTER Y WITH DIAERESIS, because 0xFF may not be LATIN
SMALL LETTER Y in the current locale, and Perl has no way of knowing
if that character even exists in the locale, much less what code
point it is.

=item *

The C<(?|...)> regular expression construct no longer crashes if the final
branch has more sets of capturing parentheses than any other branch.  This
was fixed in Perl 5.10.1 for the case of a single branch, but that fix did
not take multiple branches into account [perl #84746].

=item *

A bug has been fixed in the implementation of C<{...}> quantifiers in
regular expressions that prevented the code block in
C</((\w+)(?{ print $2 })){2}/> from seeing the C<$2> sometimes
[perl #84294].

=back

=head2 Syntax/Parsing Bugs

=over

=item *

C<when (scalar) {...}> no longer crashes, but produces a syntax error
[perl #74114] (5.12.1).

=item *

A label right before a string eval (C<foo: eval $string>) no longer causes
the label to be associated also with the first statement inside the eval
[perl #74290] (5.12.1).

=item *

The C<no 5.13.2> form of C<no> no longer tries to turn on features or
pragmata (like L<strict>) [perl #70075] (5.12.2).

=item *

C<BEGIN {require 5.12.0}> now behaves as documented, rather than behaving
identically to C<use 5.12.0>.  Previously, C<require> in a C<BEGIN> block
was erroneously executing the C<use feature ':5.12.0'> and
C<use strict> behaviour, which only C<use> was documented to
provide [perl #69050].

=item *

A regression introduced in Perl 5.12.0, making
C<< my $x = 3; $x = length(undef) >> result in C<$x> set to C<3> has been
fixed.  C<$x> will now be C<undef> [perl #85508] (5.12.2).

=item *

When strict "refs" mode is off, C<%{...}> in rvalue context returns
C<undef> if its argument is undefined.  An optimisation introduced in Perl
5.12.0 to make C<keys %{...}> faster when used as a boolean did not take
this into account, causing C<keys %{+undef}> (and C<keys %$foo> when
C<$foo> is undefined) to be an error, which it should be so in strict
mode only [perl #81750].

=item *

Constant-folding used to cause

  $text =~ ( 1 ? /phoo/ : /bear/)

to turn into

  $text =~ /phoo/

at compile time.  Now it correctly matches against C<$_> [perl #20444].

=item *

Parsing Perl code (either with string C<eval> or by loading modules) from
within a C<UNITCHECK> block no longer causes the interpreter to crash
[perl #70614].

=item *

String C<eval>s no longer fail after 2 billion scopes have been
compiled [perl #83364].

=item *

The parser no longer hangs when encountering certain Unicode characters,
such as U+387 [perl #74022].

=item *

Defining a constant with the same name as one of Perl's special blocks
(like C<INIT>) stopped working in 5.12.0, but has now been fixed
[perl #78634].

=item *

A reference to a literal value used as a hash key (C<$hash{\"foo"}>) used
to be stringified, even if the hash was tied [perl #79178].

=item *

A closure containing an C<if> statement followed by a constant or variable
is no longer treated as a constant [perl #63540].

=item *

C<state> can now be used with attributes.  It
used to mean the same thing as
C<my> if any attributes were present [perl #68658].

=item *

Expressions like C<< @$a > 3 >> no longer cause C<$a> to be mentioned in
the "Use of uninitialized value in numeric gt" warning when C<$a> is
undefined (since it is not part of the C<< > >> expression, but the operand
of the C<@>) [perl #72090].

=item *

Accessing an element of a package array with a hard-coded number (as
opposed to an arbitrary expression) would crash if the array did not exist.
Usually the array would be autovivified during compilation, but typeglob
manipulation could remove it, as in these two cases which used to crash:

  *d = *a;  print $d[0];
  undef *d; print $d[0];

=item *

The B<-C> command-line option, when used on the shebang line, can now be
followed by other options [perl #72434].

=item *

The C<B> module was returning C<B::OP>s instead of C<B::LOGOP>s for
C<entertry> [perl #80622].  This was due to a bug in the Perl core,
not in C<B> itself.

=back

=head2 Stashes, Globs and Method Lookup

Perl 5.10.0 introduced a new internal mechanism for caching MROs (method
resolution orders, or lists of parent classes; aka "isa" caches) to make
method lookup faster (so C<@ISA> arrays would not have to be searched
repeatedly).  Unfortunately, this brought with it quite a few bugs.  Almost
all of these have been fixed now, along with a few MRO-related bugs that
existed before 5.10.0:

=over

=item *

The following used to have erratic effects on method resolution, because
the "isa" caches were not reset or otherwise ended up listing the wrong
classes.  These have been fixed.

=over

=item Aliasing packages by assigning to globs [perl #77358]

=item Deleting packages by deleting their containing stash elements

=item Undefining the glob containing a package (C<undef *Foo::>)

=item Undefining an ISA glob (C<undef *Foo::ISA>)

=item Deleting an ISA stash element (C<delete $Foo::{ISA}>)

=item Sharing @ISA arrays between classes (via C<*Foo::ISA = \@Bar::ISA> or
C<*Foo::ISA = *Bar::ISA>) [perl #77238]

=back

C<undef *Foo::ISA> would even stop a new C<@Foo::ISA> array from updating
caches.

=item *

Typeglob assignments would crash if the glob's stash no longer existed, so
long as the glob assigned to were named C<ISA> or the glob on either side of
the assignment contained a subroutine.

=item *

C<PL_isarev>, which is accessible to Perl via C<mro::get_isarev> is now
updated properly when packages are deleted or removed from the C<@ISA> of
other classes.  This allows many packages to be created and deleted without
causing a memory leak [perl #75176].

=back

In addition, various other bugs related to typeglobs and stashes have been
fixed:

=over 

=item *

Some work has been done on the internal pointers that link between symbol
tables (stashes), typeglobs, and subroutines.  This has the effect that
various edge cases related to deleting stashes or stash entries (for example,
<%FOO:: = ()>), and complex typeglob or code-reference aliasing, will no
longer crash the interpreter.

=item *

Assigning a reference to a glob copy now assigns to a glob slot instead of
overwriting the glob with a scalar [perl #1804] [perl #77508].

=item *

A bug when replacing the glob of a loop variable within the loop has been fixed
[perl #21469].  This
means the following code will no longer crash:

    for $x (...) {
        *x = *y;
    }

=item *

Assigning a glob to a PVLV used to convert it to a plain string.  Now it
works correctly, and a PVLV can hold a glob.  This would happen when a
nonexistent hash or array element was passed to a subroutine:

  sub { $_[0] = *foo }->($hash{key});
  # $_[0] would have been the string "*main::foo"

It also happened when a glob was assigned to, or returned from, an element
of a tied array or hash [perl #36051].

=item *

When trying to report C<Use of uninitialized value $Foo::BAR>, crashes could
occur if the glob holding the global variable in question had been detached
from its original stash by, for example, C<delete $::{"Foo::"}>.  This has
been fixed by disabling the reporting of variable names in those
cases.

=item *

During the restoration of a localised typeglob on scope exit, any
destructors called as a result would be able to see the typeglob in an
inconsistent state, containing freed entries, which could result in a
crash.  This would affect code like this:

  local *@;
  eval { die bless [] }; # puts an object in $@
  sub DESTROY {
    local $@; # boom
  }

Now the glob entries are cleared before any destructors are called.  This
also means that destructors can vivify entries in the glob.  So Perl tries
again and, if the entries are re-created too many times, dies with a
"panic: gp_free ..." error message.

=item *

If a typeglob is freed while a subroutine attached to it is still
referenced elsewhere, the subroutine is renamed to C<__ANON__> in the same
package, unless the package has been undefined, in which case the C<__ANON__>
package is used.  This could cause packages to be sometimes autovivified,
such as if the package had been deleted.  Now this no longer occurs.
The C<__ANON__> package is also now used when the original package is
no longer attached to the symbol table.  This avoids memory leaks in some
cases [perl #87664].

=item *

Subroutines and package variables inside a package whose name ends with
C<::> can now be accessed with a fully qualified name.

=back

=head2 Unicode

=over

=item *

What has become known as "the Unicode Bug" is almost completely resolved in
this release.  Under C<use feature 'unicode_strings'> (which is
automatically selected by C<use 5.012> and above), the internal
storage format of a string no longer affects the external semantics.
[perl #58182].

There are two known exceptions:

=over

=item 1

The now-deprecated, user-defined case-changing
functions require utf8-encoded strings to operate.  The CPAN module
L<Unicode::Casing> has been written to replace this feature without its
drawbacks, and the feature is scheduled to be removed in 5.16.

=item 2

quotemeta() (and its in-line equivalent C<\Q>) can also give different
results depending on whether a string is encoded in UTF-8.  See
L<perlunicode/The "Unicode Bug">.

=back

=item *

Handling of Unicode non-character code points has changed.
Previously they were mostly considered illegal, except that in some
place only one of the 66 of them was known.  The Unicode Standard
considers them all legal, but forbids their "open interchange".
This is part of the change to allow internal use of any code
point (see L</Core Enhancements>).  Together, these changes resolve
[perl #38722], [perl #51918], [perl #51936], and [perl #63446].

=item *

Case-insensitive C<"/i"> regular expression matching of Unicode
characters that match multiple characters now works much more as
intended.  For example

 "\N{LATIN SMALL LIGATURE FFI}" =~ /ffi/ui

and

 "ffi" =~ /\N{LATIN SMALL LIGATURE FFI}/ui

are both true.  Previously, there were many bugs with this feature.
What hasn't been fixed are the places where the pattern contains the
multiple characters, but the characters are split up by other things,
such as in

 "\N{LATIN SMALL LIGATURE FFI}" =~ /(f)(f)i/ui

or

 "\N{LATIN SMALL LIGATURE FFI}" =~ /ffi*/ui

or

 "\N{LATIN SMALL LIGATURE FFI}" =~ /[a-f][f-m][g-z]/ui

None of these match.

Also, this matching doesn't fully conform to the current Unicode
Standard, which asks that the matching be made upon the NFD
(Normalization Form Decomposed) of the text.  However, as of this
writing (April 2010), the Unicode Standard is currently in flux about
what they will recommend doing with regard in such scenarios.  It may be
that they will throw out the whole concept of multi-character matches.
[perl #71736].

=item *

Naming a deprecated character in C<\N{I<NAME>}> no longer leaks memory.

=item *

We fixed a bug that could cause C<\N{I<NAME>}> constructs followed by
a single C<"."> to be parsed incorrectly [perl #74978] (5.12.1).

=item *

C<chop> now correctly handles characters above C<"\x{7fffffff}">
[perl #73246].

=item *

Passing to C<index> an offset beyond the end of the string when the string
is encoded internally in UTF8 no longer causes panics [perl #75898].

=item *

warn() and die() now respect utf8-encoded scalars [perl #45549].

=item *

Sometimes the UTF8 length cache would not be reset on a value
returned by substr, causing C<length(substr($uni_string, ...))> to give
wrong answers.  With C<${^UTF8CACHE}> set to -1, it would also produce
a "panic" error message [perl #77692].

=back

=head2 Ties, Overloading and Other Magic

=over

=item *

Overloading now works properly in conjunction with tied
variables.  What formerly happened was that most ops checked their
arguments for overloading I<before> checking for magic, so for example
an overloaded object returned by a tied array access would usually be
treated as not overloaded [RT #57012].

=item *

Various instances of magic (like tie methods) being called on tied variables
too many or too few times have been fixed:

=over

=item *

C<< $tied->() >> did not always call FETCH [perl #8438].

=item *

Filetest operators and C<y///> and C<tr///> were calling FETCH too
many times.

=item *

The C<=> operator used to ignore magic on its right-hand side if the
scalar happened to hold a typeglob (if a typeglob was the last thing
returned from or assigned to a tied scalar) [perl #77498].

=item *

Dereference operators used to ignore magic if the argument was a
reference already (such as from a previous FETCH) [perl #72144].

=item *

C<splice> now calls set-magic (so changes made
by C<splice @ISA> are respected by method calls) [perl #78400].

=item *

In-memory files created by C<< open($fh, ">", \$buffer) >> were not calling
FETCH/STORE at all [perl #43789] (5.12.2).

=item *

utf8::is_utf8() now respects get-magic (like C<$1>) (5.12.1).

=back

=item *

Non-commutative binary operators used to swap their operands if the same
tied scalar was used for both operands and returned a different value for
each FETCH.  For instance, if C<$t> returned 2 the first time and 3 the
second, then C<$t/$t> would evaluate to 1.5.  This has been fixed
[perl #87708].

=item *

String C<eval> now detects taintedness of overloaded or tied
arguments [perl #75716].

=item *

String C<eval> and regular expression matches against objects with string
overloading no longer cause memory corruption or crashes [perl #77084].

=item *

L<readline|perlfunc/"readline EXPR"> now honors C<< <> >> overloading on tied
arguments.

=item *

C<< <expr> >> always respects overloading now if the expression is
overloaded.

Because "S<< <> as >> glob" was parsed differently from
"S<< <> as >> filehandle" from 5.6 onwards, something like C<< <$foo[0]> >> did
not handle overloading, even if C<$foo[0]> was an overloaded object.  This
was contrary to the documentation for L<overload>, and meant that C<< <> >>
could not be used as a general overloaded iterator operator.

=item *

The fallback behaviour of overloading on binary operators was asymmetric
[perl #71286].

=item *

Magic applied to variables in the main package no longer affects other packages.
See L</Magic variables outside the main package> above [perl #76138].

=item *

Sometimes magic (ties, taintedness, etc.) attached to variables could cause
an object to last longer than it should, or cause a crash if a tied
variable were freed from within a tie method.  These have been fixed
[perl #81230].

=item *

DESTROY methods of objects implementing ties are no longer able to crash by
accessing the tied variable through a weak reference [perl #86328].

=item *

Fixed a regression of kill() when a match variable is used for the
process ID to kill [perl #75812].

=item *

C<$AUTOLOAD> used to remain tainted forever if it ever became tainted.  Now
it is correctly untainted if an autoloaded method is called and the method
name was not tainted.

=item *

C<sprintf> now dies when passed a tainted scalar for the format.  It did
already die for arbitrary expressions, but not for simple scalars
[perl #82250].

=item *

C<lc>, C<uc>, C<lcfirst>, and C<ucfirst> no longer return untainted strings
when the argument is tainted.  This has been broken since perl 5.8.9
[perl #87336].

=back

=head2 The Debugger

=over

=item *

The Perl debugger now also works in taint mode [perl #76872].

=item *

Subroutine redefinition works once more in the debugger [perl #48332].

=item *

When B<-d> is used on the shebang (C<#!>) line, the debugger now has access
to the lines of the main program.  In the past, this sometimes worked and
sometimes did not, depending on the order in which things happened to be
arranged in memory [perl #71806].

=item *

A possible memory leak when using L<caller()|perlfunc/"caller EXPR"> to set
C<@DB::args> has been fixed (5.12.2).

=item *

Perl no longer stomps on C<$DB::single>, C<$DB::trace>, and C<$DB::signal> 
if these variables already have values when C<$^P> is assigned to [perl #72422].

=item *

C<#line> directives in string evals were not properly updating the arrays
of lines of code (C<< @{"_< ..."} >>) that the debugger (or any debugging or
profiling module) uses.  In threaded builds, they were not being updated at
all.  In non-threaded builds, the line number was ignored, so any change to
the existing line number would cause the lines to be misnumbered
[perl #79442].

=back

=head2 Threads

=over

=item *

Perl no longer accidentally clones lexicals in scope within active stack
frames in the parent when creating a child thread [perl #73086].

=item *

Several memory leaks in cloning and freeing threaded Perl interpreters have been
fixed [perl #77352].

=item *

Creating a new thread when directory handles were open used to cause a
crash, because the handles were not cloned, but simply passed to the new
thread, resulting in a double free.

Now directory handles are cloned properly on Windows
and on systems that have a C<fchdir> function.  On other
systems, new threads simply do not inherit directory
handles from their parent threads [perl #75154].

=item *

The typeglob C<*,>, which holds the scalar variable C<$,> (output field
separator), had the wrong reference count in child threads.

=item *

[perl #78494] When pipes are shared between threads, the C<close> function
(and any implicit close, such as on thread exit) no longer blocks.

=item *

Perl now does a timely cleanup of SVs that are cloned into a new
thread but then discovered to be orphaned (that is, their owners
are I<not> cloned).  This eliminates several "scalars leaked"
warnings when joining threads.

=back

=head2 Scoping and Subroutines

=over

=item *

Lvalue subroutines are again able to return copy-on-write scalars.  This
had been broken since version 5.10.0 [perl #75656] (5.12.3).

=item *

C<require> no longer causes C<caller> to return the wrong file name for
the scope that called C<require> and other scopes higher up that had the
same file name [perl #68712].

=item *

C<sort> with a C<($$)>-prototyped comparison routine used to cause the value
of C<@_> to leak out of the sort.  Taking a reference to C<@_> within the
sorting routine could cause a crash [perl #72334].

=item *

Match variables (like C<$1>) no longer persist between calls to a sort
subroutine [perl #76026].

=item *

Iterating with C<foreach> over an array returned by an lvalue sub now works
[perl #23790].

=item *

C<$@> is now localised during calls to C<binmode> to prevent action at a
distance [perl #78844].

=item *

Calling a closure prototype (what is passed to an attribute handler for a
closure) now results in a "Closure prototype called" error message instead
of a crash [perl #68560].

=item *

Mentioning a read-only lexical variable from the enclosing scope in a
string C<eval> no longer causes the variable to become writable
[perl #19135].

=back

=head2 Signals

=over

=item *

Within signal handlers, C<$!> is now implicitly localized.

=item *

CHLD signals are no longer unblocked after a signal handler is called if
they were blocked before by C<POSIX::sigprocmask> [perl #82040].

=item *

A signal handler called within a signal handler could cause leaks or
double-frees.  Now fixed [perl #76248].

=back

=head2 Miscellaneous Memory Leaks

=over

=item *

Several memory leaks when loading XS modules were fixed (5.12.2).

=item *

L<substr()|perlfunc/"substr EXPR,OFFSET,LENGTH,REPLACEMENT">,
L<pos()|perlfunc/"index STR,SUBSTR,POSITION">, L<keys()|perlfunc/"keys HASH">,
and L<vec()|perlfunc/"vec EXPR,OFFSET,BITS"> could, when used in combination
with lvalues, result in leaking the scalar value they operate on, and cause its
destruction to happen too late.  This has now been fixed.

=item *

The postincrement and postdecrement operators, C<++> and C<-->, used to cause
leaks when used on references.  This has now been fixed.

=item *

Nested C<map> and C<grep> blocks no longer leak memory when processing
large lists [perl #48004].

=item *

C<use I<VERSION>> and C<no I<VERSION>> no longer leak memory [perl #78436]
[perl #69050].

=item *

C<.=> followed by C<< <> >> or C<readline> would leak memory if C<$/>
contained characters beyond the octet range and the scalar assigned to
happened to be encoded as UTF8 internally [perl #72246].

=item *

C<eval 'BEGIN{die}'> no longer leaks memory on non-threaded builds.

=back

=head2 Memory Corruption and Crashes

=over

=item *

glob() no longer crashes when C<%File::Glob::> is empty and
C<CORE::GLOBAL::glob> isn't present [perl #75464] (5.12.2).

=item *

readline() has been fixed when interrupted by signals so it no longer
returns the "same thing" as before or random memory.

=item *

When assigning a list with duplicated keys to a hash, the assignment used to
return garbage and/or freed values:

    @a = %h = (list with some duplicate keys);

This has now been fixed [perl #31865].

=item *

The mechanism for freeing objects in globs used to leave dangling
pointers to freed SVs, meaning Perl users could see corrupted state
during destruction.

Perl now frees only the affected slots of the GV, rather than freeing
the GV itself.  This makes sure that there are no dangling refs or
corrupted state during destruction.

=item *

The interpreter no longer crashes when freeing deeply-nested arrays of
arrays.  Hashes have not been fixed yet [perl #44225].

=item *

Concatenating long strings under C<use encoding> no longer causes Perl to
crash [perl #78674].

=item *

Calling C<< ->import >> on a class lacking an import method could corrupt
the stack, resulting in strange behaviour.  For instance,

  push @a, "foo", $b = bar->import;

would assign "foo" to C<$b> [perl #63790].

=item *

The C<recv> function could crash when called with the MSG_TRUNC flag
[perl #75082].

=item *

C<formline> no longer crashes when passed a tainted format picture.  It also
taints C<$^A> now if its arguments are tainted [perl #79138].

=item *

A bug in how we process filetest operations could cause a segfault.
Filetests don't always expect an op on the stack, so we now use
TOPs only if we're sure that we're not C<stat>ing the C<_> filehandle.
This is indicated by C<OPf_KIDS> (as checked in ck_ftst) [perl #74542]
(5.12.1).

=item *

unpack() now handles scalar context correctly for C<%32H> and C<%32u>,
fixing a potential crash.  split() would crash because the third item
on the stack wasn't the regular expression it expected.  C<unpack("%2H",
...)> would return both the unpacked result and the checksum on the stack,
as would C<unpack("%2u", ...)> [perl #73814] (5.12.2).

=back

=head2 Fixes to Various Perl Operators

=over

=item *

The C<&>, C<|>, and C<^> bitwise operators no longer coerce read-only arguments
[perl #20661].

=item *

Stringifying a scalar containing "-0.0" no longer has the effect of turning
false into true [perl #45133].

=item *

Some numeric operators were converting integers to floating point,
resulting in loss of precision on 64-bit platforms [perl #77456].

=item *

sprintf() was ignoring locales when called with constant arguments
[perl #78632].

=item *

Combining the vector (C<%v>) flag and dynamic precision would
cause C<sprintf> to confuse the order of its arguments, making it 
treat the string as the precision and vice-versa [perl #83194].

=back

=head2 Bugs Relating to the C API

=over

=item *

The C-level C<lex_stuff_pvn> function would sometimes cause a spurious
syntax error on the last line of the file if it lacked a final semicolon
[perl #74006] (5.12.1).

=item *

The C<eval_sv> and C<eval_pv> C functions now set C<$@> correctly when
there is a syntax error and no C<G_KEEPERR> flag, and never set it if the
C<G_KEEPERR> flag is present [perl #3719].

=item *

The XS multicall API no longer causes subroutines to lose reference counts
if called via the multicall interface from within those very subroutines.
This affects modules like L<List::Util>.  Calling one of its functions with an
active subroutine as the first argument could cause a crash [perl #78070].

=item *

The C<SvPVbyte> function available to XS modules now calls magic before
downgrading the SV, to avoid warnings about wide characters [perl #72398].

=item *

The ref types in the typemap for XS bindings now support magical variables
[perl #72684].

=item *

C<sv_catsv_flags> no longer calls C<mg_get> on its second argument (the
source string) if the flags passed to it do not include SV_GMAGIC.  So it
now matches the documentation.

=item *

C<my_strftime> no longer leaks memory.  This fixes a memory leak in
C<POSIX::strftime> [perl #73520].

=item *

F<XSUB.h> now correctly redefines fgets under PERL_IMPLICIT_SYS [perl #55049]
(5.12.1).

=item *

XS code using fputc() or fputs() on Windows could cause an error
due to their arguments being swapped [perl #72704] (5.12.1).

=item *

A possible segfault in the C<T_PTROBJ> default typemap has been fixed
(5.12.2).

=item *

A bug that could cause "Unknown error" messages when
C<call_sv(code, G_EVAL)> is called from an XS destructor has been fixed
(5.12.2).

=back

=head1 Known Problems

This is a list of significant unresolved issues which are regressions
from earlier versions of Perl or which affect widely-used CPAN modules.

=over 4

=item *

C<List::Util::first> misbehaves in the presence of a lexical C<$_>
(typically introduced by C<my $_> or implicitly by C<given>).  The variable
that gets set for each iteration is the package variable C<$_>, not the
lexical C<$_>.

A similar issue may occur in other modules that provide functions which
take a block as their first argument, like

    foo { ... $_ ...} list

See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=67694>

=item *

readline() returns an empty string instead of a cached previous value
when it is interrupted by a signal

=item *

The changes in prototype handling break L<Switch>.  A patch has been sent
upstream and will hopefully appear on CPAN soon.

=item *

The upgrade to F<ExtUtils-MakeMaker-6.57_05> has caused
some tests in the F<Module-Install> distribution on CPAN to
fail. (Specifically, F<02_mymeta.t> tests 5 and 21; F<18_all_from.t>
tests 6 and 15; F<19_authors.t> tests 5, 13, 21, and 29; and
F<20_authors_with_special_characters.t> tests 6, 15, and 23 in version
1.00 of that distribution now fail.)

=item *

On VMS, C<Time::HiRes> tests will fail due to a bug in the CRTL's
implementation of C<setitimer>: previous timer values would be cleared
if a timer expired but not if the timer was reset before expiring.  HP
OpenVMS Engineering have corrected the problem and will release a patch
in due course (Quix case # QXCM1001115136).

=item *

On VMS, there were a handful of C<Module::Build> test failures we didn't
get to before the release; please watch CPAN for updates.

=back

=head1 Errata

=head2 keys(), values(), and each() work on arrays

You can now use the keys(), values(), and each() builtins on arrays;
previously you could use them only on hashes.  See L<perlfunc> for details.
This is actually a change introduced in perl 5.12.0, but it was missed from
that release's L<perl5120delta>.

=head2 split() and C<@_>

split() no longer modifies C<@_> when called in scalar or void context.
In void context it now produces a "Useless use of split" warning.
This was also a perl 5.12.0 change that missed the perldelta.

=head1 Obituary

Randy Kobes, creator of http://kobesearch.cpan.org/ and
contributor/maintainer to several core Perl toolchain modules, passed
away on September 18, 2010 after a battle with lung cancer.  The community
was richer for his involvement.  He will be missed.

=head1 Acknowledgements

Perl 5.14.0 represents one year of development since
Perl 5.12.0 and contains nearly 550,000 lines of changes across nearly
3,000 files from 150 authors and committers.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.14.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Ævar Arnfjörð Bjarmason,
Alastair Douglas, Alexander Alekseev, Alexander Hartmaier, Alexandr
Ciornii, Alex Davies, Alex Vandiver, Ali Polatel, Allen Smith, Andreas
König, Andrew Rodland, Andy Armstrong, Andy Dougherty, Aristotle
Pagaltzis, Arkturuz, Arvan, A. Sinan Unur, Ben Morrow, Bo Lindbergh,
Boris Ratner, Brad Gilbert, Bram, brian d foy, Brian Phillips, Casey
West, Charles Bailey, Chas. Owens, Chip Salzenberg, Chris 'BinGOs'
Williams, chromatic, Craig A. Berry, Curtis Jewell, Dagfinn Ilmari
Mannsåker, Dan Dascalescu, Dave Rolsky, David Caldwell, David Cantrell,
David Golden, David Leadbeater, David Mitchell, David Wheeler, Eric
Brine, Father Chrysostomos, Fingle Nark, Florian Ragwitz, Frank Wiegand,
Franz Fasching, Gene Sullivan, George Greer, Gerard Goossen, Gisle Aas,
Goro Fuji, Grant McLean, gregor herrmann, H.Merijn Brand, Hongwen Qiu,
Hugo van der Sanden, Ian Goodacre, James E Keenan, James Mastros, Jan
Dubois, Jay Hannah, Jerry D. Hedden, Jesse Vincent, Jim Cromie, Jirka
Hruška, John Peacock, Joshua ben Jore, Joshua Pritikin, Karl Williamson,
Kevin Ryde, kmx, Lars Dɪᴇᴄᴋᴏᴡ 迪拉斯, Larwan Berke, Leon Brocard, Leon
Timmermans, Lubomir Rintel, Lukas Mai, Maik Hentsche, Marty Pauley,
Marvin Humphrey, Matt Johnson, Matt S Trout, Max Maischein, Michael
Breen, Michael Fig, Michael G Schwern, Michael Parker, Michael Stevens,
Michael Witten, Mike Kelly, Moritz Lenz, Nicholas Clark, Nick Cleaton,
Nick Johnston, Nicolas Kaiser, Niko Tyni, Noirin Shirley, Nuno Carvalho,
Paul Evans, Paul Green, Paul Johnson, Paul Marquess, Peter J. Holzer,
Peter John Acklam, Peter Martini, Philippe Bruhat (BooK), Piotr Fusik,
Rafael Garcia-Suarez, Rainer Tammer, Reini Urban, Renee Baecker, Ricardo
Signes, Richard Möhn, Richard Soderberg, Rob Hoelz, Robin Barker, Ruslan
Zakirov, Salvador Fandiño, Salvador Ortiz Garcia, Shlomi Fish, Sinan
Unur, Sisyphus, Slaven Rezic, Steffen Müller, Steve Hay, Steven
Schubiger, Steve Peters, Sullivan Beck, Tatsuhiko Miyagawa, Tim Bunce,
Todd Rinaldo, Tom Christiansen, Tom Hukins, Tony Cook, Tye McQueen,
Vadim Konovalov, Vernon Lyon, Vincent Pit, Walt Mankowski, Wolfram
Humann, Yves Orton, Zefram, and Zsbán Ambrus.

This is woefully incomplete as it's automatically generated from version
control history.  In particular, it doesn't include the names of the
(very much appreciated) contributors who reported issues in previous
versions of Perl that helped make Perl 5.14.0 better. For a more complete
list of all of Perl's historical contributors, please see the C<AUTHORS>
file in the Perl 5.14.0 distribution.

Many of the changes included in this version originated in the CPAN
modules included in Perl's core. We're grateful to the entire CPAN
community for helping Perl to flourish.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the Perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who are able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please use this address for
security issues in the Perl core I<only>, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl585delta.pod000064400000013401150344123470007470 0ustar00=head1 NAME

perl585delta - what is new for perl v5.8.5

=head1 DESCRIPTION

This document describes differences between the 5.8.4 release and
the 5.8.5 release.

=head1 Incompatible Changes

There are no changes incompatible with 5.8.4.

=head1 Core Enhancements

Perl's regular expression engine now contains support for matching on the
intersection of two Unicode character classes. You can also now refer to
user-defined character classes from within other user defined character
classes.

=head1 Modules and Pragmata

=over 4

=item *

Carp improved to work nicely with Safe. Carp's message reporting should now
be anomaly free - it will always print out line number information.

=item *

CGI upgraded to version 3.05

=item *

charnames now avoids clobbering $_

=item *

Digest upgraded to version 1.08

=item *

Encode upgraded to version 2.01

=item *

FileCache upgraded to version 1.04

=item *

libnet upgraded to version 1.19

=item *

Pod::Parser upgraded to version 1.28

=item *

Pod::Perldoc upgraded to version 3.13

=item *

Pod::LaTeX upgraded to version 0.57

=item *

Safe now works properly with Carp

=item *

Scalar-List-Utils upgraded to version 1.14

=item *

Shell's documentation has been re-written, and its historical partial
auto-quoting of command arguments can now be disabled.

=item *

Test upgraded to version 1.25

=item *

Test::Harness upgraded to version 2.42

=item *

Time::Local upgraded to version 1.10

=item *

Unicode::Collate upgraded to version 0.40

=item *

Unicode::Normalize upgraded to version 0.30

=back

=head1 Utility Changes

=head2 Perl's debugger

The debugger can now emulate stepping backwards, by restarting and rerunning
all bar the last command from a saved command history.

=head2 h2ph

F<h2ph> is now able to understand a very limited set of C inline functions
-- basically, the inline functions that look like CPP macros. This has
been introduced to deal with some of the headers of the newest versions of
the glibc. The standard warning still applies; to quote F<h2ph>'s
documentation, I<you may need to dicker with the files produced>.

=head1 Installation and Configuration Improvements

Perl 5.8.5 should build cleanly from source on LynxOS.

=head1 Selected Bug Fixes

=over 4

=item *

The in-place sort optimisation introduced in 5.8.4 had a bug. For example,
in code such as

    @a = sort ($b, @a)

the result would omit the value $b. This is now fixed.

=item *

The optimisation for unnecessary assignments introduced in 5.8.4 could give
spurious warnings. This has been fixed.

=item *

Perl should now correctly detect and read BOM-marked and (BOMless) UTF-16
scripts of either endianness.

=item *

Creating a new thread when weak references exist was buggy, and would often
cause warnings at interpreter destruction time. The known bug is now fixed.

=item *

Several obscure bugs involving manipulating Unicode strings with C<substr> have
been fixed.

=item *

Previously if Perl's file globbing function encountered a directory that it
did not have permission to open it would return immediately, leading to
unexpected truncation of the list of results. This has been fixed, to be
consistent with Unix shells' globbing behaviour.

=item *

Thread creation time could vary wildly between identical runs. This was caused
by a poor hashing algorithm in the thread cloning routines, which has now
been fixed.

=item *

The internals of the ithreads implementation were not checking if OS-level
thread creation had failed. threads->create() now returns C<undef> in if
thread creation fails instead of crashing perl.

=back

=head1 New or Changed Diagnostics

=over 4

=item *

Perl -V has several improvements

=over 4

=item  *

correctly outputs local patch names that contain embedded code snippets
or other characters that used to confuse it.

=item * 

arguments to -V that look like regexps will give multiple lines of output.

=item *

a trailing colon suppresses the linefeed and ';'  terminator, allowing
embedding of queries into shell commands.

=item *

a leading colon removes the 'name=' part of the response, allowing mapping to
any name.

=back

=item *

When perl fails to find the specified script, it now outputs a second line
suggesting that the user use the C<-S> flag:

    $ perl5.8.5 missing.pl
    Can't open perl script "missing.pl": No such file or directory.
    Use -S to search $PATH for it.

=back

=head1 Changed Internals

The Unicode character class files used by the regular expression engine are
now built at build time from the supplied Unicode consortium data files,
instead of being shipped prebuilt. This makes the compressed Perl source
tarball about 200K smaller. A side effect is that the layout of files inside
lib/unicore has changed.

=head1 Known Problems

The regression test F<t/uni/class.t> is now performing considerably more
tests, and can take several minutes to run even on a fast machine.

=head1 Platform Specific Problems

This release is known not to build on Windows 95.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perldebtut.pod000064400000053210150344123470007426 0ustar00=head1 NAME

perldebtut - Perl debugging tutorial

=head1 DESCRIPTION

A (very) lightweight introduction in the use of the perl debugger, and a
pointer to existing, deeper sources of information on the subject of debugging
perl programs.  

There's an extraordinary number of people out there who don't appear to know
anything about using the perl debugger, though they use the language every
day.  
This is for them.  


=head1 use strict

First of all, there's a few things you can do to make your life a lot more
straightforward when it comes to debugging perl programs, without using the
debugger at all.  To demonstrate, here's a simple script, named "hello", with
a problem:

	#!/usr/bin/perl

	$var1 = 'Hello World'; # always wanted to do that :-)
	$var2 = "$varl\n";

	print $var2; 
	exit;

While this compiles and runs happily, it probably won't do what's expected,
namely it doesn't print "Hello World\n" at all;  It will on the other hand do
exactly what it was told to do, computers being a bit that way inclined.  That
is, it will print out a newline character, and you'll get what looks like a
blank line.  It looks like there's 2 variables when (because of the typo)
there's really 3:

	$var1 = 'Hello World';
	$varl = undef;
	$var2 = "\n";

To catch this kind of problem, we can force each variable to be declared
before use by pulling in the strict module, by putting 'use strict;' after the
first line of the script.

Now when you run it, perl complains about the 3 undeclared variables and we
get four error messages because one variable is referenced twice:

 Global symbol "$var1" requires explicit package name at ./t1 line 4.
 Global symbol "$var2" requires explicit package name at ./t1 line 5.
 Global symbol "$varl" requires explicit package name at ./t1 line 5.
 Global symbol "$var2" requires explicit package name at ./t1 line 7.
 Execution of ./hello aborted due to compilation errors.     

Luvverly! and to fix this we declare all variables explicitly and now our
script looks like this:	

	#!/usr/bin/perl
	use strict;

	my $var1 = 'Hello World';
	my $varl = undef;
	my $var2 = "$varl\n";

	print $var2; 
	exit;

We then do (always a good idea) a syntax check before we try to run it again:

	> perl -c hello
	hello syntax OK 

And now when we run it, we get "\n" still, but at least we know why.  Just
getting this script to compile has exposed the '$varl' (with the letter 'l')
variable, and simply changing $varl to $var1 solves the problem.


=head1 Looking at data and -w and v

Ok, but how about when you want to really see your data, what's in that
dynamic variable, just before using it?

	#!/usr/bin/perl 
	use strict;

	my $key = 'welcome';
	my %data = (
		'this' => qw(that), 
		'tom' => qw(and jerry),
		'welcome' => q(Hello World),
		'zip' => q(welcome),
	);
	my @data = keys %data;

	print "$data{$key}\n";
	exit;                               

Looks OK, after it's been through the syntax check (perl -c scriptname), we
run it and all we get is a blank line again!  Hmmmm.

One common debugging approach here, would be to liberally sprinkle a few print
statements, to add a check just before we print out our data, and another just
after:

	print "All OK\n" if grep($key, keys %data);
	print "$data{$key}\n";
	print "done: '$data{$key}'\n";

And try again:

	> perl data
	All OK     

	done: ''

After much staring at the same piece of code and not seeing the wood for the
trees for some time, we get a cup of coffee and try another approach.  That
is, we bring in the cavalry by giving perl the 'B<-d>' switch on the command
line:

	> perl -d data 
	Default die handler restored.

	Loading DB routines from perl5db.pl version 1.07
	Editor support available.

	Enter h or `h h' for help, or `man perldebug' for more help.

	main::(./data:4):     my $key = 'welcome';   

Now, what we've done here is to launch the built-in perl debugger on our
script.  It's stopped at the first line of executable code and is waiting for
input.

Before we go any further, you'll want to know how to quit the debugger: use
just the letter 'B<q>', not the words 'quit' or 'exit':

	DB<1> q
	>

That's it, you're back on home turf again.


=head1 help

Fire the debugger up again on your script and we'll look at the help menu. 
There's a couple of ways of calling help: a simple 'B<h>' will get the summary 
help list, 'B<|h>' (pipe-h) will pipe the help through your pager (which is 
(probably 'more' or 'less'), and finally, 'B<h h>' (h-space-h) will give you 
the entire help screen.  Here is the summary page:

DB<1>h

 List/search source lines:               Control script execution:
  l [ln|sub]  List source code            T           Stack trace
  - or .      List previous/current line  s [expr]    Single step
                                                               [in expr]
  v [line]    View around line            n [expr]    Next, steps over
                                                                    subs
  f filename  View source in file         <CR/Enter>  Repeat last n or s
  /pattern/ ?patt?   Search forw/backw    r           Return from
                                                              subroutine
  M           Show module versions        c [ln|sub]  Continue until
                                                                position
 Debugger controls:                       L           List break/watch/
                                                                 actions
  o [...]     Set debugger options        t [expr]    Toggle trace
                                                            [trace expr]
  <[<]|{[{]|>[>] [cmd] Do pre/post-prompt b [ln|event|sub] [cnd] Set
                                                              breakpoint
  ! [N|pat]   Redo a previous command     B ln|*      Delete a/all
                                                             breakpoints
  H [-num]    Display last num commands   a [ln] cmd  Do cmd before line
  = [a val]   Define/list an alias        A ln|*      Delete a/all
                                                                 actions
  h [db_cmd]  Get help on command         w expr      Add a watch
                                                              expression
  h h         Complete help page          W expr|*    Delete a/all watch
                                                                   exprs
  |[|]db_cmd  Send output to pager        ![!] syscmd Run cmd in a
                                                              subprocess
  q or ^D     Quit                        R           Attempt a restart
 Data Examination:     expr     Execute perl code, also see: s,n,t expr
  x|m expr       Evals expr in list context, dumps the result or lists
                                                                methods.
  p expr         Print expression (uses script's current package).
  S [[!]pat]     List subroutine names [not] matching pattern
  V [Pk [Vars]]  List Variables in Package.  Vars can be ~pattern or
                                                               !pattern.
  X [Vars]       Same as "V current_package [Vars]".
  y [n [Vars]]   List lexicals in higher scope <n>.  Vars same as V.
 For more help, type h cmd_letter, or run man perldebug for all docs. 

More confusing options than you can shake a big stick at!  It's not as bad as
it looks and it's very useful to know more about all of it, and fun too!

There's a couple of useful ones to know about straight away.  You wouldn't
think we're using any libraries at all at the moment, but 'B<M>' will show
which modules are currently loaded, and their version number, while 'B<m>' 
will show the methods, and 'B<S>' shows all subroutines (by pattern) as 
shown below.  'B<V>' and 'B<X>' show variables in the program by package 
scope and can be constrained by pattern. 

	DB<2>S str 
	dumpvar::stringify
	strict::bits
	strict::import
	strict::unimport  

Using 'X' and cousins requires you not to use the type identifiers ($@%), just
the 'name':

	DM<3>X ~err
	FileHandle(stderr) => fileno(2)    

Remember we're in our tiny program with a problem, we should have a look at
where we are, and what our data looks like. First of all let's view some code 
at our present position (the first line of code in this case), via 'B<v>':

	DB<4> v
	1       #!/usr/bin/perl
	2:      use strict;
	3
	4==>    my $key = 'welcome';
	5:      my %data = (
	6               'this' => qw(that),
	7               'tom' => qw(and jerry),
	8               'welcome' => q(Hello World),
	9               'zip' => q(welcome),
	10      );                                 

At line number 4 is a helpful pointer, that tells you where you are now.  To
see more code, type 'v' again:

	DB<4> v
	8               'welcome' => q(Hello World),
	9               'zip' => q(welcome),
	10      );
	11:     my @data = keys %data;
	12:     print "All OK\n" if grep($key, keys %data);
	13:     print "$data{$key}\n";
	14:     print "done: '$data{$key}'\n";
	15:     exit;      

And if you wanted to list line 5 again, type 'l 5', (note the space):

	DB<4> l 5
	5:      my %data = (

In this case, there's not much to see, but of course normally there's pages of
stuff to wade through, and 'l' can be very useful.  To reset your view to the
line we're about to execute, type a lone period '.':

	DB<5> .
	main::(./data_a:4):     my $key = 'welcome';  

The line shown is the one that is about to be executed B<next>, it hasn't
happened yet.  So while we can print a variable with the letter 'B<p>', at
this point all we'd get is an empty (undefined) value back.  What we need to
do is to step through the next executable statement with an 'B<s>':

	DB<6> s
	main::(./data_a:5):     my %data = (
	main::(./data_a:6):             'this' => qw(that),
	main::(./data_a:7):             'tom' => qw(and jerry),
	main::(./data_a:8):             'welcome' => q(Hello World),
	main::(./data_a:9):             'zip' => q(welcome),
	main::(./data_a:10):    );   

Now we can have a look at that first ($key) variable:

	DB<7> p $key 
	welcome 

line 13 is where the action is, so let's continue down to there via the letter
'B<c>', which by the way, inserts a 'one-time-only' breakpoint at the given
line or sub routine:

	DB<8> c 13
	All OK
	main::(./data_a:13):    print "$data{$key}\n";

We've gone past our check (where 'All OK' was printed) and have stopped just
before the meat of our task.  We could try to print out a couple of variables
to see what is happening:

	DB<9> p $data{$key}

Not much in there, lets have a look at our hash:

	DB<10> p %data
	Hello Worldziptomandwelcomejerrywelcomethisthat 

	DB<11> p keys %data
	Hello Worldtomwelcomejerrythis  

Well, this isn't very easy to read, and using the helpful manual (B<h h>), the
'B<x>' command looks promising:

	DB<12> x %data
	0  'Hello World'
	1  'zip'
	2  'tom'
	3  'and'
	4  'welcome'
	5  undef
	6  'jerry'
	7  'welcome'
	8  'this'
	9  'that'     

That's not much help, a couple of welcomes in there, but no indication of
which are keys, and which are values, it's just a listed array dump and, in
this case, not particularly helpful.  The trick here, is to use a B<reference>
to the data structure:

	DB<13> x \%data
	0  HASH(0x8194bc4)
	   'Hello World' => 'zip'
	   'jerry' => 'welcome'
	   'this' => 'that'
	   'tom' => 'and'
	   'welcome' => undef  

The reference is truly dumped and we can finally see what we're dealing with. 
Our quoting was perfectly valid but wrong for our purposes, with 'and jerry'
being treated as 2 separate words rather than a phrase, thus throwing the
evenly paired hash structure out of alignment.

The 'B<-w>' switch would have told us about this, had we used it at the start,
and saved us a lot of trouble: 

	> perl -w data
	Odd number of elements in hash assignment at ./data line 5.    

We fix our quoting: 'tom' => q(and jerry), and run it again, this time we get
our expected output:

	> perl -w data
	Hello World


While we're here, take a closer look at the 'B<x>' command, it's really useful
and will merrily dump out nested references, complete objects, partial objects
- just about whatever you throw at it:

Let's make a quick object and x-plode it, first we'll start the debugger:
it wants some form of input from STDIN, so we give it something non-committal,
a zero:

 > perl -de 0
 Default die handler restored.

 Loading DB routines from perl5db.pl version 1.07
 Editor support available.

 Enter h or `h h' for help, or `man perldebug' for more help.

 main::(-e:1):   0

Now build an on-the-fly object over a couple of lines (note the backslash):

 DB<1> $obj = bless({'unique_id'=>'123', 'attr'=> \
 cont: 	{'col' => 'black', 'things' => [qw(this that etc)]}}, 'MY_class')

And let's have a look at it:

  	DB<2> x $obj
 0  MY_class=HASH(0x828ad98)
   		'attr' => HASH(0x828ad68)
      	'col' => 'black'
      	'things' => ARRAY(0x828abb8)
         	0  'this'
         	1  'that'
         	2  'etc'
   		'unique_id' => 123       
  	DB<3>

Useful, huh?  You can eval nearly anything in there, and experiment with bits
of code or regexes until the cows come home:

 DB<3> @data = qw(this that the other atheism leather theory scythe)

 DB<4> p 'saw -> '.($cnt += map { print "\t:\t$_\n" } grep(/the/, sort @data))
 atheism
 leather
 other
 scythe
 the
 theory
 saw -> 6

If you want to see the command History, type an 'B<H>':

 DB<5> H
 4: p 'saw -> '.($cnt += map { print "\t:\t$_\n" } grep(/the/, sort @data))
 3: @data = qw(this that the other atheism leather theory scythe)
 2: x $obj
 1: $obj = bless({'unique_id'=>'123', 'attr'=>
 {'col' => 'black', 'things' => [qw(this that etc)]}}, 'MY_class')
 DB<5>

And if you want to repeat any previous command, use the exclamation: 'B<!>':

 DB<5> !4
 p 'saw -> '.($cnt += map { print "$_\n" } grep(/the/, sort @data))
 atheism
 leather
 other
 scythe
 the
 theory
 saw -> 12

For more on references see L<perlref> and L<perlreftut>


=head1 Stepping through code

Here's a simple program which converts between Celsius and Fahrenheit, it too
has a problem:

 #!/usr/bin/perl -w
 use strict;

 my $arg = $ARGV[0] || '-c20';

 if ($arg =~ /^\-(c|f)((\-|\+)*\d+(\.\d+)*)$/) {
	my ($deg, $num) = ($1, $2);
	my ($in, $out) = ($num, $num);
	if ($deg eq 'c') {
		$deg = 'f';
		$out = &c2f($num);
	} else {
		$deg = 'c';
		$out = &f2c($num);
	}
	$out = sprintf('%0.2f', $out);
	$out =~ s/^((\-|\+)*\d+)\.0+$/$1/;
	print "$out $deg\n";
 } else {
	print "Usage: $0 -[c|f] num\n";
 }
 exit;

 sub f2c {
	my $f = shift;
	my $c = 5 * $f - 32 / 9;
	return $c;
 }

 sub c2f {
	my $c = shift;
	my $f = 9 * $c / 5 + 32;
	return $f;
 }


For some reason, the Fahrenheit to Celsius conversion fails to return the
expected output.  This is what it does:

 > temp -c0.72
 33.30 f

 > temp -f33.3
 162.94 c

Not very consistent!  We'll set a breakpoint in the code manually and run it
under the debugger to see what's going on.  A breakpoint is a flag, to which
the debugger will run without interruption, when it reaches the breakpoint, it
will stop execution and offer a prompt for further interaction.  In normal
use, these debugger commands are completely ignored, and they are safe - if a
little messy, to leave in production code.

	my ($in, $out) = ($num, $num);
	$DB::single=2; # insert at line 9!
	if ($deg eq 'c') 
		...

	> perl -d temp -f33.3
	Default die handler restored.

	Loading DB routines from perl5db.pl version 1.07
	Editor support available.

	Enter h or `h h' for help, or `man perldebug' for more help.

	main::(temp:4): my $arg = $ARGV[0] || '-c100';     

We'll simply continue down to our pre-set breakpoint with a 'B<c>':

  	DB<1> c
	main::(temp:10):                if ($deg eq 'c') {   

Followed by a view command to see where we are:

	DB<1> v
	7:              my ($deg, $num) = ($1, $2);
	8:              my ($in, $out) = ($num, $num);
	9:              $DB::single=2;
	10==>           if ($deg eq 'c') {
	11:                     $deg = 'f';
	12:                     $out = &c2f($num);
	13              } else {
	14:                     $deg = 'c';
	15:                     $out = &f2c($num);
	16              }                             

And a print to show what values we're currently using:

	DB<1> p $deg, $num
	f33.3

We can put another break point on any line beginning with a colon, we'll use
line 17 as that's just as we come out of the subroutine, and we'd like to
pause there later on:

	DB<2> b 17

There's no feedback from this, but you can see what breakpoints are set by
using the list 'L' command:

	DB<3> L
	temp:
 		17:            print "$out $deg\n";
   		break if (1)     

Note that to delete a breakpoint you use 'B'.

Now we'll continue down into our subroutine, this time rather than by line
number, we'll use the subroutine name, followed by the now familiar 'v':

	DB<3> c f2c
	main::f2c(temp:30):             my $f = shift;  

	DB<4> v
	24:     exit;
	25
	26      sub f2c {
	27==>           my $f = shift;
	28:             my $c = 5 * $f - 32 / 9; 
	29:             return $c;
	30      }
	31
	32      sub c2f {
	33:             my $c = shift;   


Note that if there was a subroutine call between us and line 29, and we wanted
to B<single-step> through it, we could use the 'B<s>' command, and to step
over it we would use 'B<n>' which would execute the sub, but not descend into
it for inspection.  In this case though, we simply continue down to line 29:

	DB<4> c 29  
	main::f2c(temp:29):             return $c;

And have a look at the return value:

	DB<5> p $c
	162.944444444444

This is not the right answer at all, but the sum looks correct.  I wonder if
it's anything to do with operator precedence?  We'll try a couple of other
possibilities with our sum:

	DB<6> p (5 * $f - 32 / 9)
	162.944444444444

	DB<7> p 5 * $f - (32 / 9) 
	162.944444444444

	DB<8> p (5 * $f) - 32 / 9
	162.944444444444

	DB<9> p 5 * ($f - 32) / 9
	0.722222222222221

:-) that's more like it!  Ok, now we can set our return variable and we'll
return out of the sub with an 'r':

	DB<10> $c = 5 * ($f - 32) / 9

	DB<11> r
	scalar context return from main::f2c: 0.722222222222221

Looks good, let's just continue off the end of the script:

	DB<12> c
	0.72 c 
	Debugged program terminated.  Use q to quit or R to restart,
  	use O inhibit_exit to avoid stopping after program termination,
  	h q, h R or h O to get additional info.   

A quick fix to the offending line (insert the missing parentheses) in the
actual program and we're finished.


=head1 Placeholder for a, w, t, T

Actions, watch variables, stack traces etc.: on the TODO list.

	a 

	w 

	t 

	T


=head1 REGULAR EXPRESSIONS

Ever wanted to know what a regex looked like?  You'll need perl compiled with
the DEBUGGING flag for this one:

  > perl -Dr -e '/^pe(a)*rl$/i'
  Compiling REx `^pe(a)*rl$'
  size 17 first at 2
  rarest char
   at 0
     1: BOL(2)
     2: EXACTF <pe>(4)
     4: CURLYN[1] {0,32767}(14)
     6:   NOTHING(8)
     8:   EXACTF <a>(0)
    12:   WHILEM(0)
    13: NOTHING(14)
    14: EXACTF <rl>(16)
    16: EOL(17)
    17: END(0)
  floating `'$ at 4..2147483647 (checking floating) stclass
    `EXACTF <pe>' anchored(BOL) minlen 4
  Omitting $` $& $' support.

  EXECUTING...

  Freeing REx: `^pe(a)*rl$'

Did you really want to know? :-)
For more gory details on getting regular expressions to work, have a look at
L<perlre>, L<perlretut>, and to decode the mysterious labels (BOL and CURLYN,
etc. above), see L<perldebguts>.


=head1 OUTPUT TIPS

To get all the output from your error log, and not miss any messages via
helpful operating system buffering, insert a line like this, at the start of
your script:

	$|=1;	

To watch the tail of a dynamically growing logfile, (from the command line):

	tail -f $error_log

Wrapping all die calls in a handler routine can be useful to see how, and from
where, they're being called, L<perlvar> has more information:

    BEGIN { $SIG{__DIE__} = sub { require Carp; Carp::confess(@_) } }

Various useful techniques for the redirection of STDOUT and STDERR filehandles
are explained in L<perlopentut> and L<perlfaq8>.


=head1 CGI

Just a quick hint here for all those CGI programmers who can't figure out how
on earth to get past that 'waiting for input' prompt, when running their CGI
script from the command-line, try something like this:

	> perl -d my_cgi.pl -nodebug 

Of course L<CGI> and L<perlfaq9> will tell you more.


=head1 GUIs

The command line interface is tightly integrated with an B<emacs> extension
and there's a B<vi> interface too.  

You don't have to do this all on the command line, though, there are a few GUI
options out there.  The nice thing about these is you can wave a mouse over a
variable and a dump of its data will appear in an appropriate window, or in a
popup balloon, no more tiresome typing of 'x $varname' :-)

In particular have a hunt around for the following:

B<ptkdb> perlTK based wrapper for the built-in debugger

B<ddd> data display debugger

B<PerlDevKit> and B<PerlBuilder> are NT specific

NB. (more info on these and others would be appreciated).


=head1 SUMMARY

We've seen how to encourage good coding practices with B<use strict> and
B<-w>.  We can run the perl debugger B<perl -d scriptname> to inspect your
data from within the perl debugger with the B<p> and B<x> commands.  You can
walk through your code, set breakpoints with B<b> and step through that code
with B<s> or B<n>, continue with B<c> and return from a sub with B<r>.  Fairly
intuitive stuff when you get down to it.  

There is of course lots more to find out about, this has just scratched the
surface.  The best way to learn more is to use perldoc to find out more about
the language, to read the on-line help (L<perldebug> is probably the next
place to go), and of course, experiment.  


=head1 SEE ALSO

L<perldebug>, 
L<perldebguts>, 
L<perldiag>,
L<perlrun>


=head1 AUTHOR

Richard Foley <richard.foley@rfi.net> Copyright (c) 2000


=head1 CONTRIBUTORS

Various people have made helpful suggestions and contributions, in particular:

Ronald J Kimball <rjk@linguist.dartmouth.edu>

Hugo van der Sanden <hv@crypt0.demon.co.uk>

Peter Scott <Peter@PSDT.com>

perlsource.pod000064400000015334150344123470007444 0ustar00=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlsource.pod

=head1 NAME

perlsource - A guide to the Perl source tree

=head1 DESCRIPTION

This document describes the layout of the Perl source tree. If you're
hacking on the Perl core, this will help you find what you're looking
for.

=head1 FINDING YOUR WAY AROUND

The Perl source tree is big. Here's some of the thing you'll find in
it:

=head2 C code

The C source code and header files mostly live in the root of the
source tree. There are a few platform-specific directories which
contain C code. In addition, some of the modules shipped with Perl
include C or XS code.

See L<perlinterp> for more details on the files that make up the Perl
interpreter, as well as details on how it works.

=head2 Core modules

Modules shipped as part of the Perl core live in four subdirectories.
Two of these directories contain modules that live in the core, and two
contain modules that can also be released separately on CPAN. Modules
which can be released on cpan are known as "dual-life" modules.

=over 4

=item * F<lib/>

This directory contains pure-Perl modules which are only released as
part of the core. This directory contains I<all> of the modules and
their tests, unlike other core modules.

=item * F<ext/>

Like F<lib/>, this directory contains modules which are only released
as part of the core.  Unlike F<lib/>, however, a module under F<ext/>
generally has a CPAN-style directory- and file-layout and its own
F<Makefile.PL>.  There is no expectation that a module under F<ext/>
will work with earlier versions of Perl 5.  Hence, such a module may
take full advantage of syntactical and other improvements in Perl 5
blead.

=item * F<dist/>

This directory is for dual-life modules where the blead source is
canonical. Note that some modules in this directory may not yet have
been released separately on CPAN.  Modules under F<dist/> should make
an effort to work with earlier versions of Perl 5.

=item * F<cpan/>

This directory contains dual-life modules where the CPAN module is
canonical. Do not patch these modules directly! Changes to these
modules should be submitted to the maintainer of the CPAN module. Once
those changes are applied and released, the new version of the module
will be incorporated into the core.

=back

For some dual-life modules, it has not yet been determined if the CPAN
version or the blead source is canonical. Until that is done, those
modules should be in F<cpan/>.

=head2 Tests

The Perl core has an extensive test suite. If you add new tests (or new
modules with tests), you may need to update the F<t/TEST> file so that
the tests are run.

=over 4

=item * Module tests

Tests for core modules in the F<lib/> directory are right next to the
module itself. For example, we have F<lib/strict.pm> and
F<lib/strict.t>.

Tests for modules in F<ext/> and the dual-life modules are in F<t/>
subdirectories for each module, like a standard CPAN distribution.

=item * F<t/base/>

Tests for the absolute basic functionality of Perl. This includes
C<if>, basic file reads and writes, simple regexes, etc. These are run
first in the test suite and if any of them fail, something is I<really>
broken.

=item * F<t/cmd/>

Tests for basic control structures, C<if>/C<else>, C<while>, subroutines,
etc.

=item * F<t/comp/>

Tests for basic issues of how Perl parses and compiles itself.

=item * F<t/io/>

Tests for built-in IO functions, including command line arguments.

=item * F<t/mro/>

Tests for perl's method resolution order implementations (see L<mro>).

=item * F<t/op/>

Tests for perl's built in functions that don't fit into any of the
other directories.

=item * F<t/opbasic/>

Tests for perl's built in functions which, like those in F<t/op/>, do
not fit into any of the other directories, but which, in addition,
cannot use F<t/test.pl>,as that program depends on functionality which
the test file itself is testing.

=item * F<t/re/>

Tests for regex related functions or behaviour. (These used to live in
t/op).

=item * F<t/run/>

Tests for features of how perl actually runs, including exit codes and
handling of PERL* environment variables.

=item * F<t/uni/>

Tests for the core support of Unicode.

=item * F<t/win32/>

Windows-specific tests.

=item * F<t/porting/>

Tests the state of the source tree for various common errors. For
example, it tests that everyone who is listed in the git log has a
corresponding entry in the F<AUTHORS> file.

=item * F<t/lib/>

The old home for the module tests, you shouldn't put anything new in
here. There are still some bits and pieces hanging around in here that
need to be moved. Perhaps you could move them?  Thanks!

=back

=head2 Documentation

All of the core documentation intended for end users lives in F<pod/>.
Individual modules in F<lib/>, F<ext/>, F<dist/>, and F<cpan/> usually
have their own documentation, either in the F<Module.pm> file or an
accompanying F<Module.pod> file.

Finally, documentation intended for core Perl developers lives in the
F<Porting/> directory.

=head2 Hacking tools and documentation

The F<Porting> directory contains a grab bag of code and documentation
intended to help porters work on Perl. Some of the highlights include:

=over 4

=item * F<check*>

These are scripts which will check the source things like ANSI C
violations, POD encoding issues, etc.

=item * F<Maintainers>, F<Maintainers.pl>, and F<Maintainers.pm>

These files contain information on who maintains which modules. Run
C<perl Porting/Maintainers -M Module::Name> to find out more
information about a dual-life module.

=item * F<podtidy>

Tidies a pod file. It's a good idea to run this on a pod file you've
patched.

=back

=head2 Build system

The Perl build system starts with the F<Configure> script in the root
directory.

Platform-specific pieces of the build system also live in
platform-specific directories like F<win32/>, F<vms/>, etc.

The F<Configure> script is ultimately responsible for generating a
F<Makefile>.

The build system that Perl uses is called metaconfig. This system is
maintained separately from the Perl core.

The metaconfig system has its own git repository. Please see its README
file in L<http://perl5.git.perl.org/metaconfig.git/> for more details.

The F<Cross> directory contains various files related to
cross-compiling Perl. See F<Cross/README> for more details.

=head2 F<AUTHORS>

This file lists everyone who's contributed to Perl. If you submit a
patch, you should add your name to this file as part of the patch.

=head2 F<MANIFEST>

The F<MANIFEST> file in the root of the source tree contains a list of
every file in the Perl core, as well as a brief description of each
file.

You can get an overview of all the files with this command:

  % perl -lne 'print if /^[^\/]+\.[ch]\s+/' MANIFEST
perlobj.pod000064400000105321150344123470006712 0ustar00=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlobj.pod

=head1 NAME
X<object> X<OOP>

perlobj - Perl object reference

=head1 DESCRIPTION

This document provides a reference for Perl's object orientation
features. If you're looking for an introduction to object-oriented
programming in Perl, please see L<perlootut>.

In order to understand Perl objects, you first need to understand
references in Perl. See L<perlref> for details.

This document describes all of Perl's object-oriented (OO) features
from the ground up. If you're just looking to write some
object-oriented code of your own, you are probably better served by
using one of the object systems from CPAN described in L<perlootut>.

If you're looking to write your own object system, or you need to
maintain code which implements objects from scratch then this document
will help you understand exactly how Perl does object orientation.

There are a few basic principles which define object oriented Perl:

=over 4

=item 1.

An object is simply a data structure that knows to which class it
belongs.

=item 2.

A class is simply a package. A class provides methods that expect to
operate on objects.

=item 3.

A method is simply a subroutine that expects a reference to an object
(or a package name, for class methods) as the first argument.

=back

Let's look at each of these principles in depth.

=head2 An Object is Simply a Data Structure
X<object> X<bless> X<constructor> X<new>

Unlike many other languages which support object orientation, Perl does
not provide any special syntax for constructing an object. Objects are
merely Perl data structures (hashes, arrays, scalars, filehandles,
etc.) that have been explicitly associated with a particular class.

That explicit association is created by the built-in C<bless> function,
which is typically used within the I<constructor> subroutine of the
class.

Here is a simple constructor:

  package File;

  sub new {
      my $class = shift;

      return bless {}, $class;
  }

The name C<new> isn't special. We could name our constructor something
else:

  package File;

  sub load {
      my $class = shift;

      return bless {}, $class;
  }

The modern convention for OO modules is to always use C<new> as the
name for the constructor, but there is no requirement to do so. Any
subroutine that blesses a data structure into a class is a valid
constructor in Perl.

In the previous examples, the C<{}> code creates a reference to an
empty anonymous hash. The C<bless> function then takes that reference
and associates the hash with the class in C<$class>. In the simplest
case, the C<$class> variable will end up containing the string "File".

We can also use a variable to store a reference to the data structure
that is being blessed as our object:

  sub new {
      my $class = shift;

      my $self = {};
      bless $self, $class;

      return $self;
  }

Once we've blessed the hash referred to by C<$self> we can start
calling methods on it. This is useful if you want to put object
initialization in its own separate method:

  sub new {
      my $class = shift;

      my $self = {};
      bless $self, $class;

      $self->_initialize();

      return $self;
  }

Since the object is also a hash, you can treat it as one, using it to
store data associated with the object. Typically, code inside the class
can treat the hash as an accessible data structure, while code outside
the class should always treat the object as opaque. This is called
B<encapsulation>. Encapsulation means that the user of an object does
not have to know how it is implemented. The user simply calls
documented methods on the object.

Note, however, that (unlike most other OO languages) Perl does not
ensure or enforce encapsulation in any way. If you want objects to
actually I<be> opaque you need to arrange for that yourself. This can
be done in a variety of ways, including using L</"Inside-Out objects">
or modules from CPAN.

=head3 Objects Are Blessed; Variables Are Not

When we bless something, we are not blessing the variable which
contains a reference to that thing, nor are we blessing the reference
that the variable stores; we are blessing the thing that the variable
refers to (sometimes known as the I<referent>). This is best
demonstrated with this code:

  use Scalar::Util 'blessed';

  my $foo = {};
  my $bar = $foo;

  bless $foo, 'Class';
  print blessed( $bar ) // 'not blessed';    # prints "Class"

  $bar = "some other value";
  print blessed( $bar ) // 'not blessed';    # prints "not blessed"

When we call C<bless> on a variable, we are actually blessing the
underlying data structure that the variable refers to. We are not
blessing the reference itself, nor the variable that contains that
reference. That's why the second call to C<blessed( $bar )> returns
false. At that point C<$bar> is no longer storing a reference to an
object.

You will sometimes see older books or documentation mention "blessing a
reference" or describe an object as a "blessed reference", but this is
incorrect. It isn't the reference that is blessed as an object; it's
the thing the reference refers to (i.e. the referent).

=head2 A Class is Simply a Package
X<class> X<package> X<@ISA> X<inheritance>

Perl does not provide any special syntax for class definitions. A
package is simply a namespace containing variables and subroutines. The
only difference is that in a class, the subroutines may expect a
reference to an object or the name of a class as the first argument.
This is purely a matter of convention, so a class may contain both
methods and subroutines which I<don't> operate on an object or class.

Each package contains a special array called C<@ISA>. The C<@ISA> array
contains a list of that class's parent classes, if any. This array is
examined when Perl does method resolution, which we will cover later.

Calling methods from a package means it must be loaded, of course, so
you will often want to load a module and add it to C<@ISA> at the same
time. You can do so in a single step using the L<parent> pragma.
(In older code you may encounter the L<base> pragma, which is nowadays
discouraged except when you have to work with the equally discouraged
L<fields> pragma.)

However the parent classes are set, the package's C<@ISA> variable will
contain a list of those parents. This is simply a list of scalars, each
of which is a string that corresponds to a package name.

All classes inherit from the L<UNIVERSAL> class implicitly. The
L<UNIVERSAL> class is implemented by the Perl core, and provides
several default methods, such as C<isa()>, C<can()>, and C<VERSION()>.
The C<UNIVERSAL> class will I<never> appear in a package's C<@ISA>
variable.

Perl I<only> provides method inheritance as a built-in feature.
Attribute inheritance is left up the class to implement. See the
L</Writing Accessors> section for details.

=head2 A Method is Simply a Subroutine
X<method>

Perl does not provide any special syntax for defining a method. A
method is simply a regular subroutine, and is declared with C<sub>.
What makes a method special is that it expects to receive either an
object or a class name as its first argument.

Perl I<does> provide special syntax for method invocation, the C<< ->
>> operator. We will cover this in more detail later.

Most methods you write will expect to operate on objects:

  sub save {
      my $self = shift;

      open my $fh, '>', $self->path() or die $!;
      print {$fh} $self->data()       or die $!;
      close $fh                       or die $!;
  }

=head2 Method Invocation
X<invocation> X<method> X<arrow> X<< -> >>

Calling a method on an object is written as C<< $object->method >>.

The left hand side of the method invocation (or arrow) operator is the
object (or class name), and the right hand side is the method name.

  my $pod = File->new( 'perlobj.pod', $data );
  $pod->save();

The C<< -> >> syntax is also used when dereferencing a reference. It
looks like the same operator, but these are two different operations.

When you call a method, the thing on the left side of the arrow is
passed as the first argument to the method. That means when we call C<<
Critter->new() >>, the C<new()> method receives the string C<"Critter">
as its first argument. When we call C<< $fred->speak() >>, the C<$fred>
variable is passed as the first argument to C<speak()>.

Just as with any Perl subroutine, all of the arguments passed in C<@_>
are aliases to the original argument. This includes the object itself.
If you assign directly to C<$_[0]> you will change the contents of the
variable that holds the reference to the object. We recommend that you
don't do this unless you know exactly what you're doing.

Perl knows what package the method is in by looking at the left side of
the arrow. If the left hand side is a package name, it looks for the
method in that package. If the left hand side is an object, then Perl
looks for the method in the package that the object has been blessed
into.

If the left hand side is neither a package name nor an object, then the
method call will cause an error, but see the section on L</Method Call
Variations> for more nuances.

=head2 Inheritance
X<inheritance>

We already talked about the special C<@ISA> array and the L<parent>
pragma.

When a class inherits from another class, any methods defined in the
parent class are available to the child class. If you attempt to call a
method on an object that isn't defined in its own class, Perl will also
look for that method in any parent classes it may have.

  package File::MP3;
  use parent 'File';    # sets @File::MP3::ISA = ('File');

  my $mp3 = File::MP3->new( 'Andvari.mp3', $data );
  $mp3->save();

Since we didn't define a C<save()> method in the C<File::MP3> class,
Perl will look at the C<File::MP3> class's parent classes to find the
C<save()> method. If Perl cannot find a C<save()> method anywhere in
the inheritance hierarchy, it will die.

In this case, it finds a C<save()> method in the C<File> class. Note
that the object passed to C<save()> in this case is still a
C<File::MP3> object, even though the method is found in the C<File>
class.

We can override a parent's method in a child class. When we do so, we
can still call the parent class's method with the C<SUPER>
pseudo-class.

  sub save {
      my $self = shift;

      say 'Prepare to rock';
      $self->SUPER::save();
  }

The C<SUPER> modifier can I<only> be used for method calls. You can't
use it for regular subroutine calls or class methods:

  SUPER::save($thing);     # FAIL: looks for save() sub in package SUPER

  SUPER->save($thing);     # FAIL: looks for save() method in class
                           #       SUPER

  $thing->SUPER::save();   # Okay: looks for save() method in parent
                           #       classes


=head3 How SUPER is Resolved
X<SUPER>

The C<SUPER> pseudo-class is resolved from the package where the call
is made. It is I<not> resolved based on the object's class. This is
important, because it lets methods at different levels within a deep
inheritance hierarchy each correctly call their respective parent
methods.

  package A;

  sub new {
      return bless {}, shift;
  }

  sub speak {
      my $self = shift;

      say 'A';
  }

  package B;

  use parent -norequire, 'A';

  sub speak {
      my $self = shift;

      $self->SUPER::speak();

      say 'B';
  }

  package C;

  use parent -norequire, 'B';

  sub speak {
      my $self = shift;

      $self->SUPER::speak();

      say 'C';
  }

  my $c = C->new();
  $c->speak();

In this example, we will get the following output:

  A
  B
  C

This demonstrates how C<SUPER> is resolved. Even though the object is
blessed into the C<C> class, the C<speak()> method in the C<B> class
can still call C<SUPER::speak()> and expect it to correctly look in the
parent class of C<B> (i.e the class the method call is in), not in the
parent class of C<C> (i.e. the class the object belongs to).

There are rare cases where this package-based resolution can be a
problem. If you copy a subroutine from one package to another, C<SUPER>
resolution will be done based on the original package.

=head3 Multiple Inheritance
X<multiple inheritance>

Multiple inheritance often indicates a design problem, but Perl always
gives you enough rope to hang yourself with if you ask for it.

To declare multiple parents, you simply need to pass multiple class
names to C<use parent>:

  package MultiChild;

  use parent 'Parent1', 'Parent2';

=head3 Method Resolution Order
X<method resolution order> X<mro>

Method resolution order only matters in the case of multiple
inheritance. In the case of single inheritance, Perl simply looks up
the inheritance chain to find a method:

  Grandparent
    |
  Parent
    |
  Child

If we call a method on a C<Child> object and that method is not defined
in the C<Child> class, Perl will look for that method in the C<Parent>
class and then, if necessary, in the C<Grandparent> class.

If Perl cannot find the method in any of these classes, it will die
with an error message.

When a class has multiple parents, the method lookup order becomes more
complicated.

By default, Perl does a depth-first left-to-right search for a method.
That means it starts with the first parent in the C<@ISA> array, and
then searches all of its parents, grandparents, etc. If it fails to
find the method, it then goes to the next parent in the original
class's C<@ISA> array and searches from there.

            SharedGreatGrandParent
            /                    \
  PaternalGrandparent       MaternalGrandparent
            \                    /
             Father        Mother
                   \      /
                    Child

So given the diagram above, Perl will search C<Child>, C<Father>,
C<PaternalGrandparent>, C<SharedGreatGrandParent>, C<Mother>, and
finally C<MaternalGrandparent>. This may be a problem because now we're
looking in C<SharedGreatGrandParent> I<before> we've checked all its
derived classes (i.e. before we tried C<Mother> and
C<MaternalGrandparent>).

It is possible to ask for a different method resolution order with the
L<mro> pragma.

  package Child;

  use mro 'c3';
  use parent 'Father', 'Mother';

This pragma lets you switch to the "C3" resolution order. In simple
terms, "C3" order ensures that shared parent classes are never searched
before child classes, so Perl will now search: C<Child>, C<Father>,
C<PaternalGrandparent>, C<Mother> C<MaternalGrandparent>, and finally
C<SharedGreatGrandParent>. Note however that this is not
"breadth-first" searching: All the C<Father> ancestors (except the
common ancestor) are searched before any of the C<Mother> ancestors are
considered.

The C3 order also lets you call methods in sibling classes with the
C<next> pseudo-class. See the L<mro> documentation for more details on
this feature.

=head3 Method Resolution Caching

When Perl searches for a method, it caches the lookup so that future
calls to the method do not need to search for it again. Changing a
class's parent class or adding subroutines to a class will invalidate
the cache for that class.

The L<mro> pragma provides some functions for manipulating the method
cache directly.

=head2 Writing Constructors
X<constructor>

As we mentioned earlier, Perl provides no special constructor syntax.
This means that a class must implement its own constructor. A
constructor is simply a class method that returns a reference to a new
object.

The constructor can also accept additional parameters that define the
object. Let's write a real constructor for the C<File> class we used
earlier:

  package File;

  sub new {
      my $class = shift;
      my ( $path, $data ) = @_;

      my $self = bless {
          path => $path,
          data => $data,
      }, $class;

      return $self;
  }

As you can see, we've stored the path and file data in the object
itself. Remember, under the hood, this object is still just a hash.
Later, we'll write accessors to manipulate this data.

For our C<File::MP3> class, we can check to make sure that the path
we're given ends with ".mp3":

  package File::MP3;

  sub new {
      my $class = shift;
      my ( $path, $data ) = @_;

      die "You cannot create a File::MP3 without an mp3 extension\n"
          unless $path =~ /\.mp3\z/;

      return $class->SUPER::new(@_);
  }

This constructor lets its parent class do the actual object
construction.

=head2 Attributes
X<attribute>

An attribute is a piece of data belonging to a particular object.
Unlike most object-oriented languages, Perl provides no special syntax
or support for declaring and manipulating attributes.

Attributes are often stored in the object itself. For example, if the
object is an anonymous hash, we can store the attribute values in the
hash using the attribute name as the key.

While it's possible to refer directly to these hash keys outside of the
class, it's considered a best practice to wrap all access to the
attribute with accessor methods.

This has several advantages. Accessors make it easier to change the
implementation of an object later while still preserving the original
API.

An accessor lets you add additional code around attribute access. For
example, you could apply a default to an attribute that wasn't set in
the constructor, or you could validate that a new value for the
attribute is acceptable.

Finally, using accessors makes inheritance much simpler. Subclasses can
use the accessors rather than having to know how a parent class is
implemented internally.

=head3 Writing Accessors
X<accessor>

As with constructors, Perl provides no special accessor declaration
syntax, so classes must provide explicitly written accessor methods.
There are two common types of accessors, read-only and read-write.

A simple read-only accessor simply gets the value of a single
attribute:

  sub path {
      my $self = shift;

      return $self->{path};
  }

A read-write accessor will allow the caller to set the value as well as
get it:

  sub path {
      my $self = shift;

      if (@_) {
          $self->{path} = shift;
      }

      return $self->{path};
  }

=head2 An Aside About Smarter and Safer Code

Our constructor and accessors are not very smart. They don't check that
a C<$path> is defined, nor do they check that a C<$path> is a valid
filesystem path.

Doing these checks by hand can quickly become tedious. Writing a bunch
of accessors by hand is also incredibly tedious. There are a lot of
modules on CPAN that can help you write safer and more concise code,
including the modules we recommend in L<perlootut>.

=head2 Method Call Variations
X<method>

Perl supports several other ways to call methods besides the C<<
$object->method() >> usage we've seen so far.

=head3 Method Names with a Fully Qualified Name

Perl allows you to call methods using their fully qualified name (the
package and method name):

  my $mp3 = File::MP3->new( 'Regin.mp3', $data );
  $mp3->File::save();

When you a fully qualified method name like C<File::save>, the method
resolution search for the C<save> method starts in the C<File> class,
skipping any C<save> method the C<File::MP3> class may have defined. It
still searches the C<File> class's parents if necessary.

While this feature is most commonly used to explicitly call methods
inherited from an ancestor class, there is no technical restriction
that enforces this:

  my $obj = Tree->new();
  $obj->Dog::bark();

This calls the C<bark> method from class C<Dog> on an object of class
C<Tree>, even if the two classes are completely unrelated. Use this
with great care.

The C<SUPER> pseudo-class that was described earlier is I<not> the same
as calling a method with a fully-qualified name. See the earlier
L</Inheritance> section for details.

=head3 Method Names as Strings

Perl lets you use a scalar variable containing a string as a method
name:

  my $file = File->new( $path, $data );

  my $method = 'save';
  $file->$method();

This works exactly like calling C<< $file->save() >>. This can be very
useful for writing dynamic code. For example, it allows you to pass a
method name to be called as a parameter to another method.

=head3 Class Names as Strings

Perl also lets you use a scalar containing a string as a class name:

  my $class = 'File';

  my $file = $class->new( $path, $data );

Again, this allows for very dynamic code.

=head3 Subroutine References as Methods

You can also use a subroutine reference as a method:

  my $sub = sub {
      my $self = shift;

      $self->save();
  };

  $file->$sub();

This is exactly equivalent to writing C<< $sub->($file) >>. You may see
this idiom in the wild combined with a call to C<can>:

  if ( my $meth = $object->can('foo') ) {
      $object->$meth();
  }

=head3 Dereferencing Method Call

Perl also lets you use a dereferenced scalar reference in a method
call. That's a mouthful, so let's look at some code:

  $file->${ \'save' };
  $file->${ returns_scalar_ref() };
  $file->${ \( returns_scalar() ) };
  $file->${ returns_ref_to_sub_ref() };

This works if the dereference produces a string I<or> a subroutine
reference.

=head3 Method Calls on Filehandles

Under the hood, Perl filehandles are instances of the C<IO::Handle> or
C<IO::File> class. Once you have an open filehandle, you can call
methods on it. Additionally, you can call methods on the C<STDIN>,
C<STDOUT>, and C<STDERR> filehandles.

  open my $fh, '>', 'path/to/file';
  $fh->autoflush();
  $fh->print('content');

  STDOUT->autoflush();

=head2 Invoking Class Methods
X<invocation>

Because Perl allows you to use barewords for package names and
subroutine names, it sometimes interprets a bareword's meaning
incorrectly. For example, the construct C<< Class->new() >> can be
interpreted as either C<< 'Class'->new() >> or C<< Class()->new() >>.
In English, that second interpretation reads as "call a subroutine
named Class(), then call new() as a method on the return value of
Class()". If there is a subroutine named C<Class()> in the current
namespace, Perl will always interpret C<< Class->new() >> as the second
alternative: a call to C<new()> on the object  returned by a call to
C<Class()>

You can force Perl to use the first interpretation (i.e. as a method
call on the class named "Class") in two ways. First, you can append a
C<::> to the class name:

    Class::->new()

Perl will always interpret this as a method call.

Alternatively, you can quote the class name:

    'Class'->new()

Of course, if the class name is in a scalar Perl will do the right
thing as well:

    my $class = 'Class';
    $class->new();

=head3 Indirect Object Syntax
X<indirect object>

B<Outside of the file handle case, use of this syntax is discouraged as
it can confuse the Perl interpreter. See below for more details.>

Perl supports another method invocation syntax called "indirect object"
notation. This syntax is called "indirect" because the method comes
before the object it is being invoked on.

This syntax can be used with any class or object method:

    my $file = new File $path, $data;
    save $file;

We recommend that you avoid this syntax, for several reasons.

First, it can be confusing to read. In the above example, it's not
clear if C<save> is a method provided by the C<File> class or simply a
subroutine that expects a file object as its first argument.

When used with class methods, the problem is even worse. Because Perl
allows subroutine names to be written as barewords, Perl has to guess
whether the bareword after the method is a class name or subroutine
name. In other words, Perl can resolve the syntax as either C<<
File->new( $path, $data ) >> B<or> C<< new( File( $path, $data ) ) >>.

To parse this code, Perl uses a heuristic based on what package names
it has seen, what subroutines exist in the current package, what
barewords it has previously seen, and other input. Needless to say,
heuristics can produce very surprising results!

Older documentation (and some CPAN modules) encouraged this syntax,
particularly for constructors, so you may still find it in the wild.
However, we encourage you to avoid using it in new code.

You can force Perl to interpret the bareword as a class name by
appending "::" to it, like we saw earlier:

  my $file = new File:: $path, $data;

=head2 C<bless>, C<blessed>, and C<ref>

As we saw earlier, an object is simply a data structure that has been
blessed into a class via the C<bless> function. The C<bless> function
can take either one or two arguments:

  my $object = bless {}, $class;
  my $object = bless {};

In the first form, the anonymous hash is being blessed into the class
in C<$class>. In the second form, the anonymous hash is blessed into
the current package.

The second form is strongly discouraged, because it breaks the ability
of a subclass to reuse the parent's constructor, but you may still run
across it in existing code.

If you want to know whether a particular scalar refers to an object,
you can use the C<blessed> function exported by L<Scalar::Util>, which
is shipped with the Perl core.

  use Scalar::Util 'blessed';

  if ( defined blessed($thing) ) { ... }

If C<$thing> refers to an object, then this function returns the name
of the package the object has been blessed into. If C<$thing> doesn't
contain a reference to a blessed object, the C<blessed> function
returns C<undef>.

Note that C<blessed($thing)> will also return false if C<$thing> has
been blessed into a class named "0". This is a possible, but quite
pathological. Don't create a class named "0" unless you know what
you're doing.

Similarly, Perl's built-in C<ref> function treats a reference to a
blessed object specially. If you call C<ref($thing)> and C<$thing>
holds a reference to an object, it will return the name of the class
that the object has been blessed into.

If you simply want to check that a variable contains an object
reference, we recommend that you use C<defined blessed($object)>, since
C<ref> returns true values for all references, not just objects.

=head2 The UNIVERSAL Class
X<UNIVERSAL>

All classes automatically inherit from the L<UNIVERSAL> class, which is
built-in to the Perl core. This class provides a number of methods, all
of which can be called on either a class or an object. You can also
choose to override some of these methods in your class. If you do so,
we recommend that you follow the built-in semantics described below.

=over 4

=item isa($class)
X<isa>

The C<isa> method returns I<true> if the object is a member of the
class in C<$class>, or a member of a subclass of C<$class>.

If you override this method, it should never throw an exception.

=item DOES($role)
X<DOES>

The C<DOES> method returns I<true> if its object claims to perform the
role C<$role>. By default, this is equivalent to C<isa>. This method is
provided for use by object system extensions that implement roles, like
C<Moose> and C<Role::Tiny>.

You can also override C<DOES> directly in your own classes. If you
override this method, it should never throw an exception.

=item can($method)
X<can>

The C<can> method checks to see if the class or object it was called on
has a method named C<$method>. This checks for the method in the class
and all of its parents. If the method exists, then a reference to the
subroutine is returned. If it does not then C<undef> is returned.

If your class responds to method calls via C<AUTOLOAD>, you may want to
overload C<can> to return a subroutine reference for methods which your
C<AUTOLOAD> method handles.

If you override this method, it should never throw an exception.

=item VERSION($need)
X<VERSION>

The C<VERSION> method returns the version number of the class
(package).

If the C<$need> argument is given then it will check that the current
version (as defined by the $VERSION variable in the package) is greater
than or equal to C<$need>; it will die if this is not the case. This
method is called automatically by the C<VERSION> form of C<use>.

    use Package 1.2 qw(some imported subs);
    # implies:
    Package->VERSION(1.2);

We recommend that you use this method to access another package's
version, rather than looking directly at C<$Package::VERSION>. The
package you are looking at could have overridden the C<VERSION> method.

We also recommend using this method to check whether a module has a
sufficient version. The internal implementation uses the L<version>
module to make sure that different types of version numbers are
compared correctly.

=back

=head2 AUTOLOAD
X<AUTOLOAD>

If you call a method that doesn't exist in a class, Perl will throw an
error. However, if that class or any of its parent classes defines an
C<AUTOLOAD> method, that C<AUTOLOAD> method is called instead.

C<AUTOLOAD> is called as a regular method, and the caller will not know
the difference. Whatever value your C<AUTOLOAD> method returns is
returned to the caller.

The fully qualified method name that was called is available in the
C<$AUTOLOAD> package global for your class. Since this is a global, if
you want to refer to do it without a package name prefix under C<strict
'vars'>, you need to declare it.

  # XXX - this is a terrible way to implement accessors, but it makes
  # for a simple example.
  our $AUTOLOAD;
  sub AUTOLOAD {
      my $self = shift;

      # Remove qualifier from original method name...
      my $called =  $AUTOLOAD =~ s/.*:://r;

      # Is there an attribute of that name?
      die "No such attribute: $called"
          unless exists $self->{$called};

      # If so, return it...
      return $self->{$called};
  }

  sub DESTROY { } # see below

Without the C<our $AUTOLOAD> declaration, this code will not compile
under the L<strict> pragma.

As the comment says, this is not a good way to implement accessors.
It's slow and too clever by far. However, you may see this as a way to
provide accessors in older Perl code. See L<perlootut> for
recommendations on OO coding in Perl.

If your class does have an C<AUTOLOAD> method, we strongly recommend
that you override C<can> in your class as well. Your overridden C<can>
method should return a subroutine reference for any method that your
C<AUTOLOAD> responds to.

=head2 Destructors
X<destructor> X<DESTROY>

When the last reference to an object goes away, the object is
destroyed. If you only have one reference to an object stored in a
lexical scalar, the object is destroyed when that scalar goes out of
scope. If you store the object in a package global, that object may not
go out of scope until the program exits.

If you want to do something when the object is destroyed, you can
define a C<DESTROY> method in your class. This method will always be
called by Perl at the appropriate time, unless the method is empty.

This is called just like any other method, with the object as the first
argument. It does not receive any additional arguments. However, the
C<$_[0]> variable will be read-only in the destructor, so you cannot
assign a value to it.

If your C<DESTROY> method throws an error, this error will be ignored.
It will not be sent to C<STDERR> and it will not cause the program to
die. However, if your destructor is running inside an C<eval {}> block,
then the error will change the value of C<$@>.

Because C<DESTROY> methods can be called at any time, you should
localize any global variables you might update in your C<DESTROY>. In
particular, if you use C<eval {}> you should localize C<$@>, and if you
use C<system> or backticks you should localize C<$?>.

If you define an C<AUTOLOAD> in your class, then Perl will call your
C<AUTOLOAD> to handle the C<DESTROY> method. You can prevent this by
defining an empty C<DESTROY>, like we did in the autoloading example.
You can also check the value of C<$AUTOLOAD> and return without doing
anything when called to handle C<DESTROY>.

=head3 Global Destruction

The order in which objects are destroyed during the global destruction
before the program exits is unpredictable. This means that any objects
contained by your object may already have been destroyed. You should
check that a contained object is defined before calling a method on it:

  sub DESTROY {
      my $self = shift;

      $self->{handle}->close() if $self->{handle};
  }

You can use the C<${^GLOBAL_PHASE}> variable to detect if you are
currently in the global destruction phase:

  sub DESTROY {
      my $self = shift;

      return if ${^GLOBAL_PHASE} eq 'DESTRUCT';

      $self->{handle}->close();
  }

Note that this variable was added in Perl 5.14.0. If you want to detect
the global destruction phase on older versions of Perl, you can use the
C<Devel::GlobalDestruction> module on CPAN.

If your C<DESTROY> method issues a warning during global destruction,
the Perl interpreter will append the string " during global
destruction" to the warning.

During global destruction, Perl will always garbage collect objects
before unblessed references. See L<perlhacktips/PERL_DESTRUCT_LEVEL>
for more information about global destruction.

=head2 Non-Hash Objects

All the examples so far have shown objects based on a blessed hash.
However, it's possible to bless any type of data structure or referent,
including scalars, globs, and subroutines. You may see this sort of
thing when looking at code in the wild.

Here's an example of a module as a blessed scalar:

  package Time;

  use strict;
  use warnings;

  sub new {
      my $class = shift;

      my $time = time;
      return bless \$time, $class;
  }

  sub epoch {
      my $self = shift;
      return ${ $self };
  }

  my $time = Time->new();
  print $time->epoch();

=head2 Inside-Out objects

In the past, the Perl community experimented with a technique called
"inside-out objects". An inside-out object stores its data outside of
the object's reference, indexed on a unique property of the object,
such as its memory address, rather than in the object itself. This has
the advantage of enforcing the encapsulation of object attributes,
since their data is not stored in the object itself.

This technique was popular for a while (and was recommended in Damian
Conway's I<Perl Best Practices>), but never achieved universal
adoption. The L<Object::InsideOut> module on CPAN provides a
comprehensive implementation of this technique, and you may see it or
other inside-out modules in the wild.

Here is a simple example of the technique, using the
L<Hash::Util::FieldHash> core module. This module was added to the core
to support inside-out object implementations.

  package Time;

  use strict;
  use warnings;

  use Hash::Util::FieldHash 'fieldhash';

  fieldhash my %time_for;

  sub new {
      my $class = shift;

      my $self = bless \( my $object ), $class;

      $time_for{$self} = time;

      return $self;
  }

  sub epoch {
      my $self = shift;

      return $time_for{$self};
  }

  my $time = Time->new;
  print $time->epoch;

=head2 Pseudo-hashes

The pseudo-hash feature was an experimental feature introduced in
earlier versions of Perl and removed in 5.10.0. A pseudo-hash is an
array reference which can be accessed using named keys like a hash. You
may run in to some code in the wild which uses it. See the L<fields>
pragma for more information.

=head1 SEE ALSO

A kinder, gentler tutorial on object-oriented programming in Perl can
be found in L<perlootut>. You should also check out L<perlmodlib> for
some style guides on constructing both modules and classes.

perl5220delta.pod000064400000377623150344123470007562 0ustar00=encoding utf8

=head1 NAME

perl5220delta - what is new for perl v5.22.0

=head1 DESCRIPTION

This document describes differences between the 5.20.0 release and the 5.22.0
release.

If you are upgrading from an earlier release such as 5.18.0, first read
L<perl5200delta>, which describes differences between 5.18.0 and 5.20.0.

=head1 Core Enhancements

=head2 New bitwise operators

A new experimental facility has been added that makes the four standard
bitwise operators (C<& | ^ ~>) treat their operands consistently as
numbers, and introduces four new dotted operators (C<&. |. ^. ~.>) that
treat their operands consistently as strings.  The same applies to the
assignment variants (C<&= |= ^= &.= |.= ^.=>).

To use this, enable the "bitwise" feature and disable the
"experimental::bitwise" warnings category.  See L<perlop/Bitwise String
Operators> for details.
L<[perl #123466]|https://rt.perl.org/Ticket/Display.html?id=123466>.

=head2 New double-diamond operator

C<<< <<>> >>> is like C<< <> >> but uses three-argument C<open> to open
each file in C<@ARGV>.  This means that each element of C<@ARGV> will be treated
as an actual file name, and C<"|foo"> won't be treated as a pipe open.

=head2 New C<\b> boundaries in regular expressions

=head3 C<qr/\b{gcb}/>

C<gcb> stands for Grapheme Cluster Boundary.  It is a Unicode property
that finds the boundary between sequences of characters that look like a
single character to a native speaker of a language.  Perl has long had
the ability to deal with these through the C<\X> regular escape
sequence.  Now, there is an alternative way of handling these.  See
L<perlrebackslash/\b{}, \b, \B{}, \B> for details.

=head3 C<qr/\b{wb}/>

C<wb> stands for Word Boundary.  It is a Unicode property
that finds the boundary between words.  This is similar to the plain
C<\b> (without braces) but is more suitable for natural language
processing.  It knows, for example, that apostrophes can occur in the
middle of words.  See L<perlrebackslash/\b{}, \b, \B{}, \B> for details.

=head3 C<qr/\b{sb}/>

C<sb> stands for Sentence Boundary.  It is a Unicode property
to aid in parsing natural language sentences.
See L<perlrebackslash/\b{}, \b, \B{}, \B> for details.

=head2 Non-Capturing Regular Expression Flag

Regular expressions now support a C</n> flag that disables capturing
and filling in C<$1>, C<$2>, etc inside of groups:

  "hello" =~ /(hi|hello)/n; # $1 is not set

This is equivalent to putting C<?:> at the beginning of every capturing group.

See L<perlre/"n"> for more information.

=head2 C<use re 'strict'>

This applies stricter syntax rules to regular expression patterns
compiled within its scope. This will hopefully alert you to typos and
other unintentional behavior that backwards-compatibility issues prevent
us from reporting in normal regular expression compilations.  Because the
behavior of this is subject to change in future Perl releases as we gain
experience, using this pragma will raise a warning of category
C<experimental::re_strict>.
See L<'strict' in re|re/'strict' mode>.

=head2 Unicode 7.0 (with correction) is now supported

For details on what is in this release, see
L<http://www.unicode.org/versions/Unicode7.0.0/>.
The version of Unicode 7.0 that comes with Perl includes
a correction dealing with glyph shaping in Arabic
(see L<http://www.unicode.org/errata/#current_errata>).


=head2 S<C<use locale>> can restrict which locale categories are affected

It is now possible to pass a parameter to S<C<use locale>> to specify
a subset of locale categories to be locale-aware, with the remaining
ones unaffected.  See L<perllocale/The "use locale" pragma> for details.

=head2 Perl now supports POSIX 2008 locale currency additions

On platforms that are able to handle POSIX.1-2008, the
hash returned by
L<C<POSIX::localeconv()>|perllocale/The localeconv function>
includes the international currency fields added by that version of the
POSIX standard.  These are
C<int_n_cs_precedes>,
C<int_n_sep_by_space>,
C<int_n_sign_posn>,
C<int_p_cs_precedes>,
C<int_p_sep_by_space>,
and
C<int_p_sign_posn>.

=head2 Better heuristics on older platforms for determining locale UTF-8ness

On platforms that implement neither the C99 standard nor the POSIX 2001
standard, determining if the current locale is UTF-8 or not depends on
heuristics.  These are improved in this release.

=head2 Aliasing via reference

Variables and subroutines can now be aliased by assigning to a reference:

    \$c = \$d;
    \&x = \&y;

Aliasing can also be accomplished
by using a backslash before a C<foreach> iterator variable; this is
perhaps the most useful idiom this feature provides:

    foreach \%hash (@array_of_hash_refs) { ... }

This feature is experimental and must be enabled via S<C<use feature
'refaliasing'>>.  It will warn unless the C<experimental::refaliasing>
warnings category is disabled.

See L<perlref/Assigning to References>

=head2 C<prototype> with no arguments

C<prototype()> with no arguments now infers C<$_>.
L<[perl #123514]|https://rt.perl.org/Ticket/Display.html?id=123514>.

=head2 New C<:const> subroutine attribute

The C<const> attribute can be applied to an anonymous subroutine.  It
causes the new sub to be executed immediately whenever one is created
(I<i.e.> when the C<sub> expression is evaluated).  Its value is captured
and used to create a new constant subroutine that is returned.  This
feature is experimental.  See L<perlsub/Constant Functions>.

=head2 C<fileno> now works on directory handles

When the relevant support is available in the operating system, the
C<fileno> builtin now works on directory handles, yielding the
underlying file descriptor in the same way as for filehandles. On
operating systems without such support, C<fileno> on a directory handle
continues to return the undefined value, as before, but also sets C<$!> to
indicate that the operation is not supported.

Currently, this uses either a C<dd_fd> member in the OS C<DIR>
structure, or a C<dirfd(3)> function as specified by POSIX.1-2008.

=head2 List form of pipe open implemented for Win32

The list form of pipe:

  open my $fh, "-|", "program", @arguments;

is now implemented on Win32.  It has the same limitations as C<system
LIST> on Win32, since the Win32 API doesn't accept program arguments
as a list.

=head2 Assignment to list repetition

C<(...) x ...> can now be used within a list that is assigned to, as long
as the left-hand side is a valid lvalue.  This allows S<C<(undef,undef,$foo)
= that_function()>> to be written as S<C<((undef)x2, $foo) = that_function()>>.

=head2 Infinity and NaN (not-a-number) handling improved

Floating point values are able to hold the special values infinity, negative
infinity, and NaN (not-a-number).  Now we more robustly recognize and
propagate the value in computations, and on output normalize them to the strings
C<Inf>, C<-Inf>, and C<NaN>.

See also the L<POSIX> enhancements.

=head2 Floating point parsing has been improved

Parsing and printing of floating point values has been improved.

As a completely new feature, hexadecimal floating point literals
(like C<0x1.23p-4>)  are now supported, and they can be output with
S<C<printf "%a">>. See L<perldata/Scalar value constructors> for more
details.

=head2 Packing infinity or not-a-number into a character is now fatal

Before, when trying to pack infinity or not-a-number into a
(signed) character, Perl would warn, and assumed you tried to
pack C<< 0xFF >>; if you gave it as an argument to C<< chr >>,
C<< U+FFFD >> was returned.

But now, all such actions (C<< pack >>, C<< chr >>, and C<< print '%c' >>)
result in a fatal error.

=head2 Experimental C Backtrace API

Perl now supports (via a C level API) retrieving
the C level backtrace (similar to what symbolic debuggers like gdb do).

The backtrace returns the stack trace of the C call frames,
with the symbol names (function names), the object names (like "perl"),
and if it can, also the source code locations (file:line).

The supported platforms are Linux and OS X (some *BSD might work at
least partly, but they have not yet been tested).

The feature needs to be enabled with C<Configure -Dusecbacktrace>.

See L<perlhacktips/"C backtrace"> for more information.

=head1 Security

=head2 Perl is now compiled with C<-fstack-protector-strong> if available

Perl has been compiled with the anti-stack-smashing option
C<-fstack-protector> since 5.10.1.  Now Perl uses the newer variant
called C<-fstack-protector-strong>, if available.

=head2 The L<Safe> module could allow outside packages to be replaced

Critical bugfix: outside packages could be replaced.  L<Safe> has
been patched to 2.38 to address this.

=head2 Perl is now always compiled with C<-D_FORTIFY_SOURCE=2> if available

The 'code hardening' option called C<_FORTIFY_SOURCE>, available in
gcc 4.*, is now always used for compiling Perl, if available.

Note that this isn't necessarily a huge step since in many platforms
the step had already been taken several years ago: many Linux
distributions (like Fedora) have been using this option for Perl,
and OS X has enforced the same for many years.

=head1 Incompatible Changes

=head2 Subroutine signatures moved before attributes

The experimental sub signatures feature, as introduced in 5.20, parsed
signatures after attributes. In this release, following feedback from users
of the experimental feature, the positioning has been moved such that
signatures occur after the subroutine name (if any) and before the attribute
list (if any).

=head2 C<&> and C<\&> prototypes accepts only subs

The C<&> prototype character now accepts only anonymous subs (C<sub
{...}>), things beginning with C<\&>, or an explicit C<undef>.  Formerly
it erroneously also allowed references to arrays, hashes, and lists.
L<[perl #4539]|https://rt.perl.org/Ticket/Display.html?id=4539>.
L<[perl #123062]|https://rt.perl.org/Ticket/Display.html?id=123062>.
L<[perl #123062]|https://rt.perl.org/Ticket/Display.html?id=123475>.

In addition, the C<\&> prototype was allowing subroutine calls, whereas
now it only allows subroutines: C<&foo> is still permitted as an argument,
while C<&foo()> and C<foo()> no longer are.
L<[perl #77860]|https://rt.perl.org/Ticket/Display.html?id=77860>.

=head2 C<use encoding> is now lexical

The L<encoding> pragma's effect is now limited to lexical scope.  This
pragma is deprecated, but in the meantime, it could adversely affect
unrelated modules that are included in the same program; this change
fixes that.

=head2 List slices returning empty lists

List slices now return an empty list only if the original list was empty
(or if there are no indices).  Formerly, a list slice would return an empty
list if all indices fell outside the original list; now it returns a list
of C<undef> values in that case.
L<[perl #114498]|https://rt.perl.org/Ticket/Display.html?id=114498>.

=head2 C<\N{}> with a sequence of multiple spaces is now a fatal error

E.g. S<C<\N{TOOE<nbsp>E<nbsp>MANY SPACES}>> or S<C<\N{TRAILING SPACE }>>.
This has been deprecated since v5.18.

=head2 S<C<use UNIVERSAL '...'>> is now a fatal error

Importing functions from C<UNIVERSAL> has been deprecated since v5.12, and
is now a fatal error.  S<C<use UNIVERSAL>> without any arguments is still
allowed.

=head2 In double-quotish C<\cI<X>>, I<X> must now be a printable ASCII character

In prior releases, failure to do this raised a deprecation warning.

=head2 Splitting the tokens C<(?> and C<(*> in regular expressions is now a fatal compilation error.

These had been deprecated since v5.18.

=head2 C<qr/foo/x> now ignores all Unicode pattern white space

The C</x> regular expression modifier allows the pattern to contain
white space and comments (both of which are ignored) for improved
readability.  Until now, not all the white space characters that Unicode
designates for this purpose were handled.  The additional ones now
recognized are:

    U+0085 NEXT LINE
    U+200E LEFT-TO-RIGHT MARK
    U+200F RIGHT-TO-LEFT MARK
    U+2028 LINE SEPARATOR
    U+2029 PARAGRAPH SEPARATOR

The use of these characters with C</x> outside bracketed character
classes and when not preceded by a backslash has raised a deprecation
warning since v5.18.  Now they will be ignored.

=head2 Comment lines within S<C<(?[ ])>> are now ended only by a C<\n>

S<C<(?[ ])>>  is an experimental feature, introduced in v5.18.  It operates
as if C</x> is always enabled.  But there was a difference: comment
lines (following a C<#> character) were terminated by anything matching
C<\R> which includes all vertical whitespace, such as form feeds.  For
consistency, this is now changed to match what terminates comment lines
outside S<C<(?[ ])>>, namely a C<\n> (even if escaped), which is the
same as what terminates a heredoc string and formats.

=head2 C<(?[...])> operators now follow standard Perl precedence

This experimental feature allows set operations in regular expression patterns.
Prior to this, the intersection operator had the same precedence as the other
binary operators.  Now it has higher precedence.  This could lead to different
outcomes than existing code expects (though the documentation has always noted
that this change might happen, recommending fully parenthesizing the
expressions).  See L<perlrecharclass/Extended Bracketed Character Classes>.

=head2 Omitting C<%> and C<@> on hash and array names is no longer permitted

Really old Perl let you omit the C<@> on array names and the C<%> on hash
names in some spots.  This has issued a deprecation warning since Perl
5.000, and is no longer permitted.

=head2 C<"$!"> text is now in English outside the scope of C<use locale>

Previously, the text, unlike almost everything else, always came out
based on the current underlying locale of the program.  (Also affected
on some systems is C<"$^E">.)  For programs that are unprepared to
handle locale differences, this can cause garbage text to be displayed.
It's better to display text that is translatable via some tool than
garbage text which is much harder to figure out.

=head2 C<"$!"> text will be returned in UTF-8 when appropriate

The stringification of C<$!> and C<$^E> will have the UTF-8 flag set
when the text is actually non-ASCII UTF-8.  This will enable programs
that are set up to be locale-aware to properly output messages in the
user's native language.  Code that needs to continue the 5.20 and
earlier behavior can do the stringification within the scopes of both
S<C<use bytes>> and S<C<use locale ":messages">>.  Within these two
scopes, no other Perl operations will
be affected by locale; only C<$!> and C<$^E> stringification.  The
C<bytes> pragma causes the UTF-8 flag to not be set, just as in previous
Perl releases.  This resolves
L<[perl #112208]|https://rt.perl.org/Ticket/Display.html?id=112208>.

=head2 Support for C<?PATTERN?> without explicit operator has been removed

The C<m?PATTERN?> construct, which allows matching a regex only once,
previously had an alternative form that was written directly with a question
mark delimiter, omitting the explicit C<m> operator.  This usage has produced
a deprecation warning since 5.14.0.  It is now a syntax error, so that the
question mark can be available for use in new operators.

=head2 C<defined(@array)> and C<defined(%hash)> are now fatal errors

These have been deprecated since v5.6.1 and have raised deprecation
warnings since v5.16.

=head2 Using a hash or an array as a reference are now fatal errors

For example, C<< %foo->{"bar"} >> now causes a fatal compilation
error.  These have been deprecated since before v5.8, and have raised
deprecation warnings since then.

=head2 Changes to the C<*> prototype

The C<*> character in a subroutine's prototype used to allow barewords to take
precedence over most, but not all, subroutine names.  It was never
consistent and exhibited buggy behavior.

Now it has been changed, so subroutines always take precedence over barewords,
which brings it into conformity with similarly prototyped built-in functions:

    sub splat(*) { ... }
    sub foo { ... }
    splat(foo); # now always splat(foo())
    splat(bar); # still splat('bar') as before
    close(foo); # close(foo())
    close(bar); # close('bar')

=head1 Deprecations

=head2 Setting C<${^ENCODING}> to anything but C<undef>

This variable allows Perl scripts to be written in an encoding other than
ASCII or UTF-8.  However, it affects all modules globally, leading
to wrong answers and segmentation faults.  New scripts should be written
in UTF-8; old scripts should be converted to UTF-8, which is easily done
with the L<piconv> utility.

=head2 Use of non-graphic characters in single-character variable names

The syntax for single-character variable names is more lenient than
for longer variable names, allowing the one-character name to be a
punctuation character or even invisible (a non-graphic).  Perl v5.20
deprecated the ASCII-range controls as such a name.  Now, all
non-graphic characters that formerly were allowed are deprecated.
The practical effect of this occurs only when not under C<S<use
utf8>>, and affects just the C1 controls (code points 0x80 through
0xFF), NO-BREAK SPACE, and SOFT HYPHEN.

=head2 Inlining of C<sub () { $var }> with observable side-effects

In many cases Perl makes S<C<sub () { $var }>> into an inlinable constant
subroutine, capturing the value of C<$var> at the time the C<sub> expression
is evaluated.  This can break the closure behavior in those cases where
C<$var> is subsequently modified, since the subroutine won't return the
changed value. (Note that this all only applies to anonymous subroutines
with an empty prototype (S<C<sub ()>>).)

This usage is now deprecated in those cases where the variable could be
modified elsewhere.  Perl detects those cases and emits a deprecation
warning.  Such code will likely change in the future and stop producing a
constant.

If your variable is only modified in the place where it is declared, then
Perl will continue to make the sub inlinable with no warnings.

    sub make_constant {
        my $var = shift;
        return sub () { $var }; # fine
    }

    sub make_constant_deprecated {
        my $var;
        $var = shift;
        return sub () { $var }; # deprecated
    }

    sub make_constant_deprecated2 {
        my $var = shift;
        log_that_value($var); # could modify $var
        return sub () { $var }; # deprecated
    }

In the second example above, detecting that C<$var> is assigned to only once
is too hard to detect.  That it happens in a spot other than the C<my>
declaration is enough for Perl to find it suspicious.

This deprecation warning happens only for a simple variable for the body of
the sub.  (A C<BEGIN> block or C<use> statement inside the sub is ignored,
because it does not become part of the sub's body.)  For more complex
cases, such as S<C<sub () { do_something() if 0; $var }>> the behavior has
changed such that inlining does not happen if the variable is modifiable
elsewhere.  Such cases should be rare.

=head2 Use of multiple C</x> regexp modifiers

It is now deprecated to say something like any of the following:

    qr/foo/xx;
    /(?xax:foo)/;
    use re qw(/amxx);

That is, now C<x> should only occur once in any string of contiguous
regular expression pattern modifiers.  We do not believe there are any
occurrences of this in all of CPAN.  This is in preparation for a future
Perl release having C</xx> permit white-space for readability in
bracketed character classes (those enclosed in square brackets:
C<[...]>).

=head2 Using a NO-BREAK space in a character alias for C<\N{...}> is now deprecated

This non-graphic character is essentially indistinguishable from a
regular space, and so should not be allowed.  See
L<charnames/CUSTOM ALIASES>.

=head2 A literal C<"{"> should now be escaped in a pattern

If you want a literal left curly bracket (also called a left brace) in a
regular expression pattern, you should now escape it by either
preceding it with a backslash (C<"\{">) or enclosing it within square
brackets C<"[{]">, or by using C<\Q>; otherwise a deprecation warning
will be raised.  This was first announced as forthcoming in the v5.16
release; it will allow future extensions to the language to happen.

=head2 Making all warnings fatal is discouraged

The documentation for L<fatal warnings|warnings/Fatal Warnings> notes that
C<< use warnings FATAL => 'all' >> is discouraged, and provides stronger
language about the risks of fatal warnings in general.

=head1 Performance Enhancements

=over 4

=item *

If a method or class name is known at compile time, a hash is precomputed
to speed up run-time method lookup.  Also, compound method names like
C<SUPER::new> are parsed at compile time, to save having to parse them at
run time.

=item *

Array and hash lookups (especially nested ones) that use only constants
or simple variables as keys, are now considerably faster. See
L</Internal Changes> for more details.

=item *

C<(...)x1>, C<("constant")x0> and C<($scalar)x0> are now optimised in list
context.  If the right-hand argument is a constant 1, the repetition
operator disappears.  If the right-hand argument is a constant 0, the whole
expression is optimised to the empty list, so long as the left-hand
argument is a simple scalar or constant.  (That is, C<(foo())x0> is not
subject to this optimisation.)

=item *

C<substr> assignment is now optimised into 4-argument C<substr> at the end
of a subroutine (or as the argument to C<return>).  Previously, this
optimisation only happened in void context.

=item *

In C<"\L...">, C<"\Q...">, etc., the extra "stringify" op is now optimised
away, making these just as fast as C<lcfirst>, C<quotemeta>, etc.

=item *

Assignment to an empty list is now sometimes faster.  In particular, it
never calls C<FETCH> on tied arguments on the right-hand side, whereas it
used to sometimes.

=item *

There is a performance improvement of up to 20% when C<length> is applied to
a non-magical, non-tied string, and either C<use bytes> is in scope or the
string doesn't use UTF-8 internally.

=item *

On most perl builds with 64-bit integers, memory usage for non-magical,
non-tied scalars containing only a floating point value has been reduced
by between 8 and 32 bytes, depending on OS.

=item *

In C<@array = split>, the assignment can be optimized away, so that C<split>
writes directly to the array.  This optimisation was happening only for
package arrays other than C<@_>, and only sometimes.  Now this
optimisation happens almost all the time.

=item *

C<join> is now subject to constant folding.  So for example
S<C<join "-", "a", "b">> is converted at compile-time to C<"a-b">.
Moreover, C<join> with a scalar or constant for the separator and a
single-item list to join is simplified to a stringification, and the
separator doesn't even get evaluated.

=item *

C<qq(@array)> is implemented using two ops: a stringify op and a join op.
If the C<qq> contains nothing but a single array, the stringification is
optimized away.

=item *

S<C<our $var>> and S<C<our($s,@a,%h)>> in void context are no longer evaluated at
run time.  Even a whole sequence of S<C<our $foo;>> statements will simply be
skipped over.  The same applies to C<state> variables.

=item *

Many internal functions have been refactored to improve performance and reduce
their memory footprints.
L<[perl #121436]|https://rt.perl.org/Ticket/Display.html?id=121436>
L<[perl #121906]|https://rt.perl.org/Ticket/Display.html?id=121906>
L<[perl #121969]|https://rt.perl.org/Ticket/Display.html?id=121969>

=item *

C<-T> and C<-B> filetests will return sooner when an empty file is detected.
L<[perl #121489]|https://rt.perl.org/Ticket/Display.html?id=121489>

=item *

Hash lookups where the key is a constant are faster.

=item *

Subroutines with an empty prototype and a body containing just C<undef> are now
eligible for inlining.
L<[perl #122728]|https://rt.perl.org/Ticket/Display.html?id=122728>

=item *

Subroutines in packages no longer need to be stored in typeglobs:
declaring a subroutine will now put a simple sub reference directly in the
stash if possible, saving memory.  The typeglob still notionally exists,
so accessing it will cause the stash entry to be upgraded to a typeglob
(I<i.e.> this is just an internal implementation detail).
This optimization does not currently apply to XSUBs or exported
subroutines, and method calls will undo it, since they cache things in
typeglobs.
L<[perl #120441]|https://rt.perl.org/Ticket/Display.html?id=120441>

=item *

The functions C<utf8::native_to_unicode()> and C<utf8::unicode_to_native()>
(see L<utf8>) are now optimized out on ASCII platforms.  There is now not even
a minimal performance hit in writing code portable between ASCII and EBCDIC
platforms.

=item *

Win32 Perl uses 8 KB less of per-process memory than before for every perl
process, because some data is now memory mapped from disk and shared
between processes from the same perl binary.

=back

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

Many of the libraries distributed with perl have been upgraded since v5.20.0.
For a complete list of changes, run:

  corelist --diff 5.20.0 5.22.0

You can substitute your favorite version in place of 5.20.0, too.

Some notable changes include:

=over 4

=item *

L<Archive::Tar> has been upgraded to version 2.04.

Tests can now be run in parallel.

=item *

L<attributes> has been upgraded to version 0.27.

The usage of C<memEQs> in the XS has been corrected.
L<[perl #122701]|https://rt.perl.org/Ticket/Display.html?id=122701>

Avoid reading beyond the end of a buffer. [perl #122629]

=item *

L<B> has been upgraded to version 1.58.

It provides a new C<B::safename> function, based on the existing
C<< B::GV->SAFENAME >>, that converts C<\cOPEN> to C<^OPEN>.

Nulled COPs are now of class C<B::COP>, rather than C<B::OP>.

C<B::REGEXP> objects now provide a C<qr_anoncv> method for accessing the
implicit CV associated with C<qr//> things containing code blocks, and a
C<compflags> method that returns the pertinent flags originating from the
C<qr//blahblah> op.

C<B::PMOP> now provides a C<pmregexp> method returning a C<B::REGEXP> object.
Two new classes, C<B::PADNAME> and C<B::PADNAMELIST>, have been introduced.

A bug where, after an ithread creation or psuedofork, special/immortal SVs in
the child ithread/psuedoprocess did not have the correct class of
C<B::SPECIAL>, has been fixed.
The C<id> and C<outid> PADLIST methods have been added.

=item *

L<B::Concise> has been upgraded to version 0.996.

Null ops that are part of the execution chain are now given sequence
numbers.

Private flags for nulled ops are now dumped with mnemonics as they would be
for the non-nulled counterparts.

=item *

L<B::Deparse> has been upgraded to version 1.35.

It now deparses C<+sub : attr { ... }> correctly at the start of a
statement.  Without the initial C<+>, C<sub> would be a statement label.

C<BEGIN> blocks are now emitted in the right place most of the time, but
the change unfortunately introduced a regression, in that C<BEGIN> blocks
occurring just before the end of the enclosing block may appear below it
instead.

C<B::Deparse> no longer puts erroneous C<local> here and there, such as for
C<LIST = tr/a//d>.  [perl #119815]

Adjacent C<use> statements are no longer accidentally nested if one
contains a C<do> block.  [perl #115066]

Parenthesised arrays in lists passed to C<\> are now correctly deparsed
with parentheses (I<e.g.>, C<\(@a, (@b), @c)> now retains the parentheses
around @b), thus preserving the flattening behavior of referenced
parenthesised arrays.  Formerly, it only worked for one array: C<\(@a)>.

C<local our> is now deparsed correctly, with the C<our> included.

C<for($foo; !$bar; $baz) {...}> was deparsed without the C<!> (or C<not>).
This has been fixed.

Core keywords that conflict with lexical subroutines are now deparsed with
the C<CORE::> prefix.

C<foreach state $x (...) {...}> now deparses correctly with C<state> and
not C<my>.

C<our @array = split(...)> now deparses correctly with C<our> in those
cases where the assignment is optimized away.

It now deparses C<our(I<LIST>)> and typed lexical (C<my Dog $spot>) correctly.

Deparse C<$#_> as that instead of as C<$#{_}>.
L<[perl #123947]|https://rt.perl.org/Ticket/Display.html?id=123947>

BEGIN blocks at the end of the enclosing scope are now deparsed in the
right place.  [perl #77452]

BEGIN blocks were sometimes deparsed as __ANON__, but are now always called
BEGIN.

Lexical subroutines are now fully deparsed.  [perl #116553]

C<Anything =~ y///r> with C</r> no longer omits the left-hand operand.

The op trees that make up regexp code blocks are now deparsed for real.
Formerly, the original string that made up the regular expression was used.
That caused problems with C<qr/(?{E<lt>E<lt>heredoc})/> and multiline code blocks,
which were deparsed incorrectly.  [perl #123217] [perl #115256]

C<$;> at the end of a statement no longer loses its semicolon.
[perl #123357]

Some cases of subroutine declarations stored in the stash in shorthand form
were being omitted.

Non-ASCII characters are now consistently escaped in strings, instead of
some of the time.  (There are still outstanding problems with regular
expressions and identifiers that have not been fixed.)

When prototype sub calls are deparsed with C<&> (I<e.g.>, under the B<-P>
option), C<scalar> is now added where appropriate, to force the scalar
context implied by the prototype.

C<require(foo())>, C<do(foo())>, C<goto(foo())> and similar constructs with
loop controls are now deparsed correctly.  The outer parentheses are not
optional.

Whitespace is no longer escaped in regular expressions, because it was
getting erroneously escaped within C<(?x:...)> sections.

C<sub foo { foo() }> is now deparsed with those mandatory parentheses.

C</@array/> is now deparsed as a regular expression, and not just
C<@array>.

C</@{-}/>, C</@{+}/> and C<$#{1}> are now deparsed with the braces, which
are mandatory in these cases.

In deparsing feature bundles, C<B::Deparse> was emitting C<no feature;> first
instead of C<no feature ':all';>.  This has been fixed.

C<chdir FH> is now deparsed without quotation marks.

C<\my @a> is now deparsed without parentheses.  (Parenthese would flatten
the array.)

C<system> and C<exec> followed by a block are now deparsed correctly.
Formerly there was an erroneous C<do> before the block.

C<< use constant QR =E<gt> qr/.../flags >> followed by C<"" =~ QR> is no longer
without the flags.

Deparsing C<BEGIN { undef &foo }> with the B<-w> switch enabled started to
emit 'uninitialized' warnings in Perl 5.14.  This has been fixed.

Deparsing calls to subs with a C<(;+)> prototype resulted in an infinite
loop.  The C<(;$>) C<(_)> and C<(;_)> prototypes were given the wrong
precedence, causing C<foo($aE<lt>$b)> to be deparsed without the parentheses.

Deparse now provides a defined state sub in inner subs.

=item *

L<B::Op_private> has been added.

L<B::Op_private> provides detailed information about the flags used in the
C<op_private> field of perl opcodes.

=item *

L<bigint>, L<bignum>, L<bigrat> have been upgraded to version 0.39.

Document in CAVEATS that using strings as numbers won't always invoke
the big number overloading, and how to invoke it.  [rt.perl.org #123064]

=item *

L<Carp> has been upgraded to version 1.36.

C<Carp::Heavy> now ignores version mismatches with Carp if Carp is newer
than 1.12, since C<Carp::Heavy>'s guts were merged into Carp at that
point.
L<[perl #121574]|https://rt.perl.org/Ticket/Display.html?id=121574>

Carp now handles non-ASCII platforms better.

Off-by-one error fix for Perl E<lt> 5.14.

=item *

L<constant> has been upgraded to version 1.33.

It now accepts fully-qualified constant names, allowing constants to be defined
in packages other than the caller.

=item *

L<CPAN> has been upgraded to version 2.11.

Add support for C<Cwd::getdcwd()> and introduce workaround for a misbehavior
seen on Strawberry Perl 5.20.1.

Fix C<chdir()> after building dependencies bug.

Introduce experimental support for plugins/hooks.

Integrate the C<App::Cpan> sources.

Do not check recursion on optional dependencies.

Sanity check F<META.yml> to contain a hash.
L<[cpan #95271]|https://rt.cpan.org/Ticket/Display.html?id=95271>

=item *

L<CPAN::Meta::Requirements> has been upgraded to version 2.132.

Works around limitations in C<version::vpp> detecting v-string magic and adds
support for forthcoming L<ExtUtils::MakeMaker> bootstrap F<version.pm> for
Perls older than 5.10.0.

=item *

L<Data::Dumper> has been upgraded to version 2.158.

Fixes CVE-2014-4330 by adding a configuration variable/option to limit
recursion when dumping deep data structures.

Changes to resolve Coverity issues.
XS dumps incorrectly stored the name of code references stored in a
GLOB.
L<[perl #122070]|https://rt.perl.org/Ticket/Display.html?id=122070>

=item *

L<DynaLoader> has been upgraded to version 1.32.

Remove C<dl_nonlazy> global if unused in Dynaloader. [perl #122926]

=item *

L<Encode> has been upgraded to version 2.72.

C<piconv> now has better error handling when the encoding name is nonexistent,
and a build breakage when upgrading L<Encode> in perl-5.8.2 and earlier has
been fixed.

Building in C++ mode on Windows now works.

=item *

L<Errno> has been upgraded to version 1.23.

Add C<-P> to the preprocessor command-line on GCC 5.  GCC added extra
line directives, breaking parsing of error code definitions.  [rt.perl.org
#123784]

=item *

L<experimental> has been upgraded to version 0.013.

Hardcodes features for Perls older than 5.15.7.

=item *

L<ExtUtils::CBuilder> has been upgraded to version 0.280221.

Fixes a regression on Android.
L<[perl #122675]|https://rt.perl.org/Ticket/Display.html?id=122675>

=item *

L<ExtUtils::Manifest> has been upgraded to version 1.70.

Fixes a bug with C<maniread()>'s handling of quoted filenames and improves
C<manifind()> to follow symlinks.
L<[perl #122415]|https://rt.perl.org/Ticket/Display.html?id=122415>

=item *

L<ExtUtils::ParseXS> has been upgraded to version 3.28.

Only declare C<file> unused if we actually define it.
Improve generated C<RETVAL> code generation to avoid repeated
references to C<ST(0)>.  [perl #123278]
Broaden and document the C</OBJ$/> to C</REF$/> typemap optimization
for the C<DESTROY> method.  [perl #123418]

=item *

L<Fcntl> has been upgraded to version 1.13.

Add support for the Linux pipe buffer size C<fcntl()> commands.

=item *

L<File::Find> has been upgraded to version 1.29.

C<find()> and C<finddepth()> will now warn if passed inappropriate or
misspelled options.

=item *

L<File::Glob> has been upgraded to version 1.24.

Avoid C<SvIV()> expanding to call C<get_sv()> three times in a few
places. [perl #123606]

=item *

L<HTTP::Tiny> has been upgraded to version 0.054.

C<keep_alive> is now fork-safe and thread-safe.

=item *

L<IO> has been upgraded to version 1.35.

The XS implementation has been fixed for the sake of older Perls.

=item *

L<IO::Socket> has been upgraded to version 1.38.

Document the limitations of the C<connected()> method.  [perl #123096]

=item *

L<IO::Socket::IP> has been upgraded to version 0.37.

A better fix for subclassing C<connect()>.
L<[cpan #95983]|https://rt.cpan.org/Ticket/Display.html?id=95983>
L<[cpan #97050]|https://rt.cpan.org/Ticket/Display.html?id=97050>

Implements Timeout for C<connect()>.
L<[cpan #92075]|https://rt.cpan.org/Ticket/Display.html?id=92075>

=item *

The libnet collection of modules has been upgraded to version 3.05.

Support for IPv6 and SSL to C<Net::FTP>, C<Net::NNTP>, C<Net::POP3> and C<Net::SMTP>.
Improvements in C<Net::SMTP> authentication.

=item *

L<Locale::Codes> has been upgraded to version 3.34.

Fixed a bug in the scripts used to extract data from spreadsheets that
prevented the SHP currency code from being found.
L<[cpan #94229]|https://rt.cpan.org/Ticket/Display.html?id=94229>

New codes have been added.

=item *

L<Math::BigInt> has been upgraded to version 1.9997.

Synchronize POD changes from the CPAN release.
C<< Math::BigFloat->blog(x) >> would sometimes return C<blog(2*x)> when
the accuracy was greater than 70 digits.
The result of C<< Math::BigFloat->bdiv() >> in list context now
satisfies C<< x = quotient * divisor + remainder >>.

Correct handling of subclasses.
L<[cpan #96254]|https://rt.cpan.org/Ticket/Display.html?id=96254>
L<[cpan #96329]|https://rt.cpan.org/Ticket/Display.html?id=96329>

=item *

L<Module::Metadata> has been upgraded to version 1.000026.

Support installations on older perls with an L<ExtUtils::MakeMaker> earlier
than 6.63_03

=item *

L<overload> has been upgraded to version 1.26.

A redundant C<ref $sub> check has been removed.

=item *

The PathTools module collection has been upgraded to version 3.56.

A warning from the B<gcc> compiler is now avoided when building the XS.

Don't turn leading C<//> into C</> on Cygwin. [perl #122635]

=item *

L<perl5db.pl> has been upgraded to version 1.49.

The debugger would cause an assertion failure.
L<[perl #124127]|https://rt.perl.org/Ticket/Display.html?id=124127>

C<fork()> in the debugger under C<tmux> will now create a new window for
the forked process. L<[perl
#121333]|https://rt.perl.org/Ticket/Display.html?id=121333>

The debugger now saves the current working directory on startup and
restores it when you restart your program with C<R> or C<rerun>. L<[perl
#121509]|https://rt.perl.org/Ticket/Display.html?id=121509>

=item *

L<PerlIO::scalar> has been upgraded to version 0.22.

Reading from a position well past the end of the scalar now correctly
returns end of file.  [perl #123443]

Seeking to a negative position still fails, but no longer leaves the
file position set to a negation location.

C<eof()> on a C<PerlIO::scalar> handle now properly returns true when
the file position is past the 2GB mark on 32-bit systems.

Attempting to write at file positions impossible for the platform now
fail early rather than wrapping at 4GB.

=item *

L<Pod::Perldoc> has been upgraded to version 3.25.

Filehandles opened for reading or writing now have C<:encoding(UTF-8)> set.
L<[cpan #98019]|https://rt.cpan.org/Ticket/Display.html?id=98019>

=item *

L<POSIX> has been upgraded to version 1.53.

The C99 math functions and constants (for example C<acosh>, C<isinf>, C<isnan>, C<round>,
C<trunc>; C<M_E>, C<M_SQRT2>, C<M_PI>) have been added.

C<POSIX::tmpnam()> now produces a deprecation warning.  [perl #122005]

=item *

L<Safe> has been upgraded to version 2.39.

C<reval> was not propagating void context properly.

=item *

Scalar-List-Utils has been upgraded to version 1.41.

A new module, L<Sub::Util>, has been added, containing functions related to
CODE refs, including C<subname> (inspired by C<Sub::Identity>) and C<set_subname>
(copied and renamed from C<Sub::Name>).
The use of C<GetMagic> in C<List::Util::reduce()> has also been fixed.
L<[cpan #63211]|https://rt.cpan.org/Ticket/Display.html?id=63211>

=item *

L<SDBM_File> has been upgraded to version 1.13.

Simplified the build process.  [perl #123413]

=item *

L<Time::Piece> has been upgraded to version 1.29.

When pretty printing negative C<Time::Seconds>, the "minus" is no longer lost.

=item *

L<Unicode::Collate> has been upgraded to version 1.12.

Version 0.67's improved discontiguous contractions is invalidated by default
and is supported as a parameter C<long_contraction>.

=item *

L<Unicode::Normalize> has been upgraded to version 1.18.

The XSUB implementation has been removed in favor of pure Perl.

=item *

L<Unicode::UCD> has been upgraded to version 0.61.

A new function L<property_values()|Unicode::UCD/prop_values()>
has been added to return a given property's possible values.

A new function L<charprop()|Unicode::UCD/charprop()>
has been added to return the value of a given property for a given code
point.

A new function L<charprops_all()|Unicode::UCD/charprops_all()>
has been added to return the values of all Unicode properties for a
given code point.

A bug has been fixed so that L<propaliases()|Unicode::UCD/prop_aliases()>
returns the correct short and long names for the Perl extensions where
it was incorrect.

A bug has been fixed so that
L<prop_value_aliases()|Unicode::UCD/prop_value_aliases()>
returns C<undef> instead of a wrong result for properties that are Perl
extensions.

This module now works on EBCDIC platforms.

=item *

L<utf8> has been upgraded to version 1.17

A mismatch between the documentation and the code in C<utf8::downgrade()>
was fixed in favor of the documentation. The optional second argument
is now correctly treated as a perl boolean (true/false semantics) and
not as an integer.

=item *

L<version> has been upgraded to version 0.9909.

Numerous changes.  See the F<Changes> file in the CPAN distribution for
details.

=item *

L<Win32> has been upgraded to version 0.51.

C<GetOSName()> now supports Windows 8.1, and building in C++ mode now works.

=item *

L<Win32API::File> has been upgraded to version 0.1202

Building in C++ mode now works.

=item *

L<XSLoader> has been upgraded to version 0.20.

Allow XSLoader to load modules from a different namespace.
[perl #122455]

=back

=head2 Removed Modules and Pragmata

The following modules (and associated modules) have been removed from the core
perl distribution:

=over 4

=item *

L<CGI>

=item *

L<Module::Build>

=back

=head1 Documentation

=head2 New Documentation

=head3 L<perlunicook>

This document, by Tom Christiansen, provides examples of handling Unicode in
Perl.

=head2 Changes to Existing Documentation

=head3 L<perlaix>

=over 4

=item *

A note on long doubles has been added.

=back


=head3 L<perlapi>

=over 4

=item *

Note that C<SvSetSV> doesn't do set magic.

=item *

C<sv_usepvn_flags> - fix documentation to mention the use of C<Newx> instead of
C<malloc>.

L<[perl #121869]|https://rt.perl.org/Ticket/Display.html?id=121869>

=item *

Clarify where C<NUL> may be embedded or is required to terminate a string.

=item *

Some documentation that was previously missing due to formatting errors is
now included.

=item *

Entries are now organized into groups rather than by the file where they
are found.

=item *

Alphabetical sorting of entries is now done consistently (automatically
by the POD generator) to make entries easier to find when scanning.

=back

=head3 L<perldata>

=over 4

=item *

The syntax of single-character variable names has been brought
up-to-date and more fully explained.

=item *

Hexadecimal floating point numbers are described, as are infinity and
NaN.

=back

=head3 L<perlebcdic>

=over 4

=item *

This document has been significantly updated in the light of recent
improvements to EBCDIC support.

=back

=head3 L<perlfilter>

=over 4

=item *

Added a L<LIMITATIONS|perlfilter/LIMITATIONS> section.

=back


=head3 L<perlfunc>

=over 4

=item *

Mention that C<study()> is currently a no-op.

=item *

Calling C<delete> or C<exists> on array values is now described as "strongly
discouraged" rather than "deprecated".

=item *

Improve documentation of C<< our >>.

=item *

C<-l> now notes that it will return false if symlinks aren't supported by the
file system.
L<[perl #121523]|https://rt.perl.org/Ticket/Display.html?id=121523>

=item *

Note that C<exec LIST> and C<system LIST> may fall back to the shell on
Win32. Only the indirect-object syntax C<exec PROGRAM LIST> and
C<system PROGRAM LIST> will reliably avoid using the shell.

This has also been noted in L<perlport>.

L<[perl #122046]|https://rt.perl.org/Ticket/Display.html?id=122046>

=back

=head3 L<perlguts>

=over 4

=item *

The OOK example has been updated to account for COW changes and a change in the
storage of the offset.

=item *

Details on C level symbols and libperl.t added.

=item *

Information on Unicode handling has been added

=item *

Information on EBCDIC handling has been added

=back

=head3 L<perlhack>

=over 4

=item *

A note has been added about running on platforms with non-ASCII
character sets

=item *

A note has been added about performance testing

=back

=head3 L<perlhacktips>

=over 4

=item *

Documentation has been added illustrating the perils of assuming that
there is no change to the contents of static memory pointed to by the
return values of Perl's wrappers for C library functions.

=item *

Replacements for C<tmpfile>, C<atoi>, C<strtol>, and C<strtoul> are now
recommended.

=item *

Updated documentation for the C<test.valgrind> C<make> target.
L<[perl #121431]|https://rt.perl.org/Ticket/Display.html?id=121431>

=item *

Information is given about writing test files portably to non-ASCII
platforms.

=item *

A note has been added about how to get a C language stack backtrace.

=back

=head3 L<perlhpux>

=over 4

=item *

Note that the message "Redeclaration of "sendpath" with a different
storage class specifier" is harmless.

=back

=head3 L<perllocale>

=over 4

=item *

Updated for the enhancements in v5.22, along with some clarifications.

=back

=head3 L<perlmodstyle>

=over 4

=item *

Instead of pointing to the module list, we are now pointing to
L<PrePAN|http://prepan.org/>.

=back

=head3 L<perlop>

=over 4

=item *

Updated for the enhancements in v5.22, along with some clarifications.

=back

=head3 L<perlpodspec>

=over 4

=item *

The specification of the pod language is changing so that the default
encoding of pods that aren't in UTF-8 (unless otherwise indicated) is
CP1252 instead of ISO 8859-1 (Latin1).

=back

=head3 L<perlpolicy>

=over 4

=item *

We now have a code of conduct for the I<< p5p >> mailing list, as documented
in L<< perlpolicy/STANDARDS OF CONDUCT >>.

=item *

The conditions for marking an experimental feature as non-experimental are now
set out.

=item *

Clarification has been made as to what sorts of changes are permissible in
maintenance releases.

=back

=head3 L<perlport>

=over 4

=item *

Out-of-date VMS-specific information has been fixed and/or simplified.

=item *

Notes about EBCDIC have been added.

=back

=head3 L<perlre>

=over 4

=item *

The description of the C</x> modifier has been clarified to note that
comments cannot be continued onto the next line by escaping them; and
there is now a list of all the characters that are considered whitespace
by this modifier.

=item *

The new C</n> modifier is described.

=item *

A note has been added on how to make bracketed character class ranges
portable to non-ASCII machines.

=back

=head3 L<perlrebackslash>

=over 4

=item *

Added documentation of C<\b{sb}>, C<\b{wb}>, C<\b{gcb}>, and C<\b{g}>.

=back

=head3 L<perlrecharclass>

=over 4

=item *

Clarifications have been added to L<perlrecharclass/Character Ranges>
to the effect C<[A-Z]>, C<[a-z]>, C<[0-9]> and
any subranges thereof in regular expression bracketed character classes
are guaranteed to match exactly what a naive English speaker would
expect them to match, even on platforms (such as EBCDIC) where perl
has to do extra work to accomplish this.

=item *

The documentation of Bracketed Character Classes has been expanded to cover the
improvements in C<qr/[\N{named sequence}]/> (see under L</Selected Bug Fixes>).

=back

=head3 L<perlref>

=over 4

=item *

A new section has been added
L<Assigning to References|perlref/Assigning to References>

=back

=head3 L<perlsec>

=over 4

=item *

Comments added on algorithmic complexity and tied hashes.

=back

=head3 L<perlsyn>

=over 4

=item *

An ambiguity in the documentation of the C<...> statement has been corrected.
L<[perl #122661]|https://rt.perl.org/Ticket/Display.html?id=122661>

=item *

The empty conditional in C<< for >> and C<< while >> is now documented
in L<< perlsyn >>.

=back

=head3 L<perlunicode>

=over 4

=item *

This has had extensive revisions to bring it up-to-date with current
Unicode support and to make it more readable.  Notable is that Unicode
7.0 changed what it should do with non-characters.  Perl retains the old
way of handling for reasons of backward compatibility.  See
L<perlunicode/Noncharacter code points>.

=back

=head3 L<perluniintro>

=over 4

=item *

Advice for how to make sure your strings and regular expression patterns are
interpreted as Unicode has been updated.

=back

=head3 L<perlvar>

=over 4

=item *

C<$]> is no longer listed as being deprecated.  Instead, discussion has
been added on the advantages and disadvantages of using it versus
C<$^V>.  C<$OLD_PERL_VERSION> was re-added to the documentation as the long
form of C<$]>.

=item *

C<${^ENCODING}> is now marked as deprecated.

=item *

The entry for C<%^H> has been clarified to indicate it can only handle
simple values.

=back

=head3 L<perlvms>

=over 4

=item *

Out-of-date and/or incorrect material has been removed.

=item *

Updated documentation on environment and shell interaction in VMS.

=back

=head3 L<perlxs>

=over 4

=item *

Added a discussion of locale issues in XS code.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<Bad symbol for scalar|perldiag/"Bad symbol for scalar">

(P) An internal request asked to add a scalar entry to something that
wasn't a symbol table entry.

=item *

L<Can't use a hash as a reference|perldiag/"Can't use a hash as a reference">

(F) You tried to use a hash as a reference, as in
C<< %foo->{"bar"} >> or C<< %$ref->{"hello"} >>.  Versions of perl E<lt>= 5.6.1
used to allow this syntax, but shouldn't have.

=item *

L<Can't use an array as a reference|perldiag/"Can't use an array as a reference">

(F) You tried to use an array as a reference, as in
C<< @foo->[23] >> or C<< @$ref->[99] >>.  Versions of perl E<lt>= 5.6.1 used to
allow this syntax, but shouldn't have.

=item *

L<Can't use 'defined(@array)' (Maybe you should just omit the defined()?)|perldiag/"Can't use 'defined(@array)' (Maybe you should just omit the defined()?)">

(F) C<defined()> is not useful on arrays because it
checks for an undefined I<scalar> value.  If you want to see if the
array is empty, just use S<C<if (@array) { # not empty }>> for example.

=item *

L<Can't use 'defined(%hash)' (Maybe you should just omit the defined()?)|perldiag/"Can't use 'defined(%hash)' (Maybe you should just omit the defined()?)">

(F) C<defined()> is not usually right on hashes.

Although S<C<defined %hash>> is false on a plain not-yet-used hash, it
becomes true in several non-obvious circumstances, including iterators,
weak references, stash names, even remaining true after S<C<undef %hash>>.
These things make S<C<defined %hash>> fairly useless in practice, so it now
generates a fatal error.

If a check for non-empty is what you wanted then just put it in boolean
context (see L<perldata/Scalar values>):

    if (%hash) {
       # not empty
    }

If you had S<C<defined %Foo::Bar::QUUX>> to check whether such a package
variable exists then that's never really been reliable, and isn't
a good way to enquire about the features of a package, or whether
it's loaded, etc.

=item *

L<Cannot chr %f|perldiag/"Cannot chr %f">

(F) You passed an invalid number (like an infinity or not-a-number) to
C<chr>.

=item *

L<Cannot compress %f in pack|perldiag/"Cannot compress %f in pack">

(F) You tried converting an infinity or not-a-number to an unsigned
character, which makes no sense.

=item *

L<Cannot pack %f with '%c'|perldiag/"Cannot pack %f with '%c'">

(F) You tried converting an infinity or not-a-number to a character,
which makes no sense.

=item *

L<Cannot print %f with '%c'|perldiag/"Cannot printf %f with '%c'">

(F) You tried printing an infinity or not-a-number as a character (C<%c>),
which makes no sense.  Maybe you meant C<'%s'>, or just stringifying it?

=item *

L<charnames alias definitions may not contain a sequence of multiple spaces|perldiag/"charnames alias definitions may not contain a sequence of multiple spaces">

(F) You defined a character name which had multiple space
characters in a row.  Change them to single spaces.  Usually these
names are defined in the C<:alias> import argument to C<use charnames>, but
they could be defined by a translator installed into C<$^H{charnames}>.
See L<charnames/CUSTOM ALIASES>.

=item *

L<charnames alias definitions may not contain trailing white-space|perldiag/"charnames alias definitions may not contain trailing white-space">

(F) You defined a character name which ended in a space
character.  Remove the trailing space(s).  Usually these names are
defined in the C<:alias> import argument to C<use charnames>, but they
could be defined by a translator installed into C<$^H{charnames}>.
See L<charnames/CUSTOM ALIASES>.

=item *

L<:const is not permitted on named subroutines|perldiag/":const is not permitted on named subroutines">

(F) The C<const> attribute causes an anonymous subroutine to be run and
its value captured at the time that it is cloned.  Named subroutines are
not cloned like this, so the attribute does not make sense on them.

=item *

L<Hexadecimal float: internal error|perldiag/"Hexadecimal float: internal error">

(F) Something went horribly bad in hexadecimal float handling.

=item *

L<Hexadecimal float: unsupported long double format|perldiag/"Hexadecimal float: unsupported long double format">

(F) You have configured Perl to use long doubles but
the internals of the long double format are unknown,
therefore the hexadecimal float output is impossible.

=item *

L<Illegal suidscript|perldiag/"Illegal suidscript">

(F) The script run under suidperl was somehow illegal.

=item *

L<In '(?...)', the '(' and '?' must be adjacent in regex; marked by S<<-- HERE> in mE<sol>%sE<sol>|perldiag/"In '(?...)', the '(' and '?' must be adjacent in regex; marked by <-- HERE in m/%s/">

(F) The two-character sequence C<"(?"> in
this context in a regular expression pattern should be an
indivisible token, with nothing intervening between the C<"(">
and the C<"?">, but you separated them.

=item *

L<In '(*VERB...)', the '(' and '*' must be adjacent in regex; marked by S<<-- HERE> in mE<sol>%sE<sol>|perldiag/"In '(*VERB...)', the '(' and '*' must be adjacent in regex; marked by <-- HERE in m/%s/">

(F) The two-character sequence C<"(*"> in
this context in a regular expression pattern should be an
indivisible token, with nothing intervening between the C<"(">
and the C<"*">, but you separated them.

=item *

L<Invalid quantifier in {,} in regex; marked by <-- HERE in mE<sol>%sE<sol>|perldiag/"Invalid quantifier in {,} in regex; marked by <-- HERE in m/%s/">

(F) The pattern looks like a {min,max} quantifier, but the min or max could not
be parsed as a valid number: either it has leading zeroes, or it represents
too big a number to cope with.  The S<<-- HERE> shows where in the regular
expression the problem was discovered.  See L<perlre>.

=item *

L<'%s' is an unknown bound type in regex|perldiag/"'%s' is an unknown bound type in regex; marked by <-- HERE in m/%s/">

(F) You used C<\b{...}> or C<\B{...}> and the C<...> is not known to
Perl.  The current valid ones are given in
L<perlrebackslash/\b{}, \b, \B{}, \B>.

=item *

L<Missing or undefined argument to require|perldiag/Missing or undefined argument to require>

(F) You tried to call C<require> with no argument or with an undefined
value as an argument.  C<require> expects either a package name or a
file-specification as an argument.  See L<perlfunc/require>.

Formerly, C<require> with no argument or C<undef> warned about a Null filename.

=back

=head3 New Warnings

=over 4

=item *

L<\C is deprecated in regex|perldiag/"\C is deprecated in regex; marked by <-- HERE in m/%s/">

(D deprecated) The C<< /\C/ >> character class was deprecated in v5.20, and
now emits a warning. It is intended that it will become an error in v5.24.
This character class matches a single byte even if it appears within a
multi-byte character, breaks encapsulation, and can corrupt UTF-8
strings.

=item *

L<"%s" is more clearly written simply as "%s" in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"%s" is more clearly written simply as "%s" in regex; marked by <-- HERE in mE<sol>%sE<sol>>

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

You specified a character that has the given plainer way of writing it,
and which is also portable to platforms running with different character
sets.

=item *

L<Argument "%s" treated as 0 in increment (++)|perldiag/"Argument "%s" treated
as 0 in increment (++)">

(W numeric) The indicated string was fed as an argument to the C<++> operator
which expects either a number or a string matching C</^[a-zA-Z]*[0-9]*\z/>.
See L<perlop/Auto-increment and Auto-decrement> for details.

=item *

L<Both or neither range ends should be Unicode in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Both or neither range ends should be Unicode in regex; marked by <-- HERE in m/%s/">

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

In a bracketed character class in a regular expression pattern, you
had a range which has exactly one end of it specified using C<\N{}>, and
the other end is specified using a non-portable mechanism.  Perl treats
the range as a Unicode range, that is, all the characters in it are
considered to be the Unicode characters, and which may be different code
points on some platforms Perl runs on.  For example, C<[\N{U+06}-\x08]>
is treated as if you had instead said C<[\N{U+06}-\N{U+08}]>, that is it
matches the characters whose code points in Unicode are 6, 7, and 8.
But that C<\x08> might indicate that you meant something different, so
the warning gets raised.

=item *

L<Can't do %s("%s") on non-UTF-8 locale; resolved to "%s".|perldiag/Can't do %s("%s") on non-UTF-8 locale; resolved to "%s".>

(W locale) You are 1) running under "C<use locale>"; 2) the current
locale is not a UTF-8 one; 3) you tried to do the designated case-change
operation on the specified Unicode character; and 4) the result of this
operation would mix Unicode and locale rules, which likely conflict.

The warnings category C<locale> is new.

=item *

L<:const is experimental|perldiag/":const is experimental">

(S experimental::const_attr) The C<const> attribute is experimental.
If you want to use the feature, disable the warning with C<no warnings
'experimental::const_attr'>, but know that in doing so you are taking
the risk that your code may break in a future Perl version.

=item *

L<gmtime(%f) failed|perldiag/"gmtime(%f) failed">

(W overflow) You called C<gmtime> with a number that it could not handle:
too large, too small, or NaN.  The returned value is C<undef>.

=item *

L<Hexadecimal float: exponent overflow|perldiag/"Hexadecimal float: exponent overflow">

(W overflow) The hexadecimal floating point has larger exponent
than the floating point supports.

=item *

L<Hexadecimal float: exponent underflow|perldiag/"Hexadecimal float: exponent underflow">

(W overflow) The hexadecimal floating point has smaller exponent
than the floating point supports.

=item *

L<Hexadecimal float: mantissa overflow|perldiag/"Hexadecimal float: mantissa overflow">

(W overflow) The hexadecimal floating point literal had more bits in
the mantissa (the part between the C<0x> and the exponent, also known as
the fraction or the significand) than the floating point supports.

=item *

L<Hexadecimal float: precision loss|perldiag/"Hexadecimal float: precision loss">

(W overflow) The hexadecimal floating point had internally more
digits than could be output.  This can be caused by unsupported
long double formats, or by 64-bit integers not being available
(needed to retrieve the digits under some configurations).

=item *

L<Locale '%s' may not work well.%s|perldiag/Locale '%s' may not work well.%s>

(W locale) You are using the named locale, which is a non-UTF-8 one, and
which perl has determined is not fully compatible with what it can
handle.  The second C<%s> gives a reason.

The warnings category C<locale> is new.

=item *

L<localtime(%f) failed|perldiag/"localtime(%f) failed">

(W overflow) You called C<localtime> with a number that it could not handle:
too large, too small, or NaN.  The returned value is C<undef>.

=item *

L<Negative repeat count does nothing|perldiag/"Negative repeat count does nothing">

(W numeric) You tried to execute the
L<C<x>|perlop/Multiplicative Operators> repetition operator fewer than 0
times, which doesn't make sense.

=item *

L<NO-BREAK SPACE in a charnames alias definition is deprecated|perldiag/"NO-BREAK SPACE in a charnames alias definition is deprecated">

(D deprecated) You defined a character name which contained a no-break
space character.  Change it to a regular space.  Usually these names are
defined in the C<:alias> import argument to C<use charnames>, but they
could be defined by a translator installed into C<$^H{charnames}>.  See
L<charnames/CUSTOM ALIASES>.

=item *

L<Non-finite repeat count does nothing|perldiag/"Non-finite repeat count does nothing">

(W numeric) You tried to execute the
L<C<x>|perlop/Multiplicative Operators> repetition operator C<Inf> (or
C<-Inf>) or NaN times, which doesn't make sense.

=item *

L<PerlIO layer ':win32' is experimental|perldiag/"PerlIO layer ':win32' is experimental">

(S experimental::win32_perlio) The C<:win32> PerlIO layer is
experimental.  If you want to take the risk of using this layer,
simply disable this warning:

    no warnings "experimental::win32_perlio";

=item *

L<Ranges of ASCII printables should be some subset of "0-9", "A-Z", or "a-z" in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Ranges of ASCII printables should be some subset of "0-9", "A-Z", or "a-z" in regex; marked by <-- HERE in mE<sol>%sE<sol>">

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

Stricter rules help to find typos and other errors.  Perhaps you didn't
even intend a range here, if the C<"-"> was meant to be some other
character, or should have been escaped (like C<"\-">).  If you did
intend a range, the one that was used is not portable between ASCII and
EBCDIC platforms, and doesn't have an obvious meaning to a casual
reader.

 [3-7]    # OK; Obvious and portable
 [d-g]    # OK; Obvious and portable
 [A-Y]    # OK; Obvious and portable
 [A-z]    # WRONG; Not portable; not clear what is meant
 [a-Z]    # WRONG; Not portable; not clear what is meant
 [%-.]    # WRONG; Not portable; not clear what is meant
 [\x41-Z] # WRONG; Not portable; not obvious to non-geek

(You can force portability by specifying a Unicode range, which means that
the endpoints are specified by
L<C<\N{...}>|perlrecharclass/Character Ranges>, but the meaning may
still not be obvious.)
The stricter rules require that ranges that start or stop with an ASCII
character that is not a control have all their endpoints be a literal
character, and not some escape sequence (like C<"\x41">), and the ranges
must be all digits, or all uppercase letters, or all lowercase letters.

=item *

L<Ranges of digits should be from the same group in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Ranges of digits should be from the same group in regex; marked by <-- HERE in m/%s/">

(W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>)

Stricter rules help to find typos and other errors.  You included a
range, and at least one of the end points is a decimal digit.  Under the
stricter rules, when this happens, both end points should be digits in
the same group of 10 consecutive digits.

=item *

L<Redundant argument in %s|perldiag/Redundant argument in %s>

(W redundant) You called a function with more arguments than were
needed, as indicated by information within other arguments you supplied
(I<e.g>. a printf format). Currently only emitted when a printf-type format
required fewer arguments than were supplied, but might be used in the
future for I<e.g.> L<perlfunc/pack>.

The warnings category C<< redundant >> is new. See also
L<[perl #121025]|https://rt.perl.org/Ticket/Display.html?id=121025>.

=item *

L<Replacement list is longer than search list|perldiag/Replacement list is longer than search list>

This is not a new diagnostic, but in earlier releases was accidentally
not displayed if the transliteration contained wide characters.  This is
now fixed, so that you may see this diagnostic in places where you
previously didn't (but should have).

=item *

L<Use of \b{} for non-UTF-8 locale is wrong.  Assuming a UTF-8 locale|perldiag/"Use of \b{} for non-UTF-8 locale is wrong.  Assuming a UTF-8 locale">

(W locale) You are matching a regular expression using locale rules,
and a Unicode boundary is being matched, but the locale is not a Unicode
one.  This doesn't make sense.  Perl will continue, assuming a Unicode
(UTF-8) locale, but the results could well be wrong except if the locale
happens to be ISO-8859-1 (Latin1) where this message is spurious and can
be ignored.

The warnings category C<locale> is new.

=item *

L<< Using E<sol>u for '%s' instead of E<sol>%s in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Using E<sol>u for '%s' instead of E<sol>%s in regex; marked by <-- HERE in mE<sol>%sE<sol>" >>

(W regexp) You used a Unicode boundary (C<\b{...}> or C<\B{...}>) in a
portion of a regular expression where the character set modifiers C</a>
or C</aa> are in effect.  These two modifiers indicate an ASCII
interpretation, and this doesn't make sense for a Unicode definition.
The generated regular expression will compile so that the boundary uses
all of Unicode.  No other portion of the regular expression is affected.

=item *

L<The bitwise feature is experimental|perldiag/"The bitwise feature is experimental">

(S experimental::bitwise) This warning is emitted if you use bitwise
operators (C<& | ^ ~ &. |. ^. ~.>) with the "bitwise" feature enabled.
Simply suppress the warning if you want to use the feature, but know
that in doing so you are taking the risk of using an experimental
feature which may change or be removed in a future Perl version:

    no warnings "experimental::bitwise";
    use feature "bitwise";
    $x |.= $y;

=item *

L<Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in mE<sol>%sE<sol>|perldiag/"Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/%s/">

(D deprecated, regexp) You used a literal C<"{"> character in a regular
expression pattern. You should change to use C<"\{"> instead, because a future
version of Perl (tentatively v5.26) will consider this to be a syntax error.  If
the pattern delimiters are also braces, any matching right brace
(C<"}">) should also be escaped to avoid confusing the parser, for
example,

    qr{abc\{def\}ghi}

=item *

L<Use of literal non-graphic characters in variable names is deprecated|perldiag/"Use of literal non-graphic characters in variable names is deprecated">

(D deprecated) Using literal non-graphic (including control)
characters in the source to refer to the I<^FOO> variables, like C<$^X> and
C<${^GLOBAL_PHASE}> is now deprecated.

=item *

L<Useless use of attribute "const"|perldiag/Useless use of attribute "const">

(W misc) The C<const> attribute has no effect except
on anonymous closure prototypes.  You applied it to
a subroutine via L<attributes.pm|attributes>.  This is only useful
inside an attribute handler for an anonymous subroutine.

=item *

L<Useless use of E<sol>d modifier in transliteration operator|perldiag/"Useless use of /d modifier in transliteration operator">

This is not a new diagnostic, but in earlier releases was accidentally
not displayed if the transliteration contained wide characters.  This is
now fixed, so that you may see this diagnostic in places where you
previously didn't (but should have).

=item *

L<E<quot>use re 'strict'E<quot> is experimental|perldiag/"use re 'strict'" is experimental>

(S experimental::re_strict) The things that are different when a regular
expression pattern is compiled under C<'strict'> are subject to change
in future Perl releases in incompatible ways; there are also proposals
to change how to enable strict checking instead of using this subpragma.
This means that a pattern that compiles today may not in a future Perl
release.  This warning is to alert you to that risk.

=item *

L<Warning: unable to close filehandle properly: %s|perldiag/"Warning: unable to close filehandle properly: %s">

L<Warning: unable to close filehandle %s properly: %s|perldiag/"Warning: unable to close filehandle %s properly: %s">

(S io) Previously, perl silently ignored any errors when doing an implicit
close of a filehandle, I<i.e.> where the reference count of the filehandle
reached zero and the user's code hadn't already called C<close()>; I<e.g.>

    {
        open my $fh, '>', $file  or die "open: '$file': $!\n";
        print $fh, $data  or die;
    } # implicit close here

In a situation such as disk full, due to buffering, the error may only be
detected during the final close, so not checking the result of the close is
dangerous.

So perl now warns in such situations.

=item *

L<Wide character (U+%X) in %s|perldiag/"Wide character (U+%X) in %s">

(W locale) While in a single-byte locale (I<i.e.>, a non-UTF-8
one), a multi-byte character was encountered.   Perl considers this
character to be the specified Unicode code point.  Combining non-UTF-8
locales and Unicode is dangerous.  Almost certainly some characters
will have two different representations.  For example, in the ISO 8859-7
(Greek) locale, the code point 0xC3 represents a Capital Gamma.  But so
also does 0x393.  This will make string comparisons unreliable.

You likely need to figure out how this multi-byte character got mixed up
with your single-byte locale (or perhaps you thought you had a UTF-8
locale, but Perl disagrees).

The warnings category C<locale> is new.

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

<> should be quotes

This warning has been changed to
L<< <> at require-statement should be quotes|perldiag/"<> at require-statement should be quotes" >>
to make the issue more identifiable.

=item *

L<Argument "%s" isn't numeric%s|perldiag/"Argument "%s" isn't numeric%s">

The L<perldiag> entry for this warning has added this clarifying note:

 Note that for the Inf and NaN (infinity and not-a-number) the
 definition of "numeric" is somewhat unusual: the strings themselves
 (like "Inf") are considered numeric, and anything following them is
 considered non-numeric.

=item *

L<Global symbol "%s" requires explicit package name|perldiag/"Global symbol "%s" requires explicit package name (did you forget to declare "my %s"?)">

This message has had '(did you forget to declare "my %s"?)' appended to it, to
make it more helpful to new Perl programmers.
L<[perl #121638]|https://rt.perl.org/Ticket/Display.html?id=121638>

=item *

'"my" variable &foo::bar can't be in a package' has been reworded to say
'subroutine' instead of 'variable'.

=item *

L<<< \N{} in character class restricted to one character in regex; marked by
S<< <-- HERE >> in mE<sol>%sE<sol>|perldiag/"\N{} in inverted character
class or as a range end-point is restricted to one character in regex;
marked by <-- HERE in m/%s/" >>>

This message has had I<character class> changed to I<inverted character
class or as a range end-point is> to reflect improvements in
C<qr/[\N{named sequence}]/> (see under L</Selected Bug Fixes>).

=item *

L<panic: frexp|perldiag/"panic: frexp: %f">

This message has had ': C<%f>' appended to it, to show what the offending
floating point number is.

=item *

I<Possible precedence problem on bitwise %c operator> reworded as
L<Possible precedence problem on bitwise %s operator|perldiag/"Possible precedence problem on bitwise %s operator">.

=item *

L<Unsuccessful %s on filename containing newline|perldiag/"Unsuccessful %s on filename containing newline">

This warning is now only produced when the newline is at the end of
the filename.

=item *

"Variable C<%s> will not stay shared" has been changed to say "Subroutine"
when it is actually a lexical sub that will not stay shared.

=item *

L<Variable length lookbehind not implemented in regex mE<sol>%sE<sol>|perldiag/"Variable length lookbehind not implemented in regex m/%s/">

The L<perldiag> entry for this warning has had information about Unicode
behavior added.

=back

=head2 Diagnostic Removals

=over

=item *

"Ambiguous use of -foo resolved as -&foo()"

There is actually no ambiguity here, and this impedes the use of negated
constants; I<e.g.>, C<-Inf>.

=item *

"Constant is not a FOO reference"

Compile-time checking of constant dereferencing (I<e.g.>, C<< my_constant->() >>)
has been removed, since it was not taking overloading into account.
L<[perl #69456]|https://rt.perl.org/Ticket/Display.html?id=69456>
L<[perl #122607]|https://rt.perl.org/Ticket/Display.html?id=122607>

=back

=head1 Utility Changes

=head2 F<find2perl>, F<s2p> and F<a2p> removal

=over 4

=item *

The F<x2p/> directory has been removed from the Perl core.

This removes find2perl, s2p and a2p. They have all been released to CPAN as
separate distributions (C<App::find2perl>, C<App::s2p>, C<App::a2p>).

=back

=head2 L<h2ph>

=over 4

=item *

F<h2ph> now handles hexadecimal constants in the compiler's predefined
macro definitions, as visible in C<$Config{cppsymbols}>.
L<[perl #123784]|https://rt.perl.org/Ticket/Display.html?id=123784>.

=back

=head2 L<encguess>

=over 4

=item *

No longer depends on non-core modules.

=back

=head1 Configuration and Compilation

=over 4

=item *

F<Configure> now checks for C<lrintl()>, C<lroundl()>, C<llrintl()>, and
C<llroundl()>.

=item *

F<Configure> with C<-Dmksymlinks> should now be faster.
L<[perl #122002]|https://rt.perl.org/Ticket/Display.html?id=122002>.

=item *

The C<pthreads> and C<cl> libraries will be linked by default if present.
This allows XS modules that require threading to work on non-threaded
perls. Note that you must still pass C<-Dusethreads> if you want a
threaded perl.

=item *

To get more precision and range for floating point numbers one can now
use the GCC quadmath library which implements the quadruple precision
floating point numbers on x86 and IA-64 platforms.  See F<INSTALL> for
details.

=item *

MurmurHash64A and MurmurHash64B can now be configured as the internal hash
function.

=item *

C<make test.valgrind> now supports parallel testing.

For example:

    TEST_JOBS=9 make test.valgrind

See L<perlhacktips/valgrind> for more information.

L<[perl #121431]|https://rt.perl.org/Ticket/Display.html?id=121431>

=item *

The MAD (Misc Attribute Decoration) build option has been removed

This was an unmaintained attempt at preserving
the Perl parse tree more faithfully so that automatic conversion of
Perl 5 to Perl 6 would have been easier.

This build-time configuration option had been unmaintained for years,
and had probably seriously diverged on both Perl 5 and Perl 6 sides.

=item *

A new compilation flag, C<< -DPERL_OP_PARENT >> is available. For details,
see the discussion below at L<< /Internal Changes >>.

=item *

Pathtools no longer tries to load XS on miniperl. This speeds up building perl
slightly.

=back

=head1 Testing

=over 4

=item *

F<t/porting/re_context.t> has been added to test that L<utf8> and its
dependencies only use the subset of the C<$1..$n> capture vars that
C<Perl_save_re_context()> is hard-coded to localize, because that function
has no efficient way of determining at runtime what vars to localize.

=item *

Tests for performance issues have been added in the file F<t/perf/taint.t>.

=item *

Some regular expression tests are written in such a way that they will
run very slowly if certain optimizations break. These tests have been
moved into new files, F<< t/re/speed.t >> and F<< t/re/speed_thr.t >>,
and are run with a C<< watchdog() >>.

=item *

C<< test.pl >> now allows C<< plan skip_all => $reason >>, to make it
more compatible with C<< Test::More >>.

=item *

A new test script, F<op/infnan.t>, has been added to test if infinity and NaN are
working correctly.  See L</Infinity and NaN (not-a-number) handling improved>.

=back

=head1 Platform Support

=head2 Regained Platforms

=over 4

=item IRIX and Tru64 platforms are working again.

Some C<make test> failures remain:
L<[perl #123977]|https://rt.perl.org/Ticket/Display.html?id=123977>
and L<[perl #125298]|https://rt.perl.org/Ticket/Display.html?id=125298>
for IRIX; L<[perl #124212]|https://rt.perl.org/Ticket/Display.html?id=124212>,
L<[cpan #99605]|https://rt.cpan.org/Public/Bug/Display.html?id=99605>, and
L<[cpan #104836]|https://rt.cpan.org/Ticket/Display.html?id=104836> for Tru64.

=item z/OS running EBCDIC Code Page 1047

Core perl now works on this EBCDIC platform.  Earlier perls also worked, but,
even though support wasn't officially withdrawn, recent perls would not compile
and run well.  Perl 5.20 would work, but had many bugs which have now been
fixed.  Many CPAN modules that ship with Perl still fail tests, including
C<Pod::Simple>.  However the version of C<Pod::Simple> currently on CPAN should work;
it was fixed too late to include in Perl 5.22.  Work is under way to fix many
of the still-broken CPAN modules, which likely will be installed on CPAN when
completed, so that you may not have to wait until Perl 5.24 to get a working
version.

=back

=head2 Discontinued Platforms

=over 4

=item NeXTSTEP/OPENSTEP

NeXTSTEP was a proprietary operating system bundled with NeXT's
workstations in the early to mid 90s; OPENSTEP was an API specification
that provided a NeXTSTEP-like environment on a non-NeXTSTEP system.  Both
are now long dead, so support for building Perl on them has been removed.

=back

=head2 Platform-Specific Notes

=over 4

=item EBCDIC

Special handling is required of the perl interpreter on EBCDIC platforms
to get C<qr/[i-j]/> to match only C<"i"> and C<"j">, since there are 7
characters between the
code points for C<"i"> and C<"j">.  This special handling had only been
invoked when both ends of the range are literals.  Now it is also
invoked if any of the C<\N{...}> forms for specifying a character by
name or Unicode code point is used instead of a literal.  See
L<perlrecharclass/Character Ranges>.

=item HP-UX

The archname now distinguishes use64bitint from use64bitall.

=item Android

Build support has been improved for cross-compiling in general and for
Android in particular.

=item VMS

=over 4

=item *

When spawning a subprocess without waiting, the return value is now
the correct PID.

=item *

Fix a prototype so linking doesn't fail under the VMS C++ compiler.

=item *

C<finite>, C<finitel>, and C<isfinite> detection has been added to
C<configure.com>, environment handling has had some minor changes, and
a fix for legacy feature checking status.

=back

=item Win32

=over 4

=item *

F<miniperl.exe> is now built with C<-fno-strict-aliasing>, allowing 64-bit
builds to complete on GCC 4.8.
L<[perl #123976]|https://rt.perl.org/Ticket/Display.html?id=123976>

=item *

C<nmake minitest> now works on Win32.  Due to dependency issues you
need to build C<nmake test-prep> first, and a small number of the
tests fail.
L<[perl #123394]|https://rt.perl.org/Ticket/Display.html?id=123394>

=item *

Perl can now be built in C++ mode on Windows by setting the makefile macro
C<USE_CPLUSPLUS> to the value "define".

=item *

The list form of piped open has been implemented for Win32.  Note: unlike
C<system LIST> this does not fall back to the shell.
L<[perl #121159]|https://rt.perl.org/Ticket/Display.html?id=121159>

=item *

New C<DebugSymbols> and C<DebugFull> configuration options added to
Windows makefiles.

=item *

Previously, compiling XS modules (including CPAN ones) using Visual C++ for
Win64 resulted in around a dozen warnings per file from F<hv_func.h>.  These
warnings have been silenced.

=item *

Support for building without PerlIO has been removed from the Windows
makefiles.  Non-PerlIO builds were all but deprecated in Perl 5.18.0 and are
already not supported by F<Configure> on POSIX systems.

=item *

Between 2 and 6 milliseconds and seven I/O calls have been saved per attempt
to open a perl module for each path in C<@INC>.

=item *

Intel C builds are now always built with C99 mode on.

=item *

C<%I64d> is now being used instead of C<%lld> for MinGW.

=item *

In the experimental C<:win32> layer, a crash in C<open> was fixed. Also
opening F</dev/null> (which works under Win32 Perl's default C<:unix>
layer) was implemented for C<:win32>.
L<[perl #122224]|https://rt.perl.org/Ticket/Display.html?id=122224>

=item *

A new makefile option, C<USE_LONG_DOUBLE>, has been added to the Windows
dmake makefile for gcc builds only.  Set this to "define" if you want perl to
use long doubles to give more accuracy and range for floating point numbers.

=back

=item OpenBSD

On OpenBSD, Perl will now default to using the system C<malloc> due to the
security features it provides. Perl's own malloc wrapper has been in use
since v5.14 due to performance reasons, but the OpenBSD project believes
the tradeoff is worth it and would prefer that users who need the speed
specifically ask for it.

L<[perl #122000]|https://rt.perl.org/Ticket/Display.html?id=122000>.

=item Solaris

=over 4

=item *

We now look for the Sun Studio compiler in both F</opt/solstudio*> and
F</opt/solarisstudio*>.

=item *

Builds on Solaris 10 with C<-Dusedtrace> would fail early since make
didn't follow implied dependencies to build C<perldtrace.h>.  Added an
explicit dependency to C<depend>.
L<[perl #120120]|https://rt.perl.org/Ticket/Display.html?id=120120>

=item *

C99 options have been cleaned up; hints look for C<solstudio>
as well as C<SUNWspro>; and support for native C<setenv> has been added.

=back

=back

=head1 Internal Changes

=over 4

=item *

Experimental support has been added to allow ops in the optree to locate
their parent, if any. This is enabled by the non-default build option
C<-DPERL_OP_PARENT>. It is envisaged that this will eventually become
enabled by default, so XS code which directly accesses the C<op_sibling>
field of ops should be updated to be future-proofed.

On C<PERL_OP_PARENT> builds, the C<op_sibling> field has been renamed
C<op_sibparent> and a new flag, C<op_moresib>, added. On the last op in a
sibling chain, C<op_moresib> is false and C<op_sibparent> points to the
parent (if any) rather than being C<NULL>.

To make existing code work transparently whether using C<PERL_OP_PARENT>
or not, a number of new macros and functions have been added that should
be used, rather than directly manipulating C<op_sibling>.

For the case of just reading C<op_sibling> to determine the next sibling,
two new macros have been added. A simple scan through a sibling chain
like this:

    for (; kid->op_sibling; kid = kid->op_sibling) { ... }

should now be written as:

    for (; OpHAS_SIBLING(kid); kid = OpSIBLING(kid)) { ... }

For altering optrees, a general-purpose function C<op_sibling_splice()>
has been added, which allows for manipulation of a chain of sibling ops.
By analogy with the Perl function C<splice()>, it allows you to cut out
zero or more ops from a sibling chain and replace them with zero or more
new ops.  It transparently handles all the updating of sibling, parent,
op_last pointers etc.

If you need to manipulate ops at a lower level, then three new macros,
C<OpMORESIB_set>, C<OpLASTSIB_set> and C<OpMAYBESIB_set> are intended to
be a low-level portable way to set C<op_sibling> / C<op_sibparent> while
also updating C<op_moresib>.  The first sets the sibling pointer to a new
sibling, the second makes the op the last sibling, and the third
conditionally does the first or second action.  Note that unlike
C<op_sibling_splice()> these macros won't maintain consistency in the
parent at the same time (I<e.g.> by updating C<op_first> and C<op_last> where
appropriate).

A C-level C<Perl_op_parent()> function and a Perl-level C<B::OP::parent()>
method have been added. The C function only exists under
C<PERL_OP_PARENT> builds (using it is build-time error on vanilla
perls).  C<B::OP::parent()> exists always, but on a vanilla build it
always returns C<NULL>. Under C<PERL_OP_PARENT>, they return the parent
of the current op, if any. The variable C<$B::OP::does_parent> allows you
to determine whether C<B> supports retrieving an op's parent.

C<PERL_OP_PARENT> was introduced in 5.21.2, but the interface was
changed considerably in 5.21.11. If you updated your code before the
5.21.11 changes, it may require further revision. The main changes after
5.21.2 were:

=over 4

=item *

The C<OP_SIBLING> and C<OP_HAS_SIBLING> macros have been renamed
C<OpSIBLING> and C<OpHAS_SIBLING> for consistency with other
op-manipulating macros.

=item *

The C<op_lastsib> field has been renamed C<op_moresib>, and its meaning
inverted.

=item *

The macro C<OpSIBLING_set> has been removed, and has been superseded by
C<OpMORESIB_set> I<et al>.

=item *

The C<op_sibling_splice()> function now accepts a null C<parent> argument
where the splicing doesn't affect the first or last ops in the sibling
chain

=back

=item *

Macros have been created to allow XS code to better manipulate the POSIX locale
category C<LC_NUMERIC>.  See L<perlapi/Locale-related functions and macros>.

=item *

The previous C<atoi> I<et al> replacement function, C<grok_atou>, has now been
superseded by C<grok_atoUV>.  See L<perlclib> for details.

=item *

A new function, C<Perl_sv_get_backrefs()>, has been added which allows you
retrieve the weak references, if any, which point at an SV.

=item *

The C<screaminstr()> function has been removed. Although marked as
public API, it was undocumented and had no usage in CPAN modules. Calling
it has been fatal since 5.17.0.

=item *

The C<newDEFSVOP()>, C<block_start()>, C<block_end()> and C<intro_my()>
functions have been added to the API.

=item *

The internal C<convert> function in F<op.c> has been renamed
C<op_convert_list> and added to the API.

=item *

The C<sv_magic()> function no longer forbids "ext" magic on read-only
values.  After all, perl can't know whether the custom magic will modify
the SV or not.
L<[perl #123103]|https://rt.perl.org/Ticket/Display.html?id=123103>.

=item *

Accessing L<perlapi/CvPADLIST> on an XSUB is now forbidden.

The C<CvPADLIST> field has been reused for a different internal purpose
for XSUBs. So in particular, you can no longer rely on it being NULL as a
test of whether a CV is an XSUB. Use C<CvISXSUB()> instead.

=item *

SVs of type C<SVt_NV> are now sometimes bodiless when the build
configuration and platform allow it: specifically, when C<< sizeof(NV) <=
sizeof(IV) >>. "Bodiless" means that the NV value is stored directly in
the head of an SV, without requiring a separate body to be allocated. This
trick has already been used for IVs since 5.9.2 (though in the case of
IVs, it is always used, regardless of platform and build configuration).

=item *

The C<$DB::single>, C<$DB::signal> and C<$DB::trace> variables now have set- and
get-magic that stores their values as IVs, and those IVs are used when
testing their values in C<pp_dbstate()>.  This prevents perl from
recursing infinitely if an overloaded object is assigned to any of those
variables.
L<[perl #122445]|https://rt.perl.org/Ticket/Display.html?id=122445>.

=item *

C<Perl_tmps_grow()>, which is marked as public API but is undocumented, has
been removed from the public API. This change does not affect XS code that
uses the C<EXTEND_MORTAL> macro to pre-extend the mortal stack.

=item *

Perl's internals no longer sets or uses the C<SVs_PADMY> flag.
C<SvPADMY()> now returns a true value for anything not marked C<PADTMP>
and C<SVs_PADMY> is now defined as 0.

=item *

The macros C<SETsv> and C<SETsvUN> have been removed. They were no longer used
in the core since commit 6f1401dc2a five years ago, and have not been
found present on CPAN.

=item *

The C<< SvFAKE >> bit (unused on HVs) got informally reserved by
David Mitchell for future work on vtables.

=item *

The C<sv_catpvn_flags()> function accepts C<SV_CATBYTES> and C<SV_CATUTF8>
flags, which specify whether the appended string is bytes or UTF-8,
respectively. (These flags have in fact been present since 5.16.0, but
were formerly not regarded as part of the API.)

=item *

A new opcode class, C<< METHOP >>, has been introduced. It holds
information used at runtime to improve the performance
of class/object method calls.

C<< OP_METHOD >> and C<< OP_METHOD_NAMED >> have changed from being
C<< UNOP/SVOP >> to being C<< METHOP >>.

=item *

C<cv_name()> is a new API function that can be passed a CV or GV.  It
returns an SV containing the name of the subroutine, for use in
diagnostics.

L<[perl #116735]|https://rt.perl.org/Ticket/Display.html?id=116735>
L<[perl #120441]|https://rt.perl.org/Ticket/Display.html?id=120441>

=item *

C<cv_set_call_checker_flags()> is a new API function that works like
C<cv_set_call_checker()>, except that it allows the caller to specify
whether the call checker requires a full GV for reporting the subroutine's
name, or whether it could be passed a CV instead.  Whatever value is
passed will be acceptable to C<cv_name()>.  C<cv_set_call_checker()>
guarantees there will be a GV, but it may have to create one on the fly,
which is inefficient.
L<[perl #116735]|https://rt.perl.org/Ticket/Display.html?id=116735>

=item *

C<CvGV> (which is not part of the API) is now a more complex macro, which may
call a function and reify a GV.  For those cases where it has been used as a
boolean, C<CvHASGV> has been added, which will return true for CVs that
notionally have GVs, but without reifying the GV.  C<CvGV> also returns a GV
now for lexical subs.
L<[perl #120441]|https://rt.perl.org/Ticket/Display.html?id=120441>

=item *

The L<perlapi/sync_locale> function has been added to the public API.
Changing the program's locale should be avoided by XS code. Nevertheless,
certain non-Perl libraries called from XS need to do so, such as C<Gtk>.
When this happens, Perl needs to be told that the locale has
changed.  Use this function to do so, before returning to Perl.

=item *

The defines and labels for the flags in the C<op_private> field of OPs are now
auto-generated from data in F<regen/op_private>.  The noticeable effect of this
is that some of the flag output of C<Concise> might differ slightly, and the
flag output of S<C<perl -Dx>> may differ considerably (they both use the same set
of labels now).  Also, debugging builds now have a new assertion in
C<op_free()> to ensure that the op doesn't have any unrecognized flags set in
C<op_private>.

=item *

The deprecated variable C<PL_sv_objcount> has been removed.

=item *

Perl now tries to keep the locale category C<LC_NUMERIC> set to "C"
except around operations that need it to be set to the program's
underlying locale.  This protects the many XS modules that cannot cope
with the decimal radix character not being a dot.  Prior to this
release, Perl initialized this category to "C", but a call to
C<POSIX::setlocale()> would change it.  Now such a call will change the
underlying locale of the C<LC_NUMERIC> category for the program, but the
locale exposed to XS code will remain "C".  There are new macros
to manipulate the LC_NUMERIC locale, including
C<STORE_LC_NUMERIC_SET_TO_NEEDED> and
C<STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>.
See L<perlapi/Locale-related functions and macros>.

=item *

A new macro L<C<isUTF8_CHAR>|perlapi/isUTF8_CHAR> has been written which
efficiently determines if the string given by its parameters begins
with a well-formed UTF-8 encoded character.

=item *

The following private API functions had their context parameter removed:
C<Perl_cast_ulong>,  C<Perl_cast_i32>, C<Perl_cast_iv>,    C<Perl_cast_uv>,
C<Perl_cv_const_sv>, C<Perl_mg_find>,  C<Perl_mg_findext>, C<Perl_mg_magical>,
C<Perl_mini_mktime>, C<Perl_my_dirfd>, C<Perl_sv_backoff>, C<Perl_utf8_hop>.

Note that the prefix-less versions of those functions that are part of the
public API, such as C<cast_i32()>, remain unaffected.

=item *

The C<PADNAME> and C<PADNAMELIST> types are now separate types, and no
longer simply aliases for SV and AV.
L<[perl #123223]|https://rt.perl.org/Ticket/Display.html?id=123223>.

=item *

Pad names are now always UTF-8.  The C<PadnameUTF8> macro always returns
true.  Previously, this was effectively the case already, but any support
for two different internal representations of pad names has now been
removed.

=item *

A new op class, C<UNOP_AUX>, has been added. This is a subclass of
C<UNOP> with an C<op_aux> field added, which points to an array of unions
of UV, SV* etc. It is intended for where an op needs to store more data
than a simple C<op_sv> or whatever. Currently the only op of this type is
C<OP_MULTIDEREF> (see next item).

=item *

A new op has been added, C<OP_MULTIDEREF>, which performs one or more
nested array and hash lookups where the key is a constant or simple
variable. For example the expression C<$a[0]{$k}[$i]>, which previously
involved ten C<rv2Xv>, C<Xelem>, C<gvsv> and C<const> ops is now performed
by a single C<multideref> op. It can also handle C<local>, C<exists> and
C<delete>. A non-simple index expression, such as C<[$i+1]> is still done
using C<aelem>/C<helem>, and single-level array lookup with a small constant
index is still done using C<aelemfast>.

=back

=head1 Selected Bug Fixes

=over 4

=item *

C<close> now sets C<$!>

When an I/O error occurs, the fact that there has been an error is recorded
in the handle.  C<close> returns false for such a handle.  Previously, the
value of C<$!> would be untouched by C<close>, so the common convention of
writing S<C<close $fh or die $!>> did not work reliably.  Now the handle
records the value of C<$!>, too, and C<close> restores it.

=item *

C<no re> now can turn off everything that C<use re> enables

Previously, running C<no re> would turn off only a few things. Now it
can turn off all the enabled things. For example, the only way to
stop debugging, once enabled, was to exit the enclosing block; that is
now fixed.

=item *

C<pack("D", $x)> and C<pack("F", $x)> now zero the padding on x86 long
double builds.  Under some build options on GCC 4.8 and later, they used
to either overwrite the zero-initialized padding, or bypass the
initialized buffer entirely.  This caused F<op/pack.t> to fail.
L<[perl #123971]|https://rt.perl.org/Ticket/Display.html?id=123971>

=item *

Extending an array cloned from a parent thread could result in "Modification of
a read-only value attempted" errors when attempting to modify the new elements.
L<[perl #124127]|https://rt.perl.org/Ticket/Display.html?id=124127>

=item *

An assertion failure and subsequent crash with C<< *x=<y> >> has been fixed.
L<[perl #123790]|https://rt.perl.org/Ticket/Display.html?id=123790>

=item *

A possible crashing/looping bug related to compiling lexical subs has been
fixed.
L<[perl #124099]|https://rt.perl.org/Ticket/Display.html?id=124099>

=item *

UTF-8 now works correctly in function names, in unquoted HERE-document
terminators, and in variable names used as array indexes.
L<[perl #124113]|https://rt.perl.org/Ticket/Display.html?id=124113>

=item *

Repeated global pattern matches in scalar context on large tainted strings were
exponentially slow depending on the current match position in the string.
L<[perl #123202]|https://rt.perl.org/Ticket/Display.html?id=123202>

=item *

Various crashes due to the parser getting confused by syntax errors have been
fixed.
L<[perl #123801]|https://rt.perl.org/Ticket/Display.html?id=123801>
L<[perl #123802]|https://rt.perl.org/Ticket/Display.html?id=123802>
L<[perl #123955]|https://rt.perl.org/Ticket/Display.html?id=123955>
L<[perl #123995]|https://rt.perl.org/Ticket/Display.html?id=123995>

=item *

C<split> in the scope of lexical C<$_> has been fixed not to fail assertions.
L<[perl #123763]|https://rt.perl.org/Ticket/Display.html?id=123763>

=item *

C<my $x : attr> syntax inside various list operators no longer fails
assertions.
L<[perl #123817]|https://rt.perl.org/Ticket/Display.html?id=123817>

=item *

An C<@> sign in quotes followed by a non-ASCII digit (which is not a valid
identifier) would cause the parser to crash, instead of simply trying the
C<@> as literal.  This has been fixed.
L<[perl #123963]|https://rt.perl.org/Ticket/Display.html?id=123963>

=item *

C<*bar::=*foo::=*glob_with_hash> has been crashing since Perl 5.14, but no
longer does.
L<[perl #123847]|https://rt.perl.org/Ticket/Display.html?id=123847>

=item *

C<foreach> in scalar context was not pushing an item on to the stack, resulting
in bugs.  (S<C<print 4, scalar do { foreach(@x){} } + 1>> would print 5.)
It has been fixed to return C<undef>.
L<[perl #124004]|https://rt.perl.org/Ticket/Display.html?id=124004>

=item *

Several cases of data used to store environment variable contents in core C
code being potentially overwritten before being used have been fixed.
L<[perl #123748]|https://rt.perl.org/Ticket/Display.html?id=123748>

=item *

Some patterns starting with C</.*..../> matched against long strings have
been slow since v5.8, and some of the form C</.*..../i> have been slow
since v5.18. They are now all fast again.
L<[perl #123743]|https://rt.perl.org/Ticket/Display.html?id=123743>.

=item *

The original visible value of C<$/> is now preserved when it is set to
an invalid value.  Previously if you set C<$/> to a reference to an
array, for example, perl would produce a runtime error and not set
C<PL_rs>, but Perl code that checked C<$/> would see the array
reference.
L<[perl #123218]|https://rt.perl.org/Ticket/Display.html?id=123218>.

=item *

In a regular expression pattern, a POSIX class, like C<[:ascii:]>, must
be inside a bracketed character class, like C<qr/[[:ascii:]]/>.  A
warning is issued when something looking like a POSIX class is not
inside a bracketed class.  That warning wasn't getting generated when
the POSIX class was negated: C<[:^ascii:]>.  This is now fixed.

=item *

Perl 5.14.0 introduced a bug whereby S<C<eval { LABEL: }>> would crash.  This
has been fixed.
L<[perl #123652]|https://rt.perl.org/Ticket/Display.html?id=123652>.

=item *

Various crashes due to the parser getting confused by syntax errors have
been fixed.
L<[perl #123617]|https://rt.perl.org/Ticket/Display.html?id=123617>.
L<[perl #123737]|https://rt.perl.org/Ticket/Display.html?id=123737>.
L<[perl #123753]|https://rt.perl.org/Ticket/Display.html?id=123753>.
L<[perl #123677]|https://rt.perl.org/Ticket/Display.html?id=123677>.

=item *

Code like C</$a[/> used to read the next line of input and treat it as
though it came immediately after the opening bracket.  Some invalid code
consequently would parse and run, but some code caused crashes, so this is
now disallowed.
L<[perl #123712]|https://rt.perl.org/Ticket/Display.html?id=123712>.

=item *

Fix argument underflow for C<pack>.
L<[perl #123874]|https://rt.perl.org/Ticket/Display.html?id=123874>.

=item *

Fix handling of non-strict C<\x{}>. Now C<\x{}> is equivalent to C<\x{0}>
instead of faulting.

=item *

C<stat -t> is now no longer treated as stackable, just like C<-t stat>.
L<[perl #123816]|https://rt.perl.org/Ticket/Display.html?id=123816>.

=item *

The following no longer causes a SEGV: C<qr{x+(y(?0))*}>.

=item *

Fixed infinite loop in parsing backrefs in regexp patterns.

=item *

Several minor bug fixes in behavior of Infinity and NaN, including
warnings when stringifying Infinity-like or NaN-like strings. For example,
"NaNcy" doesn't numify to NaN anymore.

=item *

A bug in regular expression patterns that could lead to segfaults and
other crashes has been fixed.  This occurred only in patterns compiled
with C</i> while taking into account the current POSIX locale (which usually
means they have to be compiled within the scope of C<S<use locale>>),
and there must be a string of at least 128 consecutive bytes to match.
L<[perl #123539]|https://rt.perl.org/Ticket/Display.html?id=123539>.

=item *

C<s///g> now works on very long strings (where there are more than 2
billion iterations) instead of dying with 'Substitution loop'.
L<[perl #103260]|https://rt.perl.org/Ticket/Display.html?id=103260>.
L<[perl #123071]|https://rt.perl.org/Ticket/Display.html?id=123071>.

=item *

C<gmtime> no longer crashes with not-a-number values.
L<[perl #123495]|https://rt.perl.org/Ticket/Display.html?id=123495>.

=item *

C<\()> (a reference to an empty list), and C<y///> with lexical C<$_> in
scope, could both do a bad write past the end of the stack.  They have
both been fixed to extend the stack first.

=item *

C<prototype()> with no arguments used to read the previous item on the
stack, so S<C<print "foo", prototype()>> would print foo's prototype.
It has been fixed to infer C<$_> instead.
L<[perl #123514]|https://rt.perl.org/Ticket/Display.html?id=123514>.

=item *

Some cases of lexical state subs declared inside predeclared subs could
crash, for example when evalling a string including the name of an outer
variable, but no longer do.

=item *

Some cases of nested lexical state subs inside anonymous subs could cause
'Bizarre copy' errors or possibly even crashes.

=item *

When trying to emit warnings, perl's default debugger (F<perl5db.pl>) was
sometimes giving 'Undefined subroutine &DB::db_warn called' instead.  This
bug, which started to occur in Perl 5.18, has been fixed.
L<[perl #123553]|https://rt.perl.org/Ticket/Display.html?id=123553>.

=item *

Certain syntax errors in substitutions, such as C<< s/${<>{})// >>, would
crash, and had done so since Perl 5.10.  (In some cases the crash did not
start happening till 5.16.)  The crash has, of course, been fixed.
L<[perl #123542]|https://rt.perl.org/Ticket/Display.html?id=123542>.

=item *

Fix a couple of string grow size calculation overflows; in particular,
a repeat expression like S<C<33 x ~3>> could cause a large buffer
overflow since the new output buffer size was not correctly handled by
C<SvGROW()>.  An expression like this now properly produces a memory wrap
panic.
L<[perl #123554]|https://rt.perl.org/Ticket/Display.html?id=123554>.

=item *

C<< formline("@...", "a"); >> would crash.  The C<FF_CHECKNL> case in
C<pp_formline()> didn't set the pointer used to mark the chop position,
which led to the C<FF_MORE> case crashing with a segmentation fault.
This has been fixed.
L<[perl #123538]|https://rt.perl.org/Ticket/Display.html?id=123538>.

=item *

A possible buffer overrun and crash when parsing a literal pattern during
regular expression compilation has been fixed.
L<[perl #123604]|https://rt.perl.org/Ticket/Display.html?id=123604>.

=item *

C<fchmod()> and C<futimes()> now set C<$!> when they fail due to being
passed a closed file handle.
L<[perl #122703]|https://rt.perl.org/Ticket/Display.html?id=122703>.

=item *

C<op_free()> and C<scalarvoid()> no longer crash due to a stack overflow
when freeing a deeply recursive op tree.
L<[perl #108276]|https://rt.perl.org/Ticket/Display.html?id=108276>.

=item *

In Perl 5.20.0, C<$^N> accidentally had the internal UTF-8 flag turned off
if accessed from a code block within a regular expression, effectively
UTF-8-encoding the value.  This has been fixed.
L<[perl #123135]|https://rt.perl.org/Ticket/Display.html?id=123135>.

=item *

A failed C<semctl> call no longer overwrites existing items on the stack,
which means that C<(semctl(-1,0,0,0))[0]> no longer gives an
"uninitialized" warning.

=item *

C<else{foo()}> with no space before C<foo> is now better at assigning the
right line number to that statement.
L<[perl #122695]|https://rt.perl.org/Ticket/Display.html?id=122695>.

=item *

Sometimes the assignment in C<@array = split> gets optimised so that C<split>
itself writes directly to the array.  This caused a bug, preventing this
assignment from being used in lvalue context.  So
C<(@a=split//,"foo")=bar()> was an error.  (This bug probably goes back to
Perl 3, when the optimisation was added.) It has now been fixed.
L<[perl #123057]|https://rt.perl.org/Ticket/Display.html?id=123057>.

=item *

When an argument list fails the checks specified by a subroutine
signature (which is still an experimental feature), the resulting error
messages now give the file and line number of the caller, not of the
called subroutine.
L<[perl #121374]|https://rt.perl.org/Ticket/Display.html?id=121374>.

=item *

The flip-flop operators (C<..> and C<...> in scalar context) used to maintain
a separate state for each recursion level (the number of times the
enclosing sub was called recursively), contrary to the documentation.  Now
each closure has one internal state for each flip-flop.
L<[perl #122829]|https://rt.perl.org/Ticket/Display.html?id=122829>.

=item *

The flip-flop operator (C<..> in scalar context) would return the same
scalar each time, unless the containing subroutine was called recursively.
Now it always returns a new scalar.
L<[perl #122829]|https://rt.perl.org/Ticket/Display.html?id=122829>.

=item *

C<use>, C<no>, statement labels, special blocks (C<BEGIN>) and pod are now
permitted as the first thing in a C<map> or C<grep> block, the block after
C<print> or C<say> (or other functions) returning a handle, and within
C<${...}>, C<@{...}>, etc.
L<[perl #122782]|https://rt.perl.org/Ticket/Display.html?id=122782>.

=item *

The repetition operator C<x> now propagates lvalue context to its left-hand
argument when used in contexts like C<foreach>.  That allows
S<C<for(($#that_array)x2) { ... }>> to work as expected if the loop modifies
C<$_>.

=item *

C<(...) x ...> in scalar context used to corrupt the stack if one operand
was an object with "x" overloading, causing erratic behavior.
L<[perl #121827]|https://rt.perl.org/Ticket/Display.html?id=121827>.

=item *

Assignment to a lexical scalar is often optimised away; for example in
C<my $x; $x = $y + $z>, the assign operator is optimised away and the add
operator writes its result directly to C<$x>.  Various bugs related to
this optimisation have been fixed.  Certain operators on the right-hand
side would sometimes fail to assign the value at all or assign the wrong
value, or would call STORE twice or not at all on tied variables.  The
operators affected were C<$foo++>, C<$foo-->, and C<-$foo> under C<use
integer>, C<chomp>, C<chr> and C<setpgrp>.

=item *

List assignments were sometimes buggy if the same scalar ended up on both
sides of the assignment due to use of C<tied>, C<values> or C<each>.  The
result would be the wrong value getting assigned.

=item *

C<setpgrp($nonzero)> (with one argument) was accidentally changed in 5.16
to mean C<setpgrp(0)>.  This has been fixed.

=item *

C<__SUB__> could return the wrong value or even corrupt memory under the
debugger (the C<-d> switch) and in subs containing C<eval $string>.

=item *

When S<C<sub () { $var }>> becomes inlinable, it now returns a different
scalar each time, just as a non-inlinable sub would, though Perl still
optimises the copy away in cases where it would make no observable
difference.

=item *

S<C<my sub f () { $var }>> and S<C<sub () : attr { $var }>> are no longer
eligible for inlining.  The former would crash; the latter would just
throw the attributes away.  An exception is made for the little-known
C<:method> attribute, which does nothing much.

=item *

Inlining of subs with an empty prototype is now more consistent than
before. Previously, a sub with multiple statements, of which all but the last
were optimised away, would be inlinable only if it were an anonymous sub
containing a string C<eval> or C<state> declaration or closing over an
outer lexical variable (or any anonymous sub under the debugger).  Now any
sub that gets folded to a single constant after statements have been
optimised away is eligible for inlining.  This applies to things like C<sub
() { jabber() if DEBUG; 42 }>.

Some subroutines with an explicit C<return> were being made inlinable,
contrary to the documentation,  Now C<return> always prevents inlining.

=item *

On some systems, such as VMS, C<crypt> can return a non-ASCII string.  If a
scalar assigned to had contained a UTF-8 string previously, then C<crypt>
would not turn off the UTF-8 flag, thus corrupting the return value.  This
would happen with S<C<$lexical = crypt ...>>.

=item *

C<crypt> no longer calls C<FETCH> twice on a tied first argument.

=item *

An unterminated here-doc on the last line of a quote-like operator
(C<qq[${ <<END }]>, C</(?{ <<END })/>) no longer causes a double free.  It
started doing so in 5.18.

=item *

C<index()> and C<rindex()> no longer crash when used on strings over 2GB in
size.
L<[perl #121562]|https://rt.perl.org/Ticket/Display.html?id=121562>.

=item *

A small, previously intentional, memory leak in
C<PERL_SYS_INIT>/C<PERL_SYS_INIT3> on Win32 builds was fixed. This might
affect embedders who repeatedly create and destroy perl engines within
the same process.

=item *

C<POSIX::localeconv()> now returns the data for the program's underlying
locale even when called from outside the scope of S<C<use locale>>.

=item *

C<POSIX::localeconv()> now works properly on platforms which don't have
C<LC_NUMERIC> and/or C<LC_MONETARY>, or for which Perl has been compiled
to disregard either or both of these locale categories.  In such
circumstances, there are now no entries for the corresponding values in
the hash returned by C<localeconv()>.

=item *

C<POSIX::localeconv()> now marks appropriately the values it returns as
UTF-8 or not.  Previously they were always returned as bytes, even if
they were supposed to be encoded as UTF-8.

=item *

On Microsoft Windows, within the scope of C<S<use locale>>, the following
POSIX character classes gave results for many locales that did not
conform to the POSIX standard:
C<[[:alnum:]]>,
C<[[:alpha:]]>,
C<[[:blank:]]>,
C<[[:digit:]]>,
C<[[:graph:]]>,
C<[[:lower:]]>,
C<[[:print:]]>,
C<[[:punct:]]>,
C<[[:upper:]]>,
C<[[:word:]]>,
and
C<[[:xdigit:]]>.
This was because the underlying Microsoft implementation does not
follow the standard.  Perl now takes special precautions to correct for
this.

=item *

Many issues have been detected by L<Coverity|http://www.coverity.com/> and
fixed.

=item *

C<system()> and friends should now work properly on more Android builds.

Due to an oversight, the value specified through C<-Dtargetsh> to F<Configure>
would end up being ignored by some of the build process.  This caused perls
cross-compiled for Android to end up with defective versions of C<system()>,
C<exec()> and backticks: the commands would end up looking for C</bin/sh>
instead of C</system/bin/sh>, and so would fail for the vast majority
of devices, leaving C<$!> as C<ENOENT>.

=item *

C<qr(...\(...\)...)>,
C<qr[...\[...\]...]>,
and
C<qr{...\{...\}...}>
now work.  Previously it was impossible to escape these three
left-characters with a backslash within a regular expression pattern
where otherwise they would be considered metacharacters, and the pattern
opening delimiter was the character, and the closing delimiter was its
mirror character.

=item *

C<< s///e >> on tainted UTF-8 strings corrupted C<< pos() >>. This bug,
introduced in 5.20, is now fixed.
L<[perl #122148]|https://rt.perl.org/Ticket/Display.html?id=122148>.

=item *

A non-word boundary in a regular expression (C<< \B >>) did not always
match the end of the string; in particular C<< q{} =~ /\B/ >> did not
match. This bug, introduced in perl 5.14, is now fixed.
L<[perl #122090]|https://rt.perl.org/Ticket/Display.html?id=122090>.

=item *

C<< " P" =~ /(?=.*P)P/ >> should match, but did not. This is now fixed.
L<[perl #122171]|https://rt.perl.org/Ticket/Display.html?id=122171>.

=item *

Failing to compile C<use Foo> in an C<eval> could leave a spurious
C<BEGIN> subroutine definition, which would produce a "Subroutine
BEGIN redefined" warning on the next use of C<use>, or other C<BEGIN>
block.
L<[perl #122107]|https://rt.perl.org/Ticket/Display.html?id=122107>.

=item *

C<method { BLOCK } ARGS> syntax now correctly parses the arguments if they
begin with an opening brace.
L<[perl #46947]|https://rt.perl.org/Ticket/Display.html?id=46947>.

=item *

External libraries and Perl may have different ideas of what the locale is.
This is problematic when parsing version strings if the locale's numeric
separator has been changed.  Version parsing has been patched to ensure
it handles the locales correctly.
L<[perl #121930]|https://rt.perl.org/Ticket/Display.html?id=121930>.

=item *

A bug has been fixed where zero-length assertions and code blocks inside of a
regex could cause C<pos> to see an incorrect value.
L<[perl #122460]|https://rt.perl.org/Ticket/Display.html?id=122460>.

=item *

Dereferencing of constants now works correctly for typeglob constants.  Previously
the glob was stringified and its name looked up.  Now the glob itself is used.
L<[perl #69456]|https://rt.perl.org/Ticket/Display.html?id=69456>

=item *

When parsing a sigil (C<$> C<@> C<%> C<&)> followed by braces,
the parser no
longer tries to guess whether it is a block or a hash constructor (causing a
syntax error when it guesses the latter), since it can only be a block.

=item *

S<C<undef $reference>> now frees the referent immediately, instead of hanging on
to it until the next statement.
L<[perl #122556]|https://rt.perl.org/Ticket/Display.html?id=122556>

=item *

Various cases where the name of a sub is used (autoload, overloading, error
messages) used to crash for lexical subs, but have been fixed.

=item *

Bareword lookup now tries to avoid vivifying packages if it turns out the
bareword is not going to be a subroutine name.

=item *

Compilation of anonymous constants (I<e.g.>, C<sub () { 3 }>) no longer deletes
any subroutine named C<__ANON__> in the current package.  Not only was
C<*__ANON__{CODE}> cleared, but there was a memory leak, too.  This bug goes
back to Perl 5.8.0.

=item *

Stub declarations like C<sub f;> and C<sub f ();> no longer wipe out constants
of the same name declared by C<use constant>.  This bug was introduced in Perl
5.10.0.

=item *

C<qr/[\N{named sequence}]/> now works properly in many instances.

Some names
known to C<\N{...}> refer to a sequence of multiple characters, instead of the
usual single character.  Bracketed character classes generally only match
single characters, but now special handling has been added so that they can
match named sequences, but not if the class is inverted or the sequence is
specified as the beginning or end of a range.  In these cases, the only
behavior change from before is a slight rewording of the fatal error message
given when this class is part of a C<?[...])> construct.  When the C<[...]>
stands alone, the same non-fatal warning as before is raised, and only the
first character in the sequence is used, again just as before.

=item *

Tainted constants evaluated at compile time no longer cause unrelated
statements to become tainted.
L<[perl #122669]|https://rt.perl.org/Ticket/Display.html?id=122669>

=item *

S<C<open $$fh, ...>>, which vivifies a handle with a name like
C<"main::_GEN_0">, was not giving the handle the right reference count, so
a double free could happen.

=item *

When deciding that a bareword was a method name, the parser would get confused
if an C<our> sub with the same name existed, and look up the method in the
package of the C<our> sub, instead of the package of the invocant.

=item *

The parser no longer gets confused by C<\U=> within a double-quoted string.  It
used to produce a syntax error, but now compiles it correctly.
L<[perl #80368]|https://rt.perl.org/Ticket/Display.html?id=80368>

=item *

It has always been the intention for the C<-B> and C<-T> file test operators to
treat UTF-8 encoded files as text.  (L<perlfunc|perlfunc/-X FILEHANDLE> has
been updated to say this.)  Previously, it was possible for some files to be
considered UTF-8 that actually weren't valid UTF-8.  This is now fixed.  The
operators now work on EBCDIC platforms as well.

=item *

Under some conditions warning messages raised during regular expression pattern
compilation were being output more than once.  This has now been fixed.

=item *

Perl 5.20.0 introduced a regression in which a UTF-8 encoded regular
expression pattern that contains a single ASCII lowercase letter did not
match its uppercase counterpart. That has been fixed in both 5.20.1 and
5.22.0.
L<[perl #122655]|https://rt.perl.org/Ticket/Display.html?id=122655>

=item *

Constant folding could incorrectly suppress warnings if lexical warnings
(C<use warnings> or C<no warnings>) were not in effect and C<$^W> were
false at compile time and true at run time.

=item *

Loading Unicode tables during a regular expression match could cause assertion
failures under debugging builds if the previous match used the very same
regular expression.
L<[perl #122747]|https://rt.perl.org/Ticket/Display.html?id=122747>

=item *

Thread cloning used to work incorrectly for lexical subs, possibly causing
crashes or double frees on exit.

=item *

Since Perl 5.14.0, deleting C<$SomePackage::{__ANON__}> and then undefining an
anonymous subroutine could corrupt things internally, resulting in
L<Devel::Peek> crashing or L<B.pm|B> giving nonsensical data.  This has been
fixed.

=item *

S<C<(caller $n)[3]>> now reports names of lexical subs, instead of
treating them as C<"(unknown)">.

=item *

C<sort subname LIST> now supports using a lexical sub as the comparison
routine.

=item *

Aliasing (I<e.g.>, via S<C<*x = *y>>) could confuse list assignments that mention the
two names for the same variable on either side, causing wrong values to be
assigned.
L<[perl #15667]|https://rt.perl.org/Ticket/Display.html?id=15667>

=item *

Long here-doc terminators could cause a bad read on short lines of input.  This
has been fixed.  It is doubtful that any crash could have occurred.  This bug
goes back to when here-docs were introduced in Perl 3.000 twenty-five years
ago.

=item *

An optimization in C<split> to treat S<C<split /^/>> like S<C<split /^/m>> had the
unfortunate side-effect of also treating S<C<split /\A/>> like S<C<split /^/m>>,
which it should not.  This has been fixed.  (Note, however, that S<C<split /^x/>>
does not behave like S<C<split /^x/m>>, which is also considered to be a bug and
will be fixed in a future version.)
L<[perl #122761]|https://rt.perl.org/Ticket/Display.html?id=122761>

=item *

The little-known S<C<my Class $var>> syntax (see L<fields> and L<attributes>)
could get confused in the scope of C<use utf8> if C<Class> were a constant
whose value contained Latin-1 characters.

=item *

Locking and unlocking values via L<Hash::Util> or C<Internals::SvREADONLY>
no longer has any effect on values that were read-only to begin with.
Previously, unlocking such values could result in crashes, hangs or
other erratic behavior.

=item *

Some unterminated C<(?(...)...)> constructs in regular expressions would
either crash or give erroneous error messages.  C</(?(1)/> is one such
example.

=item *

S<C<pack "w", $tied>> no longer calls FETCH twice.

=item *

List assignments like S<C<($x, $z) = (1, $y)>> now work correctly if C<$x> and
C<$y> have been aliased by C<foreach>.

=item *

Some patterns including code blocks with syntax errors, such as
S<C</ (?{(^{})/>>, would hang or fail assertions on debugging builds.  Now
they produce errors.

=item *

An assertion failure when parsing C<sort> with debugging enabled has been
fixed.
L<[perl #122771]|https://rt.perl.org/Ticket/Display.html?id=122771>.

=item *

S<C<*a = *b; @a = split //, $b[1]>> could do a bad read and produce junk
results.

=item *

In S<C<() = @array = split>>, the S<C<() =>> at the beginning no longer confuses
the optimizer into assuming a limit of 1.

=item *

Fatal warnings no longer prevent the output of syntax errors.
L<[perl #122966]|https://rt.perl.org/Ticket/Display.html?id=122966>.

=item *

Fixed a NaN double-to-long-double conversion error on VMS. For quiet NaNs
(and only on Itanium, not Alpha) negative infinity instead of NaN was
produced.

=item *

Fixed the issue that caused C<< make distclean >> to incorrectly leave some
files behind.
L<[perl #122820]|https://rt.perl.org/Ticket/Display.html?id=122820>.

=item *

AIX now sets the length in C<< getsockopt >> correctly.
L<[perl #120835]|https://rt.perl.org/Ticket/Display.html?id=120835>.
L<[cpan #91183]|https://rt.cpan.org/Ticket/Display.html?id=91183>.
L<[cpan #85570]|https://rt.cpan.org/Ticket/Display.html?id=85570>.

=item *

The optimization phase of a regexp compilation could run "forever" and
exhaust all memory under certain circumstances; now fixed.
L<[perl #122283]|https://rt.perl.org/Ticket/Display.html?id=122283>.

=item *

The test script F<< t/op/crypt.t >> now uses the SHA-256 algorithm if the
default one is disabled, rather than giving failures.
L<[perl #121591]|https://rt.perl.org/Ticket/Display.html?id=121591>.

=item *

Fixed an off-by-one error when setting the size of a shared array.
L<[perl #122950]|https://rt.perl.org/Ticket/Display.html?id=122950>.

=item *

Fixed a bug that could cause perl to enter an infinite loop during
compilation. In particular, a C<while(1)> within a sublist, I<e.g.>

    sub foo { () = ($a, my $b, ($c, do { while(1) {} })) }

The bug was introduced in 5.20.0
L<[perl #122995]|https://rt.perl.org/Ticket/Display.html?id=122995>.

=item *

On Win32, if a variable was C<local>-ized in a pseudo-process that later
forked, restoring the original value in the child pseudo-process caused
memory corruption and a crash in the child pseudo-process (and therefore the
OS process).
L<[perl #40565]|https://rt.perl.org/Ticket/Display.html?id=40565>.

=item *

Calling C<write> on a format with a C<^**> field could produce a panic
in C<sv_chop()> if there were insufficient arguments or if the variable
used to fill the field was empty.
L<[perl #123245]|https://rt.perl.org/Ticket/Display.html?id=123245>.

=item *

Non-ASCII lexical sub names now appear without trailing junk when they
appear in error messages.

=item *

The C<\@> subroutine prototype no longer flattens parenthesized arrays
(taking a reference to each element), but takes a reference to the array
itself.
L<[perl #47363]|https://rt.perl.org/Ticket/Display.html?id=47363>.

=item *

A block containing nothing except a C-style C<for> loop could corrupt the
stack, causing lists outside the block to lose elements or have elements
overwritten.  This could happen with C<map { for(...){...} } ...> and with
lists containing C<do { for(...){...} }>.
L<[perl #123286]|https://rt.perl.org/Ticket/Display.html?id=123286>.

=item *

C<scalar()> now propagates lvalue context, so that
S<C<for(scalar($#foo)) { ... }>> can modify C<$#foo> through C<$_>.

=item *

C<qr/@array(?{block})/> no longer dies with "Bizarre copy of ARRAY".
L<[perl #123344]|https://rt.perl.org/Ticket/Display.html?id=123344>.

=item *

S<C<eval '$variable'>> in nested named subroutines would sometimes look up a
global variable even with a lexical variable in scope.

=item *

In perl 5.20.0, C<sort CORE::fake> where 'fake' is anything other than a
keyword, started chopping off the last 6 characters and treating the result
as a sort sub name.  The previous behavior of treating C<CORE::fake> as a
sort sub name has been restored.
L<[perl #123410]|https://rt.perl.org/Ticket/Display.html?id=123410>.

=item *

Outside of C<use utf8>, a single-character Latin-1 lexical variable is
disallowed.  The error message for it, "Can't use global C<$foo>...", was
giving garbage instead of the variable name.

=item *

C<readline> on a nonexistent handle was causing C<${^LAST_FH}> to produce a
reference to an undefined scalar (or fail an assertion).  Now
C<${^LAST_FH}> ends up undefined.

=item *

C<(...) x ...> in void context now applies scalar context to the left-hand
argument, instead of the context the current sub was called in.
L<[perl #123020]|https://rt.perl.org/Ticket/Display.html?id=123020>.

=back

=head1 Known Problems

=over 4

=item *

C<pack>-ing a NaN on a perl compiled with Visual C 6 does not behave properly,
leading to a test failure in F<t/op/infnan.t>.
L<[perl 125203]|https://rt.perl.org/Ticket/Display.html?id=125203>

=item *

A goal is for Perl to be able to be recompiled to work reasonably well on any
Unicode version.  In Perl 5.22, though, the earliest such version is Unicode
5.1 (current is 7.0).

=item *

EBCDIC platforms

=over 4

=item *

The C<cmp> (and hence C<sort>) operators do not necessarily give the
correct results when both operands are UTF-EBCDIC encoded strings and
there is a mixture of ASCII and/or control characters, along with other
characters.

=item *

Ranges containing C<\N{...}> in the C<tr///> (and C<y///>)
transliteration operators are treated differently than the equivalent
ranges in regular expression patterns.  They should, but don't, cause
the values in the ranges to all be treated as Unicode code points, and
not native ones.  (L<perlre/Version 8 Regular Expressions> gives
details as to how it should work.)

=item *

Encode and encoding are mostly broken.

=item *

Many CPAN modules that are shipped with core show failing tests.

=item *

C<pack>/C<unpack> with C<"U0"> format may not work properly.

=back

=item *

The following modules are known to have test failures with this version of
Perl.  In many cases, patches have been submitted, so there will hopefully be
new releases soon:

=over

=item *

L<B::Generate> version 1.50

=item *

L<B::Utils> version 0.25

=item *

L<Coro> version 6.42

=item *

L<Dancer> version 1.3130

=item *

L<Data::Alias> version 1.18

=item *

L<Data::Dump::Streamer> version 2.38

=item *

L<Data::Util> version 0.63

=item *

L<Devel::Spy> version 0.07

=item *

L<invoker> version 0.34

=item *

L<Lexical::Var> version 0.009

=item *

L<LWP::ConsoleLogger> version 0.000018

=item *

L<Mason> version 2.22

=item *

L<NgxQueue> version 0.02

=item *

L<Padre> version 1.00

=item *

L<Parse::Keyword> 0.08

=back

=back

=head1 Obituary

Brian McCauley died on May 8, 2015.  He was a frequent poster to Usenet, Perl
Monks, and other Perl forums, and made several CPAN contributions under the
nick NOBULL, including to the Perl FAQ.  He attended almost every
YAPC::Europe, and indeed, helped organise YAPC::Europe 2006 and the QA
Hackathon 2009.  His wit and his delight in intricate systems were
particularly apparent in his love of board games; many Perl mongers will
have fond memories of playing Fluxx and other games with Brian.  He will be
missed.

=head1 Acknowledgements

Perl 5.22.0 represents approximately 12 months of development since Perl 5.20.0
and contains approximately 590,000 lines of changes across 2,400 files from 94
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 370,000 lines of changes to 1,500 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.22.0:

Aaron Crane, Abhijit Menon-Sen, Abigail, Alberto Simões, Alex Solovey, Alex
Vandiver, Alexandr Ciornii, Alexandre (Midnite) Jousset, Andreas König,
Andreas Voegele, Andrew Fresh, Andy Dougherty, Anthony Heading, Aristotle
Pagaltzis, brian d foy, Brian Fraser, Chad Granum, Chris 'BinGOs' Williams,
Craig A. Berry, Dagfinn Ilmari Mannsåker, Daniel Dragan, Darin McBride, Dave
Rolsky, David Golden, David Mitchell, David Wheeler, Dmitri Tikhonov, Doug
Bell, E. Choroba, Ed J, Eric Herman, Father Chrysostomos, George Greer, Glenn
D. Golden, Graham Knop, H.Merijn Brand, Herbert Breunung, Hugo van der Sanden,
James E Keenan, James McCoy, James Raspass, Jan Dubois, Jarkko Hietaniemi,
Jasmine Ngan, Jerry D. Hedden, Jim Cromie, John Goodyear, kafka, Karen
Etheridge, Karl Williamson, Kent Fredric, kmx, Lajos Veres, Leon Timmermans,
Lukas Mai, Mathieu Arnold, Matthew Horsfall, Max Maischein, Michael Bunk,
Nicholas Clark, Niels Thykier, Niko Tyni, Norman Koch, Olivier Mengué, Peter
John Acklam, Peter Martini, Petr Písař, Philippe Bruhat (BooK), Pierre
Bogossian, Rafael Garcia-Suarez, Randy Stauner, Reini Urban, Ricardo Signes,
Rob Hoelz, Rostislav Skudnov, Sawyer X, Shirakata Kentaro, Shlomi Fish,
Sisyphus, Slaven Rezic, Smylers, Steffen Müller, Steve Hay, Sullivan Beck,
syber, Tadeusz Sośnierz, Thomas Sibley, Todd Rinaldo, Tony Cook, Vincent Pit,
Vladimir Marek, Yaroslav Kuzmin, Yves Orton, Ævar Arnfjörð Bjarmason.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
L<https://rt.perl.org/>.  There may also be information at
L<http://www.perl.org/>, the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlandroid.pod000064400000017277150344123470007574 0ustar00If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see pod/perlpod.pod) which is
specially designed to be readable as is.

=head1 NAME

perlandroid - Perl under Android

=head1 SYNOPSIS

The first portions of this documents contains instructions
to cross-compile Perl for Android 2.0 and later, using the
binaries provided by Google.  The latter portion describes how to build
perl native using one of the toolchains available on the Play Store.

=head1 DESCRIPTION

This document describes how to set up your host environment when
attempting to build Perl for Android.

=head1 Cross-compilation

These instructions assume an Unixish build environment on your host system;
they've been tested on Linux and OS X, and may work on Cygwin and MSYS.
While Google also provides an NDK for Windows, these steps won't work
native there, although it may be possible to cross-compile through different
means.

If your host system's architecture is 32 bits, remember to change the
C<x86_64>'s below to C<x86>'s.  On a similar vein, the examples below
use the 4.8 toolchain; if you want to use something older or newer (for
example, the 4.4.3 toolchain included in the 8th revision of the NDK), just
change those to the relevant version.

=head2 Get the Android Native Development Kit (NDK)

You can download the NDK from L<https://developer.android.com/tools/sdk/ndk/index.html>.
You'll want the normal, non-legacy version.

=head2 Determine the architecture you'll be cross-compiling for

There's three possible options: arm-linux-androideabi for ARM,
mipsel-linux-android for MIPS, and simply x86 for x86.
As of 2014, most Android devices run on ARM, so that is generally a safe bet.

With those two in hand, you should add

  $ANDROID_NDK/toolchains/$TARGETARCH-4.8/prebuilt/`uname | tr '[A-Z]' '[a-z]'`-x86_64/bin

to your C<PATH>, where C<$ANDROID_NDK> is the location where you unpacked the
NDK, and C<$TARGETARCH> is your target's architecture.

=head2 Set up a standalone toolchain

This creates a working sysroot that we can feed to Configure later.

    $ export ANDROID_TOOLCHAIN=/tmp/my-toolchain-$TARGETARCH
    $ export SYSROOT=$ANDROID_TOOLCHAIN/sysroot
    $ $ANDROID_NDK/build/tools/make-standalone-toolchain.sh \
            --platform=android-9 \
            --install-dir=$ANDROID_TOOLCHAIN \
            --system=`uname | tr '[A-Z]' '[a-z]'`-x86_64 \
            --toolchain=$TARGETARCH-4.8

=head2 adb or ssh?

adb is the Android Debug Bridge.  For our purposes, it's basically a way
of establishing an ssh connection to an Android device without having to
install anything on the device itself, as long as the device is either on
the same local network as the host, or it is connected to the host through
USB.

Perl can be cross-compiled using either adb or a normal ssh connection;
in general, if you can connect your device to the host using a USB port,
or if you don't feel like installing an sshd app on your device,
you may want to use adb, although you may be forced to switch to ssh if
your device is not rooted and you're unlucky -- more on that later.
Alternatively, if you're cross-compiling to an emulator, you'll have to
use adb.

=head3 adb

To use adb, download the Android SDK from L<https://developer.android.com/sdk/index.html>.
The "SDK Tools Only" version should suffice -- if you downloaded the ADT
Bundle, you can find the sdk under F<$ADT_BUNDLE/sdk/>.

Add F<$ANDROID_SDK/platform-tools> to your C<PATH>, which should give you access
to adb.  You'll now have to find your device's name using C<adb devices>,
and later pass that to Configure through C<-Dtargethost=$DEVICE>.

However, before calling Configure, you need to check if using adb is a
viable choice in the first place.  Because Android doesn't have a F</tmp>,
nor does it allow executables in the sdcard, we need to find somewhere in
the device for Configure to put some files in, as well as for the tests
to run in. If your device is rooted, then you're good.  Try running these:

    $ export TARGETDIR=/mnt/asec/perl
    $ adb -s $DEVICE shell "echo sh -c '\"mkdir $TARGETDIR\"' | su --"

Which will create the directory we need, and you can move on to the next
step.  F</mnt/asec> is mounted as a tmpfs in Android, but it's only
accessible to root.

If your device is not rooted, you may still be in luck. Try running this:

    $ export TARGETDIR=/data/local/tmp/perl
    $ adb -s $DEVICE shell "mkdir $TARGETDIR"

If the command works, you can move to the next step, but beware:
B<You'll have to remove the directory from the device once you are done!
Unlike F</mnt/asec>, F</data/local/tmp> may not get automatically garbage
collected once you shut off the phone>.

If neither of those work, then you can't use adb to cross-compile to your
device.  Either try rooting it, or go for the ssh route.

=head3 ssh

To use ssh, you'll need to install and run a sshd app and set it up
properly.  There are several paid and free apps that do this rather
easily, so you should be able to spot one on the store.
Remember that Perl requires a passwordless connection, so set up a 
public key.

Note that several apps spew crap to stderr every time you
connect, which can throw off Configure.  You may need to monkeypatch
the part of Configure that creates C<run-ssh> to have it discard stderr.

Since you're using ssh, you'll have to pass some extra arguments to
Configure:

  -Dtargetrun=ssh -Dtargethost=$TARGETHOST -Dtargetuser=$TARGETUSER -Dtargetport=$TARGETPORT

=head2 Configure and beyond

With all of the previous done, you're now ready to call Configure.

If using adb, a "basic" Configure line will look like this:

  $ ./Configure -des -Dusedevel -Dusecrosscompile -Dtargetrun=adb \
      -Dcc=$TARGETARCH-gcc   \
      -Dsysroot=$SYSROOT     \
      -Dtargetdir=$TARGETDIR \
      -Dtargethost=$DEVICE

If using ssh, it's not too different -- we just change targetrun to ssh,
and pass in targetuser and targetport.  It ends up looking like this:

  $ ./Configure -des -Dusedevel -Dusecrosscompile -Dtargetrun=ssh \
      -Dcc=$TARGETARCH-gcc        \
      -Dsysroot=$SYSROOT          \
      -Dtargetdir=$TARGETDIR      \
      -Dtargethost="$TARGETHOST"  \
      -Dtargetuser=$TARGETUSER    \
      -Dtargetport=$TARGETPORT

Now you're ready to run C<make> and C<make test>!

As a final word of warning, if you're using adb, C<make test> may appear to
hang; this is because it doesn't output anything until it finishes
running all tests.  You can check its progress by logging into the
device, moving to F<$TARGETDIR>, and looking at the file F<output.stdout>.

=head3 Notes

=over

=item *

If you are targetting x86 Android, you will have to change C<$TARGETARCH-gcc>
to C<i686-linux-android-gcc>.

=item *

On some older low-end devices -- think early 2.2 era -- some tests,
particularly F<t/re/uniprops.t>, may crash the phone, causing it to turn
itself off once, and then back on again.

=back

=head1 Native Builds

While Google doesn't provide a native toolchain for Android,
you can still get one from the Play Store; for example, there's the CCTools
app which you can get for free.
Keep in mind that you want a full
toolchain; some apps tend to default to installing only a barebones
version without some important utilities, like ar or nm.

Once you have the toolchain set up properly, the only
remaining hurdle is actually locating where in the device it was installed
in.  For example, CCTools installs its toolchain in 
F</data/data/com.pdaxrom.cctools/root/cctools>.  With the path in hand,
compiling perl is little more than:

 export SYSROOT=<location of the native toolchain>
 export LD_LIBRARY_PATH="$SYSROOT/lib:`pwd`:`pwd`/lib:`pwd`/lib/auto:$LD_LIBRARY_PATH"
 sh Configure -des -Dsysroot=$SYSROOT -Alibpth="/system/lib /vendor/lib"

=head1 AUTHOR

Brian Fraser <fraserbn@gmail.com>

=cut
perlaix.pod000064400000047725150344123470006736 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlaix - Perl version 5 on IBM AIX (UNIX) systems

=head1 DESCRIPTION

This document describes various features of IBM's UNIX operating
system AIX that will affect how Perl version 5 (hereafter just Perl)
is compiled and/or runs.

=head2 Compiling Perl 5 on AIX

For information on compilers on older versions of AIX, see L</Compiling
Perl 5 on older AIX versions up to 4.3.3>.

When compiling Perl, you must use an ANSI C compiler. AIX does not ship
an ANSI compliant C compiler with AIX by default, but binary builds of
gcc for AIX are widely available. A version of gcc is also included in
the AIX Toolbox which is shipped with AIX.

=head2 Supported Compilers

Currently all versions of IBM's "xlc", "xlc_r", "cc", "cc_r" or
"vac" ANSI/C compiler will work for building Perl if that compiler
works on your system.

If you plan to link Perl to any module that requires thread-support,
like DBD::Oracle, it is better to use the _r version of the compiler.
This will not build a threaded Perl, but a thread-enabled Perl. See
also L</Threaded Perl> later on.

As of writing (2010-09) only the I<IBM XL C for AIX> or I<IBM XL C/C++
for AIX> compiler is supported by IBM on AIX 5L/6.1/7.1.

The following compiler versions are currently supported by IBM:

    IBM XL C and IBM XL C/C++ V8, V9, V10, V11

The XL C for AIX is integrated in the XL C/C++ for AIX compiler and
therefore also supported.

If you choose XL C/C++ V9 you need APAR IZ35785 installed
otherwise the integrated SDBM_File do not compile correctly due
to an optimization bug. You can circumvent this problem by
adding -qipa to the optimization flags (-Doptimize='-O -qipa').
The PTF for APAR IZ35785 which solves this problem is available
from IBM (April 2009 PTF for XL C/C++ Enterprise Edition for AIX, V9.0).

If you choose XL C/C++ V11 you need the April 2010 PTF (or newer)
installed otherwise you will not get a working Perl version.

Perl can be compiled with either IBM's ANSI C compiler or with gcc.
The former is recommended, as not only it can compile Perl with no
difficulty, but also can take advantage of features listed later
that require the use of IBM compiler-specific command-line flags.

If you decide to use gcc, make sure your installation is recent and
complete, and be sure to read the Perl INSTALL file for more gcc-specific
details. Please report any hoops you had to jump through to the
development team.

=head2 Incompatibility with AIX Toolbox lib gdbm

If the AIX Toolbox version of lib gdbm < 1.8.3-5 is installed on your
system then Perl will not work. This library contains the header files
/opt/freeware/include/gdbm/dbm.h|ndbm.h which conflict with the AIX
system versions. The lib gdbm will be automatically removed from the
wanted libraries if the presence of one of these two header files is
detected. If you want to build Perl with GDBM support then please install
at least gdbm-devel-1.8.3-5 (or higher).

=head2 Perl 5 was successfully compiled and tested on:

 Perl   | AIX Level           | Compiler Level          | w th | w/o th
 -------+---------------------+-------------------------+------+-------
 5.12.2 |5.1 TL9 32 bit       | XL C/C++ V7             | OK   | OK
 5.12.2 |5.1 TL9 64 bit       | XL C/C++ V7             | OK   | OK
 5.12.2 |5.2 TL10 SP8 32 bit  | XL C/C++ V8             | OK   | OK
 5.12.2 |5.2 TL10 SP8 32 bit  | gcc 3.2.2               | OK   | OK
 5.12.2 |5.2 TL10 SP8 64 bit  | XL C/C++ V8             | OK   | OK
 5.12.2 |5.3 TL8 SP8 32 bit   | XL C/C++ V9 + IZ35785   | OK   | OK
 5.12.2 |5.3 TL8 SP8 32 bit   | gcc 4.2.4               | OK   | OK
 5.12.2 |5.3 TL8 SP8 64 bit   | XL C/C++ V9 + IZ35785   | OK   | OK
 5.12.2 |5.3 TL10 SP3 32 bit  | XL C/C++ V11 + Apr 2010 | OK   | OK
 5.12.2 |5.3 TL10 SP3 64 bit  | XL C/C++ V11 + Apr 2010 | OK   | OK
 5.12.2 |6.1 TL1 SP7 32 bit   | XL C/C++ V10            | OK   | OK
 5.12.2 |6.1 TL1 SP7 64 bit   | XL C/C++ V10            | OK   | OK
 5.13   |7.1 TL0 SP1 32 bit   | XL C/C++ V11 + Jul 2010 | OK   | OK
 5.13   |7.1 TL0 SP1 64 bit   | XL C/C++ V11 + Jul 2010 | OK   | OK

 w th   = with thread support
 w/o th = without thread support
 OK     = tested

Successfully tested means that all "make test" runs finish with a
result of 100% OK. All tests were conducted with -Duseshrplib set.

All tests were conducted on the oldest supported AIX technology level
with the latest support package applied. If the tested AIX version is
out of support (AIX 4.3.3, 5.1, 5.2) then the last available support
level was used.

=head2 Building Dynamic Extensions on AIX

Starting from Perl 5.7.2 (and consequently 5.8.x / 5.10.x / 5.12.x)
and AIX 4.3 or newer Perl uses the AIX native dynamic loading interface
in the so called runtime linking mode instead of the emulated interface
that was used in Perl releases 5.6.1 and earlier or, for AIX releases
4.2 and earlier. This change does break backward compatibility with
compiled modules from earlier Perl releases. The change was made to make
Perl more compliant with other applications like Apache/mod_perl which are
using the AIX native interface. This change also enables the use of
C++ code with static constructors and destructors in Perl extensions,
which was not possible using the emulated interface.

It is highly recommended to use the new interface.

=head2 Using Large Files with Perl

Should yield no problems.

=head2 Threaded Perl

Should yield no problems with AIX 5.1 / 5.2 / 5.3 / 6.1 / 7.1.

IBM uses the AIX system Perl (V5.6.0 on AIX 5.1 and V5.8.2 on
AIX 5.2 / 5.3 and 6.1; V5.8.8 on AIX 5.3 TL11 and AIX 6.1 TL4; V5.10.1
on AIX 7.1) for some AIX system scripts. If you switch the links in
/usr/bin from the AIX system Perl (/usr/opt/perl5) to the newly build
Perl then you get the same features as with the IBM AIX system Perl if
the threaded options are used.

The threaded Perl build works also on AIX 5.1 but the IBM Perl
build (Perl v5.6.0) is not threaded on AIX 5.1.

Perl 5.12 an newer is not compatible with the IBM fileset perl.libext.

=head2 64-bit Perl

If your AIX system is installed with 64-bit support, you can expect 64-bit
configurations to work. If you want to use 64-bit Perl on AIX 6.1
you need an APAR for a libc.a bug which affects (n)dbm_XXX functions.
The APAR number for this problem is IZ39077.

If you need more memory (larger data segment) for your Perl programs you
can set:

    /etc/security/limits
    default:                    (or your user)
        data = -1               (default is 262144 * 512 byte)

With the default setting the size is limited to 128MB.
The -1 removes this limit. If the "make test" fails please change
your /etc/security/limits as stated above.

=head2 Long doubles

IBM calls its implementation of long doubles 128-bit, but it is not
the IEEE 128-bit ("quadruple precision") which would give 116 bit of
mantissa (nor it is implemented in hardware), instead it's a special
software implementation called "double-double", which gives 106 bits
of mantissa.

There seem to be various problems in this long double implementation.
If Configure detects this brokenness, it will disable the long double support.
This can be overriden with explicit C<-Duselongdouble> (or C<-Dusemorebits>,
which enables both long doubles and 64 bit integers).  If you decide to
enable long doubles, for most of the broken things Perl has implemented
workarounds, but the handling of the special values infinity and NaN
remains badly broken: for example infinity plus zero results in NaN.

=head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/32-bit)

With the following options you get a threaded Perl version which
passes all make tests in threaded 32-bit mode, which is the default
configuration for the Perl builds that AIX ships with.

    rm config.sh
    ./Configure \
    -d \
    -Dcc=cc_r \
    -Duseshrplib \
    -Dusethreads \
    -Dprefix=/usr/opt/perl5_32

The -Dprefix option will install Perl in a directory parallel to the 
IBM AIX system Perl installation.

=head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (32-bit)

With the following options you get a Perl version which passes 
all make tests in 32-bit mode.

    rm config.sh
    ./Configure \
    -d \
    -Dcc=cc_r \
    -Duseshrplib \
    -Dprefix=/usr/opt/perl5_32

The -Dprefix option will install Perl in a directory parallel to the
IBM AIX system Perl installation.

=head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/64-bit)

With the following options you get a threaded Perl version which
passes all make tests in 64-bit mode.

 export OBJECT_MODE=64 / setenv OBJECT_MODE 64 (depending on your shell)

 rm config.sh
 ./Configure \
 -d \
 -Dcc=cc_r \
 -Duseshrplib \
 -Dusethreads \
 -Duse64bitall \
 -Dprefix=/usr/opt/perl5_64

=head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (64-bit)

With the following options you get a Perl version which passes all
make tests in 64-bit mode. 

 export OBJECT_MODE=64 / setenv OBJECT_MODE 64 (depending on your shell)

 rm config.sh
 ./Configure \
 -d \
 -Dcc=cc_r \
 -Duseshrplib \
 -Duse64bitall \
 -Dprefix=/usr/opt/perl5_64

The -Dprefix option will install Perl in a directory parallel to the
IBM AIX system Perl installation.

If you choose gcc to compile 64-bit Perl then you need to add the
following option:

    -Dcc='gcc -maix64'


=head2 Compiling Perl 5 on AIX 7.1.0

A regression in AIX 7 causes a failure in make test in Time::Piece during
daylight savings time.  APAR IV16514 provides the fix for this.  A quick
test to see if it's required, assuming it is currently daylight savings
in Eastern Time, would be to run C< TZ=EST5 date +%Z >.  This will come
back with C<EST> normally, but nothing if you have the problem.


=head2 Compiling Perl 5 on older AIX versions up to 4.3.3

Due to the fact that AIX 4.3.3 reached end-of-service in December 31,
2003 this information is provided as is. The Perl versions prior to
Perl 5.8.9 could be compiled on AIX up to 4.3.3 with the following
settings (your mileage may vary):

When compiling Perl, you must use an ANSI C compiler. AIX does not ship
an ANSI compliant C-compiler with AIX by default, but binary builds of
gcc for AIX are widely available.

At the moment of writing, AIX supports two different native C compilers,
for which you have to pay: B<xlC> and B<vac>. If you decide to use either
of these two (which is quite a lot easier than using gcc), be sure to
upgrade to the latest available patch level. Currently:

    xlC.C     3.1.4.10 or 3.6.6.0 or 4.0.2.2 or 5.0.2.9 or 6.0.0.3
    vac.C     4.4.0.3  or 5.0.2.6 or 6.0.0.1

note that xlC has the OS version in the name as of version 4.0.2.0, so
you will find xlC.C for AIX-5.0 as package

    xlC.aix50.rte   5.0.2.0 or 6.0.0.3

subversions are not the same "latest" on all OS versions. For example,
the latest xlC-5 on aix41 is 5.0.2.9, while on aix43, it is 5.0.2.7.

Perl can be compiled with either IBM's ANSI C compiler or with gcc.
The former is recommended, as not only can it compile Perl with no
difficulty, but also can take advantage of features listed later that
require the use of IBM compiler-specific command-line flags.

The IBM's compiler patch levels 5.0.0.0 and 5.0.1.0 have compiler
optimization bugs that affect compiling perl.c and regcomp.c,
respectively.  If Perl's configuration detects those compiler patch
levels, optimization is turned off for the said source code files.
Upgrading to at least 5.0.2.0 is recommended.

If you decide to use gcc, make sure your installation is recent and
complete, and be sure to read the Perl INSTALL file for more gcc-specific
details. Please report any hoops you had to jump through to the development
team.

=head2 OS level

Before installing the patches to the IBM C-compiler you need to know the
level of patching for the Operating System. IBM's command 'oslevel' will
show the base, but is not always complete (in this example oslevel shows
4.3.NULL, whereas the system might run most of 4.3.THREE):

    # oslevel
    4.3.0.0
    # lslpp -l | grep 'bos.rte '
    bos.rte           4.3.3.75  COMMITTED  Base Operating System Runtime
    bos.rte            4.3.2.0  COMMITTED  Base Operating System Runtime
    #

The same might happen to AIX 5.1 or other OS levels. As a side note, Perl
cannot be built without bos.adt.syscalls and bos.adt.libm installed

    # lslpp -l | egrep "syscalls|libm"
    bos.adt.libm      5.1.0.25  COMMITTED  Base Application Development
    bos.adt.syscalls  5.1.0.36  COMMITTED  System Calls Application
    #

=head2 Building Dynamic Extensions on AIX E<lt> 5L

AIX supports dynamically loadable objects as well as shared libraries.
Shared libraries by convention end with the suffix .a, which is a bit
misleading, as an archive can contain static as well as dynamic members.
For Perl dynamically loaded objects we use the .so suffix also used on
many other platforms.

Note that starting from Perl 5.7.2 (and consequently 5.8.0) and AIX 4.3
or newer Perl uses the AIX native dynamic loading interface in the so
called runtime linking mode instead of the emulated interface that was
used in Perl releases 5.6.1 and earlier or, for AIX releases 4.2 and
earlier.  This change does break backward compatibility with compiled
modules from earlier Perl releases.  The change was made to make Perl
more compliant with other applications like Apache/mod_perl which are
using the AIX native interface. This change also enables the use of C++
code with static constructors and destructors in Perl extensions, which
was not possible using the emulated interface.

=head2 The IBM ANSI C Compiler

All defaults for Configure can be used.

If you've chosen to use vac 4, be sure to run 4.4.0.3. Older versions
will turn up nasty later on. For vac 5 be sure to run at least 5.0.1.0,
but vac 5.0.2.6 or up is highly recommended. Note that since IBM has
removed vac 5.0.2.1 through 5.0.2.5 from the software depot, these
versions should be considered obsolete.

Here's a brief lead of how to upgrade the compiler to the latest
level.  Of course this is subject to changes.  You can only upgrade
versions from ftp-available updates if the first three digit groups
are the same (in where you can skip intermediate unlike the patches
in the developer snapshots of Perl), or to one version up where the
"base" is available.  In other words, the AIX compiler patches are
cumulative.

 vac.C.4.4.0.1 => vac.C.4.4.0.3  is OK     (vac.C.4.4.0.2 not needed)
 xlC.C.3.1.3.3 => xlC.C.3.1.4.10 is NOT OK (xlC.C.3.1.4.0 is not
                                                              available)

 # ftp ftp.software.ibm.com
 Connected to service.boulder.ibm.com.
 : welcome message ...
 Name (ftp.software.ibm.com:merijn): anonymous
 331 Guest login ok, send your complete e-mail address as password.
 Password:
 ... accepted login stuff
 ftp> cd /aix/fixes/v4/
 ftp> dir other other.ll
 output to local-file: other.ll? y
 200 PORT command successful.
 150 Opening ASCII mode data connection for /bin/ls.
 226 Transfer complete.
 ftp> dir xlc xlc.ll
 output to local-file: xlc.ll? y
 200 PORT command successful.
 150 Opening ASCII mode data connection for /bin/ls.
 226 Transfer complete.
 ftp> bye
 ... goodbye messages
 # ls -l *.ll
 -rw-rw-rw-   1 merijn   system    1169432 Nov  2 17:29 other.ll
 -rw-rw-rw-   1 merijn   system      29170 Nov  2 17:29 xlc.ll

On AIX 4.2 using xlC, we continue:

 # lslpp -l | fgrep 'xlC.C '
   xlC.C                     3.1.4.9  COMMITTED  C for AIX Compiler
   xlC.C                     3.1.4.0  COMMITTED  C for AIX Compiler
 # grep 'xlC.C.3.1.4.*.bff' xlc.ll
 -rw-r--r--   1 45776101 1       6286336 Jul 22 1996  xlC.C.3.1.4.1.bff
 -rw-rw-r--   1 45776101 1       6173696 Aug 24 1998  xlC.C.3.1.4.10.bff
 -rw-r--r--   1 45776101 1       6319104 Aug 14 1996  xlC.C.3.1.4.2.bff
 -rw-r--r--   1 45776101 1       6316032 Oct 21 1996  xlC.C.3.1.4.3.bff
 -rw-r--r--   1 45776101 1       6315008 Dec 20 1996  xlC.C.3.1.4.4.bff
 -rw-rw-r--   1 45776101 1       6178816 Mar 28 1997  xlC.C.3.1.4.5.bff
 -rw-rw-r--   1 45776101 1       6188032 May 22 1997  xlC.C.3.1.4.6.bff
 -rw-rw-r--   1 45776101 1       6191104 Sep  5 1997  xlC.C.3.1.4.7.bff
 -rw-rw-r--   1 45776101 1       6185984 Jan 13 1998  xlC.C.3.1.4.8.bff
 -rw-rw-r--   1 45776101 1       6169600 May 27 1998  xlC.C.3.1.4.9.bff
 # wget ftp://ftp.software.ibm.com/aix/fixes/v4/xlc/xlC.C.3.1.4.10.bff
 #

On AIX 4.3 using vac, we continue:

 # lslpp -l | grep 'vac.C '
  vac.C                      5.0.2.2  COMMITTED  C for AIX Compiler
  vac.C                      5.0.2.0  COMMITTED  C for AIX Compiler
 # grep 'vac.C.5.0.2.*.bff' other.ll
 -rw-rw-r--   1 45776101 1       13592576 Apr 16 2001  vac.C.5.0.2.0.bff
 -rw-rw-r--   1 45776101 1       14133248 Apr  9 2002  vac.C.5.0.2.3.bff
 -rw-rw-r--   1 45776101 1       14173184 May 20 2002  vac.C.5.0.2.4.bff
 -rw-rw-r--   1 45776101 1       14192640 Nov 22 2002  vac.C.5.0.2.6.bff
 # wget ftp://ftp.software.ibm.com/aix/fixes/v4/other/vac.C.5.0.2.6.bff
 #

Likewise on all other OS levels. Then execute the following command, and
fill in its choices

 # smit install_update
  -> Install and Update from LATEST Available Software
  * INPUT device / directory for software [ vac.C.5.0.2.6.bff    ]
  [ OK ]
  [ OK ]

Follow the messages ... and you're done.

If you like a more web-like approach, a good start point can be
L<http://www14.software.ibm.com/webapp/download/downloadaz.jsp> and click
"C for AIX", and follow the instructions.

=head2 The usenm option

If linking miniperl

 cc -o miniperl ... miniperlmain.o opmini.o perl.o ... -lm -lc ...

causes error like this

 ld: 0711-317 ERROR: Undefined symbol: .aintl
 ld: 0711-317 ERROR: Undefined symbol: .copysignl
 ld: 0711-317 ERROR: Undefined symbol: .syscall
 ld: 0711-317 ERROR: Undefined symbol: .eaccess
 ld: 0711-317 ERROR: Undefined symbol: .setresuid
 ld: 0711-317 ERROR: Undefined symbol: .setresgid
 ld: 0711-317 ERROR: Undefined symbol: .setproctitle
 ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more
                                                            information.

you could retry with

 make realclean
 rm config.sh
 ./Configure -Dusenm ...

which makes Configure to use the C<nm> tool when scanning for library
symbols, which usually is not done in AIX.

Related to this, you probably should not use the C<-r> option of
Configure in AIX, because that affects of how the C<nm> tool is used.

=head2 Using GNU's gcc for building Perl

Using gcc-3.x (tested with 3.0.4, 3.1, and 3.2) now works out of the box,
as do recent gcc-2.9 builds available directly from IBM as part of their
Linux compatibility packages, available here:

  http://www.ibm.com/servers/aix/products/aixos/linux/

=head2 Using Large Files with Perl E<lt> 5L

Should yield no problems.

=head2 Threaded Perl E<lt> 5L

Threads seem to work OK, though at the moment not all tests pass when
threads are used in combination with 64-bit configurations.

You may get a warning when doing a threaded build:

  "pp_sys.c", line 4640.39: 1506-280 (W) Function argument assignment 
  between types "unsigned char*" and "const void*" is not allowed.

The exact line number may vary, but if the warning (W) comes from a line
line this

  hent = PerlSock_gethostbyaddr(addr, (Netdb_hlen_t) addrlen, addrtype);

in the "pp_ghostent" function, you may ignore it safely.  The warning
is caused by the reentrant variant of gethostbyaddr() having a slightly
different prototype than its non-reentrant variant, but the difference
is not really significant here.

=head2 64-bit Perl E<lt> 5L

If your AIX is installed with 64-bit support, you can expect 64-bit
configurations to work. In combination with threads some tests might
still fail.

=head2 AIX 4.2 and extensions using C++ with statics

In AIX 4.2 Perl extensions that use C++ functions that use statics
may have problems in that the statics are not getting initialized.
In newer AIX releases this has been solved by linking Perl with
the libC_r library, but unfortunately in AIX 4.2 the said library
has an obscure bug where the various functions related to time
(such as time() and gettimeofday()) return broken values, and
therefore in AIX 4.2 Perl is not linked against the libC_r.

=head1 AUTHORS

Rainer Tammer <tammer@tammer.net>

=cut
perldebug.pod000064400000114532150344123470007232 0ustar00=head1 NAME
X<debug> X<debugger>

perldebug - Perl debugging

=head1 DESCRIPTION

First of all, have you tried using L<C<use strict;>|strict> and
L<C<use warnings;>|warnings>?


If you're new to the Perl debugger, you may prefer to read
L<perldebtut>, which is a tutorial introduction to the debugger.

=head1 The Perl Debugger

If you invoke Perl with the B<-d> switch, your script runs under the
Perl source debugger.  This works like an interactive Perl
environment, prompting for debugger commands that let you examine
source code, set breakpoints, get stack backtraces, change the values of
variables, etc.  This is so convenient that you often fire up
the debugger all by itself just to test out Perl constructs
interactively to see what they do.  For example:
X<-d>

    $ perl -d -e 42

In Perl, the debugger is not a separate program the way it usually is in the
typical compiled environment.  Instead, the B<-d> flag tells the compiler
to insert source information into the parse trees it's about to hand off
to the interpreter.  That means your code must first compile correctly
for the debugger to work on it.  Then when the interpreter starts up, it
preloads a special Perl library file containing the debugger.

The program will halt I<right before> the first run-time executable
statement (but see below regarding compile-time statements) and ask you
to enter a debugger command.  Contrary to popular expectations, whenever
the debugger halts and shows you a line of code, it always displays the
line it's I<about> to execute, rather than the one it has just executed.

Any command not recognized by the debugger is directly executed
(C<eval>'d) as Perl code in the current package.  (The debugger
uses the DB package for keeping its own state information.)

Note that the said C<eval> is bound by an implicit scope. As a
result any newly introduced lexical variable or any modified
capture buffer content is lost after the eval. The debugger is a
nice environment to learn Perl, but if you interactively experiment using
material which should be in the same scope, stuff it in one line.

For any text entered at the debugger prompt, leading and trailing whitespace
is first stripped before further processing.  If a debugger command
coincides with some function in your own program, merely precede the
function with something that doesn't look like a debugger command, such
as a leading C<;> or perhaps a C<+>, or by wrapping it with parentheses
or braces.

=head2 Calling the Debugger

There are several ways to call the debugger:

=over 4

=item perl -d program_name

On the given program identified by C<program_name>.

=item perl -d -e 0 

Interactively supply an arbitrary C<expression> using C<-e>.

=item perl -d:ptkdb program_name

Debug a given program via the C<Devel::ptkdb> GUI.

=item perl -dt threaded_program_name

Debug a given program using threads (experimental).

=back

=head2 Debugger Commands

The interactive debugger understands the following commands:

=over 12

=item h
X<debugger command, h>

Prints out a summary help message

=item h [command]

Prints out a help message for the given debugger command.

=item h h

The special argument of C<h h> produces the entire help page, which is quite long.

If the output of the C<h h> command (or any command, for that matter) scrolls
past your screen, precede the command with a leading pipe symbol so
that it's run through your pager, as in

    DB> |h h

You may change the pager which is used via C<o pager=...> command.

=item p expr
X<debugger command, p>

Same as C<print {$DB::OUT} expr> in the current package.  In particular,
because this is just Perl's own C<print> function, this means that nested
data structures and objects are not dumped, unlike with the C<x> command.

The C<DB::OUT> filehandle is opened to F</dev/tty>, regardless of
where STDOUT may be redirected to.

=item x [maxdepth] expr
X<debugger command, x>

Evaluates its expression in list context and dumps out the result in a
pretty-printed fashion.  Nested data structures are printed out
recursively, unlike the real C<print> function in Perl.  When dumping
hashes, you'll probably prefer 'x \%h' rather than 'x %h'.
See L<Dumpvalue> if you'd like to do this yourself.

The output format is governed by multiple options described under
L</"Configurable Options">.

If the C<maxdepth> is included, it must be a numeral I<N>; the value is
dumped only I<N> levels deep, as if the C<dumpDepth> option had been
temporarily set to I<N>.

=item V [pkg [vars]]
X<debugger command, V>

Display all (or some) variables in package (defaulting to C<main>)
using a data pretty-printer (hashes show their keys and values so
you see what's what, control characters are made printable, etc.).
Make sure you don't put the type specifier (like C<$>) there, just
the symbol names, like this:

    V DB filename line

Use C<~pattern> and C<!pattern> for positive and negative regexes.

This is similar to calling the C<x> command on each applicable var.

=item X [vars]
X<debugger command, X>

Same as C<V currentpackage [vars]>.

=item y [level [vars]]
X<debugger command, y>

Display all (or some) lexical variables (mnemonic: C<mY> variables)
in the current scope or I<level> scopes higher.  You can limit the
variables that you see with I<vars> which works exactly as it does
for the C<V> and C<X> commands.  Requires the C<PadWalker> module
version 0.08 or higher; will warn if this isn't installed.  Output
is pretty-printed in the same style as for C<V> and the format is
controlled by the same options.

=item T
X<debugger command, T> X<backtrace> X<stack, backtrace>

Produce a stack backtrace.  See below for details on its output.

=item s [expr]
X<debugger command, s> X<step>

Single step.  Executes until the beginning of another
statement, descending into subroutine calls.  If an expression is
supplied that includes function calls, it too will be single-stepped.

=item n [expr]
X<debugger command, n>

Next.  Executes over subroutine calls, until the beginning
of the next statement.  If an expression is supplied that includes
function calls, those functions will be executed with stops before
each statement.

=item r
X<debugger command, r>

Continue until the return from the current subroutine.
Dump the return value if the C<PrintRet> option is set (default).

=item <CR>

Repeat last C<n> or C<s> command.

=item c [line|sub]
X<debugger command, c>

Continue, optionally inserting a one-time-only breakpoint
at the specified line or subroutine.

=item l
X<debugger command, l>

List next window of lines.

=item l min+incr

List C<incr+1> lines starting at C<min>.

=item l min-max

List lines C<min> through C<max>.  C<l -> is synonymous to C<->.

=item l line

List a single line.

=item l subname

List first window of lines from subroutine.  I<subname> may
be a variable that contains a code reference.

=item -
X<debugger command, ->

List previous window of lines.

=item v [line]
X<debugger command, v>

View a few lines of code around the current line.

=item .
X<debugger command, .>

Return the internal debugger pointer to the line last
executed, and print out that line.

=item f filename
X<debugger command, f>

Switch to viewing a different file or C<eval> statement.  If I<filename>
is not a full pathname found in the values of %INC, it is considered
a regex.

C<eval>ed strings (when accessible) are considered to be filenames:
C<f (eval 7)> and C<f eval 7\b> access the body of the 7th C<eval>ed string
(in the order of execution).  The bodies of the currently executed C<eval>
and of C<eval>ed strings that define subroutines are saved and thus
accessible.

=item /pattern/

Search forwards for pattern (a Perl regex); final / is optional.
The search is case-insensitive by default.

=item ?pattern?

Search backwards for pattern; final ? is optional.
The search is case-insensitive by default.

=item L [abw]
X<debugger command, L>

List (default all) actions, breakpoints and watch expressions

=item S [[!]regex]
X<debugger command, S>

List subroutine names [not] matching the regex.

=item t [n]
X<debugger command, t>

Toggle trace mode (see also the C<AutoTrace> option).
Optional argument is the maximum number of levels to trace below
the current one; anything deeper than that will be silent.

=item t [n] expr
X<debugger command, t>

Trace through execution of C<expr>.
Optional first argument is the maximum number of levels to trace below
the current one; anything deeper than that will be silent.
See L<perldebguts/"Frame Listing Output Examples"> for examples.

=item b
X<breakpoint>
X<debugger command, b>

Sets breakpoint on current line

=item b [line] [condition]
X<breakpoint>
X<debugger command, b>

Set a breakpoint before the given line.  If a condition
is specified, it's evaluated each time the statement is reached: a
breakpoint is taken only if the condition is true.  Breakpoints may
only be set on lines that begin an executable statement.  Conditions
don't use C<if>:

    b 237 $x > 30
    b 237 ++$count237 < 11
    b 33 /pattern/i

If the line number is C<.>, sets a breakpoint on the current line:

    b . $n > 100

=item b [file]:[line] [condition]
X<breakpoint>
X<debugger command, b>

Set a breakpoint before the given line in a (possibly different) file.  If a
condition is specified, it's evaluated each time the statement is reached: a
breakpoint is taken only if the condition is true.  Breakpoints may only be set
on lines that begin an executable statement.  Conditions don't use C<if>:

    b lib/MyModule.pm:237 $x > 30
    b /usr/lib/perl5/site_perl/CGI.pm:100 ++$count100 < 11

=item b subname [condition]
X<breakpoint>
X<debugger command, b>

Set a breakpoint before the first line of the named subroutine.  I<subname> may
be a variable containing a code reference (in this case I<condition>
is not supported).

=item b postpone subname [condition]
X<breakpoint>
X<debugger command, b>

Set a breakpoint at first line of subroutine after it is compiled.

=item b load filename
X<breakpoint>
X<debugger command, b>

Set a breakpoint before the first executed line of the I<filename>,
which should be a full pathname found amongst the %INC values.

=item b compile subname
X<breakpoint>
X<debugger command, b>

Sets a breakpoint before the first statement executed after the specified
subroutine is compiled.

=item B line
X<breakpoint>
X<debugger command, B>

Delete a breakpoint from the specified I<line>.

=item B *
X<breakpoint>
X<debugger command, B>

Delete all installed breakpoints.

=item disable [file]:[line]
X<breakpoint>
X<debugger command, disable>
X<disable>

Disable the breakpoint so it won't stop the execution of the program. 
Breakpoints are enabled by default and can be re-enabled using the C<enable>
command.

=item disable [line]
X<breakpoint>
X<debugger command, disable>
X<disable>

Disable the breakpoint so it won't stop the execution of the program. 
Breakpoints are enabled by default and can be re-enabled using the C<enable>
command.

This is done for a breakpoint in the current file.

=item enable [file]:[line]
X<breakpoint>
X<debugger command, disable>
X<disable>

Enable the breakpoint so it will stop the execution of the program. 

=item enable [line]
X<breakpoint>
X<debugger command, disable>
X<disable>

Enable the breakpoint so it will stop the execution of the program. 

This is done for a breakpoint in the current file.

=item a [line] command
X<debugger command, a>

Set an action to be done before the line is executed.  If I<line> is
omitted, set an action on the line about to be executed.
The sequence of steps taken by the debugger is

  1. check for a breakpoint at this line
  2. print the line if necessary (tracing)
  3. do any actions associated with that line
  4. prompt user if at a breakpoint or in single-step
  5. evaluate line

For example, this will print out $foo every time line
53 is passed:

    a 53 print "DB FOUND $foo\n"

=item A line
X<debugger command, A>

Delete an action from the specified line.

=item A *
X<debugger command, A>

Delete all installed actions.

=item w expr
X<debugger command, w>

Add a global watch-expression. Whenever a watched global changes the
debugger will stop and display the old and new values.

=item W expr
X<debugger command, W>

Delete watch-expression

=item W *
X<debugger command, W>

Delete all watch-expressions.

=item o
X<debugger command, o>

Display all options.

=item o booloption ...
X<debugger command, o>

Set each listed Boolean option to the value C<1>.

=item o anyoption? ...
X<debugger command, o>

Print out the value of one or more options.

=item o option=value ...
X<debugger command, o>

Set the value of one or more options.  If the value has internal
whitespace, it should be quoted.  For example, you could set C<o
pager="less -MQeicsNfr"> to call B<less> with those specific options.
You may use either single or double quotes, but if you do, you must
escape any embedded instances of same sort of quote you began with,
as well as any escaping any escapes that immediately precede that
quote but which are not meant to escape the quote itself.  In other
words, you follow single-quoting rules irrespective of the quote;
eg: C<o option='this isn\'t bad'> or C<o option="She said, \"Isn't
it?\"">.

For historical reasons, the C<=value> is optional, but defaults to
1 only where it is safe to do so--that is, mostly for Boolean
options.  It is always better to assign a specific value using C<=>.
The C<option> can be abbreviated, but for clarity probably should
not be.  Several options can be set together.  See L</"Configurable Options">
for a list of these.

=item < ?
X<< debugger command, < >>

List out all pre-prompt Perl command actions.

=item < [ command ]
X<< debugger command, < >>

Set an action (Perl command) to happen before every debugger prompt.
A multi-line command may be entered by backslashing the newlines.

=item < *
X<< debugger command, < >>

Delete all pre-prompt Perl command actions.

=item << command
X<< debugger command, << >>

Add an action (Perl command) to happen before every debugger prompt.
A multi-line command may be entered by backwhacking the newlines.

=item > ?
X<< debugger command, > >>

List out post-prompt Perl command actions.

=item > command
X<< debugger command, > >>

Set an action (Perl command) to happen after the prompt when you've
just given a command to return to executing the script.  A multi-line
command may be entered by backslashing the newlines (we bet you
couldn't have guessed this by now).

=item > *
X<< debugger command, > >>

Delete all post-prompt Perl command actions.

=item >> command
X<<< debugger command, >> >>>

Adds an action (Perl command) to happen after the prompt when you've
just given a command to return to executing the script.  A multi-line
command may be entered by backslashing the newlines.

=item { ?
X<debugger command, {>

List out pre-prompt debugger commands.

=item { [ command ]

Set an action (debugger command) to happen before every debugger prompt.
A multi-line command may be entered in the customary fashion.

Because this command is in some senses new, a warning is issued if
you appear to have accidentally entered a block instead.  If that's
what you mean to do, write it as with C<;{ ... }> or even
C<do { ... }>.

=item { *
X<debugger command, {>

Delete all pre-prompt debugger commands.

=item {{ command
X<debugger command, {{>

Add an action (debugger command) to happen before every debugger prompt.
A multi-line command may be entered, if you can guess how: see above.

=item ! number
X<debugger command, !>

Redo a previous command (defaults to the previous command).

=item ! -number
X<debugger command, !>

Redo number'th previous command.

=item ! pattern
X<debugger command, !>

Redo last command that started with pattern.
See C<o recallCommand>, too.

=item !! cmd
X<debugger command, !!>

Run cmd in a subprocess (reads from DB::IN, writes to DB::OUT) See
C<o shellBang>, also.  Note that the user's current shell (well,
their C<$ENV{SHELL}> variable) will be used, which can interfere
with proper interpretation of exit status or signal and coredump
information.

=item source file
X<debugger command, source>

Read and execute debugger commands from I<file>.
I<file> may itself contain C<source> commands.

=item H -number
X<debugger command, H>

Display last n commands.  Only commands longer than one character are
listed.  If I<number> is omitted, list them all.

=item q or ^D
X<debugger command, q>
X<debugger command, ^D>

Quit.  ("quit" doesn't work for this, unless you've made an alias)
This is the only supported way to exit the debugger, though typing
C<exit> twice might work.

Set the C<inhibit_exit> option to 0 if you want to be able to step
off the end the script.  You may also need to set $finished to 0
if you want to step through global destruction.

=item R
X<debugger command, R>

Restart the debugger by C<exec()>ing a new session.  We try to maintain
your history across this, but internal settings and command-line options
may be lost.

The following setting are currently preserved: history, breakpoints,
actions, debugger options, and the Perl command-line
options B<-w>, B<-I>, and B<-e>.

=item |dbcmd
X<debugger command, |>

Run the debugger command, piping DB::OUT into your current pager.

=item ||dbcmd
X<debugger command, ||>

Same as C<|dbcmd> but DB::OUT is temporarily C<select>ed as well.

=item = [alias value]
X<debugger command, =>

Define a command alias, like

    = quit q

or list current aliases.

=item command

Execute command as a Perl statement.  A trailing semicolon will be
supplied.  If the Perl statement would otherwise be confused for a
Perl debugger, use a leading semicolon, too.

=item m expr
X<debugger command, m>

List which methods may be called on the result of the evaluated
expression.  The expression may evaluated to a reference to a
blessed object, or to a package name.

=item M
X<debugger command, M>

Display all loaded modules and their versions.

=item man [manpage]
X<debugger command, man>

Despite its name, this calls your system's default documentation
viewer on the given page, or on the viewer itself if I<manpage> is
omitted.  If that viewer is B<man>, the current C<Config> information
is used to invoke B<man> using the proper MANPATH or S<B<-M>
I<manpath>> option.  Failed lookups of the form C<XXX> that match
known manpages of the form I<perlXXX> will be retried.  This lets
you type C<man debug> or C<man op> from the debugger.

On systems traditionally bereft of a usable B<man> command, the
debugger invokes B<perldoc>.  Occasionally this determination is
incorrect due to recalcitrant vendors or rather more felicitously,
to enterprising users.  If you fall into either category, just
manually set the $DB::doccmd variable to whatever viewer to view
the Perl documentation on your system.  This may be set in an rc
file, or through direct assignment.  We're still waiting for a
working example of something along the lines of:

    $DB::doccmd = 'netscape -remote http://something.here/';

=back

=head2 Configurable Options

The debugger has numerous options settable using the C<o> command,
either interactively or from the environment or an rc file.
(./.perldb or ~/.perldb under Unix.)


=over 12

=item C<recallCommand>, C<ShellBang>
X<debugger option, recallCommand>
X<debugger option, ShellBang>

The characters used to recall a command or spawn a shell.  By
default, both are set to C<!>, which is unfortunate.

=item C<pager>
X<debugger option, pager>

Program to use for output of pager-piped commands (those beginning
with a C<|> character.)  By default, C<$ENV{PAGER}> will be used.
Because the debugger uses your current terminal characteristics
for bold and underlining, if the chosen pager does not pass escape
sequences through unchanged, the output of some debugger commands
will not be readable when sent through the pager.

=item C<tkRunning>
X<debugger option, tkRunning>

Run Tk while prompting (with ReadLine).

=item C<signalLevel>, C<warnLevel>, C<dieLevel>
X<debugger option, signalLevel> X<debugger option, warnLevel>
X<debugger option, dieLevel>

Level of verbosity.  By default, the debugger leaves your exceptions
and warnings alone, because altering them can break correctly running
programs.  It will attempt to print a message when uncaught INT, BUS, or
SEGV signals arrive.  (But see the mention of signals in L</BUGS> below.)

To disable this default safe mode, set these values to something higher
than 0.  At a level of 1, you get backtraces upon receiving any kind
of warning (this is often annoying) or exception (this is
often valuable).  Unfortunately, the debugger cannot discern fatal
exceptions from non-fatal ones.  If C<dieLevel> is even 1, then your
non-fatal exceptions are also traced and unceremoniously altered if they
came from C<eval'ed> strings or from any kind of C<eval> within modules
you're attempting to load.  If C<dieLevel> is 2, the debugger doesn't
care where they came from:  It usurps your exception handler and prints
out a trace, then modifies all exceptions with its own embellishments.
This may perhaps be useful for some tracing purposes, but tends to hopelessly
destroy any program that takes its exception handling seriously.

=item C<AutoTrace>
X<debugger option, AutoTrace>

Trace mode (similar to C<t> command, but can be put into
C<PERLDB_OPTS>).

=item C<LineInfo>
X<debugger option, LineInfo>

File or pipe to print line number info to.  If it is a pipe (say,
C<|visual_perl_db>), then a short message is used.  This is the
mechanism used to interact with a slave editor or visual debugger,
such as the special C<vi> or C<emacs> hooks, or the C<ddd> graphical
debugger.

=item C<inhibit_exit>
X<debugger option, inhibit_exit>

If 0, allows I<stepping off> the end of the script.

=item C<PrintRet>
X<debugger option, PrintRet>

Print return value after C<r> command if set (default).

=item C<ornaments>
X<debugger option, ornaments>

Affects screen appearance of the command line (see L<Term::ReadLine>).
There is currently no way to disable these, which can render
some output illegible on some displays, or with some pagers.
This is considered a bug.

=item C<frame>
X<debugger option, frame>

Affects the printing of messages upon entry and exit from subroutines.  If
C<frame & 2> is false, messages are printed on entry only. (Printing
on exit might be useful if interspersed with other messages.)

If C<frame & 4>, arguments to functions are printed, plus context
and caller info.  If C<frame & 8>, overloaded C<stringify> and
C<tie>d C<FETCH> is enabled on the printed arguments.  If C<frame
& 16>, the return value from the subroutine is printed.

The length at which the argument list is truncated is governed by the
next option:

=item C<maxTraceLen>
X<debugger option, maxTraceLen>

Length to truncate the argument list when the C<frame> option's
bit 4 is set.

=item C<windowSize>
X<debugger option, windowSize>

Change the size of code list window (default is 10 lines).

=back

The following options affect what happens with C<V>, C<X>, and C<x>
commands:

=over 12

=item C<arrayDepth>, C<hashDepth>
X<debugger option, arrayDepth> X<debugger option, hashDepth>

Print only first N elements ('' for all).

=item C<dumpDepth>
X<debugger option, dumpDepth>

Limit recursion depth to N levels when dumping structures.
Negative values are interpreted as infinity.  Default: infinity.

=item C<compactDump>, C<veryCompact>
X<debugger option, compactDump> X<debugger option, veryCompact>

Change the style of array and hash output.  If C<compactDump>, short array
may be printed on one line.

=item C<globPrint>
X<debugger option, globPrint>

Whether to print contents of globs.

=item C<DumpDBFiles>
X<debugger option, DumpDBFiles>

Dump arrays holding debugged files.

=item C<DumpPackages>
X<debugger option, DumpPackages>

Dump symbol tables of packages.

=item C<DumpReused>
X<debugger option, DumpReused>

Dump contents of "reused" addresses.

=item C<quote>, C<HighBit>, C<undefPrint>
X<debugger option, quote> X<debugger option, HighBit>
X<debugger option, undefPrint>

Change the style of string dump.  The default value for C<quote>
is C<auto>; one can enable double-quotish or single-quotish format
by setting it to C<"> or C<'>, respectively.  By default, characters
with their high bit set are printed verbatim.

=item C<UsageOnly>
X<debugger option, UsageOnly>

Rudimentary per-package memory usage dump.  Calculates total
size of strings found in variables in the package.  This does not
include lexicals in a module's file scope, or lost in closures.

=item C<HistFile>
X<debugger option, history, HistFile>

The path of the file from which the history (assuming a usable
Term::ReadLine backend) will be read on the debugger's startup, and to which
it will be saved on shutdown (for persistence across sessions). Similar in
concept to Bash's C<.bash_history> file.

=item C<HistSize>
X<debugger option, history, HistSize>

The count of the saved lines in the history (assuming C<HistFile> above).

=back

After the rc file is read, the debugger reads the C<$ENV{PERLDB_OPTS}>
environment variable and parses this as the remainder of a "O ..."
line as one might enter at the debugger prompt.  You may place the
initialization options C<TTY>, C<noTTY>, C<ReadLine>, and C<NonStop>
there.

If your rc file contains:

  parse_options("NonStop=1 LineInfo=db.out AutoTrace");

then your script will run without human intervention, putting trace
information into the file I<db.out>.  (If you interrupt it, you'd
better reset C<LineInfo> to F</dev/tty> if you expect to see anything.)

=over 12

=item C<TTY>
X<debugger option, TTY>

The TTY to use for debugging I/O.

=item C<noTTY>
X<debugger option, noTTY>

If set, the debugger goes into C<NonStop> mode and will not connect to a TTY.  If
interrupted (or if control goes to the debugger via explicit setting of
$DB::signal or $DB::single from the Perl script), it connects to a TTY
specified in the C<TTY> option at startup, or to a tty found at
runtime using the C<Term::Rendezvous> module of your choice.

This module should implement a method named C<new> that returns an object
with two methods: C<IN> and C<OUT>.  These should return filehandles to use
for debugging input and output correspondingly.  The C<new> method should
inspect an argument containing the value of C<$ENV{PERLDB_NOTTY}> at
startup, or C<"$ENV{HOME}/.perldbtty$$"> otherwise.  This file is not
inspected for proper ownership, so security hazards are theoretically
possible.

=item C<ReadLine>
X<debugger option, ReadLine>

If false, readline support in the debugger is disabled in order
to debug applications that themselves use ReadLine.

=item C<NonStop>
X<debugger option, NonStop>

If set, the debugger goes into non-interactive mode until interrupted, or
programmatically by setting $DB::signal or $DB::single.

=back

Here's an example of using the C<$ENV{PERLDB_OPTS}> variable:

    $ PERLDB_OPTS="NonStop frame=2" perl -d myprogram

That will run the script B<myprogram> without human intervention,
printing out the call tree with entry and exit points.  Note that
C<NonStop=1 frame=2> is equivalent to C<N f=2>, and that originally,
options could be uniquely abbreviated by the first letter (modulo
the C<Dump*> options).  It is nevertheless recommended that you
always spell them out in full for legibility and future compatibility.

Other examples include

    $ PERLDB_OPTS="NonStop LineInfo=listing frame=2" perl -d myprogram

which runs script non-interactively, printing info on each entry
into a subroutine and each executed line into the file named F<listing>.
(If you interrupt it, you would better reset C<LineInfo> to something
"interactive"!)

Other examples include (using standard shell syntax to show environment
variable settings):

  $ ( PERLDB_OPTS="NonStop frame=1 AutoTrace LineInfo=tperl.out"
      perl -d myprogram )

which may be useful for debugging a program that uses C<Term::ReadLine>
itself.  Do not forget to detach your shell from the TTY in the window that
corresponds to F</dev/ttyXX>, say, by issuing a command like

  $ sleep 1000000

See L<perldebguts/"Debugger Internals"> for details.

=head2 Debugger Input/Output

=over 8

=item Prompt

The debugger prompt is something like

    DB<8>

or even

    DB<<17>>

where that number is the command number, and which you'd use to
access with the built-in B<csh>-like history mechanism.  For example,
C<!17> would repeat command number 17.  The depth of the angle
brackets indicates the nesting depth of the debugger.  You could
get more than one set of brackets, for example, if you'd already
at a breakpoint and then printed the result of a function call that
itself has a breakpoint, or you step into an expression via C<s/n/t
expression> command.

=item Multiline commands

If you want to enter a multi-line command, such as a subroutine
definition with several statements or a format, escape the newline
that would normally end the debugger command with a backslash.
Here's an example:

      DB<1> for (1..4) {         \
      cont:     print "ok\n";   \
      cont: }
      ok
      ok
      ok
      ok

Note that this business of escaping a newline is specific to interactive
commands typed into the debugger.

=item Stack backtrace
X<backtrace> X<stack, backtrace>

Here's an example of what a stack backtrace via C<T> command might
look like:

 $ = main::infested called from file 'Ambulation.pm' line 10
 @ = Ambulation::legs(1, 2, 3, 4) called from file 'camel_flea'
                                                          line 7
 $ = main::pests('bactrian', 4) called from file 'camel_flea'
                                                          line 4

The left-hand character up there indicates the context in which the
function was called, with C<$> and C<@> meaning scalar or list
contexts respectively, and C<.> meaning void context (which is
actually a sort of scalar context).  The display above says
that you were in the function C<main::infested> when you ran the
stack dump, and that it was called in scalar context from line
10 of the file I<Ambulation.pm>, but without any arguments at all,
meaning it was called as C<&infested>.  The next stack frame shows
that the function C<Ambulation::legs> was called in list context
from the I<camel_flea> file with four arguments.  The last stack
frame shows that C<main::pests> was called in scalar context,
also from I<camel_flea>, but from line 4.

If you execute the C<T> command from inside an active C<use>
statement, the backtrace will contain both a C<require> frame and
an C<eval> frame.

=item Line Listing Format

This shows the sorts of output the C<l> command can produce:

   DB<<13>> l
 101:        @i{@i} = ();
 102:b       @isa{@i,$pack} = ()
 103             if(exists $i{$prevpack} || exists $isa{$pack});
 104     }
 105
 106     next
 107==>      if(exists $isa{$pack});
 108
 109:a   if ($extra-- > 0) {
 110:        %isa = ($pack,1);

Breakable lines are marked with C<:>.  Lines with breakpoints are
marked by C<b> and those with actions by C<a>.  The line that's
about to be executed is marked by C<< ==> >>.

Please be aware that code in debugger listings may not look the same
as your original source code.  Line directives and external source
filters can alter the code before Perl sees it, causing code to move
from its original positions or take on entirely different forms.

=item Frame listing

When the C<frame> option is set, the debugger would print entered (and
optionally exited) subroutines in different styles.  See L<perldebguts>
for incredibly long examples of these.

=back

=head2 Debugging Compile-Time Statements

If you have compile-time executable statements (such as code within
BEGIN, UNITCHECK and CHECK blocks or C<use> statements), these will
I<not> be stopped by debugger, although C<require>s and INIT blocks
will, and compile-time statements can be traced with the C<AutoTrace>
option set in C<PERLDB_OPTS>).  From your own Perl code, however, you
can transfer control back to the debugger using the following
statement, which is harmless if the debugger is not running:

    $DB::single = 1;

If you set C<$DB::single> to 2, it's equivalent to having
just typed the C<n> command, whereas a value of 1 means the C<s>
command.  The C<$DB::trace>  variable should be set to 1 to simulate
having typed the C<t> command.

Another way to debug compile-time code is to start the debugger, set a
breakpoint on the I<load> of some module:

    DB<7> b load f:/perllib/lib/Carp.pm
  Will stop on load of 'f:/perllib/lib/Carp.pm'.

and then restart the debugger using the C<R> command (if possible).  One can use C<b
compile subname> for the same purpose.

=head2 Debugger Customization

The debugger probably contains enough configuration hooks that you
won't ever have to modify it yourself.  You may change the behaviour
of the debugger from within the debugger using its C<o> command, from
the command line via the C<PERLDB_OPTS> environment variable, and
from customization files.

You can do some customization by setting up a F<.perldb> file, which
contains initialization code.  For instance, you could make aliases
like these (the last one is one people expect to be there):

    $DB::alias{'len'}  = 's/^len(.*)/p length($1)/';
    $DB::alias{'stop'} = 's/^stop (at|in)/b/';
    $DB::alias{'ps'}   = 's/^ps\b/p scalar /';
    $DB::alias{'quit'} = 's/^quit(\s*)/exit/';

You can change options from F<.perldb> by using calls like this one;

    parse_options("NonStop=1 LineInfo=db.out AutoTrace=1 frame=2");

The code is executed in the package C<DB>.  Note that F<.perldb> is
processed before processing C<PERLDB_OPTS>.  If F<.perldb> defines the
subroutine C<afterinit>, that function is called after debugger
initialization ends.  F<.perldb> may be contained in the current
directory, or in the home directory.  Because this file is sourced
in by Perl and may contain arbitrary commands, for security reasons,
it must be owned by the superuser or the current user, and writable
by no one but its owner.

You can mock TTY input to debugger by adding arbitrary commands to
@DB::typeahead. For example, your F<.perldb> file might contain:

    sub afterinit { push @DB::typeahead, "b 4", "b 6"; }

Which would attempt to set breakpoints on lines 4 and 6 immediately
after debugger initialization. Note that @DB::typeahead is not a supported
interface and is subject to change in future releases.

If you want to modify the debugger, copy F<perl5db.pl> from the
Perl library to another name and hack it to your heart's content.
You'll then want to set your C<PERL5DB> environment variable to say
something like this:

    BEGIN { require "myperl5db.pl" }

As a last resort, you could also use C<PERL5DB> to customize the debugger
by directly setting internal variables or calling debugger functions.

Note that any variables and functions that are not documented in
this document (or in L<perldebguts>) are considered for internal
use only, and as such are subject to change without notice.

=head2 Readline Support / History in the Debugger

As shipped, the only command-line history supplied is a simplistic one
that checks for leading exclamation points.  However, if you install
the Term::ReadKey and Term::ReadLine modules from CPAN (such as
Term::ReadLine::Gnu, Term::ReadLine::Perl, ...) you will
have full editing capabilities much like those GNU I<readline>(3) provides.
Look for these in the F<modules/by-module/Term> directory on CPAN.
These do not support normal B<vi> command-line editing, however.

A rudimentary command-line completion is also available, including
lexical variables in the current scope if the C<PadWalker> module
is installed.

Without Readline support you may see the symbols "^[[A", "^[[C", "^[[B",
"^[[D"", "^H", ... when using the arrow keys and/or the backspace key.

=head2 Editor Support for Debugging

If you have the GNU's version of B<emacs> installed on your system,
it can interact with the Perl debugger to provide an integrated
software development environment reminiscent of its interactions
with C debuggers.

Recent versions of Emacs come with a
start file for making B<emacs> act like a
syntax-directed editor that understands (some of) Perl's syntax.
See L<perlfaq3>.

Users of B<vi> should also look into B<vim> and B<gvim>, the mousey
and windy version, for coloring of Perl keywords.

Note that only perl can truly parse Perl, so all such CASE tools
fall somewhat short of the mark, especially if you don't program
your Perl as a C programmer might.

=head2 The Perl Profiler
X<profile> X<profiling> X<profiler>

If you wish to supply an alternative debugger for Perl to run,
invoke your script with a colon and a package argument given to the
B<-d> flag.  Perl's alternative debuggers include a Perl profiler,
L<Devel::NYTProf>, which is available separately as a CPAN
distribution.  To profile your Perl program in the file F<mycode.pl>,
just type:

    $ perl -d:NYTProf mycode.pl

When the script terminates the profiler will create a database of the
profile information that you can turn into reports using the profiler's
tools. See <perlperf> for details.

=head1 Debugging Regular Expressions
X<regular expression, debugging>
X<regex, debugging> X<regexp, debugging>

C<use re 'debug'> enables you to see the gory details of how the Perl
regular expression engine works. In order to understand this typically
voluminous output, one must not only have some idea about how regular
expression matching works in general, but also know how Perl's regular
expressions are internally compiled into an automaton. These matters
are explored in some detail in
L<perldebguts/"Debugging Regular Expressions">.

=head1 Debugging Memory Usage
X<memory usage>

Perl contains internal support for reporting its own memory usage,
but this is a fairly advanced concept that requires some understanding
of how memory allocation works.
See L<perldebguts/"Debugging Perl Memory Usage"> for the details.

=head1 SEE ALSO

You do have C<use strict> and C<use warnings> enabled, don't you?

L<perldebtut>,
L<perldebguts>,
L<re>,
L<DB>,
L<Devel::NYTProf>,
L<Dumpvalue>,
and
L<perlrun>.

When debugging a script that uses #! and is thus normally found in
$PATH, the -S option causes perl to search $PATH for it, so you don't
have to type the path or C<which $scriptname>.

  $ perl -Sd foo.pl

=head1 BUGS

You cannot get stack frame information or in any fashion debug functions
that were not compiled by Perl, such as those from C or C++ extensions.

If you alter your @_ arguments in a subroutine (such as with C<shift>
or C<pop>), the stack backtrace will not show the original values.

The debugger does not currently work in conjunction with the B<-W>
command-line switch, because it itself is not free of warnings.

If you're in a slow syscall (like C<wait>ing, C<accept>ing, or C<read>ing
from your keyboard or a socket) and haven't set up your own C<$SIG{INT}>
handler, then you won't be able to CTRL-C your way back to the debugger,
because the debugger's own C<$SIG{INT}> handler doesn't understand that
it needs to raise an exception to longjmp(3) out of slow syscalls.
perldata.pod000064400000133227150344123470007057 0ustar00=head1 NAME

perldata - Perl data types

=head1 DESCRIPTION

=head2 Variable names
X<variable, name> X<variable name> X<data type> X<type>

Perl has three built-in data types: scalars, arrays of scalars, and
associative arrays of scalars, known as "hashes".  A scalar is a 
single string (of any size, limited only by the available memory),
number, or a reference to something (which will be discussed
in L<perlref>).  Normal arrays are ordered lists of scalars indexed
by number, starting with 0.  Hashes are unordered collections of scalar 
values indexed by their associated string key.

Values are usually referred to by name, or through a named reference.
The first character of the name tells you to what sort of data
structure it refers.  The rest of the name tells you the particular
value to which it refers.  Usually this name is a single I<identifier>,
that is, a string beginning with a letter or underscore, and
containing letters, underscores, and digits.  In some cases, it may
be a chain of identifiers, separated by C<::> (or by the slightly
archaic C<'>); all but the last are interpreted as names of packages,
to locate the namespace in which to look up the final identifier
(see L<perlmod/Packages> for details).  For a more in-depth discussion
on identifiers, see L</Identifier parsing>.  It's possible to
substitute for a simple identifier, an expression that produces a reference
to the value at runtime.   This is described in more detail below
and in L<perlref>.
X<identifier>

Perl also has its own built-in variables whose names don't follow
these rules.  They have strange names so they don't accidentally
collide with one of your normal variables.  Strings that match
parenthesized parts of a regular expression are saved under names
containing only digits after the C<$> (see L<perlop> and L<perlre>).
In addition, several special variables that provide windows into
the inner working of Perl have names containing punctuation characters.
These are documented in L<perlvar>.
X<variable, built-in>

Scalar values are always named with '$', even when referring to a
scalar that is part of an array or a hash.  The '$' symbol works
semantically like the English word "the" in that it indicates a
single value is expected.
X<scalar>

    $days		# the simple scalar value "days"
    $days[28]		# the 29th element of array @days
    $days{'Feb'}	# the 'Feb' value from hash %days
    $#days		# the last index of array @days

Entire arrays (and slices of arrays and hashes) are denoted by '@',
which works much as the word "these" or "those" does in English,
in that it indicates multiple values are expected.
X<array>

    @days		# ($days[0], $days[1],... $days[n])
    @days[3,4,5]	# same as ($days[3],$days[4],$days[5])
    @days{'a','c'}	# same as ($days{'a'},$days{'c'})

Entire hashes are denoted by '%':
X<hash>

    %days		# (key1, val1, key2, val2 ...)

In addition, subroutines are named with an initial '&', though this
is optional when unambiguous, just as the word "do" is often redundant
in English.  Symbol table entries can be named with an initial '*',
but you don't really care about that yet (if ever :-).

Every variable type has its own namespace, as do several
non-variable identifiers.  This means that you can, without fear
of conflict, use the same name for a scalar variable, an array, or
a hash--or, for that matter, for a filehandle, a directory handle, a
subroutine name, a format name, or a label.  This means that $foo
and @foo are two different variables.  It also means that C<$foo[1]>
is a part of @foo, not a part of $foo.  This may seem a bit weird,
but that's okay, because it is weird.
X<namespace>

Because variable references always start with '$', '@', or '%', the
"reserved" words aren't in fact reserved with respect to variable
names.  They I<are> reserved with respect to labels and filehandles,
however, which don't have an initial special character.  You can't
have a filehandle named "log", for instance.  Hint: you could say
C<open(LOG,'logfile')> rather than C<open(log,'logfile')>.  Using
uppercase filehandles also improves readability and protects you
from conflict with future reserved words.  Case I<is> significant--"FOO",
"Foo", and "foo" are all different names.  Names that start with a
letter or underscore may also contain digits and underscores.
X<identifier, case sensitivity>
X<case>

It is possible to replace such an alphanumeric name with an expression
that returns a reference to the appropriate type.  For a description
of this, see L<perlref>.

Names that start with a digit may contain only more digits.  Names
that do not start with a letter, underscore, digit or a caret are
limited to one character, e.g.,  C<$%> or
C<$$>.  (Most of these one character names have a predefined
significance to Perl.  For instance, C<$$> is the current process
id.  And all such names are reserved for Perl's possible use.)

=head2 Identifier parsing
X<identifiers>

Up until Perl 5.18, the actual rules of what a valid identifier
was were a bit fuzzy.  However, in general, anything defined here should
work on previous versions of Perl, while the opposite -- edge cases
that work in previous versions, but aren't defined here -- probably
won't work on newer versions.
As an important side note, please note that the following only applies
to bareword identifiers as found in Perl source code, not identifiers
introduced through symbolic references, which have much fewer
restrictions.
If working under the effect of the C<use utf8;> pragma, the following
rules apply:

    / (?[ ( \p{Word} & \p{XID_Start} ) + [_] ])
      (?[ ( \p{Word} & \p{XID_Continue} ) ]) *    /x

That is, a "start" character followed by any number of "continue"
characters.  Perl requires every character in an identifier to also
match C<\w> (this prevents some problematic cases); and Perl
additionally accepts identfier names beginning with an underscore.

If not under C<use utf8>, the source is treated as ASCII + 128 extra
generic characters, and identifiers should match

    / (?aa) (?!\d) \w+ /x

That is, any word character in the ASCII range, as long as the first
character is not a digit.

There are two package separators in Perl: A double colon (C<::>) and a single
quote (C<'>).  Normal identifiers can start or end with a double colon, and
can contain several parts delimited by double colons.
Single quotes have similar rules, but with the exception that they are not
legal at the end of an identifier: That is, C<$'foo> and C<$foo'bar> are
legal, but C<$foo'bar'> is not.

Additionally, if the identifier is preceded by a sigil --
that is, if the identifier is part of a variable name -- it
may optionally be enclosed in braces.

While you can mix double colons with singles quotes, the quotes must come
after the colons: C<$::::'foo> and C<$foo::'bar> are legal, but C<$::'::foo>
and C<$foo'::bar> are not.

Put together, a grammar to match a basic identifier becomes

 /
  (?(DEFINE)
      (?<variable>
          (?&sigil)
          (?:
                  (?&normal_identifier)
              |   \{ \s* (?&normal_identifier) \s* \}
          )
      )
      (?<normal_identifier>
          (?: :: )* '?
           (?&basic_identifier)
           (?: (?= (?: :: )+ '? | (?: :: )* ' ) (?&normal_identifier) )?
          (?: :: )*
      )
      (?<basic_identifier>
        # is use utf8 on?
          (?(?{ (caller(0))[8] & $utf8::hint_bits })
              (?&Perl_XIDS) (?&Perl_XIDC)*
            | (?aa) (?!\d) \w+
          )
      )
      (?<sigil> [&*\$\@\%])
      (?<Perl_XIDS> (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) )
      (?<Perl_XIDC> (?[ \p{Word} & \p{XID_Continue} ]) )
  )
 /x

Meanwhile, special identifiers don't follow the above rules; For the most
part, all of the identifiers in this category have a special meaning given
by Perl.  Because they have special parsing rules, these generally can't be
fully-qualified.  They come in six forms (but don't use forms 5 and 6):

=over

=item 1.

A sigil, followed solely by digits matching C<\p{POSIX_Digit}>, like
C<$0>, C<$1>, or C<$10000>.

=item 2.

A sigil followed by a single character matching the C<\p{POSIX_Punct}>
property, like C<$!> or C<%+>, except the character C<"{"> doesn't work.

=item 3.

A sigil, followed by a caret and any one of the characters
C<[][A-Z^_?\]>, like C<$^V> or C<$^]>.

=item 4.

Similar to the above, a sigil, followed by bareword text in braces,
where the first character is a caret.  The next character is any one of
the characters C<[][A-Z^_?\]>, followed by ASCII word characters.  An
example is C<${^GLOBAL_PHASE}>.

=item 5.

A sigil, followed by any single character in the range C<[\xA1-\xAC\xAE-\xFF]>
when not under C<S<"use utf8">>.  (Under C<S<"use utf8">>, the normal
identifier rules given earlier in this section apply.)  Use of
non-graphic characters (the C1 controls, the NO-BREAK SPACE, and the
SOFT HYPHEN) has been disallowed since v5.26.0.
The use of the other characters is unwise, as these are all
reserved to have special meaning to Perl, and none of them currently
do have special meaning, though this could change without notice.

Note that an implication of this form is that there are identifiers only
legal under C<S<"use utf8">>, and vice-versa, for example the identifier
C<$E<233>tat> is legal under C<S<"use utf8">>, but is otherwise
considered to be the single character variable C<$E<233>> followed by
the bareword C<"tat">, the combination of which is a syntax error.

=item 6.

This is a combination of the previous two forms.  It is valid only when
not under S<C<"use utf8">> (normal identifier rules apply when under
S<C<"use utf8">>).  The form is a sigil, followed by text in braces,
where the first character is any one of the characters in the range
C<[\x80-\xFF]> followed by ASCII word characters up to the trailing
brace.

The same caveats as the previous form apply:  The non-graphic
characters are no longer allowed with S<"use utf8">, it is unwise
to use this form at all, and utf8ness makes a big difference.

=back

Prior to Perl v5.24, non-graphical ASCII control characters were also
allowed in some situations; this had been deprecated since v5.20.

=head2 Context
X<context> X<scalar context> X<list context>

The interpretation of operations and values in Perl sometimes depends
on the requirements of the context around the operation or value.
There are two major contexts: list and scalar.  Certain operations
return list values in contexts wanting a list, and scalar values
otherwise.  If this is true of an operation it will be mentioned in
the documentation for that operation.  In other words, Perl overloads
certain operations based on whether the expected return value is
singular or plural.  Some words in English work this way, like "fish"
and "sheep".

In a reciprocal fashion, an operation provides either a scalar or a
list context to each of its arguments.  For example, if you say

    int( <STDIN> )

the integer operation provides scalar context for the <>
operator, which responds by reading one line from STDIN and passing it
back to the integer operation, which will then find the integer value
of that line and return that.  If, on the other hand, you say

    sort( <STDIN> )

then the sort operation provides list context for <>, which
will proceed to read every line available up to the end of file, and
pass that list of lines back to the sort routine, which will then
sort those lines and return them as a list to whatever the context
of the sort was.

Assignment is a little bit special in that it uses its left argument
to determine the context for the right argument.  Assignment to a
scalar evaluates the right-hand side in scalar context, while
assignment to an array or hash evaluates the righthand side in list
context.  Assignment to a list (or slice, which is just a list
anyway) also evaluates the right-hand side in list context.

When you use the C<use warnings> pragma or Perl's B<-w> command-line 
option, you may see warnings
about useless uses of constants or functions in "void context".
Void context just means the value has been discarded, such as a
statement containing only C<"fred";> or C<getpwuid(0);>.  It still
counts as scalar context for functions that care whether or not
they're being called in list context.

User-defined subroutines may choose to care whether they are being
called in a void, scalar, or list context.  Most subroutines do not
need to bother, though.  That's because both scalars and lists are
automatically interpolated into lists.  See L<perlfunc/wantarray>
for how you would dynamically discern your function's calling
context.

=head2 Scalar values
X<scalar> X<number> X<string> X<reference>

All data in Perl is a scalar, an array of scalars, or a hash of
scalars.  A scalar may contain one single value in any of three
different flavors: a number, a string, or a reference.  In general,
conversion from one form to another is transparent.  Although a
scalar may not directly hold multiple values, it may contain a
reference to an array or hash which in turn contains multiple values.

Scalars aren't necessarily one thing or another.  There's no place
to declare a scalar variable to be of type "string", type "number",
type "reference", or anything else.  Because of the automatic
conversion of scalars, operations that return scalars don't need
to care (and in fact, cannot care) whether their caller is looking
for a string, a number, or a reference.  Perl is a contextually
polymorphic language whose scalars can be strings, numbers, or
references (which includes objects).  Although strings and numbers
are considered pretty much the same thing for nearly all purposes,
references are strongly-typed, uncastable pointers with builtin
reference-counting and destructor invocation.

A scalar value is interpreted as FALSE in the Boolean sense
if it is undefined, the null string or the number 0 (or its
string equivalent, "0"), and TRUE if it is anything else.  The
Boolean context is just a special kind of scalar context where no 
conversion to a string or a number is ever performed.
X<boolean> X<bool> X<true> X<false> X<truth>

There are actually two varieties of null strings (sometimes referred
to as "empty" strings), a defined one and an undefined one.  The
defined version is just a string of length zero, such as C<"">.
The undefined version is the value that indicates that there is
no real value for something, such as when there was an error, or
at end of file, or when you refer to an uninitialized variable or
element of an array or hash.  Although in early versions of Perl,
an undefined scalar could become defined when first used in a
place expecting a defined value, this no longer happens except for
rare cases of autovivification as explained in L<perlref>.  You can
use the defined() operator to determine whether a scalar value is
defined (this has no meaning on arrays or hashes), and the undef()
operator to produce an undefined value.
X<defined> X<undefined> X<undef> X<null> X<string, null>

To find out whether a given string is a valid non-zero number, it's
sometimes enough to test it against both numeric 0 and also lexical
"0" (although this will cause noises if warnings are on).  That's 
because strings that aren't numbers count as 0, just as they do in B<awk>:

    if ($str == 0 && $str ne "0")  {
	warn "That doesn't look like a number";
    }

That method may be best because otherwise you won't treat IEEE
notations like C<NaN> or C<Infinity> properly.  At other times, you
might prefer to determine whether string data can be used numerically
by calling the POSIX::strtod() function or by inspecting your string
with a regular expression (as documented in L<perlre>).

    warn "has nondigits"	if     /\D/;
    warn "not a natural number" unless /^\d+$/;             # rejects -3
    warn "not an integer"       unless /^-?\d+$/;           # rejects +3
    warn "not an integer"       unless /^[+-]?\d+$/;
    warn "not a decimal number" unless /^-?\d+\.?\d*$/;     # rejects .2
    warn "not a decimal number" unless /^-?(?:\d+(?:\.\d*)?|\.\d+)$/;
    warn "not a C float"
	unless /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/;

The length of an array is a scalar value.  You may find the length
of array @days by evaluating C<$#days>, as in B<csh>.  However, this
isn't the length of the array; it's the subscript of the last element,
which is a different value since there is ordinarily a 0th element.
Assigning to C<$#days> actually changes the length of the array.
Shortening an array this way destroys intervening values.  Lengthening
an array that was previously shortened does not recover values
that were in those elements.
X<$#> X<array, length>

You can also gain some minuscule measure of efficiency by pre-extending
an array that is going to get big.  You can also extend an array
by assigning to an element that is off the end of the array.  You
can truncate an array down to nothing by assigning the null list
() to it.  The following are equivalent:

    @whatever = ();
    $#whatever = -1;

If you evaluate an array in scalar context, it returns the length
of the array.  (Note that this is not true of lists, which return
the last value, like the C comma operator, nor of built-in functions,
which return whatever they feel like returning.)  The following is
always true:
X<array, length>

    scalar(@whatever) == $#whatever + 1;

Some programmers choose to use an explicit conversion so as to 
leave nothing to doubt:

    $element_count = scalar(@whatever);

If you evaluate a hash in scalar context, it returns false if the
hash is empty.  If there are any key/value pairs, it returns true.
A more precise definition is version dependent.

Prior to Perl 5.25 the value returned was a string consisting of the
number of used buckets and the number of allocated buckets, separated
by a slash.  This is pretty much useful only to find out whether
Perl's internal hashing algorithm is performing poorly on your data
set.  For example, you stick 10,000 things in a hash, but evaluating
%HASH in scalar context reveals C<"1/16">, which means only one out
of sixteen buckets has been touched, and presumably contains all
10,000 of your items.  This isn't supposed to happen.

As of Perl 5.25 the return was changed to be the count of keys in the
hash. If you need access to the old behavior you can use
C<Hash::Util::bucket_ratio()> instead.

If a tied hash is evaluated in scalar context, the C<SCALAR> method is
called (with a fallback to C<FIRSTKEY>).
X<hash, scalar context> X<hash, bucket> X<bucket>

You can preallocate space for a hash by assigning to the keys() function.
This rounds up the allocated buckets to the next power of two:

    keys(%users) = 1000;		# allocate 1024 buckets

=head2 Scalar value constructors
X<scalar, literal> X<scalar, constant>

Numeric literals are specified in any of the following floating point or
integer formats:

 12345
 12345.67
 .23E-10             # a very small number
 3.14_15_92          # a very important number
 4_294_967_296       # underscore for legibility
 0xff                # hex
 0xdead_beef         # more hex
 0377                # octal (only numbers, begins with 0)
 0b011011            # binary
 0x1.999ap-4         # hexadecimal floating point (the 'p' is required)

You are allowed to use underscores (underbars) in numeric literals
between digits for legibility (but not multiple underscores in a row:
C<23__500> is not legal; C<23_500> is).
You could, for example, group binary
digits by threes (as for a Unix-style mode argument such as 0b110_100_100)
or by fours (to represent nibbles, as in 0b1010_0110) or in other groups.
X<number, literal>

String literals are usually delimited by either single or double
quotes.  They work much like quotes in the standard Unix shells:
double-quoted string literals are subject to backslash and variable
substitution; single-quoted strings are not (except for C<\'> and
C<\\>).  The usual C-style backslash rules apply for making
characters such as newline, tab, etc., as well as some more exotic
forms.  See L<perlop/"Quote and Quote-like Operators"> for a list.
X<string, literal>

Hexadecimal, octal, or binary, representations in string literals
(e.g. '0xff') are not automatically converted to their integer
representation.  The hex() and oct() functions make these conversions
for you.  See L<perlfunc/hex> and L<perlfunc/oct> for more details.

Hexadecimal floating point can start just like a hexadecimal literal,
and it can be followed by an optional fractional hexadecimal part,
but it must be followed by C<p>, an optional sign, and a power of two.
The format is useful for accurately presenting floating point values,
avoiding conversions to or from decimal floating point, and therefore
avoiding possible loss in precision.  Notice that while most current
platforms use the 64-bit IEEE 754 floating point, not all do.  Another
potential source of (low-order) differences are the floating point
rounding modes, which can differ between CPUs, operating systems,
and compilers, and which Perl doesn't control.

You can also embed newlines directly in your strings, i.e., they can end
on a different line than they begin.  This is nice, but if you forget
your trailing quote, the error will not be reported until Perl finds
another line containing the quote character, which may be much further
on in the script.  Variable substitution inside strings is limited to
scalar variables, arrays, and array or hash slices.  (In other words,
names beginning with $ or @, followed by an optional bracketed
expression as a subscript.)  The following code segment prints out "The
price is $Z<>100."
X<interpolation>

    $Price = '$100';	# not interpolated
    print "The price is $Price.\n";	# interpolated

There is no double interpolation in Perl, so the C<$100> is left as is.

By default floating point numbers substituted inside strings use the
dot (".")  as the decimal separator.  If C<use locale> is in effect,
and POSIX::setlocale() has been called, the character used for the
decimal separator is affected by the LC_NUMERIC locale.
See L<perllocale> and L<POSIX>.

As in some shells, you can enclose the variable name in braces to
disambiguate it from following alphanumerics (and underscores).
You must also do
this when interpolating a variable into a string to separate the
variable name from a following double-colon or an apostrophe, since
these would be otherwise treated as a package separator:
X<interpolation>

    $who = "Larry";
    print PASSWD "${who}::0:0:Superuser:/:/bin/perl\n";
    print "We use ${who}speak when ${who}'s here.\n";

Without the braces, Perl would have looked for a $whospeak, a
C<$who::0>, and a C<$who's> variable.  The last two would be the
$0 and the $s variables in the (presumably) non-existent package
C<who>.

In fact, a simple identifier within such curlies is forced to be
a string, and likewise within a hash subscript.  Neither need
quoting.  Our earlier example, C<$days{'Feb'}> can be written as
C<$days{Feb}> and the quotes will be assumed automatically.  But
anything more complicated in the subscript will be interpreted as an
expression.  This means for example that C<$version{2.0}++> is
equivalent to C<$version{2}++>, not to C<$version{'2.0'}++>.

=head3 Special floating point: infinity (Inf) and not-a-number (NaN)

Floating point values include the special values C<Inf> and C<NaN>,
for infinity and not-a-number.  The infinity can be also negative.

The infinity is the result of certain math operations that overflow
the floating point range, like 9**9**9.  The not-a-number is the
result when the result is undefined or unrepresentable.  Though note
that you cannot get C<NaN> from some common "undefined" or
"out-of-range" operations like dividing by zero, or square root of
a negative number, since Perl generates fatal errors for those.

The infinity and not-a-number have their own special arithmetic rules.
The general rule is that they are "contagious": C<Inf> plus one is
C<Inf>, and C<NaN> plus one is C<NaN>.  Where things get interesting
is when you combine infinities and not-a-numbers: C<Inf> minus C<Inf>
and C<Inf> divided by C<Inf> are C<NaN> (while C<Inf> plus C<Inf> is
C<Inf> and C<Inf> times C<Inf> is C<Inf>).  C<NaN> is also curious
in that it does not equal any number, I<including> itself:
C<NaN> != C<NaN>.

Perl doesn't understand C<Inf> and C<NaN> as numeric literals, but
you can have them as strings, and Perl will convert them as needed:
"Inf" + 1.  (You can, however, import them from the POSIX extension;
C<use POSIX qw(Inf NaN);> and then use them as literals.)

Note that on input (string to number) Perl accepts C<Inf> and C<NaN>
in many forms.   Case is ignored, and the Win32-specific forms like
C<1.#INF> are understood, but on output the values are normalized to
C<Inf> and C<NaN>.

=head3 Version Strings
X<version string> X<vstring> X<v-string>

A literal of the form C<v1.20.300.4000> is parsed as a string composed
of characters with the specified ordinals.  This form, known as
v-strings, provides an alternative, more readable way to construct
strings, rather than use the somewhat less readable interpolation form
C<"\x{1}\x{14}\x{12c}\x{fa0}">.  This is useful for representing
Unicode strings, and for comparing version "numbers" using the string
comparison operators, C<cmp>, C<gt>, C<lt> etc.  If there are two or
more dots in the literal, the leading C<v> may be omitted.

    print v9786;              # prints SMILEY, "\x{263a}"
    print v102.111.111;       # prints "foo"
    print 102.111.111;        # same

Such literals are accepted by both C<require> and C<use> for
doing a version check.  Note that using the v-strings for IPv4
addresses is not portable unless you also use the
inet_aton()/inet_ntoa() routines of the Socket package.

Note that since Perl 5.8.1 the single-number v-strings (like C<v65>)
are not v-strings before the C<< => >> operator (which is usually used
to separate a hash key from a hash value); instead they are interpreted
as literal strings ('v65').  They were v-strings from Perl 5.6.0 to
Perl 5.8.0, but that caused more confusion and breakage than good.
Multi-number v-strings like C<v65.66> and C<65.66.67> continue to
be v-strings always.

=head3 Special Literals
X<special literal> X<__END__> X<__DATA__> X<END> X<DATA>
X<end> X<data> X<^D> X<^Z>

The special literals __FILE__, __LINE__, and __PACKAGE__
represent the current filename, line number, and package name at that
point in your program.  __SUB__ gives a reference to the current
subroutine.  They may be used only as separate tokens; they
will not be interpolated into strings.  If there is no current package
(due to an empty C<package;> directive), __PACKAGE__ is the undefined
value.  (But the empty C<package;> is no longer supported, as of version
5.10.)  Outside of a subroutine, __SUB__ is the undefined value.  __SUB__
is only available in 5.16 or higher, and only with a C<use v5.16> or
C<use feature "current_sub"> declaration.
X<__FILE__> X<__LINE__> X<__PACKAGE__> X<__SUB__>
X<line> X<file> X<package>

The two control characters ^D and ^Z, and the tokens __END__ and __DATA__
may be used to indicate the logical end of the script before the actual
end of file.  Any following text is ignored.

Text after __DATA__ may be read via the filehandle C<PACKNAME::DATA>,
where C<PACKNAME> is the package that was current when the __DATA__
token was encountered.  The filehandle is left open pointing to the
line after __DATA__.  The program should C<close DATA> when it is done
reading from it.  (Leaving it open leaks filehandles if the module is
reloaded for any reason, so it's a safer practice to close it.)  For
compatibility with older scripts written before __DATA__ was
introduced, __END__ behaves like __DATA__ in the top level script (but
not in files loaded with C<require> or C<do>) and leaves the remaining
contents of the file accessible via C<main::DATA>.

See L<SelfLoader> for more description of __DATA__, and
an example of its use.  Note that you cannot read from the DATA
filehandle in a BEGIN block: the BEGIN block is executed as soon
as it is seen (during compilation), at which point the corresponding
__DATA__ (or __END__) token has not yet been seen.

=head3 Barewords
X<bareword>

A word that has no other interpretation in the grammar will
be treated as if it were a quoted string.  These are known as
"barewords".  As with filehandles and labels, a bareword that consists
entirely of lowercase letters risks conflict with future reserved
words, and if you use the C<use warnings> pragma or the B<-w> switch, 
Perl will warn you about any such words.  Perl limits barewords (like
identifiers) to about 250 characters.  Future versions of Perl are likely
to eliminate these arbitrary limitations.

Some people may wish to outlaw barewords entirely.  If you
say

    use strict 'subs';

then any bareword that would NOT be interpreted as a subroutine call
produces a compile-time error instead.  The restriction lasts to the
end of the enclosing block.  An inner block may countermand this
by saying C<no strict 'subs'>.

=head3 Array Interpolation
X<array, interpolation> X<interpolation, array> X<$">

Arrays and slices are interpolated into double-quoted strings
by joining the elements with the delimiter specified in the C<$">
variable (C<$LIST_SEPARATOR> if "use English;" is specified), 
space by default.  The following are equivalent:

    $temp = join($", @ARGV);
    system "echo $temp";

    system "echo @ARGV";

Within search patterns (which also undergo double-quotish substitution)
there is an unfortunate ambiguity:  Is C</$foo[bar]/> to be interpreted as
C</${foo}[bar]/> (where C<[bar]> is a character class for the regular
expression) or as C</${foo[bar]}/> (where C<[bar]> is the subscript to array
@foo)?  If @foo doesn't otherwise exist, then it's obviously a
character class.  If @foo exists, Perl takes a good guess about C<[bar]>,
and is almost always right.  If it does guess wrong, or if you're just
plain paranoid, you can force the correct interpretation with curly
braces as above.

If you're looking for the information on how to use here-documents,
which used to be here, that's been moved to
L<perlop/Quote and Quote-like Operators>.

=head2 List value constructors
X<list>

List values are denoted by separating individual values by commas
(and enclosing the list in parentheses where precedence requires it):

    (LIST)

In a context not requiring a list value, the value of what appears
to be a list literal is simply the value of the final element, as
with the C comma operator.  For example,

    @foo = ('cc', '-E', $bar);

assigns the entire list value to array @foo, but

    $foo = ('cc', '-E', $bar);

assigns the value of variable $bar to the scalar variable $foo.
Note that the value of an actual array in scalar context is the
length of the array; the following assigns the value 3 to $foo:

    @foo = ('cc', '-E', $bar);
    $foo = @foo;                # $foo gets 3

You may have an optional comma before the closing parenthesis of a
list literal, so that you can say:

    @foo = (
        1,
        2,
        3,
    );

To use a here-document to assign an array, one line per element,
you might use an approach like this:

    @sauces = <<End_Lines =~ m/(\S.*\S)/g;
        normal tomato
        spicy tomato
        green chile
        pesto
        white wine
    End_Lines

LISTs do automatic interpolation of sublists.  That is, when a LIST is
evaluated, each element of the list is evaluated in list context, and
the resulting list value is interpolated into LIST just as if each
individual element were a member of LIST.  Thus arrays and hashes lose their
identity in a LIST--the list

    (@foo,@bar,&SomeSub,%glarch)

contains all the elements of @foo followed by all the elements of @bar,
followed by all the elements returned by the subroutine named SomeSub 
called in list context, followed by the key/value pairs of %glarch.
To make a list reference that does I<NOT> interpolate, see L<perlref>.

The null list is represented by ().  Interpolating it in a list
has no effect.  Thus ((),(),()) is equivalent to ().  Similarly,
interpolating an array with no elements is the same as if no
array had been interpolated at that point.

This interpolation combines with the facts that the opening
and closing parentheses are optional (except when necessary for
precedence) and lists may end with an optional comma to mean that
multiple commas within lists are legal syntax.  The list C<1,,3> is a
concatenation of two lists, C<1,> and C<3>, the first of which ends
with that optional comma.  C<1,,3> is C<(1,),(3)> is C<1,3> (And
similarly for C<1,,,3> is C<(1,),(,),3> is C<1,3> and so on.)  Not that
we'd advise you to use this obfuscation.

A list value may also be subscripted like a normal array.  You must
put the list in parentheses to avoid ambiguity.  For example:

    # Stat returns list value.
    $time = (stat($file))[8];

    # SYNTAX ERROR HERE.
    $time = stat($file)[8];  # OOPS, FORGOT PARENTHESES

    # Find a hex digit.
    $hexdigit = ('a','b','c','d','e','f')[$digit-10];

    # A "reverse comma operator".
    return (pop(@foo),pop(@foo))[0];

Lists may be assigned to only when each element of the list
is itself legal to assign to:

    ($a, $b, $c) = (1, 2, 3);

    ($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);

An exception to this is that you may assign to C<undef> in a list.
This is useful for throwing away some of the return values of a
function:

    ($dev, $ino, undef, undef, $uid, $gid) = stat($file);

As of Perl 5.22, you can also use C<(undef)x2> instead of C<undef, undef>.
(You can also do C<($x) x 2>, which is less useful, because it assigns to
the same variable twice, clobbering the first value assigned.)

List assignment in scalar context returns the number of elements
produced by the expression on the right side of the assignment:

    $x = (($foo,$bar) = (3,2,1));       # set $x to 3, not 2
    $x = (($foo,$bar) = f());           # set $x to f()'s return count

This is handy when you want to do a list assignment in a Boolean
context, because most list functions return a null list when finished,
which when assigned produces a 0, which is interpreted as FALSE.

It's also the source of a useful idiom for executing a function or
performing an operation in list context and then counting the number of
return values, by assigning to an empty list and then using that
assignment in scalar context.  For example, this code:

    $count = () = $string =~ /\d+/g;

will place into $count the number of digit groups found in $string.
This happens because the pattern match is in list context (since it
is being assigned to the empty list), and will therefore return a list
of all matching parts of the string.  The list assignment in scalar
context will translate that into the number of elements (here, the
number of times the pattern matched) and assign that to $count.  Note
that simply using

    $count = $string =~ /\d+/g;

would not have worked, since a pattern match in scalar context will
only return true or false, rather than a count of matches.

The final element of a list assignment may be an array or a hash:

    ($a, $b, @rest) = split;
    my($a, $b, %rest) = @_;

You can actually put an array or hash anywhere in the list, but the first one
in the list will soak up all the values, and anything after it will become
undefined.  This may be useful in a my() or local().

A hash can be initialized using a literal list holding pairs of
items to be interpreted as a key and a value:

    # same as map assignment above
    %map = ('red',0x00f,'blue',0x0f0,'green',0xf00);

While literal lists and named arrays are often interchangeable, that's
not the case for hashes.  Just because you can subscript a list value like
a normal array does not mean that you can subscript a list value as a
hash.  Likewise, hashes included as parts of other lists (including
parameters lists and return lists from functions) always flatten out into
key/value pairs.  That's why it's good to use references sometimes.

It is often more readable to use the C<< => >> operator between key/value
pairs.  The C<< => >> operator is mostly just a more visually distinctive
synonym for a comma, but it also arranges for its left-hand operand to be
interpreted as a string if it's a bareword that would be a legal simple
identifier.  C<< => >> doesn't quote compound identifiers, that contain
double colons.  This makes it nice for initializing hashes:

    %map = (
                 red   => 0x00f,
                 blue  => 0x0f0,
                 green => 0xf00,
   );

or for initializing hash references to be used as records:

    $rec = {
                witch => 'Mable the Merciless',
                cat   => 'Fluffy the Ferocious',
                date  => '10/31/1776',
    };

or for using call-by-named-parameter to complicated functions:

   $field = $query->radio_group(
               name      => 'group_name',
               values    => ['eenie','meenie','minie'],
               default   => 'meenie',
               linebreak => 'true',
               labels    => \%labels
   );

Note that just because a hash is initialized in that order doesn't
mean that it comes out in that order.  See L<perlfunc/sort> for examples
of how to arrange for an output ordering.

If a key appears more than once in the initializer list of a hash, the last
occurrence wins:

    %circle = (
                  center => [5, 10],
                  center => [27, 9],
                  radius => 100,
                  color => [0xDF, 0xFF, 0x00],
                  radius => 54,
    );

    # same as
    %circle = (
                  center => [27, 9],
                  color => [0xDF, 0xFF, 0x00],
                  radius => 54,
    );

This can be used to provide overridable configuration defaults:

    # values in %args take priority over %config_defaults
    %config = (%config_defaults, %args);

=head2 Subscripts

An array can be accessed one scalar at a
time by specifying a dollar sign (C<$>), then the
name of the array (without the leading C<@>), then the subscript inside
square brackets.  For example:

    @myarray = (5, 50, 500, 5000);
    print "The Third Element is", $myarray[2], "\n";

The array indices start with 0.  A negative subscript retrieves its 
value from the end.  In our example, C<$myarray[-1]> would have been 
5000, and C<$myarray[-2]> would have been 500.

Hash subscripts are similar, only instead of square brackets curly brackets
are used.  For example:

    %scientists = 
    (
        "Newton" => "Isaac",
        "Einstein" => "Albert",
        "Darwin" => "Charles",
        "Feynman" => "Richard",
    );

    print "Darwin's First Name is ", $scientists{"Darwin"}, "\n";

You can also subscript a list to get a single element from it:

    $dir = (getpwnam("daemon"))[7];

=head2 Multi-dimensional array emulation

Multidimensional arrays may be emulated by subscripting a hash with a
list.  The elements of the list are joined with the subscript separator
(see L<perlvar/$;>).

    $foo{$a,$b,$c}

is equivalent to

    $foo{join($;, $a, $b, $c)}

The default subscript separator is "\034", the same as SUBSEP in B<awk>.

=head2 Slices
X<slice> X<array, slice> X<hash, slice>

A slice accesses several elements of a list, an array, or a hash
simultaneously using a list of subscripts.  It's more convenient
than writing out the individual elements as a list of separate
scalar values.

    ($him, $her)   = @folks[0,-1];              # array slice
    @them          = @folks[0 .. 3];            # array slice
    ($who, $home)  = @ENV{"USER", "HOME"};      # hash slice
    ($uid, $dir)   = (getpwnam("daemon"))[2,7]; # list slice

Since you can assign to a list of variables, you can also assign to
an array or hash slice.

    @days[3..5]    = qw/Wed Thu Fri/;
    @colors{'red','blue','green'} 
                   = (0xff0000, 0x0000ff, 0x00ff00);
    @folks[0, -1]  = @folks[-1, 0];

The previous assignments are exactly equivalent to

    ($days[3], $days[4], $days[5]) = qw/Wed Thu Fri/;
    ($colors{'red'}, $colors{'blue'}, $colors{'green'})
                   = (0xff0000, 0x0000ff, 0x00ff00);
    ($folks[0], $folks[-1]) = ($folks[-1], $folks[0]);

Since changing a slice changes the original array or hash that it's
slicing, a C<foreach> construct will alter some--or even all--of the
values of the array or hash.

    foreach (@array[ 4 .. 10 ]) { s/peter/paul/ } 

    foreach (@hash{qw[key1 key2]}) {
        s/^\s+//;           # trim leading whitespace
        s/\s+$//;           # trim trailing whitespace
        s/(\w+)/\u\L$1/g;   # "titlecase" words
    }

As a special exception, when you slice a list (but not an array or a hash),
if the list evaluates to empty, then taking a slice of that empty list will
always yield the empty list in turn.  Thus:

    @a = ()[0,1];          # @a has no elements
    @b = (@a)[0,1];        # @b has no elements
    @c = (sub{}->())[0,1]; # @c has no elements
    @d = ('a','b')[0,1];   # @d has two elements
    @e = (@d)[0,1,8,9];    # @e has four elements
    @f = (@d)[8,9];        # @f has two elements

This makes it easy to write loops that terminate when a null list
is returned:

    while ( ($home, $user) = (getpwent)[7,0] ) {
        printf "%-8s %s\n", $user, $home;
    }

As noted earlier in this document, the scalar sense of list assignment
is the number of elements on the right-hand side of the assignment.
The null list contains no elements, so when the password file is
exhausted, the result is 0, not 2.

Slices in scalar context return the last item of the slice.

    @a = qw/first second third/;
    %h = (first => 'A', second => 'B');
    $t = @a[0, 1];                  # $t is now 'second'
    $u = @h{'first', 'second'};     # $u is now 'B'

If you're confused about why you use an '@' there on a hash slice
instead of a '%', think of it like this.  The type of bracket (square
or curly) governs whether it's an array or a hash being looked at.
On the other hand, the leading symbol ('$' or '@') on the array or
hash indicates whether you are getting back a singular value (a
scalar) or a plural one (a list).

=head3 Key/Value Hash Slices

Starting in Perl 5.20, a hash slice operation
with the % symbol is a variant of slice operation
returning a list of key/value pairs rather than just values:

    %h = (blonk => 2, foo => 3, squink => 5, bar => 8);
    %subset = %h{'foo', 'bar'}; # key/value hash slice
    # %subset is now (foo => 3, bar => 8)

However, the result of such a slice cannot be localized, deleted or used
in assignment.  These are otherwise very much consistent with hash slices
using the @ symbol.

=head3 Index/Value Array Slices

Similar to key/value hash slices (and also introduced
in Perl 5.20), the % array slice syntax returns a list
of index/value pairs:

    @a = "a".."z";
    @list = %a[3,4,6];
    # @list is now (3, "d", 4, "e", 6, "g")

=head2 Typeglobs and Filehandles
X<typeglob> X<filehandle> X<*>

Perl uses an internal type called a I<typeglob> to hold an entire
symbol table entry.  The type prefix of a typeglob is a C<*>, because
it represents all types.  This used to be the preferred way to
pass arrays and hashes by reference into a function, but now that
we have real references, this is seldom needed.  

The main use of typeglobs in modern Perl is create symbol table aliases.
This assignment:

    *this = *that;

makes $this an alias for $that, @this an alias for @that, %this an alias
for %that, &this an alias for &that, etc.  Much safer is to use a reference.
This:

    local *Here::blue = \$There::green;

temporarily makes $Here::blue an alias for $There::green, but doesn't
make @Here::blue an alias for @There::green, or %Here::blue an alias for
%There::green, etc.  See L<perlmod/"Symbol Tables"> for more examples
of this.  Strange though this may seem, this is the basis for the whole
module import/export system.

Another use for typeglobs is to pass filehandles into a function or
to create new filehandles.  If you need to use a typeglob to save away
a filehandle, do it this way:

    $fh = *STDOUT;

or perhaps as a real reference, like this:

    $fh = \*STDOUT;

See L<perlsub> for examples of using these as indirect filehandles
in functions.

Typeglobs are also a way to create a local filehandle using the local()
operator.  These last until their block is exited, but may be passed back.
For example:

    sub newopen {
        my $path = shift;
        local  *FH;  # not my!
        open   (FH, $path)          or  return undef;
        return *FH;
    }
    $fh = newopen('/etc/passwd');

Now that we have the C<*foo{THING}> notation, typeglobs aren't used as much
for filehandle manipulations, although they're still needed to pass brand
new file and directory handles into or out of functions.  That's because
C<*HANDLE{IO}> only works if HANDLE has already been used as a handle.
In other words, C<*FH> must be used to create new symbol table entries;
C<*foo{THING}> cannot.  When in doubt, use C<*FH>.

All functions that are capable of creating filehandles (open(),
opendir(), pipe(), socketpair(), sysopen(), socket(), and accept())
automatically create an anonymous filehandle if the handle passed to
them is an uninitialized scalar variable.  This allows the constructs
such as C<open(my $fh, ...)> and C<open(local $fh,...)> to be used to
create filehandles that will conveniently be closed automatically when
the scope ends, provided there are no other references to them.  This
largely eliminates the need for typeglobs when opening filehandles
that must be passed around, as in the following example:

    sub myopen {
        open my $fh, "@_"
             or die "Can't open '@_': $!";
        return $fh;
    }

    {
        my $f = myopen("</etc/motd");
        print <$f>;
        # $f implicitly closed here
    }

Note that if an initialized scalar variable is used instead the
result is different: C<my $fh='zzz'; open($fh, ...)> is equivalent
to C<open( *{'zzz'}, ...)>.
C<use strict 'refs'> forbids such practice.

Another way to create anonymous filehandles is with the Symbol
module or with the IO::Handle module and its ilk.  These modules
have the advantage of not hiding different types of the same name
during the local().  See the bottom of L<perlfunc/open> for an
example.

=head1 SEE ALSO

See L<perlvar> for a description of Perl's built-in variables and
a discussion of legal variable names.  See L<perlref>, L<perlsub>,
and L<perlmod/"Symbol Tables"> for more discussion on typeglobs and
the C<*foo{THING}> syntax.
perllexwarn.pod000064400000000543150344123470007620 0ustar00=head1 NAME

perllexwarn - Perl Lexical Warnings

=head1 DESCRIPTION

Perl v5.6.0 introduced lexical control over the handling of warnings by
category.  The C<warnings> pragma generally replaces the command line flag
B<-w>.  Documentation on the use of lexical warnings, once partly found in
this document, is now found in the L<warnings> documentation.

perlpolicy.pod000064400000062035150344123470007443 0ustar00=encoding utf8

=head1 NAME

perlpolicy - Various and sundry policies and commitments related to the Perl core

=head1 DESCRIPTION

This document is the master document which records all written
policies about how the Perl 5 Porters collectively develop and maintain
the Perl core.

=head1 GOVERNANCE

=head2 Perl 5 Porters

Subscribers to perl5-porters (the porters themselves) come in several flavours.
Some are quiet curious lurkers, who rarely pitch in and instead watch
the ongoing development to ensure they're forewarned of new changes or
features in Perl.  Some are representatives of vendors, who are there
to make sure that Perl continues to compile and work on their
platforms.  Some patch any reported bug that they know how to fix,
some are actively patching their pet area (threads, Win32, the regexp
-engine), while others seem to do nothing but complain.  In other
words, it's your usual mix of technical people.

Over this group of porters presides Larry Wall.  He has the final word
in what does and does not change in any of the Perl programming languages.
These days, Larry spends most of his time on Perl 6, while Perl 5 is
shepherded by a "pumpking", a porter responsible for deciding what
goes into each release and ensuring that releases happen on a regular
basis.

Larry sees Perl development along the lines of the US government:
there's the Legislature (the porters), the Executive branch (the
-pumpking), and the Supreme Court (Larry).  The legislature can
discuss and submit patches to the executive branch all they like, but
the executive branch is free to veto them.  Rarely, the Supreme Court
will side with the executive branch over the legislature, or the
legislature over the executive branch.  Mostly, however, the
legislature and the executive branch are supposed to get along and
work out their differences without impeachment or court cases.

You might sometimes see reference to Rule 1 and Rule 2.  Larry's power
as Supreme Court is expressed in The Rules:

=over 4

=item 1

Larry is always by definition right about how Perl should behave.
This means he has final veto power on the core functionality.

=item 2

Larry is allowed to change his mind about any matter at a later date,
regardless of whether he previously invoked Rule 1.

=back

Got that?  Larry is always right, even when he was wrong.  It's rare
to see either Rule exercised, but they are often alluded to.

=head1 MAINTENANCE AND SUPPORT

Perl 5 is developed by a community, not a corporate entity. Every change
contributed to the Perl core is the result of a donation. Typically, these
donations are contributions of code or time by individual members of our
community. On occasion, these donations come in the form of corporate
or organizational sponsorship of a particular individual or project.

As a volunteer organization, the commitments we make are heavily dependent
on the goodwill and hard work of individuals who have no obligation to
contribute to Perl.

That being said, we value Perl's stability and security and have long
had an unwritten covenant with the broader Perl community to support
and maintain releases of Perl.

This document codifies the support and maintenance commitments that
the Perl community should expect from Perl's developers:

=over

=item *

We "officially" support the two most recent stable release series.  5.20.x
and earlier are now out of support.  As of the release of 5.26.0, we will
"officially" end support for Perl 5.22.x, other than providing security
updates as described below.

=item *

To the best of our ability, we will attempt to fix critical issues
in the two most recent stable 5.x release series.  Fixes for the
current release series take precedence over fixes for the previous
release series.

=item *

To the best of our ability, we will provide "critical" security patches
/ releases for any major version of Perl whose 5.x.0 release was within
the past three years.  We can only commit to providing these for the
most recent .y release in any 5.x.y series.

=item *

We will not provide security updates or bug fixes for development
releases of Perl.

=item *

We encourage vendors to ship the most recent supported release of
Perl at the time of their code freeze.

=item *

As a vendor, you may have a requirement to backport security fixes
beyond our 3 year support commitment.  We can provide limited support and
advice to you as you do so and, where possible will try to apply
those patches to the relevant -maint branches in git, though we may or
may not choose to make numbered releases or "official" patches
available. See L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details on how to begin that process.

=back

=head1 BACKWARD COMPATIBILITY AND DEPRECATION

Our community has a long-held belief that backward-compatibility is a
virtue, even when the functionality in question is a design flaw.

We would all love to unmake some mistakes we've made over the past
decades.  Living with every design error we've ever made can lead
to painful stagnation.  Unwinding our mistakes is very, very
difficult.  Doing so without actively harming our users is
nearly impossible.

Lately, ignoring or actively opposing compatibility with earlier versions
of Perl has come into vogue.  Sometimes, a change is proposed which
wants to usurp syntax which previously had another meaning.  Sometimes,
a change wants to improve previously-crazy semantics.

Down this road lies madness.

Requiring end-user programmers to change just a few language constructs,
even language constructs which no well-educated developer would ever
intentionally use is tantamount to saying "you should not upgrade to
a new release of Perl unless you have 100% test coverage and can do a
full manual audit of your codebase."  If we were to have tools capable of
reliably upgrading Perl source code from one version of Perl to another,
this concern could be significantly mitigated.

We want to ensure that Perl continues to grow and flourish in the coming
years and decades, but not at the expense of our user community.

Existing syntax and semantics should only be marked for destruction in
very limited circumstances.  If they are believed to be very rarely used,
stand in the way of actual improvement to the Perl language or perl
interpreter, and if affected code can be easily updated to continue
working, they may be considered for removal.  When in doubt, caution
dictates that we will favor backward compatibility.  When a feature is
deprecated, a statement of reasoning describing the decision process
will be posted, and a link to it will be provided in the relevant
perldelta documents.

Using a lexical pragma to enable or disable legacy behavior should be
considered when appropriate, and in the absence of any pragma legacy
behavior should be enabled.  Which backward-incompatible changes are
controlled implicitly by a 'use v5.x.y' is a decision which should be
made by the pumpking in consultation with the community.

Historically, we've held ourselves to a far higher standard than
backward-compatibility -- bugward-compatibility.  Any accident of
implementation or unintentional side-effect of running some bit of code
has been considered to be a feature of the language to be defended with
the same zeal as any other feature or functionality.  No matter how
frustrating these unintentional features may be to us as we continue
to improve Perl, these unintentional features often deserve our
protection.  It is very important that existing software written in
Perl continue to work correctly.  If end-user developers have adopted a
bug as a feature, we need to treat it as such.

New syntax and semantics which don't break existing language constructs
and syntax have a much lower bar.  They merely need to prove themselves
to be useful, elegant, well designed, and well tested.  In most cases,
these additions will be marked as I<experimental> for some time.  See
below for more on that.

=head2 Terminology

To make sure we're talking about the same thing when we discuss the removal
of features or functionality from the Perl core, we have specific definitions
for a few words and phrases.

=over

=item experimental

If something in the Perl core is marked as B<experimental>, we may change
its behaviour, deprecate or remove it without notice. While we'll always
do our best to smooth the transition path for users of experimental
features, you should contact the perl5-porters mailinglist if you find
an experimental feature useful and want to help shape its future.

Experimental features must be experimental in two stable releases before being
marked non-experimental.  Experimental features will only have their
experimental status revoked when they no longer have any design-changing bugs
open against them and when they have remained unchanged in behavior for the
entire length of a development cycle.  In other words, a feature present in
v5.20.0 may be marked no longer experimental in v5.22.0 if and only if its
behavior is unchanged throughout all of v5.21.

=item deprecated

If something in the Perl core is marked as B<deprecated>, we may remove it
from the core in the future, though we might not.  Generally, backward
incompatible changes will have deprecation warnings for two release
cycles before being removed, but may be removed after just one cycle if
the risk seems quite low or the benefits quite high.

As of
Perl 5.12, deprecated features and modules warn the user as they're used.
When a module is deprecated, it will also be made available on CPAN.
Installing it from CPAN will silence deprecation warnings for that module.

If you use a deprecated feature or module and believe that its removal from
the Perl core would be a mistake, please contact the perl5-porters
mailinglist and plead your case.  We don't deprecate things without a good
reason, but sometimes there's a counterargument we haven't considered.
Historically, we did not distinguish between "deprecated" and "discouraged"
features.

=item discouraged

From time to time, we may mark language constructs and features which we
consider to have been mistakes as B<discouraged>.  Discouraged features
aren't currently candidates for removal, but
we may later deprecate them if they're found to stand in the way of a
significant improvement to the Perl core.

=item removed

Once a feature, construct or module has been marked as deprecated, we
may remove it from the Perl core.  Unsurprisingly,
we say we've B<removed> these things.  When a module is removed, it will
no longer ship with Perl, but will continue to be available on CPAN.

=back

=head1 MAINTENANCE BRANCHES

New releases of maintenance branches should only contain changes that fall into
one of the "acceptable" categories set out below, but must not contain any
changes that fall into one of the "unacceptable" categories.  (For example, a
fix for a crashing bug must not be included if it breaks binary compatibility.)

It is not necessary to include every change meeting these criteria, and in
general the focus should be on addressing security issues, crashing bugs,
regressions and serious installation issues.  The temptation to include a
plethora of minor changes that don't affect the installation or execution of
perl (e.g. spelling corrections in documentation) should be resisted in order
to reduce the overall risk of overlooking something.  The intention is to
create maintenance releases which are both worthwhile and which users can have
full confidence in the stability of.  (A secondary concern is to avoid burning
out the maint-pumpking or overwhelming other committers voting on changes to be
included (see L</"Getting changes into a maint branch"> below).)

The following types of change may be considered acceptable, as long as they do
not also fall into any of the "unacceptable" categories set out below:

=over

=item *

Patches that fix CVEs or security issues.  These changes should
be passed using the security reporting mechanism rather than applied
directly; see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>.

=item *

Patches that fix crashing bugs, assertion failures and
memory corruption but which do not otherwise change perl's
functionality or negatively impact performance.

=item *

Patches that fix regressions in perl's behavior relative to previous
releases, no matter how old the regression, since some people may
upgrade from very old versions of perl to the latest version.

=item *

Patches that fix bugs in features that were new in the corresponding 5.x.0
stable release.

=item *

Patches that fix anything which prevents or seriously impacts the build
or installation of perl.

=item *

Portability fixes, such as changes to Configure and the files in
the hints/ folder.

=item *

Minimal patches that fix platform-specific test failures.

=item *

Documentation updates that correct factual errors, explain significant
bugs or deficiencies in the current implementation, or fix broken markup.

=item *

Updates to dual-life modules should consist of minimal patches to
fix crashing bugs or security issues (as above).  Any changes made to
dual-life modules for which CPAN is canonical should be coordinated with
the upstream author.

=back

The following types of change are NOT acceptable:

=over

=item *

Patches that break binary compatibility.  (Please talk to a pumpking.)

=item *

Patches that add or remove features.

=item *

Patches that add new warnings or errors or deprecate features.

=item *

Ports of Perl to a new platform, architecture or OS release that
involve changes to the implementation.

=item *

New versions of dual-life modules should NOT be imported into maint.
Those belong in the next stable series.

=back

If there is any question about whether a given patch might merit
inclusion in a maint release, then it almost certainly should not
be included.

=head2 Getting changes into a maint branch

Historically, only the pumpking cherry-picked changes from bleadperl
into maintperl.  This has scaling problems.  At the same time,
maintenance branches of stable versions of Perl need to be treated with
great care. To that end, as of Perl 5.12, we have a new process for
maint branches.

Any committer may cherry-pick any commit from blead to a maint branch if
they send mail to perl5-porters announcing their intent to cherry-pick
a specific commit along with a rationale for doing so and at least two
other committers respond to the list giving their assent. (This policy
applies to current and former pumpkings, as well as other committers.)

Other voting mechanisms may be used instead, as long as the same number of
votes is gathered in a transparent manner.  Specifically, proposals of
which changes to cherry-pick must be visible to everyone on perl5-porters
so that the views of everyone interested may be heard.

It is not necessary for voting to be held on cherry-picking perldelta
entries associated with changes that have already been cherry-picked, nor
for the maint-pumpking to obtain votes on changes required by the
F<Porting/release_managers_guide.pod> where such changes can be applied by
the means of cherry-picking from blead.

=head1 CONTRIBUTED MODULES


=head2 A Social Contract about Artistic Control

What follows is a statement about artistic control, defined as the ability
of authors of packages to guide the future of their code and maintain
control over their work.  It is a recognition that authors should have
control over their work, and that it is a responsibility of the rest of
the Perl community to ensure that they retain this control.  It is an
attempt to document the standards to which we, as Perl developers, intend
to hold ourselves.  It is an attempt to write down rough guidelines about
the respect we owe each other as Perl developers.

This statement is not a legal contract.  This statement is not a legal
document in any way, shape, or form.  Perl is distributed under the GNU
Public License and under the Artistic License; those are the precise legal
terms.  This statement isn't about the law or licenses.  It's about
community, mutual respect, trust, and good-faith cooperation.

We recognize that the Perl core, defined as the software distributed with
the heart of Perl itself, is a joint project on the part of all of us.
From time to time, a script, module, or set of modules (hereafter referred
to simply as a "module") will prove so widely useful and/or so integral to
the correct functioning of Perl itself that it should be distributed with
the Perl core.  This should never be done without the author's explicit
consent, and a clear recognition on all parts that this means the module
is being distributed under the same terms as Perl itself.  A module author
should realize that inclusion of a module into the Perl core will
necessarily mean some loss of control over it, since changes may
occasionally have to be made on short notice or for consistency with the
rest of Perl.

Once a module has been included in the Perl core, however, everyone
involved in maintaining Perl should be aware that the module is still the
property of the original author unless the original author explicitly
gives up their ownership of it.  In particular:

=over

=item *

The version of the module in the Perl core should still be considered the
work of the original author.  All patches, bug reports, and so
forth should be fed back to them.  Their development directions
should be respected whenever possible.

=item *

Patches may be applied by the pumpkin holder without the explicit
cooperation of the module author if and only if they are very minor,
time-critical in some fashion (such as urgent security fixes), or if
the module author cannot be reached.  Those patches must still be
given back to the author when possible, and if the author decides on
an alternate fix in their version, that fix should be strongly
preferred unless there is a serious problem with it.  Any changes not
endorsed by the author should be marked as such, and the contributor
of the change acknowledged.

=item *

The version of the module distributed with Perl should, whenever
possible, be the latest version of the module as distributed by the
author (the latest non-beta version in the case of public Perl
releases), although the pumpkin holder may hold off on upgrading the
version of the module distributed with Perl to the latest version
until the latest version has had sufficient testing.

=back

In other words, the author of a module should be considered to have final
say on modifications to their module whenever possible (bearing in mind
that it's expected that everyone involved will work together and arrive at
reasonable compromises when there are disagreements).

As a last resort, however:


If the author's vision of the future of their module is sufficiently
different from the vision of the pumpkin holder and perl5-porters as a
whole so as to cause serious problems for Perl, the pumpkin holder may
choose to formally fork the version of the module in the Perl core from the
one maintained by the author.  This should not be done lightly and
should B<always> if at all possible be done only after direct input
from Larry.  If this is done, it must then be made explicit in the
module as distributed with the Perl core that it is a forked version and
that while it is based on the original author's work, it is no longer
maintained by them.  This must be noted in both the documentation and
in the comments in the source of the module.

Again, this should be a last resort only.  Ideally, this should never
happen, and every possible effort at cooperation and compromise should be
made before doing this.  If it does prove necessary to fork a module for
the overall health of Perl, proper credit must be given to the original
author in perpetuity and the decision should be constantly re-evaluated to
see if a remerging of the two branches is possible down the road.

In all dealings with contributed modules, everyone maintaining Perl should
keep in mind that the code belongs to the original author, that they may
not be on perl5-porters at any given time, and that a patch is not
official unless it has been integrated into the author's copy of the
module.  To aid with this, and with points #1, #2, and #3 above, contact
information for the authors of all contributed modules should be kept with
the Perl distribution.

Finally, the Perl community as a whole recognizes that respect for
ownership of code, respect for artistic control, proper credit, and active
effort to prevent unintentional code skew or communication gaps is vital
to the health of the community and Perl itself.  Members of a community
should not normally have to resort to rules and laws to deal with each
other, and this document, although it contains rules so as to be clear, is
about an attitude and general approach.  The first step in any dispute
should be open communication, respect for opposing views, and an attempt
at a compromise.  In nearly every circumstance nothing more will be
necessary, and certainly no more drastic measure should be used until
every avenue of communication and discussion has failed.


=head1 DOCUMENTATION

Perl's documentation is an important resource for our users. It's
incredibly important for Perl's documentation to be reasonably coherent
and to accurately reflect the current implementation.

Just as P5P collectively maintains the codebase, we collectively
maintain the documentation.  Writing a particular bit of documentation
doesn't give an author control of the future of that documentation.
At the same time, just as source code changes should match the style
of their surrounding blocks, so should documentation changes.

Examples in documentation should be illustrative of the concept
they're explaining.  Sometimes, the best way to show how a
language feature works is with a small program the reader can
run without modification.  More often, examples will consist
of a snippet of code containing only the "important" bits.
The definition of "important" varies from snippet to snippet.
Sometimes it's important to declare C<use strict> and C<use warnings>,
initialize all variables and fully catch every error condition.
More often than not, though, those things obscure the lesson
the example was intended to teach.

As Perl is developed by a global team of volunteers, our
documentation often contains spellings which look funny
to I<somebody>.  Choice of American/British/Other spellings
is left as an exercise for the author of each bit of
documentation.  When patching documentation, try to emulate
the documentation around you, rather than changing the existing
prose.

In general, documentation should describe what Perl does "now" rather
than what it used to do.  It's perfectly reasonable to include notes
in documentation about how behaviour has changed from previous releases,
but, with very few exceptions, documentation isn't "dual-life" --
it doesn't need to fully describe how all old versions used to work.

=head1 STANDARDS OF CONDUCT

The official forum for the development of perl is the perl5-porters mailing
list, mentioned above, and its bugtracker at rt.perl.org.  Posting to the
list and the bugtracker is not a right: all participants in discussion are
expected to adhere to a standard of conduct.

=over 4

=item *

Always be civil.

=item *

Heed the moderators.

=back

Civility is simple: stick to the facts while avoiding demeaning remarks,
belittling other individuals, sarcasm, or a presumption of bad faith. It is
not enough to be factual.  You must also be civil.  Responding in kind to
incivility is not acceptable.  If you relay otherwise-unposted comments to
the list from a third party, you take responsibility for the content of
those comments, and you must therefore ensure that they are civil.

While civility is required, kindness is encouraged; if you have any doubt about
whether you are being civil, simply ask yourself, "Am I being kind?" and aspire
to that.

If the list moderators tell you that you are not being civil, carefully
consider how your words have appeared before responding in any way.  Were they
kind?  You may protest, but repeated protest in the face of a repeatedly
reaffirmed decision is not acceptable.  Repeatedly protesting about the
moderators' decisions regarding a third party is also unacceptable, as is
continuing to initiate off-list contact with the moderators about their
decisions.

Unacceptable behavior will result in a public and clearly identified
warning.  A second instance of unacceptable behavior from the same
individual will result in removal from the mailing list and rt.perl.org,
for a period of one calendar month.  The rationale for this is to
provide an opportunity for the person to change the way they act.

After the time-limited ban has been lifted, a third instance of
unacceptable behavior will result in a further public warning.  A fourth
or subsequent instance will result in an indefinite ban.  The rationale
is that, in the face of an apparent refusal to change behavior, we must
protect other community members from future unacceptable actions.  The
moderators may choose to lift an indefinite ban if the person in
question affirms they will not transgress again.

Removals, like warnings, are public.

The list of moderators will be public knowledge.  At present, it is:
Aaron Crane, Andy Dougherty, Karen Etheridge, Ricardo Signes, Sawyer X,
Steffen Müller, Todd Rinaldo.

=head1 CREDITS

"Social Contract about Contributed Modules" originally by Russ Allbery E<lt>rra@stanford.eduE<gt> and the perl5-porters.

perl5201delta.pod000064400000025223150344123470007543 0ustar00=encoding utf8

=head1 NAME

perl5201delta - what is new for perl v5.20.1

=head1 DESCRIPTION

This document describes differences between the 5.20.0 release and the 5.20.1
release.

If you are upgrading from an earlier release such as 5.18.0, first read
L<perl5200delta>, which describes differences between 5.18.0 and 5.20.0.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.20.0.  If any exist,
they are bugs, and we request that you submit a report.  See L</Reporting Bugs>
below.

=head1 Performance Enhancements

=over 4

=item *

An optimization to avoid problems with COW and deliberately overallocated PVs
has been disabled because it interfered with another, more important,
optimization, causing a slowdown on some platforms.
L<[perl #121975]|https://rt.perl.org/Ticket/Display.html?id=121975>

=item *

Returning a string from a lexical variable could be slow in some cases.  This
has now been fixed.
L<[perl #121977]|https://rt.perl.org/Ticket/Display.html?id=121977>

=back

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Config::Perl::V> has been upgraded from version 0.20 to 0.22.

The list of Perl versions covered has been updated and some flaws in the
parsing have been fixed.

=item *

L<Exporter> has been upgraded from version 5.70 to 5.71.

Illegal POD syntax in the documentation has been corrected.

=item *

L<ExtUtils::CBuilder> has been upgraded from version 0.280216 to 0.280217.

Android builds now link to both B<-lperl> and C<$Config::Config{perllibs}>.

=item *

L<File::Copy> has been upgraded from version 2.29 to 2.30.

The documentation now notes that C<copy> will not overwrite read-only files.

=item *

L<Module::CoreList> has been upgraded from version 3.11 to 5.020001.

The list of Perl versions covered has been updated.

=item *

The PathTools module collection has been upgraded from version 3.47 to 3.48.

Fallbacks are now in place when cross-compiling for Android and
C<$Config::Config{sh}> is not yet defined.
L<[perl #121963]|https://rt.perl.org/Ticket/Display.html?id=121963>

=item *

L<PerlIO::via> has been upgraded from version 0.14 to 0.15.

A minor portability improvement has been made to the XS implementation.

=item *

L<Unicode::UCD> has been upgraded from version 0.57 to 0.58.

The documentation includes many clarifications and fixes.

=item *

L<utf8> has been upgraded from version 1.13 to 1.13_01.

The documentation has some minor formatting improvements.

=item *

L<version> has been upgraded from version 0.9908 to 0.9909.

External libraries and Perl may have different ideas of what the locale is.
This is problematic when parsing version strings if the locale's numeric
separator has been changed.  Version parsing has been patched to ensure it
handles the locales correctly.
L<[perl #121930]|https://rt.perl.org/Ticket/Display.html?id=121930>

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perlapi>

=over 4

=item *

C<av_len> - Emphasize that this returns the highest index in the array, not the
size of the array.
L<[perl #120386]|https://rt.perl.org/Ticket/Display.html?id=120386>

=item *

Note that C<SvSetSV> doesn't do set magic.

=item *

C<sv_usepvn_flags> - Fix documentation to mention the use of C<NewX> instead of
C<malloc>.
L<[perl #121869]|https://rt.perl.org/Ticket/Display.html?id=121869>

=item *

Clarify where C<NUL> may be embedded or is required to terminate a string.

=back

=head3 L<perlfunc>

=over 4

=item *

Clarify the meaning of C<-B> and C<-T>.

=item *

C<-l> now notes that it will return false if symlinks aren't supported by the
file system.
L<[perl #121523]|https://rt.perl.org/Ticket/Display.html?id=121523>

=item *

Note that C<each>, C<keys> and C<values> may produce different orderings for
tied hashes compared to other perl hashes.
L<[perl #121404]|https://rt.perl.org/Ticket/Display.html?id=121404>

=item *

Note that C<exec LIST> and C<system LIST> may fall back to the shell on Win32.
Only C<exec PROGRAM LIST> and C<system PROGRAM LIST> indirect object syntax
will reliably avoid using the shell.  This has also been noted in L<perlport>.
L<[perl #122046]|https://rt.perl.org/Ticket/Display.html?id=122046>

=item *

Clarify the meaning of C<our>.
L<[perl #122132]|https://rt.perl.org/Ticket/Display.html?id=122132>

=back

=head3 L<perlguts>

=over 4

=item *

Explain various ways of modifying an existing SV's buffer.
L<[perl #116925]|https://rt.perl.org/Ticket/Display.html?id=116925>

=back

=head3 L<perlpolicy>

=over 4

=item *

We now have a code of conduct for the I<< p5p >> mailing list, as documented in
L<< perlpolicy/STANDARDS OF CONDUCT >>.

=back

=head3 L<perlre>

=over 4

=item *

The C</x> modifier has been clarified to note that comments cannot be continued
onto the next line by escaping them.

=back

=head3 L<perlsyn>

=over 4

=item *

Mention the use of empty conditionals in C<for>/C<while> loops for infinite
loops.

=back

=head3 L<perlxs>

=over 4

=item *

Added a discussion of locale issues in XS code.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 Changes to Existing Diagnostics

=over 4

=item *

L<Variable length lookbehind not implemented in regex mE<sol>%sE<sol>|perldiag/"Variable length lookbehind not implemented in regex m/%s/">

Information about Unicode behaviour has been added.

=back

=head1 Configuration and Compilation

=over 4

=item *

Building Perl no longer writes to the source tree when configured with
F<Configure>'s B<-Dmksymlinks> option.
L<[perl #121585]|https://rt.perl.org/Ticket/Display.html?id=121585>

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item Android

Build support has been improved for cross-compiling in general and for Android
in particular.

=item OpenBSD

Corrected architectures and version numbers used in configuration hints when
building Perl.

=item Solaris

B<c99> options have been cleaned up, hints look for B<solstudio> as well as
B<SUNWspro>, and support for native C<setenv> has been added.

=item VMS

An old bug in feature checking, mainly affecting pre-7.3 systems, has been
fixed.

=item Windows

C<%I64d> is now being used instead of C<%lld> for MinGW.

=back

=head1 Internal Changes

=over 4

=item *

Added L<perlapi/sync_locale>.
Changing the program's locale should be avoided by XS code.  Nevertheless,
certain non-Perl libraries called from XS, such as C<Gtk> do so.  When this
happens, Perl needs to be told that the locale has changed.  Use this function
to do so, before returning to Perl.

=back

=head1 Selected Bug Fixes

=over 4

=item *

A bug has been fixed where zero-length assertions and code blocks inside of a
regex could cause C<pos> to see an incorrect value.
L<[perl #122460]|https://rt.perl.org/Ticket/Display.html?id=122460>

=item *

Using C<s///e> on tainted utf8 strings could issue bogus "Malformed UTF-8
character (unexpected end of string)" warnings.  This has now been fixed.
L<[perl #122148]|https://rt.perl.org/Ticket/Display.html?id=122148>

=item *

C<system> and friends should now work properly on more Android builds.

Due to an oversight, the value specified through B<-Dtargetsh> to F<Configure>
would end up being ignored by some of the build process.  This caused perls
cross-compiled for Android to end up with defective versions of C<system>,
C<exec> and backticks: the commands would end up looking for F</bin/sh> instead
of F</system/bin/sh>, and so would fail for the vast majority of devices,
leaving C<$!> as C<ENOENT>.

=item *

Many issues have been detected by L<Coverity|http://www.coverity.com/> and 
fixed.

=back

=head1 Acknowledgements

Perl 5.20.1 represents approximately 4 months of development since Perl 5.20.0
and contains approximately 12,000 lines of changes across 170 files from 36
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 2,600 lines of changes to 110 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.20.1:

Aaron Crane, Abigail, Alberto Simões, Alexandr Ciornii, Alexandre (Midnite)
Jousset, Andrew Fresh, Andy Dougherty, Brian Fraser, Chris 'BinGOs' Williams,
Craig A. Berry, Daniel Dragan, David Golden, David Mitchell, H.Merijn Brand,
James E Keenan, Jan Dubois, Jarkko Hietaniemi, John Peacock, kafka, Karen
Etheridge, Karl Williamson, Lukas Mai, Matthew Horsfall, Michael Bunk, Peter
Martini, Rafael Garcia-Suarez, Reini Urban, Ricardo Signes, Shirakata Kentaro,
Smylers, Steve Hay, Thomas Sibley, Todd Rinaldo, Tony Cook, Vladimir Marek,
Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
https://rt.perl.org/ .  There may also be information at http://www.perl.org/ ,
the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perltooc.pod000064400000000446150344123470007106 0ustar00=encoding utf8

=head1 NAME

perltooc - Links to information on object-oriented programming in Perl

=head1 DESCRIPTION

For information on OO programming with Perl, please see L<perlootut>
and L<perlobj>.

(The above documents supersede the tutorial that was formerly here in
perltooc.)

=cut
perlsec.pod000064400000063110150344123470006711 0ustar00=head1 NAME

perlsec - Perl security

=head1 DESCRIPTION

Perl is designed to make it easy to program securely even when running
with extra privileges, like setuid or setgid programs.  Unlike most
command line shells, which are based on multiple substitution passes on
each line of the script, Perl uses a more conventional evaluation scheme
with fewer hidden snags.  Additionally, because the language has more
builtin functionality, it can rely less upon external (and possibly
untrustworthy) programs to accomplish its purposes.

=head1 SECURITY VULNERABILITY CONTACT INFORMATION

If you believe you have found a security vulnerability in Perl, please
email the details to perl5-security-report@perl.org. This creates a new
Request Tracker ticket in a special queue which isn't initially publicly
accessible. The email will also be copied to a closed subscription
unarchived mailing list which includes all the core committers, who will
be able to help assess the impact of issues, figure out a resolution, and
help co-ordinate the release of patches to mitigate or fix the problem
across all platforms on which Perl is supported. Please only use this
address for security issues in the Perl core, not for modules
independently distributed on CPAN.

When sending an initial request to the security email address, please
don't Cc any other parties, because if they reply to all, the reply will
generate yet another new ticket. Once you have received an initial reply
with a C<[perl #NNNNNN]> ticket number in  the headline, it's okay to Cc
subsequent replies to third parties: all emails to the
perl5-security-report address with the ticket number in the subject line
will be added to the ticket; without it, a new ticket will be created.

=head1 SECURITY MECHANISMS AND CONCERNS

=head2 Taint mode

Perl automatically enables a set of special security checks, called I<taint
mode>, when it detects its program running with differing real and effective
user or group IDs.  The setuid bit in Unix permissions is mode 04000, the
setgid bit mode 02000; either or both may be set.  You can also enable taint
mode explicitly by using the B<-T> command line flag.  This flag is
I<strongly> suggested for server programs and any program run on behalf of
someone else, such as a CGI script.  Once taint mode is on, it's on for
the remainder of your script.

While in this mode, Perl takes special precautions called I<taint
checks> to prevent both obvious and subtle traps.  Some of these checks
are reasonably simple, such as verifying that path directories aren't
writable by others; careful programmers have always used checks like
these.  Other checks, however, are best supported by the language itself,
and it is these checks especially that contribute to making a set-id Perl
program more secure than the corresponding C program.

You may not use data derived from outside your program to affect
something else outside your program--at least, not by accident.  All
command line arguments, environment variables, locale information (see
L<perllocale>), results of certain system calls (C<readdir()>,
C<readlink()>, the variable of C<shmread()>, the messages returned by
C<msgrcv()>, the password, gcos and shell fields returned by the
C<getpwxxx()> calls), and all file input are marked as "tainted".
Tainted data may not be used directly or indirectly in any command
that invokes a sub-shell, nor in any command that modifies files,
directories, or processes, B<with the following exceptions>:

=over 4

=item *

Arguments to C<print> and C<syswrite> are B<not> checked for taintedness.

=item *

Symbolic methods

    $obj->$method(@args);

and symbolic sub references

    &{$foo}(@args);
    $foo->(@args);

are not checked for taintedness.  This requires extra carefulness
unless you want external data to affect your control flow.  Unless
you carefully limit what these symbolic values are, people are able
to call functions B<outside> your Perl code, such as POSIX::system,
in which case they are able to run arbitrary external code.

=item *

Hash keys are B<never> tainted.

=back

For efficiency reasons, Perl takes a conservative view of
whether data is tainted.  If an expression contains tainted data,
any subexpression may be considered tainted, even if the value
of the subexpression is not itself affected by the tainted data.

Because taintedness is associated with each scalar value, some
elements of an array or hash can be tainted and others not.
The keys of a hash are B<never> tainted.

For example:

    $arg = shift;		# $arg is tainted
    $hid = $arg . 'bar';	# $hid is also tainted
    $line = <>;			# Tainted
    $line = <STDIN>;		# Also tainted
    open FOO, "/home/me/bar" or die $!;
    $line = <FOO>;		# Still tainted
    $path = $ENV{'PATH'};	# Tainted, but see below
    $data = 'abc';		# Not tainted

    system "echo $arg";		# Insecure
    system "/bin/echo", $arg;	# Considered insecure
				# (Perl doesn't know about /bin/echo)
    system "echo $hid";		# Insecure
    system "echo $data";	# Insecure until PATH set

    $path = $ENV{'PATH'};	# $path now tainted

    $ENV{'PATH'} = '/bin:/usr/bin';
    delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};

    $path = $ENV{'PATH'};	# $path now NOT tainted
    system "echo $data";	# Is secure now!

    open(FOO, "< $arg");	# OK - read-only file
    open(FOO, "> $arg"); 	# Not OK - trying to write

    open(FOO,"echo $arg|");	# Not OK
    open(FOO,"-|")
	or exec 'echo', $arg;	# Also not OK

    $shout = `echo $arg`;	# Insecure, $shout now tainted

    unlink $data, $arg;		# Insecure
    umask $arg;			# Insecure

    exec "echo $arg";		# Insecure
    exec "echo", $arg;		# Insecure
    exec "sh", '-c', $arg;	# Very insecure!

    @files = <*.c>;		# insecure (uses readdir() or similar)
    @files = glob('*.c');	# insecure (uses readdir() or similar)

    # In either case, the results of glob are tainted, since the list of
    # filenames comes from outside of the program.

    $bad = ($arg, 23);		# $bad will be tainted
    $arg, `true`;		# Insecure (although it isn't really)

If you try to do something insecure, you will get a fatal error saying
something like "Insecure dependency" or "Insecure $ENV{PATH}".

The exception to the principle of "one tainted value taints the whole
expression" is with the ternary conditional operator C<?:>.  Since code
with a ternary conditional

    $result = $tainted_value ? "Untainted" : "Also untainted";

is effectively

    if ( $tainted_value ) {
        $result = "Untainted";
    } else {
        $result = "Also untainted";
    }

it doesn't make sense for C<$result> to be tainted.

=head2 Laundering and Detecting Tainted Data

To test whether a variable contains tainted data, and whose use would
thus trigger an "Insecure dependency" message, you can use the
C<tainted()> function of the Scalar::Util module, available in your
nearby CPAN mirror, and included in Perl starting from the release 5.8.0.
Or you may be able to use the following C<is_tainted()> function.

    sub is_tainted {
        local $@;   # Don't pollute caller's value.
        return ! eval { eval("#" . substr(join("", @_), 0, 0)); 1 };
    }

This function makes use of the fact that the presence of tainted data
anywhere within an expression renders the entire expression tainted.  It
would be inefficient for every operator to test every argument for
taintedness.  Instead, the slightly more efficient and conservative
approach is used that if any tainted value has been accessed within the
same expression, the whole expression is considered tainted.

But testing for taintedness gets you only so far.  Sometimes you have just
to clear your data's taintedness.  Values may be untainted by using them
as keys in a hash; otherwise the only way to bypass the tainting
mechanism is by referencing subpatterns from a regular expression match.
Perl presumes that if you reference a substring using $1, $2, etc. in a
non-tainting pattern, that
you knew what you were doing when you wrote that pattern.  That means using
a bit of thought--don't just blindly untaint anything, or you defeat the
entire mechanism.  It's better to verify that the variable has only good
characters (for certain values of "good") rather than checking whether it
has any bad characters.  That's because it's far too easy to miss bad
characters that you never thought of.

Here's a test to make sure that the data contains nothing but "word"
characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
or a dot.

    if ($data =~ /^([-\@\w.]+)$/) {
	$data = $1; 			# $data now untainted
    } else {
	die "Bad data in '$data'"; 	# log this somewhere
    }

This is fairly secure because C</\w+/> doesn't normally match shell
metacharacters, nor are dot, dash, or at going to mean something special
to the shell.  Use of C</.+/> would have been insecure in theory because
it lets everything through, but Perl doesn't check for that.  The lesson
is that when untainting, you must be exceedingly careful with your patterns.
Laundering data using regular expression is the I<only> mechanism for
untainting dirty data, unless you use the strategy detailed below to fork
a child of lesser privilege.

The example does not untaint C<$data> if C<use locale> is in effect,
because the characters matched by C<\w> are determined by the locale.
Perl considers that locale definitions are untrustworthy because they
contain data from outside the program.  If you are writing a
locale-aware program, and want to launder data with a regular expression
containing C<\w>, put C<no locale> ahead of the expression in the same
block.  See L<perllocale/SECURITY> for further discussion and examples.

=head2 Switches On the "#!" Line

When you make a script executable, in order to make it usable as a
command, the system will pass switches to perl from the script's #!
line.  Perl checks that any command line switches given to a setuid
(or setgid) script actually match the ones set on the #! line.  Some
Unix and Unix-like environments impose a one-switch limit on the #!
line, so you may need to use something like C<-wU> instead of C<-w -U>
under such systems.  (This issue should arise only in Unix or
Unix-like environments that support #! and setuid or setgid scripts.)

=head2 Taint mode and @INC

When the taint mode (C<-T>) is in effect, the "." directory is removed
from C<@INC>, and the environment variables C<PERL5LIB> and C<PERLLIB>
are ignored by Perl.  You can still adjust C<@INC> from outside the
program by using the C<-I> command line option as explained in
L<perlrun>.  The two environment variables are ignored because
they are obscured, and a user running a program could be unaware that
they are set, whereas the C<-I> option is clearly visible and
therefore permitted.

Another way to modify C<@INC> without modifying the program, is to use
the C<lib> pragma, e.g.:

  perl -Mlib=/foo program

The benefit of using C<-Mlib=/foo> over C<-I/foo>, is that the former
will automagically remove any duplicated directories, while the latter
will not.

Note that if a tainted string is added to C<@INC>, the following
problem will be reported:

  Insecure dependency in require while running with -T switch

=head2 Cleaning Up Your Path

For "Insecure C<$ENV{PATH}>" messages, you need to set C<$ENV{'PATH'}> to
a known value, and each directory in the path must be absolute and
non-writable by others than its owner and group.  You may be surprised to
get this message even if the pathname to your executable is fully
qualified.  This is I<not> generated because you didn't supply a full path
to the program; instead, it's generated because you never set your PATH
environment variable, or you didn't set it to something that was safe.
Because Perl can't guarantee that the executable in question isn't itself
going to turn around and execute some other program that is dependent on
your PATH, it makes sure you set the PATH.

The PATH isn't the only environment variable which can cause problems.
Because some shells may use the variables IFS, CDPATH, ENV, and
BASH_ENV, Perl checks that those are either empty or untainted when
starting subprocesses.  You may wish to add something like this to your
setid and taint-checking scripts.

    delete @ENV{qw(IFS CDPATH ENV BASH_ENV)};   # Make %ENV safer

It's also possible to get into trouble with other operations that don't
care whether they use tainted values.  Make judicious use of the file
tests in dealing with any user-supplied filenames.  When possible, do
opens and such B<after> properly dropping any special user (or group!)
privileges.  Perl doesn't prevent you from
opening tainted filenames for reading,
so be careful what you print out.  The tainting mechanism is intended to
prevent stupid mistakes, not to remove the need for thought.

Perl does not call the shell to expand wild cards when you pass C<system>
and C<exec> explicit parameter lists instead of strings with possible shell
wildcards in them.  Unfortunately, the C<open>, C<glob>, and
backtick functions provide no such alternate calling convention, so more
subterfuge will be required.

Perl provides a reasonably safe way to open a file or pipe from a setuid
or setgid program: just create a child process with reduced privilege who
does the dirty work for you.  First, fork a child using the special
C<open> syntax that connects the parent and child by a pipe.  Now the
child resets its ID set and any other per-process attributes, like
environment variables, umasks, current working directories, back to the
originals or known safe values.  Then the child process, which no longer
has any special permissions, does the C<open> or other system call.
Finally, the child passes the data it managed to access back to the
parent.  Because the file or pipe was opened in the child while running
under less privilege than the parent, it's not apt to be tricked into
doing something it shouldn't.

Here's a way to do backticks reasonably safely.  Notice how the C<exec> is
not called with a string that the shell could expand.  This is by far the
best way to call something that might be subjected to shell escapes: just
never call the shell at all.  

        use English;
        die "Can't fork: $!" unless defined($pid = open(KID, "-|"));
        if ($pid) {           # parent
            while (<KID>) {
                # do something
            }
            close KID;
        } else {
            my @temp     = ($EUID, $EGID);
            my $orig_uid = $UID;
            my $orig_gid = $GID;
            $EUID = $UID;
            $EGID = $GID;
            # Drop privileges
            $UID  = $orig_uid;
            $GID  = $orig_gid;
            # Make sure privs are really gone
            ($EUID, $EGID) = @temp;
            die "Can't drop privileges"
                unless $UID == $EUID  && $GID eq $EGID;
            $ENV{PATH} = "/bin:/usr/bin"; # Minimal PATH.
	    # Consider sanitizing the environment even more.
            exec 'myprog', 'arg1', 'arg2'
                or die "can't exec myprog: $!";
        }

A similar strategy would work for wildcard expansion via C<glob>, although
you can use C<readdir> instead.

Taint checking is most useful when although you trust yourself not to have
written a program to give away the farm, you don't necessarily trust those
who end up using it not to try to trick it into doing something bad.  This
is the kind of security checking that's useful for set-id programs and
programs launched on someone else's behalf, like CGI programs.

This is quite different, however, from not even trusting the writer of the
code not to try to do something evil.  That's the kind of trust needed
when someone hands you a program you've never seen before and says, "Here,
run this."  For that kind of safety, you might want to check out the Safe
module, included standard in the Perl distribution.  This module allows the
programmer to set up special compartments in which all system operations
are trapped and namespace access is carefully controlled.  Safe should
not be considered bullet-proof, though: it will not prevent the foreign
code to set up infinite loops, allocate gigabytes of memory, or even
abusing perl bugs to make the host interpreter crash or behave in
unpredictable ways.  In any case it's better avoided completely if you're
really concerned about security.

=head2 Security Bugs

Beyond the obvious problems that stem from giving special privileges to
systems as flexible as scripts, on many versions of Unix, set-id scripts
are inherently insecure right from the start.  The problem is a race
condition in the kernel.  Between the time the kernel opens the file to
see which interpreter to run and when the (now-set-id) interpreter turns
around and reopens the file to interpret it, the file in question may have
changed, especially if you have symbolic links on your system.

Fortunately, sometimes this kernel "feature" can be disabled.
Unfortunately, there are two ways to disable it.  The system can simply
outlaw scripts with any set-id bit set, which doesn't help much.
Alternately, it can simply ignore the set-id bits on scripts.

However, if the kernel set-id script feature isn't disabled, Perl will
complain loudly that your set-id script is insecure.  You'll need to
either disable the kernel set-id script feature, or put a C wrapper around
the script.  A C wrapper is just a compiled program that does nothing
except call your Perl program.   Compiled programs are not subject to the
kernel bug that plagues set-id scripts.  Here's a simple wrapper, written
in C:

    #include <unistd.h>
    #include <stdio.h>
    #include <string.h>
    #include <errno.h>

    #define REAL_PATH "/path/to/script"

    int main(int argc, char **argv)
    {
        execv(REAL_PATH, argv);
        fprintf(stderr, "%s: %s: %s\n",
                        argv[0], REAL_PATH, strerror(errno));
        return 127;
    }

Compile this wrapper into a binary executable and then make I<it> rather
than your script setuid or setgid.

In recent years, vendors have begun to supply systems free of this
inherent security bug.  On such systems, when the kernel passes the name
of the set-id script to open to the interpreter, rather than using a
pathname subject to meddling, it instead passes I</dev/fd/3>.  This is a
special file already opened on the script, so that there can be no race
condition for evil scripts to exploit.  On these systems, Perl should be
compiled with C<-DSETUID_SCRIPTS_ARE_SECURE_NOW>.  The F<Configure>
program that builds Perl tries to figure this out for itself, so you
should never have to specify this yourself.  Most modern releases of
SysVr4 and BSD 4.4 use this approach to avoid the kernel race condition.

=head2 Protecting Your Programs

There are a number of ways to hide the source to your Perl programs,
with varying levels of "security".

First of all, however, you I<can't> take away read permission, because
the source code has to be readable in order to be compiled and
interpreted.  (That doesn't mean that a CGI script's source is
readable by people on the web, though.)  So you have to leave the
permissions at the socially friendly 0755 level.  This lets 
people on your local system only see your source.

Some people mistakenly regard this as a security problem.  If your program does
insecure things, and relies on people not knowing how to exploit those
insecurities, it is not secure.  It is often possible for someone to
determine the insecure things and exploit them without viewing the
source.  Security through obscurity, the name for hiding your bugs
instead of fixing them, is little security indeed.

You can try using encryption via source filters (Filter::* from CPAN,
or Filter::Util::Call and Filter::Simple since Perl 5.8).
But crackers might be able to decrypt it.  You can try using the byte
code compiler and interpreter described below, but crackers might be
able to de-compile it.  You can try using the native-code compiler
described below, but crackers might be able to disassemble it.  These
pose varying degrees of difficulty to people wanting to get at your
code, but none can definitively conceal it (this is true of every
language, not just Perl).

If you're concerned about people profiting from your code, then the
bottom line is that nothing but a restrictive license will give you
legal security.  License your software and pepper it with threatening
statements like "This is unpublished proprietary software of XYZ Corp.
Your access to it does not give you permission to use it blah blah
blah."  You should see a lawyer to be sure your license's wording will
stand up in court.

=head2 Unicode

Unicode is a new and complex technology and one may easily overlook
certain security pitfalls.  See L<perluniintro> for an overview and
L<perlunicode> for details, and L<perlunicode/"Security Implications
of Unicode"> for security implications in particular.

=head2 Algorithmic Complexity Attacks

Certain internal algorithms used in the implementation of Perl can
be attacked by choosing the input carefully to consume large amounts
of either time or space or both.  This can lead into the so-called
I<Denial of Service> (DoS) attacks.

=over 4

=item *

Hash Algorithm - Hash algorithms like the one used in Perl are well
known to be vulnerable to collision attacks on their hash function.
Such attacks involve constructing a set of keys which collide into
the same bucket producing inefficient behavior.  Such attacks often
depend on discovering the seed of the hash function used to map the
keys to buckets.  That seed is then used to brute-force a key set which
can be used to mount a denial of service attack.  In Perl 5.8.1 changes
were introduced to harden Perl to such attacks, and then later in
Perl 5.18.0 these features were enhanced and additional protections
added.

At the time of this writing, Perl 5.18.0 is considered to be
well-hardened against algorithmic complexity attacks on its hash
implementation.  This is largely owed to the following measures
mitigate attacks:

=over 4

=item Hash Seed Randomization

In order to make it impossible to know what seed to generate an attack
key set for, this seed is randomly initialized at process start.  This
may be overridden by using the PERL_HASH_SEED environment variable, see
L<perlrun/PERL_HASH_SEED>.  This environment variable controls how
items are actually stored, not how they are presented via
C<keys>, C<values> and C<each>.

=item Hash Traversal Randomization

Independent of which seed is used in the hash function, C<keys>,
C<values>, and C<each> return items in a per-hash randomized order.
Modifying a hash by insertion will change the iteration order of that hash.
This behavior can be overridden by using C<hash_traversal_mask()> from
L<Hash::Util> or by using the PERL_PERTURB_KEYS environment variable,
see L<perlrun/PERL_PERTURB_KEYS>.  Note that this feature controls the
"visible" order of the keys, and not the actual order they are stored in.

=item Bucket Order Perturbance

When items collide into a given hash bucket the order they are stored in
the chain is no longer predictable in Perl 5.18.  This
has the intention to make it harder to observe a
collision.  This behavior can be overridden by using
the PERL_PERTURB_KEYS environment variable, see L<perlrun/PERL_PERTURB_KEYS>.

=item New Default Hash Function

The default hash function has been modified with the intention of making
it harder to infer the hash seed.

=item Alternative Hash Functions

The source code includes multiple hash algorithms to choose from.  While we
believe that the default perl hash is robust to attack, we have included the
hash function Siphash as a fall-back option.  At the time of release of
Perl 5.18.0 Siphash is believed to be of cryptographic strength.  This is
not the default as it is much slower than the default hash.

=back

Without compiling a special Perl, there is no way to get the exact same
behavior of any versions prior to Perl 5.18.0.  The closest one can get
is by setting PERL_PERTURB_KEYS to 0 and setting the PERL_HASH_SEED
to a known value.  We do not advise those settings for production use
due to the above security considerations.

B<Perl has never guaranteed any ordering of the hash keys>, and
the ordering has already changed several times during the lifetime of
Perl 5.  Also, the ordering of hash keys has always been, and continues
to be, affected by the insertion order and the history of changes made
to the hash over its lifetime.

Also note that while the order of the hash elements might be
randomized, this "pseudo-ordering" should B<not> be used for
applications like shuffling a list randomly (use C<List::Util::shuffle()>
for that, see L<List::Util>, a standard core module since Perl 5.8.0;
or the CPAN module C<Algorithm::Numerical::Shuffle>), or for generating
permutations (use e.g. the CPAN modules C<Algorithm::Permute> or
C<Algorithm::FastPermute>), or for any cryptographic applications.

Tied hashes may have their own ordering and algorithmic complexity
attacks.

=item *

Regular expressions - Perl's regular expression engine is so called NFA
(Non-deterministic Finite Automaton), which among other things means that
it can rather easily consume large amounts of both time and space if the
regular expression may match in several ways.  Careful crafting of the
regular expressions can help but quite often there really isn't much
one can do (the book "Mastering Regular Expressions" is required
reading, see L<perlfaq2>).  Running out of space manifests itself by
Perl running out of memory.

=item *

Sorting - the quicksort algorithm used in Perls before 5.8.0 to
implement the sort() function is very easy to trick into misbehaving
so that it consumes a lot of time.  Starting from Perl 5.8.0 a different
sorting algorithm, mergesort, is used by default.  Mergesort cannot
misbehave on any input.

=back

See L<https://www.usenix.org/legacy/events/sec03/tech/full_papers/crosby/crosby.pdf> for more information,
and any computer science textbook on algorithmic complexity.

=head1 SEE ALSO

L<perlrun> for its description of cleaning up environment variables.
perltoc.pod000064400002513633150344123470006740 0ustar00
# !!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
# This file is autogenerated by buildtoc from all the other pods.
# Edit those files and run pod/buildtoc to effect changes.

=encoding UTF-8

=head1 NAME

perltoc - perl documentation table of contents

=head1 DESCRIPTION

This page provides a brief table of contents for the rest of the Perl
documentation set.  It is meant to be scanned quickly or grepped
through to locate the proper section you're looking for.

=head1 BASIC DOCUMENTATION

=head2 perl - The Perl 5 language interpreter

=over 4

=item SYNOPSIS

=item GETTING HELP

=over 4

=item Overview

=item Tutorials

=item Reference Manual

=item Internals and C Language Interface

=item Miscellaneous

=item Language-Specific

=item Platform-Specific

=item Stubs for Deleted Documents

=back

=item DESCRIPTION

=item AVAILABILITY

=item ENVIRONMENT

=item AUTHOR

=item FILES

=item SEE ALSO

=item DIAGNOSTICS

=item BUGS

=item NOTES

=back

=head2 perlintro -- a brief introduction and overview of Perl

=over 4

=item DESCRIPTION

=over 4

=item What is Perl?

=item Running Perl programs

=item Safety net

=item Basic syntax overview

=item Perl variable types

Scalars, Arrays, Hashes

=item Variable scoping

=item Conditional and looping constructs

if, while, for, foreach

=item Builtin operators and functions

Arithmetic, Numeric comparison, String comparison, Boolean logic,
Miscellaneous

=item Files and I/O

=item Regular expressions

Simple matching, Simple substitution, More complex regular expressions,
Parentheses for capturing, Other regexp features

=item Writing subroutines

=item OO Perl

=item Using Perl modules

=back

=item AUTHOR

=back

=head2 perlrun - how to execute the Perl interpreter

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item #! and quoting on non-Unix systems
X<hashbang> X<#!>

OS/2, MS-DOS, Win95/NT, VMS

=item Location of Perl
X<perl, location of interpreter>

=item Command Switches
X<perl, command switches> X<command switches>

B<-0>[I<octal/hexadecimal>] X<-0> X<$/>, B<-a> X<-a> X<autosplit>, B<-C
[I<number/list>]> X<-C>, B<-c> X<-c>, B<-d> X<-d> X<-dt>, B<-dt>,
B<-d:>I<MOD[=bar,baz]> X<-d> X<-dt>, B<-dt:>I<MOD[=bar,baz]>,
B<-D>I<letters> X<-D> X<DEBUGGING> X<-DDEBUGGING>, B<-D>I<number>, B<-e>
I<commandline> X<-e>, B<-E> I<commandline> X<-E>, B<-f> X<-f>
X<sitecustomize> X<sitecustomize.pl>, B<-F>I<pattern> X<-F>, B<-h> X<-h>,
B<-i>[I<extension>] X<-i> X<in-place>, B<-I>I<directory> X<-I> X<@INC>,
B<-l>[I<octnum>] X<-l> X<$/> X<$\>, B<-m>[B<->]I<module> X<-m> X<-M>,
B<-M>[B<->]I<module>, B<-M>[B<->]I<'module ...'>,
B<-[mM]>[B<->]I<module=arg[,arg]...>, B<-n> X<-n>, B<-p> X<-p>, B<-s>
X<-s>, B<-S> X<-S>, B<-t> X<-t>, B<-T> X<-T>, B<-u> X<-u>, B<-U> X<-U>,
B<-v> X<-v>, B<-V> X<-V>, B<-V:>I<configvar>, B<-w> X<-w>, B<-W> X<-W>,
B<-X> X<-X>, B<-x> X<-x>, B<-x>I<directory>

=back

=item ENVIRONMENT
X<perl, environment variables>

HOME X<HOME>, LOGDIR X<LOGDIR>, PATH X<PATH>, PERL5LIB X<PERL5LIB>,
PERL5OPT X<PERL5OPT>, PERLIO X<PERLIO>, :bytes X<:bytes>, :crlf X<:crlf>,
:mmap X<:mmap>, :perlio X<:perlio>, :pop X<:pop>, :raw X<:raw>, :stdio
X<:stdio>, :unix X<:unix>, :utf8 X<:utf8>, :win32 X<:win32>, PERLIO_DEBUG
X<PERLIO_DEBUG>, PERLLIB X<PERLLIB>, PERL5DB X<PERL5DB>, PERL5DB_THREADED
X<PERL5DB_THREADED>, PERL5SHELL (specific to the Win32 port) X<PERL5SHELL>,
PERL_ALLOW_NON_IFS_LSP (specific to the Win32 port)
X<PERL_ALLOW_NON_IFS_LSP>, PERL_DEBUG_MSTATS X<PERL_DEBUG_MSTATS>,
PERL_DESTRUCT_LEVEL X<PERL_DESTRUCT_LEVEL>, PERL_DL_NONLAZY
X<PERL_DL_NONLAZY>, PERL_ENCODING X<PERL_ENCODING>, PERL_HASH_SEED
X<PERL_HASH_SEED>, PERL_PERTURB_KEYS X<PERL_PERTURB_KEYS>,
PERL_HASH_SEED_DEBUG X<PERL_HASH_SEED_DEBUG>, PERL_MEM_LOG X<PERL_MEM_LOG>,
PERL_ROOT (specific to the VMS port) X<PERL_ROOT>, PERL_SIGNALS
X<PERL_SIGNALS>, PERL_UNICODE X<PERL_UNICODE>, PERL_USE_UNSAFE_INC
X<PERL_USE_UNSAFE_INC>, SYS$LOGIN (specific to the VMS port) X<SYS$LOGIN>

=back

=head2 perlreftut - Mark's very short tutorial about references

=over 4

=item DESCRIPTION

=item Who Needs Complicated Data Structures?

=item The Solution

=item Syntax

=over 4

=item Making References

=item Using References

=item An Example

=item Arrow Rule

=back

=item Solution

=item The Rest

=item Summary

=item Credits

=over 4

=item Distribution Conditions

=back

=back

=head2 perldsc - Perl Data Structures Cookbook

=over 4

=item DESCRIPTION

arrays of arrays, hashes of arrays, arrays of hashes, hashes of hashes,
more elaborate constructs

=item REFERENCES
X<reference> X<dereference> X<dereferencing> X<pointer>

=item COMMON MISTAKES

=item CAVEAT ON PRECEDENCE
X<dereference, precedence> X<dereferencing, precedence>

=item WHY YOU SHOULD ALWAYS C<use strict>

=item DEBUGGING
X<data structure, debugging> X<complex data structure, debugging>
X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging>
X<array of arrays, debugging> X<hash of arrays, debugging>
X<array of hashes, debugging> X<hash of hashes, debugging>

=item CODE EXAMPLES

=item ARRAYS OF ARRAYS
X<array of arrays> X<AoA>

=over 4

=item Declaration of an ARRAY OF ARRAYS

=item Generation of an ARRAY OF ARRAYS

=item Access and Printing of an ARRAY OF ARRAYS

=back

=item HASHES OF ARRAYS
X<hash of arrays> X<HoA>

=over 4

=item Declaration of a HASH OF ARRAYS

=item Generation of a HASH OF ARRAYS

=item Access and Printing of a HASH OF ARRAYS

=back

=item ARRAYS OF HASHES
X<array of hashes> X<AoH>

=over 4

=item Declaration of an ARRAY OF HASHES

=item Generation of an ARRAY OF HASHES

=item Access and Printing of an ARRAY OF HASHES

=back

=item HASHES OF HASHES
X<hash of hashes> X<HoH>

=over 4

=item Declaration of a HASH OF HASHES

=item Generation of a HASH OF HASHES

=item Access and Printing of a HASH OF HASHES

=back

=item MORE ELABORATE RECORDS
X<record> X<structure> X<struct>

=over 4

=item Declaration of MORE ELABORATE RECORDS

=item Declaration of a HASH OF COMPLEX RECORDS

=item Generation of a HASH OF COMPLEX RECORDS

=back

=item Database Ties

=item SEE ALSO

=item AUTHOR

=back

=head2 perllol - Manipulating Arrays of Arrays in Perl

=over 4

=item DESCRIPTION

=over 4

=item Declaration and Access of Arrays of Arrays

=item Growing Your Own

=item Access and Printing

=item Slices

=back

=item SEE ALSO

=item AUTHOR

=back

=head2 perlrequick - Perl regular expressions quick start

=over 4

=item DESCRIPTION

=item The Guide

=over 4

=item Simple word matching

=item Using character classes

=item Matching this or that

=item Grouping things and hierarchical matching

=item Extracting matches

=item Matching repetitions

=item More matching

=item Search and replace

=item The split operator

=item C<use re 'strict'>

=back

=item BUGS

=item SEE ALSO

=item AUTHOR AND COPYRIGHT

=over 4

=item Acknowledgments

=back

=back

=head2 perlretut - Perl regular expressions tutorial

=over 4

=item DESCRIPTION

=item Part 1: The basics

=over 4

=item Simple word matching

=item Using character classes

=item Matching this or that

=item Grouping things and hierarchical matching

Z<>0. Start with the first letter in the string C<'a'>, Z<>1. Try the first
alternative in the first group C<'abd'>, Z<>2.	Match C<'a'> followed by
C<'b'>. So far so good, Z<>3.  C<'d'> in the regexp doesn't match C<'c'> in
the string - a dead end.  So backtrack two characters and pick the second
alternative in the first group C<'abc'>, Z<>4.	Match C<'a'> followed by
C<'b'> followed by C<'c'>.  We are on a roll and have satisfied the first
group. Set C<$1> to C<'abc'>, Z<>5 Move on to the second group and pick the
first alternative C<'df'>, Z<>6 Match the C<'d'>, Z<>7.  C<'f'> in the
regexp doesn't match C<'e'> in the string, so a dead end.  Backtrack one
character and pick the second alternative in the second group C<'d'>, Z<>8.
 C<'d'> matches. The second grouping is satisfied, so set C<$2> to C<'d'>,
Z<>9.  We are at the end of the regexp, so we are done! We have matched
C<'abcd'> out of the string C<"abcde">

=item Extracting matches

=item Backreferences

=item Relative backreferences

=item Named backreferences

=item Alternative capture group numbering

=item Position information

=item Non-capturing groupings

=item Matching repetitions

Z<>0.  Start with the first letter in the string C<'t'>, Z<>1.	The first
quantifier C<'.*'> starts out by matching the whole string "C<the cat in
the hat>", Z<>2.  C<'a'> in the regexp element C<'at'> doesn't match the
end of the string.  Backtrack one character, Z<>3.  C<'a'> in the regexp
element C<'at'> still doesn't match the last letter of the string C<'t'>,
so backtrack one more character, Z<>4.	Now we can match the C<'a'> and the
C<'t'>, Z<>5.  Move on to the third element C<'.*'>.  Since we are at the
end of the string and C<'.*'> can match 0 times, assign it the empty
string, Z<>6.  We are done!

=item Possessive quantifiers

=item Building a regexp

=item Using regular expressions in Perl

=back

=item Part 2: Power tools

=over 4

=item More on characters, strings, and character classes

=item Compiling and saving regular expressions

=item Composing regular expressions at runtime

=item Embedding comments and modifiers in a regular expression

=item Looking ahead and looking behind

=item Using independent subexpressions to prevent backtracking

=item Conditional expressions

=item Defining named patterns

=item Recursive patterns

=item A bit of magic: executing Perl code in a regular expression

=item Backtracking control verbs

=item Pragmas and debugging

=back

=item SEE ALSO

=item AUTHOR AND COPYRIGHT

=over 4

=item Acknowledgments

=back

=back

=head2 perlootut - Object-Oriented Programming in Perl Tutorial

=over 4

=item DATE

=item DESCRIPTION

=item OBJECT-ORIENTED FUNDAMENTALS

=over 4

=item Object

=item Class

=item Methods

=item Attributes

=item Polymorphism

=item Inheritance

=item Encapsulation

=item Composition

=item Roles

=item When to Use OO

=back

=item PERL OO SYSTEMS

=over 4

=item Moose

Declarative sugar, Roles built-in, A miniature type system, Full
introspection and manipulation, Self-hosted and extensible, Rich ecosystem,
Many more features

=item Class::Accessor

=item Class::Tiny

=item Role::Tiny

=item OO System Summary

L<Moose>, L<Class::Accessor>, L<Class::Tiny>, L<Role::Tiny>

=item Other OO Systems

=back

=item CONCLUSION

=back

=head2 perlperf - Perl Performance and Optimization Techniques

=over 4

=item DESCRIPTION

=item OVERVIEW

=over 4

=item ONE STEP SIDEWAYS

=item ONE STEP FORWARD

=item ANOTHER STEP SIDEWAYS

=back

=item GENERAL GUIDELINES

=item BENCHMARKS

=over 4

=item  Assigning and Dereferencing Variables.

=item  Search and replace or tr

=back

=item PROFILING TOOLS

=over 4

=item Devel::DProf

=item Devel::Profiler

=item Devel::SmallProf

=item Devel::FastProf

=item Devel::NYTProf

=back

=item  SORTING

Elapsed Real Time, User CPU Time, System CPU Time

=item LOGGING

=over 4

=item  Logging if DEBUG (constant)

=back

=item POSTSCRIPT

=item SEE ALSO

=over 4

=item PERLDOCS

=item MAN PAGES

=item MODULES

=item URLS

=back

=item AUTHOR

=back

=head2 perlstyle - Perl style guide

=over 4

=item DESCRIPTION

=back

=head2 perlcheat - Perl 5 Cheat Sheet

=over 4

=item DESCRIPTION

=over 4

=item The sheet

=back

=item ACKNOWLEDGEMENTS

=item AUTHOR

=item SEE ALSO

=back

=head2 perltrap - Perl traps for the unwary

=over 4

=item DESCRIPTION

=over 4

=item Awk Traps

=item C/C++ Traps

=item JavaScript Traps

=item Sed Traps

=item Shell Traps

=item Perl Traps

=back

=back

=head2 perldebtut - Perl debugging tutorial

=over 4

=item DESCRIPTION

=item use strict

=item Looking at data and -w and v

=item help

=item Stepping through code

=item Placeholder for a, w, t, T

=item REGULAR EXPRESSIONS

=item OUTPUT TIPS

=item CGI

=item GUIs

=item SUMMARY

=item SEE ALSO

=item AUTHOR

=item CONTRIBUTORS

=back

=head2 perlfaq - frequently asked questions about Perl

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item Where to find the perlfaq

=item How to use the perlfaq

=item How to contribute to the perlfaq

=item What if my question isn't answered in the FAQ?

=back

=item TABLE OF CONTENTS

perlfaq1 - General Questions About Perl, perlfaq2 - Obtaining and Learning
about Perl, perlfaq3 - Programming Tools, perlfaq4 - Data Manipulation,
perlfaq5 - Files and Formats, perlfaq6 - Regular Expressions, perlfaq7 -
General Perl Language Issues, perlfaq8 - System Interaction, perlfaq9 -
Web, Email and Networking

=item THE QUESTIONS

=over 4

=item L<perlfaq1>: General Questions About Perl

=item L<perlfaq2>: Obtaining and Learning about Perl

=item L<perlfaq3>: Programming Tools

=item L<perlfaq4>: Data Manipulation

=item L<perlfaq5>: Files and Formats

=item L<perlfaq6>: Regular Expressions

=item L<perlfaq7>: General Perl Language Issues

=item L<perlfaq8>: System Interaction

=item L<perlfaq9>: Web, Email and Networking

=back

=item CREDITS

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq1 - General Questions About Perl

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item What is Perl?

=item Who supports Perl? Who develops it? Why is it free?

=item Which version of Perl should I use?

=item What are Perl 4, Perl 5, or Perl 6?

=item What is Perl 6?

=item How stable is Perl?

=item How often are new versions of Perl released?

=item Is Perl difficult to learn?

=item How does Perl compare with other languages like Java, Python, REXX,
Scheme, or Tcl?

=item Can I do [task] in Perl?

=item When shouldn't I program in Perl?

=item What's the difference between "perl" and "Perl"?

=item What is a JAPH?

=item How can I convince others to use Perl?

L<http://www.perl.org/about.html>,
L<http://perltraining.com.au/whyperl.html>

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq2 - Obtaining and Learning about Perl

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item What machines support Perl? Where do I get it?

=item How can I get a binary version of Perl?

=item I don't have a C compiler. How can I build my own Perl interpreter?

=item I copied the Perl binary from one machine to another, but scripts
don't work.

=item I grabbed the sources and tried to compile but gdbm/dynamic
loading/malloc/linking/... failed. How do I make it work?

=item What modules and extensions are available for Perl? What is CPAN?

=item Where can I get information on Perl?

L<http://www.perl.org/>, L<http://perldoc.perl.org/>,
L<http://learn.perl.org/>

=item What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org?

L<http://www.perl.org/>, L<http://learn.perl.org/>,
L<http://jobs.perl.org/>, L<http://lists.perl.org/>

=item Where can I post questions?

=item Perl Books

=item Which magazines have Perl content?

=item Which Perl blogs should I read?

=item What mailing lists are there for Perl?

=item Where can I buy a commercial version of Perl?

=item Where do I send bug reports?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq3 - Programming Tools

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item How do I do (anything)?

Basics, L<perldata> - Perl data types, L<perlvar> - Perl pre-defined
variables, L<perlsyn> - Perl syntax, L<perlop> - Perl operators and
precedence, L<perlsub> - Perl subroutines, Execution, L<perlrun> - how to
execute the Perl interpreter, L<perldebug> - Perl debugging, Functions,
L<perlfunc> - Perl builtin functions, Objects, L<perlref> - Perl references
and nested data structures, L<perlmod> - Perl modules (packages and symbol
tables), L<perlobj> - Perl objects, L<perltie> - how to hide an object
class in a simple variable, Data Structures, L<perlref> - Perl references
and nested data structures, L<perllol> - Manipulating arrays of arrays in
Perl, L<perldsc> - Perl Data Structures Cookbook, Modules, L<perlmod> -
Perl modules (packages and symbol tables), L<perlmodlib> - constructing new
Perl modules and finding existing ones, Regexes, L<perlre> - Perl regular
expressions, L<perlfunc> - Perl builtin functions>, L<perlop> - Perl
operators and precedence, L<perllocale> - Perl locale handling
(internationalization and localization), Moving to perl5, L<perltrap> -
Perl traps for the unwary, L<perl>, Linking with C, L<perlxstut> - Tutorial
for writing XSUBs, L<perlxs> - XS language reference manual, L<perlcall> -
Perl calling conventions from C, L<perlguts> - Introduction to the Perl
API, L<perlembed> - how to embed perl in your C program, Various

=item How can I use Perl interactively?

=item How do I find which modules are installed on my system?

=item How do I debug my Perl programs?

=item How do I profile my Perl programs?

=item How do I cross-reference my Perl programs?

=item Is there a pretty-printer (formatter) for Perl?

=item Is there an IDE or Windows Perl Editor?

Eclipse, Enginsite, IntelliJ IDEA, Kephra, Komodo, Notepad++, Open Perl
IDE, OptiPerl, Padre, PerlBuilder, visiPerl+, Visual Perl, Zeus, GNU Emacs,
MicroEMACS, XEmacs, Jed, Vim, Vile, MultiEdit, SlickEdit, ConTEXT, bash,
zsh, BBEdit and TextWrangler

=item Where can I get Perl macros for vi?

=item Where can I get perl-mode or cperl-mode for emacs?
X<emacs>

=item How can I use curses with Perl?

=item How can I write a GUI (X, Tk, Gtk, etc.) in Perl?
X<GUI> X<Tk> X<Wx> X<WxWidgets> X<Gtk> X<Gtk2> X<CamelBones> X<Qt>

Tk, Wx, Gtk and Gtk2, Win32::GUI, CamelBones, Qt, Athena

=item How can I make my Perl program run faster?

=item How can I make my Perl program take less memory?

Don't slurp!, Use map and grep selectively, Avoid unnecessary quotes and
stringification, Pass by reference, Tie large variables to disk

=item Is it safe to return a reference to local or lexical data?

=item How can I free an array or hash so my program shrinks?

=item How can I make my CGI script more efficient?

=item How can I hide the source for my Perl program?

=item How can I compile my Perl program into byte code or C?

=item How can I get C<#!perl> to work on [MS-DOS,NT,...]?

=item Can I write useful Perl programs on the command line?

=item Why don't Perl one-liners work on my DOS/Mac/VMS system?

=item Where can I learn about CGI or Web programming in Perl?

=item Where can I learn about object-oriented Perl programming?

=item Where can I learn about linking C with Perl?

=item I've read perlembed, perlguts, etc., but I can't embed perl in my C
program; what am I doing wrong?

=item When I tried to run my script, I got this message. What does it mean?

=item What's MakeMaker?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq4 - Data Manipulation

=over 4

=item VERSION

=item DESCRIPTION

=item Data: Numbers

=over 4

=item Why am I getting long decimals (eg, 19.9499999999999) instead of the
numbers I should be getting (eg, 19.95)?

=item Why is int() broken?

=item Why isn't my octal data interpreted correctly?

=item Does Perl have a round() function? What about ceil() and floor()?
Trig functions?

=item How do I convert between numeric representations/bases/radixes?

How do I convert hexadecimal into decimal, How do I convert from decimal to
hexadecimal, How do I convert from octal to decimal, How do I convert from
decimal to octal, How do I convert from binary to decimal, How do I convert
from decimal to binary

=item Why doesn't & work the way I want it to?

=item How do I multiply matrices?

=item How do I perform an operation on a series of integers?

=item How can I output Roman numerals?

=item Why aren't my random numbers random?

=item How do I get a random number between X and Y?

=back

=item Data: Dates

=over 4

=item How do I find the day or week of the year?

=item How do I find the current century or millennium?

=item How can I compare two dates and find the difference?

=item How can I take a string and turn it into epoch seconds?

=item How can I find the Julian Day?

=item How do I find yesterday's date?
X<date> X<yesterday> X<DateTime> X<Date::Calc> X<Time::Local>
X<daylight saving time> X<day> X<Today_and_Now> X<localtime>
X<timelocal>

=item Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant?

=back

=item Data: Strings

=over 4

=item How do I validate input?

=item How do I unescape a string?

=item How do I remove consecutive pairs of characters?

=item How do I expand function calls in a string?

=item How do I find matching/nesting anything?

=item How do I reverse a string?

=item How do I expand tabs in a string?

=item How do I reformat a paragraph?

=item How can I access or change N characters of a string?

=item How do I change the Nth occurrence of something?

=item How can I count the number of occurrences of a substring within a
string?

=item How do I capitalize all the words on one line?
X<Text::Autoformat> X<capitalize> X<case, title> X<case, sentence>

=item How can I split a [character]-delimited string except when inside
[character]?

=item How do I strip blank space from the beginning/end of a string?

=item How do I pad a string with blanks or pad a number with zeroes?

=item How do I extract selected columns from a string?

=item How do I find the soundex value of a string?

=item How can I expand variables in text strings?

=item What's wrong with always quoting "$vars"?

=item Why don't my E<lt>E<lt>HERE documents work?

There must be no space after the E<lt>E<lt> part, There (probably) should
be a semicolon at the end of the opening token, You can't (easily) have any
space in front of the tag, There needs to be at least a line separator
after the end token

=back

=item Data: Arrays

=over 4

=item What is the difference between a list and an array?

=item What is the difference between $array[1] and @array[1]?

=item How can I remove duplicate elements from a list or array?

=item How can I tell whether a certain element is contained in a list or
array?

=item How do I compute the difference of two arrays? How do I compute the
intersection of two arrays?

=item How do I test whether two arrays or hashes are equal?

=item How do I find the first array element for which a condition is true?

=item How do I handle linked lists?

=item How do I handle circular lists?
X<circular> X<array> X<Tie::Cycle> X<Array::Iterator::Circular>
X<cycle> X<modulus>

=item How do I shuffle an array randomly?

=item How do I process/modify each element of an array?

=item How do I select a random element from an array?

=item How do I permute N elements of a list?
X<List::Permutor> X<permute> X<Algorithm::Loops> X<Knuth>
X<The Art of Computer Programming> X<Fischer-Krause>

=item How do I sort an array by (anything)?

=item How do I manipulate arrays of bits?

=item Why does defined() return true on empty arrays and hashes?

=back

=item Data: Hashes (Associative Arrays)

=over 4

=item How do I process an entire hash?

=item How do I merge two hashes?
X<hash> X<merge> X<slice, hash>

=item What happens if I add or remove keys from a hash while iterating over
it?

=item How do I look up a hash element by value?

=item How can I know how many entries are in a hash?

=item How do I sort a hash (optionally by value instead of key)?

=item How can I always keep my hash sorted?
X<hash tie sort DB_File Tie::IxHash>

=item What's the difference between "delete" and "undef" with hashes?

=item Why don't my tied hashes make the defined/exists distinction?

=item How do I reset an each() operation part-way through?

=item How can I get the unique keys from two hashes?

=item How can I store a multidimensional array in a DBM file?

=item How can I make my hash remember the order I put elements into it?

=item Why does passing a subroutine an undefined element in a hash create
it?

=item How can I make the Perl equivalent of a C structure/C++ class/hash or
array of hashes or arrays?

=item How can I use a reference as a hash key?

=item How can I check if a key exists in a multilevel hash?

=item How can I prevent addition of unwanted keys into a hash?

=back

=item Data: Misc

=over 4

=item How do I handle binary data correctly?

=item How do I determine whether a scalar is a number/whole/integer/float?

=item How do I keep persistent data across program calls?

=item How do I print out or copy a recursive data structure?

=item How do I define methods for every class/object?

=item How do I verify a credit card checksum?

=item How do I pack arrays of doubles or floats for XS code?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq5 - Files and Formats

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item How do I flush/unbuffer an output filehandle? Why must I do this?
X<flush> X<buffer> X<unbuffer> X<autoflush>

=item How do I change, delete, or insert a line in a file, or append to the
beginning of a file?
X<file, editing>

=item How do I count the number of lines in a file?
X<file, counting lines> X<lines> X<line>

=item How do I delete the last N lines from a file?
X<lines> X<file>

=item How can I use Perl's C<-i> option from within a program?
X<-i> X<in-place>

=item How can I copy a file?
X<copy> X<file, copy> X<File::Copy>

=item How do I make a temporary file name?
X<file, temporary>

=item How can I manipulate fixed-record-length files?
X<fixed-length> X<file, fixed-length records>

=item How can I make a filehandle local to a subroutine? How do I pass
filehandles between subroutines? How do I make an array of filehandles?
X<filehandle, local> X<filehandle, passing> X<filehandle, reference>

=item How can I use a filehandle indirectly?
X<filehandle, indirect>

=item How can I set up a footer format to be used with write()?
X<footer>

=item How can I write() into a string?
X<write, into a string>

=item How can I open a filehandle to a string?
X<string> X<open> X<IO::String> X<filehandle>

=item How can I output my numbers with commas added?
X<number, commify>

=item How can I translate tildes (~) in a filename?
X<tilde> X<tilde expansion>

=item How come when I open a file read-write it wipes it out?
X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating>

=item Why do I sometimes get an "Argument list too long" when I use
E<lt>*E<gt>?
X<argument list too long>

=item How can I open a file with a leading ">" or trailing blanks?
X<filename, special characters>

=item How can I reliably rename a file?
X<rename> X<mv> X<move> X<file, rename>

=item How can I lock a file?
X<lock> X<file, lock> X<flock>

=item Why can't I just open(FH, "E<gt>file.lock")?
X<lock, lockfile race condition>

=item I still don't get locking. I just want to increment the number in the
file. How can I do this?
X<counter> X<file, counter>

=item All I want to do is append a small amount of text to the end of a
file. Do I still have to use locking?
X<append> X<file, append>

=item How do I randomly update a binary file?
X<file, binary patch>

=item How do I get a file's timestamp in perl?
X<timestamp> X<file, timestamp>

=item How do I set a file's timestamp in perl?
X<timestamp> X<file, timestamp>

=item How do I print to more than one file at once?
X<print, to multiple files>

=item How can I read in an entire file all at once?
X<slurp> X<file, slurping>

=item How can I read in a file by paragraphs?
X<file, reading by paragraphs>

=item How can I read a single character from a file? From the keyboard?
X<getc> X<file, reading one character at a time>

=item How can I tell whether there's a character waiting on a filehandle?

=item How do I do a C<tail -f> in perl?
X<tail> X<IO::Handle> X<File::Tail> X<clearerr>

=item How do I dup() a filehandle in Perl?
X<dup>

=item How do I close a file descriptor by number?
X<file, closing file descriptors> X<POSIX> X<close>

=item Why can't I use "C:\temp\foo" in DOS paths? Why doesn't
`C:\temp\foo.exe` work?
X<filename, DOS issues>

=item Why doesn't glob("*.*") get all the files?
X<glob>

=item Why does Perl let me delete read-only files? Why does C<-i> clobber
protected files? Isn't this a bug in Perl?

=item How do I select a random line from a file?
X<file, selecting a random line>

=item Why do I get weird spaces when I print an array of lines?

=item How do I traverse a directory tree?

=item How do I delete a directory tree?

=item How do I copy an entire directory?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq6 - Regular Expressions

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item How can I hope to use regular expressions without creating illegible
and unmaintainable code?
X<regex, legibility> X<regexp, legibility>
X<regular expression, legibility> X</x>

Comments Outside the Regex, Comments Inside the Regex, Different Delimiters

=item I'm having trouble matching over more than one line. What's wrong?
X<regex, multiline> X<regexp, multiline> X<regular expression, multiline>

=item How can I pull out lines between two patterns that are themselves on
different lines?
X<..>

=item How do I match XML, HTML, or other nasty, ugly things with a regex?
X<regex, XML> X<regex, HTML> X<XML> X<HTML> X<pain> X<frustration>
X<sucking out, will to live>

=item I put a regular expression into $/ but it didn't work. What's wrong?
X<$/, regexes in> X<$INPUT_RECORD_SEPARATOR, regexes in>
X<$RS, regexes in>

=item How do I substitute case-insensitively on the LHS while preserving
case on the RHS?
X<replace, case preserving> X<substitute, case preserving>
X<substitution, case preserving> X<s, case preserving>

=item How can I make C<\w> match national character sets?
X<\w>

=item How can I match a locale-smart version of C</[a-zA-Z]/>?
X<alpha>

=item How can I quote a variable to use in a regex?
X<regex, escaping> X<regexp, escaping> X<regular expression, escaping>

=item What is C</o> really for?
X</o, regular expressions> X<compile, regular expressions>

=item How do I use a regular expression to strip C-style comments from a
file?

=item Can I use Perl regular expressions to match balanced text?
X<regex, matching balanced test> X<regexp, matching balanced test>
X<regular expression, matching balanced test> X<possessive> X<PARNO>
X<Text::Balanced> X<Regexp::Common> X<backtracking> X<recursion>

=item What does it mean that regexes are greedy? How can I get around it?
X<greedy> X<greediness>

=item How do I process each word on each line?
X<word>

=item How can I print out a word-frequency or line-frequency summary?

=item How can I do approximate matching?
X<match, approximate> X<matching, approximate>

=item How do I efficiently match many regular expressions at once?
X<regex, efficiency> X<regexp, efficiency>
X<regular expression, efficiency>

=item Why don't word-boundary searches with C<\b> work for me?
X<\b>

=item Why does using $&, $`, or $' slow my program down?
X<$MATCH> X<$&> X<$POSTMATCH> X<$'> X<$PREMATCH> X<$`>

=item What good is C<\G> in a regular expression?
X<\G>

=item Are Perl regexes DFAs or NFAs? Are they POSIX compliant?
X<DFA> X<NFA> X<POSIX>

=item What's wrong with using grep in a void context?
X<grep>

=item How can I match strings with multibyte characters?
X<regex, and multibyte characters> X<regexp, and multibyte characters>
X<regular expression, and multibyte characters> X<martian> X<encoding,
Martian>

=item How do I match a regular expression that's in a variable?
X<regex, in variable> X<eval> X<regex> X<quotemeta> X<\Q, regex>
X<\E, regex> X<qr//>

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq7 - General Perl Language Issues

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item Can I get a BNF/yacc/RE for the Perl language?

=item What are all these $@%&* punctuation signs, and how do I know when to
use them?

=item Do I always/never have to quote my strings or use semicolons and
commas?

=item How do I skip some return values?

=item How do I temporarily block warnings?

=item What's an extension?

=item Why do Perl operators have different precedence than C operators?

=item How do I declare/create a structure?

=item How do I create a module?

=item How do I adopt or take over a module already on CPAN?

=item How do I create a class?
X<class, creation> X<package>

=item How can I tell if a variable is tainted?

=item What's a closure?

=item What is variable suicide and how can I prevent it?

=item How can I pass/return a {Function, FileHandle, Array, Hash, Method,
Regex}?

Passing Variables and Functions, Passing Filehandles, Passing Regexes,
Passing Methods

=item How do I create a static variable?

=item What's the difference between dynamic and lexical (static) scoping?
Between local() and my()?

=item How can I access a dynamic variable while a similarly named lexical
is in scope?

=item What's the difference between deep and shallow binding?

=item Why doesn't "my($foo) = E<lt>$fhE<gt>;" work right?

=item How do I redefine a builtin function, operator, or method?

=item What's the difference between calling a function as &foo and foo()?

=item How do I create a switch or case statement?

=item How can I catch accesses to undefined variables, functions, or
methods?

=item Why can't a method included in this same file be found?

=item How can I find out my current or calling package?

=item How can I comment out a large block of Perl code?

=item How do I clear a package?

=item How can I use a variable as a variable name?

=item What does "bad interpreter" mean?

=item Do I need to recompile XS modules when there is a change in the C
library?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq8 - System Interaction

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item How do I find out which operating system I'm running under?

=item How come exec() doesn't return?
X<exec> X<system> X<fork> X<open> X<pipe>

=item How do I do fancy stuff with the keyboard/screen/mouse?

Keyboard, Screen, Mouse

=item How do I print something out in color?

=item How do I read just one key without waiting for a return key?

=item How do I check whether input is ready on the keyboard?

=item How do I clear the screen?

=item How do I get the screen size?

=item How do I ask the user for a password?

=item How do I read and write the serial port?

lockfiles, open mode, end of line, flushing output, non-blocking input

=item How do I decode encrypted password files?

=item How do I start a process in the background?

STDIN, STDOUT, and STDERR are shared, Signals, Zombies

=item How do I trap control characters/signals?

=item How do I modify the shadow password file on a Unix system?

=item How do I set the time and date?

=item How can I sleep() or alarm() for under a second?
X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select>

=item How can I measure time under a second?
X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select>

=item How can I do an atexit() or setjmp()/longjmp()? (Exception handling)

=item Why doesn't my sockets program work under System V (Solaris)? What
does the error message "Protocol not supported" mean?

=item How can I call my system's unique C functions from Perl?

=item Where do I get the include files to do ioctl() or syscall()?

=item Why do setuid perl scripts complain about kernel problems?

=item How can I open a pipe both to and from a command?

=item Why can't I get the output of a command with system()?

=item How can I capture STDERR from an external command?

=item Why doesn't open() return an error when a pipe open fails?

=item What's wrong with using backticks in a void context?

=item How can I call backticks without shell processing?

=item Why can't my script read from STDIN after I gave it EOF (^D on Unix,
^Z on MS-DOS)?

=item How can I convert my shell script to perl?

=item Can I use perl to run a telnet or ftp session?

=item How can I write expect in Perl?

=item Is there a way to hide perl's command line from programs such as
"ps"?

=item I {changed directory, modified my environment} in a perl script. How
come the change disappeared when I exited the script? How do I get my
changes to be visible?

Unix

=item How do I close a process's filehandle without waiting for it to
complete?

=item How do I fork a daemon process?

=item How do I find out if I'm running interactively or not?

=item How do I timeout a slow event?

=item How do I set CPU limits?
X<BSD::Resource> X<limit> X<CPU>

=item How do I avoid zombies on a Unix system?

=item How do I use an SQL database?

=item How do I make a system() exit on control-C?

=item How do I open a file without blocking?

=item How do I tell the difference between errors from the shell and perl?

=item How do I install a module from CPAN?

=item What's the difference between require and use?

=item How do I keep my own module/library directory?

=item How do I add the directory my program lives in to the module/library
search path?

=item How do I add a directory to my include path (@INC) at runtime?

the C<PERLLIB> environment variable, the C<PERL5LIB> environment variable,
the C<perl -Idir> command line flag, the C<lib> pragma:, the L<local::lib>
module:

=item Where are modules installed?

=item What is socket.ph and where do I get it?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlfaq9 - Web, Email and Networking

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item Should I use a web framework?

=item Which web framework should I use?
X<framework> X<CGI.pm> X<CGI> X<Catalyst> X<Dancer>

L<Catalyst>, L<Dancer>, L<Mojolicious>, L<Web::Simple>

=item What is Plack and PSGI?

=item How do I remove HTML from a string?

=item How do I extract URLs?

=item How do I fetch an HTML file?

=item How do I automate an HTML form submission?

=item How do I decode or create those %-encodings on the web?
X<URI> X<URI::Escape> X<RFC 2396>

=item How do I redirect to another page?

=item How do I put a password on my web pages?

=item How do I make sure users can't enter values into a form that causes
my CGI script to do bad things?

=item How do I parse a mail header?

=item How do I check a valid mail address?

=item How do I decode a MIME/BASE64 string?

=item How do I find the user's mail address?

=item How do I send email?

L<Email::Sender::Transport::Sendmail>, L<Email::Sender::Transport::SMTP>,
L<Email::Sender::Transport::SMTP::TLS>

=item How do I use MIME to make an attachment to a mail message?

=item How do I read email?

=item How do I find out my hostname, domainname, or IP address?
X<hostname, domainname, IP address, host, domain, hostfqdn, inet_ntoa,
gethostbyname, Socket, Net::Domain, Sys::Hostname>

=item How do I fetch/put an (S)FTP file?

=item How can I do RPC in Perl?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlsyn - Perl syntax

=over 4

=item DESCRIPTION

=over 4

=item Declarations
X<declaration> X<undef> X<undefined> X<uninitialized>

=item Comments
X<comment> X<#>

=item Simple Statements
X<statement> X<semicolon> X<expression> X<;>

=item Truth and Falsehood
X<truth> X<falsehood> X<true> X<false> X<!> X<not> X<negation> X<0>

=item Statement Modifiers
X<statement modifier> X<modifier> X<if> X<unless> X<while>
X<until> X<when> X<foreach> X<for>

=item Compound Statements
X<statement, compound> X<block> X<bracket, curly> X<curly bracket> X<brace>
X<{> X<}> X<if> X<unless> X<given> X<while> X<until> X<foreach> X<for>
X<continue>

=item Loop Control
X<loop control> X<loop, control> X<next> X<last> X<redo> X<continue>

=item For Loops
X<for> X<foreach>

=item Foreach Loops
X<for> X<foreach>

=item Basic BLOCKs
X<block>

=item Switch Statements

=item Goto
X<goto>

=item The Ellipsis Statement
X<...>
X<... statement>
X<ellipsis operator>
X<elliptical statement>
X<unimplemented statement>
X<unimplemented operator>
X<yada-yada>
X<yada-yada operator>
X<... operator>
X<whatever operator>
X<triple-dot operator>

=item PODs: Embedded Documentation
X<POD> X<documentation>

=item Plain Old Comments (Not!)
X<comment> X<line> X<#> X<preprocessor> X<eval>

=item Experimental Details on given and when

Z<>1, Z<>2, Z<>3, Z<>4, Z<>5, Z<>6, Z<>7, Z<>8, Z<>9, Z<>10

=back

=back

=head2 perldata - Perl data types

=over 4

=item DESCRIPTION

=over 4

=item Variable names
X<variable, name> X<variable name> X<data type> X<type>

=item Identifier parsing
X<identifiers>

=item Context
X<context> X<scalar context> X<list context>

=item Scalar values
X<scalar> X<number> X<string> X<reference>

=item Scalar value constructors
X<scalar, literal> X<scalar, constant>

=item List value constructors
X<list>

=item Subscripts

=item Multi-dimensional array emulation

=item Slices
X<slice> X<array, slice> X<hash, slice>

=item Typeglobs and Filehandles
X<typeglob> X<filehandle> X<*>

=back

=item SEE ALSO

=back

=head2 perlop - Perl operators and precedence

=over 4

=item DESCRIPTION

=over 4

=item Operator Precedence and Associativity
X<operator, precedence> X<precedence> X<associativity>

=item Terms and List Operators (Leftward)
X<list operator> X<operator, list> X<term>

=item The Arrow Operator
X<arrow> X<dereference> X<< -> >>

=item Auto-increment and Auto-decrement
X<increment> X<auto-increment> X<++> X<decrement> X<auto-decrement> X<-->

=item Exponentiation
X<**> X<exponentiation> X<power>

=item Symbolic Unary Operators
X<unary operator> X<operator, unary>

=item Binding Operators
X<binding> X<operator, binding> X<=~> X<!~>

=item Multiplicative Operators
X<operator, multiplicative>

=item Additive Operators
X<operator, additive>

=item Shift Operators
X<shift operator> X<operator, shift> X<<< << >>>
X<<< >> >>> X<right shift> X<left shift> X<bitwise shift>
X<shl> X<shr> X<shift, right> X<shift, left>

=item Named Unary Operators
X<operator, named unary>

=item Relational Operators
X<relational operator> X<operator, relational>

=item Equality Operators
X<equality> X<equal> X<equals> X<operator, equality>

=item Smartmatch Operator

1. Empty hashes or arrays match, 2. That is, each element smartmatches the
element of the same index in the other array.[3], 3. If a circular
reference is found, fall back to referential equality, 4. Either an actual
number, or a string that looks like one

=item Bitwise And
X<operator, bitwise, and> X<bitwise and> X<&>

=item Bitwise Or and Exclusive Or
X<operator, bitwise, or> X<bitwise or> X<|> X<operator, bitwise, xor>
X<bitwise xor> X<^>

=item C-style Logical And
X<&&> X<logical and> X<operator, logical, and>

=item C-style Logical Or
X<||> X<operator, logical, or>

=item Logical Defined-Or
X<//> X<operator, logical, defined-or>

=item Range Operators
X<operator, range> X<range> X<..> X<...>

=item Conditional Operator
X<operator, conditional> X<operator, ternary> X<ternary> X<?:>

=item Assignment Operators
X<assignment> X<operator, assignment> X<=> X<**=> X<+=> X<*=> X<&=>
X<<< <<= >>> X<&&=> X<-=> X</=> X<|=> X<<< >>= >>> X<||=> X<//=> X<.=>
X<%=> X<^=> X<x=> X<&.=> X<|.=> X<^.=>

=item Comma Operator
X<comma> X<operator, comma> X<,>

=item List Operators (Rightward)
X<operator, list, rightward> X<list operator>

=item Logical Not
X<operator, logical, not> X<not>

=item Logical And
X<operator, logical, and> X<and>

=item Logical or and Exclusive Or
X<operator, logical, or> X<operator, logical, xor>
X<operator, logical, exclusive or>
X<or> X<xor>

=item C Operators Missing From Perl
X<operator, missing from perl> X<&> X<*>
X<typecasting> X<(TYPE)>

unary &, unary *, (TYPE)

=item Quote and Quote-like Operators
X<operator, quote> X<operator, quote-like> X<q> X<qq> X<qx> X<qw> X<m>
X<qr> X<s> X<tr> X<'> X<''> X<"> X<""> X<//> X<`> X<``> X<<< << >>>
X<escape sequence> X<escape>

[1], [2], [3], [4], [5], [6], [7], [8]

=item Regexp Quote-Like Operators
X<operator, regexp>

C<qr/I<STRING>/msixpodualn> X<qr> X</i> X</m> X</o> X</s> X</x> X</p>,
C<m/I<PATTERN>/msixpodualngc> X<m> X<operator, match> X<regexp, options>
X<regexp> X<regex, options> X<regex> X</m> X</s> X</i> X</x> X</p> X</o>
X</g> X</c>, C</I<PATTERN>/msixpodualngc>, The empty pattern C<//>,
Matching in list context, C<\G I<assertion>>, C<m?I<PATTERN>?msixpodualngc>
X<?> X<operator, match-once>,
C<s/I<PATTERN>/I<REPLACEMENT>/msixpodualngcer> X<s> X<substitute>
X<substitution> X<replace> X<regexp, replace> X<regexp, substitute> X</m>
X</s> X</i> X</x> X</p> X</o> X</g> X</c> X</e> X</r>

=item Quote-Like Operators
X<operator, quote-like>

C<q/I<STRING>/> X<q> X<quote, single> X<'> X<''>, C<'I<STRING>'>,
C<qq/I<STRING>/> X<qq> X<quote, double> X<"> X<"">, "I<STRING>",
C<qx/I<STRING>/> X<qx> X<`> X<``> X<backtick>, C<`I<STRING>`>,
C<qw/I<STRING>/> X<qw> X<quote, list> X<quote, words>,
C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr> X<tr> X<y> X<transliterate>
X</c> X</d> X</s>, C<y/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr>, C<< <<I<EOF>
>> X<here-doc> X<heredoc> X<here-document> X<<< << >>>, Double Quotes,
Single Quotes, Backticks, Indented Here-docs

=item Gory details of parsing quoted constructs
X<quote, gory details>

Finding the end, Interpolation X<interpolation>, C<<<'EOF'>,  C<m''>, the
pattern of C<s'''>, C<''>, C<q//>, C<tr'''>, C<y'''>, the replacement of
C<s'''>, C<tr///>, C<y///>, C<"">, C<``>, C<qq//>, C<qx//>, C<< <file*glob>
>>, C<<<"EOF">, the replacement of C<s///>, C<RE> in C<m?RE?>, C</RE/>,
C<m/RE/>, C<s/RE/foo/>,, parsing regular expressions X<regexp, parse>,
Optimization of regular expressions X<regexp, optimization>

=item I/O Operators
X<operator, i/o> X<operator, io> X<io> X<while> X<filehandle>
X<< <> >> X<< <<>> >> X<@ARGV>

=item Constant Folding
X<constant folding> X<folding>

=item No-ops
X<no-op> X<nop>

=item Bitwise String Operators
X<operator, bitwise, string> X<&.> X<|.> X<^.> X<~.>

=item Integer Arithmetic
X<integer>

=item Floating-point Arithmetic

=item Bigger Numbers
X<number, arbitrary precision>

=back

=back

=head2 perlsub - Perl subroutines

=over 4

=item SYNOPSIS

=item DESCRIPTION

documented later in this document, documented in L<perlmod>, documented in
L<perlobj>, documented in L<perltie>, documented in L<PerlIO::via>,
documented in L<perlfunc>, documented in L<UNIVERSAL>, documented in
L<perldebguts>, undocumented, used internally by the L<overload> feature

=over 4

=item Signatures

=item Private Variables via my()
X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
X<lexical scope> X<attributes, my>

=item Persistent Private Variables
X<state> X<state variable> X<static> X<variable, persistent> X<variable,
static> X<closure>

=item Temporary Values via local()
X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
X<variable, temporary>

=item Lvalue subroutines
X<lvalue> X<subroutine, lvalue>

=item Lexical Subroutines
X<my sub> X<state sub> X<our sub> X<subroutine, lexical>

=item Passing Symbol Table Entries (typeglobs)
X<typeglob> X<*>

=item When to Still Use local()
X<local> X<variable, local>

=item Pass by Reference
X<pass by reference> X<pass-by-reference> X<reference>

=item Prototypes
X<prototype> X<subroutine, prototype>

=item Constant Functions
X<constant>

=item Overriding Built-in Functions
X<built-in> X<override> X<CORE> X<CORE::GLOBAL>

=item Autoloading
X<autoloading> X<AUTOLOAD>

=item Subroutine Attributes
X<attribute> X<subroutine, attribute> X<attrs>

=back

=item SEE ALSO

=back

=head2 perlfunc - Perl builtin functions

=over 4

=item DESCRIPTION

=over 4

=item Perl Functions by Category
X<function>

Functions for SCALARs or strings X<scalar> X<string> X<character>, Regular
expressions and pattern matching X<regular expression> X<regex> X<regexp>,
Numeric functions X<numeric> X<number> X<trigonometric> X<trigonometry>,
Functions for real @ARRAYs X<array>, Functions for list data X<list>,
Functions for real %HASHes X<hash>, Input and output functions X<I/O>
X<input> X<output> X<dbm>, Functions for fixed-length data or records,
Functions for filehandles, files, or directories X<file> X<filehandle>
X<directory> X<pipe> X<link> X<symlink>, Keywords related to the control
flow of your Perl program X<control flow>, Keywords related to scoping,
Miscellaneous functions, Functions for processes and process groups
X<process> X<pid> X<process id>, Keywords related to Perl modules
X<module>, Keywords related to classes and object-orientation X<object>
X<class> X<package>, Low-level socket functions X<socket> X<sock>, System V
interprocess communication functions X<IPC> X<System V> X<semaphore>
X<shared memory> X<memory> X<message>, Fetching user and group info X<user>
X<group> X<password> X<uid> X<gid>  X<passwd> X</etc/passwd>, Fetching
network info X<network> X<protocol> X<host> X<hostname> X<IP> X<address>
X<service>, Time-related functions X<time> X<date>, Non-function keywords

=item Portability
X<portability> X<Unix> X<portable>

=item Alphabetical Listing of Perl Functions

-I<X> FILEHANDLE
X<-r>X<-w>X<-x>X<-o>X<-R>X<-W>X<-X>X<-O>X<-e>X<-z>X<-s>X<-f>X<-d>X<-l>X<-p>
X<-S>X<-b>X<-c>X<-t>X<-u>X<-g>X<-k>X<-T>X<-B>X<-M>X<-A>X<-C>, -I<X> EXPR,
-I<X> DIRHANDLE, -I<X>, abs VALUE X<abs> X<absolute>, abs, accept
NEWSOCKET,GENERICSOCKET X<accept>, alarm SECONDS X<alarm> X<SIGALRM>
X<timer>, alarm, atan2 Y,X X<atan2> X<arctangent> X<tan> X<tangent>, bind
SOCKET,NAME X<bind>, binmode FILEHANDLE, LAYER X<binmode> X<binary> X<text>
X<DOS> X<Windows>, binmode FILEHANDLE, bless REF,CLASSNAME X<bless>, bless
REF, break, caller EXPR X<caller> X<call stack> X<stack> X<stack trace>,
caller, chdir EXPR X<chdir> X<cd> X<directory, change>, chdir FILEHANDLE,
chdir DIRHANDLE, chdir, chmod LIST X<chmod> X<permission> X<mode>, chomp
VARIABLE X<chomp> X<INPUT_RECORD_SEPARATOR> X<$/> X<newline> X<eol>, chomp(
LIST ), chomp, chop VARIABLE X<chop>, chop( LIST ), chop, chown LIST
X<chown> X<owner> X<user> X<group>, chr NUMBER X<chr> X<character> X<ASCII>
X<Unicode>, chr, chroot FILENAME X<chroot> X<root>, chroot, close
FILEHANDLE X<close>, close, closedir DIRHANDLE X<closedir>, connect
SOCKET,NAME X<connect>, continue BLOCK X<continue>, continue, cos EXPR
X<cos> X<cosine> X<acos> X<arccosine>, cos, crypt PLAINTEXT,SALT X<crypt>
X<digest> X<hash> X<salt> X<plaintext> X<password> X<decrypt>
X<cryptography> X<passwd> X<encrypt>, dbmclose HASH X<dbmclose>, dbmopen
HASH,DBNAME,MASK X<dbmopen> X<dbm> X<ndbm> X<sdbm> X<gdbm>, defined EXPR
X<defined> X<undef> X<undefined>, defined, delete EXPR X<delete>, die LIST
X<die> X<throw> X<exception> X<raise> X<$@> X<abort>, do BLOCK X<do>
X<block>, do EXPR X<do>, dump LABEL X<dump> X<core> X<undump>, dump EXPR,
dump, each HASH X<each> X<hash, iterator>, each ARRAY X<array, iterator>,
eof FILEHANDLE X<eof> X<end of file> X<end-of-file>, eof (), eof, eval EXPR
X<eval> X<try> X<catch> X<evaluate> X<parse> X<execute> X<error, handling>
X<exception, handling>, eval BLOCK, eval, String eval, Under the
L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes'
features>, Outside the C<"unicode_eval"> feature, Block eval, evalbytes
EXPR X<evalbytes>, evalbytes, exec LIST X<exec> X<execute>, exec PROGRAM
LIST, exists EXPR X<exists> X<autovivification>, exit EXPR X<exit>
X<terminate> X<abort>, exit, exp EXPR X<exp> X<exponential> X<antilog>
X<antilogarithm> X<e>, exp, fc EXPR X<fc> X<foldcase> X<casefold>
X<fold-case> X<case-fold>, fc, fcntl FILEHANDLE,FUNCTION,SCALAR X<fcntl>,
__FILE__ X<__FILE__>, fileno FILEHANDLE X<fileno>, flock
FILEHANDLE,OPERATION X<flock> X<lock> X<locking>, fork X<fork> X<child>
X<parent>, format X<format>, formline PICTURE,LIST X<formline>, getc
FILEHANDLE X<getc> X<getchar> X<character> X<file, read>, getc, getlogin
X<getlogin> X<login>, getpeername SOCKET X<getpeername> X<peer>, getpgrp
PID X<getpgrp> X<group>, getppid X<getppid> X<parent> X<pid>, getpriority
WHICH,WHO X<getpriority> X<priority> X<nice>, getpwnam NAME X<getpwnam>
X<getgrnam> X<gethostbyname> X<getnetbyname> X<getprotobyname> X<getpwuid>
X<getgrgid> X<getservbyname> X<gethostbyaddr> X<getnetbyaddr>
X<getprotobynumber> X<getservbyport> X<getpwent> X<getgrent> X<gethostent>
X<getnetent> X<getprotoent> X<getservent> X<setpwent> X<setgrent>
X<sethostent> X<setnetent> X<setprotoent> X<setservent> X<endpwent>
X<endgrent> X<endhostent> X<endnetent> X<endprotoent> X<endservent>,
getgrnam NAME, gethostbyname NAME, getnetbyname NAME, getprotobyname NAME,
getpwuid UID, getgrgid GID, getservbyname NAME,PROTO, gethostbyaddr
ADDR,ADDRTYPE, getnetbyaddr ADDR,ADDRTYPE, getprotobynumber NUMBER,
getservbyport PORT,PROTO, getpwent, getgrent, gethostent, getnetent,
getprotoent, getservent, setpwent, setgrent, sethostent STAYOPEN, setnetent
STAYOPEN, setprotoent STAYOPEN, setservent STAYOPEN, endpwent, endgrent,
endhostent, endnetent, endprotoent, endservent, getsockname SOCKET
X<getsockname>, getsockopt SOCKET,LEVEL,OPTNAME X<getsockopt>, glob EXPR
X<glob> X<wildcard> X<filename, expansion> X<expand>, glob, gmtime EXPR
X<gmtime> X<UTC> X<Greenwich>, gmtime, goto LABEL X<goto> X<jump> X<jmp>,
goto EXPR, goto &NAME, grep BLOCK LIST X<grep>, grep EXPR,LIST, hex EXPR
X<hex> X<hexadecimal>, hex, import LIST X<import>, index
STR,SUBSTR,POSITION X<index> X<indexOf> X<InStr>, index STR,SUBSTR, int
EXPR X<int> X<integer> X<truncate> X<trunc> X<floor>, int, ioctl
FILEHANDLE,FUNCTION,SCALAR X<ioctl>, join EXPR,LIST X<join>, keys HASH
X<keys> X<key>, keys ARRAY, kill SIGNAL, LIST, kill SIGNAL X<kill>
X<signal>, last LABEL X<last> X<break>, last EXPR, last, lc EXPR X<lc>
X<lowercase>, lc, If C<use bytes> is in effect:, Otherwise, if C<use
locale> for C<LC_CTYPE> is in effect:, Otherwise, If EXPR has the UTF8 flag
set:, Otherwise, if C<use feature 'unicode_strings'> or C<use locale
':not_characters'> is in effect:, Otherwise:, lcfirst EXPR X<lcfirst>
X<lowercase>, lcfirst, length EXPR X<length> X<size>, length, __LINE__
X<__LINE__>, link OLDFILE,NEWFILE X<link>, listen SOCKET,QUEUESIZE
X<listen>, local EXPR X<local>, localtime EXPR X<localtime> X<ctime>,
localtime, lock THING X<lock>, log EXPR X<log> X<logarithm> X<e> X<ln>
X<base>, log, lstat FILEHANDLE X<lstat>, lstat EXPR, lstat DIRHANDLE,
lstat, m//, map BLOCK LIST X<map>, map EXPR,LIST, mkdir FILENAME,MASK
X<mkdir> X<md> X<directory, create>, mkdir FILENAME, mkdir, msgctl
ID,CMD,ARG X<msgctl>, msgget KEY,FLAGS X<msgget>, msgrcv
ID,VAR,SIZE,TYPE,FLAGS X<msgrcv>, msgsnd ID,MSG,FLAGS X<msgsnd>, my VARLIST
X<my>, my TYPE VARLIST, my VARLIST : ATTRS, my TYPE VARLIST : ATTRS, next
LABEL X<next> X<continue>, next EXPR, next, no MODULE VERSION LIST X<no
declarations> X<unimporting>, no MODULE VERSION, no MODULE LIST, no MODULE,
no VERSION, oct EXPR X<oct> X<octal> X<hex> X<hexadecimal> X<binary>
X<bin>, oct, open FILEHANDLE,EXPR X<open> X<pipe> X<file, open> X<fopen>,
open FILEHANDLE,MODE,EXPR, open FILEHANDLE,MODE,EXPR,LIST, open
FILEHANDLE,MODE,REFERENCE, open FILEHANDLE, opendir DIRHANDLE,EXPR
X<opendir>, ord EXPR X<ord> X<encoding>, ord, our VARLIST X<our> X<global>,
our TYPE VARLIST, our VARLIST : ATTRS, our TYPE VARLIST : ATTRS, pack
TEMPLATE,LIST X<pack>, package NAMESPACE, package NAMESPACE VERSION
X<package> X<module> X<namespace> X<version>, package NAMESPACE BLOCK,
package NAMESPACE VERSION BLOCK X<package> X<module> X<namespace>
X<version>, __PACKAGE__ X<__PACKAGE__>, pipe READHANDLE,WRITEHANDLE
X<pipe>, pop ARRAY X<pop> X<stack>, pop, pos SCALAR X<pos> X<match,
position>, pos, print FILEHANDLE LIST X<print>, print FILEHANDLE, print
LIST, print, printf FILEHANDLE FORMAT, LIST X<printf>, printf FILEHANDLE,
printf FORMAT, LIST, printf, prototype FUNCTION X<prototype>, prototype,
push ARRAY,LIST X<push> X<stack>, q/STRING/, qq/STRING/, qw/STRING/,
qx/STRING/, qr/STRING/, quotemeta EXPR X<quotemeta> X<metacharacter>,
quotemeta, rand EXPR X<rand> X<random>, rand, read
FILEHANDLE,SCALAR,LENGTH,OFFSET X<read> X<file, read>, read
FILEHANDLE,SCALAR,LENGTH, readdir DIRHANDLE X<readdir>, readline EXPR,
readline X<readline> X<gets> X<fgets>, readlink EXPR X<readlink>, readlink,
readpipe EXPR, readpipe X<readpipe>, recv SOCKET,SCALAR,LENGTH,FLAGS
X<recv>, redo LABEL X<redo>, redo EXPR, redo, ref EXPR X<ref> X<reference>,
ref, rename OLDNAME,NEWNAME X<rename> X<move> X<mv> X<ren>, require VERSION
X<require>, require EXPR, require, reset EXPR X<reset>, reset, return EXPR
X<return>, return, reverse LIST X<reverse> X<rev> X<invert>, rewinddir
DIRHANDLE X<rewinddir>, rindex STR,SUBSTR,POSITION X<rindex>, rindex
STR,SUBSTR, rmdir FILENAME X<rmdir> X<rd> X<directory, remove>, rmdir,
s///, say FILEHANDLE LIST X<say>, say FILEHANDLE, say LIST, say, scalar
EXPR X<scalar> X<context>, seek FILEHANDLE,POSITION,WHENCE X<seek> X<fseek>
X<filehandle, position>, seekdir DIRHANDLE,POS X<seekdir>, select
FILEHANDLE X<select> X<filehandle, default>, select, select
RBITS,WBITS,EBITS,TIMEOUT X<select>, semctl ID,SEMNUM,CMD,ARG X<semctl>,
semget KEY,NSEMS,FLAGS X<semget>, semop KEY,OPSTRING X<semop>, send
SOCKET,MSG,FLAGS,TO X<send>, send SOCKET,MSG,FLAGS, setpgrp PID,PGRP
X<setpgrp> X<group>, setpriority WHICH,WHO,PRIORITY X<setpriority>
X<priority> X<nice> X<renice>, setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL
X<setsockopt>, shift ARRAY X<shift>, shift, shmctl ID,CMD,ARG X<shmctl>,
shmget KEY,SIZE,FLAGS X<shmget>, shmread ID,VAR,POS,SIZE X<shmread>
X<shmwrite>, shmwrite ID,STRING,POS,SIZE, shutdown SOCKET,HOW X<shutdown>,
sin EXPR X<sin> X<sine> X<asin> X<arcsine>, sin, sleep EXPR X<sleep>
X<pause>, sleep, socket SOCKET,DOMAIN,TYPE,PROTOCOL X<socket>, socketpair
SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL X<socketpair>, sort SUBNAME LIST
X<sort> X<qsort> X<quicksort> X<mergesort>, sort BLOCK LIST, sort LIST,
splice ARRAY,OFFSET,LENGTH,LIST X<splice>, splice ARRAY,OFFSET,LENGTH,
splice ARRAY,OFFSET, splice ARRAY, split /PATTERN/,EXPR,LIMIT X<split>,
split /PATTERN/,EXPR, split /PATTERN/, split, sprintf FORMAT, LIST
X<sprintf>, format parameter index, flags, vector flag, (minimum) width,
precision, or maximum width X<precision>, size, order of arguments, sqrt
EXPR X<sqrt> X<root> X<square root>, sqrt, srand EXPR X<srand> X<seed>
X<randseed>, srand, stat FILEHANDLE X<stat> X<file, status> X<ctime>, stat
EXPR, stat DIRHANDLE, stat, state VARLIST X<state>, state TYPE VARLIST,
state VARLIST : ATTRS, state TYPE VARLIST : ATTRS, study SCALAR X<study>,
study, sub NAME BLOCK X<sub>, sub NAME (PROTO) BLOCK, sub NAME : ATTRS
BLOCK, sub NAME (PROTO) : ATTRS BLOCK, __SUB__ X<__SUB__>, substr
EXPR,OFFSET,LENGTH,REPLACEMENT X<substr> X<substring> X<mid> X<left>
X<right>, substr EXPR,OFFSET,LENGTH, substr EXPR,OFFSET, symlink
OLDFILE,NEWFILE X<symlink> X<link> X<symbolic link> X<link, symbolic>,
syscall NUMBER, LIST X<syscall> X<system call>, sysopen
FILEHANDLE,FILENAME,MODE X<sysopen>, sysopen
FILEHANDLE,FILENAME,MODE,PERMS, sysread FILEHANDLE,SCALAR,LENGTH,OFFSET
X<sysread>, sysread FILEHANDLE,SCALAR,LENGTH, sysseek
FILEHANDLE,POSITION,WHENCE X<sysseek> X<lseek>, system LIST X<system>
X<shell>, system PROGRAM LIST, syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET
X<syswrite>, syswrite FILEHANDLE,SCALAR,LENGTH, syswrite FILEHANDLE,SCALAR,
tell FILEHANDLE X<tell>, tell, telldir DIRHANDLE X<telldir>, tie
VARIABLE,CLASSNAME,LIST X<tie>, tied VARIABLE X<tied>, time X<time>
X<epoch>, times X<times>, tr///, truncate FILEHANDLE,LENGTH X<truncate>,
truncate EXPR,LENGTH, uc EXPR X<uc> X<uppercase> X<toupper>, uc, ucfirst
EXPR X<ucfirst> X<uppercase>, ucfirst, umask EXPR X<umask>, umask, undef
EXPR X<undef> X<undefine>, undef, unlink LIST X<unlink> X<delete> X<remove>
X<rm> X<del>, unlink, unpack TEMPLATE,EXPR X<unpack>, unpack TEMPLATE,
unshift ARRAY,LIST X<unshift>, untie VARIABLE X<untie>, use Module VERSION
LIST X<use> X<module> X<import>, use Module VERSION, use Module LIST, use
Module, use VERSION, utime LIST X<utime>, values HASH X<values>, values
ARRAY, vec EXPR,OFFSET,BITS X<vec> X<bit> X<bit vector>, wait X<wait>,
waitpid PID,FLAGS X<waitpid>, wantarray X<wantarray> X<context>, warn LIST
X<warn> X<warning> X<STDERR>, write FILEHANDLE X<write>, write EXPR, write,
y///

=item Non-function Keywords by Cross-reference

__DATA__, __END__, BEGIN, CHECK, END, INIT, UNITCHECK, DESTROY, and, cmp,
eq, ge, gt, le, lt, ne, not, or, x, xor, AUTOLOAD, else, elsif, for,
foreach, if, unless, until, while, elseif, default, given, when

=back

=back

=head2 perlopentut - simple recipes for opening files and pipes in Perl

=over 4

=item DESCRIPTION

I<OK>, I<HANDLE>, I<MODE>, I<PATHNAME>

=item Opening Text Files

=over 4

=item Opening Text Files for Reading

=item Opening Text Files for Writing

=back

=item Opening Binary Files

=item Opening Pipes

=item Low-level File Opens via sysopen

=item SEE ALSO

=item AUTHOR and COPYRIGHT

=back

=head2 perlpacktut - tutorial on C<pack> and C<unpack>

=over 4

=item DESCRIPTION

=item The Basic Principle

=item Packing Text

=item Packing Numbers

=over 4

=item Integers

=item Unpacking a Stack Frame

=item How to Eat an Egg on a Net

=item Byte-order modifiers

=item Floating point Numbers

=back

=item Exotic Templates

=over 4

=item Bit Strings

=item Uuencoding

=item Doing Sums

=item  Unicode

=item Another Portable Binary Encoding

=back

=item Template Grouping

=item Lengths and Widths

=over 4

=item String Lengths

=item Dynamic Templates

=item Counting Repetitions

=item Intel HEX

=back

=item Packing and Unpacking C Structures

=over 4

=item The Alignment Pit

=item Dealing with Endian-ness

=item Alignment, Take 2

=item Alignment, Take 3

=item Pointers for How to Use Them

=back

=item Pack Recipes

=item Funnies Section

=item Authors

=back

=head2 perlpod - the Plain Old Documentation format

=over 4

=item DESCRIPTION

=over 4

=item Ordinary Paragraph
X<POD, ordinary paragraph>

=item Verbatim Paragraph
X<POD, verbatim paragraph> X<verbatim>

=item Command Paragraph
X<POD, command>

C<=head1 I<Heading Text>> X<=head1> X<=head2> X<=head3> X<=head4> X<head1>
X<head2> X<head3> X<head4>, C<=head2 I<Heading Text>>, C<=head3 I<Heading
Text>>, C<=head4 I<Heading Text>>, C<=over I<indentlevel>> X<=over>
X<=item> X<=back> X<over> X<item> X<back>, C<=item I<stuff...>>, C<=back>,
C<=cut> X<=cut> X<cut>, C<=pod> X<=pod> X<pod>, C<=begin I<formatname>>
X<=begin> X<=end> X<=for> X<begin> X<end> X<for>, C<=end I<formatname>>,
C<=for I<formatname> I<text...>>, C<=encoding I<encodingname>> X<=encoding>
X<encoding>

=item Formatting Codes
X<POD, formatting code> X<formatting code>
X<POD, interior sequence> X<interior sequence>

C<IE<lt>textE<gt>> -- italic text X<I> X<< IZ<><> >> X<POD, formatting
code, italic> X<italic>, C<BE<lt>textE<gt>> -- bold text X<B> X<< BZ<><> >>
X<POD, formatting code, bold> X<bold>, C<CE<lt>codeE<gt>> -- code text X<C>
X<< CZ<><> >> X<POD, formatting code, code> X<code>, C<LE<lt>nameE<gt>> --
a hyperlink X<L> X<< LZ<><> >> X<POD, formatting code, hyperlink>
X<hyperlink>, C<EE<lt>escapeE<gt>> -- a character escape X<E> X<< EZ<><> >>
X<POD, formatting code, escape> X<escape>, C<FE<lt>filenameE<gt>> -- used
for filenames X<F> X<< FZ<><> >> X<POD, formatting code, filename>
X<filename>, C<SE<lt>textE<gt>> -- text contains non-breaking spaces X<S>
X<< SZ<><> >> X<POD, formatting code, non-breaking space>  X<non-breaking
space>, C<XE<lt>topic nameE<gt>> -- an index entry X<X> X<< XZ<><> >>
X<POD, formatting code, index entry> X<index entry>, C<ZE<lt>E<gt>> -- a
null (zero-effect) formatting code X<Z> X<< ZZ<><> >> X<POD, formatting
code, null> X<null>

=item The Intent
X<POD, intent of>

=item Embedding Pods in Perl Modules
X<POD, embedding>

=item Hints for Writing Pod

X<podchecker> X<POD, validating>

=back

=item SEE ALSO

=item AUTHOR

=back

=head2 perlpodspec - Plain Old Documentation: format specification and
notes

=over 4

=item DESCRIPTION

=item Pod Definitions

=item Pod Commands

"=head1", "=head2", "=head3", "=head4", "=pod", "=cut", "=over", "=item",
"=back", "=begin formatname", "=begin formatname parameter", "=end
formatname", "=for formatname text...", "=encoding encodingname"

=item Pod Formatting Codes

C<IE<lt>textE<gt>> -- italic text, C<BE<lt>textE<gt>> -- bold text,
C<CE<lt>codeE<gt>> -- code text, C<FE<lt>filenameE<gt>> -- style for
filenames, C<XE<lt>topic nameE<gt>> -- an index entry, C<ZE<lt>E<gt>> -- a
null (zero-effect) formatting code, C<LE<lt>nameE<gt>> -- a hyperlink,
C<EE<lt>escapeE<gt>> -- a character escape, C<SE<lt>textE<gt>> -- text
contains non-breaking spaces

=item Notes on Implementing Pod Processors

=item About LE<lt>...E<gt> Codes

First:, Second:, Third:, Fourth:, Fifth:, Sixth:

=item About =over...=back Regions

=item About Data Paragraphs and "=begin/=end" Regions

=item SEE ALSO

=item AUTHOR

=back

=head2 perlpodstyle - Perl POD style guide

=over 4

=item DESCRIPTION

NAME, SYNOPSIS, DESCRIPTION, OPTIONS, RETURN VALUE, ERRORS, DIAGNOSTICS,
EXAMPLES, ENVIRONMENT, FILES, CAVEATS, BUGS, RESTRICTIONS, NOTES, AUTHOR,
HISTORY, COPYRIGHT AND LICENSE, SEE ALSO

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 perldiag - various Perl diagnostics

=over 4

=item DESCRIPTION

=item SEE ALSO

=back

=head2 perldeprecation - list Perl deprecations

=over 4

=item DESCRIPTION

=over 4

=item Perl 5.32

=item Perl 5.30

=item Perl 5.28

=item Perl 5.26

=item Perl 5.24

=item Perl 5.16

=back

=item SEE ALSO

=back

=head2 perllexwarn - Perl Lexical Warnings

=over 4

=item DESCRIPTION

=back

=head2 perldebug - Perl debugging

=over 4

=item DESCRIPTION

=item The Perl Debugger

=over 4

=item Calling the Debugger

perl -d program_name, perl -d -e 0, perl -d:ptkdb program_name, perl -dt
threaded_program_name

=item Debugger Commands

h X<debugger command, h>, h [command], h h, p expr X<debugger command, p>,
x [maxdepth] expr X<debugger command, x>, V [pkg [vars]] X<debugger
command, V>, X [vars] X<debugger command, X>, y [level [vars]] X<debugger
command, y>, T X<debugger command, T> X<backtrace> X<stack, backtrace>, s
[expr] X<debugger command, s> X<step>, n [expr] X<debugger command, n>, r
X<debugger command, r>, <CR>, c [line|sub] X<debugger command, c>, l
X<debugger command, l>, l min+incr, l min-max, l line, l subname, -
X<debugger command, ->, v [line] X<debugger command, v>, . X<debugger
command, .>, f filename X<debugger command, f>, /pattern/, ?pattern?, L
[abw] X<debugger command, L>, S [[!]regex] X<debugger command, S>, t [n]
X<debugger command, t>, t [n] expr X<debugger command, t>, b X<breakpoint>
X<debugger command, b>, b [line] [condition] X<breakpoint> X<debugger
command, b>, b [file]:[line] [condition] X<breakpoint> X<debugger command,
b>, b subname [condition] X<breakpoint> X<debugger command, b>, b postpone
subname [condition] X<breakpoint> X<debugger command, b>, b load filename
X<breakpoint> X<debugger command, b>, b compile subname X<breakpoint>
X<debugger command, b>, B line X<breakpoint> X<debugger command, B>, B *
X<breakpoint> X<debugger command, B>, disable [file]:[line] X<breakpoint>
X<debugger command, disable> X<disable>, disable [line] X<breakpoint>
X<debugger command, disable> X<disable>, enable [file]:[line] X<breakpoint>
X<debugger command, disable> X<disable>, enable [line] X<breakpoint>
X<debugger command, disable> X<disable>, a [line] command X<debugger
command, a>, A line X<debugger command, A>, A * X<debugger command, A>, w
expr X<debugger command, w>, W expr X<debugger command, W>, W * X<debugger
command, W>, o X<debugger command, o>, o booloption ... X<debugger command,
o>, o anyoption? ... X<debugger command, o>, o option=value ... X<debugger
command, o>, < ? X<< debugger command, < >>, < [ command ] X<< debugger
command, < >>, < * X<< debugger command, < >>, << command X<< debugger
command, << >>, > ? X<< debugger command, > >>, > command X<< debugger
command, > >>, > * X<< debugger command, > >>, >> command X<<< debugger
command, >> >>>, { ? X<debugger command, {>, { [ command ], { * X<debugger
command, {>, {{ command X<debugger command, {{>, ! number X<debugger
command, !>, ! -number X<debugger command, !>, ! pattern X<debugger
command, !>, !! cmd X<debugger command, !!>, source file X<debugger
command, source>, H -number X<debugger command, H>, q or ^D X<debugger
command, q> X<debugger command, ^D>, R X<debugger command, R>, |dbcmd
X<debugger command, |>, ||dbcmd X<debugger command, ||>, command, m expr
X<debugger command, m>, M X<debugger command, M>, man [manpage] X<debugger
command, man>

=item Configurable Options

C<recallCommand>, C<ShellBang> X<debugger option, recallCommand> X<debugger
option, ShellBang>, C<pager> X<debugger option, pager>, C<tkRunning>
X<debugger option, tkRunning>, C<signalLevel>, C<warnLevel>, C<dieLevel>
X<debugger option, signalLevel> X<debugger option, warnLevel> X<debugger
option, dieLevel>, C<AutoTrace> X<debugger option, AutoTrace>, C<LineInfo>
X<debugger option, LineInfo>, C<inhibit_exit> X<debugger option,
inhibit_exit>, C<PrintRet> X<debugger option, PrintRet>, C<ornaments>
X<debugger option, ornaments>, C<frame> X<debugger option, frame>,
C<maxTraceLen> X<debugger option, maxTraceLen>, C<windowSize> X<debugger
option, windowSize>, C<arrayDepth>, C<hashDepth> X<debugger option,
arrayDepth> X<debugger option, hashDepth>, C<dumpDepth> X<debugger option,
dumpDepth>, C<compactDump>, C<veryCompact> X<debugger option, compactDump>
X<debugger option, veryCompact>, C<globPrint> X<debugger option,
globPrint>, C<DumpDBFiles> X<debugger option, DumpDBFiles>, C<DumpPackages>
X<debugger option, DumpPackages>, C<DumpReused> X<debugger option,
DumpReused>, C<quote>, C<HighBit>, C<undefPrint> X<debugger option, quote>
X<debugger option, HighBit> X<debugger option, undefPrint>, C<UsageOnly>
X<debugger option, UsageOnly>, C<HistFile> X<debugger option, history,
HistFile>, C<HistSize> X<debugger option, history, HistSize>, C<TTY>
X<debugger option, TTY>, C<noTTY> X<debugger option, noTTY>, C<ReadLine>
X<debugger option, ReadLine>, C<NonStop> X<debugger option, NonStop>

=item Debugger Input/Output

Prompt, Multiline commands, Stack backtrace X<backtrace> X<stack,
backtrace>, Line Listing Format, Frame listing

=item Debugging Compile-Time Statements

=item Debugger Customization

=item Readline Support / History in the Debugger

=item Editor Support for Debugging

=item The Perl Profiler
X<profile> X<profiling> X<profiler>

=back

=item Debugging Regular Expressions
X<regular expression, debugging>
X<regex, debugging> X<regexp, debugging>

=item Debugging Memory Usage
X<memory usage>

=item SEE ALSO

=item BUGS

=back

=head2 perlvar - Perl predefined variables

=over 4

=item DESCRIPTION

=over 4

=item The Syntax of Variable Names

=back

=item SPECIAL VARIABLES

=over 4

=item General Variables

$ARG, $_ X<$_> X<$ARG>, @ARG, @_ X<@_> X<@ARG>, $LIST_SEPARATOR, $" X<$">
X<$LIST_SEPARATOR>, $PROCESS_ID, $PID, $$ X<$$> X<$PID> X<$PROCESS_ID>,
$PROGRAM_NAME, $0 X<$0> X<$PROGRAM_NAME>, $REAL_GROUP_ID, $GID, $( X<$(>
X<$GID> X<$REAL_GROUP_ID>, $EFFECTIVE_GROUP_ID, $EGID, $) X<$)> X<$EGID>
X<$EFFECTIVE_GROUP_ID>, $REAL_USER_ID, $UID, $< X<< $< >> X<$UID>
X<$REAL_USER_ID>, $EFFECTIVE_USER_ID, $EUID, $> X<< $> >> X<$EUID>
X<$EFFECTIVE_USER_ID>, $SUBSCRIPT_SEPARATOR, $SUBSEP, $; X<$;> X<$SUBSEP>
X<SUBSCRIPT_SEPARATOR>, $a, $b X<$a> X<$b>, %ENV X<%ENV>,
$OLD_PERL_VERSION, $] X<$]> X<$OLD_PERL_VERSION>, $SYSTEM_FD_MAX, $^F
X<$^F> X<$SYSTEM_FD_MAX>, @F X<@F>, @INC X<@INC>, %INC X<%INC>,
$INPLACE_EDIT, $^I X<$^I> X<$INPLACE_EDIT>, @ISA X<@ISA>, $^M X<$^M>,
$OSNAME, $^O X<$^O> X<$OSNAME>, %SIG X<%SIG>, $BASETIME, $^T X<$^T>
X<$BASETIME>, $PERL_VERSION, $^V X<$^V> X<$PERL_VERSION>,
${^WIN32_SLOPPY_STAT} X<${^WIN32_SLOPPY_STAT}> X<sitecustomize>
X<sitecustomize.pl>, $EXECUTABLE_NAME, $^X X<$^X> X<$EXECUTABLE_NAME>

=item Variables related to regular expressions

$<I<digits>> ($1, $2, ...) X<$1> X<$2> X<$3> X<$I<digits>>, @{^CAPTURE}
X<@{^CAPTURE}> X<@^CAPTURE>, $MATCH, $& X<$&> X<$MATCH>, ${^MATCH}
X<${^MATCH}>, $PREMATCH, $` X<$`> X<$PREMATCH> X<${^PREMATCH}>,
${^PREMATCH} X<$`> X<${^PREMATCH}>, $POSTMATCH, $' X<$'> X<$POSTMATCH>
X<${^POSTMATCH}> X<@->, ${^POSTMATCH} X<${^POSTMATCH}> X<$'> X<$POSTMATCH>,
$LAST_PAREN_MATCH, $+ X<$+> X<$LAST_PAREN_MATCH>, $LAST_SUBMATCH_RESULT,
$^N X<$^N> X<$LAST_SUBMATCH_RESULT>, @LAST_MATCH_END, @+ X<@+>
X<@LAST_MATCH_END>, %{^CAPTURE}, %LAST_PAREN_MATCH, %+ X<%+>
X<%LAST_PAREN_MATCH> X<%{^CAPTURE}>, @LAST_MATCH_START, @- X<@->
X<@LAST_MATCH_START>, C<$`> is the same as C<substr($var, 0, $-[0])>, C<$&>
is the same as C<substr($var, $-[0], $+[0] - $-[0])>, C<$'> is the same as
C<substr($var, $+[0])>, C<$1> is the same as C<substr($var, $-[1], $+[1] -
$-[1])>, C<$2> is the same as C<substr($var, $-[2], $+[2] - $-[2])>, C<$3>
is the same as C<substr($var, $-[3], $+[3] - $-[3])>, %{^CAPTURE_ALL}
X<%{^CAPTURE_ALL}>, %- X<%->, $LAST_REGEXP_CODE_RESULT, $^R X<$^R>
X<$LAST_REGEXP_CODE_RESULT>, ${^RE_DEBUG_FLAGS} X<${^RE_DEBUG_FLAGS}>,
${^RE_TRIE_MAXBUF} X<${^RE_TRIE_MAXBUF}>

=item Variables related to filehandles

$ARGV X<$ARGV>, @ARGV X<@ARGV>, ARGV X<ARGV>, ARGVOUT X<ARGVOUT>,
IO::Handle->output_field_separator( EXPR ), $OUTPUT_FIELD_SEPARATOR, $OFS,
$, X<$,> X<$OFS> X<$OUTPUT_FIELD_SEPARATOR>, HANDLE->input_line_number(
EXPR ), $INPUT_LINE_NUMBER, $NR, $. X<$.> X<$NR> X<$INPUT_LINE_NUMBER>
X<line number>, IO::Handle->input_record_separator( EXPR ),
$INPUT_RECORD_SEPARATOR, $RS, $/ X<$/> X<$RS> X<$INPUT_RECORD_SEPARATOR>,
IO::Handle->output_record_separator( EXPR ), $OUTPUT_RECORD_SEPARATOR,
$ORS, $\ X<$\> X<$ORS> X<$OUTPUT_RECORD_SEPARATOR>, HANDLE->autoflush( EXPR
), $OUTPUT_AUTOFLUSH, $| X<$|> X<autoflush> X<flush> X<$OUTPUT_AUTOFLUSH>,
${^LAST_FH} X<${^LAST_FH}>, $ACCUMULATOR, $^A X<$^A> X<$ACCUMULATOR>,
IO::Handle->format_formfeed(EXPR), $FORMAT_FORMFEED, $^L X<$^L>
X<$FORMAT_FORMFEED>, HANDLE->format_page_number(EXPR), $FORMAT_PAGE_NUMBER,
$% X<$%> X<$FORMAT_PAGE_NUMBER>, HANDLE->format_lines_left(EXPR),
$FORMAT_LINES_LEFT, $- X<$-> X<$FORMAT_LINES_LEFT>,
IO::Handle->format_line_break_characters EXPR,
$FORMAT_LINE_BREAK_CHARACTERS, $: X<$:> X<FORMAT_LINE_BREAK_CHARACTERS>,
HANDLE->format_lines_per_page(EXPR), $FORMAT_LINES_PER_PAGE, $= X<$=>
X<$FORMAT_LINES_PER_PAGE>, HANDLE->format_top_name(EXPR), $FORMAT_TOP_NAME,
$^ X<$^> X<$FORMAT_TOP_NAME>, HANDLE->format_name(EXPR), $FORMAT_NAME, $~
X<$~> X<$FORMAT_NAME>

=item Error Variables
X<error> X<exception>

${^CHILD_ERROR_NATIVE} X<$^CHILD_ERROR_NATIVE>, $EXTENDED_OS_ERROR, $^E
X<$^E> X<$EXTENDED_OS_ERROR>, $EXCEPTIONS_BEING_CAUGHT, $^S X<$^S>
X<$EXCEPTIONS_BEING_CAUGHT>, $WARNING, $^W X<$^W> X<$WARNING>,
${^WARNING_BITS} X<${^WARNING_BITS}>, $OS_ERROR, $ERRNO, $! X<$!> X<$ERRNO>
X<$OS_ERROR>, %OS_ERROR, %ERRNO, %! X<%!> X<%OS_ERROR> X<%ERRNO>,
$CHILD_ERROR, $? X<$?> X<$CHILD_ERROR>, $EVAL_ERROR, $@ X<$@>
X<$EVAL_ERROR>

=item Variables related to the interpreter state

$COMPILING, $^C X<$^C> X<$COMPILING>, $DEBUGGING, $^D X<$^D> X<$DEBUGGING>,
${^ENCODING} X<${^ENCODING}>, ${^GLOBAL_PHASE} X<${^GLOBAL_PHASE}>,
CONSTRUCT, START, CHECK, INIT, RUN, END, DESTRUCT, $^H X<$^H>, %^H X<%^H>,
${^OPEN} X<${^OPEN}>, $PERLDB, $^P X<$^P> X<$PERLDB>, 0x01, 0x02, 0x04,
0x08, 0x10, 0x20, 0x40, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000, ${^TAINT}
X<${^TAINT}>, ${^UNICODE} X<${^UNICODE}>, ${^UTF8CACHE} X<${^UTF8CACHE}>,
${^UTF8LOCALE} X<${^UTF8LOCALE}>

=item Deprecated and removed variables

$# X<$#>, $* X<$*>, $[ X<$[>

=back

=back

=head2 perlre - Perl regular expressions

=over 4

=item DESCRIPTION

=over 4

=item The Basics
X<regular expression, version 8> X<regex, version 8> X<regexp, version 8>

=item Modifiers

B<C<m>> X</m> X<regex, multiline> X<regexp, multiline> X<regular
expression, multiline>, B<C<s>> X</s> X<regex, single-line> X<regexp,
single-line> X<regular expression, single-line>, B<C<i>> X</i> X<regex,
case-insensitive> X<regexp, case-insensitive> X<regular expression,
case-insensitive>, B<C<x>> and B<C<xx>> X</x>, B<C<p>> X</p> X<regex,
preserve> X<regexp, preserve>, B<C<a>>, B<C<d>>, B<C<l>>, and B<C<u>> X</a>
X</d> X</l> X</u>, B<C<n>> X</n> X<regex, non-capture> X<regexp,
non-capture> X<regular expression, non-capture>, Other Modifiers

=item Regular Expressions

[1], [2], [3], [4], [5], [6], [7], [8]

=item Quoting metacharacters

=item Extended Patterns

C<(?#text)> X<(?#)>, C<(?adlupimnsx-imnsx)>, C<(?^alupimnsx)> X<(?)>
X<(?^)>, C<(?:pattern)> X<(?:)>, C<(?adluimnsx-imnsx:pattern)>,
C<(?^aluimnsx:pattern)> X<(?^:)>, C<(?|pattern)> X<(?|)> X<Branch reset>,
Lookaround Assertions X<look-around assertion> X<lookaround assertion>
X<look-around> X<lookaround>, C<(?=pattern)> X<(?=)> X<look-ahead,
positive> X<lookahead, positive>, C<(?!pattern)> X<(?!)> X<look-ahead,
negative> X<lookahead, negative>, C<(?<=pattern)>, C<\K> X<(?<=)>
X<look-behind, positive> X<lookbehind, positive> X<\K>, C<(?<!pattern)>
X<(?<!)> X<look-behind, negative> X<lookbehind, negative>, C<<
(?<NAME>pattern) >>, C<(?'NAME'pattern)> X<< (?<NAME>) >> X<(?'NAME')>
X<named capture> X<capture>, C<< \k<NAME> >>, C<< \k'NAME' >>, C<(?{ code
})> X<(?{})> X<regex, code in> X<regexp, code in> X<regular expression,
code in>, C<(??{ code })> X<(??{})> X<regex, postponed> X<regexp,
postponed> X<regular expression, postponed>, C<(?I<PARNO>)> C<(?-I<PARNO>)>
C<(?+I<PARNO>)> C<(?R)> C<(?0)> X<(?PARNO)> X<(?1)> X<(?R)> X<(?0)>
X<(?-1)> X<(?+1)> X<(?-PARNO)> X<(?+PARNO)> X<regex, recursive> X<regexp,
recursive> X<regular expression, recursive> X<regex, relative recursion>
X<GOSUB> X<GOSTART>, C<(?&NAME)> X<(?&NAME)>,
C<(?(condition)yes-pattern|no-pattern)> X<(?()>,
C<(?(condition)yes-pattern)>, an integer in parentheses, a
lookahead/lookbehind/evaluate zero-width assertion;, a name in angle
brackets or single quotes, the special symbol C<(R)>, C<(1)> C<(2)> ..,
C<(E<lt>I<NAME>E<gt>)> C<('I<NAME>')>, C<(?=...)> C<(?!...)> C<(?<=...)>
C<(?<!...)>, C<(?{ I<CODE> })>, C<(R)>, C<(R1)> C<(R2)> .., C<(R&I<NAME>)>,
C<(DEFINE)>, C<< (?>pattern) >> X<backtrack> X<backtracking> X<atomic>
X<possessive>, C<(?[ ])>

=item Backtracking
X<backtrack> X<backtracking>

=item Special Backtracking Control Verbs

Verbs, C<(*PRUNE)> C<(*PRUNE:NAME)> X<(*PRUNE)> X<(*PRUNE:NAME)>,
C<(*SKIP)> C<(*SKIP:NAME)> X<(*SKIP)>, C<(*MARK:NAME)> C<(*:NAME)>
X<(*MARK)> X<(*MARK:NAME)> X<(*:NAME)>, C<(*THEN)> C<(*THEN:NAME)>,
C<(*COMMIT)> C<(*COMMIT:args)> X<(*COMMIT)>, C<(*FAIL)> C<(*F)>
C<(*FAIL:arg)> X<(*FAIL)> X<(*F)>, C<(*ACCEPT)> C<(*ACCEPT:arg)>
X<(*ACCEPT)>

=item Warning on C<\1> Instead of C<$1>

=item Repeated Patterns Matching a Zero-length Substring

=item Combining RE Pieces

C<ST>, C<S|T>, C<S{REPEAT_COUNT}>, C<S{min,max}>, C<S{min,max}?>, C<S?>,
C<S*>, C<S+>, C<S??>, C<S*?>, C<S+?>, C<< (?>S) >>, C<(?=S)>, C<(?<=S)>,
C<(?!S)>, C<(?<!S)>, C<(??{ EXPR })>, C<(?I<PARNO>)>,
C<(?(condition)yes-pattern|no-pattern)>

=item Creating Custom RE Engines

=item Embedded Code Execution Frequency

=item PCRE/Python Support

C<< (?PE<lt>NAMEE<gt>pattern) >>, C<< (?P=NAME) >>, C<< (?P>NAME) >>

=back

=item BUGS

=item SEE ALSO

=back

=head2 perlrebackslash - Perl Regular Expression Backslash Sequences and
Escapes

=over 4

=item DESCRIPTION

=over 4

=item The backslash

[1]

=item All the sequences and escapes

=item Character Escapes

[1], [2]

=item Modifiers

=item Character classes

=item Referencing

=item Assertions

\A, \z, \Z, \G, \b{}, \b, \B{}, \B, C<\b{gcb}> or C<\b{g}>, C<\b{lb}>,
C<\b{sb}>, C<\b{wb}>

=item Misc

\K, \N, \R X<\R>, \X X<\X>

=back

=back

=head2 perlrecharclass - Perl Regular Expression Character Classes

=over 4

=item DESCRIPTION

=over 4

=item The dot

=item Backslash sequences
X<\w> X<\W> X<\s> X<\S> X<\d> X<\D> X<\p> X<\P>
X<\N> X<\v> X<\V> X<\h> X<\H>
X<word> X<whitespace>

If the C</a> modifier is in effect .., otherwise .., For code points above
255 .., For code points below 256 .., if locale rules are in effect .., if,
instead, Unicode rules are in effect .., otherwise .., If the C</a>
modifier is in effect .., otherwise .., For code points above 255 .., For
code points below 256 .., if locale rules are in effect .., if, instead,
Unicode rules are in effect .., otherwise .., [1], [2]

=item Bracketed Character Classes

[1], [2], [3], [4], [5], [6], If the C</a> modifier, is in effect ..,
otherwise .., For code points above 255 .., For code points below 256 ..,
if locale rules are in effect .., C<word>, C<ascii>, C<blank>, if, instead,
Unicode rules are in effect .., otherwise ..

=back

=back

=head2 perlreref - Perl Regular Expressions Reference

=over 4

=item DESCRIPTION

=over 4

=item OPERATORS

=item SYNTAX

=item ESCAPE SEQUENCES

=item CHARACTER CLASSES

=item ANCHORS

=item QUANTIFIERS

=item EXTENDED CONSTRUCTS

=item VARIABLES

=item FUNCTIONS

=item TERMINOLOGY

=back

=item AUTHOR

=item SEE ALSO

=item THANKS

=back

=head2 perlref - Perl references and nested data structures

=over 4

=item NOTE

=item DESCRIPTION

=over 4

=item Making References
X<reference, creation> X<referencing>

1. X<\> X<backslash>, 2. X<array, anonymous> X<[> X<[]> X<square bracket>
X<bracket, square> X<arrayref> X<array reference> X<reference, array>, 3.
X<hash, anonymous> X<{> X<{}> X<curly bracket> X<bracket, curly> X<brace>
X<hashref> X<hash reference> X<reference, hash>, 4. X<subroutine,
anonymous> X<subroutine, reference> X<reference, subroutine> X<scope,
lexical> X<closure> X<lexical> X<lexical scope>, 5. X<constructor> X<new>,
6. X<autovivification>, 7. X<*foo{THING}> X<*>

=item Using References
X<reference, use> X<dereferencing> X<dereference>

=item Circular References
X<circular reference> X<reference, circular>

=item Symbolic references
X<reference, symbolic> X<reference, soft>
X<symbolic reference> X<soft reference>

=item Not-so-symbolic references

=item Pseudo-hashes: Using an array as a hash
X<pseudo-hash> X<pseudo hash> X<pseudohash>

=item Function Templates
X<scope, lexical> X<closure> X<lexical> X<lexical scope>
X<subroutine, nested> X<sub, nested> X<subroutine, local> X<sub, local>

=back

=item WARNING: Don't use references as hash keys
X<reference, string context> X<reference, use as hash key>

=over 4

=item Postfix Dereference Syntax

=item Postfix Reference Slicing

=item Assigning to References

=back

=item Declaring a Reference to a Variable

=item SEE ALSO

=back

=head2 perlform - Perl formats

=over 4

=item DESCRIPTION

=over 4

=item Text Fields
X<format, text field>

=item Numeric Fields
X<#> X<format, numeric field>

=item The Field @* for Variable-Width Multi-Line Text
X<@*>

=item The Field ^* for Variable-Width One-line-at-a-time Text
X<^*>

=item Specifying Values
X<format, specifying values>

=item Using Fill Mode
X<format, fill mode>

=item Suppressing Lines Where All Fields Are Void
X<format, suppressing lines>

=item Repeating Format Lines
X<format, repeating lines>

=item Top of Form Processing
X<format, top of form> X<top> X<header>

=item Format Variables
X<format variables>
X<format, variables>

=back

=item NOTES

=over 4

=item Footers
X<format, footer> X<footer>

=item Accessing Formatting Internals
X<format, internals>

=back

=item WARNINGS

=back

=head2 perlobj - Perl object reference

=over 4

=item DESCRIPTION

=over 4

=item An Object is Simply a Data Structure
X<object> X<bless> X<constructor> X<new>

=item A Class is Simply a Package
X<class> X<package> X<@ISA> X<inheritance>

=item A Method is Simply a Subroutine
X<method>

=item Method Invocation
X<invocation> X<method> X<arrow> X<< -> >>

=item Inheritance
X<inheritance>

=item Writing Constructors
X<constructor>

=item Attributes
X<attribute>

=item An Aside About Smarter and Safer Code

=item Method Call Variations
X<method>

=item Invoking Class Methods
X<invocation>

=item C<bless>, C<blessed>, and C<ref>

=item The UNIVERSAL Class
X<UNIVERSAL>

isa($class) X<isa>, DOES($role) X<DOES>, can($method) X<can>,
VERSION($need) X<VERSION>

=item AUTOLOAD
X<AUTOLOAD>

=item Destructors
X<destructor> X<DESTROY>

=item Non-Hash Objects

=item Inside-Out objects

=item Pseudo-hashes

=back

=item SEE ALSO

=back

=head2 perltie - how to hide an object class in a simple variable

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Tying Scalars
X<scalar, tying>

TIESCALAR classname, LIST X<TIESCALAR>, FETCH this X<FETCH>, STORE this,
value X<STORE>, UNTIE this X<UNTIE>, DESTROY this X<DESTROY>

=item Tying Arrays
X<array, tying>

TIEARRAY classname, LIST X<TIEARRAY>, FETCH this, index X<FETCH>, STORE
this, index, value X<STORE>, FETCHSIZE this X<FETCHSIZE>, STORESIZE this,
count X<STORESIZE>, EXTEND this, count X<EXTEND>, EXISTS this, key
X<EXISTS>, DELETE this, key X<DELETE>, CLEAR this X<CLEAR>, PUSH this, LIST
 X<PUSH>, POP this X<POP>, SHIFT this X<SHIFT>, UNSHIFT this, LIST 
X<UNSHIFT>, SPLICE this, offset, length, LIST X<SPLICE>, UNTIE this
X<UNTIE>, DESTROY this X<DESTROY>

=item Tying Hashes
X<hash, tying>

USER, HOME, CLOBBER, LIST, TIEHASH classname, LIST X<TIEHASH>, FETCH this,
key X<FETCH>, STORE this, key, value X<STORE>, DELETE this, key X<DELETE>,
CLEAR this X<CLEAR>, EXISTS this, key X<EXISTS>, FIRSTKEY this X<FIRSTKEY>,
NEXTKEY this, lastkey X<NEXTKEY>, SCALAR this X<SCALAR>, UNTIE this
X<UNTIE>, DESTROY this X<DESTROY>

=item Tying FileHandles
X<filehandle, tying>

TIEHANDLE classname, LIST X<TIEHANDLE>, WRITE this, LIST X<WRITE>, PRINT
this, LIST X<PRINT>, PRINTF this, LIST X<PRINTF>, READ this, LIST X<READ>,
READLINE this X<READLINE>, GETC this X<GETC>, EOF this X<EOF>, CLOSE this
X<CLOSE>, UNTIE this X<UNTIE>, DESTROY this X<DESTROY>

=item UNTIE this
X<UNTIE>

=item The C<untie> Gotcha
X<untie>

=back

=item SEE ALSO

=item BUGS

=item AUTHOR

=back

=head2 perldbmfilter - Perl DBM Filters

=over 4

=item SYNOPSIS

=item DESCRIPTION

B<filter_store_key>, B<filter_store_value>, B<filter_fetch_key>,
B<filter_fetch_value>

=over 4

=item The Filter

=item An Example: the NULL termination problem.

=item Another Example: Key is a C int.

=back

=item SEE ALSO

=item AUTHOR

=back

=head2 perlipc - Perl interprocess communication (signals, fifos, pipes,
safe subprocesses, sockets, and semaphores)

=over 4

=item DESCRIPTION

=item Signals

=over 4

=item Handling the SIGHUP Signal in Daemons

=item Deferred Signals (Safe Signals)

Long-running opcodes, Interrupting IO, Restartable system calls, Signals as
"faults", Signals triggered by operating system state

=back

=item Named Pipes

=item Using open() for IPC

=over 4

=item Filehandles

=item Background Processes

=item Complete Dissociation of Child from Parent

=item Safe Pipe Opens

=item Avoiding Pipe Deadlocks

=item Bidirectional Communication with Another Process

=item Bidirectional Communication with Yourself

=back

=item Sockets: Client/Server Communication

=over 4

=item Internet Line Terminators

=item Internet TCP Clients and Servers

=item Unix-Domain TCP Clients and Servers

=back

=item TCP Clients with IO::Socket

=over 4

=item A Simple Client

C<Proto>, C<PeerAddr>, C<PeerPort>

=item A Webget Client

=item Interactive Client with IO::Socket

=back

=item TCP Servers with IO::Socket

Proto, LocalPort, Listen, Reuse

=item UDP: Message Passing

=item SysV IPC

=item NOTES

=item BUGS

=item AUTHOR

=item SEE ALSO

=back

=head2 perlfork - Perl's fork() emulation

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Behavior of other Perl features in forked pseudo-processes

$$ or $PROCESS_ID, %ENV, chdir() and all other builtins that accept
filenames, wait() and waitpid(), kill(), exec(), exit(), Open handles to
files, directories and network sockets

=item Resource limits

=item Killing the parent process

=item Lifetime of the parent process and pseudo-processes

=back

=item CAVEATS AND LIMITATIONS

BEGIN blocks, Open filehandles, Open directory handles, Forking pipe open()
not yet implemented, Global state maintained by XSUBs, Interpreter embedded
in larger application, Thread-safety of extensions

=item PORTABILITY CAVEATS

=item BUGS

=item AUTHOR

=item SEE ALSO

=back

=head2 perlnumber - semantics of numbers and numeric operations in Perl

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Storing numbers

=item Numeric operators and numeric conversions

=item Flavors of Perl numeric operations

Arithmetic operators, ++, Arithmetic operators during C<use integer>, Other
mathematical operators, Bitwise operators, Bitwise operators during C<use
integer>, Operators which expect an integer, Operators which expect a
string

=item AUTHOR

=item SEE ALSO

=back

=head2 perlthrtut - Tutorial on threads in Perl

=over 4

=item DESCRIPTION

=item What Is A Thread Anyway?

=item Threaded Program Models

=over 4

=item Boss/Worker

=item Work Crew

=item Pipeline

=back

=item What kind of threads are Perl threads?

=item Thread-Safe Modules

=item Thread Basics

=over 4

=item Basic Thread Support

=item A Note about the Examples

=item Creating Threads

=item Waiting For A Thread To Exit

=item Ignoring A Thread

=item Process and Thread Termination

=back

=item Threads And Data

=over 4

=item Shared And Unshared Data

=item Thread Pitfalls: Races

=back

=item Synchronization and control

=over 4

=item Controlling access: lock()

=item A Thread Pitfall: Deadlocks

=item Queues: Passing Data Around

=item Semaphores: Synchronizing Data Access

=item Basic semaphores

=item Advanced Semaphores

=item Waiting for a Condition

=item Giving up control

=back

=item General Thread Utility Routines

=over 4

=item What Thread Am I In?

=item Thread IDs

=item Are These Threads The Same?

=item What Threads Are Running?

=back

=item A Complete Example

=item Different implementations of threads

=item Performance considerations

=item Process-scope Changes

=item Thread-Safety of System Libraries

=item Conclusion

=item SEE ALSO

=item Bibliography

=over 4

=item Introductory Texts

=item OS-Related References

=item Other References

=back

=item Acknowledgements

=item AUTHOR

=item Copyrights

=back

=head2 perlport - Writing portable Perl

=over 4

=item DESCRIPTION

Not all Perl programs have to be portable, Nearly all of Perl already I<is>
portable

=item ISSUES

=over 4

=item Newlines

=item Numbers endianness and Width

=item Files and Filesystems

=item System Interaction

=item Command names versus file pathnames

=item Networking

=item Interprocess Communication (IPC)

=item External Subroutines (XS)

=item Standard Modules

=item Time and Date

=item Character sets and character encoding

=item Internationalisation

=item System Resources

=item Security

=item Style

=back

=item CPAN Testers

=item PLATFORMS

=over 4

=item Unix

=item DOS and Derivatives

=item VMS

=item VOS

=item EBCDIC Platforms

=item Acorn RISC OS

=item Other perls

=back

=item FUNCTION IMPLEMENTATIONS

=over 4

=item Alphabetical Listing of Perl Functions

-I<X>, alarm, atan2, binmode, chmod, chown, chroot, crypt, dbmclose,
dbmopen, dump, exec, exit, fcntl, flock, fork, getlogin, getpgrp, getppid,
getpriority, getpwnam, getgrnam, getnetbyname, getpwuid, getgrgid,
getnetbyaddr, getprotobynumber, getpwent, getgrent, gethostbyname,
gethostent, getnetent, getprotoent, getservent, seekdir, sethostent,
setnetent, setprotoent, setservent, endpwent, endgrent, endhostent,
endnetent, endprotoent, endservent, getsockopt, glob, gmtime, ioctl, kill,
link, localtime, lstat, msgctl, msgget, msgsnd, msgrcv, open, readlink,
rename, rewinddir, select, semctl, semget, semop, setgrent, setpgrp,
setpriority, setpwent, setsockopt, shmctl, shmget, shmread, shmwrite,
sleep, socketpair, stat, symlink, syscall, sysopen, system, telldir, times,
truncate, umask, utime, wait, waitpid

=back

=item Supported Platforms

Linux (x86, ARM, IA64), HP-UX, AIX, Win32, Windows 2000, Windows XP,
Windows Server 2003, Windows Vista, Windows Server 2008, Windows 7, Cygwin,
Solaris (x86, SPARC), OpenVMS, Alpha (7.2 and later), I64 (8.2 and later),
Symbian, NetBSD, FreeBSD, Debian GNU/kFreeBSD, Haiku, Irix (6.5. What
else?), OpenBSD, Dragonfly BSD, Midnight BSD, QNX Neutrino RTOS (6.5.0),
MirOS BSD, Stratus OpenVOS (17.0 or later), time_t issues that may or may
not be fixed, Symbian (Series 60 v3, 3.2 and 5 - what else?), Stratus VOS /
OpenVOS, AIX, Android, FreeMINT

=item EOL Platforms

=over 4

=item (Perl 5.20)

AT&T 3b1

=item (Perl 5.14)

Windows 95, Windows 98, Windows ME, Windows NT4

=item (Perl 5.12)

Atari MiNT, Apollo Domain/OS, Apple Mac OS 8/9, Tenon Machten

=back

=item Supported Platforms (Perl 5.8)

=item SEE ALSO

=item AUTHORS / CONTRIBUTORS

=back

=head2 perllocale - Perl locale handling (internationalization and
localization)

=over 4

=item DESCRIPTION

=item WHAT IS A LOCALE

Category C<LC_NUMERIC>: Numeric formatting, Category C<LC_MONETARY>:
Formatting of monetary amounts, Category C<LC_TIME>: Date/Time formatting,
Category C<LC_MESSAGES>: Error and other messages, Category C<LC_COLLATE>:
Collation, Category C<LC_CTYPE>: Character Types, Other categories

=item PREPARING TO USE LOCALES

=item USING LOCALES

=over 4

=item The C<"use locale"> pragma

B<Not within the scope of C<"use locale">>, B<Lingering effects of C<S<use
locale>>>, B<Under C<"use locale";>>

=item The setlocale function

=item Finding locales

=item LOCALE PROBLEMS

=item Testing for broken locales

=item Temporarily fixing locale problems

=item Permanently fixing locale problems

=item Permanently fixing your system's locale configuration

=item Fixing system locale configuration

=item The localeconv function

=item I18N::Langinfo

=back

=item LOCALE CATEGORIES

=over 4

=item Category C<LC_COLLATE>: Collation: Text Comparisons and Sorting

=item Category C<LC_CTYPE>: Character Types

=item Category C<LC_NUMERIC>: Numeric Formatting

=item Category C<LC_MONETARY>: Formatting of monetary amounts

=item Category C<LC_TIME>: Respresentation of time

=item Other categories

=back

=item SECURITY

=item ENVIRONMENT

PERL_SKIP_LOCALE_INIT, PERL_BADLANG, C<LC_ALL>, C<LANGUAGE>, C<LC_CTYPE>,
C<LC_COLLATE>, C<LC_MONETARY>, C<LC_NUMERIC>, C<LC_TIME>, C<LANG>

=over 4

=item Examples

=back

=item NOTES

=over 4

=item String C<eval> and C<LC_NUMERIC>

=item Backward compatibility

=item I18N:Collate obsolete

=item Sort speed and memory use impacts

=item Freely available locale definitions

=item I18n and l10n

=item An imperfect standard

=back

=item Unicode and UTF-8

=item BUGS

=over 4

=item Collation of strings containing embedded C<NUL> characters

=item Broken systems

=back

=item SEE ALSO

=item HISTORY

=back

=head2 perluniintro - Perl Unicode introduction

=over 4

=item DESCRIPTION

=over 4

=item Unicode

=item Perl's Unicode Support

=item Perl's Unicode Model

=item Unicode and EBCDIC

=item Creating Unicode

=item Handling Unicode

=item Legacy Encodings

=item Unicode I/O

=item Displaying Unicode As Text

=item Special Cases

=item Advanced Topics

=item Miscellaneous

=item Questions With Answers

=item Hexadecimal Notation

=item Further Resources

=back

=item UNICODE IN OLDER PERLS

=item SEE ALSO

=item ACKNOWLEDGMENTS

=item AUTHOR, COPYRIGHT, AND LICENSE

=back

=head2 perlunicode - Unicode support in Perl

=over 4

=item DESCRIPTION

=over 4

=item Important Caveats

Safest if you C<use feature 'unicode_strings'>, Input and Output Layers,
You should convert your non-ASCII, non-UTF-8 Perl scripts to be UTF-8,
C<use utf8> still needed to enable L<UTF-8|/Unicode Encodings> in scripts,
L<UTF-16|/Unicode Encodings> scripts autodetected

=item Byte and Character Semantics

=item ASCII Rules versus Unicode Rules

When the string has been upgraded to UTF-8, There are additional methods
for regular expression patterns

=item Extended Grapheme Clusters (Logical characters)

=item Unicode Character Properties

B<C<\p{All}>>, B<C<\p{Alnum}>>, B<C<\p{Any}>>, B<C<\p{ASCII}>>,
B<C<\p{Assigned}>>, B<C<\p{Blank}>>, B<C<\p{Decomposition_Type:
Non_Canonical}>>    (Short: C<\p{Dt=NonCanon}>), B<C<\p{Graph}>>,
B<C<\p{HorizSpace}>>, B<C<\p{In=*}>>, B<C<\p{PerlSpace}>>,
B<C<\p{PerlWord}>>, B<C<\p{Posix...}>>, B<C<\p{Present_In: *}>>    (Short:
C<\p{In=*}>), B<C<\p{Print}>>, B<C<\p{SpacePerl}>>, B<C<\p{Title}>> and 
B<C<\p{Titlecase}>>, B<C<\p{Unicode}>>, B<C<\p{VertSpace}>>,
B<C<\p{Word}>>, B<C<\p{XPosix...}>>

=item User-Defined Character Properties

=item User-Defined Case Mappings (for serious hackers only)

=item Character Encodings for Input and Output

=item Unicode Regular Expression Support Level

[1] C<\N{U+...}> and C<\x{...}>, [2] C<\p{...}> C<\P{...}>.  This
requirement is for a minimal list of properties.  Perl supports these and
all other Unicode character properties, as R2.7 asks (see L</"Unicode
Character Properties"> above), [3] Perl has C<\d> C<\D> C<\s> C<\S> C<\w>
C<\W> C<\X> C<[:I<prop>:]> C<[:^I<prop>:]>, plus all the properties
specified by
L<http://www.unicode.org/reports/tr18/#Compatibility_Properties>.  These
are described above in L</Other Properties>, [4], Regular expression
lookahead, [5] C<\b> C<\B> meet most, but not all, the details of this
requirement, but C<\b{wb}> and C<\B{wb}> do, as well as the stricter R2.3,
[6], [7], [8] UTF-8/UTF-EBDDIC used in Perl allows not only C<U+10000> to
C<U+10FFFF> but also beyond C<U+10FFFF>, [9] Unicode has rewritten this
portion of UTS#18 to say that getting canonical equivalence (see UAX#15
L<"Unicode Normalization Forms"|http://www.unicode.org/reports/tr15>) is
basically to be done at the programmer level.  Use NFD to write both your
regular expressions and text to match them against (you can use
L<Unicode::Normalize>), [10] Perl has C<\X> and C<\b{gcb}> but we don't
have a "Grapheme Cluster Mode", [11] see L<UAX#29 "Unicode Text
Segmentation"|http://www.unicode.org/reports/tr29>,, [12] Perl has
L<Unicode::Collate>, but it isn't integrated with regular expressions.	See
L<UTS#10 "Unicode Collation
Algorithms"|http://www.unicode.org/reports/tr10>, [13] Perl has C<(?<=x)>
and C<(?=x)>, but lookaheads or lookbehinds should see outside of the
target substring

=item Unicode Encodings

=item Noncharacter code points

=item Beyond Unicode code points

=item Security Implications of Unicode

=item Unicode in Perl on EBCDIC

=item Locales

=item When Unicode Does Not Happen

=item The "Unicode Bug"

=item Forcing Unicode in Perl (Or Unforcing Unicode in Perl)

=item Using Unicode in XS

=item Hacking Perl to work on earlier Unicode versions (for very serious
hackers only)

=item Porting code from perl-5.6.X

=back

=item BUGS

=over 4

=item Interaction with Extensions

=item Speed

=back

=item SEE ALSO

=back

=head2 perlunicook - cookbookish examples of handling Unicode in Perl

=over 4

=item DESCRIPTION

=item EXAMPLES

=over 4

=item ℞ 0: Standard preamble

=item ℞ 1: Generic Unicode-savvy filter

=item ℞ 2: Fine-tuning Unicode warnings

=item ℞ 3: Declare source in utf8 for identifiers and literals

=item ℞ 4: Characters and their numbers

=item ℞ 5: Unicode literals by character number

=item ℞ 6: Get character name by number

=item ℞ 7: Get character number by name

=item ℞ 8: Unicode named characters

=item ℞ 9: Unicode named sequences

=item ℞ 10: Custom named characters

=item ℞ 11: Names of CJK codepoints

=item ℞ 12: Explicit encode/decode

=item ℞ 13: Decode program arguments as utf8

=item ℞ 14: Decode program arguments as locale encoding

=item ℞ 15: Declare STD{IN,OUT,ERR} to be utf8

=item ℞ 16: Declare STD{IN,OUT,ERR} to be in locale encoding

=item ℞ 17: Make file I/O default to utf8

=item ℞ 18: Make all I/O and args default to utf8

=item ℞ 19: Open file with specific encoding

=item ℞ 20: Unicode casing

=item ℞ 21: Unicode case-insensitive comparisons

=item ℞ 22: Match Unicode linebreak sequence in regex

=item ℞ 23: Get character category

=item ℞ 24: Disabling Unicode-awareness in builtin charclasses

=item ℞ 25: Match Unicode properties in regex with \p, \P

=item ℞ 26: Custom character properties

=item ℞ 27: Unicode normalization

=item ℞ 28: Convert non-ASCII Unicode numerics

=item ℞ 29: Match Unicode grapheme cluster in regex

=item ℞ 30: Extract by grapheme instead of by codepoint (regex)

=item ℞ 31: Extract by grapheme instead of by codepoint (substr)

=item ℞ 32: Reverse string by grapheme

=item ℞ 33: String length in graphemes

=item ℞ 34: Unicode column-width for printing

=item ℞ 35: Unicode collation

=item ℞ 36: Case- I<and> accent-insensitive Unicode sort

=item ℞ 37: Unicode locale collation

=item ℞ 38: Making C<cmp> work on text instead of codepoints

=item ℞ 39: Case- I<and> accent-insensitive comparisons

=item ℞ 40: Case- I<and> accent-insensitive locale comparisons

=item ℞ 41: Unicode linebreaking

=item ℞ 42: Unicode text in DBM hashes, the tedious way

=item ℞ 43: Unicode text in DBM hashes, the easy way

=item ℞ 44: PROGRAM: Demo of Unicode collation and printing

=back

=item SEE ALSO

§3.13 Default Case Algorithms, page 113; §4.2  Case, pages 120–122;
Case Mappings, page 166–172, especially Caseless Matching starting on
page 170, UAX #44: Unicode Character Database, UTS #18: Unicode Regular
Expressions, UAX #15: Unicode Normalization Forms, UTS #10: Unicode
Collation Algorithm, UAX #29: Unicode Text Segmentation, UAX #14: Unicode
Line Breaking Algorithm, UAX #11: East Asian Width

=item AUTHOR

=item COPYRIGHT AND LICENCE

=item REVISION HISTORY

=back

=head2 perlunifaq - Perl Unicode FAQ

=over 4

=item Q and A

=over 4

=item perlunitut isn't really a Unicode tutorial, is it?

=item What character encodings does Perl support?

=item Which version of perl should I use?

=item What about binary data, like images?

=item When should I decode or encode?

=item What if I don't decode?

=item What if I don't encode?

=item Is there a way to automatically decode or encode?

=item What if I don't know which encoding was used?

=item Can I use Unicode in my Perl sources?

=item Data::Dumper doesn't restore the UTF8 flag; is it broken?

=item Why do regex character classes sometimes match only in the ASCII
range?

=item Why do some characters not uppercase or lowercase correctly?

=item How can I determine if a string is a text string or a binary string?

=item How do I convert from encoding FOO to encoding BAR?

=item What are C<decode_utf8> and C<encode_utf8>?

=item What is a "wide character"?

=back

=item INTERNALS

=over 4

=item What is "the UTF8 flag"?

=item What about the C<use bytes> pragma?

=item What about the C<use encoding> pragma?

=item What is the difference between C<:encoding> and C<:utf8>?

=item What's the difference between C<UTF-8> and C<utf8>?

=item I lost track; what encoding is the internal format really?

=back

=item AUTHOR

=item SEE ALSO

=back

=head2 perluniprops - Index of Unicode Version 9.0.0 character properties
in Perl

=over 4

=item DESCRIPTION

=item Properties accessible through C<\p{}> and C<\P{}>

Single form (C<\p{name}>) tighter rules:, white space adjacent to a
non-word character, underscores separating digits in numbers, Compound form
(C<\p{name=value}> or C<\p{name:value}>) tighter rules:, Stabilized,
Deprecated, Obsolete, Discouraged, Z<>B<*> is a wild-card, B<(\d+)> in the
info column gives the number of Unicode code points matched by this
property, B<D> means this is deprecated, B<O> means this is obsolete, B<S>
means this is stabilized, B<T> means tighter (stricter) name matching
applies, B<X> means use of this form is discouraged, and may not be stable

=over 4

=item Legal C<\p{}> and C<\P{}> constructs that match no characters

\p{Canonical_Combining_Class=Attached_Below_Left},
\p{Canonical_Combining_Class=CCC133}

=back

=item Properties accessible through Unicode::UCD

=item Properties accessible through other means

=item Unicode character properties that are NOT accepted by Perl

I<Expands_On_NFC> (XO_NFC), I<Expands_On_NFD> (XO_NFD), I<Expands_On_NFKC>
(XO_NFKC), I<Expands_On_NFKD> (XO_NFKD), I<Grapheme_Link> (Gr_Link),
I<Jamo_Short_Name> (JSN), I<Other_Alphabetic> (OAlpha),
I<Other_Default_Ignorable_Code_Point> (ODI), I<Other_Grapheme_Extend>
(OGr_Ext), I<Other_ID_Continue> (OIDC), I<Other_ID_Start> (OIDS),
I<Other_Lowercase> (OLower), I<Other_Math> (OMath), I<Other_Uppercase>
(OUpper), I<Script=Katakana_Or_Hiragana> (sc=Hrkt),
I<Script_Extensions=Katakana_Or_Hiragana> (scx=Hrkt)

=item Other information in the Unicode data base

F<auxiliary/GraphemeBreakTest.html>, F<auxiliary/LineBreakTest.html>,
F<auxiliary/SentenceBreakTest.html>, F<auxiliary/WordBreakTest.html>,
F<BidiCharacterTest.txt>, F<BidiTest.txt>, F<NormTest.txt>,
F<CJKRadicals.txt>, F<EmojiSources.txt>, F<Index.txt>, F<NamedSqProv.txt>,
F<NamesList.html>, F<NamesList.txt>, F<NormalizationCorrections.txt>,
F<ReadMe.txt>, F<StandardizedVariants.html>, F<StandardizedVariants.txt>,
F<TangutSources.txt>, F<USourceData.txt>, F<USourceGlyphs.pdf>

=item SEE ALSO

=back

=head2 perlunitut - Perl Unicode Tutorial

=over 4

=item DESCRIPTION

=over 4

=item Definitions

=item Your new toolkit

=item I/O flow (the actual 5 minute tutorial)

=back

=item SUMMARY

=item Q and A (or FAQ)

=item ACKNOWLEDGEMENTS

=item AUTHOR

=item SEE ALSO

=back

=head2 perlebcdic - Considerations for running Perl on EBCDIC platforms

=over 4

=item DESCRIPTION

=item COMMON CHARACTER CODE SETS

=over 4

=item ASCII

=item ISO 8859

=item Latin 1 (ISO 8859-1)

=item EBCDIC

B<0037>, B<1047>, B<POSIX-BC>

=item Unicode code points versus EBCDIC code points

=item Unicode and UTF

=item Using Encode

=back

=item SINGLE OCTET TABLES

recipe 0, recipe 1, recipe 2, recipe 3, recipe 4, recipe 5, recipe 6

=over 4

=item Table in hex, sorted in 1047 order

=back

=item IDENTIFYING CHARACTER CODE SETS

=item CONVERSIONS

=over 4

=item C<utf8::unicode_to_native()> and C<utf8::native_to_unicode()>

=item tr///

=item iconv

=item C RTL

=back

=item OPERATOR DIFFERENCES

=item FUNCTION DIFFERENCES

C<chr()>, C<ord()>, C<pack()>, C<print()>, C<printf()>, C<sort()>,
C<sprintf()>, C<unpack()>

=item REGULAR EXPRESSION DIFFERENCES

=item SOCKETS

=item SORTING

=over 4

=item Ignore ASCII vs. EBCDIC sort differences.

=item Use a sort helper function

=item MONO CASE then sort data (for non-digits, non-underscore)

=item Perform sorting on one type of platform only.

=back

=item TRANSFORMATION FORMATS

=over 4

=item URL decoding and encoding

=item uu encoding and decoding

=item Quoted-Printable encoding and decoding

=item Caesarean ciphers

=back

=item Hashing order and checksums

=item I18N AND L10N

=item MULTI-OCTET CHARACTER SETS

=item OS ISSUES

=over 4

=item OS/400

PASE, IFS access

=item OS/390, z/OS

C<sigaction>, C<chcp>, dataset access, C<iconv>, locales

=item POSIX-BC?

=back

=item BUGS

=item SEE ALSO

=item REFERENCES

=item HISTORY

=item AUTHOR

=back

=head2 perlsec - Perl security

=over 4

=item DESCRIPTION

=item SECURITY VULNERABILITY CONTACT INFORMATION

=item SECURITY MECHANISMS AND CONCERNS

=over 4

=item Taint mode

=item Laundering and Detecting Tainted Data

=item Switches On the "#!" Line

=item Taint mode and @INC

=item Cleaning Up Your Path

=item Security Bugs

=item Protecting Your Programs

=item Unicode

=item Algorithmic Complexity Attacks

Hash Seed Randomization, Hash Traversal Randomization, Bucket Order
Perturbance, New Default Hash Function, Alternative Hash Functions

=back

=item SEE ALSO

=back

=head2 perlmod - Perl modules (packages and symbol tables)

=over 4

=item DESCRIPTION

=over 4

=item Is this the document you were after?

This doc, L<perlnewmod>, L<perlmodstyle>

=item Packages
X<package> X<namespace> X<variable, global> X<global variable> X<global>

=item Symbol Tables
X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias>

=item BEGIN, UNITCHECK, CHECK, INIT and END
X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END>

=item Perl Classes
X<class> X<@ISA>

=item Perl Modules
X<module>

=item Making your module threadsafe
X<threadsafe> X<thread safe>
X<module, threadsafe> X<module, thread safe>
X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread>

=back

=item SEE ALSO

=back

=head2 perlmodlib - constructing new Perl modules and finding existing ones

=over 4

=item THE PERL MODULE LIBRARY

=over 4

=item Pragmatic Modules

arybase, attributes, autodie, autodie::exception,
autodie::exception::system, autodie::hints, autodie::skip, autouse, base,
bigint, bignum, bigrat, blib, bytes, charnames, constant, deprecate,
diagnostics, encoding, encoding::warnings, experimental, feature, fields,
filetest, if, integer, less, lib, locale, mro, ok, open, ops, overload,
overloading, parent, re, sigtrap, sort, strict, subs, threads,
threads::shared, utf8, vars, version, vmsish, warnings::register

=item Standard Modules

Amiga::ARexx, Amiga::Exec, AnyDBM_File, App::Cpan, App::Prove,
App::Prove::State, App::Prove::State::Result,
App::Prove::State::Result::Test, Archive::Tar, Archive::Tar::File,
Attribute::Handlers, AutoLoader, AutoSplit, B, B::Concise, B::Debug,
B::Deparse, B::Op_private, B::Showlex, B::Terse, B::Xref, Benchmark,
C<IO::Socket::IP>, C<Socket>, CORE, CPAN, CPAN::API::HOWTO, CPAN::Debug,
CPAN::Distroprefs, CPAN::FirstTime, CPAN::HandleConfig, CPAN::Kwalify,
CPAN::Meta, CPAN::Meta::Converter, CPAN::Meta::Feature,
CPAN::Meta::History, CPAN::Meta::History::Meta_1_0,
CPAN::Meta::History::Meta_1_1, CPAN::Meta::History::Meta_1_2,
CPAN::Meta::History::Meta_1_3, CPAN::Meta::History::Meta_1_4,
CPAN::Meta::Merge, CPAN::Meta::Prereqs, CPAN::Meta::Requirements,
CPAN::Meta::Spec, CPAN::Meta::Validator, CPAN::Meta::YAML, CPAN::Nox,
CPAN::Plugin, CPAN::Plugin::Specfile, CPAN::Queue, CPAN::Tarzip,
CPAN::Version, Carp, Class::Struct, Compress::Raw::Bzip2,
Compress::Raw::Zlib, Compress::Zlib, Config, Config::Perl::V, Cwd, DB,
DBM_Filter, DBM_Filter::compress, DBM_Filter::encode, DBM_Filter::int32,
DBM_Filter::null, DBM_Filter::utf8, DB_File, Data::Dumper, Devel::PPPort,
Devel::Peek, Devel::SelfStubber, Digest, Digest::MD5, Digest::SHA,
Digest::base, Digest::file, DirHandle, Dumpvalue, DynaLoader, Encode,
Encode::Alias, Encode::Byte, Encode::CJKConstants, Encode::CN,
Encode::CN::HZ, Encode::Config, Encode::EBCDIC, Encode::Encoder,
Encode::Encoding, Encode::GSM0338, Encode::Guess, Encode::JP,
Encode::JP::H2Z, Encode::JP::JIS7, Encode::KR, Encode::KR::2022_KR,
Encode::MIME::Header, Encode::MIME::Name, Encode::PerlIO,
Encode::Supported, Encode::Symbol, Encode::TW, Encode::Unicode,
Encode::Unicode::UTF7, English, Env, Errno, Exporter, Exporter::Heavy,
ExtUtils::CBuilder, ExtUtils::CBuilder::Platform::Windows,
ExtUtils::Command, ExtUtils::Command::MM, ExtUtils::Constant,
ExtUtils::Constant::Base, ExtUtils::Constant::Utils,
ExtUtils::Constant::XS, ExtUtils::Embed, ExtUtils::Install,
ExtUtils::Installed, ExtUtils::Liblist, ExtUtils::MM, ExtUtils::MM::Utils,
ExtUtils::MM_AIX, ExtUtils::MM_Any, ExtUtils::MM_BeOS, ExtUtils::MM_Cygwin,
ExtUtils::MM_DOS, ExtUtils::MM_Darwin, ExtUtils::MM_MacOS,
ExtUtils::MM_NW5, ExtUtils::MM_OS2, ExtUtils::MM_QNX, ExtUtils::MM_UWIN,
ExtUtils::MM_Unix, ExtUtils::MM_VMS, ExtUtils::MM_VOS, ExtUtils::MM_Win32,
ExtUtils::MM_Win95, ExtUtils::MY, ExtUtils::MakeMaker,
ExtUtils::MakeMaker::Config, ExtUtils::MakeMaker::FAQ,
ExtUtils::MakeMaker::Locale, ExtUtils::MakeMaker::Tutorial,
ExtUtils::Manifest, ExtUtils::Miniperl, ExtUtils::Mkbootstrap,
ExtUtils::Mksymlists, ExtUtils::Packlist, ExtUtils::ParseXS,
ExtUtils::ParseXS::Constants, ExtUtils::ParseXS::Eval,
ExtUtils::ParseXS::Utilities, ExtUtils::Typemaps, ExtUtils::Typemaps::Cmd,
ExtUtils::Typemaps::InputMap, ExtUtils::Typemaps::OutputMap,
ExtUtils::Typemaps::Type, ExtUtils::XSSymSet, ExtUtils::testlib, Fatal,
Fcntl, File::Basename, File::Compare, File::Copy, File::DosGlob,
File::Fetch, File::Find, File::Glob, File::GlobMapper, File::Path,
File::Spec, File::Spec::AmigaOS, File::Spec::Cygwin, File::Spec::Epoc,
File::Spec::Functions, File::Spec::Mac, File::Spec::OS2, File::Spec::Unix,
File::Spec::VMS, File::Spec::Win32, File::Temp, File::stat, FileCache,
FileHandle, Filter::Simple, Filter::Util::Call, FindBin, GDBM_File,
Getopt::Long, Getopt::Std, HTTP::Tiny, Hash::Util, Hash::Util::FieldHash,
I18N::Collate, I18N::LangTags, I18N::LangTags::Detect,
I18N::LangTags::List, I18N::Langinfo, IO, IO::Compress::Base,
IO::Compress::Bzip2, IO::Compress::Deflate, IO::Compress::FAQ,
IO::Compress::Gzip, IO::Compress::RawDeflate, IO::Compress::Zip, IO::Dir,
IO::File, IO::Handle, IO::Pipe, IO::Poll, IO::Seekable, IO::Select,
IO::Socket, IO::Socket::INET, IO::Socket::UNIX, IO::Uncompress::AnyInflate,
IO::Uncompress::AnyUncompress, IO::Uncompress::Base,
IO::Uncompress::Bunzip2, IO::Uncompress::Gunzip, IO::Uncompress::Inflate,
IO::Uncompress::RawInflate, IO::Uncompress::Unzip, IO::Zlib, IPC::Cmd,
IPC::Msg, IPC::Open2, IPC::Open3, IPC::Semaphore, IPC::SharedMem,
IPC::SysV, Internals, JSON::PP, JSON::PP::Boolean, List::Util,
List::Util::XS, Locale::Codes, Locale::Codes::API, Locale::Codes::Changes,
Locale::Codes::Country, Locale::Codes::Currency, Locale::Codes::LangExt,
Locale::Codes::LangFam, Locale::Codes::LangVar, Locale::Codes::Language,
Locale::Codes::Script, Locale::Country, Locale::Currency, Locale::Language,
Locale::Maketext, Locale::Maketext::Cookbook, Locale::Maketext::Guts,
Locale::Maketext::GutsLoader, Locale::Maketext::Simple,
Locale::Maketext::TPJ13, Locale::Script, MIME::Base64, MIME::QuotedPrint,
Math::BigFloat, Math::BigInt, Math::BigInt::Calc, Math::BigInt::CalcEmu,
Math::BigInt::FastCalc, Math::BigInt::Lib, Math::BigRat, Math::Complex,
Math::Trig, Memoize, Memoize::AnyDBM_File, Memoize::Expire,
Memoize::ExpireFile, Memoize::ExpireTest, Memoize::NDBM_File,
Memoize::SDBM_File, Memoize::Storable, Module::CoreList,
Module::CoreList::Utils, Module::Load, Module::Load::Conditional,
Module::Loaded, Module::Metadata, NDBM_File, NEXT, Net::Cmd, Net::Config,
Net::Domain, Net::FTP, Net::FTP::dataconn, Net::NNTP, Net::Netrc,
Net::POP3, Net::Ping, Net::SMTP, Net::Time, Net::hostent, Net::libnetFAQ,
Net::netent, Net::protoent, Net::servent, O, ODBM_File, Opcode, POSIX,
Params::Check, Parse::CPAN::Meta, Perl::OSType, PerlIO, PerlIO::encoding,
PerlIO::mmap, PerlIO::scalar, PerlIO::via, PerlIO::via::QuotedPrint,
Pod::Checker, Pod::Escapes, Pod::Find, Pod::Functions, Pod::Html,
Pod::InputObjects, Pod::Man, Pod::ParseLink, Pod::ParseUtils, Pod::Parser,
Pod::Perldoc, Pod::Perldoc::BaseTo, Pod::Perldoc::GetOptsOO,
Pod::Perldoc::ToANSI, Pod::Perldoc::ToChecker, Pod::Perldoc::ToMan,
Pod::Perldoc::ToNroff, Pod::Perldoc::ToPod, Pod::Perldoc::ToRtf,
Pod::Perldoc::ToTerm, Pod::Perldoc::ToText, Pod::Perldoc::ToTk,
Pod::Perldoc::ToXml, Pod::PlainText, Pod::Select, Pod::Simple,
Pod::Simple::Checker, Pod::Simple::Debug, Pod::Simple::DumpAsText,
Pod::Simple::DumpAsXML, Pod::Simple::HTML, Pod::Simple::HTMLBatch,
Pod::Simple::LinkSection, Pod::Simple::Methody, Pod::Simple::PullParser,
Pod::Simple::PullParserEndToken, Pod::Simple::PullParserStartToken,
Pod::Simple::PullParserTextToken, Pod::Simple::PullParserToken,
Pod::Simple::RTF, Pod::Simple::Search, Pod::Simple::SimpleTree,
Pod::Simple::Subclassing, Pod::Simple::Text, Pod::Simple::TextContent,
Pod::Simple::XHTML, Pod::Simple::XMLOutStream, Pod::Text, Pod::Text::Color,
Pod::Text::Termcap, Pod::Usage, SDBM_File, Safe, Scalar::Util,
Search::Dict, SelectSaver, SelfLoader, Storable, Sub::Util, Symbol,
Sys::Hostname, Sys::Syslog, Sys::Syslog::Win32, TAP::Base,
TAP::Formatter::Base, TAP::Formatter::Color, TAP::Formatter::Console,
TAP::Formatter::Console::ParallelSession, TAP::Formatter::Console::Session,
TAP::Formatter::File, TAP::Formatter::File::Session,
TAP::Formatter::Session, TAP::Harness, TAP::Harness::Env, TAP::Object,
TAP::Parser, TAP::Parser::Aggregator, TAP::Parser::Grammar,
TAP::Parser::Iterator, TAP::Parser::Iterator::Array,
TAP::Parser::Iterator::Process, TAP::Parser::Iterator::Stream,
TAP::Parser::IteratorFactory, TAP::Parser::Multiplexer,
TAP::Parser::Result, TAP::Parser::Result::Bailout,
TAP::Parser::Result::Comment, TAP::Parser::Result::Plan,
TAP::Parser::Result::Pragma, TAP::Parser::Result::Test,
TAP::Parser::Result::Unknown, TAP::Parser::Result::Version,
TAP::Parser::Result::YAML, TAP::Parser::ResultFactory,
TAP::Parser::Scheduler, TAP::Parser::Scheduler::Job,
TAP::Parser::Scheduler::Spinner, TAP::Parser::Source,
TAP::Parser::SourceHandler, TAP::Parser::SourceHandler::Executable,
TAP::Parser::SourceHandler::File, TAP::Parser::SourceHandler::Handle,
TAP::Parser::SourceHandler::Perl, TAP::Parser::SourceHandler::RawTAP,
TAP::Parser::YAMLish::Reader, TAP::Parser::YAMLish::Writer,
Term::ANSIColor, Term::Cap, Term::Complete, Term::ReadLine, Test, Test2,
Test2::API, Test2::API::Breakage, Test2::API::Context,
Test2::API::Instance, Test2::API::Stack, Test2::Event, Test2::Event::Bail,
Test2::Event::Diag, Test2::Event::Encoding, Test2::Event::Exception,
Test2::Event::Generic, Test2::Event::Info, Test2::Event::Note,
Test2::Event::Ok, Test2::Event::Plan, Test2::Event::Skip,
Test2::Event::Subtest, Test2::Event::TAP::Version, Test2::Event::Waiting,
Test2::Formatter, Test2::Formatter::TAP, Test2::Hub,
Test2::Hub::Interceptor, Test2::Hub::Interceptor::Terminator,
Test2::Hub::Subtest, Test2::IPC, Test2::IPC::Driver,
Test2::IPC::Driver::Files, Test2::Tools::Tiny, Test2::Transition,
Test2::Util, Test2::Util::ExternalMeta, Test2::Util::HashBase,
Test2::Util::Trace, Test::Builder, Test::Builder::Formatter,
Test::Builder::IO::Scalar, Test::Builder::Module, Test::Builder::Tester,
Test::Builder::Tester::Color, Test::Builder::TodoDiag, Test::Harness,
Test::Harness::Beyond, Test::More, Test::Simple, Test::Tester,
Test::Tester::Capture, Test::Tester::CaptureRunner, Test::Tutorial,
Test::use::ok, Text::Abbrev, Text::Balanced, Text::ParseWords, Text::Tabs,
Text::Wrap, Thread, Thread::Queue, Thread::Semaphore, Tie::Array,
Tie::File, Tie::Handle, Tie::Hash, Tie::Hash::NamedCapture, Tie::Memoize,
Tie::RefHash, Tie::Scalar, Tie::StdHandle, Tie::SubstrHash, Time::HiRes,
Time::Local, Time::Piece, Time::Seconds, Time::gmtime, Time::localtime,
Time::tm, UNIVERSAL, Unicode::Collate, Unicode::Collate::CJK::Big5,
Unicode::Collate::CJK::GB2312, Unicode::Collate::CJK::JISX0208,
Unicode::Collate::CJK::Korean, Unicode::Collate::CJK::Pinyin,
Unicode::Collate::CJK::Stroke, Unicode::Collate::CJK::Zhuyin,
Unicode::Collate::Locale, Unicode::Normalize, Unicode::UCD, User::grent,
User::pwent, VMS::DCLsym, VMS::Filespec, VMS::Stdio, Win32, Win32API::File,
Win32CORE, XS::APItest, XS::Typemap, XSLoader, autodie::Scope::Guard,
autodie::Scope::GuardStack, autodie::Util, version::Internals

=item Extension Modules

=back

=item CPAN

=over 4

=item Africa

South Africa, Uganda, Zimbabwe

=item Asia

Bangladesh, China, India, Indonesia, Iran, Israel, Japan, Kazakhstan,
Philippines, Qatar, Republic of Korea, Singapore, Taiwan, Turkey, Viet Nam

=item Europe

Austria, Belarus, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Czech
Republic, Denmark, Finland, France, Germany, Greece, Hungary, Ireland,
Italy, Latvia, Lithuania, Moldova, Netherlands, Norway, Poland, Portugal,
Romania, Russian Federation, Serbia, Slovakia, Slovenia, Spain, Sweden,
Switzerland, Ukraine, United Kingdom

=item North America

Canada, Costa Rica, Mexico, United States, Alabama, Arizona, California,
Idaho, Illinois, Indiana, Kansas, Massachusetts, Michigan, New Hampshire,
New Jersey, New York, North Carolina, Oregon, Pennsylvania, South Carolina,
Texas, Utah, Virginia, Washington, Wisconsin

=item Oceania

Australia, New Caledonia, New Zealand

=item South America

Argentina, Brazil, Chile

=item RSYNC Mirrors

=back

=item Modules: Creation, Use, and Abuse

=over 4

=item Guidelines for Module Creation

=item Guidelines for Converting Perl 4 Library Scripts into Modules

=item Guidelines for Reusing Application Code

=back

=item NOTE

=back

=head2 perlmodstyle - Perl module style guide

=over 4

=item INTRODUCTION

=item QUICK CHECKLIST

=over 4

=item Before you start

=item The API

=item Stability

=item Documentation

=item Release considerations

=back

=item BEFORE YOU START WRITING A MODULE

=over 4

=item Has it been done before?

=item Do one thing and do it well

=item What's in a name?

=item Get feedback before publishing

=back

=item DESIGNING AND WRITING YOUR MODULE

=over 4

=item To OO or not to OO?

=item Designing your API

Write simple routines to do simple things, Separate functionality from
output, Provide sensible shortcuts and defaults, Naming conventions,
Parameter passing

=item Strictness and warnings

=item Backwards compatibility

=item Error handling and messages

=back

=item DOCUMENTING YOUR MODULE

=over 4

=item POD

=item README, INSTALL, release notes, changelogs

perl Makefile.PL, make, make test, make install, perl Build.PL, perl Build,
perl Build test, perl Build install

=back

=item RELEASE CONSIDERATIONS

=over 4

=item Version numbering

=item Pre-requisites

=item Testing

=item Packaging

=item Licensing

=back

=item COMMON PITFALLS

=over 4

=item Reinventing the wheel

=item Trying to do too much

=item Inappropriate documentation

=back

=item SEE ALSO

L<perlstyle>, L<perlnewmod>, L<perlpod>, L<podchecker>, Packaging Tools,
Testing tools, L<http://pause.perl.org/>, Any good book on software
engineering

=item AUTHOR

=back

=head2 perlmodinstall - Installing CPAN Modules

=over 4

=item DESCRIPTION

=over 4

=item PREAMBLE

B<DECOMPRESS> the file, B<UNPACK> the file into a directory, B<BUILD> the
module (sometimes unnecessary), B<INSTALL> the module

=back

=item PORTABILITY

=item HEY

=item AUTHOR

=item COPYRIGHT

=back

=head2 perlnewmod - preparing a new module for distribution

=over 4

=item DESCRIPTION

=over 4

=item Warning

=item What should I make into a module?

=item Step-by-step: Preparing the ground

Look around, Check it's new, Discuss the need, Choose a name, Check again

=item Step-by-step: Making the module

Start with F<module-starter> or F<h2xs>, Use L<strict|strict> and
L<warnings|warnings>, Use L<Carp|Carp>, Use L<Exporter|Exporter> - wisely!,
Use L<plain old documentation|perlpod>, Write tests, Write the F<README>,
Write F<Changes>

=item Step-by-step: Distributing your module

Get a CPAN user ID, C<perl Makefile.PL; make test; make distcheck; make
dist>, Upload the tarball, Fix bugs!

=back

=item AUTHOR

=item SEE ALSO

=back

=head2 perlpragma - how to write a user pragma

=over 4

=item DESCRIPTION

=item A basic example

=item Key naming

=item Implementation details

=back

=head2 perlutil - utilities packaged with the Perl distribution

=over 4

=item DESCRIPTION

=item LIST OF UTILITIES

=over 4

=item Documentation

L<perldoc|perldoc>, L<pod2man|pod2man> and L<pod2text|pod2text>,
L<pod2html|pod2html>, L<pod2usage|pod2usage>, L<podselect|podselect>,
L<podchecker|podchecker>, L<splain|splain>, C<roffitall>

=item Converters

=item Administration

L<libnetcfg|libnetcfg>, L<perlivp>

=item Development

L<perlbug|perlbug>, L<perlthanks|perlbug>, L<h2ph|h2ph>, L<h2xs|h2xs>,
L<enc2xs>, L<xsubpp>, L<prove>, L<corelist>

=item General tools

L<piconv>, L<ptar>, L<ptardiff>, L<ptargrep>, L<shasum>, L<zipdetails>

=item Installation

L<cpan>, L<instmodsh>

=back

=item SEE ALSO

=back

=head2 perlfilter - Source Filters

=over 4

=item DESCRIPTION

=item CONCEPTS

=item USING FILTERS

=item WRITING A SOURCE FILTER

=item WRITING A SOURCE FILTER IN C

B<Decryption Filters>

=item CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE

=item WRITING A SOURCE FILTER IN PERL

=item USING CONTEXT: THE DEBUG FILTER

=item CONCLUSION

=item LIMITATIONS

=item THINGS TO LOOK OUT FOR

Some Filters Clobber the C<DATA> Handle

=item REQUIREMENTS

=item AUTHOR

=item Copyrights

=back

=head2 perldtrace - Perl's support for DTrace

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item HISTORY

=item PROBES

sub-entry(SUBNAME, FILE, LINE, PACKAGE), sub-return(SUBNAME, FILE, LINE,
PACKAGE), phase-change(NEWPHASE, OLDPHASE), op-entry(OPNAME),
loading-file(FILENAME), loaded-file(FILENAME)

=item EXAMPLES

Most frequently called functions, Trace function calls, Function calls
during interpreter cleanup, System calls at compile time, Perl functions
that execute the most opcodes

=item REFERENCES

DTrace Dynamic Tracing Guide, DTrace: Dynamic Tracing in Oracle Solaris,
Mac OS X and FreeBSD

=item SEE ALSO

L<Devel::DTrace::Provider>

=item AUTHORS

=back

=head2 perlglossary - Perl Glossary

=over 4

=item VERSION

=item DESCRIPTION

=over 4

=item A

accessor methods, actual arguments, address operator, algorithm, alias,
alphabetic, alternatives, anonymous, application, architecture, argument,
ARGV, arithmetical operator, array, array context, Artistic License, ASCII,
assertion, assignment, assignment operator, associative array,
associativity, asynchronous, atom, atomic operation, attribute,
autogeneration, autoincrement, autoload, autosplit, autovivification, AV,
awk

=item B

backreference, backtracking, backward compatibility, bareword, base class,
big-endian, binary, binary operator, bind, bit, bit shift, bit string,
bless, block, BLOCK, block buffering, Boolean, Boolean context, breakpoint,
broadcast, BSD, bucket, buffer, built-in, bundle, byte, bytecode

=item C

C, cache, callback, call by reference, call by value, canonical, capture
variables, capturing, cargo cult, case, casefolding, casemapping,
character, character class, character property, circumfix operator, class,
class method, client, closure, cluster, CODE, code generator, codepoint,
code subpattern, collating sequence, co-maintainer, combining character,
command, command buffering, command-line arguments, command name, comment,
compilation unit, compile, compile phase, compiler, compile time, composer,
concatenation, conditional, connection, construct, constructor, context,
continuation, core dump, CPAN, C preprocessor, cracker, currently selected
output channel, current package, current working directory, CV

=item D

dangling statement, datagram, data structure, data type, DBM, declaration,
declarator, decrement, default, defined, delimiter, dereference, derived
class, descriptor, destroy, destructor, device, directive, directory,
directory handle, discipline, dispatch, distribution, dual-lived, dweomer,
dwimmer, dynamic scoping

=item E

eclectic, element, embedding, empty subclass test, encapsulation, endian,
en passant, environment, environment variable, EOF, errno, error, escape
sequence, exception, exception handling, exec, executable file, execute,
execute bit, exit status, exploit, export, expression, extension

=item F

false, FAQ, fatal error, feeping creaturism, field, FIFO, file, file
descriptor, fileglob, filehandle, filename, filesystem, file test operator,
filter, first-come, flag, floating point, flush, FMTEYEWTK, foldcase, fork,
formal arguments, format, freely available, freely redistributable,
freeware, function, funny character

=item G

garbage collection, GID, glob, global, global destruction, glue language,
granularity, grapheme, greedy, grep, group, GV

=item H

hacker, handler, hard reference, hash, hash table, header file, here
document, hexadecimal, home directory, host, hubris, HV

=item I

identifier, impatience, implementation, import, increment, indexing,
indirect filehandle, indirection, indirect object, indirect object slot,
infix, inheritance, instance, instance data, instance method, instance
variable, integer, interface, interpolation, interpreter, invocant,
invocation, I/O, IO, I/O layer, IPA, IP, IPC, is-a, iteration, iterator, IV

=item J

JAPH

=item K

key, keyword

=item L

label, laziness, leftmost longest, left shift, lexeme, lexer, lexical
analysis, lexical scoping, lexical variable, library, LIFO, line,
linebreak, line buffering, line number, link, LIST, list, list context,
list operator, list value, literal, little-endian, local, logical operator,
lookahead, lookbehind, loop, loop control statement, loop label, lowercase,
lvaluable, lvalue, lvalue modifier

=item M

magic, magical increment, magical variables, Makefile, man, manpage,
matching, member data, memory, metacharacter, metasymbol, method, method
resolution order, minicpan, minimalism, mode, modifier, module, modulus,
mojibake, monger, mortal, mro, multidimensional array, multiple inheritance

=item N

named pipe, namespace, NaN, network address, newline, NFS, normalization,
null character, null list, null string, numeric context, numification, NV,
nybble

=item O

object, octal, offset, one-liner, open source software, operand, operating
system, operator, operator overloading, options, ordinal, overloading,
overriding, owner

=item P

package, pad, parameter, parent class, parse tree, parsing, patch, PATH,
pathname, pattern, pattern matching, PAUSE, Perl mongers, permission bits,
Pern, pipe, pipeline, platform, pod, pod command, pointer, polymorphism,
port, portable, porter, possessive, POSIX, postfix, pp, pragma, precedence,
prefix, preprocessing, primary maintainer, procedure, process, program,
program generator, progressive matching, property, protocol, prototype,
pseudofunction, pseudohash, pseudoliteral, public domain, pumpkin,
pumpking, PV

=item Q

qualified, quantifier

=item R

race condition, readable, reaping, record, recursion, reference, referent,
regex, regular expression, regular expression modifier, regular file,
relational operator, reserved words, return value, RFC, right shift, role,
root, RTFM, run phase, runtime, runtime pattern, RV, rvalue

=item S

sandbox, scalar, scalar context, scalar literal, scalar value, scalar
variable, scope, scratchpad, script, script kiddie, sed, semaphore,
separator, serialization, server, service, setgid, setuid, shared memory,
shebang, shell, side effects, sigil, signal, signal handler, single
inheritance, slice, slurp, socket, soft reference, source filter, stack,
standard, standard error, standard input, standard I/O, Standard Library,
standard output, statement, statement modifier, static, static method,
static scoping, static variable, stat structure, status, STDERR, STDIN,
STDIO, STDOUT, stream, string, string context, stringification, struct,
structure, subclass, subpattern, subroutine, subscript, substitution,
substring, superclass, superuser, SV, switch, switch cluster, switch
statement, symbol, symbolic debugger, symbolic link, symbolic reference,
symbol table, synchronous, syntactic sugar, syntax, syntax tree, syscall

=item T

taint checks, tainted, taint mode, TCP, term, terminator, ternary, text,
thread, tie, titlecase, TMTOWTDI, token, tokener, tokenizing, toolbox
approach, topic, transliterate, trigger, trinary, troff, true, truncating,
type, type casting, typedef, typed lexical, typeglob, typemap

=item U

UDP, UID, umask, unary operator, Unicode, Unix, uppercase

=item V

value, variable, variable interpolation, variadic, vector, virtual, void
context, v-string

=item W

warning, watch expression, weak reference, whitespace, word, working
directory, wrapper, WYSIWYG

=item X

XS, XSUB

=item Y

yacc

=item Z

zero width, zombie

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 perlembed - how to embed perl in your C program

=over 4

=item DESCRIPTION

=over 4

=item PREAMBLE

B<Use C from Perl?>, B<Use a Unix program from Perl?>, B<Use Perl from
Perl?>, B<Use C from C?>, B<Use Perl from C?>

=item ROADMAP

=item Compiling your C program

=item Adding a Perl interpreter to your C program

=item Calling a Perl subroutine from your C program

=item Evaluating a Perl statement from your C program

=item Performing Perl pattern matches and substitutions from your C program

=item Fiddling with the Perl stack from your C program

=item Maintaining a persistent interpreter

=item Execution of END blocks

=item $0 assignments

=item Maintaining multiple interpreter instances

=item Using Perl modules, which themselves use C libraries, from your C
program

=item Using embedded Perl with POSIX locales

=back

=item Hiding Perl_

=item MORAL

=item AUTHOR

=item COPYRIGHT

=back

=head2 perldebguts - Guts of Perl debugging 

=over 4

=item DESCRIPTION

=item Debugger Internals

=over 4

=item Writing Your Own Debugger

=back

=item Frame Listing Output Examples

=item Debugging Regular Expressions

=over 4

=item Compile-time Output

C<anchored> I<STRING> C<at> I<POS>, C<floating> I<STRING> C<at>
I<POS1..POS2>, C<matching floating/anchored>, C<minlen>, C<stclass>
I<TYPE>, C<noscan>, C<isall>, C<GPOS>, C<plus>, C<implicit>, C<with eval>,
C<anchored(TYPE)>

=item Types of Nodes

=item Run-time Output

=back

=item Debugging Perl Memory Usage

=over 4

=item Using C<$ENV{PERL_DEBUG_MSTATS}>

C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>, Free/Used, C<Total sbrk():
SBRKed/SBRKs:CONTINUOUS>, C<pad: 0>, C<heads: 2192>, C<chain: 0>, C<tail:
6144>

=back

=item SEE ALSO

=back

=head2 perlxstut - Tutorial for writing XSUBs

=over 4

=item DESCRIPTION

=item SPECIAL NOTES

=over 4

=item make

=item Version caveat

=item Dynamic Loading versus Static Loading

=item Threads and PERL_NO_GET_CONTEXT

=back

=item TUTORIAL

=over 4

=item EXAMPLE 1

=item EXAMPLE 2

=item What has gone on?

=item Writing good test scripts

=item EXAMPLE 3

=item What's new here?

=item Input and Output Parameters

=item The XSUBPP Program

=item The TYPEMAP file

=item Warning about Output Arguments

=item EXAMPLE 4

=item What has happened here?

=item Anatomy of .xs file

=item Getting the fat out of XSUBs

=item More about XSUB arguments

=item The Argument Stack

=item Extending your Extension

=item Documenting your Extension

=item Installing your Extension

=item EXAMPLE 5

=item New Things in this Example

=item EXAMPLE 6

=item New Things in this Example

=item EXAMPLE 7 (Coming Soon)

=item EXAMPLE 8 (Coming Soon)

=item EXAMPLE 9 Passing open files to XSes

=item Troubleshooting these Examples

=back

=item See also

=item Author

=over 4

=item Last Changed

=back

=back

=head2 perlxs - XS language reference manual

=over 4

=item DESCRIPTION

=over 4

=item Introduction

=item On The Road

=item The Anatomy of an XSUB

=item The Argument Stack

=item The RETVAL Variable

=item Returning SVs, AVs and HVs through RETVAL

=item The MODULE Keyword

=item The PACKAGE Keyword

=item The PREFIX Keyword

=item The OUTPUT: Keyword

=item The NO_OUTPUT Keyword

=item The CODE: Keyword

=item The INIT: Keyword

=item The NO_INIT Keyword

=item The TYPEMAP: Keyword

=item Initializing Function Parameters

=item Default Parameter Values

=item The PREINIT: Keyword

=item The SCOPE: Keyword

=item The INPUT: Keyword

=item The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords

=item The C<length(NAME)> Keyword

=item Variable-length Parameter Lists

=item The C_ARGS: Keyword

=item The PPCODE: Keyword

=item Returning Undef And Empty Lists

=item The REQUIRE: Keyword

=item The CLEANUP: Keyword

=item The POSTCALL: Keyword

=item The BOOT: Keyword

=item The VERSIONCHECK: Keyword

=item The PROTOTYPES: Keyword

=item The PROTOTYPE: Keyword

=item The ALIAS: Keyword

=item The OVERLOAD: Keyword

=item The FALLBACK: Keyword

=item The INTERFACE: Keyword

=item The INTERFACE_MACRO: Keyword

=item The INCLUDE: Keyword

=item The INCLUDE_COMMAND: Keyword

=item The CASE: Keyword

=item The EXPORT_XSUB_SYMBOLS: Keyword

=item The & Unary Operator

=item Inserting POD, Comments and C Preprocessor Directives

=item Using XS With C++

=item Interface Strategy

=item Perl Objects And C Structures

=item Safely Storing Static Data in XS

MY_CXT_KEY, typedef my_cxt_t, START_MY_CXT, MY_CXT_INIT, dMY_CXT, MY_CXT,
aMY_CXT/pMY_CXT, MY_CXT_CLONE, MY_CXT_INIT_INTERP(my_perl),
dMY_CXT_INTERP(my_perl)

=item Thread-aware system interfaces

=back

=item EXAMPLES

=item CAVEATS

Non-locale-aware XS code, Locale-aware XS code

=item XS VERSION

=item AUTHOR

=back

=head2 perlxstypemap - Perl XS C/Perl type mapping

=over 4

=item DESCRIPTION

=over 4

=item Anatomy of a typemap

=item The Role of the typemap File in Your Distribution

=item Sharing typemaps Between CPAN Distributions

=item Writing typemap Entries

=item Full Listing of Core Typemaps

T_SV, T_SVREF, T_SVREF_FIXED, T_AVREF, T_AVREF_REFCOUNT_FIXED, T_HVREF,
T_HVREF_REFCOUNT_FIXED, T_CVREF, T_CVREF_REFCOUNT_FIXED, T_SYSRET, T_UV,
T_IV, T_INT, T_ENUM, T_BOOL, T_U_INT, T_SHORT, T_U_SHORT, T_LONG, T_U_LONG,
T_CHAR, T_U_CHAR, T_FLOAT, T_NV, T_DOUBLE, T_PV, T_PTR, T_PTRREF, T_PTROBJ,
T_REF_IV_REF, T_REF_IV_PTR, T_PTRDESC, T_REFREF, T_REFOBJ, T_OPAQUEPTR,
T_OPAQUE, Implicit array, T_PACKED, T_PACKEDARRAY, T_DATAUNIT, T_CALLBACK,
T_ARRAY, T_STDIO, T_INOUT, T_IN, T_OUT

=back

=back

=head2 perlclib - Internal replacements for standard C library functions

=over 4

=item DESCRIPTION

=over 4

=item Conventions

C<t>, C<p>, C<n>, C<s>

=item File Operations

=item File Input and Output

=item File Positioning

=item Memory Management and String Handling

=item Character Class Tests

=item F<stdlib.h> functions

=item Miscellaneous functions

=back

=item SEE ALSO

=back

=head2 perlguts - Introduction to the Perl API

=over 4

=item DESCRIPTION

=item Variables

=over 4

=item Datatypes

=item What is an "IV"?

=item Working with SVs

=item Offsets

=item What's Really Stored in an SV?

=item Working with AVs

=item Working with HVs

=item Hash API Extensions

=item AVs, HVs and undefined values

=item References

=item Blessed References and Class Objects

=item Creating New Variables

GV_ADDMULTI, GV_ADDWARN

=item Reference Counts and Mortality

=item Stashes and Globs

=item Double-Typed SVs

=item Read-Only Values

=item Copy on Write

=item Magic Variables

=item Assigning Magic

=item Magic Virtual Tables

=item Finding Magic

=item Understanding the Magic of Tied Hashes and Arrays

=item Localizing changes

C<SAVEINT(int i)>, C<SAVEIV(IV i)>, C<SAVEI32(I32 i)>, C<SAVELONG(long i)>,
C<SAVESPTR(s)>, C<SAVEPPTR(p)>, C<SAVEFREESV(SV *sv)>, C<SAVEMORTALIZESV(SV
*sv)>, C<SAVEFREEOP(OP *op)>, C<SAVEFREEPV(p)>, C<SAVECLEARSV(SV *sv)>,
C<SAVEDELETE(HV *hv, char *key, I32 length)>,
C<SAVEDESTRUCTOR(DESTRUCTORFUNC_NOCONTEXT_t f, void *p)>,
C<SAVEDESTRUCTOR_X(DESTRUCTORFUNC_t f, void *p)>, C<SAVESTACK_POS()>, C<SV*
save_scalar(GV *gv)>, C<AV* save_ary(GV *gv)>, C<HV* save_hash(GV *gv)>,
C<void save_item(SV *item)>, C<void save_list(SV **sarg, I32 maxsarg)>,
C<SV* save_svref(SV **sptr)>, C<void save_aptr(AV **aptr)>, C<void
save_hptr(HV **hptr)>

=back

=item Subroutines

=over 4

=item XSUBs and the Argument Stack

=item Autoloading with XSUBs

=item Calling Perl Routines from within C Programs

=item Putting a C value on Perl stack

=item Scratchpads

=item Scratchpads and recursion

=back

=item Memory Allocation

=over 4

=item Allocation

=item Reallocation

=item Moving

=back

=item PerlIO

=item Compiled code

=over 4

=item Code tree

=item Examining the tree

=item Compile pass 1: check routines

=item Compile pass 1a: constant folding

=item Compile pass 2: context propagation

=item Compile pass 3: peephole optimization

=item Pluggable runops

=item Compile-time scope hooks

C<void bhk_start(pTHX_ int full)>, C<void bhk_pre_end(pTHX_ OP **o)>,
C<void bhk_post_end(pTHX_ OP **o)>, C<void bhk_eval(pTHX_ OP *const o)>

=back

=item Examining internal data structures with the C<dump> functions

=item How multiple interpreters and concurrency are supported

=over 4

=item Background and PERL_IMPLICIT_CONTEXT

=item So what happened to dTHR?

=item How do I use all this in extensions?

=item Should I do anything special if I call perl from multiple threads?

=item Future Plans and PERL_IMPLICIT_SYS

=back

=item Internal Functions

A, p, d, s, n, r, f, M, o, x, m, X, E, b, others

=over 4

=item Formatted Printing of IVs, UVs, and NVs

=item Formatted Printing of Size_t and SSize_t

=item Pointer-To-Integer and Integer-To-Pointer

=item Exception Handling

=item Source Documentation

=item Backwards compatibility

=back

=item Unicode Support

=over 4

=item What B<is> Unicode, anyway?

=item How can I recognise a UTF-8 string?

=item How does UTF-8 represent Unicode characters?

=item How does Perl store UTF-8 strings?

=item How do I convert a string to UTF-8?

=item How do I compare strings?

=item Is there anything else I need to know?

=back

=item Custom Operators

xop_name, xop_desc, xop_class, OA_BASEOP, OA_UNOP, OA_BINOP, OA_LOGOP,
OA_LISTOP, OA_PMOP, OA_SVOP, OA_PADOP, OA_PVOP_OR_SVOP, OA_LOOP, OA_COP,
xop_peep

=item Dynamic Scope and the Context Stack

=over 4

=item Introduction to the context stack

=item Pushing contexts

=item Popping contexts

=item Redoing contexts

=back

=item AUTHORS

=item SEE ALSO

=back

=head2 perlcall - Perl calling conventions from C

=over 4

=item DESCRIPTION

An Error Handler, An Event-Driven Program

=item THE CALL_ FUNCTIONS

call_sv, call_pv, call_method, call_argv

=item FLAG VALUES

=over 4

=item  G_VOID

=item  G_SCALAR

=item G_ARRAY

=item G_DISCARD

=item G_NOARGS

=item G_EVAL

=item G_KEEPERR

=item Determining the Context

=back

=item EXAMPLES

=over 4

=item No Parameters, Nothing Returned

=item Passing Parameters

=item Returning a Scalar

=item Returning a List of Values

=item Returning a List in Scalar Context

=item Returning Data from Perl via the Parameter List

=item Using G_EVAL

=item Using G_KEEPERR

=item Using call_sv

=item Using call_argv

=item Using call_method

=item Using GIMME_V

=item Using Perl to Dispose of Temporaries

=item Strategies for Storing Callback Context Information

1. Ignore the problem - Allow only 1 callback, 2. Create a sequence of
callbacks - hard wired limit, 3. Use a parameter to map to the Perl
callback

=item Alternate Stack Manipulation

=item Creating and Calling an Anonymous Subroutine in C

=back

=item LIGHTWEIGHT CALLBACKS

=item SEE ALSO

=item AUTHOR

=item DATE

=back

=head2 perlmroapi - Perl method resolution plugin interface

=over 4

=item DESCRIPTION

resolve, name, length, kflags, hash

=item Callbacks

=item Caching

=item Examples

=item AUTHORS

=back

=head2 perlreapi - Perl regular expression plugin interface

=over 4

=item DESCRIPTION

=item Callbacks

=over 4

=item comp

C</m> - RXf_PMf_MULTILINE, C</s> - RXf_PMf_SINGLELINE, C</i> -
RXf_PMf_FOLD, C</x> - RXf_PMf_EXTENDED, C</p> - RXf_PMf_KEEPCOPY, Character
set, RXf_SPLIT, RXf_SKIPWHITE, RXf_START_ONLY, RXf_WHITE, RXf_NULL,
RXf_NO_INPLACE_SUBST

=item exec

rx, sv, strbeg, strend, stringarg, minend, data, flags

=item intuit

=item checkstr

=item free

=item Numbered capture callbacks

=item Named capture callbacks

=item qr_package

=item dupe

=item op_comp

=back

=item The REGEXP structure

=over 4

=item C<engine>

=item C<mother_re>

=item C<extflags>

=item C<minlen> C<minlenret>

=item C<gofs>

=item C<substrs>

=item C<nparens>, C<lastparen>, and C<lastcloseparen>

=item C<intflags>

=item C<pprivate>

=item C<swap>

=item C<offs>

=item C<precomp> C<prelen>

=item C<paren_names>

=item C<substrs>

=item C<subbeg> C<sublen> C<saved_copy> C<suboffset> C<subcoffset>

=item C<wrapped> C<wraplen>

=item C<seen_evals>

=item C<refcnt>

=back

=item HISTORY

=item AUTHORS

=item LICENSE

=back

=head2 perlreguts - Description of the Perl regular expression engine.

=over 4

=item DESCRIPTION

=item OVERVIEW

=over 4

=item A quick note on terms

=item What is a regular expression engine?

=item Structure of a Regexp Program

C<regnode_1>, C<regnode_2>, C<regnode_string>, C<regnode_charclass>,
C<regnode_charclass_posixl>

=back

=item Process Overview

A. Compilation, 1. Parsing for size, 2. Parsing for construction, 3.
Peep-hole optimisation and analysis, B. Execution, 4. Start position and
no-match optimisations, 5. Program execution

=over 4

=item Compilation

anchored fixed strings, floating fixed strings, minimum and maximum length
requirements, start class, Beginning/End of line positions

=item Execution

=back

=item MISCELLANEOUS

=over 4

=item Unicode and Localisation Support

=item Base Structures

C<offsets>, C<regstclass>, C<data>, C<program>

=back

=item SEE ALSO

=item AUTHOR

=item LICENCE

=item REFERENCES

=back

=head2 perlapi - autogenerated documentation for the perl public API

=over 4

=item DESCRIPTION
X<Perl API> X<API> X<api>

=item Array Manipulation Functions

av_clear X<av_clear>, av_create_and_push X<av_create_and_push>,
av_create_and_unshift_one X<av_create_and_unshift_one>, av_delete
X<av_delete>, av_exists X<av_exists>, av_extend X<av_extend>, av_fetch
X<av_fetch>, AvFILL X<AvFILL>, av_fill X<av_fill>, av_len X<av_len>,
av_make X<av_make>, av_pop X<av_pop>, av_push X<av_push>, av_shift
X<av_shift>, av_store X<av_store>, av_tindex X<av_tindex>, av_top_index
X<av_top_index>, av_undef X<av_undef>, av_unshift X<av_unshift>, get_av
X<get_av>, newAV X<newAV>, sortsv X<sortsv>, sortsv_flags X<sortsv_flags>

=item Callback Functions

call_argv X<call_argv>, call_method X<call_method>, call_pv X<call_pv>,
call_sv X<call_sv>, ENTER X<ENTER>, ENTER_with_name(name)
X<ENTER_with_name(name)>, eval_pv X<eval_pv>, eval_sv X<eval_sv>, FREETMPS
X<FREETMPS>, LEAVE X<LEAVE>, LEAVE_with_name(name)
X<LEAVE_with_name(name)>, SAVETMPS X<SAVETMPS>

=item Character case changing

toFOLD X<toFOLD>, toFOLD_utf8 X<toFOLD_utf8>, toFOLD_utf8_safe
X<toFOLD_utf8_safe>, toFOLD_uvchr X<toFOLD_uvchr>, toLOWER X<toLOWER>,
toLOWER_L1 X<toLOWER_L1>, toLOWER_LC X<toLOWER_LC>, toLOWER_utf8
X<toLOWER_utf8>, toLOWER_utf8_safe X<toLOWER_utf8_safe>, toLOWER_uvchr
X<toLOWER_uvchr>, toTITLE X<toTITLE>, toTITLE_utf8 X<toTITLE_utf8>,
toTITLE_utf8_safe X<toTITLE_utf8_safe>, toTITLE_uvchr X<toTITLE_uvchr>,
toUPPER X<toUPPER>, toUPPER_utf8 X<toUPPER_utf8>, toUPPER_utf8_safe
X<toUPPER_utf8_safe>, toUPPER_uvchr X<toUPPER_uvchr>

=item Character classification

isALPHA X<isALPHA>, isALPHANUMERIC X<isALPHANUMERIC>, isASCII X<isASCII>,
isBLANK X<isBLANK>, isCNTRL X<isCNTRL>, isDIGIT X<isDIGIT>, isGRAPH
X<isGRAPH>, isIDCONT X<isIDCONT>, isIDFIRST X<isIDFIRST>, isLOWER
X<isLOWER>, isOCTAL X<isOCTAL>, isPRINT X<isPRINT>, isPSXSPC X<isPSXSPC>,
isPUNCT X<isPUNCT>, isSPACE X<isSPACE>, isUPPER X<isUPPER>, isWORDCHAR
X<isWORDCHAR>, isXDIGIT X<isXDIGIT>

=item Cloning an interpreter

perl_clone X<perl_clone>

=item Compile-time scope hooks

BhkDISABLE X<BhkDISABLE>, BhkENABLE X<BhkENABLE>, BhkENTRY_set
X<BhkENTRY_set>, blockhook_register X<blockhook_register>

=item COP Hint Hashes

cophh_2hv X<cophh_2hv>, cophh_copy X<cophh_copy>, cophh_delete_pv
X<cophh_delete_pv>, cophh_delete_pvn X<cophh_delete_pvn>, cophh_delete_pvs
X<cophh_delete_pvs>, cophh_delete_sv X<cophh_delete_sv>, cophh_fetch_pv
X<cophh_fetch_pv>, cophh_fetch_pvn X<cophh_fetch_pvn>, cophh_fetch_pvs
X<cophh_fetch_pvs>, cophh_fetch_sv X<cophh_fetch_sv>, cophh_free
X<cophh_free>, cophh_new_empty X<cophh_new_empty>, cophh_store_pv
X<cophh_store_pv>, cophh_store_pvn X<cophh_store_pvn>, cophh_store_pvs
X<cophh_store_pvs>, cophh_store_sv X<cophh_store_sv>

=item COP Hint Reading

cop_hints_2hv X<cop_hints_2hv>, cop_hints_fetch_pv X<cop_hints_fetch_pv>,
cop_hints_fetch_pvn X<cop_hints_fetch_pvn>, cop_hints_fetch_pvs
X<cop_hints_fetch_pvs>, cop_hints_fetch_sv X<cop_hints_fetch_sv>

=item Custom Operators

custom_op_register X<custom_op_register>, custom_op_xop X<custom_op_xop>,
XopDISABLE X<XopDISABLE>, XopENABLE X<XopENABLE>, XopENTRY X<XopENTRY>,
XopENTRYCUSTOM X<XopENTRYCUSTOM>, XopENTRY_set X<XopENTRY_set>, XopFLAGS
X<XopFLAGS>

=item CV Manipulation Functions

caller_cx X<caller_cx>, CvSTASH X<CvSTASH>, find_runcv X<find_runcv>,
get_cv X<get_cv>, get_cvn_flags X<get_cvn_flags>

=item C<xsubpp> variables and internal functions

ax X<ax>, CLASS X<CLASS>, dAX X<dAX>, dAXMARK X<dAXMARK>, dITEMS X<dITEMS>,
dUNDERBAR X<dUNDERBAR>, dXSARGS X<dXSARGS>, dXSI32 X<dXSI32>, items
X<items>, ix X<ix>, RETVAL X<RETVAL>, ST X<ST>, THIS X<THIS>, UNDERBAR
X<UNDERBAR>, XS X<XS>, XS_EXTERNAL X<XS_EXTERNAL>, XS_INTERNAL
X<XS_INTERNAL>

=item Debugging Utilities

dump_all X<dump_all>, dump_packsubs X<dump_packsubs>, op_class X<op_class>,
op_dump X<op_dump>, sv_dump X<sv_dump>

=item Display and Dump functions

pv_display X<pv_display>, pv_escape X<pv_escape>, pv_pretty X<pv_pretty>

=item Embedding Functions

cv_clone X<cv_clone>, cv_name X<cv_name>, cv_undef X<cv_undef>,
find_rundefsv X<find_rundefsv>, find_rundefsvoffset X<find_rundefsvoffset>,
intro_my X<intro_my>, load_module X<load_module>, newPADNAMELIST
X<newPADNAMELIST>, newPADNAMEouter X<newPADNAMEouter>, newPADNAMEpvn
X<newPADNAMEpvn>, nothreadhook X<nothreadhook>, pad_add_anon
X<pad_add_anon>, pad_add_name_pv X<pad_add_name_pv>, pad_add_name_pvn
X<pad_add_name_pvn>, pad_add_name_sv X<pad_add_name_sv>, pad_alloc
X<pad_alloc>, pad_findmy_pv X<pad_findmy_pv>, pad_findmy_pvn
X<pad_findmy_pvn>, pad_findmy_sv X<pad_findmy_sv>, padnamelist_fetch
X<padnamelist_fetch>, padnamelist_store X<padnamelist_store>, pad_setsv
X<pad_setsv>, pad_sv X<pad_sv>, pad_tidy X<pad_tidy>, perl_alloc
X<perl_alloc>, perl_construct X<perl_construct>, perl_destruct
X<perl_destruct>, perl_free X<perl_free>, perl_parse X<perl_parse>,
perl_run X<perl_run>, require_pv X<require_pv>

=item Exception Handling (simple) Macros

dXCPT X<dXCPT>, XCPT_CATCH X<XCPT_CATCH>, XCPT_RETHROW X<XCPT_RETHROW>,
XCPT_TRY_END X<XCPT_TRY_END>, XCPT_TRY_START X<XCPT_TRY_START>

=item Functions in file scope.c

save_gp X<save_gp>

=item Functions in file vutil.c

new_version X<new_version>, prescan_version X<prescan_version>,
scan_version X<scan_version>, upg_version X<upg_version>, vcmp X<vcmp>,
vnormal X<vnormal>, vnumify X<vnumify>, vstringify X<vstringify>, vverify
X<vverify>, The SV is an HV or a reference to an HV, The hash contains a
"version" key, The "version" key has a reference to an AV as its value

=item "Gimme" Values

G_ARRAY X<G_ARRAY>, G_DISCARD X<G_DISCARD>, G_EVAL X<G_EVAL>, GIMME
X<GIMME>, GIMME_V X<GIMME_V>, G_NOARGS X<G_NOARGS>, G_SCALAR X<G_SCALAR>,
G_VOID X<G_VOID>

=item Global Variables

PL_check X<PL_check>, PL_keyword_plugin X<PL_keyword_plugin>

=item GV Functions

GvAV X<GvAV>, gv_const_sv X<gv_const_sv>, GvCV X<GvCV>, gv_fetchmeth
X<gv_fetchmeth>, gv_fetchmethod_autoload X<gv_fetchmethod_autoload>,
gv_fetchmeth_autoload X<gv_fetchmeth_autoload>, gv_fetchmeth_pv
X<gv_fetchmeth_pv>, gv_fetchmeth_pvn X<gv_fetchmeth_pvn>,
gv_fetchmeth_pvn_autoload X<gv_fetchmeth_pvn_autoload>,
gv_fetchmeth_pv_autoload X<gv_fetchmeth_pv_autoload>, gv_fetchmeth_sv
X<gv_fetchmeth_sv>, gv_fetchmeth_sv_autoload X<gv_fetchmeth_sv_autoload>,
GvHV X<GvHV>, gv_init X<gv_init>, gv_init_pv X<gv_init_pv>, gv_init_pvn
X<gv_init_pvn>, gv_init_sv X<gv_init_sv>, gv_stashpv X<gv_stashpv>,
gv_stashpvn X<gv_stashpvn>, gv_stashpvs X<gv_stashpvs>, gv_stashsv
X<gv_stashsv>, GvSV X<GvSV>, setdefout X<setdefout>

=item Handy Values

Nullav X<Nullav>, Nullch X<Nullch>, Nullcv X<Nullcv>, Nullhv X<Nullhv>,
Nullsv X<Nullsv>

=item Hash Manipulation Functions

cop_fetch_label X<cop_fetch_label>, cop_store_label X<cop_store_label>,
get_hv X<get_hv>, HEf_SVKEY X<HEf_SVKEY>, HeHASH X<HeHASH>, HeKEY X<HeKEY>,
HeKLEN X<HeKLEN>, HePV X<HePV>, HeSVKEY X<HeSVKEY>, HeSVKEY_force
X<HeSVKEY_force>, HeSVKEY_set X<HeSVKEY_set>, HeUTF8 X<HeUTF8>, HeVAL
X<HeVAL>, hv_assert X<hv_assert>, hv_bucket_ratio X<hv_bucket_ratio>,
hv_clear X<hv_clear>, hv_clear_placeholders X<hv_clear_placeholders>,
hv_copy_hints_hv X<hv_copy_hints_hv>, hv_delete X<hv_delete>, hv_delete_ent
X<hv_delete_ent>, HvENAME X<HvENAME>, HvENAMELEN X<HvENAMELEN>, HvENAMEUTF8
X<HvENAMEUTF8>, hv_exists X<hv_exists>, hv_exists_ent X<hv_exists_ent>,
hv_fetch X<hv_fetch>, hv_fetchs X<hv_fetchs>, hv_fetch_ent X<hv_fetch_ent>,
hv_fill X<hv_fill>, hv_iterinit X<hv_iterinit>, hv_iterkey X<hv_iterkey>,
hv_iterkeysv X<hv_iterkeysv>, hv_iternext X<hv_iternext>, hv_iternextsv
X<hv_iternextsv>, hv_iternext_flags X<hv_iternext_flags>, hv_iterval
X<hv_iterval>, hv_magic X<hv_magic>, HvNAME X<HvNAME>, HvNAMELEN
X<HvNAMELEN>, HvNAMEUTF8 X<HvNAMEUTF8>, hv_scalar X<hv_scalar>, hv_store
X<hv_store>, hv_stores X<hv_stores>, hv_store_ent X<hv_store_ent>, hv_undef
X<hv_undef>, newHV X<newHV>

=item Hook manipulation

wrap_op_checker X<wrap_op_checker>

=item Lexer interface

lex_bufutf8 X<lex_bufutf8>, lex_discard_to X<lex_discard_to>,
lex_grow_linestr X<lex_grow_linestr>, lex_next_chunk X<lex_next_chunk>,
lex_peek_unichar X<lex_peek_unichar>, lex_read_space X<lex_read_space>,
lex_read_to X<lex_read_to>, lex_read_unichar X<lex_read_unichar>, lex_start
X<lex_start>, lex_stuff_pv X<lex_stuff_pv>, lex_stuff_pvn X<lex_stuff_pvn>,
lex_stuff_pvs X<lex_stuff_pvs>, lex_stuff_sv X<lex_stuff_sv>, lex_unstuff
X<lex_unstuff>, parse_arithexpr X<parse_arithexpr>, parse_barestmt
X<parse_barestmt>, parse_block X<parse_block>, parse_fullexpr
X<parse_fullexpr>, parse_fullstmt X<parse_fullstmt>, parse_label
X<parse_label>, parse_listexpr X<parse_listexpr>, parse_stmtseq
X<parse_stmtseq>, parse_termexpr X<parse_termexpr>, PL_parser X<PL_parser>,
PL_parser-E<gt>bufend X<PL_parser-E<gt>bufend>, PL_parser-E<gt>bufptr
X<PL_parser-E<gt>bufptr>, PL_parser-E<gt>linestart
X<PL_parser-E<gt>linestart>, PL_parser-E<gt>linestr
X<PL_parser-E<gt>linestr>

=item Locale-related functions and macros

DECLARATION_FOR_LC_NUMERIC_MANIPULATION
X<DECLARATION_FOR_LC_NUMERIC_MANIPULATION>, RESTORE_LC_NUMERIC
X<RESTORE_LC_NUMERIC>, STORE_LC_NUMERIC_FORCE_TO_UNDERLYING
X<STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>, STORE_LC_NUMERIC_SET_TO_NEEDED
X<STORE_LC_NUMERIC_SET_TO_NEEDED>, sync_locale X<sync_locale>

=item Magical Functions

mg_clear X<mg_clear>, mg_copy X<mg_copy>, mg_find X<mg_find>, mg_findext
X<mg_findext>, mg_free X<mg_free>, mg_free_type X<mg_free_type>, mg_get
X<mg_get>, mg_length X<mg_length>, mg_magical X<mg_magical>, mg_set
X<mg_set>, SvGETMAGIC X<SvGETMAGIC>, SvLOCK X<SvLOCK>, SvSETMAGIC
X<SvSETMAGIC>, SvSetMagicSV X<SvSetMagicSV>, SvSetMagicSV_nosteal
X<SvSetMagicSV_nosteal>, SvSetSV X<SvSetSV>, SvSetSV_nosteal
X<SvSetSV_nosteal>, SvSHARE X<SvSHARE>, SvUNLOCK X<SvUNLOCK>

=item Memory Management

Copy X<Copy>, CopyD X<CopyD>, Move X<Move>, MoveD X<MoveD>, Newx X<Newx>,
Newxc X<Newxc>, Newxz X<Newxz>, Poison X<Poison>, PoisonFree X<PoisonFree>,
PoisonNew X<PoisonNew>, PoisonWith X<PoisonWith>, Renew X<Renew>, Renewc
X<Renewc>, Safefree X<Safefree>, savepv X<savepv>, savepvn X<savepvn>,
savepvs X<savepvs>, savesharedpv X<savesharedpv>, savesharedpvn
X<savesharedpvn>, savesharedpvs X<savesharedpvs>, savesharedsvpv
X<savesharedsvpv>, savesvpv X<savesvpv>, StructCopy X<StructCopy>, Zero
X<Zero>, ZeroD X<ZeroD>

=item Miscellaneous Functions

dump_c_backtrace X<dump_c_backtrace>, fbm_compile X<fbm_compile>, fbm_instr
X<fbm_instr>, foldEQ X<foldEQ>, foldEQ_locale X<foldEQ_locale>, form
X<form>, getcwd_sv X<getcwd_sv>, get_c_backtrace_dump
X<get_c_backtrace_dump>, ibcmp X<ibcmp>, ibcmp_locale X<ibcmp_locale>,
is_safe_syscall X<is_safe_syscall>, memEQ X<memEQ>, memNE X<memNE>, mess
X<mess>, mess_sv X<mess_sv>, my_snprintf X<my_snprintf>, my_sprintf
X<my_sprintf>, my_strlcat X<my_strlcat>, my_strlcpy X<my_strlcpy>,
my_vsnprintf X<my_vsnprintf>, ninstr X<ninstr>, PERL_SYS_INIT
X<PERL_SYS_INIT>, PERL_SYS_INIT3 X<PERL_SYS_INIT3>, PERL_SYS_TERM
X<PERL_SYS_TERM>, quadmath_format_needed X<quadmath_format_needed>,
quadmath_format_single X<quadmath_format_single>, READ_XDIGIT
X<READ_XDIGIT>, rninstr X<rninstr>, strEQ X<strEQ>, strGE X<strGE>, strGT
X<strGT>, strLE X<strLE>, strLT X<strLT>, strNE X<strNE>, strnEQ X<strnEQ>,
strnNE X<strnNE>, sv_destroyable X<sv_destroyable>, sv_nosharing
X<sv_nosharing>, vmess X<vmess>

=item MRO Functions

mro_get_linear_isa X<mro_get_linear_isa>, mro_method_changed_in
X<mro_method_changed_in>, mro_register X<mro_register>

=item Multicall Functions

dMULTICALL X<dMULTICALL>, MULTICALL X<MULTICALL>, POP_MULTICALL
X<POP_MULTICALL>, PUSH_MULTICALL X<PUSH_MULTICALL>

=item Numeric functions

grok_bin X<grok_bin>, grok_hex X<grok_hex>, grok_infnan X<grok_infnan>,
grok_number X<grok_number>, grok_number_flags X<grok_number_flags>,
grok_numeric_radix X<grok_numeric_radix>, grok_oct X<grok_oct>, isinfnan
X<isinfnan>, Perl_signbit X<Perl_signbit>, scan_bin X<scan_bin>, scan_hex
X<scan_hex>, scan_oct X<scan_oct>

=item Obsolete backwards compatibility functions

custom_op_desc X<custom_op_desc>, custom_op_name X<custom_op_name>,
gv_fetchmethod X<gv_fetchmethod>, is_utf8_char X<is_utf8_char>,
is_utf8_char_buf X<is_utf8_char_buf>, pack_cat X<pack_cat>,
pad_compname_type X<pad_compname_type>, sv_2pvbyte_nolen
X<sv_2pvbyte_nolen>, sv_2pvutf8_nolen X<sv_2pvutf8_nolen>, sv_2pv_nolen
X<sv_2pv_nolen>, sv_catpvn_mg X<sv_catpvn_mg>, sv_catsv_mg X<sv_catsv_mg>,
sv_force_normal X<sv_force_normal>, sv_iv X<sv_iv>, sv_nolocking
X<sv_nolocking>, sv_nounlocking X<sv_nounlocking>, sv_nv X<sv_nv>, sv_pv
X<sv_pv>, sv_pvbyte X<sv_pvbyte>, sv_pvbyten X<sv_pvbyten>, sv_pvn
X<sv_pvn>, sv_pvutf8 X<sv_pvutf8>, sv_pvutf8n X<sv_pvutf8n>, sv_taint
X<sv_taint>, sv_unref X<sv_unref>, sv_usepvn X<sv_usepvn>, sv_usepvn_mg
X<sv_usepvn_mg>, sv_uv X<sv_uv>, unpack_str X<unpack_str>, utf8_to_uvchr
X<utf8_to_uvchr>, utf8_to_uvuni X<utf8_to_uvuni>

=item Optree construction

newASSIGNOP X<newASSIGNOP>, newBINOP X<newBINOP>, newCONDOP X<newCONDOP>,
newDEFSVOP X<newDEFSVOP>, newFOROP X<newFOROP>, newGIVENOP X<newGIVENOP>,
newGVOP X<newGVOP>, newLISTOP X<newLISTOP>, newLOGOP X<newLOGOP>, newLOOPEX
X<newLOOPEX>, newLOOPOP X<newLOOPOP>, newMETHOP X<newMETHOP>,
newMETHOP_named X<newMETHOP_named>, newNULLLIST X<newNULLLIST>, newOP
X<newOP>, newPADOP X<newPADOP>, newPMOP X<newPMOP>, newPVOP X<newPVOP>,
newRANGE X<newRANGE>, newSLICEOP X<newSLICEOP>, newSTATEOP X<newSTATEOP>,
newSVOP X<newSVOP>, newUNOP X<newUNOP>, newUNOP_AUX X<newUNOP_AUX>,
newWHENOP X<newWHENOP>, newWHILEOP X<newWHILEOP>

=item Optree Manipulation Functions

alloccopstash X<alloccopstash>, block_end X<block_end>, block_start
X<block_start>, ck_entersub_args_list X<ck_entersub_args_list>,
ck_entersub_args_proto X<ck_entersub_args_proto>,
ck_entersub_args_proto_or_list X<ck_entersub_args_proto_or_list>,
cv_const_sv X<cv_const_sv>, cv_get_call_checker X<cv_get_call_checker>,
cv_set_call_checker X<cv_set_call_checker>, cv_set_call_checker_flags
X<cv_set_call_checker_flags>, LINKLIST X<LINKLIST>, newCONSTSUB
X<newCONSTSUB>, newCONSTSUB_flags X<newCONSTSUB_flags>, newXS X<newXS>,
op_append_elem X<op_append_elem>, op_append_list X<op_append_list>,
OP_CLASS X<OP_CLASS>, op_contextualize X<op_contextualize>, op_convert_list
X<op_convert_list>, OP_DESC X<OP_DESC>, op_free X<op_free>, OpHAS_SIBLING
X<OpHAS_SIBLING>, OpLASTSIB_set X<OpLASTSIB_set>, op_linklist
X<op_linklist>, op_lvalue X<op_lvalue>, OpMAYBESIB_set X<OpMAYBESIB_set>,
OpMORESIB_set X<OpMORESIB_set>, OP_NAME X<OP_NAME>, op_null X<op_null>,
op_parent X<op_parent>, op_prepend_elem X<op_prepend_elem>, op_scope
X<op_scope>, OpSIBLING X<OpSIBLING>, op_sibling_splice
X<op_sibling_splice>, OP_TYPE_IS X<OP_TYPE_IS>, OP_TYPE_IS_OR_WAS
X<OP_TYPE_IS_OR_WAS>, rv2cv_op_cv X<rv2cv_op_cv>

=item Pack and Unpack

packlist X<packlist>, unpackstring X<unpackstring>

=item Pad Data Structures

CvPADLIST X<CvPADLIST>, pad_add_name_pvs X<pad_add_name_pvs>, PadARRAY
X<PadARRAY>, pad_findmy_pvs X<pad_findmy_pvs>, PadlistARRAY
X<PadlistARRAY>, PadlistMAX X<PadlistMAX>, PadlistNAMES X<PadlistNAMES>,
PadlistNAMESARRAY X<PadlistNAMESARRAY>, PadlistNAMESMAX X<PadlistNAMESMAX>,
PadlistREFCNT X<PadlistREFCNT>, PadMAX X<PadMAX>, PadnameLEN X<PadnameLEN>,
PadnamelistARRAY X<PadnamelistARRAY>, PadnamelistMAX X<PadnamelistMAX>,
PadnamelistREFCNT X<PadnamelistREFCNT>, PadnamelistREFCNT_dec
X<PadnamelistREFCNT_dec>, PadnamePV X<PadnamePV>, PadnameREFCNT
X<PadnameREFCNT>, PadnameREFCNT_dec X<PadnameREFCNT_dec>, PadnameSV
X<PadnameSV>, PadnameUTF8 X<PadnameUTF8>, pad_new X<pad_new>, PL_comppad
X<PL_comppad>, PL_comppad_name X<PL_comppad_name>, PL_curpad X<PL_curpad>

=item Per-Interpreter Variables

PL_modglobal X<PL_modglobal>, PL_na X<PL_na>, PL_opfreehook
X<PL_opfreehook>, PL_peepp X<PL_peepp>, PL_rpeepp X<PL_rpeepp>, PL_sv_no
X<PL_sv_no>, PL_sv_undef X<PL_sv_undef>, PL_sv_yes X<PL_sv_yes>

=item REGEXP Functions

SvRX X<SvRX>, SvRXOK X<SvRXOK>

=item Stack Manipulation Macros

dMARK X<dMARK>, dORIGMARK X<dORIGMARK>, dSP X<dSP>, EXTEND X<EXTEND>, MARK
X<MARK>, mPUSHi X<mPUSHi>, mPUSHn X<mPUSHn>, mPUSHp X<mPUSHp>, mPUSHs
X<mPUSHs>, mPUSHu X<mPUSHu>, mXPUSHi X<mXPUSHi>, mXPUSHn X<mXPUSHn>,
mXPUSHp X<mXPUSHp>, mXPUSHs X<mXPUSHs>, mXPUSHu X<mXPUSHu>, ORIGMARK
X<ORIGMARK>, POPi X<POPi>, POPl X<POPl>, POPn X<POPn>, POPp X<POPp>,
POPpbytex X<POPpbytex>, POPpx X<POPpx>, POPs X<POPs>, POPu X<POPu>, POPul
X<POPul>, PUSHi X<PUSHi>, PUSHMARK X<PUSHMARK>, PUSHmortal X<PUSHmortal>,
PUSHn X<PUSHn>, PUSHp X<PUSHp>, PUSHs X<PUSHs>, PUSHu X<PUSHu>, PUTBACK
X<PUTBACK>, SP X<SP>, SPAGAIN X<SPAGAIN>, XPUSHi X<XPUSHi>, XPUSHmortal
X<XPUSHmortal>, XPUSHn X<XPUSHn>, XPUSHp X<XPUSHp>, XPUSHs X<XPUSHs>,
XPUSHu X<XPUSHu>, XSRETURN X<XSRETURN>, XSRETURN_EMPTY X<XSRETURN_EMPTY>,
XSRETURN_IV X<XSRETURN_IV>, XSRETURN_NO X<XSRETURN_NO>, XSRETURN_NV
X<XSRETURN_NV>, XSRETURN_PV X<XSRETURN_PV>, XSRETURN_UNDEF
X<XSRETURN_UNDEF>, XSRETURN_UV X<XSRETURN_UV>, XSRETURN_YES
X<XSRETURN_YES>, XST_mIV X<XST_mIV>, XST_mNO X<XST_mNO>, XST_mNV
X<XST_mNV>, XST_mPV X<XST_mPV>, XST_mUNDEF X<XST_mUNDEF>, XST_mYES
X<XST_mYES>

=item SV-Body Allocation

looks_like_number X<looks_like_number>, newRV_noinc X<newRV_noinc>, newSV
X<newSV>, newSVhek X<newSVhek>, newSViv X<newSViv>, newSVnv X<newSVnv>,
newSVpv X<newSVpv>, newSVpvf X<newSVpvf>, newSVpvn X<newSVpvn>,
newSVpvn_flags X<newSVpvn_flags>, newSVpvn_share X<newSVpvn_share>,
newSVpvs X<newSVpvs>, newSVpvs_flags X<newSVpvs_flags>, newSVpv_share
X<newSVpv_share>, newSVpvs_share X<newSVpvs_share>, newSVrv X<newSVrv>,
newSVsv X<newSVsv>, newSV_type X<newSV_type>, newSVuv X<newSVuv>, sv_2bool
X<sv_2bool>, sv_2bool_flags X<sv_2bool_flags>, sv_2cv X<sv_2cv>, sv_2io
X<sv_2io>, sv_2iv_flags X<sv_2iv_flags>, sv_2mortal X<sv_2mortal>,
sv_2nv_flags X<sv_2nv_flags>, sv_2pvbyte X<sv_2pvbyte>, sv_2pvutf8
X<sv_2pvutf8>, sv_2pv_flags X<sv_2pv_flags>, sv_2uv_flags X<sv_2uv_flags>,
sv_backoff X<sv_backoff>, sv_bless X<sv_bless>, sv_catpv X<sv_catpv>,
sv_catpvf X<sv_catpvf>, sv_catpvf_mg X<sv_catpvf_mg>, sv_catpvn
X<sv_catpvn>, sv_catpvn_flags X<sv_catpvn_flags>, sv_catpvs X<sv_catpvs>,
sv_catpvs_flags X<sv_catpvs_flags>, sv_catpvs_mg X<sv_catpvs_mg>,
sv_catpvs_nomg X<sv_catpvs_nomg>, sv_catpv_flags X<sv_catpv_flags>,
sv_catpv_mg X<sv_catpv_mg>, sv_catsv X<sv_catsv>, sv_catsv_flags
X<sv_catsv_flags>, sv_chop X<sv_chop>, sv_clear X<sv_clear>, sv_cmp
X<sv_cmp>, sv_cmp_flags X<sv_cmp_flags>, sv_cmp_locale X<sv_cmp_locale>,
sv_cmp_locale_flags X<sv_cmp_locale_flags>, sv_collxfrm X<sv_collxfrm>,
sv_collxfrm_flags X<sv_collxfrm_flags>, sv_copypv X<sv_copypv>,
sv_copypv_flags X<sv_copypv_flags>, sv_copypv_nomg X<sv_copypv_nomg>,
sv_dec X<sv_dec>, sv_dec_nomg X<sv_dec_nomg>, sv_eq X<sv_eq>, sv_eq_flags
X<sv_eq_flags>, sv_force_normal_flags X<sv_force_normal_flags>, sv_free
X<sv_free>, sv_gets X<sv_gets>, sv_get_backrefs X<sv_get_backrefs>, sv_grow
X<sv_grow>, sv_inc X<sv_inc>, sv_inc_nomg X<sv_inc_nomg>, sv_insert
X<sv_insert>, sv_insert_flags X<sv_insert_flags>, sv_isa X<sv_isa>,
sv_isobject X<sv_isobject>, sv_len X<sv_len>, sv_len_utf8 X<sv_len_utf8>,
sv_magic X<sv_magic>, sv_magicext X<sv_magicext>, sv_mortalcopy
X<sv_mortalcopy>, sv_newmortal X<sv_newmortal>, sv_newref X<sv_newref>,
sv_pos_b2u X<sv_pos_b2u>, sv_pos_b2u_flags X<sv_pos_b2u_flags>, sv_pos_u2b
X<sv_pos_u2b>, sv_pos_u2b_flags X<sv_pos_u2b_flags>, sv_pvbyten_force
X<sv_pvbyten_force>, sv_pvn_force X<sv_pvn_force>, sv_pvn_force_flags
X<sv_pvn_force_flags>, sv_pvutf8n_force X<sv_pvutf8n_force>, sv_ref
X<sv_ref>, sv_reftype X<sv_reftype>, sv_replace X<sv_replace>, sv_reset
X<sv_reset>, sv_rvweaken X<sv_rvweaken>, sv_setiv X<sv_setiv>, sv_setiv_mg
X<sv_setiv_mg>, sv_setnv X<sv_setnv>, sv_setnv_mg X<sv_setnv_mg>, sv_setpv
X<sv_setpv>, sv_setpvf X<sv_setpvf>, sv_setpvf_mg X<sv_setpvf_mg>,
sv_setpviv X<sv_setpviv>, sv_setpviv_mg X<sv_setpviv_mg>, sv_setpvn
X<sv_setpvn>, sv_setpvn_mg X<sv_setpvn_mg>, sv_setpvs X<sv_setpvs>,
sv_setpvs_mg X<sv_setpvs_mg>, sv_setpv_bufsize X<sv_setpv_bufsize>,
sv_setpv_mg X<sv_setpv_mg>, sv_setref_iv X<sv_setref_iv>, sv_setref_nv
X<sv_setref_nv>, sv_setref_pv X<sv_setref_pv>, sv_setref_pvn
X<sv_setref_pvn>, sv_setref_pvs X<sv_setref_pvs>, sv_setref_uv
X<sv_setref_uv>, sv_setsv X<sv_setsv>, sv_setsv_flags X<sv_setsv_flags>,
sv_setsv_mg X<sv_setsv_mg>, sv_setuv X<sv_setuv>, sv_setuv_mg
X<sv_setuv_mg>, sv_set_undef X<sv_set_undef>, sv_tainted X<sv_tainted>,
sv_true X<sv_true>, sv_unmagic X<sv_unmagic>, sv_unmagicext
X<sv_unmagicext>, sv_unref_flags X<sv_unref_flags>, sv_untaint
X<sv_untaint>, sv_upgrade X<sv_upgrade>, sv_usepvn_flags
X<sv_usepvn_flags>, sv_utf8_decode X<sv_utf8_decode>, sv_utf8_downgrade
X<sv_utf8_downgrade>, sv_utf8_encode X<sv_utf8_encode>, sv_utf8_upgrade
X<sv_utf8_upgrade>, sv_utf8_upgrade_flags X<sv_utf8_upgrade_flags>,
sv_utf8_upgrade_flags_grow X<sv_utf8_upgrade_flags_grow>,
sv_utf8_upgrade_nomg X<sv_utf8_upgrade_nomg>, sv_vcatpvf X<sv_vcatpvf>,
sv_vcatpvfn X<sv_vcatpvfn>, sv_vcatpvfn_flags X<sv_vcatpvfn_flags>,
sv_vcatpvf_mg X<sv_vcatpvf_mg>, sv_vsetpvf X<sv_vsetpvf>, sv_vsetpvfn
X<sv_vsetpvfn>, sv_vsetpvf_mg X<sv_vsetpvf_mg>

=item SV Flags

SVt_INVLIST X<SVt_INVLIST>, SVt_IV X<SVt_IV>, SVt_NULL X<SVt_NULL>, SVt_NV
X<SVt_NV>, SVt_PV X<SVt_PV>, SVt_PVAV X<SVt_PVAV>, SVt_PVCV X<SVt_PVCV>,
SVt_PVFM X<SVt_PVFM>, SVt_PVGV X<SVt_PVGV>, SVt_PVHV X<SVt_PVHV>, SVt_PVIO
X<SVt_PVIO>, SVt_PVIV X<SVt_PVIV>, SVt_PVLV X<SVt_PVLV>, SVt_PVMG
X<SVt_PVMG>, SVt_PVNV X<SVt_PVNV>, SVt_REGEXP X<SVt_REGEXP>, svtype
X<svtype>

=item SV Manipulation Functions

boolSV X<boolSV>, croak_xs_usage X<croak_xs_usage>, get_sv X<get_sv>,
newRV_inc X<newRV_inc>, newSVpadname X<newSVpadname>, newSVpvn_utf8
X<newSVpvn_utf8>, sv_catpvn_nomg X<sv_catpvn_nomg>, sv_catpv_nomg
X<sv_catpv_nomg>, sv_catsv_nomg X<sv_catsv_nomg>, SvCUR X<SvCUR>, SvCUR_set
X<SvCUR_set>, sv_derived_from X<sv_derived_from>, sv_derived_from_pv
X<sv_derived_from_pv>, sv_derived_from_pvn X<sv_derived_from_pvn>,
sv_derived_from_sv X<sv_derived_from_sv>, sv_does X<sv_does>, sv_does_pv
X<sv_does_pv>, sv_does_pvn X<sv_does_pvn>, sv_does_sv X<sv_does_sv>, SvEND
X<SvEND>, SvGAMAGIC X<SvGAMAGIC>, SvGROW X<SvGROW>, SvIOK X<SvIOK>,
SvIOK_notUV X<SvIOK_notUV>, SvIOK_off X<SvIOK_off>, SvIOK_on X<SvIOK_on>,
SvIOK_only X<SvIOK_only>, SvIOK_only_UV X<SvIOK_only_UV>, SvIOKp X<SvIOKp>,
SvIOK_UV X<SvIOK_UV>, SvIsCOW X<SvIsCOW>, SvIsCOW_shared_hash
X<SvIsCOW_shared_hash>, SvIV X<SvIV>, SvIV_nomg X<SvIV_nomg>, SvIV_set
X<SvIV_set>, SvIVX X<SvIVX>, SvIVx X<SvIVx>, SvLEN X<SvLEN>, SvLEN_set
X<SvLEN_set>, SvMAGIC_set X<SvMAGIC_set>, SvNIOK X<SvNIOK>, SvNIOK_off
X<SvNIOK_off>, SvNIOKp X<SvNIOKp>, SvNOK X<SvNOK>, SvNOK_off X<SvNOK_off>,
SvNOK_on X<SvNOK_on>, SvNOK_only X<SvNOK_only>, SvNOKp X<SvNOKp>, SvNV
X<SvNV>, SvNV_nomg X<SvNV_nomg>, SvNV_set X<SvNV_set>, SvNVX X<SvNVX>,
SvNVx X<SvNVx>, SvOK X<SvOK>, SvOOK X<SvOOK>, SvOOK_offset X<SvOOK_offset>,
SvPOK X<SvPOK>, SvPOK_off X<SvPOK_off>, SvPOK_on X<SvPOK_on>, SvPOK_only
X<SvPOK_only>, SvPOK_only_UTF8 X<SvPOK_only_UTF8>, SvPOKp X<SvPOKp>, SvPV
X<SvPV>, SvPVbyte X<SvPVbyte>, SvPVbyte_force X<SvPVbyte_force>,
SvPVbyte_nolen X<SvPVbyte_nolen>, SvPVbytex X<SvPVbytex>, SvPVbytex_force
X<SvPVbytex_force>, SvPVCLEAR X<SvPVCLEAR>, SvPV_force X<SvPV_force>,
SvPV_force_nomg X<SvPV_force_nomg>, SvPV_nolen X<SvPV_nolen>, SvPV_nomg
X<SvPV_nomg>, SvPV_nomg_nolen X<SvPV_nomg_nolen>, SvPV_set X<SvPV_set>,
SvPVutf8 X<SvPVutf8>, SvPVutf8x X<SvPVutf8x>, SvPVutf8x_force
X<SvPVutf8x_force>, SvPVutf8_force X<SvPVutf8_force>, SvPVutf8_nolen
X<SvPVutf8_nolen>, SvPVX X<SvPVX>, SvPVx X<SvPVx>, SvREADONLY
X<SvREADONLY>, SvREADONLY_off X<SvREADONLY_off>, SvREADONLY_on
X<SvREADONLY_on>, SvREFCNT X<SvREFCNT>, SvREFCNT_dec X<SvREFCNT_dec>,
SvREFCNT_dec_NN X<SvREFCNT_dec_NN>, SvREFCNT_inc X<SvREFCNT_inc>,
SvREFCNT_inc_NN X<SvREFCNT_inc_NN>, SvREFCNT_inc_simple
X<SvREFCNT_inc_simple>, SvREFCNT_inc_simple_NN X<SvREFCNT_inc_simple_NN>,
SvREFCNT_inc_simple_void X<SvREFCNT_inc_simple_void>,
SvREFCNT_inc_simple_void_NN X<SvREFCNT_inc_simple_void_NN>,
SvREFCNT_inc_void X<SvREFCNT_inc_void>, SvREFCNT_inc_void_NN
X<SvREFCNT_inc_void_NN>, sv_report_used X<sv_report_used>, SvROK X<SvROK>,
SvROK_off X<SvROK_off>, SvROK_on X<SvROK_on>, SvRV X<SvRV>, SvRV_set
X<SvRV_set>, sv_setsv_nomg X<sv_setsv_nomg>, SvSTASH X<SvSTASH>,
SvSTASH_set X<SvSTASH_set>, SvTAINT X<SvTAINT>, SvTAINTED X<SvTAINTED>,
SvTAINTED_off X<SvTAINTED_off>, SvTAINTED_on X<SvTAINTED_on>, SvTRUE
X<SvTRUE>, SvTRUE_nomg X<SvTRUE_nomg>, SvTYPE X<SvTYPE>, SvUOK X<SvUOK>,
SvUPGRADE X<SvUPGRADE>, SvUTF8 X<SvUTF8>, sv_utf8_upgrade_nomg
X<sv_utf8_upgrade_nomg>, SvUTF8_off X<SvUTF8_off>, SvUTF8_on X<SvUTF8_on>,
SvUV X<SvUV>, SvUV_nomg X<SvUV_nomg>, SvUV_set X<SvUV_set>, SvUVX X<SvUVX>,
SvUVx X<SvUVx>, SvVOK X<SvVOK>

=item Unicode Support

BOM_UTF8 X<BOM_UTF8>, bytes_cmp_utf8 X<bytes_cmp_utf8>, bytes_from_utf8
X<bytes_from_utf8>, bytes_to_utf8 X<bytes_to_utf8>, DO_UTF8 X<DO_UTF8>,
foldEQ_utf8 X<foldEQ_utf8>, is_ascii_string X<is_ascii_string>,
is_c9strict_utf8_string X<is_c9strict_utf8_string>,
is_c9strict_utf8_string_loc X<is_c9strict_utf8_string_loc>,
is_c9strict_utf8_string_loclen X<is_c9strict_utf8_string_loclen>,
isC9_STRICT_UTF8_CHAR X<isC9_STRICT_UTF8_CHAR>, is_invariant_string
X<is_invariant_string>, isSTRICT_UTF8_CHAR X<isSTRICT_UTF8_CHAR>,
is_strict_utf8_string X<is_strict_utf8_string>, is_strict_utf8_string_loc
X<is_strict_utf8_string_loc>, is_strict_utf8_string_loclen
X<is_strict_utf8_string_loclen>, is_utf8_fixed_width_buf_flags
X<is_utf8_fixed_width_buf_flags>, is_utf8_fixed_width_buf_loclen_flags
X<is_utf8_fixed_width_buf_loclen_flags>, is_utf8_fixed_width_buf_loc_flags
X<is_utf8_fixed_width_buf_loc_flags>, is_utf8_invariant_string
X<is_utf8_invariant_string>, is_utf8_string X<is_utf8_string>,
is_utf8_string_flags X<is_utf8_string_flags>, is_utf8_string_loc
X<is_utf8_string_loc>, is_utf8_string_loclen X<is_utf8_string_loclen>,
is_utf8_string_loclen_flags X<is_utf8_string_loclen_flags>,
is_utf8_string_loc_flags X<is_utf8_string_loc_flags>,
is_utf8_valid_partial_char X<is_utf8_valid_partial_char>,
is_utf8_valid_partial_char_flags X<is_utf8_valid_partial_char_flags>,
isUTF8_CHAR X<isUTF8_CHAR>, isUTF8_CHAR_flags X<isUTF8_CHAR_flags>,
pv_uni_display X<pv_uni_display>, REPLACEMENT_CHARACTER_UTF8
X<REPLACEMENT_CHARACTER_UTF8>, sv_cat_decode X<sv_cat_decode>,
sv_recode_to_utf8 X<sv_recode_to_utf8>, sv_uni_display X<sv_uni_display>,
to_utf8_case X<to_utf8_case>, to_utf8_fold X<to_utf8_fold>, to_utf8_lower
X<to_utf8_lower>, to_utf8_title X<to_utf8_title>, to_utf8_upper
X<to_utf8_upper>, utf8n_to_uvchr X<utf8n_to_uvchr>, utf8n_to_uvchr_error
X<utf8n_to_uvchr_error>, C<UTF8_GOT_ABOVE_31_BIT>,
C<UTF8_GOT_CONTINUATION>, C<UTF8_GOT_EMPTY>, C<UTF8_GOT_LONG>,
C<UTF8_GOT_NONCHAR>, C<UTF8_GOT_NON_CONTINUATION>, C<UTF8_GOT_OVERFLOW>,
C<UTF8_GOT_SHORT>, C<UTF8_GOT_SUPER>, C<UTF8_GOT_SURROGATE>, utf8n_to_uvuni
X<utf8n_to_uvuni>, UTF8SKIP X<UTF8SKIP>, utf8_distance X<utf8_distance>,
utf8_hop X<utf8_hop>, utf8_hop_back X<utf8_hop_back>, utf8_hop_forward
X<utf8_hop_forward>, utf8_hop_safe X<utf8_hop_safe>, UTF8_IS_INVARIANT
X<UTF8_IS_INVARIANT>, UTF8_IS_NONCHAR X<UTF8_IS_NONCHAR>, UTF8_IS_SUPER
X<UTF8_IS_SUPER>, UTF8_IS_SURROGATE X<UTF8_IS_SURROGATE>, utf8_length
X<utf8_length>, utf8_to_bytes X<utf8_to_bytes>, utf8_to_uvchr_buf
X<utf8_to_uvchr_buf>, utf8_to_uvuni_buf X<utf8_to_uvuni_buf>,
UVCHR_IS_INVARIANT X<UVCHR_IS_INVARIANT>, UVCHR_SKIP X<UVCHR_SKIP>,
uvchr_to_utf8 X<uvchr_to_utf8>, uvchr_to_utf8_flags X<uvchr_to_utf8_flags>,
uvoffuni_to_utf8_flags X<uvoffuni_to_utf8_flags>, uvuni_to_utf8_flags
X<uvuni_to_utf8_flags>, valid_utf8_to_uvchr X<valid_utf8_to_uvchr>

=item Variables created by C<xsubpp> and C<xsubpp> internal functions

newXSproto X<newXSproto>, XS_APIVERSION_BOOTCHECK
X<XS_APIVERSION_BOOTCHECK>, XS_VERSION X<XS_VERSION>, XS_VERSION_BOOTCHECK
X<XS_VERSION_BOOTCHECK>

=item Warning and Dieing

ckWARN X<ckWARN>, ckWARN2 X<ckWARN2>, ckWARN3 X<ckWARN3>, ckWARN4
X<ckWARN4>, ckWARN_d X<ckWARN_d>, ckWARN2_d X<ckWARN2_d>, ckWARN3_d
X<ckWARN3_d>, ckWARN4_d X<ckWARN4_d>, croak X<croak>, croak_no_modify
X<croak_no_modify>, croak_sv X<croak_sv>, die X<die>, die_sv X<die_sv>,
vcroak X<vcroak>, vwarn X<vwarn>, warn X<warn>, warn_sv X<warn_sv>

=item Undocumented functions

GetVars X<GetVars>, Gv_AMupdate X<Gv_AMupdate>, PerlIO_clearerr
X<PerlIO_clearerr>, PerlIO_close X<PerlIO_close>, PerlIO_context_layers
X<PerlIO_context_layers>, PerlIO_eof X<PerlIO_eof>, PerlIO_error
X<PerlIO_error>, PerlIO_fileno X<PerlIO_fileno>, PerlIO_fill
X<PerlIO_fill>, PerlIO_flush X<PerlIO_flush>, PerlIO_get_base
X<PerlIO_get_base>, PerlIO_get_bufsiz X<PerlIO_get_bufsiz>, PerlIO_get_cnt
X<PerlIO_get_cnt>, PerlIO_get_ptr X<PerlIO_get_ptr>, PerlIO_read
X<PerlIO_read>, PerlIO_seek X<PerlIO_seek>, PerlIO_set_cnt
X<PerlIO_set_cnt>, PerlIO_set_ptrcnt X<PerlIO_set_ptrcnt>,
PerlIO_setlinebuf X<PerlIO_setlinebuf>, PerlIO_stderr X<PerlIO_stderr>,
PerlIO_stdin X<PerlIO_stdin>, PerlIO_stdout X<PerlIO_stdout>, PerlIO_tell
X<PerlIO_tell>, PerlIO_unread X<PerlIO_unread>, PerlIO_write
X<PerlIO_write>, amagic_call X<amagic_call>, amagic_deref_call
X<amagic_deref_call>, any_dup X<any_dup>, atfork_lock X<atfork_lock>,
atfork_unlock X<atfork_unlock>, av_arylen_p X<av_arylen_p>, av_iter_p
X<av_iter_p>, block_gimme X<block_gimme>, call_atexit X<call_atexit>,
call_list X<call_list>, calloc X<calloc>, cast_i32 X<cast_i32>, cast_iv
X<cast_iv>, cast_ulong X<cast_ulong>, cast_uv X<cast_uv>, ck_warner
X<ck_warner>, ck_warner_d X<ck_warner_d>, ckwarn X<ckwarn>, ckwarn_d
X<ckwarn_d>, clear_defarray X<clear_defarray>, clone_params_del
X<clone_params_del>, clone_params_new X<clone_params_new>,
croak_memory_wrap X<croak_memory_wrap>, croak_nocontext X<croak_nocontext>,
csighandler X<csighandler>, cx_dump X<cx_dump>, cx_dup X<cx_dup>, cxinc
X<cxinc>, deb X<deb>, deb_nocontext X<deb_nocontext>, debop X<debop>,
debprofdump X<debprofdump>, debstack X<debstack>, debstackptrs
X<debstackptrs>, delimcpy X<delimcpy>, despatch_signals
X<despatch_signals>, die_nocontext X<die_nocontext>, dirp_dup X<dirp_dup>,
do_aspawn X<do_aspawn>, do_binmode X<do_binmode>, do_close X<do_close>,
do_gv_dump X<do_gv_dump>, do_gvgv_dump X<do_gvgv_dump>, do_hv_dump
X<do_hv_dump>, do_join X<do_join>, do_magic_dump X<do_magic_dump>,
do_op_dump X<do_op_dump>, do_open X<do_open>, do_open9 X<do_open9>,
do_openn X<do_openn>, do_pmop_dump X<do_pmop_dump>, do_spawn X<do_spawn>,
do_spawn_nowait X<do_spawn_nowait>, do_sprintf X<do_sprintf>, do_sv_dump
X<do_sv_dump>, doing_taint X<doing_taint>, doref X<doref>, dounwind
X<dounwind>, dowantarray X<dowantarray>, dump_eval X<dump_eval>, dump_form
X<dump_form>, dump_indent X<dump_indent>, dump_mstats X<dump_mstats>,
dump_sub X<dump_sub>, dump_vindent X<dump_vindent>, filter_add
X<filter_add>, filter_del X<filter_del>, filter_read X<filter_read>,
foldEQ_latin1 X<foldEQ_latin1>, form_nocontext X<form_nocontext>, fp_dup
X<fp_dup>, fprintf_nocontext X<fprintf_nocontext>, free_global_struct
X<free_global_struct>, free_tmps X<free_tmps>, get_context X<get_context>,
get_mstats X<get_mstats>, get_op_descs X<get_op_descs>, get_op_names
X<get_op_names>, get_ppaddr X<get_ppaddr>, get_vtbl X<get_vtbl>, gp_dup
X<gp_dup>, gp_free X<gp_free>, gp_ref X<gp_ref>, gv_AVadd X<gv_AVadd>,
gv_HVadd X<gv_HVadd>, gv_IOadd X<gv_IOadd>, gv_SVadd X<gv_SVadd>,
gv_add_by_type X<gv_add_by_type>, gv_autoload4 X<gv_autoload4>,
gv_autoload_pv X<gv_autoload_pv>, gv_autoload_pvn X<gv_autoload_pvn>,
gv_autoload_sv X<gv_autoload_sv>, gv_check X<gv_check>, gv_dump X<gv_dump>,
gv_efullname X<gv_efullname>, gv_efullname3 X<gv_efullname3>, gv_efullname4
X<gv_efullname4>, gv_fetchfile X<gv_fetchfile>, gv_fetchfile_flags
X<gv_fetchfile_flags>, gv_fetchpv X<gv_fetchpv>, gv_fetchpvn_flags
X<gv_fetchpvn_flags>, gv_fetchsv X<gv_fetchsv>, gv_fullname X<gv_fullname>,
gv_fullname3 X<gv_fullname3>, gv_fullname4 X<gv_fullname4>, gv_handler
X<gv_handler>, gv_name_set X<gv_name_set>, he_dup X<he_dup>, hek_dup
X<hek_dup>, hv_common X<hv_common>, hv_common_key_len X<hv_common_key_len>,
hv_delayfree_ent X<hv_delayfree_ent>, hv_eiter_p X<hv_eiter_p>,
hv_eiter_set X<hv_eiter_set>, hv_free_ent X<hv_free_ent>, hv_ksplit
X<hv_ksplit>, hv_name_set X<hv_name_set>, hv_placeholders_get
X<hv_placeholders_get>, hv_placeholders_set X<hv_placeholders_set>,
hv_rand_set X<hv_rand_set>, hv_riter_p X<hv_riter_p>, hv_riter_set
X<hv_riter_set>, ibcmp_utf8 X<ibcmp_utf8>, init_global_struct
X<init_global_struct>, init_stacks X<init_stacks>, init_tm X<init_tm>,
instr X<instr>, is_lvalue_sub X<is_lvalue_sub>, leave_scope X<leave_scope>,
load_module_nocontext X<load_module_nocontext>, magic_dump X<magic_dump>,
malloc X<malloc>, markstack_grow X<markstack_grow>, mess_nocontext
X<mess_nocontext>, mfree X<mfree>, mg_dup X<mg_dup>, mg_size X<mg_size>,
mini_mktime X<mini_mktime>, moreswitches X<moreswitches>, mro_get_from_name
X<mro_get_from_name>, mro_get_private_data X<mro_get_private_data>,
mro_set_mro X<mro_set_mro>, mro_set_private_data X<mro_set_private_data>,
my_atof X<my_atof>, my_atof2 X<my_atof2>, my_bcopy X<my_bcopy>, my_bzero
X<my_bzero>, my_chsize X<my_chsize>, my_cxt_index X<my_cxt_index>,
my_cxt_init X<my_cxt_init>, my_dirfd X<my_dirfd>, my_exit X<my_exit>,
my_failure_exit X<my_failure_exit>, my_fflush_all X<my_fflush_all>, my_fork
X<my_fork>, my_lstat X<my_lstat>, my_memcmp X<my_memcmp>, my_memset
X<my_memset>, my_pclose X<my_pclose>, my_popen X<my_popen>, my_popen_list
X<my_popen_list>, my_setenv X<my_setenv>, my_socketpair X<my_socketpair>,
my_stat X<my_stat>, my_strftime X<my_strftime>, newANONATTRSUB
X<newANONATTRSUB>, newANONHASH X<newANONHASH>, newANONLIST X<newANONLIST>,
newANONSUB X<newANONSUB>, newATTRSUB X<newATTRSUB>, newAVREF X<newAVREF>,
newCVREF X<newCVREF>, newFORM X<newFORM>, newGVREF X<newGVREF>, newGVgen
X<newGVgen>, newGVgen_flags X<newGVgen_flags>, newHVREF X<newHVREF>,
newHVhv X<newHVhv>, newIO X<newIO>, newMYSUB X<newMYSUB>, newPROG
X<newPROG>, newRV X<newRV>, newSUB X<newSUB>, newSVREF X<newSVREF>,
newSVpvf_nocontext X<newSVpvf_nocontext>, new_stackinfo X<new_stackinfo>,
op_refcnt_lock X<op_refcnt_lock>, op_refcnt_unlock X<op_refcnt_unlock>,
parser_dup X<parser_dup>, perl_alloc_using X<perl_alloc_using>,
perl_clone_using X<perl_clone_using>, pmop_dump X<pmop_dump>, pop_scope
X<pop_scope>, pregcomp X<pregcomp>, pregexec X<pregexec>, pregfree
X<pregfree>, pregfree2 X<pregfree2>, printf_nocontext X<printf_nocontext>,
ptr_table_fetch X<ptr_table_fetch>, ptr_table_free X<ptr_table_free>,
ptr_table_new X<ptr_table_new>, ptr_table_split X<ptr_table_split>,
ptr_table_store X<ptr_table_store>, push_scope X<push_scope>, re_compile
X<re_compile>, re_dup_guts X<re_dup_guts>, re_intuit_start
X<re_intuit_start>, re_intuit_string X<re_intuit_string>, realloc
X<realloc>, reentrant_free X<reentrant_free>, reentrant_init
X<reentrant_init>, reentrant_retry X<reentrant_retry>, reentrant_size
X<reentrant_size>, ref X<ref>, reg_named_buff_all X<reg_named_buff_all>,
reg_named_buff_exists X<reg_named_buff_exists>, reg_named_buff_fetch
X<reg_named_buff_fetch>, reg_named_buff_firstkey
X<reg_named_buff_firstkey>, reg_named_buff_nextkey
X<reg_named_buff_nextkey>, reg_named_buff_scalar X<reg_named_buff_scalar>,
regdump X<regdump>, regdupe_internal X<regdupe_internal>, regexec_flags
X<regexec_flags>, regfree_internal X<regfree_internal>, reginitcolors
X<reginitcolors>, regnext X<regnext>, repeatcpy X<repeatcpy>, rsignal
X<rsignal>, rsignal_state X<rsignal_state>, runops_debug X<runops_debug>,
runops_standard X<runops_standard>, rvpv_dup X<rvpv_dup>, safesyscalloc
X<safesyscalloc>, safesysfree X<safesysfree>, safesysmalloc
X<safesysmalloc>, safesysrealloc X<safesysrealloc>, save_I16 X<save_I16>,
save_I32 X<save_I32>, save_I8 X<save_I8>, save_adelete X<save_adelete>,
save_aelem X<save_aelem>, save_aelem_flags X<save_aelem_flags>, save_alloc
X<save_alloc>, save_aptr X<save_aptr>, save_ary X<save_ary>, save_bool
X<save_bool>, save_clearsv X<save_clearsv>, save_delete X<save_delete>,
save_destructor X<save_destructor>, save_destructor_x X<save_destructor_x>,
save_freeop X<save_freeop>, save_freepv X<save_freepv>, save_freesv
X<save_freesv>, save_generic_pvref X<save_generic_pvref>,
save_generic_svref X<save_generic_svref>, save_hash X<save_hash>,
save_hdelete X<save_hdelete>, save_helem X<save_helem>, save_helem_flags
X<save_helem_flags>, save_hints X<save_hints>, save_hptr X<save_hptr>,
save_int X<save_int>, save_item X<save_item>, save_iv X<save_iv>, save_list
X<save_list>, save_long X<save_long>, save_mortalizesv X<save_mortalizesv>,
save_nogv X<save_nogv>, save_op X<save_op>, save_padsv_and_mortalize
X<save_padsv_and_mortalize>, save_pptr X<save_pptr>, save_pushi32ptr
X<save_pushi32ptr>, save_pushptr X<save_pushptr>, save_pushptrptr
X<save_pushptrptr>, save_re_context X<save_re_context>, save_scalar
X<save_scalar>, save_set_svflags X<save_set_svflags>, save_shared_pvref
X<save_shared_pvref>, save_sptr X<save_sptr>, save_svref X<save_svref>,
save_vptr X<save_vptr>, savestack_grow X<savestack_grow>,
savestack_grow_cnt X<savestack_grow_cnt>, scan_num X<scan_num>,
scan_vstring X<scan_vstring>, seed X<seed>, set_context X<set_context>,
set_numeric_local X<set_numeric_local>, set_numeric_radix
X<set_numeric_radix>, set_numeric_standard X<set_numeric_standard>,
share_hek X<share_hek>, si_dup X<si_dup>, ss_dup X<ss_dup>, stack_grow
X<stack_grow>, start_subparse X<start_subparse>, str_to_version
X<str_to_version>, sv_2iv X<sv_2iv>, sv_2pv X<sv_2pv>, sv_2uv X<sv_2uv>,
sv_catpvf_mg_nocontext X<sv_catpvf_mg_nocontext>, sv_catpvf_nocontext
X<sv_catpvf_nocontext>, sv_dup X<sv_dup>, sv_dup_inc X<sv_dup_inc>, sv_peek
X<sv_peek>, sv_pvn_nomg X<sv_pvn_nomg>, sv_setpvf_mg_nocontext
X<sv_setpvf_mg_nocontext>, sv_setpvf_nocontext X<sv_setpvf_nocontext>,
sys_init X<sys_init>, sys_init3 X<sys_init3>, sys_intern_clear
X<sys_intern_clear>, sys_intern_dup X<sys_intern_dup>, sys_intern_init
X<sys_intern_init>, sys_term X<sys_term>, taint_env X<taint_env>,
taint_proper X<taint_proper>, unlnk X<unlnk>, unsharepvn X<unsharepvn>,
utf16_to_utf8 X<utf16_to_utf8>, utf16_to_utf8_reversed
X<utf16_to_utf8_reversed>, uvuni_to_utf8 X<uvuni_to_utf8>, vdeb X<vdeb>,
vform X<vform>, vload_module X<vload_module>, vnewSVpvf X<vnewSVpvf>,
vwarner X<vwarner>, warn_nocontext X<warn_nocontext>, warner X<warner>,
warner_nocontext X<warner_nocontext>, whichsig X<whichsig>, whichsig_pv
X<whichsig_pv>, whichsig_pvn X<whichsig_pvn>, whichsig_sv X<whichsig_sv>

=item AUTHORS

=item SEE ALSO

=back

=head2 perlintern - autogenerated documentation of purely B<internal>
		 Perl functions

=over 4

=item DESCRIPTION
X<internal Perl functions> X<interpreter functions>

=item Compile-time scope hooks

BhkENTRY X<BhkENTRY>, BhkFLAGS X<BhkFLAGS>, CALL_BLOCK_HOOKS
X<CALL_BLOCK_HOOKS>

=item Custom Operators

core_prototype X<core_prototype>

=item CV Manipulation Functions

docatch X<docatch>

=item CV reference counts and CvOUTSIDE

CvWEAKOUTSIDE X<CvWEAKOUTSIDE>

=item Embedding Functions

cv_dump X<cv_dump>, cv_forget_slab X<cv_forget_slab>, do_dump_pad
X<do_dump_pad>, pad_alloc_name X<pad_alloc_name>, pad_block_start
X<pad_block_start>, pad_check_dup X<pad_check_dup>, pad_findlex
X<pad_findlex>, pad_fixup_inner_anons X<pad_fixup_inner_anons>, pad_free
X<pad_free>, pad_leavemy X<pad_leavemy>, padlist_dup X<padlist_dup>,
padname_dup X<padname_dup>, padnamelist_dup X<padnamelist_dup>, pad_push
X<pad_push>, pad_reset X<pad_reset>, pad_swipe X<pad_swipe>

=item GV Functions

gv_try_downgrade X<gv_try_downgrade>

=item Hash Manipulation Functions

hv_ename_add X<hv_ename_add>, hv_ename_delete X<hv_ename_delete>,
refcounted_he_chain_2hv X<refcounted_he_chain_2hv>, refcounted_he_fetch_pv
X<refcounted_he_fetch_pv>, refcounted_he_fetch_pvn
X<refcounted_he_fetch_pvn>, refcounted_he_fetch_pvs
X<refcounted_he_fetch_pvs>, refcounted_he_fetch_sv
X<refcounted_he_fetch_sv>, refcounted_he_free X<refcounted_he_free>,
refcounted_he_inc X<refcounted_he_inc>, refcounted_he_new_pv
X<refcounted_he_new_pv>, refcounted_he_new_pvn X<refcounted_he_new_pvn>,
refcounted_he_new_pvs X<refcounted_he_new_pvs>, refcounted_he_new_sv
X<refcounted_he_new_sv>

=item IO Functions

start_glob X<start_glob>

=item Lexer interface

validate_proto X<validate_proto>

=item Magical Functions

magic_clearhint X<magic_clearhint>, magic_clearhints X<magic_clearhints>,
magic_methcall X<magic_methcall>, magic_sethint X<magic_sethint>,
mg_localize X<mg_localize>

=item Miscellaneous Functions

free_c_backtrace X<free_c_backtrace>, get_c_backtrace X<get_c_backtrace>

=item MRO Functions

mro_get_linear_isa_dfs X<mro_get_linear_isa_dfs>, mro_isa_changed_in
X<mro_isa_changed_in>, mro_package_moved X<mro_package_moved>

=item Optree Manipulation Functions

finalize_optree X<finalize_optree>

=item Pad Data Structures

CX_CURPAD_SAVE X<CX_CURPAD_SAVE>, CX_CURPAD_SV X<CX_CURPAD_SV>, PAD_BASE_SV
X<PAD_BASE_SV>, PAD_CLONE_VARS X<PAD_CLONE_VARS>, PAD_COMPNAME_FLAGS
X<PAD_COMPNAME_FLAGS>, PAD_COMPNAME_GEN X<PAD_COMPNAME_GEN>,
PAD_COMPNAME_GEN_set X<PAD_COMPNAME_GEN_set>, PAD_COMPNAME_OURSTASH
X<PAD_COMPNAME_OURSTASH>, PAD_COMPNAME_PV X<PAD_COMPNAME_PV>,
PAD_COMPNAME_TYPE X<PAD_COMPNAME_TYPE>, PadnameIsOUR X<PadnameIsOUR>,
PadnameIsSTATE X<PadnameIsSTATE>, PadnameOURSTASH X<PadnameOURSTASH>,
PadnameOUTER X<PadnameOUTER>, PadnameTYPE X<PadnameTYPE>, PAD_RESTORE_LOCAL
X<PAD_RESTORE_LOCAL>, PAD_SAVE_LOCAL X<PAD_SAVE_LOCAL>, PAD_SAVE_SETNULLPAD
X<PAD_SAVE_SETNULLPAD>, PAD_SETSV X<PAD_SETSV>, PAD_SET_CUR X<PAD_SET_CUR>,
PAD_SET_CUR_NOSAVE X<PAD_SET_CUR_NOSAVE>, PAD_SV X<PAD_SV>, PAD_SVl
X<PAD_SVl>, SAVECLEARSV X<SAVECLEARSV>, SAVECOMPPAD X<SAVECOMPPAD>,
SAVEPADSV X<SAVEPADSV>

=item Per-Interpreter Variables

PL_DBsingle X<PL_DBsingle>, PL_DBsub X<PL_DBsub>, PL_DBtrace X<PL_DBtrace>,
PL_dowarn X<PL_dowarn>, PL_last_in_gv X<PL_last_in_gv>, PL_ofsgv
X<PL_ofsgv>, PL_rs X<PL_rs>

=item Stack Manipulation Macros

djSP X<djSP>, LVRET X<LVRET>

=item SV-Body Allocation

sv_2num X<sv_2num>

=item SV Manipulation Functions

sv_add_arena X<sv_add_arena>, sv_clean_all X<sv_clean_all>, sv_clean_objs
X<sv_clean_objs>, sv_free_arenas X<sv_free_arenas>, SvTHINKFIRST
X<SvTHINKFIRST>

=item Unicode Support

find_uninit_var X<find_uninit_var>, report_uninit X<report_uninit>

=item Undocumented functions

PerlIO_restore_errno X<PerlIO_restore_errno>, PerlIO_save_errno
X<PerlIO_save_errno>, Slab_Alloc X<Slab_Alloc>, Slab_Free X<Slab_Free>,
Slab_to_ro X<Slab_to_ro>, Slab_to_rw X<Slab_to_rw>, _add_range_to_invlist
X<_add_range_to_invlist>, _byte_dump_string X<_byte_dump_string>,
_core_swash_init X<_core_swash_init>, _get_regclass_nonbitmap_data
X<_get_regclass_nonbitmap_data>, _get_swash_invlist X<_get_swash_invlist>,
_invlistEQ X<_invlistEQ>, _invlist_array_init X<_invlist_array_init>,
_invlist_contains_cp X<_invlist_contains_cp>, _invlist_dump
X<_invlist_dump>, _invlist_intersection X<_invlist_intersection>,
_invlist_intersection_maybe_complement_2nd
X<_invlist_intersection_maybe_complement_2nd>, _invlist_invert
X<_invlist_invert>, _invlist_len X<_invlist_len>, _invlist_populate_swatch
X<_invlist_populate_swatch>, _invlist_search X<_invlist_search>,
_invlist_subtract X<_invlist_subtract>, _invlist_union X<_invlist_union>,
_invlist_union_maybe_complement_2nd X<_invlist_union_maybe_complement_2nd>,
_is_grapheme X<_is_grapheme>, _load_PL_utf8_foldclosures
X<_load_PL_utf8_foldclosures>, _mem_collxfrm X<_mem_collxfrm>, _new_invlist
X<_new_invlist>, _new_invlist_C_array X<_new_invlist_C_array>,
_setup_canned_invlist X<_setup_canned_invlist>, _swash_inversion_hash
X<_swash_inversion_hash>, _swash_to_invlist X<_swash_to_invlist>,
_to_fold_latin1 X<_to_fold_latin1>, _to_upper_title_latin1
X<_to_upper_title_latin1>, _warn_problematic_locale
X<_warn_problematic_locale>, abort_execution X<abort_execution>,
add_cp_to_invlist X<add_cp_to_invlist>, alloc_LOGOP X<alloc_LOGOP>,
alloc_maybe_populate_EXACT X<alloc_maybe_populate_EXACT>, allocmy
X<allocmy>, amagic_is_enabled X<amagic_is_enabled>,
append_utf8_from_native_byte X<append_utf8_from_native_byte>, apply
X<apply>, av_extend_guts X<av_extend_guts>, av_reify X<av_reify>,
bind_match X<bind_match>, boot_core_PerlIO X<boot_core_PerlIO>,
boot_core_UNIVERSAL X<boot_core_UNIVERSAL>, boot_core_mro X<boot_core_mro>,
cando X<cando>, check_utf8_print X<check_utf8_print>, ck_anoncode
X<ck_anoncode>, ck_backtick X<ck_backtick>, ck_bitop X<ck_bitop>, ck_cmp
X<ck_cmp>, ck_concat X<ck_concat>, ck_defined X<ck_defined>, ck_delete
X<ck_delete>, ck_each X<ck_each>, ck_entersub_args_core
X<ck_entersub_args_core>, ck_eof X<ck_eof>, ck_eval X<ck_eval>, ck_exec
X<ck_exec>, ck_exists X<ck_exists>, ck_ftst X<ck_ftst>, ck_fun X<ck_fun>,
ck_glob X<ck_glob>, ck_grep X<ck_grep>, ck_index X<ck_index>, ck_join
X<ck_join>, ck_length X<ck_length>, ck_lfun X<ck_lfun>, ck_listiob
X<ck_listiob>, ck_match X<ck_match>, ck_method X<ck_method>, ck_null
X<ck_null>, ck_open X<ck_open>, ck_prototype X<ck_prototype>, ck_readline
X<ck_readline>, ck_refassign X<ck_refassign>, ck_repeat X<ck_repeat>,
ck_require X<ck_require>, ck_return X<ck_return>, ck_rfun X<ck_rfun>,
ck_rvconst X<ck_rvconst>, ck_sassign X<ck_sassign>, ck_select X<ck_select>,
ck_shift X<ck_shift>, ck_smartmatch X<ck_smartmatch>, ck_sort X<ck_sort>,
ck_spair X<ck_spair>, ck_split X<ck_split>, ck_stringify X<ck_stringify>,
ck_subr X<ck_subr>, ck_substr X<ck_substr>, ck_svconst X<ck_svconst>,
ck_tell X<ck_tell>, ck_trunc X<ck_trunc>, closest_cop X<closest_cop>,
compute_EXACTish X<compute_EXACTish>, coresub_op X<coresub_op>,
create_eval_scope X<create_eval_scope>, croak_caller X<croak_caller>,
croak_no_mem X<croak_no_mem>, croak_popstack X<croak_popstack>,
current_re_engine X<current_re_engine>, custom_op_get_field
X<custom_op_get_field>, cv_ckproto_len_flags X<cv_ckproto_len_flags>,
cv_clone_into X<cv_clone_into>, cv_const_sv_or_av X<cv_const_sv_or_av>,
cv_undef_flags X<cv_undef_flags>, cvgv_from_hek X<cvgv_from_hek>, cvgv_set
X<cvgv_set>, cvstash_set X<cvstash_set>, deb_stack_all X<deb_stack_all>,
defelem_target X<defelem_target>, delete_eval_scope X<delete_eval_scope>,
delimcpy_no_escape X<delimcpy_no_escape>, die_unwind X<die_unwind>,
do_aexec X<do_aexec>, do_aexec5 X<do_aexec5>, do_eof X<do_eof>, do_exec
X<do_exec>, do_exec3 X<do_exec3>, do_execfree X<do_execfree>, do_ipcctl
X<do_ipcctl>, do_ipcget X<do_ipcget>, do_msgrcv X<do_msgrcv>, do_msgsnd
X<do_msgsnd>, do_ncmp X<do_ncmp>, do_open6 X<do_open6>, do_open_raw
X<do_open_raw>, do_print X<do_print>, do_readline X<do_readline>, do_seek
X<do_seek>, do_semop X<do_semop>, do_shmio X<do_shmio>, do_sysseek
X<do_sysseek>, do_tell X<do_tell>, do_trans X<do_trans>, do_vecget
X<do_vecget>, do_vecset X<do_vecset>, do_vop X<do_vop>, does_utf8_overflow
X<does_utf8_overflow>, dofile X<dofile>, drand48_init_r X<drand48_init_r>,
drand48_r X<drand48_r>, dtrace_probe_call X<dtrace_probe_call>,
dtrace_probe_load X<dtrace_probe_load>, dtrace_probe_op X<dtrace_probe_op>,
dtrace_probe_phase X<dtrace_probe_phase>, dump_all_perl X<dump_all_perl>,
dump_packsubs_perl X<dump_packsubs_perl>, dump_sub_perl X<dump_sub_perl>,
dump_sv_child X<dump_sv_child>, emulate_cop_io X<emulate_cop_io>,
feature_is_enabled X<feature_is_enabled>, find_lexical_cv
X<find_lexical_cv>, find_runcv_where X<find_runcv_where>, find_script
X<find_script>, form_short_octal_warning X<form_short_octal_warning>,
free_tied_hv_pool X<free_tied_hv_pool>, get_db_sub X<get_db_sub>,
get_debug_opts X<get_debug_opts>, get_hash_seed X<get_hash_seed>,
get_invlist_iter_addr X<get_invlist_iter_addr>, get_invlist_offset_addr
X<get_invlist_offset_addr>, get_invlist_previous_index_addr
X<get_invlist_previous_index_addr>, get_no_modify X<get_no_modify>,
get_opargs X<get_opargs>, get_re_arg X<get_re_arg>, getenv_len
X<getenv_len>, grok_atoUV X<grok_atoUV>, grok_bslash_c X<grok_bslash_c>,
grok_bslash_o X<grok_bslash_o>, grok_bslash_x X<grok_bslash_x>,
gv_fetchmeth_internal X<gv_fetchmeth_internal>, gv_override X<gv_override>,
gv_setref X<gv_setref>, gv_stashpvn_internal X<gv_stashpvn_internal>,
gv_stashsvpvn_cached X<gv_stashsvpvn_cached>, handle_named_backref
X<handle_named_backref>, hfree_next_entry X<hfree_next_entry>,
hv_backreferences_p X<hv_backreferences_p>, hv_kill_backrefs
X<hv_kill_backrefs>, hv_placeholders_p X<hv_placeholders_p>, hv_undef_flags
X<hv_undef_flags>, init_argv_symbols X<init_argv_symbols>, init_constants
X<init_constants>, init_dbargs X<init_dbargs>, init_debugger
X<init_debugger>, invert X<invert>, invlist_array X<invlist_array>,
invlist_clear X<invlist_clear>, invlist_clone X<invlist_clone>,
invlist_highest X<invlist_highest>, invlist_is_iterating
X<invlist_is_iterating>, invlist_iterfinish X<invlist_iterfinish>,
invlist_iterinit X<invlist_iterinit>, invlist_max X<invlist_max>,
invlist_previous_index X<invlist_previous_index>, invlist_set_len
X<invlist_set_len>, invlist_set_previous_index
X<invlist_set_previous_index>, invlist_trim X<invlist_trim>, io_close
X<io_close>, isFF_OVERLONG X<isFF_OVERLONG>, isFOO_lc X<isFOO_lc>,
is_utf8_common X<is_utf8_common>, is_utf8_common_with_len
X<is_utf8_common_with_len>, is_utf8_cp_above_31_bits
X<is_utf8_cp_above_31_bits>, is_utf8_overlong_given_start_byte_ok
X<is_utf8_overlong_given_start_byte_ok>, isinfnansv X<isinfnansv>, jmaybe
X<jmaybe>, keyword X<keyword>, keyword_plugin_standard
X<keyword_plugin_standard>, list X<list>, localize X<localize>,
magic_clear_all_env X<magic_clear_all_env>, magic_cleararylen_p
X<magic_cleararylen_p>, magic_clearenv X<magic_clearenv>, magic_clearisa
X<magic_clearisa>, magic_clearpack X<magic_clearpack>, magic_clearsig
X<magic_clearsig>, magic_copycallchecker X<magic_copycallchecker>,
magic_existspack X<magic_existspack>, magic_freearylen_p
X<magic_freearylen_p>, magic_freeovrld X<magic_freeovrld>, magic_get
X<magic_get>, magic_getarylen X<magic_getarylen>, magic_getdebugvar
X<magic_getdebugvar>, magic_getdefelem X<magic_getdefelem>, magic_getnkeys
X<magic_getnkeys>, magic_getpack X<magic_getpack>, magic_getpos
X<magic_getpos>, magic_getsig X<magic_getsig>, magic_getsubstr
X<magic_getsubstr>, magic_gettaint X<magic_gettaint>, magic_getuvar
X<magic_getuvar>, magic_getvec X<magic_getvec>, magic_killbackrefs
X<magic_killbackrefs>, magic_nextpack X<magic_nextpack>, magic_regdata_cnt
X<magic_regdata_cnt>, magic_regdatum_get X<magic_regdatum_get>,
magic_regdatum_set X<magic_regdatum_set>, magic_scalarpack
X<magic_scalarpack>, magic_set X<magic_set>, magic_set_all_env
X<magic_set_all_env>, magic_setarylen X<magic_setarylen>, magic_setcollxfrm
X<magic_setcollxfrm>, magic_setdbline X<magic_setdbline>, magic_setdebugvar
X<magic_setdebugvar>, magic_setdefelem X<magic_setdefelem>, magic_setenv
X<magic_setenv>, magic_setisa X<magic_setisa>, magic_setlvref
X<magic_setlvref>, magic_setmglob X<magic_setmglob>, magic_setnkeys
X<magic_setnkeys>, magic_setpack X<magic_setpack>, magic_setpos
X<magic_setpos>, magic_setregexp X<magic_setregexp>, magic_setsig
X<magic_setsig>, magic_setsubstr X<magic_setsubstr>, magic_settaint
X<magic_settaint>, magic_setutf8 X<magic_setutf8>, magic_setuvar
X<magic_setuvar>, magic_setvec X<magic_setvec>, magic_sizepack
X<magic_sizepack>, magic_wipepack X<magic_wipepack>, malloc_good_size
X<malloc_good_size>, malloced_size X<malloced_size>, mem_collxfrm
X<mem_collxfrm>, mem_log_alloc X<mem_log_alloc>, mem_log_free
X<mem_log_free>, mem_log_realloc X<mem_log_realloc>, mg_find_mglob
X<mg_find_mglob>, mode_from_discipline X<mode_from_discipline>, more_bodies
X<more_bodies>, mro_meta_dup X<mro_meta_dup>, mro_meta_init
X<mro_meta_init>, multideref_stringify X<multideref_stringify>, my_attrs
X<my_attrs>, my_clearenv X<my_clearenv>, my_lstat_flags X<my_lstat_flags>,
my_stat_flags X<my_stat_flags>, my_unexec X<my_unexec>, newATTRSUB_x
X<newATTRSUB_x>, newGP X<newGP>, newMETHOP_internal X<newMETHOP_internal>,
newSTUB X<newSTUB>, newSVavdefelem X<newSVavdefelem>, newXS_deffile
X<newXS_deffile>, newXS_len_flags X<newXS_len_flags>, new_warnings_bitfield
X<new_warnings_bitfield>, nextargv X<nextargv>, noperl_die X<noperl_die>,
notify_parser_that_changed_to_utf8 X<notify_parser_that_changed_to_utf8>,
oopsAV X<oopsAV>, oopsHV X<oopsHV>, op_clear X<op_clear>, op_integerize
X<op_integerize>, op_lvalue_flags X<op_lvalue_flags>, op_refcnt_dec
X<op_refcnt_dec>, op_refcnt_inc X<op_refcnt_inc>, op_relocate_sv
X<op_relocate_sv>, op_std_init X<op_std_init>, op_unscope X<op_unscope>,
opmethod_stash X<opmethod_stash>, opslab_force_free X<opslab_force_free>,
opslab_free X<opslab_free>, opslab_free_nopad X<opslab_free_nopad>, package
X<package>, package_version X<package_version>, pad_add_weakref
X<pad_add_weakref>, padlist_store X<padlist_store>, padname_free
X<padname_free>, padnamelist_free X<padnamelist_free>, parse_unicode_opts
X<parse_unicode_opts>, parser_free X<parser_free>, parser_free_nexttoke_ops
X<parser_free_nexttoke_ops>, path_is_searchable X<path_is_searchable>, peep
X<peep>, pmruntime X<pmruntime>, populate_isa X<populate_isa>, ptr_hash
X<ptr_hash>, qerror X<qerror>, re_exec_indentf X<re_exec_indentf>,
re_indentf X<re_indentf>, re_op_compile X<re_op_compile>, re_printf
X<re_printf>, reg_named_buff X<reg_named_buff>, reg_named_buff_iter
X<reg_named_buff_iter>, reg_numbered_buff_fetch X<reg_numbered_buff_fetch>,
reg_numbered_buff_length X<reg_numbered_buff_length>,
reg_numbered_buff_store X<reg_numbered_buff_store>, reg_qr_package
X<reg_qr_package>, reg_skipcomment X<reg_skipcomment>, reg_temp_copy
X<reg_temp_copy>, regcurly X<regcurly>, regprop X<regprop>, report_evil_fh
X<report_evil_fh>, report_redefined_cv X<report_redefined_cv>,
report_wrongway_fh X<report_wrongway_fh>, rpeep X<rpeep>, rsignal_restore
X<rsignal_restore>, rsignal_save X<rsignal_save>, rxres_save X<rxres_save>,
same_dirent X<same_dirent>, save_strlen X<save_strlen>, sawparens
X<sawparens>, scalar X<scalar>, scalarvoid X<scalarvoid>, set_caret_X
X<set_caret_X>, set_padlist X<set_padlist>, should_warn_nl
X<should_warn_nl>, sighandler X<sighandler>, softref2xv X<softref2xv>,
ssc_add_range X<ssc_add_range>, ssc_clear_locale X<ssc_clear_locale>,
ssc_cp_and X<ssc_cp_and>, ssc_intersection X<ssc_intersection>, ssc_union
X<ssc_union>, sub_crush_depth X<sub_crush_depth>, sv_add_backref
X<sv_add_backref>, sv_buf_to_ro X<sv_buf_to_ro>, sv_del_backref
X<sv_del_backref>, sv_free2 X<sv_free2>, sv_kill_backrefs
X<sv_kill_backrefs>, sv_len_utf8_nomg X<sv_len_utf8_nomg>,
sv_magicext_mglob X<sv_magicext_mglob>, sv_mortalcopy_flags
X<sv_mortalcopy_flags>, sv_only_taint_gmagic X<sv_only_taint_gmagic>,
sv_or_pv_pos_u2b X<sv_or_pv_pos_u2b>, sv_resetpvn X<sv_resetpvn>, sv_sethek
X<sv_sethek>, sv_setsv_cow X<sv_setsv_cow>, sv_unglob X<sv_unglob>,
swash_fetch X<swash_fetch>, swash_init X<swash_init>, tied_method
X<tied_method>, tmps_grow_p X<tmps_grow_p>, translate_substr_offsets
X<translate_substr_offsets>, try_amagic_bin X<try_amagic_bin>,
try_amagic_un X<try_amagic_un>, unshare_hek X<unshare_hek>, utilize
X<utilize>, varname X<varname>, vivify_defelem X<vivify_defelem>,
vivify_ref X<vivify_ref>, wait4pid X<wait4pid>, was_lvalue_sub
X<was_lvalue_sub>, watch X<watch>, win32_croak_not_implemented
X<win32_croak_not_implemented>, write_to_stderr X<write_to_stderr>,
xs_boot_epilog X<xs_boot_epilog>, xs_handshake X<xs_handshake>, yyerror
X<yyerror>, yyerror_pv X<yyerror_pv>, yyerror_pvn X<yyerror_pvn>, yylex
X<yylex>, yyparse X<yyparse>, yyquit X<yyquit>, yyunlex X<yyunlex>

=item AUTHORS

=item SEE ALSO

=back

=head2 perliol - C API for Perl's implementation of IO in Layers.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item History and Background

=item Basic Structure

=item Layers vs Disciplines

=item Data Structures

=item Functions and Attributes

=item Per-instance Data

=item Layers in action.

=item Per-instance flag bits

PERLIO_F_EOF, PERLIO_F_CANWRITE,  PERLIO_F_CANREAD, PERLIO_F_ERROR,
PERLIO_F_TRUNCATE, PERLIO_F_APPEND, PERLIO_F_CRLF, PERLIO_F_UTF8,
PERLIO_F_UNBUF, PERLIO_F_WRBUF, PERLIO_F_RDBUF, PERLIO_F_LINEBUF,
PERLIO_F_TEMP, PERLIO_F_OPEN, PERLIO_F_FASTGETS

=item Methods in Detail

fsize, name, size, kind, PERLIO_K_BUFFERED, PERLIO_K_RAW, PERLIO_K_CANCRLF,
PERLIO_K_FASTGETS, PERLIO_K_MULTIARG, Pushed, Popped, Open, Binmode,
Getarg, Fileno, Dup, Read, Write, Seek, Tell, Close, Flush, Fill, Eof,
Error,	Clearerr, Setlinebuf, Get_base, Get_bufsiz, Get_ptr, Get_cnt,
Set_ptrcnt

=item Utilities

=item Implementing PerlIO Layers

C implementations, Perl implementations

=item Core Layers

"unix", "perlio", "stdio", "crlf", "mmap", "pending", "raw", "utf8"

=item Extension Layers

":encoding", ":scalar", ":via"

=back

=item TODO

=back

=head2 perlapio - perl's IO abstraction interface.

=over 4

=item SYNOPSIS

=item DESCRIPTION

1. USE_STDIO, 2. USE_PERLIO, B<PerlIO_stdin()>, B<PerlIO_stdout()>,
B<PerlIO_stderr()>, B<PerlIO_open(path, mode)>, B<PerlIO_fdopen(fd,mode)>,
B<PerlIO_reopen(path,mode,f)>, B<PerlIO_printf(f,fmt,...)>,
B<PerlIO_vprintf(f,fmt,a)>, B<PerlIO_stdoutf(fmt,...)>,
B<PerlIO_read(f,buf,count)>, B<PerlIO_write(f,buf,count)>,
B<PerlIO_close(f)>, B<PerlIO_puts(f,s)>, B<PerlIO_putc(f,c)>,
B<PerlIO_ungetc(f,c)>, B<PerlIO_getc(f)>, B<PerlIO_eof(f)>,
B<PerlIO_error(f)>, B<PerlIO_fileno(f)>, B<PerlIO_clearerr(f)>,
B<PerlIO_flush(f)>, B<PerlIO_seek(f,offset,whence)>, B<PerlIO_tell(f)>,
B<PerlIO_getpos(f,p)>, B<PerlIO_setpos(f,p)>, B<PerlIO_rewind(f)>,
B<PerlIO_tmpfile()>, B<PerlIO_setlinebuf(f)>

=over 4

=item Co-existence with stdio

B<PerlIO_importFILE(f,mode)>, B<PerlIO_exportFILE(f,mode)>,
B<PerlIO_releaseFILE(p,f)>, B<PerlIO_findFILE(f)>

=item "Fast gets" Functions

B<PerlIO_fast_gets(f)>, B<PerlIO_has_cntptr(f)>, B<PerlIO_get_cnt(f)>,
B<PerlIO_get_ptr(f)>, B<PerlIO_set_ptrcnt(f,p,c)>, B<PerlIO_canset_cnt(f)>,
B<PerlIO_set_cnt(f,c)>, B<PerlIO_has_base(f)>, B<PerlIO_get_base(f)>,
B<PerlIO_get_bufsiz(f)>

=item Other Functions

PerlIO_apply_layers(f,mode,layers), PerlIO_binmode(f,ptype,imode,layers),
'E<lt>' read, 'E<gt>' write, '+' read/write, PerlIO_debug(fmt,...)

=back

=back

=head2 perlhack - How to hack on Perl

=over 4

=item DESCRIPTION

=item SUPER QUICK PATCH GUIDE

Check out the source repository, Ensure you're following the latest advice,
Make your change, Test your change, Commit your change, Send your change to
perlbug, Thank you, Next time

=item BUG REPORTING

=item PERL 5 PORTERS

=over 4

=item perl-changes mailing list

=item #p5p on IRC

=back

=item GETTING THE PERL SOURCE

=over 4

=item Read access via Git

=item Read access via the web

=item Read access via rsync

=item Write access via git

=back

=item PATCHING PERL

=over 4

=item Submitting patches

=item Getting your patch accepted

Why, What, How

=item Patching a core module

=item Updating perldelta

=item What makes for a good patch?

=back

=item TESTING

F<t/base>, F<t/comp> and F<t/opbasic>, F<t/cmd>, F<t/run>, F<t/io> and
F<t/op>, Everything else

=over 4

=item Special C<make test> targets

test_porting, minitest, test.valgrind check.valgrind, test_harness,
test-notty test_notty

=item Parallel tests

=item Running tests by hand

=item Using F<t/harness> for testing

-v, -torture, -re=PATTERN, -re LIST OF PATTERNS, PERL_CORE=1,
PERL_DESTRUCT_LEVEL=2, PERL, PERL_SKIP_TTY_TEST, PERL_TEST_Net_Ping,
PERL_TEST_NOVREXX, PERL_TEST_NUMCONVERTS, PERL_TEST_MEMORY

=item Performance testing

=back

=item MORE READING FOR GUTS HACKERS

L<perlsource>, L<perlinterp>, L<perlhacktut>, L<perlhacktips>, L<perlguts>,
L<perlxstut> and L<perlxs>, L<perlapi>, F<Porting/pumpkin.pod>

=item CPAN TESTERS AND PERL SMOKERS

=item WHAT NEXT?

=over 4

=item "The Road goes ever on and on, down from the door where it began."

=item Metaphoric Quotations

=back

=item AUTHOR

=back

=head2 perlsource - A guide to the Perl source tree

=over 4

=item DESCRIPTION

=item FINDING YOUR WAY AROUND

=over 4

=item C code

=item Core modules

F<lib/>, F<ext/>, F<dist/>, F<cpan/>

=item Tests

Module tests, F<t/base/>, F<t/cmd/>, F<t/comp/>, F<t/io/>, F<t/mro/>,
F<t/op/>, F<t/opbasic/>, F<t/re/>, F<t/run/>, F<t/uni/>, F<t/win32/>,
F<t/porting/>, F<t/lib/>

=item Documentation

=item Hacking tools and documentation

F<check*>, F<Maintainers>, F<Maintainers.pl>, and F<Maintainers.pm>,
F<podtidy>

=item Build system

=item F<AUTHORS>

=item F<MANIFEST>

=back

=back

=head2 perlinterp - An overview of the Perl interpreter

=over 4

=item DESCRIPTION

=item ELEMENTS OF THE INTERPRETER

=over 4

=item Startup

=item Parsing

=item Optimization

=item Running

=item Exception handing

=item INTERNAL VARIABLE TYPES

=back

=item OP TREES

=item STACKS

=over 4

=item Argument stack

=item Mark stack

=item Save stack

=back

=item MILLIONS OF MACROS

=item FURTHER READING

=back

=head2 perlhacktut - Walk through the creation of a simple C code patch

=over 4

=item DESCRIPTION

=item EXAMPLE OF A SIMPLE PATCH

=over 4

=item Writing the patch

=item Testing the patch

=item Documenting the patch

=item Submit

=back

=item AUTHOR

=back

=head2 perlhacktips - Tips for Perl core C code hacking

=over 4

=item DESCRIPTION

=item COMMON PROBLEMS

=over 4

=item Perl environment problems

=item Portability problems

=item Problematic System Interfaces

=item Security problems

=back

=item DEBUGGING

=over 4

=item Poking at Perl

=item Using a source-level debugger

run [args], break function_name, break source.c:xxx, step, next, continue,
finish, 'enter', ptype, print

=item gdb macro support

=item Dumping Perl Data Structures

=item Using gdb to look at specific parts of a program

=item Using gdb to look at what the parser/lexer are doing

=back

=item SOURCE CODE STATIC ANALYSIS

=over 4

=item lint

=item Coverity

=item HP-UX cadvise (Code Advisor)

=item cpd (cut-and-paste detector)

=item gcc warnings

=item Warnings of other C compilers

=back

=item MEMORY DEBUGGERS

=over 4

=item valgrind

=item AddressSanitizer

-Dcc=clang, -Accflags=-faddress-sanitizer, -Aldflags=-faddress-sanitizer,
-Alddlflags=-shared\ -faddress-sanitizer

=back

=item PROFILING

=over 4

=item Gprof Profiling

-a, -b, -e routine, -f routine, -s, -z

=item GCC gcov Profiling

=back

=item MISCELLANEOUS TRICKS

=over 4

=item PERL_DESTRUCT_LEVEL

=item PERL_MEM_LOG

=item DDD over gdb

=item C backtrace

Linux, OS X, get_c_backtrace, free_c_backtrace, get_c_backtrace_dump,
dump_c_backtrace

=item Poison

=item Read-only optrees

=item When is a bool not a bool?

=item The .i Targets

=back

=item AUTHOR

=back

=head2 perlpolicy - Various and sundry policies and commitments related to
the Perl core

=over 4

=item DESCRIPTION

=item GOVERNANCE

=over 4

=item Perl 5 Porters

=back

=item MAINTENANCE AND SUPPORT

=item BACKWARD COMPATIBILITY AND DEPRECATION

=over 4

=item Terminology

experimental, deprecated, discouraged, removed

=back

=item MAINTENANCE BRANCHES

=over 4

=item Getting changes into a maint branch

=back

=item CONTRIBUTED MODULES

=over 4

=item A Social Contract about Artistic Control

=back

=item DOCUMENTATION

=item STANDARDS OF CONDUCT

=item CREDITS

=back

=head2 perlgit - Detailed information about git and the Perl repository

=over 4

=item DESCRIPTION

=item CLONING THE REPOSITORY

=item WORKING WITH THE REPOSITORY

=over 4

=item Finding out your status

=item Patch workflow

=item Committing your changes

=item Sending patch emails

=item A note on derived files

=item Cleaning a working directory

=item Bisecting

=item Topic branches and rewriting history

=item Grafts

=back

=item WRITE ACCESS TO THE GIT REPOSITORY

=over 4

=item Accepting a patch

=item Committing to blead

=item On merging and rebasing

=item Committing to maintenance versions

=item Merging from a branch via GitHub

=item Using a smoke-me branch to test changes

=item A note on camel and dromedary

=back

=back

=head2 perlbook - Books about and related to Perl

=over 4

=item DESCRIPTION

=over 4

=item The most popular books

I<Programming Perl> (the "Camel Book"):, I<The Perl Cookbook> (the "Ram
Book"):, I<Learning Perl>  (the "Llama Book"), I<Intermediate Perl> (the
"Alpaca Book")

=item References

I<Perl 5 Pocket Reference>, I<Perl Debugger Pocket Reference>, I<Regular
Expression Pocket Reference>

=item Tutorials

I<Beginning Perl>, I<Learning Perl> (the "Llama Book"), I<Intermediate
Perl> (the "Alpaca Book"), I<Mastering Perl>, I<Effective Perl Programming>

=item Task-Oriented

I<Writing Perl Modules for CPAN>, I<The Perl Cookbook>, I<Automating System
Administration with Perl>, I<Real World SQL Server Administration with
Perl>

=item Special Topics

I<Regular Expressions Cookbook>, I<Programming the Perl DBI>, I<Perl Best
Practices>, I<Higher-Order Perl>, I<Mastering Regular Expressions>,
I<Network Programming with Perl>, I<Perl Template Toolkit>, I<Object
Oriented Perl>, I<Data Munging with Perl>, I<Mastering Perl/Tk>,
I<Extending and Embedding Perl>, I<Pro Perl Debugging>

=item Free (as in beer) books

=item Other interesting, non-Perl books

I<Programming Pearls>, I<More Programming Pearls>

=item A note on freshness

=item Get your book listed

=back

=back

=head2 perlcommunity - a brief overview of the Perl community

=over 4

=item DESCRIPTION

=over 4

=item Where to Find the Community

=item Mailing Lists and Newsgroups

=item IRC

=item Websites

L<http://perl.com/>, L<http://blogs.perl.org/>, L<http://perlsphere.net/>,
L<http://perlweekly.com/>, L<http://use.perl.org/>,
L<http://www.perlmonks.org/>, L<http://stackoverflow.com/>,
L<http://prepan.org/>

=item User Groups

=item Workshops

=item Hackathons

=item Conventions

=item Calendar of Perl Events

=back

=item AUTHOR

=back

=head2 perldoc - Look up Perl documentation in Pod format.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OPTIONS

B<-h>, B<-D>, B<-t>, B<-u>, B<-m> I<module>, B<-l>, B<-U>, B<-F>, B<-f>
I<perlfunc>, B<-q> I<perlfaq-search-regexp>, B<-a> I<perlapifunc>, B<-v>
I<perlvar>, B<-T>, B<-d> I<destination-filename>, B<-o>
I<output-formatname>, B<-M> I<module-name>, B<-w> I<option:value> or B<-w>
I<option>, B<-X>, B<-L> I<language_code>,
B<PageName|ModuleName|ProgramName|URL>, B<-n> I<some-formatter>, B<-r>,
B<-i>, B<-V>

=item SECURITY

=item ENVIRONMENT

=item CHANGES

=item SEE ALSO

=item AUTHOR

=back

=head2 perlhist - the Perl history records

=over 4

=item DESCRIPTION

=item INTRODUCTION

=item THE KEEPERS OF THE PUMPKIN

=over 4

=item PUMPKIN?

=back

=item THE RECORDS

=over 4

=item SELECTED RELEASE SIZES

=item SELECTED PATCH SIZES

=back

=item THE KEEPERS OF THE RECORDS

=back

=head2 perldelta - what is new for perl v5.26.3

=over 4

=item DESCRIPTION

=item Security

=over 4

=item [CVE-2018-12015] Directory traversal in module Archive::Tar

=item [CVE-2018-18311] Integer overflow leading to buffer overflow and
segmentation fault

=item [CVE-2018-18312] Heap-buffer-overflow write in S_regatom (regcomp.c)

=item [CVE-2018-18313] Heap-buffer-overflow read in S_grok_bslash_N
(regcomp.c)

=item [CVE-2018-18314] Heap-buffer-overflow write in S_regatom (regcomp.c)

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Acknowledgements

=item Reporting Bugs

=item Give Thanks

=item SEE ALSO

=back

=head2 perl5263delta, perldelta - what is new for perl v5.26.3

=over 4

=item DESCRIPTION

=item Security

=over 4

=item [CVE-2018-12015] Directory traversal in module Archive::Tar

=item [CVE-2018-18311] Integer overflow leading to buffer overflow and
segmentation fault

=item [CVE-2018-18312] Heap-buffer-overflow write in S_regatom (regcomp.c)

=item [CVE-2018-18313] Heap-buffer-overflow read in S_grok_bslash_N
(regcomp.c)

=item [CVE-2018-18314] Heap-buffer-overflow write in S_regatom (regcomp.c)

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Acknowledgements

=item Reporting Bugs

=item Give Thanks

=item SEE ALSO

=back

=head2 perl5280delta - what is new for perl v5.28.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item Unicode 10.0 is supported

=item L<C<delete>|perlfunc/delete EXPR> on key/value hash slices

=item Experimentally, there are now alphabetic synonyms for some regular
expression assertions

=item Mixed Unicode scripts are now detectable

=item In-place editing with C<perl -i> is now safer

=item Initialisation of aggregate state variables

=item Full-size inode numbers

=item The C<sprintf> C<%j> format size modifier is now available with
pre-C99 compilers

=item Close-on-exec flag set atomically

=item String- and number-specific bitwise ops are no longer experimental

=item Locales are now thread-safe on systems that support them

=item New read-only predefined variable C<${^SAFE_LOCALES}>

=back

=item Security

=over 4

=item [CVE-2017-12837] Heap buffer overflow in regular expression compiler

=item [CVE-2017-12883] Buffer over-read in regular expression parser

=item [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows

=item Default Hash Function Change

=back

=item Incompatible Changes

=over 4

=item Subroutine attribute and signature order

=item Comma-less variable lists in formats are no longer allowed

=item The C<:locked> and C<:unique> attributes have been removed

=item C<\N{}> with nothing between the braces is now illegal

=item Opening the same symbol as both a file and directory handle is no
longer allowed

=item Use of bare C<< << >> to mean C<< <<"" >> is no longer allowed

=item Setting $/ to a reference to a non-positive integer no longer allowed

=item Unicode code points with values exceeding C<IV_MAX> are now fatal

=item The C<B::OP::terse> method has been removed

=item Use of inherited AUTOLOAD for non-methods is no longer allowed

=item Use of strings with code points over 0xFF is not allowed for bitwise
string operators

=item Setting C<${^ENCODING}> to a defined value is now illegal

=item Backslash no longer escapes colon in PATH for the C<-S> switch

=item the -DH (DEBUG_H) misfeature has been removed

=item Yada-yada is now strictly a statement

=item Sort algorithm can no longer be specified

=item Over-radix digits in floating point literals

=item Return type of C<unpackstring()>

=back

=item Deprecations

=over 4

=item Use of L<C<vec>|perlfunc/vec EXPR,OFFSET,BITS> on strings with code
points above 0xFF is deprecated

=item Some uses of unescaped C<"{"> in regexes are no longer fatal

=item Use of unescaped C<"{"> immediately after a C<"("> in regular
expression patterns is deprecated

=item Assignment to C<$[> will be fatal in Perl 5.30

=item hostname() won't accept arguments in Perl 5.32

=item Module removals

B::Debug, L<Locale::Codes> and its associated Country, Currency and
Language modules

=back

=item Performance Enhancements

=item Modules and Pragmata

=over 4

=item Removal of use vars

=item Use of DynaLoader changed to XSLoader in many modules

=item Updated Modules and Pragmata

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

L<perldiag/Variable length lookbehind not implemented in regex
mE<sol>%sE<sol>>, "Use of state $_ is experimental" in L<perldiag>

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Utility Changes

=over 4

=item L<perlbug>

=back

=item Configuration and Compilation

C89 requirement, New probes, HAS_BUILTIN_ADD_OVERFLOW,
HAS_BUILTIN_MUL_OVERFLOW, HAS_BUILTIN_SUB_OVERFLOW,
HAS_THREAD_SAFE_NL_LANGINFO_L, HAS_LOCALECONV_L, HAS_MBRLEN, HAS_MBRTOWC,
HAS_MEMRCHR, HAS_NANOSLEEP, HAS_STRNLEN, HAS_STRTOLD_L, I_WCHAR

=item Testing

=item Packaging

=item Platform Support

=over 4

=item Discontinued Platforms

PowerUX / Power MAX OS

=item Platform-Specific Notes

CentOS, Cygwin, Darwin, FreeBSD, VMS, Windows

=back

=item Internal Changes

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item Give Thanks

=item SEE ALSO

=back

=head2 perl5262delta - what is new for perl v5.26.2

=over 4

=item DESCRIPTION

=item Security

=over 4

=item [CVE-2018-6797] heap-buffer-overflow (WRITE of size 1) in S_regatom
(regcomp.c)

=item [CVE-2018-6798] Heap-buffer-overflow in Perl__byte_dump_string
(utf8.c)

=item [CVE-2018-6913] heap-buffer-overflow in S_pack_rec

=item Assertion failure in Perl__core_swash_init (utf8.c)

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Platform Support

=over 4

=item Platform-Specific Notes

Windows

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item Give Thanks

=item SEE ALSO

=back

=head2 perl5261delta - what is new for perl v5.26.1

=over 4

=item DESCRIPTION

=item Security

=over 4

=item [CVE-2017-12837] Heap buffer overflow in regular expression compiler

=item [CVE-2017-12883] Buffer over-read in regular expression parser

=item [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Platform Support

=over 4

=item Platform-Specific Notes

FreeBSD, Windows

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item Give Thanks

=item SEE ALSO

=back

=head2 perl5260delta - what is new for perl v5.26.0

=over 4

=item DESCRIPTION

=item Notice

C<"."> no longer in C<@INC>, C<do> may now warn, In regular expression
patterns, a literal left brace C<"{"> should be escaped

=item Core Enhancements

=over 4

=item Lexical subroutines are no longer experimental

=item Indented Here-documents

=item New regular expression modifier C</xx>

=item C<@{^CAPTURE}>, C<%{^CAPTURE}>, and C<%{^CAPTURE_ALL}>

=item Declaring a reference to a variable

=item Unicode 9.0 is now supported

=item Use of C<\p{I<script>}> uses the improved Script_Extensions property

=item Perl can now do default collation in UTF-8 locales on platforms
that support it

=item Better locale collation of strings containing embedded C<NUL>
characters

=item C<CORE> subroutines for hash and array functions callable via
reference

=item New Hash Function For 64-bit Builds

=back

=item Security

=over 4

=item Removal of the current directory (C<".">) from C<@INC>

F<Configure -Udefault_inc_excludes_dot>, C<PERL_USE_UNSAFE_INC>, A new
deprecation warning issued by C<do>, Script authors, Installing and using
CPAN modules, Module Authors

=item Escaped colons and relative paths in PATH

=item New C<-Di> switch is now required for PerlIO debugging output

=back

=item Incompatible Changes

=over 4

=item Unescaped literal C<"{"> characters in regular expression
patterns are no longer permissible

=item C<scalar(%hash)> return signature changed

=item C<keys> returned from an lvalue subroutine

=item The C<${^ENCODING}> facility has been removed

=item C<POSIX::tmpnam()> has been removed

=item require ::Foo::Bar is now illegal.

=item Literal control character variable names are no longer permissible

=item C<NBSP> is no longer permissible in C<\N{...}>

=back

=item Deprecations

=over 4

=item String delimiters that aren't stand-alone graphemes are now
deprecated

=item C<\cI<X>> that maps to a printable is no longer deprecated

=back

=item Performance Enhancements

New Faster Hash Function on 64 bit builds, readline is faster

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Utility Changes

=over 4

=item F<c2ph> and F<pstruct>

=item F<Porting/pod_lib.pl>

=item F<Porting/sync-with-cpan>

=item F<perf/benchmarks>

=item F<Porting/checkAUTHORS.pl>

=item F<t/porting/regen.t>

=item F<utils/h2xs.PL>

=item L<perlbug>

=back

=item Configuration and Compilation

=item Testing

=item Platform Support

=over 4

=item New Platforms

NetBSD/VAX

=item Platform-Specific Notes

Darwin, EBCDIC, HP-UX, Hurd, VAX, VMS, Windows, Linux, OpenBSD 6, FreeBSD,
DragonFly BSD

=back

=item Internal Changes

=item Selected Bug Fixes

=item Known Problems

=item Errata From Previous Releases

=item Obituary

=item Acknowledgements

=item Reporting Bugs

=item Give Thanks

=item SEE ALSO

=back

=head2 perl5244delta - what is new for perl v5.24.4

=over 4

=item DESCRIPTION

=item Security

=over 4

=item [CVE-2018-6797] heap-buffer-overflow (WRITE of size 1) in S_regatom
(regcomp.c)

=item [CVE-2018-6798] Heap-buffer-overflow in Perl__byte_dump_string
(utf8.c)

=item [CVE-2018-6913] heap-buffer-overflow in S_pack_rec

=item Assertion failure in Perl__core_swash_init (utf8.c)

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5243delta - what is new for perl v5.24.3

=over 4

=item DESCRIPTION

=item Security

=over 4

=item [CVE-2017-12837] Heap buffer overflow in regular expression compiler

=item [CVE-2017-12883] Buffer over-read in regular expression parser

=item [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Configuration and Compilation

=item Platform Support

=over 4

=item Platform-Specific Notes

VMS, Windows

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5242delta - what is new for perl v5.24.2

=over 4

=item DESCRIPTION

=item Security

=over 4

=item Improved handling of '.' in @INC in base.pm

=item "Escaped" colons and relative paths in PATH

=back

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5241delta - what is new for perl v5.24.1

=over 4

=item DESCRIPTION

=item Security

=over 4

=item B<-Di> switch is now required for PerlIO debugging output

=item Core modules and tools no longer search F<"."> for optional modules

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Testing

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5240delta - what is new for perl v5.24.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item Postfix dereferencing is no longer experimental

=item Unicode 8.0 is now supported

=item perl will now croak when closing an in-place output file fails

=item New C<\b{lb}> boundary in regular expressions

=item C<qr/(?[ ])/> now works in UTF-8 locales

=item Integer shift (C<< << >> and C<< >> >>) now more explicitly defined

=item printf and sprintf now allow reordered precision arguments

=item More fields provided to C<sigaction> callback with C<SA_SIGINFO>

=item Hashbang redirection to Perl 6

=back

=item Security

=over 4

=item Set proper umask before calling C<mkstemp(3)>

=item Fix out of boundary access in Win32 path handling

=item Fix loss of taint in canonpath

=item Avoid accessing uninitialized memory in win32 C<crypt()>

=item Remove duplicate environment variables from C<environ>

=back

=item Incompatible Changes

=over 4

=item The C<autoderef> feature has been removed

=item Lexical $_ has been removed

=item C<qr/\b{wb}/> is now tailored to Perl expectations

=item Regular expression compilation errors

=item C<qr/\N{}/> now disallowed under C<use re "strict">

=item Nested declarations are now disallowed

=item The C</\C/> character class has been removed.

=item C<chdir('')> no longer chdirs home

=item ASCII characters in variable names must now be all visible

=item An off by one issue in C<$Carp::MaxArgNums> has been fixed

=item Only blanks and tabs are now allowed within C<[...]> within
C<(?[...])>.

=back

=item Deprecations

=over 4

=item Using code points above the platform's C<IV_MAX> is now deprecated

=item Doing bitwise operations on strings containing code points above
0xFF is deprecated

=item C<sysread()>, C<syswrite()>, C<recv()> and C<send()> are deprecated
on
:utf8 handles

=back

=item Performance Enhancements

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Configuration and Compilation

=item Testing

=item Platform Support

=over 4

=item Platform-Specific Notes

AmigaOS, Cygwin, EBCDIC, UTF-EBCDIC extended, EBCDIC C<cmp()> and C<sort()>
fixed for UTF-EBCDIC strings, EBCDIC C<tr///> and C<y///> fixed for
C<\N{}>, and C<S<use utf8>> ranges, FreeBSD, IRIX, MacOS X, Solaris, Tru64,
VMS, Win32, ppc64el, floating point

=back

=item Internal Changes

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5224delta - what is new for perl v5.22.4

=over 4

=item DESCRIPTION

=item Security

=over 4

=item Improved handling of '.' in @INC in base.pm

=item "Escaped" colons and relative paths in PATH

=back

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5223delta - what is new for perl v5.22.3

=over 4

=item DESCRIPTION

=item Security

=over 4

=item B<-Di> switch is now required for PerlIO debugging output

=item Core modules and tools no longer search F<"."> for optional modules

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Testing

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5222delta - what is new for perl v5.22.2

=over 4

=item DESCRIPTION

=item Security

=over 4

=item Fix out of boundary access in Win32 path handling

=item Fix loss of taint in C<canonpath()>

=item Set proper umask before calling C<mkstemp(3)>

=item Avoid accessing uninitialized memory in Win32 C<crypt()>

=item Remove duplicate environment variables from C<environ>

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Configuration and Compilation

=item Platform Support

=over 4

=item Platform-Specific Notes

Darwin, OS X/Darwin, ppc64el, Tru64

=back

=item Internal Changes

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5221delta - what is new for perl v5.22.1

=over 4

=item DESCRIPTION

=item Incompatible Changes

=over 4

=item Bounds Checking Constructs

=back

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item Changes to Existing Diagnostics

=back

=item Configuration and Compilation

=item Platform Support

=over 4

=item Platform-Specific Notes

IRIX

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5220delta - what is new for perl v5.22.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item New bitwise operators

=item New double-diamond operator

=item New C<\b> boundaries in regular expressions

=item Non-Capturing Regular Expression Flag

=item C<use re 'strict'>

=item Unicode 7.0 (with correction) is now supported

=item S<C<use locale>> can restrict which locale categories are affected

=item Perl now supports POSIX 2008 locale currency additions

=item Better heuristics on older platforms for determining locale UTF-8ness

=item Aliasing via reference

=item C<prototype> with no arguments

=item New C<:const> subroutine attribute

=item C<fileno> now works on directory handles

=item List form of pipe open implemented for Win32

=item Assignment to list repetition

=item Infinity and NaN (not-a-number) handling improved

=item Floating point parsing has been improved

=item Packing infinity or not-a-number into a character is now fatal

=item Experimental C Backtrace API

=back

=item Security

=over 4

=item Perl is now compiled with C<-fstack-protector-strong> if available

=item The L<Safe> module could allow outside packages to be replaced

=item Perl is now always compiled with C<-D_FORTIFY_SOURCE=2> if available

=back

=item Incompatible Changes

=over 4

=item Subroutine signatures moved before attributes

=item C<&> and C<\&> prototypes accepts only subs

=item C<use encoding> is now lexical

=item List slices returning empty lists

=item C<\N{}> with a sequence of multiple spaces is now a fatal error

=item S<C<use UNIVERSAL '...'>> is now a fatal error

=item In double-quotish C<\cI<X>>, I<X> must now be a printable ASCII
character

=item Splitting the tokens C<(?> and C<(*> in regular expressions is now a
fatal compilation error.

=item C<qr/foo/x> now ignores all Unicode pattern white space

=item Comment lines within S<C<(?[ ])>> are now ended only by a C<\n>

=item C<(?[...])> operators now follow standard Perl precedence

=item Omitting C<%> and C<@> on hash and array names is no longer permitted

=item C<"$!"> text is now in English outside the scope of C<use locale>

=item C<"$!"> text will be returned in UTF-8 when appropriate

=item Support for C<?PATTERN?> without explicit operator has been removed

=item C<defined(@array)> and C<defined(%hash)> are now fatal errors

=item Using a hash or an array as a reference are now fatal errors

=item Changes to the C<*> prototype

=back

=item Deprecations

=over 4

=item Setting C<${^ENCODING}> to anything but C<undef>

=item Use of non-graphic characters in single-character variable names

=item Inlining of C<sub () { $var }> with observable side-effects

=item Use of multiple C</x> regexp modifiers

=item Using a NO-BREAK space in a character alias for C<\N{...}> is now
deprecated

=item A literal C<"{"> should now be escaped in a pattern

=item Making all warnings fatal is discouraged

=back

=item Performance Enhancements

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=item Diagnostic Removals

=back

=item Utility Changes

=over 4

=item F<find2perl>, F<s2p> and F<a2p> removal

=item L<h2ph>

=item L<encguess>

=back

=item Configuration and Compilation

=item Testing

=item Platform Support

=over 4

=item Regained Platforms

IRIX and Tru64 platforms are working again, z/OS running EBCDIC Code Page
1047

=item Discontinued Platforms

NeXTSTEP/OPENSTEP

=item Platform-Specific Notes

EBCDIC, HP-UX, Android, VMS, Win32, OpenBSD, Solaris

=back

=item Internal Changes

=item Selected Bug Fixes

=item Known Problems

=item Obituary

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5203delta - what is new for perl v5.20.3

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Utility Changes

=over 4

=item L<h2ph>

=back

=item Testing

=item Platform Support

=over 4

=item Platform-Specific Notes

Win32

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5202delta - what is new for perl v5.20.2

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item Changes to Existing Diagnostics

=back

=item Testing

=item Platform Support

=over 4

=item Regained Platforms

=back

=item Selected Bug Fixes

=item Known Problems

=item Errata From Previous Releases

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5201delta - what is new for perl v5.20.1

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Performance Enhancements

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item Changes to Existing Diagnostics

=back

=item Configuration and Compilation

=item Platform Support

=over 4

=item Platform-Specific Notes

Android, OpenBSD, Solaris, VMS, Windows

=back

=item Internal Changes

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5200delta - what is new for perl v5.20.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item Experimental Subroutine signatures

=item C<sub>s now take a C<prototype> attribute

=item More consistent prototype parsing

=item C<rand> now uses a consistent random number generator

=item New slice syntax

=item Experimental Postfix Dereferencing

=item Unicode 6.3 now supported

=item New C<\p{Unicode}> regular expression pattern property

=item Better 64-bit support

=item C<S<use locale>> now works on UTF-8 locales

=item C<S<use locale>> now compiles on systems without locale ability

=item More locale initialization fallback options

=item C<-DL> runtime option now added for tracing locale setting

=item B<-F> now implies B<-a> and B<-a> implies B<-n>

=item $a and $b warnings exemption

=back

=item Security

=over 4

=item Avoid possible read of free()d memory during parsing

=back

=item Incompatible Changes

=over 4

=item C<do> can no longer be used to call subroutines

=item Quote-like escape changes

=item Tainting happens under more circumstances; now conforms to
documentation

=item C<\p{}>, C<\P{}> matching has changed for non-Unicode code
points.

=item C<\p{All}> has been expanded to match all possible code points

=item Data::Dumper's output may change

=item Locale decimal point character no longer leaks outside of S<C<use
locale>> scope

=item Assignments of Windows sockets error codes to $! now prefer
F<errno.h> values over WSAGetLastError() values

=item Functions C<PerlIO_vsprintf> and C<PerlIO_sprintf> have been removed

=back

=item Deprecations

=over 4

=item The C</\C/> character class

=item Literal control characters in variable names

=item References to non-integers and non-positive integers in C<$/>

=item Character matching routines in POSIX

=item Interpreter-based threads are now I<discouraged>

=item Module removals

L<CGI> and its associated CGI:: packages, L<inc::latest>,
L<Package::Constants>, L<Module::Build> and its associated Module::Build::
packages

=item Utility removals

L<find2perl>, L<s2p>, L<a2p>

=back

=item Performance Enhancements

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Utility Changes

=item Configuration and Compilation

=item Testing

=item Platform Support

=over 4

=item New Platforms

Android, Bitrig, FreeMiNT, Synology

=item Discontinued Platforms

C<sfio>, AT&T 3b1, DG/UX, EBCDIC

=item Platform-Specific Notes

Cygwin, GNU/Hurd, Linux, Mac OS, MidnightBSD, Mixed-endian platforms, VMS,
Win32, WinCE

=back

=item Internal Changes

=item Selected Bug Fixes

=over 4

=item Regular Expressions

=item Perl 5 Debugger and -d

=item Lexical Subroutines

=item Everything Else

=back

=item Known Problems

=item Obituary

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5184delta - what is new for perl v5.18.4

=over 4

=item DESCRIPTION

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Platform Support

=over 4

=item Platform-Specific Notes

Win32

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5182delta - what is new for perl v5.18.2

=over 4

=item DESCRIPTION

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5181delta - what is new for perl v5.18.1

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Platform Support

=over 4

=item Platform-Specific Notes

AIX, MidnightBSD

=back

=item Selected Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5180delta - what is new for perl v5.18.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item New mechanism for experimental features

=item Hash overhaul

=item Upgrade to Unicode 6.2

=item Character name aliases may now include non-Latin1-range characters

=item New DTrace probes

=item C<${^LAST_FH}>

=item Regular Expression Set Operations

=item Lexical subroutines

=item Computed Labels

=item More CORE:: subs

=item C<kill> with negative signal names

=back

=item Security

=over 4

=item See also: hash overhaul

=item C<Storable> security warning in documentation

=item C<Locale::Maketext> allowed code injection via a malicious template

=item Avoid calling memset with a negative count

=back

=item Incompatible Changes

=over 4

=item See also: hash overhaul

=item An unknown character name in C<\N{...}> is now a syntax error

=item Formerly deprecated characters in C<\N{}> character name aliases are
now errors.

=item C<\N{BELL}> now refers to U+1F514 instead of U+0007

=item New Restrictions in Multi-Character Case-Insensitive Matching in
Regular Expression Bracketed Character Classes

=item Explicit rules for variable names and identifiers

=item Vertical tabs are now whitespace

=item C</(?{})/> and C</(??{})/> have been heavily reworked

=item Stricter parsing of substitution replacement

=item C<given> now aliases the global C<$_>

=item The smartmatch family of features are now experimental

=item Lexical C<$_> is now experimental

=item readline() with C<$/ = \N> now reads N characters, not N bytes

=item Overridden C<glob> is now passed one argument

=item Here doc parsing

=item Alphanumeric operators must now be separated from the closing
delimiter of regular expressions

=item qw(...) can no longer be used as parentheses

=item Interaction of lexical and default warnings

=item C<state sub> and C<our sub>

=item Defined values stored in environment are forced to byte strings

=item C<require> dies for unreadable files

=item C<gv_fetchmeth_*> and SUPER

=item C<split>'s first argument is more consistently interpreted

=back

=item Deprecations

=over 4

=item Module removals

L<encoding>, L<Archive::Extract>, L<B::Lint>, L<B::Lint::Debug>,
L<CPANPLUS> and all included C<CPANPLUS::*> modules,
L<Devel::InnerPackage>, L<Log::Message>, L<Log::Message::Config>,
L<Log::Message::Handlers>, L<Log::Message::Item>, L<Log::Message::Simple>,
L<Module::Pluggable>, L<Module::Pluggable::Object>, L<Object::Accessor>,
L<Pod::LaTeX>, L<Term::UI>, L<Term::UI::History>

=item Deprecated Utilities

L<cpanp>, C<cpanp-run-perl>, L<cpan2dist>, L<pod2latex>

=item PL_sv_objcount

=item Five additional characters should be escaped in patterns with C</x>

=item User-defined charnames with surprising whitespace

=item Various XS-callable functions are now deprecated

=item Certain rare uses of backslashes within regexes are now deprecated

=item Splitting the tokens C<(?> and C<(*> in regular expressions

=item Pre-PerlIO IO implementations

=back

=item Future Deprecations

DG/UX, NeXT

=item Performance Enhancements

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated Modules and Pragmata

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item Changes to Existing Documentation

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Utility Changes

=item Configuration and Compilation

=item Testing

=item Platform Support

=over 4

=item Discontinued Platforms

BeOS, UTS Global, VM/ESA, MPE/IX, EPOC, Rhapsody

=item Platform-Specific Notes

=back

=item Internal Changes

=item Selected Bug Fixes

=item Known Problems

=item Obituary

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5163delta - what is new for perl v5.16.3

=over 4

=item DESCRIPTION

=item Core Enhancements

=item Security

=over 4

=item CVE-2013-1667: memory exhaustion with arbitrary hash keys

=item wrap-around with IO on long strings

=item memory leak in Encode

=back

=item Incompatible Changes

=item Deprecations

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Known Problems

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5162delta - what is new for perl v5.16.2

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Configuration and Compilation

configuration should no longer be confused by ls colorization

=item Platform Support

=over 4

=item Platform-Specific Notes

AIX

=back

=item Selected Bug Fixes

fix /\h/ equivalence with /[\h]/

=item Known Problems

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5161delta - what is new for perl v5.16.1

=over 4

=item DESCRIPTION

=item Security

=over 4

=item an off-by-two error in Scalar-List-Util has been fixed

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules and Pragmata

=back

=item Configuration and Compilation

=item Platform Support

=over 4

=item Platform-Specific Notes

VMS

=back

=item Selected Bug Fixes

=item Known Problems

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5160delta - what is new for perl v5.16.0

=over 4

=item DESCRIPTION

=item Notice

=item Core Enhancements

=over 4

=item C<use I<VERSION>>

=item C<__SUB__>

=item New and Improved Built-ins

=item Unicode Support

=item XS Changes

=item Changes to Special Variables

=item Debugger Changes

=item The C<CORE> Namespace

=item Other Changes

=back

=item Security

=over 4

=item Use C<is_utf8_char_buf()> and not C<is_utf8_char()>

=item Malformed UTF-8 input could cause attempts to read beyond the end of
the buffer

=item C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC
(CVE-2011-2728).

=item Privileges are now set correctly when assigning to C<$(>

=back

=item Deprecations

=over 4

=item Don't read the Unicode data base files in F<lib/unicore>

=item XS functions C<is_utf8_char()>, C<utf8_to_uvchr()> and
C<utf8_to_uvuni()>

=back

=item Future Deprecations

=over 4

=item Core Modules

=item Platforms with no supporting programmers

=item Other Future Deprecations

=back

=item Incompatible Changes

=over 4

=item Special blocks called in void context

=item The C<overloading> pragma and regexp objects

=item Two XS typemap Entries removed

=item Unicode 6.1 has incompatibilities with Unicode 6.0

=item Borland compiler

=item Certain deprecated Unicode properties are no longer supported by
default

=item Dereferencing IO thingies as typeglobs

=item User-defined case-changing operations

=item XSUBs are now 'static'

=item Weakening read-only references

=item Tying scalars that hold typeglobs

=item IPC::Open3 no longer provides C<xfork()>, C<xclose_on_exec()>
and C<xpipe_anon()>

=item C<$$> no longer caches PID

=item C<$$> and C<getppid()> no longer emulate POSIX semantics under
LinuxThreads

=item C<< $< >>, C<< $> >>, C<$(> and C<$)> are no longer cached

=item Which Non-ASCII characters get quoted by C<quotemeta> and C<\Q> has
changed

=back

=item Performance Enhancements

=item Modules and Pragmata

=over 4

=item Deprecated Modules

L<Version::Requirements>

=item New Modules and Pragmata

=item Updated Modules and Pragmata

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=item Removed Documentation

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Removed Errors

=item Changes to Existing Diagnostics

=back

=item Utility Changes

=item Configuration and Compilation

=item Platform Support

=over 4

=item Platform-Specific Notes

=back

=item Internal Changes

=item Selected Bug Fixes

=over 4

=item Array and hash

=item C API fixes

=item Compile-time hints

=item Copy-on-write scalars

=item The debugger

=item Dereferencing operators

=item Filehandle, last-accessed

=item Filetests and C<stat>

=item Formats

=item C<given> and C<when>

=item The C<glob> operator

=item Lvalue subroutines

=item Overloading

=item Prototypes of built-in keywords

=item Regular expressions

=item Smartmatching

=item The C<sort> operator

=item The C<substr> operator

=item Support for embedded nulls

=item Threading bugs

=item Tied variables

=item Version objects and vstrings

=item Warnings, redefinition

=item Warnings, "Uninitialized"

=item Weak references

=item Other notable fixes

=back

=item Known Problems

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5144delta - what is new for perl v5.14.4

=over 4

=item DESCRIPTION

=item Core Enhancements

=item Security

=over 4

=item CVE-2013-1667: memory exhaustion with arbitrary hash keys

=item memory leak in Encode

=item [perl #111594] Socket::unpack_sockaddr_un heap-buffer-overflow

=item [perl #111586] SDBM_File: fix off-by-one access to global ".dir"

=item off-by-two error in List::Util

=item [perl #115994] fix segv in regcomp.c:S_join_exact()

=item [perl #115992] PL_eval_start use-after-free

=item wrap-around with IO on long strings

=back

=item Incompatible Changes

=item Deprecations

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated Modules and Pragmata

Socket, SDBM_File, List::Util

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Diagnostics

=item Utility Changes

=item Configuration and Compilation

=item Platform Support

=over 4

=item New Platforms

=item Discontinued Platforms

=item Platform-Specific Notes

VMS

=back

=item Selected Bug Fixes

=item Known Problems

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5143delta - what is new for perl v5.14.3

=over 4

=item DESCRIPTION

=item Core Enhancements

=item Security

=over 4

=item C<Digest> unsafe use of eval (CVE-2011-3597)

=item Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195)

=back

=item Incompatible Changes

=item Deprecations

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated Modules and Pragmata

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Configuration and Compilation

=item Platform Support

=over 4

=item New Platforms

=item Discontinued Platforms

=item Platform-Specific Notes

FreeBSD, Solaris and NetBSD, HP-UX, Linux, Mac OS X, GNU/Hurd, NetBSD

=back

=item Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5142delta - what is new for perl v5.14.2

=over 4

=item DESCRIPTION

=item Core Enhancements

=item Security

=over 4

=item C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC
(CVE-2011-2728).

=item C<Encode> decode_xs n-byte heap-overflow (CVE-2011-2939)

=back

=item Incompatible Changes

=item Deprecations

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated Modules and Pragmata

=item Removed Modules and Pragmata

=back

=item Platform Support

=over 4

=item New Platforms

=item Discontinued Platforms

=item Platform-Specific Notes

HP-UX PA-RISC/64 now supports gcc-4.x, Building on OS X 10.7 Lion and Xcode
4 works again

=back

=item Bug Fixes

=item Known Problems

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5141delta - what is new for perl v5.14.1

=over 4

=item DESCRIPTION

=item Core Enhancements

=item Security

=item Incompatible Changes

=item Deprecations

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated Modules and Pragmata

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item New Diagnostics

=item Changes to Existing Diagnostics

=back

=item Utility Changes

=item Configuration and Compilation

=item Testing

=item Platform Support

=over 4

=item New Platforms

=item Discontinued Platforms

=item Platform-Specific Notes

=back

=item Internal Changes

=item Bug Fixes

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5140delta - what is new for perl v5.14.0

=over 4

=item DESCRIPTION

=item Notice

=item Core Enhancements

=over 4

=item Unicode

=item Regular Expressions

=item Syntactical Enhancements

=item Exception Handling

=item Other Enhancements

C<-d:-foo>, C<-d:-foo=bar>

=item New C APIs

=back

=item Security

=over 4

=item User-defined regular expression properties

=back

=item Incompatible Changes

=over 4

=item Regular Expressions and String Escapes

=item Stashes and Package Variables

=item Changes to Syntax or to Perl Operators

=item Threads and Processes

=item Configuration

=back

=item Deprecations

=over 4

=item Omitting a space between a regular expression and subsequent word

=item C<\cI<X>>

=item C<"\b{"> and C<"\B{">

=item Perl 4-era .pl libraries

=item List assignment to C<$[>

=item Use of qw(...) as parentheses

=item C<\N{BELL}>

=item C<?PATTERN?>

=item Tie functions on scalars holding typeglobs

=item User-defined case-mapping

=item Deprecated modules

L<Devel::DProf>

=back

=item Performance Enhancements

=over 4

=item "Safe signals" optimisation

=item Optimisation of shift() and pop() calls without arguments

=item Optimisation of regexp engine string comparison work

=item Regular expression compilation speed-up

=item String appending is 100 times faster

=item Eliminate C<PL_*> accessor functions under ithreads

=item Freeing weak references

=item Lexical array and hash assignments

=item C<@_> uses less memory

=item Size optimisations to SV and HV structures

=item Memory consumption improvements to Exporter

=item Memory savings for weak references

=item C<%+> and C<%-> use less memory

=item Multiple small improvements to threads

=item Adjacent pairs of nextstate opcodes are now optimized away

=back

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated Modules and Pragma

much less configuration dialog hassle, support for F<META/MYMETA.json>,
support for L<local::lib>, support for L<HTTP::Tiny> to reduce the
dependency on FTP sites, automatic mirror selection, iron out all known
bugs in configure_requires, support for distributions compressed with
L<bzip2(1)>, allow F<Foo/Bar.pm> on the command line to mean C<Foo::Bar>,
charinfo(), charscript(), charblock()

=item Removed Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Diagnostics

=over 4

=item New Diagnostics

Closure prototype called, Insecure user-defined property %s, panic: gp_free
failed to free glob pointer - something is repeatedly re-creating entries,
Parsing code internal error (%s), refcnt: fd %d%s, Regexp modifier "/%c"
may not appear twice, Regexp modifiers "/%c" and "/%c" are mutually
exclusive, Using !~ with %s doesn't make sense, "\b{" is deprecated; use
"\b\{" instead, "\B{" is deprecated; use "\B\{" instead, Operation "%s"
returns its argument for .., Use of qw(...) as parentheses is deprecated

=item Changes to Existing Diagnostics

=back

=item Utility Changes

=item Configuration and Compilation

=item Platform Support

=over 4

=item New Platforms

AIX

=item Discontinued Platforms

Apollo DomainOS, MacOS Classic

=item Platform-Specific Notes

=back

=item Internal Changes

=over 4

=item New APIs

=item C API Changes

=item Deprecated C APIs

C<Perl_ptr_table_clear>, C<sv_compile_2op>, C<find_rundefsvoffset>,
C<CALL_FPTR> and C<CPERLscope>

=item Other Internal Changes

=back

=item Selected Bug Fixes

=over 4

=item I/O

=item Regular Expression Bug Fixes

=item Syntax/Parsing Bugs

=item Stashes, Globs and Method Lookup

Aliasing packages by assigning to globs [perl #77358], Deleting packages by
deleting their containing stash elements, Undefining the glob containing a
package (C<undef *Foo::>), Undefining an ISA glob (C<undef *Foo::ISA>),
Deleting an ISA stash element (C<delete $Foo::{ISA}>), Sharing @ISA arrays
between classes (via C<*Foo::ISA = \@Bar::ISA> or C<*Foo::ISA = *Bar::ISA>)
[perl #77238]

=item Unicode

=item Ties, Overloading and Other Magic

=item The Debugger

=item Threads

=item Scoping and Subroutines

=item Signals

=item Miscellaneous Memory Leaks

=item Memory Corruption and Crashes

=item Fixes to Various Perl Operators

=item Bugs Relating to the C API

=back

=item Known Problems

=item Errata

=over 4

=item keys(), values(), and each() work on arrays

=item split() and C<@_>

=back

=item Obituary

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5125delta - what is new for perl v5.12.5

=over 4

=item DESCRIPTION

=item Security

=over 4

=item C<Encode> decode_xs n-byte heap-overflow (CVE-2011-2939)

=item C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC
(CVE-2011-2728).

=item Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195)

=back

=item Incompatible Changes

=item Modules and Pragmata

=over 4

=item Updated Modules

=back

=item Changes to Existing Documentation

=over 4

=item L<perlebcdic>

=item L<perlunicode>

=item L<perluniprops>

=back

=item Installation and Configuration Improvements

=over 4

=item Platform Specific Changes

Mac OS X, NetBSD

=back

=item Selected Bug Fixes

=item Errata

=over 4

=item split() and C<@_>

=back

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5124delta - what is new for perl v5.12.4

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Selected Bug Fixes

=item Modules and Pragmata

=item Testing

=item Documentation

=item Platform Specific Notes

Linux

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5123delta - what is new for perl v5.12.3

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=over 4

=item C<keys>, C<values> work on arrays

=back

=item Bug Fixes

=item Platform Specific Notes

Solaris, VMS, VOS

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5122delta - what is new for perl v5.12.2

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Pragmata Changes

=item Updated Modules

C<Carp>, C<CPANPLUS>, C<File::Glob>, C<File::Copy>, C<File::Spec>

=back

=item Utility Changes

=item Changes to Existing Documentation

=item Installation and Configuration Improvements

=over 4

=item Configuration improvements

=item Compilation improvements

=back

=item Selected Bug Fixes

=item Platform Specific Notes

=over 4

=item AIX

=item Windows

=item VMS

=back

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5121delta - what is new for perl v5.12.1

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=item Modules and Pragmata

=over 4

=item Pragmata Changes

=item Updated Modules

=back

=item Changes to Existing Documentation

=item Testing

=over 4

=item Testing Improvements

=back

=item Installation and Configuration Improvements

=over 4

=item Configuration improvements

=back

=item Bug Fixes

=item Platform Specific Notes

=over 4

=item HP-UX

=item AIX

=item FreeBSD 7

=item VMS

=back

=item Known Problems

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5120delta - what is new for perl v5.12.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item New C<package NAME VERSION> syntax

=item The C<...> operator

=item Implicit strictures

=item Unicode improvements

=item Y2038 compliance

=item qr overloading

=item Pluggable keywords

=item APIs for more internals

=item Overridable function lookup

=item A proper interface for pluggable Method Resolution Orders

=item C<\N> experimental regex escape

=item DTrace support

=item Support for C<configure_requires> in CPAN module metadata

=item C<each>, C<keys>, C<values> are now more flexible

=item C<when> as a statement modifier

=item C<$,> flexibility

=item // in when clauses

=item Enabling warnings from your shell environment

=item C<delete local>

=item New support for Abstract namespace sockets

=item 32-bit limit on substr arguments removed

=back

=item Potentially Incompatible Changes

=over 4

=item Deprecations warn by default

=item Version number formats

=item @INC reorganization

=item REGEXPs are now first class

=item Switch statement changes

flip-flop operators, defined-or operator

=item Smart match changes

=item Other potentially incompatible changes

=back

=item Deprecations

suidperl, Use of C<:=> to mean an empty attribute list, C<<
UNIVERSAL->import() >>, Use of "goto" to jump into a construct, Custom
character names in \N{name} that don't look like names, Deprecated Modules,
L<Class::ISA>, L<Pod::Plainer>, L<Shell>, L<Switch>, Assignment to $[, Use
of the attribute :locked on subroutines, Use of "locked" with the
attributes pragma, Use of "unique" with the attributes pragma, Perl_pmflag,
Numerous Perl 4-era libraries

=item Unicode overhaul

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

C<autodie>, C<Compress::Raw::Bzip2>, C<overloading>, C<parent>,
C<Parse::CPAN::Meta>, C<VMS::DCLsym>, C<VMS::Stdio>,
C<XS::APItest::KeywordRPN>

=item Updated Pragmata

C<base>, C<bignum>, C<charnames>, C<constant>, C<diagnostics>, C<feature>,
C<less>, C<lib>, C<mro>, C<overload>, C<threads>, C<threads::shared>,
C<version>, C<warnings>

=item Updated Modules

C<Archive::Extract>, C<Archive::Tar>, C<Attribute::Handlers>,
C<AutoLoader>, C<B::Concise>, C<B::Debug>, C<B::Deparse>, C<B::Lint>,
C<CGI>, C<Class::ISA>, C<Compress::Raw::Zlib>, C<CPAN>, C<CPANPLUS>,
C<CPANPLUS::Dist::Build>, C<Data::Dumper>, C<DB_File>, C<Devel::PPPort>,
C<Digest>, C<Digest::MD5>, C<Digest::SHA>, C<Encode>, C<Exporter>,
C<ExtUtils::CBuilder>, C<ExtUtils::Command>, C<ExtUtils::Constant>,
C<ExtUtils::Install>, C<ExtUtils::MakeMaker>, C<ExtUtils::Manifest>,
C<ExtUtils::ParseXS>, C<File::Fetch>, C<File::Path>, C<File::Temp>,
C<Filter::Simple>, C<Filter::Util::Call>, C<Getopt::Long>, C<IO>,
C<IO::Zlib>, C<IPC::Cmd>, C<IPC::SysV>, C<Locale::Maketext>,
C<Locale::Maketext::Simple>, C<Log::Message>, C<Log::Message::Simple>,
C<Math::BigInt>, C<Math::BigInt::FastCalc>, C<Math::BigRat>,
C<Math::Complex>, C<Memoize>, C<MIME::Base64>, C<Module::Build>,
C<Module::CoreList>, C<Module::Load>, C<Module::Load::Conditional>,
C<Module::Loaded>, C<Module::Pluggable>, C<Net::Ping>, C<NEXT>,
C<Object::Accessor>, C<Package::Constants>, C<PerlIO>, C<Pod::Parser>,
C<Pod::Perldoc>, C<Pod::Plainer>, C<Pod::Simple>, C<Safe>, C<SelfLoader>,
C<Storable>, C<Switch>, C<Sys::Syslog>, C<Term::ANSIColor>, C<Term::UI>,
C<Test>, C<Test::Harness>, C<Test::Simple>, C<Text::Balanced>,
C<Text::ParseWords>, C<Text::Soundex>, C<Thread::Queue>,
C<Thread::Semaphore>, C<Tie::RefHash>, C<Time::HiRes>, C<Time::Local>,
C<Time::Piece>, C<Unicode::Collate>, C<Unicode::Normalize>, C<Win32>,
C<Win32API::File>, C<XSLoader>

=item Removed Modules and Pragmata

C<attrs>, C<CPAN::API::HOWTO>, C<CPAN::DeferedCode>, C<CPANPLUS::inc>,
C<DCLsym>, C<ExtUtils::MakeMaker::bytes>, C<ExtUtils::MakeMaker::vmsish>,
C<Stdio>, C<Test::Harness::Assert>, C<Test::Harness::Iterator>,
C<Test::Harness::Point>, C<Test::Harness::Results>,
C<Test::Harness::Straps>, C<Test::Harness::Util>, C<XSSymSet>

=item Deprecated Modules and Pragmata

=back

=item Documentation

=over 4

=item New Documentation

=item Changes to Existing Documentation

=back

=item Selected Performance Enhancements

=item Installation and Configuration Improvements

=item Internal Changes

=item Testing

=over 4

=item Testing improvements

Parallel tests, Test harness flexibility, Test watchdog

=item New Tests

=back

=item New or Changed Diagnostics

=over 4

=item New Diagnostics

=item Changed Diagnostics

C<Illegal character in prototype for %s : %s>, C<Prototype after '%c' for
%s : %s>

=back

=item Utility Changes

=item Selected Bug Fixes

=item Platform Specific Changes

=over 4

=item New Platforms

Haiku, MirOS BSD

=item Discontinued Platforms

Domain/OS, MiNT, Tenon MachTen

=item Updated Platforms

AIX, Cygwin, Darwin (Mac OS X), DragonFly BSD, FreeBSD, Irix, NetBSD,
OpenVMS, Stratus VOS, Symbian, Windows

=back

=item Known Problems

=item Errata

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5101delta - what is new for perl v5.10.1

=over 4

=item DESCRIPTION

=item Incompatible Changes

=over 4

=item Switch statement changes

flip-flop operators, defined-or operator

=item Smart match changes

=item Other incompatible changes

=back

=item Core Enhancements

=over 4

=item Unicode Character Database 5.1.0

=item A proper interface for pluggable Method Resolution Orders

=item The C<overloading> pragma

=item Parallel tests

=item DTrace support

=item Support for C<configure_requires> in CPAN module metadata

=back

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

C<autodie>, C<Compress::Raw::Bzip2>, C<parent>, C<Parse::CPAN::Meta>

=item Pragmata Changes

C<attributes>, C<attrs>, C<base>, C<bigint>, C<bignum>, C<bigrat>,
C<charnames>, C<constant>, C<feature>, C<fields>, C<lib>, C<open>,
C<overload>, C<overloading>, C<version>

=item Updated Modules

C<Archive::Extract>, C<Archive::Tar>, C<Attribute::Handlers>,
C<AutoLoader>, C<AutoSplit>, C<B>, C<B::Debug>, C<B::Deparse>, C<B::Lint>,
C<B::Xref>, C<Benchmark>, C<Carp>, C<CGI>, C<Compress::Zlib>, C<CPAN>,
C<CPANPLUS>, C<CPANPLUS::Dist::Build>, C<Cwd>, C<Data::Dumper>, C<DB>,
C<DB_File>, C<Devel::PPPort>, C<Digest::MD5>, C<Digest::SHA>, C<DirHandle>,
C<Dumpvalue>, C<DynaLoader>, C<Encode>, C<Errno>, C<Exporter>,
C<ExtUtils::CBuilder>, C<ExtUtils::Command>, C<ExtUtils::Constant>,
C<ExtUtils::Embed>, C<ExtUtils::Install>, C<ExtUtils::MakeMaker>,
C<ExtUtils::Manifest>, C<ExtUtils::ParseXS>, C<Fatal>, C<File::Basename>,
C<File::Compare>, C<File::Copy>, C<File::Fetch>, C<File::Find>,
C<File::Path>, C<File::Spec>, C<File::stat>, C<File::Temp>, C<FileCache>,
C<FileHandle>, C<Filter::Simple>, C<Filter::Util::Call>, C<FindBin>,
C<GDBM_File>, C<Getopt::Long>, C<Hash::Util::FieldHash>, C<I18N::Collate>,
C<IO>, C<IO::Compress::*>, C<IO::Dir>, C<IO::Handle>, C<IO::Socket>,
C<IO::Zlib>, C<IPC::Cmd>, C<IPC::Open3>, C<IPC::SysV>, C<lib>,
C<List::Util>, C<Locale::MakeText>, C<Log::Message>, C<Math::BigFloat>,
C<Math::BigInt>, C<Math::BigInt::FastCalc>, C<Math::BigRat>,
C<Math::Complex>, C<Math::Trig>, C<Memoize>, C<Module::Build>,
C<Module::CoreList>, C<Module::Load>, C<Module::Load::Conditional>,
C<Module::Loaded>, C<Module::Pluggable>, C<NDBM_File>, C<Net::Ping>,
C<NEXT>, C<Object::Accessor>, C<OS2::REXX>, C<Package::Constants>,
C<PerlIO>, C<PerlIO::via>, C<Pod::Man>, C<Pod::Parser>, C<Pod::Simple>,
C<Pod::Text>, C<POSIX>, C<Safe>, C<Scalar::Util>, C<SelectSaver>,
C<SelfLoader>, C<Socket>, C<Storable>, C<Switch>, C<Symbol>,
C<Sys::Syslog>, C<Term::ANSIColor>, C<Term::ReadLine>, C<Term::UI>,
C<Test::Harness>, C<Test::Simple>, C<Text::ParseWords>, C<Text::Tabs>,
C<Text::Wrap>, C<Thread::Queue>, C<Thread::Semaphore>, C<threads>,
C<threads::shared>, C<Tie::RefHash>, C<Tie::StdHandle>, C<Time::HiRes>,
C<Time::Local>, C<Time::Piece>, C<Unicode::Normalize>, C<Unicode::UCD>,
C<UNIVERSAL>, C<Win32>, C<Win32API::File>, C<XSLoader>

=back

=item Utility Changes

F<h2ph>, F<h2xs>, F<perl5db.pl>, F<perlthanks>

=item New Documentation

L<perlhaiku>, L<perlmroapi>, L<perlperf>, L<perlrepository>, L<perlthanks>

=item Changes to Existing Documentation

=item Performance Enhancements

=item Installation and Configuration Improvements

=over 4

=item F<ext/> reorganisation

=item Configuration improvements

=item Compilation improvements

=item Platform Specific Changes

AIX, Cygwin, FreeBSD, Irix, Haiku, MirOS BSD, NetBSD, Stratus VOS, Symbian,
Win32, VMS

=back

=item Selected Bug Fixes

=item New or Changed Diagnostics

C<panic: sv_chop %s>, C<Can't locate package %s for the parents of %s>,
C<v-string in use/require is non-portable>, C<Deep recursion on subroutine
"%s">

=item Changed Internals

C<SVf_UTF8>, C<SVs_TEMP>

=item New Tests

t/comp/retainedlines.t, t/io/perlio_fail.t, t/io/perlio_leaks.t,
t/io/perlio_open.t, t/io/perlio.t, t/io/pvbm.t, t/mro/package_aliases.t,
t/op/dbm.t, t/op/index_thr.t, t/op/pat_thr.t, t/op/qr_gc.t,
t/op/reg_email_thr.t, t/op/regexp_qr_embed_thr.t,
t/op/regexp_unicode_prop.t, t/op/regexp_unicode_prop_thr.t,
t/op/reg_nc_tie.t, t/op/reg_posixcc.t, t/op/re.t, t/op/setpgrpstack.t,
t/op/substr_thr.t, t/op/upgrade.t, t/uni/lex_utf8.t, t/uni/tie.t

=item Known Problems

=item Deprecations

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl5100delta - what is new for perl 5.10.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item The C<feature> pragma

=item New B<-E> command-line switch

=item Defined-or operator

=item Switch and Smart Match operator

=item Regular expressions

Recursive Patterns, Named Capture Buffers, Possessive Quantifiers,
Backtracking control verbs, Relative backreferences, C<\K> escape, Vertical
and horizontal whitespace, and linebreak, Optional pre-match and post-match
captures with the /p flag

=item C<say()>

=item Lexical C<$_>

=item The C<_> prototype

=item UNITCHECK blocks

=item New Pragma, C<mro>

=item readdir() may return a "short filename" on Windows

=item readpipe() is now overridable

=item Default argument for readline()

=item state() variables

=item Stacked filetest operators

=item UNIVERSAL::DOES()

=item Formats

=item Byte-order modifiers for pack() and unpack()

=item C<no VERSION>

=item C<chdir>, C<chmod> and C<chown> on filehandles

=item OS groups

=item Recursive sort subs

=item Exceptions in constant folding

=item Source filters in @INC

=item New internal variables

C<${^RE_DEBUG_FLAGS}>, C<${^CHILD_ERROR_NATIVE}>, C<${^RE_TRIE_MAXBUF}>,
C<${^WIN32_SLOPPY_STAT}>

=item Miscellaneous

=item UCD 5.0.0

=item MAD

=item kill() on Windows

=back

=item Incompatible Changes

=over 4

=item Packing and UTF-8 strings

=item Byte/character count feature in unpack()

=item The C<$*> and C<$#> variables have been removed

=item substr() lvalues are no longer fixed-length

=item Parsing of C<-f _>

=item C<:unique>

=item Effect of pragmas in eval

=item chdir FOO

=item Handling of .pmc files

=item $^V is now a C<version> object instead of a v-string

=item @- and @+ in patterns

=item $AUTOLOAD can now be tainted

=item Tainting and printf

=item undef and signal handlers

=item strictures and dereferencing in defined()

=item C<(?p{})> has been removed

=item Pseudo-hashes have been removed

=item Removal of the bytecode compiler and of perlcc

=item Removal of the JPL

=item Recursive inheritance detected earlier

=item warnings::enabled and warnings::warnif changed to favor users of
modules

=back

=item Modules and Pragmata

=over 4

=item Upgrading individual core modules

=item Pragmata Changes

C<feature>, C<mro>, Scoping of the C<sort> pragma, Scoping of C<bignum>,
C<bigint>, C<bigrat>, C<base>, C<strict> and C<warnings>, C<version>,
C<warnings>, C<less>

=item New modules

=item Selected Changes to Core Modules

C<Attribute::Handlers>, C<B::Lint>, C<B>, C<Thread>

=back

=item Utility Changes

perl -d, ptar, ptardiff, shasum, corelist, h2ph and h2xs, perlivp,
find2perl, config_data, cpanp, cpan2dist, pod2html

=item New Documentation

=item Performance Enhancements

=over 4

=item In-place sorting

=item Lexical array access

=item XS-assisted SWASHGET

=item Constant subroutines

=item C<PERL_DONT_CREATE_GVSV>

=item Weak references are cheaper

=item sort() enhancements

=item Memory optimisations

=item UTF-8 cache optimisation

=item Sloppy stat on Windows

=item Regular expressions optimisations

Engine de-recursivised, Single char char-classes treated as literals, Trie
optimisation of literal string alternations, Aho-Corasick start-point
optimisation

=back

=item Installation and Configuration Improvements

=over 4

=item Configuration improvements

C<-Dusesitecustomize>, Relocatable installations, strlcat() and strlcpy(),
C<d_pseudofork> and C<d_printf_format_null>, Configure help

=item Compilation improvements

Parallel build, Borland's compilers support, Static build on Windows,
ppport.h files, C++ compatibility, Support for Microsoft 64-bit compiler,
Visual C++, Win32 builds

=item Installation improvements

Module auxiliary files

=item New Or Improved Platforms

=back

=item Selected Bug Fixes

strictures in regexp-eval blocks, Calling CORE::require(), Subscripts of
slices, C<no warnings 'category'> works correctly with -w, threads
improvements, chr() and negative values, PERL5SHELL and tainting, Using
*FILE{IO}, Overloading and reblessing, Overloading and UTF-8, eval memory
leaks fixed, Random device on Windows, PERLIO_DEBUG, PerlIO::scalar and
read-only scalars, study() and UTF-8, Critical signals, @INC-hook fix,
C<-t> switch fix, Duping UTF-8 filehandles, Localisation of hash elements

=item New or Changed Diagnostics

Use of uninitialized value, Deprecated use of my() in false conditional,
!=~ should be !~, Newline in left-justified string, Too late for "-T"
option, "%s" variable %s masks earlier declaration,
readdir()/closedir()/etc. attempted on invalid dirhandle, Opening
dirhandle/filehandle %s also as a file/directory, Use of -P is deprecated,
v-string in use/require is non-portable, perl -V

=item Changed Internals

=over 4

=item Reordering of SVt_* constants

=item Elimination of SVt_PVBM

=item New type SVt_BIND

=item Removal of CPP symbols

=item Less space is used by ops

=item New parser

=item Use of C<const>

=item Mathoms

=item C<AvFLAGS> has been removed

=item C<av_*> changes

=item $^H and %^H

=item B:: modules inheritance changed

=item Anonymous hash and array constructors

=back

=item Known Problems

=over 4

=item UTF-8 problems

=back

=item Platform Specific Problems

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl589delta - what is new for perl v5.8.9

=over 4

=item DESCRIPTION

=item Notice

=item Incompatible Changes

=item Core Enhancements

=over 4

=item Unicode Character Database 5.1.0.

=item stat and -X on directory handles

=item Source filters in @INC

=item Exceptions in constant folding

=item C<no VERSION>

=item Improved internal UTF-8 caching code

=item Runtime relocatable installations

=item New internal variables

C<${^CHILD_ERROR_NATIVE}>, C<${^UTF8CACHE}>

=item C<readpipe> is now overridable

=item simple exception handling macros

=item -D option enhancements 

=item XS-assisted SWASHGET

=item Constant subroutines

=back

=item New Platforms

=item Modules and Pragmata

=over 4

=item New Modules

=item Updated Modules

=back

=item Utility Changes

=over 4

=item debugger upgraded to version 1.31

=item F<perlthanks>

=item F<perlbug>

=item F<h2xs>

=item F<h2ph>

=back

=item New Documentation

=item Changes to Existing Documentation

=item Performance Enhancements

=item Installation and Configuration Improvements

=over 4

=item Relocatable installations

=item Configuration improvements

=item Compilation improvements

=item Installation improvements.

=item Platform Specific Changes

=back

=item Selected Bug Fixes

=over 4

=item Unicode

=item PerlIO

=item Magic

=item Reblessing overloaded objects now works

=item C<strict> now propagates correctly into string evals

=item Other fixes

=item Platform Specific Fixes

=item Smaller fixes

=back

=item New or Changed Diagnostics

=over 4

=item panic: sv_chop %s

=item Maximal count of pending signals (%s) exceeded

=item panic: attempt to call %s in %s

=item FETCHSIZE returned a negative value

=item Can't upgrade %s (%d) to %d

=item %s argument is not a HASH or ARRAY element or a subroutine

=item Cannot make the non-overridable builtin %s fatal

=item Unrecognized character '%s' in column %d

=item Offset outside string

=item Invalid escape in the specified encoding in regexp; marked by <--
HERE in m/%s/

=item Your machine doesn't support dump/undump.

=back

=item Changed Internals

=over 4

=item Macro cleanups

=back

=item New Tests

ext/DynaLoader/t/DynaLoader.t, t/comp/fold.t, t/io/pvbm.t,
t/lib/proxy_constant_subs.t, t/op/attrhand.t, t/op/dbm.t,
t/op/inccode-tie.t, t/op/incfilter.t, t/op/kill0.t, t/op/qrstack.t,
t/op/qr.t, t/op/regexp_qr_embed.t, t/op/regexp_qr.t, t/op/rxcode.t,
t/op/studytied.t, t/op/substT.t, t/op/symbolcache.t, t/op/upgrade.t,
t/mro/package_aliases.t, t/pod/twice.t, t/run/cloexec.t, t/uni/cache.t,
t/uni/chr.t, t/uni/greek.t, t/uni/latin2.t, t/uni/overload.t, t/uni/tie.t

=item Known Problems

=item Platform Specific Notes

=over 4

=item Win32

=item OS/2

=item VMS

=back

=item Obituary

=item Acknowledgements

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl588delta - what is new for perl v5.8.8

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=item Modules and Pragmata

=item Utility Changes

=over 4

=item C<h2xs> enhancements

=item C<perlivp> enhancements

=back

=item New Documentation

=item Performance Enhancements

=item Installation and Configuration Improvements

=item Selected Bug Fixes

=over 4

=item no warnings 'category' works correctly with -w

=item Remove over-optimisation

=item sprintf() fixes

=item Debugger and Unicode slowdown

=item Smaller fixes

=back

=item New or Changed Diagnostics

=over 4

=item Attempt to set length of freed array

=item Non-string passed as bitmask

=item Search pattern not terminated or ternary operator parsed as search
pattern

=back

=item Changed Internals

=item Platform Specific Problems

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl587delta - what is new for perl v5.8.7

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=over 4

=item Unicode Character Database 4.1.0

=item suidperl less insecure

=item Optional site customization script

=item C<Config.pm> is now much smaller.

=back

=item Modules and Pragmata

=item Utility Changes

=over 4

=item find2perl enhancements

=back

=item Performance Enhancements

=item Installation and Configuration Improvements

=item Selected Bug Fixes

=item New or Changed Diagnostics

=item Changed Internals

=item Known Problems

=item Platform Specific Problems

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl586delta - what is new for perl v5.8.6

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=item Modules and Pragmata

=item Utility Changes

=item Performance Enhancements

=item Selected Bug Fixes

=item New or Changed Diagnostics

=item Changed Internals

=item New Tests

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl585delta - what is new for perl v5.8.5

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=item Modules and Pragmata

=item Utility Changes

=over 4

=item Perl's debugger

=item h2ph

=back

=item Installation and Configuration Improvements

=item Selected Bug Fixes

=item New or Changed Diagnostics

=item Changed Internals

=item Known Problems

=item Platform Specific Problems

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl584delta - what is new for perl v5.8.4

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=over 4

=item Malloc wrapping

=item Unicode Character Database 4.0.1

=item suidperl less insecure

=item format

=back

=item Modules and Pragmata

=over 4

=item Updated modules

Attribute::Handlers, B, Benchmark, CGI, Carp, Cwd, Exporter, File::Find,
IO, IPC::Open3, Local::Maketext, Math::BigFloat, Math::BigInt,
Math::BigRat, MIME::Base64, ODBM_File, POSIX, Shell, Socket, Storable,
Switch, Sys::Syslog, Term::ANSIColor, Time::HiRes, Unicode::UCD, Win32,
base, open, threads, utf8

=back

=item Performance Enhancements

=item Utility Changes

=item Installation and Configuration Improvements

=item Selected Bug Fixes

=item New or Changed Diagnostics

=item Changed Internals

=item Future Directions

=item Platform Specific Problems

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl583delta - what is new for perl v5.8.3

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=item Modules and Pragmata

CGI, Cwd, Digest, Digest::MD5, Encode, File::Spec, FindBin, List::Util,
Math::BigInt, PodParser, Pod::Perldoc, POSIX, Unicode::Collate,
Unicode::Normalize, Test::Harness, threads::shared

=item Utility Changes

=item New Documentation

=item Installation and Configuration Improvements

=item Selected Bug Fixes

=item New or Changed Diagnostics

=item Changed Internals

=item Configuration and Building

=item Platform Specific Problems

=item Known Problems

=item Future Directions

=item Obituary

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl582delta - what is new for perl v5.8.2

=over 4

=item DESCRIPTION

=item Incompatible Changes

=item Core Enhancements

=over 4

=item Hash Randomisation

=item Threading

=back

=item Modules and Pragmata

=over 4

=item Updated Modules And Pragmata

Devel::PPPort, Digest::MD5, I18N::LangTags, libnet, MIME::Base64,
Pod::Perldoc, strict, Tie::Hash, Time::HiRes, Unicode::Collate,
Unicode::Normalize, UNIVERSAL

=back

=item Selected Bug Fixes

=item Changed Internals

=item Platform Specific Problems

=item Future Directions

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl581delta - what is new for perl v5.8.1

=over 4

=item DESCRIPTION

=item Incompatible Changes

=over 4

=item Hash Randomisation

=item UTF-8 On Filehandles No Longer Activated By Locale

=item Single-number v-strings are no longer v-strings before "=>"

=item (Win32) The -C Switch Has Been Repurposed

=item (Win32) The /d Switch Of cmd.exe

=back

=item Core Enhancements

=over 4

=item UTF-8 no longer default under UTF-8 locales

=item Unsafe signals again available

=item Tied Arrays with Negative Array Indices

=item local ${$x}

=item Unicode Character Database 4.0.0

=item Deprecation Warnings

=item Miscellaneous Enhancements

=back

=item Modules and Pragmata

=over 4

=item Updated Modules And Pragmata

base, B::Bytecode, B::Concise, B::Deparse, Benchmark, ByteLoader, bytes,
CGI, charnames, CPAN, Data::Dumper, DB_File, Devel::PPPort, Digest::MD5,
Encode, fields, libnet, Math::BigInt, MIME::Base64, NEXT, Net::Ping,
PerlIO::scalar, podlators, Pod::LaTeX, PodParsers, Pod::Perldoc,
Scalar::Util, Storable, strict, Term::ANSIcolor, Test::Harness, Test::More,
Test::Simple, Text::Balanced, Time::HiRes, threads, threads::shared,
Unicode::Collate, Unicode::Normalize, Win32::GetFolderPath,
Win32::GetOSVersion

=back

=item Utility Changes

=item New Documentation

=item Installation and Configuration Improvements

=over 4

=item Platform-specific enhancements

=back

=item Selected Bug Fixes

=over 4

=item Closures, eval and lexicals

=item Generic fixes

=item Platform-specific fixes

=back

=item New or Changed Diagnostics

=over 4

=item Changed "A thread exited while %d threads were running"

=item Removed "Attempt to clear a restricted hash"

=item New "Illegal declaration of anonymous subroutine"

=item Changed "Invalid range "%s" in transliteration operator"

=item New "Missing control char name in \c"

=item New "Newline in left-justified string for %s"

=item New "Possible precedence problem on bitwise %c operator"

=item New "Pseudo-hashes are deprecated"

=item New "read() on %s filehandle %s"

=item New "5.005 threads are deprecated"

=item New "Tied variable freed while still in use"

=item New "To%s: illegal mapping '%s'"

=item New "Use of freed value in iteration"

=back

=item Changed Internals

=item New Tests

=item Known Problems

=over 4

=item Tied hashes in scalar context

=item Net::Ping 450_service and 510_ping_udp failures

=item B::C

=back

=item Platform Specific Problems

=over 4

=item EBCDIC Platforms

=item Cygwin 1.5 problems

=item HP-UX: HP cc warnings about sendfile and sendpath

=item IRIX: t/uni/tr_7jis.t falsely failing

=item Mac OS X: no usemymalloc

=item Tru64: No threaded builds with GNU cc (gcc)

=item Win32: sysopen, sysread, syswrite

=back

=item Future Directions

=item Reporting Bugs

=item SEE ALSO

=back

=head2 perl58delta - what is new for perl v5.8.0

=over 4

=item DESCRIPTION

=item Highlights In 5.8.0

=item Incompatible Changes

=over 4

=item Binary Incompatibility

=item 64-bit platforms and malloc

=item AIX Dynaloading

=item Attributes for C<my> variables now handled at run-time

=item Socket Extension Dynamic in VMS

=item IEEE-format Floating Point Default on OpenVMS Alpha

=item New Unicode Semantics (no more C<use utf8>, almost)

=item New Unicode Properties

=item REF(...) Instead Of SCALAR(...)

=item pack/unpack D/F recycled

=item glob() now returns filenames in alphabetical order

=item Deprecations

=back

=item Core Enhancements

=over 4

=item Unicode Overhaul

=item PerlIO is Now The Default

=item ithreads

=item Restricted Hashes

=item Safe Signals

=item Understanding of Numbers

=item Arrays now always interpolate into double-quoted strings [561]

=item Miscellaneous Changes

=back

=item Modules and Pragmata

=over 4

=item New Modules and Pragmata

=item Updated And Improved Modules and Pragmata

=back

=item Utility Changes

=item New Documentation

=item Performance Enhancements

=item Installation and Configuration Improvements

=over 4

=item Generic Improvements

=item New Or Improved Platforms

=back

=item Selected Bug Fixes

=over 4

=item Platform Specific Changes and Fixes

=back

=item New or Changed Diagnostics

=item Changed Internals

=item Security Vulnerability Closed [561]

=item New Tests

=item Known Problems

=over 4

=item The Compiler Suite Is Still Very Experimental

=item Localising Tied Arrays and Hashes Is Broken

=item Building Extensions Can Fail Because Of Largefiles

=item Modifying $_ Inside for(..)

=item mod_perl 1.26 Doesn't Build With Threaded Perl

=item lib/ftmp-security tests warn 'system possibly insecure'

=item libwww-perl (LWP) fails base/date #51

=item PDL failing some tests

=item Perl_get_sv

=item Self-tying Problems

=item ext/threads/t/libc

=item Failure of Thread (5.005-style) tests

=item Timing problems

=item Tied/Magical Array/Hash Elements Do Not Autovivify

=item Unicode in package/class and subroutine names does not work

=back

=item Platform Specific Problems

=over 4

=item AIX

=item Alpha systems with old gccs fail several tests

=item AmigaOS

=item BeOS

=item Cygwin "unable to remap"

=item Cygwin ndbm tests fail on FAT

=item DJGPP Failures

=item FreeBSD built with ithreads coredumps reading large directories

=item FreeBSD Failing locale Test 117 For ISO 8859-15 Locales

=item IRIX fails ext/List/Util/t/shuffle.t or Digest::MD5

=item HP-UX lib/posix Subtest 9 Fails When LP64-Configured

=item Linux with glibc 2.2.5 fails t/op/int subtest #6 with -Duse64bitint

=item Linux With Sfio Fails op/misc Test 48

=item Mac OS X

=item Mac OS X dyld undefined symbols

=item OS/2 Test Failures

=item op/sprintf tests 91, 129, and 130

=item SCO

=item Solaris 2.5

=item Solaris x86 Fails Tests With -Duse64bitint

=item SUPER-UX (NEC SX)

=item Term::ReadKey not working on Win32

=item UNICOS/mk

=item UTS

=item VOS (Stratus)

=item VMS

=item Win32

=item XML::Parser not working

=item z/OS (OS/390)

=item Unicode Support on EBCDIC Still Spotty

=item Seen In Perl 5.7 But Gone Now

=back

=item Reporting Bugs

=item SEE ALSO

=item HISTORY

=back

=head2 perl561delta - what's new for perl v5.6.1

=over 4

=item DESCRIPTION

=item Summary of changes between 5.6.0 and 5.6.1

=over 4

=item Security Issues

=item Core bug fixes

C<UNIVERSAL::isa()>, Memory leaks, Numeric conversions, qw(a\\b), caller(),
Bugs in regular expressions, "slurp" mode, Autovivification of symbolic
references to special variables, Lexical warnings, Spurious warnings and
errors, glob(), Tainting, sort(), #line directives, Subroutine prototypes,
map(), Debugger, PERL5OPT, chop(), Unicode support, 64-bit support,
Compiler, Lvalue subroutines, IO::Socket, File::Find, xsubpp, C<no
Module;>, Tests

=item Core features

=item Configuration issues

=item Documentation

=item Bundled modules

B::Concise, File::Temp, Pod::LaTeX, Pod::Text::Overstrike, CGI, CPAN,
Class::Struct, DB_File, Devel::Peek, File::Find, Getopt::Long, IO::Poll,
IPC::Open3, Math::BigFloat, Math::Complex, Net::Ping, Opcode, Pod::Parser,
Pod::Text, SDBM_File, Sys::Syslog, Tie::RefHash, Tie::SubstrHash

=item Platform-specific improvements

NCR MP-RAS, NonStop-UX

=back

=item Core Enhancements

=over 4

=item Interpreter cloning, threads, and concurrency

=item Lexically scoped warning categories

=item Unicode and UTF-8 support

=item Support for interpolating named characters

=item "our" declarations

=item Support for strings represented as a vector of ordinals

=item Improved Perl version numbering system

=item New syntax for declaring subroutine attributes

=item File and directory handles can be autovivified

=item open() with more than two arguments

=item 64-bit support

=item Large file support

=item Long doubles

=item "more bits"

=item Enhanced support for sort() subroutines

=item C<sort $coderef @foo> allowed

=item File globbing implemented internally

=item Support for CHECK blocks

=item POSIX character class syntax [: :] supported

=item Better pseudo-random number generator

=item Improved C<qw//> operator

=item Better worst-case behavior of hashes

=item pack() format 'Z' supported

=item pack() format modifier '!' supported

=item pack() and unpack() support counted strings

=item Comments in pack() templates

=item Weak references

=item Binary numbers supported

=item Lvalue subroutines

=item Some arrows may be omitted in calls through references

=item Boolean assignment operators are legal lvalues

=item exists() is supported on subroutine names

=item exists() and delete() are supported on array elements

=item Pseudo-hashes work better

=item Automatic flushing of output buffers

=item Better diagnostics on meaningless filehandle operations

=item Where possible, buffered data discarded from duped input filehandle

=item eof() has the same old magic as <>

=item binmode() can be used to set :crlf and :raw modes

=item C<-T> filetest recognizes UTF-8 encoded files as "text"

=item system(), backticks and pipe open now reflect exec() failure

=item Improved diagnostics

=item Diagnostics follow STDERR

=item More consistent close-on-exec behavior

=item syswrite() ease-of-use

=item Better syntax checks on parenthesized unary operators

=item Bit operators support full native integer width

=item Improved security features

=item More functional bareword prototype (*)

=item C<require> and C<do> may be overridden

=item $^X variables may now have names longer than one character

=item New variable $^C reflects C<-c> switch

=item New variable $^V contains Perl version as a string

=item Optional Y2K warnings

=item Arrays now always interpolate into double-quoted strings

=item @- and @+ provide starting/ending offsets of regex submatches

=back

=item Modules and Pragmata

=over 4

=item Modules

attributes, B, Benchmark, ByteLoader, constant, charnames, Data::Dumper,
DB, DB_File, Devel::DProf, Devel::Peek, Dumpvalue, DynaLoader, English,
Env, Fcntl, File::Compare, File::Find, File::Glob, File::Spec,
File::Spec::Functions, Getopt::Long, IO, JPL, lib, Math::BigInt,
Math::Complex, Math::Trig, Pod::Parser, Pod::InputObjects, Pod::Checker,
podchecker, Pod::ParseUtils, Pod::Find, Pod::Select, podselect, Pod::Usage,
pod2usage, Pod::Text and Pod::Man, SDBM_File, Sys::Syslog, Sys::Hostname,
Term::ANSIColor, Time::Local, Win32, XSLoader, DBM Filters

=item Pragmata

=back

=item Utility Changes

=over 4

=item dprofpp

=item find2perl

=item h2xs

=item perlcc

=item perldoc

=item The Perl Debugger

=back

=item Improved Documentation

perlapi.pod, perlboot.pod, perlcompile.pod, perldbmfilter.pod,
perldebug.pod, perldebguts.pod, perlfork.pod, perlfilter.pod, perlhack.pod,
perlintern.pod, perllexwarn.pod, perlnumber.pod, perlopentut.pod,
perlreftut.pod, perltootc.pod, perltodo.pod, perlunicode.pod

=item Performance enhancements

=over 4

=item Simple sort() using { $a <=> $b } and the like are optimized

=item Optimized assignments to lexical variables

=item Faster subroutine calls

=item delete(), each(), values() and hash iteration are faster

=back

=item Installation and Configuration Improvements

=over 4

=item -Dusethreads means something different

=item New Configure flags

=item Threadedness and 64-bitness now more daring

=item Long Doubles

=item -Dusemorebits

=item -Duselargefiles

=item installusrbinperl

=item SOCKS support

=item C<-A> flag

=item Enhanced Installation Directories

=item gcc automatically tried if 'cc' does not seem to be working

=back

=item Platform specific changes

=over 4

=item Supported platforms

=item DOS

=item OS390 (OpenEdition MVS)

=item VMS

=item Win32

=back

=item Significant bug fixes

=over 4

=item <HANDLE> on empty files

=item C<eval '...'> improvements

=item All compilation errors are true errors

=item Implicitly closed filehandles are safer

=item Behavior of list slices is more consistent

=item C<(\$)> prototype and C<$foo{a}>

=item C<goto &sub> and AUTOLOAD

=item C<-bareword> allowed under C<use integer>

=item Failures in DESTROY()

=item Locale bugs fixed

=item Memory leaks

=item Spurious subroutine stubs after failed subroutine calls

=item Taint failures under C<-U>

=item END blocks and the C<-c> switch

=item Potential to leak DATA filehandles

=back

=item New or Changed Diagnostics

"%s" variable %s masks earlier declaration in same %s, "my sub" not yet
implemented, "our" variable %s redeclared, '!' allowed only after types %s,
/ cannot take a count, / must be followed by a, A or Z, / must be followed
by a*, A* or Z*, / must follow a numeric type, /%s/: Unrecognized escape
\\%c passed through, /%s/: Unrecognized escape \\%c in character class
passed through, /%s/ should probably be written as "%s", %s() called too
early to check prototype, %s argument is not a HASH or ARRAY element, %s
argument is not a HASH or ARRAY element or slice, %s argument is not a
subroutine name, %s package attribute may clash with future reserved word:
%s, (in cleanup) %s, <> should be quotes, Attempt to join self, Bad evalled
substitution pattern, Bad realloc() ignored, Bareword found in conditional,
Binary number > 0b11111111111111111111111111111111 non-portable, Bit vector
size > 32 non-portable, Buffer overflow in prime_env_iter: %s, Can't check
filesystem of script "%s", Can't declare class for non-scalar %s in "%s",
Can't declare %s in "%s", Can't ignore signal CHLD, forcing to default,
Can't modify non-lvalue subroutine call, Can't read CRTL environ, Can't
remove %s: %s, skipping file, Can't return %s from lvalue subroutine, Can't
weaken a nonreference, Character class [:%s:] unknown, Character class
syntax [%s] belongs inside character classes, Constant is not %s reference,
constant(%s): %s, CORE::%s is not a keyword, defined(@array) is deprecated,
defined(%hash) is deprecated, Did not produce a valid header, (Did you mean
"local" instead of "our"?), Document contains no data, entering effective
%s failed, false [] range "%s" in regexp, Filehandle %s opened only for
output, flock() on closed filehandle %s, Global symbol "%s" requires
explicit package name, Hexadecimal number > 0xffffffff non-portable,
Ill-formed CRTL environ value "%s", Ill-formed message in prime_env_iter:
|%s|, Illegal binary digit %s, Illegal binary digit %s ignored, Illegal
number of bits in vec, Integer overflow in %s number, Invalid %s attribute:
%s, Invalid %s attributes: %s, invalid [] range "%s" in regexp, Invalid
separator character %s in attribute list, Invalid separator character %s in
subroutine attribute list, leaving effective %s failed, Lvalue subs
returning %s not implemented yet, Method %s not permitted, Missing
%sbrace%s on \N{}, Missing command in piped open, Missing name in "my sub",
No %s specified for -%c, No package name allowed for variable %s in "our",
No space allowed after -%c, no UTC offset information; assuming local time
is UTC, Octal number > 037777777777 non-portable, panic: del_backref,
panic: kid popen errno read, panic: magic_killbackrefs, Parentheses missing
around "%s" list, Possible unintended interpolation of %s in string,
Possible Y2K bug: %s, pragma "attrs" is deprecated, use "sub NAME : ATTRS"
instead, Premature end of script headers, Repeat count in pack overflows,
Repeat count in unpack overflows, realloc() of freed memory ignored,
Reference is already weak, setpgrp can't take arguments, Strange *+?{} on
zero-length expression, switching effective %s is not implemented, This
Perl can't reset CRTL environ elements (%s), This Perl can't set CRTL
environ elements (%s=%s), Too late to run %s block, Unknown open() mode
'%s', Unknown process %x sent message to prime_env_iter: %s, Unrecognized
escape \\%c passed through, Unterminated attribute parameter in attribute
list, Unterminated attribute list, Unterminated attribute parameter in
subroutine attribute list, Unterminated subroutine attribute list, Value of
CLI symbol "%s" too long, Version number must be a constant number

=item New tests

=item Incompatible Changes

=over 4

=item Perl Source Incompatibilities

CHECK is a new keyword, Treatment of list slices of undef has changed,
Format of $English::PERL_VERSION is different, Literals of the form
C<1.2.3> parse differently, Possibly changed pseudo-random number
generator, Hashing function for hash keys has changed, C<undef> fails on
read only values, Close-on-exec bit may be set on pipe and socket handles,
Writing C<"$$1"> to mean C<"${$}1"> is unsupported, delete(), each(),
values() and C<\(%h)>, vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS,
Text of some diagnostic output has changed, C<%@> has been removed,
Parenthesized not() behaves like a list operator, Semantics of bareword
prototype C<(*)> have changed, Semantics of bit operators may have changed
on 64-bit platforms, More builtins taint their results

=item C Source Incompatibilities

C<PERL_POLLUTE>, C<PERL_IMPLICIT_CONTEXT>, C<PERL_POLLUTE_MALLOC>

=item Compatible C Source API Changes

C<PATCHLEVEL> is now C<PERL_VERSION>

=item Binary Incompatibilities

=back

=item Known Problems

=over 4

=item Localizing a tied hash element may leak memory

=item Known test failures

=item EBCDIC platforms not fully supported

=item UNICOS/mk CC failures during Configure run

=item Arrow operator and arrays

=item Experimental features

Threads, Unicode, 64-bit support, Lvalue subroutines, Weak references, The
pseudo-hash data type, The Compiler suite, Internal implementation of file
globbing, The DB module, The regular expression code constructs:

=back

=item Obsolete Diagnostics

Character class syntax [: :] is reserved for future extensions, Ill-formed
logical name |%s| in prime_env_iter, In string, @%s now must be written as
\@%s, Probable precedence problem on %s, regexp too big, Use of "$$<digit>"
to mean "${$}<digit>" is deprecated

=item Reporting Bugs

=item SEE ALSO

=item HISTORY

=back

=head2 perl56delta - what's new for perl v5.6.0

=over 4

=item DESCRIPTION

=item Core Enhancements

=over 4

=item Interpreter cloning, threads, and concurrency

=item Lexically scoped warning categories

=item Unicode and UTF-8 support

=item Support for interpolating named characters

=item "our" declarations

=item Support for strings represented as a vector of ordinals

=item Improved Perl version numbering system

=item New syntax for declaring subroutine attributes

=item File and directory handles can be autovivified

=item open() with more than two arguments

=item 64-bit support

=item Large file support

=item Long doubles

=item "more bits"

=item Enhanced support for sort() subroutines

=item C<sort $coderef @foo> allowed

=item File globbing implemented internally

=item Support for CHECK blocks

=item POSIX character class syntax [: :] supported

=item Better pseudo-random number generator

=item Improved C<qw//> operator

=item Better worst-case behavior of hashes

=item pack() format 'Z' supported

=item pack() format modifier '!' supported

=item pack() and unpack() support counted strings

=item Comments in pack() templates

=item Weak references

=item Binary numbers supported

=item Lvalue subroutines

=item Some arrows may be omitted in calls through references

=item Boolean assignment operators are legal lvalues

=item exists() is supported on subroutine names

=item exists() and delete() are supported on array elements

=item Pseudo-hashes work better

=item Automatic flushing of output buffers

=item Better diagnostics on meaningless filehandle operations

=item Where possible, buffered data discarded from duped input filehandle

=item eof() has the same old magic as <>

=item binmode() can be used to set :crlf and :raw modes

=item C<-T> filetest recognizes UTF-8 encoded files as "text"

=item system(), backticks and pipe open now reflect exec() failure

=item Improved diagnostics

=item Diagnostics follow STDERR

=item More consistent close-on-exec behavior

=item syswrite() ease-of-use

=item Better syntax checks on parenthesized unary operators

=item Bit operators support full native integer width

=item Improved security features

=item More functional bareword prototype (*)

=item C<require> and C<do> may be overridden

=item $^X variables may now have names longer than one character

=item New variable $^C reflects C<-c> switch

=item New variable $^V contains Perl version as a string

=item Optional Y2K warnings

=item Arrays now always interpolate into double-quoted strings

=item @- and @+ provide starting/ending offsets of regex matches

=back

=item Modules and Pragmata

=over 4

=item Modules

attributes, B, Benchmark, ByteLoader, constant, charnames, Data::Dumper,
DB, DB_File, Devel::DProf, Devel::Peek, Dumpvalue, DynaLoader, English,
Env, Fcntl, File::Compare, File::Find, File::Glob, File::Spec,
File::Spec::Functions, Getopt::Long, IO, JPL, lib, Math::BigInt,
Math::Complex, Math::Trig, Pod::Parser, Pod::InputObjects, Pod::Checker,
podchecker, Pod::ParseUtils, Pod::Find, Pod::Select, podselect, Pod::Usage,
pod2usage, Pod::Text and Pod::Man, SDBM_File, Sys::Syslog, Sys::Hostname,
Term::ANSIColor, Time::Local, Win32, XSLoader, DBM Filters

=item Pragmata

=back

=item Utility Changes

=over 4

=item dprofpp

=item find2perl

=item h2xs

=item perlcc

=item perldoc

=item The Perl Debugger

=back

=item Improved Documentation

perlapi.pod, perlboot.pod, perlcompile.pod, perldbmfilter.pod,
perldebug.pod, perldebguts.pod, perlfork.pod, perlfilter.pod, perlhack.pod,
perlintern.pod, perllexwarn.pod, perlnumber.pod, perlopentut.pod,
perlreftut.pod, perltootc.pod, perltodo.pod, perlunicode.pod

=item Performance enhancements

=over 4

=item Simple sort() using { $a <=> $b } and the like are optimized

=item Optimized assignments to lexical variables

=item Faster subroutine calls

=item delete(), each(), values() and hash iteration are faster

=back

=item Installation and Configuration Improvements

=over 4

=item -Dusethreads means something different

=item New Configure flags

=item Threadedness and 64-bitness now more daring

=item Long Doubles

=item -Dusemorebits

=item -Duselargefiles

=item installusrbinperl

=item SOCKS support

=item C<-A> flag

=item Enhanced Installation Directories

=back

=item Platform specific changes

=over 4

=item Supported platforms

=item DOS

=item OS390 (OpenEdition MVS)

=item VMS

=item Win32

=back

=item Significant bug fixes

=over 4

=item <HANDLE> on empty files

=item C<eval '...'> improvements

=item All compilation errors are true errors

=item Implicitly closed filehandles are safer

=item Behavior of list slices is more consistent

=item C<(\$)> prototype and C<$foo{a}>

=item C<goto &sub> and AUTOLOAD

=item C<-bareword> allowed under C<use integer>

=item Failures in DESTROY()

=item Locale bugs fixed

=item Memory leaks

=item Spurious subroutine stubs after failed subroutine calls

=item Taint failures under C<-U>

=item END blocks and the C<-c> switch

=item Potential to leak DATA filehandles

=back

=item New or Changed Diagnostics

"%s" variable %s masks earlier declaration in same %s, "my sub" not yet
implemented, "our" variable %s redeclared, '!' allowed only after types %s,
/ cannot take a count, / must be followed by a, A or Z, / must be followed
by a*, A* or Z*, / must follow a numeric type, /%s/: Unrecognized escape
\\%c passed through, /%s/: Unrecognized escape \\%c in character class
passed through, /%s/ should probably be written as "%s", %s() called too
early to check prototype, %s argument is not a HASH or ARRAY element, %s
argument is not a HASH or ARRAY element or slice, %s argument is not a
subroutine name, %s package attribute may clash with future reserved word:
%s, (in cleanup) %s, <> should be quotes, Attempt to join self, Bad evalled
substitution pattern, Bad realloc() ignored, Bareword found in conditional,
Binary number > 0b11111111111111111111111111111111 non-portable, Bit vector
size > 32 non-portable, Buffer overflow in prime_env_iter: %s, Can't check
filesystem of script "%s", Can't declare class for non-scalar %s in "%s",
Can't declare %s in "%s", Can't ignore signal CHLD, forcing to default,
Can't modify non-lvalue subroutine call, Can't read CRTL environ, Can't
remove %s: %s, skipping file, Can't return %s from lvalue subroutine, Can't
weaken a nonreference, Character class [:%s:] unknown, Character class
syntax [%s] belongs inside character classes, Constant is not %s reference,
constant(%s): %s, CORE::%s is not a keyword, defined(@array) is deprecated,
defined(%hash) is deprecated, Did not produce a valid header, (Did you mean
"local" instead of "our"?), Document contains no data, entering effective
%s failed, false [] range "%s" in regexp, Filehandle %s opened only for
output, flock() on closed filehandle %s, Global symbol "%s" requires
explicit package name, Hexadecimal number > 0xffffffff non-portable,
Ill-formed CRTL environ value "%s", Ill-formed message in prime_env_iter:
|%s|, Illegal binary digit %s, Illegal binary digit %s ignored, Illegal
number of bits in vec, Integer overflow in %s number, Invalid %s attribute:
%s, Invalid %s attributes: %s, invalid [] range "%s" in regexp, Invalid
separator character %s in attribute list, Invalid separator character %s in
subroutine attribute list, leaving effective %s failed, Lvalue subs
returning %s not implemented yet, Method %s not permitted, Missing
%sbrace%s on \N{}, Missing command in piped open, Missing name in "my sub",
No %s specified for -%c, No package name allowed for variable %s in "our",
No space allowed after -%c, no UTC offset information; assuming local time
is UTC, Octal number > 037777777777 non-portable, panic: del_backref,
panic: kid popen errno read, panic: magic_killbackrefs, Parentheses missing
around "%s" list, Possible unintended interpolation of %s in string,
Possible Y2K bug: %s, pragma "attrs" is deprecated, use "sub NAME : ATTRS"
instead, Premature end of script headers, Repeat count in pack overflows,
Repeat count in unpack overflows, realloc() of freed memory ignored,
Reference is already weak, setpgrp can't take arguments, Strange *+?{} on
zero-length expression, switching effective %s is not implemented, This
Perl can't reset CRTL environ elements (%s), This Perl can't set CRTL
environ elements (%s=%s), Too late to run %s block, Unknown open() mode
'%s', Unknown process %x sent message to prime_env_iter: %s, Unrecognized
escape \\%c passed through, Unterminated attribute parameter in attribute
list, Unterminated attribute list, Unterminated attribute parameter in
subroutine attribute list, Unterminated subroutine attribute list, Value of
CLI symbol "%s" too long, Version number must be a constant number

=item New tests

=item Incompatible Changes

=over 4

=item Perl Source Incompatibilities

CHECK is a new keyword, Treatment of list slices of undef has changed,
Format of $English::PERL_VERSION is different, Literals of the form
C<1.2.3> parse differently, Possibly changed pseudo-random number
generator, Hashing function for hash keys has changed, C<undef> fails on
read only values, Close-on-exec bit may be set on pipe and socket handles,
Writing C<"$$1"> to mean C<"${$}1"> is unsupported, delete(), each(),
values() and C<\(%h)>, vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS,
Text of some diagnostic output has changed, C<%@> has been removed,
Parenthesized not() behaves like a list operator, Semantics of bareword
prototype C<(*)> have changed, Semantics of bit operators may have changed
on 64-bit platforms, More builtins taint their results

=item C Source Incompatibilities

C<PERL_POLLUTE>, C<PERL_IMPLICIT_CONTEXT>, C<PERL_POLLUTE_MALLOC>

=item Compatible C Source API Changes

C<PATCHLEVEL> is now C<PERL_VERSION>

=item Binary Incompatibilities

=back

=item Known Problems

=over 4

=item Thread test failures

=item EBCDIC platforms not supported

=item In 64-bit HP-UX the lib/io_multihomed test may hang

=item NEXTSTEP 3.3 POSIX test failure

=item Tru64 (aka Digital UNIX, aka DEC OSF/1) lib/sdbm test failure with
gcc

=item UNICOS/mk CC failures during Configure run

=item Arrow operator and arrays

=item Experimental features

Threads, Unicode, 64-bit support, Lvalue subroutines, Weak references, The
pseudo-hash data type, The Compiler suite, Internal implementation of file
globbing, The DB module, The regular expression code constructs:

=back

=item Obsolete Diagnostics

Character class syntax [: :] is reserved for future extensions, Ill-formed
logical name |%s| in prime_env_iter, In string, @%s now must be written as
\@%s, Probable precedence problem on %s, regexp too big, Use of "$$<digit>"
to mean "${$}<digit>" is deprecated

=item Reporting Bugs

=item SEE ALSO

=item HISTORY

=back

=head2 perl5005delta - what's new for perl5.005

=over 4

=item DESCRIPTION

=item About the new versioning system

=item Incompatible Changes

=over 4

=item WARNING:	This version is not binary compatible with Perl 5.004.

=item Default installation structure has changed

=item Perl Source Compatibility

=item C Source Compatibility

=item Binary Compatibility

=item Security fixes may affect compatibility

=item Relaxed new mandatory warnings introduced in 5.004

=item Licensing

=back

=item Core Changes

=over 4

=item Threads

=item Compiler

=item Regular Expressions

Many new and improved optimizations, Many bug fixes, New regular expression
constructs, New operator for precompiled regular expressions, Other
improvements, Incompatible changes

=item	Improved malloc()

=item Quicksort is internally implemented

=item Reliable signals

=item Reliable stack pointers

=item More generous treatment of carriage returns

=item Memory leaks

=item Better support for multiple interpreters

=item Behavior of local() on array and hash elements is now well-defined

=item C<%!> is transparently tied to the L<Errno> module

=item Pseudo-hashes are supported

=item C<EXPR foreach EXPR> is supported

=item Keywords can be globally overridden

=item C<$^E> is meaningful on Win32

=item C<foreach (1..1000000)> optimized

=item C<Foo::> can be used as implicitly quoted package name

=item C<exists $Foo::{Bar::}> tests existence of a package

=item Better locale support

=item Experimental support for 64-bit platforms

=item prototype() returns useful results on builtins

=item Extended support for exception handling

=item Re-blessing in DESTROY() supported for chaining DESTROY() methods

=item All C<printf> format conversions are handled internally

=item New C<INIT> keyword

=item New C<lock> keyword

=item New C<qr//> operator

=item C<our> is now a reserved word

=item Tied arrays are now fully supported

=item Tied handles support is better

=item 4th argument to substr

=item Negative LENGTH argument to splice

=item Magic lvalues are now more magical

=item <> now reads in records

=back

=item Supported Platforms

=over 4

=item New Platforms

=item Changes in existing support

=back

=item Modules and Pragmata

=over 4

=item New Modules

B, Data::Dumper, Dumpvalue, Errno, File::Spec, ExtUtils::Installed,
ExtUtils::Packlist, Fatal, IPC::SysV, Test, Tie::Array, Tie::Handle,
Thread, attrs, fields, re

=item Changes in existing modules

Benchmark, Carp, CGI, Fcntl, Math::Complex, Math::Trig, POSIX, DB_File,
MakeMaker, CPAN, Cwd

=back

=item Utility Changes

=item Documentation Changes

=item New Diagnostics

Ambiguous call resolved as CORE::%s(), qualify as such or use &, Bad index
while coercing array into hash, Bareword "%s" refers to nonexistent
package, Can't call method "%s" on an undefined value, Can't check
filesystem of script "%s" for nosuid, Can't coerce array into hash, Can't
goto subroutine from an eval-string, Can't localize pseudo-hash element,
Can't use %%! because Errno.pm is not available, Cannot find an opnumber
for "%s", Character class syntax [. .] is reserved for future extensions,
Character class syntax [: :] is reserved for future extensions, Character
class syntax [= =] is reserved for future extensions, %s: Eval-group in
insecure regular expression, %s: Eval-group not allowed, use re 'eval', %s:
Eval-group not allowed at run time, Explicit blessing to '' (assuming
package main), Illegal hex digit ignored, No such array field, No such
field "%s" in variable %s of type %s, Out of memory during ridiculously
large request, Range iterator outside integer range, Recursive inheritance
detected while looking for method '%s' %s, Reference found where even-sized
list expected, Undefined value assigned to typeglob, Use of reserved word
"%s" is deprecated, perl: warning: Setting locale failed

=item Obsolete Diagnostics

Can't mktemp(), Can't write to temp file for B<-e>: %s, Cannot open
temporary file, regexp too big

=item Configuration Changes

=item BUGS

=item SEE ALSO

=item HISTORY

=back

=head2 perl5004delta - what's new for perl5.004

=over 4

=item DESCRIPTION

=item Supported Environments

=item Core Changes

=over 4

=item List assignment to %ENV works

=item Change to "Can't locate Foo.pm in @INC" error

=item Compilation option: Binary compatibility with 5.003

=item $PERL5OPT environment variable

=item Limitations on B<-M>, B<-m>, and B<-T> options

=item More precise warnings

=item Deprecated: Inherited C<AUTOLOAD> for non-methods

=item Previously deprecated %OVERLOAD is no longer usable

=item Subroutine arguments created only when they're modified

=item Group vector changeable with C<$)>

=item Fixed parsing of $$<digit>, &$<digit>, etc.

=item Fixed localization of $<digit>, $&, etc.

=item No resetting of $. on implicit close

=item C<wantarray> may return undef

=item C<eval EXPR> determines value of EXPR in scalar context

=item Changes to tainting checks

No glob() or <*>, No spawning if tainted $CDPATH, $ENV, $BASH_ENV, No
spawning if tainted $TERM doesn't look like a terminal name

=item New Opcode module and revised Safe module

=item Embedding improvements

=item Internal change: FileHandle class based on IO::* classes

=item Internal change: PerlIO abstraction interface

=item New and changed syntax

$coderef->(PARAMS)

=item New and changed builtin constants

__PACKAGE__

=item New and changed builtin variables

$^E, $^H, $^M

=item New and changed builtin functions

delete on slices, flock, printf and sprintf, keys as an lvalue, my() in
Control Structures, pack() and unpack(), sysseek(), use VERSION, use Module
VERSION LIST, prototype(FUNCTION), srand, $_ as Default, C<m//gc> does not
reset search position on failure, C<m//x> ignores whitespace before ?*+{},
nested C<sub{}> closures work now, formats work right on changing lexicals

=item New builtin methods

isa(CLASS), can(METHOD), VERSION( [NEED] )

=item TIEHANDLE now supported

TIEHANDLE classname, LIST, PRINT this, LIST, PRINTF this, LIST, READ this
LIST, READLINE this, GETC this, DESTROY this

=item Malloc enhancements

-DPERL_EMERGENCY_SBRK, -DPACK_MALLOC, -DTWO_POT_OPTIMIZE

=item Miscellaneous efficiency enhancements

=back

=item Support for More Operating Systems

=over 4

=item Win32

=item Plan 9

=item QNX

=item AmigaOS

=back

=item Pragmata

use autouse MODULE => qw(sub1 sub2 sub3), use blib, use blib 'dir', use
constant NAME => VALUE, use locale, use ops, use vmsish

=item Modules

=over 4

=item Required Updates

=item Installation directories

=item Module information summary

=item Fcntl

=item IO

=item Math::Complex

=item Math::Trig

=item DB_File

=item Net::Ping

=item Object-oriented overrides for builtin operators

=back

=item Utility Changes

=over 4

=item pod2html

Sends converted HTML to standard output

=item xsubpp

C<void> XSUBs now default to returning nothing

=back

=item C Language API Changes

C<gv_fetchmethod> and C<perl_call_sv>, C<perl_eval_pv>, Extended API for
manipulating hashes

=item Documentation Changes

L<perldelta>, L<perlfaq>, L<perllocale>, L<perltoot>, L<perlapio>,
L<perlmodlib>, L<perldebug>, L<perlsec>

=item New Diagnostics

"my" variable %s masks earlier declaration in same scope, %s argument is
not a HASH element or slice, Allocation too large: %lx, Allocation too
large, Applying %s to %s will act on scalar(%s), Attempt to free
nonexistent shared string, Attempt to use reference as lvalue in substr,
Bareword "%s" refers to nonexistent package, Can't redefine active sort
subroutine %s, Can't use bareword ("%s") as %s ref while "strict refs" in
use, Cannot resolve method `%s' overloading `%s' in package `%s', Constant
subroutine %s redefined, Constant subroutine %s undefined, Copy method did
not return a reference, Died, Exiting pseudo-block via %s, Identifier too
long, Illegal character %s (carriage return), Illegal switch in PERL5OPT:
%s, Integer overflow in hex number, Integer overflow in octal number,
internal error: glob failed, Invalid conversion in %s: "%s", Invalid type
in pack: '%s', Invalid type in unpack: '%s', Name "%s::%s" used only once:
possible typo, Null picture in formline, Offset outside string, Out of
memory!, Out of memory during request for %s, panic: frexp, Possible
attempt to put comments in qw() list, Possible attempt to separate words
with commas, Scalar value @%s{%s} better written as $%s{%s}, Stub found
while resolving method `%s' overloading `%s' in %s, Too late for "B<-T>"
option, untie attempted while %d inner references still exist, Unrecognized
character %s, Unsupported function fork, Use of "$$<digit>" to mean
"${$}<digit>" is deprecated, Value of %s can be "0"; test with defined(),
Variable "%s" may be unavailable, Variable "%s" will not stay shared,
Warning: something's wrong, Ill-formed logical name |%s| in prime_env_iter,
Got an error from DosAllocMem, Malformed PERLLIB_PREFIX, PERL_SH_DIR too
long, Process terminated by SIG%s

=item BUGS

=item SEE ALSO

=item HISTORY

=back

=head2 perlexperiment - A listing of experimental features in Perl

=over 4

=item DESCRIPTION

=over 4

=item Current experiments

C<our> can now have an experimental optional attribute C<unique>, Smart
match (C<~~>), Pluggable keywords, Regular Expression Set Operations,
Subroutine signatures, Aliasing via reference, The "const" attribute, use
re 'strict';, String- and number-specific bitwise operators, The <:win32>
IO pseudolayer, Declaring a reference to a variable, There is an
C<installhtml> target in the Makefile, Unicode in Perl on EBCDIC

=item Accepted features

64-bit support, die accepts a reference, DB module, Weak references,
Internal file glob, fork() emulation, -Dusemultiplicity -Duseithreads,
Support for long doubles, The C<\N> regex character class, C<(?{code})> and
C<(??{ code })>, Linux abstract Unix domain sockets, Lvalue subroutines,
Backtracking control verbs, The <:pop> IO pseudolayer, C<\s> in regexp
matches vertical tab, Postfix dereference syntax, Lexical subroutines

=item Removed features

5.005-style threading, perlcc, The pseudo-hash data type, GetOpt::Long
Options can now take multiple values at once (experimental), Assertions,
Test::Harness::Straps, C<legacy>, Lexical C<$_>, Array and hash container
functions accept references

=back

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT

=item LICENSE

=back

=head2 perlartistic - the Perl Artistic License

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item The "Artistic License"

=over 4

=item Preamble

=item Definitions

"Package", "Standard Version", "Copyright Holder", "You", "Reasonable
copying fee", "Freely Available"

=item Conditions

a), b), c), d), a), b), c), d)

=back

=back

=head2 perlgpl - the GNU General Public License, version 1

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item GNU GENERAL PUBLIC LICENSE

=back

=head2 perlaix - Perl version 5 on IBM AIX (UNIX) systems

=over 4

=item DESCRIPTION

=over 4

=item Compiling Perl 5 on AIX

=item Supported Compilers

=item Incompatibility with AIX Toolbox lib gdbm

=item Perl 5 was successfully compiled and tested on:

=item Building Dynamic Extensions on AIX

=item Using Large Files with Perl

=item Threaded Perl

=item 64-bit Perl

=item Long doubles

=item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/32-bit)

=item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (32-bit)

=item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/64-bit)

=item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (64-bit)

=item Compiling Perl 5 on AIX 7.1.0

=item Compiling Perl 5 on older AIX versions up to 4.3.3

=item OS level

=item Building Dynamic Extensions on AIX E<lt> 5L

=item The IBM ANSI C Compiler

=item The usenm option

=item Using GNU's gcc for building Perl

=item Using Large Files with Perl E<lt> 5L

=item Threaded Perl E<lt> 5L

=item 64-bit Perl E<lt> 5L

=item AIX 4.2 and extensions using C++ with statics

=back

=item AUTHORS

=back

=head2 perlamiga - Perl under AmigaOS 4.1

=over 4

=item NOTE

=item SYNOPSIS

=back

=over 4

=item DESCRIPTION

=over 4

=item Prerequisites for running Perl 5.22.1 under AmigaOS 4.1

B<AmigaOS 4.1 update 6 with all updates applied as of 9th October 2013>,
B<newlib.library version 53.28 or greater>, B<AmigaOS SDK>, B<abc-shell>

=item Starting Perl programs under AmigaOS 4.1

=item Limitations of Perl under AmigaOS 4.1

B<Nested Piped programs can crash when run from older abc-shells>,
B<Incorrect or unexpected command line unescaping>, B<Starting subprocesses
via open has limitations>, If you find any other limitations or bugs then
let me know

=back

=item INSTALLATION

=item Amiga Specific Modules

=over 4

=item Amiga::ARexx

=item Amiga::Exec

=back

=item BUILDING

=item CHANGES

B<August 2015>, Port to Perl 5.22, Add handling of NIL: to afstat(), Fix
inheritance of environment variables by subprocesses, Fix exec, and exit in
"forked" subprocesses, Fix issue with newlib's unlink, which could cause
infinite loops, Add flock() emulation using IDOS->LockRecord thanks to Tony
Cook for the suggestion, Fix issue where kill was using the wrong kind of
process ID, B<27th November 2013>, Create new installation system based on
installperl links and Amiga protection bits now set correctly, Pod now
defaults to text, File::Spec should now recognise an Amiga style absolute
path as well as an Unix style one. Relative paths must always be Unix
style, B<20th November 2013>, Configured to use SDK:Local/C/perl to start
standard scripts, Added Amiga::Exec module with support for Wait() and
AmigaOS signal numbers, B<10th October 13>

=item SEE ALSO

=back

=head2 perlandroid - Perl under Android

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Cross-compilation

=over 4

=item Get the Android Native Development Kit (NDK)

=item Determine the architecture you'll be cross-compiling for

=item Set up a standalone toolchain

=item adb or ssh?

=item Configure and beyond

=back

=item Native Builds

=item AUTHOR

=back

=head2 perlbs2000 - building and installing Perl for BS2000.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item gzip on BS2000

=item bison on BS2000

=item Unpacking Perl Distribution on BS2000

=item Compiling Perl on BS2000

=item Testing Perl on BS2000

=item Installing Perl on BS2000

=item Using Perl in the Posix-Shell of BS2000

=item Using Perl in "native" BS2000

=item Floating point anomalies on BS2000

=item Using PerlIO and different encodings on ASCII and EBCDIC partitions

=back

=item AUTHORS

=item SEE ALSO

=over 4

=item Mailing list

=back

=item HISTORY

=back

=head2 perlce - Perl for WinCE

=over 4

=item Building Perl for WinCE

=over 4

=item WARNING

=item DESCRIPTION

=item General explanations on cross-compiling WinCE

=item CURRENT BUILD INSTRUCTIONS

=item OLD BUILD INSTRUCTIONS

Microsoft Embedded Visual Tools, Microsoft Visual C++, Rainer Keuchel's
celib-sources, Rainer Keuchel's console-sources, go to F<./win32>
subdirectory, edit file F<./win32/ce-helpers/compile.bat>, run	 
compile.bat, run    compile.bat dist

=back

=item Using Perl on WinCE

=over 4

=item DESCRIPTION

=item LIMITATIONS

=item ENVIRONMENT

PERL5LIB, PATH, TMP, UNIXROOTPATH, ROWS/COLS, HOME, CONSOLEFONTSIZE

=item REGISTRY

=item XS

=item BUGS

=item INSTALLATION

=back

=item ACKNOWLEDGEMENTS

=item History of WinCE port

=item AUTHORS

Rainer Keuchel <coyxc@rainer-keuchel.de>, Vadim Konovalov, Daniel Dragan

=back

=head2 perlcygwin - Perl for Cygwin

=over 4

=item SYNOPSIS

=item PREREQUISITES FOR COMPILING PERL ON CYGWIN

=over 4

=item Cygwin = GNU+Cygnus+Windows (Don't leave UNIX without it)

=item Cygwin Configuration

C<PATH>, I<nroff>

=back

=item CONFIGURE PERL ON CYGWIN

=over 4

=item Stripping Perl Binaries on Cygwin

=item Optional Libraries for Perl on Cygwin

C<-lcrypt>, C<-lgdbm_compat> (C<use GDBM_File>), C<-ldb> (C<use DB_File>),
C<cygserver> (C<use IPC::SysV>), C<-lutil>

=item Configure-time Options for Perl on Cygwin

C<-Uusedl>, C<-Dusemymalloc>, C<-Uuseperlio>, C<-Dusemultiplicity>,
C<-Uuse64bitint>, C<-Duselongdouble>, C<-Uuseithreads>, C<-Duselargefiles>,
C<-Dmksymlinks>

=item Suspicious Warnings on Cygwin

Win9x and C<d_eofnblk>, Compiler/Preprocessor defines

=back

=item MAKE ON CYGWIN

=item TEST ON CYGWIN

=over 4

=item File Permissions on Cygwin

=item NDBM_File and ODBM_File do not work on FAT filesystems

=item C<fork()> failures in io_* tests

=back

=item Specific features of the Cygwin port

=over 4

=item Script Portability on Cygwin

Pathnames, Text/Binary, PerlIO, F<.exe>, Cygwin vs. Windows process ids,
Cygwin vs. Windows errors, rebase errors on fork or system, C<chown()>,
Miscellaneous

=item Prebuilt methods:

C<Cwd::cwd>, C<Cygwin::pid_to_winpid>, C<Cygwin::winpid_to_pid>,
C<Cygwin::win_to_posix_path>, C<Cygwin::posix_to_win_path>,
C<Cygwin::mount_table()>, C<Cygwin::mount_flags>, C<Cygwin::is_binmount>,
C<Cygwin::sync_winenv>

=back

=item INSTALL PERL ON CYGWIN

=item MANIFEST ON CYGWIN

Documentation, Build, Configure, Make, Install, Tests, Compiled Perl
Source, Compiled Module Source, Perl Modules/Scripts, Perl Module Tests

=item BUGS ON CYGWIN

=item AUTHORS

=item HISTORY

=back

=head2 perldos - Perl under DOS, W31, W95.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Prerequisites for Compiling Perl on DOS

DJGPP, Pthreads

=item Shortcomings of Perl under DOS

=item Building Perl on DOS

=item Testing Perl on DOS

=item Installation of Perl on DOS

=back

=item BUILDING AND INSTALLING MODULES ON DOS

=over 4

=item Building Prerequisites for Perl on DOS

=item Unpacking CPAN Modules on DOS

=item Building Non-XS Modules on DOS

=item Building XS Modules on DOS

=back

=item AUTHOR

=item SEE ALSO

=back

=head2 perlfreebsd - Perl version 5 on FreeBSD systems

=over 4

=item DESCRIPTION

=over 4

=item FreeBSD core dumps from readdir_r with ithreads

=item C<$^X> doesn't always contain a full path in FreeBSD

=back

=item AUTHOR

=back

=head2 perlhaiku - Perl version 5.10+ on Haiku

=over 4

=item DESCRIPTION

=item BUILD AND INSTALL

=item KNOWN PROBLEMS

=item CONTACT

=back

=head2 perlhpux - Perl version 5 on Hewlett-Packard Unix (HP-UX) systems

=over 4

=item DESCRIPTION

=over 4

=item Using perl as shipped with HP-UX

=item Using perl from HP's porting centre

=item Other prebuilt perl binaries

=item Compiling Perl 5 on HP-UX

=item PA-RISC

=item Portability Between PA-RISC Versions

=item PA-RISC 1.0

=item PA-RISC 1.1

=item PA-RISC 2.0

=item Itanium Processor Family (IPF) and HP-UX

=item Itanium, Itanium 2 & Madison 6

=item HP-UX versions

=item Building Dynamic Extensions on HP-UX

=item The HP ANSI C Compiler

=item The GNU C Compiler

=item Using Large Files with Perl on HP-UX

=item Threaded Perl on HP-UX

=item 64-bit Perl on HP-UX

=item Oracle on HP-UX

=item GDBM and Threads on HP-UX

=item NFS filesystems and utime(2) on HP-UX

=item HP-UX Kernel Parameters (maxdsiz) for Compiling Perl

=back

=item nss_delete core dump from op/pwent or op/grent

=item error: pasting ")" and "l" does not give a valid preprocessing token

=item Redeclaration of "sendpath" with a different storage class specifier

=item Miscellaneous

=item AUTHOR

=back

=head2 perlhurd - Perl version 5 on Hurd

=over 4

=item DESCRIPTION

=over 4

=item Known Problems with Perl on Hurd 

=back

=item AUTHOR

=back

=head2 perlirix - Perl version 5 on Irix systems

=over 4

=item DESCRIPTION

=over 4

=item Building 32-bit Perl in Irix

=item Building 64-bit Perl in Irix

=item About Compiler Versions of Irix

=item Linker Problems in Irix

=item Malloc in Irix

=item Building with threads in Irix

=item Irix 5.3

=back

=item AUTHOR

=back

=head2 perllinux - Perl version 5 on Linux systems

=over 4

=item DESCRIPTION

=over 4

=item Experimental Support for Sun Studio Compilers for Linux OS

=back

=item AUTHOR

=back

=head2 perlmacos - Perl under Mac OS (Classic)

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item AUTHOR

=back

=head2 perlmacosx - Perl under Mac OS X

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Installation Prefix

=item SDK support

=item Universal Binary support

=item 64-bit PPC support

=item libperl and Prebinding

=item Updating Apple's Perl

=item Known problems

=item Cocoa

=back

=item Starting From Scratch

=item AUTHOR

=item DATE

=back

=head2 perlnetware - Perl for NetWare

=over 4

=item DESCRIPTION

=item BUILD

=over 4

=item Tools & SDK

=item Setup

SetNWBld.bat, Buildtype.bat

=item Make

=item Interpreter

=item Extensions

=back

=item INSTALL

=item BUILD NEW EXTENSIONS

=item ACKNOWLEDGEMENTS

=item AUTHORS

=item DATE

=back

=head2 perlopenbsd - Perl version 5 on OpenBSD systems

=over 4

=item DESCRIPTION

=over 4

=item OpenBSD core dumps from getprotobyname_r and getservbyname_r with
ithreads

=back

=item AUTHOR

=back

=head2 perlos2 - Perl under OS/2, DOS, Win0.3*, Win0.95 and WinNT.

=over 4

=item SYNOPSIS

=back

=over 4

=item DESCRIPTION

=over 4

=item Target

=item Other OSes

=item Prerequisites

EMX, RSX, HPFS, pdksh

=item Starting Perl programs under OS/2 (and DOS and...)

=item Starting OS/2 (and DOS) programs under Perl

=back

=item Frequently asked questions

=over 4

=item "It does not work"

=item I cannot run external programs

=item I cannot embed perl into my program, or use F<perl.dll> from my
program. 

Is your program EMX-compiled with C<-Zmt -Zcrtdll>?, Did you use
L<ExtUtils::Embed>?

=item C<``> and pipe-C<open> do not work under DOS.

=item Cannot start C<find.exe "pattern" file>

=back

=item INSTALLATION

=over 4

=item Automatic binary installation

C<PERL_BADLANG>, C<PERL_BADFREE>, F<Config.pm>

=item Manual binary installation

Perl VIO and PM executables (dynamically linked), Perl_ VIO executable
(statically linked), Executables for Perl utilities, Main Perl library,
Additional Perl modules, Tools to compile Perl modules, Manpages for Perl
and utilities, Manpages for Perl modules, Source for Perl documentation,
Perl manual in F<.INF> format, Pdksh

=item B<Warning>

=back

=item Accessing documentation

=over 4

=item OS/2 F<.INF> file

=item Plain text

=item Manpages

=item HTML

=item GNU C<info> files

=item F<PDF> files

=item C<LaTeX> docs

=back

=item BUILD

=over 4

=item The short story

=item Prerequisites

=item Getting perl source

=item Application of the patches

=item Hand-editing

=item Making

=item Testing

A lot of C<bad free>, Process terminated by SIGTERM/SIGINT, F<op/fs.t>,
Z<>18, Z<>25, F<op/stat.t>

=item Installing the built perl

=item C<a.out>-style build

=back

=item Building a binary distribution

=item Building custom F<.EXE> files

=over 4

=item Making executables with a custom collection of statically loaded
extensions

=item Making executables with a custom search-paths

=back

=item Build FAQ

=over 4

=item Some C</> became C<\> in pdksh.

=item C<'errno'> - unresolved external

=item Problems with tr or sed

=item Some problem (forget which ;-)

=item Library ... not found

=item Segfault in make

=item op/sprintf test failure

=back

=item Specific (mis)features of OS/2 port

=over 4

=item C<setpriority>, C<getpriority>

=item C<system()>

=item C<extproc> on the first line

=item Additional modules:

=item Prebuilt methods:

C<File::Copy::syscopy>, C<DynaLoader::mod2fname>,  C<Cwd::current_drive()>,
 C<Cwd::sys_chdir(name)>,  C<Cwd::change_drive(name)>, 
C<Cwd::sys_is_absolute(name)>,	C<Cwd::sys_is_rooted(name)>, 
C<Cwd::sys_is_relative(name)>,	C<Cwd::sys_cwd(name)>, 
C<Cwd::sys_abspath(name, dir)>,  C<Cwd::extLibpath([type])>, 
C<Cwd::extLibpath_set( path [, type ] )>,
C<OS2::Error(do_harderror,do_exception)>, C<OS2::Errors2Drive(drive)>,
OS2::SysInfo(), OS2::BootDrive(), C<OS2::MorphPM(serve)>,
C<OS2::UnMorphPM(serve)>, C<OS2::Serve_Messages(force)>,
C<OS2::Process_Messages(force [, cnt])>, C<OS2::_control87(new,mask)>,
OS2::get_control87(), C<OS2::set_control87_em(new=MCW_EM,mask=MCW_EM)>,
C<OS2::DLLname([how [, \&xsub]])>

=item Prebuilt variables:

$OS2::emx_rev, $OS2::emx_env, $OS2::os_ver, $OS2::is_aout, $OS2::can_fork,
$OS2::nsyserror

=item Misfeatures

=item Modifications

C<popen>, C<tmpnam>, C<tmpfile>, C<ctermid>, C<stat>, C<mkdir>, C<rmdir>,
C<flock>

=item Identifying DLLs

=item Centralized management of resources

C<HAB>, C<HMQ>, Treating errors reported by OS/2 API,
C<CheckOSError(expr)>, C<CheckWinError(expr)>, C<SaveWinError(expr)>,
C<SaveCroakWinError(expr,die,name1,name2)>, C<WinError_2_Perl_rc>,
C<FillWinError>, C<FillOSError(rc)>, Loading DLLs and ordinals in DLLs

=back

=item Perl flavors

=over 4

=item F<perl.exe>

=item F<perl_.exe>

=item F<perl__.exe>

=item F<perl___.exe>

=item Why strange names?

=item Why dynamic linking?

=item Why chimera build?

=back

=item ENVIRONMENT

=over 4

=item C<PERLLIB_PREFIX>

=item C<PERL_BADLANG>

=item C<PERL_BADFREE>

=item C<PERL_SH_DIR>

=item C<USE_PERL_FLOCK>

=item C<TMP> or C<TEMP>

=back

=item Evolution

=over 4

=item Text-mode filehandles

=item Priorities

=item DLL name mangling: pre 5.6.2

=item DLL name mangling: 5.6.2 and beyond

Global DLLs, specific DLLs, C<BEGINLIBPATH> and C<ENDLIBPATH>, F<.> from
C<LIBPATH>

=item DLL forwarder generation

=item Threading

=item Calls to external programs

=item Memory allocation

=item Threads

C<COND_WAIT>, F<os2.c>

=back

=item BUGS

=back

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 perlos390 - building and installing Perl for OS/390 and z/OS

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Tools

=item Unpacking Perl distribution on OS/390

=item Setup and utilities for Perl on OS/390

=item Configure Perl on OS/390

=item Build, Test, Install Perl on OS/390

=item Build Anomalies with Perl on OS/390

=item Testing Anomalies with Perl on OS/390

=item Installation Anomalies with Perl on OS/390

=item Usage Hints for Perl on OS/390

=item Floating Point Anomalies with Perl on OS/390

=item Modules and Extensions for Perl on OS/390

=back

=item AUTHORS

=item SEE ALSO

=over 4

=item Mailing list for Perl on OS/390

=back

=item HISTORY

=back

=head2 perlos400 - Perl version 5 on OS/400

=over 4

=item DESCRIPTION

=over 4

=item Compiling Perl for OS/400 PASE

=item Installing Perl in OS/400 PASE

=item Using Perl in OS/400 PASE

=item Known Problems

=item Perl on ILE

=back

=item AUTHORS

=back

=head2 perlplan9 - Plan 9-specific documentation for Perl

=over 4

=item DESCRIPTION

=over 4

=item Invoking Perl

=item What's in Plan 9 Perl

=item What's not in Plan 9 Perl

=item Perl5 Functions not currently supported in Plan 9 Perl

=item Signals in Plan 9 Perl

=back

=item COMPILING AND INSTALLING PERL ON PLAN 9

=over 4

=item Installing Perl Documentation on Plan 9

=back

=item BUGS

=item Revision date

=item AUTHOR

=back

=head2 perlqnx - Perl version 5 on QNX

=over 4

=item DESCRIPTION

=over 4

=item Required Software for Compiling Perl on QNX4

/bin/sh, ar, nm, cpp, make

=item Outstanding Issues with Perl on QNX4

=item QNX auxiliary files

qnx/ar, qnx/cpp

=item Outstanding issues with perl under QNX6

=item Cross-compilation

=back

=item AUTHOR

=back

=head2 perlriscos - Perl version 5 for RISC OS

=over 4

=item DESCRIPTION

=item BUILD

=item AUTHOR

=back

=head2 perlsolaris - Perl version 5 on Solaris systems

=over 4

=item DESCRIPTION

=over 4

=item Solaris Version Numbers.

=back

=item RESOURCES

Solaris FAQ, Precompiled Binaries, Solaris Documentation

=item SETTING UP

=over 4

=item File Extraction Problems on Solaris.

=item Compiler and Related Tools on Solaris.

=item Environment for Compiling perl on Solaris

=back

=item RUN CONFIGURE.

=over 4

=item 64-bit perl on Solaris.

=item Threads in perl on Solaris.

=item Malloc Issues with perl on Solaris.

=back

=item MAKE PROBLEMS.

Dynamic Loading Problems With GNU as and GNU ld, ld.so.1: ./perl: fatal:
relocation error:, dlopen: stub interception failed, #error "No
DATAMODEL_NATIVE specified", sh: ar: not found

=item MAKE TEST

=over 4

=item op/stat.t test 4 in Solaris

=item nss_delete core dump from op/pwent or op/grent

=back

=item CROSS-COMPILATION

=item PREBUILT BINARIES OF PERL FOR SOLARIS.

=item RUNTIME ISSUES FOR PERL ON SOLARIS.

=over 4

=item Limits on Numbers of Open Files on Solaris.

=back

=item SOLARIS-SPECIFIC MODULES.

=item SOLARIS-SPECIFIC PROBLEMS WITH MODULES.

=over 4

=item Proc::ProcessTable on Solaris

=item BSD::Resource on Solaris

=item Net::SSLeay on Solaris

=back

=item SunOS 4.x

=item AUTHOR

=back

=head2 perlsymbian - Perl version 5 on Symbian OS

=over 4

=item DESCRIPTION

=over 4

=item Compiling Perl on Symbian

=item Compilation problems

=item PerlApp

=item sisify.pl

=item Using Perl in Symbian

=back

=item TO DO

=item WARNING

=item NOTE

=item AUTHOR

=item COPYRIGHT

=item LICENSE

=item HISTORY

=back

=head2 perlsynology - Perl 5 on Synology DSM systems

=over 4

=item DESCRIPTION

=over 4

=item Setting up the build environment

=item Compiling Perl 5

=item Known problems

Error message "No error definitions found",
F<ext/DynaLoader/t/DynaLoader.t>

=item Smoke testing Perl 5

=item Adding libraries

=back

=item REVISION

=item AUTHOR

=back

=head2 perltru64 - Perl version 5 on Tru64 (formerly known as Digital UNIX
formerly known as DEC OSF/1) systems

=over 4

=item DESCRIPTION

=over 4

=item Compiling Perl 5 on Tru64

=item Using Large Files with Perl on Tru64

=item Threaded Perl on Tru64

=item Long Doubles on Tru64

=item DB_File tests failing on Tru64

=item 64-bit Perl on Tru64

=item Warnings about floating-point overflow when compiling Perl on Tru64

=back

=item Testing Perl on Tru64

=item ext/ODBM_File/odbm Test Failing With Static Builds

=item Perl Fails Because Of Unresolved Symbol sockatmark

=item read_cur_obj_info: bad file magic number

=item AUTHOR

=back

=head2 perlvms - VMS-specific documentation for Perl

=over 4

=item DESCRIPTION

=item Installation

=item Organization of Perl Images

=over 4

=item Core Images

=item Perl Extensions

=item Installing static extensions

=item Installing dynamic extensions

=back

=item File specifications

=over 4

=item Syntax

=item Filename Case

=item Symbolic Links

=item Wildcard expansion

=item Pipes

=back

=item PERL5LIB and PERLLIB

=item The Perl Forked Debugger

=item PERL_VMS_EXCEPTION_DEBUG

=item Command line

=over 4

=item I/O redirection and backgrounding

=item Command line switches

-i, -S, -u

=back

=item Perl functions

File tests, backticks, binmode FILEHANDLE, crypt PLAINTEXT, USER, die,
dump, exec LIST, fork, getpwent, getpwnam, getpwuid, gmtime, kill, qx//,
select (system call), stat EXPR, system LIST, time, times, unlink LIST,
utime LIST, waitpid PID,FLAGS

=item Perl variables

%ENV, CRTL_ENV, CLISYM_[LOCAL], Any other string, $!, $^E, $?, $|

=item Standard modules with VMS-specific differences

=over 4

=item SDBM_File

=back

=item Revision date

=item AUTHOR

=back

=head2 perlvos - Perl for Stratus OpenVOS

=over 4

=item SYNOPSIS

=item BUILDING PERL FOR OPENVOS

=item INSTALLING PERL IN OPENVOS

=item USING PERL IN OPENVOS

=over 4

=item Restrictions of Perl on OpenVOS

=back

=item TEST STATUS

=item SUPPORT STATUS

=item AUTHOR

=item LAST UPDATE

=back

=head2 perlwin32 - Perl under Windows

=over 4

=item SYNOPSIS

=item DESCRIPTION

L<http://mingw.org>, L<http://mingw-w64.org>

=over 4

=item Setting Up Perl on Windows

Make, Command Shell, Microsoft Visual C++, Microsoft Visual C++ 2008-2017
Express/Community Edition, Microsoft Visual C++ 2005 Express Edition,
Microsoft Visual C++ Toolkit 2003, Microsoft Platform SDK 64-bit Compiler,
MinGW release 3 with gcc, Intel C++ Compiler

=item Building

=item Testing Perl on Windows

=item Installation of Perl on Windows

=item Usage Hints for Perl on Windows

Environment Variables, File Globbing, Using perl from the command line,
Building Extensions, Command-line Wildcard Expansion, Notes on 64-bit
Windows

=item Running Perl Scripts

=item Miscellaneous Things

=back

=item BUGS AND CAVEATS

=item ACKNOWLEDGEMENTS

=item AUTHORS

Gary Ng E<lt>71564.1743@CompuServe.COME<gt>, Gurusamy Sarathy
E<lt>gsar@activestate.comE<gt>, Nick Ing-Simmons
E<lt>nick@ing-simmons.netE<gt>, Jan Dubois E<lt>jand@activestate.comE<gt>,
Steve Hay E<lt>steve.m.hay@googlemail.comE<gt>

=item SEE ALSO

=item HISTORY

=back

=head2 perlboot - Links to information on object-oriented programming in
Perl

=over 4

=item DESCRIPTION

=back

=head2 perlbot - Links to information on object-oriented programming in
Perl

=over 4

=item DESCRIPTION

=back

=head2 perlrepository - Links to current information on the Perl source
repository

=over 4

=item DESCRIPTION

=back

=head2 perltodo - Link to the Perl to-do list

=over 4

=item DESCRIPTION

=back

=head2 perltooc - Links to information on object-oriented programming in
Perl

=over 4

=item DESCRIPTION

=back

=head2 perltoot - Links to information on object-oriented programming in
Perl

=over 4

=item DESCRIPTION

=back

=head1 PRAGMA DOCUMENTATION

=head2 arybase - Set indexing base via $[

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item HISTORY

=item BUGS

=item SEE ALSO

=back

=head2 attributes - get/set subroutine or variable attributes

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item What C<import> does

=item Built-in Attributes

lvalue, method, prototype(..), locked, const, shared, unique

=item Available Subroutines

get, reftype

=item Package-specific Attribute Handling

FETCH_I<type>_ATTRIBUTES, MODIFY_I<type>_ATTRIBUTES

=item Syntax of Attribute Lists

=back

=item EXPORTS

=over 4

=item Default exports

=item Available exports

=item Export tags defined

=back

=item EXAMPLES

=item MORE EXAMPLES

=item SEE ALSO

=back

=head2 autodie - Replace functions with ones that succeed or die with
lexical scope

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXCEPTIONS

=item CATEGORIES

=item FUNCTION SPECIFIC NOTES

=over 4

=item print

=item flock

=item system/exec

=back

=item GOTCHAS

=item DIAGNOSTICS

:void cannot be used with lexical scope, No user hints defined for %s

=item BUGS

=over 4

=item autodie and string eval

=item REPORTING BUGS

=back

=item FEEDBACK

=item AUTHOR

=item LICENSE

=item SEE ALSO

=item ACKNOWLEDGEMENTS

=back

=head2 autodie::Scope::Guard - Wrapper class for calling subs at end of
scope

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Methods

=back

=item AUTHOR

=item LICENSE

=back

=head2 autodie::Scope::GuardStack -  Hook stack for managing scopes via %^H

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Methods

=back

=item AUTHOR

=item LICENSE

=back

=head2 autodie::Util - Internal Utility subroutines for autodie and Fatal

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Methods

=back

=item AUTHOR

=item LICENSE

=back

=head2 autodie::exception - Exceptions from autodying functions.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Common Methods

=back

=back

=over 4

=item Advanced methods

=back

=over 4

=item SEE ALSO

=item LICENSE

=item AUTHOR

=back

=head2 autodie::exception::system - Exceptions from autodying system().

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item stringify

=back

=over 4

=item LICENSE

=item AUTHOR

=back

=head2 autodie::hints - Provide hints about user subroutines to autodie

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Introduction

=item What are hints?

=item Example hints

=back

=item Manually setting hints from within your program

=item Adding hints to your module

=item Insisting on hints

=back

=over 4

=item Diagnostics

Attempts to set_hints_for unidentifiable subroutine, fail hints cannot be
provided with either scalar or list hints for %s, %s hint missing for %s

=item ACKNOWLEDGEMENTS

=item AUTHOR

=item LICENSE

=item SEE ALSO

=back

=head2 autodie::skip - Skip a package when throwing autodie exceptions

=over 4

=item SYNPOSIS

=item DESCRIPTION

=item AUTHOR

=item LICENSE

=item SEE ALSO

=back

=head2 autouse - postpone load of modules until a function is used

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item WARNING

=item AUTHOR

=item SEE ALSO

=back

=head2 base - Establish an ISA relationship with base classes at compile
time

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item DIAGNOSTICS

Base class package "%s" is empty, Class 'Foo' tried to inherit from itself

=item HISTORY

=item CAVEATS

=item SEE ALSO

=back

=head2 bigint - Transparent BigInteger support for Perl

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item use integer vs. use bigint

=item Options

a or accuracy, p or precision, t or trace, hex, oct, l, lib, try or only, v
or version

=item Math Library

=item Internal Format

=item Sign

=item Method calls

=item Methods

inf(), NaN(), e, PI, bexp(), bpi(), upgrade(), in_effect()

=back

=item CAVEATS

Operator vs literal overloading, ranges, in_effect(), hex()/oct()

=item MODULES USED

=item EXAMPLES

=item BUGS

=item SUPPORT

=item LICENSE

=item SEE ALSO

=item AUTHORS

=back

=head2 bignum - Transparent BigNumber support for Perl

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Options

a or accuracy, p or precision, t or trace, l or lib, hex, oct, v or version

=item Methods

=item Caveats

inf(), NaN(), e, PI(), bexp(), bpi(), upgrade(), in_effect()

=item Math Library

=item INTERNAL FORMAT

=item SIGN

=back

=item CAVEATS

Operator vs literal overloading, in_effect(), hex()/oct()

=item MODULES USED

=item EXAMPLES

=item BUGS

=item SUPPORT

RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN
Ratings, Search CPAN, CPAN Testers Matrix

=item LICENSE

=item SEE ALSO

=item AUTHORS

=back

=head2 bigrat - Transparent BigNumber/BigRational support for Perl

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Modules Used

=item Math Library

=item Sign

=item Methods

inf(), NaN(), e, PI, bexp(), bpi(), upgrade(), in_effect()

=item MATH LIBRARY

=item Caveat

=item Options

a or accuracy, p or precision, t or trace, l or lib, hex, oct, v or version

=back

=item CAVEATS

Operator vs literal overloading, in_effect(), hex()/oct()

=item EXAMPLES

=item BUGS

=item SUPPORT

=item LICENSE

=item SEE ALSO

=item AUTHORS

=back

=head2 blib - Use MakeMaker's uninstalled version of a package

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item BUGS

=item AUTHOR

=back

=head2 bytes - Perl pragma to expose the individual bytes of characters

=over 4

=item NOTICE

=item SYNOPSIS

=item DESCRIPTION

=item LIMITATIONS

=item SEE ALSO

=back

=head2 charnames - access to Unicode character names and named character
sequences; also define character names

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item LOOSE MATCHES

=item ALIASES

=item CUSTOM ALIASES

=item charnames::string_vianame(I<name>)

=item charnames::vianame(I<name>)

=item charnames::viacode(I<code>)

=item CUSTOM TRANSLATORS

=item BUGS

=back

=head2 constant - Perl pragma to declare constants

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTES

=over 4

=item List constants

=item Defining multiple constants at once

=item Magic constants

=back

=item TECHNICAL NOTES

=item CAVEATS

=item SEE ALSO

=item BUGS

=item AUTHORS

=item COPYRIGHT & LICENSE

=back

=head2 deprecate - Perl pragma for deprecating the core version of a module

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item EXPORT

=back

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 diagnostics, splain - produce verbose warning diagnostics

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item The C<diagnostics> Pragma

=item The I<splain> Program

=back

=item EXAMPLES

=item INTERNALS

=item BUGS

=item AUTHOR

=back

=head2 encoding - allows you to write your script in non-ASCII and
non-UTF-8

=over 4

=item WARNING

=item SYNOPSIS

=item DESCRIPTION

C<use encoding ['I<ENCNAME>'] ;>, C<use encoding I<ENCNAME>
Filter=E<gt>1;>, C<no encoding;>

=item OPTIONS

=over 4

=item Setting C<STDIN> and/or C<STDOUT> individually

=item The C<:locale> sub-pragma

=back

=item CAVEATS

=over 4

=item SIDE EFFECTS

=item DO NOT MIX MULTIPLE ENCODINGS

=item Prior to Perl v5.22

=item Prior to Encode version 1.87

=item Prior to Perl v5.8.1

"NON-EUC" doublebyte encodings, C<tr///>, Legend of characters above

=back

=item EXAMPLE - Greekperl

=item BUGS

Thread safety, Can't be used by more than one module in a single program,
Other modules using C<STDIN> and C<STDOUT> get the encoded stream, literals
in regex that are longer than 127 bytes, EBCDIC, C<format>, See also
L</CAVEATS>

=item HISTORY

=item SEE ALSO

=back

=head2 encoding::warnings - Warn on implicit encoding conversions

=over 4

=item VERSION

=item NOTICE

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overview of the problem

=item Detecting the problem

=item Solving the problem

Upgrade both sides to unicode-strings, Downgrade both sides to
byte-strings, Specify the encoding for implicit byte-string upgrading,
PerlIO layers for B<STDIN> and B<STDOUT>, Literal conversions, Implicit
upgrading for byte-strings

=back

=item CAVEATS

=back

=over 4

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT

=back

=head2 experimental - Experimental features made easy

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Ordering matters

=item Disclaimer

=back

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 feature - Perl pragma to enable new features

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Lexical effect

=item C<no feature>

=back

=item AVAILABLE FEATURES

=over 4

=item The 'say' feature

=item The 'state' feature

=item The 'switch' feature

=item The 'unicode_strings' feature

=item The 'unicode_eval' and 'evalbytes' features

=item The 'current_sub' feature

=item The 'array_base' feature

=item The 'fc' feature

=item The 'lexical_subs' feature

=item The 'postderef' and 'postderef_qq' features

=item The 'signatures' feature

=item The 'refaliasing' feature

=item The 'bitwise' feature

=item The 'declared_refs' feature

=back

=item FEATURE BUNDLES

=item IMPLICIT LOADING

=back

=head2 fields - compile-time class fields

=over 4

=item SYNOPSIS

=item DESCRIPTION

new, phash

=item SEE ALSO

=back

=head2 filetest - Perl pragma to control the filetest permission operators

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Consider this carefully

=item The "access" sub-pragma

=item Limitation with regard to C<_>

=back

=back

=head2 if - C<use> a Perl module if a condition holds (also can C<no> a
module)

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item EXAMPLES

=back

=item BUGS

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENCE

=back

=head2 integer - Perl pragma to use integer arithmetic instead of floating
point

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 less - perl pragma to request less of something

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FOR MODULE AUTHORS

=over 4

=item C<< BOOLEAN = less->of( FEATURE ) >>

=item C<< FEATURES = less->of() >>

=back

=item CAVEATS

This probably does nothing, This works only on 5.10+

=back

=head2 lib - manipulate @INC at compile time

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Adding directories to @INC

=item Deleting directories from @INC

=item Restoring original @INC

=back

=item CAVEATS

=item NOTES

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 locale - Perl pragma to use or avoid POSIX locales for built-in
operations

=over 4

=item WARNING

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 mro - Method Resolution Order

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OVERVIEW

=item The C3 MRO

=over 4

=item What is C3?

=item How does C3 work

=back

=item Functions

=over 4

=item mro::get_linear_isa($classname[, $type])

=item mro::set_mro ($classname, $type)

=item mro::get_mro($classname)

=item mro::get_isarev($classname)

=item mro::is_universal($classname)

=item mro::invalidate_all_method_caches()

=item mro::method_changed_in($classname)

=item mro::get_pkg_gen($classname)

=item next::method

=item next::can

=item maybe::next::method

=back

=item SEE ALSO

=over 4

=item The original Dylan paper

L<http://haahr.tempdomainname.com/dylan/linearization-oopsla96.html>

=item Pugs

=item Parrot

L<http://use.perl.org/~autrijus/journal/25768>

=item Python 2.3 MRO related links

L<http://www.python.org/2.3/mro.html>,
L<http://www.python.org/2.2.2/descrintro.html#mro>

=item Class::C3

L<Class::C3>

=back

=item AUTHOR

=back

=head2 ok - Alternative to Test::More::use_ok

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CC0 1.0 Universal

=back

=head2 open - perl pragma to set default PerlIO layers for input and output

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NONPERLIO FUNCTIONALITY

=item IMPLEMENTATION DETAILS

=item SEE ALSO

=back

=head2 ops - Perl pragma to restrict unsafe operations when compiling

=over 4

=item SYNOPSIS	

=item DESCRIPTION

=item SEE ALSO

=back

=head2 overload - Package for overloading Perl operations

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Fundamentals

=item Overloadable Operations

C<not>, C<neg>, C<++>, C<-->, I<Assignments>, I<Non-mutators with a mutator
variant>, C<int>, I<String, numeric, boolean, and regexp conversions>,
I<Iteration>, I<File tests>, I<Matching>, I<Dereferencing>, I<Special>

=item Magic Autogeneration

=item Special Keys for C<use overload>

defined, but FALSE, C<undef>, TRUE

=item How Perl Chooses an Operator Implementation

=item Losing Overloading

=item Inheritance and Overloading

Method names in the C<use overload> directive, Overloading of an operation
is inherited by derived classes

=item Run-time Overloading

=item Public Functions

overload::StrVal(arg), overload::Overloaded(arg), overload::Method(obj,op)

=item Overloading Constants

integer, float, binary, q, qr

=back

=item IMPLEMENTATION

=item COOKBOOK

=over 4

=item Two-face Scalars

=item Two-face References

=item Symbolic Calculator

=item I<Really> Symbolic Calculator

=back

=item AUTHOR

=item SEE ALSO

=item DIAGNOSTICS

Odd number of arguments for overload::constant, '%s' is not an overloadable
type, '%s' is not a code reference, overload arg '%s' is invalid

=item BUGS AND PITFALLS

=back

=head2 overloading - perl pragma to lexically control overloading

=over 4

=item SYNOPSIS

=item DESCRIPTION

C<no overloading>, C<no overloading @ops>, C<use overloading>, C<use
overloading @ops>

=back

=head2 parent - Establish an ISA relationship with base classes at compile
time

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item HISTORY

=item CAVEATS

=item SEE ALSO

=item AUTHORS AND CONTRIBUTORS

=item MAINTAINER

=item LICENSE

=back

=head2 re - Perl pragma to alter regular expression behaviour

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item 'taint' mode

=item 'eval' mode

=item 'strict' mode

=item '/flags' mode

=item 'debug' mode

=item 'Debug' mode

Compile related options, COMPILE, PARSE, OPTIMISE, TRIEC, DUMP, FLAGS,
TEST, Execute related options, EXECUTE, MATCH, TRIEE, INTUIT, Extra
debugging options, EXTRA, BUFFERS, TRIEM, STATE, STACK, GPOS, OPTIMISEM,
OFFSETS, OFFSETSDBG, Other useful flags, ALL, All, MORE, More

=item Exportable Functions

is_regexp($ref), regexp_pattern($ref), regmust($ref), regname($name,$all),
regnames($all), regnames_count()

=back

=item SEE ALSO

=back

=head2 sigtrap - Perl pragma to enable simple signal handling

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OPTIONS

=over 4

=item SIGNAL HANDLERS

B<stack-trace>, B<die>, B<handler> I<your-handler>

=item SIGNAL LISTS

B<normal-signals>, B<error-signals>, B<old-interface-signals>

=item OTHER

B<untrapped>, B<any>, I<signal>, I<number>

=back

=item EXAMPLES

=back

=head2 sort - perl pragma to control sort() behaviour

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEATS

=back

=head2 strict - Perl pragma to restrict unsafe constructs

=over 4

=item SYNOPSIS

=item DESCRIPTION

C<strict refs>, C<strict vars>, C<strict subs>

=item HISTORY

=back

=head2 subs - Perl pragma to predeclare sub names

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 threads - Perl interpreter-based threads

=over 4

=item VERSION

=item WARNING

=item SYNOPSIS

=item DESCRIPTION

$thr = threads->create(FUNCTION, ARGS), $thr->join(), $thr->detach(),
threads->detach(), threads->self(), $thr->tid(), threads->tid(), "$thr",
threads->object($tid), threads->yield(), threads->list(),
threads->list(threads::all), threads->list(threads::running),
threads->list(threads::joinable), $thr1->equal($thr2), async BLOCK;,
$thr->error(), $thr->_handle(), threads->_handle()

=item EXITING A THREAD

threads->exit(), threads->exit(status), die(), exit(status), use threads
'exit' => 'threads_only', threads->create({'exit' => 'thread_only'}, ...),
$thr->set_thread_exit_only(boolean), threads->set_thread_exit_only(boolean)

=item THREAD STATE

$thr->is_running(), $thr->is_joinable(), $thr->is_detached(),
threads->is_detached()

=item THREAD CONTEXT

=over 4

=item Explicit context

=item Implicit context

=item $thr->wantarray()

=item threads->wantarray()

=back

=item THREAD STACK SIZE

threads->get_stack_size();, $size = $thr->get_stack_size();, $old_size =
threads->set_stack_size($new_size);, use threads ('stack_size' => VALUE);,
$ENV{'PERL5_ITHREADS_STACK_SIZE'}, threads->create({'stack_size' => VALUE},
FUNCTION, ARGS), $thr2 = $thr1->create(FUNCTION, ARGS)

=item THREAD SIGNALLING

$thr->kill('SIG...');

=item WARNINGS

Perl exited with active threads:, Thread creation failed: pthread_create
returned #, Thread # terminated abnormally: .., Using minimum thread stack
size of #, Thread creation failed: pthread_attr_setstacksize(I<SIZE>)
returned 22

=item ERRORS

This Perl not built to support threads, Cannot change stack size of an
existing thread, Cannot signal threads without safe signals, Unrecognized
signal name: ..

=item BUGS AND LIMITATIONS

Thread-safe modules, Using non-thread-safe modules, Memory consumption,
Current working directory, Environment variables, Catching signals,
Parent-child threads, Creating threads inside special blocks, Unsafe
signals, Perl has been built with C<PERL_OLD_SIGNALS> (see C<perl -V>), The
environment variable C<PERL_SIGNALS> is set to C<unsafe> (see
L<perlrun/"PERL_SIGNALS">), The module L<Perl::Unsafe::Signals> is used,
Returning closures from threads, Returning objects from threads, END blocks
in threads, Open directory handles, Detached threads and global
destruction, Perl Bugs and the CPAN Version of L<threads>

=item REQUIREMENTS

=item SEE ALSO

=item AUTHOR

=item LICENSE

=item ACKNOWLEDGEMENTS

=back

=head2 threads::shared - Perl extension for sharing data structures between
threads

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item EXPORT

=item FUNCTIONS

share VARIABLE, shared_clone REF, is_shared VARIABLE, lock VARIABLE,
cond_wait VARIABLE, cond_wait CONDVAR, LOCKVAR, cond_timedwait VARIABLE,
ABS_TIMEOUT, cond_timedwait CONDVAR, ABS_TIMEOUT, LOCKVAR, cond_signal
VARIABLE, cond_broadcast VARIABLE

=item OBJECTS

=item NOTES

=item WARNINGS

cond_broadcast() called on unlocked variable, cond_signal() called on
unlocked variable

=item BUGS AND LIMITATIONS

=item SEE ALSO

=item AUTHOR

=item LICENSE

=back

=head2 utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source
code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Utility functions

C<$num_octets = utf8::upgrade($string)>, C<$success =
utf8::downgrade($string[, $fail_ok])>, C<utf8::encode($string)>, C<$success
= utf8::decode($string)>, C<$unicode =
utf8::native_to_unicode($code_point)>, C<$native =
utf8::unicode_to_native($code_point)>, C<$flag = utf8::is_utf8($string)>,
C<$flag = utf8::valid($string)>

=back

=item BUGS

=item SEE ALSO

=back

=head2 vars - Perl pragma to predeclare global variable names

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 version - Perl extension for Version Objects

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item TYPES OF VERSION OBJECTS

Decimal Versions, Dotted Decimal Versions

=item DECLARING VERSIONS

=over 4

=item How to convert a module from decimal to dotted-decimal

=item How to C<declare()> a dotted-decimal version

=back

=item PARSING AND COMPARING VERSIONS

=over 4

=item How to C<parse()> a version

=item How to check for a legal version string

C<is_lax()>, C<is_strict()>

=item How to compare version objects

=back

=item OBJECT METHODS

=over 4

=item is_alpha()

=item is_qv()

=item normal()

=item numify()

=item stringify()

=back

=item EXPORTED FUNCTIONS

=over 4

=item qv()

=item is_lax()

=item is_strict()

=back

=item AUTHOR

=item SEE ALSO

=back

=head2 version::Internals - Perl extension for Version Objects

=over 4

=item DESCRIPTION

=item WHAT IS A VERSION?

Decimal versions, Dotted-Decimal versions

=over 4

=item Decimal Versions

=item Dotted-Decimal Versions

=item Alpha Versions

=item Regular Expressions for Version Parsing

C<$version::LAX>, C<$version::STRICT>, v1.234.5

=back

=item IMPLEMENTATION DETAILS

=over 4

=item Equivalence between Decimal and Dotted-Decimal Versions

=item Quoting Rules

=item What about v-strings?

=item Version Object Internals

original, qv, alpha, version

=item Replacement UNIVERSAL::VERSION

=back

=item USAGE DETAILS

=over 4

=item Using modules that use version.pm

Decimal versions always work, Dotted-Decimal version work sometimes

=item Object Methods

new(), qv(), Normal Form, Numification, Stringification, Comparison
operators, Logical Operators

=back

=item AUTHOR

=item SEE ALSO

=back

=head2 vmsish - Perl pragma to control VMS-specific language features

=over 4

=item SYNOPSIS

=item DESCRIPTION

C<vmsish status>, C<vmsish exit>, C<vmsish time>, C<vmsish hushed>

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Default Warnings and Optional Warnings

=item What's wrong with B<-w> and C<$^W>

=item Controlling Warnings from the Command Line

B<-w> X<-w>, B<-W> X<-W>, B<-X> X<-X>

=item Backward Compatibility

=item Category Hierarchy
X<warning, categories>

=item Fatal Warnings
X<warning, fatal>

=item Reporting Warnings from a Module
X<warning, reporting> X<warning, registering>

=back

=item FUNCTIONS

use warnings::register, warnings::enabled(), warnings::enabled($category),
warnings::enabled($object), warnings::fatal_enabled(),
warnings::fatal_enabled($category), warnings::fatal_enabled($object),
warnings::warn($message), warnings::warn($category, $message),
warnings::warn($object, $message), warnings::warnif($message),
warnings::warnif($category, $message), warnings::warnif($object, $message),
warnings::register_categories(@names)

=back

=head2 warnings::register - warnings import function

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head1 MODULE DOCUMENTATION

=head2 AnyDBM_File - provide framework for multiple DBMs

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item DBM Comparisons

[0], [1], [2], [3]

=back

=item SEE ALSO

=back

=head2 App::Cpan - easily interact with CPAN from the command line

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Options

-a, -A module [ module ... ], -c module, -C module [ module ... ], -D
module [ module ... ], -f, -F, -g module [ module ... ], -G module [ module
... ], -h, -i module [ module ... ], -I, -j Config.pm, -J, -l, -L author [
author ... ], -m, -M mirror1,mirror2,.., -n, -O, -p, -P, -r, -s, -t module
[ module ... ], -T, -u, -v, -V, -w, -x module [ module ... ], -I<X>

=item Examples

=item Environment variables

NONINTERACTIVE_TESTING, PERL_MM_USE_DEFAULT, CPAN_OPTS,
CPANSCRIPT_LOGLEVEL, GIT_COMMAND

=item Methods

=back

=back

run()

=over 4

=item EXIT VALUES

=item TO DO

=item BUGS

=item SEE ALSO

=item SOURCE AVAILABILITY

=item CREDITS

=item AUTHOR

=item COPYRIGHT

=back

=head2 App::Prove - Implements the C<prove> command.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=back

=over 4

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Attributes

C<archive>, C<argv>, C<backwards>, C<blib>, C<color>, C<directives>,
C<dry>, C<exec>, C<extensions>, C<failures>, C<comments>, C<formatter>,
C<harness>, C<ignore_exit>, C<includes>, C<jobs>, C<lib>, C<merge>,
C<modules>, C<parse>, C<plugins>, C<quiet>, C<really_quiet>, C<recurse>,
C<rules>, C<show_count>, C<show_help>, C<show_man>, C<show_version>,
C<shuffle>, C<state>, C<state_class>, C<taint_fail>, C<taint_warn>,
C<test_args>, C<timer>, C<verbose>, C<warnings_fail>, C<warnings_warn>,
C<tapversion>, C<trap>

=back

=over 4

=item PLUGINS

=over 4

=item Sample Plugin

=back

=item SEE ALSO

=back

=head2 App::Prove::State - State storage for the C<prove> command.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=back

=over 4

=item METHODS

=over 4

=item Class Methods

C<store>, C<extensions> (optional), C<result_class> (optional)

=back

=back

=over 4

=item C<result_class>

=back

=over 4

=item C<extensions>

=back

=over 4

=item C<results>

=back

=over 4

=item C<commit>

=back

=over 4

=item Instance Methods

C<last>, C<failed>, C<passed>, C<all>, C<hot>, C<todo>, C<slow>, C<fast>,
C<new>, C<old>, C<save>

=back

=head2 App::Prove::State::Result - Individual test suite results.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=back

=over 4

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item C<state_version>

=back

=over 4

=item C<test_class>

=back

=head2 App::Prove::State::Result::Test - Individual test results.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=back

=over 4

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=head2 Archive::Tar - module for manipulations of tar archives

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Object Methods

=over 4

=item Archive::Tar->new( [$file, $compressed] )

=back

=back

=over 4

=item $tar->read ( $filename|$handle, [$compressed, {opt => 'val'}] )

limit, filter, md5, extract

=back

=over 4

=item $tar->contains_file( $filename )

=back

=over 4

=item $tar->extract( [@filenames] )

=back

=over 4

=item $tar->extract_file( $file, [$extract_path] )

=back

=over 4

=item $tar->list_files( [\@properties] )

=back

=over 4

=item $tar->get_files( [@filenames] )

=back

=over 4

=item $tar->get_content( $file )

=back

=over 4

=item $tar->replace_content( $file, $content )

=back

=over 4

=item $tar->rename( $file, $new_name )

=back

=over 4

=item $tar->chmod( $file, $mode )

=back

=over 4

=item $tar->chown( $file, $uname [, $gname] )

=back

=over 4

=item $tar->remove (@filenamelist)

=back

=over 4

=item $tar->clear

=back

=over 4

=item $tar->write ( [$file, $compressed, $prefix] )

=back

=over 4

=item $tar->add_files( @filenamelist )

=back

=over 4

=item $tar->add_data ( $filename, $data, [$opthashref] )

FILE, HARDLINK, SYMLINK, CHARDEV, BLOCKDEV, DIR, FIFO, SOCKET

=back

=over 4

=item $tar->error( [$BOOL] )

=back

=over 4

=item $tar->setcwd( $cwd );

=back

=over 4

=item Class Methods

=over 4

=item Archive::Tar->create_archive($file, $compressed, @filelist)

=back

=back

=over 4

=item Archive::Tar->iter( $filename, [ $compressed, {opt => $val} ] )

=back

=over 4

=item Archive::Tar->list_archive($file, $compressed, [\@properties])

=back

=over 4

=item Archive::Tar->extract_archive($file, $compressed)

=back

=over 4

=item $bool = Archive::Tar->has_io_string

=back

=over 4

=item $bool = Archive::Tar->has_perlio

=back

=over 4

=item $bool = Archive::Tar->has_zlib_support

=back

=over 4

=item $bool = Archive::Tar->has_bzip2_support

=back

=over 4

=item Archive::Tar->can_handle_compressed_files

=back

=over 4

=item GLOBAL VARIABLES

=over 4

=item $Archive::Tar::FOLLOW_SYMLINK

=item $Archive::Tar::CHOWN

=item $Archive::Tar::CHMOD

=item $Archive::Tar::SAME_PERMISSIONS

=item $Archive::Tar::DO_NOT_USE_PREFIX

=item $Archive::Tar::DEBUG

=item $Archive::Tar::WARN

=item $Archive::Tar::error

=item $Archive::Tar::INSECURE_EXTRACT_MODE

=item $Archive::Tar::HAS_PERLIO

=item $Archive::Tar::HAS_IO_STRING

=item $Archive::Tar::ZERO_PAD_NUMBERS

=item Tuning the way RESOLVE_SYMLINK will works

=back

=back

=over 4

=item FAQ

What's the minimum perl version required to run Archive::Tar?, Isn't
Archive::Tar slow?, Isn't Archive::Tar heavier on memory than /bin/tar?,
Can you lazy-load data instead?, How much memory will an X kb tar file
need?, What do you do with unsupported filetypes in an archive?, I'm using
WinZip, or some other non-POSIX client, and files are not being extracted
properly!, How do I extract only files that have property X from an
archive?, How do I access .tar.Z files?, How do I handle Unicode strings?

=item CAVEATS

=item TODO

Check if passed in handles are open for read/write, Allow archives to be
passed in as string, Facilitate processing an opened filehandle of a
compressed archive

=item SEE ALSO

The GNU tar specification, The PAX format specification, A comparison of
GNU and POSIX tar standards;
C<http://www.delorie.com/gnu/docs/tar/tar_114.html>, GNU tar intends to
switch to POSIX compatibility, A Comparison between various tar
implementations

=item AUTHOR

=item ACKNOWLEDGEMENTS

=item COPYRIGHT

=back

=head2 Archive::Tar::File - a subclass for in-memory extracted file from
Archive::Tar

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Accessors

name, mode, uid, gid, size, mtime, chksum, type, linkname, magic, version,
uname, gname, devmajor, devminor, prefix, raw

=back

=item Methods

=over 4

=item Archive::Tar::File->new( file => $path )

=item Archive::Tar::File->new( data => $path, $data, $opt )

=item Archive::Tar::File->new( chunk => $chunk )

=back

=back

=over 4

=item $bool = $file->extract( [ $alternative_name ] )

=back

=over 4

=item $path = $file->full_path

=back

=over 4

=item $bool = $file->validate

=back

=over 4

=item $bool = $file->has_content

=back

=over 4

=item $content = $file->get_content

=back

=over 4

=item $cref = $file->get_content_by_ref

=back

=over 4

=item $bool = $file->replace_content( $content )

=back

=over 4

=item $bool = $file->rename( $new_name )

=back

=over 4

=item $bool = $file->chmod $mode)

=back

=over 4

=item $bool = $file->chown( $user [, $group])

=back

=over 4

=item Convenience methods

$file->is_file, $file->is_dir, $file->is_hardlink, $file->is_symlink,
$file->is_chardev, $file->is_blockdev, $file->is_fifo, $file->is_socket,
$file->is_longlink, $file->is_label, $file->is_unknown

=back

=head2 Attribute::Handlers - Simpler definition of attribute handlers

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

[0], [1], [2], [3], [4], [5], [6], [7]

=over 4

=item Typed lexicals

=item Type-specific attribute handlers

=item Non-interpretive attribute handlers

=item Phase-specific attribute handlers

=item Attributes as C<tie> interfaces

=back

=item EXAMPLES

=item UTILITY FUNCTIONS

findsym

=item DIAGNOSTICS

C<Bad attribute type: ATTR(%s)>, C<Attribute handler %s doesn't handle %s
attributes>, C<Declaration of %s attribute in package %s may clash with
future reserved word>, C<Can't have two ATTR specifiers on one subroutine>,
C<Can't autotie a %s>, C<Internal error: %s symbol went missing>, C<Won't
be able to apply END handler>

=item AUTHOR

=item BUGS

=item COPYRIGHT AND LICENSE

=back

=head2 AutoLoader - load subroutines only on demand

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Subroutine Stubs

=item Using B<AutoLoader>'s AUTOLOAD Subroutine

=item Overriding B<AutoLoader>'s AUTOLOAD Subroutine

=item Package Lexicals

=item Not Using AutoLoader

=item B<AutoLoader> vs. B<SelfLoader>

=item Forcing AutoLoader to Load a Function

=back

=item CAVEATS

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 AutoSplit - split a package for autoloading

=over 4

=item SYNOPSIS

=item DESCRIPTION

$keep, $check, $modtime

=over 4

=item Multiple packages

=back

=item DIAGNOSTICS

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 B - The Perl Compiler Backend

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OVERVIEW

=item Utility Functions

=over 4

=item Functions Returning C<B::SV>, C<B::AV>, C<B::HV>, and C<B::CV>
objects

sv_undef, sv_yes, sv_no, svref_2object(SVREF), amagic_generation, init_av,
check_av, unitcheck_av, begin_av, end_av, comppadlist, regex_padav, main_cv

=item Functions for Examining the Symbol Table

walksymtable(SYMREF, METHOD, RECURSE, PREFIX)

=item Functions Returning C<B::OP> objects or for walking op trees

main_root, main_start, walkoptree(OP, METHOD), walkoptree_debug(DEBUG)

=item Miscellaneous Utility Functions

ppname(OPNUM), hash(STR), cast_I32(I), minus_c, cstring(STR),
perlstring(STR), safename(STR), class(OBJ), threadsv_names

=item Exported utility variables

@optype, @specialsv_name

=back

=item OVERVIEW OF CLASSES

=over 4

=item SV-RELATED CLASSES

=item B::SV Methods

REFCNT, FLAGS, object_2svref

=item B::IV Methods

IV, IVX, UVX, int_value, needs64bits, packiv

=item B::NV Methods

NV, NVX, COP_SEQ_RANGE_LOW, COP_SEQ_RANGE_HIGH

=item B::RV Methods

RV

=item B::PV Methods

PV, RV, PVX, CUR, LEN

=item B::PVMG Methods

MAGIC, SvSTASH

=item B::MAGIC Methods

MOREMAGIC, precomp, PRIVATE, TYPE, FLAGS, OBJ, PTR, REGEX

=item B::PVLV Methods

TARGOFF, TARGLEN, TYPE, TARG

=item B::BM Methods

USEFUL, PREVIOUS, RARE, TABLE

=item B::REGEXP Methods

REGEX, precomp, qr_anoncv, compflags

=item B::GV Methods

is_empty, NAME, SAFENAME, STASH, SV, IO, FORM, AV, HV, EGV, CV, CVGEN,
LINE, FILE, FILEGV, GvREFCNT, FLAGS, GPFLAGS

=item B::IO Methods

LINES, PAGE, PAGE_LEN, LINES_LEFT, TOP_NAME, TOP_GV, FMT_NAME, FMT_GV,
BOTTOM_NAME, BOTTOM_GV, SUBPROCESS, IoTYPE, IoFLAGS, IsSTD

=item B::AV Methods

FILL, MAX, ARRAY, ARRAYelt, OFF, AvFLAGS

=item B::CV Methods

STASH, START, ROOT, GV, FILE, DEPTH, PADLIST, OUTSIDE, OUTSIDE_SEQ, XSUB,
XSUBANY, CvFLAGS, const_sv, NAME_HEK

=item B::HV Methods

FILL, MAX, KEYS, RITER, NAME, ARRAY, PMROOT

=item OP-RELATED CLASSES

=item B::OP Methods

next, sibling, parent, name, ppaddr, desc, targ, type, opt, flags, private,
spare

=item B::UNOP Method

first

=item B::UNOP_AUX Methods (since 5.22)

aux_list(cv), string(cv)

=item B::BINOP Method

last

=item B::LOGOP Method

other

=item B::LISTOP Method

children

=item B::PMOP Methods

pmreplroot, pmreplstart, pmnext, pmflags, extflags, precomp, pmoffset,
code_list, pmregexp

=item B::SVOP Methods

sv, gv

=item B::PADOP Method

padix

=item B::PVOP Method

pv

=item B::LOOP Methods

redoop, nextop, lastop

=item B::COP Methods

label, stash, stashpv, stashoff (threaded only), file, cop_seq, arybase,
line, warnings, io, hints, hints_hash

=item B::METHOP Methods (Since Perl 5.22)

first, meth_sv

=item PAD-RELATED CLASSES

=item B::PADLIST Methods

MAX, ARRAY, ARRAYelt, NAMES, REFCNT, id, outid

=item B::PADNAMELIST Methods

MAX, ARRAY, ARRAYelt, REFCNT

=item B::PADNAME Methods

PV, PVX, LEN, REFCNT, FLAGS, TYPE, SvSTASH, OURSTASH, PROTOCV,
COP_SEQ_RANGE_LOW, COP_SEQ_RANGE_HIGH, PARENT_PAD_INDEX,
PARENT_FAKELEX_FLAGS

=item $B::overlay

=back

=item AUTHOR

=back

=head2 B::Concise - Walk Perl syntax tree, printing concise info about ops

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLE

=item OPTIONS

=over 4

=item Options for Opcode Ordering

B<-basic>, B<-exec>, B<-tree>

=item Options for Line-Style

B<-concise>, B<-terse>, B<-linenoise>, B<-debug>, B<-env>

=item Options for tree-specific formatting

B<-compact>, B<-loose>, B<-vt>, B<-ascii>

=item Options controlling sequence numbering

B<-base>I<n>, B<-bigendian>, B<-littleendian>

=item Other options

B<-src>, B<-stash="somepackage">, B<-main>, B<-nomain>, B<-nobanner>,
B<-banner>, B<-banneris> => subref

=item Option Stickiness

=back

=item ABBREVIATIONS

=over 4

=item OP class abbreviations

=item OP flags abbreviations

=back

=item FORMATTING SPECIFICATIONS

=over 4

=item Special Patterns

B<(x(>I<exec_text>B<;>I<basic_text>B<)x)>, B<(*(>I<text>B<)*)>,
B<(*(>I<text1>B<;>I<text2>B<)*)>, B<(?(>I<text1>B<#>I<var>I<Text2>B<)?)>,
B<~>

=item # Variables

B<#>I<var>, B<#>I<var>I<N>, B<#>I<Var>, B<#addr>, B<#arg>, B<#class>,
B<#classsym>, B<#coplabel>, B<#exname>, B<#extarg>, B<#firstaddr>,
B<#flags>, B<#flagval>, B<#hints>, B<#hintsval>, B<#hyphseq>, B<#label>,
B<#lastaddr>, B<#name>, B<#NAME>, B<#next>, B<#nextaddr>, B<#noise>,
B<#private>, B<#privval>, B<#seq>, B<#seqnum>, B<#opt>, B<#sibaddr>,
B<#svaddr>, B<#svclass>, B<#svval>, B<#targ>, B<#targarg>, B<#targarglife>,
B<#typenum>

=back

=item One-Liner Command tips

perl -MO=Concise,bar foo.pl, perl -MDigest::MD5=md5 -MO=Concise,md5 -e1,
perl -MPOSIX -MO=Concise,_POSIX_ARG_MAX -e1, perl -MPOSIX -MO=Concise,a -e
'print _POSIX_SAVED_IDS', perl -MPOSIX -MO=Concise,a -e 'sub
a{_POSIX_SAVED_IDS}', perl -MB::Concise -e
'B::Concise::compile("-exec","-src", \%B::Concise::)->()'

=item Using B::Concise outside of the O framework

=over 4

=item Example: Altering Concise Renderings

=item set_style()

=item set_style_standard($name)

=item add_style ()

=item add_callback ()

=item Running B::Concise::compile()

=item B::Concise::reset_sequence()

=item Errors

=back

=item AUTHOR

=back

=head2 B::Debug - Walk Perl syntax tree, printing debug info about ops

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OPTIONS

=item AUTHOR

=item LICENSE

=back

=head2 B::Deparse - Perl compiler backend to produce perl code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OPTIONS

B<-d>, B<-f>I<FILE>, B<-l>, B<-p>, B<-P>, B<-q>, B<-s>I<LETTERS>, B<C>,
B<i>I<NUMBER>, B<T>, B<v>I<STRING>B<.>, B<-x>I<LEVEL>

=item USING B::Deparse AS A MODULE

=over 4

=item Synopsis

=item Description

=item new

=item ambient_pragmas

strict, $[, bytes, utf8, integer, re, warnings, hint_bits, warning_bits,
%^H

=item coderef2text

=back

=item BUGS

=item AUTHOR

=back

=head2 B::Op_private -	OP op_private flag definitions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item C<%bits>

=item C<%defines>

=item C<%labels>

=item C<%ops_using>

=back

=back

=head2 B::Showlex - Show lexical variables used in functions or files

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLES

=over 4

=item OPTIONS

=back

=item SEE ALSO

=item TODO

=item AUTHOR

=back

=head2 B::Terse - Walk Perl syntax tree, printing terse info about ops

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item AUTHOR

=back

=head2 B::Xref - Generates cross reference reports for Perl programs

=over 4

=item SYNOPSIS

=item DESCRIPTION

i, &, s, r

=item OPTIONS

C<-oFILENAME>, C<-r>, C<-d>, C<-D[tO]>

=item BUGS

=item AUTHOR

=back

=head2 Benchmark - benchmark running times of Perl code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Methods

new, debug, iters

=item Standard Exports

timeit(COUNT, CODE), timethis ( COUNT, CODE, [ TITLE, [ STYLE ]] ),
timethese ( COUNT, CODEHASHREF, [ STYLE ] ), timediff ( T1, T2 ), timestr (
TIMEDIFF, [ STYLE, [ FORMAT ] ] )

=item Optional Exports

clearcache ( COUNT ), clearallcache ( ), cmpthese ( COUNT, CODEHASHREF, [
STYLE ] ), cmpthese ( RESULTSHASHREF, [ STYLE ] ), countit(TIME, CODE),
disablecache ( ), enablecache ( ), timesum ( T1, T2 )

=item :hireswallclock

=back

=item Benchmark Object

cpu_p, cpu_c, cpu_a, real, iters

=item NOTES

=item EXAMPLES

=item INHERITANCE

=item CAVEATS

=item SEE ALSO

=item AUTHORS

=item MODIFICATION HISTORY

=back

=head2 CORE - Namespace for Perl's core routines

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OVERRIDING CORE FUNCTIONS

=item AUTHOR

=item SEE ALSO

=back

=head2 CPAN - query, download and build perl modules from CPAN sites

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item CPAN::shell([$prompt, $command]) Starting Interactive Mode

Searching for authors, bundles, distribution files and modules, C<get>,
C<make>, C<test>, C<install>, C<clean> modules or distributions, C<readme>,
C<perldoc>, C<look> module or distribution, C<ls> author, C<ls>
globbing_expression, C<failed>, Persistence between sessions, The C<force>
and the C<fforce> pragma, Lockfile, Signals

=item CPAN::Shell

=item autobundle

=item hosts

install_tested, is_tested

=item mkmyconfig

=item r [Module|/Regexp/]...

=item recent ***EXPERIMENTAL COMMAND***

=item recompile

=item report Bundle|Distribution|Module

=item smoke ***EXPERIMENTAL COMMAND***

=item upgrade [Module|/Regexp/]...

=item The four C<CPAN::*> Classes: Author, Bundle, Module, Distribution

=item Integrating local directories

=item Redirection

=item Plugin support ***EXPERIMENTAL***

=back

=item CONFIGURATION

completion support, displaying some help: o conf help, displaying current
values: o conf [KEY], changing of scalar values: o conf KEY VALUE, changing
of list values: o conf KEY SHIFT|UNSHIFT|PUSH|POP|SPLICE|LIST, reverting to
saved: o conf defaults, saving the config: o conf commit

=over 4

=item Config Variables

C<o conf E<lt>scalar optionE<gt>>, C<o conf E<lt>scalar optionE<gt>
E<lt>valueE<gt>>, C<o conf E<lt>list optionE<gt>>, C<o conf E<lt>list
optionE<gt> [shift|pop]>, C<o conf E<lt>list optionE<gt>
[unshift|push|splice] E<lt>listE<gt>>, interactive editing: o conf init
[MATCH|LIST]

=item CPAN::anycwd($path): Note on config variable getcwd

cwd, getcwd, fastcwd, getdcwd, backtickcwd

=item Note on the format of the urllist parameter

=item The urllist parameter has CD-ROM support

=item Maintaining the urllist parameter

=item The C<requires> and C<build_requires> dependency declarations

=item Configuration for individual distributions (I<Distroprefs>)

=item Filenames

=item Fallback Data::Dumper and Storable

=item Blueprint

=item Language Specs

comment [scalar], cpanconfig [hash], depends [hash] *** EXPERIMENTAL
FEATURE ***, disabled [boolean], features [array] *** EXPERIMENTAL FEATURE
***, goto [string], install [hash], make [hash], match [hash], patches
[array], pl [hash], test [hash]

=item Processing Instructions

args [array], commandline, eexpect [hash], env [hash], expect [array]

=item Schema verification with C<Kwalify>

=item Example Distroprefs Files

=back

=item PROGRAMMER'S INTERFACE

expand($type,@things), expandany(@things), Programming Examples

=over 4

=item Methods in the other Classes

CPAN::Author::as_glimpse(), CPAN::Author::as_string(),
CPAN::Author::email(), CPAN::Author::fullname(), CPAN::Author::name(),
CPAN::Bundle::as_glimpse(), CPAN::Bundle::as_string(),
CPAN::Bundle::clean(), CPAN::Bundle::contains(),
CPAN::Bundle::force($method,@args), CPAN::Bundle::get(),
CPAN::Bundle::inst_file(), CPAN::Bundle::inst_version(),
CPAN::Bundle::uptodate(), CPAN::Bundle::install(), CPAN::Bundle::make(),
CPAN::Bundle::readme(), CPAN::Bundle::test(),
CPAN::Distribution::as_glimpse(), CPAN::Distribution::as_string(),
CPAN::Distribution::author, CPAN::Distribution::pretty_id(),
CPAN::Distribution::base_id(), CPAN::Distribution::clean(),
CPAN::Distribution::containsmods(), CPAN::Distribution::cvs_import(),
CPAN::Distribution::dir(), CPAN::Distribution::force($method,@args),
CPAN::Distribution::get(), CPAN::Distribution::install(),
CPAN::Distribution::isa_perl(), CPAN::Distribution::look(),
CPAN::Distribution::make(), CPAN::Distribution::perldoc(),
CPAN::Distribution::prefs(), CPAN::Distribution::prereq_pm(),
CPAN::Distribution::readme(), CPAN::Distribution::reports(),
CPAN::Distribution::read_yaml(), CPAN::Distribution::test(),
CPAN::Distribution::uptodate(), CPAN::Index::force_reload(),
CPAN::Index::reload(), CPAN::InfoObj::dump(), CPAN::Module::as_glimpse(),
CPAN::Module::as_string(), CPAN::Module::clean(),
CPAN::Module::cpan_file(), CPAN::Module::cpan_version(),
CPAN::Module::cvs_import(), CPAN::Module::description(),
CPAN::Module::distribution(), CPAN::Module::dslip_status(),
CPAN::Module::force($method,@args), CPAN::Module::get(),
CPAN::Module::inst_file(), CPAN::Module::available_file(),
CPAN::Module::inst_version(), CPAN::Module::available_version(),
CPAN::Module::install(), CPAN::Module::look(), CPAN::Module::make(),
CPAN::Module::manpage_headline(), CPAN::Module::perldoc(),
CPAN::Module::readme(), CPAN::Module::reports(), CPAN::Module::test(),
CPAN::Module::uptodate(), CPAN::Module::userid()

=item Cache Manager

=item Bundles

=back

=item PREREQUISITES

=item UTILITIES

=over 4

=item Finding packages and VERSION

=item Debugging

o debug package.., o debug -package.., o debug all, o debug number

=item Floppy, Zip, Offline Mode

=item Basic Utilities for Programmers

has_inst($module), use_inst($module), has_usable($module),
instance($module), frontend(), frontend($new_frontend)

=back

=item SECURITY

=over 4

=item Cryptographically signed modules

=back

=item EXPORT

=item ENVIRONMENT

=item POPULATE AN INSTALLATION WITH LOTS OF MODULES

=item WORKING WITH CPAN.pm BEHIND FIREWALLS

=over 4

=item Three basic types of firewalls

http firewall, ftp firewall, One-way visibility, SOCKS, IP Masquerade

=item Configuring lynx or ncftp for going through a firewall

=back

=item FAQ

1), 2), 3), 4), 5), 6), 7), 8), 9), 10), 11), 12), 13), 14), 15), 16), 17),
18)

=item COMPATIBILITY

=over 4

=item OLD PERL VERSIONS

=item CPANPLUS

=item CPANMINUS

=back

=item SECURITY ADVICE

=item BUGS

=item AUTHOR

=item LICENSE

=item TRANSLATIONS

=item SEE ALSO

=back

=head2 CPAN::API::HOWTO - a recipe book for programming with CPAN.pm

=over 4

=item RECIPES

=over 4

=item What distribution contains a particular module?

=item What modules does a particular distribution contain?

=back

=item SEE ALSO

=item LICENSE

=item AUTHOR

=back

=head2 CPAN::Debug - internal debugging for CPAN.pm

=over 4

=item LICENSE

=back

=head2 CPAN::Distroprefs -- read and match distroprefs

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item INTERFACE

a CPAN::Distroprefs::Result object, C<undef>, indicating that no prefs
files remain to be found

=item RESULTS

=over 4

=item Common

=item Errors

=item Successes

=back

=item PREFS

=item LICENSE

=back

=head2 CPAN::FirstTime - Utility for CPAN::Config file Initialization

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

auto_commit, build_cache, build_dir, build_dir_reuse,
build_requires_install_policy, cache_metadata, check_sigs,
cleanup_after_install, colorize_output, colorize_print, colorize_warn,
colorize_debug, commandnumber_in_prompt, connect_to_internet_ok,
ftp_passive, ftpstats_period, ftpstats_size, getcwd, halt_on_failure,
histfile, histsize, inactivity_timeout, index_expire,
inhibit_startup_message, keep_source_where, load_module_verbosity,
makepl_arg, make_arg, make_install_arg, make_install_make_command,
mbuildpl_arg, mbuild_arg, mbuild_install_arg, mbuild_install_build_command,
pager, prefer_installer, prefs_dir, prerequisites_policy,
randomize_urllist, recommends_policy, scan_cache, shell,
show_unparsable_versions, show_upload_date, show_zero_versions,
suggests_policy, tar_verbosity, term_is_latin, term_ornaments, test_report,
perl5lib_verbosity, prefer_external_tar, trust_test_report_history,
use_prompt_default, use_sqlite, version_timeout, yaml_load_code,
yaml_module

=over 4

=item LICENSE

=back

=head2 CPAN::HandleConfig - internal configuration handling for CPAN.pm

=over 4

=item C<< CLASS->safe_quote ITEM >>

=back

=over 4

=item LICENSE

=back

=head2 CPAN::Kwalify - Interface between CPAN.pm and Kwalify.pm

=over 4

=item SYNOPSIS

=item DESCRIPTION

_validate($schema_name, $data, $file, $doc), yaml($schema_name)

=item AUTHOR

=item LICENSE

=back

=head2 CPAN::Meta - the distribution metadata for a CPAN dist

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item create

=item load_file

=item load_yaml_string

=item load_json_string

=item load_string

=item save

=item meta_spec_version

=item effective_prereqs

=item should_index_file

=item should_index_package

=item features

=item feature

=item as_struct

=item as_string

=back

=item STRING DATA

=item LIST DATA

=item MAP DATA

=item CUSTOM DATA

=item BUGS

=item SEE ALSO

=item SUPPORT

=over 4

=item Bugs / Feature Requests

=item Source Code

=back

=item AUTHORS

=item CONTRIBUTORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::Converter - Convert CPAN distribution metadata
structures

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item convert

=item upgrade_fragment

=back

=item BUGS

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::Feature - an optional feature provided by a CPAN
distribution

=over 4

=item VERSION

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item identifier

=item description

=item prereqs

=back

=item BUGS

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::History - history of CPAN Meta Spec changes

=over 4

=item VERSION

=item DESCRIPTION

=item HISTORY

=over 4

=item Version 2

=item Version 1.4

=item Version 1.3

=item Version 1.2

=item Version 1.1

=item Version 1.0

=back

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::History::Meta_1_0 - Version 1.0 metadata specification
for META.yml

=over 4

=item PREFACE

=item DESCRIPTION

=item Format

=item Fields

name, version, license, perl, gpl, lgpl, artistic, bsd, open_source,
unrestricted, restrictive, distribution_type, requires, recommends,
build_requires, conflicts, dynamic_config, generated_by

=item Related Projects

DOAP

=item History

=back

=head2 CPAN::Meta::History::Meta_1_1 - Version 1.1 metadata specification
for META.yml

=over 4

=item PREFACE

=item DESCRIPTION

=item Format

=item Fields

name, version, license, perl, gpl, lgpl, artistic, bsd, open_source,
unrestricted, restrictive, license_uri, distribution_type, private,
requires, recommends, build_requires, conflicts, dynamic_config,
generated_by

=over 4

=item Ingy's suggestions

short_description, description, maturity, author_id, owner_id,
categorization, keyword, chapter_id, URL for further information,
namespaces

=back

=item History

=back

=head2 CPAN::Meta::History::Meta_1_2 - Version 1.2 metadata specification
for META.yml

=over 4

=item PREFACE

=item SYNOPSIS

=item DESCRIPTION

=item FORMAT

=item TERMINOLOGY

distribution, module

=item VERSION SPECIFICATIONS

=item HEADER

=item FIELDS

=over 4

=item meta-spec

=item name

=item version

=item abstract

=item author

=item license

perl, gpl, lgpl, artistic, bsd, open_source, unrestricted, restrictive

=item distribution_type

=item requires

=item recommends

=item build_requires

=item conflicts

=item dynamic_config

=item private

=item provides

=item no_index

=item keywords

=item resources

homepage, license, bugtracker

=item generated_by

=back

=item SEE ALSO

=item HISTORY

March 14, 2003 (Pi day), May 8, 2003, November 13, 2003, November 16, 2003,
December 9, 2003, December 15, 2003, July 26, 2005, August 23, 2005

=back

=head2 CPAN::Meta::History::Meta_1_3 - Version 1.3 metadata specification
for META.yml

=over 4

=item PREFACE

=item SYNOPSIS

=item DESCRIPTION

=item FORMAT

=item TERMINOLOGY

distribution, module

=item HEADER

=item FIELDS

=over 4

=item meta-spec

=item name

=item version

=item abstract

=item author

=item license

apache, artistic, bsd, gpl, lgpl, mit, mozilla, open_source, perl,
restrictive, unrestricted

=item distribution_type

=item requires

=item recommends

=item build_requires

=item conflicts

=item dynamic_config

=item private

=item provides

=item no_index

=item keywords

=item resources

homepage, license, bugtracker

=item generated_by

=back

=item VERSION SPECIFICATIONS

=item SEE ALSO

=item HISTORY

March 14, 2003 (Pi day), May 8, 2003, November 13, 2003, November 16, 2003,
December 9, 2003, December 15, 2003, July 26, 2005, August 23, 2005

=back

=head2 CPAN::Meta::History::Meta_1_4 - Version 1.4 metadata specification
for META.yml

=over 4

=item PREFACE

=item SYNOPSIS

=item DESCRIPTION

=item FORMAT

=item TERMINOLOGY

distribution, module

=item HEADER

=item FIELDS

=over 4

=item meta-spec

=item name

=item version

=item abstract

=item author

=item license

apache, artistic, bsd, gpl, lgpl, mit, mozilla, open_source, perl,
restrictive, unrestricted

=item distribution_type

=item requires

=item recommends

=item build_requires

=item configure_requires

=item conflicts

=item dynamic_config

=item private

=item provides

=item no_index

=item keywords

=item resources

homepage, license, bugtracker

=item generated_by

=back

=item VERSION SPECIFICATIONS

=item SEE ALSO

=item HISTORY

March 14, 2003 (Pi day), May 8, 2003, November 13, 2003, November 16, 2003,
December 9, 2003, December 15, 2003, July 26, 2005, August 23, 2005, June
12, 2007

=back

=head2 CPAN::Meta::Merge - Merging CPAN Meta fragments

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item merge(@fragments)

=back

=item MERGE STRATEGIES

identical, set_addition, uniq_map, improvise

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::Prereqs - a set of distribution prerequisites by phase
and type

=over 4

=item VERSION

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item requirements_for

=item phases

=item types_in

=item with_merged_prereqs

=item merged_requirements

=item as_string_hash

=item is_finalized

=item finalize

=item clone

=back

=item BUGS

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::Requirements - a set of version requirements for a CPAN
dist

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item add_minimum

=item add_maximum

=item add_exclusion

=item exact_version

=item add_requirements

=item accepts_module

=item clear_requirement

=item requirements_for_module

=item structured_requirements_for_module

=item required_modules

=item clone

=item is_simple

=item is_finalized

=item finalize

=item as_string_hash

=item add_string_requirement

>= 1.3, <= 1.3, != 1.3, > 1.3, < 1.3, >= 1.3, != 1.5, <= 2.0

=item from_string_hash

=back

=item SUPPORT

=over 4

=item Bugs / Feature Requests

=item Source Code

=back

=item AUTHORS

=item CONTRIBUTORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::Spec - specification for CPAN distribution metadata

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item TERMINOLOGY

distribution, module, package, consumer, producer, must, should, may, etc

=item DATA TYPES

=over 4

=item Boolean

=item String

=item List

=item Map

=item License String

=item URL

=item Version

=item Version Range

=back

=item STRUCTURE

=over 4

=item REQUIRED FIELDS

version, url, stable, testing, unstable

=item OPTIONAL FIELDS

file, directory, package, namespace, description, prereqs, file, version,
homepage, license, bugtracker, repository

=item DEPRECATED FIELDS

=back

=item VERSION NUMBERS

=over 4

=item Version Formats

Decimal versions, Dotted-integer versions

=item Version Ranges

=back

=item PREREQUISITES

=over 4

=item Prereq Spec

configure, build, test, runtime, develop, requires, recommends, suggests,
conflicts

=item Merging and Resolving Prerequisites

=back

=item SERIALIZATION

=item NOTES FOR IMPLEMENTORS

=over 4

=item Extracting Version Numbers from Perl Modules

=item Comparing Version Numbers

=item Prerequisites for dynamically configured distributions

=item Indexing distributions a la PAUSE

=back

=item SEE ALSO

=item HISTORY

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::Validator - validate CPAN distribution metadata
structures

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item is_valid

=item errors

=item Check Methods

=item Validator Methods

=back

=item BUGS

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 CPAN::Meta::YAML - Read and write a subset of YAML for CPAN Meta
files

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORT

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

new( LOCAL_FILE_NAME )

continents()

countries( [CONTINENTS] )

mirrors( [COUNTRIES] )

get_mirrors_by_countries( [COUNTRIES] )

get_mirrors_by_continents( [CONTINENTS] )

get_countries_by_continents( [CONTINENTS] )

default_mirror

best_mirrors

get_n_random_mirrors_by_continents( N, [CONTINENTS] )

get_mirrors_timings( MIRROR_LIST, SEEN, CALLBACK );

find_best_continents( HASH_REF );

=over 4

=item AUTHOR

=item LICENSE

=back

=head2 CPAN::Nox - Wrapper around CPAN.pm without using any XS module

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item LICENSE

=item  SEE ALSO

=back

=head2 CPAN::Plugin - Base class for CPAN shell extensions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Alpha Status

=item How Plugins work?

=back

=item METHODS

=over 4

=item plugin_requires

=item distribution_object

=item distribution

=item distribution_info

=item build_dir

=item is_xs

=back

=item AUTHOR

=back

=head2 CPAN::Plugin::Specfile - Proof of concept implementation of a
trivial CPAN::Plugin

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item OPTIONS

=back

=item AUTHOR

=back

=head2 CPAN::Queue - internal queue support for CPAN.pm

=over 4

=item LICENSE

=back

=head2 CPAN::Tarzip - internal handling of tar archives for CPAN.pm

=over 4

=item LICENSE

=back

=head2 CPAN::Version - utility functions to compare CPAN versions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item LICENSE

=back

=head2 Carp - alternative warn and die for modules

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Forcing a Stack Trace

=item Stack Trace formatting

=back

=item GLOBAL VARIABLES

=over 4

=item $Carp::MaxEvalLen

=item $Carp::MaxArgLen

=item $Carp::MaxArgNums

=item $Carp::Verbose

=item $Carp::RefArgFormatter

=item @CARP_NOT

=item %Carp::Internal

=item %Carp::CarpInternal

=item $Carp::CarpLevel

=back

=item BUGS

=item SEE ALSO

=item CONTRIBUTING

=item AUTHOR

=item COPYRIGHT

=item LICENSE

=back

=head2 Class::Struct - declare struct-like datatypes as Perl classes

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item The C<struct()> function

=item Class Creation at Compile Time

=item Element Types and Accessor Methods

Scalar (C<'$'> or C<'*$'>), Array (C<'@'> or C<'*@'>), Hash (C<'%'> or
C<'*%'>), Class (C<'Class_Name'> or C<'*Class_Name'>)

=item Initializing with C<new>

=back

=item EXAMPLES

Example 1, Example 2, Example 3

=item Author and Modification History

=back

=head2 Compress::Raw::Bzip2 - Low-Level Interface to bzip2 compression
library

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Compression

=over 4

=item ($z, $status) = new Compress::Raw::Bzip2 $appendOutput,
$blockSize100k, $workfactor;

B<$appendOutput>, B<$blockSize100k>, B<$workfactor>

=item $status = $bz->bzdeflate($input, $output);

=item $status = $bz->bzflush($output);

=item $status = $bz->bzclose($output);

=item Example

=back

=item Uncompression

=over 4

=item ($z, $status) = new Compress::Raw::Bunzip2 $appendOutput,
$consumeInput, $small, $verbosity, $limitOutput;

B<$appendOutput>, B<$consumeInput>, B<$small>, B<$limitOutput>,
B<$verbosity>

=item $status = $z->bzinflate($input, $output);

=back

=item Misc

=over 4

=item my $version = Compress::Raw::Bzip2::bzlibversion();

=back

=item Constants

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 Compress::Raw::Zlib - Low-Level Interface to zlib compression
library

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Compress::Raw::Zlib::Deflate

=over 4

=item B<($d, $status) = new Compress::Raw::Zlib::Deflate( [OPT] ) >

B<-Level>, B<-Method>, B<-WindowBits>, B<-MemLevel>, B<-Strategy>,
B<-Dictionary>, B<-Bufsize>, B<-AppendOutput>, B<-CRC32>, B<-ADLER32>

=item B<$status = $d-E<gt>deflate($input, $output)>

=item B<$status = $d-E<gt>flush($output [, $flush_type]) >

=item B<$status = $d-E<gt>deflateReset() >

=item B<$status = $d-E<gt>deflateParams([OPT])>

B<-Level>, B<-Strategy>, B<-BufSize>

=item B<$status = $d-E<gt>deflateTune($good_length, $max_lazy,
$nice_length, $max_chain)>

=item B<$d-E<gt>dict_adler()>

=item B<$d-E<gt>crc32()>

=item B<$d-E<gt>adler32()>

=item B<$d-E<gt>msg()>

=item B<$d-E<gt>total_in()>

=item B<$d-E<gt>total_out()>

=item B<$d-E<gt>get_Strategy()>

=item B<$d-E<gt>get_Level()>

=item B<$d-E<gt>get_BufSize()>

=item Example

=back

=item Compress::Raw::Zlib::Inflate

=over 4

=item B< ($i, $status) = new Compress::Raw::Zlib::Inflate( [OPT] ) >

B<-WindowBits>, B<-Bufsize>, B<-Dictionary>, B<-AppendOutput>, B<-CRC32>,
B<-ADLER32>, B<-ConsumeInput>, B<-LimitOutput>

=item B< $status = $i-E<gt>inflate($input, $output [,$eof]) >

=item B<$status = $i-E<gt>inflateSync($input)>

=item B<$status = $i-E<gt>inflateReset() >

=item B<$i-E<gt>dict_adler()>

=item B<$i-E<gt>crc32()>

=item B<$i-E<gt>adler32()>

=item B<$i-E<gt>msg()>

=item B<$i-E<gt>total_in()>

=item B<$i-E<gt>total_out()>

=item B<$d-E<gt>get_BufSize()>

=item Examples

=back

=item CHECKSUM FUNCTIONS

=item Misc

=over 4

=item my $version = Compress::Raw::Zlib::zlib_version();

=item  my $flags = Compress::Raw::Zlib::zlibCompileFlags();

=back

=item The LimitOutput option.

=item ACCESSING ZIP FILES

=item FAQ

=over 4

=item Compatibility with Unix compress/uncompress.

=item Accessing .tar.Z files

=item Zlib Library Version Support

=back

=item CONSTANTS

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 Compress::Zlib - Interface to zlib compression library

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Notes for users of Compress::Zlib version 1

=back

=item GZIP INTERFACE

B<$gz = gzopen($filename, $mode)>, B<$gz = gzopen($filehandle, $mode)>,
B<$bytesread = $gz-E<gt>gzread($buffer [, $size]) ;>, B<$bytesread =
$gz-E<gt>gzreadline($line) ;>, B<$byteswritten = $gz-E<gt>gzwrite($buffer)
;>, B<$status = $gz-E<gt>gzflush($flush_type) ;>, B<$offset =
$gz-E<gt>gztell() ;>, B<$status = $gz-E<gt>gzseek($offset, $whence) ;>,
B<$gz-E<gt>gzclose>, B<$gz-E<gt>gzsetparams($level, $strategy>, B<$level>,
B<$strategy>, B<$gz-E<gt>gzerror>, B<$gzerrno>

=over 4

=item Examples

=item Compress::Zlib::memGzip

=item Compress::Zlib::memGunzip

=back

=item COMPRESS/UNCOMPRESS

B<$dest = compress($source [, $level] ) ;>, B<$dest = uncompress($source)
;>

=item Deflate Interface

=over 4

=item B<($d, $status) = deflateInit( [OPT] )>

B<-Level>, B<-Method>, B<-WindowBits>, B<-MemLevel>, B<-Strategy>,
B<-Dictionary>, B<-Bufsize>

=item B<($out, $status) = $d-E<gt>deflate($buffer)>

=item B<($out, $status) = $d-E<gt>flush()>
=head2 B<($out, $status) = $d-E<gt>flush($flush_type)>

=item B<$status = $d-E<gt>deflateParams([OPT])>

B<-Level>, B<-Strategy>

=item B<$d-E<gt>dict_adler()>

=item B<$d-E<gt>msg()>

=item B<$d-E<gt>total_in()>

=item B<$d-E<gt>total_out()>

=item Example

=back

=item Inflate Interface

=over 4

=item B<($i, $status) = inflateInit()>

B<-WindowBits>, B<-Bufsize>, B<-Dictionary>

=item B<($out, $status) = $i-E<gt>inflate($buffer)>

=item B<$status = $i-E<gt>inflateSync($buffer)>

=item B<$i-E<gt>dict_adler()>

=item B<$i-E<gt>msg()>

=item B<$i-E<gt>total_in()>

=item B<$i-E<gt>total_out()>

=item Example

=back

=item CHECKSUM FUNCTIONS

=item Misc

=over 4

=item my $version = Compress::Zlib::zlib_version();

=back

=item CONSTANTS

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 Config - access Perl configuration information

=over 4

=item SYNOPSIS

=item DESCRIPTION

myconfig(), config_sh(), config_re($regex), config_vars(@names),
bincompat_options(), non_bincompat_options(), compile_date(),
local_patches(), header_files()

=item EXAMPLE

=item WARNING

=item GLOSSARY

=back

=over 4

=item _

=back

C<_a>, C<_exe>, C<_o>

=over 4

=item a

=back

C<afs>, C<afsroot>, C<alignbytes>, C<ansi2knr>, C<aphostname>,
C<api_revision>, C<api_subversion>, C<api_version>, C<api_versionstring>,
C<ar>, C<archlib>, C<archlibexp>, C<archname>, C<archname64>, C<archobjs>,
C<asctime_r_proto>, C<awk>

=over 4

=item b

=back

C<baserev>, C<bash>, C<bin>, C<bin_ELF>, C<binexp>, C<bison>, C<byacc>,
C<byteorder>

=over 4

=item c

=back

C<c>, C<castflags>, C<cat>, C<cc>, C<cccdlflags>, C<ccdlflags>, C<ccflags>,
C<ccflags_uselargefiles>, C<ccname>, C<ccsymbols>, C<ccversion>, C<cf_by>,
C<cf_email>, C<cf_time>, C<charbits>, C<charsize>, C<chgrp>, C<chmod>,
C<chown>, C<clocktype>, C<comm>, C<compress>, C<config_arg0>,
C<config_argc>, C<config_args>, C<contains>, C<cp>, C<cpio>, C<cpp>,
C<cpp_stuff>, C<cppccsymbols>, C<cppflags>, C<cpplast>, C<cppminus>,
C<cpprun>, C<cppstdin>, C<cppsymbols>, C<crypt_r_proto>, C<cryptlib>,
C<csh>, C<ctermid_r_proto>, C<ctime_r_proto>

=over 4

=item d

=back

C<d__fwalk>, C<d_access>, C<d_accessx>, C<d_acosh>, C<d_aintl>, C<d_alarm>,
C<d_archlib>, C<d_asctime64>, C<d_asctime_r>, C<d_asinh>, C<d_atanh>,
C<d_atolf>, C<d_atoll>, C<d_attribute_deprecated>, C<d_attribute_format>,
C<d_attribute_malloc>, C<d_attribute_nonnull>, C<d_attribute_noreturn>,
C<d_attribute_pure>, C<d_attribute_unused>,
C<d_attribute_warn_unused_result>, C<d_backtrace>, C<d_bcmp>, C<d_bcopy>,
C<d_bsd>, C<d_bsdgetpgrp>, C<d_bsdsetpgrp>, C<d_builtin_choose_expr>,
C<d_builtin_expect>, C<d_bzero>, C<d_c99_variadic_macros>, C<d_casti32>,
C<d_castneg>, C<d_cbrt>, C<d_charvspr>, C<d_chown>, C<d_chroot>,
C<d_chsize>, C<d_class>, C<d_clearenv>, C<d_closedir>, C<d_cmsghdr_s>,
C<d_const>, C<d_copysign>, C<d_copysignl>, C<d_cplusplus>, C<d_crypt>,
C<d_crypt_r>, C<d_csh>, C<d_ctermid>, C<d_ctermid_r>, C<d_ctime64>,
C<d_ctime_r>, C<d_cuserid>, C<d_dbl_dig>, C<d_dbminitproto>, C<d_difftime>,
C<d_difftime64>, C<d_dir_dd_fd>, C<d_dirfd>, C<d_dirnamlen>, C<d_dladdr>,
C<d_dlerror>, C<d_dlopen>, C<d_dlsymun>, C<d_double_has_inf>,
C<d_double_has_nan>, C<d_double_has_negative_zero>,
C<d_double_has_subnormals>, C<d_double_style_cray>, C<d_double_style_ibm>,
C<d_double_style_ieee>, C<d_double_style_vax>, C<d_dosuid>, C<d_drand48_r>,
C<d_drand48proto>, C<d_dup2>, C<d_duplocale>, C<d_eaccess>, C<d_endgrent>,
C<d_endgrent_r>, C<d_endhent>, C<d_endhostent_r>, C<d_endnent>,
C<d_endnetent_r>, C<d_endpent>, C<d_endprotoent_r>, C<d_endpwent>,
C<d_endpwent_r>, C<d_endsent>, C<d_endservent_r>, C<d_eofnblk>, C<d_erf>,
C<d_erfc>, C<d_eunice>, C<d_exp2>, C<d_expm1>, C<d_faststdio>, C<d_fchdir>,
C<d_fchmod>, C<d_fchown>, C<d_fcntl>, C<d_fcntl_can_lock>, C<d_fd_macros>,
C<d_fd_set>, C<d_fdclose>, C<d_fdim>, C<d_fds_bits>, C<d_fegetround>,
C<d_fgetpos>, C<d_finite>, C<d_finitel>, C<d_flexfnam>, C<d_flock>,
C<d_flockproto>, C<d_fma>, C<d_fmax>, C<d_fmin>, C<d_fdopendir>, C<d_fork>,
C<d_fp_class>, C<d_fp_classify>, C<d_fp_classl>, C<d_fpathconf>,
C<d_fpclass>, C<d_fpclassify>, C<d_fpclassl>, C<d_fpgetround>,
C<d_fpos64_t>, C<d_freelocale>, C<d_frexpl>, C<d_fs_data_s>, C<d_fseeko>,
C<d_fsetpos>, C<d_fstatfs>, C<d_fstatvfs>, C<d_fsync>, C<d_ftello>,
C<d_ftime>, C<d_futimes>, C<d_gai_strerror>, C<d_Gconvert>,
C<d_gdbm_ndbm_h_uses_prototypes>, C<d_gdbmndbm_h_uses_prototypes>,
C<d_getaddrinfo>, C<d_getcwd>, C<d_getespwnam>, C<d_getfsstat>,
C<d_getgrent>, C<d_getgrent_r>, C<d_getgrgid_r>, C<d_getgrnam_r>,
C<d_getgrps>, C<d_gethbyaddr>, C<d_gethbyname>, C<d_gethent>,
C<d_gethname>, C<d_gethostbyaddr_r>, C<d_gethostbyname_r>,
C<d_gethostent_r>, C<d_gethostprotos>, C<d_getitimer>, C<d_getlogin>,
C<d_getlogin_r>, C<d_getmnt>, C<d_getmntent>, C<d_getnameinfo>,
C<d_getnbyaddr>, C<d_getnbyname>, C<d_getnent>, C<d_getnetbyaddr_r>,
C<d_getnetbyname_r>, C<d_getnetent_r>, C<d_getnetprotos>, C<d_getpagsz>,
C<d_getpbyname>, C<d_getpbynumber>, C<d_getpent>, C<d_getpgid>,
C<d_getpgrp>, C<d_getpgrp2>, C<d_getppid>, C<d_getprior>,
C<d_getprotobyname_r>, C<d_getprotobynumber_r>, C<d_getprotoent_r>,
C<d_getprotoprotos>, C<d_getprpwnam>, C<d_getpwent>, C<d_getpwent_r>,
C<d_getpwnam_r>, C<d_getpwuid_r>, C<d_getsbyname>, C<d_getsbyport>,
C<d_getsent>, C<d_getservbyname_r>, C<d_getservbyport_r>,
C<d_getservent_r>, C<d_getservprotos>, C<d_getspnam>, C<d_getspnam_r>,
C<d_gettimeod>, C<d_gmtime64>, C<d_gmtime_r>, C<d_gnulibc>, C<d_grpasswd>,
C<d_hasmntopt>, C<d_htonl>, C<d_hypot>, C<d_ilogb>, C<d_ilogbl>,
C<d_inc_version_list>, C<d_index>, C<d_inetaton>, C<d_inetntop>,
C<d_inetpton>, C<d_int64_t>, C<d_ip_mreq>, C<d_ip_mreq_source>,
C<d_ipv6_mreq>, C<d_ipv6_mreq_source>, C<d_isascii>, C<d_isblank>,
C<d_isfinite>, C<d_isfinitel>, C<d_isinf>, C<d_isinfl>, C<d_isless>,
C<d_isnan>, C<d_isnanl>, C<d_isnormal>, C<d_j0>, C<d_j0l>, C<d_killpg>,
C<d_lc_monetary_2008>, C<d_lchown>, C<d_ldbl_dig>, C<d_ldexpl>,
C<d_lgamma>, C<d_lgamma_r>, C<d_libm_lib_version>, C<d_libname_unique>,
C<d_link>, C<d_llrint>, C<d_llrintl>, C<d_llround>, C<d_llroundl>,
C<d_localtime64>, C<d_localtime_r>, C<d_localtime_r_needs_tzset>,
C<d_locconv>, C<d_lockf>, C<d_log1p>, C<d_log2>, C<d_logb>, C<d_longdbl>,
C<d_long_double_style_ieee>, C<d_long_double_style_ieee_doubledouble>,
C<d_long_double_style_ieee_extended>, C<d_long_double_style_ieee_std>,
C<d_long_double_style_vax>, C<d_longlong>, C<d_lrint>, C<d_lrintl>,
C<d_lround>, C<d_lroundl>, C<d_lseekproto>, C<d_lstat>, C<d_madvise>,
C<d_malloc_good_size>, C<d_malloc_size>, C<d_mblen>, C<d_mbstowcs>,
C<d_mbtowc>, C<d_memchr>, C<d_memcmp>, C<d_memcpy>, C<d_memmem>,
C<d_memmove>, C<d_memset>, C<d_mkdir>, C<d_mkdtemp>, C<d_mkfifo>,
C<d_mkstemp>, C<d_mkstemps>, C<d_mktime>, C<d_mktime64>, C<d_mmap>,
C<d_modfl>, C<d_modflproto>, C<d_mprotect>, C<d_msg>, C<d_msg_ctrunc>,
C<d_msg_dontroute>, C<d_msg_oob>, C<d_msg_peek>, C<d_msg_proxy>,
C<d_msgctl>, C<d_msgget>, C<d_msghdr_s>, C<d_msgrcv>, C<d_msgsnd>,
C<d_msync>, C<d_munmap>, C<d_mymalloc>, C<d_nan>, C<d_ndbm>,
C<d_ndbm_h_uses_prototypes>, C<d_nearbyint>, C<d_newlocale>,
C<d_nextafter>, C<d_nexttoward>, C<d_nice>, C<d_nl_langinfo>,
C<d_nv_preserves_uv>, C<d_nv_zero_is_allbits_zero>, C<d_off64_t>,
C<d_old_pthread_create_joinable>, C<d_oldpthreads>, C<d_oldsock>,
C<d_open3>, C<d_pathconf>, C<d_pause>, C<d_perl_otherlibdirs>,
C<d_phostname>, C<d_pipe>, C<d_poll>, C<d_portable>, C<d_prctl>,
C<d_prctl_set_name>, C<d_PRId64>, C<d_PRIeldbl>, C<d_PRIEUldbl>,
C<d_PRIfldbl>, C<d_PRIFUldbl>, C<d_PRIgldbl>, C<d_PRIGUldbl>, C<d_PRIi64>,
C<d_printf_format_null>, C<d_PRIo64>, C<d_PRIu64>, C<d_PRIx64>,
C<d_PRIXU64>, C<d_procselfexe>, C<d_pseudofork>, C<d_pthread_atfork>,
C<d_pthread_attr_setscope>, C<d_pthread_yield>, C<d_ptrdiff_t>, C<d_pwage>,
C<d_pwchange>, C<d_pwclass>, C<d_pwcomment>, C<d_pwexpire>, C<d_pwgecos>,
C<d_pwpasswd>, C<d_pwquota>, C<d_qgcvt>, C<d_quad>, C<d_querylocale>,
C<d_random_r>, C<d_re_comp>, C<d_readdir>, C<d_readdir64_r>,
C<d_readdir_r>, C<d_readlink>, C<d_readv>, C<d_recvmsg>, C<d_regcmp>,
C<d_regcomp>, C<d_remainder>, C<d_remquo>, C<d_rename>, C<d_rewinddir>,
C<d_rint>, C<d_rmdir>, C<d_round>, C<d_safebcpy>, C<d_safemcpy>,
C<d_sanemcmp>, C<d_sbrkproto>, C<d_scalbn>, C<d_scalbnl>, C<d_sched_yield>,
C<d_scm_rights>, C<d_SCNfldbl>, C<d_seekdir>, C<d_select>, C<d_sem>,
C<d_semctl>, C<d_semctl_semid_ds>, C<d_semctl_semun>, C<d_semget>,
C<d_semop>, C<d_sendmsg>, C<d_setegid>, C<d_seteuid>, C<d_setgrent>,
C<d_setgrent_r>, C<d_setgrps>, C<d_sethent>, C<d_sethostent_r>,
C<d_setitimer>, C<d_setlinebuf>, C<d_setlocale>, C<d_setlocale_r>,
C<d_setnent>, C<d_setnetent_r>, C<d_setpent>, C<d_setpgid>, C<d_setpgrp>,
C<d_setpgrp2>, C<d_setprior>, C<d_setproctitle>, C<d_setprotoent_r>,
C<d_setpwent>, C<d_setpwent_r>, C<d_setregid>, C<d_setresgid>,
C<d_setresuid>, C<d_setreuid>, C<d_setrgid>, C<d_setruid>, C<d_setsent>,
C<d_setservent_r>, C<d_setsid>, C<d_setvbuf>, C<d_shm>, C<d_shmat>,
C<d_shmatprototype>, C<d_shmctl>, C<d_shmdt>, C<d_shmget>, C<d_sigaction>,
C<d_siginfo_si_addr>, C<d_siginfo_si_band>, C<d_siginfo_si_errno>,
C<d_siginfo_si_fd>, C<d_siginfo_si_pid>, C<d_siginfo_si_status>,
C<d_siginfo_si_uid>, C<d_siginfo_si_value>, C<d_signbit>, C<d_sigprocmask>,
C<d_sigsetjmp>, C<d_sin6_scope_id>, C<d_sitearch>, C<d_snprintf>,
C<d_sockaddr_in6>, C<d_sockaddr_sa_len>, C<d_sockatmark>,
C<d_sockatmarkproto>, C<d_socket>, C<d_socklen_t>, C<d_sockpair>,
C<d_socks5_init>, C<d_sprintf_returns_strlen>, C<d_sqrtl>, C<d_srand48_r>,
C<d_srandom_r>, C<d_sresgproto>, C<d_sresuproto>, C<d_stat>, C<d_statblks>,
C<d_statfs_f_flags>, C<d_statfs_s>, C<d_static_inline>, C<d_statvfs>,
C<d_stdio_cnt_lval>, C<d_stdio_ptr_lval>, C<d_stdio_ptr_lval_nochange_cnt>,
C<d_stdio_ptr_lval_sets_cnt>, C<d_stdio_stream_array>, C<d_stdiobase>,
C<d_stdstdio>, C<d_strchr>, C<d_strcoll>, C<d_strctcpy>, C<d_strerrm>,
C<d_strerror>, C<d_strerror_l>, C<d_strerror_r>, C<d_strftime>,
C<d_strlcat>, C<d_strlcpy>, C<d_strtod>, C<d_strtol>, C<d_strtold>,
C<d_strtoll>, C<d_strtoq>, C<d_strtoul>, C<d_strtoull>, C<d_strtouq>,
C<d_strxfrm>, C<d_suidsafe>, C<d_symlink>, C<d_syscall>, C<d_syscallproto>,
C<d_sysconf>, C<d_sysernlst>, C<d_syserrlst>, C<d_system>, C<d_tcgetpgrp>,
C<d_tcsetpgrp>, C<d_telldir>, C<d_telldirproto>, C<d_tgamma>, C<d_time>,
C<d_timegm>, C<d_times>, C<d_tm_tm_gmtoff>, C<d_tm_tm_zone>, C<d_tmpnam_r>,
C<d_trunc>, C<d_truncate>, C<d_truncl>, C<d_ttyname_r>, C<d_tzname>,
C<d_u32align>, C<d_ualarm>, C<d_umask>, C<d_uname>, C<d_union_semun>,
C<d_unordered>, C<d_unsetenv>, C<d_uselocale>, C<d_usleep>,
C<d_usleepproto>, C<d_ustat>, C<d_vendorarch>, C<d_vendorbin>,
C<d_vendorlib>, C<d_vendorscript>, C<d_vfork>, C<d_void_closedir>,
C<d_voidsig>, C<d_voidtty>, C<d_volatile>, C<d_vprintf>, C<d_vsnprintf>,
C<d_wait4>, C<d_waitpid>, C<d_wcscmp>, C<d_wcstombs>, C<d_wcsxfrm>,
C<d_wctomb>, C<d_writev>, C<d_xenix>, C<date>, C<db_hashtype>,
C<db_prefixtype>, C<db_version_major>, C<db_version_minor>,
C<db_version_patch>, C<default_inc_excludes_dot>, C<direntrytype>,
C<dlext>, C<dlsrc>, C<doubleinfbytes>, C<doublekind>, C<doublemantbits>,
C<doublenanbytes>, C<doublesize>, C<drand01>, C<drand48_r_proto>,
C<dtrace>, C<dtraceobject>, C<dtracexnolibs>, C<dynamic_ext>

=over 4

=item e

=back

C<eagain>, C<ebcdic>, C<echo>, C<egrep>, C<emacs>, C<endgrent_r_proto>,
C<endhostent_r_proto>, C<endnetent_r_proto>, C<endprotoent_r_proto>,
C<endpwent_r_proto>, C<endservent_r_proto>, C<eunicefix>, C<exe_ext>,
C<expr>, C<extensions>, C<extern_C>, C<extras>

=over 4

=item f

=back

C<fflushall>, C<fflushNULL>, C<find>, C<firstmakefile>, C<flex>,
C<fpossize>, C<fpostype>, C<freetype>, C<from>, C<full_ar>, C<full_csh>,
C<full_sed>

=over 4

=item g

=back

C<gccansipedantic>, C<gccosandvers>, C<gccversion>, C<getgrent_r_proto>,
C<getgrgid_r_proto>, C<getgrnam_r_proto>, C<gethostbyaddr_r_proto>,
C<gethostbyname_r_proto>, C<gethostent_r_proto>, C<getlogin_r_proto>,
C<getnetbyaddr_r_proto>, C<getnetbyname_r_proto>, C<getnetent_r_proto>,
C<getprotobyname_r_proto>, C<getprotobynumber_r_proto>,
C<getprotoent_r_proto>, C<getpwent_r_proto>, C<getpwnam_r_proto>,
C<getpwuid_r_proto>, C<getservbyname_r_proto>, C<getservbyport_r_proto>,
C<getservent_r_proto>, C<getspnam_r_proto>, C<gidformat>, C<gidsign>,
C<gidsize>, C<gidtype>, C<glibpth>, C<gmake>, C<gmtime_r_proto>,
C<gnulibc_version>, C<grep>, C<groupcat>, C<groupstype>, C<gzip>

=over 4

=item h

=back

C<h_fcntl>, C<h_sysfile>, C<hint>, C<hostcat>, C<hostgenerate>,
C<hostosname>, C<hostperl>, C<html1dir>, C<html1direxp>, C<html3dir>,
C<html3direxp>

=over 4

=item i

=back

C<i16size>, C<i16type>, C<i32size>, C<i32type>, C<i64size>, C<i64type>,
C<i8size>, C<i8type>, C<i_arpainet>, C<i_assert>, C<i_bfd>, C<i_bsdioctl>,
C<i_crypt>, C<i_db>, C<i_dbm>, C<i_dirent>, C<i_dlfcn>, C<i_execinfo>,
C<i_fcntl>, C<i_fenv>, C<i_float>, C<i_fp>, C<i_fp_class>, C<i_gdbm>,
C<i_gdbm_ndbm>, C<i_gdbmndbm>, C<i_grp>, C<i_ieeefp>, C<i_inttypes>,
C<i_langinfo>, C<i_libutil>, C<i_limits>, C<i_locale>, C<i_machcthr>,
C<i_malloc>, C<i_mallocmalloc>, C<i_math>, C<i_memory>, C<i_mntent>,
C<i_ndbm>, C<i_netdb>, C<i_neterrno>, C<i_netinettcp>, C<i_niin>,
C<i_poll>, C<i_prot>, C<i_pthread>, C<i_pwd>, C<i_quadmath>,
C<i_rpcsvcdbm>, C<i_sgtty>, C<i_shadow>, C<i_socks>, C<i_stdarg>,
C<i_stdbool>, C<i_stddef>, C<i_stdint>, C<i_stdlib>, C<i_string>,
C<i_sunmath>, C<i_sysaccess>, C<i_sysdir>, C<i_sysfile>, C<i_sysfilio>,
C<i_sysin>, C<i_sysioctl>, C<i_syslog>, C<i_sysmman>, C<i_sysmode>,
C<i_sysmount>, C<i_sysndir>, C<i_sysparam>, C<i_syspoll>, C<i_sysresrc>,
C<i_syssecrt>, C<i_sysselct>, C<i_syssockio>, C<i_sysstat>, C<i_sysstatfs>,
C<i_sysstatvfs>, C<i_systime>, C<i_systimek>, C<i_systimes>, C<i_systypes>,
C<i_sysuio>, C<i_sysun>, C<i_sysutsname>, C<i_sysvfs>, C<i_syswait>,
C<i_termio>, C<i_termios>, C<i_time>, C<i_unistd>, C<i_ustat>, C<i_utime>,
C<i_values>, C<i_varargs>, C<i_varhdr>, C<i_vfork>, C<i_xlocale>,
C<ignore_versioned_solibs>, C<inc_version_list>, C<inc_version_list_init>,
C<incpath>, C<incpth>, C<inews>, C<initialinstalllocation>,
C<installarchlib>, C<installbin>, C<installhtml1dir>, C<installhtml3dir>,
C<installman1dir>, C<installman3dir>, C<installprefix>,
C<installprefixexp>, C<installprivlib>, C<installscript>,
C<installsitearch>, C<installsitebin>, C<installsitehtml1dir>,
C<installsitehtml3dir>, C<installsitelib>, C<installsiteman1dir>,
C<installsiteman3dir>, C<installsitescript>, C<installstyle>,
C<installusrbinperl>, C<installvendorarch>, C<installvendorbin>,
C<installvendorhtml1dir>, C<installvendorhtml3dir>, C<installvendorlib>,
C<installvendorman1dir>, C<installvendorman3dir>, C<installvendorscript>,
C<intsize>, C<issymlink>, C<ivdformat>, C<ivsize>, C<ivtype>

=over 4

=item k

=back

C<known_extensions>, C<ksh>

=over 4

=item l

=back

C<ld>, C<ld_can_script>, C<lddlflags>, C<ldflags>,
C<ldflags_uselargefiles>, C<ldlibpthname>, C<less>, C<lib_ext>, C<libc>,
C<libperl>, C<libpth>, C<libs>, C<libsdirs>, C<libsfiles>, C<libsfound>,
C<libspath>, C<libswanted>, C<libswanted_uselargefiles>, C<line>, C<lint>,
C<lkflags>, C<ln>, C<lns>, C<localtime_r_proto>, C<locincpth>,
C<loclibpth>, C<longdblinfbytes>, C<longdblkind>, C<longdblmantbits>,
C<longdblnanbytes>, C<longdblsize>, C<longlongsize>, C<longsize>, C<lp>,
C<lpr>, C<ls>, C<lseeksize>, C<lseektype>

=over 4

=item m

=back

C<mail>, C<mailx>, C<make>, C<make_set_make>, C<mallocobj>, C<mallocsrc>,
C<malloctype>, C<man1dir>, C<man1direxp>, C<man1ext>, C<man3dir>,
C<man3direxp>, C<man3ext>, C<mips_type>, C<mistrustnm>, C<mkdir>,
C<mmaptype>, C<modetype>, C<more>, C<multiarch>, C<mv>, C<myarchname>,
C<mydomain>, C<myhostname>, C<myuname>

=over 4

=item n

=back

C<n>, C<need_va_copy>, C<netdb_hlen_type>, C<netdb_host_type>,
C<netdb_name_type>, C<netdb_net_type>, C<nm>, C<nm_opt>, C<nm_so_opt>,
C<nonxs_ext>, C<nroff>, C<nv_overflows_integers_at>,
C<nv_preserves_uv_bits>, C<nveformat>, C<nvEUformat>, C<nvfformat>,
C<nvFUformat>, C<nvgformat>, C<nvGUformat>, C<nvmantbits>, C<nvsize>,
C<nvtype>

=over 4

=item o

=back

C<o_nonblock>, C<obj_ext>, C<old_pthread_create_joinable>, C<optimize>,
C<orderlib>, C<osname>, C<osvers>, C<otherlibdirs>

=over 4

=item p

=back

C<package>, C<pager>, C<passcat>, C<patchlevel>, C<path_sep>, C<perl>,
C<perl5>

=over 4

=item P

=back

C<PERL_API_REVISION>, C<PERL_API_SUBVERSION>, C<PERL_API_VERSION>,
C<PERL_CONFIG_SH>, C<PERL_PATCHLEVEL>, C<perl_patchlevel>,
C<PERL_REVISION>, C<perl_static_inline>, C<PERL_SUBVERSION>,
C<PERL_VERSION>, C<perladmin>, C<perllibs>, C<perlpath>, C<pg>,
C<phostname>, C<pidtype>, C<plibpth>, C<pmake>, C<pr>, C<prefix>,
C<prefixexp>, C<privlib>, C<privlibexp>, C<procselfexe>, C<prototype>,
C<ptrsize>

=over 4

=item q

=back

C<quadkind>, C<quadtype>

=over 4

=item r

=back

C<randbits>, C<randfunc>, C<random_r_proto>, C<randseedtype>, C<ranlib>,
C<rd_nodata>, C<readdir64_r_proto>, C<readdir_r_proto>, C<revision>, C<rm>,
C<rm_try>, C<rmail>, C<run>, C<runnm>

=over 4

=item s

=back

C<sched_yield>, C<scriptdir>, C<scriptdirexp>, C<sed>, C<seedfunc>,
C<selectminbits>, C<selecttype>, C<sendmail>, C<setgrent_r_proto>,
C<sethostent_r_proto>, C<setlocale_r_proto>, C<setnetent_r_proto>,
C<setprotoent_r_proto>, C<setpwent_r_proto>, C<setservent_r_proto>,
C<sGMTIME_max>, C<sGMTIME_min>, C<sh>, C<shar>, C<sharpbang>, C<shmattype>,
C<shortsize>, C<shrpenv>, C<shsharp>, C<sig_count>, C<sig_name>,
C<sig_name_init>, C<sig_num>, C<sig_num_init>, C<sig_size>, C<signal_t>,
C<sitearch>, C<sitearchexp>, C<sitebin>, C<sitebinexp>, C<sitehtml1dir>,
C<sitehtml1direxp>, C<sitehtml3dir>, C<sitehtml3direxp>, C<sitelib>,
C<sitelib_stem>, C<sitelibexp>, C<siteman1dir>, C<siteman1direxp>,
C<siteman3dir>, C<siteman3direxp>, C<siteprefix>, C<siteprefixexp>,
C<sitescript>, C<sitescriptexp>, C<sizesize>, C<sizetype>, C<sleep>,
C<sLOCALTIME_max>, C<sLOCALTIME_min>, C<smail>, C<so>, C<sockethdr>,
C<socketlib>, C<socksizetype>, C<sort>, C<spackage>, C<spitshell>,
C<sPRId64>, C<sPRIeldbl>, C<sPRIEUldbl>, C<sPRIfldbl>, C<sPRIFUldbl>,
C<sPRIgldbl>, C<sPRIGUldbl>, C<sPRIi64>, C<sPRIo64>, C<sPRIu64>,
C<sPRIx64>, C<sPRIXU64>, C<srand48_r_proto>, C<srandom_r_proto>, C<src>,
C<sSCNfldbl>, C<ssizetype>, C<st_ino_sign>, C<st_ino_size>, C<startperl>,
C<startsh>, C<static_ext>, C<stdchar>, C<stdio_base>, C<stdio_bufsiz>,
C<stdio_cnt>, C<stdio_filbuf>, C<stdio_ptr>, C<stdio_stream_array>,
C<strerror_r_proto>, C<strings>, C<submit>, C<subversion>, C<sysman>,
C<sysroot>

=over 4

=item t

=back

C<tail>, C<tar>, C<targetarch>, C<targetdir>, C<targetenv>, C<targethost>,
C<targetmkdir>, C<targetport>, C<targetsh>, C<tbl>, C<tee>, C<test>,
C<timeincl>, C<timetype>, C<tmpnam_r_proto>, C<to>, C<touch>, C<tr>,
C<trnl>, C<troff>, C<ttyname_r_proto>

=over 4

=item u

=back

C<u16size>, C<u16type>, C<u32size>, C<u32type>, C<u64size>, C<u64type>,
C<u8size>, C<u8type>, C<uidformat>, C<uidsign>, C<uidsize>, C<uidtype>,
C<uname>, C<uniq>, C<uquadtype>, C<use5005threads>, C<use64bitall>,
C<use64bitint>, C<usecbacktrace>, C<usecrosscompile>, C<usedevel>,
C<usedl>, C<usedtrace>, C<usefaststdio>, C<useithreads>,
C<usekernprocpathname>, C<uselargefiles>, C<uselongdouble>,
C<usemallocwrap>, C<usemorebits>, C<usemultiplicity>, C<usemymalloc>,
C<usenm>, C<usensgetexecutablepath>, C<useopcode>, C<useperlio>,
C<useposix>, C<usequadmath>, C<usereentrant>, C<userelocatableinc>,
C<useshrplib>, C<usesitecustomize>, C<usesocks>, C<usethreads>,
C<usevendorprefix>, C<useversionedarchname>, C<usevfork>, C<usrinc>,
C<uuname>, C<uvoformat>, C<uvsize>, C<uvtype>, C<uvuformat>, C<uvxformat>,
C<uvXUformat>

=over 4

=item v

=back

C<vaproto>, C<vendorarch>, C<vendorarchexp>, C<vendorbin>, C<vendorbinexp>,
C<vendorhtml1dir>, C<vendorhtml1direxp>, C<vendorhtml3dir>,
C<vendorhtml3direxp>, C<vendorlib>, C<vendorlib_stem>, C<vendorlibexp>,
C<vendorman1dir>, C<vendorman1direxp>, C<vendorman3dir>,
C<vendorman3direxp>, C<vendorprefix>, C<vendorprefixexp>, C<vendorscript>,
C<vendorscriptexp>, C<version>, C<version_patchlevel_string>,
C<versiononly>, C<vi>

=over 4

=item x

=back

C<xlibpth>

=over 4

=item y

=back

C<yacc>, C<yaccflags>

=over 4

=item z

=back

C<zcat>, C<zip>

=over 4

=item GIT DATA

=item NOTE

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

dynamic, nonxs, static

=item AUTHOR

=back

=head2 Config::Perl::V - Structured data retrieval of perl -V output

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item $conf = myconfig ()

=item $conf = plv2hash ($text [, ...])

=item $info = summary ([$conf])

=item $md5 = signature ([$conf])

=item The hash structure

build, osname, stamp, options, derived, patches, environment, config, inc

=back

=item REASONING

=item BUGS

=item TODO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 Cwd - get pathname of current working directory

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item getcwd and friends

getcwd, cwd, fastcwd, fastgetcwd, getdcwd

=item abs_path and friends

abs_path, realpath, fast_abs_path

=item $ENV{PWD}

=back

=item NOTES

=item AUTHOR

=item COPYRIGHT

=item SEE ALSO

=back

=head2 DB - programmatic interface to the Perl debugging API

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Global Variables

 $DB::sub,  %DB::sub,  $DB::single,  $DB::signal,  $DB::trace,	@DB::args, 
@DB::dbline,  %DB::dbline,  $DB::package,  $DB::filename,  $DB::subname, 
$DB::lineno

=item API Methods

CLIENT->register(), CLIENT->evalcode(STRING), CLIENT->skippkg('D::hide'),
CLIENT->run(), CLIENT->step(), CLIENT->next(), CLIENT->done()

=item Client Callback Methods

CLIENT->init(), CLIENT->prestop([STRING]), CLIENT->stop(), CLIENT->idle(),
CLIENT->poststop([STRING]), CLIENT->evalcode(STRING), CLIENT->cleanup(),
CLIENT->output(LIST)

=back

=item BUGS

=item AUTHOR

=back

=head2 DBM_Filter -- Filter DBM keys/values 

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item What is a DBM Filter?

=over 4

=item So what's new?

=back

=item METHODS

=over 4

=item $db->Filter_Push() / $db->Filter_Key_Push() /
$db->Filter_Value_Push()

Filter_Push, Filter_Key_Push, Filter_Value_Push

=item $db->Filter_Pop()

=item $db->Filtered()

=back

=item Writing a Filter

=over 4

=item Immediate Filters

=item Canned Filters

"name", params

=back

=item Filters Included

utf8, encode, compress, int32, null

=item NOTES

=over 4

=item Maintain Round Trip Integrity

=item Don't mix filtered & non-filtered data in the same database file. 

=back

=item EXAMPLE

=item SEE ALSO

=item AUTHOR

=back

=head2 DBM_Filter::compress - filter for DBM_Filter

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=back

=head2 DBM_Filter::encode - filter for DBM_Filter

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=back

=head2 DBM_Filter::int32 - filter for DBM_Filter

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=back

=head2 DBM_Filter::null - filter for DBM_Filter

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=back

=head2 DBM_Filter::utf8 - filter for DBM_Filter

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=back

=head2 DB_File - Perl5 access to Berkeley DB version 1.x

=over 4

=item SYNOPSIS

=item DESCRIPTION

B<DB_HASH>, B<DB_BTREE>, B<DB_RECNO>

=over 4

=item Using DB_File with Berkeley DB version 2 or greater

=item Interface to Berkeley DB

=item Opening a Berkeley DB Database File

=item Default Parameters

=item In Memory Databases

=back

=item DB_HASH

=over 4

=item A Simple Example

=back

=item DB_BTREE

=over 4

=item Changing the BTREE sort order

=item Handling Duplicate Keys 

=item The get_dup() Method

=item The find_dup() Method

=item The del_dup() Method

=item Matching Partial Keys 

=back

=item DB_RECNO

=over 4

=item The 'bval' Option

=item A Simple Example

=item Extra RECNO Methods

B<$X-E<gt>push(list) ;>, B<$value = $X-E<gt>pop ;>, B<$X-E<gt>shift>,
B<$X-E<gt>unshift(list) ;>, B<$X-E<gt>length>, B<$X-E<gt>splice(offset,
length, elements);>

=item Another Example

=back

=item THE API INTERFACE

B<$status = $X-E<gt>get($key, $value [, $flags]) ;>, B<$status =
$X-E<gt>put($key, $value [, $flags]) ;>, B<$status = $X-E<gt>del($key [,
$flags]) ;>, B<$status = $X-E<gt>fd ;>, B<$status = $X-E<gt>seq($key,
$value, $flags) ;>, B<$status = $X-E<gt>sync([$flags]) ;>

=item DBM FILTERS

=over 4

=item DBM Filter Low-level API

B<filter_store_key>, B<filter_store_value>, B<filter_fetch_key>,
B<filter_fetch_value>

=item The Filter

=item An Example -- the NULL termination problem.

=item Another Example -- Key is a C int.

=back

=item HINTS AND TIPS 

=over 4

=item Locking: The Trouble with fd

=item Safe ways to lock a database

B<Tie::DB_Lock>, B<Tie::DB_LockFile>, B<DB_File::Lock>

=item Sharing Databases With C Applications

=item The untie() Gotcha

=back

=item COMMON QUESTIONS

=over 4

=item Why is there Perl source in my database?

=item How do I store complex data structures with DB_File?

=item What does "wide character in subroutine entry" mean?

=item What does "Invalid Argument" mean?

=item What does "Bareword 'DB_File' not allowed" mean? 

=back

=item REFERENCES

=item HISTORY

=item BUGS

=item AVAILABILITY

=item COPYRIGHT

=item SEE ALSO

=item AUTHOR

=back

=head2 Data::Dumper - stringified perl data structures, suitable for both
printing and C<eval>

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Methods

I<PACKAGE>->new(I<ARRAYREF [>, I<ARRAYREF]>), I<$OBJ>->Dump  I<or> 
I<PACKAGE>->Dump(I<ARRAYREF [>, I<ARRAYREF]>), I<$OBJ>->Seen(I<[HASHREF]>),
I<$OBJ>->Values(I<[ARRAYREF]>), I<$OBJ>->Names(I<[ARRAYREF]>),
I<$OBJ>->Reset

=item Functions

Dumper(I<LIST>)

=item Configuration Variables or Methods

=item Exports

Dumper

=back

=item EXAMPLES

=item BUGS

=over 4

=item NOTE

=back

=item AUTHOR

=item VERSION

=item SEE ALSO

=back

=head2 Devel::PPPort - Perl/Pollution/Portability

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Why use ppport.h?

=item How to use ppport.h

=item Running ppport.h

=back

=item FUNCTIONS

=over 4

=item WriteFile

=item GetFileContents

=back

=item COMPATIBILITY

=over 4

=item Provided Perl compatibility API

=item Perl API not supported by ppport.h

perl 5.24.0, perl 5.23.9, perl 5.23.8, perl 5.22.0, perl 5.21.10, perl
5.21.8, perl 5.21.7, perl 5.21.6, perl 5.21.5, perl 5.21.4, perl 5.21.2,
perl 5.21.1, perl 5.19.10, perl 5.19.9, perl 5.19.7, perl 5.19.4, perl
5.19.3, perl 5.19.2, perl 5.19.1, perl 5.18.0, perl 5.17.9, perl 5.17.8,
perl 5.17.7, perl 5.17.6, perl 5.17.4, perl 5.17.2, perl 5.15.9, perl
5.15.8, perl 5.15.7, perl 5.15.6, perl 5.15.4, perl 5.15.1, perl 5.14.0,
perl 5.13.10, perl 5.13.8, perl 5.13.7, perl 5.13.6, perl 5.13.5, perl
5.13.3, perl 5.13.2, perl 5.13.1, perl 5.11.5, perl 5.11.4, perl 5.11.2,
perl 5.11.1, perl 5.11.0, perl 5.10.1, perl 5.10.0, perl 5.9.5, perl 5.9.4,
perl 5.9.3, perl 5.9.2, perl 5.9.1, perl 5.9.0, perl 5.8.3, perl 5.8.1,
perl 5.8.0, perl 5.7.3, perl 5.7.2, perl 5.7.1, perl 5.6.1, perl 5.6.0,
perl 5.005_03, perl 5.005, perl 5.004_05, perl 5.004, perl 5.003_07

=back

=item BUGS

=item AUTHORS

=item COPYRIGHT

=item SEE ALSO

=back

=head2 Devel::Peek - A data debugging tool for the XS programmer

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Runtime debugging

=item Memory footprint debugging

=back

=item EXAMPLES

=over 4

=item A simple scalar string

=item A simple scalar number

=item A simple scalar with an extra reference

=item A reference to a simple scalar

=item A reference to an array

=item A reference to a hash

=item Dumping a large array or hash

=item A reference to an SV which holds a C pointer

=item A reference to a subroutine

=back

=item EXPORTS

=item BUGS

=item AUTHOR

=item SEE ALSO

=back

=head2 Devel::SelfStubber - generate stubs for a SelfLoading module

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 Digest - Modules that calculate message digests

=over 4

=item SYNOPSIS

=item DESCRIPTION

I<binary>, I<hex>, I<base64>

=item OO INTERFACE

$ctx = Digest->XXX($arg,...), $ctx = Digest->new(XXX => $arg,...), $ctx =
Digest::XXX->new($arg,...), $other_ctx = $ctx->clone, $ctx->reset,
$ctx->add( $data ), $ctx->add( $chunk1, $chunk2, ... ), $ctx->addfile(
$io_handle ), $ctx->add_bits( $data, $nbits ), $ctx->add_bits( $bitstring
), $ctx->digest, $ctx->hexdigest, $ctx->b64digest

=item Digest speed

=item SEE ALSO

=item AUTHOR

=back

=head2 Digest::MD5 - Perl interface to the MD5 Algorithm

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS

md5($data,...), md5_hex($data,...), md5_base64($data,...)

=item METHODS

$md5 = Digest::MD5->new, $md5->reset, $md5->clone, $md5->add($data,...),
$md5->addfile($io_handle), $md5->add_bits($data, $nbits),
$md5->add_bits($bitstring), $md5->digest, $md5->hexdigest, $md5->b64digest,
@ctx = $md5->context, $md5->context(@ctx)

=item EXAMPLES

=item SEE ALSO

=item COPYRIGHT

=item AUTHORS

=back

=head2 Digest::SHA - Perl extension for SHA-1/224/256/384/512

=over 4

=item SYNOPSIS

=item SYNOPSIS (HMAC-SHA)

=item ABSTRACT

=item DESCRIPTION

=item UNICODE AND SIDE EFFECTS

=item NIST STATEMENT ON SHA-1

=item PADDING OF BASE64 DIGESTS

=item EXPORT

=item EXPORTABLE FUNCTIONS

B<sha1($data, ...)>, B<sha224($data, ...)>, B<sha256($data, ...)>,
B<sha384($data, ...)>, B<sha512($data, ...)>, B<sha512224($data, ...)>,
B<sha512256($data, ...)>, B<sha1_hex($data, ...)>, B<sha224_hex($data,
...)>, B<sha256_hex($data, ...)>, B<sha384_hex($data, ...)>,
B<sha512_hex($data, ...)>, B<sha512224_hex($data, ...)>,
B<sha512256_hex($data, ...)>, B<sha1_base64($data, ...)>,
B<sha224_base64($data, ...)>, B<sha256_base64($data, ...)>,
B<sha384_base64($data, ...)>, B<sha512_base64($data, ...)>,
B<sha512224_base64($data, ...)>, B<sha512256_base64($data, ...)>,
B<new($alg)>, B<reset($alg)>, B<hashsize>, B<algorithm>, B<clone>,
B<add($data, ...)>, B<add_bits($data, $nbits)>, B<add_bits($bits)>,
B<addfile(*FILE)>, B<addfile($filename [, $mode])>, B<getstate>,
B<putstate($str)>, B<dump($filename)>, B<load($filename)>, B<digest>,
B<hexdigest>, B<b64digest>, B<hmac_sha1($data, $key)>, B<hmac_sha224($data,
$key)>, B<hmac_sha256($data, $key)>, B<hmac_sha384($data, $key)>,
B<hmac_sha512($data, $key)>, B<hmac_sha512224($data, $key)>,
B<hmac_sha512256($data, $key)>, B<hmac_sha1_hex($data, $key)>,
B<hmac_sha224_hex($data, $key)>, B<hmac_sha256_hex($data, $key)>,
B<hmac_sha384_hex($data, $key)>, B<hmac_sha512_hex($data, $key)>,
B<hmac_sha512224_hex($data, $key)>, B<hmac_sha512256_hex($data, $key)>,
B<hmac_sha1_base64($data, $key)>, B<hmac_sha224_base64($data, $key)>,
B<hmac_sha256_base64($data, $key)>, B<hmac_sha384_base64($data, $key)>,
B<hmac_sha512_base64($data, $key)>, B<hmac_sha512224_base64($data, $key)>,
B<hmac_sha512256_base64($data, $key)>

=item SEE ALSO

=item AUTHOR

=item ACKNOWLEDGMENTS

=item COPYRIGHT AND LICENSE

=back

=head2 Digest::base - Digest base class

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=back

=head2 Digest::file - Calculate digests of files

=over 4

=item SYNOPSIS

=item DESCRIPTION

digest_file( $file, $algorithm, [$arg,...] ), digest_file_hex( $file,
$algorithm, [$arg,...] ), digest_file_base64( $file, $algorithm, [$arg,...]
)

=item SEE ALSO

=back

=head2 DirHandle - supply object methods for directory handles

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 Dumpvalue - provides screen dump of Perl data.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Creation

C<arrayDepth>, C<hashDepth>, C<compactDump>, C<veryCompact>, C<globPrint>,
C<dumpDBFiles>, C<dumpPackages>, C<dumpReused>, C<tick>, C<quoteHighBit>,
C<printUndef>, C<usageOnly>, unctrl, subdump, bareStringify, quoteHighBit,
stopDbSignal

=item Methods

dumpValue, dumpValues, stringify, dumpvars, set_quote, set_unctrl,
compactDump, veryCompact, set, get

=back

=back

=head2 DynaLoader - Dynamically load C libraries into Perl code

=over 4

=item SYNOPSIS

=item DESCRIPTION

@dl_library_path, @dl_resolve_using, @dl_require_symbols, @dl_librefs,
@dl_modules, @dl_shared_objects, dl_error(), $dl_debug, $dl_dlext,
dl_findfile(), dl_expandspec(), dl_load_file(), dl_unload_file(),
dl_load_flags(), dl_find_symbol(), dl_find_symbol_anywhere(),
dl_undef_symbols(), dl_install_xsub(), bootstrap()

=item AUTHOR

=back

=head2 Encode - character encodings in Perl

=over 4

=item SYNOPSIS

=over 4

=item Table of Contents

L<Encode::Alias> - Alias definitions to encodings, L<Encode::Encoding> -
Encode Implementation Base Class, L<Encode::Supported> - List of Supported
Encodings, L<Encode::CN> - Simplified Chinese Encodings, L<Encode::JP> -
Japanese Encodings, L<Encode::KR> - Korean Encodings, L<Encode::TW> -
Traditional Chinese Encodings

=back

=item DESCRIPTION

=over 4

=item TERMINOLOGY

=back

=item THE PERL ENCODING API

=over 4

=item Basic methods

=item Listing available encodings

=item Defining Aliases

=item Finding IANA Character Set Registry names

=back

=item Encoding via PerlIO

=item Handling Malformed Data

=over 4

=item List of I<CHECK> values

perlqq mode (I<CHECK> = Encode::FB_PERLQQ), HTML charref mode (I<CHECK> =
Encode::FB_HTMLCREF), XML charref mode (I<CHECK> = Encode::FB_XMLCREF)

=item coderef for CHECK

=back

=item Defining Encodings

=item The UTF8 flag

Goal #1:, Goal #2:, Goal #3:, Goal #4:

=over 4

=item Messing with Perl's Internals

=back

=item UTF-8 vs. utf8 vs. UTF8

=item SEE ALSO

=item MAINTAINER

=item COPYRIGHT

=back

=head2 Encode::Alias - alias definitions to encodings

=over 4

=item SYNOPSIS

=item DESCRIPTION

As a simple string, As a qr// compiled regular expression, e.g.:, As a code
reference, e.g.:

=over 4

=item Alias overloading

=back

=item SEE ALSO

=back

=head2 Encode::Byte - Single Byte Encodings

=over 4

=item SYNOPSIS

=item ABSTRACT

=item DESCRIPTION

=item SEE ALSO

=back

=head2 Encode::CJKConstants -- Internally used by Encode::??::ISO_2022_*

=head2 Encode::CN - China-based Chinese Encodings

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTES

=item BUGS

=item SEE ALSO

=back

=head2 Encode::CN::HZ -- internally used by Encode::CN

=head2 Encode::Config -- internally used by Encode

=head2 Encode::EBCDIC - EBCDIC Encodings

=over 4

=item SYNOPSIS

=item ABSTRACT

=item DESCRIPTION

=item SEE ALSO

=back

=head2 Encode::Encoder -- Object Oriented Encoder

=over 4

=item SYNOPSIS

=item ABSTRACT

=item Description

=over 4

=item Predefined Methods

$e = Encode::Encoder-E<gt>new([$data, $encoding]);, encoder(),
$e-E<gt>data([$data]), $e-E<gt>encoding([$encoding]),
$e-E<gt>bytes([$encoding])

=item Example: base64 transcoder

=item Operator Overloading

=back

=item SEE ALSO

=back

=head2 Encode::Encoding - Encode Implementation Base Class

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Methods you should implement

-E<gt>encode($string [,$check]), -E<gt>decode($octets [,$check]),
-E<gt>cat_decode($destination, $octets, $offset, $terminator [,$check])

=item Other methods defined in Encode::Encodings

-E<gt>name, -E<gt>mime_name, -E<gt>renew, -E<gt>renewed, -E<gt>perlio_ok(),
-E<gt>needs_lines()

=item Example: Encode::ROT13

=back

=item Why the heck Encode API is different?

=over 4

=item Compiled Encodings

=back

=item SEE ALSO

Scheme 1, Scheme 2, Other Schemes

=back

=head2 Encode::GSM0338 -- ESTI GSM 03.38 Encoding

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTES

=item BUGS

=item SEE ALSO

=back

=head2 Encode::Guess -- Guesses encoding from data

=over 4

=item SYNOPSIS

=item ABSTRACT

=item DESCRIPTION

Encode::Guess->set_suspects, Encode::Guess->add_suspects,
Encode::decode("Guess" ...), Encode::Guess->guess($data),
guess_encoding($data, [, I<list of suspects>])

=item CAVEATS

=item TO DO

=item SEE ALSO

=back

=head2 Encode::JP - Japanese Encodings

=over 4

=item SYNOPSIS

=item ABSTRACT

=item DESCRIPTION

=item Note on ISO-2022-JP(-1)?

=item BUGS

=item SEE ALSO

=back

=head2 Encode::JP::H2Z -- internally used by Encode::JP::2022_JP*

=head2 Encode::JP::JIS7 -- internally used by Encode::JP

=head2 Encode::KR - Korean Encodings

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item BUGS

=item SEE ALSO

=back

=head2 Encode::KR::2022_KR -- internally used by Encode::KR

=head2 Encode::MIME::Header -- MIME encoding for an unstructured email
header

=over 4

=item SYNOPSIS

=item ABSTRACT

=item DESCRIPTION

=item BUGS

=item AUTHORS

=item SEE ALSO

=back

=head2 Encode::MIME::Name, Encode::MIME::NAME -- internally used by Encode

=over 4

=item SEE ALSO

=back

=head2 Encode::PerlIO -- a detailed document on Encode and PerlIO

=over 4

=item Overview

=item How does it work?

=item Line Buffering

=over 4

=item How can I tell whether my encoding fully supports PerlIO ?

=back

=item SEE ALSO

=back

=head2 Encode::Supported -- Encodings supported by Encode

=over 4

=item DESCRIPTION

=over 4

=item Encoding Names

=back

=item Supported Encodings

=over 4

=item Built-in Encodings

=item Encode::Unicode -- other Unicode encodings

=item Encode::Byte -- Extended ASCII

ISO-8859 and corresponding vendor mappings, KOI8 - De Facto Standard for
the Cyrillic world

=item gsm0338 - Hentai Latin 1

gsm0338 support before 2.19

=item CJK: Chinese, Japanese, Korean (Multibyte)

Encode::CN -- Continental China, Encode::JP -- Japan, Encode::KR -- Korea,
Encode::TW -- Taiwan, Encode::HanExtra -- More Chinese via CPAN,
Encode::JIS2K -- JIS X 0213 encodings via CPAN

=item Miscellaneous encodings

Encode::EBCDIC, Encode::Symbols, Encode::MIME::Header, Encode::Guess

=back

=item Unsupported encodings

  ISO-2022-JP-2 [RFC1554], ISO-2022-CN [RFC1922], Various HP-UX encodings,
Cyrillic encoding ISO-IR-111, ISO-8859-8-1 [Hebrew], ISIRI 3342, Iran
System, ISIRI 2900 [Farsi], Thai encoding TCVN, Vietnamese encodings VPS,
Various Mac encodings, (Mac) Indic encodings

=item Encoding vs. Charset -- terminology

=item Encoding Classification (by Anton Tagunov and Dan Kogai)

=over 4

=item Microsoft-related naming mess

KS_C_5601-1987, GB2312, Big5, Shift_JIS

=back

=item Glossary

character repertoire, coded character set (CCS), character encoding scheme
(CES), charset (in MIME context), EUC, ISO-2022, UCS, UCS-2, Unicode, UTF,
UTF-16

=item See Also

=item References

ECMA, ECMA-035 (eq C<ISO-2022>), IANA, Assigned Charset Names by IANA, ISO,
RFC, UC, Unicode Glossary

=over 4

=item Other Notable Sites

czyborra.com, CJK.inf, Jungshik Shin's Hangul FAQ, debian.org:
"Introduction to i18n"

=item Offline sources

C<CJKV Information Processing> by Ken Lunde

=back

=back

=head2 Encode::Symbol - Symbol Encodings

=over 4

=item SYNOPSIS

=item ABSTRACT

=item DESCRIPTION

=item SEE ALSO

=back

=head2 Encode::TW - Taiwan-based Chinese Encodings

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTES

=item BUGS

=item SEE ALSO

=back

=head2 Encode::Unicode -- Various Unicode Transformation Formats

=over 4

=item SYNOPSIS

=item ABSTRACT

L<http://www.unicode.org/glossary/> says:, Quick Reference

=item Size, Endianness, and BOM

=over 4

=item by size

=item by endianness

BOM as integer when fetched in network byte order

=back

=item Surrogate Pairs

=item Error Checking

=item SEE ALSO

=back

=head2 Encode::Unicode::UTF7 -- UTF-7 encoding

=over 4

=item SYNOPSIS

=item ABSTRACT

=item In Practice

=item SEE ALSO

=back

=head2 English - use nice English (or awk) names for ugly punctuation
variables

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item PERFORMANCE

=back

=head2 Env - perl module that imports environment variables as scalars or
arrays

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item LIMITATIONS

=item AUTHOR

=back

=head2 Errno - System errno constants

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEATS

=item AUTHOR

=item COPYRIGHT

=back

=head2 Exporter - Implements default import method for modules

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item How to Export

=item Selecting What to Export

=item How to Import

C<use YourModule;>, C<use YourModule ();>, C<use YourModule qw(...);>

=back

=item Advanced Features

=over 4

=item Specialised Import Lists

=item Exporting Without Using Exporter's import Method

=item Exporting Without Inheriting from Exporter

=item Module Version Checking

=item Managing Unknown Symbols

=item Tag Handling Utility Functions

=item Generating Combined Tags

=item C<AUTOLOAD>ed Constants

=back

=item Good Practices

=over 4

=item Declaring C<@EXPORT_OK> and Friends

=item Playing Safe

=item What Not to Export

=back

=item SEE ALSO

=item LICENSE

=back

=head2 Exporter::Heavy - Exporter guts

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 ExtUtils::CBuilder - Compile and link C code for Perl modules

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

new, have_compiler, have_cplusplus, compile, C<object_file>,
C<include_dirs>, C<extra_compiler_flags>, C<C++>, link, lib_file,
module_name, extra_linker_flags, link_executable, exe_file, object_file,
lib_file, exe_file, prelink, need_prelink, extra_link_args_after_prelink

=item TO DO

=item HISTORY

=item SUPPORT

=item AUTHOR

=item COPYRIGHT

=item SEE ALSO

=back

=head2 ExtUtils::CBuilder::Platform::Windows - Builder class for Windows
platforms

=over 4

=item DESCRIPTION

=item AUTHOR

=item SEE ALSO

=back

=head2 ExtUtils::Command - utilities to replace common UNIX commands in
Makefiles etc.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item FUNCTIONS

=back

=back

cat

eqtime

rm_rf

rm_f

touch

mv

cp

chmod

mkpath

test_f

test_d

dos2unix

=over 4

=item SEE ALSO

=item AUTHOR

=back

=head2 ExtUtils::Command::MM - Commands for the MM's to use in Makefiles

=over 4

=item SYNOPSIS

=item DESCRIPTION

B<test_harness>

=back

B<pod2man>

B<warn_if_old_packlist>

B<perllocal_install>

B<uninstall>

B<test_s>

B<cp_nonempty>

=head2 ExtUtils::Constant - generate XS code to import C header constants

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item USAGE

IV, UV, NV, PV, PVN, SV, YES, NO, UNDEF

=item FUNCTIONS

=back

constant_types

XS_constant PACKAGE, TYPES, XS_SUBNAME, C_SUBNAME

autoload PACKAGE, VERSION, AUTOLOADER

WriteMakefileSnippet

WriteConstants ATTRIBUTE =E<gt> VALUE [, ...], NAME, DEFAULT_TYPE,
BREAKOUT_AT, NAMES, PROXYSUBS, C_FH, C_FILE, XS_FH, XS_FILE, XS_SUBNAME,
C_SUBNAME

=over 4

=item AUTHOR

=back

=head2 ExtUtils::Constant::Base - base class for ExtUtils::Constant objects

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item USAGE

=back

header

memEQ_clause args_hashref

dump_names arg_hashref, ITEM..

assign arg_hashref, VALUE..

return_clause arg_hashref, ITEM

switch_clause arg_hashref, NAMELEN, ITEMHASH, ITEM..

params WHAT

dogfood arg_hashref, ITEM..

normalise_items args, default_type, seen_types, seen_items, ITEM..

C_constant arg_hashref, ITEM.., name, type, value, macro, default, pre,
post, def_pre, def_post, utf8, weight

=over 4

=item BUGS

=item AUTHOR

=back

=head2 ExtUtils::Constant::Utils - helper functions for ExtUtils::Constant

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item USAGE

C_stringify NAME

=back

perl_stringify NAME

=over 4

=item AUTHOR

=back

=head2 ExtUtils::Constant::XS - generate C code for XS modules' constants.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item BUGS

=item AUTHOR

=back

=head2 ExtUtils::Embed - Utilities for embedding Perl in C/C++ applications

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item @EXPORT

=item FUNCTIONS

xsinit(), Examples, ldopts(), Examples, perl_inc(), ccflags(), ccdlflags(),
ccopts(), xsi_header(), xsi_protos(@modules), xsi_body(@modules)

=item EXAMPLES

=item SEE ALSO

=item AUTHOR

=back

=head2 ExtUtils::Install - install files from here to there

=over 4

=item SYNOPSIS

=item VERSION

=back

=over 4

=item DESCRIPTION

_chmod($$;$), _warnonce(@), _choke(@)

=back

_move_file_at_boot( $file, $target, $moan  )

_unlink_or_rename( $file, $tryhard, $installing )

=over 4

=item Functions

_get_install_skip

=back

_have_write_access

_can_write_dir(C<$dir>)

_mkpath($dir,$show,$mode,$verbose,$dry_run)

_copy($from,$to,$verbose,$dry_run)

_chdir($from)

B<install>

_do_cleanup

install_rooted_file( $file ), install_rooted_dir( $dir )

forceunlink( $file, $tryhard )

directory_not_empty( $dir )

B<install_default> I<DISCOURAGED>

B<uninstall>

inc_uninstall($filepath,$libdir,$verbose,$dry_run,$ignore,$results)

run_filter($cmd,$src,$dest)

B<pm_to_blib>

_autosplit

_invokant

=over 4

=item ENVIRONMENT

B<PERL_INSTALL_ROOT>, B<EU_INSTALL_IGNORE_SKIP>,
B<EU_INSTALL_SITE_SKIPFILE>, B<EU_INSTALL_ALWAYS_COPY>

=item AUTHOR

=item LICENSE

=back

=head2 ExtUtils::Installed - Inventory management of installed modules

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item USAGE

=item METHODS

new(), modules(), files(), directories(), directory_tree(), validate(),
packlist(), version()

=item EXAMPLE

=item AUTHOR

=back

=head2 ExtUtils::Liblist - determine libraries to use and how to use them

=over 4

=item SYNOPSIS

=item DESCRIPTION

For static extensions, For dynamic extensions at build/link time, For
dynamic extensions at load time

=over 4

=item EXTRALIBS

=item LDLOADLIBS and LD_RUN_PATH

=item BSLOADLIBS

=back

=item PORTABILITY

=over 4

=item VMS implementation

=item Win32 implementation

=back

=item SEE ALSO

=back

=head2 ExtUtils::MM - OS adjusted ExtUtils::MakeMaker subclass

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 ExtUtils::MM::Utils - ExtUtils::MM methods without dependency on
ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

maybe_command

=back

=over 4

=item BUGS

=item SEE ALSO

=back

=head2 ExtUtils::MM_AIX - AIX specific subclass of ExtUtils::MM_Unix

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overridden methods

=back

=back

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 ExtUtils::MM_Any - Platform-agnostic MM methods

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Cross-platform helper methods

=back

=back

=over 4

=item Targets

=back

=over 4

=item Init methods

=back

=over 4

=item Tools

=back

=over 4

=item File::Spec wrappers

=back

=over 4

=item Misc

=back

=over 4

=item AUTHOR

=back

=head2 ExtUtils::MM_BeOS - methods to override UN*X behaviour in
ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

os_flavor

init_linker

=head2 ExtUtils::MM_Cygwin - methods to override UN*X behaviour in
ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

os_flavor

=back

cflags

replace_manpage_separator

init_linker

maybe_command

dynamic_lib

install

all_target

=head2 ExtUtils::MM_DOS - DOS specific subclass of ExtUtils::MM_Unix

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overridden methods

os_flavor

=back

=back

B<replace_manpage_separator>

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 ExtUtils::MM_Darwin - special behaviors for OS X

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overriden Methods

=back

=back

=head2 ExtUtils::MM_MacOS - once produced Makefiles for MacOS Classic

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 ExtUtils::MM_NW5 - methods to override UN*X behaviour in
ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

os_flavor

init_platform, platform_constants

static_lib_pure_cmd

dynamic_lib

=head2 ExtUtils::MM_OS2 - methods to override UN*X behaviour in
ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

init_dist

=back

init_linker

os_flavor

=head2 ExtUtils::MM_QNX - QNX specific subclass of ExtUtils::MM_Unix

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overridden methods

=back

=back

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 ExtUtils::MM_UWIN - U/WIN specific subclass of ExtUtils::MM_Unix

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overridden methods

os_flavor

=back

=back

B<replace_manpage_separator>

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 ExtUtils::MM_Unix - methods used by ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=back

=over 4

=item Methods

os_flavor

=back

c_o (o)

xs_obj_opt

cflags (o)

const_cccmd (o)

const_config (o)

const_loadlibs (o)

constants (o)

depend (o)

init_DEST

init_dist

dist (o)

dist_basics (o)

dist_ci (o)

dist_core (o)

B<dist_target>

B<tardist_target>

B<zipdist_target>

B<tarfile_target>

zipfile_target

uutardist_target

shdist_target

dlsyms (o)

dynamic_bs (o)

dynamic_lib (o)

xs_dynamic_lib_macros

xs_make_dynamic_lib

exescan

extliblist

find_perl

fixin

force (o)

guess_name

has_link_code

init_dirscan

init_MANPODS

init_MAN1PODS

init_MAN3PODS

init_PM

init_DIRFILESEP

init_main

init_tools

init_linker

init_lib2arch

init_PERL

init_platform, platform_constants

init_PERM

init_xs

install (o)

installbin (o)

linkext (o)

lsdir

macro (o)

makeaperl (o)

makefile (o)

maybe_command

needs_linking (o)

parse_abstract

parse_version

pasthru (o)

perl_script

perldepend (o)

pm_to_blib

ppd

prefixify

processPL (o)

specify_shell

quote_paren

replace_manpage_separator

cd

oneliner

quote_literal

escape_newlines

max_exec_len

static (o)

xs_make_static_lib

static_lib_closures

static_lib_fixtures

static_lib_pure_cmd

staticmake (o)

subdir_x (o)

subdirs (o)

test (o)

test_via_harness (override)

test_via_script (override)

tool_xsubpp (o)

all_target

top_targets (o)

writedoc

xs_c (o)

xs_cpp (o)

xs_o (o)

=over 4

=item SEE ALSO

=back

=head2 ExtUtils::MM_VMS - methods to override UN*X behaviour in
ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Methods always loaded

wraplist

=back

=back

=over 4

=item Methods

guess_name (override)

=back

find_perl (override)

_fixin_replace_shebang (override)

maybe_command (override)

pasthru (override)

pm_to_blib (override)

perl_script (override)

replace_manpage_separator

init_DEST

init_DIRFILESEP

init_main (override)

init_tools (override)

init_platform (override)

platform_constants

init_VERSION (override)

constants (override)

special_targets

cflags (override)

const_cccmd (override)

tools_other (override)

init_dist (override)

c_o (override)

xs_c (override)

xs_o (override)

_xsbuild_replace_macro (override)

_xsbuild_value (override)

dlsyms (override)

xs_obj_opt

dynamic_lib (override)

xs_make_static_lib (override)

extra_clean_files

zipfile_target, tarfile_target, shdist_target

install (override)

perldepend (override)

makeaperl (override)

maketext_filter (override)

prefixify (override)

cd

oneliner

B<echo>

quote_literal

escape_dollarsigns

escape_all_dollarsigns

escape_newlines

max_exec_len

init_linker

catdir (override), catfile (override)

eliminate_macros

fixpath

os_flavor

is_make_type (override)

make_type (override)

=over 4

=item AUTHOR

=back

=head2 ExtUtils::MM_VOS - VOS specific subclass of ExtUtils::MM_Unix

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overridden methods

=back

=back

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 ExtUtils::MM_Win32 - methods to override UN*X behaviour in
ExtUtils::MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item Overridden methods

B<dlsyms>

=back

xs_dlsyms_ext

replace_manpage_separator

B<maybe_command>

B<init_DIRFILESEP>

init_tools

init_others

init_platform, platform_constants

specify_shell

constants

special_targets

static_lib_pure_cmd

dynamic_lib

extra_clean_files

init_linker

perl_script

quote_dep

xs_obj_opt

pasthru

arch_check (override)

oneliner

cd

max_exec_len

os_flavor

cflags

make_type

=head2 ExtUtils::MM_Win95 - method to customize MakeMaker for Win9X

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Overridden methods

max_exec_len

=back

=back

os_flavor

=over 4

=item AUTHOR

=back

=head2 ExtUtils::MY - ExtUtils::MakeMaker subclass for customization

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 ExtUtils::MakeMaker - Create a module Makefile

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item How To Write A Makefile.PL

=item Default Makefile Behaviour

=item make test

=item make testdb

=item make install

=item INSTALL_BASE

=item PREFIX and LIB attribute

=item AFS users

=item Static Linking of a new Perl Binary

=item Determination of Perl Library and Installation Locations

=item Which architecture dependent directory?

=item Using Attributes and Parameters

ABSTRACT, ABSTRACT_FROM, AUTHOR, BINARY_LOCATION, BUILD_REQUIRES, C,
CCFLAGS, CONFIG, CONFIGURE, CONFIGURE_REQUIRES, DEFINE, DESTDIR, DIR,
DISTNAME, DISTVNAME, DLEXT, DL_FUNCS, DL_VARS, EXCLUDE_EXT, EXE_FILES,
FIRST_MAKEFILE, FULLPERL, FULLPERLRUN, FULLPERLRUNINST, FUNCLIST, H,
IMPORTS, INC, INCLUDE_EXT, INSTALLARCHLIB, INSTALLBIN, INSTALLDIRS,
INSTALLMAN1DIR, INSTALLMAN3DIR, INSTALLPRIVLIB, INSTALLSCRIPT,
INSTALLSITEARCH, INSTALLSITEBIN, INSTALLSITELIB, INSTALLSITEMAN1DIR,
INSTALLSITEMAN3DIR, INSTALLSITESCRIPT, INSTALLVENDORARCH, INSTALLVENDORBIN,
INSTALLVENDORLIB, INSTALLVENDORMAN1DIR, INSTALLVENDORMAN3DIR,
INSTALLVENDORSCRIPT, INST_ARCHLIB, INST_BIN, INST_LIB, INST_MAN1DIR,
INST_MAN3DIR, INST_SCRIPT, LD, LDDLFLAGS, LDFROM, LIB, LIBPERL_A, LIBS,
LICENSE, LINKTYPE, MAGICXS, MAKE, MAKEAPERL, MAKEFILE_OLD, MAN1PODS,
MAN3PODS, MAP_TARGET, META_ADD, META_MERGE, MIN_PERL_VERSION, MYEXTLIB,
NAME, NEEDS_LINKING, NOECHO, NORECURS, NO_META, NO_MYMETA, NO_PACKLIST,
NO_PERLLOCAL, NO_VC, OBJECT, OPTIMIZE, PERL, PERL_CORE, PERLMAINCC,
PERL_ARCHLIB, PERL_LIB, PERL_MALLOC_OK, PERLPREFIX, PERLRUN, PERLRUNINST,
PERL_SRC, PERM_DIR, PERM_RW, PERM_RWX, PL_FILES, PM, PMLIBDIRS, PM_FILTER,
POLLUTE, PPM_INSTALL_EXEC, PPM_INSTALL_SCRIPT, PPM_UNINSTALL_EXEC,
PPM_UNINSTALL_SCRIPT, PREFIX, PREREQ_FATAL, PREREQ_PM, PREREQ_PRINT,
PRINT_PREREQ, SITEPREFIX, SIGN, SKIP, TEST_REQUIRES, TYPEMAPS,
USE_MM_LD_RUN_PATH, VENDORPREFIX, VERBINST, VERSION, VERSION_FROM,
VERSION_SYM, XS, XSBUILD, XSMULTI, XSOPT, XSPROTOARG, XS_VERSION

=item Additional lowercase attributes

clean, depend, dist, dynamic_lib, linkext, macro, postamble, realclean,
test, tool_autosplit

=item Overriding MakeMaker Methods

=item The End Of Cargo Cult Programming

C<< MAN3PODS => ' ' >>

=item Hintsfile support

=item Distribution Support

   make distcheck,    make skipcheck,	 make distclean,    make veryclean,
   make manifest,    make distdir,   make disttest,    make tardist,   
make dist,    make uutardist,	 make shdist,	 make zipdist,	  make ci

=item Module Meta-Data (META and MYMETA)

=item Disabling an extension

=item Other Handy Functions

prompt

=item Supported versions of Perl

=back

=item ENVIRONMENT

PERL_MM_OPT, PERL_MM_USE_DEFAULT, PERL_CORE

=item SEE ALSO

=item AUTHORS

=item LICENSE

=back

=head2 ExtUtils::MakeMaker::Config - Wrapper around Config.pm

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 ExtUtils::MakeMaker::FAQ - Frequently Asked Questions About
MakeMaker

=over 4

=item DESCRIPTION

=over 4

=item Module Installation

How do I install a module into my home directory?, How do I get MakeMaker
and Module::Build to install to the same place?, How do I keep from
installing man pages?, How do I use a module without installing it?, How
can I organize tests into subdirectories and have them run?, PREFIX vs
INSTALL_BASE from Module::Build::Cookbook, Generating *.pm files with
substitutions eg of $VERSION

=item Common errors and problems

"No rule to make target `/usr/lib/perl5/CORE/config.h', needed by
`Makefile'"

=item Philosophy and History

Why not just use <insert other build config tool here>?, What is
Module::Build and how does it relate to MakeMaker?, pure perl.	no make, no
shell commands, easier to customize, cleaner internals, less cruft

=item Module Writing

How do I keep my $VERSION up to date without resetting it manually?, What's
this F<META.yml> thing and how did it get in my F<MANIFEST>?!, How do I
delete everything not in my F<MANIFEST>?, Which tar should I use on
Windows?, Which zip should I use on Windows for '[ndg]make zipdist'?

=item XS

=back

=item DESIGN

=over 4

=item MakeMaker object hierarchy (simplified)

=item MakeMaker object hierarchy (real)

=item The MM_* hierarchy

=back

=item PATCHING

make a pull request on the MakeMaker github repository, raise a issue on
the MakeMaker github repository, file an RT ticket, email
makemaker@perl.org

=item AUTHOR

=item SEE ALSO

=back

=head2 ExtUtils::MakeMaker::Locale - bundled Encode::Locale

=over 4

=item SYNOPSIS

=item DESCRIPTION

decode_argv( ), decode_argv( Encode::FB_CROAK ), env( $uni_key ), env(
$uni_key => $uni_value ), reinit( ), reinit( $encoding ), $ENCODING_LOCALE,
$ENCODING_LOCALE_FS, $ENCODING_CONSOLE_IN, $ENCODING_CONSOLE_OUT

=item NOTES

=over 4

=item Windows

=item Mac OS X

=item POSIX (Linux and other Unixes)

=back

=item SEE ALSO

=item AUTHOR

=back

=head2 ExtUtils::MakeMaker::Tutorial - Writing a module with MakeMaker

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item The Mantra

=item The Layout

Makefile.PL, MANIFEST, lib/, t/, Changes, README, INSTALL, MANIFEST.SKIP,
bin/

=back

=item SEE ALSO

=back

=head2 ExtUtils::Manifest - utilities to write and check a MANIFEST file

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Functions

mkmanifest

=back

=back

manifind

manicheck

filecheck

fullcheck

skipcheck

maniread

maniskip

manicopy

maniadd

=over 4

=item MANIFEST

=item MANIFEST.SKIP

#!include_default, #!include /Path/to/another/manifest.skip

=item EXPORT_OK

=item GLOBAL VARIABLES

=back

=over 4

=item DIAGNOSTICS

C<Not in MANIFEST:> I<file>, C<Skipping> I<file>, C<No such file:> I<file>,
C<MANIFEST:> I<$!>, C<Added to MANIFEST:> I<file>

=item ENVIRONMENT

B<PERL_MM_MANIFEST_DEBUG>

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 ExtUtils::Miniperl - write the C code for miniperlmain.c and
perlmain.c

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=back

=head2 ExtUtils::Mkbootstrap - make a bootstrap file for use by DynaLoader

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 ExtUtils::Mksymlists - write linker options files for dynamic
extension

=over 4

=item SYNOPSIS

=item DESCRIPTION

DLBASE, DL_FUNCS, DL_VARS, FILE, FUNCLIST, IMPORTS, NAME

=item AUTHOR

=item REVISION

mkfh()

=back

__find_relocations

=head2 ExtUtils::Packlist - manage .packlist files

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item USAGE

=item FUNCTIONS

new(), read(), write(), validate(), packlist_file()

=item EXAMPLE

=item AUTHOR

=back

=head2 ExtUtils::ParseXS - converts Perl XS code into C code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXPORT

=item METHODS

$pxs->new(), $pxs->process_file(), B<C++>, B<hiertype>, B<except>,
B<typemap>, B<prototypes>, B<versioncheck>, B<linenumbers>, B<optimize>,
B<inout>, B<argtypes>, B<s>, $pxs->report_error_count()

=item AUTHOR

=item COPYRIGHT

=item SEE ALSO

=back

=head2 ExtUtils::ParseXS::Constants - Initialization values for some
globals

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 ExtUtils::ParseXS::Eval - Clean package to evaluate code in

=over 4

=item SYNOPSIS

=item SUBROUTINES

=over 4

=item $pxs->eval_output_typemap_code($typemapcode, $other_hashref)

=back

=back

=over 4

=item $pxs->eval_input_typemap_code($typemapcode, $other_hashref)

=back

=over 4

=item TODO

=back

=head2 ExtUtils::ParseXS::Utilities - Subroutines used with
ExtUtils::ParseXS

=over 4

=item SYNOPSIS

=item SUBROUTINES

=over 4

=item C<standard_typemap_locations()>

Purpose, Arguments, Return Value

=back

=back

=over 4

=item C<trim_whitespace()>

Purpose, Argument, Return Value

=back

=over 4

=item C<C_string()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<valid_proto_string()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<process_typemaps()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<map_type()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<standard_XS_defs()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<assign_func_args()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<analyze_preprocessor_statements()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<set_cond()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<current_line_number()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<Warn()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<blurt()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<death()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<check_conditional_preprocessor_statements()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<escape_file_for_line_directive()>

Purpose, Arguments, Return Value

=back

=over 4

=item C<report_typemap_failure>

Purpose, Arguments, Return Value

=back

=head2 ExtUtils::Typemaps - Read/Write/Modify Perl/XS typemap files

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=back

=over 4

=item new

=back

=over 4

=item file

=back

=over 4

=item add_typemap

=back

=over 4

=item add_inputmap

=back

=over 4

=item add_outputmap

=back

=over 4

=item add_string

=back

=over 4

=item remove_typemap

=back

=over 4

=item remove_inputmap

=back

=over 4

=item remove_inputmap

=back

=over 4

=item get_typemap

=back

=over 4

=item get_inputmap

=back

=over 4

=item get_outputmap

=back

=over 4

=item write

=back

=over 4

=item as_string

=back

=over 4

=item as_embedded_typemap

=back

=over 4

=item merge

=back

=over 4

=item is_empty

=back

=over 4

=item list_mapped_ctypes

=back

=over 4

=item _get_typemap_hash

=back

=over 4

=item _get_inputmap_hash

=back

=over 4

=item _get_outputmap_hash

=back

=over 4

=item _get_prototype_hash

=back

=over 4

=item clone

=back

=over 4

=item tidy_type

=back

=over 4

=item CAVEATS

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT & LICENSE

=back

=head2 ExtUtils::Typemaps::Cmd - Quick commands for handling typemaps

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXPORTED FUNCTIONS

=over 4

=item embeddable_typemap

=back

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT & LICENSE

=back

=head2 ExtUtils::Typemaps::InputMap - Entry in the INPUT section of a
typemap

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=back

=over 4

=item new

=back

=over 4

=item code

=back

=over 4

=item xstype

=back

=over 4

=item cleaned_code

=back

=over 4

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT & LICENSE

=back

=head2 ExtUtils::Typemaps::OutputMap - Entry in the OUTPUT section of a
typemap

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=back

=over 4

=item new

=back

=over 4

=item code

=back

=over 4

=item xstype

=back

=over 4

=item cleaned_code

=back

=over 4

=item targetable

=back

=over 4

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT & LICENSE

=back

=head2 ExtUtils::Typemaps::Type - Entry in the TYPEMAP section of a typemap

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=back

=over 4

=item new

=back

=over 4

=item proto

=back

=over 4

=item xstype

=back

=over 4

=item ctype

=back

=over 4

=item tidy_ctype

=back

=over 4

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT & LICENSE

=back

=head2 ExtUtils::XSSymSet - keep sets of symbol names palatable to the VMS
linker

=over 4

=item SYNOPSIS

=item DESCRIPTION

new([$maxlen[,$silent]]), addsym($name[,$maxlen[,$silent]]),
trimsym($name[,$maxlen[,$silent]]), delsym($name), get_orig($trimmed),
get_trimmed($name), all_orig(), all_trimmed()

=item AUTHOR

=item REVISION

=back

=head2 ExtUtils::testlib - add blib/* directories to @INC

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 Fatal - Replace functions with equivalents which succeed or die

=over 4

=item SYNOPSIS

=item BEST PRACTICE

=item DESCRIPTION

=item DIAGNOSTICS

Bad subroutine name for Fatal: %s, %s is not a Perl subroutine, %s is
neither a builtin, nor a Perl subroutine, Cannot make the non-overridable
%s fatal, Internal error: %s

=item BUGS

=item AUTHOR

=item LICENSE

=item SEE ALSO

=back

=head2 Fcntl - load the C Fcntl.h defines

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTE

=item EXPORTED SYMBOLS

=back

=head2 File::Basename - Parse file paths into directory, filename and
suffix.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

C<fileparse> X<fileparse>

C<basename> X<basename> X<filename>

C<dirname> X<dirname>

C<fileparse_set_fstype> X<filesystem>

=over 4

=item SEE ALSO

=back

=head2 File::Compare - Compare files or filehandles

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item RETURN

=item AUTHOR

=back

=head2 File::Copy - Copy files or filehandles

=over 4

=item SYNOPSIS

=item DESCRIPTION

copy X<copy> X<cp>, move X<move> X<mv> X<rename>, syscopy X<syscopy>,
rmscopy($from,$to[,$date_flag]) X<rmscopy>

=item RETURN

=item NOTES

=item AUTHOR

=back

=head2 File::DosGlob - DOS like globbing and then some

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXPORTS (by request only)

=item BUGS

=item AUTHOR

=item HISTORY

=item SEE ALSO

=back

=head2 File::Fetch - A generic file fetching mechanism

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item ACCESSORS

$ff->uri, $ff->scheme, $ff->host, $ff->vol, $ff->share, $ff->path,
$ff->file, $ff->file_default

=back

$ff->output_file

=over 4

=item METHODS

=over 4

=item $ff = File::Fetch->new( uri => 'http://some.where.com/dir/file.txt'
);

=back

=back

=over 4

=item $where = $ff->fetch( [to => /my/output/dir/ | \$scalar] )

=back

=over 4

=item $ff->error([BOOL])

=back

=over 4

=item HOW IT WORKS

=item GLOBAL VARIABLES

=over 4

=item $File::Fetch::FROM_EMAIL

=item $File::Fetch::USER_AGENT

=item $File::Fetch::FTP_PASSIVE

=item $File::Fetch::TIMEOUT

=item $File::Fetch::WARN

=item $File::Fetch::DEBUG

=item $File::Fetch::BLACKLIST

=item $File::Fetch::METHOD_FAIL

=back

=item MAPPING

=item FREQUENTLY ASKED QUESTIONS

=over 4

=item So how do I use a proxy with File::Fetch?

=item I used 'lynx' to fetch a file, but its contents is all wrong!

=item Files I'm trying to fetch have reserved characters or non-ASCII
characters in them. What do I do?

=back

=item TODO

Implement $PREFER_BIN

=item BUG REPORTS

=item AUTHOR

=item COPYRIGHT

=back

=head2 File::Find - Traverse a directory tree.

=over 4

=item SYNOPSIS

=item DESCRIPTION

B<find>, B<finddepth>

=over 4

=item %options

C<wanted>, C<bydepth>, C<preprocess>, C<postprocess>, C<follow>,
C<follow_fast>, C<follow_skip>, C<dangling_symlinks>, C<no_chdir>,
C<untaint>, C<untaint_pattern>, C<untaint_skip>

=item The wanted function

C<$File::Find::dir> is the current directory name,, C<$_> is the current
filename within that directory, C<$File::Find::name> is the complete
pathname to the file

=back

=item WARNINGS

=item CAVEAT

$dont_use_nlink, symlinks

=item BUGS AND CAVEATS

=item HISTORY

=item SEE ALSO

=back

=head2 File::Glob - Perl extension for BSD glob routine

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item META CHARACTERS

=item EXPORTS

=item POSIX FLAGS

C<GLOB_ERR>, C<GLOB_LIMIT>, C<GLOB_MARK>, C<GLOB_NOCASE>, C<GLOB_NOCHECK>,
C<GLOB_NOSORT>, C<GLOB_BRACE>, C<GLOB_NOMAGIC>, C<GLOB_QUOTE>,
C<GLOB_TILDE>, C<GLOB_CSH>, C<GLOB_ALPHASORT>

=back

=item DIAGNOSTICS

C<GLOB_NOSPACE>, C<GLOB_ABEND>

=item NOTES

=item SEE ALSO

=item AUTHOR

=back

=head2 File::GlobMapper - Extend File Glob to Allow Input and Output Files

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Behind The Scenes

=item Limitations

=item Input File Glob

B<~>, B<~user>, B<.>, B<*>, B<?>, B<\>,  B<[]>,  B<{,}>,  B<()>

=item Output File Glob

"*", #1

=item Returned Data

=back

=item EXAMPLES

=over 4

=item A Rename script

=item A few example globmaps

=back

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 File::Path - Create or remove directory trees

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

make_path( $dir1, $dir2, .... ), make_path( $dir1, $dir2, ...., \%opts ),
mode => $num, chmod => $num, verbose => $bool, error => \$err, owner =>
$owner, user => $owner, uid => $owner, group => $group, mkpath( $dir ),
mkpath( $dir, $verbose, $mode ), mkpath( [$dir1, $dir2,...], $verbose,
$mode ), mkpath( $dir1, $dir2,..., \%opt ), remove_tree( $dir1, $dir2, ....
), remove_tree( $dir1, $dir2, ...., \%opts ), verbose => $bool, safe =>
$bool, keep_root => $bool, result => \$res, error => \$err, rmtree( $dir ),
rmtree( $dir, $verbose, $safe ), rmtree( [$dir1, $dir2,...], $verbose,
$safe ), rmtree( $dir1, $dir2,..., \%opt )

=over 4

=item ERROR HANDLING

B<NOTE:>

=item NOTES

=back

=item DIAGNOSTICS

mkdir [path]: [errmsg] (SEVERE), No root path(s) specified, No such file or
directory, cannot fetch initial working directory: [errmsg], cannot stat
initial working directory: [errmsg], cannot chdir to [dir]: [errmsg],
directory [dir] changed before chdir, expected dev=[n] ino=[n], actual
dev=[n] ino=[n], aborting. (FATAL), cannot make directory [dir]
read+writeable: [errmsg], cannot read [dir]: [errmsg], cannot reset chmod
[dir]: [errmsg], cannot remove [dir] when cwd is [dir], cannot chdir to
[parent-dir] from [child-dir]: [errmsg], aborting. (FATAL), cannot stat
prior working directory [dir]: [errmsg], aborting. (FATAL), previous
directory [parent-dir] changed before entering [child-dir], expected
dev=[n] ino=[n], actual dev=[n] ino=[n], aborting. (FATAL), cannot make
directory [dir] writeable: [errmsg], cannot remove directory [dir]:
[errmsg], cannot restore permissions of [dir] to [0nnn]: [errmsg], cannot
make file [file] writeable: [errmsg], cannot unlink file [file]: [errmsg],
cannot restore permissions of [file] to [0nnn]: [errmsg], unable to map
[owner] to a uid, ownership not changed");, unable to map [group] to a gid,
group ownership not changed

=item SEE ALSO

=item BUGS AND LIMITATIONS

=over 4

=item MULTITHREAD APPLICATIONS

=item NFS Mount Points

=item REPORTING BUGS

=back

=item ACKNOWLEDGEMENTS

=item AUTHORS

=item CONTRIBUTORS

<F<bulkdd@cpan.org>>, Craig A. Berry <F<craigberry@mac.com>>, Richard
Elberger <F<riche@cpan.org>>, Ryan Yee <F<ryee@cpan.org>>, Skye Shaw
<F<shaw@cpan.org>>, Tom Lutz <F<tommylutz@gmail.com>>

=item COPYRIGHT

=item LICENSE

=back

=head2 File::Spec - portably perform operations on file names

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

canonpath X<canonpath>, catdir X<catdir>, catfile X<catfile>, curdir
X<curdir>, devnull X<devnull>, rootdir X<rootdir>, tmpdir X<tmpdir>, updir
X<updir>, no_upwards, case_tolerant, file_name_is_absolute, path X<path>,
join X<join, path>, splitpath X<splitpath> X<split, path>, splitdir
X<splitdir> X<split, dir>, catpath(), abs2rel X<abs2rel> X<absolute, path>
X<relative, path>, rel2abs() X<rel2abs> X<absolute, path> X<relative, path>

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 File::Spec::AmigaOS - File::Spec for AmigaOS

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

tmpdir

=back

file_name_is_absolute

=head2 File::Spec::Cygwin - methods for Cygwin file specs

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

canonpath

file_name_is_absolute

tmpdir (override)

case_tolerant

=over 4

=item COPYRIGHT

=back

=head2 File::Spec::Epoc - methods for Epoc file specs

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

canonpath()

=over 4

=item AUTHOR

=item COPYRIGHT

=item SEE ALSO

=back

=head2 File::Spec::Functions - portably perform operations on file names

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Exports

=back

=item COPYRIGHT

=item SEE ALSO

=back

=head2 File::Spec::Mac - File::Spec for Mac OS (Classic)

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

canonpath

=back

catdir()

catfile

curdir

devnull

rootdir

tmpdir

updir

file_name_is_absolute

path

splitpath

splitdir

catpath

abs2rel

rel2abs

=over 4

=item AUTHORS

=item COPYRIGHT

=item SEE ALSO

=back

=head2 File::Spec::OS2 - methods for OS/2 file specs

=over 4

=item SYNOPSIS

=item DESCRIPTION

tmpdir, splitpath

=item COPYRIGHT

=back

=head2 File::Spec::Unix - File::Spec for Unix, base for other File::Spec
modules

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

canonpath()

=back

catdir()

catfile

curdir

devnull

rootdir

tmpdir

updir

no_upwards

case_tolerant

file_name_is_absolute

path

join

splitpath

splitdir

catpath()

abs2rel

rel2abs()

=over 4

=item COPYRIGHT

=item SEE ALSO

=back

=head2 File::Spec::VMS - methods for VMS file specs

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

canonpath (override)

catdir (override)

catfile (override)

curdir (override)

devnull (override)

rootdir (override)

tmpdir (override)

updir (override)

case_tolerant (override)

path (override)

file_name_is_absolute (override)

splitpath (override)

splitdir (override)

catpath (override)

abs2rel (override)

rel2abs (override)

=over 4

=item COPYRIGHT

=item SEE ALSO

=back

=head2 File::Spec::Win32 - methods for Win32 file specs

=over 4

=item SYNOPSIS

=item DESCRIPTION

devnull

=back

tmpdir

case_tolerant

file_name_is_absolute

catfile

canonpath

splitpath

splitdir

catpath

=over 4

=item Note For File::Spec::Win32 Maintainers

=back

=over 4

=item COPYRIGHT

=item SEE ALSO

=back

=head2 File::Temp - return name and handle of a temporary file safely

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item PORTABILITY

=item OBJECT-ORIENTED INTERFACE

B<new>, B<newdir>, B<filename>, B<dirname>, B<unlink_on_destroy>,
B<DESTROY>

=item FUNCTIONS

B<tempfile>, B<tempdir>

=item MKTEMP FUNCTIONS

B<mkstemp>, B<mkstemps>, B<mkdtemp>, B<mktemp>

=item POSIX FUNCTIONS

B<tmpnam>, B<tmpfile>

=item ADDITIONAL FUNCTIONS

B<tempnam>

=item UTILITY FUNCTIONS

B<unlink0>, B<cmpstat>, B<unlink1>, B<cleanup>

=item PACKAGE VARIABLES

B<safe_level>, STANDARD, MEDIUM, HIGH, TopSystemUID, B<$KEEP_ALL>,
B<$DEBUG>

=item WARNING

=over 4

=item Temporary files and NFS

=item Forking

=item Directory removal

=item Taint mode

=item BINMODE

=back

=item HISTORY

=item SEE ALSO

=item SUPPORT

=over 4

=item Bugs / Feature Requests

=item Source Code

=back

=item AUTHOR

=item CONTRIBUTORS

=item COPYRIGHT AND LICENSE

=back

=head2 File::stat - by-name interface to Perl's built-in stat() functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item BUGS

=item ERRORS

-%s is not implemented on a File::stat object

=item WARNINGS

File::stat ignores use filetest 'access', File::stat ignores VMS ACLs

=item NOTE

=item AUTHOR

=back

=head2 FileCache - keep more files open than the system permits

=over 4

=item SYNOPSIS

=item DESCRIPTION

cacheout EXPR, cacheout MODE, EXPR

=item CAVEATS

=item BUGS

=back

=head2 FileHandle - supply object methods for filehandles

=over 4

=item SYNOPSIS

=item DESCRIPTION

$fh->print, $fh->printf, $fh->getline, $fh->getlines

=item SEE ALSO

=back

=head2 Filter::Simple - Simplified source filtering

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item The Problem

=item A Solution

=item Disabling or changing <no> behaviour

=item All-in-one interface

=item Filtering only specific components of source code

C<"code">, C<"code_no_comments">, C<"executable">,
C<"executable_no_comments">, C<"quotelike">, C<"string">, C<"regex">,
C<"all">

=item Filtering only the code parts of source code

=item Using Filter::Simple with an explicit C<import> subroutine

=item Using Filter::Simple and Exporter together

=item How it works

=back

=item AUTHOR

=item CONTACT

=item COPYRIGHT AND LICENSE

=back

=head2 Filter::Util::Call - Perl Source Filter Utility Module

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item B<use Filter::Util::Call>

=item B<import()>

=item B<filter_add()>

=item B<filter() and anonymous sub>

B<$_>, B<$status>, B<filter_read> and B<filter_read_exact>, B<filter_del>,
I<real_import>, I<unimport()>

=back

=item LIMITATIONS

__DATA__ is ignored, Max. codesize limited to 32-bit

=item EXAMPLES

=over 4

=item Example 1: A simple filter.

=item Example 2: Using the context

=item Example 3: Using the context within the filter

=item Example 4: Using filter_del

=back

=item Filter::Simple

=item AUTHOR

=item DATE

=item LICENSE

=back

=head2 FindBin - Locate directory of original perl script

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXPORTABLE VARIABLES

=item KNOWN ISSUES

=item AUTHORS

=item COPYRIGHT

=back

=head2 GDBM_File - Perl5 access to the gdbm library.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item AVAILABILITY

=item BUGS

=item SEE ALSO

=back

=head2 Getopt::Long - Extended processing of command line options

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Command Line Options, an Introduction

=item Getting Started with Getopt::Long

=over 4

=item Simple options

=item A little bit less simple options

=item Mixing command line option with other arguments

=item Options with values

=item Options with multiple values

=item Options with hash values

=item User-defined subroutines to handle options

=item Options with multiple names

=item Case and abbreviations

=item Summary of Option Specifications

!, +, s, i, o, f, : I<type> [ I<desttype> ], : I<number> [ I<desttype> ], :
+ [ I<desttype> ]

=back

=item Advanced Possibilities

=over 4

=item Object oriented interface

=item Thread Safety

=item Documentation and help texts

=item Parsing options from an arbitrary array

=item Parsing options from an arbitrary string

=item Storing options values in a hash

=item Bundling

=item The lonesome dash

=item Argument callback

=back

=item Configuring Getopt::Long

default, posix_default, auto_abbrev, getopt_compat, gnu_compat, gnu_getopt,
require_order, permute, bundling (default: disabled), bundling_override
(default: disabled), ignore_case  (default: enabled), ignore_case_always
(default: disabled), auto_version (default:disabled), auto_help
(default:disabled), pass_through (default: disabled), prefix,
prefix_pattern, long_prefix_pattern, debug (default: disabled)

=item Exportable Methods

VersionMessage, C<-message>, C<-msg>, C<-exitval>, C<-output>, HelpMessage

=item Return values and Errors

=item Legacy

=over 4

=item Default destinations

=item Alternative option starters

=item Configuration variables

=back

=item Tips and Techniques

=over 4

=item Pushing multiple values in a hash option

=back

=item Troubleshooting

=over 4

=item GetOptions does not return a false result when an option is not
supplied

=item GetOptions does not split the command line correctly

=item Undefined subroutine &main::GetOptions called

=item How do I put a "-?" option into a Getopt::Long?

=back

=item AUTHOR

=item COPYRIGHT AND DISCLAIMER

=back

=head2 Getopt::Std - Process single-character switches with switch
clustering

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item C<--help> and C<--version>

=back

=head2 HTTP::Tiny - A small, simple, correct HTTP/1.1 client

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item new

=item get|head|put|post|delete

=item post_form

=item mirror

=item request

=item www_form_urlencode

=item can_ssl

=item connected

=back

=item SSL SUPPORT

=item PROXY SUPPORT

=item LIMITATIONS

=item SEE ALSO

=item SUPPORT

=over 4

=item Bugs / Feature Requests

=item Source Code

=back

=item AUTHORS

=item CONTRIBUTORS

=item COPYRIGHT AND LICENSE

=back

=head2 Hash::Util - A selection of general-utility hash subroutines

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Restricted hashes

B<lock_keys>, B<unlock_keys>

=back

=back

B<lock_keys_plus>

B<lock_value>, B<unlock_value>

B<lock_hash>, B<unlock_hash>

B<lock_hash_recurse>, B<unlock_hash_recurse>

B<hashref_locked>, B<hash_locked>

B<hashref_unlocked>, B<hash_unlocked>

B<legal_keys>, B<hidden_keys>, B<all_keys>, B<hash_seed>, B<hash_value>,
B<bucket_info>, B<bucket_stats>, B<bucket_array>

B<bucket_stats_formatted>

B<hv_store>, B<hash_traversal_mask>, B<bucket_ratio>, B<used_buckets>,
B<num_buckets>

=over 4

=item Operating on references to hashes.

lock_ref_keys, unlock_ref_keys, lock_ref_keys_plus, lock_ref_value,
unlock_ref_value, lock_hashref, unlock_hashref, lock_hashref_recurse,
unlock_hashref_recurse, hash_ref_unlocked, legal_ref_keys, hidden_ref_keys

=back

=over 4

=item CAVEATS

=item BUGS

=item AUTHOR

=item SEE ALSO

=back

=head2 Hash::Util::FieldHash - Support for Inside-Out Classes

=over 4

=item SYNOPSIS

=item FUNCTIONS

id, id_2obj, register, idhash, idhashes, fieldhash, fieldhashes

=item DESCRIPTION

=over 4

=item The Inside-out Technique

=item Problems of Inside-out

=item Solutions

=item More Problems

=item The Generic Object

=item How to use Field Hashes

=item Garbage-Collected Hashes

=back

=item EXAMPLES

C<init()>, C<first()>, C<last()>, C<name()>, C<Name_hash>, C<Name_id>,
C<Name_idhash>, C<Name_id_reg>, C<Name_idhash_reg>, C<Name_fieldhash>

=over 4

=item Example 1

=item Example 2

=back

=item GUTS

=over 4

=item The C<PERL_MAGIC_uvar> interface for hashes

=item Weakrefs call uvar magic

=item How field hashes work

=item Internal function Hash::Util::FieldHash::_fieldhash

=back

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 I18N::Collate - compare 8-bit scalar data according to the current
locale

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 I18N::LangTags - functions for dealing with RFC3066-style language
tags

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

the function is_language_tag($lang1)

the function extract_language_tags($whatever)

the function same_language_tag($lang1, $lang2)

the function similarity_language_tag($lang1, $lang2)

the function is_dialect_of($lang1, $lang2)

the function super_languages($lang1)

the function locale2language_tag($locale_identifier)

the function encode_language_tag($lang1)

the function alternate_language_tags($lang1)

the function @langs = panic_languages(@accept_languages)

the function implicate_supers( ...languages... ), the function
implicate_supers_strictly( ...languages... )

=over 4

=item ABOUT LOWERCASING

=item ABOUT UNICODE PLAINTEXT LANGUAGE TAGS

=item SEE ALSO

=item COPYRIGHT

=item AUTHOR

=back

=head2 I18N::LangTags::Detect - detect the user's language preferences

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS

=item ENVIRONMENT

=item SEE ALSO

=item COPYRIGHT

=item AUTHOR

=back

=head2 I18N::LangTags::List -- tags and names for human languages

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item ABOUT LANGUAGE TAGS

=item LIST OF LANGUAGES

{ab} : Abkhazian, {ace} : Achinese, {ach} : Acoli, {ada} : Adangme, {ady} :
Adyghe, {aa} : Afar, {afh} : Afrihili, {af} : Afrikaans, [{afa} :
Afro-Asiatic (Other)], {ak} : Akan, {akk} : Akkadian, {sq} : Albanian,
{ale} : Aleut, [{alg} : Algonquian languages], [{tut} : Altaic (Other)],
{am} : Amharic, {i-ami} : Ami, [{apa} : Apache languages], {ar} : Arabic,
{arc} : Aramaic, {arp} : Arapaho, {arn} : Araucanian, {arw} : Arawak, {hy}
: Armenian, {an} : Aragonese, [{art} : Artificial (Other)], {ast} :
Asturian, {as} : Assamese, [{ath} : Athapascan languages], [{aus} :
Australian languages], [{map} : Austronesian (Other)], {av} : Avaric, {ae}
: Avestan, {awa} : Awadhi, {ay} : Aymara, {az} : Azerbaijani, {ban} :
Balinese, [{bat} : Baltic (Other)], {bal} : Baluchi, {bm} : Bambara, [{bai}
: Bamileke languages], {bad} : Banda, [{bnt} : Bantu (Other)], {bas} :
Basa, {ba} : Bashkir, {eu} : Basque, {btk} : Batak (Indonesia), {bej} :
Beja, {be} : Belarusian, {bem} : Bemba, {bn} : Bengali, [{ber} : Berber
(Other)], {bho} : Bhojpuri, {bh} : Bihari, {bik} : Bikol, {bin} : Bini,
{bi} : Bislama, {bs} : Bosnian, {bra} : Braj, {br} : Breton, {bug} :
Buginese, {bg} : Bulgarian, {i-bnn} : Bunun, {bua} : Buriat, {my} :
Burmese, {cad} : Caddo, {car} : Carib, {ca} : Catalan, [{cau} : Caucasian
(Other)], {ceb} : Cebuano, [{cel} : Celtic (Other)], [{cai} : Central
American Indian (Other)], {chg} : Chagatai, [{cmc} : Chamic languages],
{ch} : Chamorro, {ce} : Chechen, {chr} : Cherokee, {chy} : Cheyenne, {chb}
: Chibcha, {ny} : Chichewa, {zh} : Chinese, {chn} : Chinook Jargon, {chp} :
Chipewyan, {cho} : Choctaw, {cu} : Church Slavic, {chk} : Chuukese, {cv} :
Chuvash, {cop} : Coptic, {kw} : Cornish, {co} : Corsican, {cr} : Cree,
{mus} : Creek, [{cpe} : English-based Creoles and pidgins (Other)], [{cpf}
: French-based Creoles and pidgins (Other)], [{cpp} : Portuguese-based
Creoles and pidgins (Other)], [{crp} : Creoles and pidgins (Other)], {hr} :
Croatian, [{cus} : Cushitic (Other)], {cs} : Czech, {dak} : Dakota, {da} :
Danish, {dar} : Dargwa, {day} : Dayak, {i-default} : Default (Fallthru)
Language, {del} : Delaware, {din} : Dinka, {dv} : Divehi, {doi} : Dogri,
{dgr} : Dogrib, [{dra} : Dravidian (Other)], {dua} : Duala, {nl} : Dutch,
{dum} : Middle Dutch (ca.1050-1350), {dyu} : Dyula, {dz} : Dzongkha, {efi}
: Efik, {egy} : Ancient Egyptian, {eka} : Ekajuk, {elx} : Elamite, {en} :
English, {enm} : Old English (1100-1500), {ang} : Old English
(ca.450-1100), {i-enochian} : Enochian (Artificial), {myv} : Erzya, {eo} :
Esperanto, {et} : Estonian, {ee} : Ewe, {ewo} : Ewondo, {fan} : Fang, {fat}
: Fanti, {fo} : Faroese, {fj} : Fijian, {fi} : Finnish, [{fiu} :
Finno-Ugrian (Other)], {fon} : Fon, {fr} : French, {frm} : Middle French
(ca.1400-1600), {fro} : Old French (842-ca.1400), {fy} : Frisian, {fur} :
Friulian, {ff} : Fulah, {gaa} : Ga, {gd} : Scots Gaelic, {gl} : Gallegan,
{lg} : Ganda, {gay} : Gayo, {gba} : Gbaya, {gez} : Geez, {ka} : Georgian,
{de} : German, {gmh} : Middle High German (ca.1050-1500), {goh} : Old High
German (ca.750-1050), [{gem} : Germanic (Other)], {gil} : Gilbertese, {gon}
: Gondi, {gor} : Gorontalo, {got} : Gothic, {grb} : Grebo, {grc} : Ancient
Greek, {el} : Modern Greek, {gn} : Guarani, {gu} : Gujarati, {gwi} :
Gwich'in, {hai} : Haida, {ht} : Haitian, {ha} : Hausa, {haw} : Hawaiian,
{he} : Hebrew, {hz} : Herero, {hil} : Hiligaynon, {him} : Himachali, {hi} :
Hindi, {ho} : Hiri Motu, {hit} : Hittite, {hmn} : Hmong, {hu} : Hungarian,
{hup} : Hupa, {iba} : Iban, {is} : Icelandic, {io} : Ido, {ig} : Igbo,
{ijo} : Ijo, {ilo} : Iloko, [{inc} : Indic (Other)], [{ine} : Indo-European
(Other)], {id} : Indonesian, {inh} : Ingush, {ia} : Interlingua
(International Auxiliary Language Association), {ie} : Interlingue, {iu} :
Inuktitut, {ik} : Inupiaq, [{ira} : Iranian (Other)], {ga} : Irish, {mga} :
Middle Irish (900-1200), {sga} : Old Irish (to 900), [{iro} : Iroquoian
languages], {it} : Italian, {ja} : Japanese, {jv} : Javanese, {jrb} :
Judeo-Arabic, {jpr} : Judeo-Persian, {kbd} : Kabardian, {kab} : Kabyle,
{kac} : Kachin, {kl} : Kalaallisut, {xal} : Kalmyk, {kam} : Kamba, {kn} :
Kannada, {kr} : Kanuri, {krc} : Karachay-Balkar, {kaa} : Kara-Kalpak, {kar}
: Karen, {ks} : Kashmiri, {csb} : Kashubian, {kaw} : Kawi, {kk} : Kazakh,
{kha} : Khasi, {km} : Khmer, [{khi} : Khoisan (Other)], {kho} : Khotanese,
{ki} : Kikuyu, {kmb} : Kimbundu, {rw} : Kinyarwanda, {ky} : Kirghiz,
{i-klingon} : Klingon, {kv} : Komi, {kg} : Kongo, {kok} : Konkani, {ko} :
Korean, {kos} : Kosraean, {kpe} : Kpelle, {kro} : Kru, {kj} : Kuanyama,
{kum} : Kumyk, {ku} : Kurdish, {kru} : Kurukh, {kut} : Kutenai, {lad} :
Ladino, {lah} : Lahnda, {lam} : Lamba, {lo} : Lao, {la} : Latin, {lv} :
Latvian, {lb} : Letzeburgesch, {lez} : Lezghian, {li} : Limburgish, {ln} :
Lingala, {lt} : Lithuanian, {nds} : Low German, {art-lojban} : Lojban
(Artificial), {loz} : Lozi, {lu} : Luba-Katanga, {lua} : Luba-Lulua, {lui}
: Luiseno, {lun} : Lunda, {luo} : Luo (Kenya and Tanzania), {lus} : Lushai,
{mk} : Macedonian, {mad} : Madurese, {mag} : Magahi, {mai} : Maithili,
{mak} : Makasar, {mg} : Malagasy, {ms} : Malay, {ml} : Malayalam, {mt} :
Maltese, {mnc} : Manchu, {mdr} : Mandar, {man} : Mandingo, {mni} :
Manipuri, [{mno} : Manobo languages], {gv} : Manx, {mi} : Maori, {mr} :
Marathi, {chm} : Mari, {mh} : Marshall, {mwr} : Marwari, {mas} : Masai,
[{myn} : Mayan languages], {men} : Mende, {mic} : Micmac, {min} :
Minangkabau, {i-mingo} : Mingo, [{mis} : Miscellaneous languages], {moh} :
Mohawk, {mdf} : Moksha, {mo} : Moldavian, [{mkh} : Mon-Khmer (Other)],
{lol} : Mongo, {mn} : Mongolian, {mos} : Mossi, [{mul} : Multiple
languages], [{mun} : Munda languages], {nah} : Nahuatl, {nap} : Neapolitan,
{na} : Nauru, {nv} : Navajo, {nd} : North Ndebele, {nr} : South Ndebele,
{ng} : Ndonga, {ne} : Nepali, {new} : Newari, {nia} : Nias, [{nic} :
Niger-Kordofanian (Other)], [{ssa} : Nilo-Saharan (Other)], {niu} : Niuean,
{nog} : Nogai, {non} : Old Norse, [{nai} : North American Indian], {no} :
Norwegian, {nb} : Norwegian Bokmal, {nn} : Norwegian Nynorsk, [{nub} :
Nubian languages], {nym} : Nyamwezi, {nyn} : Nyankole, {nyo} : Nyoro, {nzi}
: Nzima, {oc} : Occitan (post 1500), {oj} : Ojibwa, {or} : Oriya, {om} :
Oromo, {osa} : Osage, {os} : Ossetian; Ossetic, [{oto} : Otomian
languages], {pal} : Pahlavi, {i-pwn} : Paiwan, {pau} : Palauan, {pi} :
Pali, {pam} : Pampanga, {pag} : Pangasinan, {pa} : Panjabi, {pap} :
Papiamento, [{paa} : Papuan (Other)], {fa} : Persian, {peo} : Old Persian
(ca.600-400 B.C.), [{phi} : Philippine (Other)], {phn} : Phoenician, {pon}
: Pohnpeian, {pl} : Polish, {pt} : Portuguese, [{pra} : Prakrit languages],
{pro} : Old Provencal (to 1500), {ps} : Pushto, {qu} : Quechua, {rm} :
Raeto-Romance, {raj} : Rajasthani, {rap} : Rapanui, {rar} : Rarotongan,
[{qaa - qtz} : Reserved for local use.], [{roa} : Romance (Other)], {ro} :
Romanian, {rom} : Romany, {rn} : Rundi, {ru} : Russian, [{sal} : Salishan
languages], {sam} : Samaritan Aramaic, {se} : Northern Sami, {sma} :
Southern Sami, {smn} : Inari Sami, {smj} : Lule Sami, {sms} : Skolt Sami,
[{smi} : Sami languages (Other)], {sm} : Samoan, {sad} : Sandawe, {sg} :
Sango, {sa} : Sanskrit, {sat} : Santali, {sc} : Sardinian, {sas} : Sasak,
{sco} : Scots, {sel} : Selkup, [{sem} : Semitic (Other)], {sr} : Serbian,
{srr} : Serer, {shn} : Shan, {sn} : Shona, {sid} : Sidamo, {sgn-...} : Sign
Languages, {bla} : Siksika, {sd} : Sindhi, {si} : Sinhalese, [{sit} :
Sino-Tibetan (Other)], [{sio} : Siouan languages], {den} : Slave
(Athapascan), [{sla} : Slavic (Other)], {sk} : Slovak, {sl} : Slovenian,
{sog} : Sogdian, {so} : Somali, {son} : Songhai, {snk} : Soninke, {wen} :
Sorbian languages, {nso} : Northern Sotho, {st} : Southern Sotho, [{sai} :
South American Indian (Other)], {es} : Spanish, {suk} : Sukuma, {sux} :
Sumerian, {su} : Sundanese, {sus} : Susu, {sw} : Swahili, {ss} : Swati,
{sv} : Swedish, {syr} : Syriac, {tl} : Tagalog, {ty} : Tahitian, [{tai} :
Tai (Other)], {tg} : Tajik, {tmh} : Tamashek, {ta} : Tamil, {i-tao} : Tao,
{tt} : Tatar, {i-tay} : Tayal, {te} : Telugu, {ter} : Tereno, {tet} :
Tetum, {th} : Thai, {bo} : Tibetan, {tig} : Tigre, {ti} : Tigrinya, {tem} :
Timne, {tiv} : Tiv, {tli} : Tlingit, {tpi} : Tok Pisin, {tkl} : Tokelau,
{tog} : Tonga (Nyasa), {to} : Tonga (Tonga Islands), {tsi} : Tsimshian,
{ts} : Tsonga, {i-tsu} : Tsou, {tn} : Tswana, {tum} : Tumbuka, [{tup} :
Tupi languages], {tr} : Turkish, {ota} : Ottoman Turkish (1500-1928), {crh}
: Crimean Turkish, {tk} : Turkmen, {tvl} : Tuvalu, {tyv} : Tuvinian, {tw} :
Twi, {udm} : Udmurt, {uga} : Ugaritic, {ug} : Uighur, {uk} : Ukrainian,
{umb} : Umbundu, {und} : Undetermined, {ur} : Urdu, {uz} : Uzbek, {vai} :
Vai, {ve} : Venda, {vi} : Vietnamese, {vo} : Volapuk, {vot} : Votic, [{wak}
: Wakashan languages], {wa} : Walloon, {wal} : Walamo, {war} : Waray, {was}
: Washo, {cy} : Welsh, {wo} : Wolof, {x-...} : Unregistered (Semi-Private
Use), {xh} : Xhosa, {sah} : Yakut, {yao} : Yao, {yap} : Yapese, {ii} :
Sichuan Yi, {yi} : Yiddish, {yo} : Yoruba, [{ypk} : Yupik languages], {znd}
: Zande, [{zap} : Zapotec], {zen} : Zenaga, {za} : Zhuang, {zu} : Zulu,
{zun} : Zuni

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMER

=item AUTHOR

=back

=head2 I18N::Langinfo - query locale information

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item EXPORT

=back

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 IO - load various IO modules

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item DEPRECATED

=back

=head2 IO::Compress::Base - Base Class for IO::Compress modules 

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Compress::Bzip2 - Write bzip2 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item bzip2 $input_filename_or_reference => $output_filename_or_reference
[, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A
Filehandle, C<< BlockSize100K => number >>, C<< WorkFactor => number >>,
C<< Strict => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item print

=item printf

=item syswrite

=item write

=item flush

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item newStream([OPTS])

=back

=item Importing 

:all

=item EXAMPLES

=over 4

=item Apache::GZip Revisited

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Compress::Deflate - Write RFC 1950 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item deflate $input_filename_or_reference => $output_filename_or_reference
[, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A
Filehandle, C<< Merge => 0|1 >>, -Level, -Strategy, C<< Strict => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item print

=item printf

=item syswrite

=item write

=item flush

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item newStream([OPTS])

=item deflateParams

=back

=item Importing 

:all, :constants, :flush, :level, :strategy

=item EXAMPLES

=over 4

=item Apache::GZip Revisited

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Compress::FAQ -- Frequently Asked Questions about IO::Compress

=over 4

=item DESCRIPTION

=item GENERAL 

=over 4

=item Compatibility with Unix compress/uncompress.

=item Accessing .tar.Z files

=item How do I recompress using a different compression?

=back

=item ZIP

=over 4

=item What Compression Types do IO::Compress::Zip & IO::Uncompress::Unzip
support?

Store (method 0), Deflate (method 8), Bzip2 (method 12), Lzma (method 14)

=item Can I Read/Write Zip files larger the 4 Gig?

=item Can I write more that 64K entries is a Zip files?

=item Zip Resources

=back

=item GZIP

=over 4

=item Gzip Resources

=item Dealing with concatenated gzip files

=item Reading bgzip files with IO::Uncompress::Gunzip

=back

=item ZLIB

=over 4

=item Zlib Resources

=back

=item Bzip2

=over 4

=item Bzip2 Resources

=item Dealing with Concatenated bzip2 files

=item Interoperating with Pbzip2

=back

=item HTTP & NETWORK

=over 4

=item Apache::GZip Revisited

=item Compressed files and Net::FTP

=back

=item MISC

=over 4

=item Using C<InputLength> to uncompress data embedded in a larger
file/buffer.

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Compress::Gzip - Write RFC 1952 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item gzip $input_filename_or_reference => $output_filename_or_reference [,
OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A
Filehandle, C<< Merge => 0|1 >>, -Level, -Strategy, C<< Minimal => 0|1 >>,
C<< Comment => $comment >>, C<< Name => $string >>, C<< Time => $number >>,
C<< TextFlag => 0|1 >>, C<< HeaderCRC => 0|1 >>, C<< OS_Code => $value >>,
C<< ExtraField => $data >>, C<< ExtraFlags => $value >>, C<< Strict => 0|1
>>

=item Examples

=back

=item Methods 

=over 4

=item print

=item printf

=item syswrite

=item write

=item flush

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item newStream([OPTS])

=item deflateParams

=back

=item Importing 

:all, :constants, :flush, :level, :strategy

=item EXAMPLES

=over 4

=item Apache::GZip Revisited

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Compress::RawDeflate - Write RFC 1951 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item rawdeflate $input_filename_or_reference =>
$output_filename_or_reference [, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A
Filehandle, C<< Merge => 0|1 >>, -Level, -Strategy, C<< Strict => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item print

=item printf

=item syswrite

=item write

=item flush

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item newStream([OPTS])

=item deflateParams

=back

=item Importing 

:all, :constants, :flush, :level, :strategy

=item EXAMPLES

=over 4

=item Apache::GZip Revisited

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Compress::Zip - Write zip files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item zip $input_filename_or_reference => $output_filename_or_reference [,
OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A
Filehandle, C<< Name => $string >>, C<< CanonicalName => 0|1 >>, C<<
FilterName => sub { ... }  >>, C<< Time => $number >>, C<< ExtAttr => $attr
>>, C<< exTime => [$atime, $mtime, $ctime] >>, C<< exUnix2 => [$uid, $gid]
>>, C<< exUnixN => [$uid, $gid] >>, C<< Comment => $comment >>, C<<
ZipComment => $comment >>, C<< Method => $method >>, C<< Stream => 0|1 >>,
C<< Zip64 => 0|1 >>, C<< TextFlag => 0|1 >>, C<< ExtraFieldLocal => $data
>>, C<< ExtraFieldCentral => $data >>, C<< Minimal => 1|0 >>, C<<
BlockSize100K => number >>, C<< WorkFactor => number >>, C<< Preset =>
number >>, C<< Extreme => 0|1 >>, -Level, -Strategy, C<< Strict => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item print

=item printf

=item syswrite

=item write

=item flush

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item newStream([OPTS])

=item deflateParams

=back

=item Importing 

:all, :constants, :flush, :level, :strategy, :zip_method

=item EXAMPLES

=over 4

=item Apache::GZip Revisited

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Dir - supply object methods for directory handles

=over 4

=item SYNOPSIS

=item DESCRIPTION

new ( [ DIRNAME ] ), open ( DIRNAME ), read (), seek ( POS ), tell (),
rewind (), close (), tie %hash, 'IO::Dir', DIRNAME [, OPTIONS ]

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 IO::File - supply object methods for filehandles

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( FILENAME [,MODE [,PERMS]] ), new_tmpfile

=item METHODS

open( FILENAME [,MODE [,PERMS]] ), open( FILENAME, IOLAYERS ), binmode(
[LAYER] )

=item NOTE

=item SEE ALSO

=item HISTORY

=back

=head2 IO::Handle - supply object methods for I/O handles

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new (), new_from_fd ( FD, MODE )

=item METHODS

$io->fdopen ( FD, MODE ), $io->opened, $io->getline, $io->getlines,
$io->ungetc ( ORD ), $io->write ( BUF, LEN [, OFFSET ] ), $io->error,
$io->clearerr, $io->sync, $io->flush, $io->printflush ( ARGS ),
$io->blocking ( [ BOOL ] ), $io->untaint

=item NOTE

=item SEE ALSO

=item BUGS

=item HISTORY

=back

=head2 IO::Pipe - supply object methods for pipes

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [READER, WRITER] )

=item METHODS

reader ([ARGS]), writer ([ARGS]), handles ()

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 IO::Poll - Object interface to system poll call

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

mask ( IO [, EVENT_MASK ] ), poll ( [ TIMEOUT ] ), events ( IO ), remove (
IO ), handles( [ EVENT_MASK ] )

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 IO::Seekable - supply seek based methods for I/O objects

=over 4

=item SYNOPSIS

=item DESCRIPTION

$io->getpos, $io->setpos, $io->seek ( POS, WHENCE ), WHENCE=0 (SEEK_SET),
WHENCE=1 (SEEK_CUR), WHENCE=2 (SEEK_END), $io->sysseek( POS, WHENCE ),
$io->tell

=item SEE ALSO

=item HISTORY

=back

=head2 IO::Select - OO interface to the select system call

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [ HANDLES ] )

=item METHODS

add ( HANDLES ), remove ( HANDLES ), exists ( HANDLE ), handles, can_read (
[ TIMEOUT ] ), can_write ( [ TIMEOUT ] ), has_exception ( [ TIMEOUT ] ),
count (), bits(), select ( READ, WRITE, EXCEPTION [, TIMEOUT ] )

=item EXAMPLE

=item AUTHOR

=item COPYRIGHT

=back

=head2 IO::Socket - Object interface to socket communications

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [ARGS] )

=item METHODS

accept([PKG]), socketpair(DOMAIN, TYPE, PROTOCOL), atmark, connected,
protocol, sockdomain, sockopt(OPT [, VAL]), getsockopt(LEVEL, OPT),
setsockopt(LEVEL, OPT, VAL), socktype, timeout([VAL])

=item LIMITATIONS

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 IO::Socket::INET - Object interface for AF_INET domain sockets

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [ARGS] )

=over 4

=item METHODS

sockaddr (), sockport (), sockhost (), peeraddr (), peerport (), peerhost
()

=back

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 IO::Socket::IP, C<IO::Socket::IP> - Family-neutral IP socket
supporting both IPv4 and IPv6

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item REPLACING C<IO::Socket> DEFAULT BEHAVIOUR

=back

=over 4

=item CONSTRUCTORS

=back

=over 4

=item $sock = IO::Socket::IP->new( %args )

PeerHost => STRING, PeerService => STRING, PeerAddr => STRING, PeerPort =>
STRING, PeerAddrInfo => ARRAY, LocalHost => STRING, LocalService => STRING,
LocalAddr => STRING, LocalPort => STRING, LocalAddrInfo => ARRAY, Family =>
INT, Type => INT, Proto => STRING or INT, GetAddrInfoFlags => INT, Listen
=> INT, ReuseAddr => BOOL, ReusePort => BOOL, Broadcast => BOOL, Sockopts
=> ARRAY, V6Only => BOOL, MultiHomed, Blocking => BOOL, Timeout => NUM

=item $sock = IO::Socket::IP->new( $peeraddr )

=back

=over 4

=item METHODS

=back

=over 4

=item ( $host, $service ) = $sock->sockhost_service( $numeric )

=back

=over 4

=item $addr = $sock->sockhost

=item $port = $sock->sockport

=item $host = $sock->sockhostname

=item $service = $sock->sockservice

=back

=over 4

=item $addr = $sock->sockaddr

=back

=over 4

=item ( $host, $service ) = $sock->peerhost_service( $numeric )

=back

=over 4

=item $addr = $sock->peerhost

=item $port = $sock->peerport

=item $host = $sock->peerhostname

=item $service = $sock->peerservice

=back

=over 4

=item $addr = $peer->peeraddr

=back

=over 4

=item $inet = $sock->as_inet

=back

=over 4

=item NON-BLOCKING

=back

=over 4

=item C<PeerHost> AND C<LocalHost> PARSING

=over 4

=item ( $host, $port ) = IO::Socket::IP->split_addr( $addr )

=back

=back

=over 4

=item $addr = IO::Socket::IP->join_addr( $host, $port )

=back

=over 4

=item C<IO::Socket::INET> INCOMPATIBILITES

=back

=over 4

=item TODO

=item AUTHOR

=back

=head2 IO::Socket::UNIX - Object interface for AF_UNIX domain sockets

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [ARGS] )

=item METHODS

hostpath(), peerpath()

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 IO::Uncompress::AnyInflate - Uncompress zlib-based (zip, gzip)
file/buffer

=over 4

=item SYNOPSIS

=item DESCRIPTION

RFC 1950, RFC 1951 (optionally), gzip (RFC 1952), zip

=item Functional Interface

=over 4

=item anyinflate $input_filename_or_reference =>
$output_filename_or_reference [, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<<
TrailingData => $scalar >>

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string
>>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength =>
$size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< RawInflate => 0|1
>>, C<< ParseExtra => 0|1 >> If the gzip FEXTRA header field is present and
this option is set, it will force the module to check that it conforms to
the sub-field structure as defined in RFC 1952

=item Examples

=back

=item Methods 

=over 4

=item read

=item read

=item getline

=item getc

=item ungetc

=item inflateSync

=item getHeaderInfo

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item nextStream

=item trailingData

=back

=item Importing 

:all

=item EXAMPLES

=over 4

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Uncompress::AnyUncompress - Uncompress gzip, zip, bzip2 or lzop
file/buffer

=over 4

=item SYNOPSIS

=item DESCRIPTION

RFC 1950, RFC 1951 (optionally), gzip (RFC 1952), zip, bzip2, lzop, lzf,
lzma, xz

=item Functional Interface

=over 4

=item anyuncompress $input_filename_or_reference =>
$output_filename_or_reference [, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<<
TrailingData => $scalar >>

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string
>>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength =>
$size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< RawInflate => 0|1
>>, C<< UnLzma => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item read

=item read

=item getline

=item getc

=item ungetc

=item getHeaderInfo

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item nextStream

=item trailingData

=back

=item Importing 

:all

=item EXAMPLES

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Uncompress::Base - Base Class for IO::Uncompress modules 

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Uncompress::Bunzip2 - Read bzip2 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item bunzip2 $input_filename_or_reference => $output_filename_or_reference
[, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<<
TrailingData => $scalar >>

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string
>>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength =>
$size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< Small => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item read

=item read

=item getline

=item getc

=item ungetc

=item getHeaderInfo

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item nextStream

=item trailingData

=back

=item Importing 

:all

=item EXAMPLES

=over 4

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Uncompress::Gunzip - Read RFC 1952 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item gunzip $input_filename_or_reference => $output_filename_or_reference
[, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<<
TrailingData => $scalar >>

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string
>>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength =>
$size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< ParseExtra => 0|1
>> If the gzip FEXTRA header field is present and this option is set, it
will force the module to check that it conforms to the sub-field structure
as defined in RFC 1952

=item Examples

=back

=item Methods 

=over 4

=item read

=item read

=item getline

=item getc

=item ungetc

=item inflateSync

=item getHeaderInfo

Name, Comment

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item nextStream

=item trailingData

=back

=item Importing 

:all

=item EXAMPLES

=over 4

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Uncompress::Inflate - Read RFC 1950 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item inflate $input_filename_or_reference => $output_filename_or_reference
[, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<<
TrailingData => $scalar >>

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string
>>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength =>
$size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item read

=item read

=item getline

=item getc

=item ungetc

=item inflateSync

=item getHeaderInfo

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item nextStream

=item trailingData

=back

=item Importing 

:all

=item EXAMPLES

=over 4

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Uncompress::RawInflate - Read RFC 1951 files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item rawinflate $input_filename_or_reference =>
$output_filename_or_reference [, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<<
TrailingData => $scalar >>

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string
>>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength =>
$size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item read

=item read

=item getline

=item getc

=item ungetc

=item inflateSync

=item getHeaderInfo

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item nextStream

=item trailingData

=back

=item Importing 

:all

=item EXAMPLES

=over 4

=item Working with Net::FTP

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Uncompress::Unzip - Read zip files/buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Functional Interface

=over 4

=item unzip $input_filename_or_reference => $output_filename_or_reference
[, OPTS]

A filename, A filehandle, A scalar reference, An array reference, An Input
FileGlob string, A filename, A filehandle, A scalar reference, An Array
Reference, An Output FileGlob

=item Notes

=item Optional Parameters

C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A
Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<<
TrailingData => $scalar >>

=item Examples

=back

=item OO Interface

=over 4

=item Constructor

A filename, A filehandle, A scalar reference

=item Constructor Options

C<< Name => "membername" >>, C<< AutoClose => 0|1 >>, C<< MultiStream =>
0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize
=> $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict
=> 0|1 >>

=item Examples

=back

=item Methods 

=over 4

=item read

=item read

=item getline

=item getc

=item ungetc

=item inflateSync

=item getHeaderInfo

=item tell

=item eof

=item seek

=item binmode

=item opened

=item autoflush

=item input_line_number

=item fileno

=item close

=item nextStream

=item trailingData

=back

=item Importing 

:all

=item EXAMPLES

=over 4

=item Working with Net::FTP

=item Walking through a zip file

=item Unzipping a complete zip file to disk

=back

=item SEE ALSO

=item AUTHOR

=item MODIFICATION HISTORY

=item COPYRIGHT AND LICENSE

=back

=head2 IO::Zlib - IO:: style interface to L<Compress::Zlib>

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [ARGS] )

=item OBJECT METHODS

open ( FILENAME, MODE ), opened, close, getc, getline, getlines, print (
ARGS... ), read ( BUF, NBYTES, [OFFSET] ), eof, seek ( OFFSET, WHENCE ),
tell, setpos ( POS ), getpos ( POS )

=item USING THE EXTERNAL GZIP

=item CLASS METHODS

has_Compress_Zlib, gzip_external, gzip_used, gzip_read_open,
gzip_write_open

=item DIAGNOSTICS

IO::Zlib::getlines: must be called in list context,
IO::Zlib::gzopen_external: mode '...' is illegal, IO::Zlib::import: '...'
is illegal, IO::Zlib::import: ':gzip_external' requires an argument,
IO::Zlib::import: 'gzip_read_open' requires an argument, IO::Zlib::import:
'gzip_read' '...' is illegal, IO::Zlib::import: 'gzip_write_open' requires
an argument, IO::Zlib::import: 'gzip_write_open' '...' is illegal,
IO::Zlib::import: no Compress::Zlib and no external gzip, IO::Zlib::open:
needs a filename, IO::Zlib::READ: NBYTES must be specified,
IO::Zlib::WRITE: too long LENGTH

=item SEE ALSO

=item HISTORY

=item COPYRIGHT

=back

=head2 IPC::Cmd - finding and running system commands made easy

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CLASS METHODS

=over 4

=item $ipc_run_version = IPC::Cmd->can_use_ipc_run( [VERBOSE] )

=back

=back

=over 4

=item $ipc_open3_version = IPC::Cmd->can_use_ipc_open3( [VERBOSE] )

=back

=over 4

=item $bool = IPC::Cmd->can_capture_buffer

=back

=over 4

=item $bool = IPC::Cmd->can_use_run_forked

=back

=over 4

=item FUNCTIONS

=over 4

=item $path = can_run( PROGRAM );

=back

=back

=over 4

=item $ok | ($ok, $err, $full_buf, $stdout_buff, $stderr_buff) = run(
command => COMMAND, [verbose => BOOL, buffer => \$SCALAR, timeout => DIGIT]
);

command, verbose, buffer, timeout, success, error message, full_buffer,
out_buffer, error_buffer

=back

=over 4

=item $hashref = run_forked( COMMAND, { child_stdin => SCALAR, timeout =>
DIGIT, stdout_handler => CODEREF, stderr_handler => CODEREF} );

C<timeout>, C<child_stdin>, C<stdout_handler>, C<stderr_handler>,
C<discard_output>, C<terminate_on_parent_sudden_death>, C<exit_code>,
C<timeout>, C<stdout>, C<stderr>, C<merged>, C<err_msg>

=back

=over 4

=item $q = QUOTE

=back

=over 4

=item HOW IT WORKS

=item Global Variables

=over 4

=item $IPC::Cmd::VERBOSE

=item $IPC::Cmd::USE_IPC_RUN

=item $IPC::Cmd::USE_IPC_OPEN3

=item $IPC::Cmd::WARN

=item $IPC::Cmd::INSTANCES

=item $IPC::Cmd::ALLOW_NULL_ARGS

=back

=item Caveats

Whitespace and IPC::Open3 / system(), Whitespace and IPC::Run, IO Redirect,
Interleaving STDOUT/STDERR

=item See Also

=item ACKNOWLEDGEMENTS

=item BUG REPORTS

=item AUTHOR

=item COPYRIGHT

=back

=head2 IPC::Msg - SysV Msg IPC object class

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

new ( KEY , FLAGS ), id, rcv ( BUF, LEN [, TYPE [, FLAGS ]] ), remove, set
( STAT ), set ( NAME => VALUE [, NAME => VALUE ...] ), snd ( TYPE, MSG [,
FLAGS ] ), stat

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT

=back

=head2 IPC::Open2 - open a process for both reading and writing using
open2()

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item WARNING 

=item SEE ALSO

=back

=head2 IPC::Open3 - open a process for reading, writing, and error handling
using open3()

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item See Also

L<IPC::Open2>, L<IPC::Run>

=item WARNING

=back

=head2 IPC::Semaphore - SysV Semaphore IPC object class

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

new ( KEY , NSEMS , FLAGS ), getall, getncnt ( SEM ), getpid ( SEM ),
getval ( SEM ), getzcnt ( SEM ), id, op ( OPLIST ), remove, set ( STAT ),
set ( NAME => VALUE [, NAME => VALUE ...] ), setall ( VALUES ), setval ( N
, VALUE ), stat

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT

=back

=head2 IPC::SharedMem - SysV Shared Memory IPC object class

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

new ( KEY , SIZE , FLAGS ), id, read ( POS, SIZE ), write ( STRING, POS,
SIZE ), remove, is_removed, stat, attach ( [FLAG] ), detach, addr

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT

=back

=head2 IPC::SysV - System V IPC constants and system calls

=over 4

=item SYNOPSIS

=item DESCRIPTION

ftok( PATH ), ftok( PATH, ID ), shmat( ID, ADDR, FLAG ), shmdt( ADDR ),
memread( ADDR, VAR, POS, SIZE ), memwrite( ADDR, STRING, POS, SIZE )

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT

=back

=head2 Internals - Reserved special namespace for internals related
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item FUNCTIONS

SvREFCNT(THING [, $value]), SvREADONLY(THING, [, $value]),
hv_clear_placeholders(%hash)

=back

=item AUTHOR

=item SEE ALSO

=back

=head2 JSON::PP - JSON::XS compatible pure-Perl module.

=over 4

=item SYNOPSIS

=item VERSION

=item NOTE

=item DESCRIPTION

=over 4

=item FEATURES

correct unicode handling, round-trip integrity, strict checking of JSON
correctness

=back

=item FUNCTIONAL INTERFACE

=over 4

=item encode_json

=item decode_json

=item JSON::PP::is_bool

=item JSON::PP::true

=item JSON::PP::false

=item JSON::PP::null

=back

=item HOW DO I DECODE A DATA FROM OUTER AND ENCODE TO OUTER

=item METHODS

=over 4

=item new

=item ascii

=item latin1

=item utf8

=item pretty

=item indent

=item space_before

=item space_after

=item relaxed

list items can have an end-comma, shell-style '#'-comments

=item canonical

=item allow_nonref

=item allow_unknown

=item allow_blessed

=item convert_blessed

=item filter_json_object

=item filter_json_single_key_object

=item shrink

=item max_depth

=item max_size

=item encode

=item decode

=item decode_prefix

=back

=item INCREMENTAL PARSING

=over 4

=item incr_parse

=item incr_text

=item incr_skip

=item incr_reset

=back

=item JSON::PP OWN METHODS

=over 4

=item allow_singlequote

=item allow_barekey

=item allow_bignum

=item loose

=item escape_slash

=item indent_length

=item sort_by

=back

=item INTERNAL

PP_encode_box, PP_decode_box

=item MAPPING

=over 4

=item JSON -> PERL

object, array, string, number, true, false, null

=item PERL -> JSON

hash references, array references, other references, JSON::PP::true,
JSON::PP::false, JSON::PP::null, blessed objects, simple scalars, Big
Number

=back

=item UNICODE HANDLING ON PERLS

=over 4

=item Perl 5.8 and later

=item Perl 5.6

=item Perl 5.005

=back

=item TODO

speed, memory saving

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 JSON::PP::Boolean - dummy module providing JSON::PP::Boolean

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item AUTHOR

=back

=head2 List::Util - A selection of general-utility list subroutines

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item LIST-REDUCTION FUNCTIONS

=back

=over 4

=item reduce

=item any

=item all

=item none

=item notall

=item first

=item max

=item maxstr

=item min

=item minstr

=item product

=item sum

=item sum0

=back

=over 4

=item KEY/VALUE PAIR LIST FUNCTIONS

=back

=over 4

=item pairs

=item unpairs

=item pairkeys

=item pairvalues

=item pairgrep

=item pairfirst

=item pairmap

=back

=over 4

=item OTHER FUNCTIONS

=back

=over 4

=item shuffle

=item uniq

=item uniqnum

=item uniqstr

=back

=over 4

=item KNOWN BUGS

=over 4

=item RT #95409

=item uniqnum() on oversized bignums

=back

=item SUGGESTED ADDITIONS

=item SEE ALSO

=item COPYRIGHT

=back

=head2 List::Util::XS - Indicate if List::Util was compiled with a C
compiler

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item COPYRIGHT

=back

=head2 Locale::Codes - a distribution of modules to handle locale codes

=over 4

=item DESCRIPTION

L<Locale::Codes::Country>, L<Locale::Country>, L<Locale::Codes::Language>,
L<Locale::Language>, L<Locale::Codes::Currency>, L<Locale::Currency>,
L<Locale::Codes::Script>, L<Locale::Script>, L<Locale::Codes::LangExt>,
L<Locale::Codes::LangVar>, L<Locale::Codes::LangFam>, B<Locale::Codes>,
B<Locale::Codes::Constants>, B<Locale::Codes::Country_codes>,
B<Locale::Codes::Language_codes>, B<Locale::Codes::Currency_codes>,
B<Locale::Codes::Script_codes>, B<Locale::Codes::LangExt_codes>,
B<Locale::Codes::LangVar_codes>, B<Locale::Codes::LangFam_codes>

=item NEW CODE SETS

B<General-use code set>, B<An official source of data>, B<A free source of
the data>, B<A reliable source of data>

=item COMMON ALIASES

=item DEPRECATED CODES

=item SEE ALSO

L<Locale::Codes::API>, L<Locale::Codes::Country>,
L<Locale::Codes::Language>, L<Locale::Codes::Script>,
L<Locale::Codes::Currency>, L<Locale::Codes::LangExt>,
L<Locale::Codes::LangVar>, L<Locale::Codes::LangFam>,
L<Locale::Codes::Changes>

=item BUGS AND QUESTIONS

Direct email, CPAN Bug Tracking, GitHub, B<Locale::Codes version>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::API - a description of the callable function in each
module

=over 4

=item DESCRIPTION

=item ROUTINES

B<code2XXX ( CODE [,CODESET] [,'retired'] )>, B<XXX2code ( NAME [,CODESET]
[,'retired'] )>, B<XXX_code2code ( CODE ,CODESET ,CODESET2 )>,
B<all_XXX_codes ( [CODESET] [,'retired'] )>, B<all_XXX_names ( [CODESET]
[,'retired'] )>

=item SEMI-PRIVATE ROUTINES

B<MODULE::rename_XXX  ( CODE ,NEW_NAME [,CODESET] )>, B<MODULE::add_XXX  (
CODE ,NAME [,CODESET] )>, B<MODULE::delete_XXX	( CODE [,CODESET] )>,
B<MODULE::add_XXX_alias  ( NAME ,NEW_NAME )>, B<MODULE::delete_XXX_alias  (
NAME )>, B<MODULE::rename_XXX_code  ( CODE ,NEW_CODE [,CODESET] )>,
B<MODULE::add_XXX_code_alias  ( CODE ,NEW_CODE [,CODESET] )>,
B<MODULE::delete_XXX_code_alias  ( CODE [,CODESET] )>

=item KNOWN BUGS AND LIMITATIONS

B<Relationship between code sets>, B<Non-ASCII characters not supported>

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::Changes - details changes to Locale::Codes

=over 4

=item SYNOPSIS

=item VERSION 3.46  (planned 2017-12-01; sbeck)

=item VERSION 3.45  (planned 2017-09-01; sbeck)

=item VERSION 3.44  (planned 2017-06-01; sbeck)

=item VERSION 3.43  (planned 2017-03-01; sbeck)

=item VERSION 3.42  (2016-11-30; sbeck)

B<Added Czech republic aliases back in>

=item VERSION 3.41  (2016-11-18; sbeck)

=item VERSION 3.40  (2016-09-01; sbeck)

=item VERSION 3.39  (2016-05-31; sbeck)

B<Added UN codes back in>, B<Added GENC codes>

=item VERSION 3.38  (2016-03-02; sbeck)

B<Tests reworked>

=item VERSION 3.37  (2015-12-01; sbeck)

=item VERSION 3.36  (2015-09-01; sbeck)

B<(!) Removed alias_code function>

=item VERSION 3.35  (2015-06-01; sbeck)

B<Documentation improvements>

=item VERSION 3.34  (2015-03-01; sbeck)

=item VERSION 3.33  (2014-12-01; sbeck)

B<Filled out LOCALE_LANG_TERM codeset>, B<Moved repository to GitHub>

=item VERSION 3.32  (2014-09-01; sbeck)

=item VERSION 3.31  (2014-06-01; sbeck)

B<Bug fixes>

=item VERSION 3.30  (2014-03-04; sbeck)

B<alias_code remove date set>, B<Bug fixes>

=item VERSION 3.29  (2014-01-27; sbeck)

B<ISO 3166 country codes improved>, B<Bug fixes>

=item VERSION 3.28  (2013-12-02; sbeck)

=item VERSION 3.27  (2013-09-03; sbeck)

B<* FIPS-10 country codes removed>

=item VERSION 3.26  (2013-06-03; sbeck)

B<Documentation fixes>

=item VERSION 3.25  (2013-03-01; sbeck)

=item VERSION 3.24  (2012-12-03; sbeck)

B<Syria alias>, B<FIPS-10 country codes deprecated>, B<Domain country codes
now come from ISO 3166>

=item VERSION 3.23  (2012-09-01; sbeck)

=item VERSION 3.22  (2012-06-01; sbeck)

B<Updated perl version required>, B<Sorted deprecated codes>

=item VERSION 3.21  (2012-03-01; sbeck)

=item VERSION 3.20  (2011-12-01; sbeck)

B<Added limited support for deprecated codes>, B<Fixed capitalization>,
B<Pod tests off by default>, B<Codesets may be specified by name>,
B<alias_code deprecated>, B<Code cleanup>, B<Added LangFam module>

=item VERSION 3.18  (2011-08-31; sbeck)

B<No longer use CIA data>

=item VERSION 3.17  (2011-06-28; sbeck)

B<Added new types of codes>, B<Added new codeset(s)>, B<Bug fixes>,
B<Reorganized code>

=item VERSION 3.16  (2011-03-01; sbeck)

=item VERSION 3.15  (2010-12-02; sbeck)

B<Minor fixes>

=item VERSION 3.14  (2010-09-28; sbeck)

B<Bug fixes>

=item VERSION 3.13  (2010-06-04; sbeck)

=item VERSION 3.12  (2010-04-06; sbeck)

B<Reorganized code>

=item VERSION 3.11  (2010-03-01; sbeck)

B<Added new codeset(s)>, B<Bug fixes>

=item VERSION 3.10  (2010-02-18; sbeck)

B<Reorganized code>, B<(!) Changed XXX_code2code behavior slightly>,
B<Added many semi-private routines>, B<New aliases>

=item VERSION 3.01  (2010-02-15; sbeck)

B<Fixed Makefile.PL and Build.PL>

=item VERSION 3.00  (2010-02-10; sbeck)

B<(*) New maintainer>, B<(*) (!) All codes are generated from standards>,
B<Added new codeset(s)>, B<(*) (!) Locale::Script changed>, B<Added missing
functions>, B<(!) Dropped support for _alias_code>, B<(!) All functions
return the standard value>, B<(!) rename_country function altered>

=item VERSION 2.07  (2004-06-10; neilb)

=item VERSION 2.06  (2002-07-15; neilb)

=item VERSION 2.05  (2002-07-08; neilb)

=item VERSION 2.04  (2002-05-23; neilb)

=item VERSION 2.03  (2002-03-24; neilb)

=item VERSION 2.02  (2002-03-09; neilb)

=item VERSION 2.01  (2002-02-18; neilb)

=item VERSION 2.00  (2002-02-17; neilb)

=item VERSION 1.06  (2001-03-04; neilb)

=item VERSION 1.05  (2001-02-13; neilb)

=item VERSION 1.04  (2000-12-21; neilb)

=item VERSION 1.03  (2000-12-??; neilb)

=item VERSION 1.02  (2000-05-04; neilb)

=item VERSION 1.00  (1998-03-09; neilb)

=item VERSION 0.003  (1997-05-09; neilb)

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::Country - standard codes for country identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha-2, LOCALE_CODE_ALPHA_2>, B<alpha-3, LOCALE_CODE_ALPHA_3>,
B<numeric, LOCALE_CODE_NUMERIC>, B<dom, LOCALE_CODE_DOM>, B<un-alpha-3,
LOCALE_CODE_UN_ALPHA_3>, B<un-numeric, LOCALE_CODE_UN_NUMERIC>,
B<genc-alpha-2, LOCALE_CODE_GENC_ALPHA_2>, B<genc-alpha-3,
LOCALE_CODE_GENC_ALPHA_3>, B<genc-numeric, LOCALE_CODE_GENC_NUMERIC>

=item ROUTINES

B<code2country(CODE [,CODESET] [,'retired'])>, B<country2code(NAME
[,CODESET] [,'retired'])>, B<country_code2code(CODE ,CODESET ,CODESET2)>,
B<all_country_codes([CODESET] [,'retired'])>, B<all_country_names([CODESET]
[,'retired'])>, B<Locale::Codes::Country::rename_country(CODE ,NEW_NAME
[,CODESET])>, B<Locale::Codes::Country::add_country(CODE ,NAME
[,CODESET])>, B<Locale::Codes::Country::delete_country(CODE [,CODESET])>,
B<Locale::Codes::Country::add_country_alias(NAME ,NEW_NAME)>,
B<Locale::Codes::Country::delete_country_alias(NAME)>,
B<Locale::Codes::Country::rename_country_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Codes::Country::add_country_code_alias(CODE ,NEW_CODE
[,CODESET])>, B<Locale::Codes::Country::delete_country_code_alias(CODE
[,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>, L<Locale::SubCountry>,
L<http://www.iso.org/iso/home/standards/country_codes.htm>,
L<http://www.iso.org/iso/home/standards/country_codes/iso-3166-1_decoding_t
able.htm>, L<http://www.iana.org/domains/root/db/>,
L<http://unstats.un.org/unsd/methods/m49/m49alpha.htm>,
L<https://nsgreg.nga.mil/genc/discovery>,
L<https://www.cia.gov/library/publications/the-world-factbook/appendix/prin
t_appendix-d.html>, L<http://www.statoids.com/wab.html>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::Currency - standard codes for currency identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha, LOCALE_CURR_ALPHA>, B<num, LOCALE_CURR_NUMERIC>

=item ROUTINES

B<code2currency(CODE [,CODESET] [,'retired'])>, B<currency2code(NAME
[,CODESET] [,'retired'])>, B<currency_code2code(CODE ,CODESET ,CODESET2)>,
B<all_currency_codes([CODESET] [,'retired'])>,
B<all_currency_names([CODESET] [,'retired'])>,
B<Locale::Codes::Currency::rename_currency(CODE ,NEW_NAME [,CODESET])>,
B<Locale::Codes::Currency::add_currency(CODE ,NAME [,CODESET])>,
B<Locale::Codes::Currency::delete_currency(CODE [,CODESET])>,
B<Locale::Codes::Currency::add_currency_alias(NAME ,NEW_NAME)>,
B<Locale::Codes::Currency::delete_currency_alias(NAME)>,
B<Locale::Codes::Currency::rename_currency_code(CODE ,NEW_CODE
[,CODESET])>, B<Locale::Codes::Currency::add_currency_code_alias(CODE
,NEW_CODE [,CODESET])>,
B<Locale::Codes::Currency::delete_currency_code_alias( CODE [,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.iso.org/iso/support/currency_codes_list-1.htm>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::LangExt - standard codes for language extension
identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha>

=item ROUTINES

B<code2langext(CODE [,CODESET] [,'retired'])>, B<langext2code(NAME
[,CODESET] [,'retired'])>, B<langext_code2code(CODE ,CODESET ,CODESET2)>,
B<all_langext_codes([CODESET] [,'retired'])>, B<all_langext_names([CODESET]
[,'retired'])>, B<Locale::Codes::LangExt::rename_langext(CODE ,NEW_NAME
[,CODESET])>, B<Locale::Codes::LangExt::add_langext(CODE ,NAME
[,CODESET])>, B<Locale::Codes::LangExt::delete_langext(CODE [,CODESET])>,
B<Locale::Codes::LangExt::add_langext_alias(NAME ,NEW_NAME)>,
B<Locale::Codes::LangExt::delete_langext_alias(NAME)>,
B<Locale::Codes::LangExt::rename_langext_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Codes::LangExt::add_langext_code_alias(CODE ,NEW_CODE
[,CODESET])>, B<Locale::Codes::LangExt::delete_langext_code_alias(CODE
[,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.iana.org/assignments/language-subtag-registry>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::LangFam - standard codes for language extension
identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha>

=item ROUTINES

B<code2langfam(CODE [,CODESET] [,'retired'])>, B<langfam2code(NAME
[,CODESET] [,'retired'])>, B<langfam_code2code(CODE ,CODESET ,CODESET2)>,
B<all_langfam_codes([CODESET] [,'retired'])>, B<all_langfam_names([CODESET]
[,'retired'])>, B<Locale::Codes::LangFam::rename_langfam(CODE ,NEW_NAME
[,CODESET])>, B<Locale::Codes::LangFam::add_langfam(CODE ,NAME
[,CODESET])>, B<Locale::Codes::LangFam::delete_langfam(CODE [,CODESET])>,
B<Locale::Codes::LangFam::add_langfam_alias(NAME ,NEW_NAME)>,
B<Locale::Codes::LangFam::delete_langfam_alias(NAME)>,
B<Locale::Codes::LangFam::rename_langfam_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Codes::LangFam::add_langfam_code_alias(CODE ,NEW_CODE
[,CODESET])>, B<Locale::Codes::LangFam::delete_langfam_code_alias(CODE
[,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.loc.gov/standards/iso639-5/id.php>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::LangVar - standard codes for language variation
identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha>

=item ROUTINES

B<code2langvar(CODE [,CODESET] [,'retired'])>, B<langvar2code(NAME
[,CODESET] [,'retired'])>, B<langvar_code2code(CODE ,CODESET ,CODESET2)>,
B<all_langvar_codes([CODESET] [,'retired'])>, B<all_langvar_names([CODESET]
[,'retired'])>, B<Locale::Codes::LangVar::rename_langvar(CODE ,NEW_NAME
[,CODESET])>, B<Locale::Codes::LangVar::add_langvar(CODE ,NAME
[,CODESET])>, B<Locale::Codes::LangVar::delete_langvar(CODE [,CODESET])>,
B<Locale::Codes::LangVar::add_langvar_alias(NAME ,NEW_NAME)>,
B<Locale::Codes::LangVar::delete_langvar_alias(NAME)>,
B<Locale::Codes::LangVar::rename_langvar_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Codes::LangVar::add_langvar_code_alias(CODE ,NEW_CODE
[,CODESET])>, B<Locale::Codes::LangVar::delete_langvar_code_alias(CODE
[,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.iana.org/assignments/language-subtag-registry>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::Language - standard codes for language identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha-2, LOCALE_LANG_ALPHA_2>, B<alpha-3, LOCALE_LANG_ALPHA_3>, B<term,
LOCALE_LANG_TERM>

=item ROUTINES

B<code2language(CODE [,CODESET] [,'retired'])>, B<language2code(NAME
[,CODESET] [,'retired'])>, B<language_code2code(CODE ,CODESET ,CODESET2)>,
B<all_language_codes([CODESET] [,'retired'])>,
B<all_language_names([CODESET] [,'retired'])>,
B<Locale::Codes::Language::rename_language(CODE ,NEW_NAME [,CODESET])>,
B<Locale::Codes::Language::add_language(CODE ,NAME [,CODESET])>,
B<Locale::Codes::Language::delete_language(CODE [,CODESET])>,
B<Locale::Codes::Language::add_language_alias(NAME ,NEW_NAME)>,
B<Locale::Codes::Language::delete_language_alias(NAME)>,
B<Locale::Codes::Language::rename_language_code(CODE ,NEW_CODE
[,CODESET])>, B<Locale::Codes::Language::add_language_code_alias(CODE
,NEW_CODE [,CODESET])>,
B<Locale::Codes::Language::delete_language_code_alias(CODE [,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.loc.gov/standards/iso639-2/>,
L<http://www.loc.gov/standards/iso639-5/>,
L<http://www.iana.org/assignments/language-subtag-registry>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Codes::Script - standard codes for script identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha, LOCALE_SCRIPT_ALPHA>, B<num, LOCALE_SCRIPT_NUMERIC>

=item ROUTINES

B<code2script(CODE [,CODESET] [,'retired'])>, B<script2code(NAME [,CODESET]
[,'retired'])>, B<script_code2code(CODE ,CODESET ,CODESET2)>,
B<all_script_codes([CODESET] [,'retired'])>, B<all_script_names([CODESET]
[,'retired'])>, B<Locale::Codes::Script::rename_script(CODE ,NEW_NAME
[,CODESET])>, B<Locale::Codes::Script::add_script(CODE ,NAME [,CODESET])>,
B<Locale::Codes::Script::delete_script(CODE [,CODESET])>,
B<Locale::Codes::Script::add_script_alias(NAME ,NEW_NAME)>,
B<Locale::Codes::Script::delete_script_alias(NAME)>,
B<Locale::Codes::Script::rename_script_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Codes::Script::add_script_code_alias(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Codes::Script::delete_script_code_alias(CODE [,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.unicode.org/iso15924/>,
L<http://www.iana.org/assignments/language-subtag-registry>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Country - standard codes for country identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha-2, LOCALE_CODE_ALPHA_2>, B<alpha-3, LOCALE_CODE_ALPHA_3>,
B<numeric, LOCALE_CODE_NUMERIC>, B<dom, LOCALE_CODE_DOM>, B<un-alpha-3,
LOCALE_CODE_UN_ALPHA_3>, B<un-numeric, LOCALE_CODE_UN_NUMERIC>,
B<genc-alpha-2, LOCALE_CODE_GENC_ALPHA_2>, B<genc-alpha-3,
LOCALE_CODE_GENC_ALPHA_3>, B<genc-numeric, LOCALE_CODE_GENC_NUMERIC>

=item ROUTINES

B<code2country(CODE [,CODESET] [,'retired'])>, B<country2code(NAME
[,CODESET] [,'retired'])>, B<country_code2code(CODE ,CODESET ,CODESET2)>,
B<all_country_codes([CODESET] [,'retired'])>, B<all_country_names([CODESET]
[,'retired'])>, B<Locale::Country::rename_country(CODE ,NEW_NAME
[,CODESET])>, B<Locale::Country::add_country(CODE ,NAME [,CODESET])>,
B<Locale::Country::delete_country(CODE [,CODESET])>,
B<Locale::Country::add_country_alias(NAME ,NEW_NAME)>,
B<Locale::Country::delete_country_alias(NAME)>,
B<Locale::Country::rename_country_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Country::add_country_code_alias(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Country::delete_country_code_alias(CODE [,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>, L<Locale::SubCountry>,
L<http://www.iso.org/iso/home/standards/country_codes.htm>,
L<http://www.iso.org/iso/home/standards/country_codes/iso-3166-1_decoding_t
able.htm>, L<http://www.iana.org/domains/root/db/>,
L<http://unstats.un.org/unsd/methods/m49/m49alpha.htm>,
L<https://nsgreg.nga.mil/genc/discovery>,
L<https://www.cia.gov/library/publications/the-world-factbook/appendix/prin
t_appendix-d.html>, L<http://www.statoids.com/wab.html>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Currency - standard codes for currency identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha, LOCALE_CURR_ALPHA>, B<num, LOCALE_CURR_NUMERIC>

=item ROUTINES

B<code2currency(CODE [,CODESET] [,'retired'])>, B<currency2code(NAME
[,CODESET] [,'retired'])>, B<currency_code2code(CODE ,CODESET ,CODESET2)>,
B<all_currency_codes([CODESET] [,'retired'])>,
B<all_currency_names([CODESET] [,'retired'])>,
B<Locale::Currency::rename_currency(CODE ,NEW_NAME [,CODESET])>,
B<Locale::Currency::add_currency(CODE ,NAME [,CODESET])>,
B<Locale::Currency::delete_currency(CODE [,CODESET])>,
B<Locale::Currency::add_currency_alias(NAME ,NEW_NAME)>,
B<Locale::Currency::delete_currency_alias(NAME)>,
B<Locale::Currency::rename_currency_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Currency::add_currency_code_alias(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Currency::delete_currency_code_alias( CODE [,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.iso.org/iso/support/currency_codes_list-1.htm>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Language - standard codes for language identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha-2, LOCALE_LANG_ALPHA_2>, B<alpha-3, LOCALE_LANG_ALPHA_3>, B<term,
LOCALE_LANG_TERM>

=item ROUTINES

B<code2language(CODE [,CODESET] [,'retired'])>, B<language2code(NAME
[,CODESET] [,'retired'])>, B<language_code2code(CODE ,CODESET ,CODESET2)>,
B<all_language_codes([CODESET] [,'retired'])>,
B<all_language_names([CODESET] [,'retired'])>,
B<Locale::Language::rename_language(CODE ,NEW_NAME [,CODESET])>,
B<Locale::Language::add_language(CODE ,NAME [,CODESET])>,
B<Locale::Language::delete_language(CODE [,CODESET])>,
B<Locale::Language::add_language_alias(NAME ,NEW_NAME)>,
B<Locale::Language::delete_language_alias(NAME)>,
B<Locale::Language::rename_language_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Language::add_language_code_alias(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Language::delete_language_code_alias(CODE [,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.loc.gov/standards/iso639-2/>,
L<http://www.loc.gov/standards/iso639-5/>,
L<http://www.iana.org/assignments/language-subtag-registry>

=item AUTHOR

=item COPYRIGHT

=back

=head2 Locale::Maketext - framework for localization

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item QUICK OVERVIEW

=item METHODS

=over 4

=item Construction Methods

=item The "maketext" Method

$lh->fail_with I<or> $lh->fail_with(I<PARAM>), $lh->failure_handler_auto,
$lh->blacklist(@list), $lh->whitelist(@list)

=item Utility Methods

$language->quant($number, $singular), $language->quant($number, $singular,
$plural), $language->quant($number, $singular, $plural, $negative),
$language->numf($number), $language->numerate($number, $singular, $plural,
$negative), $language->sprintf($format, @items), $language->language_tag(),
$language->encoding()

=item Language Handle Attributes and Internals

=back

=item LANGUAGE CLASS HIERARCHIES

=item ENTRIES IN EACH LEXICON

=item BRACKET NOTATION

=item BRACKET NOTATION SECURITY

=item AUTO LEXICONS

=item READONLY LEXICONS

=item CONTROLLING LOOKUP FAILURE

=item HOW TO USE MAKETEXT

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMER

=item AUTHOR

=back

=head2 Locale::Maketext::Cookbook - recipes for using Locale::Maketext

=over 4

=item INTRODUCTION

=item ONESIDED LEXICONS

=item DECIMAL PLACES IN NUMBER FORMATTING

=back

=head2 Locale::Maketext::Guts - Deprecated module to load Locale::Maketext
utf8 code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 Locale::Maketext::GutsLoader - Deprecated module to load
Locale::Maketext utf8 code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 Locale::Maketext::Simple - Simple interface to
Locale::Maketext::Lexicon

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item OPTIONS

=over 4

=item Class

=item Path

=item Style

=item Export

=item Subclass

=item Decode

=item Encoding

=back

=back

=over 4

=item ACKNOWLEDGMENTS

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT

=over 4

=item The "MIT" License

=back

=back

=head2 Locale::Maketext::TPJ13 -- article about software localization

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Localization and Perl: gettext breaks, Maketext fixes

=over 4

=item A Localization Horror Story: It Could Happen To You

=item The Linguistic View

=item Breaking gettext

=item Replacing gettext

=item Buzzwords: Abstraction and Encapsulation

=item Buzzword: Isomorphism

=item Buzzword: Inheritance

=item Buzzword: Concision

=item The Devil in the Details

=item The Proof in the Pudding: Localizing Web Sites

=item References

=back

=back

=head2 Locale::Script - standard codes for script identification

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SUPPORTED CODE SETS

B<alpha, LOCALE_SCRIPT_ALPHA>, B<num, LOCALE_SCRIPT_NUMERIC>

=item ROUTINES

B<code2script(CODE [,CODESET] [,'retired'])>, B<script2code(NAME [,CODESET]
[,'retired'])>, B<script_code2code(CODE ,CODESET ,CODESET2)>,
B<all_script_codes([CODESET] [,'retired'])>, B<all_script_names([CODESET]
[,'retired'])>, B<Locale::Script::rename_script(CODE ,NEW_NAME
[,CODESET])>, B<Locale::Script::add_script(CODE ,NAME [,CODESET])>,
B<Locale::Script::delete_script(CODE [,CODESET])>,
B<Locale::Script::add_script_alias(NAME ,NEW_NAME)>,
B<Locale::Script::delete_script_alias(NAME)>,
B<Locale::Script::rename_script_code(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Script::add_script_code_alias(CODE ,NEW_CODE [,CODESET])>,
B<Locale::Script::delete_script_code_alias(CODE [,CODESET])>

=item SEE ALSO

L<Locale::Codes>, L<Locale::Codes::API>,
L<http://www.unicode.org/iso15924/>,
L<http://www.iana.org/assignments/language-subtag-registry>

=item AUTHOR

=item COPYRIGHT

=back

=head2 MIME::Base64 - Encoding and decoding of base64 strings

=over 4

=item SYNOPSIS

=item DESCRIPTION

encode_base64( $bytes ), encode_base64( $bytes, $eol );, decode_base64(
$str ), encode_base64url( $bytes ), decode_base64url( $str ),
encoded_base64_length( $bytes ), encoded_base64_length( $bytes, $eol ),
decoded_base64_length( $str )

=item EXAMPLES

=item COPYRIGHT

=item SEE ALSO

=back

=head2 MIME::QuotedPrint - Encoding and decoding of quoted-printable
strings

=over 4

=item SYNOPSIS

=item DESCRIPTION

encode_qp( $str), encode_qp( $str, $eol), encode_qp( $str, $eol, $binmode
), decode_qp( $str )

=item COPYRIGHT

=item SEE ALSO

=back

=head2 Math::BigFloat - Arbitrary size floating point math package

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Input

=item Output

=back

=item METHODS

=over 4

=item Configuration methods

accuracy(), precision()

=item Constructor methods

from_hex(), from_oct(), from_bin(), bpi()

=item Arithmetic methods

bmuladd(), bdiv(), bmod(), bexp(), bnok(), bsin(), bcos(), batan(),
batan2(), as_float()

=item ACCURACY AND PRECISION

=item Rounding

bfround ( +$scale ), bfround ( -$scale ), bfround ( 0 ), bround  ( +$scale
), bround  ( -$scale ) and bround ( 0 )

=back

=item Autocreating constants

=over 4

=item Math library

=item Using Math::BigInt::Lite

=back

=item EXPORTS

=item CAVEATS

stringify, bstr(), brsft(), Modifying and =, precision() vs. accuracy()

=item BUGS

=item SUPPORT

RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN
Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to
mailing list, View mailing list, Subscribe/Unsubscribe

=item LICENSE

=item SEE ALSO

=item AUTHORS

=back

=head2 Math::BigInt - Arbitrary size integer/float math package

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Input

=item Output

=back

=item METHODS

=over 4

=item Configuration methods

accuracy(), precision(), div_scale(), round_mode(), upgrade(), downgrade(),
modify(), config()

=item Constructor methods

new(), from_hex(), from_oct(), from_bin(), from_bytes(), bzero(), bone(),
binf(), bnan(), bpi(), copy(), as_int(), as_number()

=item Boolean methods

is_zero(), is_one( [ SIGN ]), is_finite(), is_inf( [ SIGN ] ), is_nan(),
is_positive(), is_pos(), is_negative(), is_neg(), is_odd(), is_even(),
is_int()

=item Comparison methods

bcmp(), bacmp(), beq(), bne(), blt(), ble(), bgt(), bge()

=item Arithmetic methods

bneg(), babs(), bsgn(), bnorm(), binc(), bdec(), badd(), bsub(), bmul(),
bmuladd(), bdiv(), btdiv(), bmod(), btmod(), bmodinv(), bmodpow(), bpow(),
blog(), bexp(), bnok(), bsin(), bcos(), batan(), batan2(), bsqrt(),
broot(), bfac(), brsft(), blsft()

=item Bitwise methods

band(), bior(), bxor(), bnot()

=item Rounding methods

round(), bround(), bfround(), bfloor(), bceil(), bint()

=item Other mathematical methods

bgcd(), blcm()

=item Object property methods

sign(), digit(), length(), mantissa(), exponent(), parts(), sparts(),
nparts(), eparts(), dparts()

=item String conversion methods

bstr(), bsstr(), bnstr(), bestr(), bdstr(), as_hex(), as_bin(), as_oct(),
as_bytes()

=item Other conversion methods

numify()

=back

=item ACCURACY and PRECISION

=over 4

=item Precision P

=item Accuracy A

=item Fallback F

=item Rounding mode R

'trunc', 'even', 'odd', '+inf', '-inf', 'zero', 'common', Precision,
Accuracy (significant digits), Setting/Accessing, Creating numbers, Usage,
Precedence, Overriding globals, Local settings, Rounding, Default values,
Remarks

=back

=item Infinity and Not a Number

oct()/hex()

=item INTERNALS

=over 4

=item MATH LIBRARY

=item SIGN

=back

=item EXAMPLES

=item Autocreating constants

=item PERFORMANCE

=over 4

=item Alternative math libraries

=back

=item SUBCLASSING

=over 4

=item Subclassing Math::BigInt

=back

=item UPGRADING

=over 4

=item Auto-upgrade

=back

=item EXPORTS

=item CAVEATS

Comparing numbers as strings, int(), Modifying and =, Overloading -$x,
Mixing different object types

=item BUGS

=item SUPPORT

RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN
Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to
mailing list, View mailing list, Subscribe/Unsubscribe

=item LICENSE

=item SEE ALSO

=item AUTHORS

=back

=head2 Math::BigInt::Calc - Pure Perl module to support Math::BigInt

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=back

=head2 Math::BigInt::CalcEmu - Emulate low-level math with BigInt code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

__emu_bxor, __emu_band, __emu_bior

=item BUGS

=item SUPPORT

RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN
Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to
mailing list, View mailing list, Subscribe/Unsubscribe

=item LICENSE

=item AUTHORS

=item SEE ALSO

=back

=head2 Math::BigInt::FastCalc - Math::BigInt::Calc with some XS for more
speed

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item STORAGE

=item METHODS

=item BUGS

=item SUPPORT

RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN
Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to
mailing list, View mailing list, Subscribe/Unsubscribe

=item LICENSE

=item AUTHORS

=item SEE ALSO

=back

=head2 Math::BigInt::Lib - virtual parent class for Math::BigInt libraries

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item General Notes

api_version(), _new(STR), _zero(), _one(), _two(), _ten(), _from_bin(STR),
_from_oct(STR), _from_hex(STR), _from_bytes(STR), _add(OBJ1, OBJ2),
_mul(OBJ1, OBJ2), _div(OBJ1, OBJ2), _sub(OBJ1, OBJ2, FLAG), _sub(OBJ1,
OBJ2), _dec(OBJ), _inc(OBJ), _mod(OBJ1, OBJ2), _sqrt(OBJ), _root(OBJ, N),
_fac(OBJ), _pow(OBJ1, OBJ2), _modinv(OBJ1, OBJ2), _modpow(OBJ1, OBJ2,
OBJ3), _rsft(OBJ, N, B), _lsft(OBJ, N, B), _log_int(OBJ, B), _gcd(OBJ1,
OBJ2), _lcm(OBJ1, OBJ2), _and(OBJ1, OBJ2), _or(OBJ1, OBJ2), _xor(OBJ1,
OBJ2), _is_zero(OBJ), _is_one(OBJ), _is_two(OBJ), _is_ten(OBJ),
_is_even(OBJ), _is_odd(OBJ), _acmp(OBJ1, OBJ2), _str(OBJ), _as_bin(OBJ),
_as_oct(OBJ), _as_hex(OBJ), _as_bytes(OBJ), _num(OBJ), _copy(OBJ),
_len(OBJ), _zeros(OBJ), _digit(OBJ, N), _check(OBJ)

=item API version 2

_1ex(N), _nok(OBJ1, OBJ2), _alen(OBJ)

=item API optional methods

_signed_or(OBJ1, OBJ2, SIGN1, SIGN2), _signed_and(OBJ1, OBJ2, SIGN1,
SIGN2), _signed_xor(OBJ1, OBJ2, SIGN1, SIGN2)

=back

=item WRAP YOUR OWN

=item BUGS

=item SUPPORT

RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN
Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to
mailing list, View mailing list, Subscribe/Unsubscribe

=item LICENSE

=item AUTHOR

=item SEE ALSO

=back

=head2 Math::BigRat - Arbitrary big rational numbers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item MATH LIBRARY

=back

=item METHODS

new(), numerator(), denominator(), parts(), numify(), as_int()/as_number(),
as_float(), as_hex(), as_bin(), as_oct(), from_hex(), from_oct(),
from_bin(), bnan(), bzero(), binf(), bone(), length(), digit(), bnorm(),
bfac(), bround()/round()/bfround(), bmod(), bmodinv(), bmodpow(), bneg(),
is_one(), is_zero(), is_pos()/is_positive(), is_neg()/is_negative(),
is_int(), is_odd(), is_even(), bceil(), bfloor(), bint(), bsqrt(), broot(),
badd(), bmul(), bsub(), bdiv(), bdec(), binc(), copy(), bstr()/bsstr(),
bcmp(), bacmp(), beq(), bne(), blt(), ble(), bgt(), bge(), blsft()/brsft(),
band(), bior(), bxor(), bnot(), bpow(), blog(), bexp(), bnok(), config()

=item BUGS

=item SUPPORT

RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN
Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to
mailing list, View mailing list, Subscribe/Unsubscribe

=item LICENSE

=item SEE ALSO

=item AUTHORS

=back

=head2 Math::Complex - complex numbers and associated mathematical
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OPERATIONS

=item CREATION

=item DISPLAYING

=over 4

=item CHANGED IN PERL 5.6

=back

=item USAGE

=item CONSTANTS

=over 4

=item PI

=item Inf

=back

=item ERRORS DUE TO DIVISION BY ZERO OR LOGARITHM OF ZERO

=item ERRORS DUE TO INDIGESTIBLE ARGUMENTS

=item BUGS

=item SEE ALSO

=item AUTHORS

=item LICENSE

=back

=head2 Math::Trig - trigonometric functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item TRIGONOMETRIC FUNCTIONS

B<tan>

=over 4

=item ERRORS DUE TO DIVISION BY ZERO

=item SIMPLE (REAL) ARGUMENTS, COMPLEX RESULTS

=back

=item PLANE ANGLE CONVERSIONS

deg2rad, grad2rad, rad2deg, grad2deg, deg2grad, rad2grad, rad2rad, deg2deg,
grad2grad

=item RADIAL COORDINATE CONVERSIONS

=over 4

=item COORDINATE SYSTEMS

=item 3-D ANGLE CONVERSIONS

cartesian_to_cylindrical, cartesian_to_spherical, cylindrical_to_cartesian,
cylindrical_to_spherical, spherical_to_cartesian, spherical_to_cylindrical

=back

=item GREAT CIRCLE DISTANCES AND DIRECTIONS

=over 4

=item great_circle_distance

=item great_circle_direction

=item great_circle_bearing

=item great_circle_destination

=item great_circle_midpoint

=item great_circle_waypoint

=back

=item EXAMPLES

=over 4

=item CAVEAT FOR GREAT CIRCLE FORMULAS

=item Real-valued asin and acos

asin_real, acos_real

=back

=item BUGS

=item AUTHORS

=item LICENSE

=back

=head2 Memoize - Make functions faster by trading space for time

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item DETAILS

=item OPTIONS

=over 4

=item INSTALL

=item NORMALIZER

=item C<SCALAR_CACHE>, C<LIST_CACHE>

C<MEMORY>, C<HASH>, C<TIE>, C<FAULT>, C<MERGE>

=back

=item OTHER FACILITIES

=over 4

=item C<unmemoize>

=item C<flush_cache>

=back

=item CAVEATS

=item PERSISTENT CACHE SUPPORT

=item EXPIRATION SUPPORT

=item BUGS

=item MAILING LIST

=item AUTHOR

=item COPYRIGHT AND LICENSE

=item THANK YOU

=back

=head2 Memoize::AnyDBM_File - glue to provide EXISTS for AnyDBM_File for
Storable use

=over 4

=item DESCRIPTION

=back

=head2 Memoize::Expire - Plug-in module for automatic expiration of
memoized values

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item INTERFACE

 TIEHASH,  EXISTS,  STORE

=item ALTERNATIVES

=item CAVEATS

=item AUTHOR

=item SEE ALSO

=back

=head2 Memoize::ExpireFile - test for Memoize expiration semantics

=over 4

=item DESCRIPTION

=back

=head2 Memoize::ExpireTest - test for Memoize expiration semantics

=over 4

=item DESCRIPTION

=back

=head2 Memoize::NDBM_File - glue to provide EXISTS for NDBM_File for
Storable use

=over 4

=item DESCRIPTION

=back

=head2 Memoize::SDBM_File - glue to provide EXISTS for SDBM_File for
Storable use

=over 4

=item DESCRIPTION

=back

=head2 Memoize::Storable - store Memoized data in Storable database

=over 4

=item DESCRIPTION

=back

=head2 Module::CoreList - what modules shipped with versions of perl

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS API

C<first_release( MODULE )>, C<first_release_by_date( MODULE )>,
C<find_modules( REGEX, [ LIST OF PERLS ] )>, C<find_version( PERL_VERSION
)>, C<is_core( MODULE, [ MODULE_VERSION, [ PERL_VERSION ] ] )>,
C<is_deprecated( MODULE, PERL_VERSION )>, C<deprecated_in( MODULE )>,
C<removed_from( MODULE )>, C<removed_from_by_date( MODULE )>,
C<changes_between( PERL_VERSION, PERL_VERSION )>

=item DATA STRUCTURES

C<%Module::CoreList::version>, C<%Module::CoreList::delta>,
C<%Module::CoreList::released>, C<%Module::CoreList::families>,
C<%Module::CoreList::deprecated>, C<%Module::CoreList::upstream>,
C<%Module::CoreList::bug_tracker>

=item CAVEATS

=item HISTORY

=item AUTHOR

=item LICENSE

=item SEE ALSO

=back

=head2 Module::CoreList::Utils - what utilities shipped with versions of
perl

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS API

C<utilities>, C<first_release( UTILITY )>, C<first_release_by_date( UTILITY
)>, C<removed_from( UTILITY )>, C<removed_from_by_date( UTILITY )>

=item DATA STRUCTURES

C<%Module::CoreList::Utils::utilities>

=item AUTHOR

=item LICENSE

=item SEE ALSO

=back

=head2 Module::Load - runtime require of both modules and files

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Difference between C<load> and C<autoload>

=back

=item FUNCTIONS

load, autoload, load_remote, autoload_remote

=item Rules

=item IMPORTS THE FUNCTIONS

"load","autoload","load_remote","autoload_remote", 'all', '','none',undef

=item Caveats

=item ACKNOWLEDGEMENTS

=item BUG REPORTS

=item AUTHOR

=item COPYRIGHT

=back

=head2 Module::Load::Conditional - Looking up module information / loading
at runtime

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Methods

=over 4

=item $href = check_install( module => NAME [, version => VERSION, verbose
=> BOOL ] );

module, version, verbose, file, dir, version, uptodate

=back

=back

=over 4

=item $bool = can_load( modules => { NAME => VERSION [,NAME => VERSION] },
[verbose => BOOL, nocache => BOOL, autoload => BOOL] )

modules, verbose, nocache, autoload

=back

=over 4

=item @list = requires( MODULE );

=back

=over 4

=item Global Variables

=over 4

=item $Module::Load::Conditional::VERBOSE

=item $Module::Load::Conditional::FIND_VERSION

=item $Module::Load::Conditional::CHECK_INC_HASH

=item $Module::Load::Conditional::FORCE_SAFE_INC

=item $Module::Load::Conditional::CACHE

=item $Module::Load::Conditional::ERROR

=item $Module::Load::Conditional::DEPRECATED

=back

=item See Also

=item BUG REPORTS

=item AUTHOR

=item COPYRIGHT

=back

=head2 Module::Loaded - mark modules as loaded or unloaded

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS

=over 4

=item $bool = mark_as_loaded( PACKAGE );

=back

=back

=over 4

=item $bool = mark_as_unloaded( PACKAGE );

=back

=over 4

=item $loc = is_loaded( PACKAGE );

=back

=over 4

=item BUG REPORTS

=item AUTHOR

=item COPYRIGHT

=back

=head2 Module::Metadata - Gather package and POD information from perl
module files

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item CLASS METHODS

=over 4

=item C<< new_from_file($filename, collect_pod => 1) >>

=item C<< new_from_handle($handle, $filename, collect_pod => 1) >>

=item C<< new_from_module($module, collect_pod => 1, inc => \@dirs) >>

=item C<< find_module_by_name($module, \@dirs) >>

=item C<< find_module_dir_by_name($module, \@dirs) >>

=item C<< provides( %options ) >>

version B<(required)>, dir, files, prefix

=item C<< package_versions_from_directory($dir, \@files?) >>

=item C<< log_info (internal) >>

=back

=item OBJECT METHODS

=over 4

=item C<< name() >>

=item C<< version($package) >>

=item C<< filename() >>

=item C<< packages_inside() >>

=item C<< pod_inside() >>

=item C<< contains_pod() >>

=item C<< pod($section) >>

=item C<< is_indexable($package) >> or C<< is_indexable() >>

=back

=item SUPPORT

=item AUTHOR

=item CONTRIBUTORS

=item COPYRIGHT & LICENSE

=back

=head2 NDBM_File - Tied access to ndbm files

=over 4

=item SYNOPSIS

=item DESCRIPTION

C<O_RDONLY>, C<O_WRONLY>, C<O_RDWR>

=item DIAGNOSTICS

=over 4

=item C<ndbm store returned -1, errno 22, key "..." at ...>

=back

=item BUGS AND WARNINGS

=back

=head2 NEXT - Provide a pseudo-class NEXT (et al) that allows method
redispatch

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Enforcing redispatch

=item Avoiding repetitions

=item Invoking all versions of a method with a single call

=item Using C<EVERY> methods

=back

=item SEE ALSO

=item AUTHOR

=item BUGS AND IRRITATIONS

=item COPYRIGHT

=back

=head2 Net::Cmd - Network Command class (as used by FTP, SMTP etc)

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item USER METHODS

debug ( VALUE ), message (), code (), ok (), status (), datasend ( DATA ),
dataend ()

=item CLASS METHODS

debug_print ( DIR, TEXT ), debug_text ( DIR, TEXT ), command ( CMD [, ARGS,
... ]), unsupported (), response (), parse_response ( TEXT ), getline (),
ungetline ( TEXT ), rawdatasend ( DATA ), read_until_dot (), tied_fh ()

=item PSEUDO RESPONSES

Initial value, Connection closed, Timeout

=item EXPORTS

=item AUTHOR

=item COPYRIGHT

=back

=head2 Net::Config - Local configuration data for libnet

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

requires_firewall ( HOST )

=item NetConfig VALUES

nntp_hosts, snpp_hosts, pop3_hosts, smtp_hosts, ph_hosts, daytime_hosts,
time_hosts, inet_domain, ftp_firewall, ftp_firewall_type, 0Z<>, 1Z<>, 2Z<>,
3Z<>, 4Z<>, 5Z<>, 6Z<>, 7Z<>, ftp_ext_passive, ftp_int_passive,
local_netmask, test_hosts, test_exists

=item AUTHOR

=item COPYRIGHT

=back

=head2 Net::Domain - Attempt to evaluate the current host's internet name
and domain

=over 4

=item SYNOPSIS

=item DESCRIPTION

hostfqdn (), domainname (), hostname (), hostdomain ()

=item AUTHOR

=item COPYRIGHT

=back

=head2 Net::FTP - FTP Client class

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item OVERVIEW

=item CONSTRUCTOR

new ([ HOST ] [, OPTIONS ])

=item METHODS

login ([LOGIN [,PASSWORD [, ACCOUNT] ] ]), starttls (), stoptls (), prot (
LEVEL ), host (), account( ACCT ), authorize ( [AUTH [, RESP]]), site
(ARGS), ascii (), binary (), type ( [ TYPE ] ), rename ( OLDNAME, NEWNAME
), delete ( FILENAME ), cwd ( [ DIR ] ), cdup (), passive ( [ PASSIVE ] ),
pwd (), restart ( WHERE ), rmdir ( DIR [, RECURSE ]), mkdir ( DIR [,
RECURSE ]), alloc ( SIZE [, RECORD_SIZE] ), ls ( [ DIR ] ), dir ( [ DIR ]
), get ( REMOTE_FILE [, LOCAL_FILE [, WHERE]] ), put ( LOCAL_FILE [,
REMOTE_FILE ] ), put_unique ( LOCAL_FILE [, REMOTE_FILE ] ), append (
LOCAL_FILE [, REMOTE_FILE ] ), unique_name (), mdtm ( FILE ), size ( FILE
), supported ( CMD ), hash ( [FILEHANDLE_GLOB_REF],[ BYTES_PER_HASH_MARK]
), feature ( NAME ), nlst ( [ DIR ] ), list ( [ DIR ] ), retr ( FILE ),
stor ( FILE ), stou ( FILE ), appe ( FILE ), port ( [ PORT ] ), eprt ( [
PORT ] ), pasv (), epsv (), pasv_xfer ( SRC_FILE, DEST_SERVER [, DEST_FILE
] ), pasv_xfer_unique ( SRC_FILE, DEST_SERVER [, DEST_FILE ] ), pasv_wait (
NON_PASV_SERVER ), abort (), quit ()

=over 4

=item Methods for the adventurous

quot (CMD [,ARGS]), can_inet6 (), can_ssl ()

=back

=item THE dataconn CLASS

=item UNIMPLEMENTED

B<SMNT>, B<HELP>, B<MODE>, B<SYST>, B<STAT>, B<STRU>, B<REIN>

=item REPORTING BUGS

=item AUTHOR

=item SEE ALSO

=item USE EXAMPLES

http://www.csh.rit.edu/~adam/Progs/

=item CREDITS

=item COPYRIGHT

=back

=head2 Net::NNTP - NNTP Client class

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [ HOST ] [, OPTIONS ])

=item METHODS

host (), starttls (), article ( [ MSGID|MSGNUM ], [FH] ), body ( [
MSGID|MSGNUM ], [FH] ), head ( [ MSGID|MSGNUM ], [FH] ), articlefh ( [
MSGID|MSGNUM ] ), bodyfh ( [ MSGID|MSGNUM ] ), headfh ( [ MSGID|MSGNUM ] ),
nntpstat ( [ MSGID|MSGNUM ] ), group ( [ GROUP ] ), help ( ), ihave ( MSGID
[, MESSAGE ]), last (), date (), postok (), authinfo ( USER, PASS ),
authinfo_simple ( USER, PASS ), list (), newgroups ( SINCE [, DISTRIBUTIONS
]), newnews ( SINCE [, GROUPS [, DISTRIBUTIONS ]]), next (), post ( [
MESSAGE ] ), postfh (), slave (), quit (), can_inet6 (), can_ssl ()

=over 4

=item Extension methods

newsgroups ( [ PATTERN ] ), distributions (), distribution_patterns (),
subscriptions (), overview_fmt (), active_times (), active ( [ PATTERN ] ),
xgtitle ( PATTERN ), xhdr ( HEADER, MESSAGE-SPEC ), xover ( MESSAGE-SPEC ),
xpath ( MESSAGE-ID ), xpat ( HEADER, PATTERN, MESSAGE-SPEC), xrover (),
listgroup ( [ GROUP ] ), reader ()

=back

=item UNSUPPORTED

=item DEFINITIONS

MESSAGE-SPEC, PATTERN, Examples, C<[^]-]>, C<*bdc>, C<[0-9a-zA-Z]>, C<a??d>

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 Net::Netrc - OO interface to users netrc file

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item THE .netrc FILE

machine name, default, login name, password string, account string, macdef
name

=item CONSTRUCTOR

lookup ( MACHINE [, LOGIN ])

=item METHODS

login (), password (), account (), lpa ()

=item AUTHOR

=item SEE ALSO

=item COPYRIGHT

=back

=head2 Net::POP3 - Post Office Protocol 3 Client class (RFC1939)

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

new ( [ HOST ] [, OPTIONS ] )

=item METHODS

host (), auth ( USERNAME, PASSWORD ), user ( USER ), pass ( PASS ), login (
[ USER [, PASS ]] ), starttls ( SSLARGS ), apop ( [ USER [, PASS ]] ),
banner (), capa (),  capabilities (), top ( MSGNUM [, NUMLINES ] ), list (
[ MSGNUM ] ), get ( MSGNUM [, FH ] ), getfh ( MSGNUM ), last (), popstat
(), ping ( USER ), uidl ( [ MSGNUM ] ), delete ( MSGNUM ), reset (), quit
(), can_inet6 (), can_ssl ()

=item NOTES

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 Net::Ping - check a remote host for reachability

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Functions

Net::Ping->new([proto, timeout, bytes, device, tos, ttl, family,	   
	   host, port, bind, gateway, retrans, pingstring,		   
     source_verify econnrefused dontfrag		      
IPV6_USE_MIN_MTU IPV6_RECVPATHMTU]), $p->ping($host [, $timeout [,
$family]]);, $p->source_verify( { 0 | 1 } );, $p->service_check( { 0 | 1 }
);, $p->tcp_service_check( { 0 | 1 } );, $p->hires( { 0 | 1 } );, $p->time,
$p->socket_blocking_mode( $fh, $mode );, $p->IPV6_USE_MIN_MTU,
$p->IPV6_RECVPATHMTU, $p->IPV6_HOPLIMIT, $p->IPV6_REACHCONF I<NYI>,
$p->bind($local_addr);, $p->open($host);, $p->ack( [ $host ] );, $p->nack(
$failed_ack_host );, $p->ack_unfork($host), $p->ping_icmp([$host, $timeout,
$family]), $p->ping_icmpv6([$host, $timeout, $family]) I<NYI>,
$p->ping_stream([$host, $timeout, $family]), $p->ping_syn([$host, $ip,
$start_time, $stop_time]), $p->ping_syn_fork([$host, $timeout, $family]),
$p->ping_tcp([$host, $timeout, $family]), $p->ping_udp([$host, $timeout,
$family]), $p->ping_external([$host, $timeout, $family]),
$p->tcp_connect([$ip, $timeout]), $p->tcp_echo([$ip, $timeout,
$pingstring]), $p->close();, $p->port_number([$port_number]), $p->mselect,
$p->ntop, $p->checksum($msg), $p->icmp_result, pingecho($host [,
$timeout]);, wakeonlan($mac, [$host, [$port]])

=back

=item NOTES

=item INSTALL

=item BUGS

=item AUTHORS

=item COPYRIGHT

=back

=head2 Net::SMTP - Simple Mail Transfer Protocol Client

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLES

=item CONSTRUCTOR

new ( [ HOST ] [, OPTIONS ] )

=item METHODS

banner (), domain (), hello ( DOMAIN ), host (), etrn ( DOMAIN ), starttls
( SSLARGS ), auth ( USERNAME, PASSWORD ), auth ( SASL ), mail ( ADDRESS [,
OPTIONS] ), send ( ADDRESS ), send_or_mail ( ADDRESS ), send_and_mail (
ADDRESS ), reset (), recipient ( ADDRESS [, ADDRESS, [...]] [, OPTIONS ] ),
to ( ADDRESS [, ADDRESS [...]] ), cc ( ADDRESS [, ADDRESS [...]] ), bcc (
ADDRESS [, ADDRESS [...]] ), data ( [ DATA ] ), bdat ( DATA ), bdatlast (
DATA ), expand ( ADDRESS ), verify ( ADDRESS ), help ( [ $subject ] ), quit
(), can_inet6 (), can_ssl ()

=item ADDRESSES

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT

=back

=head2 Net::Time - time and daytime network client interface

=over 4

=item SYNOPSIS

=item DESCRIPTION

inet_time ( [HOST [, PROTOCOL [, TIMEOUT]]]), inet_daytime ( [HOST [,
PROTOCOL [, TIMEOUT]]])

=item AUTHOR

=item COPYRIGHT

=back

=head2 Net::hostent - by-name interface to Perl's built-in gethost*()
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLES

=item NOTE

=item AUTHOR

=back

=head2 Net::libnetFAQ, libnetFAQ - libnet Frequently Asked Questions

=over 4

=item DESCRIPTION

=over 4

=item Where to get this document

=item How to contribute to this document

=back

=item Author and Copyright Information

=over 4

=item Disclaimer

=back

=item Obtaining and installing libnet

=over 4

=item What is libnet ?

=item Which version of perl do I need ?

=item What other modules do I need ?

=item What machines support libnet ?

=item Where can I get the latest libnet release

=back

=item Using Net::FTP

=over 4

=item How do I download files from an FTP server ?

=item How do I transfer files in binary mode ?

=item How can I get the size of a file on a remote FTP server ?

=item How can I get the modification time of a file on a remote FTP server
?

=item How can I change the permissions of a file on a remote server ?

=item Can I do a reget operation like the ftp command ?

=item How do I get a directory listing from an FTP server ?

=item Changing directory to "" does not fail ?

=item I am behind a SOCKS firewall, but the Firewall option does not work ?

=item I am behind an FTP proxy firewall, but cannot access machines outside
?

=item My ftp proxy firewall does not listen on port 21

=item Is it possible to change the file permissions of a file on an FTP
server ?

=item I have seen scripts call a method message, but cannot find it
documented ?

=item Why does Net::FTP not implement mput and mget methods

=back

=item Using Net::SMTP

=over 4

=item Why can't the part of an Email address after the @ be used as the
hostname ?

=item Why does Net::SMTP not do DNS MX lookups ?

=item The verify method always returns true ?

=back

=item Debugging scripts

=over 4

=item How can I debug my scripts that use Net::* modules ?

=back

=item AUTHOR AND COPYRIGHT

=back

=head2 Net::netent - by-name interface to Perl's built-in getnet*()
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLES

=item NOTE

=item AUTHOR

=back

=head2 Net::protoent - by-name interface to Perl's built-in getproto*()
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTE

=item AUTHOR

=back

=head2 Net::servent - by-name interface to Perl's built-in getserv*()
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLES

=item NOTE

=item AUTHOR

=back

=head2 O - Generic interface to Perl Compiler backends

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONVENTIONS

=item IMPLEMENTATION

=item BUGS

=item AUTHOR

=back

=head2 ODBM_File - Tied access to odbm files

=over 4

=item SYNOPSIS

=item DESCRIPTION

C<O_RDONLY>, C<O_WRONLY>, C<O_RDWR>

=item DIAGNOSTICS

=over 4

=item C<odbm store returned -1, errno 22, key "..." at ...>

=back

=item BUGS AND WARNINGS

=back

=head2 Opcode - Disable named opcodes when compiling perl code

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTE

=item WARNING

=item Operator Names and Operator Lists

an operator name (opname), an operator tag name (optag), a negated opname
or optag, an operator set (opset)

=item Opcode Functions

opcodes, opset (OP, ...), opset_to_ops (OPSET), opset_to_hex (OPSET),
full_opset, empty_opset, invert_opset (OPSET), verify_opset (OPSET, ...),
define_optag (OPTAG, OPSET), opmask_add (OPSET), opmask, opdesc (OP, ...),
opdump (PAT)

=item Manipulating Opsets

=item TO DO (maybe)

=back

=over 4

=item Predefined Opcode Tags

:base_core, :base_mem, :base_loop, :base_io, :base_orig, :base_math,
:base_thread, :default, :filesys_read, :sys_db, :browse, :filesys_open,
:filesys_write, :subprocess, :ownprocess, :others, :load,
:still_to_be_decided, :dangerous

=item SEE ALSO

=item AUTHORS

=back

=head2 POSIX - Perl interface to IEEE Std 1003.1

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEATS

=item FUNCTIONS

C<_exit>, C<abort>, C<abs>, C<access>, C<acos>, C<acosh>, C<alarm>,
C<asctime>, C<asin>, C<asinh>, C<assert>, C<atan>, C<atanh>, C<atan2>,
C<atexit>, C<atof>, C<atoi>, C<atol>, C<bsearch>, C<calloc>, C<cbrt>,
C<ceil>, C<chdir>, C<chmod>, C<chown>, C<clearerr>, C<clock>, C<close>,
C<closedir>, C<cos>, C<cosh>, C<copysign>, C<creat>, C<ctermid>, C<ctime>,
C<cuserid>, C<difftime>, C<div>, C<dup>, C<dup2>, C<erf>, C<erfc>,
C<errno>, C<execl>, C<execle>, C<execlp>, C<execv>, C<execve>, C<execvp>,
C<exit>, C<exp>, C<expm1>, C<fabs>, C<fclose>, C<fcntl>, C<fdopen>,
C<feof>, C<ferror>, C<fflush>, C<fgetc>, C<fgetpos>, C<fgets>, C<fileno>,
C<floor>, C<fdim>, C<fegetround>, C<fesetround>, C<fma>, C<fmax>, C<fmin>,
C<fmod>, C<fopen>, C<fork>, C<fpathconf>, C<fpclassify>, C<fprintf>,
C<fputc>, C<fputs>, C<fread>, C<free>, C<freopen>, C<frexp>, C<fscanf>,
C<fseek>, C<fsetpos>, C<fstat>, C<fsync>, C<ftell>, C<fwrite>, C<getc>,
C<getchar>, C<getcwd>, C<getegid>, C<getenv>, C<geteuid>, C<getgid>,
C<getgrgid>, C<getgrnam>, C<getgroups>, C<getlogin>, C<getpayload>,
C<getpgrp>, C<getpid>, C<getppid>, C<getpwnam>, C<getpwuid>, C<gets>,
C<getuid>, C<gmtime>, C<hypot>, C<ilogb>, C<Inf>, C<isalnum>, C<isalpha>,
C<isatty>, C<iscntrl>, C<isdigit>, C<isfinite>, C<isgraph>, C<isgreater>,
C<isinf>, C<islower>, C<isnan>, C<isnormal>, C<isprint>, C<ispunct>,
C<issignaling>, C<isspace>, C<isupper>, C<isxdigit>, C<j0>, C<j1>, C<jn>,
C<y0>, C<y1>, C<yn>, C<kill>, C<labs>, C<lchown>, C<ldexp>, C<ldiv>,
C<lgamma>, C<log1p>, C<log2>, C<logb>, C<link>, C<localeconv>,
C<localtime>, C<log>, C<log10>, C<longjmp>, C<lseek>, C<lrint>, C<lround>,
C<malloc>, C<mblen>, C<mbstowcs>, C<mbtowc>, C<memchr>, C<memcmp>,
C<memcpy>, C<memmove>, C<memset>, C<mkdir>, C<mkfifo>, C<mktime>, C<modf>,
C<NaN>, C<nan>, C<nearbyint>, C<nextafter>, C<nexttoward>, C<nice>,
C<offsetof>, C<open>, C<opendir>, C<pathconf>, C<pause>, C<perror>,
C<pipe>, C<pow>, C<printf>, C<putc>, C<putchar>, C<puts>, C<qsort>,
C<raise>, C<rand>, C<read>, C<readdir>, C<realloc>, C<remainder>,
C<remove>, C<remquo>, C<rename>, C<rewind>, C<rewinddir>, C<rint>,
C<rmdir>, C<round>, C<scalbn>, C<scanf>, C<setgid>, C<setjmp>,
C<setlocale>, C<setpayload>, C<setpayloadsig>, C<setpgid>, C<setsid>,
C<setuid>, C<sigaction>, C<siglongjmp>, C<signbit>, C<sigpending>,
C<sigprocmask>, C<sigsetjmp>, C<sigsuspend>, C<sin>, C<sinh>, C<sleep>,
C<sprintf>, C<sqrt>, C<srand>, C<sscanf>, C<stat>, C<strcat>, C<strchr>,
C<strcmp>, C<strcoll>, C<strcpy>, C<strcspn>, C<strerror>, C<strftime>,
C<strlen>, C<strncat>, C<strncmp>, C<strncpy>, C<strpbrk>, C<strrchr>,
C<strspn>, C<strstr>, C<strtod>, C<strtok>, C<strtol>, C<strtold>,
C<strtoul>, C<strxfrm>, C<sysconf>, C<system>, C<tan>, C<tanh>, C<tcdrain>,
C<tcflow>, C<tcflush>, C<tcgetpgrp>, C<tcsendbreak>, C<tcsetpgrp>,
C<tgamma>, C<time>, C<times>, C<tmpfile>, C<tmpnam>, C<tolower>,
C<toupper>, C<trunc>, C<ttyname>, C<tzname>, C<tzset>, C<umask>, C<uname>,
C<ungetc>, C<unlink>, C<utime>, C<vfprintf>, C<vprintf>, C<vsprintf>,
C<wait>, C<waitpid>, C<wcstombs>, C<wctomb>, C<write>

=item CLASSES

=over 4

=item C<POSIX::SigAction>

C<new>, C<handler>, C<mask>, C<flags>, C<safe>

=item C<POSIX::SigRt>

C<%SIGRT>, C<SIGRTMIN>, C<SIGRTMAX>

=item C<POSIX::SigSet>

C<new>, C<addset>, C<delset>, C<emptyset>, C<fillset>, C<ismember>

=item C<POSIX::Termios>

C<new>, C<getattr>, C<getcc>, C<getcflag>, C<getiflag>, C<getispeed>,
C<getlflag>, C<getoflag>, C<getospeed>, C<setattr>, C<setcc>, C<setcflag>,
C<setiflag>, C<setispeed>, C<setlflag>, C<setoflag>, C<setospeed>, Baud
rate values, Terminal interface values, C<c_cc> field values, C<c_cflag>
field values, C<c_iflag> field values, C<c_lflag> field values, C<c_oflag>
field values

=back

=item PATHNAME CONSTANTS

Constants

=item POSIX CONSTANTS

Constants

=item SYSTEM CONFIGURATION

Constants

=item ERRNO

Constants

=item FCNTL

Constants

=item FLOAT

Constants

=item FLOATING-POINT ENVIRONMENT

Constants

=item LIMITS

Constants

=item LOCALE

Constants

=item MATH

Constants

=item SIGNAL

Constants

=item STAT

Constants, Macros

=item STDLIB

Constants

=item STDIO

Constants

=item TIME

Constants

=item UNISTD

Constants

=item WAIT

Constants, C<WNOHANG>, C<WUNTRACED>, Macros, C<WIFEXITED>, C<WEXITSTATUS>,
C<WIFSIGNALED>, C<WTERMSIG>, C<WIFSTOPPED>, C<WSTOPSIG>

=item WINSOCK

Constants

=back

=head2 Params::Check - A generic input parsing/checking mechanism.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Template

default, required, strict_type, defined, no_override, store, allow

=item Functions

=over 4

=item check( \%tmpl, \%args, [$verbose] );

Template, Arguments, Verbose

=back

=back

=over 4

=item allow( $test_me, \@criteria );

string, regexp, subroutine, array ref

=back

=over 4

=item last_error()

=back

=over 4

=item Global Variables

=over 4

=item $Params::Check::VERBOSE

=item $Params::Check::STRICT_TYPE

=item $Params::Check::ALLOW_UNKNOWN

=item $Params::Check::STRIP_LEADING_DASHES

=item $Params::Check::NO_DUPLICATES

=item $Params::Check::PRESERVE_CASE

=item $Params::Check::ONLY_ALLOW_DEFINED

=item $Params::Check::SANITY_CHECK_TEMPLATE

=item $Params::Check::WARNINGS_FATAL

=item $Params::Check::CALLER_DEPTH

=back

=item Acknowledgements

=item BUG REPORTS

=item AUTHOR

=item COPYRIGHT

=back

=head2 Parse::CPAN::Meta - Parse META.yml and META.json CPAN metadata files

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item load_file

=item load_yaml_string

=item load_json_string

=item load_string

=item yaml_backend

=item json_backend

=item json_decoder

=back

=item FUNCTIONS

=over 4

=item Load

=item LoadFile

=back

=item ENVIRONMENT

=over 4

=item CPAN_META_JSON_DECODER

=item CPAN_META_JSON_BACKEND

=item PERL_JSON_BACKEND

=item PERL_YAML_BACKEND

=back

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 Perl::OSType - Map Perl operating system names to generic types

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item USAGE

=over 4

=item os_type()

=item is_os_type()

=back

=item SEE ALSO

=item SUPPORT

=over 4

=item Bugs / Feature Requests

=item Source Code

=back

=item AUTHOR

=item CONTRIBUTORS

=item COPYRIGHT AND LICENSE

=back

=head2 PerlIO - On demand loader for PerlIO layers and root of PerlIO::*
name space

=over 4

=item SYNOPSIS

=item DESCRIPTION

:unix, :stdio, :perlio, :crlf, :utf8, :bytes, :raw, :pop, :win32

=over 4

=item Custom Layers

:encoding, :mmap, :via

=item Alternatives to raw

=item Defaults and how to override them

=item Querying the layers of filehandles

=back

=item AUTHOR

=item SEE ALSO

=back

=head2 PerlIO::encoding - encoding layer

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=back

=head2 PerlIO::mmap - Memory mapped IO

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item IMPLEMENTATION NOTE

=back

=head2 PerlIO::scalar - in-memory IO, scalar IO

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item IMPLEMENTATION NOTE

=back

=head2 PerlIO::via - Helper class for PerlIO layers implemented in perl

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXPECTED METHODS

$class->PUSHED([$mode,[$fh]]), $obj->POPPED([$fh]),
$obj->UTF8($belowFlag,[$fh]), $obj->OPEN($path,$mode,[$fh]),
$obj->BINMODE([$fh]), $obj->FDOPEN($fd,[$fh]),
$obj->SYSOPEN($path,$imode,$perm,[$fh]), $obj->FILENO($fh),
$obj->READ($buffer,$len,$fh), $obj->WRITE($buffer,$fh), $obj->FILL($fh),
$obj->CLOSE($fh), $obj->SEEK($posn,$whence,$fh), $obj->TELL($fh),
$obj->UNREAD($buffer,$fh), $obj->FLUSH($fh), $obj->SETLINEBUF($fh),
$obj->CLEARERR($fh), $obj->ERROR($fh), $obj->EOF($fh)

=item EXAMPLES

=over 4

=item Example - a Hexadecimal Handle

=back

=back

=head2 PerlIO::via::QuotedPrint - PerlIO layer for quoted-printable strings

=over 4

=item SYNOPSIS

=item VERSION

=item DESCRIPTION

=item REQUIRED MODULES

=item SEE ALSO

=item ACKNOWLEDGEMENTS

=item COPYRIGHT

=back

=head2 Pod::Checker - check pod documents for syntax errors

=over 4

=item SYNOPSIS

=item OPTIONS/ARGUMENTS

=over 4

=item podchecker()

B<-warnings> =E<gt> I<val>, B<-quiet> =E<gt> I<val>

=back

=item DESCRIPTION

=item DIAGNOSTICS

=over 4

=item Errors

empty =headn, =over on line I<N> without closing =back, You forgot a
'=back' before '=headI<N>', =over is the last thing in the document?!,
'=item' outside of any '=over', =back without =over, Can't have a 0 in
=over I<N>, =over should be: '=over' or '=over positive_number', =begin
I<TARGET> without matching =end I<TARGET>, =begin without a target?, =end
I<TARGET> without matching =begin, '=end' without a target?, '=end
I<TARGET>' is invalid, =end I<CONTENT> doesn't match =begin I<TARGET>, =for
without a target?, unresolved internal link I<NAME>, Unknown directive:
I<CMD>, Deleting unknown formatting code I<SEQ>, Unterminated
I<SEQ>E<lt>E<gt> sequence, An EE<lt>...E<gt> surrounding strange content,
An empty EE<lt>E<gt>, An empty C<< LE<lt>E<gt> >>, An empty XE<lt>E<gt>, A
non-empty ZE<lt>E<gt>, Spurious text after =pod / =cut, =back doesn't take
any parameters, but you said =back I<ARGUMENT>, =pod directives shouldn't
be over one line long!	Ignoring all I<N> lines of content, =cut found
outside a pod block, Invalid =encoding syntax: I<CONTENT>

=item Warnings

nested commands I<CMD>E<lt>...I<CMD>E<lt>...E<gt>...E<gt>, multiple
occurrences (I<N>) of link target I<name>, line containing nothing but
whitespace in paragraph, =item has no contents, You can't have =items (as
at line I<N>) unless the first thing after the =over is an =item, Expected
'=item I<EXPECTED VALUE>', Expected '=item *', Possible =item type
mismatch: 'I<x>' found leading a supposed definition =item, You have '=item
x' instead of the expected '=item I<N>', Unknown E content in
EE<lt>I<CONTENT>E<gt>, empty =over/=back block, empty section in previous
paragraph, Verbatim paragraph in NAME section, =headI<n> without preceding
higher level

=item Hyperlinks

ignoring leading/trailing whitespace in link, alternative text/node '%s'
contains non-escaped | or /

=back

=item RETURN VALUE

=item EXAMPLES

=item SCRIPTS

=item INTERFACE

=back

C<Pod::Checker-E<gt>new( %options )>

C<$checker-E<gt>poderror( @args )>, C<$checker-E<gt>poderror( {%opts},
@args )>

C<$checker-E<gt>num_errors()>

C<$checker-E<gt>num_warnings()>

C<$checker-E<gt>name()>

C<$checker-E<gt>node()>

C<$checker-E<gt>idx()>

C<$checker-E<gt>hyperlinks()>

line()

type()

page()

node()

=over 4

=item AUTHOR

=back

=head2 Pod::Escapes - for resolving Pod EE<lt>...E<gt> sequences

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item GOODIES

e2char($e_content), e2charnum($e_content), $Name2character{I<name>},
$Name2character_number{I<name>}, $Latin1Code_to_fallback{I<integer>},
$Latin1Char_to_fallback{I<character>}, $Code2USASCII{I<integer>}

=item CAVEATS

=item SEE ALSO

=item REPOSITORY

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Find - find POD documents in directory trees

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item C<pod_find( { %opts } , @directories )>

C<-verbose =E<gt> 1>, C<-perl =E<gt> 1>, C<-script =E<gt> 1>, C<-inc =E<gt>
1>

=back

=over 4

=item C<simplify_name( $str )>

=back

=over 4

=item C<pod_where( { %opts }, $pod )>

C<-inc =E<gt> 1>, C<-dirs =E<gt> [ $dir1, $dir2, ... ]>, C<-verbose =E<gt>
1>

=back

=over 4

=item C<contains_pod( $file , $verbose )>

=back

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 Pod::Html - module to convert pod files to HTML

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS

=over 4

=item pod2html

backlink, cachedir, css, flush, header, help, htmldir, htmlroot, index,
infile, outfile, poderrors, podpath, podroot, quiet, recurse, title,
verbose

=item htmlify

=item anchorify

=back

=item ENVIRONMENT

=item AUTHOR

=item SEE ALSO

=item COPYRIGHT

=back

=head2 Pod::InputObjects - objects representing POD input paragraphs,
commands, etc.

=over 4

=item SYNOPSIS

=item REQUIRES

=item EXPORTS

=item DESCRIPTION

package B<Pod::InputSource>, package B<Pod::Paragraph>, package
B<Pod::InteriorSequence>, package B<Pod::ParseTree>

=back

=over 4

=item B<Pod::InputSource>

=back

=over 4

=item B<new()>

=back

=over 4

=item B<name()>

=back

=over 4

=item B<handle()>

=back

=over 4

=item B<was_cutting()>

=back

=over 4

=item B<Pod::Paragraph>

=back

=over 4

=item Pod::Paragraph-E<gt>B<new()>

=back

=over 4

=item $pod_para-E<gt>B<cmd_name()>

=back

=over 4

=item $pod_para-E<gt>B<text()>

=back

=over 4

=item $pod_para-E<gt>B<raw_text()>

=back

=over 4

=item $pod_para-E<gt>B<cmd_prefix()>

=back

=over 4

=item $pod_para-E<gt>B<cmd_separator()>

=back

=over 4

=item $pod_para-E<gt>B<parse_tree()>

=back

=over 4

=item $pod_para-E<gt>B<file_line()>

=back

=over 4

=item B<Pod::InteriorSequence>

=back

=over 4

=item Pod::InteriorSequence-E<gt>B<new()>

=back

=over 4

=item $pod_seq-E<gt>B<cmd_name()>

=back

=over 4

=item $pod_seq-E<gt>B<prepend()>

=back

=over 4

=item $pod_seq-E<gt>B<append()>

=back

=over 4

=item $pod_seq-E<gt>B<nested()>

=back

=over 4

=item $pod_seq-E<gt>B<raw_text()>

=back

=over 4

=item $pod_seq-E<gt>B<left_delimiter()>

=back

=over 4

=item $pod_seq-E<gt>B<right_delimiter()>

=back

=over 4

=item $pod_seq-E<gt>B<parse_tree()>

=back

=over 4

=item $pod_seq-E<gt>B<file_line()>

=back

=over 4

=item Pod::InteriorSequence::B<DESTROY()>

=back

=over 4

=item B<Pod::ParseTree>

=back

=over 4

=item Pod::ParseTree-E<gt>B<new()>

=back

=over 4

=item $ptree-E<gt>B<top()>

=back

=over 4

=item $ptree-E<gt>B<children()>

=back

=over 4

=item $ptree-E<gt>B<prepend()>

=back

=over 4

=item $ptree-E<gt>B<append()>

=back

=over 4

=item $ptree-E<gt>B<raw_text()>

=back

=over 4

=item Pod::ParseTree::B<DESTROY()>

=back

=over 4

=item SEE ALSO

=item AUTHOR

=back

=head2 Pod::Man - Convert POD data to formatted *roff input

=over 4

=item SYNOPSIS

=item DESCRIPTION

center, date, errors, fixed, fixedbold, fixeditalic, fixedbolditalic,
lquote, rquote, name, nourls, quotes, release, section, stderr, utf8

=item DIAGNOSTICS

roff font should be 1 or 2 chars, not "%s", Invalid errors setting "%s",
Invalid quote specification "%s", POD document had syntax errors

=item ENVIRONMENT

PERL_CORE, POD_MAN_DATE, SOURCE_DATE_EPOCH

=item BUGS

=item CAVEATS

=item AUTHOR

=item COPYRIGHT AND LICENSE

=item SEE ALSO

=back

=head2 Pod::ParseLink - Parse an LE<lt>E<gt> formatting code in POD text

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 Pod::ParseUtils - helpers for POD parsing and conversion

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item Pod::List

Pod::List-E<gt>new()

=back

$list-E<gt>file()

$list-E<gt>start()

$list-E<gt>indent()

$list-E<gt>type()

$list-E<gt>rx()

$list-E<gt>item()

$list-E<gt>parent()

$list-E<gt>tag()

=over 4

=item Pod::Hyperlink

Pod::Hyperlink-E<gt>new()

=back

$link-E<gt>parse($string)

$link-E<gt>markup($string)

$link-E<gt>text()

$link-E<gt>warning()

$link-E<gt>file(), $link-E<gt>line()

$link-E<gt>page()

$link-E<gt>node()

$link-E<gt>alttext()

$link-E<gt>type()

$link-E<gt>link()

=over 4

=item Pod::Cache

Pod::Cache-E<gt>new()

=back

$cache-E<gt>item()

$cache-E<gt>find_page($name)

=over 4

=item Pod::Cache::Item

Pod::Cache::Item-E<gt>new()

=back

$cacheitem-E<gt>page()

$cacheitem-E<gt>description()

$cacheitem-E<gt>path()

$cacheitem-E<gt>file()

$cacheitem-E<gt>nodes()

$cacheitem-E<gt>find_node($name)

$cacheitem-E<gt>idx()

=over 4

=item AUTHOR

=item SEE ALSO

=back

=head2 Pod::Parser - base class for creating POD filters and translators

=over 4

=item SYNOPSIS

=item REQUIRES

=item EXPORTS

=item DESCRIPTION

=item QUICK OVERVIEW

=item PARSING OPTIONS

B<-want_nonPODs> (default: unset), B<-process_cut_cmd> (default: unset),
B<-warnings> (default: unset)

=back

=over 4

=item RECOMMENDED SUBROUTINE/METHOD OVERRIDES

=back

=over 4

=item B<command()>

C<$cmd>, C<$text>, C<$line_num>, C<$pod_para>

=back

=over 4

=item B<verbatim()>

C<$text>, C<$line_num>, C<$pod_para>

=back

=over 4

=item B<textblock()>

C<$text>, C<$line_num>, C<$pod_para>

=back

=over 4

=item B<interior_sequence()>

=back

=over 4

=item OPTIONAL SUBROUTINE/METHOD OVERRIDES

=back

=over 4

=item B<new()>

=back

=over 4

=item B<initialize()>

=back

=over 4

=item B<begin_pod()>

=back

=over 4

=item B<begin_input()>

=back

=over 4

=item B<end_input()>

=back

=over 4

=item B<end_pod()>

=back

=over 4

=item B<preprocess_line()>

=back

=over 4

=item B<preprocess_paragraph()>

=back

=over 4

=item METHODS FOR PARSING AND PROCESSING

=back

=over 4

=item B<parse_text()>

B<-expand_seq> =E<gt> I<code-ref>|I<method-name>, B<-expand_text> =E<gt>
I<code-ref>|I<method-name>, B<-expand_ptree> =E<gt>
I<code-ref>|I<method-name>

=back

=over 4

=item B<interpolate()>

=back

=over 4

=item B<parse_paragraph()>

=back

=over 4

=item B<parse_from_filehandle()>

=back

=over 4

=item B<parse_from_file()>

=back

=over 4

=item ACCESSOR METHODS

=back

=over 4

=item B<errorsub()>

=back

=over 4

=item B<cutting()>

=back

=over 4

=item B<parseopts()>

=back

=over 4

=item B<output_file()>

=back

=over 4

=item B<output_handle()>

=back

=over 4

=item B<input_file()>

=back

=over 4

=item B<input_handle()>

=back

=over 4

=item B<input_streams()>

=back

=over 4

=item B<top_stream()>

=back

=over 4

=item PRIVATE METHODS AND DATA

=back

=over 4

=item B<_push_input_stream()>

=back

=over 4

=item B<_pop_input_stream()>

=back

=over 4

=item TREE-BASED PARSING

=item CAVEATS

=item SEE ALSO

=item AUTHOR

=item LICENSE

=back

=head2 Pod::Perldoc - Look up Perl documentation in Pod format.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::BaseTo - Base for Pod::Perldoc formatters

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::GetOptsOO - Customized option parser for Pod::Perldoc

=over 4

=item SYNOPSIS

=item DESCRIPTION

Call Pod::Perldoc::GetOptsOO::getopts($object, \@ARGV, $truth), Given -n,
if there's a opt_n_with, it'll call $object->opt_n_with( ARGUMENT )   
(e.g., "-n foo" => $object->opt_n_with('foo').	Ditto "-nfoo"), Otherwise
(given -n) if there's an opt_n, we'll call it $object->opt_n($truth)   
(Truth defaults to 1), Otherwise we try calling
$object->handle_unknown_option('n')    (and we increment the error count by
the return value of it), If there's no handle_unknown_option, then we just
warn, and then increment    the error counter

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToANSI - render Pod with ANSI color escapes 

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToChecker - let Perldoc check Pod for errors

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToMan - let Perldoc render Pod as man pages

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToNroff - let Perldoc convert Pod to nroff

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToPod - let Perldoc render Pod as ... Pod!

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToRtf - let Perldoc render Pod as RTF

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToTerm - render Pod with terminal escapes

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item PAGER FORMATTING

=item CAVEAT

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToText - let Perldoc render Pod as plaintext

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::Perldoc::ToTk - let Perldoc use Tk::Pod to render Pod

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item AUTHOR

=back

=head2 Pod::Perldoc::ToXml - let Perldoc render Pod as XML

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

=back

=head2 Pod::PlainText - Convert POD data to formatted ASCII text

=over 4

=item SYNOPSIS

=item DESCRIPTION

alt, indent, loose, sentence, width

=item DIAGNOSTICS

Bizarre space in item, Can't open %s for reading: %s, Unknown escape: %s,
Unknown sequence: %s, Unmatched =back

=item RESTRICTIONS

=item NOTES

=item SEE ALSO

=item AUTHOR

=back

=head2 Pod::Select, podselect() - extract selected sections of POD from
input

=over 4

=item SYNOPSIS

=item REQUIRES

=item EXPORTS

=item DESCRIPTION

=item SECTION SPECIFICATIONS

=item RANGE SPECIFICATIONS

=back

=over 4

=item OBJECT METHODS

=back

=over 4

=item B<curr_headings()>

=back

=over 4

=item B<select()>

=back

=over 4

=item B<add_selection()>

=back

=over 4

=item B<clear_selections()>

=back

=over 4

=item B<match_section()>

=back

=over 4

=item B<is_selected()>

=back

=over 4

=item EXPORTED FUNCTIONS

=back

=over 4

=item B<podselect()>

B<-output>, B<-sections>, B<-ranges>

=back

=over 4

=item PRIVATE METHODS AND DATA

=back

=over 4

=item B<_compile_section_spec()>

=back

=over 4

=item $self->{_SECTION_HEADINGS}

=back

=over 4

=item $self->{_SELECTED_SECTIONS}

=back

=over 4

=item SEE ALSO

=item AUTHOR

=back

=head2 Pod::Simple - framework for parsing Pod

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item MAIN METHODS

C<< $parser = I<SomeClass>->new(); >>, C<< $parser->output_fh( *OUT ); >>,
C<< $parser->output_string( \$somestring ); >>, C<< $parser->parse_file(
I<$some_filename> ); >>, C<< $parser->parse_file( *INPUT_FH ); >>, C<<
$parser->parse_string_document( I<$all_content> ); >>, C<<
$parser->parse_lines( I<...@lines...>, undef ); >>, C<<
$parser->content_seen >>, C<< I<SomeClass>->filter( I<$filename> ); >>, C<<
I<SomeClass>->filter( I<*INPUT_FH> ); >>, C<< I<SomeClass>->filter(
I<\$document_content> ); >>

=item SECONDARY METHODS

C<< $parser->parse_characters( I<SOMEVALUE> ) >>, C<< $parser->no_whining(
I<SOMEVALUE> ) >>, C<< $parser->no_errata_section( I<SOMEVALUE> ) >>, C<<
$parser->complain_stderr( I<SOMEVALUE> ) >>, C<< $parser->source_filename
>>, C<< $parser->doc_has_started >>, C<< $parser->source_dead >>, C<<
$parser->strip_verbatim_indent( I<SOMEVALUE> ) >>

=item TERTIARY METHODS

C<< $parser->abandon_output_fh() >>X<abandon_output_fh>, C<<
$parser->abandon_output_string() >>X<abandon_output_string>, C<<
$parser->accept_code( @codes ) >>X<accept_code>, C<< $parser->accept_codes(
@codes ) >>X<accept_codes>, C<< $parser->accept_directive_as_data(
@directives ) >>X<accept_directive_as_data>, C<<
$parser->accept_directive_as_processed( @directives )
>>X<accept_directive_as_processed>, C<<
$parser->accept_directive_as_verbatim( @directives )
>>X<accept_directive_as_verbatim>, C<< $parser->accept_target( @targets )
>>X<accept_target>, C<< $parser->accept_target_as_text( @targets )
>>X<accept_target_as_text>, C<< $parser->accept_targets( @targets )
>>X<accept_targets>, C<< $parser->accept_targets_as_text( @targets )
>>X<accept_targets_as_text>, C<< $parser->any_errata_seen()
>>X<any_errata_seen>, C<< $parser->errata_seen() >>X<errata_seen>, C<<
$parser->detected_encoding() >>X<detected_encoding>, C<<
$parser->encoding() >>X<encoding>, C<< $parser->parse_from_file( $source,
$to ) >>X<parse_from_file>, C<< $parser->scream( @error_messages )
>>X<scream>, C<< $parser->unaccept_code( @codes ) >>X<unaccept_code>, C<<
$parser->unaccept_codes( @codes ) >>X<unaccept_codes>, C<<
$parser->unaccept_directive( @directives ) >>X<unaccept_directive>, C<<
$parser->unaccept_directives( @directives ) >>X<unaccept_directives>, C<<
$parser->unaccept_target( @targets ) >>X<unaccept_target>, C<<
$parser->unaccept_targets( @targets ) >>X<unaccept_targets>, C<<
$parser->version_report() >>X<version_report>, C<< $parser->whine(
@error_messages ) >>X<whine>

=item ENCODING

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>, Gabor Szabo C<szabgab@gmail.com>,
Shawn H Corey  C<SHCOREY at cpan.org>

=back

=head2 Pod::Simple::Checker -- check the Pod syntax of a document

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::Debug -- put Pod::Simple into trace/debug mode

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEATS

=item GUTS

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::DumpAsText -- dump Pod-parsing events as text

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::DumpAsXML -- turn Pod into XML

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::HTML - convert Pod to HTML

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CALLING FROM THE COMMAND LINE

=item CALLING FROM PERL

=over 4

=item Minimal code

=item More detailed example

=back

=item METHODS

=over 4

=item html_css

=item html_javascript

=item title_prefix

=item title_postfix

=item html_header_before_title

=item top_anchor

=item html_h_level

=item index

=item html_header_after_title

=item html_footer

=back

=item SUBCLASSING

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item ACKNOWLEDGEMENTS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::HTMLBatch - convert several Pod files to several HTML
files

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item FROM THE COMMAND LINE

=back

=item MAIN METHODS

$batchconv = Pod::Simple::HTMLBatch->new;, $batchconv->batch_convert(
I<indirs>, I<outdir> );, $batchconv->batch_convert( undef    , ...);,
$batchconv->batch_convert( q{@INC}, ...);, $batchconv->batch_convert(
\@dirs , ...);, $batchconv->batch_convert( "somedir" , ...);,
$batchconv->batch_convert( 'somedir:someother:also' , ...);,
$batchconv->batch_convert( ... , undef );, $batchconv->batch_convert( ... ,
'somedir' );

=over 4

=item ACCESSOR METHODS

$batchconv->verbose( I<nonnegative_integer> );, $batchconv->index(
I<true-or-false> );, $batchconv->contents_file( I<filename> );,
$batchconv->contents_page_start( I<HTML_string> );,
$batchconv->contents_page_end( I<HTML_string> );, $batchconv->add_css( $url
);, $batchconv->add_javascript( $url );, $batchconv->css_flurry(
I<true-or-false> );, $batchconv->javascript_flurry( I<true-or-false> );,
$batchconv->no_contents_links( I<true-or-false> );,
$batchconv->html_render_class( I<classname> );, $batchconv->search_class(
I<classname> );

=back

=item NOTES ON CUSTOMIZATION

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::LinkSection -- represent "section" attributes of L
codes

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::Methody -- turn Pod::Simple events into method calls

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHOD CALLING

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::PullParser -- a pull-parser interface to parsing Pod

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

my $token = $parser->get_token, $parser->unget_token( $token ),
$parser->unget_token( $token1, $token2, ... ), $parser->set_source(
$filename ), $parser->set_source( $filehandle_object ),
$parser->set_source( \$document_source ), $parser->set_source(
\@document_lines ), $parser->parse_file(...),
$parser->parse_string_document(...), $parser->filter(...),
$parser->parse_from_file(...), my $title_string = $parser->get_title, my
$title_string = $parser->get_short_title, $author_name	 =
$parser->get_author, $description_name = $parser->get_description,
$version_block = $parser->get_version

=item NOTE

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::PullParserEndToken -- end-tokens from
Pod::Simple::PullParser

=over 4

=item SYNOPSIS

=item DESCRIPTION

$token->tagname, $token->tagname(I<somestring>), $token->tag(...),
$token->is_tag(I<somestring>) or $token->is_tagname(I<somestring>)

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::PullParserStartToken -- start-tokens from
Pod::Simple::PullParser

=over 4

=item SYNOPSIS

=item DESCRIPTION

$token->tagname, $token->tagname(I<somestring>), $token->tag(...),
$token->is_tag(I<somestring>) or $token->is_tagname(I<somestring>),
$token->attr(I<attrname>), $token->attr(I<attrname>, I<newvalue>),
$token->attr_hash

=item SEE ALSO

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::PullParserTextToken -- text-tokens from
Pod::Simple::PullParser

=over 4

=item SYNOPSIS

=item DESCRIPTION

$token->text, $token->text(I<somestring>), $token->text_r()

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::PullParserToken -- tokens from Pod::Simple::PullParser

=over 4

=item SYNOPSIS

=item DESCRIPTION

$token->type, $token->is_start, $token->is_text, $token->is_end,
$token->dump

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::RTF -- format Pod as RTF

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FORMAT CONTROL ATTRIBUTES

$parser->head1_halfpoint_size( I<halfpoint_integer> );,
$parser->head2_halfpoint_size( I<halfpoint_integer> );,
$parser->head3_halfpoint_size( I<halfpoint_integer> );,
$parser->head4_halfpoint_size( I<halfpoint_integer> );,
$parser->codeblock_halfpoint_size( I<halfpoint_integer> );,
$parser->header_halfpoint_size( I<halfpoint_integer> );,
$parser->normal_halfpoint_size( I<halfpoint_integer> );,
$parser->no_proofing_exemptions( I<true_or_false> );, $parser->doc_lang(
I<microsoft_decimal_language_code> )

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::Search - find POD documents in directory trees

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CONSTRUCTOR

=item ACCESSORS

$search->inc( I<true-or-false> );, $search->verbose( I<nonnegative-number>
);, $search->limit_glob( I<some-glob-string> );, $search->callback(
I<\&some_routine> );, $search->laborious( I<true-or-false> );,
$search->recurse( I<true-or-false> );, $search->shadows( I<true-or-false>
);, $search->limit_re( I<some-regxp> );, $search->dir_prefix(
I<some-string-value> );, $search->progress( I<some-progress-object> );,
$name2path = $self->name2path;, $path2name = $self->path2name;

=item MAIN SEARCH METHODS

=over 4

=item C<< $search->survey( @directories ) >>

C<name2path>, C<path2name>

=item C<< $search->simplify_name( $str ) >>

=item C<< $search->find( $pod ) >>

=item C<< $search->find( $pod, @search_dirs ) >>

=item C<< $self->contains_pod( $file ) >>

=back

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::SimpleTree -- parse Pod into a simple parse tree 

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=item Tree Contents

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::Subclassing -- write a formatter as a Pod::Simple
subclass

=over 4

=item SYNOPSIS

=item DESCRIPTION

Pod::Simple, Pod::Simple::Methody, Pod::Simple::PullParser,
Pod::Simple::SimpleTree

=item Events

C<< $parser->_handle_element_start( I<element_name>, I<attr_hashref> ) >>,
C<< $parser->_handle_element_end( I<element_name>  ) >>, C<<
$parser->_handle_text(	I<text_string>	) >>, events with an element_name
of Document, events with an element_name of Para, events with an
element_name of B, C, F, or I, events with an element_name of S, events
with an element_name of X, events with an element_name of L, events with an
element_name of E or Z, events with an element_name of Verbatim, events
with an element_name of head1 .. head4, events with an element_name of
encoding, events with an element_name of over-bullet, events with an
element_name of over-number, events with an element_name of over-text,
events with an element_name of over-block, events with an element_name of
over-empty, events with an element_name of item-bullet, events with an
element_name of item-number, events with an element_name of item-text,
events with an element_name of for, events with an element_name of Data

=item More Pod::Simple Methods

C<< $parser->accept_targets( I<SOMEVALUE> ) >>, C<<
$parser->accept_targets_as_text(  I<SOMEVALUE>	) >>, C<<
$parser->accept_codes( I<Codename>, I<Codename>...  ) >>, C<<
$parser->accept_directive_as_data( I<directive_name> ) >>, C<<
$parser->accept_directive_as_verbatim( I<directive_name> ) >>, C<<
$parser->accept_directive_as_processed( I<directive_name> ) >>, C<<
$parser->nbsp_for_S( I<BOOLEAN> ); >>, C<< $parser->version_report() >>,
C<< $parser->pod_para_count() >>, C<< $parser->line_count() >>, C<<
$parser->nix_X_codes(  I<SOMEVALUE>  ) >>, C<<
$parser->keep_encoding_directive(  I<SOMEVALUE>  ) >>, C<<
$parser->merge_text(  I<SOMEVALUE>  ) >>, C<< $parser->code_handler( 
I<CODE_REF>  ) >>, C<< $parser->cut_handler(  I<CODE_REF>  ) >>, C<<
$parser->pod_handler(  I<CODE_REF>  ) >>, C<< $parser->whiteline_handler( 
I<CODE_REF>  ) >>, C<< $parser->whine( I<linenumber>, I<complaint string> )
>>, C<< $parser->scream( I<linenumber>, I<complaint string> ) >>, C<<
$parser->source_dead(1) >>, C<< $parser->hide_line_numbers( I<SOMEVALUE> )
>>, C<< $parser->no_whining( I<SOMEVALUE> ) >>, C<<
$parser->no_errata_section( I<SOMEVALUE> ) >>, C<<
$parser->complain_stderr( I<SOMEVALUE> ) >>, C<< $parser->bare_output(
I<SOMEVALUE> ) >>, C<< $parser->preserve_whitespace( I<SOMEVALUE> ) >>, C<<
$parser->parse_empty_lists( I<SOMEVALUE> ) >>

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::Text -- format Pod as plaintext

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::TextContent -- get the text content of Pod

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::XHTML -- format Pod as validating XHTML

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Minimal code

=back

=back

=over 4

=item METHODS

=over 4

=item perldoc_url_prefix

=item perldoc_url_postfix

=item man_url_prefix

=item man_url_postfix

=item title_prefix, title_postfix

=item html_css

=item html_javascript

=item html_doctype

=item html_charset

=item html_header_tags

=item html_h_level

=item default_title

=item force_title

=item html_header, html_footer

=item index

=item anchor_items

=item backlink

=back

=back

=over 4

=item SUBCLASSING

=back

=over 4

=item handle_text

=item handle_code

=item accept_targets_as_html

=back

=over 4

=item resolve_pod_page_link

=back

=over 4

=item resolve_man_page_link

=back

=over 4

=item idify

=back

=over 4

=item batch_mode_page_object_init

=back

=over 4

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item ACKNOWLEDGEMENTS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Simple::XMLOutStream -- turn Pod into XML

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item ABOUT EXTENDING POD

=item SEE ALSO

=item SUPPORT

=item COPYRIGHT AND DISCLAIMERS

=item AUTHOR

Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>,
David E. Wheeler C<dwheeler@cpan.org>

=back

=head2 Pod::Text - Convert POD data to formatted text

=over 4

=item SYNOPSIS

=item DESCRIPTION

alt, code, errors, indent, loose, margin, nourls, quotes, sentence, stderr,
utf8, width

=item DIAGNOSTICS

Bizarre space in item, Item called without tag, Can't open %s for reading:
%s, Invalid errors setting "%s", Invalid quote specification "%s", POD
document had syntax errors

=item BUGS

=item CAVEATS

=item NOTES

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 Pod::Text::Color - Convert POD data to formatted color ASCII text

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item BUGS

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 Pod::Text::Overstrike, =for stopwords
overstrike

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item BUGS

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 Pod::Text::Termcap - Convert POD data to ASCII text with format
escapes

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item ENVIRONMENT

=item NOTES

=item SEE ALSO

=item AUTHOR

=item COPYRIGHT AND LICENSE

=back

=head2 Pod::Usage - print a usage message from embedded pod documentation

=over 4

=item SYNOPSIS

=item ARGUMENTS

C<-message> I<string>, C<-msg> I<string>, C<-exitval> I<value>, C<-verbose>
I<value>, C<-sections> I<spec>, C<-output> I<handle>, C<-input> I<handle>,
C<-pathlist> I<string>, C<-noperldoc>, C<-perlcmd>, C<-perldoc>
I<path-to-perldoc>, C<-perldocopt> I<string>

=over 4

=item Formatting base class

=item Pass-through options

=back

=item DESCRIPTION

=over 4

=item Scripts

=back

=item EXAMPLES

=over 4

=item Recommended Use

=back

=item CAVEATS

=item AUTHOR

=item ACKNOWLEDGMENTS

=item SEE ALSO

=back

=head2 SDBM_File - Tied access to sdbm files

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Tie

=back

=item EXPORTS

=item DIAGNOSTICS

=over 4

=item C<sdbm store returned -1, errno 22, key "..." at ...>

=back

=item BUGS AND WARNINGS

=back

=head2 Safe - Compile and execute code in restricted compartments

=over 4

=item SYNOPSIS

=item DESCRIPTION

a new namespace, an operator mask

=item WARNING

=item METHODS

=over 4

=item permit (OP, ...)

=item permit_only (OP, ...)

=item deny (OP, ...)

=item deny_only (OP, ...)

=item trap (OP, ...), untrap (OP, ...)

=item share (NAME, ...)

=item share_from (PACKAGE, ARRAYREF)

=item varglob (VARNAME)

=item reval (STRING, STRICT)

=item rdo (FILENAME)

=item root (NAMESPACE)

=item mask (MASK)

=item wrap_code_ref (CODEREF)

=item wrap_code_refs_within (...)

=back

=item RISKS

Memory, CPU, Snooping, Signals, State Changes

=item AUTHOR

=back

=head2 Scalar::Util - A selection of general-utility scalar subroutines

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item FUNCTIONS FOR REFERENCES

=over 4

=item blessed

=item refaddr

=item reftype

=item weaken

=item unweaken

=item isweak

=back

=item OTHER FUNCTIONS

=over 4

=item dualvar

=item isdual

=item isvstring

=item looks_like_number

=item openhandle

=item readonly

=item set_prototype

=item tainted

=back

=item DIAGNOSTICS

Weak references are not implemented in the version of perl, Vstrings are
not implemented in the version of perl

=item KNOWN BUGS

=item SEE ALSO

=item COPYRIGHT

=back

=head2 Search::Dict - look - search for key in dictionary file

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 SelectSaver - save and restore selected file handle

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 SelfLoader - load functions only on demand

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item The __DATA__ token

=item SelfLoader autoloading

=item Autoloading and package lexicals

=item SelfLoader and AutoLoader

=item __DATA__, __END__, and the FOOBAR::DATA filehandle.

=item Classes and inherited methods.

=back

=item Multiple packages and fully qualified subroutine names

=item AUTHOR

=item COPYRIGHT AND LICENSE

a), b)

=back

=head2 Socket, C<Socket> - networking constants and support functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item CONSTANTS

=back

=over 4

=item PF_INET, PF_INET6, PF_UNIX, ...

=item AF_INET, AF_INET6, AF_UNIX, ...

=item SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, ...

=item SOCK_NONBLOCK. SOCK_CLOEXEC

=item SOL_SOCKET

=item SO_ACCEPTCONN, SO_BROADCAST, SO_ERROR, ...

=item IP_OPTIONS, IP_TOS, IP_TTL, ...

=item IPTOS_LOWDELAY, IPTOS_THROUGHPUT, IPTOS_RELIABILITY, ...

=item MSG_BCAST, MSG_OOB, MSG_TRUNC, ...

=item SHUT_RD, SHUT_RDWR, SHUT_WR

=item INADDR_ANY, INADDR_BROADCAST, INADDR_LOOPBACK, INADDR_NONE

=item IPPROTO_IP, IPPROTO_IPV6, IPPROTO_TCP, ...

=item TCP_CORK, TCP_KEEPALIVE, TCP_NODELAY, ...

=item IN6ADDR_ANY, IN6ADDR_LOOPBACK

=item IPV6_ADD_MEMBERSHIP, IPV6_MTU, IPV6_V6ONLY, ...

=back

=over 4

=item STRUCTURE MANIPULATORS

=back

=over 4

=item $family = sockaddr_family $sockaddr

=item $sockaddr = pack_sockaddr_in $port, $ip_address

=item ($port, $ip_address) = unpack_sockaddr_in $sockaddr

=item $sockaddr = sockaddr_in $port, $ip_address

=item ($port, $ip_address) = sockaddr_in $sockaddr

=item $sockaddr = pack_sockaddr_in6 $port, $ip6_address, [$scope_id,
[$flowinfo]]

=item ($port, $ip6_address, $scope_id, $flowinfo) = unpack_sockaddr_in6
$sockaddr

=item $sockaddr = sockaddr_in6 $port, $ip6_address, [$scope_id,
[$flowinfo]]

=item ($port, $ip6_address, $scope_id, $flowinfo) = sockaddr_in6 $sockaddr

=item $sockaddr = pack_sockaddr_un $path

=item ($path) = unpack_sockaddr_un $sockaddr

=item $sockaddr = sockaddr_un $path

=item ($path) = sockaddr_un $sockaddr

=item $ip_mreq = pack_ip_mreq $multiaddr, $interface

=item ($multiaddr, $interface) = unpack_ip_mreq $ip_mreq

=item $ip_mreq_source = pack_ip_mreq_source $multiaddr, $source, $interface

=item ($multiaddr, $source, $interface) = unpack_ip_mreq_source $ip_mreq

=item $ipv6_mreq = pack_ipv6_mreq $multiaddr6, $ifindex

=item ($multiaddr6, $ifindex) = unpack_ipv6_mreq $ipv6_mreq

=back

=over 4

=item FUNCTIONS

=back

=over 4

=item $ip_address = inet_aton $string

=item $string = inet_ntoa $ip_address

=item $address = inet_pton $family, $string

=item $string = inet_ntop $family, $address

=item ($err, @result) = getaddrinfo $host, $service, [$hints]

flags => INT, family => INT, socktype => INT, protocol => INT, family =>
INT, socktype => INT, protocol => INT, addr => STRING, canonname => STRING,
AI_PASSIVE, AI_CANONNAME, AI_NUMERICHOST

=item ($err, $hostname, $servicename) = getnameinfo $sockaddr, [$flags,
[$xflags]]

NI_NUMERICHOST, NI_NUMERICSERV, NI_NAMEREQD, NI_DGRAM, NIx_NOHOST,
NIx_NOSERV

=back

=over 4

=item getaddrinfo() / getnameinfo() ERROR CONSTANTS

EAI_AGAIN, EAI_BADFLAGS, EAI_FAMILY, EAI_NODATA, EAI_NONAME, EAI_SERVICE

=back

=over 4

=item EXAMPLES

=over 4

=item Lookup for connect()

=item Making a human-readable string out of an address

=item Resolving hostnames into IP addresses

=item Accessing socket options

=back

=back

=over 4

=item AUTHOR

=back

=head2 Storable - persistence for Perl data structures

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item MEMORY STORE

=item ADVISORY LOCKING

=item SPEED

=item CANONICAL REPRESENTATION

=item CODE REFERENCES

=item FORWARD COMPATIBILITY

utf8 data, restricted hashes, files from future versions of Storable

=item ERROR REPORTING

=item WIZARDS ONLY

=over 4

=item Hooks

C<STORABLE_freeze> I<obj>, I<cloning>, C<STORABLE_thaw> I<obj>, I<cloning>,
I<serialized>, .., C<STORABLE_attach> I<class>, I<cloning>, I<serialized>

=item Predicates

C<Storable::last_op_in_netorder>, C<Storable::is_storing>,
C<Storable::is_retrieving>

=item Recursion

=item Deep Cloning

=back

=item Storable magic

$info = Storable::file_magic( $filename ), C<version>, C<version_nv>,
C<major>, C<minor>, C<hdrsize>, C<netorder>, C<byteorder>, C<intsize>,
C<longsize>, C<ptrsize>, C<nvsize>, C<file>, $info = Storable::read_magic(
$buffer ), $info = Storable::read_magic( $buffer, $must_be_file )

=item EXAMPLES

=item SECURITY WARNING

=item WARNING

=item BUGS

=over 4

=item 64 bit data in perl 5.6.0 and 5.6.1

=back

=item CREDITS

=item AUTHOR

=item SEE ALSO

=back

=head2 Sub::Util - A selection of utility subroutines for subs and CODE
references

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item FUNCTIONS

=back

=over 4

=item prototype

=back

=over 4

=item set_prototype

=back

=over 4

=item subname

=back

=over 4

=item set_subname

=back

=over 4

=item AUTHOR

=back

=head2 Symbol - manipulate Perl symbols and their names

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item BUGS

=back

=head2 Sys::Hostname - Try every conceivable way to get hostname

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item AUTHOR

=back

=head2 Sys::Syslog - Perl interface to the UNIX syslog(3) calls

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item EXPORTS

=item FUNCTIONS

B<openlog($ident, $logopt, $facility)>, B<syslog($priority, $message)>,
B<syslog($priority, $format, @args)>, B<Note>,
B<setlogmask($mask_priority)>, B<setlogsock()>, B<Note>, B<closelog()>

=item THE RULES OF SYS::SYSLOG

=item EXAMPLES

=item CONSTANTS

=over 4

=item Facilities

=item Levels

=back

=item DIAGNOSTICS

C<Invalid argument passed to setlogsock>, C<eventlog passed to setlogsock,
but no Win32 API available>, C<no connection to syslog available>, C<stream
passed to setlogsock, but %s is not writable>, C<stream passed to
setlogsock, but could not find any device>, C<tcp passed to setlogsock, but
tcp service unavailable>, C<syslog: expecting argument %s>, C<syslog:
invalid level/facility: %s>, C<syslog: too many levels given: %s>,
C<syslog: too many facilities given: %s>, C<syslog: level must be given>,
C<udp passed to setlogsock, but udp service unavailable>, C<unix passed to
setlogsock, but path not available>

=item HISTORY

=item SEE ALSO

=over 4

=item Other modules

=item Manual Pages

=item RFCs

=item Articles

=item Event Log

=back

=item AUTHORS & ACKNOWLEDGEMENTS

=item BUGS

=item SUPPORT

Perl Documentation, MetaCPAN, Search CPAN, AnnoCPAN: Annotated CPAN
documentation, CPAN Ratings, RT: CPAN's request tracker

=item COPYRIGHT

=item LICENSE

=back

=head2 TAP::Base - Base class that provides common functionality to
L<TAP::Parser>
and L<TAP::Harness>

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=head2 TAP::Formatter::Base - Base class for harness output delegates

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=back

=over 4

=item METHODS

=over 4

=item Class Methods

C<verbosity>, C<verbose>, C<timer>, C<failures>, C<comments>, C<quiet>,
C<really_quiet>, C<silent>, C<errors>, C<directives>, C<stdout>, C<color>,
C<jobs>, C<show_count>

=back

=back

=head2 TAP::Formatter::Color - Run Perl test scripts with color

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

=over 4

=item Class Methods

=back

=back

=head2 TAP::Formatter::Console - Harness output delegate for default
console output

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=over 4

=item C<< open_test >>

=back

=back

=head2 TAP::Formatter::Console::ParallelSession - Harness output delegate
for parallel console output

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=back

=over 4

=item METHODS

=over 4

=item Class Methods

=back

=back

=head2 TAP::Formatter::Console::Session - Harness output delegate for
default console output

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=back

=over 4

=item C<<	clear_for_close >>

=item C<<	close_test >>

=item C<<	header >>

=item C<<	result >>

=back

=head2 TAP::Formatter::File - Harness output delegate for file output

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=over 4

=item C<< open_test >>

=back

=back

=head2 TAP::Formatter::File::Session - Harness output delegate for file
output

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=back

=over 4

=item METHODS

=over 4

=item result

=back

=back

=over 4

=item close_test

=back

=head2 TAP::Formatter::Session - Abstract base class for harness output
delegate 

=over 4

=item VERSION

=back

=over 4

=item METHODS

=over 4

=item Class Methods

C<formatter>, C<parser>, C<name>, C<show_count>

=back

=back

=head2 TAP::Harness - Run test scripts with statistics

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item SYNOPSIS

=back

=over 4

=item METHODS

=over 4

=item Class Methods

C<verbosity>, C<timer>, C<failures>, C<comments>, C<show_count>,
C<normalize>, C<lib>, C<switches>, C<test_args>, C<color>, C<exec>,
C<merge>, C<sources>, C<aggregator_class>, C<version>, C<formatter_class>,
C<multiplexer_class>, C<parser_class>, C<scheduler_class>, C<formatter>,
C<errors>, C<directives>, C<ignore_exit>, C<jobs>, C<rules>, C<rulesfiles>,
C<stdout>, C<trap>

=back

=back

=over 4

=item Instance Methods

=back

the source name of a test to run, a reference to a [ source name, display
name ] array

=over 4

=item CONFIGURING

=over 4

=item Plugins

=item C<Module::Build>

=item C<ExtUtils::MakeMaker>

=item C<prove>

=back

=item WRITING PLUGINS

Customize how TAP gets into the parser, Customize how TAP results are
output from the parser

=item SUBCLASSING

=over 4

=item Methods

L</new>, L</runtests>, L</summary>

=back

=back

=over 4

=item REPLACING

=item SEE ALSO

=back

=head2 TAP::Harness::Beyond, Test::Harness::Beyond - Beyond make test

=over 4

=item Beyond make test

=over 4

=item Saved State

=item Parallel Testing

=item Non-Perl Tests

=item Mixing it up

=item Rolling My Own

=item Deeper Customisation

=item Callbacks

=item Parsing TAP

=item Getting Support

=back

=back

=head2 TAP::Harness::Env - Parsing harness related environmental variables
where appropriate

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

create( \%args )

=item ENVIRONMENTAL VARIABLES

C<HARNESS_PERL_SWITCHES>, C<HARNESS_VERBOSE>, C<HARNESS_SUBCLASS>,
C<HARNESS_OPTIONS>, C<< j<n> >>, C<< c >>, C<< a<file.tgz> >>, C<<
fPackage-With-Dashes >>, C<HARNESS_TIMER>, C<HARNESS_COLOR>,
C<HARNESS_IGNORE_EXIT>

=back

=head2 TAP::Object - Base class that provides common functionality to all
C<TAP::*> modules

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser - Parse L<TAP|Test::Harness::TAP> output

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

C<source>, C<tap>, C<exec>, C<sources>, C<callback>, C<switches>,
C<test_args>, C<spool>, C<merge>, C<grammar_class>,
C<result_factory_class>, C<iterator_factory_class>

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item INDIVIDUAL RESULTS

=over 4

=item Result types

Version, Plan, Pragma, Test, Comment, Bailout, Unknown

=item Common type methods

=item C<plan> methods

=item C<pragma> methods

=item C<comment> methods

=item C<bailout> methods

=item C<unknown> methods

=item C<test> methods

=back

=item TOTAL RESULTS

=over 4

=item Individual Results

=back

=back

=over 4

=item Pragmas

=back

=over 4

=item Summary Results

=back

=over 4

=item C<ignore_exit>

=back

Misplaced plan, No plan, More than one plan, Test numbers out of sequence

=over 4

=item CALLBACKS

C<test>, C<version>, C<plan>, C<comment>, C<bailout>, C<yaml>, C<unknown>,
C<ELSE>, C<ALL>, C<EOF>

=item TAP GRAMMAR

=item BACKWARDS COMPATIBILITY

=over 4

=item Differences

TODO plans, 'Missing' tests

=back

=item SUBCLASSING

=over 4

=item Parser Components

option 1, option 2

=back

=item ACKNOWLEDGMENTS

Michael Schwern, Andy Lester, chromatic, GEOFFR, Shlomi Fish, Torsten
Schoenfeld, Jerry Gay, Aristotle, Adam Kennedy, Yves Orton, Adrian Howard,
Sean & Lil, Andreas J. Koenig, Florian Ragwitz, Corion, Mark Stosberg, Matt
Kraai, David Wheeler, Alex Vandiver, Cosimo Streppone, Ville Skyttä

=item AUTHORS

=item BUGS

=item COPYRIGHT & LICENSE

=back

=head2 TAP::Parser::Aggregator - Aggregate TAP::Parser results

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item Summary methods

failed, parse_errors, passed, planned, skipped, todo, todo_passed, wait,
exit

=back

Failed tests, Parse errors, Bad exit or wait status

=over 4

=item See Also

=back

=head2 TAP::Parser::Grammar - A grammar for the Test Anything Protocol.

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item TAP GRAMMAR

=item SUBCLASSING

=item SEE ALSO

=back

=head2 TAP::Parser::Iterator - Base class for TAP source iterators

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=item Instance Methods

=back

=back

=over 4

=item SUBCLASSING

=over 4

=item Example

=back

=item SEE ALSO

=back

=head2 TAP::Parser::Iterator::Array - Iterator for array-based TAP sources

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=item Instance Methods

=back

=back

=over 4

=item ATTRIBUTION

=item SEE ALSO

=back

=head2 TAP::Parser::Iterator::Process - Iterator for process-based TAP
sources

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=item Instance Methods

=back

=back

=over 4

=item ATTRIBUTION

=item SEE ALSO

=back

=head2 TAP::Parser::Iterator::Stream - Iterator for filehandle-based TAP
sources

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item ATTRIBUTION

=item SEE ALSO

=back

=head2 TAP::Parser::IteratorFactory - Figures out which SourceHandler
objects to use for a given Source

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item SUBCLASSING

=over 4

=item Example

=back

=item AUTHORS

=item ATTRIBUTION

=item SEE ALSO

=back

=head2 TAP::Parser::Multiplexer - Multiplex multiple TAP::Parsers

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item See Also

=back

=head2 TAP::Parser::Result - Base class for TAP::Parser output objects

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=over 4

=item DESCRIPTION

=item METHODS

=back

=back

=over 4

=item Boolean methods

C<is_plan>, C<is_pragma>, C<is_test>, C<is_comment>, C<is_bailout>,
C<is_version>, C<is_unknown>, C<is_yaml>

=back

=over 4

=item SUBCLASSING

=over 4

=item Example

=back

=item SEE ALSO

=back

=head2 TAP::Parser::Result::Bailout - Bailout result token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

C<as_string>

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser::Result::Comment - Comment result token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

C<as_string>

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser::Result::Plan - Plan result token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

C<as_string>, C<raw>

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser::Result::Pragma - TAP pragma token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

C<as_string>, C<raw>

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser::Result::Test - Test result token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

=over 4

=item Instance Methods

=back

=back

=head2 TAP::Parser::Result::Unknown - Unknown result token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

C<as_string>, C<raw>

=back

=head2 TAP::Parser::Result::Version - TAP syntax version token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

C<as_string>, C<raw>

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser::Result::YAML - YAML result token.

=over 4

=item VERSION

=back

=over 4

=item DESCRIPTION

=item OVERRIDDEN METHODS

C<as_string>, C<raw>

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser::ResultFactory - Factory for creating TAP::Parser output
objects

=over 4

=item SYNOPSIS

=item VERSION

=back

=over 4

=item DESCRIPTION

=item METHODS

=item Class Methods

=back

=over 4

=item SUBCLASSING

=over 4

=item Example

=back

=item SEE ALSO

=back

=head2 TAP::Parser::Scheduler - Schedule tests during parallel testing

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=item Rules data structure

By default, all tests are eligible to be run in parallel. Specifying any of
your own rules removes this one, "First match wins". The first rule that
matches a test will be the one that applies, Any test which does not match
a rule will be run in sequence at the end of the run, The existence of a
rule does not imply selecting a test. You must still specify the tests to
run, Specifying a rule to allow tests to run in parallel does not make the
run in parallel. You still need specify the number of parallel C<jobs> in
your Harness object

=back

=back

=over 4

=item Instance Methods

=back

=head2 TAP::Parser::Scheduler::Job - A single testing job.

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item Attributes

=back

=head2 TAP::Parser::Scheduler::Spinner - A no-op job.

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item SEE ALSO

=back

=head2 TAP::Parser::Source - a TAP source & meta data about it

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item Instance Methods

=back

=over 4

=item AUTHORS

=item SEE ALSO

=back

=head2 TAP::Parser::SourceHandler - Base class for different TAP source
handlers

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item SUBCLASSING

=over 4

=item Example

=back

=item AUTHORS

=item SEE ALSO

=back

=head2 TAP::Parser::SourceHandler::Executable - Stream output from an
executable TAP source

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item SUBCLASSING

=over 4

=item Example

=back

=item SEE ALSO

=back

=head2 TAP::Parser::SourceHandler::File - Stream TAP from a text file.

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item CONFIGURATION

=item SUBCLASSING

=item SEE ALSO

=back

=head2 TAP::Parser::SourceHandler::Handle - Stream TAP from an IO::Handle
or a GLOB.

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item SUBCLASSING

=item SEE ALSO

=back

=head2 TAP::Parser::SourceHandler::Perl - Stream TAP from a Perl executable

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item SUBCLASSING

=over 4

=item Example

=back

=item SEE ALSO

=back

=head2 TAP::Parser::SourceHandler::RawTAP - Stream output from raw TAP in a
scalar/array ref.

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=back

=back

=over 4

=item SUBCLASSING

=item SEE ALSO

=back

=head2 TAP::Parser::YAMLish::Reader - Read YAMLish data from iterator

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=item Instance Methods

=back

=item AUTHOR

=item SEE ALSO

=item COPYRIGHT

=back

=head2 TAP::Parser::YAMLish::Writer - Write YAMLish data

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=over 4

=item Class Methods

=item Instance Methods

a reference to a scalar to append YAML to, the handle of an open file, a
reference to an array into which YAML will be pushed, a code reference

=back

=item AUTHOR

=item SEE ALSO

=item COPYRIGHT

=back

=head2 Term::ANSIColor - Color screen output using ANSI escape sequences

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Supported Colors

=item Function Interface

color(ATTR[, ATTR ...]), colored(STRING, ATTR[, ATTR ...]),
colored(ATTR-REF, STRING[, STRING...]), uncolor(ESCAPE),
colorstrip(STRING[, STRING ...]), colorvalid(ATTR[, ATTR ...]),
coloralias(ALIAS[, ATTR])

=item Constant Interface

=item The Color Stack

=back

=item DIAGNOSTICS

Bad color mapping %s, Bad escape sequence %s, Bareword "%s" not allowed
while "strict subs" in use, Cannot alias standard color %s, Cannot alias
standard color %s in %s, Invalid alias name %s, Invalid alias name %s in
%s, Invalid attribute name %s, Invalid attribute name %s in %s, Name "%s"
used only once: possible typo, No comma allowed after filehandle, No name
for escape sequence %s

=item ENVIRONMENT

ANSI_COLORS_ALIASES, ANSI_COLORS_DISABLED

=item COMPATIBILITY

=item RESTRICTIONS

=item NOTES

=item AUTHORS

=item COPYRIGHT AND LICENSE

=item SEE ALSO

=back

=head2 Term::Cap - Perl termcap interface

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item METHODS

=back

=back

B<Tgetent>, OSPEED, TERM

B<Tpad>, B<$string>, B<$cnt>, B<$FH>

B<Tputs>, B<$cap>, B<$cnt>, B<$FH>

B<Tgoto>, B<$cap>, B<$col>, B<$row>, B<$FH>

B<Trequire>

=over 4

=item EXAMPLES

=item COPYRIGHT AND LICENSE

=item AUTHOR

=item SEE ALSO

=back

=head2 Term::Complete - Perl word completion module

=over 4

=item SYNOPSIS

=item DESCRIPTION

E<lt>tabE<gt>, ^D, ^U, E<lt>delE<gt>, E<lt>bsE<gt>

=item DIAGNOSTICS

=item BUGS

=item AUTHOR

=back

=head2 Term::ReadLine - Perl interface to various C<readline> packages.
If no real package is found, substitutes stubs instead of basic functions.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Minimal set of supported functions

C<ReadLine>, C<new>, C<readline>, C<addhistory>, C<IN>, C<OUT>, C<MinLine>,
C<findConsole>, Attribs, C<Features>

=item Additional supported functions

C<tkRunning>, C<event_loop>, C<ornaments>, C<newTTY>

=item EXPORTS

=item ENVIRONMENT

=back

=head2 Test - provides a simple framework for writing test scripts

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item QUICK START GUIDE

=over 4

=item Functions

C<plan(...)>, C<tests =E<gt> I<number>>, C<todo =E<gt> [I<1,5,14>]>,
C<onfail =E<gt> sub { ... }>, C<onfail =E<gt> \&some_sub>

=back

=back

B<_to_value>

C<ok(...)>

C<skip(I<skip_if_true>, I<args...>)>

=over 4

=item TEST TYPES

NORMAL TESTS, SKIPPED TESTS, TODO TESTS

=item ONFAIL

=item BUGS and CAVEATS

=item ENVIRONMENT

=item NOTE

=item SEE ALSO

=item AUTHOR

=back

=head2 Test2 - Framework for writing test tools that all work together.

=over 4

=item DESCRIPTION

=over 4

=item WHAT IS NEW?

Easier to test new testing tools, Better diagnostics capabilities, Event
driven, More complete API, Support for output other than TAP, Subtest
implementation is more sane, Support for threading/forking

=back

=item GETTING STARTED

=back

=head2 Test2, This describes the namespace layout for the Test2 ecosystem.
Not all the
namespaces listed here are part of the Test2 distribution, some are
implemented
in L<Test2::Suite>.

=over 4

=item Test2::Tools::

=item Test2::Plugin::

=item Test2::Bundle::

=item Test2::Require::

=item Test2::Formatter::

=item Test2::Event::

=item Test2::Hub::

=item Test2::IPC::

=item Test2::Util::

=item Test2::API::

=item Test2::

=back

=over 4

=item SEE ALSO

=item CONTACTING US

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::API - Primary interface for writing Test2 based testing
tools.

=over 4

=item ***INTERNALS NOTE***

=item DESCRIPTION

=item SYNOPSIS

=over 4

=item WRITING A TOOL

=item TESTING YOUR TOOLS

=item OTHER API FUNCTIONS

=back

=item MAIN API EXPORTS

=over 4

=item context(...)

$ctx = context(), $ctx = context(%params), level => $int, wrapped => $int,
stack => $stack, hub => $hub, on_init => sub { ... }, on_release => sub {
... }

=item release($;$)

release $ctx;, release $ctx, ...;

=item context_do(&;@)

=item no_context(&;$)

no_context { ... };, no_context { ... } $hid;

=item intercept(&)

=item run_subtest(...)

$NAME, \&CODE, $BUFFERED or \%PARAMS, 'buffered' => $bool, 'inherit_trace'
=> $bool, @ARGS, Things not effected by this flag, Things that are effected
by this flag, Things that are formatter dependant

=back

=item OTHER API EXPORTS

=over 4

=item STATUS AND INITIALIZATION STATE

$bool = test2_init_done(), $bool = test2_load_done(), test2_set_is_end(),
test2_set_is_end($bool), $bool = test2_get_is_end(), $stack =
test2_stack(), $bool = test2_no_wait(), test2_no_wait($bool)

=item BEHAVIOR HOOKS

test2_add_callback_exit(sub { ... }), test2_add_callback_post_load(sub {
... }), test2_add_callback_context_acquire(sub { ... }),
test2_add_callback_context_init(sub { ... }),
test2_add_callback_context_release(sub { ... }), @list =
test2_list_context_acquire_callbacks(), @list =
test2_list_context_init_callbacks(), @list =
test2_list_context_release_callbacks(), @list =
test2_list_exit_callbacks(), @list = test2_list_post_load_callbacks()

=item IPC AND CONCURRENCY

$ipc = test2_ipc(), test2_ipc_add_driver($DRIVER), @drivers =
test2_ipc_drivers(), $bool = test2_ipc_polling(),
test2_ipc_enable_polling(), test2_ipc_disable_polling(),
test2_ipc_enable_shm(), test2_ipc_set_pending($uniq_val), $pending =
test2_ipc_get_pending()

=item MANAGING FORMATTERS

$formatter = test2_formatter, test2_formatter_set($class_or_instance),
@formatters = test2_formatters(), test2_formatter_add($class_or_instance)

=back

=item OTHER EXAMPLES

=item SEE ALSO

=item MAGIC

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::API::Breakage - What breaks at what version

=over 4

=item DESCRIPTION

=item FUNCTIONS

%mod_ver = upgrade_suggested(), %mod_ver =
Test2::API::Breakage->upgrade_suggested(), %mod_ver = upgrade_required(),
%mod_ver = Test2::API::Breakage->upgrade_required(), %mod_ver =
known_broken(), %mod_ver = Test2::API::Breakage->known_broken()

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::API::Context - Object to represent a testing context.

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item CRITICAL DETAILS

you MUST always use the context() sub from Test2::API, You MUST always
release the context when done with it, You MUST NOT pass context objects
around, You MUST NOT store or cache a context for later, You SHOULD obtain
your context as soon as possible in a given tool

=item METHODS

$ctx->done_testing;, $clone = $ctx->snapshot(), $ctx->release(),
$ctx->throw($message), $ctx->alert($message), $stack = $ctx->stack(), $hub
= $ctx->hub(), $dbg = $ctx->trace(), $ctx->do_in_context(\&code, @args);,
$ctx->restore_error_vars(), $! = $ctx->errno(), $? = $ctx->child_error(),
$@ = $ctx->eval_error()

=over 4

=item EVENT PRODUCTION METHODS

$event = $ctx->ok($bool, $name), $event = $ctx->ok($bool, $name,
\@on_fail), $event = $ctx->info($renderer, diagnostics => $bool,
%other_params), $event = $ctx->note($message), $event =
$ctx->diag($message), $event = $ctx->plan($max), $event = $ctx->plan(0,
'SKIP', $reason), $event = $ctx->skip($name, $reason);, $event =
$ctx->bail($reason), $event = $ctx->send_event($Type, %parameters), $event
= $ctx->build_event($Type, %parameters)

=back

=item HOOKS

=over 4

=item INIT HOOKS

=item RELEASE HOOKS

=back

=item THIRD PARTY META-DATA

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>, Kent Fredric
E<lt>kentnl@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::API::Instance - Object used by Test2::API under the hood

=over 4

=item DESCRIPTION

=item SYNOPSIS

$pid = $obj->pid, $obj->tid, $obj->reset(), $obj->load(), $bool =
$obj->loaded, $arrayref = $obj->post_load_callbacks,
$obj->add_post_load_callback(sub { ... }), $hashref = $obj->contexts(),
$arrayref = $obj->context_acquire_callbacks, $arrayref =
$obj->context_init_callbacks, $arrayref = $obj->context_release_callbacks,
$obj->add_context_init_callback(sub { ... }),
$obj->add_context_release_callback(sub { ... }), $obj->set_exit(),
$obj->ipc_enable_shm(), $shm_id = $obj->ipc_shm_id(), $shm_size =
$obj->ipc_shm_size(), $shm_last_val = $obj->ipc_shm_last(),
$obj->set_ipc_pending($val), $pending = $obj->get_ipc_pending(), $drivers =
$obj->ipc_drivers, $obj->add_ipc_driver($DRIVER_CLASS), $bool =
$obj->ipc_polling, $obj->enable_ipc_polling, $obj->disable_ipc_polling,
$bool = $obj->no_wait, $bool = $obj->set_no_wait($bool), $arrayref =
$obj->exit_callbacks, $obj->add_exit_callback(sub { ... }), $bool =
$obj->finalized, $ipc = $obj->ipc, $stack = $obj->stack, $formatter =
$obj->formatter, $bool = $obj->formatter_set(),
$obj->add_formatter($class), $obj->add_formatter($obj)

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::API::Stack - Object to manage a stack of L<Test2::Hub>
instances.

=over 4

=item ***INTERNALS NOTE***

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$stack = Test2::API::Stack->new(), $hub = $stack->new_hub(), $hub =
$stack->new_hub(%params), $hub = $stack->new_hub(%params, class => $class),
$hub = $stack->top(), $hub = $stack->peek(), $stack->cull, @hubs =
$stack->all, $stack->clear, $stack->push($hub), $stack->pop($hub)

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event - Base class for events

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$trace = $e->trace, $bool = $e->causes_fail, $bool = $e->increments_count,
$e->callback($hub), $call = $e->created, $num = $e->nested, $bool =
$e->global, $code = $e->terminate, $todo = $e->todo, $e->set_todo($todo),
$bool = $e->diag_todo, $e->diag_todo($todo), $msg = $e->summary, ($count,
$directive, $reason) = $e->sets_plan(), $bool = $e->diagnostics, $bool =
$e->no_display, $id = $e->in_subtest, $id = $e->subtest_id, $hashref =
$e->TO_JSON, $e = Test2::Event->from_json(%$hashref)

=item THIRD PARTY META-DATA

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Bail - Bailout!

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$reason = $e->reason

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Diag - Diag event type

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item ACCESSORS

$diag->message

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Encoding - Set the encoding for the output stream

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$encoding = $e->encoding

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Exception - Exception event

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$reason = $e->error

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Generic - Generic event type.

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$e->callback($hub), $e->set_callback(sub { ... }), $bool = $e->causes_fail,
$e->set_causes_fail($bool), $bool = $e->diagnostics,
$e->set_diagnostics($bool), $bool_or_undef = $e->global, @bool_or_empty =
$e->global, $e->set_global($bool_or_undef), $bool = $e->increments_count,
$e->set_increments_count($bool), $bool = $e->no_display,
$e->set_no_display($bool), @plan = $e->sets_plan,
$e->set_sets_plan(\@plan), $summary = $e->summary,
$e->set_summary($summary_or_undef), $int_or_undef = $e->terminate,
@int_or_empty = $e->terminate, $e->set_terminate($int_or_undef)

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Info - Info event base class

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item FORMATS

'text', 'ansi', 'html'

=item ACCESSORS

$bool = $info->diagnostics(), $info->set_diagnostics($bool)

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Note - Note event type

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item ACCESSORS

$note->message

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Ok - Ok event type

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item ACCESSORS

$rb = $e->pass, $name = $e->name, $b = $e->effective_pass, $b =
$e->allow_bad_name

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Plan - The event of a plan

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item ACCESSORS

$num = $plan->max, $dir = $plan->directive, $reason = $plan->reason

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Skip - Skip event type

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item ACCESSORS

$reason = $e->reason

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Subtest - Event for subtest types

=over 4

=item DESCRIPTION

=item ACCESSORS

$arrayref = $e->subevents, $bool = $e->buffered

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::TAP::Version - Event for TAP version.

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$version = $e->version

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Event::Waiting - Tell all procs/threads it is time to be done

=over 4

=item DESCRIPTION

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Formatter - Namespace for formatters.

=over 4

=item DESCRIPTION

=item CREATING FORMATTERS

The number of tests that were planned, The number of tests actually seen,
The number of tests which failed, A boolean indicating whether or not the
test suite passed, A boolean indicating whether or not this call is for a
subtest

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Formatter::TAP - Standard TAP formatter

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$bool = $tap->no_numbers, $tap->set_no_numbers($bool), $arrayref =
$tap->handles, $tap->set_handles(\@handles);, $encoding = $tap->encoding,
$tap->encoding($encoding), $tap->write($e, $num),
Test2::Formatter::TAP->register_event($pkg, sub { ... });

=over 4

=item EVENT METHODS

@out = $TAP->event_ok($e), @out = $TAP->event_ok($e, $num), @out =
$TAP->event_plan($e), @out = $TAP->event_plan($e, $num), @out =
$TAP->event_note($e), @out = $TAP->event_note($e, $num), @out =
$TAP->event_diag($e), @out = $TAP->event_diag($e, $num), @out =
$TAP->event_bail($e), @out = $TAP->event_bail($e, $num), @out =
$TAP->event_exception($e), @out = $TAP->event_exception($e, $num), @out =
$TAP->event_skip($e), @out = $TAP->event_skip($e, $num), @out =
$TAP->event_subtest($e), @out = $TAP->event_subtest($e, $num), @out =
$TAP->event_other($e, $num)

=back

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>, Kent Fredric
E<lt>kentnl@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Hub - The conduit through which all events flow.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item COMMON TASKS

=over 4

=item SENDING EVENTS

=item ALTERING OR REMOVING EVENTS

=item LISTENING FOR EVENTS

=item POST-TEST BEHAVIORS

=item SETTING THE FORMATTER

=back

=item METHODS

$hub->send($event), $hub->process($event), $old = $hub->format($formatter),
$sub = $hub->listen(sub { ... }, %optional_params), $hub->unlisten($sub),
$sub = $hub->filter(sub { ... }, %optional_params), $sub =
$hub->pre_filter(sub { ... }, %optional_params), $hub->unfilter($sub),
$hub->pre_unfilter($sub), $hub->follow_op(sub { ... }), $sub =
$hub->add_context_acquire(sub { ... });,
$hub->remove_context_acquire($sub);, $sub = $hub->add_context_init(sub {
... });, $hub->remove_context_init($sub);, $sub =
$hub->add_context_release(sub { ... });,
$hub->remove_context_release($sub);, $hub->cull(), $pid = $hub->pid(), $tid
= $hub->tid(), $hud = $hub->hid(), $ipc = $hub->ipc(),
$hub->set_no_ending($bool), $bool = $hub->no_ending, $bool = $hub->active,
$hub->set_active($bool)

=over 4

=item STATE METHODS

$hub->reset_state(), $num = $hub->count, $num = $hub->failed, $bool =
$hub->ended, $bool = $hub->is_passing, $hub->is_passing($bool),
$hub->plan($plan), $plan = $hub->plan, $bool = $hub->check_plan

=back

=item THIRD PARTY META-DATA

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Hub::Interceptor - Hub used by interceptor to grab results.

=over 4

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Hub::Interceptor::Terminator - Exception class used by
Test2::Hub::Interceptor

=over 4

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Hub::Subtest - Hub used by subtests

=over 4

=item DESCRIPTION

=item TOGGLES

$bool = $hub->manual_skip_all, $hub->set_manual_skip_all($bool)

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::IPC - Turn on IPC for threading or forking support.

=over 4

=item SYNOPSIS

=item EXPORTS

cull()

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::IPC::Driver - Base class for Test2 IPC drivers.

=over 4

=item SYNOPSIS

=item METHODS

$self->abort($msg), $self->abort_trace($msg), $false = $self->use_shm

=item LOADING DRIVERS

=item WRITING DRIVERS

=over 4

=item METHODS SUBCLASSES MUST IMPLEMENT

$ipc->is_viable, $ipc->add_hub($hid), $ipc->drop_hub($hid),
$ipc->send($hid, $event);, $ipc->send($hid, $event, $global);, @events =
$ipc->cull($hid), $ipc->waiting()

=item METHODS SUBCLASSES MAY IMPLEMENT OR OVERRIDE

$bool = $ipc->use_shm(), $bites = $ipc->shm_size()

=back

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::IPC::Driver::Files - Temp dir + Files concurrency model.

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item ENVIRONMENT VARIABLES

T2_KEEP_TEMPDIR=0, T2_TEMPDIR_TEMPLATE='test2-XXXXXX'

=item SEE ALSO

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Tools::Tiny - Tiny set of tools for unfortunate souls who
cannot use
L<Test2::Suite>.

=over 4

=item DESCRIPTION

=item USE Test2::Suite INSTEAD

=item EXPORTS

ok($bool, $name), ok($bool, $name, @diag), is($got, $want, $name), is($got,
$want, $name, @diag), isnt($got, $do_not_want, $name), isnt($got,
$do_not_want, $name, @diag), like($got, $regex, $name), like($got, $regex,
$name, @diag), unlike($got, $regex, $name), unlike($got, $regex, $name,
@diag), is_deeply($got, $want, $name), is_deeply($got, $want, $name,
@diag), diag($msg), note($msg), skip_all($reason), todo $reason => sub {
... }, plan($count), done_testing(), $warnings = warnings { ... },
$exception = exception { ... }, tests $name => sub { ... }, $output =
capture { ... }

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Transition - Transition notes when upgrading to Test2

=over 4

=item DESCRIPTION

=item THINGS THAT BREAK

=over 4

=item Test::Builder1.5/2 conditionals

=item Replacing the Test::Builder singleton

=item Directly Accessing Hash Elements

=item Subtest indentation

=back

=item DISTRIBUTIONS THAT BREAK OR NEED TO BE UPGRADED

=over 4

=item WORKS BUT TESTS WILL FAIL

Test::DBIx::Class::Schema, Test::Kit, Device::Chip

=item UPGRADE SUGGESTED

Test::Exception, Data::Peek, circular::require, Test::Module::Used,
Test::Moose::More, Test::FITesque, autouse

=item NEED TO UPGRADE

Test::SharedFork, Test::Builder::Clutch, Test::Dist::VersionSync,
Test::Modern, Test::UseAllModules

=item STILL BROKEN

Test::Aggregate, Test::Wrapper, Test::ParallelSubtest, Test::Pretty,
Test::More::Prefix, Net::BitTorrent, Test::Group, Test::Flatten,
Log::Dispatch::Config::TestLog, Test::Able

=back

=item MAKE ASSERTIONS -> SEND EVENTS

=over 4

=item LEGACY

=item TEST2

ok($bool, $name), diag(@messages), note(@messages), subtest($name, $code)

=back

=item WRAP EXISTING TOOLS

=over 4

=item LEGACY

=item TEST2

=back

=item USING UTF8

=over 4

=item LEGACY

=item TEST2

=back

=item AUTHORS, CONTRIBUTORS AND REVIEWERS

Chad Granum (EXODIST) E<lt>exodist@cpan.orgE<gt>

=item SOURCE

=item MAINTAINER

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Util - Tools used by Test2 and friends.

=over 4

=item DESCRIPTION

=item EXPORTS

($success, $error) = try { ... }, protect { ... }, CAN_FORK,
CAN_REALLY_FORK, CAN_THREAD, USE_THREADS, get_tid, my $file =
pkg_to_file($package)

=item NOTES && CAVEATS

Devel::Cover

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>, Kent Fredric
E<lt>kentnl@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Util::ExternalMeta - Allow third party tools to safely attach
meta-data
to your instances.

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item WHERE IS THE DATA STORED?

=item EXPORTS

$val = $obj->meta($key), $val = $obj->meta($key, $default), $val =
$obj->get_meta($key), $val = $obj->delete_meta($key), $obj->set_meta($key,
$val)

=item META-KEY RESTRICTIONS

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Util::HashBase - Build hash based classes.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item THIS IS A BUNDLED COPY OF HASHBASE

=item METHODS

=over 4

=item PROVIDED BY HASH BASE

$it = $class->new(@VALUES)

=item HOOKS

$self->init()

=back

=item ACCESSORS

foo(), set_foo(), FOO()

=item SUBCLASSING

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test2::Util::Trace - Debug information for events

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$trace->set_detail($msg), $msg = $trace->detail, $str = $trace->debug,
$trace->alert($MESSAGE), $trace->throw($MESSAGE), $frame = $trace->frame(),
($package, $file, $line, $subname) = $trace->call(), $pkg =
$trace->package, $file = $trace->file, $line = $trace->line, $subname =
$trace->subname, $hashref = $t->TO_JSON, $t =
Test2::Util::Trace->from_json(%$hashref)

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test::Builder - Backend for building test libraries

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Construction

B<new>, B<create>, B<subtest>, B<name>, B<reset>

=item Setting up tests

B<plan>, B<expected_tests>, B<no_plan>, B<done_testing>, B<has_plan>,
B<skip_all>, B<exported_to>

=item Running tests

B<ok>, B<is_eq>, B<is_num>, B<isnt_eq>, B<isnt_num>, B<like>, B<unlike>,
B<cmp_ok>

=item Other Testing Methods

B<BAIL_OUT>, B<skip>, B<todo_skip>, B<skip_rest>

=item Test building utility methods

B<maybe_regex>, B<is_fh>

=back

=back

=over 4

=item Test style

B<level>, B<use_numbers>, B<no_diag>, B<no_ending>, B<no_header>

=item Output

B<diag>, B<note>, B<explain>, B<output>, B<failure_output>, B<todo_output>,
reset_outputs, carp, croak

=item Test Status and Info

B<current_test>, B<is_passing>, B<summary>, B<details>, B<todo>,
B<find_TODO>, B<in_todo>, B<todo_start>, C<todo_end>, B<caller>

=back

=over 4

=item EXIT CODES

=item THREADS

=item MEMORY

=item EXAMPLES

=item SEE ALSO

=item AUTHORS

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test::Builder::Formatter - Test::Builder subclass of
Test2::Formatter::TAP

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item METHODS

$f->event_todo_diag, $f->event_diag, $f->event_plan

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test::Builder::IO::Scalar - A copy of IO::Scalar for Test::Builder

=over 4

=item DESCRIPTION

=item COPYRIGHT and LICENSE

=back

=over 4

=item Construction

=back

new [ARGS...]

open [SCALARREF]

opened

close

=over 4

=item Input and output

=back

flush

getc

getline

getlines

print ARGS..

read BUF, NBYTES, [OFFSET]

write BUF, NBYTES, [OFFSET]

sysread BUF, LEN, [OFFSET]

syswrite BUF, NBYTES, [OFFSET]

=over 4

=item Seeking/telling and other attributes

=back

autoflush

binmode

clearerr

eof

seek OFFSET, WHENCE

sysseek OFFSET, WHENCE

tell

 use_RS [YESNO]

setpos POS

getpos

sref

=over 4

=item WARNINGS

=item VERSION

=item AUTHORS

=over 4

=item Primary Maintainer

=item Principal author

=item Other contributors

=back

=item SEE ALSO

=back

=head2 Test::Builder::Module - Base class for test modules

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Importing

=back

=back

=over 4

=item Builder

=back

=head2 Test::Builder::Tester - test testsuites that have been built with
Test::Builder

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item Functions

test_out, test_err

=back

test_fail

test_diag

test_test, title (synonym 'name', 'label'), skip_out, skip_err

line_num

color

=over 4

=item BUGS

=item AUTHOR

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item NOTES

=item SEE ALSO

=back

=head2 Test::Builder::Tester::Color - turn on colour in
Test::Builder::Tester

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=over 4

=item AUTHOR

=item BUGS

=item SEE ALSO

=back

=head2 Test::Builder::TodoDiag - Test::Builder subclass of
Test2::Event::Diag

=over 4

=item DESCRIPTION

=item SYNOPSIS

=item SOURCE

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item AUTHORS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test::Harness - Run Perl standard test scripts with statistics

=over 4

=item VERSION

=back

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS

=over 4

=item runtests( @test_files )

=back

=back

=over 4

=item execute_tests( tests => \@test_files, out => \*FH )

=back

=over 4

=item EXPORT

=item ENVIRONMENT VARIABLES THAT TAP::HARNESS::COMPATIBLE SETS

C<HARNESS_ACTIVE>, C<HARNESS_VERSION>

=item ENVIRONMENT VARIABLES THAT AFFECT TEST::HARNESS

C<HARNESS_PERL_SWITCHES>, C<HARNESS_TIMER>, C<HARNESS_VERBOSE>,
C<HARNESS_OPTIONS>, C<< j<n> >>, C<< c >>, C<< a<file.tgz> >>, C<<
fPackage-With-Dashes >>, C<HARNESS_SUBCLASS>,
C<HARNESS_SUMMARY_COLOR_SUCCESS>, C<HARNESS_SUMMARY_COLOR_FAIL>

=item Taint Mode

=item SEE ALSO

=item BUGS

=item AUTHORS

=item LICENCE AND COPYRIGHT

=back

=head2 Test::More - yet another framework for writing test scripts

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item I love it when a plan comes together

=back

=back

B<done_testing>

=over 4

=item Test names

=item I'm ok, you're not ok.

B<ok>

=back

B<is>, B<isnt>

B<like>

B<unlike>

B<cmp_ok>

B<can_ok>

B<isa_ok>

B<new_ok>

B<subtest>

B<pass>, B<fail>

=over 4

=item Module tests

B<require_ok>

=back

B<use_ok>

=over 4

=item Complex data structures

B<is_deeply>

=back

=over 4

=item Diagnostics

B<diag>, B<note>

=back

B<explain>

=over 4

=item Conditional tests

B<SKIP: BLOCK>

=back

B<TODO: BLOCK>, B<todo_skip>

When do I use SKIP vs. TODO?

=over 4

=item Test control

B<BAIL_OUT>

=back

=over 4

=item Discouraged comparison functions

B<eq_array>

=back

B<eq_hash>

B<eq_set>

=over 4

=item Extending and Embedding Test::More

B<builder>

=back

=over 4

=item EXIT CODES

=item COMPATIBILITY

subtests, C<done_testing()>, C<cmp_ok()>, C<new_ok()> C<note()> and
C<explain()>

=item CAVEATS and NOTES

utf8 / "Wide character in print", Overloaded objects, Threads

=item HISTORY

=item SEE ALSO

=over 4

=item ALTERNATIVES

=item TESTING FRAMEWORKS

=item ADDITIONAL LIBRARIES

=item OTHER COMPONENTS

=item BUNDLES

=back

=item AUTHORS

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item BUGS

=item SOURCE

=item COPYRIGHT

=back

=head2 Test::Simple - Basic utilities for writing tests.

=over 4

=item SYNOPSIS

=item DESCRIPTION

B<ok>

=back

=over 4

=item EXAMPLE

=item CAVEATS

=item NOTES

=item HISTORY

=item SEE ALSO

L<Test::More>

=item AUTHORS

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test::Tester - Ease testing test modules built with Test::Builder

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item HOW TO USE (THE EASY WAY)

=item HOW TO USE (THE HARD WAY)

=item TEST RESULTS

ok, actual_ok, name, type, reason, diag, depth

=item SPACES AND TABS

=item COLOUR

=item EXPORTED FUNCTIONS

=item HOW IT WORKS

=item CAVEATS

=item SEE ALSO

=item AUTHOR

=item LICENSE

=back

=head2 Test::Tester::Capture - Help testing test modules built with
Test::Builder

=over 4

=item DESCRIPTION

=item AUTHOR

=item LICENSE

=back

=head2 Test::Tester::CaptureRunner - Help testing test modules built with
Test::Builder

=over 4

=item DESCRIPTION

=item AUTHOR

=item LICENSE

=back

=head2 Test::Tutorial - A tutorial about writing really basic tests

=over 4

=item DESCRIPTION

=over 4

=item Nuts and bolts of testing.

=item Where to start?

=item Names

=item Test the manual

=item Sometimes the tests are wrong

=item Testing lots of values

=item Informative names

=item Skipping tests

=item Todo tests

=item Testing with taint mode.

=back

=item FOOTNOTES

=item AUTHORS

=item MAINTAINERS

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item COPYRIGHT

=back

=head2 Test::use::ok - Alternative to Test::More::use_ok

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=item MAINTAINER

Chad Granum E<lt>exodist@cpan.orgE<gt>

=item CC0 1.0 Universal

=back

=head2 Text::Abbrev - abbrev - create an abbreviation table from a list

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLE

=back

=head2 Text::Balanced - Extract delimited text sequences from strings.

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item General behaviour in list contexts

[0], [1], [2]

=item General behaviour in scalar and void contexts

=item A note about prefixes

=item C<extract_delimited>

=item C<extract_bracketed>

=item C<extract_variable>

[0], [1], [2]

=item C<extract_tagged>

C<reject =E<gt> $listref>, C<ignore =E<gt> $listref>, C<fail =E<gt> $str>,
[0], [1], [2], [3], [4], [5]

=item C<gen_extract_tagged>

=item C<extract_quotelike>

[0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]

=item C<extract_quotelike> and "here documents"

[0], [1], [2], [3], [4], [5], [6], [7..10]

=item C<extract_codeblock>

=item C<extract_multiple>

=item C<gen_delimited_pat>

=item C<delimited_pat>

=back

=item DIAGNOSTICS

 C<Did not find a suitable bracket: "%s">,  C<Did not find prefix: /%s/>, 
C<Did not find opening bracket after prefix: "%s">,  C<No quotelike
operator found after prefix: "%s">,  C<Unmatched closing bracket: "%c">, 
C<Unmatched opening bracket(s): "%s">, C<Unmatched embedded quote (%s)>,
C<Did not find closing delimiter to match '%s'>,  C<Mismatched closing
bracket: expected "%c" but found "%s">,  C<No block delimiter found after
quotelike "%s">, C<Did not find leading dereferencer>, C<Bad identifier
after dereferencer>, C<Did not find expected opening bracket at %s>,
C<Improperly nested codeblock at %s>,  C<Missing second block for quotelike
"%s">, C<No match found for opening bracket>, C<Did not find opening tag:
/%s/>, C<Unable to construct closing tag to match: /%s/>, C<Found invalid
nested tag: %s>, C<Found unbalanced nested tag: %s>, C<Did not find closing
tag>

=item AUTHOR

=item BUGS AND IRRITATIONS

=item COPYRIGHT

=back

=head2 Text::ParseWords - parse text into an array of tokens or array of
arrays

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLES

0Z<>, 1Z<>, 2Z<>, 3Z<>, 4Z<>, 5Z<>

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 Text::Tabs - expand and unexpand tabs like unix expand(1) and
unexpand(1)

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXPORTS

expand, unexpand, $tabstop

=item EXAMPLE

=item SUBVERSION

=item BUGS

=item LICENSE

=back

=head2 Text::Wrap - line wrapping to form simple paragraphs

=over 4

=item SYNOPSIS 

=item DESCRIPTION

=item OVERRIDES

=item EXAMPLES

=item SUBVERSION

=item SEE ALSO

=item AUTHOR

=item LICENSE

=back

=head2 Thread - Manipulate threads in Perl (for old code only)

=over 4

=item DEPRECATED

=item HISTORY

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS

$thread = Thread->new(\&start_sub), $thread = Thread->new(\&start_sub,
LIST), lock VARIABLE, async BLOCK;, Thread->self, Thread->list, cond_wait
VARIABLE, cond_signal VARIABLE, cond_broadcast VARIABLE, yield

=item METHODS

join, detach, equal, tid, done

=item DEFUNCT

lock(\&sub), eval, flags

=item SEE ALSO

=back

=head2 Thread::Queue - Thread-safe queues

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

Ordinary scalars, Array refs, Hash refs, Scalar refs, Objects based on the
above

=item QUEUE CREATION

->new(), ->new(LIST)

=item BASIC METHODS

->enqueue(LIST), ->dequeue(), ->dequeue(COUNT), ->dequeue_nb(),
->dequeue_nb(COUNT), ->dequeue_timed(TIMEOUT), ->dequeue_timed(TIMEOUT,
COUNT), ->pending(), ->limit, ->end()

=item ADVANCED METHODS

->peek(), ->peek(INDEX), ->insert(INDEX, LIST), ->extract(),
->extract(INDEX), ->extract(INDEX, COUNT)

=item NOTES

=item LIMITATIONS

=item SEE ALSO

=item MAINTAINER

=item LICENSE

=back

=head2 Thread::Semaphore - Thread-safe semaphores

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

->new(), ->new(NUMBER), ->down(), ->down(NUMBER), ->down_nb(),
->down_nb(NUMBER), ->down_force(), ->down_force(NUMBER),
->down_timed(TIMEOUT), ->down_timed(TIMEOUT, NUMBER), ->up(), ->up(NUMBER)

=item NOTES

=item SEE ALSO

=item MAINTAINER

=item LICENSE

=back

=head2 Tie::Array - base class for tied arrays

=over 4

=item SYNOPSIS

=item DESCRIPTION

TIEARRAY classname, LIST, STORE this, index, value, FETCH this, index,
FETCHSIZE this, STORESIZE this, count, EXTEND this, count, EXISTS this,
key, DELETE this, key, CLEAR this, DESTROY this, PUSH this, LIST, POP this,
SHIFT this, UNSHIFT this, LIST, SPLICE this, offset, length, LIST

=item CAVEATS

=item AUTHOR

=back

=head2 Tie::File - Access the lines of a disk file via a Perl array

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item C<recsep>

=item C<autochomp>

=item C<mode>

=item C<memory>

=item C<dw_size>

=item Option Format

=back

=item Public Methods

=over 4

=item C<flock>

=item C<autochomp>

=item C<defer>, C<flush>, C<discard>, and C<autodefer>

=item C<offset>

=back

=item Tying to an already-opened filehandle

=item Deferred Writing

=over 4

=item Autodeferring

=back

=item CONCURRENT ACCESS TO FILES

=item CAVEATS

=item SUBCLASSING

=item WHAT ABOUT C<DB_File>?

=item AUTHOR

=item LICENSE

=item WARRANTY

=item THANKS

=item TODO

=back

=head2 Tie::Handle - base class definitions for tied handles

=over 4

=item SYNOPSIS

=item DESCRIPTION

TIEHANDLE classname, LIST, WRITE this, scalar, length, offset, PRINT this,
LIST, PRINTF this, format, LIST, READ this, scalar, length, offset,
READLINE this, GETC this, CLOSE this, OPEN this, filename, BINMODE this,
EOF this, TELL this, SEEK this, offset, whence, DESTROY this

=item MORE INFORMATION

=item COMPATIBILITY

=back

=head2 Tie::Hash, Tie::StdHash, Tie::ExtraHash - base class definitions for
tied hashes

=over 4

=item SYNOPSIS

=item DESCRIPTION

TIEHASH classname, LIST, STORE this, key, value, FETCH this, key, FIRSTKEY
this, NEXTKEY this, lastkey, EXISTS this, key, DELETE this, key, CLEAR
this, SCALAR this

=item Inheriting from B<Tie::StdHash>

=item Inheriting from B<Tie::ExtraHash>

=item C<SCALAR>, C<UNTIE> and C<DESTROY>

=item MORE INFORMATION

=back

=head2 Tie::Hash::NamedCapture - Named regexp capture buffers

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

=back

=head2 Tie::Memoize - add data to hash when needed

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item Inheriting from B<Tie::Memoize>

=item EXAMPLE

=item BUGS

=item AUTHOR

=back

=head2 Tie::RefHash - use references as hash keys

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item EXAMPLE

=item THREAD SUPPORT

=item STORABLE SUPPORT

=item RELIC SUPPORT

=item LICENSE

=item MAINTAINER

=item AUTHOR

=item SEE ALSO

=back

=head2 Tie::Scalar, Tie::StdScalar - base class definitions for tied
scalars

=over 4

=item SYNOPSIS

=item DESCRIPTION

TIESCALAR classname, LIST, FETCH this, STORE this, value, DESTROY this

=over 4

=item Tie::Scalar vs Tie::StdScalar

=back

=item MORE INFORMATION

=back

=head2 Tie::StdHandle - base class definitions for tied handles

=over 4

=item SYNOPSIS

=item DESCRIPTION

=back

=head2 Tie::SubstrHash - Fixed-table-size, fixed-key-length hashing

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEATS

=back

=head2 Time::HiRes - High resolution alarm, sleep, gettimeofday, interval
timers

=over 4

=item SYNOPSIS

=item DESCRIPTION

gettimeofday (), usleep ( $useconds ), nanosleep ( $nanoseconds ), ualarm (
$useconds [, $interval_useconds ] ), tv_interval, time (), sleep (
$floating_seconds ), alarm ( $floating_seconds [,
$interval_floating_seconds ] ), setitimer ( $which, $floating_seconds [,
$interval_floating_seconds ] ), getitimer ( $which ), clock_gettime (
$which ), clock_getres ( $which ), clock_nanosleep ( $which, $nanoseconds,
$flags = 0), clock(), stat, stat FH, stat EXPR, lstat, lstat FH, lstat
EXPR, utime LIST

=item EXAMPLES

=item C API

=item DIAGNOSTICS

=over 4

=item useconds or interval more than ...

=item negative time not invented yet

=item internal error: useconds < 0 (unsigned ... signed ...)

=item useconds or uinterval equal to or more than 1000000

=item unimplemented in this platform

=back

=item CAVEATS

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT AND LICENSE

=back

=head2 Time::Local - Efficiently compute time from local and GMT time

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=item FUNCTIONS

=over 4

=item C<timelocal()> and C<timegm()>

=item C<timelocal_nocheck()> and C<timegm_nocheck()>

=item Year Value Interpretation

=item Limits of time_t

=item Ambiguous Local Times (DST)

=item Non-Existent Local Times (DST)

=item Negative Epoch Values

=back

=item IMPLEMENTATION

=item AUTHORS EMERITUS

=item BUGS

=item AUTHOR

=item CONTRIBUTORS

=item COPYRIGHT AND LICENSE

=back

=head2 Time::Piece - Object Oriented time objects

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item USAGE

=over 4

=item Local Locales

=item Date Calculations

=item Date Comparisons

=item Date Parsing

=item YYYY-MM-DDThh:mm:ss

=item Week Number

=item Global Overriding

=back

=item CAVEATS

=over 4

=item Setting $ENV{TZ} in Threads on Win32

=item Use of epoch seconds

=back

=item AUTHOR

=item COPYRIGHT AND LICENSE

=item SEE ALSO

=item BUGS

=back

=head2 Time::Seconds - a simple API to convert seconds to other date values

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item METHODS

=item AUTHOR

=item COPYRIGHT AND LICENSE

=item Bugs

=back

=head2 Time::gmtime - by-name interface to Perl's built-in gmtime()
function

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTE

=item AUTHOR

=back

=head2 Time::localtime - by-name interface to Perl's built-in localtime()
function

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTE

=item AUTHOR

=back

=head2 Time::tm - internal object used by Time::gmtime and Time::localtime

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item AUTHOR

=back

=head2 UNIVERSAL - base class for ALL classes (blessed references)

=over 4

=item SYNOPSIS

=item DESCRIPTION

C<< $obj->isa( TYPE ) >>, C<< CLASS->isa( TYPE ) >>, C<< eval { VAL->isa(
TYPE ) } >>, C<TYPE>, C<$obj>, C<CLASS>, C<VAL>, C<< $obj->DOES( ROLE ) >>,
C<< CLASS->DOES( ROLE ) >>, C<< $obj->can( METHOD ) >>, C<< CLASS->can(
METHOD ) >>, C<< eval { VAL->can( METHOD ) } >>, C<VERSION ( [ REQUIRE ] )>

=item WARNINGS

=item EXPORTS

=back

=head2 Unicode::Collate - Unicode Collation Algorithm

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Constructor and Tailoring

UCA_Version, alternate, backwards, entry, hangul_terminator, highestFFFF,
identical, ignoreChar, ignoreName, ignore_level2, katakana_before_hiragana,
level, long_contraction, minimalFFFE, normalization, overrideCJK,
overrideHangul, overrideOut, preprocess, rearrange, rewrite, suppress,
table, undefChar, undefName, upper_before_lower, variable

=item Methods for Collation

C<@sorted = $Collator-E<gt>sort(@not_sorted)>, C<$result =
$Collator-E<gt>cmp($a, $b)>, C<$result = $Collator-E<gt>eq($a, $b)>,
C<$result = $Collator-E<gt>ne($a, $b)>, C<$result = $Collator-E<gt>lt($a,
$b)>, C<$result = $Collator-E<gt>le($a, $b)>, C<$result =
$Collator-E<gt>gt($a, $b)>, C<$result = $Collator-E<gt>ge($a, $b)>,
C<$sortKey = $Collator-E<gt>getSortKey($string)>, C<$sortKeyForm =
$Collator-E<gt>viewSortKey($string)>

=item Methods for Searching

C<$position = $Collator-E<gt>index($string, $substring[, $position])>,
C<($position, $length) = $Collator-E<gt>index($string, $substring[,
$position])>, C<$match_ref = $Collator-E<gt>match($string, $substring)>,
C<($match)   = $Collator-E<gt>match($string, $substring)>, C<@match =
$Collator-E<gt>gmatch($string, $substring)>, C<$count =
$Collator-E<gt>subst($string, $substring, $replacement)>, C<$count =
$Collator-E<gt>gsubst($string, $substring, $replacement)>

=item Other Methods

C<%old_tailoring = $Collator-E<gt>change(%new_tailoring)>,
C<$modified_collator = $Collator-E<gt>change(%new_tailoring)>, C<$version =
$Collator-E<gt>version()>, C<UCA_Version()>, C<Base_Unicode_Version()>

=back

=item EXPORT

=item INSTALL

=item CAVEATS

Normalization, Conformance Test

=item AUTHOR, COPYRIGHT AND LICENSE

=item SEE ALSO

Unicode Collation Algorithm - UTS #10, The Default Unicode Collation
Element Table (DUCET), The conformance test for the UCA, Hangul Syllable
Type, Unicode Normalization Forms - UAX #15, Unicode Locale Data Markup
Language (LDML) - UTS #35

=back

=head2 Unicode::Collate::CJK::Big5 - weighting CJK Unified Ideographs
for Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup
Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale>

=back

=head2 Unicode::Collate::CJK::GB2312 - weighting CJK Unified Ideographs
for Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup
Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale>

=back

=head2 Unicode::Collate::CJK::JISX0208 - weighting JIS KANJI for
Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

L<Unicode::Collate>, L<Unicode::Collate::Locale>

=back

=head2 Unicode::Collate::CJK::Korean - weighting CJK Unified Ideographs
for Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item SEE ALSO

CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup
Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale>

=back

=head2 Unicode::Collate::CJK::Pinyin - weighting CJK Unified Ideographs
for Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup
Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale>

=back

=head2 Unicode::Collate::CJK::Stroke - weighting CJK Unified Ideographs
for Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup
Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale>

=back

=head2 Unicode::Collate::CJK::Zhuyin - weighting CJK Unified Ideographs
for Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item CAVEAT

=item SEE ALSO

CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup
Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale>

=back

=head2 Unicode::Collate::Locale - Linguistic tailoring for DUCET via
Unicode::Collate

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Constructor

=item Methods

C<$Collator-E<gt>getlocale>, C<$Collator-E<gt>locale_version>

=item A list of tailorable locales

=back

=item INSTALL

=item CAVEAT

Tailoring is not maximum, Collation reordering is not supported

=over 4

=item Reference

=back

=item AUTHOR

=item SEE ALSO

Unicode Collation Algorithm - UTS #10, The Default Unicode Collation
Element Table (DUCET), Unicode Locale Data Markup Language (LDML) - UTS
#35, CLDR - Unicode Common Locale Data Repository, L<Unicode::Collate>,
L<Unicode::Normalize>

=back

=head2 Unicode::Normalize - Unicode Normalization Forms

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Normalization Forms

C<$NFD_string = NFD($string)>, C<$NFC_string = NFC($string)>,
C<$NFKD_string = NFKD($string)>, C<$NFKC_string = NFKC($string)>,
C<$FCD_string = FCD($string)>, C<$FCC_string = FCC($string)>,
C<$normalized_string = normalize($form_name, $string)>

=item Decomposition and Composition

C<$decomposed_string = decompose($string [, $useCompatMapping])>,
C<$reordered_string = reorder($string)>, C<$composed_string =
compose($string)>, C<($processed, $unprocessed) =
splitOnLastStarter($normalized)>, C<$processed = normalize_partial($form,
$unprocessed)>, C<$processed = NFD_partial($unprocessed)>, C<$processed =
NFC_partial($unprocessed)>, C<$processed = NFKD_partial($unprocessed)>,
C<$processed = NFKC_partial($unprocessed)>

=item Quick Check

C<$result = checkNFD($string)>, C<$result = checkNFC($string)>, C<$result =
checkNFKD($string)>, C<$result = checkNFKC($string)>, C<$result =
checkFCD($string)>, C<$result = checkFCC($string)>, C<$result =
check($form_name, $string)>

=item Character Data

C<$canonical_decomposition = getCanon($code_point)>,
C<$compatibility_decomposition = getCompat($code_point)>,
C<$code_point_composite = getComposite($code_point_here,
$code_point_next)>, C<$combining_class = getCombinClass($code_point)>,
C<$may_be_composed_with_prev_char = isComp2nd($code_point)>,
C<$is_exclusion = isExclusion($code_point)>, C<$is_singleton =
isSingleton($code_point)>, C<$is_non_starter_decomposition =
isNonStDecomp($code_point)>, C<$is_Full_Composition_Exclusion =
isComp_Ex($code_point)>, C<$NFD_is_NO = isNFD_NO($code_point)>,
C<$NFC_is_NO = isNFC_NO($code_point)>, C<$NFC_is_MAYBE =
isNFC_MAYBE($code_point)>, C<$NFKD_is_NO = isNFKD_NO($code_point)>,
C<$NFKC_is_NO = isNFKC_NO($code_point)>, C<$NFKC_is_MAYBE =
isNFKC_MAYBE($code_point)>

=back

=item EXPORT

=item CAVEATS

Perl's version vs. Unicode version, Correction of decomposition mapping,
Revised definition of canonical composition

=item AUTHOR

=item LICENSE

=item SEE ALSO

http://www.unicode.org/reports/tr15/,
http://www.unicode.org/Public/UNIDATA/CompositionExclusions.txt,
http://www.unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt,
http://www.unicode.org/Public/UNIDATA/NormalizationCorrections.txt,
http://www.unicode.org/review/pr-29.html, http://www.unicode.org/notes/tn5/

=back

=head2 Unicode::UCD - Unicode character database

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item code point argument

=back

=back

=over 4

=item B<charinfo()>

B<code>, B<name>, B<category>, B<combining>, B<bidi>, B<decomposition>,
B<decimal>, B<digit>, B<numeric>, B<mirrored>, B<unicode10>, B<comment>,
B<upper>, B<lower>, B<title>, B<block>, B<script>

=back

=over 4

=item B<charprop()>

Block, Decomposition_Mapping, Name_Alias, Numeric_Value, Script_Extensions

=back

=over 4

=item B<charprops_all()>

=back

=over 4

=item B<charblock()>

=back

=over 4

=item B<charscript()>

=back

=over 4

=item B<charblocks()>

=back

=over 4

=item B<charscripts()>

=back

=over 4

=item B<charinrange()>

=back

=over 4

=item B<general_categories()>

=back

=over 4

=item B<bidi_types()>

=back

=over 4

=item B<compexcl()>

=back

=over 4

=item B<casefold()>

B<code>, B<full>, B<simple>, B<mapping>, B<status>, Z<>B<*> If you use this
C<I> mapping, Z<>B<*> If you exclude this C<I> mapping, B<turkic>

=back

=over 4

=item B<all_casefolds()>

=back

=over 4

=item B<casespec()>

B<code>, B<lower>, B<title>, B<upper>, B<condition>

=back

=over 4

=item B<namedseq()>

=back

=over 4

=item B<num()>

=back

=over 4

=item B<prop_aliases()>

=back

=over 4

=item B<prop_values()>

=back

=over 4

=item B<prop_value_aliases()>

=back

=over 4

=item B<prop_invlist()>

=back

=over 4

=item B<prop_invmap()>

B<C<s>>, B<C<sl>>, C<correction>, C<control>, C<alternate>, C<figment>,
C<abbreviation>, B<C<a>>, B<C<al>>, B<C<ae>>, B<C<ale>>, B<C<ar>>, B<C<n>>,
B<C<ad>>

=back

=over 4

=item B<search_invlist()>

=back

=over 4

=item Unicode::UCD::UnicodeVersion

=back

=over 4

=item B<Blocks versus Scripts>

=item B<Matching Scripts and Blocks>

=item Old-style versus new-style block names

=item Use with older Unicode versions

=back

=over 4

=item AUTHOR

=back

=head2 User::grent - by-name interface to Perl's built-in getgr*()
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=item NOTE

=item AUTHOR

=back

=head2 User::pwent - by-name interface to Perl's built-in getpw*()
functions

=over 4

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item System Specifics

=back

=item NOTE

=item AUTHOR

=item HISTORY

March 18th, 2000

=back

=head2 XSLoader - Dynamically load C libraries into Perl code

=over 4

=item VERSION

=item SYNOPSIS

=item DESCRIPTION

=over 4

=item Migration from C<DynaLoader>

=item Backward compatible boilerplate

=back

=item Order of initialization: early load()

=over 4

=item The most hairy case

=back

=item DIAGNOSTICS

C<Can't find '%s' symbol in %s>, C<Can't load '%s' for module %s: %s>,
C<Undefined symbols present after loading %s: %s>

=item LIMITATIONS

=item KNOWN BUGS

=item BUGS

=item SEE ALSO

=item AUTHORS

=item COPYRIGHT & LICENSE

=back

=head1 AUXILIARY DOCUMENTATION

Here should be listed all the extra programs' documentation, but they
don't all have manual pages yet:

=over 4

=item h2ph

=item h2xs

=item perlbug

=item pl2pm

=item pod2html

=item pod2man

=item splain

=item xsubpp

=back

=head1 AUTHOR

Larry Wall <F<larry@wall.org>>, with the help of oodles
of other folks.

perl5122delta.pod000064400000022603150344123470007544 0ustar00=encoding utf8

=head1 NAME

perl5122delta - what is new for perl v5.12.2

=head1 DESCRIPTION

This document describes differences between the 5.12.1 release and
the 5.12.2 release.

If you are upgrading from an earlier major version, such as 5.10.1,
first read L<perl5120delta>, which describes differences between 5.10.1
and 5.12.0, as well as L<perl5121delta>, which describes earlier changes
in the 5.12 stable release series.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.12.1. If any exist, they
are bugs and reports are welcome.

=head1 Core Enhancements

Other than the bug fixes listed below, there should be no user-visible
changes to the core language in this release.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

This release does not introduce any new modules or pragmata.

=head2 Pragmata Changes

In the previous release, C<no I<VERSION>;> statements triggered a bug
which could cause L<feature> bundles to be loaded and L<strict> mode to
be enabled unintentionally.

=head2 Updated Modules

=over 4

=item C<Carp>

Upgraded from version 1.16 to 1.17.

L<Carp> now detects incomplete L<caller()|perlfunc/"caller EXPR">
overrides and avoids using bogus C<@DB::args>. To provide backtraces, Carp
relies on particular behaviour of the caller built-in. Carp now detects
if other code has overridden this with an incomplete implementation, and
modifies its backtrace accordingly. Previously incomplete overrides would
cause incorrect values in backtraces (best case), or obscure fatal errors
(worst case)

This fixes certain cases of C<Bizarre copy of ARRAY> caused by modules
overriding C<caller()> incorrectly.

=item C<CPANPLUS>

A patch to F<cpanp-run-perl> has been backported from CPANPLUS C<0.9004>. This
resolves L<RT #55964|http://rt.cpan.org/Public/Bug/Display.html?id=55964>
and L<RT #57106|http://rt.cpan.org/Public/Bug/Display.html?id=57106>, both
of which related to failures to install distributions that use
C<Module::Install::DSL>.

=item C<File::Glob>

A regression which caused a failure to find C<CORE::GLOBAL::glob> after
loading C<File::Glob> to crash has been fixed.  Now, it correctly falls back
to external globbing via C<pp_glob>.

=item C<File::Copy>

C<File::Copy::copy(FILE, DIR)> is now documented.

=item C<File::Spec>

Upgraded from version 3.31 to 3.31_01.

Several portability fixes were made in C<File::Spec::VMS>: a colon is now
recognized as a delimiter in native filespecs; caret-escaped delimiters are
recognized for better handling of extended filespecs; C<catpath()> returns
an empty directory rather than the current directory if the input directory
name is empty; C<abs2rel()> properly handles Unix-style input.

=back

=head1 Utility Changes

=over

=item *

F<perlbug> now always gives the reporter a chance to change the email address it
guesses for them.

=item *

F<perlbug> should no longer warn about uninitialized values when using the C<-d>
and C<-v> options.

=back

=head1 Changes to Existing Documentation

=over

=item *

The existing policy on backward-compatibility and deprecation has
been added to L<perlpolicy>, along with definitions of terms like
I<deprecation>.

=item *

L<perlfunc/srand>'s usage has been clarified.

=item *

The entry for L<perlfunc/die> was reorganized to emphasize its
role in the exception mechanism.

=item *

Perl's L<INSTALL> file has been clarified to explicitly state that Perl
requires a C89 compliant ANSI C Compiler.

=item *

L<IO::Socket>'s C<getsockopt()> and C<setsockopt()> have been documented.

=item *

F<alarm()>'s inability to interrupt blocking IO on Windows has been documented.

=item *

L<Math::TrulyRandom> hasn't been updated since 1996 and has been removed
as a recommended solution for random number generation.

=item *

L<perlrun> has been updated to clarify the behaviour of octal flags to F<perl>.

=item *

To ease user confusion, C<$#> and C<$*>, two special variables that were
removed in earlier versions of Perl have been documented.

=item *

The version of L<perlfaq> shipped with the Perl core has been updated from the
official FAQ version, which is now maintained in the C<briandfoy/perlfaq>
branch of the Perl repository at L<git://perl5.git.perl.org/perl.git>.

=back

=head1 Installation and Configuration Improvements

=head2 Configuration improvements

=over

=item *

The C<d_u32align> configuration probe on ARM has been fixed.

=back

=head2 Compilation improvements

=over

=item *

An "C<incompatible operand types>" error in ternary expressions when building
with C<clang> has been fixed.

=item *

Perl now skips setuid C<File::Copy> tests on partitions it detects to be mounted
as C<nosuid>.

=back

=head1 Selected Bug Fixes

=over 4

=item *

A possible segfault in the C<T_PRTOBJ> default typemap has been fixed.

=item *

A possible memory leak when using L<caller()|perlfunc/"caller EXPR"> to set
C<@DB::args> has been fixed.

=item *

Several memory leaks when loading XS modules were fixed.

=item *

C<unpack()> now handles scalar context correctly for C<%32H> and C<%32u>,
fixing a potential crash.  C<split()> would crash because the third item
on the stack wasn't the regular expression it expected.  C<unpack("%2H",
...)> would return both the unpacked result and the checksum on the stack,
as would C<unpack("%2u", ...)>.
L<[perl #73814]|http://rt.perl.org/rt3/Ticket/Display.html?id=73814>

=item *

Perl now avoids using memory after calling C<free()> in F<pp_require>
when there are CODEREFs in C<@INC>.

=item *

A bug that could cause "C<Unknown error>" messages when
"C<call_sv(code, G_EVAL)>" is called from an XS destructor has been fixed.

=item *

The implementation of the C<open $fh, 'E<gt>' \$buffer> feature
now supports get/set magic and thus tied buffers correctly.

=item *

The C<pp_getc>, C<pp_tell>, and C<pp_eof> opcodes now make room on the
stack for their return values in cases where no argument was passed in.

=item *

When matching unicode strings under some conditions inappropriate backtracking would
result in a C<Malformed UTF-8 character (fatal)> error. This should no longer occur.
See  L<[perl #75680]|http://rt.perl.org/rt3/Public/Bug/Display.html?id=75680>

=back

=head1 Platform Specific Notes

=head2 AIX

=over

=item *

F<README.aix> has been updated with information about the XL C/C++ V11 compiler
suite.

=back

=head2 Windows

=over

=item *

When building Perl with the mingw64 x64 cross-compiler C<incpath>,
C<libpth>, C<ldflags>, C<lddlflags> and C<ldflags_nolargefiles> values
in F<Config.pm> and F<Config_heavy.pl> were not previously being set
correctly because, with that compiler, the include and lib directories
are not immediately below C<$(CCHOME)>.

=back

=head2 VMS

=over

=item *

F<git_version.h> is now installed on VMS. This was an oversight in v5.12.0 which
caused some extensions to fail to build.

=item *

Several memory leaks in L<stat()|perlfunc/"stat FILEHANDLE"> have been fixed.

=item *

A memory leak in C<Perl_rename()> due to a double allocation has been
fixed.

=item *

A memory leak in C<vms_fid_to_name()> (used by C<realpath()> and
C<realname()>) has been fixed.

=back

=head1 Acknowledgements

Perl 5.12.2 represents approximately three months of development since
Perl 5.12.1 and contains approximately 2,000 lines of changes across
100 files from 36 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.12.2:

Abigail, Ævar Arnfjörð Bjarmason, Ben Morrow, brian d foy, Brian
Phillips, Chas. Owens, Chris 'BinGOs' Williams, Chris Williams,
Craig A. Berry, Curtis Jewell, Dan Dascalescu, David Golden, David
Mitchell, Father Chrysostomos, Florian Ragwitz, George Greer, H.Merijn
Brand, Jan Dubois, Jesse Vincent, Jim Cromie, Karl Williamson, Lars
Dɪᴇᴄᴋᴏᴡ 迪拉斯, Leon Brocard, Maik Hentsche, Matt S Trout,
Nicholas Clark, Rafael Garcia-Suarez, Rainer Tammer, Ricardo Signes,
Salvador Ortiz Garcia, Sisyphus, Slaven Rezic, Steffen Mueller, Tony Cook,
Vincent Pit and Yves Orton.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes
all the core committers, who will be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlintro.pod000064400000053147150344123500007275 0ustar00=head1 NAME

perlintro -- a brief introduction and overview of Perl

=head1 DESCRIPTION

This document is intended to give you a quick overview of the Perl
programming language, along with pointers to further documentation.  It
is intended as a "bootstrap" guide for those who are new to the
language, and provides just enough information for you to be able to
read other peoples' Perl and understand roughly what it's doing, or
write your own simple scripts.

This introductory document does not aim to be complete.  It does not
even aim to be entirely accurate.  In some cases perfection has been
sacrificed in the goal of getting the general idea across.  You are
I<strongly> advised to follow this introduction with more information
from the full Perl manual, the table of contents to which can be found
in L<perltoc>.

Throughout this document you'll see references to other parts of the
Perl documentation.  You can read that documentation using the C<perldoc>
command or whatever method you're using to read this document.

Throughout Perl's documentation, you'll find numerous examples intended
to help explain the discussed features.  Please keep in mind that many
of them are code fragments rather than complete programs.

These examples often reflect the style and preference of the author of
that piece of the documentation, and may be briefer than a corresponding
line of code in a real program.  Except where otherwise noted, you
should assume that C<use strict> and C<use warnings> statements
appear earlier in the "program", and that any variables used have
already been declared, even if those declarations have been omitted
to make the example easier to read.

Do note that the examples have been written by many different authors over
a period of several decades.  Styles and techniques will therefore differ,
although some effort has been made to not vary styles too widely in the
same sections.  Do not consider one style to be better than others - "There's
More Than One Way To Do It" is one of Perl's mottos.  After all, in your
journey as a programmer, you are likely to encounter different styles.

=head2 What is Perl?

Perl is a general-purpose programming language originally developed for
text manipulation and now used for a wide range of tasks including
system administration, web development, network programming, GUI
development, and more.

The language is intended to be practical (easy to use, efficient,
complete) rather than beautiful (tiny, elegant, minimal).  Its major
features are that it's easy to use, supports both procedural and
object-oriented (OO) programming, has powerful built-in support for text
processing, and has one of the world's most impressive collections of
third-party modules.

Different definitions of Perl are given in L<perl>, L<perlfaq1> and
no doubt other places.  From this we can determine that Perl is different
things to different people, but that lots of people think it's at least
worth writing about.

=head2 Running Perl programs

To run a Perl program from the Unix command line:

 perl progname.pl

Alternatively, put this as the first line of your script:

 #!/usr/bin/env perl

... and run the script as F</path/to/script.pl>.  Of course, it'll need
to be executable first, so C<chmod 755 script.pl> (under Unix).

(This start line assumes you have the B<env> program.  You can also put
directly the path to your perl executable, like in C<#!/usr/bin/perl>).

For more information, including instructions for other platforms such as
Windows and Mac OS, read L<perlrun>.

=head2 Safety net

Perl by default is very forgiving.  In order to make it more robust
it is recommended to start every program with the following lines:

 #!/usr/bin/perl
 use strict;
 use warnings;

The two additional lines request from perl to catch various common
problems in your code.  They check different things so you need both.  A
potential problem caught by C<use strict;> will cause your code to stop
immediately when it is encountered, while C<use warnings;> will merely
give a warning (like the command-line switch B<-w>) and let your code run.
To read more about them check their respective manual pages at L<strict>
and L<warnings>.

=head2 Basic syntax overview

A Perl script or program consists of one or more statements.  These
statements are simply written in the script in a straightforward
fashion.  There is no need to have a C<main()> function or anything of
that kind.

Perl statements end in a semi-colon:

 print "Hello, world";

Comments start with a hash symbol and run to the end of the line

 # This is a comment

Whitespace is irrelevant:

 print
     "Hello, world"
     ;

... except inside quoted strings:

 # this would print with a linebreak in the middle
 print "Hello
 world";

Double quotes or single quotes may be used around literal strings:

 print "Hello, world";
 print 'Hello, world';

However, only double quotes "interpolate" variables and special
characters such as newlines (C<\n>):

 print "Hello, $name\n";     # works fine
 print 'Hello, $name\n';     # prints $name\n literally

Numbers don't need quotes around them:

 print 42;

You can use parentheses for functions' arguments or omit them
according to your personal taste.  They are only required
occasionally to clarify issues of precedence.

 print("Hello, world\n");
 print "Hello, world\n";

More detailed information about Perl syntax can be found in L<perlsyn>.

=head2 Perl variable types

Perl has three main variable types: scalars, arrays, and hashes.

=over 4

=item Scalars

A scalar represents a single value:

 my $animal = "camel";
 my $answer = 42;

Scalar values can be strings, integers or floating point numbers, and Perl
will automatically convert between them as required.  There is no need
to pre-declare your variable types, but you have to declare them using
the C<my> keyword the first time you use them.  (This is one of the
requirements of C<use strict;>.)

Scalar values can be used in various ways:

 print $animal;
 print "The animal is $animal\n";
 print "The square of $answer is ", $answer * $answer, "\n";

There are a number of "magic" scalars with names that look like
punctuation or line noise.  These special variables are used for all
kinds of purposes, and are documented in L<perlvar>.  The only one you
need to know about for now is C<$_> which is the "default variable".
It's used as the default argument to a number of functions in Perl, and
it's set implicitly by certain looping constructs.

 print;          # prints contents of $_ by default

=item Arrays

An array represents a list of values:

 my @animals = ("camel", "llama", "owl");
 my @numbers = (23, 42, 69);
 my @mixed   = ("camel", 42, 1.23);

Arrays are zero-indexed.  Here's how you get at elements in an array:

 print $animals[0];              # prints "camel"
 print $animals[1];              # prints "llama"

The special variable C<$#array> tells you the index of the last element
of an array:

 print $mixed[$#mixed];       # last element, prints 1.23

You might be tempted to use C<$#array + 1> to tell you how many items there
are in an array.  Don't bother.  As it happens, using C<@array> where Perl
expects to find a scalar value ("in scalar context") will give you the number
of elements in the array:

 if (@animals < 5) { ... }

The elements we're getting from the array start with a C<$> because
we're getting just a single value out of the array; you ask for a scalar,
you get a scalar.

To get multiple values from an array:

 @animals[0,1];                 # gives ("camel", "llama");
 @animals[0..2];                # gives ("camel", "llama", "owl");
 @animals[1..$#animals];        # gives all except the first element

This is called an "array slice".

You can do various useful things to lists:

 my @sorted    = sort @animals;
 my @backwards = reverse @numbers;

There are a couple of special arrays too, such as C<@ARGV> (the command
line arguments to your script) and C<@_> (the arguments passed to a
subroutine).  These are documented in L<perlvar>.

=item Hashes

A hash represents a set of key/value pairs:

 my %fruit_color = ("apple", "red", "banana", "yellow");

You can use whitespace and the C<< => >> operator to lay them out more
nicely:

 my %fruit_color = (
     apple  => "red",
     banana => "yellow",
 );

To get at hash elements:

 $fruit_color{"apple"};           # gives "red"

You can get at lists of keys and values with C<keys()> and
C<values()>.

 my @fruits = keys %fruit_colors;
 my @colors = values %fruit_colors;

Hashes have no particular internal order, though you can sort the keys
and loop through them.

Just like special scalars and arrays, there are also special hashes.
The most well known of these is C<%ENV> which contains environment
variables.  Read all about it (and other special variables) in
L<perlvar>.

=back

Scalars, arrays and hashes are documented more fully in L<perldata>.

More complex data types can be constructed using references, which allow
you to build lists and hashes within lists and hashes.

A reference is a scalar value and can refer to any other Perl data
type.  So by storing a reference as the value of an array or hash
element, you can easily create lists and hashes within lists and
hashes.  The following example shows a 2 level hash of hash
structure using anonymous hash references.

 my $variables = {
     scalar  =>  {
                  description => "single item",
                  sigil => '$',
                 },
     array   =>  {
                  description => "ordered list of items",
                  sigil => '@',
                 },
     hash    =>  {
                  description => "key/value pairs",
                  sigil => '%',
                 },
 };

 print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n";

Exhaustive information on the topic of references can be found in
L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>.

=head2 Variable scoping

Throughout the previous section all the examples have used the syntax:

 my $var = "value";

The C<my> is actually not required; you could just use:

 $var = "value";

However, the above usage will create global variables throughout your
program, which is bad programming practice.  C<my> creates lexically
scoped variables instead.  The variables are scoped to the block
(i.e. a bunch of statements surrounded by curly-braces) in which they
are defined.

 my $x = "foo";
 my $some_condition = 1;
 if ($some_condition) {
     my $y = "bar";
     print $x;           # prints "foo"
     print $y;           # prints "bar"
 }
 print $x;               # prints "foo"
 print $y;               # prints nothing; $y has fallen out of scope

Using C<my> in combination with a C<use strict;> at the top of
your Perl scripts means that the interpreter will pick up certain common
programming errors.  For instance, in the example above, the final
C<print $y> would cause a compile-time error and prevent you from
running the program.  Using C<strict> is highly recommended.

=head2 Conditional and looping constructs

Perl has most of the usual conditional and looping constructs.  As of Perl
5.10, it even has a case/switch statement (spelled C<given>/C<when>).  See
L<perlsyn/"Switch Statements"> for more details.

The conditions can be any Perl expression.  See the list of operators in
the next section for information on comparison and boolean logic operators,
which are commonly used in conditional statements.

=over 4

=item if

 if ( condition ) {
     ...
 } elsif ( other condition ) {
     ...
 } else {
     ...
 }

There's also a negated version of it:

 unless ( condition ) {
     ...
 }

This is provided as a more readable version of C<if (!I<condition>)>.

Note that the braces are required in Perl, even if you've only got one
line in the block.  However, there is a clever way of making your one-line
conditional blocks more English like:

 # the traditional way
 if ($zippy) {
     print "Yow!";
 }

 # the Perlish post-condition way
 print "Yow!" if $zippy;
 print "We have no bananas" unless $bananas;

=item while

 while ( condition ) {
     ...
 }

There's also a negated version, for the same reason we have C<unless>:

 until ( condition ) {
     ...
 }

You can also use C<while> in a post-condition:

 print "LA LA LA\n" while 1;          # loops forever

=item for

Exactly like C:

 for ($i = 0; $i <= $max; $i++) {
     ...
 }

The C style for loop is rarely needed in Perl since Perl provides
the more friendly list scanning C<foreach> loop.

=item foreach

 foreach (@array) {
     print "This element is $_\n";
 }

 print $list[$_] foreach 0 .. $max;

 # you don't have to use the default $_ either...
 foreach my $key (keys %hash) {
     print "The value of $key is $hash{$key}\n";
 }

The C<foreach> keyword is actually a synonym for the C<for>
keyword.  See C<L<perlsyn/"Foreach Loops">>.

=back

For more detail on looping constructs (and some that weren't mentioned in
this overview) see L<perlsyn>.

=head2 Builtin operators and functions

Perl comes with a wide selection of builtin functions.  Some of the ones
we've already seen include C<print>, C<sort> and C<reverse>.  A list of
them is given at the start of L<perlfunc> and you can easily read
about any given function by using C<perldoc -f I<functionname>>.

Perl operators are documented in full in L<perlop>, but here are a few
of the most common ones:

=over 4

=item Arithmetic

 +   addition
 -   subtraction
 *   multiplication
 /   division

=item Numeric comparison

 ==  equality
 !=  inequality
 <   less than
 >   greater than
 <=  less than or equal
 >=  greater than or equal

=item String comparison

 eq  equality
 ne  inequality
 lt  less than
 gt  greater than
 le  less than or equal
 ge  greater than or equal

(Why do we have separate numeric and string comparisons?  Because we don't
have special variable types, and Perl needs to know whether to sort
numerically (where 99 is less than 100) or alphabetically (where 100 comes
before 99).

=item Boolean logic

 &&  and
 ||  or
 !   not

(C<and>, C<or> and C<not> aren't just in the above table as descriptions
of the operators.  They're also supported as operators in their own
right.  They're more readable than the C-style operators, but have
different precedence to C<&&> and friends.  Check L<perlop> for more
detail.)

=item Miscellaneous

 =   assignment
 .   string concatenation
 x   string multiplication
 ..  range operator (creates a list of numbers or strings)

=back

Many operators can be combined with a C<=> as follows:

 $a += 1;        # same as $a = $a + 1
 $a -= 1;        # same as $a = $a - 1
 $a .= "\n";     # same as $a = $a . "\n";

=head2 Files and I/O

You can open a file for input or output using the C<open()> function.
It's documented in extravagant detail in L<perlfunc> and L<perlopentut>,
but in short:

 open(my $in,  "<",  "input.txt")  or die "Can't open input.txt: $!";
 open(my $out, ">",  "output.txt") or die "Can't open output.txt: $!";
 open(my $log, ">>", "my.log")     or die "Can't open my.log: $!";

You can read from an open filehandle using the C<< <> >> operator.  In
scalar context it reads a single line from the filehandle, and in list
context it reads the whole file in, assigning each line to an element of
the list:

 my $line  = <$in>;
 my @lines = <$in>;

Reading in the whole file at one time is called slurping.  It can
be useful but it may be a memory hog.  Most text file processing
can be done a line at a time with Perl's looping constructs.

The C<< <> >> operator is most often seen in a C<while> loop:

 while (<$in>) {     # assigns each line in turn to $_
     print "Just read in this line: $_";
 }

We've already seen how to print to standard output using C<print()>.
However, C<print()> can also take an optional first argument specifying
which filehandle to print to:

 print STDERR "This is your final warning.\n";
 print $out $record;
 print $log $logmessage;

When you're done with your filehandles, you should C<close()> them
(though to be honest, Perl will clean up after you if you forget):

 close $in or die "$in: $!";

=head2 Regular expressions

Perl's regular expression support is both broad and deep, and is the
subject of lengthy documentation in L<perlrequick>, L<perlretut>, and
elsewhere.  However, in short:

=over 4

=item Simple matching

 if (/foo/)       { ... }  # true if $_ contains "foo"
 if ($a =~ /foo/) { ... }  # true if $a contains "foo"

The C<//> matching operator is documented in L<perlop>.  It operates on
C<$_> by default, or can be bound to another variable using the C<=~>
binding operator (also documented in L<perlop>).

=item Simple substitution

 s/foo/bar/;               # replaces foo with bar in $_
 $a =~ s/foo/bar/;         # replaces foo with bar in $a
 $a =~ s/foo/bar/g;        # replaces ALL INSTANCES of foo with bar
                           # in $a

The C<s///> substitution operator is documented in L<perlop>.

=item More complex regular expressions

You don't just have to match on fixed strings.  In fact, you can match
on just about anything you could dream of by using more complex regular
expressions.  These are documented at great length in L<perlre>, but for
the meantime, here's a quick cheat sheet:

 .                   a single character
 \s                  a whitespace character (space, tab, newline,
                     ...)
 \S                  non-whitespace character
 \d                  a digit (0-9)
 \D                  a non-digit
 \w                  a word character (a-z, A-Z, 0-9, _)
 \W                  a non-word character
 [aeiou]             matches a single character in the given set
 [^aeiou]            matches a single character outside the given
                     set
 (foo|bar|baz)       matches any of the alternatives specified

 ^                   start of string
 $                   end of string

Quantifiers can be used to specify how many of the previous thing you
want to match on, where "thing" means either a literal character, one
of the metacharacters listed above, or a group of characters or
metacharacters in parentheses.

 *                   zero or more of the previous thing
 +                   one or more of the previous thing
 ?                   zero or one of the previous thing
 {3}                 matches exactly 3 of the previous thing
 {3,6}               matches between 3 and 6 of the previous thing
 {3,}                matches 3 or more of the previous thing

Some brief examples:

 /^\d+/              string starts with one or more digits
 /^$/                nothing in the string (start and end are
                     adjacent)
 /(\d\s){3}/         three digits, each followed by a whitespace
                     character (eg "3 4 5 ")
 /(a.)+/             matches a string in which every odd-numbered
                     letter is a (eg "abacadaf")

 # This loop reads from STDIN, and prints non-blank lines:
 while (<>) {
     next if /^$/;
     print;
 }

=item Parentheses for capturing

As well as grouping, parentheses serve a second purpose.  They can be
used to capture the results of parts of the regexp match for later use.
The results end up in C<$1>, C<$2> and so on.

 # a cheap and nasty way to break an email address up into parts

 if ($email =~ /([^@]+)@(.+)/) {
     print "Username is $1\n";
     print "Hostname is $2\n";
 }

=item Other regexp features

Perl regexps also support backreferences, lookaheads, and all kinds of
other complex details.  Read all about them in L<perlrequick>,
L<perlretut>, and L<perlre>.

=back

=head2 Writing subroutines

Writing subroutines is easy:

 sub logger {
    my $logmessage = shift;
    open my $logfile, ">>", "my.log" or die "Could not open my.log: $!";
    print $logfile $logmessage;
 }

Now we can use the subroutine just as any other built-in function:

 logger("We have a logger subroutine!");

What's that C<shift>?  Well, the arguments to a subroutine are available
to us as a special array called C<@_> (see L<perlvar> for more on that).
The default argument to the C<shift> function just happens to be C<@_>.
So C<my $logmessage = shift;> shifts the first item off the list of
arguments and assigns it to C<$logmessage>.

We can manipulate C<@_> in other ways too:

 my ($logmessage, $priority) = @_;       # common
 my $logmessage = $_[0];                 # uncommon, and ugly

Subroutines can also return values:

 sub square {
     my $num = shift;
     my $result = $num * $num;
     return $result;
 }

Then use it like:

 $sq = square(8);

For more information on writing subroutines, see L<perlsub>.

=head2 OO Perl

OO Perl is relatively simple and is implemented using references which
know what sort of object they are based on Perl's concept of packages.
However, OO Perl is largely beyond the scope of this document.
Read L<perlootut> and L<perlobj>.

As a beginning Perl programmer, your most common use of OO Perl will be
in using third-party modules, which are documented below.

=head2 Using Perl modules

Perl modules provide a range of features to help you avoid reinventing
the wheel, and can be downloaded from CPAN ( L<http://www.cpan.org/> ).  A
number of popular modules are included with the Perl distribution
itself.

Categories of modules range from text manipulation to network protocols
to database integration to graphics.  A categorized list of modules is
also available from CPAN.

To learn how to install modules you download from CPAN, read
L<perlmodinstall>.

To learn how to use a particular module, use C<perldoc I<Module::Name>>.
Typically you will want to C<use I<Module::Name>>, which will then give
you access to exported functions or an OO interface to the module.

L<perlfaq> contains questions and answers related to many common
tasks, and often provides suggestions for good CPAN modules to use.

L<perlmod> describes Perl modules in general.  L<perlmodlib> lists the
modules which came with your Perl installation.

If you feel the urge to write Perl modules, L<perlnewmod> will give you
good advice.

=head1 AUTHOR

Kirrily "Skud" Robert <skud@cpan.org>
perlfork.pod000064400000032053150344123500007074 0ustar00=head1 NAME

perlfork - Perl's fork() emulation

=head1 SYNOPSIS

    NOTE:  As of the 5.8.0 release, fork() emulation has considerably
    matured.  However, there are still a few known bugs and differences
    from real fork() that might affect you.  See the "BUGS" and
    "CAVEATS AND LIMITATIONS" sections below.

Perl provides a fork() keyword that corresponds to the Unix system call
of the same name.  On most Unix-like platforms where the fork() system
call is available, Perl's fork() simply calls it.

On some platforms such as Windows where the fork() system call is not
available, Perl can be built to emulate fork() at the interpreter level.
While the emulation is designed to be as compatible as possible with the
real fork() at the level of the Perl program, there are certain
important differences that stem from the fact that all the pseudo child
"processes" created this way live in the same real process as far as the
operating system is concerned.

This document provides a general overview of the capabilities and
limitations of the fork() emulation.  Note that the issues discussed here
are not applicable to platforms where a real fork() is available and Perl
has been configured to use it.

=head1 DESCRIPTION

The fork() emulation is implemented at the level of the Perl interpreter.
What this means in general is that running fork() will actually clone the
running interpreter and all its state, and run the cloned interpreter in
a separate thread, beginning execution in the new thread just after the
point where the fork() was called in the parent.  We will refer to the
thread that implements this child "process" as the pseudo-process.

To the Perl program that called fork(), all this is designed to be
transparent.  The parent returns from the fork() with a pseudo-process
ID that can be subsequently used in any process-manipulation functions;
the child returns from the fork() with a value of C<0> to signify that
it is the child pseudo-process.

=head2 Behavior of other Perl features in forked pseudo-processes

Most Perl features behave in a natural way within pseudo-processes.

=over 8

=item $$ or $PROCESS_ID

This special variable is correctly set to the pseudo-process ID.
It can be used to identify pseudo-processes within a particular
session.  Note that this value is subject to recycling if any
pseudo-processes are launched after others have been wait()-ed on.

=item %ENV

Each pseudo-process maintains its own virtual environment.  Modifications
to %ENV affect the virtual environment, and are only visible within that
pseudo-process, and in any processes (or pseudo-processes) launched from
it.

=item chdir() and all other builtins that accept filenames

Each pseudo-process maintains its own virtual idea of the current directory.
Modifications to the current directory using chdir() are only visible within
that pseudo-process, and in any processes (or pseudo-processes) launched from
it.  All file and directory accesses from the pseudo-process will correctly
map the virtual working directory to the real working directory appropriately.

=item wait() and waitpid()

wait() and waitpid() can be passed a pseudo-process ID returned by fork().
These calls will properly wait for the termination of the pseudo-process
and return its status.

=item kill()

C<kill('KILL', ...)> can be used to terminate a pseudo-process by
passing it the ID returned by fork(). The outcome of kill on a pseudo-process
is unpredictable and it should not be used except
under dire circumstances, because the operating system may not
guarantee integrity of the process resources when a running thread is
terminated.  The process which implements the pseudo-processes can be blocked
and the Perl interpreter hangs. Note that using C<kill('KILL', ...)> on a
pseudo-process() may typically cause memory leaks, because the thread
that implements the pseudo-process does not get a chance to clean up
its resources.

C<kill('TERM', ...)> can also be used on pseudo-processes, but the
signal will not be delivered while the pseudo-process is blocked by a
system call, e.g. waiting for a socket to connect, or trying to read
from a socket with no data available.  Starting in Perl 5.14 the
parent process will not wait for children to exit once they have been
signalled with C<kill('TERM', ...)> to avoid deadlock during process
exit.  You will have to explicitly call waitpid() to make sure the
child has time to clean-up itself, but you are then also responsible
that the child is not blocking on I/O either.

=item exec()

Calling exec() within a pseudo-process actually spawns the requested
executable in a separate process and waits for it to complete before
exiting with the same exit status as that process.  This means that the
process ID reported within the running executable will be different from
what the earlier Perl fork() might have returned.  Similarly, any process
manipulation functions applied to the ID returned by fork() will affect the
waiting pseudo-process that called exec(), not the real process it is
waiting for after the exec().

When exec() is called inside a pseudo-process then DESTROY methods and
END blocks will still be called after the external process returns.

=item exit()

exit() always exits just the executing pseudo-process, after automatically
wait()-ing for any outstanding child pseudo-processes.  Note that this means
that the process as a whole will not exit unless all running pseudo-processes
have exited.  See below for some limitations with open filehandles.

=item Open handles to files, directories and network sockets

All open handles are dup()-ed in pseudo-processes, so that closing
any handles in one process does not affect the others.  See below for
some limitations.

=back

=head2 Resource limits

In the eyes of the operating system, pseudo-processes created via the fork()
emulation are simply threads in the same process.  This means that any
process-level limits imposed by the operating system apply to all
pseudo-processes taken together.  This includes any limits imposed by the
operating system on the number of open file, directory and socket handles,
limits on disk space usage, limits on memory size, limits on CPU utilization
etc.

=head2 Killing the parent process

If the parent process is killed (either using Perl's kill() builtin, or
using some external means) all the pseudo-processes are killed as well,
and the whole process exits.

=head2 Lifetime of the parent process and pseudo-processes

During the normal course of events, the parent process and every
pseudo-process started by it will wait for their respective pseudo-children
to complete before they exit.  This means that the parent and every
pseudo-child created by it that is also a pseudo-parent will only exit
after their pseudo-children have exited.

Starting with Perl 5.14 a parent will not wait() automatically
for any child that has been signalled with C<kill('TERM', ...)>
to avoid a deadlock in case the child is blocking on I/O and
never receives the signal.

=head1 CAVEATS AND LIMITATIONS

=over 8

=item BEGIN blocks

The fork() emulation will not work entirely correctly when called from
within a BEGIN block.  The forked copy will run the contents of the
BEGIN block, but will not continue parsing the source stream after the
BEGIN block.  For example, consider the following code:

    BEGIN {
        fork and exit;          # fork child and exit the parent
        print "inner\n";
    }
    print "outer\n";

This will print:

    inner

rather than the expected:

    inner
    outer

This limitation arises from fundamental technical difficulties in
cloning and restarting the stacks used by the Perl parser in the
middle of a parse.

=item Open filehandles

Any filehandles open at the time of the fork() will be dup()-ed.  Thus,
the files can be closed independently in the parent and child, but beware
that the dup()-ed handles will still share the same seek pointer.  Changing
the seek position in the parent will change it in the child and vice-versa.
One can avoid this by opening files that need distinct seek pointers
separately in the child.

On some operating systems, notably Solaris and Unixware, calling C<exit()>
from a child process will flush and close open filehandles in the parent,
thereby corrupting the filehandles.  On these systems, calling C<_exit()>
is suggested instead.  C<_exit()> is available in Perl through the
C<POSIX> module.  Please consult your system's manpages for more information
on this.

=item Open directory handles

Perl will completely read from all open directory handles until they
reach the end of the stream.  It will then seekdir() back to the
original location and all future readdir() requests will be fulfilled
from the cache buffer.  That means that neither the directory handle held
by the parent process nor the one held by the child process will see
any changes made to the directory after the fork() call.

Note that rewinddir() has a similar limitation on Windows and will not
force readdir() to read the directory again either.  Only a newly
opened directory handle will reflect changes to the directory.

=item Forking pipe open() not yet implemented

The C<open(FOO, "|-")> and C<open(BAR, "-|")> constructs are not yet
implemented.  This limitation can be easily worked around in new code
by creating a pipe explicitly.  The following example shows how to
write to a forked child:

    # simulate open(FOO, "|-")
    sub pipe_to_fork ($) {
        my $parent = shift;
        pipe my $child, $parent or die;
        my $pid = fork();
        die "fork() failed: $!" unless defined $pid;
        if ($pid) {
            close $child;
        }
        else {
            close $parent;
            open(STDIN, "<&=" . fileno($child)) or die;
        }
        $pid;
    }

    if (pipe_to_fork('FOO')) {
        # parent
        print FOO "pipe_to_fork\n";
        close FOO;
    }
    else {
        # child
        while (<STDIN>) { print; }
        exit(0);
    }

And this one reads from the child:

    # simulate open(FOO, "-|")
    sub pipe_from_fork ($) {
        my $parent = shift;
        pipe $parent, my $child or die;
        my $pid = fork();
        die "fork() failed: $!" unless defined $pid;
        if ($pid) {
            close $child;
        }
        else {
            close $parent;
            open(STDOUT, ">&=" . fileno($child)) or die;
        }
        $pid;
    }

    if (pipe_from_fork('BAR')) {
        # parent
        while (<BAR>) { print; }
        close BAR;
    }
    else {
        # child
        print "pipe_from_fork\n";
        exit(0);
    }

Forking pipe open() constructs will be supported in future.

=item Global state maintained by XSUBs

External subroutines (XSUBs) that maintain their own global state may
not work correctly.  Such XSUBs will either need to maintain locks to
protect simultaneous access to global data from different pseudo-processes,
or maintain all their state on the Perl symbol table, which is copied
naturally when fork() is called.  A callback mechanism that provides
extensions an opportunity to clone their state will be provided in the
near future.

=item Interpreter embedded in larger application

The fork() emulation may not behave as expected when it is executed in an
application which embeds a Perl interpreter and calls Perl APIs that can
evaluate bits of Perl code.  This stems from the fact that the emulation
only has knowledge about the Perl interpreter's own data structures and
knows nothing about the containing application's state.  For example, any
state carried on the application's own call stack is out of reach.

=item Thread-safety of extensions

Since the fork() emulation runs code in multiple threads, extensions
calling into non-thread-safe libraries may not work reliably when
calling fork().  As Perl's threading support gradually becomes more
widely adopted even on platforms with a native fork(), such extensions
are expected to be fixed for thread-safety.

=back

=head1 PORTABILITY CAVEATS

In portable Perl code, C<kill(9, $child)> must not be used on forked processes.
Killing a forked process is unsafe and has unpredictable results.
See L</kill()>, above.

=head1 BUGS

=over 8

=item *

Having pseudo-process IDs be negative integers breaks down for the integer
C<-1> because the wait() and waitpid() functions treat this number as
being special.  The tacit assumption in the current implementation is that
the system never allocates a thread ID of C<1> for user threads.  A better
representation for pseudo-process IDs will be implemented in future.

=item *

In certain cases, the OS-level handles created by the pipe(), socket(),
and accept() operators are apparently not duplicated accurately in
pseudo-processes.  This only happens in some situations, but where it
does happen, it may result in deadlocks between the read and write ends
of pipe handles, or inability to send or receive data across socket
handles.

=item *

This document may be incomplete in some respects.

=back

=head1 AUTHOR

Support for concurrent interpreters and the fork() emulation was implemented
by ActiveState, with funding from Microsoft Corporation.

This document is authored and maintained by Gurusamy Sarathy
E<lt>gsar@activestate.comE<gt>.

=head1 SEE ALSO

L<perlfunc/"fork">, L<perlipc>

=cut
perlunifaq.pod000064400000032517150344123500007423 0ustar00=head1 NAME

perlunifaq - Perl Unicode FAQ

=head1 Q and A

This is a list of questions and answers about Unicode in Perl, intended to be
read after L<perlunitut>.

=head2 perlunitut isn't really a Unicode tutorial, is it?

No, and this isn't really a Unicode FAQ.

Perl has an abstracted interface for all supported character encodings, so this
is actually a generic C<Encode> tutorial and C<Encode> FAQ. But many people
think that Unicode is special and magical, and I didn't want to disappoint
them, so I decided to call the document a Unicode tutorial.

=head2 What character encodings does Perl support?

To find out which character encodings your Perl supports, run:

    perl -MEncode -le "print for Encode->encodings(':all')"

=head2 Which version of perl should I use?

Well, if you can, upgrade to the most recent, but certainly C<5.8.1> or newer.
The tutorial and FAQ assume the latest release.

You should also check your modules, and upgrade them if necessary. For example,
HTML::Entities requires version >= 1.32 to function correctly, even though the
changelog is silent about this.

=head2 What about binary data, like images?

Well, apart from a bare C<binmode $fh>, you shouldn't treat them specially.
(The binmode is needed because otherwise Perl may convert line endings on Win32
systems.)

Be careful, though, to never combine text strings with binary strings. If you
need text in a binary stream, encode your text strings first using the
appropriate encoding, then join them with binary strings. See also: "What if I
don't encode?".

=head2 When should I decode or encode?

Whenever you're communicating text with anything that is external to your perl
process, like a database, a text file, a socket, or another program. Even if
the thing you're communicating with is also written in Perl.

=head2 What if I don't decode?

Whenever your encoded, binary string is used together with a text string, Perl
will assume that your binary string was encoded with ISO-8859-1, also known as
latin-1. If it wasn't latin-1, then your data is unpleasantly converted. For
example, if it was UTF-8, the individual bytes of multibyte characters are seen
as separate characters, and then again converted to UTF-8. Such double encoding
can be compared to double HTML encoding (C<&amp;gt;>), or double URI encoding
(C<%253E>).

This silent implicit decoding is known as "upgrading". That may sound
positive, but it's best to avoid it.

=head2 What if I don't encode?

Your text string will be sent using the bytes in Perl's internal format. In
some cases, Perl will warn you that you're doing something wrong, with a
friendly warning:

    Wide character in print at example.pl line 2.

Because the internal format is often UTF-8, these bugs are hard to spot,
because UTF-8 is usually the encoding you wanted! But don't be lazy, and don't
use the fact that Perl's internal format is UTF-8 to your advantage. Encode
explicitly to avoid weird bugs, and to show to maintenance programmers that you
thought this through.

=head2 Is there a way to automatically decode or encode?

If all data that comes from a certain handle is encoded in exactly the same
way, you can tell the PerlIO system to automatically decode everything, with
the C<encoding> layer. If you do this, you can't accidentally forget to decode
or encode anymore, on things that use the layered handle.

You can provide this layer when C<open>ing the file:

  open my $fh, '>:encoding(UTF-8)', $filename;  # auto encoding on write
  open my $fh, '<:encoding(UTF-8)', $filename;  # auto decoding on read

Or if you already have an open filehandle:

  binmode $fh, ':encoding(UTF-8)';

Some database drivers for DBI can also automatically encode and decode, but
that is sometimes limited to the UTF-8 encoding.

=head2 What if I don't know which encoding was used?

Do whatever you can to find out, and if you have to: guess. (Don't forget to
document your guess with a comment.)

You could open the document in a web browser, and change the character set or
character encoding until you can visually confirm that all characters look the
way they should.

There is no way to reliably detect the encoding automatically, so if people
keep sending you data without charset indication, you may have to educate them.

=head2 Can I use Unicode in my Perl sources?

Yes, you can! If your sources are UTF-8 encoded, you can indicate that with the
C<use utf8> pragma.

    use utf8;

This doesn't do anything to your input, or to your output. It only influences
the way your sources are read. You can use Unicode in string literals, in
identifiers (but they still have to be "word characters" according to C<\w>),
and even in custom delimiters.

=head2 Data::Dumper doesn't restore the UTF8 flag; is it broken?

No, Data::Dumper's Unicode abilities are as they should be. There have been
some complaints that it should restore the UTF8 flag when the data is read
again with C<eval>. However, you should really not look at the flag, and
nothing indicates that Data::Dumper should break this rule.

Here's what happens: when Perl reads in a string literal, it sticks to 8 bit
encoding as long as it can. (But perhaps originally it was internally encoded
as UTF-8, when you dumped it.) When it has to give that up because other
characters are added to the text string, it silently upgrades the string to
UTF-8. 

If you properly encode your strings for output, none of this is of your
concern, and you can just C<eval> dumped data as always.

=head2 Why do regex character classes sometimes match only in the ASCII range?

Starting in Perl 5.14 (and partially in Perl 5.12), just put a
C<use feature 'unicode_strings'> near the beginning of your program.
Within its lexical scope you shouldn't have this problem.  It also is
automatically enabled under C<use feature ':5.12'> or C<use v5.12> or
using C<-E> on the command line for Perl 5.12 or higher.

The rationale for requiring this is to not break older programs that
rely on the way things worked before Unicode came along.  Those older
programs knew only about the ASCII character set, and so may not work
properly for additional characters.  When a string is encoded in UTF-8,
Perl assumes that the program is prepared to deal with Unicode, but when
the string isn't, Perl assumes that only ASCII
is wanted, and so those characters that are not ASCII
characters aren't recognized as to what they would be in Unicode.
C<use feature 'unicode_strings'> tells Perl to treat all characters as
Unicode, whether the string is encoded in UTF-8 or not, thus avoiding
the problem.

However, on earlier Perls, or if you pass strings to subroutines outside
the feature's scope, you can force Unicode rules by changing the
encoding to UTF-8 by doing C<utf8::upgrade($string)>. This can be used
safely on any string, as it checks and does not change strings that have
already been upgraded.

For a more detailed discussion, see L<Unicode::Semantics> on CPAN.

=head2 Why do some characters not uppercase or lowercase correctly?

See the answer to the previous question.

=head2 How can I determine if a string is a text string or a binary string?

You can't. Some use the UTF8 flag for this, but that's misuse, and makes well
behaved modules like Data::Dumper look bad. The flag is useless for this
purpose, because it's off when an 8 bit encoding (by default ISO-8859-1) is
used to store the string.

This is something you, the programmer, has to keep track of; sorry. You could
consider adopting a kind of "Hungarian notation" to help with this.

=head2 How do I convert from encoding FOO to encoding BAR?

By first converting the FOO-encoded byte string to a text string, and then the
text string to a BAR-encoded byte string:

    my $text_string = decode('FOO', $foo_string);
    my $bar_string  = encode('BAR', $text_string);

or by skipping the text string part, and going directly from one binary
encoding to the other:

    use Encode qw(from_to);
    from_to($string, 'FOO', 'BAR');  # changes contents of $string

or by letting automatic decoding and encoding do all the work:

    open my $foofh, '<:encoding(FOO)', 'example.foo.txt';
    open my $barfh, '>:encoding(BAR)', 'example.bar.txt';
    print { $barfh } $_ while <$foofh>;

=head2 What are C<decode_utf8> and C<encode_utf8>?

These are alternate syntaxes for C<decode('utf8', ...)> and C<encode('utf8',
...)>. Do not use these functions for data exchange. Instead use
C<decode('UTF-8', ...)> and C<encode('UTF-8', ...)>; see
L</What's the difference between UTF-8 and utf8?> below.

=head2 What is a "wide character"?

This is a term used for characters occupying more than one byte.

The Perl warning "Wide character in ..." is caused by such a character.
With no specified encoding layer, Perl tries to
fit things into a single byte.  When it can't, it
emits this warning (if warnings are enabled), and uses UTF-8 encoded data
instead.

To avoid this warning and to avoid having different output encodings in a single
stream, always specify an encoding explicitly, for example with a PerlIO layer:

    binmode STDOUT, ":encoding(UTF-8)";

=head1 INTERNALS

=head2 What is "the UTF8 flag"?

Please, unless you're hacking the internals, or debugging weirdness, don't
think about the UTF8 flag at all. That means that you very probably shouldn't
use C<is_utf8>, C<_utf8_on> or C<_utf8_off> at all.

The UTF8 flag, also called SvUTF8, is an internal flag that indicates that the
current internal representation is UTF-8. Without the flag, it is assumed to be
ISO-8859-1. Perl converts between these automatically.  (Actually Perl usually
assumes the representation is ASCII; see L</Why do regex character classes
sometimes match only in the ASCII range?> above.)

One of Perl's internal formats happens to be UTF-8. Unfortunately, Perl can't
keep a secret, so everyone knows about this. That is the source of much
confusion. It's better to pretend that the internal format is some unknown
encoding, and that you always have to encode and decode explicitly.

=head2 What about the C<use bytes> pragma?

Don't use it. It makes no sense to deal with bytes in a text string, and it
makes no sense to deal with characters in a byte string. Do the proper
conversions (by decoding/encoding), and things will work out well: you get
character counts for decoded data, and byte counts for encoded data.

C<use bytes> is usually a failed attempt to do something useful. Just forget
about it.

=head2 What about the C<use encoding> pragma?

Don't use it. Unfortunately, it assumes that the programmer's environment and
that of the user will use the same encoding. It will use the same encoding for
the source code and for STDIN and STDOUT. When a program is copied to another
machine, the source code does not change, but the STDIO environment might.

If you need non-ASCII characters in your source code, make it a UTF-8 encoded
file and C<use utf8>.

If you need to set the encoding for STDIN, STDOUT, and STDERR, for example
based on the user's locale, C<use open>.

=head2 What is the difference between C<:encoding> and C<:utf8>?

Because UTF-8 is one of Perl's internal formats, you can often just skip the
encoding or decoding step, and manipulate the UTF8 flag directly.

Instead of C<:encoding(UTF-8)>, you can simply use C<:utf8>, which skips the
encoding step if the data was already represented as UTF8 internally. This is
widely accepted as good behavior when you're writing, but it can be dangerous
when reading, because it causes internal inconsistency when you have invalid
byte sequences. Using C<:utf8> for input can sometimes result in security
breaches, so please use C<:encoding(UTF-8)> instead.

Instead of C<decode> and C<encode>, you could use C<_utf8_on> and C<_utf8_off>,
but this is considered bad style. Especially C<_utf8_on> can be dangerous, for
the same reason that C<:utf8> can.

There are some shortcuts for oneliners;
see L<-C|perlrun/-C [numberE<sol>list]> in L<perlrun>.

=head2 What's the difference between C<UTF-8> and C<utf8>?

C<UTF-8> is the official standard. C<utf8> is Perl's way of being liberal in
what it accepts. If you have to communicate with things that aren't so liberal,
you may want to consider using C<UTF-8>. If you have to communicate with things
that are too liberal, you may have to use C<utf8>. The full explanation is in
L<Encode/"UTF-8 vs. utf8 vs. UTF8">.

C<UTF-8> is internally known as C<utf-8-strict>. The tutorial uses UTF-8
consistently, even where utf8 is actually used internally, because the
distinction can be hard to make, and is mostly irrelevant.

For example, utf8 can be used for code points that don't exist in Unicode, like
9999999, but if you encode that to UTF-8, you get a substitution character (by
default; see L<Encode/"Handling Malformed Data"> for more ways of dealing with
this.)

Okay, if you insist: the "internal format" is utf8, not UTF-8. (When it's not
some other encoding.)

=head2 I lost track; what encoding is the internal format really?

It's good that you lost track, because you shouldn't depend on the internal
format being any specific encoding. But since you asked: by default, the
internal format is either ISO-8859-1 (latin-1), or utf8, depending on the
history of the string. On EBCDIC platforms, this may be different even.

Perl knows how it stored the string internally, and will use that knowledge
when you C<encode>. In other words: don't try to find out what the internal
encoding for a certain string is, but instead just encode it into the encoding
that you want.

=head1 AUTHOR

Juerd Waalboer <#####@juerd.nl>

=head1 SEE ALSO

L<perlunicode>, L<perluniintro>, L<Encode>

perl.pod000064400000037616150344123500006224 0ustar00=head1 NAME

perl - The Perl 5 language interpreter

=head1 SYNOPSIS

B<perl>	S<[ B<-sTtuUWX> ]>
	S<[ B<-hv> ] [ B<-V>[:I<configvar>] ]>
	S<[ B<-cw> ] [ B<-d>[B<t>][:I<debugger>] ] [ B<-D>[I<number/list>] ]>
	S<[ B<-pna> ] [ B<-F>I<pattern> ] [ B<-l>[I<octal>] ] [ B<-0>[I<octal/hexadecimal>] ]>
	S<[ B<-I>I<dir> ] [ B<-m>[B<->]I<module> ] [ B<-M>[B<->]I<'module...'> ] [ B<-f> ]>
	S<[ B<-C [I<number/list>] >]>
	S<[ B<-S> ]>
	S<[ B<-x>[I<dir>] ]>
	S<[ B<-i>[I<extension>] ]>
	S<[ [B<-e>|B<-E>] I<'command'> ] [ B<--> ] [ I<programfile> ] [ I<argument> ]...>

For more information on these options, you can run C<perldoc perlrun>.

=head1 GETTING HELP

The F<perldoc> program gives you access to all the documentation that comes
with Perl.  You can get more documentation, tutorials and community support
online at L<http://www.perl.org/>.

If you're new to Perl, you should start by running C<perldoc perlintro>,
which is a general intro for beginners and provides some background to help
you navigate the rest of Perl's extensive documentation.  Run C<perldoc
perldoc> to learn more things you can do with F<perldoc>.

For ease of access, the Perl manual has been split up into several sections.

=begin buildtoc

# This section is parsed by Porting/pod_lib.pl for use by pod/buildtoc etc

flag =g  perluniprops perlmodlib perlapi perlintern
flag =go perltoc
flag =ro perlcn perljp perlko perltw
flag =   perlvms

path perlfaq.*               cpan/perlfaq/lib/
path perlglossary            cpan/perlfaq/lib/
path perlxs(?:tut|typemap)?  dist/ExtUtils-ParseXS/lib/
path perldoc                 cpan/Pod-Perldoc/

aux h2ph h2xs perlbug pl2pm pod2html pod2man splain xsubpp

=end buildtoc

=head2 Overview

    perl		Perl overview (this section)
    perlintro		Perl introduction for beginners
    perlrun		Perl execution and options
    perltoc		Perl documentation table of contents

=head2 Tutorials

    perlreftut		Perl references short introduction
    perldsc		Perl data structures intro
    perllol		Perl data structures: arrays of arrays

    perlrequick 	Perl regular expressions quick start
    perlretut		Perl regular expressions tutorial

    perlootut		Perl OO tutorial for beginners

    perlperf		Perl Performance and Optimization Techniques

    perlstyle		Perl style guide

    perlcheat		Perl cheat sheet
    perltrap		Perl traps for the unwary
    perldebtut		Perl debugging tutorial

    perlfaq		Perl frequently asked questions
      perlfaq1		General Questions About Perl
      perlfaq2		Obtaining and Learning about Perl
      perlfaq3		Programming Tools
      perlfaq4		Data Manipulation
      perlfaq5		Files and Formats
      perlfaq6		Regexes
      perlfaq7		Perl Language Issues
      perlfaq8		System Interaction
      perlfaq9		Networking

=head2 Reference Manual

    perlsyn		Perl syntax
    perldata		Perl data structures
    perlop		Perl operators and precedence
    perlsub		Perl subroutines
    perlfunc		Perl built-in functions
      perlopentut	Perl open() tutorial
      perlpacktut	Perl pack() and unpack() tutorial
    perlpod		Perl plain old documentation
    perlpodspec 	Perl plain old documentation format specification
    perlpodstyle	Perl POD style guide
    perldiag		Perl diagnostic messages
    perldeprecation     Perl deprecations
    perllexwarn 	Perl warnings and their control
    perldebug		Perl debugging
    perlvar		Perl predefined variables
    perlre		Perl regular expressions, the rest of the story
    perlrebackslash	Perl regular expression backslash sequences
    perlrecharclass	Perl regular expression character classes
    perlreref		Perl regular expressions quick reference
    perlref		Perl references, the rest of the story
    perlform		Perl formats
    perlobj		Perl objects
    perltie		Perl objects hidden behind simple variables
      perldbmfilter	Perl DBM filters

    perlipc		Perl interprocess communication
    perlfork		Perl fork() information
    perlnumber		Perl number semantics

    perlthrtut		Perl threads tutorial

    perlport		Perl portability guide
    perllocale		Perl locale support
    perluniintro	Perl Unicode introduction
    perlunicode 	Perl Unicode support
    perlunicook 	Perl Unicode cookbook
    perlunifaq		Perl Unicode FAQ
    perluniprops	Index of Unicode properties in Perl
    perlunitut		Perl Unicode tutorial
    perlebcdic		Considerations for running Perl on EBCDIC platforms

    perlsec		Perl security

    perlmod		Perl modules: how they work
    perlmodlib		Perl modules: how to write and use
    perlmodstyle	Perl modules: how to write modules with style
    perlmodinstall	Perl modules: how to install from CPAN
    perlnewmod		Perl modules: preparing a new module for distribution
    perlpragma		Perl modules: writing a user pragma

    perlutil		utilities packaged with the Perl distribution

    perlfilter		Perl source filters

    perldtrace		Perl's support for DTrace

    perlglossary	Perl Glossary

=head2 Internals and C Language Interface

    perlembed		Perl ways to embed perl in your C or C++ application
    perldebguts 	Perl debugging guts and tips
    perlxstut		Perl XS tutorial
    perlxs		Perl XS application programming interface
    perlxstypemap	Perl XS C/Perl type conversion tools
    perlclib		Internal replacements for standard C library functions
    perlguts		Perl internal functions for those doing extensions
    perlcall		Perl calling conventions from C
    perlmroapi		Perl method resolution plugin interface
    perlreapi		Perl regular expression plugin interface
    perlreguts		Perl regular expression engine internals

    perlapi		Perl API listing (autogenerated)
    perlintern		Perl internal functions (autogenerated)
    perliol		C API for Perl's implementation of IO in Layers
    perlapio		Perl internal IO abstraction interface

    perlhack		Perl hackers guide
    perlsource		Guide to the Perl source tree
    perlinterp		Overview of the Perl interpreter source and how it works
    perlhacktut 	Walk through the creation of a simple C code patch
    perlhacktips	Tips for Perl core C code hacking
    perlpolicy		Perl development policies
    perlgit		Using git with the Perl repository

=head2 Miscellaneous

    perlbook		Perl book information
    perlcommunity	Perl community information

    perldoc		Look up Perl documentation in Pod format

    perlhist		Perl history records
    perldelta		Perl changes since previous version
    perl5280delta	Perl changes in version 5.28.0
    perl5262delta	Perl changes in version 5.26.2
    perl5261delta	Perl changes in version 5.26.1
    perl5260delta	Perl changes in version 5.26.0
    perl5244delta	Perl changes in version 5.24.4
    perl5243delta	Perl changes in version 5.24.3
    perl5242delta	Perl changes in version 5.24.2
    perl5241delta	Perl changes in version 5.24.1
    perl5240delta	Perl changes in version 5.24.0
    perl5224delta	Perl changes in version 5.22.4
    perl5223delta	Perl changes in version 5.22.3
    perl5222delta	Perl changes in version 5.22.2
    perl5221delta	Perl changes in version 5.22.1
    perl5220delta	Perl changes in version 5.22.0
    perl5203delta	Perl changes in version 5.20.3
    perl5202delta	Perl changes in version 5.20.2
    perl5201delta	Perl changes in version 5.20.1
    perl5200delta	Perl changes in version 5.20.0
    perl5184delta	Perl changes in version 5.18.4
    perl5182delta	Perl changes in version 5.18.2
    perl5181delta	Perl changes in version 5.18.1
    perl5180delta	Perl changes in version 5.18.0
    perl5163delta	Perl changes in version 5.16.3
    perl5162delta	Perl changes in version 5.16.2
    perl5161delta	Perl changes in version 5.16.1
    perl5160delta	Perl changes in version 5.16.0
    perl5144delta	Perl changes in version 5.14.4
    perl5143delta	Perl changes in version 5.14.3
    perl5142delta	Perl changes in version 5.14.2
    perl5141delta	Perl changes in version 5.14.1
    perl5140delta	Perl changes in version 5.14.0
    perl5125delta	Perl changes in version 5.12.5
    perl5124delta	Perl changes in version 5.12.4
    perl5123delta	Perl changes in version 5.12.3
    perl5122delta	Perl changes in version 5.12.2
    perl5121delta	Perl changes in version 5.12.1
    perl5120delta	Perl changes in version 5.12.0
    perl5101delta	Perl changes in version 5.10.1
    perl5100delta	Perl changes in version 5.10.0
    perl589delta	Perl changes in version 5.8.9
    perl588delta	Perl changes in version 5.8.8
    perl587delta	Perl changes in version 5.8.7
    perl586delta	Perl changes in version 5.8.6
    perl585delta	Perl changes in version 5.8.5
    perl584delta	Perl changes in version 5.8.4
    perl583delta	Perl changes in version 5.8.3
    perl582delta	Perl changes in version 5.8.2
    perl581delta	Perl changes in version 5.8.1
    perl58delta 	Perl changes in version 5.8.0
    perl561delta	Perl changes in version 5.6.1
    perl56delta 	Perl changes in version 5.6
    perl5005delta	Perl changes in version 5.005
    perl5004delta	Perl changes in version 5.004

    perlexperiment	A listing of experimental features in Perl

    perlartistic	Perl Artistic License
    perlgpl		GNU General Public License

=head2 Language-Specific

=for buildtoc flag +r

    perlcn		Perl for Simplified Chinese (in EUC-CN)
    perljp		Perl for Japanese (in EUC-JP)
    perlko		Perl for Korean (in EUC-KR)
    perltw		Perl for Traditional Chinese (in Big5)

=head2 Platform-Specific

    perlaix		Perl notes for AIX
    perlamiga		Perl notes for AmigaOS
    perlandroid		Perl notes for Android
    perlbs2000		Perl notes for POSIX-BC BS2000
    perlce		Perl notes for WinCE
    perlcygwin		Perl notes for Cygwin
    perldos		Perl notes for DOS
    perlfreebsd 	Perl notes for FreeBSD
    perlhaiku		Perl notes for Haiku
    perlhpux		Perl notes for HP-UX
    perlhurd		Perl notes for Hurd
    perlirix		Perl notes for Irix
    perllinux		Perl notes for Linux
    perlmacos		Perl notes for Mac OS (Classic)
    perlmacosx		Perl notes for Mac OS X
    perlnetware 	Perl notes for NetWare
    perlopenbsd 	Perl notes for OpenBSD
    perlos2		Perl notes for OS/2
    perlos390		Perl notes for OS/390
    perlos400		Perl notes for OS/400
    perlplan9		Perl notes for Plan 9
    perlqnx		Perl notes for QNX
    perlriscos		Perl notes for RISC OS
    perlsolaris 	Perl notes for Solaris
    perlsymbian 	Perl notes for Symbian
    perlsynology 	Perl notes for Synology
    perltru64		Perl notes for Tru64
    perlvms		Perl notes for VMS
    perlvos		Perl notes for Stratus VOS
    perlwin32		Perl notes for Windows

=for buildtoc flag -r

=head2 Stubs for Deleted Documents

    perlboot		
    perlbot		
    perlrepository
    perltodo
    perltooc		
    perltoot		

=for buildtoc __END__

On a Unix-like system, these documentation files will usually also be
available as manpages for use with the F<man> program.

Some documentation is not available as man pages, so if a
cross-reference is not found by man, try it with L<perldoc>.  Perldoc can
also take you directly to documentation for functions (with the B<-f>
switch). See C<perldoc --help> (or C<perldoc perldoc> or C<man perldoc>)
for other helpful options L<perldoc> has to offer.

In general, if something strange has gone wrong with your program and you're
not sure where you should look for help, try making your code comply with
B<use strict> and B<use warnings>.  These will often point out exactly
where the trouble is.

=head1 DESCRIPTION

Perl officially stands for Practical Extraction and Report Language,
except when it doesn't.

Perl was originally a language optimized for scanning arbitrary
text files, extracting information from those text files, and printing
reports based on that information.  It quickly became a good language
for many system management tasks. Over the years, Perl has grown into
a general-purpose programming language. It's widely used for everything
from quick "one-liners" to full-scale application development.

The language is intended to be practical (easy to use, efficient,
complete) rather than beautiful (tiny, elegant, minimal).  It combines
(in the author's opinion, anyway) some of the best features of B<sed>,
B<awk>, and B<sh>, making it familiar and easy to use for Unix users to
whip up quick solutions to annoying problems.  Its general-purpose
programming facilities support procedural, functional, and
object-oriented programming paradigms, making Perl a comfortable
language for the long haul on major projects, whatever your bent.

Perl's roots in text processing haven't been forgotten over the years.
It still boasts some of the most powerful regular expressions to be
found anywhere, and its support for Unicode text is world-class.  It
handles all kinds of structured text, too, through an extensive
collection of extensions.  Those libraries, collected in the CPAN,
provide ready-made solutions to an astounding array of problems.  When
they haven't set the standard themselves, they steal from the best
-- just like Perl itself.

=head1 AVAILABILITY

Perl is available for most operating systems, including virtually
all Unix-like platforms.  See L<perlport/"Supported Platforms">
for a listing.

=head1 ENVIRONMENT

See L<perlrun>.

=head1 AUTHOR

Larry Wall <larry@wall.org>, with the help of oodles of other folks.

If your Perl success stories and testimonials may be of help to others 
who wish to advocate the use of Perl in their applications, 
or if you wish to simply express your gratitude to Larry and the 
Perl developers, please write to perl-thanks@perl.org .

=head1 FILES

 "@INC"			locations of perl libraries

"@INC" above is a reference to the built-in variable of the same name;
see L<perlvar> for more information.

=head1 SEE ALSO

 http://www.perl.org/       the Perl homepage
 http://www.perl.com/       Perl articles (O'Reilly)
 http://www.cpan.org/       the Comprehensive Perl Archive
 http://www.pm.org/         the Perl Mongers

=head1 DIAGNOSTICS

Using the C<use strict> pragma ensures that all variables are properly
declared and prevents other misuses of legacy Perl features.

The C<use warnings> pragma produces some lovely diagnostics. One can
also use the B<-w> flag, but its use is normally discouraged, because
it gets applied to all executed Perl code, including that not under
your control.

See L<perldiag> for explanations of all Perl's diagnostics.  The C<use
diagnostics> pragma automatically turns Perl's normally terse warnings
and errors into these longer forms.

Compilation errors will tell you the line number of the error, with an
indication of the next token or token type that was to be examined.
(In a script passed to Perl via B<-e> switches, each
B<-e> is counted as one line.)

Setuid scripts have additional constraints that can produce error
messages such as "Insecure dependency".  See L<perlsec>.

Did we mention that you should definitely consider using the B<use warnings>
pragma?

=head1 BUGS

The behavior implied by the B<use warnings> pragma is not mandatory.

Perl is at the mercy of your machine's definitions of various
operations such as type casting, atof(), and floating-point
output with sprintf().

If your stdio requires a seek or eof between reads and writes on a
particular stream, so does Perl.  (This doesn't apply to sysread()
and syswrite().)

While none of the built-in data types have any arbitrary size limits
(apart from memory size), there are still a few arbitrary limits:  a
given variable name may not be longer than 251 characters.  Line numbers
displayed by diagnostics are internally stored as short integers,
so they are limited to a maximum of 65535 (higher numbers usually being
affected by wraparound).

You may mail your bug reports (be sure to include full configuration
information as output by the myconfig program in the perl source
tree, or by C<perl -V>) to perlbug@perl.org .  If you've succeeded
in compiling perl, the L<perlbug> script in the F<utils/> subdirectory
can be used to help mail in a bug report.

Perl actually stands for Pathologically Eclectic Rubbish Lister, but
don't tell anyone I said that.

=head1 NOTES

The Perl motto is "There's more than one way to do it."  Divining
how many more is left as an exercise to the reader.

The three principal virtues of a programmer are Laziness,
Impatience, and Hubris.  See the Camel Book for why.

perl5202delta.pod000064400000030335150344123500007536 0ustar00=encoding utf8

=head1 NAME

perl5202delta - what is new for perl v5.20.2

=head1 DESCRIPTION

This document describes differences between the 5.20.1 release and the 5.20.2
release.

If you are upgrading from an earlier release such as 5.20.0, first read
L<perl5201delta>, which describes differences between 5.20.0 and 5.20.1.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.20.1.  If any exist,
they are bugs, and we request that you submit a report.  See L</Reporting Bugs>
below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<attributes> has been upgraded from version 0.22 to 0.23.

The usage of C<memEQs> in the XS has been corrected.
L<[perl #122701]|https://rt.perl.org/Ticket/Display.html?id=122701>

=item *

L<Data::Dumper> has been upgraded from version 2.151 to 2.151_01.

Fixes CVE-2014-4330 by adding a configuration variable/option to limit
recursion when dumping deep data structures.

=item *

L<Errno> has been upgraded from version 1.20_03 to 1.20_05.

Warnings when building the XS on Windows with the Visual C++ compiler are now
avoided.

=item *

L<feature> has been upgraded from version 1.36 to 1.36_01.

The C<postderef> feature has now been documented.  This feature was actually
added in Perl 5.20.0 but was accidentally omitted from the feature
documentation until now.

=item *

L<IO::Socket> has been upgraded from version 1.37 to 1.38.

Document the limitations of the connected() method.
L<[perl #123096]|https://rt.perl.org/Ticket/Display.html?id=123096>

=item *

L<Module::CoreList> has been upgraded from version 5.020001 to 5.20150214.

The list of Perl versions covered has been updated.

=item *

PathTools has been upgraded from version 3.48 to 3.48_01.

A warning from the B<gcc> compiler is now avoided when building the XS.

=item *

L<PerlIO::scalar> has been upgraded from version 0.18 to 0.18_01.

Reading from a position well past the end of the scalar now correctly returns
end of file.
L<[perl #123443]|https://rt.perl.org/Ticket/Display.html?id=123443>

Seeking to a negative position still fails, but no longer leaves the file
position set to a negation location.

C<eof()> on a C<PerlIO::scalar> handle now properly returns true when the file
position is past the 2GB mark on 32-bit systems.

=item *

L<Storable> has been upgraded from version 2.49 to 2.49_01.

Minor grammatical change to the documentation only.

=item *

L<VMS::DCLsym> has been upgraded from version 1.05 to 1.05_01.

Minor formatting change to the documentation only.

=item *

L<VMS::Stdio> has been upgraded from version 2.4 to 2.41.

Minor formatting change to the documentation only.

=back

=head1 Documentation

=head2 New Documentation

=head3 L<perlunicook>

This document, by Tom Christiansen, provides examples of handling Unicode in
Perl.

=head2 Changes to Existing Documentation

=head3 L<perlexperiment>

=over 4

=item *

Added reference to subroutine signatures.  This feature was actually added in
Perl 5.20.0 but was accidentally omitted from the experimental feature
documentation until now.

=back

=head3 L<perlpolicy>

=over 4

=item *

The process whereby features may graduate from experimental status has now been
formally documented.

=back

=head3 L<perlsyn>

=over 4

=item *

An ambiguity in the documentation of the ellipsis statement has been corrected.
L<[perl #122661]|https://rt.perl.org/Ticket/Display.html?id=122661>

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 Changes to Existing Diagnostics

=over 4

=item *

L<Bad symbol for scalar|perldiag/"Bad symbol for scalar"> is now documented.
This error is not new, but was not previously documented here.

=item *

L<Missing right brace on \N{}|perldiag/"Missing right brace on \N{}"> is now
documented.  This error is not new, but was not previously documented here.

=back

=head1 Testing

=over 4

=item *

The test script F<re/rt122747.t> has been added to verify that
L<perl #122747|https://rt.perl.org/Ticket/Display.html?id=122747> remains
fixed.

=back

=head1 Platform Support

=head2 Regained Platforms

IRIX and Tru64 platforms are working again.  (Some C<make test> failures
remain.)

=head1 Selected Bug Fixes

=over 4

=item *

AIX now sets the length in C<< getsockopt >> correctly.
L<[perl #120835]|https://rt.perl.org/Ticket/Display.html?id=120835>,
L<[cpan #91183]|https://rt.cpan.org/Ticket/Display.html?id=91183>,
L<[cpan #85570]|https://rt.cpan.org/Ticket/Display.html?id=85570>

=item *

In Perl 5.20.0, C<$^N> accidentally had the internal UTF8 flag turned off if
accessed from a code block within a regular expression, effectively
UTF8-encoding the value.  This has been fixed.
L<[perl #123135]|https://rt.perl.org/Ticket/Display.html?id=123135>

=item *

Various cases where the name of a sub is used (autoload, overloading, error
messages) used to crash for lexical subs, but have been fixed.

=item *

An assertion failure when parsing C<sort> with debugging enabled has been
fixed.
L<[perl #122771]|https://rt.perl.org/Ticket/Display.html?id=122771>

=item *

Loading UTF8 tables during a regular expression match could cause assertion
failures under debugging builds if the previous match used the very same
regular expression.
L<[perl #122747]|https://rt.perl.org/Ticket/Display.html?id=122747>

=item *

Due to a mistake in the string-copying logic, copying the value of a state
variable could instead steal the value and undefine the variable.  This bug,
introduced in Perl 5.20, would happen mostly for long strings (1250 chars or
more), but could happen for any strings under builds with copy-on-write
disabled.
L<[perl #123029]|https://rt.perl.org/Ticket/Display.html?id=123029>

=item *

Fixed a bug that could cause perl to execute an infinite loop during
compilation.
L<[perl #122995]|https://rt.perl.org/Ticket/Display.html?id=122995>

=item *

On Win32, restoring in a child pseudo-process a variable that was C<local()>ed
in a parent pseudo-process before the C<fork> happened caused memory corruption
and a crash in the child pseudo-process (and therefore OS process).
L<[perl #40565]|https://rt.perl.org/Ticket/Display.html?id=40565>

=item *

Tainted constants evaluated at compile time no longer cause unrelated
statements to become tainted.
L<[perl #122669]|https://rt.perl.org/Ticket/Display.html?id=122669>

=item *

Calling C<write> on a format with a C<^**> field could produce a panic in
sv_chop() if there were insufficient arguments or if the variable used to fill
the field was empty.
L<[perl #123245]|https://rt.perl.org/Ticket/Display.html?id=123245>

=item *

In Perl 5.20.0, C<sort CORE::fake> where 'fake' is anything other than a
keyword started chopping of the last 6 characters and treating the result as a
sort sub name.  The previous behaviour of treating "CORE::fake" as a sort sub
name has been restored.
L<[perl #123410]|https://rt.perl.org/Ticket/Display.html?id=123410>

=item *

A bug in regular expression patterns that could lead to segfaults and other
crashes has been fixed.  This occurred only in patterns compiled with C<"/i">,
while taking into account the current POSIX locale (this usually means they
have to be compiled within the scope of C<S<"use locale">>), and there must be
a string of at least 128 consecutive bytes to match.
L<[perl #123539]|https://rt.perl.org/Ticket/Display.html?id=123539>

=item *

C<qr/@array(?{block})/> no longer dies with "Bizarre copy of ARRAY".
L<[perl #123344]|https://rt.perl.org/Ticket/Display.html?id=123344>

=item *

C<gmtime> no longer crashes with not-a-number values.
L<[perl #123495]|https://rt.perl.org/Ticket/Display.html?id=123495>

=item *

Certain syntax errors in substitutions, such as C<< s/${<>{})// >>, would
crash, and had done so since Perl 5.10.  (In some cases the crash did not start
happening until Perl 5.16.)  The crash has, of course, been fixed.
L<[perl #123542]|https://rt.perl.org/Ticket/Display.html?id=123542>

=item *

A memory leak in some regular expressions, introduced in Perl 5.20.1, has been
fixed.
L<[perl #123198]|https://rt.perl.org/Ticket/Display.html?id=123198>

=item *

C<< formline("@...", "a"); >> would crash.  The C<FF_CHECKNL> case in
pp_formline() didn't set the pointer used to mark the chop position, which led
to the C<FF_MORE> case crashing with a segmentation fault.  This has been
fixed.
L<[perl #123538]|https://rt.perl.org/Ticket/Display.html?id=123538>
L<[perl #123622]|https://rt.perl.org/Ticket/Display.html?id=123622>

=item *

A possible buffer overrun and crash when parsing a literal pattern during
regular expression compilation has been fixed.
L<[perl #123604]|https://rt.perl.org/Ticket/Display.html?id=123604>

=back

=head1 Known Problems

=over 4

=item *

It is a known bug that lexical subroutines cannot be used as the C<SUBNAME>
argument to C<sort>.  This will be fixed in a future version of Perl.

=back

=head1 Errata From Previous Releases

=over 4

=item *

A regression has been fixed that was introduced in Perl 5.20.0 (fixed in Perl
5.20.1 as well as here) in which a UTF-8 encoded regular expression pattern
that contains a single ASCII lowercase letter does not match its uppercase
counterpart.
L<[perl #122655]|https://rt.perl.org/Ticket/Display.html?id=122655>

=back

=head1 Acknowledgements

Perl 5.20.2 represents approximately 5 months of development since Perl 5.20.1
and contains approximately 6,300 lines of changes across 170 files from 34
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 1,900 lines of changes to 80 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.20.2:

Aaron Crane, Abigail, Andreas Voegele, Andy Dougherty, Anthony Heading,
Aristotle Pagaltzis, Chris 'BinGOs' Williams, Craig A. Berry, Daniel Dragan,
Doug Bell, Ed J, Father Chrysostomos, Glenn D. Golden, H.Merijn Brand, Hugo van
der Sanden, James E Keenan, Jarkko Hietaniemi, Jim Cromie, Karen Etheridge,
Karl Williamson, kmx, Matthew Horsfall, Max Maischein, Peter Martini, Rafael
Garcia-Suarez, Ricardo Signes, Shlomi Fish, Slaven Rezic, Steffen Müller,
Steve Hay, Tadeusz Sośnierz, Tony Cook, Yves Orton, Ævar Arnfjörð
Bjarmason.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
https://rt.perl.org/ .  There may also be information at http://www.perl.org/ ,
the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlsub.pod000064400000216407150344123500006733 0ustar00=head1 NAME
X<subroutine> X<function>

perlsub - Perl subroutines

=head1 SYNOPSIS

To declare subroutines:
X<subroutine, declaration> X<sub>

    sub NAME;			  # A "forward" declaration.
    sub NAME(PROTO);		  #  ditto, but with prototypes
    sub NAME : ATTRS;		  #  with attributes
    sub NAME(PROTO) : ATTRS;	  #  with attributes and prototypes

    sub NAME BLOCK		  # A declaration and a definition.
    sub NAME(PROTO) BLOCK	  #  ditto, but with prototypes
    sub NAME(SIG) BLOCK           #  with a signature instead
    sub NAME : ATTRS BLOCK	  #  with attributes
    sub NAME(PROTO) : ATTRS BLOCK #  with prototypes and attributes
    sub NAME(SIG) : ATTRS BLOCK   #  with a signature and attributes

To define an anonymous subroutine at runtime:
X<subroutine, anonymous>

    $subref = sub BLOCK;		 # no proto
    $subref = sub (PROTO) BLOCK;	 # with proto
    $subref = sub (SIG) BLOCK;           # with signature
    $subref = sub : ATTRS BLOCK;	 # with attributes
    $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
    $subref = sub (SIG) : ATTRS BLOCK;   # with signature and attributes

To import subroutines:
X<import>

    use MODULE qw(NAME1 NAME2 NAME3);

To call subroutines:
X<subroutine, call> X<call>

    NAME(LIST);	   # & is optional with parentheses.
    NAME LIST;	   # Parentheses optional if predeclared/imported.
    &NAME(LIST);   # Circumvent prototypes.
    &NAME;	   # Makes current @_ visible to called subroutine.

=head1 DESCRIPTION

Like many languages, Perl provides for user-defined subroutines.
These may be located anywhere in the main program, loaded in from
other files via the C<do>, C<require>, or C<use> keywords, or
generated on the fly using C<eval> or anonymous subroutines.
You can even call a function indirectly using a variable containing
its name or a CODE reference.

The Perl model for function call and return values is simple: all
functions are passed as parameters one single flat list of scalars, and
all functions likewise return to their caller one single flat list of
scalars.  Any arrays or hashes in these call and return lists will
collapse, losing their identities--but you may always use
pass-by-reference instead to avoid this.  Both call and return lists may
contain as many or as few scalar elements as you'd like.  (Often a
function without an explicit return statement is called a subroutine, but
there's really no difference from Perl's perspective.)
X<subroutine, parameter> X<parameter>

Any arguments passed in show up in the array C<@_>.
(They may also show up in lexical variables introduced by a signature;
see L</Signatures> below.)  Therefore, if
you called a function with two arguments, those would be stored in
C<$_[0]> and C<$_[1]>.  The array C<@_> is a local array, but its
elements are aliases for the actual scalar parameters.  In particular,
if an element C<$_[0]> is updated, the corresponding argument is
updated (or an error occurs if it is not updatable).  If an argument
is an array or hash element which did not exist when the function
was called, that element is created only when (and if) it is modified
or a reference to it is taken.  (Some earlier versions of Perl
created the element whether or not the element was assigned to.)
Assigning to the whole array C<@_> removes that aliasing, and does
not update any arguments.
X<subroutine, argument> X<argument> X<@_>

A C<return> statement may be used to exit a subroutine, optionally
specifying the returned value, which will be evaluated in the
appropriate context (list, scalar, or void) depending on the context of
the subroutine call.  If you specify no return value, the subroutine
returns an empty list in list context, the undefined value in scalar
context, or nothing in void context.  If you return one or more
aggregates (arrays and hashes), these will be flattened together into
one large indistinguishable list.

If no C<return> is found and if the last statement is an expression, its
value is returned.  If the last statement is a loop control structure
like a C<foreach> or a C<while>, the returned value is unspecified.  The
empty sub returns the empty list.
X<subroutine, return value> X<return value> X<return>

Aside from an experimental facility (see L</Signatures> below),
Perl does not have named formal parameters.  In practice all you
do is assign to a C<my()> list of these.  Variables that aren't
declared to be private are global variables.  For gory details
on creating private variables, see L</"Private Variables via my()">
and L</"Temporary Values via local()">.  To create protected
environments for a set of functions in a separate package (and
probably a separate file), see L<perlmod/"Packages">.
X<formal parameter> X<parameter, formal>

Example:

    sub max {
	my $max = shift(@_);
	foreach $foo (@_) {
	    $max = $foo if $max < $foo;
	}
	return $max;
    }
    $bestday = max($mon,$tue,$wed,$thu,$fri);

Example:

    # get a line, combining continuation lines
    #  that start with whitespace

    sub get_line {
	$thisline = $lookahead;  # global variables!
	LINE: while (defined($lookahead = <STDIN>)) {
	    if ($lookahead =~ /^[ \t]/) {
		$thisline .= $lookahead;
	    }
	    else {
		last LINE;
	    }
	}
	return $thisline;
    }

    $lookahead = <STDIN>;	# get first line
    while (defined($line = get_line())) {
	...
    }

Assigning to a list of private variables to name your arguments:

    sub maybeset {
	my($key, $value) = @_;
	$Foo{$key} = $value unless $Foo{$key};
    }

Because the assignment copies the values, this also has the effect
of turning call-by-reference into call-by-value.  Otherwise a
function is free to do in-place modifications of C<@_> and change
its caller's values.
X<call-by-reference> X<call-by-value>

    upcase_in($v1, $v2);  # this changes $v1 and $v2
    sub upcase_in {
	for (@_) { tr/a-z/A-Z/ }
    }

You aren't allowed to modify constants in this way, of course.  If an
argument were actually literal and you tried to change it, you'd take a
(presumably fatal) exception.   For example, this won't work:
X<call-by-reference> X<call-by-value>

    upcase_in("frederick");

It would be much safer if the C<upcase_in()> function
were written to return a copy of its parameters instead
of changing them in place:

    ($v3, $v4) = upcase($v1, $v2);  # this doesn't change $v1 and $v2
    sub upcase {
	return unless defined wantarray;  # void context, do nothing
	my @parms = @_;
	for (@parms) { tr/a-z/A-Z/ }
  	return wantarray ? @parms : $parms[0];
    }

Notice how this (unprototyped) function doesn't care whether it was
passed real scalars or arrays.  Perl sees all arguments as one big,
long, flat parameter list in C<@_>.  This is one area where
Perl's simple argument-passing style shines.  The C<upcase()>
function would work perfectly well without changing the C<upcase()>
definition even if we fed it things like this:

    @newlist   = upcase(@list1, @list2);
    @newlist   = upcase( split /:/, $var );

Do not, however, be tempted to do this:

    (@a, @b)   = upcase(@list1, @list2);

Like the flattened incoming parameter list, the return list is also
flattened on return.  So all you have managed to do here is stored
everything in C<@a> and made C<@b> empty.  See 
L</Pass by Reference> for alternatives.

A subroutine may be called using an explicit C<&> prefix.  The
C<&> is optional in modern Perl, as are parentheses if the
subroutine has been predeclared.  The C<&> is I<not> optional
when just naming the subroutine, such as when it's used as
an argument to defined() or undef().  Nor is it optional when you
want to do an indirect subroutine call with a subroutine name or
reference using the C<&$subref()> or C<&{$subref}()> constructs,
although the C<< $subref->() >> notation solves that problem.
See L<perlref> for more about all that.
X<&>

Subroutines may be called recursively.  If a subroutine is called
using the C<&> form, the argument list is optional, and if omitted,
no C<@_> array is set up for the subroutine: the C<@_> array at the
time of the call is visible to subroutine instead.  This is an
efficiency mechanism that new users may wish to avoid.
X<recursion>

    &foo(1,2,3);	# pass three arguments
    foo(1,2,3);		# the same

    foo();		# pass a null list
    &foo();		# the same

    &foo;		# foo() get current args, like foo(@_) !!
    foo;		# like foo() IFF sub foo predeclared, else "foo"

Not only does the C<&> form make the argument list optional, it also
disables any prototype checking on arguments you do provide.  This
is partly for historical reasons, and partly for having a convenient way
to cheat if you know what you're doing.  See L</Prototypes> below.
X<&>

Since Perl 5.16.0, the C<__SUB__> token is available under C<use feature
'current_sub'> and C<use 5.16.0>.  It will evaluate to a reference to the
currently-running sub, which allows for recursive calls without knowing
your subroutine's name.

    use 5.16.0;
    my $factorial = sub {
      my ($x) = @_;
      return 1 if $x == 1;
      return($x * __SUB__->( $x - 1 ) );
    };

The behavior of C<__SUB__> within a regex code block (such as C</(?{...})/>)
is subject to change.

Subroutines whose names are in all upper case are reserved to the Perl
core, as are modules whose names are in all lower case.  A subroutine in
all capitals is a loosely-held convention meaning it will be called
indirectly by the run-time system itself, usually due to a triggered event.
Subroutines whose name start with a left parenthesis are also reserved the 
same way.  The following is a list of some subroutines that currently do 
special, pre-defined things.

=over

=item documented later in this document

C<AUTOLOAD>

=item documented in L<perlmod>

C<CLONE>, C<CLONE_SKIP>

=item documented in L<perlobj>

C<DESTROY>, C<DOES>

=item documented in L<perltie>

C<BINMODE>, C<CLEAR>, C<CLOSE>, C<DELETE>, C<DESTROY>, C<EOF>, C<EXISTS>, 
C<EXTEND>, C<FETCH>, C<FETCHSIZE>, C<FILENO>, C<FIRSTKEY>, C<GETC>, 
C<NEXTKEY>, C<OPEN>, C<POP>, C<PRINT>, C<PRINTF>, C<PUSH>, C<READ>, 
C<READLINE>, C<SCALAR>, C<SEEK>, C<SHIFT>, C<SPLICE>, C<STORE>, 
C<STORESIZE>, C<TELL>, C<TIEARRAY>, C<TIEHANDLE>, C<TIEHASH>, 
C<TIESCALAR>, C<UNSHIFT>, C<UNTIE>, C<WRITE>

=item documented in L<PerlIO::via>

C<BINMODE>, C<CLEARERR>, C<CLOSE>, C<EOF>, C<ERROR>, C<FDOPEN>, C<FILENO>, 
C<FILL>, C<FLUSH>, C<OPEN>, C<POPPED>, C<PUSHED>, C<READ>, C<SEEK>, 
C<SETLINEBUF>, C<SYSOPEN>, C<TELL>, C<UNREAD>, C<UTF8>, C<WRITE>

=item documented in L<perlfunc>

L<< C<import> | perlfunc/use >>, L<< C<unimport> | perlfunc/use >>,
L<< C<INC> | perlfunc/require >>

=item documented in L<UNIVERSAL>

C<VERSION>

=item documented in L<perldebguts>

C<DB::DB>, C<DB::sub>, C<DB::lsub>, C<DB::goto>, C<DB::postponed>

=item undocumented, used internally by the L<overload> feature

any starting with C<(>

=back

The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines
are not so much subroutines as named special code blocks, of which you
can have more than one in a package, and which you can B<not> call
explicitly.  See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">

=head2 Signatures

B<WARNING>: Subroutine signatures are experimental.  The feature may be
modified or removed in future versions of Perl.

Perl has an experimental facility to allow a subroutine's formal
parameters to be introduced by special syntax, separate from the
procedural code of the subroutine body.  The formal parameter list
is known as a I<signature>.  The facility must be enabled first by a
pragmatic declaration, C<use feature 'signatures'>, and it will produce
a warning unless the "experimental::signatures" warnings category is
disabled.

The signature is part of a subroutine's body.  Normally the body of a
subroutine is simply a braced block of code.  When using a signature,
the signature is a parenthesised list that goes immediately after
the subroutine name (or, for anonymous subroutines, immediately after
the C<sub> keyword).  The signature declares lexical variables that are
in scope for the block.  When the subroutine is called, the signature
takes control first.  It populates the signature variables from the
list of arguments that were passed.  If the argument list doesn't meet
the requirements of the signature, then it will throw an exception.
When the signature processing is complete, control passes to the block.

Positional parameters are handled by simply naming scalar variables in
the signature.  For example,

    sub foo ($left, $right) {
	return $left + $right;
    }

takes two positional parameters, which must be filled at runtime by
two arguments.  By default the parameters are mandatory, and it is
not permitted to pass more arguments than expected.  So the above is
equivalent to

    sub foo {
	die "Too many arguments for subroutine" unless @_ <= 2;
	die "Too few arguments for subroutine" unless @_ >= 2;
	my $left = $_[0];
	my $right = $_[1];
	return $left + $right;
    }

An argument can be ignored by omitting the main part of the name from
a parameter declaration, leaving just a bare C<$> sigil.  For example,

    sub foo ($first, $, $third) {
	return "first=$first, third=$third";
    }

Although the ignored argument doesn't go into a variable, it is still
mandatory for the caller to pass it.

A positional parameter is made optional by giving a default value,
separated from the parameter name by C<=>:

    sub foo ($left, $right = 0) {
	return $left + $right;
    }

The above subroutine may be called with either one or two arguments.
The default value expression is evaluated when the subroutine is called,
so it may provide different default values for different calls.  It is
only evaluated if the argument was actually omitted from the call.
For example,

    my $auto_id = 0;
    sub foo ($thing, $id = $auto_id++) {
	print "$thing has ID $id";
    }

automatically assigns distinct sequential IDs to things for which no
ID was supplied by the caller.  A default value expression may also
refer to parameters earlier in the signature, making the default for
one parameter vary according to the earlier parameters.  For example,

    sub foo ($first_name, $surname, $nickname = $first_name) {
	print "$first_name $surname is known as \"$nickname\"";
    }

An optional parameter can be nameless just like a mandatory parameter.
For example,

    sub foo ($thing, $ = 1) {
	print $thing;
    }

The parameter's default value will still be evaluated if the corresponding
argument isn't supplied, even though the value won't be stored anywhere.
This is in case evaluating it has important side effects.  However, it
will be evaluated in void context, so if it doesn't have side effects
and is not trivial it will generate a warning if the "void" warning
category is enabled.  If a nameless optional parameter's default value
is not important, it may be omitted just as the parameter's name was:

    sub foo ($thing, $=) {
	print $thing;
    }

Optional positional parameters must come after all mandatory positional
parameters.  (If there are no mandatory positional parameters then an
optional positional parameters can be the first thing in the signature.)
If there are multiple optional positional parameters and not enough
arguments are supplied to fill them all, they will be filled from left
to right.

After positional parameters, additional arguments may be captured in a
slurpy parameter.  The simplest form of this is just an array variable:

    sub foo ($filter, @inputs) {
	print $filter->($_) foreach @inputs;
    }

With a slurpy parameter in the signature, there is no upper limit on how
many arguments may be passed.  A slurpy array parameter may be nameless
just like a positional parameter, in which case its only effect is to
turn off the argument limit that would otherwise apply:

    sub foo ($thing, @) {
	print $thing;
    }

A slurpy parameter may instead be a hash, in which case the arguments
available to it are interpreted as alternating keys and values.
There must be as many keys as values: if there is an odd argument then
an exception will be thrown.  Keys will be stringified, and if there are
duplicates then the later instance takes precedence over the earlier,
as with standard hash construction.

    sub foo ($filter, %inputs) {
	print $filter->($_, $inputs{$_}) foreach sort keys %inputs;
    }

A slurpy hash parameter may be nameless just like other kinds of
parameter.  It still insists that the number of arguments available to
it be even, even though they're not being put into a variable.

    sub foo ($thing, %) {
	print $thing;
    }

A slurpy parameter, either array or hash, must be the last thing in the
signature.  It may follow mandatory and optional positional parameters;
it may also be the only thing in the signature.  Slurpy parameters cannot
have default values: if no arguments are supplied for them then you get
an empty array or empty hash.

A signature may be entirely empty, in which case all it does is check
that the caller passed no arguments:

    sub foo () {
	return 123;
    }

When using a signature, the arguments are still available in the special
array variable C<@_>, in addition to the lexical variables of the
signature.  There is a difference between the two ways of accessing the
arguments: C<@_> I<aliases> the arguments, but the signature variables
get I<copies> of the arguments.  So writing to a signature variable
only changes that variable, and has no effect on the caller's variables,
but writing to an element of C<@_> modifies whatever the caller used to
supply that argument.

There is a potential syntactic ambiguity between signatures and prototypes
(see L</Prototypes>), because both start with an opening parenthesis and
both can appear in some of the same places, such as just after the name
in a subroutine declaration.  For historical reasons, when signatures
are not enabled, any opening parenthesis in such a context will trigger
very forgiving prototype parsing.  Most signatures will be interpreted
as prototypes in those circumstances, but won't be valid prototypes.
(A valid prototype cannot contain any alphabetic character.)  This will
lead to somewhat confusing error messages.

To avoid ambiguity, when signatures are enabled the special syntax
for prototypes is disabled.  There is no attempt to guess whether a
parenthesised group was intended to be a prototype or a signature.
To give a subroutine a prototype under these circumstances, use a
L<prototype attribute|attributes/Built-in Attributes>.  For example,

    sub foo :prototype($) { $_[0] }

It is entirely possible for a subroutine to have both a prototype and
a signature.  They do different jobs: the prototype affects compilation
of calls to the subroutine, and the signature puts argument values into
lexical variables at runtime.  You can therefore write

    sub foo ($left, $right) : prototype($$) {
	return $left + $right;
    }

The prototype attribute, and any other attributes, come after 
the signature.

=head2 Private Variables via my()
X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical>
X<lexical scope> X<attributes, my>

Synopsis:

    my $foo;	    	# declare $foo lexically local
    my (@wid, %get); 	# declare list of variables local
    my $foo = "flurp";	# declare $foo lexical, and init it
    my @oof = @bar;	# declare @oof lexical, and init it
    my $x : Foo = $y;	# similar, with an attribute applied

B<WARNING>: The use of attribute lists on C<my> declarations is still
evolving.  The current semantics and interface are subject to change.
See L<attributes> and L<Attribute::Handlers>.

The C<my> operator declares the listed variables to be lexically
confined to the enclosing block, conditional
(C<if>/C<unless>/C<elsif>/C<else>), loop
(C<for>/C<foreach>/C<while>/C<until>/C<continue>), subroutine, C<eval>,
or C<do>/C<require>/C<use>'d file.  If more than one value is listed, the
list must be placed in parentheses.  All listed elements must be
legal lvalues.  Only alphanumeric identifiers may be lexically
scoped--magical built-ins like C<$/> must currently be C<local>ized
with C<local> instead.

Unlike dynamic variables created by the C<local> operator, lexical
variables declared with C<my> are totally hidden from the outside
world, including any called subroutines.  This is true if it's the
same subroutine called from itself or elsewhere--every call gets
its own copy.
X<local>

This doesn't mean that a C<my> variable declared in a statically
enclosing lexical scope would be invisible.  Only dynamic scopes
are cut off.   For example, the C<bumpx()> function below has access
to the lexical $x variable because both the C<my> and the C<sub>
occurred at the same scope, presumably file scope.

    my $x = 10;
    sub bumpx { $x++ } 

An C<eval()>, however, can see lexical variables of the scope it is
being evaluated in, so long as the names aren't hidden by declarations within
the C<eval()> itself.  See L<perlref>.
X<eval, scope of>

The parameter list to my() may be assigned to if desired, which allows you
to initialize your variables.  (If no initializer is given for a
particular variable, it is created with the undefined value.)  Commonly
this is used to name input parameters to a subroutine.  Examples:

    $arg = "fred";	  # "global" variable
    $n = cube_root(27);
    print "$arg thinks the root is $n\n";
 fred thinks the root is 3

    sub cube_root {
	my $arg = shift;  # name doesn't matter
	$arg **= 1/3;
	return $arg;
    }

The C<my> is simply a modifier on something you might assign to.  So when
you do assign to variables in its argument list, C<my> doesn't
change whether those variables are viewed as a scalar or an array.  So

    my ($foo) = <STDIN>;		# WRONG?
    my @FOO = <STDIN>;

both supply a list context to the right-hand side, while

    my $foo = <STDIN>;

supplies a scalar context.  But the following declares only one variable:

    my $foo, $bar = 1;			# WRONG

That has the same effect as

    my $foo;
    $bar = 1;

The declared variable is not introduced (is not visible) until after
the current statement.  Thus,

    my $x = $x;

can be used to initialize a new $x with the value of the old $x, and
the expression

    my $x = 123 and $x == 123

is false unless the old $x happened to have the value C<123>.

Lexical scopes of control structures are not bounded precisely by the
braces that delimit their controlled blocks; control expressions are
part of that scope, too.  Thus in the loop

    while (my $line = <>) {
        $line = lc $line;
    } continue {
        print $line;
    }

the scope of $line extends from its declaration throughout the rest of
the loop construct (including the C<continue> clause), but not beyond
it.  Similarly, in the conditional

    if ((my $answer = <STDIN>) =~ /^yes$/i) {
        user_agrees();
    } elsif ($answer =~ /^no$/i) {
        user_disagrees();
    } else {
	chomp $answer;
        die "'$answer' is neither 'yes' nor 'no'";
    }

the scope of $answer extends from its declaration through the rest
of that conditional, including any C<elsif> and C<else> clauses, 
but not beyond it.  See L<perlsyn/"Simple Statements"> for information
on the scope of variables in statements with modifiers.

The C<foreach> loop defaults to scoping its index variable dynamically
in the manner of C<local>.  However, if the index variable is
prefixed with the keyword C<my>, or if there is already a lexical
by that name in scope, then a new lexical is created instead.  Thus
in the loop
X<foreach> X<for>

    for my $i (1, 2, 3) {
        some_function();
    }

the scope of $i extends to the end of the loop, but not beyond it,
rendering the value of $i inaccessible within C<some_function()>.
X<foreach> X<for>

Some users may wish to encourage the use of lexically scoped variables.
As an aid to catching implicit uses to package variables,
which are always global, if you say

    use strict 'vars';

then any variable mentioned from there to the end of the enclosing
block must either refer to a lexical variable, be predeclared via
C<our> or C<use vars>, or else must be fully qualified with the package name.
A compilation error results otherwise.  An inner block may countermand
this with C<no strict 'vars'>.

A C<my> has both a compile-time and a run-time effect.  At compile
time, the compiler takes notice of it.  The principal usefulness
of this is to quiet C<use strict 'vars'>, but it is also essential
for generation of closures as detailed in L<perlref>.  Actual
initialization is delayed until run time, though, so it gets executed
at the appropriate time, such as each time through a loop, for
example.

Variables declared with C<my> are not part of any package and are therefore
never fully qualified with the package name.  In particular, you're not
allowed to try to make a package variable (or other global) lexical:

    my $pack::var;	# ERROR!  Illegal syntax

In fact, a dynamic variable (also known as package or global variables)
are still accessible using the fully qualified C<::> notation even while a
lexical of the same name is also visible:

    package main;
    local $x = 10;
    my    $x = 20;
    print "$x and $::x\n";

That will print out C<20> and C<10>.

You may declare C<my> variables at the outermost scope of a file
to hide any such identifiers from the world outside that file.  This
is similar in spirit to C's static variables when they are used at
the file level.  To do this with a subroutine requires the use of
a closure (an anonymous function that accesses enclosing lexicals).
If you want to create a private subroutine that cannot be called
from outside that block, it can declare a lexical variable containing
an anonymous sub reference:

    my $secret_version = '1.001-beta';
    my $secret_sub = sub { print $secret_version };
    &$secret_sub();

As long as the reference is never returned by any function within the
module, no outside module can see the subroutine, because its name is not in
any package's symbol table.  Remember that it's not I<REALLY> called
C<$some_pack::secret_version> or anything; it's just $secret_version,
unqualified and unqualifiable.

This does not work with object methods, however; all object methods
have to be in the symbol table of some package to be found.  See
L<perlref/"Function Templates"> for something of a work-around to
this.

=head2 Persistent Private Variables
X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure>

There are two ways to build persistent private variables in Perl 5.10.
First, you can simply use the C<state> feature.  Or, you can use closures,
if you want to stay compatible with releases older than 5.10.

=head3 Persistent variables via state()

Beginning with Perl 5.10.0, you can declare variables with the C<state>
keyword in place of C<my>.  For that to work, though, you must have
enabled that feature beforehand, either by using the C<feature> pragma, or
by using C<-E> on one-liners (see L<feature>).  Beginning with Perl 5.16,
the C<CORE::state> form does not require the
C<feature> pragma.

The C<state> keyword creates a lexical variable (following the same scoping
rules as C<my>) that persists from one subroutine call to the next.  If a
state variable resides inside an anonymous subroutine, then each copy of
the subroutine has its own copy of the state variable.  However, the value
of the state variable will still persist between calls to the same copy of
the anonymous subroutine.  (Don't forget that C<sub { ... }> creates a new
subroutine each time it is executed.)

For example, the following code maintains a private counter, incremented
each time the gimme_another() function is called:

    use feature 'state';
    sub gimme_another { state $x; return ++$x }

And this example uses anonymous subroutines to create separate counters:

    use feature 'state';
    sub create_counter {
	return sub { state $x; return ++$x }
    }

Also, since C<$x> is lexical, it can't be reached or modified by any Perl
code outside.

When combined with variable declaration, simple scalar assignment to C<state>
variables (as in C<state $x = 42>) is executed only the first time.  When such
statements are evaluated subsequent times, the assignment is ignored.  The
behavior of this sort of assignment to non-scalar variables is undefined.

=head3 Persistent variables with closures

Just because a lexical variable is lexically (also called statically)
scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that
within a function it works like a C static.  It normally works more
like a C auto, but with implicit garbage collection.  

Unlike local variables in C or C++, Perl's lexical variables don't
necessarily get recycled just because their scope has exited.
If something more permanent is still aware of the lexical, it will
stick around.  So long as something else references a lexical, that
lexical won't be freed--which is as it should be.  You wouldn't want
memory being free until you were done using it, or kept around once you
were done.  Automatic garbage collection takes care of this for you.

This means that you can pass back or save away references to lexical
variables, whereas to return a pointer to a C auto is a grave error.
It also gives us a way to simulate C's function statics.  Here's a
mechanism for giving a function private variables with both lexical
scoping and a static lifetime.  If you do want to create something like
C's static variables, just enclose the whole function in an extra block,
and put the static variable outside the function but in the block.

    {
	my $secret_val = 0;
	sub gimme_another {
	    return ++$secret_val;
	}
    }
    # $secret_val now becomes unreachable by the outside
    # world, but retains its value between calls to gimme_another

If this function is being sourced in from a separate file
via C<require> or C<use>, then this is probably just fine.  If it's
all in the main program, you'll need to arrange for the C<my>
to be executed early, either by putting the whole block above
your main program, or more likely, placing merely a C<BEGIN>
code block around it to make sure it gets executed before your program
starts to run:

    BEGIN {
	my $secret_val = 0;
	sub gimme_another {
	    return ++$secret_val;
	}
    }

See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the
special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>,
C<INIT> and C<END>.

If declared at the outermost scope (the file scope), then lexicals
work somewhat like C's file statics.  They are available to all
functions in that same file declared below them, but are inaccessible
from outside that file.  This strategy is sometimes used in modules
to create private variables that the whole module can see.

=head2 Temporary Values via local()
X<local> X<scope, dynamic> X<dynamic scope> X<variable, local>
X<variable, temporary>

B<WARNING>: In general, you should be using C<my> instead of C<local>, because
it's faster and safer.  Exceptions to this include the global punctuation
variables, global filehandles and formats, and direct manipulation of the
Perl symbol table itself.  C<local> is mostly used when the current value
of a variable must be visible to called subroutines.

Synopsis:

    # localization of values

    local $foo;		       # make $foo dynamically local
    local (@wid, %get);	       # make list of variables local
    local $foo = "flurp";      # make $foo dynamic, and init it
    local @oof = @bar;	       # make @oof dynamic, and init it

    local $hash{key} = "val";  # sets a local value for this hash entry
    delete local $hash{key};   # delete this entry for the current block
    local ($cond ? $v1 : $v2); # several types of lvalues support
			       # localization

    # localization of symbols

    local *FH;		       # localize $FH, @FH, %FH, &FH  ...
    local *merlyn = *randal;   # now $merlyn is really $randal, plus
                               #     @merlyn is really @randal, etc
    local *merlyn = 'randal';  # SAME THING: promote 'randal' to *randal
    local *merlyn = \$randal;  # just alias $merlyn, not @merlyn etc

A C<local> modifies its listed variables to be "local" to the
enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine
called from within that block>.  A C<local> just gives temporary
values to global (meaning package) variables.  It does I<not> create
a local variable.  This is known as dynamic scoping.  Lexical scoping
is done with C<my>, which works more like C's auto declarations.

Some types of lvalues can be localized as well: hash and array elements
and slices, conditionals (provided that their result is always
localizable), and symbolic references.  As for simple variables, this
creates new, dynamically scoped values.

If more than one variable or expression is given to C<local>, they must be
placed in parentheses.  This operator works
by saving the current values of those variables in its argument list on a
hidden stack and restoring them upon exiting the block, subroutine, or
eval.  This means that called subroutines can also reference the local
variable, but not the global one.  The argument list may be assigned to if
desired, which allows you to initialize your local variables.  (If no
initializer is given for a particular variable, it is created with an
undefined value.)

Because C<local> is a run-time operator, it gets executed each time
through a loop.  Consequently, it's more efficient to localize your
variables outside the loop.

=head3 Grammatical note on local()
X<local, context>

A C<local> is simply a modifier on an lvalue expression.  When you assign to
a C<local>ized variable, the C<local> doesn't change whether its list is viewed
as a scalar or an array.  So

    local($foo) = <STDIN>;
    local @FOO = <STDIN>;

both supply a list context to the right-hand side, while

    local $foo = <STDIN>;

supplies a scalar context.

=head3 Localization of special variables
X<local, special variable>

If you localize a special variable, you'll be giving a new value to it,
but its magic won't go away.  That means that all side-effects related
to this magic still work with the localized value.

This feature allows code like this to work :

    # Read the whole contents of FILE in $slurp
    { local $/ = undef; $slurp = <FILE>; }

Note, however, that this restricts localization of some values ; for
example, the following statement dies, as of perl 5.10.0, with an error
I<Modification of a read-only value attempted>, because the $1 variable is
magical and read-only :

    local $1 = 2;

One exception is the default scalar variable: starting with perl 5.14
C<local($_)> will always strip all magic from $_, to make it possible
to safely reuse $_ in a subroutine.

B<WARNING>: Localization of tied arrays and hashes does not currently
work as described.
This will be fixed in a future release of Perl; in the meantime, avoid
code that relies on any particular behavior of localising tied arrays
or hashes (localising individual elements is still okay).
See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more
details.
X<local, tie>

=head3 Localization of globs
X<local, glob> X<glob>

The construct

    local *name;

creates a whole new symbol table entry for the glob C<name> in the
current package.  That means that all variables in its glob slot ($name,
@name, %name, &name, and the C<name> filehandle) are dynamically reset.

This implies, among other things, that any magic eventually carried by
those variables is locally lost.  In other words, saying C<local */>
will not have any effect on the internal value of the input record
separator.

=head3 Localization of elements of composite types
X<local, composite type element> X<local, array element> X<local, hash element>

It's also worth taking a moment to explain what happens when you
C<local>ize a member of a composite type (i.e. an array or hash element).
In this case, the element is C<local>ized I<by name>.  This means that
when the scope of the C<local()> ends, the saved value will be
restored to the hash element whose key was named in the C<local()>, or
the array element whose index was named in the C<local()>.  If that
element was deleted while the C<local()> was in effect (e.g. by a
C<delete()> from a hash or a C<shift()> of an array), it will spring
back into existence, possibly extending an array and filling in the
skipped elements with C<undef>.  For instance, if you say

    %hash = ( 'This' => 'is', 'a' => 'test' );
    @ary  = ( 0..5 );
    {
         local($ary[5]) = 6;
         local($hash{'a'}) = 'drill';
         while (my $e = pop(@ary)) {
             print "$e . . .\n";
             last unless $e > 3;
         }
         if (@ary) {
             $hash{'only a'} = 'test';
             delete $hash{'a'};
         }
    }
    print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
    print "The array has ",scalar(@ary)," elements: ",
          join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";

Perl will print

    6 . . .
    4 . . .
    3 . . .
    This is a test only a test.
    The array has 6 elements: 0, 1, 2, undef, undef, 5

The behavior of local() on non-existent members of composite
types is subject to change in future.

=head3 Localized deletion of elements of composite types
X<delete> X<local, composite type element> X<local, array element> X<local, hash element>

You can use the C<delete local $array[$idx]> and C<delete local $hash{key}>
constructs to delete a composite type entry for the current block and restore
it when it ends.  They return the array/hash value before the localization,
which means that they are respectively equivalent to

    do {
        my $val = $array[$idx];
        local  $array[$idx];
        delete $array[$idx];
        $val
    }

and

    do {
        my $val = $hash{key};
        local  $hash{key};
        delete $hash{key};
        $val
    }

except that for those the C<local> is
scoped to the C<do> block.  Slices are
also accepted.

    my %hash = (
     a => [ 7, 8, 9 ],
     b => 1,
    )

    {
     my $a = delete local $hash{a};
     # $a is [ 7, 8, 9 ]
     # %hash is (b => 1)

     {
      my @nums = delete local @$a[0, 2]
      # @nums is (7, 9)
      # $a is [ undef, 8 ]

      $a[0] = 999; # will be erased when the scope ends
     }
     # $a is back to [ 7, 8, 9 ]

    }
    # %hash is back to its original state

=head2 Lvalue subroutines
X<lvalue> X<subroutine, lvalue>

It is possible to return a modifiable value from a subroutine.
To do this, you have to declare the subroutine to return an lvalue.

    my $val;
    sub canmod : lvalue {
	$val;  # or:  return $val;
    }
    sub nomod {
	$val;
    }

    canmod() = 5;   # assigns to $val
    nomod()  = 5;   # ERROR

The scalar/list context for the subroutine and for the right-hand
side of assignment is determined as if the subroutine call is replaced
by a scalar.  For example, consider:

    data(2,3) = get_data(3,4);

Both subroutines here are called in a scalar context, while in:

    (data(2,3)) = get_data(3,4);

and in:

    (data(2),data(3)) = get_data(3,4);

all the subroutines are called in a list context.

Lvalue subroutines are convenient, but you have to keep in mind that,
when used with objects, they may violate encapsulation.  A normal
mutator can check the supplied argument before setting the attribute
it is protecting, an lvalue subroutine cannot.  If you require any
special processing when storing and retrieving the values, consider
using the CPAN module Sentinel or something similar.

=head2 Lexical Subroutines
X<my sub> X<state sub> X<our sub> X<subroutine, lexical>

Beginning with Perl 5.18, you can declare a private subroutine with C<my>
or C<state>.  As with state variables, the C<state> keyword is only
available under C<use feature 'state'> or C<use 5.010> or higher.

Prior to Perl 5.26, lexical subroutines were deemed experimental and were
available only under the C<use feature 'lexical_subs'> pragma.  They also
produced a warning unless the "experimental::lexical_subs" warnings
category was disabled.

These subroutines are only visible within the block in which they are
declared, and only after that declaration:

    # Include these two lines if your code is intended to run under Perl
    # versions earlier than 5.26.
    no warnings "experimental::lexical_subs";
    use feature 'lexical_subs';

    foo();		# calls the package/global subroutine
    state sub foo {
	foo();		# also calls the package subroutine
    }
    foo();		# calls "state" sub
    my $ref = \&foo;	# take a reference to "state" sub

    my sub bar { ... }
    bar();		# calls "my" sub

To use a lexical subroutine from inside the subroutine itself, you must
predeclare it.  The C<sub foo {...}> subroutine definition syntax respects
any previous C<my sub;> or C<state sub;> declaration.

    my sub baz;		# predeclaration
    sub baz {		# define the "my" sub
	baz();		# recursive call
    }

=head3 C<state sub> vs C<my sub>

What is the difference between "state" subs and "my" subs?  Each time that
execution enters a block when "my" subs are declared, a new copy of each
sub is created.  "State" subroutines persist from one execution of the
containing block to the next.

So, in general, "state" subroutines are faster.  But "my" subs are
necessary if you want to create closures:

    sub whatever {
	my $x = shift;
	my sub inner {
	    ... do something with $x ...
	}
	inner();
    }

In this example, a new C<$x> is created when C<whatever> is called, and
also a new C<inner>, which can see the new C<$x>.  A "state" sub will only
see the C<$x> from the first call to C<whatever>.

=head3 C<our> subroutines

Like C<our $variable>, C<our sub> creates a lexical alias to the package
subroutine of the same name.

The two main uses for this are to switch back to using the package sub
inside an inner scope:

    sub foo { ... }

    sub bar {
	my sub foo { ... }
	{
	    # need to use the outer foo here
	    our sub foo;
	    foo();
	}
    }

and to make a subroutine visible to other packages in the same scope:

    package MySneakyModule;

    our sub do_something { ... }

    sub do_something_with_caller {
	package DB;
	() = caller 1;		# sets @DB::args
	do_something(@args);	# uses MySneakyModule::do_something
    }

=head2 Passing Symbol Table Entries (typeglobs)
X<typeglob> X<*>

B<WARNING>: The mechanism described in this section was originally
the only way to simulate pass-by-reference in older versions of
Perl.  While it still works fine in modern versions, the new reference
mechanism is generally easier to work with.  See below.

Sometimes you don't want to pass the value of an array to a subroutine
but rather the name of it, so that the subroutine can modify the global
copy of it rather than working with a local copy.  In perl you can
refer to all objects of a particular name by prefixing the name
with a star: C<*foo>.  This is often known as a "typeglob", because the
star on the front can be thought of as a wildcard match for all the
funny prefix characters on variables and subroutines and such.

When evaluated, the typeglob produces a scalar value that represents
all the objects of that name, including any filehandle, format, or
subroutine.  When assigned to, it causes the name mentioned to refer to
whatever C<*> value was assigned to it.  Example:

    sub doubleary {
	local(*someary) = @_;
	foreach $elem (@someary) {
	    $elem *= 2;
	}
    }
    doubleary(*foo);
    doubleary(*bar);

Scalars are already passed by reference, so you can modify
scalar arguments without using this mechanism by referring explicitly
to C<$_[0]> etc.  You can modify all the elements of an array by passing
all the elements as scalars, but you have to use the C<*> mechanism (or
the equivalent reference mechanism) to C<push>, C<pop>, or change the size of
an array.  It will certainly be faster to pass the typeglob (or reference).

Even if you don't want to modify an array, this mechanism is useful for
passing multiple arrays in a single LIST, because normally the LIST
mechanism will merge all the array values so that you can't extract out
the individual arrays.  For more on typeglobs, see
L<perldata/"Typeglobs and Filehandles">.

=head2 When to Still Use local()
X<local> X<variable, local>

Despite the existence of C<my>, there are still three places where the
C<local> operator still shines.  In fact, in these three places, you
I<must> use C<local> instead of C<my>.

=over 4

=item 1.

You need to give a global variable a temporary value, especially $_.

The global variables, like C<@ARGV> or the punctuation variables, must be 
C<local>ized with C<local()>.  This block reads in F</etc/motd>, and splits
it up into chunks separated by lines of equal signs, which are placed
in C<@Fields>.

    {
	local @ARGV = ("/etc/motd");
        local $/ = undef;
        local $_ = <>;	
	@Fields = split /^\s*=+\s*$/;
    } 

It particular, it's important to C<local>ize $_ in any routine that assigns
to it.  Look out for implicit assignments in C<while> conditionals.

=item 2.

You need to create a local file or directory handle or a local function.

A function that needs a filehandle of its own must use
C<local()> on a complete typeglob.   This can be used to create new symbol
table entries:

    sub ioqueue {
        local  (*READER, *WRITER);    # not my!
        pipe    (READER,  WRITER)     or die "pipe: $!";
        return (*READER, *WRITER);
    }
    ($head, $tail) = ioqueue();

See the Symbol module for a way to create anonymous symbol table
entries.

Because assignment of a reference to a typeglob creates an alias, this
can be used to create what is effectively a local function, or at least,
a local alias.

    {
        local *grow = \&shrink; # only until this block exits
        grow();                # really calls shrink()
	move();		       # if move() grow()s, it shrink()s too
    }
    grow();		       # get the real grow() again

See L<perlref/"Function Templates"> for more about manipulating
functions by name in this way.

=item 3.

You want to temporarily change just one element of an array or hash.

You can C<local>ize just one element of an aggregate.  Usually this
is done on dynamics:

    {
	local $SIG{INT} = 'IGNORE';
	funct();			    # uninterruptible
    } 
    # interruptibility automatically restored here

But it also works on lexically declared aggregates.

=back

=head2 Pass by Reference
X<pass by reference> X<pass-by-reference> X<reference>

If you want to pass more than one array or hash into a function--or
return them from it--and have them maintain their integrity, then
you're going to have to use an explicit pass-by-reference.  Before you
do that, you need to understand references as detailed in L<perlref>.
This section may not make much sense to you otherwise.

Here are a few simple examples.  First, let's pass in several arrays
to a function and have it C<pop> all of then, returning a new list
of all their former last elements:

    @tailings = popmany ( \@a, \@b, \@c, \@d );

    sub popmany {
	my $aref;
	my @retlist;
	foreach $aref ( @_ ) {
	    push @retlist, pop @$aref;
	}
	return @retlist;
    }

Here's how you might write a function that returns a
list of keys occurring in all the hashes passed to it:

    @common = inter( \%foo, \%bar, \%joe );
    sub inter {
	my ($k, $href, %seen); # locals
	foreach $href (@_) {
	    while ( $k = each %$href ) {
		$seen{$k}++;
	    }
	}
	return grep { $seen{$_} == @_ } keys %seen;
    }

So far, we're using just the normal list return mechanism.
What happens if you want to pass or return a hash?  Well,
if you're using only one of them, or you don't mind them
concatenating, then the normal calling convention is ok, although
a little expensive.

Where people get into trouble is here:

    (@a, @b) = func(@c, @d);
or
    (%a, %b) = func(%c, %d);

That syntax simply won't work.  It sets just C<@a> or C<%a> and
clears the C<@b> or C<%b>.  Plus the function didn't get passed
into two separate arrays or hashes: it got one long list in C<@_>,
as always.

If you can arrange for everyone to deal with this through references, it's
cleaner code, although not so nice to look at.  Here's a function that
takes two array references as arguments, returning the two array elements
in order of how many elements they have in them:

    ($aref, $bref) = func(\@c, \@d);
    print "@$aref has more than @$bref\n";
    sub func {
	my ($cref, $dref) = @_;
	if (@$cref > @$dref) {
	    return ($cref, $dref);
	} else {
	    return ($dref, $cref);
	}
    }

It turns out that you can actually do this also:

    (*a, *b) = func(\@c, \@d);
    print "@a has more than @b\n";
    sub func {
	local (*c, *d) = @_;
	if (@c > @d) {
	    return (\@c, \@d);
	} else {
	    return (\@d, \@c);
	}
    }

Here we're using the typeglobs to do symbol table aliasing.  It's
a tad subtle, though, and also won't work if you're using C<my>
variables, because only globals (even in disguise as C<local>s)
are in the symbol table.

If you're passing around filehandles, you could usually just use the bare
typeglob, like C<*STDOUT>, but typeglobs references work, too.
For example:

    splutter(\*STDOUT);
    sub splutter {
	my $fh = shift;
	print $fh "her um well a hmmm\n";
    }

    $rec = get_rec(\*STDIN);
    sub get_rec {
	my $fh = shift;
	return scalar <$fh>;
    }

If you're planning on generating new filehandles, you could do this.
Notice to pass back just the bare *FH, not its reference.

    sub openit {
	my $path = shift;
	local *FH;
	return open (FH, $path) ? *FH : undef;
    }

=head2 Prototypes
X<prototype> X<subroutine, prototype>

Perl supports a very limited kind of compile-time argument checking
using function prototyping.  This can be declared in either the PROTO
section or with a L<prototype attribute|attributes/Built-in Attributes>.
If you declare either of

    sub mypush (\@@)
    sub mypush :prototype(\@@)

then C<mypush()> takes arguments exactly like C<push()> does.

If subroutine signatures are enabled (see L</Signatures>), then
the shorter PROTO syntax is unavailable, because it would clash with
signatures.  In that case, a prototype can only be declared in the form
of an attribute.

The
function declaration must be visible at compile time.  The prototype
affects only interpretation of new-style calls to the function,
where new-style is defined as not using the C<&> character.  In
other words, if you call it like a built-in function, then it behaves
like a built-in function.  If you call it like an old-fashioned
subroutine, then it behaves like an old-fashioned subroutine.  It
naturally falls out from this rule that prototypes have no influence
on subroutine references like C<\&foo> or on indirect subroutine
calls like C<&{$subref}> or C<< $subref->() >>.

Method calls are not influenced by prototypes either, because the
function to be called is indeterminate at compile time, since
the exact code called depends on inheritance.

Because the intent of this feature is primarily to let you define
subroutines that work like built-in functions, here are prototypes
for some other functions that parse almost exactly like the
corresponding built-in.

   Declared as		   Called as

   sub mylink ($$)	   mylink $old, $new
   sub myvec ($$$)	   myvec $var, $offset, 1
   sub myindex ($$;$)	   myindex &getstring, "substr"
   sub mysyswrite ($$$;$)  mysyswrite $buf, 0, length($buf) - $off, $off
   sub myreverse (@)	   myreverse $a, $b, $c
   sub myjoin ($@)	   myjoin ":", $a, $b, $c
   sub mypop (\@)	   mypop @array
   sub mysplice (\@$$@)	   mysplice @array, 0, 2, @pushme
   sub mykeys (\[%@])	   mykeys %{$hashref}
   sub myopen (*;$)	   myopen HANDLE, $name
   sub mypipe (**)	   mypipe READHANDLE, WRITEHANDLE
   sub mygrep (&@)	   mygrep { /foo/ } $a, $b, $c
   sub myrand (;$)	   myrand 42
   sub mytime ()	   mytime

Any backslashed prototype character represents an actual argument
that must start with that character (optionally preceded by C<my>,
C<our> or C<local>), with the exception of C<$>, which will
accept any scalar lvalue expression, such as C<$foo = 7> or
C<< my_function()->[0] >>.  The value passed as part of C<@_> will be a
reference to the actual argument given in the subroutine call,
obtained by applying C<\> to that argument.

You can use the C<\[]> backslash group notation to specify more than one
allowed argument type.  For example:

    sub myref (\[$@%&*])

will allow calling myref() as

    myref $var
    myref @array
    myref %hash
    myref &sub
    myref *glob

and the first argument of myref() will be a reference to
a scalar, an array, a hash, a code, or a glob.

Unbackslashed prototype characters have special meanings.  Any
unbackslashed C<@> or C<%> eats all remaining arguments, and forces
list context.  An argument represented by C<$> forces scalar context.  An
C<&> requires an anonymous subroutine, which, if passed as the first
argument, does not require the C<sub> keyword or a subsequent comma.

A C<*> allows the subroutine to accept a bareword, constant, scalar expression,
typeglob, or a reference to a typeglob in that slot.  The value will be
available to the subroutine either as a simple scalar, or (in the latter
two cases) as a reference to the typeglob.  If you wish to always convert
such arguments to a typeglob reference, use Symbol::qualify_to_ref() as
follows:

    use Symbol 'qualify_to_ref';

    sub foo (*) {
	my $fh = qualify_to_ref(shift, caller);
	...
    }

The C<+> prototype is a special alternative to C<$> that will act like
C<\[@%]> when given a literal array or hash variable, but will otherwise
force scalar context on the argument.  This is useful for functions which
should accept either a literal array or an array reference as the argument:

    sub mypush (+@) {
        my $aref = shift;
        die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
        push @$aref, @_;
    }

When using the C<+> prototype, your function must check that the argument
is of an acceptable type.

A semicolon (C<;>) separates mandatory arguments from optional arguments.
It is redundant before C<@> or C<%>, which gobble up everything else.

As the last character of a prototype, or just before a semicolon, a C<@>
or a C<%>, you can use C<_> in place of C<$>: if this argument is not
provided, C<$_> will be used instead.

Note how the last three examples in the table above are treated
specially by the parser.  C<mygrep()> is parsed as a true list
operator, C<myrand()> is parsed as a true unary operator with unary
precedence the same as C<rand()>, and C<mytime()> is truly without
arguments, just like C<time()>.  That is, if you say

    mytime +2;

you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed
without a prototype.  If you want to force a unary function to have the
same precedence as a list operator, add C<;> to the end of the prototype:

    sub mygetprotobynumber($;);
    mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)

The interesting thing about C<&> is that you can generate new syntax with it,
provided it's in the initial position:
X<&>

    sub try (&@) {
	my($try,$catch) = @_;
	eval { &$try };
	if ($@) {
	    local $_ = $@;
	    &$catch;
	}
    }
    sub catch (&) { $_[0] }

    try {
	die "phooey";
    } catch {
	/phooey/ and print "unphooey\n";
    };

That prints C<"unphooey">.  (Yes, there are still unresolved
issues having to do with visibility of C<@_>.  I'm ignoring that
question for the moment.  (But note that if we make C<@_> lexically
scoped, those anonymous subroutines can act like closures... (Gee,
is this sounding a little Lispish?  (Never mind.))))

And here's a reimplementation of the Perl C<grep> operator:
X<grep>

    sub mygrep (&@) {
	my $code = shift;
	my @result;
	foreach $_ (@_) {
	    push(@result, $_) if &$code;
	}
	@result;
    }

Some folks would prefer full alphanumeric prototypes.  Alphanumerics have
been intentionally left out of prototypes for the express purpose of
someday in the future adding named, formal parameters.  The current
mechanism's main goal is to let module writers provide better diagnostics
for module users.  Larry feels the notation quite understandable to Perl
programmers, and that it will not intrude greatly upon the meat of the
module, nor make it harder to read.  The line noise is visually
encapsulated into a small pill that's easy to swallow.

If you try to use an alphanumeric sequence in a prototype you will
generate an optional warning - "Illegal character in prototype...".
Unfortunately earlier versions of Perl allowed the prototype to be
used as long as its prefix was a valid prototype.  The warning may be
upgraded to a fatal error in a future version of Perl once the
majority of offending code is fixed.

It's probably best to prototype new functions, not retrofit prototyping
into older ones.  That's because you must be especially careful about
silent impositions of differing list versus scalar contexts.  For example,
if you decide that a function should take just one parameter, like this:

    sub func ($) {
	my $n = shift;
	print "you gave me $n\n";
    }

and someone has been calling it with an array or expression
returning a list:

    func(@foo);
    func( $text =~ /\w+/g );

Then you've just supplied an automatic C<scalar> in front of their
argument, which can be more than a bit surprising.  The old C<@foo>
which used to hold one thing doesn't get passed in.  Instead,
C<func()> now gets passed in a C<1>; that is, the number of elements
in C<@foo>.  And the C<m//g> gets called in scalar context so instead of a
list of words it returns a boolean result and advances C<pos($text)>.  Ouch!

If a sub has both a PROTO and a BLOCK, the prototype is not applied
until after the BLOCK is completely defined.  This means that a recursive
function with a prototype has to be predeclared for the prototype to take
effect, like so:

	sub foo($$);
	sub foo($$) {
		foo 1, 2;
	}

This is all very powerful, of course, and should be used only in moderation
to make the world a better place.

=head2 Constant Functions
X<constant>

Functions with a prototype of C<()> are potential candidates for
inlining.  If the result after optimization and constant folding
is either a constant or a lexically-scoped scalar which has no other
references, then it will be used in place of function calls made
without C<&>.  Calls made using C<&> are never inlined.  (See
F<constant.pm> for an easy way to declare most constants.)

The following functions would all be inlined:

    sub pi ()		{ 3.14159 }		# Not exact, but close.
    sub PI ()		{ 4 * atan2 1, 1 }	# As good as it gets,
						# and it's inlined, too!
    sub ST_DEV ()	{ 0 }
    sub ST_INO ()	{ 1 }

    sub FLAG_FOO ()	{ 1 << 8 }
    sub FLAG_BAR ()	{ 1 << 9 }
    sub FLAG_MASK ()	{ FLAG_FOO | FLAG_BAR }

    sub OPT_BAZ ()	{ not (0x1B58 & FLAG_MASK) }

    sub N () { int(OPT_BAZ) / 3 }

    sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
    sub FOO_SET2 () { if (FLAG_MASK & FLAG_FOO) { 1 } }

(Be aware that the last example was not always inlined in Perl 5.20 and
earlier, which did not behave consistently with subroutines containing
inner scopes.)  You can countermand inlining by using an explicit
C<return>:

    sub baz_val () {
	if (OPT_BAZ) {
	    return 23;
	}
	else {
	    return 42;
	}
    }
    sub bonk_val () { return 12345 }

As alluded to earlier you can also declare inlined subs dynamically at
BEGIN time if their body consists of a lexically-scoped scalar which
has no other references.  Only the first example here will be inlined:

    BEGIN {
        my $var = 1;
        no strict 'refs';
        *INLINED = sub () { $var };
    }

    BEGIN {
        my $var = 1;
        my $ref = \$var;
        no strict 'refs';
        *NOT_INLINED = sub () { $var };
    }

A not so obvious caveat with this (see [RT #79908]) is that the
variable will be immediately inlined, and will stop behaving like a
normal lexical variable, e.g. this will print C<79907>, not C<79908>:

    BEGIN {
        my $x = 79907;
        *RT_79908 = sub () { $x };
        $x++;
    }
    print RT_79908(); # prints 79907

As of Perl 5.22, this buggy behavior, while preserved for backward
compatibility, is detected and emits a deprecation warning.  If you want
the subroutine to be inlined (with no warning), make sure the variable is
not used in a context where it could be modified aside from where it is
declared.

    # Fine, no warning
    BEGIN {
        my $x = 54321;
        *INLINED = sub () { $x };
    }
    # Warns.  Future Perl versions will stop inlining it.
    BEGIN {
        my $x;
        $x = 54321;
        *ALSO_INLINED = sub () { $x };
    }

Perl 5.22 also introduces the experimental "const" attribute as an
alternative.  (Disable the "experimental::const_attr" warnings if you want
to use it.)  When applied to an anonymous subroutine, it forces the sub to
be called when the C<sub> expression is evaluated.  The return value is
captured and turned into a constant subroutine:

    my $x = 54321;
    *INLINED = sub : const { $x };
    $x++;

The return value of C<INLINED> in this example will always be 54321,
regardless of later modifications to $x.  You can also put any arbitrary
code inside the sub, at it will be executed immediately and its return
value captured the same way.

If you really want a subroutine with a C<()> prototype that returns a
lexical variable you can easily force it to not be inlined by adding
an explicit C<return>:

    BEGIN {
        my $x = 79907;
        *RT_79908 = sub () { return $x };
        $x++;
    }
    print RT_79908(); # prints 79908

The easiest way to tell if a subroutine was inlined is by using
L<B::Deparse>.  Consider this example of two subroutines returning
C<1>, one with a C<()> prototype causing it to be inlined, and one
without (with deparse output truncated for clarity):

 $ perl -MO=Deparse -le 'sub ONE { 1 } if (ONE) { print ONE if ONE }'
 sub ONE {
     1;
 }
 if (ONE ) {
     print ONE() if ONE ;
 }
 $ perl -MO=Deparse -le 'sub ONE () { 1 } if (ONE) { print ONE if ONE }'
 sub ONE () { 1 }
 do {
     print 1
 };

If you redefine a subroutine that was eligible for inlining, you'll
get a warning by default.  You can use this warning to tell whether or
not a particular subroutine is considered inlinable, since it's
different than the warning for overriding non-inlined subroutines:

    $ perl -e 'sub one () {1} sub one () {2}'
    Constant subroutine one redefined at -e line 1.
    $ perl -we 'sub one {1} sub one {2}'
    Subroutine one redefined at -e line 1.

The warning is considered severe enough not to be affected by the
B<-w> switch (or its absence) because previously compiled invocations
of the function will still be using the old value of the function.  If
you need to be able to redefine the subroutine, you need to ensure
that it isn't inlined, either by dropping the C<()> prototype (which
changes calling semantics, so beware) or by thwarting the inlining
mechanism in some other way, e.g. by adding an explicit C<return>, as
mentioned above:

    sub not_inlined () { return 23 }

=head2 Overriding Built-in Functions
X<built-in> X<override> X<CORE> X<CORE::GLOBAL>

Many built-in functions may be overridden, though this should be tried
only occasionally and for good reason.  Typically this might be
done by a package attempting to emulate missing built-in functionality
on a non-Unix system.

Overriding may be done only by importing the name from a module at
compile time--ordinary predeclaration isn't good enough.  However, the
C<use subs> pragma lets you, in effect, predeclare subs
via the import syntax, and these names may then override built-in ones:

    use subs 'chdir', 'chroot', 'chmod', 'chown';
    chdir $somewhere;
    sub chdir { ... }

To unambiguously refer to the built-in form, precede the
built-in name with the special package qualifier C<CORE::>.  For example,
saying C<CORE::open()> always refers to the built-in C<open()>, even
if the current package has imported some other subroutine called
C<&open()> from elsewhere.  Even though it looks like a regular
function call, it isn't: the CORE:: prefix in that case is part of Perl's
syntax, and works for any keyword, regardless of what is in the CORE
package.  Taking a reference to it, that is, C<\&CORE::open>, only works
for some keywords.  See L<CORE>.

Library modules should not in general export built-in names like C<open>
or C<chdir> as part of their default C<@EXPORT> list, because these may
sneak into someone else's namespace and change the semantics unexpectedly.
Instead, if the module adds that name to C<@EXPORT_OK>, then it's
possible for a user to import the name explicitly, but not implicitly.
That is, they could say

    use Module 'open';

and it would import the C<open> override.  But if they said

    use Module;

they would get the default imports without overrides.

The foregoing mechanism for overriding built-in is restricted, quite
deliberately, to the package that requests the import.  There is a second
method that is sometimes applicable when you wish to override a built-in
everywhere, without regard to namespace boundaries.  This is achieved by
importing a sub into the special namespace C<CORE::GLOBAL::>.  Here is an
example that quite brazenly replaces the C<glob> operator with something
that understands regular expressions.

    package REGlob;
    require Exporter;
    @ISA = 'Exporter';
    @EXPORT_OK = 'glob';

    sub import {
	my $pkg = shift;
	return unless @_;
	my $sym = shift;
	my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
	$pkg->export($where, $sym, @_);
    }

    sub glob {
	my $pat = shift;
	my @got;
	if (opendir my $d, '.') { 
	    @got = grep /$pat/, readdir $d; 
	    closedir $d;   
	}
	return @got;
    }
    1;

And here's how it could be (ab)used:

    #use REGlob 'GLOBAL_glob';	    # override glob() in ALL namespaces
    package Foo;
    use REGlob 'glob';		    # override glob() in Foo:: only
    print for <^[a-z_]+\.pm\$>;	    # show all pragmatic modules

The initial comment shows a contrived, even dangerous example.
By overriding C<glob> globally, you would be forcing the new (and
subversive) behavior for the C<glob> operator for I<every> namespace,
without the complete cognizance or cooperation of the modules that own
those namespaces.  Naturally, this should be done with extreme caution--if
it must be done at all.

The C<REGlob> example above does not implement all the support needed to
cleanly override perl's C<glob> operator.  The built-in C<glob> has
different behaviors depending on whether it appears in a scalar or list
context, but our C<REGlob> doesn't.  Indeed, many perl built-in have such
context sensitive behaviors, and these must be adequately supported by
a properly written override.  For a fully functional example of overriding
C<glob>, study the implementation of C<File::DosGlob> in the standard
library.

When you override a built-in, your replacement should be consistent (if
possible) with the built-in native syntax.  You can achieve this by using
a suitable prototype.  To get the prototype of an overridable built-in,
use the C<prototype> function with an argument of C<"CORE::builtin_name">
(see L<perlfunc/prototype>).

Note however that some built-ins can't have their syntax expressed by a
prototype (such as C<system> or C<chomp>).  If you override them you won't
be able to fully mimic their original syntax.

The built-ins C<do>, C<require> and C<glob> can also be overridden, but due
to special magic, their original syntax is preserved, and you don't have
to define a prototype for their replacements.  (You can't override the
C<do BLOCK> syntax, though).

C<require> has special additional dark magic: if you invoke your
C<require> replacement as C<require Foo::Bar>, it will actually receive
the argument C<"Foo/Bar.pm"> in @_.  See L<perlfunc/require>.

And, as you'll have noticed from the previous example, if you override
C<glob>, the C<< <*> >> glob operator is overridden as well.

In a similar fashion, overriding the C<readline> function also overrides
the equivalent I/O operator C<< <FILEHANDLE> >>.  Also, overriding
C<readpipe> also overrides the operators C<``> and C<qx//>.

Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden.

=head2 Autoloading
X<autoloading> X<AUTOLOAD>

If you call a subroutine that is undefined, you would ordinarily
get an immediate, fatal error complaining that the subroutine doesn't
exist.  (Likewise for subroutines being used as methods, when the
method doesn't exist in any base class of the class's package.)
However, if an C<AUTOLOAD> subroutine is defined in the package or
packages used to locate the original subroutine, then that
C<AUTOLOAD> subroutine is called with the arguments that would have
been passed to the original subroutine.  The fully qualified name
of the original subroutine magically appears in the global $AUTOLOAD
variable of the same package as the C<AUTOLOAD> routine.  The name
is not passed as an ordinary argument because, er, well, just
because, that's why.  (As an exception, a method call to a nonexistent
C<import> or C<unimport> method is just skipped instead.  Also, if
the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the
subroutine name.  See L<perlguts/Autoloading with XSUBs> for details.)


Many C<AUTOLOAD> routines load in a definition for the requested
subroutine using eval(), then execute that subroutine using a special
form of goto() that erases the stack frame of the C<AUTOLOAD> routine
without a trace.  (See the source to the standard module documented
in L<AutoLoader>, for example.)  But an C<AUTOLOAD> routine can
also just emulate the routine and never define it.   For example,
let's pretend that a function that wasn't defined should just invoke
C<system> with those arguments.  All you'd do is:

    sub AUTOLOAD {
	my $program = $AUTOLOAD;
	$program =~ s/.*:://;
	system($program, @_);
    }
    date();
    who('am', 'i');
    ls('-l');

In fact, if you predeclare functions you want to call that way, you don't
even need parentheses:

    use subs qw(date who ls);
    date;
    who "am", "i";
    ls '-l';

A more complete example of this is the Shell module on CPAN, which
can treat undefined subroutine calls as calls to external programs.

Mechanisms are available to help modules writers split their modules
into autoloadable files.  See the standard AutoLoader module
described in L<AutoLoader> and in L<AutoSplit>, the standard
SelfLoader modules in L<SelfLoader>, and the document on adding C
functions to Perl code in L<perlxs>.

=head2 Subroutine Attributes
X<attribute> X<subroutine, attribute> X<attrs>

A subroutine declaration or definition may have a list of attributes
associated with it.  If such an attribute list is present, it is
broken up at space or colon boundaries and treated as though a
C<use attributes> had been seen.  See L<attributes> for details
about what attributes are currently supported.
Unlike the limitation with the obsolescent C<use attrs>, the
C<sub : ATTRLIST> syntax works to associate the attributes with
a pre-declaration, and not just with a subroutine definition.

The attributes must be valid as simple identifier names (without any
punctuation other than the '_' character).  They may have a parameter
list appended, which is only checked for whether its parentheses ('(',')')
nest properly.

Examples of valid syntax (even though the attributes are unknown):

    sub fnord (&\%) : switch(10,foo(7,3))  :  expensive;
    sub plugh () : Ugly('\(") :Bad;
    sub xyzzy : _5x5 { ... }

Examples of invalid syntax:

    sub fnord : switch(10,foo(); # ()-string not balanced
    sub snoid : Ugly('(');	  # ()-string not balanced
    sub xyzzy : 5x5;		  # "5x5" not a valid identifier
    sub plugh : Y2::north;	  # "Y2::north" not a simple identifier
    sub snurt : foo + bar;	  # "+" not a colon or space

The attribute list is passed as a list of constant strings to the code
which associates them with the subroutine.  In particular, the second example
of valid syntax above currently looks like this in terms of how it's
parsed and invoked:

    use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';

For further details on attribute lists and their manipulation,
see L<attributes> and L<Attribute::Handlers>.

=head1 SEE ALSO

See L<perlref/"Function Templates"> for more about references and closures.
See L<perlxs> if you'd like to learn about calling C subroutines from Perl.  
See L<perlembed> if you'd like to learn about calling Perl subroutines from C.  
See L<perlmod> to learn about bundling up your functions in separate files.
See L<perlmodlib> to learn what library modules come standard on your system.
See L<perlootut> to learn how to make object method calls.
perl584delta.pod000064400000016303150344123500007465 0ustar00=head1 NAME

perl584delta - what is new for perl v5.8.4

=head1 DESCRIPTION

This document describes differences between the 5.8.3 release and
the 5.8.4 release.

=head1 Incompatible Changes

Many minor bugs have been fixed. Scripts which happen to rely on previously
erroneous behaviour will consider these fixes as incompatible changes :-)
You are advised to perform sufficient acceptance testing on this release
to satisfy yourself that this does not affect you, before putting this
release into production.

The diagnostic output of Carp has been changed slightly, to add a space after
the comma between arguments. This makes it much easier for tools such as
web browsers to wrap it, but might confuse any automatic tools which perform
detailed parsing of Carp output.

The internal dump output has been improved, so that non-printable characters
such as newline and backspace are output in C<\x> notation, rather than
octal. This might just confuse non-robust tools which parse the output of
modules such as Devel::Peek.

=head1 Core Enhancements

=head2 Malloc wrapping

Perl can now be built to detect attempts to assign pathologically large chunks
of memory.  Previously such assignments would suffer from integer wrap-around
during size calculations causing a misallocation, which would crash perl, and
could theoretically be used for "stack smashing" attacks.  The wrapping
defaults to enabled on platforms where we know it works (most AIX
configurations, BSDi, Darwin, DEC OSF/1, FreeBSD, HP/UX, GNU Linux, OpenBSD,
Solaris, VMS and most Win32 compilers) and defaults to disabled on other
platforms.

=head2 Unicode Character Database 4.0.1

The copy of the Unicode Character Database included in Perl 5.8 has
been updated to 4.0.1 from 4.0.0.

=head2 suidperl less insecure

Paul Szabo has analysed and patched C<suidperl> to remove existing known
insecurities. Currently there are no known holes in C<suidperl>, but previous
experience shows that we cannot be confident that these were the last. You may
no longer invoke the set uid perl directly, so to preserve backwards
compatibility with scripts that invoke #!/usr/bin/suidperl the only set uid
binary is now C<sperl5.8.>I<n> (C<sperl5.8.4> for this release). C<suidperl>
is installed as a hard link to C<perl>; both C<suidperl> and C<perl> will
invoke C<sperl5.8.4> automatically the set uid binary, so this change should
be completely transparent.

For new projects the core perl team would strongly recommend that you use
dedicated, single purpose security tools such as C<sudo> in preference to
C<suidperl>.

=head2 format

In addition to bug fixes, C<format>'s features have been enhanced. See
L<perlform>

=head1 Modules and Pragmata

The (mis)use of C</tmp> in core modules and documentation has been tidied up.
Some modules available both within the perl core and independently from CPAN
("dual-life modules") have not yet had these changes applied; the changes
will be integrated into future stable perl releases as the modules are
updated on CPAN.

=head2 Updated modules

=over 4

=item Attribute::Handlers

=item B

=item Benchmark

=item CGI

=item Carp

=item Cwd

=item Exporter

=item File::Find

=item IO

=item IPC::Open3

=item Local::Maketext

=item Math::BigFloat

=item Math::BigInt

=item Math::BigRat

=item MIME::Base64

=item ODBM_File

=item POSIX

=item Shell

=item Socket

There is experimental support for Linux abstract Unix domain sockets.

=item Storable

=item Switch

Synced with its CPAN version 2.10

=item Sys::Syslog

C<syslog()> can now use numeric constants for facility names and priorities,
in addition to strings.

=item Term::ANSIColor

=item Time::HiRes

=item Unicode::UCD

=item Win32

Win32.pm/Win32.xs has moved from the libwin32 module to core Perl

=item base

=item open

=item threads

Detached threads are now also supported on Windows.

=item utf8

=back

=head1 Performance Enhancements

=over 4

=item *

Accelerated Unicode case mappings (C</i>, C<lc>, C<uc>, etc).

=item *

In place sort optimised (eg C<@a = sort @a>)

=item *

Unnecessary assignment optimised away in

  my $s = undef;
  my @a = ();
  my %h = ();

=item *

Optimised C<map> in scalar context

=back

=head1 Utility Changes

The Perl debugger (F<lib/perl5db.pl>) can now save all debugger commands for
sourcing later, and can display the parent inheritance tree of a given class.

=head1 Installation and Configuration Improvements

The build process on both VMS and Windows has had several minor improvements
made. On Windows Borland's C compiler can now compile perl with PerlIO and/or
USE_LARGE_FILES enabled.

C<perl.exe> on Windows now has a "Camel" logo icon. The use of a camel with
the topic of Perl is a trademark of O'Reilly and Associates Inc., and is used
with their permission (ie distribution of the source, compiling a Windows
executable from it, and using that executable locally). Use of the supplied
camel for anything other than a perl executable's icon is specifically not
covered, and anyone wishing to redistribute perl binaries I<with> the icon
should check directly with O'Reilly beforehand.

Perl should build cleanly on Stratus VOS once more.

=head1 Selected Bug Fixes

More utf8 bugs fixed, notably in how C<chomp>, C<chop>, C<send>, and
C<syswrite> and interact with utf8 data. Concatenation now works correctly
when C<use bytes;> is in scope.

Pragmata are now correctly propagated into (?{...}) constructions in regexps.
Code such as

   my $x = qr{ ... (??{ $x }) ... };

will now (correctly) fail under use strict. (As the inner C<$x> is and
has always referred to C<$::x>)

The "const in void context" warning has been suppressed for a constant in an
optimised-away boolean expression such as C<5 || print;>

C<perl -i> could C<fchmod(stdin)> by mistake. This is serious if stdin is
attached to a terminal, and perl is running as root. Now fixed.

=head1 New or Changed Diagnostics

C<Carp> and the internal diagnostic routines used by C<Devel::Peek> have been
made clearer, as described in L</Incompatible Changes>

=head1 Changed Internals

Some bugs have been fixed in the hash internals. Restricted hashes and
their place holders are now allocated and deleted at slightly different times,
but this should not be visible to user code.

=head1 Future Directions

Code freeze for the next maintenance release (5.8.5) will be on 30th June
2004, with release by mid July.

=head1 Platform Specific Problems

This release is known not to build on Windows 95.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlreref.pod000064400000034630150344123500007241 0ustar00=head1 NAME

perlreref - Perl Regular Expressions Reference

=head1 DESCRIPTION

This is a quick reference to Perl's regular expressions.
For full information see L<perlre> and L<perlop>, as well
as the L</"SEE ALSO"> section in this document.

=head2 OPERATORS

C<=~> determines to which variable the regex is applied.
In its absence, $_ is used.

    $var =~ /foo/;

C<!~> determines to which variable the regex is applied,
and negates the result of the match; it returns
false if the match succeeds, and true if it fails.

    $var !~ /foo/;

C<m/pattern/msixpogcdualn> searches a string for a pattern match,
applying the given options.

    m  Multiline mode - ^ and $ match internal lines
    s  match as a Single line - . matches \n
    i  case-Insensitive
    x  eXtended legibility - free whitespace and comments
    p  Preserve a copy of the matched string -
       ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} will be defined.
    o  compile pattern Once
    g  Global - all occurrences
    c  don't reset pos on failed matches when using /g
    a  restrict \d, \s, \w and [:posix:] to match ASCII only
    aa (two a's) also /i matches exclude ASCII/non-ASCII
    l  match according to current locale
    u  match according to Unicode rules
    d  match according to native rules unless something indicates
       Unicode
    n  Non-capture mode. Don't let () fill in $1, $2, etc...

If 'pattern' is an empty string, the last I<successfully> matched
regex is used. Delimiters other than '/' may be used for both this
operator and the following ones. The leading C<m> can be omitted
if the delimiter is '/'.

C<qr/pattern/msixpodualn> lets you store a regex in a variable,
or pass one around. Modifiers as for C<m//>, and are stored
within the regex.

C<s/pattern/replacement/msixpogcedual> substitutes matches of
'pattern' with 'replacement'. Modifiers as for C<m//>,
with two additions:

    e  Evaluate 'replacement' as an expression
    r  Return substitution and leave the original string untouched.

'e' may be specified multiple times. 'replacement' is interpreted
as a double quoted string unless a single-quote (C<'>) is the delimiter.

C<m?pattern?> is like C<m/pattern/> but matches only once. No alternate
delimiters can be used.  Must be reset with reset().

=head2 SYNTAX

 \       Escapes the character immediately following it
 .       Matches any single character except a newline (unless /s is
           used)
 ^       Matches at the beginning of the string (or line, if /m is used)
 $       Matches at the end of the string (or line, if /m is used)
 *       Matches the preceding element 0 or more times
 +       Matches the preceding element 1 or more times
 ?       Matches the preceding element 0 or 1 times
 {...}   Specifies a range of occurrences for the element preceding it
 [...]   Matches any one of the characters contained within the brackets
 (...)   Groups subexpressions for capturing to $1, $2...
 (?:...) Groups subexpressions without capturing (cluster)
 |       Matches either the subexpression preceding or following it
 \g1 or \g{1}, \g2 ...    Matches the text from the Nth group
 \1, \2, \3 ...           Matches the text from the Nth group
 \g-1 or \g{-1}, \g-2 ... Matches the text from the Nth previous group
 \g{name}     Named backreference
 \k<name>     Named backreference
 \k'name'     Named backreference
 (?P=name)    Named backreference (python syntax)

=head2 ESCAPE SEQUENCES

These work as in normal strings.

   \a       Alarm (beep)
   \e       Escape
   \f       Formfeed
   \n       Newline
   \r       Carriage return
   \t       Tab
   \037     Char whose ordinal is the 3 octal digits, max \777
   \o{2307} Char whose ordinal is the octal number, unrestricted
   \x7f     Char whose ordinal is the 2 hex digits, max \xFF
   \x{263a} Char whose ordinal is the hex number, unrestricted
   \cx      Control-x
   \N{name} A named Unicode character or character sequence
   \N{U+263D} A Unicode character by hex ordinal

   \l  Lowercase next character
   \u  Titlecase next character
   \L  Lowercase until \E
   \U  Uppercase until \E
   \F  Foldcase until \E
   \Q  Disable pattern metacharacters until \E
   \E  End modification

For Titlecase, see L</Titlecase>.

This one works differently from normal strings:

   \b  An assertion, not backspace, except in a character class

=head2 CHARACTER CLASSES

   [amy]    Match 'a', 'm' or 'y'
   [f-j]    Dash specifies "range"
   [f-j-]   Dash escaped or at start or end means 'dash'
   [^f-j]   Caret indicates "match any character _except_ these"

The following sequences (except C<\N>) work within or without a character class.
The first six are locale aware, all are Unicode aware. See L<perllocale>
and L<perlunicode> for details.

   \d      A digit
   \D      A nondigit
   \w      A word character
   \W      A non-word character
   \s      A whitespace character
   \S      A non-whitespace character
   \h      An horizontal whitespace
   \H      A non horizontal whitespace
   \N      A non newline (when not followed by '{NAME}';;
           not valid in a character class; equivalent to [^\n]; it's
           like '.' without /s modifier)
   \v      A vertical whitespace
   \V      A non vertical whitespace
   \R      A generic newline           (?>\v|\x0D\x0A)

   \pP     Match P-named (Unicode) property
   \p{...} Match Unicode property with name longer than 1 character
   \PP     Match non-P
   \P{...} Match lack of Unicode property with name longer than 1 char
   \X      Match Unicode extended grapheme cluster

POSIX character classes and their Unicode and Perl equivalents:

            ASCII-         Full-
   POSIX    range          range    backslash
 [[:...:]]  \p{...}        \p{...}   sequence    Description

 -----------------------------------------------------------------------
 alnum   PosixAlnum       XPosixAlnum            'alpha' plus 'digit'
 alpha   PosixAlpha       XPosixAlpha            Alphabetic characters
 ascii   ASCII                                   Any ASCII character
 blank   PosixBlank       XPosixBlank   \h       Horizontal whitespace;
                                                   full-range also
                                                   written as
                                                   \p{HorizSpace} (GNU
                                                   extension)
 cntrl   PosixCntrl       XPosixCntrl            Control characters
 digit   PosixDigit       XPosixDigit   \d       Decimal digits
 graph   PosixGraph       XPosixGraph            'alnum' plus 'punct'
 lower   PosixLower       XPosixLower            Lowercase characters
 print   PosixPrint       XPosixPrint            'graph' plus 'space',
                                                   but not any Controls
 punct   PosixPunct       XPosixPunct            Punctuation and Symbols
                                                   in ASCII-range; just
                                                   punct outside it
 space   PosixSpace       XPosixSpace   \s       Whitespace
 upper   PosixUpper       XPosixUpper            Uppercase characters
 word    PosixWord        XPosixWord    \w       'alnum' + Unicode marks
                                                    + connectors, like
                                                    '_' (Perl extension)
 xdigit  ASCII_Hex_Digit  XPosixDigit            Hexadecimal digit,
                                                    ASCII-range is
                                                    [0-9A-Fa-f]

Also, various synonyms like C<\p{Alpha}> for C<\p{XPosixAlpha}>; all listed
in L<perluniprops/Properties accessible through \p{} and \P{}>

Within a character class:

    POSIX      traditional   Unicode
  [:digit:]       \d        \p{Digit}
  [:^digit:]      \D        \P{Digit}

=head2 ANCHORS

All are zero-width assertions.

   ^  Match string start (or line, if /m is used)
   $  Match string end (or line, if /m is used) or before newline
   \b{} Match boundary of type specified within the braces
   \B{} Match wherever \b{} doesn't match
   \b Match word boundary (between \w and \W)
   \B Match except at word boundary (between \w and \w or \W and \W)
   \A Match string start (regardless of /m)
   \Z Match string end (before optional newline)
   \z Match absolute string end
   \G Match where previous m//g left off
   \K Keep the stuff left of the \K, don't include it in $&

=head2 QUANTIFIERS

Quantifiers are greedy by default and match the B<longest> leftmost.

   Maximal Minimal Possessive Allowed range
   ------- ------- ---------- -------------
   {n,m}   {n,m}?  {n,m}+     Must occur at least n times
                              but no more than m times
   {n,}    {n,}?   {n,}+      Must occur at least n times
   {n}     {n}?    {n}+       Must occur exactly n times
   *       *?      *+         0 or more times (same as {0,})
   +       +?      ++         1 or more times (same as {1,})
   ?       ??      ?+         0 or 1 time (same as {0,1})

The possessive forms (new in Perl 5.10) prevent backtracking: what gets
matched by a pattern with a possessive quantifier will not be backtracked
into, even if that causes the whole match to fail.

There is no quantifier C<{,n}>. That's interpreted as a literal string.

=head2 EXTENDED CONSTRUCTS

   (?#text)          A comment
   (?:...)           Groups subexpressions without capturing (cluster)
   (?pimsx-imsx:...) Enable/disable option (as per m// modifiers)
   (?=...)           Zero-width positive lookahead assertion
   (?!...)           Zero-width negative lookahead assertion
   (?<=...)          Zero-width positive lookbehind assertion
   (?<!...)          Zero-width negative lookbehind assertion
   (?>...)           Grab what we can, prohibit backtracking
   (?|...)           Branch reset
   (?<name>...)      Named capture
   (?'name'...)      Named capture
   (?P<name>...)     Named capture (python syntax)
   (?[...])          Extended bracketed character class
   (?{ code })       Embedded code, return value becomes $^R
   (??{ code })      Dynamic regex, return value used as regex
   (?N)              Recurse into subpattern number N
   (?-N), (?+N)      Recurse into Nth previous/next subpattern
   (?R), (?0)        Recurse at the beginning of the whole pattern
   (?&name)          Recurse into a named subpattern
   (?P>name)         Recurse into a named subpattern (python syntax)
   (?(cond)yes|no)
   (?(cond)yes)      Conditional expression, where "cond" can be:
                     (?=pat)   lookahead
                     (?!pat)   negative lookahead
                     (?<=pat)  lookbehind
                     (?<!pat)  negative lookbehind
                     (N)       subpattern N has matched something
                     (<name>)  named subpattern has matched something
                     ('name')  named subpattern has matched something
                     (?{code}) code condition
                     (R)       true if recursing
                     (RN)      true if recursing into Nth subpattern
                     (R&name)  true if recursing into named subpattern
                     (DEFINE)  always false, no no-pattern allowed

=head2 VARIABLES

   $_    Default variable for operators to use

   $`    Everything prior to matched string
   $&    Entire matched string
   $'    Everything after to matched string

   ${^PREMATCH}   Everything prior to matched string
   ${^MATCH}      Entire matched string
   ${^POSTMATCH}  Everything after to matched string

Note to those still using Perl 5.18 or earlier:
The use of C<$`>, C<$&> or C<$'> will slow down B<all> regex use
within your program. Consult L<perlvar> for C<@->
to see equivalent expressions that won't cause slow down.
See also L<Devel::SawAmpersand>. Starting with Perl 5.10, you
can also use the equivalent variables C<${^PREMATCH}>, C<${^MATCH}>
and C<${^POSTMATCH}>, but for them to be defined, you have to
specify the C</p> (preserve) modifier on your regular expression.
In Perl 5.20, the use of C<$`>, C<$&> and C<$'> makes no speed difference.

   $1, $2 ...  hold the Xth captured expr
   $+    Last parenthesized pattern match
   $^N   Holds the most recently closed capture
   $^R   Holds the result of the last (?{...}) expr
   @-    Offsets of starts of groups. $-[0] holds start of whole match
   @+    Offsets of ends of groups. $+[0] holds end of whole match
   %+    Named capture groups
   %-    Named capture groups, as array refs

Captured groups are numbered according to their I<opening> paren.

=head2 FUNCTIONS

   lc          Lowercase a string
   lcfirst     Lowercase first char of a string
   uc          Uppercase a string
   ucfirst     Titlecase first char of a string
   fc          Foldcase a string

   pos         Return or set current match position
   quotemeta   Quote metacharacters
   reset       Reset m?pattern? status
   study       Analyze string for optimizing matching

   split       Use a regex to split a string into parts

The first five of these are like the escape sequences C<\L>, C<\l>,
C<\U>, C<\u>, and C<\F>.  For Titlecase, see L</Titlecase>; For
Foldcase, see L</Foldcase>.

=head2 TERMINOLOGY

=head3 Titlecase

Unicode concept which most often is equal to uppercase, but for
certain characters like the German "sharp s" there is a difference.

=head3 Foldcase

Unicode form that is useful when comparing strings regardless of case,
as certain characters have complex one-to-many case mappings. Primarily a
variant of lowercase.

=head1 AUTHOR

Iain Truskett. Updated by the Perl 5 Porters.

This document may be distributed under the same terms as Perl itself.

=head1 SEE ALSO

=over 4

=item *

L<perlretut> for a tutorial on regular expressions.

=item *

L<perlrequick> for a rapid tutorial.

=item *

L<perlre> for more details.

=item *

L<perlvar> for details on the variables.

=item *

L<perlop> for details on the operators.

=item *

L<perlfunc> for details on the functions.

=item *

L<perlfaq6> for FAQs on regular expressions.

=item *

L<perlrebackslash> for a reference on backslash sequences.

=item *

L<perlrecharclass> for a reference on character classes.

=item *

The L<re> module to alter behaviour and aid
debugging.

=item *

L<perldebug/"Debugging Regular Expressions">

=item *

L<perluniintro>, L<perlunicode>, L<charnames> and L<perllocale>
for details on regexes and internationalisation.

=item *

I<Mastering Regular Expressions> by Jeffrey Friedl
(L<http://oreilly.com/catalog/9780596528126/>) for a thorough grounding and
reference on the topic.

=back

=head1 THANKS

David P.C. Wollmann,
Richard Soderberg,
Sean M. Burke,
Tom Christiansen,
Jim Cromie,
and
Jeffrey Goff
for useful advice.

=cut
perl5141delta.pod000064400000017436150344123500007547 0ustar00=encoding utf8

=head1 NAME

perl5141delta - what is new for perl v5.14.1

=head1 DESCRIPTION

This document describes differences between the 5.14.0 release and
the 5.14.1 release.

If you are upgrading from an earlier release such as 5.12.0, first read
L<perl5140delta>, which describes differences between 5.12.0 and
5.14.0.

=head1 Core Enhancements

No changes since 5.14.0.

=head1 Security

No changes since 5.14.0.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.14.0. If any
exist, they are bugs and reports are welcome.

=head1 Deprecations

There have been no deprecations since 5.14.0.

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

None

=head2 Updated Modules and Pragmata

=over 4

=item *

L<B::Deparse> has been upgraded from version 1.03 to 1.04, to address two
regressions in Perl 5.14.0:

Deparsing of the C<glob> operator and its diamond (C<< <> >>) form now
works again. [perl #90898]

The presence of subroutines named C<::::> or C<::::::> no longer causes
B::Deparse to hang.

=item *

L<Pod::Perldoc> has been upgraded from version 3.15_03 to 3.15_04.

It corrects the search paths on VMS. [perl #90640]

=back

=head2 Removed Modules and Pragmata

None

=head1 Documentation

=head2 New Documentation

None

=head2 Changes to Existing Documentation

=head3 L<perlfunc>

=over

=item *

C<given>, C<when> and C<default> are now listed in L<perlfunc>.

=item *

Documentation for C<use> now includes a pointer to F<if.pm>.

=back

=head3 L<perllol>

=over

=item *

L<perllol> has been expanded with examples using the new C<push $scalar>
syntax introduced in Perl 5.14.0.

=back

=head3 L<perlop>

=over 4

=item *

The explanation of bitwise operators has been expanded to explain how they
work on Unicode strings.

=item *

The section on the triple-dot or yada-yada operator has been moved up, as
it used to separate two closely related sections about the comma operator.

=item *

More examples for C<m//g> have been added.

=item *

The C<<< <<\FOO >>> here-doc syntax has been documented.

=back

=head3 L<perlrun>

=over

=item *

L<perlrun> has undergone a significant clean-up.  Most notably, the
B<-0x...> form of the B<-0> flag has been clarified, and the final section
on environment variables has been corrected and expanded.

=back

=head3 L<POSIX>

=over 

=item *

The invocation documentation for C<WIFEXITED>, C<WEXITSTATUS>,
C<WIFSIGNALED>, C<WTERMSIG>, C<WIFSTOPPED>, and C<WSTOPSIG> was corrected.

=back


=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

None

=head2 Changes to Existing Diagnostics

None

=head1 Utility Changes

None

=head1 Configuration and Compilation

=over 4

=item *

F<regexp.h> has been modified for compatibility with GCC's C<-Werror>
option, as used by some projects that include perl's header files.

=back

=head1 Testing

=over 4

=item *

Some test failures in F<dist/Locale-Maketext/t/09_compile.t> that could
occur depending on the environment have been fixed. [perl #89896]

=item * 

A watchdog timer for F<t/re/re.t> was lengthened to accommodate SH-4 systems
which were unable to complete the tests before the previous timer ran out.


=back

=head1 Platform Support

=head2 New Platforms

None

=head2 Discontinued Platforms

None

=head2 Platform-Specific Notes

=head3 Solaris

=over 

=item *

Documentation listing the Solaris packages required to build Perl on
Solaris 9 and Solaris 10 has been corrected.

=back

=head3 Mac OS X

=over

=item * 

The F<lib/locale.t> test script has been updated to work on the upcoming
Lion release.

=item * 

Mac OS X specific compilation instructions have been clarified.

=back

=head3 Ubuntu Linux

=over 

=item *

The L<ODBM_File> installation process has been updated with the new library
paths on Ubuntu natty.

=back

=head1 Internal Changes

=over 

=item *

The compiled representation of formats is now stored via the mg_ptr of
their PERL_MAGIC_fm. Previously it was stored in the string buffer,
beyond SvLEN(), the regular end of the string. SvCOMPILED() and
SvCOMPILED_{on,off}() now exist solely for compatibility for XS code.
The first is always 0, the other two now no-ops.

=back

=head1 Bug Fixes

=over 4

=item *

A bug has been fixed that would cause a "Use of freed value in iteration"
error if the next two hash elements that would be iterated over are
deleted. [perl #85026]

=item *

Passing the same constant subroutine to both C<index> and C<formline> no
longer causes one or the other to fail. [perl #89218]

=item *

5.14.0 introduced some memory leaks in regular expression character
classes such as C<[\w\s]>, which have now been fixed.

=item *

An edge case in regular expression matching could potentially loop.
This happened only under C</i> in bracketed character classes that have
characters with multi-character folds, and the target string to match
against includes the first portion of the fold, followed by another
character that has a multi-character fold that begins with the remaining
portion of the fold, plus some more.

 "s\N{U+DF}" =~ /[\x{DF}foo]/i

is one such case.  C<\xDF> folds to C<"ss">.

=item * 

Several Unicode case-folding bugs have been fixed.

=item *

The new (in 5.14.0) regular expression modifier C</a> when repeated like
C</aa> forbids the characters outside the ASCII range that match
characters inside that range from matching under C</i>.  This did not
work under some circumstances, all involving alternation, such as:

 "\N{KELVIN SIGN}" =~ /k|foo/iaa;

succeeded inappropriately.  This is now fixed.

=item *

Fixed a case where it was possible that a freed buffer may have been read
from when parsing a here document.

=back

=head1 Acknowledgements

Perl 5.14.1 represents approximately four weeks of development since
Perl 5.14.0 and contains approximately 3500 lines of changes
across 38 files from 17 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers.  The following people are known to
have contributed the improvements that became Perl 5.14.1:

Bo Lindbergh, Claudio Ramirez, Craig A. Berry, David Leadbeater, Father
Chrysostomos, Jesse Vincent, Jim Cromie, Justin Case, Karl Williamson,
Leo Lapworth, Nicholas Clark, Nobuhiro Iwamatsu, smash, Tom Christiansen,
Ton Hospel, Vladimir Timofeev, and Zsbán Ambrus.


=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes all the core committers, who be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlmroapi.pod000064400000006214150344123500007422 0ustar00=head1 NAME

perlmroapi - Perl method resolution plugin interface

=head1 DESCRIPTION

As of Perl 5.10.1 there is a new interface for plugging and using method
resolution orders other than the default (linear depth first search).
The C3 method resolution order added in 5.10.0 has been re-implemented as
a plugin, without changing its Perl-space interface.

Each plugin should register itself by providing
the following structure

    struct mro_alg {
        AV *(*resolve)(pTHX_ HV *stash, U32 level);
        const char *name;
        U16 length;
        U16 kflags;
        U32 hash;
    };

and calling C<Perl_mro_register>:

    Perl_mro_register(aTHX_ &my_mro_alg);

=over 4

=item resolve

Pointer to the linearisation function, described below.

=item name

Name of the MRO, either in ISO-8859-1 or UTF-8.

=item length

Length of the name.

=item kflags

If the name is given in UTF-8, set this to C<HVhek_UTF8>. The value is passed
direct as the parameter I<kflags> to C<hv_common()>.

=item hash

A precomputed hash value for the MRO's name, or 0.

=back

=head1 Callbacks

The C<resolve> function is called to generate a linearised ISA for the
given stash, using this MRO. It is called with a pointer to the stash, and
a I<level> of 0. The core always sets I<level> to 0 when it calls your
function - the parameter is provided to allow your implementation to track
depth if it needs to recurse.

The function should return a reference to an array containing the parent
classes in order. The names of the classes should be the result of calling
C<HvENAME()> on the stash. In those cases where C<HvENAME()> returns null,
C<HvNAME()> should be used instead.

The caller is responsible for incrementing the reference count of the array
returned if it wants to keep the structure. Hence, if you have created a
temporary value that you keep no pointer to, C<sv_2mortal()> to ensure that
it is disposed of correctly. If you have cached your return value, then
return a pointer to it without changing the reference count.

=head1 Caching

Computing MROs can be expensive. The implementation provides a cache, in
which you can store a single C<SV *>, or anything that can be cast to
C<SV *>, such as C<AV *>. To read your private value, use the macro
C<MRO_GET_PRIVATE_DATA()>, passing it the C<mro_meta> structure from the
stash, and a pointer to your C<mro_alg> structure:

    meta = HvMROMETA(stash);
    private_sv = MRO_GET_PRIVATE_DATA(meta, &my_mro_alg);

To set your private value, call C<Perl_mro_set_private_data()>:

    Perl_mro_set_private_data(aTHX_ meta, &c3_alg, private_sv);

The private data cache will take ownership of a reference to private_sv,
much the same way that C<hv_store()> takes ownership of a reference to the
value that you pass it.

=head1 Examples

For examples of MRO implementations, see C<S_mro_get_linear_isa_c3()>
and the C<BOOT:> section of F<ext/mro/mro.xs>, and
C<S_mro_get_linear_isa_dfs()> in F<mro_core.c>

=head1 AUTHORS

The implementation of the C3 MRO and switchable MROs within the perl core was
written by Brandon L Black. Nicholas Clark created the pluggable interface, 
refactored Brandon's implementation to work with it, and wrote this document.

=cut
perlos390.pod000064400000036472150344123500007021 0ustar00This document is written in pod format hence there are punctuation
characters in odd places.  Do not worry, you've apparently got the
ASCII->EBCDIC translation worked out correctly.  You can read more
about pod in pod/perlpod.pod or the short summary in the INSTALL file.

=head1 NAME

perlos390 - building and installing Perl for OS/390 and z/OS

=head1 SYNOPSIS

This document will help you Configure, build, test and install Perl
on OS/390 (aka z/OS) Unix System Services.

B<This document needs to be updated, but we don't know what it should say.
Please email comments to L<perlbug@perl.org|mailto:perlbug@perl.org>.>

=head1 DESCRIPTION

This is a fully ported Perl for OS/390 Version 2 Release 3, 5, 6, 7,
8, and 9.  It may work on other versions or releases, but those are
the ones we've tested it on.

You may need to carry out some system configuration tasks before
running the Configure script for Perl.


=head2 Tools

The z/OS Unix Tools and Toys list may prove helpful and contains links
to ports of much of the software helpful for building Perl.
L<http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1toy.html>


=head2 Unpacking Perl distribution on OS/390

If using ftp remember to transfer the distribution in binary format.

Gunzip/gzip for OS/390 is discussed at:

  http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1ty1.html

to extract an ASCII tar archive on OS/390, try this:

   pax -o to=IBM-1047,from=ISO8859-1 -r < latest.tar

or

   zcat latest.tar.Z | pax -o to=IBM-1047,from=ISO8859-1 -r

If you get lots of errors of the form

 tar: FSUM7171 ...: cannot set uid/gid: EDC5139I Operation not permitted

you didn't read the above and tried to use tar instead of pax, you'll
first have to remove the (now corrupt) perl directory

   rm -rf perl-...

and then use pax.

=head2 Setup and utilities for Perl on OS/390

Be sure that your yacc installation is in place including any necessary
parser template files. If you have not already done so then be sure to:

  cp /samples/yyparse.c /etc

This may also be a good time to ensure that your /etc/protocol file
and either your /etc/resolv.conf or /etc/hosts files are in place.
The IBM document that described such USS system setup issues was
SC28-1890-07 "OS/390 UNIX System Services Planning", in particular
Chapter 6 on customizing the OE shell.

GNU make for OS/390, which is recommended for the build of perl (as
well as building CPAN modules and extensions), is available from the
L</Tools>.

Some people have reported encountering "Out of memory!" errors while
trying to build Perl using GNU make binaries.  If you encounter such
trouble then try to download the source code kit and build GNU make
from source to eliminate any such trouble.  You might also find GNU make
(as well as Perl and Apache) in the red-piece/book "Open Source Software
for OS/390 UNIX", SG24-5944-00 from IBM.

If instead of the recommended GNU make you would like to use the system
supplied make program then be sure to install the default rules file
properly via the shell command:

    cp /samples/startup.mk /etc

and be sure to also set the environment variable _C89_CCMODE=1 (exporting
_C89_CCMODE=1 is also a good idea for users of GNU make).

You might also want to have GNU groff for OS/390 installed before
running the "make install" step for Perl.

There is a syntax error in the /usr/include/sys/socket.h header file
that IBM supplies with USS V2R7, V2R8, and possibly V2R9.  The problem with
the header file is that near the definition of the SO_REUSEPORT constant
there is a spurious extra '/' character outside of a comment like so:

 #define SO_REUSEPORT    0x0200    /* allow local address & port
                                      reuse */                    /

You could edit that header yourself to remove that last '/', or you might
note that Language Environment (LE) APAR PQ39997 describes the problem
and PTF's UQ46272 and UQ46271 are the (R8 at least) fixes and apply them.
If left unattended that syntax error will turn up as an inability for Perl
to build its "Socket" extension.

For successful testing you may need to turn on the sticky bit for your
world readable /tmp directory if you have not already done so (see man chmod).

=head2 Configure Perl on OS/390

Once you've unpacked the distribution, run "sh Configure" (see INSTALL
for a full discussion of the Configure options).  There is a "hints" file
for os390 that specifies the correct values for most things.  Some things
to watch out for include:

=over 4

=item *

A message of the form:

 (I see you are using the Korn shell.  Some ksh's blow up on
 Configure, mainly on older exotic systems.  If yours does, try the
 Bourne shell instead.)

is nothing to worry about at all.

=item *

Some of the parser default template files in /samples are needed in /etc.
In particular be sure that you at least copy /samples/yyparse.c to /etc
before running Perl's Configure.  This step ensures successful extraction
of EBCDIC versions of parser files such as perly.c and perly.h.
This has to be done before running Configure the first time.  If you failed
to do so then the easiest way to re-Configure Perl is to delete your
misconfigured build root and re-extract the source from the tar ball.
Then you must ensure that /etc/yyparse.c is properly in place before
attempting to re-run Configure.

=item *

This port will support dynamic loading, but it is not selected by
default.  If you would like to experiment with dynamic loading then
be sure to specify -Dusedl in the arguments to the Configure script.
See the comments in hints/os390.sh for more information on dynamic loading.
If you build with dynamic loading then you will need to add the
$archlibexp/CORE directory to your LIBPATH environment variable in order
for perl to work.  See the config.sh file for the value of $archlibexp.
If in trying to use Perl you see an error message similar to:

 CEE3501S The module libperl.dll was not found.
   From entry point __dllstaticinit at compile unit offset +00000194
   at

then your LIBPATH does not have the location of libperl.x and either
libperl.dll or libperl.so in it.  Add that directory to your LIBPATH and
proceed.

=item *

Do not turn on the compiler optimization flag "-O".  There is
a bug in either the optimizer or perl that causes perl to
not work correctly when the optimizer is on.

=item *

Some of the configuration files in /etc used by the
networking APIs are either missing or have the wrong
names.  In particular, make sure that there's either
an /etc/resolv.conf or an /etc/hosts, so that
gethostbyname() works, and make sure that the file
/etc/proto has been renamed to /etc/protocol (NOT
/etc/protocols, as used by other Unix systems).
You may have to look for things like HOSTNAME and DOMAINORIGIN
in the "//'SYS1.TCPPARMS(TCPDATA)'" PDS member in order to
properly set up your /etc networking files.

=back

=head2 Build, Test, Install Perl on OS/390

Simply put:

    sh Configure
    make
    make test

if everything looks ok (see the next section for test/IVP diagnosis) then:

    make install

this last step may or may not require UID=0 privileges depending
on how you answered the questions that Configure asked and whether
or not you have write access to the directories you specified.

=head2 Build Anomalies with Perl on OS/390

"Out of memory!" messages during the build of Perl are most often fixed
by re building the GNU make utility for OS/390 from a source code kit.

Another memory limiting item to check is your MAXASSIZE parameter in your
'SYS1.PARMLIB(BPXPRMxx)' data set (note too that as of V2R8 address space
limits can be set on a per user ID basis in the USS segment of a RACF
profile).  People have reported successful builds of Perl with MAXASSIZE
parameters as small as 503316480 (and it may be possible to build Perl
with a MAXASSIZE smaller than that).

Within USS your /etc/profile or $HOME/.profile may limit your ulimit
settings.  Check that the following command returns reasonable values:

    ulimit -a

To conserve memory you should have your compiler modules loaded into the
Link Pack Area (LPA/ELPA) rather than in a link list or step lib.

If the c89 compiler complains of syntax errors during the build of the
Socket extension then be sure to fix the syntax error in the system
header /usr/include/sys/socket.h.

=head2 Testing Anomalies with Perl on OS/390

The "make test" step runs a Perl Verification Procedure, usually before
installation.  You might encounter STDERR messages even during a successful
run of "make test".  Here is a guide to some of the more commonly seen
anomalies:

=over 4

=item *

A message of the form:

 io/openpid...........CEE5210S The signal SIGHUP was received.
 CEE5210S The signal SIGHUP was received.
 CEE5210S The signal SIGHUP was received.
 ok

indicates that the t/io/openpid.t test of Perl has passed but done so
with extraneous messages on stderr from CEE.

=item *

A message of the form:

 lib/ftmp-security....File::Temp::_gettemp: Parent directory (/tmp/)
 is not safe (sticky bit not set when world writable?) at
 lib/ftmp-security.t line 100
 File::Temp::_gettemp: Parent directory (/tmp/) is not safe (sticky
 bit not set when world writable?) at lib/ftmp-security.t line 100
 ok

indicates a problem with the permissions on your /tmp directory within the HFS.
To correct that problem issue the command:

     chmod a+t /tmp

from an account with write access to the directory entry for /tmp.

=item *

Out of Memory!

Recent perl test suite is quite memory hungry. In addition to the comments
above on memory limitations it is also worth checking for _CEE_RUNOPTS
in your environment. Perl now has (in miniperlmain.c) a C #pragma
to set CEE run options, but the environment variable wins.

The C code asks for:

 #pragma runopts(HEAP(2M,500K,ANYWHERE,KEEP,8K,4K) STACK(,,ANY,) ALL31(ON))

The important parts of that are the second argument (the increment) to HEAP,
and allowing the stack to be "Above the (16M) line". If the heap
increment is too small then when perl (for example loading unicode/Name.pl) tries
to create a "big" (400K+) string it cannot fit in a single segment
and you get "Out of Memory!" - even if there is still plenty of memory
available.

A related issue is use with perl's malloc. Perl's malloc uses C<sbrk()>
to get memory, and C<sbrk()> is limited to the first allocation so in this
case something like:

  HEAP(8M,500K,ANYWHERE,KEEP,8K,4K)

is needed to get through the test suite.


=back

=head2 Installation Anomalies with Perl on OS/390

The installman script will try to run on OS/390.  There will be fewer errors
if you have a roff utility installed.  You can obtain GNU groff from the
Redbook SG24-5944-00 ftp site.

=head2 Usage Hints for Perl on OS/390

When using perl on OS/390 please keep in mind that the EBCDIC and ASCII
character sets are different.  See perlebcdic.pod for more on such character
set issues.  Perl builtin functions that may behave differently under
EBCDIC are also mentioned in the perlport.pod document.

Open Edition (UNIX System Services) from V2R8 onward does support
#!/path/to/perl script invocation.  There is a PTF available from
IBM for V2R7 that will allow shell/kernel support for #!.  USS
releases prior to V2R7 did not support the #! means of script invocation.
If you are running V2R6 or earlier then see:

    head `whence perldoc`

for an example of how to use the "eval exec" trick to ask the shell to
have Perl run your scripts on those older releases of Unix System Services.

If you are having trouble with square brackets then consider switching your
rlogin or telnet client.  Try to avoid older 3270 emulators and ISHELL for
working with Perl on USS.

=head2 Floating Point Anomalies with Perl on OS/390

There appears to be a bug in the floating point implementation on S/390
systems such that calling int() on the product of a number and a small
magnitude number is not the same as calling int() on the quotient of
that number and a large magnitude number.  For example, in the following
Perl code:

    my $x = 100000.0;
    my $y = int($x * 1e-5) * 1e5; # '0'
    my $z = int($x / 1e+5) * 1e5;  # '100000'
    print "\$y is $y and \$z is $z\n"; # $y is 0 and $z is 100000

Although one would expect the quantities $y and $z to be the same and equal
to 100000 they will differ and instead will be 0 and 100000 respectively.

The problem can be further examined in a roughly equivalent C program:

    #include <stdio.h>
    #include <math.h>
    main()
    {
    double r1,r2;
    double x = 100000.0;
    double y = 0.0;
    double z = 0.0;
    x = 100000.0 * 1e-5;
    r1 = modf (x,&y);
    x = 100000.0 / 1e+5;
    r2 = modf (x,&z);
    printf("y is %e and z is %e\n",y*1e5,z*1e5);
    /* y is 0.000000e+00 and z is 1.000000e+05 (with c89) */
    }

=head2 Modules and Extensions for Perl on OS/390

Pure pure (that is non xs) modules may be installed via the usual:

    perl Makefile.PL
    make
    make test
    make install

If you built perl with dynamic loading capability then that would also
be the way to build xs based extensions.  However, if you built perl with
the default static linking you can still build xs based extensions for OS/390
but you will need to follow the instructions in ExtUtils::MakeMaker for
building statically linked perl binaries.  In the simplest configurations
building a static perl + xs extension boils down to:

    perl Makefile.PL
    make
    make perl
    make test
    make install
    make -f Makefile.aperl inst_perl MAP_TARGET=perl

In most cases people have reported better results with GNU make rather
than the system's /bin/make program, whether for plain modules or for
xs based extensions.

If the make process encounters trouble with either compilation or
linking then try setting the _C89_CCMODE to 1.  Assuming sh is your
login shell then run:

    export _C89_CCMODE=1

If tcsh is your login shell then use the setenv command.

=head1 AUTHORS

David Fiander and Peter Prymmer with thanks to Dennis Longnecker
and William Raffloer for valuable reports, LPAR and PTF feedback.
Thanks to Mike MacIsaac and Egon Terwedow for SG24-5944-00.
Thanks to Ignasi Roca for pointing out the floating point problems.
Thanks to John Goodyear for dynamic loading help.

=head1 SEE ALSO

L<INSTALL>, L<perlport>, L<perlebcdic>, L<ExtUtils::MakeMaker>.

 http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1toy.html

 http://www.redbooks.ibm.com/redbooks/SG245944.html

 http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1ty1.html#opensrc

 http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/

 http://publibz.boulder.ibm.com:80/cgi-bin/bookmgr_OS390/BOOKS/ceea3030/

 http://publibz.boulder.ibm.com:80/cgi-bin/bookmgr_OS390/BOOKS/CBCUG030/

=head2 Mailing list for Perl on OS/390

If you are interested in the z/OS (formerly known as OS/390)
and POSIX-BC (BS2000) ports of Perl then see the perl-mvs mailing list.
To subscribe, send an empty message to perl-mvs-subscribe@perl.org.

See also:

    http://lists.perl.org/list/perl-mvs.html

There are web archives of the mailing list at:

    http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/
    http://archive.develooper.com/perl-mvs@perl.org/

=head1 HISTORY

This document was originally written by David Fiander for the 5.005
release of Perl.

This document was podified for the 5.005_03 release of Perl 11 March 1999.

Updated 28 November 2001 for broken URLs.

Updated 12 November 2000 for the 5.7.1 release of Perl.

Updated 15 January 2001 for the 5.7.1 release of Perl.

Updated 24 January 2001 to mention dynamic loading.

Updated 12 March 2001 to mention //'SYS1.TCPPARMS(TCPDATA)'.

=cut

perlpacktut.pod000064400000144122150344123500007607 0ustar00=head1 NAME

perlpacktut - tutorial on C<pack> and C<unpack>

=head1 DESCRIPTION

C<pack> and C<unpack> are two functions for transforming data according
to a user-defined template, between the guarded way Perl stores values
and some well-defined representation as might be required in the 
environment of a Perl program. Unfortunately, they're also two of 
the most misunderstood and most often overlooked functions that Perl
provides. This tutorial will demystify them for you.


=head1 The Basic Principle

Most programming languages don't shelter the memory where variables are
stored. In C, for instance, you can take the address of some variable,
and the C<sizeof> operator tells you how many bytes are allocated to
the variable. Using the address and the size, you may access the storage
to your heart's content.

In Perl, you just can't access memory at random, but the structural and
representational conversion provided by C<pack> and C<unpack> is an
excellent alternative. The C<pack> function converts values to a byte
sequence containing representations according to a given specification,
the so-called "template" argument. C<unpack> is the reverse process,
deriving some values from the contents of a string of bytes. (Be cautioned,
however, that not all that has been packed together can be neatly unpacked - 
a very common experience as seasoned travellers are likely to confirm.)

Why, you may ask, would you need a chunk of memory containing some values
in binary representation? One good reason is input and output accessing
some file, a device, or a network connection, whereby this binary
representation is either forced on you or will give you some benefit
in processing. Another cause is passing data to some system call that
is not available as a Perl function: C<syscall> requires you to provide
parameters stored in the way it happens in a C program. Even text processing 
(as shown in the next section) may be simplified with judicious usage 
of these two functions.

To see how (un)packing works, we'll start with a simple template
code where the conversion is in low gear: between the contents of a byte
sequence and a string of hexadecimal digits. Let's use C<unpack>, since
this is likely to remind you of a dump program, or some desperate last
message unfortunate programs are wont to throw at you before they expire
into the wild blue yonder. Assuming that the variable C<$mem> holds a 
sequence of bytes that we'd like to inspect without assuming anything 
about its meaning, we can write

   my( $hex ) = unpack( 'H*', $mem );
   print "$hex\n";

whereupon we might see something like this, with each pair of hex digits
corresponding to a byte:

   41204d414e204120504c414e20412043414e414c2050414e414d41

What was in this chunk of memory? Numbers, characters, or a mixture of
both? Assuming that we're on a computer where ASCII (or some similar)
encoding is used: hexadecimal values in the range C<0x40> - C<0x5A>
indicate an uppercase letter, and C<0x20> encodes a space. So we might
assume it is a piece of text, which some are able to read like a tabloid;
but others will have to get hold of an ASCII table and relive that
firstgrader feeling. Not caring too much about which way to read this,
we note that C<unpack> with the template code C<H> converts the contents
of a sequence of bytes into the customary hexadecimal notation. Since
"a sequence of" is a pretty vague indication of quantity, C<H> has been
defined to convert just a single hexadecimal digit unless it is followed
by a repeat count. An asterisk for the repeat count means to use whatever
remains.

The inverse operation - packing byte contents from a string of hexadecimal
digits - is just as easily written. For instance:

   my $s = pack( 'H2' x 10, 30..39 );
   print "$s\n";

Since we feed a list of ten 2-digit hexadecimal strings to C<pack>, the
pack template should contain ten pack codes. If this is run on a computer
with ASCII character coding, it will print C<0123456789>.

=head1 Packing Text

Let's suppose you've got to read in a data file like this:

    Date      |Description                | Income|Expenditure
    01/24/2001 Zed's Camel Emporium                    1147.99
    01/28/2001 Flea spray                                24.99
    01/29/2001 Camel rides to tourists      235.00

How do we do it? You might think first to use C<split>; however, since
C<split> collapses blank fields, you'll never know whether a record was
income or expenditure. Oops. Well, you could always use C<substr>:

    while (<>) { 
        my $date   = substr($_,  0, 11);
        my $desc   = substr($_, 12, 27);
        my $income = substr($_, 40,  7);
        my $expend = substr($_, 52,  7);
        ...
    }

It's not really a barrel of laughs, is it? In fact, it's worse than it
may seem; the eagle-eyed may notice that the first field should only be
10 characters wide, and the error has propagated right through the other
numbers - which we've had to count by hand. So it's error-prone as well
as horribly unfriendly.

Or maybe we could use regular expressions:

    while (<>) { 
        my($date, $desc, $income, $expend) = 
            m|(\d\d/\d\d/\d{4}) (.{27}) (.{7})(.*)|;
        ...
    }

Urgh. Well, it's a bit better, but - well, would you want to maintain
that?

Hey, isn't Perl supposed to make this sort of thing easy? Well, it does,
if you use the right tools. C<pack> and C<unpack> are designed to help
you out when dealing with fixed-width data like the above. Let's have a
look at a solution with C<unpack>:

    while (<>) { 
        my($date, $desc, $income, $expend) = unpack("A10xA27xA7A*", $_);
        ...
    }

That looks a bit nicer; but we've got to take apart that weird template.
Where did I pull that out of? 

OK, let's have a look at some of our data again; in fact, we'll include
the headers, and a handy ruler so we can keep track of where we are.

             1         2         3         4         5        
    1234567890123456789012345678901234567890123456789012345678
    Date      |Description                | Income|Expenditure
    01/28/2001 Flea spray                                24.99
    01/29/2001 Camel rides to tourists      235.00

From this, we can see that the date column stretches from column 1 to
column 10 - ten characters wide. The C<pack>-ese for "character" is
C<A>, and ten of them are C<A10>. So if we just wanted to extract the
dates, we could say this:

    my($date) = unpack("A10", $_);

OK, what's next? Between the date and the description is a blank column;
we want to skip over that. The C<x> template means "skip forward", so we
want one of those. Next, we have another batch of characters, from 12 to
38. That's 27 more characters, hence C<A27>. (Don't make the fencepost
error - there are 27 characters between 12 and 38, not 26. Count 'em!)

Now we skip another character and pick up the next 7 characters:

    my($date,$description,$income) = unpack("A10xA27xA7", $_);

Now comes the clever bit. Lines in our ledger which are just income and
not expenditure might end at column 46. Hence, we don't want to tell our
C<unpack> pattern that we B<need> to find another 12 characters; we'll
just say "if there's anything left, take it". As you might guess from
regular expressions, that's what the C<*> means: "use everything
remaining".

=over 3

=item *

Be warned, though, that unlike regular expressions, if the C<unpack>
template doesn't match the incoming data, Perl will scream and die.

=back


Hence, putting it all together:

    my ($date, $description, $income, $expend) =
        unpack("A10xA27xA7xA*", $_);

Now, that's our data parsed. I suppose what we might want to do now is
total up our income and expenditure, and add another line to the end of
our ledger - in the same format - saying how much we've brought in and
how much we've spent:

    while (<>) {
        my ($date, $desc, $income, $expend) =
            unpack("A10xA27xA7xA*", $_);
        $tot_income += $income;
        $tot_expend += $expend;
    }

    $tot_income = sprintf("%.2f", $tot_income); # Get them into 
    $tot_expend = sprintf("%.2f", $tot_expend); # "financial" format

    $date = POSIX::strftime("%m/%d/%Y", localtime); 

    # OK, let's go:

    print pack("A10xA27xA7xA*", $date, "Totals",
        $tot_income, $tot_expend);

Oh, hmm. That didn't quite work. Let's see what happened:

    01/24/2001 Zed's Camel Emporium                     1147.99
    01/28/2001 Flea spray                                 24.99
    01/29/2001 Camel rides to tourists     1235.00
    03/23/2001Totals                     1235.001172.98

OK, it's a start, but what happened to the spaces? We put C<x>, didn't
we? Shouldn't it skip forward? Let's look at what L<perlfunc/pack> says:

    x   A null byte.

Urgh. No wonder. There's a big difference between "a null byte",
character zero, and "a space", character 32. Perl's put something
between the date and the description - but unfortunately, we can't see
it! 

What we actually need to do is expand the width of the fields. The C<A>
format pads any non-existent characters with spaces, so we can use the
additional spaces to line up our fields, like this:

    print pack("A11 A28 A8 A*", $date, "Totals",
        $tot_income, $tot_expend);

(Note that you can put spaces in the template to make it more readable,
but they don't translate to spaces in the output.) Here's what we got
this time:

    01/24/2001 Zed's Camel Emporium                     1147.99
    01/28/2001 Flea spray                                 24.99
    01/29/2001 Camel rides to tourists     1235.00
    03/23/2001 Totals                      1235.00 1172.98

That's a bit better, but we still have that last column which needs to
be moved further over. There's an easy way to fix this up:
unfortunately, we can't get C<pack> to right-justify our fields, but we
can get C<sprintf> to do it:

    $tot_income = sprintf("%.2f", $tot_income); 
    $tot_expend = sprintf("%12.2f", $tot_expend);
    $date = POSIX::strftime("%m/%d/%Y", localtime); 
    print pack("A11 A28 A8 A*", $date, "Totals",
        $tot_income, $tot_expend);

This time we get the right answer:

    01/28/2001 Flea spray                                 24.99
    01/29/2001 Camel rides to tourists     1235.00
    03/23/2001 Totals                      1235.00      1172.98

So that's how we consume and produce fixed-width data. Let's recap what
we've seen of C<pack> and C<unpack> so far:

=over 3

=item *

Use C<pack> to go from several pieces of data to one fixed-width
version; use C<unpack> to turn a fixed-width-format string into several
pieces of data. 

=item *

The pack format C<A> means "any character"; if you're C<pack>ing and
you've run out of things to pack, C<pack> will fill the rest up with
spaces.

=item *

C<x> means "skip a byte" when C<unpack>ing; when C<pack>ing, it means
"introduce a null byte" - that's probably not what you mean if you're
dealing with plain text.

=item *

You can follow the formats with numbers to say how many characters
should be affected by that format: C<A12> means "take 12 characters";
C<x6> means "skip 6 bytes" or "character 0, 6 times".

=item *

Instead of a number, you can use C<*> to mean "consume everything else
left". 

B<Warning>: when packing multiple pieces of data, C<*> only means
"consume all of the current piece of data". That's to say

    pack("A*A*", $one, $two)

packs all of C<$one> into the first C<A*> and then all of C<$two> into
the second. This is a general principle: each format character
corresponds to one piece of data to be C<pack>ed.

=back



=head1 Packing Numbers

So much for textual data. Let's get onto the meaty stuff that C<pack>
and C<unpack> are best at: handling binary formats for numbers. There is,
of course, not just one binary format  - life would be too simple - but
Perl will do all the finicky labor for you.


=head2 Integers

Packing and unpacking numbers implies conversion to and from some
I<specific> binary representation. Leaving floating point numbers
aside for the moment, the salient properties of any such representation
are:

=over 4

=item *

the number of bytes used for storing the integer,

=item *

whether the contents are interpreted as a signed or unsigned number,

=item *

the byte ordering: whether the first byte is the least or most
significant byte (or: little-endian or big-endian, respectively).

=back

So, for instance, to pack 20302 to a signed 16 bit integer in your
computer's representation you write

   my $ps = pack( 's', 20302 );

Again, the result is a string, now containing 2 bytes. If you print 
this string (which is, generally, not recommended) you might see
C<ON> or C<NO> (depending on your system's byte ordering) - or something
entirely different if your computer doesn't use ASCII character encoding.
Unpacking C<$ps> with the same template returns the original integer value:

   my( $s ) = unpack( 's', $ps );

This is true for all numeric template codes. But don't expect miracles:
if the packed value exceeds the allotted byte capacity, high order bits
are silently discarded, and unpack certainly won't be able to pull them
back out of some magic hat. And, when you pack using a signed template
code such as C<s>, an excess value may result in the sign bit
getting set, and unpacking this will smartly return a negative value.

16 bits won't get you too far with integers, but there is C<l> and C<L>
for signed and unsigned 32-bit integers. And if this is not enough and
your system supports 64 bit integers you can push the limits much closer
to infinity with pack codes C<q> and C<Q>. A notable exception is provided
by pack codes C<i> and C<I> for signed and unsigned integers of the 
"local custom" variety: Such an integer will take up as many bytes as
a local C compiler returns for C<sizeof(int)>, but it'll use I<at least>
32 bits.

Each of the integer pack codes C<sSlLqQ> results in a fixed number of bytes,
no matter where you execute your program. This may be useful for some 
applications, but it does not provide for a portable way to pass data 
structures between Perl and C programs (bound to happen when you call 
XS extensions or the Perl function C<syscall>), or when you read or
write binary files. What you'll need in this case are template codes that
depend on what your local C compiler compiles when you code C<short> or
C<unsigned long>, for instance. These codes and their corresponding
byte lengths are shown in the table below.  Since the C standard leaves
much leeway with respect to the relative sizes of these data types, actual
values may vary, and that's why the values are given as expressions in
C and Perl. (If you'd like to use values from C<%Config> in your program
you have to import it with C<use Config>.)

   signed unsigned  byte length in C   byte length in Perl       
     s!     S!      sizeof(short)      $Config{shortsize}
     i!     I!      sizeof(int)        $Config{intsize}
     l!     L!      sizeof(long)       $Config{longsize}
     q!     Q!      sizeof(long long)  $Config{longlongsize}

The C<i!> and C<I!> codes aren't different from C<i> and C<I>; they are
tolerated for completeness' sake.


=head2 Unpacking a Stack Frame

Requesting a particular byte ordering may be necessary when you work with
binary data coming from some specific architecture whereas your program could
run on a totally different system. As an example, assume you have 24 bytes
containing a stack frame as it happens on an Intel 8086:

      +---------+        +----+----+               +---------+
 TOS: |   IP    |  TOS+4:| FL | FH | FLAGS  TOS+14:|   SI    |
      +---------+        +----+----+               +---------+
      |   CS    |        | AL | AH | AX            |   DI    |
      +---------+        +----+----+               +---------+
                         | BL | BH | BX            |   BP    |
                         +----+----+               +---------+
                         | CL | CH | CX            |   DS    |
                         +----+----+               +---------+
                         | DL | DH | DX            |   ES    |
                         +----+----+               +---------+

First, we note that this time-honored 16-bit CPU uses little-endian order,
and that's why the low order byte is stored at the lower address. To
unpack such a (unsigned) short we'll have to use code C<v>. A repeat
count unpacks all 12 shorts:

   my( $ip, $cs, $flags, $ax, $bx, $cd, $dx, $si, $di, $bp, $ds, $es ) =
     unpack( 'v12', $frame );

Alternatively, we could have used C<C> to unpack the individually
accessible byte registers FL, FH, AL, AH, etc.:

   my( $fl, $fh, $al, $ah, $bl, $bh, $cl, $ch, $dl, $dh ) =
     unpack( 'C10', substr( $frame, 4, 10 ) );

It would be nice if we could do this in one fell swoop: unpack a short,
back up a little, and then unpack 2 bytes. Since Perl I<is> nice, it
proffers the template code C<X> to back up one byte. Putting this all
together, we may now write:

   my( $ip, $cs,
       $flags,$fl,$fh,
       $ax,$al,$ah, $bx,$bl,$bh, $cx,$cl,$ch, $dx,$dl,$dh, 
       $si, $di, $bp, $ds, $es ) =
   unpack( 'v2' . ('vXXCC' x 5) . 'v5', $frame );

(The clumsy construction of the template can be avoided - just read on!)  

We've taken some pains to construct the template so that it matches
the contents of our frame buffer. Otherwise we'd either get undefined values,
or C<unpack> could not unpack all. If C<pack> runs out of items, it will
supply null strings (which are coerced into zeroes whenever the pack code
says so).


=head2 How to Eat an Egg on a Net

The pack code for big-endian (high order byte at the lowest address) is
C<n> for 16 bit and C<N> for 32 bit integers. You use these codes
if you know that your data comes from a compliant architecture, but,
surprisingly enough, you should also use these pack codes if you
exchange binary data, across the network, with some system that you
know next to nothing about. The simple reason is that this
order has been chosen as the I<network order>, and all standard-fearing
programs ought to follow this convention. (This is, of course, a stern
backing for one of the Lilliputian parties and may well influence the
political development there.) So, if the protocol expects you to send
a message by sending the length first, followed by just so many bytes,
you could write:

   my $buf = pack( 'N', length( $msg ) ) . $msg;

or even:

   my $buf = pack( 'NA*', length( $msg ), $msg );

and pass C<$buf> to your send routine. Some protocols demand that the
count should include the length of the count itself: then just add 4
to the data length. (But make sure to read L</"Lengths and Widths"> before
you really code this!)


=head2 Byte-order modifiers

In the previous sections we've learned how to use C<n>, C<N>, C<v> and
C<V> to pack and unpack integers with big- or little-endian byte-order.
While this is nice, it's still rather limited because it leaves out all
kinds of signed integers as well as 64-bit integers. For example, if you
wanted to unpack a sequence of signed big-endian 16-bit integers in a
platform-independent way, you would have to write:

   my @data = unpack 's*', pack 'S*', unpack 'n*', $buf;

This is ugly. As of Perl 5.9.2, there's a much nicer way to express your
desire for a certain byte-order: the C<E<gt>> and C<E<lt>> modifiers.
C<E<gt>> is the big-endian modifier, while C<E<lt>> is the little-endian
modifier. Using them, we could rewrite the above code as:

   my @data = unpack 's>*', $buf;

As you can see, the "big end" of the arrow touches the C<s>, which is a
nice way to remember that C<E<gt>> is the big-endian modifier. The same
obviously works for C<E<lt>>, where the "little end" touches the code.

You will probably find these modifiers even more useful if you have
to deal with big- or little-endian C structures. Be sure to read
L</"Packing and Unpacking C Structures"> for more on that.


=head2 Floating point Numbers

For packing floating point numbers you have the choice between the
pack codes C<f>, C<d>, C<F> and C<D>. C<f> and C<d> pack into (or unpack
from) single-precision or double-precision representation as it is provided
by your system. If your systems supports it, C<D> can be used to pack and
unpack (C<long double>) values, which can offer even more resolution
than C<f> or C<d>.  B<Note that there are different long double formats.>

C<F> packs an C<NV>, which is the floating point type used by Perl
internally.

There is no such thing as a network representation for reals, so if
you want to send your real numbers across computer boundaries, you'd
better stick to text representation, possibly using the hexadecimal
float format (avoiding the decimal conversion loss), unless you're
absolutely sure what's on the other end of the line. For the even more
adventuresome, you can use the byte-order modifiers from the previous
section also on floating point codes.



=head1 Exotic Templates


=head2 Bit Strings

Bits are the atoms in the memory world. Access to individual bits may
have to be used either as a last resort or because it is the most
convenient way to handle your data. Bit string (un)packing converts
between strings containing a series of C<0> and C<1> characters and
a sequence of bytes each containing a group of 8 bits. This is almost
as simple as it sounds, except that there are two ways the contents of
a byte may be written as a bit string. Let's have a look at an annotated
byte:

     7 6 5 4 3 2 1 0
   +-----------------+
   | 1 0 0 0 1 1 0 0 |
   +-----------------+
    MSB           LSB

It's egg-eating all over again: Some think that as a bit string this should
be written "10001100" i.e. beginning with the most significant bit, others
insist on "00110001". Well, Perl isn't biased, so that's why we have two bit
string codes:

   $byte = pack( 'B8', '10001100' ); # start with MSB
   $byte = pack( 'b8', '00110001' ); # start with LSB

It is not possible to pack or unpack bit fields - just integral bytes.
C<pack> always starts at the next byte boundary and "rounds up" to the
next multiple of 8 by adding zero bits as required. (If you do want bit
fields, there is L<perlfunc/vec>. Or you could implement bit field 
handling at the character string level, using split, substr, and
concatenation on unpacked bit strings.)

To illustrate unpacking for bit strings, we'll decompose a simple
status register (a "-" stands for a "reserved" bit):

   +-----------------+-----------------+
   | S Z - A - P - C | - - - - O D I T |
   +-----------------+-----------------+
    MSB           LSB MSB           LSB

Converting these two bytes to a string can be done with the unpack 
template C<'b16'>. To obtain the individual bit values from the bit
string we use C<split> with the "empty" separator pattern which dissects
into individual characters. Bit values from the "reserved" positions are
simply assigned to C<undef>, a convenient notation for "I don't care where
this goes".

   ($carry, undef, $parity, undef, $auxcarry, undef, $zero, $sign,
    $trace, $interrupt, $direction, $overflow) =
      split( //, unpack( 'b16', $status ) );

We could have used an unpack template C<'b12'> just as well, since the
last 4 bits can be ignored anyway. 


=head2 Uuencoding

Another odd-man-out in the template alphabet is C<u>, which packs a
"uuencoded string". ("uu" is short for Unix-to-Unix.) Chances are that
you won't ever need this encoding technique which was invented to overcome
the shortcomings of old-fashioned transmission mediums that do not support
other than simple ASCII data. The essential recipe is simple: Take three 
bytes, or 24 bits. Split them into 4 six-packs, adding a space (0x20) to 
each. Repeat until all of the data is blended. Fold groups of 4 bytes into 
lines no longer than 60 and garnish them in front with the original byte count 
(incremented by 0x20) and a C<"\n"> at the end. - The C<pack> chef will
prepare this for you, a la minute, when you select pack code C<u> on the menu:

   my $uubuf = pack( 'u', $bindat );

A repeat count after C<u> sets the number of bytes to put into an
uuencoded line, which is the maximum of 45 by default, but could be
set to some (smaller) integer multiple of three. C<unpack> simply ignores
the repeat count.


=head2 Doing Sums

An even stranger template code is C<%>E<lt>I<number>E<gt>. First, because 
it's used as a prefix to some other template code. Second, because it
cannot be used in C<pack> at all, and third, in C<unpack>, doesn't return the
data as defined by the template code it precedes. Instead it'll give you an
integer of I<number> bits that is computed from the data value by 
doing sums. For numeric unpack codes, no big feat is achieved:

    my $buf = pack( 'iii', 100, 20, 3 );
    print unpack( '%32i3', $buf ), "\n";  # prints 123

For string values, C<%> returns the sum of the byte values saving
you the trouble of a sum loop with C<substr> and C<ord>:

    print unpack( '%32A*', "\x01\x10" ), "\n";  # prints 17

Although the C<%> code is documented as returning a "checksum":
don't put your trust in such values! Even when applied to a small number
of bytes, they won't guarantee a noticeable Hamming distance.

In connection with C<b> or C<B>, C<%> simply adds bits, and this can be put
to good use to count set bits efficiently:

    my $bitcount = unpack( '%32b*', $mask );

And an even parity bit can be determined like this:

    my $evenparity = unpack( '%1b*', $mask );


=head2  Unicode

Unicode is a character set that can represent most characters in most of
the world's languages, providing room for over one million different
characters. Unicode 3.1 specifies 94,140 characters: The Basic Latin
characters are assigned to the numbers 0 - 127. The Latin-1 Supplement with
characters that are used in several European languages is in the next
range, up to 255. After some more Latin extensions we find the character
sets from languages using non-Roman alphabets, interspersed with a
variety of symbol sets such as currency symbols, Zapf Dingbats or Braille.
(You might want to visit L<http://www.unicode.org/> for a look at some of
them - my personal favourites are Telugu and Kannada.)

The Unicode character sets associates characters with integers. Encoding
these numbers in an equal number of bytes would more than double the
requirements for storing texts written in Latin alphabets.
The UTF-8 encoding avoids this by storing the most common (from a western
point of view) characters in a single byte while encoding the rarer
ones in three or more bytes.

Perl uses UTF-8, internally, for most Unicode strings.

So what has this got to do with C<pack>? Well, if you want to compose a
Unicode string (that is internally encoded as UTF-8), you can do so by
using template code C<U>. As an example, let's produce the Euro currency
symbol (code number 0x20AC):

   $UTF8{Euro} = pack( 'U', 0x20AC );
   # Equivalent to: $UTF8{Euro} = "\x{20ac}";

Inspecting C<$UTF8{Euro}> shows that it contains 3 bytes:
"\xe2\x82\xac". However, it contains only 1 character, number 0x20AC.
The round trip can be completed with C<unpack>:

   $Unicode{Euro} = unpack( 'U', $UTF8{Euro} );

Unpacking using the C<U> template code also works on UTF-8 encoded byte
strings.

Usually you'll want to pack or unpack UTF-8 strings:

   # pack and unpack the Hebrew alphabet
   my $alefbet = pack( 'U*', 0x05d0..0x05ea );
   my @hebrew = unpack( 'U*', $utf );

Please note: in the general case, you're better off using
L<C<Encode::decode('UTF-8', $utf)>|Encode/decode> to decode a UTF-8
encoded byte string to a Perl Unicode string, and
L<C<Encode::encode('UTF-8', $str)>|Encode/encode> to encode a Perl Unicode
string to UTF-8 bytes. These functions provide means of handling invalid byte
sequences and generally have a friendlier interface.

=head2 Another Portable Binary Encoding

The pack code C<w> has been added to support a portable binary data
encoding scheme that goes way beyond simple integers. (Details can
be found at L<http://Casbah.org/>, the Scarab project.)  A BER (Binary Encoded
Representation) compressed unsigned integer stores base 128
digits, most significant digit first, with as few digits as possible.
Bit eight (the high bit) is set on each byte except the last. There
is no size limit to BER encoding, but Perl won't go to extremes.

   my $berbuf = pack( 'w*', 1, 128, 128+1, 128*128+127 );

A hex dump of C<$berbuf>, with spaces inserted at the right places,
shows 01 8100 8101 81807F. Since the last byte is always less than
128, C<unpack> knows where to stop.


=head1 Template Grouping

Prior to Perl 5.8, repetitions of templates had to be made by
C<x>-multiplication of template strings. Now there is a better way as
we may use the pack codes C<(> and C<)> combined with a repeat count.
The C<unpack> template from the Stack Frame example can simply
be written like this:

   unpack( 'v2 (vXXCC)5 v5', $frame )

Let's explore this feature a little more. We'll begin with the equivalent of

   join( '', map( substr( $_, 0, 1 ), @str ) )

which returns a string consisting of the first character from each string.
Using pack, we can write

   pack( '(A)'.@str, @str )

or, because a repeat count C<*> means "repeat as often as required",
simply

   pack( '(A)*', @str )

(Note that the template C<A*> would only have packed C<$str[0]> in full
length.)

To pack dates stored as triplets ( day, month, year ) in an array C<@dates>
into a sequence of byte, byte, short integer we can write

   $pd = pack( '(CCS)*', map( @$_, @dates ) );

To swap pairs of characters in a string (with even length) one could use
several techniques. First, let's use C<x> and C<X> to skip forward and back:

   $s = pack( '(A)*', unpack( '(xAXXAx)*', $s ) );

We can also use C<@> to jump to an offset, with 0 being the position where
we were when the last C<(> was encountered:

   $s = pack( '(A)*', unpack( '(@1A @0A @2)*', $s ) );

Finally, there is also an entirely different approach by unpacking big
endian shorts and packing them in the reverse byte order:

   $s = pack( '(v)*', unpack( '(n)*', $s );


=head1 Lengths and Widths

=head2 String Lengths

In the previous section we've seen a network message that was constructed
by prefixing the binary message length to the actual message. You'll find
that packing a length followed by so many bytes of data is a 
frequently used recipe since appending a null byte won't work
if a null byte may be part of the data. Here is an example where both
techniques are used: after two null terminated strings with source and
destination address, a Short Message (to a mobile phone) is sent after
a length byte:

   my $msg = pack( 'Z*Z*CA*', $src, $dst, length( $sm ), $sm );

Unpacking this message can be done with the same template:

   ( $src, $dst, $len, $sm ) = unpack( 'Z*Z*CA*', $msg );

There's a subtle trap lurking in the offing: Adding another field after
the Short Message (in variable C<$sm>) is all right when packing, but this
cannot be unpacked naively:

   # pack a message
   my $msg = pack( 'Z*Z*CA*C', $src, $dst, length( $sm ), $sm, $prio );

   # unpack fails - $prio remains undefined!
   ( $src, $dst, $len, $sm, $prio ) = unpack( 'Z*Z*CA*C', $msg );

The pack code C<A*> gobbles up all remaining bytes, and C<$prio> remains
undefined! Before we let disappointment dampen the morale: Perl's got
the trump card to make this trick too, just a little further up the sleeve.
Watch this:

   # pack a message: ASCIIZ, ASCIIZ, length/string, byte
   my $msg = pack( 'Z* Z* C/A* C', $src, $dst, $sm, $prio );

   # unpack
   ( $src, $dst, $sm, $prio ) = unpack( 'Z* Z* C/A* C', $msg );

Combining two pack codes with a slash (C</>) associates them with a single
value from the argument list. In C<pack>, the length of the argument is
taken and packed according to the first code while the argument itself
is added after being converted with the template code after the slash.
This saves us the trouble of inserting the C<length> call, but it is 
in C<unpack> where we really score: The value of the length byte marks the
end of the string to be taken from the buffer. Since this combination
doesn't make sense except when the second pack code isn't C<a*>, C<A*>
or C<Z*>, Perl won't let you.

The pack code preceding C</> may be anything that's fit to represent a
number: All the numeric binary pack codes, and even text codes such as
C<A4> or C<Z*>:

   # pack/unpack a string preceded by its length in ASCII
   my $buf = pack( 'A4/A*', "Humpty-Dumpty" );
   # unpack $buf: '13  Humpty-Dumpty'
   my $txt = unpack( 'A4/A*', $buf );

C</> is not implemented in Perls before 5.6, so if your code is required to
work on older Perls you'll need to C<unpack( 'Z* Z* C')> to get the length,
then use it to make a new unpack string. For example

   # pack a message: ASCIIZ, ASCIIZ, length, string, byte
   # (5.005 compatible)
   my $msg = pack( 'Z* Z* C A* C', $src, $dst, length $sm, $sm, $prio );

   # unpack
   ( undef, undef, $len) = unpack( 'Z* Z* C', $msg );
   ($src, $dst, $sm, $prio) = unpack ( "Z* Z* x A$len C", $msg );

But that second C<unpack> is rushing ahead. It isn't using a simple literal
string for the template. So maybe we should introduce...

=head2 Dynamic Templates

So far, we've seen literals used as templates. If the list of pack
items doesn't have fixed length, an expression constructing the
template is required (whenever, for some reason, C<()*> cannot be used).
Here's an example: To store named string values in a way that can be
conveniently parsed by a C program, we create a sequence of names and
null terminated ASCII strings, with C<=> between the name and the value,
followed by an additional delimiting null byte. Here's how:

   my $env = pack( '(A*A*Z*)' . keys( %Env ) . 'C',
                   map( { ( $_, '=', $Env{$_} ) } keys( %Env ) ), 0 );

Let's examine the cogs of this byte mill, one by one. There's the C<map>
call, creating the items we intend to stuff into the C<$env> buffer:
to each key (in C<$_>) it adds the C<=> separator and the hash entry value.
Each triplet is packed with the template code sequence C<A*A*Z*> that
is repeated according to the number of keys. (Yes, that's what the C<keys>
function returns in scalar context.) To get the very last null byte,
we add a C<0> at the end of the C<pack> list, to be packed with C<C>.
(Attentive readers may have noticed that we could have omitted the 0.)

For the reverse operation, we'll have to determine the number of items
in the buffer before we can let C<unpack> rip it apart:

   my $n = $env =~ tr/\0// - 1;
   my %env = map( split( /=/, $_ ), unpack( "(Z*)$n", $env ) );

The C<tr> counts the null bytes. The C<unpack> call returns a list of
name-value pairs each of which is taken apart in the C<map> block. 


=head2 Counting Repetitions

Rather than storing a sentinel at the end of a data item (or a list of items),
we could precede the data with a count. Again, we pack keys and values of
a hash, preceding each with an unsigned short length count, and up front
we store the number of pairs:

   my $env = pack( 'S(S/A* S/A*)*', scalar keys( %Env ), %Env );

This simplifies the reverse operation as the number of repetitions can be
unpacked with the C</> code:

   my %env = unpack( 'S/(S/A* S/A*)', $env );

Note that this is one of the rare cases where you cannot use the same
template for C<pack> and C<unpack> because C<pack> can't determine
a repeat count for a C<()>-group.


=head2 Intel HEX

Intel HEX is a file format for representing binary data, mostly for
programming various chips, as a text file. (See
L<http://en.wikipedia.org/wiki/.hex> for a detailed description, and
L<http://en.wikipedia.org/wiki/SREC_(file_format)> for the Motorola
S-record format, which can be unravelled using the same technique.)
Each line begins with a colon (':') and is followed by a sequence of
hexadecimal characters, specifying a byte count I<n> (8 bit),
an address (16 bit, big endian), a record type (8 bit), I<n> data bytes
and a checksum (8 bit) computed as the least significant byte of the two's
complement sum of the preceding bytes. Example: C<:0300300002337A1E>.

The first step of processing such a line is the conversion, to binary,
of the hexadecimal data, to obtain the four fields, while checking the
checksum. No surprise here: we'll start with a simple C<pack> call to 
convert everything to binary:

   my $binrec = pack( 'H*', substr( $hexrec, 1 ) );

The resulting byte sequence is most convenient for checking the checksum.
Don't slow your program down with a for loop adding the C<ord> values
of this string's bytes - the C<unpack> code C<%> is the thing to use
for computing the 8-bit sum of all bytes, which must be equal to zero:

   die unless unpack( "%8C*", $binrec ) == 0;

Finally, let's get those four fields. By now, you shouldn't have any
problems with the first three fields - but how can we use the byte count
of the data in the first field as a length for the data field? Here
the codes C<x> and C<X> come to the rescue, as they permit jumping
back and forth in the string to unpack.

   my( $addr, $type, $data ) = unpack( "x n C X4 C x3 /a", $bin ); 

Code C<x> skips a byte, since we don't need the count yet. Code C<n> takes
care of the 16-bit big-endian integer address, and C<C> unpacks the
record type. Being at offset 4, where the data begins, we need the count.
C<X4> brings us back to square one, which is the byte at offset 0.
Now we pick up the count, and zoom forth to offset 4, where we are
now fully furnished to extract the exact number of data bytes, leaving
the trailing checksum byte alone.



=head1 Packing and Unpacking C Structures

In previous sections we have seen how to pack numbers and character
strings. If it were not for a couple of snags we could conclude this
section right away with the terse remark that C structures don't
contain anything else, and therefore you already know all there is to it.
Sorry, no: read on, please.

If you have to deal with a lot of C structures, and don't want to
hack all your template strings manually, you'll probably want to have
a look at the CPAN module C<Convert::Binary::C>. Not only can it parse
your C source directly, but it also has built-in support for all the
odds and ends described further on in this section.

=head2 The Alignment Pit

In the consideration of speed against memory requirements the balance
has been tilted in favor of faster execution. This has influenced the
way C compilers allocate memory for structures: On architectures
where a 16-bit or 32-bit operand can be moved faster between places in
memory, or to or from a CPU register, if it is aligned at an even or 
multiple-of-four or even at a multiple-of eight address, a C compiler
will give you this speed benefit by stuffing extra bytes into structures.
If you don't cross the C shoreline this is not likely to cause you any
grief (although you should care when you design large data structures,
or you want your code to be portable between architectures (you do want
that, don't you?)).

To see how this affects C<pack> and C<unpack>, we'll compare these two
C structures:

   typedef struct {
     char     c1;
     short    s;
     char     c2;
     long     l;
   } gappy_t;

   typedef struct {
     long     l;
     short    s;
     char     c1;
     char     c2;
   } dense_t;

Typically, a C compiler allocates 12 bytes to a C<gappy_t> variable, but
requires only 8 bytes for a C<dense_t>. After investigating this further,
we can draw memory maps, showing where the extra 4 bytes are hidden:

   0           +4          +8          +12
   +--+--+--+--+--+--+--+--+--+--+--+--+
   |c1|xx|  s  |c2|xx|xx|xx|     l     |    xx = fill byte
   +--+--+--+--+--+--+--+--+--+--+--+--+
   gappy_t

   0           +4          +8
   +--+--+--+--+--+--+--+--+
   |     l     |  h  |c1|c2|
   +--+--+--+--+--+--+--+--+
   dense_t

And that's where the first quirk strikes: C<pack> and C<unpack>
templates have to be stuffed with C<x> codes to get those extra fill bytes.

The natural question: "Why can't Perl compensate for the gaps?" warrants
an answer. One good reason is that C compilers might provide (non-ANSI)
extensions permitting all sorts of fancy control over the way structures
are aligned, even at the level of an individual structure field. And, if
this were not enough, there is an insidious thing called C<union> where
the amount of fill bytes cannot be derived from the alignment of the next
item alone.

OK, so let's bite the bullet. Here's one way to get the alignment right
by inserting template codes C<x>, which don't take a corresponding item 
from the list:

  my $gappy = pack( 'cxs cxxx l!', $c1, $s, $c2, $l );

Note the C<!> after C<l>: We want to make sure that we pack a long
integer as it is compiled by our C compiler. And even now, it will only
work for the platforms where the compiler aligns things as above.
And somebody somewhere has a platform where it doesn't.
[Probably a Cray, where C<short>s, C<int>s and C<long>s are all 8 bytes. :-)]

Counting bytes and watching alignments in lengthy structures is bound to 
be a drag. Isn't there a way we can create the template with a simple
program? Here's a C program that does the trick:

   #include <stdio.h>
   #include <stddef.h>

   typedef struct {
     char     fc1;
     short    fs;
     char     fc2;
     long     fl;
   } gappy_t;

   #define Pt(struct,field,tchar) \
     printf( "@%d%s ", offsetof(struct,field), # tchar );

   int main() {
     Pt( gappy_t, fc1, c  );
     Pt( gappy_t, fs,  s! );
     Pt( gappy_t, fc2, c  );
     Pt( gappy_t, fl,  l! );
     printf( "\n" );
   }

The output line can be used as a template in a C<pack> or C<unpack> call:

  my $gappy = pack( '@0c @2s! @4c @8l!', $c1, $s, $c2, $l );

Gee, yet another template code - as if we hadn't plenty. But 
C<@> saves our day by enabling us to specify the offset from the beginning
of the pack buffer to the next item: This is just the value
the C<offsetof> macro (defined in C<E<lt>stddef.hE<gt>>) returns when
given a C<struct> type and one of its field names ("member-designator" in 
C standardese).

Neither using offsets nor adding C<x>'s to bridge the gaps is satisfactory.
(Just imagine what happens if the structure changes.) What we really need
is a way of saying "skip as many bytes as required to the next multiple of N".
In fluent Templatese, you say this with C<x!N> where N is replaced by the
appropriate value. Here's the next version of our struct packaging:

  my $gappy = pack( 'c x!2 s c x!4 l!', $c1, $s, $c2, $l );

That's certainly better, but we still have to know how long all the
integers are, and portability is far away. Rather than C<2>,
for instance, we want to say "however long a short is". But this can be
done by enclosing the appropriate pack code in brackets: C<[s]>. So, here's
the very best we can do:

  my $gappy = pack( 'c x![s] s c x![l!] l!', $c1, $s, $c2, $l );


=head2 Dealing with Endian-ness

Now, imagine that we want to pack the data for a machine with a
different byte-order. First, we'll have to figure out how big the data
types on the target machine really are. Let's assume that the longs are
32 bits wide and the shorts are 16 bits wide. You can then rewrite the
template as:

  my $gappy = pack( 'c x![s] s c x![l] l', $c1, $s, $c2, $l );

If the target machine is little-endian, we could write:

  my $gappy = pack( 'c x![s] s< c x![l] l<', $c1, $s, $c2, $l );

This forces the short and the long members to be little-endian, and is
just fine if you don't have too many struct members. But we could also
use the byte-order modifier on a group and write the following:

  my $gappy = pack( '( c x![s] s c x![l] l )<', $c1, $s, $c2, $l );

This is not as short as before, but it makes it more obvious that we
intend to have little-endian byte-order for a whole group, not only
for individual template codes. It can also be more readable and easier
to maintain.


=head2 Alignment, Take 2

I'm afraid that we're not quite through with the alignment catch yet. The
hydra raises another ugly head when you pack arrays of structures:

   typedef struct {
     short    count;
     char     glyph;
   } cell_t;

   typedef cell_t buffer_t[BUFLEN];

Where's the catch? Padding is neither required before the first field C<count>,
nor between this and the next field C<glyph>, so why can't we simply pack
like this:

   # something goes wrong here:
   pack( 's!a' x @buffer,
         map{ ( $_->{count}, $_->{glyph} ) } @buffer );

This packs C<3*@buffer> bytes, but it turns out that the size of 
C<buffer_t> is four times C<BUFLEN>! The moral of the story is that
the required alignment of a structure or array is propagated to the
next higher level where we have to consider padding I<at the end>
of each component as well. Thus the correct template is:

   pack( 's!ax' x @buffer,
         map{ ( $_->{count}, $_->{glyph} ) } @buffer );

=head2 Alignment, Take 3

And even if you take all the above into account, ANSI still lets this:

   typedef struct {
     char     foo[2];
   } foo_t;

vary in size. The alignment constraint of the structure can be greater than
any of its elements. [And if you think that this doesn't affect anything
common, dismember the next cellphone that you see. Many have ARM cores, and
the ARM structure rules make C<sizeof (foo_t)> == 4]

=head2 Pointers for How to Use Them

The title of this section indicates the second problem you may run into
sooner or later when you pack C structures. If the function you intend
to call expects a, say, C<void *> value, you I<cannot> simply take
a reference to a Perl variable. (Although that value certainly is a
memory address, it's not the address where the variable's contents are
stored.)

Template code C<P> promises to pack a "pointer to a fixed length string".
Isn't this what we want? Let's try:

    # allocate some storage and pack a pointer to it
    my $memory = "\x00" x $size;
    my $memptr = pack( 'P', $memory );

But wait: doesn't C<pack> just return a sequence of bytes? How can we pass this
string of bytes to some C code expecting a pointer which is, after all,
nothing but a number? The answer is simple: We have to obtain the numeric
address from the bytes returned by C<pack>.

    my $ptr = unpack( 'L!', $memptr );

Obviously this assumes that it is possible to typecast a pointer
to an unsigned long and vice versa, which frequently works but should not
be taken as a universal law. - Now that we have this pointer the next question
is: How can we put it to good use? We need a call to some C function
where a pointer is expected. The read(2) system call comes to mind:

    ssize_t read(int fd, void *buf, size_t count);

After reading L<perlfunc> explaining how to use C<syscall> we can write
this Perl function copying a file to standard output:

    require 'syscall.ph'; # run h2ph to generate this file
    sub cat($){
        my $path = shift();
        my $size = -s $path;
        my $memory = "\x00" x $size;  # allocate some memory
        my $ptr = unpack( 'L', pack( 'P', $memory ) );
        open( F, $path ) || die( "$path: cannot open ($!)\n" );
        my $fd = fileno(F);
        my $res = syscall( &SYS_read, fileno(F), $ptr, $size );
        print $memory;
        close( F );
    }

This is neither a specimen of simplicity nor a paragon of portability but
it illustrates the point: We are able to sneak behind the scenes and
access Perl's otherwise well-guarded memory! (Important note: Perl's
C<syscall> does I<not> require you to construct pointers in this roundabout
way. You simply pass a string variable, and Perl forwards the address.) 

How does C<unpack> with C<P> work? Imagine some pointer in the buffer
about to be unpacked: If it isn't the null pointer (which will smartly
produce the C<undef> value) we have a start address - but then what?
Perl has no way of knowing how long this "fixed length string" is, so
it's up to you to specify the actual size as an explicit length after C<P>.

   my $mem = "abcdefghijklmn";
   print unpack( 'P5', pack( 'P', $mem ) ); # prints "abcde"

As a consequence, C<pack> ignores any number or C<*> after C<P>.


Now that we have seen C<P> at work, we might as well give C<p> a whirl.
Why do we need a second template code for packing pointers at all? The 
answer lies behind the simple fact that an C<unpack> with C<p> promises
a null-terminated string starting at the address taken from the buffer,
and that implies a length for the data item to be returned:

   my $buf = pack( 'p', "abc\x00efhijklmn" );
   print unpack( 'p', $buf );    # prints "abc"



Albeit this is apt to be confusing: As a consequence of the length being
implied by the string's length, a number after pack code C<p> is a repeat
count, not a length as after C<P>. 


Using C<pack(..., $x)> with C<P> or C<p> to get the address where C<$x> is
actually stored must be used with circumspection. Perl's internal machinery
considers the relation between a variable and that address as its very own 
private matter and doesn't really care that we have obtained a copy. Therefore:

=over 4

=item * 

Do not use C<pack> with C<p> or C<P> to obtain the address of variable
that's bound to go out of scope (and thereby freeing its memory) before you
are done with using the memory at that address.

=item * 

Be very careful with Perl operations that change the value of the
variable. Appending something to the variable, for instance, might require
reallocation of its storage, leaving you with a pointer into no-man's land.

=item * 

Don't think that you can get the address of a Perl variable
when it is stored as an integer or double number! C<pack('P', $x)> will
force the variable's internal representation to string, just as if you
had written something like C<$x .= ''>.

=back

It's safe, however, to P- or p-pack a string literal, because Perl simply
allocates an anonymous variable.



=head1 Pack Recipes

Here are a collection of (possibly) useful canned recipes for C<pack>
and C<unpack>:

    # Convert IP address for socket functions
    pack( "C4", split /\./, "123.4.5.6" ); 

    # Count the bits in a chunk of memory (e.g. a select vector)
    unpack( '%32b*', $mask );

    # Determine the endianness of your system
    $is_little_endian = unpack( 'c', pack( 's', 1 ) );
    $is_big_endian = unpack( 'xc', pack( 's', 1 ) );

    # Determine the number of bits in a native integer
    $bits = unpack( '%32I!', ~0 );

    # Prepare argument for the nanosleep system call
    my $timespec = pack( 'L!L!', $secs, $nanosecs );

For a simple memory dump we unpack some bytes into just as 
many pairs of hex digits, and use C<map> to handle the traditional
spacing - 16 bytes to a line:

    my $i;
    print map( ++$i % 16 ? "$_ " : "$_\n",
               unpack( 'H2' x length( $mem ), $mem ) ),
          length( $mem ) % 16 ? "\n" : '';


=head1 Funnies Section

    # Pulling digits out of nowhere...
    print unpack( 'C', pack( 'x' ) ),
          unpack( '%B*', pack( 'A' ) ),
          unpack( 'H', pack( 'A' ) ),
          unpack( 'A', unpack( 'C', pack( 'A' ) ) ), "\n";

    # One for the road ;-)
    my $advice = pack( 'all u can in a van' );


=head1 Authors

Simon Cozens and Wolfgang Laun.

perl5223delta.pod000064400000020410150344123500007532 0ustar00=encoding utf8

=head1 NAME

perl5223delta - what is new for perl v5.22.3

=head1 DESCRIPTION

This document describes differences between the 5.22.2 release and the 5.22.3
release.

If you are upgrading from an earlier release such as 5.22.1, first read
L<perl5222delta>, which describes differences between 5.22.1 and 5.22.2.

=head1 Security

=head2 B<-Di> switch is now required for PerlIO debugging output

Previously PerlIO debugging output would be sent to the file specified by the
C<PERLIO_DEBUG> environment variable if perl wasn't running setuid and the
B<-T> or B<-t> switches hadn't been parsed yet.

If perl performed output at a point where it hadn't yet parsed its switches
this could result in perl creating or overwriting the file named by
C<PERLIO_DEBUG> even when the B<-T> switch had been supplied.

Perl now requires the B<-Di> switch to produce PerlIO debugging output.  By
default this is written to C<stderr>, but can optionally be redirected to a
file by setting the C<PERLIO_DEBUG> environment variable.

If perl is running setuid or the B<-T> switch was supplied C<PERLIO_DEBUG> is
ignored and the debugging output is sent to C<stderr> as for any other B<-D>
switch.

=head2 Core modules and tools no longer search F<"."> for optional modules

The tools and many modules supplied in core no longer search the default
current directory entry in L<C<@INC>|perlvar/@INC> for optional modules.  For
example, L<Storable> will remove the final F<"."> from C<@INC> before trying to
load L<Log::Agent>.

This prevents an attacker injecting an optional module into a process run by
another user where the current directory is writable by the attacker, e.g. the
F</tmp> directory.

In most cases this removal should not cause problems, but difficulties were
encountered with L<base>, which treats every module name supplied as optional.
These difficulties have not yet been resolved, so for this release there are no
changes to L<base>.  We hope to have a fix for L<base> in Perl 5.22.4.

To protect your own code from this attack, either remove the default F<".">
entry from C<@INC> at the start of your script, so:

  #!/usr/bin/perl
  use strict;
  ...

becomes:

  #!/usr/bin/perl
  BEGIN { pop @INC if $INC[-1] eq '.' }
  use strict;
  ...

or for modules, remove F<"."> from a localized C<@INC>, so:

  my $can_foo = eval { require Foo; }

becomes:

  my $can_foo = eval {
      local @INC = @INC;
      pop @INC if $INC[-1] eq '.';
      require Foo;
  };

=head1 Incompatible Changes

Other than the security changes above there are no changes intentionally
incompatible with Perl 5.22.2.  If any exist, they are bugs, and we request
that you submit a report.  See L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Archive::Tar> has been upgraded from version 2.04 to 2.04_01.

=item *

L<bignum> has been upgraded from version 0.39 to 0.39_01.

=item *

L<CPAN> has been upgraded from version 2.11 to 2.11_01.

=item *

L<Digest> has been upgraded from version 1.17 to 1.17_01.

=item *

L<Digest::SHA> has been upgraded from version 5.95 to 5.95_01.

=item *

L<Encode> has been upgraded from version 2.72 to 2.72_01.

=item *

L<ExtUtils::Command> has been upgraded from version 1.20 to 1.20_01.

=item *

L<ExtUtils::MakeMaker> has been upgraded from version 7.04_01 to 7.04_02.

=item *

L<File::Fetch> has been upgraded from version 0.48 to 0.48_01.

=item *

L<File::Spec> has been upgraded from version 3.56_01 to 3.56_02.

=item *

L<HTTP::Tiny> has been upgraded from version 0.054 to 0.054_01.

=item *

L<IO> has been upgraded from version 1.35 to 1.35_01.

=item *

The IO-Compress modules have been upgraded from version 2.068 to 2.068_001.

=item *

L<IPC::Cmd> has been upgraded from version 0.92 to 0.92_01.

=item *

L<JSON::PP> has been upgraded from version 2.27300 to 2.27300_01.

=item *

L<Locale::Maketext> has been upgraded from version 1.26 to 1.26_01.

=item *

L<Locale::Maketext::Simple> has been upgraded from version 0.21 to 0.21_01.

=item *

L<Memoize> has been upgraded from version 1.03 to 1.03_01.

=item *

L<Module::CoreList> has been upgraded from version 5.20160429 to 5.20170114_22.

=item *

L<Net::Ping> has been upgraded from version 2.43 to 2.43_01.

=item *

L<Parse::CPAN::Meta> has been upgraded from version 1.4414 to 1.4414_001.

=item *

L<Pod::Html> has been upgraded from version 1.22 to 1.2201.

=item *

L<Pod::Perldoc> has been upgraded from version 3.25 to 3.25_01.

=item *

L<Storable> has been upgraded from version 2.53_01 to 2.53_02.

=item *

L<Sys::Syslog> has been upgraded from version 0.33 to 0.33_01.

=item *

L<Test> has been upgraded from version 1.26 to 1.26_01.

=item *

L<Test::Harness> has been upgraded from version 3.35 to 3.35_01.

=item *

L<XSLoader> has been upgraded from version 0.20 to 0.20_01, fixing a security
hole in which binary files could be loaded from a path outside of C<@INC>.
L<[perl #128528]|https://rt.perl.org/Public/Bug/Display.html?id=128528>

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perlapio>

=over 4

=item *

The documentation of C<PERLIO_DEBUG> has been updated.

=back

=head3 L<perlrun>

=over 4

=item *

The new B<-Di> switch has been documented, and the documentation of
C<PERLIO_DEBUG> has been updated.

=back

=head1 Testing

=over 4

=item *

A new test script, F<t/run/switchDx.t>, has been added to test that the new
B<-Di> switch is working correctly.

=back

=head1 Selected Bug Fixes

=over 4

=item *

The C<PadlistNAMES> macro is an lvalue again.

=back

=head1 Acknowledgements

Perl 5.22.3 represents approximately 9 months of development since Perl 5.22.2
and contains approximately 4,400 lines of changes across 240 files from 20
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 2,200 lines of changes to 170 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.22.3:

Aaron Crane, Abigail, Alex Vandiver, Aristotle Pagaltzis, Chad Granum, Chris
'BinGOs' Williams, Craig A. Berry, David Mitchell, Father Chrysostomos, James E
Keenan, Jarkko Hietaniemi, Karen Etheridge, Karl Williamson, Matthew Horsfall,
Niko Tyni, Ricardo Signes, Sawyer X, Stevan Little, Steve Hay, Tony Cook.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the Perl bug database at
https://rt.perl.org/ .  There may also be information at http://www.perl.org/ ,
the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send it
to perl5-security-report@perl.org.  This points to a closed subscription
unarchived mailing list, which includes all the core committers, who will be
able to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported.  Please only use this address for
security issues in the Perl core, not for modules independently distributed on
CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlopenbsd.pod000064400000002264150344123500007566 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perlopenbsd - Perl version 5 on OpenBSD systems

=head1 DESCRIPTION

This document describes various features of OpenBSD that will affect how Perl
version 5 (hereafter just Perl) is compiled and/or runs.

=head2 OpenBSD core dumps from getprotobyname_r and getservbyname_r with ithreads

When Perl is configured to use ithreads, it will use re-entrant library calls
in preference to non-re-entrant versions.  There is an incompatibility in
OpenBSD's C<getprotobyname_r> and C<getservbyname_r> function in versions 3.7
and later that will cause a SEGV when called without doing a C<bzero> on
their return structs prior to calling these functions.  Current Perl's
should handle this problem correctly.  Older threaded Perls (5.8.6 or earlier)
will run into this problem.  If you want to run a threaded Perl on OpenBSD
3.7 or higher, you will need to upgrade to at least Perl 5.8.7.

=head1 AUTHOR

Steve Peters <steve@fisharerojo.org>

Please report any errors, updates, or suggestions to F<perlbug@perl.org>.

perltoot.pod000064400000000446150344123500007121 0ustar00=encoding utf8

=head1 NAME

perltoot - Links to information on object-oriented programming in Perl

=head1 DESCRIPTION

For information on OO programming with Perl, please see L<perlootut>
and L<perlobj>.

(The above documents supersede the tutorial that was formerly here in
perltoot.)

=cut
perlgit.pod000064400000101345150344123500006717 0ustar00=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlgit.pod

=head1 NAME

perlgit - Detailed information about git and the Perl repository

=head1 DESCRIPTION

This document provides details on using git to develop Perl. If you are
just interested in working on a quick patch, see L<perlhack> first.
This document is intended for people who are regular contributors to
Perl, including those with write access to the git repository.

=head1 CLONING THE REPOSITORY

All of Perl's source code is kept centrally in a Git repository at
I<perl5.git.perl.org>.

You can make a read-only clone of the repository by running:

  % git clone git://perl5.git.perl.org/perl.git perl

This uses the git protocol (port 9418).

If you cannot use the git protocol for firewall reasons, you can also
clone via http, though this is much slower:

  % git clone http://perl5.git.perl.org/perl.git perl

=head1 WORKING WITH THE REPOSITORY

Once you have changed into the repository directory, you can inspect
it. After a clone the repository will contain a single local branch,
which will be the current branch as well, as indicated by the asterisk.

  % git branch
  * blead

Using the -a switch to C<branch> will also show the remote tracking
branches in the repository:

  % git branch -a
  * blead
    origin/HEAD
    origin/blead
  ...

The branches that begin with "origin" correspond to the "git remote"
that you cloned from (which is named "origin"). Each branch on the
remote will be exactly tracked by these branches. You should NEVER do
work on these remote tracking branches. You only ever do work in a
local branch. Local branches can be configured to automerge (on pull)
from a designated remote tracking branch. This is the case with the
default branch C<blead> which will be configured to merge from the
remote tracking branch C<origin/blead>.

You can see recent commits:

  % git log

And pull new changes from the repository, and update your local
repository (must be clean first)

  % git pull

Assuming we are on the branch C<blead> immediately after a pull, this
command would be more or less equivalent to:

  % git fetch
  % git merge origin/blead

In fact if you want to update your local repository without touching
your working directory you do:

  % git fetch

And if you want to update your remote-tracking branches for all defined
remotes simultaneously you can do

  % git remote update

Neither of these last two commands will update your working directory,
however both will update the remote-tracking branches in your
repository.

To make a local branch of a remote branch:

  % git checkout -b maint-5.10 origin/maint-5.10

To switch back to blead:

  % git checkout blead

=head2 Finding out your status

The most common git command you will use will probably be

  % git status

This command will produce as output a description of the current state
of the repository, including modified files and unignored untracked
files, and in addition it will show things like what files have been
staged for the next commit, and usually some useful information about
how to change things. For instance the following:

 % git status
 On branch blead
 Your branch is ahead of 'origin/blead' by 1 commit.

 Changes to be committed:
   (use "git reset HEAD <file>..." to unstage)

       modified:   pod/perlgit.pod

 Changes not staged for commit:
   (use "git add <file>..." to update what will be committed)
   (use "git checkout -- <file>..." to discard changes in working
                                                              directory)

       modified:   pod/perlgit.pod

 Untracked files:
   (use "git add <file>..." to include in what will be committed)

       deliberate.untracked

This shows that there were changes to this document staged for commit,
and that there were further changes in the working directory not yet
staged. It also shows that there was an untracked file in the working
directory, and as you can see shows how to change all of this. It also
shows that there is one commit on the working branch C<blead> which has
not been pushed to the C<origin> remote yet. B<NOTE>: This output
is also what you see as a template if you do not provide a message to
C<git commit>.

=head2 Patch workflow

First, please read L<perlhack> for details on hacking the Perl core.
That document covers many details on how to create a good patch.

If you already have a Perl repository, you should ensure that you're on
the I<blead> branch, and your repository is up to date:

  % git checkout blead
  % git pull

It's preferable to patch against the latest blead version, since this
is where new development occurs for all changes other than critical bug
fixes. Critical bug fix patches should be made against the relevant
maint branches, or should be submitted with a note indicating all the
branches where the fix should be applied.

Now that we have everything up to date, we need to create a temporary
new branch for these changes and switch into it:

  % git checkout -b orange

which is the short form of

  % git branch orange
  % git checkout orange

Creating a topic branch makes it easier for the maintainers to rebase
or merge back into the master blead for a more linear history. If you
don't work on a topic branch the maintainer has to manually cherry pick
your changes onto blead before they can be applied.

That'll get you scolded on perl5-porters, so don't do that. Be Awesome.

Then make your changes. For example, if Leon Brocard changes his name
to Orange Brocard, we should change his name in the AUTHORS file:

  % perl -pi -e 's{Leon Brocard}{Orange Brocard}' AUTHORS

You can see what files are changed:

  % git status
  On branch orange
  Changes to be committed:
    (use "git reset HEAD <file>..." to unstage)

     modified:   AUTHORS

And you can see the changes:

 % git diff
 diff --git a/AUTHORS b/AUTHORS
 index 293dd70..722c93e 100644
 --- a/AUTHORS
 +++ b/AUTHORS
 @@ -541,7 +541,7 @@    Lars Hecking              <lhecking@nmrc.ucc.ie>
  Laszlo Molnar                  <laszlo.molnar@eth.ericsson.se>
  Leif Huhn                      <leif@hale.dkstat.com>
  Len Johnson                    <lenjay@ibm.net>
 -Leon Brocard                   <acme@astray.com>
 +Orange Brocard                 <acme@astray.com>
  Les Peters                     <lpeters@aol.net>
  Lesley Binks                   <lesley.binks@gmail.com>
  Lincoln D. Stein               <lstein@cshl.org>

Now commit your change locally:

 % git commit -a -m 'Rename Leon Brocard to Orange Brocard'
 Created commit 6196c1d: Rename Leon Brocard to Orange Brocard
  1 files changed, 1 insertions(+), 1 deletions(-)

The C<-a> option is used to include all files that git tracks that you
have changed. If at this time, you only want to commit some of the
files you have worked on, you can omit the C<-a> and use the command
C<S<git add I<FILE ...>>> before doing the commit. C<S<git add
--interactive>> allows you to even just commit portions of files
instead of all the changes in them.

The C<-m> option is used to specify the commit message. If you omit it,
git will open a text editor for you to compose the message
interactively. This is useful when the changes are more complex than
the sample given here, and, depending on the editor, to know that the
first line of the commit message doesn't exceed the 50 character legal
maximum.

Once you've finished writing your commit message and exited your
editor, git will write your change to disk and tell you something like
this:

 Created commit daf8e63: explain git status and stuff about remotes
  1 files changed, 83 insertions(+), 3 deletions(-)

If you re-run C<git status>, you should see something like this:

 % git status
 On branch orange
 Untracked files:
   (use "git add <file>..." to include in what will be committed)

       deliberate.untracked

 nothing added to commit but untracked files present (use "git add" to
                                                                  track)

When in doubt, before you do anything else, check your status and read
it carefully, many questions are answered directly by the git status
output.

You can examine your last commit with:

  % git show HEAD

and if you are not happy with either the description or the patch
itself you can fix it up by editing the files once more and then issue:

  % git commit -a --amend

Now you should create a patch file for all your local changes:

  % git format-patch -M blead..
  0001-Rename-Leon-Brocard-to-Orange-Brocard.patch

Or for a lot of changes, e.g. from a topic branch:

  % git format-patch --stdout -M blead.. > topic-branch-changes.patch

You should now send an email to
L<perlbug@perl.org|mailto:perlbug@perl.org> with a description of your
changes, and include this patch file as an attachment. In addition to
being tracked by RT, mail to perlbug will automatically be forwarded to
perl5-porters (with manual moderation, so please be patient). You
should only send patches to
L<perl5-porters@perl.org|mailto:perl5-porters@perl.org> directly if the
patch is not ready to be applied, but intended for discussion.

Please do not use git-send-email(1) to send your patch. See L<Sending
patch emails|/Sending patch emails> for more information.

If you want to delete your temporary branch, you may do so with:

 % git checkout blead
 % git branch -d orange
 error: The branch 'orange' is not an ancestor of your current HEAD.
 If you are sure you want to delete it, run 'git branch -D orange'.
 % git branch -D orange
 Deleted branch orange.

=head2 Committing your changes

Assuming that you'd like to commit all the changes you've made as a
single atomic unit, run this command:

  % git commit -a

(That C<-a> tells git to add every file you've changed to this commit.
New files aren't automatically added to your commit when you use
C<commit -a> If you want to add files or to commit some, but not all of
your changes, have a look at the documentation for C<git add>.)

Git will start up your favorite text editor, so that you can craft a
commit message for your change. See L<perlhack/Commit message> for more
information about what makes a good commit message.

Once you've finished writing your commit message and exited your
editor, git will write your change to disk and tell you something like
this:

 Created commit daf8e63: explain git status and stuff about remotes
  1 files changed, 83 insertions(+), 3 deletions(-)

If you re-run C<git status>, you should see something like this:

 % git status
 On branch blead
 Your branch is ahead of 'origin/blead' by 2 commits.
   (use "git push" to publish your local commits)
 Untracked files:
   (use "git add <file>..." to include in what will be committed)

       deliberate.untracked

 nothing added to commit but untracked files present (use "git add" to
                                                                  track)

When in doubt, before you do anything else, check your status and read
it carefully, many questions are answered directly by the git status
output.

=head2 Sending patch emails

After you've generated your patch you should send it
to L<perlbug@perl.org|mailto:perlbug@perl.org> (as discussed L<in the
previous section|/"Patch workflow">) with a normal mail client as an
attachment, along with a description of the patch.

You B<must not> use git-send-email(1) to send patches generated with
git-format-patch(1). The RT ticketing system living behind
L<perlbug@perl.org|mailto:perlbug@perl.org> does not respect the inline
contents of E-Mails, sending an inline patch to RT guarantees that your
patch will be destroyed.

Someone may download your patch from RT, which will result in the
subject (the first line of the commit message) being omitted.  See
L<RT #74192|https://rt.perl.org/Ticket/Display.html?id=74192> and
L<commit a4583001|http://perl5.git.perl.org/perl.git/commitdiff/a4583001>
for an example. Alternatively someone may
apply your patch from RT after it arrived in their mailbox, by which
time RT will have modified the inline content of the message.  See
L<RT #74532|https://rt.perl.org/Ticket/Display.html?id=74532> and
L<commit f9bcfeac|http://perl5.git.perl.org/perl.git/commitdiff/f9bcfeac>
for a bad example of this failure mode.

=head2 A note on derived files

Be aware that many files in the distribution are derivative--avoid
patching them, because git won't see the changes to them, and the build
process will overwrite them. Patch the originals instead. Most
utilities (like perldoc) are in this category, i.e. patch
F<utils/perldoc.PL> rather than F<utils/perldoc>. Similarly, don't
create patches for files under F<$src_root/ext> from their copies found
in F<$install_root/lib>. If you are unsure about the proper location of
a file that may have gotten copied while building the source
distribution, consult the F<MANIFEST>.

=head2 Cleaning a working directory

The command C<git clean> can with varying arguments be used as a
replacement for C<make clean>.

To reset your working directory to a pristine condition you can do:

  % git clean -dxf

However, be aware this will delete ALL untracked content. You can use

  % git clean -Xf

to remove all ignored untracked files, such as build and test
byproduct, but leave any manually created files alone.

If you only want to cancel some uncommitted edits, you can use C<git
checkout> and give it a list of files to be reverted, or C<git checkout
-f> to revert them all.

If you want to cancel one or several commits, you can use C<git reset>.

=head2 Bisecting

C<git> provides a built-in way to determine which commit should be blamed
for introducing a given bug. C<git bisect> performs a binary search of
history to locate the first failing commit. It is fast, powerful and
flexible, but requires some setup and to automate the process an auxiliary
shell script is needed.

The core provides a wrapper program, F<Porting/bisect.pl>, which attempts to
simplify as much as possible, making bisecting as simple as running a Perl
one-liner. For example, if you want to know when this became an error:

    perl -e 'my $a := 2'

you simply run this:

    .../Porting/bisect.pl -e 'my $a := 2;'

Using F<Porting/bisect.pl>, with one command (and no other files) it's easy to
find out

=over 4

=item *

Which commit caused this example code to break?

=item *

Which commit caused this example code to start working?

=item *

Which commit added the first file to match this regex?

=item *

Which commit removed the last file to match this regex?

=back

usually without needing to know which versions of perl to use as start and
end revisions, as F<Porting/bisect.pl> automatically searches to find the
earliest stable version for which the test case passes. Run
C<Porting/bisect.pl --help> for the full documentation, including how to
set the C<Configure> and build time options.

If you require more flexibility than F<Porting/bisect.pl> has to offer, you'll
need to run C<git bisect> yourself. It's most useful to use C<git bisect run>
to automate the building and testing of perl revisions. For this you'll need
a shell script for C<git> to call to test a particular revision. An example
script is F<Porting/bisect-example.sh>, which you should copy B<outside> of
the repository, as the bisect process will reset the state to a clean checkout
as it runs. The instructions below assume that you copied it as F<~/run> and
then edited it as appropriate.

You first enter in bisect mode with:

  % git bisect start

For example, if the bug is present on C<HEAD> but wasn't in 5.10.0,
C<git> will learn about this when you enter:

  % git bisect bad
  % git bisect good perl-5.10.0
  Bisecting: 853 revisions left to test after this

This results in checking out the median commit between C<HEAD> and
C<perl-5.10.0>. You can then run the bisecting process with:

  % git bisect run ~/run

When the first bad commit is isolated, C<git bisect> will tell you so:

  ca4cfd28534303b82a216cfe83a1c80cbc3b9dc5 is first bad commit
  commit ca4cfd28534303b82a216cfe83a1c80cbc3b9dc5
  Author: Dave Mitchell <davem@fdisolutions.com>
  Date:   Sat Feb 9 14:56:23 2008 +0000

      [perl #49472] Attributes + Unknown Error
      ...

  bisect run success

You can peek into the bisecting process with C<git bisect log> and
C<git bisect visualize>. C<git bisect reset> will get you out of bisect
mode.

Please note that the first C<good> state must be an ancestor of the
first C<bad> state. If you want to search for the commit that I<solved>
some bug, you have to negate your test case (i.e. exit with C<1> if OK
and C<0> if not) and still mark the lower bound as C<good> and the
upper as C<bad>. The "first bad commit" has then to be understood as
the "first commit where the bug is solved".

C<git help bisect> has much more information on how you can tweak your
binary searches.

=head2 Topic branches and rewriting history

Individual committers should create topic branches under
B<yourname>/B<some_descriptive_name>:

  % branch="$yourname/$some_descriptive_name"
  % git checkout -b $branch
  ... do local edits, commits etc ...
  % git push origin -u $branch

Should you be stuck with an ancient version of git (prior to 1.7), then
C<git push> will not have the C<-u> switch, and you have to replace the
last step with the following sequence:

  % git push origin $branch:refs/heads/$branch
  % git config branch.$branch.remote origin
  % git config branch.$branch.merge refs/heads/$branch

If you want to make changes to someone else's topic branch, you should
check with its creator before making any change to it.

You
might sometimes find that the original author has edited the branch's
history. There are lots of good reasons for this. Sometimes, an author
might simply be rebasing the branch onto a newer source point.
Sometimes, an author might have found an error in an early commit which
they wanted to fix before merging the branch to blead.

Currently the master repository is configured to forbid
non-fast-forward merges. This means that the branches within can not be
rebased and pushed as a single step.

The only way you will ever be allowed to rebase or modify the history
of a pushed branch is to delete it and push it as a new branch under
the same name. Please think carefully about doing this. It may be
better to sequentially rename your branches so that it is easier for
others working with you to cherry-pick their local changes onto the new
version. (XXX: needs explanation).

If you want to rebase a personal topic branch, you will have to delete
your existing topic branch and push as a new version of it. You can do
this via the following formula (see the explanation about C<refspec>'s
in the git push documentation for details) after you have rebased your
branch:

  # first rebase
  % git checkout $user/$topic
  % git fetch
  % git rebase origin/blead

  # then "delete-and-push"
  % git push origin :$user/$topic
  % git push origin $user/$topic

B<NOTE:> it is forbidden at the repository level to delete any of the
"primary" branches. That is any branch matching
C<m!^(blead|maint|perl)!>. Any attempt to do so will result in git
producing an error like this:

  % git push origin :blead
  *** It is forbidden to delete blead/maint branches in this repository
  error: hooks/update exited with error code 1
  error: hook declined to update refs/heads/blead
  To ssh://perl5.git.perl.org/perl
   ! [remote rejected] blead (hook declined)
   error: failed to push some refs to 'ssh://perl5.git.perl.org/perl'

As a matter of policy we do B<not> edit the history of the blead and
maint-* branches. If a typo (or worse) sneaks into a commit to blead or
maint-*, we'll fix it in another commit. The only types of updates
allowed on these branches are "fast-forwards", where all history is
preserved.

Annotated tags in the canonical perl.git repository will never be
deleted or modified. Think long and hard about whether you want to push
a local tag to perl.git before doing so. (Pushing simple tags is
not allowed.)

=head2 Grafts

The perl history contains one mistake which was not caught in the
conversion: a merge was recorded in the history between blead and
maint-5.10 where no merge actually occurred. Due to the nature of git,
this is now impossible to fix in the public repository. You can remove
this mis-merge locally by adding the following line to your
C<.git/info/grafts> file:

 296f12bbbbaa06de9be9d09d3dcf8f4528898a49 434946e0cb7a32589ed92d18008aaa1d88515930

It is particularly important to have this graft line if any bisecting
is done in the area of the "merge" in question.

=head1 WRITE ACCESS TO THE GIT REPOSITORY

Once you have write access, you will need to modify the URL for the
origin remote to enable pushing. Edit F<.git/config> with the
git-config(1) command:

  % git config remote.origin.url ssh://perl5.git.perl.org/perl.git

You can also set up your user name and e-mail address. Most people do
this once globally in their F<~/.gitconfig> by doing something like:

  % git config --global user.name "Ævar Arnfjörð Bjarmason"
  % git config --global user.email avarab@gmail.com

However, if you'd like to override that just for perl,
execute something like the following in F<perl>:

  % git config user.email avar@cpan.org

It is also possible to keep C<origin> as a git remote, and add a new
remote for ssh access:

  % git remote add camel perl5.git.perl.org:/perl.git

This allows you to update your local repository by pulling from
C<origin>, which is faster and doesn't require you to authenticate, and
to push your changes back with the C<camel> remote:

  % git fetch camel
  % git push camel

The C<fetch> command just updates the C<camel> refs, as the objects
themselves should have been fetched when pulling from C<origin>.

=head2 Accepting a patch

If you have received a patch file generated using the above section,
you should try out the patch.

First we need to create a temporary new branch for these changes and
switch into it:

 % git checkout -b experimental

Patches that were formatted by C<git format-patch> are applied with
C<git am>:

 % git am 0001-Rename-Leon-Brocard-to-Orange-Brocard.patch
 Applying Rename Leon Brocard to Orange Brocard

Note that some UNIX mail systems can mess with text attachments containing
'From '. This will fix them up:

 % perl -pi -e's/^>From /From /' \
                        0001-Rename-Leon-Brocard-to-Orange-Brocard.patch

If just a raw diff is provided, it is also possible use this two-step
process:

 % git apply bugfix.diff
 % git commit -a -m "Some fixing" \
                            --author="That Guy <that.guy@internets.com>"

Now we can inspect the change:

 % git show HEAD
 commit b1b3dab48344cff6de4087efca3dbd63548ab5e2
 Author: Leon Brocard <acme@astray.com>
 Date:   Fri Dec 19 17:02:59 2008 +0000

   Rename Leon Brocard to Orange Brocard

 diff --git a/AUTHORS b/AUTHORS
 index 293dd70..722c93e 100644
 --- a/AUTHORS
 +++ b/AUTHORS
 @@ -541,7 +541,7 @@ Lars Hecking                 <lhecking@nmrc.ucc.ie>
  Laszlo Molnar                  <laszlo.molnar@eth.ericsson.se>
  Leif Huhn                      <leif@hale.dkstat.com>
  Len Johnson                    <lenjay@ibm.net>
 -Leon Brocard                   <acme@astray.com>
 +Orange Brocard                 <acme@astray.com>
  Les Peters                     <lpeters@aol.net>
  Lesley Binks                   <lesley.binks@gmail.com>
  Lincoln D. Stein               <lstein@cshl.org>

If you are a committer to Perl and you think the patch is good, you can
then merge it into blead then push it out to the main repository:

  % git checkout blead
  % git merge experimental
  % git push origin blead

If you want to delete your temporary branch, you may do so with:

 % git checkout blead
 % git branch -d experimental
 error: The branch 'experimental' is not an ancestor of your current
 HEAD.  If you are sure you want to delete it, run 'git branch -D
 experimental'.
 % git branch -D experimental
 Deleted branch experimental.

=head2 Committing to blead

The 'blead' branch will become the next production release of Perl.

Before pushing I<any> local change to blead, it's incredibly important
that you do a few things, lest other committers come after you with
pitchforks and torches:

=over

=item *

Make sure you have a good commit message. See L<perlhack/Commit
message> for details.

=item *

Run the test suite. You might not think that one typo fix would break a
test file. You'd be wrong. Here's an example of where not running the
suite caused problems. A patch was submitted that added a couple of
tests to an existing F<.t>. It couldn't possibly affect anything else, so
no need to test beyond the single affected F<.t>, right?  But, the
submitter's email address had changed since the last of their
submissions, and this caused other tests to fail. Running the test
target given in the next item would have caught this problem.

=item *

If you don't run the full test suite, at least C<make test_porting>.
This will run basic sanity checks. To see which sanity checks, have a
look in F<t/porting>.

=item *

If you make any changes that affect miniperl or core routines that have
different code paths for miniperl, be sure to run C<make minitest>.
This will catch problems that even the full test suite will not catch
because it runs a subset of tests under miniperl rather than perl.

=back

=head2 On merging and rebasing

Simple, one-off commits pushed to the 'blead' branch should be simple
commits that apply cleanly.  In other words, you should make sure your
work is committed against the current position of blead, so that you can
push back to the master repository without merging.

Sometimes, blead will move while you're building or testing your
changes.  When this happens, your push will be rejected with a message
like this:

 To ssh://perl5.git.perl.org/perl.git
  ! [rejected]        blead -> blead (non-fast-forward)
 error: failed to push some refs to 'ssh://perl5.git.perl.org/perl.git'
 To prevent you from losing history, non-fast-forward updates were
 rejected Merge the remote changes (e.g. 'git pull') before pushing
 again.  See the 'Note about fast-forwards' section of 'git push --help'
 for details.

When this happens, you can just I<rebase> your work against the new
position of blead, like this (assuming your remote for the master
repository is "p5p"):

  % git fetch p5p
  % git rebase p5p/blead

You will see your commits being re-applied, and you will then be able to
push safely.  More information about rebasing can be found in the
documentation for the git-rebase(1) command.

For larger sets of commits that only make sense together, or that would
benefit from a summary of the set's purpose, you should use a merge
commit.  You should perform your work on a L<topic branch|/Topic
branches and rewriting history>, which you should regularly rebase
against blead to ensure that your code is not broken by blead moving.
When you have finished your work, please perform a final rebase and
test.  Linear history is something that gets lost with every
commit on blead, but a final rebase makes the history linear
again, making it easier for future maintainers to see what has
happened.  Rebase as follows (assuming your work was on the
branch C<< committer/somework >>):

  % git checkout committer/somework
  % git rebase blead

Then you can merge it into master like this:

  % git checkout blead
  % git merge --no-ff --no-commit committer/somework
  % git commit -a

The switches above deserve explanation.  C<--no-ff> indicates that even
if all your work can be applied linearly against blead, a merge commit
should still be prepared.  This ensures that all your work will be shown
as a side branch, with all its commits merged into the mainstream blead
by the merge commit.

C<--no-commit> means that the merge commit will be I<prepared> but not
I<committed>.  The commit is then actually performed when you run the
next command, which will bring up your editor to describe the commit.
Without C<--no-commit>, the commit would be made with nearly no useful
message, which would greatly diminish the value of the merge commit as a
placeholder for the work's description.

When describing the merge commit, explain the purpose of the branch, and
keep in mind that this description will probably be used by the
eventual release engineer when reviewing the next perldelta document.

=head2 Committing to maintenance versions

Maintenance versions should only be altered to add critical bug fixes,
see L<perlpolicy>.

To commit to a maintenance version of perl, you need to create a local
tracking branch:

  % git checkout --track -b maint-5.005 origin/maint-5.005

This creates a local branch named C<maint-5.005>, which tracks the
remote branch C<origin/maint-5.005>. Then you can pull, commit, merge
and push as before.

You can also cherry-pick commits from blead and another branch, by
using the C<git cherry-pick> command. It is recommended to use the
B<-x> option to C<git cherry-pick> in order to record the SHA1 of the
original commit in the new commit message.

Before pushing any change to a maint version, make sure you've
satisfied the steps in L</Committing to blead> above.

=head2 Merging from a branch via GitHub

While we don't encourage the submission of patches via GitHub, that
will still happen. Here is a guide to merging patches from a GitHub
repository.

  % git remote add avar git://github.com/avar/perl.git
  % git fetch avar

Now you can see the differences between the branch and blead:

  % git diff avar/orange

And you can see the commits:

  % git log avar/orange

If you approve of a specific commit, you can cherry pick it:

  % git cherry-pick 0c24b290ae02b2ab3304f51d5e11e85eb3659eae

Or you could just merge the whole branch if you like it all:

  % git merge avar/orange

And then push back to the repository:

  % git push origin blead

=head2 Using a smoke-me branch to test changes

Sometimes a change affects code paths which you cannot test on the OSes
which are directly available to you and it would be wise to have users
on other OSes test the change before you commit it to blead.

Fortunately, there is a way to get your change smoke-tested on various
OSes: push it to a "smoke-me" branch and wait for certain automated
smoke-testers to report the results from their OSes.

The procedure for doing this is roughly as follows (using the example of
of tonyc's smoke-me branch called win32stat):

First, make a local branch and switch to it:

  % git checkout -b win32stat

Make some changes, build perl and test your changes, then commit them to
your local branch. Then push your local branch to a remote smoke-me
branch:

  % git push origin win32stat:smoke-me/tonyc/win32stat

Now you can switch back to blead locally:

  % git checkout blead

and continue working on other things while you wait a day or two,
keeping an eye on the results reported for your smoke-me branch at
L<http://perl.develop-help.com/?b=smoke-me/tonyc/win32state>.

If all is well then update your blead branch:

  % git pull

then checkout your smoke-me branch once more and rebase it on blead:

  % git rebase blead win32stat

Now switch back to blead and merge your smoke-me branch into it:

  % git checkout blead
  % git merge win32stat

As described earlier, if there are many changes on your smoke-me branch
then you should prepare a merge commit in which to give an overview of
those changes by using the following command instead of the last
command above:

  % git merge win32stat --no-ff --no-commit

You should now build perl and test your (merged) changes one last time
(ideally run the whole test suite, but failing that at least run the
F<t/porting/*.t> tests) before pushing your changes as usual:

  % git push origin blead

Finally, you should then delete the remote smoke-me branch:

  % git push origin :smoke-me/tonyc/win32stat

(which is likely to produce a warning like this, which can be ignored:

 remote: fatal: ambiguous argument
                                  'refs/heads/smoke-me/tonyc/win32stat':
 unknown revision or path not in the working tree.
 remote: Use '--' to separate paths from revisions

) and then delete your local branch:

  % git branch -d win32stat

=head2 A note on camel and dromedary

The committers have SSH access to the two servers that serve
C<perl5.git.perl.org>. One is C<perl5.git.perl.org> itself (I<camel>),
which is the 'master' repository. The second one is
C<users.perl5.git.perl.org> (I<dromedary>), which can be used for
general testing and development. Dromedary syncs the git tree from
camel every few minutes, you should not push there. Both machines also
have a full CPAN mirror in F</srv/CPAN>, please use this. To share files
with the general public, dromedary serves your F<~/public_html/> as
C<L<http://users.perl5.git.perl.org/~yourlogin/>>

These hosts have fairly strict firewalls to the outside. Outgoing, only
rsync, ssh and git are allowed. For http and ftp, you can use
L<http://webproxy:3128> as proxy. Incoming, the firewall tries to detect
attacks and blocks IP addresses with suspicious activity. This
sometimes (but very rarely) has false positives and you might get
blocked. The quickest way to get unblocked is to notify the admins.

These two boxes are owned, hosted, and operated by booking.com. You can
reach the sysadmins in #p5p on irc.perl.org or via mail to
L<perl5-porters@perl.org|mailto:perl5-porters@perl.org>.
perl589delta.pod000064400000151214150344123500007473 0ustar00=head1 NAME

perl589delta - what is new for perl v5.8.9

=head1 DESCRIPTION

This document describes differences between the 5.8.8 release and
the 5.8.9 release.

=head1 Notice

The 5.8.9 release will be the last significant release of the 5.8.x
series. Any future releases of 5.8.x will likely only be to deal with
security issues, and platform build failures. Hence you should look to
migrating to 5.10.x, if you have not started already.
See L</"Known Problems"> for more information.

=head1 Incompatible Changes

A particular construction in the source code of extensions written in C++
may need changing. See L</"Changed Internals"> for more details. All
extensions written in C, most written in C++, and all existing compiled
extensions are unaffected. This was necessary to improve C++ support.

Other than this, there are no changes intentionally incompatible with 5.8.8.
If any exist, they are bugs and reports are welcome.

=head1 Core Enhancements

=head2 Unicode Character Database 5.1.0.

The copy of the Unicode Character Database included in Perl 5.8 has
been updated to 5.1.0 from 4.1.0. See
L<http://www.unicode.org/versions/Unicode5.1.0/#NotableChanges> for the
notable changes.

=head2 stat and -X on directory handles

It is now possible to call C<stat> and the C<-X> filestat operators on
directory handles. As both directory and file handles are barewords, there
can be ambiguities over which was intended. In these situations the file
handle semantics are preferred. Both also treat C<*FILE{IO}> filehandles
like C<*FILE> filehandles.

=head2 Source filters in @INC

It's possible to enhance the mechanism of subroutine hooks in @INC by
adding a source filter on top of the filehandle opened and returned by the
hook. This feature was planned a long time ago, but wasn't quite working
until now. See L<perlfunc/require> for details. (Nicholas Clark)

=head2 Exceptions in constant folding

The constant folding routine is now wrapped in an exception handler, and
if folding throws an exception (such as attempting to evaluate 0/0), perl
now retains the current optree, rather than aborting the whole program.
Without this change, programs would not compile if they had expressions that
happened to generate exceptions, even though those expressions were in code
that could never be reached at runtime. (Nicholas Clark, Dave Mitchell)

=head2 C<no VERSION>

You can now use C<no> followed by a version number to specify that you
want to use a version of perl older than the specified one.

=head2 Improved internal UTF-8 caching code

The code that caches calculated UTF-8 byte offsets for character offsets for
a string has been re-written. Several bugs have been located and eliminated,
and the code now makes better use of the information it has, so should be
faster. In particular, it doesn't scan to the end of a string before
calculating an offset within the string, which should speed up some operations
on long strings. It is now possible to disable the caching code at run time,
to verify that it is not the cause of suspected problems.

=head2 Runtime relocatable installations

There is now F<Configure> support for creating a perl tree that is relocatable
at run time. see L</Relocatable installations>.

=head2 New internal variables

=over 4

=item C<${^CHILD_ERROR_NATIVE}>

This variable gives the native status returned by the last pipe close,
backtick command, successful call to C<wait> or C<waitpid>, or from the
C<system> operator. See L<perlvar> for details. (Contributed by Gisle Aas.)

=item C<${^UTF8CACHE}>

This variable controls the state of the internal UTF-8 offset caching code.
1 for on (the default), 0 for off, -1 to debug the caching code by checking
all its results against linear scans, and panicking on any discrepancy.

=back

=head2 C<readpipe> is now overridable

The built-in function C<readpipe> is now overridable. Overriding it permits
also to override its operator counterpart, C<qx//> (also known as C<``>).

=head2 simple exception handling macros

Perl 5.8.9 (and 5.10.0 onwards) now provides a couple of macros to do very
basic exception handling in XS modules. You can use these macros if you call
code that may C<croak>, but you need to do some cleanup before giving control
back to Perl. See L<perlguts/Exception Handling> for more details.

=head2 -D option enhancements 

=over

=item *

C<-Dq> suppresses the I<EXECUTING...> message when running under C<-D>

=item *

C<-Dl> logs runops loop entry and exit, and jump level popping.

=item *

C<-Dv> displays the process id as part of the trace output.

=back

=head2 XS-assisted SWASHGET

Some pure-perl code that the regexp engine was using to retrieve Unicode
properties and transliteration mappings has been reimplemented in XS
for faster execution.
(SADAHIRO Tomoyuki)

=head2 Constant subroutines

The interpreter internals now support a far more memory efficient form of
inlineable constants. Storing a reference to a constant value in a symbol
table is equivalent to a full typeglob referencing a constant subroutine,
but using about 400 bytes less memory. This proxy constant subroutine is
automatically upgraded to a real typeglob with subroutine if necessary.
The approach taken is analogous to the existing space optimisation for
subroutine stub declarations, which are stored as plain scalars in place
of the full typeglob.

However, to aid backwards compatibility of existing code, which (wrongly)
does not expect anything other than typeglobs in symbol tables, nothing in
core uses this feature, other than the regression tests.

Stubs for prototyped subroutines have been stored in symbol tables as plain
strings, and stubs for unprototyped subroutines as the number -1, since 5.005,
so code which assumes that the core only places typeglobs in symbol tables
has been making incorrect assumptions for over 10 years.

=head1 New Platforms

Compile support added for:

=over

=item *

DragonFlyBSD

=item *

MidnightBSD

=item *

MirOS BSD

=item *

RISC OS 

=item *

Cray XT4/Catamount

=back

=head1 Modules and Pragmata

=head2 New Modules

=over

=item *

C<Module::Pluggable> is a simple framework to create modules that accept
pluggable sub-modules. The bundled version is 3.8

=item *

C<Module::CoreList> is a hash of hashes that is keyed on perl version as
indicated in C<$]>. The bundled version is 2.17

=item *

C<Win32API::File> now available in core on Microsoft Windows. The bundled
version is 0.1001_01

=item * 

C<Devel::InnerPackage> finds all the packages defined by a single file. It is
part of the C<Module::Pluggable> distribution. The bundled version is 0.3

=back

=head2 Updated Modules

=over

=item *

C<attributes> upgraded to version 0.09

=item *

C<AutoLoader> upgraded to version 5.67

=item *

C<AutoSplit> upgraded to 1.06

=item *

C<autouse> upgraded to version 1.06

=item *

C<B> upgraded from 1.09_01 to 1.19

=over

=item *

provides new pad related abstraction macros C<B::NV::COP_SEQ_RANGE_LOW>,
C<B::NV::COP_SEQ_RANGE_HIGH>, C<B::NV::PARENT_PAD_INDEX>,
C<B::NV::PARENT_FAKELEX_FLAGS>, which hides the difference in storage in
5.10.0 and later.

=item *

provides C<B::sub_generation>, which exposes C<PL_sub_generation>

=item *

provides C<B::GV::isGV_with_GP>, which on pre-5.10 perls always returns true.

=item *

New type C<B::HE> added with methods C<VAL>, C<HASH> and C<SVKEY_force>

=item *

The C<B::GVf_IMPORTED_CV> flag is now set correctly when a proxy
constant subroutine is imported.

=item *

bugs fixed in the handling of C<PMOP>s.

=item *

C<B::BM::PREVIOUS> returns now C<U32>, not C<U16>.
C<B::CV::START> and C<B:CV::ROOT> return now C<NULL> on an XSUB,
C<B::CV::XSUB> and C<B::CV::XSUBANY> return 0 on a non-XSUB.

=back

=item *

C<B::C> upgraded to 1.05

=item *

C<B::Concise> upgraded to 0.76

=over

=item *

new option C<-src> causes the rendering of each statement (starting with
the nextstate OP) to be preceded by the first line of source code that
generates it.

=item *

new option C<-stash="somepackage">, C<require>s "somepackage", and then renders
each function defined in its namespace.

=item *

now has documentation of detailed hint symbols.

=back

=item *

C<B::Debug> upgraded to version 1.05

=item *

C<B::Deparse> upgraded to version 0.87

=over 4

=item *

properly deparse C<print readpipe $x, $y>.

=item *

now handles C<''->()>, C<::()>, C<sub :: {}>, I<etc.> correctly [RT #43010].
All bugs in parsing these kinds of syntax are now fixed:

    perl -MO=Deparse -e '"my %h = "->()'
    perl -MO=Deparse -e '::->()'
    perl -MO=Deparse -e 'sub :: {}'
    perl -MO=Deparse -e 'package a; sub a::b::c {}'
    perl -MO=Deparse -e 'sub the::main::road {}'

=item *

does B<not> deparse C<$^H{v_string}>, which is automatically set by the
internals.

=back

=item *

C<B::Lint> upgraded to version 1.11

=item *

C<B::Terse> upgraded to version 1.05

=item *

C<base> upgraded to version 2.13

=over 4

=item *

loading a module via base.pm would mask a global C<$SIG{__DIE__}> in that
module.

=item *

push all classes at once in C<@ISA>

=back

=item *

C<Benchmark> upgraded to version 1.10

=item *

C<bigint> upgraded to 0.23

=item *

C<bignum> upgraded to 0.23

=item *

C<bigrat> upgraded to 0.23

=item *

C<blib> upgraded to 0.04

=item *

C<Carp> upgraded to version 1.10

The argument backtrace code now shows C<undef> as C<undef>,
instead of a string I<"undef">.

=item *

C<CGI> upgraded to version 3.42

=item *

C<charnames> upgraded to 1.06

=item *

C<constant> upgraded to version 1.17

=item *

C<CPAN> upgraded to version 1.9301

=item *

C<Cwd> upgraded to version 3.29 with some platform specific
improvements (including for VMS).

=item *

C<Data::Dumper> upgraded to version 2.121_17

=over

=item *

Fixes hash iterator current position with the pure Perl version [RT #40668]

=item *

Performance enhancements, which will be most evident on platforms where
repeated calls to C's C<realloc()> are slow, such as Win32.

=back

=item *

C<DB_File> upgraded to version 1.817

=item *

C<DB_Filter> upgraded to version 0.02

=item *

C<Devel::DProf> upgraded to version 20080331.00

=item *

C<Devel::Peek> upgraded to version 1.04

=item *

C<Devel::PPPort> upgraded to version 3.14

=item *

C<diagnostics> upgraded to version 1.16

=item *

C<Digest> upgraded to version 1.15

=item *

C<Digest::MD5> upgraded to version 2.37

=item *

C<DirHandle> upgraded to version 1.02

=over

=item *

now localises C<$.>, C<$@>, C<$!>, C<$^E>, and C<$?> before closing the
directory handle to suppress leaking any side effects of warnings about it
already being closed.

=back

=item *

C<DynaLoader> upgraded to version 1.09

C<DynaLoader> can now dynamically load a loadable object from a file with a
non-default file extension.

=item *

C<Encode> upgraded to version 2.26

C<Encode::Alias> includes a fix for encoding "646" on Solaris (better known as
ASCII).

=item *

C<English> upgraded to version 1.03

=item *

C<Errno> upgraded to version 1.10

=item *

C<Exporter> upgraded to version 5.63

=item *

C<ExtUtils::Command> upgraded to version 1.15

=item *

C<ExtUtils::Constant> upgraded to version 0.21

=item *

C<ExtUtils::Embed> upgraded to version 1.28

=item *

C<ExtUtils::Install> upgraded to version 1.50_01

=item *

C<ExtUtils::Installed> upgraded to version 1.43

=item *

C<ExtUtils::MakeMaker> upgraded to version 6.48

=over

=item *

support for C<INSTALLSITESCRIPT> and C<INSTALLVENDORSCRIPT>
configuration.

=back

=item *

C<ExtUtils::Manifest> upgraded to version 1.55

=item *

C<ExtUtils::ParseXS> upgraded to version 2.19

=item *

C<Fatal> upgraded to version 1.06

=over

=item *

allows built-ins in C<CORE::GLOBAL> to be made fatal.

=back

=item *

C<Fcntl> upgraded to version 1.06

=item *

C<fields> upgraded to version 2.12

=item *

C<File::Basename> upgraded to version 2.77

=item *

C<FileCache> upgraded to version 1.07

=item *

C<File::Compare> upgraded to 1.1005

=item *

C<File::Copy> upgraded to 2.13

=over 4

=item *

now uses 3-arg open.

=back

=item *

C<File::DosGlob> upgraded to 1.01

=item *

C<File::Find> upgraded to version 1.13

=item *

C<File::Glob> upgraded to version 1.06

=over

=item *

fixes spurious results with brackets inside braces.

=back

=item *

C<File::Path> upgraded to version 2.07_02

=item *

C<File::Spec> upgraded to version 3.29

=over 4

=item *

improved handling of bad arguments.

=item *

some platform specific improvements (including for VMS and Cygwin), with
an optimisation on C<abs2rel> when handling both relative arguments.

=back

=item *

C<File::stat> upgraded to version 1.01

=item *

C<File::Temp> upgraded to version 0.20

=item *

C<filetest> upgraded to version 1.02

=item *

C<Filter::Util::Call> upgraded to version 1.07

=item *

C<Filter::Simple> upgraded to version 0.83

=item * 

C<FindBin> upgraded to version 1.49

=item *

C<GDBM_File> upgraded to version 1.09

=item *

C<Getopt::Long> upgraded to version 2.37

=item *

C<Getopt::Std> upgraded to version 1.06

=item *

C<Hash::Util> upgraded to version 0.06

=item *

C<if> upgraded to version 0.05

=item *

C<IO> upgraded to version 1.23

Reduced number of calls to C<getpeername> in C<IO::Socket>

=item *

C<IPC::Open> upgraded to version 1.03

=item *

C<IPC::Open3> upgraded to version 1.03

=item *

C<IPC::SysV> upgraded to version 2.00

=item *

C<lib> upgraded to version 0.61

=over

=item *

avoid warning about loading F<.par> files.

=back

=item *

C<libnet> upgraded to version 1.22

=item *

C<List::Util> upgraded to 1.19

=item *

C<Locale::Maketext> upgraded to 1.13

=item *

C<Math::BigFloat> upgraded to version 1.60

=item *

C<Math::BigInt> upgraded to version 1.89

=item *

C<Math::BigRat> upgraded to version 0.22

=over 4

=item *

implements new C<as_float> method.

=back

=item *

C<Math::Complex> upgraded to version 1.54.

=item *

C<Math::Trig> upgraded to version 1.18.

=item *

C<NDBM_File> upgraded to version 1.07

=over

=item *

improve F<g++> handling for systems using GDBM compatibility headers.

=back

=item *

C<Net::Ping> upgraded to version 2.35

=item *

C<NEXT> upgraded to version 0.61

=over

=item *

fix several bugs with C<NEXT> when working with C<AUTOLOAD>, C<eval> block, and
within overloaded stringification.

=back

=item *

C<ODBM_File> upgraded to 1.07

=item *

C<open> upgraded to 1.06

=item *

C<ops> upgraded to 1.02

=item *

C<PerlIO::encoding> upgraded to version 0.11

=item *

C<PerlIO::scalar> upgraded to version 0.06

=over 4

=item *

[RT #40267] C<PerlIO::scalar> doesn't respect readonly-ness.

=back

=item *

C<PerlIO::via> upgraded to version 0.05

=item *

C<Pod::Html> upgraded to version 1.09

=item *

C<Pod::Parser> upgraded to version 1.35

=item * 

C<Pod::Usage> upgraded to version 1.35

=item *

C<POSIX> upgraded to version 1.15

=over

=item *

C<POSIX> constants that duplicate those in C<Fcntl> are now imported from
C<Fcntl> and re-exported, rather than being duplicated by C<POSIX>

=item *

C<POSIX::remove> can remove empty directories.

=item *

C<POSIX::setlocale> safer to call multiple times.

=item *

C<POSIX::SigRt> added, which provides access to POSIX realtime signal
functionality on systems that support it.

=back

=item *

C<re> upgraded to version 0.06_01

=item *

C<Safe> upgraded to version 2.16

=item *

C<Scalar::Util> upgraded to 1.19

=item *

C<SDBM_File> upgraded to version 1.06

=item *

C<SelfLoader> upgraded to version 1.17

=item *

C<Shell> upgraded to version 0.72

=item *

C<sigtrap> upgraded to version 1.04

=item *

C<Socket> upgraded to version 1.81

=over

=item *

this fixes an optimistic use of C<gethostbyname>

=back

=item *

C<Storable> upgraded to 2.19

=item *

C<Switch> upgraded to version 2.13

=item *

C<Sys::Syslog> upgraded to version 0.27

=item *

C<Term::ANSIColor> upgraded to version 1.12

=item *

C<Term::Cap> upgraded to version 1.12

=item *

C<Term::ReadLine> upgraded to version 1.03

=item *

C<Test::Builder> upgraded to version 0.80

=item *

C<Test::Harness> upgraded version to 2.64

=over

=item *

this makes it able to handle newlines.

=back

=item *

C<Test::More> upgraded to version 0.80

=item *

C<Test::Simple> upgraded to version 0.80

=item *

C<Text::Balanced> upgraded to version 1.98

=item *

C<Text::ParseWords> upgraded to version 3.27

=item *

C<Text::Soundex> upgraded to version 3.03

=item *

C<Text::Tabs> upgraded to version 2007.1117

=item *

C<Text::Wrap> upgraded to version 2006.1117

=item *

C<Thread> upgraded to version 2.01

=item *

C<Thread::Semaphore> upgraded to version 2.09

=item *

C<Thread::Queue> upgraded to version 2.11

=over

=item *

added capability to add complex structures (e.g., hash of hashes) to queues.

=item *

added capability to dequeue multiple items at once.

=item *

added new methods to inspect and manipulate queues:  C<peek>, C<insert> and
C<extract>

=back

=item *

C<Tie::Handle> upgraded to version 4.2

=item *

C<Tie::Hash> upgraded to version 1.03

=item *

C<Tie::Memoize> upgraded to version 1.1

=over

=item *

C<Tie::Memoize::EXISTS> now correctly caches its results.

=back

=item *

C<Tie::RefHash> upgraded to version 1.38

=item *

C<Tie::Scalar> upgraded to version 1.01

=item *

C<Tie::StdHandle> upgraded to version 4.2

=item *

C<Time::gmtime> upgraded to version 1.03

=item *

C<Time::Local> upgraded to version 1.1901

=item *

C<Time::HiRes> upgraded to version 1.9715 with various build improvements 
(including VMS) and minor platform-specific bug fixes (including
for HP-UX 11 ia64).

=item *

C<threads> upgraded to 1.71

=over

=item *

new thread state information methods: C<is_running>, C<is_detached>
and C<is_joinable>.  C<list> method enhanced to return running or joinable
threads.

=item *

new thread signal method: C<kill>

=item *

added capability to specify thread stack size.

=item *

added capability to control thread exiting behavior.  Added a new C<exit>
method.

=back

=item *

C<threads::shared> upgraded to version 1.27

=over

=item *

smaller and faster implementation that eliminates one internal structure and
the consequent level of indirection.

=item *

user locks are now stored in a safer manner.

=item *

new function C<shared_clone> creates a copy of an object leaving
shared elements as-is and deep-cloning non-shared elements.

=item *

added new C<is_shared> method.

=back

=item *

C<Unicode::Normalize> upgraded to version 1.02

=item *

C<Unicode::UCD> upgraded to version 0.25

=item *

C<warnings> upgraded to version 1.05_01

=item *

C<Win32> upgraded to version 0.38

=over 4

=item *

added new function C<GetCurrentProcessId> which returns the regular Windows
process identifier of the current process, even when called from within a fork.

=back

=item *

C<XSLoader> upgraded to version 0.10

=item *

C<XS::APItest> and C<XS::Typemap> are for internal use only and hence
no longer installed. Many more tests have been added to C<XS::APItest>.

=back

=head1 Utility Changes

=head2 debugger upgraded to version 1.31

=over 4

=item *

Andreas KE<ouml>nig contributed two functions to save and load the debugger
history.

=item *

C<NEXT::AUTOLOAD> no longer emits warnings under the debugger.

=item *

The debugger should now correctly find tty the device on OS X 10.5 and VMS
when the program C<fork>s.

=item *

LVALUE subs now work inside the debugger.

=back

=head2 F<perlthanks>

Perl 5.8.9 adds a new utility F<perlthanks>, which is a variant of F<perlbug>,
but for sending non-bug-reports to the authors and maintainers of Perl.
Getting nothing but bug reports can become a bit demoralising - we'll see if
this changes things.

=head2 F<perlbug>

F<perlbug> now checks if you're reporting about a non-core module and suggests
you report it to the CPAN author instead.

=head2 F<h2xs>

=over

=item *

won't define an empty string as a constant [RT #25366]

=item *

has examples for C<h2xs -X>

=back

=head2 F<h2ph>

=over 4

=item *

now attempts to deal sensibly with the difference in path implications
between C<""> and C<< E<lt>E<gt> >> quoting in C<#include> statements.

=item *

now generates correct code for C<#if defined A || defined B>
[RT #39130]

=back

=head1 New Documentation

As usual, the documentation received its share of corrections, clarifications
and other nitfixes. More C<< X<...> >> tags were added for indexing.

L<perlunitut> is a tutorial written by Juerd Waalboer on Unicode-related
terminology and how to correctly handle Unicode in Perl scripts.

L<perlunicode> is updated in section user defined properties.

L<perluniintro> has been updated in the example of detecting data that is not
valid in particular encoding. 

L<perlcommunity> provides an overview of the Perl Community along with further
resources.

L<CORE> documents the pseudo-namespace for Perl's core routines.

=head1 Changes to Existing Documentation

L<perlglossary> adds I<deprecated modules and features> and I<to be dropped modules>.

L<perlhack> has been updated and added resources on smoke testing.

The Perl FAQs (F<perlfaq1>..F<perlfaq9>) have been updated.

L<perlcheat> is updated with better details on C<\w>, C<\d>, and C<\s>.

L<perldebug> is updated with information on how to call the debugger.

L<perldiag> documentation updated with I<subroutine with an ampersand> on the
argument to C<exists> and C<delete> and also several terminology updates on
warnings.

L<perlfork> documents the limitation of C<exec> inside pseudo-processes.

L<perlfunc>:

=over

=item *

Documentation is fixed in section C<caller> and C<pop>. 

=item *

Function C<alarm> now mentions C<Time::HiRes::ualarm> in preference
to C<select>.

=item *

Regarding precedence in C<-X>, filetest operators are the same as unary
operators, but not regarding parsing and parentheses (spotted by Eirik Berg
Hanssen).

=item *

C<reverse> function documentation received scalar context examples.

=back

L<perllocale> documentation is adjusted for number localization and
C<POSIX::setlocale> to fix Debian bug #379463.

L<perlmodlib> is updated with C<CPAN::API::HOWTO> and
C<Sys::Syslog::win32::Win32> 

L<perlre> documentation updated to reflect the differences between
C<[[:xxxxx:]]> and C<\p{IsXxxxx}> matches. Also added section on C</g> and
C</c> modifiers.

L<perlreguts> describe the internals of the regular expressions engine. It has
been contributed by Yves Orton.

L<perlrebackslash> describes all perl regular expression backslash and escape
sequences.

L<perlrecharclass> describes the syntax and use of character classes in
Perl Regular Expressions.

L<perlrun> is updated to clarify on the hash seed I<PERL_HASH_SEED>. Also more
information in options C<-x> and C<-u>.

L<perlsub> example is updated to use a lexical variable for C<opendir> syntax.

L<perlvar> fixes confusion about real GID C<$(> and effective GID C<$)>. 

Perl thread tutorial example is fixed in section
L<perlthrtut/Queues: Passing Data Around> and L<perlthrtut>.

L<perlhack> documentation extensively improved by Jarkko Hietaniemi and others.

L<perltoot> provides information on modifying C<@UNIVERSAL::ISA>.

L<perlport> documentation extended to include different C<kill(-9, ...)>
semantics on Windows. It also clearly states C<dump> is not supported on Win32
and cygwin.

F<INSTALL> has been updated and modernised.

=head1 Performance Enhancements

=over

=item *

The default since perl 5.000 has been for perl to create an empty scalar
with every new typeglob. The increased use of lexical variables means that
most are now unused. Thanks to Nicholas Clark's efforts, Perl can now be
compiled with C<-DPERL_DONT_CREATE_GVSV> to avoid creating these empty scalars.
This will significantly decrease the number of scalars allocated for all
configurations, and the number of scalars that need to be copied for ithread
creation. Whilst this option is binary compatible with existing perl
installations, it does change a long-standing assumption about the
internals, hence it is not enabled by default, as some third party code may
rely on the old behaviour.

We would recommend testing with this configuration on new deployments of
perl, particularly for multi-threaded servers, to see whether all third party
code is compatible with it, as this configuration may give useful performance
improvements. For existing installations we would not recommend changing to
this configuration unless thorough testing is performed before deployment.

=item *

C<diagnostics> no longer uses C<$&>, which results in large speedups
for regexp matching in all code using it.

=item *

Regular expressions classes of a single character are now treated the same as
if the character had been used as a literal, meaning that code that uses
char-classes as an escaping mechanism will see a speedup. (Yves Orton)

=item *

Creating anonymous array and hash references (ie. C<[]> and C<{}>) now incurs
no more overhead than creating an anonymous list or hash. Nicholas Clark
provided changes with a saving of two ops and one stack push, which was measured
as a slightly better than 5% improvement for these operations.

=item *

Many calls to C<strlen()> have been eliminated, either because the length was
already known, or by adopting or enhancing APIs that pass lengths. This has
been aided by the adoption of a C<my_sprintf()> wrapper, which returns the
correct C89 value - the length of the formatted string. Previously we could
not rely on the return value of C<sprintf()>, because on some ancient but
extant platforms it still returns C<char *>.

=item * 

C<index> is now faster if the search string is stored in UTF-8 but only contains
characters in the Latin-1 range.

=item *

The Unicode swatch cache inside the regexp engine is now used. (the lookup had
a key mismatch, present since the initial implementation). [RT #42839]

=back

=head1 Installation and Configuration Improvements

=head2 Relocatable installations

There is now F<Configure> support for creating a relocatable perl tree. If
you F<Configure> with C<-Duserelocatableinc>, then the paths in C<@INC> (and
everything else in C<%Config>) can be optionally located via the path of the
F<perl> executable.

At start time, if any paths in C<@INC> or C<Config> that F<Configure> marked
as relocatable (by starting them with C<".../">), then they are prefixed the
directory of C<$^X>. This allows the relocation can be configured on a
per-directory basis, although the default with C<-Duserelocatableinc> is that
everything is relocated. The initial install is done to the original configured
prefix.

=head2 Configuration improvements

F<Configure> is now better at removing temporary files. Tom Callaway
(from RedHat) also contributed patches that complete the set of flags
passed to the compiler and the linker, in particular that C<-fPIC> is now
enabled on Linux. It will also croak when your F</dev/null> isn't a device.

A new configuration variable C<d_pseudofork> has been to F<Configure>, and is
available as  C<$Config{d_pseudofork}> in the C<Config> module. This
distinguishes real C<fork> support from the pseudofork emulation used on
Windows platforms.

F<Config.pod> and F<config.sh> are now placed correctly for cross-compilation.

C<$Config{useshrplib}> is now 'true' rather than 'yes' when using a shared perl
library.

=head2 Compilation improvements

Parallel makes should work properly now, although there may still be problems
if C<make test> is instructed to run in parallel.

Many compilation warnings have been cleaned up. A very stubborn compiler
warning in C<S_emulate_eaccess()> was killed after six attempts.
F<g++> support has been tuned, especially for FreeBSD.

F<mkppport> has been integrated, and all F<ppport.h> files in the core will now
be autogenerated at build time (and removed during cleanup).

=head2 Installation improvements.

F<installman> now works with C<-Duserelocatableinc> and C<DESTDIR>.

F<installperl> no longer installs:

=over 4

=item *

static library files of statically linked extensions when a shared perl library
is being used. (They are not needed. See L</Windows> below).

=item *

F<SIGNATURE> and F<PAUSE*.pub> (CPAN files)

=item *

F<NOTES> and F<PATCHING> (ExtUtils files)

=item *

F<perlld> and F<ld2> (Cygwin files)

=back

=head2 Platform Specific Changes

There are improved hints for AIX, Cygwin, DEC/OSF, FreeBSD, HP/UX, Irix 6
Linux, MachTen, NetBSD, OS/390, QNX, SCO, Solaris, SunOS, System V Release 5.x
(UnixWare 7, OpenUNIX 8), Ultrix, UMIPS, uts and VOS.

=head3 FreeBSD

=over 4

=item *

Drop C<-std=c89> and C<-ansi> if using C<long long> as the main integral type,
else in FreeBSD 6.2 (and perhaps other releases), system headers do not
declare some functions required by perl.

=back

=head3 Solaris

=over 4

=item *

Starting with Solaris 10, we do not want versioned shared libraries, because
those often indicate a private use only library. These problems could often
be triggered when L<SUNWbdb> (Berkeley DB) was installed. Hence if Solaris 10
is detected set C<ignore_versioned_solibs=y>.

=back

=head3 VMS

=over 4

=item *

Allow IEEE math to be deselected on OpenVMS I64 (but it remains the default).

=item *

Record IEEE usage in C<config.h>

=item *

Help older VMS compilers by using C<ccflags> when building C<munchconfig.exe>.

=item * 

Don't try to build old C<Thread> extension on VMS when C<-Duseithreads> has
been chosen.

=item *

Passing a raw string of "NaN" to F<nawk> causes a core dump - so the string
has been changed to "*NaN*"

=item *

F<t/op/stat.t> tests will now test hard links on VMS if they are supported.

=back

=head3 Windows

=over 4

=item *

When using a shared perl library F<installperl> no longer installs static
library files, import library files and export library files (of statically
linked extensions) and empty bootstrap files (of dynamically linked
extensions). This fixes a problem building PAR-Packer on Win32 with a debug
build of perl.

=item *

Various improvements to the win32 build process, including support for Visual
C++ 2005 Express Edition (aka Visual C++ 8.x).

=item * 

F<perl.exe> will now have an icon if built with MinGW or Borland. 

=item *

Improvements to the perl-static.exe build process.

=item *

Add Win32 makefile option to link all extensions statically.

=item *

The F<WinCE> directory has been merged into the F<Win32> directory.

=item *

C<setlocale> tests have been re-enabled for Windows XP onwards.

=back

=head1 Selected Bug Fixes

=head2 Unicode

Many many bugs related to the internal Unicode implementation (UTF-8) have
been fixed. In particular, long standing bugs related to returning Unicode
via C<tie>, overloading or C<$@> are now gone, some of which were never
reported.

C<unpack> will internally convert the string back from UTF-8 on numeric types.
This is a compromise between the full consistency now in 5.10, and the current
behaviour, which is often used as a "feature" on string types.

Using C<:crlf> and C<UTF-16> IO layers together will now work.

Fixed problems with C<split>, Unicode C</\s+/> and C</ \0/>.

Fixed bug RT #40641 - encoding of Unicode characters in regular expressions.

Fixed a bug where using certain patterns in a regexp led to a panic.
[RT #45337]

Perl no longer segfaults (due to infinite internal recursion) if the locale's
character is not UTF-8 [RT #41442]:

    use open ':locale';
    print STDERR "\x{201e}"; # &bdquo;

=head2 PerlIO

Inconsistencies have been fixed in the reference counting PerlIO uses to keep
track of Unix file descriptors, and the API used by XS code to manage getting
and releasing C<FILE *>s

=head2 Magic

Several bugs have been fixed in Magic, the internal system used to implement
features such as C<tie>, tainting and threads sharing.

C<undef @array> on a tied array now correctly calls the C<CLEAR> method.

Some of the bitwise ops were not checking whether their arguments were magical
before using them. [RT #24816]

Magic is no longer invoked twice by the expression C<\&$x>

A bug with assigning large numbers and tainting has been resolved.
[RT #40708]

A new entry has been added to the MAGIC vtable - C<svt_local>. This is used
when copying magic to the new value during C<local>, allowing certain problems
with localising shared variables to be resolved.

For the implementation details, see L<perlguts/Magic Virtual Tables>.

=head2 Reblessing overloaded objects now works

Internally, perl object-ness is on the referent, not the reference, even
though methods can only be called via a reference. However, the original
implementation of overloading stored flags related to overloading on the
reference, relying on the flags being copied when the reference was copied,
or set at the creation of a new reference. This manifests in a bug - if you
rebless an object from a class that has overloading, into one that does not,
then any other existing references think that they (still) point to an
overloaded object, choose these C code paths, and then throw errors.
Analogously, blessing into an overloaded class when other references exist will
result in them not using overloading.

The implementation has been fixed for 5.10, but this fix changes the semantics
of flag bits, so is not binary compatible, so can't be applied to 5.8.9.
However, 5.8.9 has a work-around that implements the same bug fix. If the
referent has multiple references, then all the other references are located and
corrected. A full search is avoided whenever possible by scanning lexicals
outwards from the current subroutine, and the argument stack.

A certain well known Linux vendor applied incomplete versions of this bug fix
to their F</usr/bin/perl> and then prematurely closed bug reports about
performance issues without consulting back upstream. This not being enough,
they then proceeded to ignore the necessary fixes to these unreleased changes
for 11 months, until massive pressure was applied by their long-suffering
paying customers, catalysed by the failings being featured on a prominent blog
and Slashdot.

=head2 C<strict> now propagates correctly into string evals

Under 5.8.8 and earlier:

    $ perl5.8.8 -e 'use strict; eval "use foo bar" or die $@'
    Can't locate foo.pm in @INC (@INC contains: ... .) at (eval 1) line 2.
    BEGIN failed--compilation aborted at (eval 1) line 2.

Under 5.8.9 and later:

    $ perl5.8.9 -e 'use strict; eval "use foo bar" or die $@'
    Bareword "bar" not allowed while "strict subs" in use at (eval 1) line 1.

This may cause problems with programs that parse the error message and rely
on the buggy behaviour.

=head2 Other fixes

=over

=item *

The tokenizer no longer treats C<=cute> (and other words beginning
with C<=cut>) as a synonym for C<=cut>.

=item *

Calling C<CORE::require>

C<CORE::require> and C<CORE::do> were always parsed as C<require> and C<do>
when they were overridden. This is now fixed.

=item *

Stopped memory leak on long F</etc/groups> entries.

=item *

C<while (my $x ...) { ...; redo }> shouldn't C<undef $x>.

In the presence of C<my> in the conditional of a C<while()>, C<until()>,
or C<for(;;)> loop, we now add an extra scope to the body so that C<redo>
doesn't C<undef> the lexical.

=item *

The C<encoding> pragma now correctly ignores anything following an C<@> 
character in the C<LC_ALL> and C<LANG> environment variables. [RT # 49646]

=item *

A segfault observed with some F<gcc> 3.3 optimisations is resolved.

=item *

A possible segfault when C<unpack> used in scalar context with C<()> groups
is resolved. [RT #50256]

=item *

Resolved issue where C<$!> could be changed by a signal handler interrupting
a C<system> call.

=item *

Fixed bug RT #37886, symbolic dereferencing was allowed in the argument of
C<defined> even under the influence of C<use strict 'refs'>.

=item *

Fixed bug RT #43207, where C<lc>/C<uc> inside C<sort> affected the return
value.

=item *

Fixed bug RT #45607, where C<*{"BONK"} = \&{"BONK"}> didn't work correctly.

=item *

Fixed bug RT #35878, croaking from a XSUB called via C<goto &xsub> corrupts perl
internals.

=item *

Fixed bug RT #32539, F<DynaLoader.o> is moved into F<libperl.so> to avoid the
need to statically link DynaLoader into the stub perl executable. With this
F<libperl.so> provides everything needed to get a functional embedded perl
interpreter to run.

=item *

Fix bug RT #36267 so that assigning to a tied hash doesn't change the
underlying hash.

=item *

Fix bug RT #6006, regexp replaces using large replacement variables
fail some of the time, I<i.e.> when substitution contains something
like C<${10}> (note the bracket) instead of just C<$10>.

=item *

Fix bug RT #45053, C<Perl_newCONSTSUB()> is now thread safe.

=back

=head2 Platform Specific Fixes

=head3 Darwin / MacOS X

=over 4

=item *

Various improvements to 64 bit builds.

=item *

Mutex protection added in C<PerlIOStdio_close()> to avoid race conditions.
Hopefully this fixes failures in the threads tests F<free.t> and F<blocks.t>.

=item *

Added forked terminal support to the debugger, with the ability to update the
window title.

=back

=head3 OS/2

=over 4

=item *

A build problem with specifying C<USE_MULTI> and C<USE_ITHREADS> but without
C<USE_IMP_SYS> has been fixed.

=item *

C<OS2::REXX> upgraded to version 1.04

=back

=head3 Tru64

=over 4

=item *

Aligned floating point build policies for F<cc> and F<gcc>.

=back

=head3 RedHat Linux

=over 4

=item *

Revisited a patch from 5.6.1 for RH7.2 for Intel's F<icc> [RT #7916], added an
additional check for C<$Config{gccversion}>.

=back

=head3 Solaris/i386

=over 4

=item *

Use C<-DPTR_IS_LONG> when using 64 bit integers

=back

=head3 VMS

=over 4

=item *

Fixed C<PerlIO::Scalar> in-memory file record-style reads.

=item *

pipe shutdown at process exit should now be more robust.

=item *

Bugs in VMS exit handling tickled by C<Test::Harness> 2.64 have been fixed.

=item *

Fix C<fcntl()> locking capability test in F<configure.com>.

=item *

Replaced C<shrplib='define'> with C<useshrplib='true'> on VMS.

=back

=head3 Windows

=over 4

=item *

C<File::Find> used to fail when the target directory is a bare drive letter and
C<no_chdir> is 1 (the default is 0). [RT #41555]

=item *

A build problem with specifying C<USE_MULTI> and C<USE_ITHREADS> but without
C<USE_IMP_SYS> has been fixed.

=item *

The process id is no longer truncated to 16 bits on some Windows platforms
( http://bugs.activestate.com/show_bug.cgi?id=72443 )

=item *

Fixed bug RT #54828 in F<perlio.c> where calling C<binmode> on Win32 and Cygwin
may cause a segmentation fault.

=back

=head2 Smaller fixes

=over 4

=item *

It is now possible to overload C<eq> when using C<nomethod>.

=item *

Various problems using C<overload> with 64 bit integers corrected.

=item *

The reference count of C<PerlIO> file descriptors is now correctly handled.

=item *

On VMS, escaped dots will be preserved when converted to Unix syntax.

=item *

C<keys %+> no longer throws an C<'ambiguous'> warning.

=item * 

Using C<#!perl -d> could trigger an assertion, which has been fixed.

=item *

Don't stringify tied code references in C<@INC> when calling C<require>.

=item *

Code references in C<@INC> report the correct file name when C<__FILE__> is
used.

=item *

Width and precision in sprintf didn't handle characters above 255 correctly.
[RT #40473]

=item *

List slices with indices out of range now work more consistently.
[RT #39882]

=item *

A change introduced with perl 5.8.1 broke the parsing of arguments of the form
C<-foo=bar> with the C<-s> on the <#!> line. This has been fixed. See
http://bugs.activestate.com/show_bug.cgi?id=43483

=item *

C<tr///> is now threadsafe. Previously it was storing a swash inside its OP,
rather than in a pad.

=item *

F<pod2html> labels anchors more consistently and handles nested definition
lists better.

=item *

C<threads> cleanup veto has been extended to include C<perl_free()> and
C<perl_destruct()>

=item *

On some systems, changes to C<$ENV{TZ}> would not always be
respected by the underlying calls to C<localtime_r()>.  Perl now
forces the inspection of the environment on these systems.

=item *

The special variable C<$^R> is now more consistently set when executing
regexps using the C<(?{...})> construct.  In particular, it will still
be set even if backreferences or optional sub-patterns C<(?:...)?> are
used.

=back

=head1 New or Changed Diagnostics

=head2 panic: sv_chop %s

This new fatal error occurs when the C routine C<Perl_sv_chop()> was passed a
position that is not within the scalar's string buffer. This is caused by
buggy XS code, and at this point recovery is not possible.

=head2 Maximal count of pending signals (%s) exceeded

This new fatal error occurs when the perl process has to abort due to
too many pending signals, which is bound to prevent perl from being
able to handle further incoming signals safely.

=head2 panic: attempt to call %s in %s

This new fatal error occurs when the ACL version file test operator is used
where it is not available on the current platform. Earlier checks mean that
it should never be possible to get this.

=head2 FETCHSIZE returned a negative value

New error indicating that a tied array has claimed to have a negative
number of elements.

=head2 Can't upgrade %s (%d) to %d

Previously the internal error from the SV upgrade code was the less informative
I<Can't upgrade that kind of scalar>. It now reports the current internal type,
and the new type requested.

=head2 %s argument is not a HASH or ARRAY element or a subroutine

This error, thrown if an invalid argument is provided to C<exists> now
correctly includes "or a subroutine". [RT #38955]

=head2 Cannot make the non-overridable builtin %s fatal

This error in C<Fatal> previously did not show the name of the builtin in
question (now represented by %s above).

=head2 Unrecognized character '%s' in column %d

This error previously did not state the column.

=head2 Offset outside string

This can now also be generated by a C<seek> on a file handle using
C<PerlIO::scalar>.

=head2 Invalid escape in the specified encoding in regexp; marked by <-- HERE in m/%s/

New error, introduced as part of the fix to RT #40641 to handle encoding
of Unicode characters in regular expression comments.

=head2 Your machine doesn't support dump/undump.

A more informative fatal error issued when calling C<dump> on Win32 and
Cygwin. (Given that the purpose of C<dump> is to abort with a core dump,
and core dumps can't be produced on these platforms, this is more useful than
silently exiting.)

=head1 Changed Internals

The perl sources can now be compiled with a C++ compiler instead of a C
compiler. A necessary implementation details is that under C++, the macro
C<XS> used to define XSUBs now includes an C<extern "C"> definition. A side
effect of this is that B<C++> code that used the construction

    typedef XS(SwigPerlWrapper);

now needs to be written

    typedef XSPROTO(SwigPerlWrapper);

using the new C<XSPROTO> macro, in order to compile. C extensions are
unaffected, although C extensions are encouraged to use C<XSPROTO> too.
This change was present in the 5.10.0 release of perl, so any actively
maintained code that happened to use this construction should already have
been adapted. Code that needs changing will fail with a compilation error.

C<set> magic on localizing/assigning to a magic variable will now only
trigger for I<container magics>, i.e. it will for C<%ENV> or C<%SIG>
but not for C<$#array>.

The new API macro C<newSVpvs()> can be used in place of constructions such as
C<newSVpvn("ISA", 3)>. It takes a single string constant, and at C compile
time determines its length.

The new API function C<Perl_newSV_type()> can be used as a more efficient
replacement of the common idiom

    sv = newSV(0);
    sv_upgrade(sv, type);

Similarly C<Perl_newSVpvn_flags()> can be used to combine
C<Perl_newSVpv()> with C<Perl_sv_2mortal()> or the equivalent
C<Perl_sv_newmortal()> with C<Perl_sv_setpvn()>

Two new macros C<mPUSHs()> and C<mXPUSHs()> are added, to make it easier to
push mortal SVs onto the stack. They were then used to fix several bugs where
values on the stack had not been mortalised.

A C<Perl_signbit()> function was added to test the sign of an C<NV>. It 
maps to the system one when available.

C<Perl_av_reify()>, C<Perl_lex_end()>, C<Perl_mod()>, C<Perl_op_clear()>,
C<Perl_pop_return()>, C<Perl_qerror()>, C<Perl_setdefout()>,
C<Perl_vivify_defelem()> and C<Perl_yylex()> are now visible to extensions.
This was required to allow C<Data::Alias> to work on Windows.

C<Perl_find_runcv()> is now visible to perl core extensions. This was required
to allow C<Sub::Current> to work on Windows.

C<ptr_table*> functions are now available in unthreaded perl. C<Storable>
takes advantage of this.

There have been many small cleanups made to the internals. In particular,
C<Perl_sv_upgrade()> has been simplified considerably, with a straight-through
code path that uses C<memset()> and C<memcpy()> to initialise the new body,
rather than assignment via multiple temporary variables. It has also
benefited from simplification and de-duplication of the arena management
code.

A lot of small improvements in the code base were made due to reports from
the Coverity static code analyzer.

Corrected use and documentation of C<Perl_gv_stashpv()>, C<Perl_gv_stashpvn()>,
C<Perl_gv_stashsv()> functions (last parameter is a bitmask, not boolean).

C<PERL_SYS_INIT>, C<PERL_SYS_INIT3> and C<PERL_SYS_TERM> macros have been
changed into functions.

C<PERLSYS_TERM> no longer requires a context. C<PerlIO_teardown()>
is now called without a context, and debugging output in this function has
been disabled because that required that an interpreter was present, an invalid
assumption at termination time.

All compile time options which affect binary compatibility have been grouped
together into a global variable (C<PL_bincompat_options>).

The values of C<PERL_REVISION>, C<PERL_VERSION> and C<PERL_SUBVERSION> are
now baked into global variables (and hence into any shared perl library).
Additionally under C<MULTIPLICITY>, the perl executable now records the size of
the interpreter structure (total, and for this version). Coupled with
C<PL_bincompat_options> this will allow 5.8.10 (and later), when compiled with a
shared perl library, to perform sanity checks in C<main()> to verify that the
shared library is indeed binary compatible.

Symbolic references can now have embedded NULs. The new public function
C<Perl_get_cvn_flags()> can be used in extensions if you have to handle them.

=head2 Macro cleanups

The core code, and XS code in F<ext> that is not dual-lived on CPAN, no longer
uses the macros C<PL_na>, C<NEWSV()>, C<Null()>, C<Nullav>, C<Nullcv>,
C<Nullhv>, C<Nullhv> I<etc>. Their use is discouraged in new code,
particularly C<PL_na>, which is a small performance hit.

=head1 New Tests

Many modules updated from CPAN incorporate new tests. Some core specific
tests have been added:

=over 4

=item ext/DynaLoader/t/DynaLoader.t

Tests for the C<DynaLoader> module.

=item t/comp/fold.t

Tests for compile-time constant folding.

=item t/io/pvbm.t

Tests incorporated from 5.10.0 which check that there is no unexpected
interaction between the internal types C<PVBM> and C<PVGV>.

=item t/lib/proxy_constant_subs.t

Tests for the new form of constant subroutines.

=item t/op/attrhand.t

Tests for C<Attribute::Handlers>.

=item t/op/dbm.t

Tests for C<dbmopen>.

=item t/op/inccode-tie.t

Calls all tests in F<t/op/inccode.t> after first tying C<@INC>.

=item t/op/incfilter.t

Tests for source filters returned from code references in C<@INC>.

=item t/op/kill0.t

Tests for RT #30970.

=item t/op/qrstack.t

Tests for RT #41484.

=item t/op/qr.t

Tests for the C<qr//> construct.

=item t/op/regexp_qr_embed.t

Tests for the C<qr//> construct within another regexp.

=item t/op/regexp_qr.t

Tests for the C<qr//> construct.

=item t/op/rxcode.t

Tests for RT #32840.

=item t/op/studytied.t

Tests for C<study> on tied scalars.

=item t/op/substT.t

Tests for C<subst> run under C<-T> mode.

=item t/op/symbolcache.t

Tests for C<undef> and C<delete> on stash entries that are bound to
subroutines or methods.

=item t/op/upgrade.t

Tests for C<Perl_sv_upgrade()>.

=item t/mro/package_aliases.t

MRO tests for C<isa> and package aliases.

=item t/pod/twice.t

Tests for calling C<Pod::Parser> twice.

=item t/run/cloexec.t

Tests for inheriting file descriptors across C<exec> (close-on-exec).

=item t/uni/cache.t

Tests for the UTF-8 caching code.

=item t/uni/chr.t

Test that strange encodings do not upset C<Perl_pp_chr()>.

=item t/uni/greek.t

Tests for RT #40641.

=item t/uni/latin2.t

Tests for RT #40641.

=item t/uni/overload.t

Tests for returning Unicode from overloaded values.

=item t/uni/tie.t

Tests for returning Unicode from tied variables.

=back

=head1 Known Problems

There are no known new bugs.

However, programs that rely on bugs that have been fixed will have problems.
Also, many bug fixes present in 5.10.0 can't be back-ported to the 5.8.x
branch, because they require changes that are binary incompatible, or because
the code changes are too large and hence too risky to incorporate.

We have only limited volunteer labour, and the maintenance burden is
getting increasingly complex. Hence this will be the last significant
release of the 5.8.x series. Any future releases of 5.8.x will likely
only be to deal with security issues, and platform build
failures. Hence you should look to migrating to 5.10.x, if you have
not started already. Alternatively, if business requirements constrain
you to continue to use 5.8.x, you may wish to consider commercial
support from firms such as ActiveState.

=head1 Platform Specific Notes

=head2 Win32

C<readdir()>, C<cwd()>, C<$^X> and C<@INC> now use the alternate (short)
filename if the long name is outside the current codepage (Jan Dubois).

=head3 Updated Modules

=over 4

=item *

C<Win32> upgraded to version 0.38. Now has a documented 'WinVista' response
from C<GetOSName> and support for Vista's privilege elevation in C<IsAdminUser>.
Support for Unicode characters in path names. Improved cygwin and Win64
compatibility. 

=item *

C<Win32API> updated to 0.1001_01

=item *

C<killpg()> support added to C<MSWin32> (Jan Dubois).

=item *

C<File::Spec::Win32> upgraded to version 3.2701

=back

=head2 OS/2

=head3 Updated Modules

=over 4

=item *

C<OS2::Process> upgraded to 1.03

Ilya Zakharevich has added and documented several C<Window*> and C<Clipbrd*>
functions.

=item *

C<OS2::REXX::DLL>, C<OS2::REXX> updated to version 1.03

=back

=head2 VMS

=head3 Updated Modules

=over 4

=item *

C<DCLsym> upgraded to version 1.03

=item *

C<Stdio> upgraded to version 2.4

=item *

C<VMS::XSSymSet> upgraded to 1.1.

=back

=head1 Obituary

Nick Ing-Simmons, long time Perl hacker, author of the C<Tk> and C<Encode>
modules, F<perlio.c> in the core, and 5.003_02 pumpking, died of a heart
attack on 25th September 2006. He will be missed.

=head1 Acknowledgements

Some of the work in this release was funded by a TPF grant.

Steve Hay worked behind the scenes working out the causes of the differences
between core modules, their CPAN releases, and previous core releases, and
the best way to rectify them. He doesn't want to do it again. I know this
feeling, and I'm very glad he did it this time, instead of me.

Paul Fenwick assembled a team of 18 volunteers, who broke the back of writing
this document. In particular, Bradley Dean, Eddy Tan, and Vincent Pit
provided half the team's contribution.

Schwern verified the list of updated module versions, correcting quite a few
errors that I (and everyone else) had missed, both wrongly stated module
versions, and changed modules that had not been listed.

The crack Berlin-based QA team of Andreas KE<ouml>nig and Slaven Rezic
tirelessly re-built snapshots, tested most everything CPAN against
them, and then identified the changes responsible for any module regressions,
ensuring that several show-stopper bugs were stomped before the first release
candidate was cut.

The other core committers contributed most of the changes, and applied most
of the patches sent in by the hundreds of contributors listed in F<AUTHORS>.

And obviously, Larry Wall, without whom we wouldn't have Perl.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes
all the core committers, who will be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for security
issues in the Perl core, not for modules independently distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlootut.pod000064400000064237150344123500007316 0ustar00=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlootut.pod

=head1 NAME

perlootut - Object-Oriented Programming in Perl Tutorial

=head1 DATE

This document was created in February, 2011, and the last major
revision was in February, 2013.

If you are reading this in the future then it's possible that the state
of the art has changed. We recommend you start by reading the perlootut
document in the latest stable release of Perl, rather than this
version.

=head1 DESCRIPTION

This document provides an introduction to object-oriented programming
in Perl. It begins with a brief overview of the concepts behind object
oriented design. Then it introduces several different OO systems from
L<CPAN|http://search.cpan.org> which build on top of what Perl
provides.

By default, Perl's built-in OO system is very minimal, leaving you to
do most of the work. This minimalism made a lot of sense in 1994, but
in the years since Perl 5.0 we've seen a number of common patterns
emerge in Perl OO. Fortunately, Perl's flexibility has allowed a rich
ecosystem of Perl OO systems to flourish.

If you want to know how Perl OO works under the hood, the L<perlobj>
document explains the nitty gritty details.

This document assumes that you already understand the basics of Perl
syntax, variable types, operators, and subroutine calls. If you don't
understand these concepts yet, please read L<perlintro> first. You
should also read the L<perlsyn>, L<perlop>, and L<perlsub> documents.

=head1 OBJECT-ORIENTED FUNDAMENTALS

Most object systems share a number of common concepts. You've probably
heard terms like "class", "object, "method", and "attribute" before.
Understanding the concepts will make it much easier to read and write
object-oriented code. If you're already familiar with these terms, you
should still skim this section, since it explains each concept in terms
of Perl's OO implementation.

Perl's OO system is class-based. Class-based OO is fairly common. It's
used by Java, C++, C#, Python, Ruby, and many other languages. There
are other object orientation paradigms as well. JavaScript is the most
popular language to use another paradigm. JavaScript's OO system is
prototype-based.

=head2 Object

An B<object> is a data structure that bundles together data and
subroutines which operate on that data. An object's data is called
B<attributes>, and its subroutines are called B<methods>. An object can
be thought of as a noun (a person, a web service, a computer).

An object represents a single discrete thing. For example, an object
might represent a file. The attributes for a file object might include
its path, content, and last modification time. If we created an object
to represent F</etc/hostname> on a machine named "foo.example.com",
that object's path would be "/etc/hostname", its content would be
"foo\n", and it's last modification time would be 1304974868 seconds
since the beginning of the epoch.

The methods associated with a file might include C<rename()> and
C<write()>.

In Perl most objects are hashes, but the OO systems we recommend keep
you from having to worry about this. In practice, it's best to consider
an object's internal data structure opaque.

=head2 Class

A B<class> defines the behavior of a category of objects. A class is a
name for a category (like "File"), and a class also defines the
behavior of objects in that category.

All objects belong to a specific class. For example, our
F</etc/hostname> object belongs to the C<File> class. When we want to
create a specific object, we start with its class, and B<construct> or
B<instantiate> an object. A specific object is often referred to as an
B<instance> of a class.

In Perl, any package can be a class. The difference between a package
which is a class and one which isn't is based on how the package is
used. Here's our "class declaration" for the C<File> class:

  package File;

In Perl, there is no special keyword for constructing an object.
However, most OO modules on CPAN use a method named C<new()> to
construct a new object:

  my $hostname = File->new(
      path          => '/etc/hostname',
      content       => "foo\n",
      last_mod_time => 1304974868,
  );

(Don't worry about that C<< -> >> operator, it will be explained
later.)

=head3 Blessing

As we said earlier, most Perl objects are hashes, but an object can be
an instance of any Perl data type (scalar, array, etc.). Turning a
plain data structure into an object is done by B<blessing> that data
structure using Perl's C<bless> function.

While we strongly suggest you don't build your objects from scratch,
you should know the term B<bless>. A B<blessed> data structure (aka "a
referent") is an object. We sometimes say that an object has been
"blessed into a class".

Once a referent has been blessed, the C<blessed> function from the
L<Scalar::Util> core module can tell us its class name. This subroutine
returns an object's class when passed an object, and false otherwise.

  use Scalar::Util 'blessed';

  print blessed($hash);      # undef
  print blessed($hostname);  # File

=head3 Constructor

A B<constructor> creates a new object. In Perl, a class's constructor
is just another method, unlike some other languages, which provide
syntax for constructors. Most Perl classes use C<new> as the name for
their constructor:

  my $file = File->new(...);

=head2 Methods

You already learned that a B<method> is a subroutine that operates on
an object. You can think of a method as the things that an object can
I<do>. If an object is a noun, then methods are its verbs (save, print,
open).

In Perl, methods are simply subroutines that live in a class's package.
Methods are always written to receive the object as their first
argument:

  sub print_info {
      my $self = shift;

      print "This file is at ", $self->path, "\n";
  }

  $file->print_info;
  # The file is at /etc/hostname

What makes a method special is I<how it's called>. The arrow operator
(C<< -> >>) tells Perl that we are calling a method.

When we make a method call, Perl arranges for the method's B<invocant>
to be passed as the first argument. B<Invocant> is a fancy name for the
thing on the left side of the arrow. The invocant can either be a class
name or an object. We can also pass additional arguments to the method:

  sub print_info {
      my $self   = shift;
      my $prefix = shift // "This file is at ";

      print $prefix, ", ", $self->path, "\n";
  }

  $file->print_info("The file is located at ");
  # The file is located at /etc/hostname

=head2 Attributes

Each class can define its B<attributes>. When we instantiate an object,
we assign values to those attributes. For example, every C<File> object
has a path. Attributes are sometimes called B<properties>.

Perl has no special syntax for attributes. Under the hood, attributes
are often stored as keys in the object's underlying hash, but don't
worry about this.

We recommend that you only access attributes via B<accessor> methods.
These are methods that can get or set the value of each attribute. We
saw this earlier in the C<print_info()> example, which calls C<<
$self->path >>.

You might also see the terms B<getter> and B<setter>. These are two
types of accessors. A getter gets the attribute's value, while a setter
sets it. Another term for a setter is B<mutator>

Attributes are typically defined as read-only or read-write. Read-only
attributes can only be set when the object is first created, while
read-write attributes can be altered at any time.

The value of an attribute may itself be another object. For example,
instead of returning its last mod time as a number, the C<File> class
could return a L<DateTime> object representing that value.

It's possible to have a class that does not expose any publicly
settable attributes. Not every class has attributes and methods.

=head2 Polymorphism

B<Polymorphism> is a fancy way of saying that objects from two
different classes share an API. For example, we could have C<File> and
C<WebPage> classes which both have a C<print_content()> method. This
method might produce different output for each class, but they share a
common interface.

While the two classes may differ in many ways, when it comes to the
C<print_content()> method, they are the same. This means that we can
try to call the C<print_content()> method on an object of either class,
and B<we don't have to know what class the object belongs to!>

Polymorphism is one of the key concepts of object-oriented design.

=head2 Inheritance

B<Inheritance> lets you create a specialized version of an existing
class. Inheritance lets the new class reuse the methods and attributes
of another class.

For example, we could create an C<File::MP3> class which B<inherits>
from C<File>. An C<File::MP3> B<is-a> I<more specific> type of C<File>.
All mp3 files are files, but not all files are mp3 files.

We often refer to inheritance relationships as B<parent-child> or
C<superclass>/C<subclass> relationships. Sometimes we say that the
child has an B<is-a> relationship with its parent class.

C<File> is a B<superclass> of C<File::MP3>, and C<File::MP3> is a
B<subclass> of C<File>.

  package File::MP3;

  use parent 'File';

The L<parent> module is one of several ways that Perl lets you define
inheritance relationships.

Perl allows multiple inheritance, which means that a class can inherit
from multiple parents. While this is possible, we strongly recommend
against it. Generally, you can use B<roles> to do everything you can do
with multiple inheritance, but in a cleaner way.

Note that there's nothing wrong with defining multiple subclasses of a
given class. This is both common and safe. For example, we might define
C<File::MP3::FixedBitrate> and C<File::MP3::VariableBitrate> classes to
distinguish between different types of mp3 file.

=head3 Overriding methods and method resolution

Inheritance allows two classes to share code. By default, every method
in the parent class is also available in the child. The child can
explicitly B<override> a parent's method to provide its own
implementation. For example, if we have an C<File::MP3> object, it has
the C<print_info()> method from C<File>:

  my $cage = File::MP3->new(
      path          => 'mp3s/My-Body-Is-a-Cage.mp3',
      content       => $mp3_data,
      last_mod_time => 1304974868,
      title         => 'My Body Is a Cage',
  );

  $cage->print_info;
  # The file is at mp3s/My-Body-Is-a-Cage.mp3

If we wanted to include the mp3's title in the greeting, we could
override the method:

  package File::MP3;

  use parent 'File';

  sub print_info {
      my $self = shift;

      print "This file is at ", $self->path, "\n";
      print "Its title is ", $self->title, "\n";
  }

  $cage->print_info;
  # The file is at mp3s/My-Body-Is-a-Cage.mp3
  # Its title is My Body Is a Cage

The process of determining what method should be used is called
B<method resolution>. What Perl does is look at the object's class
first (C<File::MP3> in this case). If that class defines the method,
then that class's version of the method is called. If not, Perl looks
at each parent class in turn. For C<File::MP3>, its only parent is
C<File>. If C<File::MP3> does not define the method, but C<File> does,
then Perl calls the method in C<File>.

If C<File> inherited from C<DataSource>, which inherited from C<Thing>,
then Perl would keep looking "up the chain" if necessary.

It is possible to explicitly call a parent method from a child:

  package File::MP3;

  use parent 'File';

  sub print_info {
      my $self = shift;

      $self->SUPER::print_info();
      print "Its title is ", $self->title, "\n";
  }

The C<SUPER::> bit tells Perl to look for the C<print_info()> in the
C<File::MP3> class's inheritance chain. When it finds the parent class
that implements this method, the method is called.

We mentioned multiple inheritance earlier. The main problem with
multiple inheritance is that it greatly complicates method resolution.
See L<perlobj> for more details.

=head2 Encapsulation

B<Encapsulation> is the idea that an object is opaque. When another
developer uses your class, they don't need to know I<how> it is
implemented, they just need to know I<what> it does.

Encapsulation is important for several reasons. First, it allows you to
separate the public API from the private implementation. This means you
can change that implementation without breaking the API.

Second, when classes are well encapsulated, they become easier to
subclass. Ideally, a subclass uses the same APIs to access object data
that its parent class uses. In reality, subclassing sometimes involves
violating encapsulation, but a good API can minimize the need to do
this.

We mentioned earlier that most Perl objects are implemented as hashes
under the hood. The principle of encapsulation tells us that we should
not rely on this. Instead, we should use accessor methods to access the
data in that hash. The object systems that we recommend below all
automate the generation of accessor methods. If you use one of them,
you should never have to access the object as a hash directly.

=head2 Composition

In object-oriented code, we often find that one object references
another object. This is called B<composition>, or a B<has-a>
relationship.

Earlier, we mentioned that the C<File> class's C<last_mod_time>
accessor could return a L<DateTime> object. This is a perfect example
of composition. We could go even further, and make the C<path> and
C<content> accessors return objects as well. The C<File> class would
then be B<composed> of several other objects.

=head2 Roles

B<Roles> are something that a class I<does>, rather than something that
it I<is>. Roles are relatively new to Perl, but have become rather
popular. Roles are B<applied> to classes. Sometimes we say that classes
B<consume> roles.

Roles are an alternative to inheritance for providing polymorphism.
Let's assume we have two classes, C<Radio> and C<Computer>. Both of
these things have on/off switches. We want to model that in our class
definitions.

We could have both classes inherit from a common parent, like
C<Machine>, but not all machines have on/off switches. We could create
a parent class called C<HasOnOffSwitch>, but that is very artificial.
Radios and computers are not specializations of this parent. This
parent is really a rather ridiculous creation.

This is where roles come in. It makes a lot of sense to create a
C<HasOnOffSwitch> role and apply it to both classes. This role would
define a known API like providing C<turn_on()> and C<turn_off()>
methods.

Perl does not have any built-in way to express roles. In the past,
people just bit the bullet and used multiple inheritance. Nowadays,
there are several good choices on CPAN for using roles.

=head2 When to Use OO

Object Orientation is not the best solution to every problem. In I<Perl
Best Practices> (copyright 2004, Published by O'Reilly Media, Inc.),
Damian Conway provides a list of criteria to use when deciding if OO is
the right fit for your problem:

=over 4

=item *

The system being designed is large, or is likely to become large.

=item *

The data can be aggregated into obvious structures, especially if
there's a large amount of data in each aggregate.

=item *

The various types of data aggregate form a natural hierarchy that
facilitates the use of inheritance and polymorphism.

=item *

You have a piece of data on which many different operations are
applied.

=item *

You need to perform the same general operations on related types of
data, but with slight variations depending on the specific type of data
the operations are applied to.

=item *

It's likely you'll have to add new data types later.

=item *

The typical interactions between pieces of data are best represented by
operators.

=item *

The implementation of individual components of the system is likely to
change over time.

=item *

The system design is already object-oriented.

=item *

Large numbers of other programmers will be using your code modules.

=back

=head1 PERL OO SYSTEMS

As we mentioned before, Perl's built-in OO system is very minimal, but
also quite flexible. Over the years, many people have developed systems
which build on top of Perl's built-in system to provide more features
and convenience.

We strongly recommend that you use one of these systems. Even the most
minimal of them eliminates a lot of repetitive boilerplate. There's
really no good reason to write your classes from scratch in Perl.

If you are interested in the guts underlying these systems, check out
L<perlobj>.

=head2 Moose

L<Moose> bills itself as a "postmodern object system for Perl 5". Don't
be scared, the "postmodern" label is a callback to Larry's description
of Perl as "the first postmodern computer language".

C<Moose> provides a complete, modern OO system. Its biggest influence
is the Common Lisp Object System, but it also borrows ideas from
Smalltalk and several other languages. C<Moose> was created by Stevan
Little, and draws heavily from his work on the Perl 6 OO design.

Here is our C<File> class using C<Moose>:

  package File;
  use Moose;

  has path          => ( is => 'ro' );
  has content       => ( is => 'ro' );
  has last_mod_time => ( is => 'ro' );

  sub print_info {
      my $self = shift;

      print "This file is at ", $self->path, "\n";
  }

C<Moose> provides a number of features:

=over 4

=item * Declarative sugar

C<Moose> provides a layer of declarative "sugar" for defining classes.
That sugar is just a set of exported functions that make declaring how
your class works simpler and more palatable.  This lets you describe
I<what> your class is, rather than having to tell Perl I<how> to
implement your class.

The C<has()> subroutine declares an attribute, and C<Moose>
automatically creates accessors for these attributes. It also takes
care of creating a C<new()> method for you. This constructor knows
about the attributes you declared, so you can set them when creating a
new C<File>.

=item * Roles built-in

C<Moose> lets you define roles the same way you define classes:

  package HasOnOffSwitch;
  use Moose::Role;

  has is_on => (
      is  => 'rw',
      isa => 'Bool',
  );

  sub turn_on {
      my $self = shift;
      $self->is_on(1);
  }

  sub turn_off {
      my $self = shift;
      $self->is_on(0);
  }

=item * A miniature type system

In the example above, you can see that we passed C<< isa => 'Bool' >>
to C<has()> when creating our C<is_on> attribute. This tells C<Moose>
that this attribute must be a boolean value. If we try to set it to an
invalid value, our code will throw an error.

=item * Full introspection and manipulation

Perl's built-in introspection features are fairly minimal. C<Moose>
builds on top of them and creates a full introspection layer for your
classes. This lets you ask questions like "what methods does the File
class implement?" It also lets you modify your classes
programmatically.

=item * Self-hosted and extensible

C<Moose> describes itself using its own introspection API. Besides
being a cool trick, this means that you can extend C<Moose> using
C<Moose> itself.

=item * Rich ecosystem

There is a rich ecosystem of C<Moose> extensions on CPAN under the
L<MooseX|http://search.cpan.org/search?query=MooseX&mode=dist>
namespace. In addition, many modules on CPAN already use C<Moose>,
providing you with lots of examples to learn from.

=item * Many more features

C<Moose> is a very powerful tool, and we can't cover all of its
features here. We encourage you to learn more by reading the C<Moose>
documentation, starting with
L<Moose::Manual|http://search.cpan.org/perldoc?Moose::Manual>.

=back

Of course, C<Moose> isn't perfect.

C<Moose> can make your code slower to load. C<Moose> itself is not
small, and it does a I<lot> of code generation when you define your
class. This code generation means that your runtime code is as fast as
it can be, but you pay for this when your modules are first loaded.

This load time hit can be a problem when startup speed is important,
such as with a command-line script or a "plain vanilla" CGI script that
must be loaded each time it is executed.

Before you panic, know that many people do use C<Moose> for
command-line tools and other startup-sensitive code. We encourage you
to try C<Moose> out first before worrying about startup speed.

C<Moose> also has several dependencies on other modules. Most of these
are small stand-alone modules, a number of which have been spun off
from C<Moose>. C<Moose> itself, and some of its dependencies, require a
compiler. If you need to install your software on a system without a
compiler, or if having I<any> dependencies is a problem, then C<Moose>
may not be right for you.

=head3 Moo

If you try C<Moose> and find that one of these issues is preventing you
from using C<Moose>, we encourage you to consider L<Moo> next. C<Moo>
implements a subset of C<Moose>'s functionality in a simpler package.
For most features that it does implement, the end-user API is
I<identical> to C<Moose>, meaning you can switch from C<Moo> to
C<Moose> quite easily.

C<Moo> does not implement most of C<Moose>'s introspection API, so it's
often faster when loading your modules. Additionally, none of its
dependencies require XS, so it can be installed on machines without a
compiler.

One of C<Moo>'s most compelling features is its interoperability with
C<Moose>. When someone tries to use C<Moose>'s introspection API on a
C<Moo> class or role, it is transparently inflated into a C<Moose>
class or role. This makes it easier to incorporate C<Moo>-using code
into a C<Moose> code base and vice versa.

For example, a C<Moose> class can subclass a C<Moo> class using
C<extends> or consume a C<Moo> role using C<with>.

The C<Moose> authors hope that one day C<Moo> can be made obsolete by
improving C<Moose> enough, but for now it provides a worthwhile
alternative to C<Moose>.

=head2 Class::Accessor

L<Class::Accessor> is the polar opposite of C<Moose>. It provides very
few features, nor is it self-hosting.

It is, however, very simple, pure Perl, and it has no non-core
dependencies. It also provides a "Moose-like" API on demand for the
features it supports.

Even though it doesn't do much, it is still preferable to writing your
own classes from scratch.

Here's our C<File> class with C<Class::Accessor>:

  package File;
  use Class::Accessor 'antlers';

  has path          => ( is => 'ro' );
  has content       => ( is => 'ro' );
  has last_mod_time => ( is => 'ro' );

  sub print_info {
      my $self = shift;

      print "This file is at ", $self->path, "\n";
  }

The C<antlers> import flag tells C<Class::Accessor> that you want to
define your attributes using C<Moose>-like syntax. The only parameter
that you can pass to C<has> is C<is>. We recommend that you use this
Moose-like syntax if you choose C<Class::Accessor> since it means you
will have a smoother upgrade path if you later decide to move to
C<Moose>.

Like C<Moose>, C<Class::Accessor> generates accessor methods and a
constructor for your class.

=head2 Class::Tiny

Finally, we have L<Class::Tiny>. This module truly lives up to its
name. It has an incredibly minimal API and absolutely no dependencies
on any recent Perl. Still, we think it's a lot easier to use than
writing your own OO code from scratch.

Here's our C<File> class once more:

  package File;
  use Class::Tiny qw( path content last_mod_time );

  sub print_info {
      my $self = shift;

      print "This file is at ", $self->path, "\n";
  }

That's it!

With C<Class::Tiny>, all accessors are read-write. It generates a
constructor for you, as well as the accessors you define.

You can also use L<Class::Tiny::Antlers> for C<Moose>-like syntax.

=head2 Role::Tiny

As we mentioned before, roles provide an alternative to inheritance,
but Perl does not have any built-in role support. If you choose to use
Moose, it comes with a full-fledged role implementation. However, if
you use one of our other recommended OO modules, you can still use
roles with L<Role::Tiny>

C<Role::Tiny> provides some of the same features as Moose's role
system, but in a much smaller package. Most notably, it doesn't support
any sort of attribute declaration, so you have to do that by hand.
Still, it's useful, and works well with C<Class::Accessor> and
C<Class::Tiny>

=head2 OO System Summary

Here's a brief recap of the options we covered:

=over 4

=item * L<Moose>

C<Moose> is the maximal option. It has a lot of features, a big
ecosystem, and a thriving user base. We also covered L<Moo> briefly.
C<Moo> is C<Moose> lite, and a reasonable alternative when Moose
doesn't work for your application.

=item * L<Class::Accessor>

C<Class::Accessor> does a lot less than C<Moose>, and is a nice
alternative if you find C<Moose> overwhelming. It's been around a long
time and is well battle-tested. It also has a minimal C<Moose>
compatibility mode which makes moving from C<Class::Accessor> to
C<Moose> easy.

=item * L<Class::Tiny>

C<Class::Tiny> is the absolute minimal option. It has no dependencies,
and almost no syntax to learn. It's a good option for a super minimal
environment and for throwing something together quickly without having
to worry about details.

=item * L<Role::Tiny>

Use C<Role::Tiny> with C<Class::Accessor> or C<Class::Tiny> if you find
yourself considering multiple inheritance. If you go with C<Moose>, it
comes with its own role implementation.

=back

=head2 Other OO Systems

There are literally dozens of other OO-related modules on CPAN besides
those covered here, and you're likely to run across one or more of them
if you work with other people's code.

In addition, plenty of code in the wild does all of its OO "by hand",
using just the Perl built-in OO features. If you need to maintain such
code, you should read L<perlobj> to understand exactly how Perl's
built-in OO works.

=head1 CONCLUSION

As we said before, Perl's minimal OO system has led to a profusion of
OO systems on CPAN. While you can still drop down to the bare metal and
write your classes by hand, there's really no reason to do that with
modern Perl.

For small systems, L<Class::Tiny> and L<Class::Accessor> both provide
minimal object systems that take care of basic boilerplate for you.

For bigger projects, L<Moose> provides a rich set of features that will
let you focus on implementing your business logic. L<Moo> provides a
nice alternative to L<Moose> when you want a lot of features but need
faster compile time or to avoid XS.

We encourage you to play with and evaluate L<Moose>, L<Moo>,
L<Class::Accessor>, and L<Class::Tiny> to see which OO system is right
for you.

=cut
perl5240delta.pod000064400000176637150344123500007560 0ustar00=encoding utf8

=head1 NAME

perl5240delta - what is new for perl v5.24.0

=head1 DESCRIPTION

This document describes the differences between the 5.22.0 release and the
5.24.0 release.

=head1 Core Enhancements

=head2 Postfix dereferencing is no longer experimental

Using the C<postderef> and C<postderef_qq> features no longer emits a
warning. Existing code that disables the C<experimental::postderef> warning
category that they previously used will continue to work. The C<postderef>
feature has no effect; all Perl code can use postfix dereferencing,
regardless of what feature declarations are in scope. The C<5.24> feature
bundle now includes the C<postderef_qq> feature.

=head2 Unicode 8.0 is now supported

For details on what is in this release, see
L<http://www.unicode.org/versions/Unicode8.0.0/>.

=head2 perl will now croak when closing an in-place output file fails

Until now, failure to close the output file for an in-place edit was not
detected, meaning that the input file could be clobbered without the edit being
successfully completed.  Now, when the output file cannot be closed
successfully, an exception is raised.

=head2 New C<\b{lb}> boundary in regular expressions

C<lb> stands for Line Break.  It is a Unicode property
that determines where a line of text is suitable to break (typically so
that it can be output without overflowing the available horizontal
space).  This capability has long been furnished by the
L<Unicode::LineBreak> module, but now a light-weight, non-customizable
version that is suitable for many purposes is in core Perl.

=head2 C<qr/(?[ ])/> now works in UTF-8 locales

L<Extended Bracketed Character Classes|perlrecharclass/Extended Bracketed Character Classes>
now will successfully compile when S<C<use locale>> is in effect.  The compiled
pattern will use standard Unicode rules.  If the runtime locale is not a
UTF-8 one, a warning is raised and standard Unicode rules are used
anyway.  No tainting is done since the outcome does not actually depend
on the locale.

=head2 Integer shift (C<< << >> and C<< >> >>) now more explicitly defined

Negative shifts are reverse shifts: left shift becomes right shift,
and right shift becomes left shift.

Shifting by the number of bits in a native integer (or more) is zero,
except when the "overshift" is right shifting a negative value under
C<use integer>, in which case the result is -1 (arithmetic shift).

Until now negative shifting and overshifting have been undefined
because they have relied on whatever the C implementation happens
to do.  For example, for the overshift a common C behavior is
"modulo shift":

  1 >> 64 == 1 >> (64 % 64) == 1 >> 0 == 1  # Common C behavior.

  # And the same for <<, while Perl now produces 0 for both.

Now these behaviors are well-defined under Perl, regardless of what
the underlying C implementation does.  Note, however, that you are still
constrained by the native integer width: you need to know how far left you
can go.  You can use for example:

  use Config;
  my $wordbits = $Config{uvsize} * 8;  # Or $Config{uvsize} << 3.

If you need a more bits on the left shift, you can use for example
the C<bigint> pragma, or the C<Bit::Vector> module from CPAN.

=head2 printf and sprintf now allow reordered precision arguments

That is, C<< sprintf '|%.*2$d|', 2, 3 >> now returns C<|002|>. This extends
the existing reordering mechanism (which allows reordering for arguments
that are used as format fields, widths, and vector separators).

=head2 More fields provided to C<sigaction> callback with C<SA_SIGINFO>

When passing the C<SA_SIGINFO> flag to L<sigaction|POSIX/sigaction>, the
C<errno>, C<status>, C<uid>, C<pid>, C<addr> and C<band> fields are now
included in the hash passed to the handler, if supported by the
platform.

=head2 Hashbang redirection to Perl 6

Previously perl would redirect to another interpreter if it found a
hashbang path unless the path contains "perl" (see L<perlrun>). To improve
compatability with Perl 6 this behavior has been extended to also redirect
if "perl" is followed by "6".

=head1 Security

=head2 Set proper umask before calling C<mkstemp(3)>

In 5.22 perl started setting umask to 0600 before calling C<mkstemp(3)>
and restoring it afterwards. This wrongfully tells C<open(2)> to strip
the owner read and write bits from the given mode before applying it,
rather than the intended negation of leaving only those bits in place.

Systems that use mode 0666 in C<mkstemp(3)> (like old versions of
glibc) create a file with permissions 0066, leaving world read and
write permissions regardless of current umask.

This has been fixed by using umask 0177 instead. [perl #127322]

=head2 Fix out of boundary access in Win32 path handling

This is CVE-2015-8608.  For more information see
L<[perl #126755]|https://rt.perl.org/Ticket/Display.html?id=126755>

=head2 Fix loss of taint in canonpath

This is CVE-2015-8607.  For more information see
L<[perl #126862]|https://rt.perl.org/Ticket/Display.html?id=126862>

=head2 Avoid accessing uninitialized memory in win32 C<crypt()>

Added validation that will detect both a short salt and invalid characters
in the salt.
L<[perl #126922]|https://rt.perl.org/Ticket/Display.html?id=126922>

=head2 Remove duplicate environment variables from C<environ>

Previously, if an environment variable appeared more than once in
C<environ[]>, C<%ENV> would contain the last entry for that name,
while a typical C<getenv()> would return the first entry. We now
make sure C<%ENV> contains the same as what C<getenv> returns.

Second, we remove duplicates from C<environ[]>, so if a setting
with that name is set in C<%ENV>, we won't pass an unsafe value
to a child process.

[CVE-2016-2381]

=head1 Incompatible Changes

=head2 The C<autoderef> feature has been removed

The experimental C<autoderef> feature (which allowed calling C<push>,
C<pop>, C<shift>, C<unshift>, C<splice>, C<keys>, C<values>, and C<each> on
a scalar argument) has been deemed unsuccessful. It has now been removed;
trying to use the feature (or to disable the C<experimental::autoderef>
warning it previously triggered) now yields an exception.

=head2 Lexical $_ has been removed

C<my $_> was introduced in Perl 5.10, and subsequently caused much confusion
with no obvious solution.  In Perl 5.18.0, it was made experimental on the
theory that it would either be removed or redesigned in a less confusing (but
backward-incompatible) way.  Over the following years, no alternatives were
proposed.  The feature has now been removed and will fail to compile.

=head2 C<qr/\b{wb}/> is now tailored to Perl expectations

This is now more suited to be a drop-in replacement for plain C<\b>, but
giving better results for parsing natural language.  Previously it
strictly followed the current Unicode rules which calls for it to match
between each white space character.  Now it doesn't generally match
within spans of white space, behaving like C<\b> does.  See
L<perlrebackslash/\b{wb}>

=head2 Regular expression compilation errors

Some regular expression patterns that had runtime errors now
don't compile at all.

Almost all Unicode properties using the C<\p{}> and C<\P{}> regular
expression pattern constructs are now checked for validity at pattern
compilation time, and invalid ones will cause the program to not
compile.  In earlier releases, this check was often deferred until run
time.  Whenever an error check is moved from run- to compile time,
erroneous code is caught 100% of the time, whereas before it would only
get caught if and when the offending portion actually gets executed,
which for unreachable code might be never.

=head2 C<qr/\N{}/> now disallowed under C<use re "strict">

An empty C<\N{}> makes no sense, but for backwards compatibility is
accepted as doing nothing, though a deprecation warning is raised by
default.  But now this is a fatal error under the experimental feature
L<re/'strict' mode>.

=head2 Nested declarations are now disallowed

A C<my>, C<our>, or C<state> declaration is no longer allowed inside
of another C<my>, C<our>, or C<state> declaration.

For example, these are now fatal:

   my ($x, my($y));
   our (my $x);

L<[perl #125587]|https://rt.perl.org/Ticket/Display.html?id=125587>

L<[perl #121058]|https://rt.perl.org/Ticket/Display.html?id=121058>

=head2 The C</\C/> character class has been removed.

This regular expression character class was deprecated in v5.20.0 and has
produced a deprecation warning since v5.22.0. It is now a compile-time
error. If you need to examine the individual bytes that make up a
UTF8-encoded character, then use C<utf8::encode()> on the string (or a
copy) first.

=head2 C<chdir('')> no longer chdirs home

Using C<chdir('')> or C<chdir(undef)> to chdir home has been deprecated since
perl v5.8, and will now fail.  Use C<chdir()> instead.

=head2 ASCII characters in variable names must now be all visible

It was legal until now on ASCII platforms for variable names to contain
non-graphical ASCII control characters (ordinals 0 through 31, and 127,
which are the C0 controls and C<DELETE>).  This usage has been
deprecated since v5.20, and as of now causes a syntax error.  The
variables these names referred to are special, reserved by Perl for
whatever use it may choose, now, or in the future.  Each such variable
has an alternative way of spelling it.  Instead of the single
non-graphic control character, a two character sequence beginning with a
caret is used, like C<$^]> and C<${^GLOBAL_PHASE}>.  Details are at
L<perlvar>.   It remains legal, though unwise and deprecated (raising a
deprecation warning), to use certain non-graphic non-ASCII characters in
variables names when not under S<C<use utf8>>.  No code should do this,
as all such variables are reserved by Perl, and Perl doesn't currently
define any of them (but could at any time, without notice).

=head2 An off by one issue in C<$Carp::MaxArgNums> has been fixed

C<$Carp::MaxArgNums> is supposed to be the number of arguments to display.
Prior to this version, it was instead showing C<$Carp::MaxArgNums> + 1 arguments,
contrary to the documentation.

=head2 Only blanks and tabs are now allowed within C<[...]> within C<(?[...])>.

The experimental Extended Bracketed Character Classes can contain regular
bracketed character classes within them.  These differ from regular ones in
that white space is generally ignored, unless escaped by preceding it with a
backslash.  The white space that is ignored is now limited to just tab C<\t>
and SPACE characters.  Previously, it was any white space.  See
L<perlrecharclass/Extended Bracketed Character Classes>.

=head1 Deprecations

=head2 Using code points above the platform's C<IV_MAX> is now deprecated

Unicode defines code points in the range C<0..0x10FFFF>.  Some standards
at one time defined them up to 2**31 - 1, but Perl has allowed them to
be as high as anything that will fit in a word on the platform being
used.  However, use of those above the platform's C<IV_MAX> is broken in
some constructs, notably C<tr///>, regular expression patterns involving
quantifiers, and in some arithmetic and comparison operations, such as
being the upper limit of a loop.  Now the use of such code points raises
a deprecation warning, unless that warning category is turned off.
C<IV_MAX> is typically 2**31 -1 on 32-bit platforms, and 2**63-1 on
64-bit ones.

=head2 Doing bitwise operations on strings containing code points above
0xFF is deprecated

The string bitwise operators treat their operands as strings of bytes,
and values beyond 0xFF are nonsensical in this context.  To operate on
encoded bytes, first encode the strings.  To operate on code points'
numeric values, use C<split> and C<map ord>.  In the future, this
warning will be replaced by an exception.

=head2 C<sysread()>, C<syswrite()>, C<recv()> and C<send()> are deprecated on
:utf8 handles

The C<sysread()>, C<recv()>, C<syswrite()> and C<send()> operators
are deprecated on handles that have the C<:utf8> layer, either
explicitly, or implicitly, eg., with the C<:encoding(UTF-16LE)> layer.

Both C<sysread()> and C<recv()> currently use only the C<:utf8> flag for the
stream, ignoring the actual layers.  Since C<sysread()> and C<recv()> do no
UTF-8 validation they can end up creating invalidly encoded scalars.

Similarly, C<syswrite()> and C<send()> use only the C<:utf8> flag, otherwise
ignoring any layers.  If the flag is set, both write the value UTF-8
encoded, even if the layer is some different encoding, such as the
example above.

Ideally, all of these operators would completely ignore the C<:utf8>
state, working only with bytes, but this would result in silently
breaking existing code.  To avoid this a future version of perl will
throw an exception when any of C<sysread()>, C<recv()>, C<syswrite()> or C<send()>
are called on handle with the C<:utf8> layer.

=head1 Performance Enhancements

=over 4

=item *

The overhead of scope entry and exit has been considerably reduced, so
for example subroutine calls, loops and basic blocks are all faster now.
This empty function call now takes about a third less time to execute:

    sub f{} f();

=item *

Many languages, such as Chinese, are caseless.  Perl now knows about
most common ones, and skips much of the work when
a program tries to change case in them (like C<ucfirst()>) or match
caselessly (C<qr//i>).  This will speed up a program, such as a web
server, that can operate on multiple languages, while it is operating on a
caseless one.

=item *

C</fixed-substr/> has been made much faster.

On platforms with a libc C<memchr()> implementation which makes good use of
underlying hardware support, patterns which include fixed substrings will now
often be much faster; for example with glibc on a recent x86_64 CPU, this:

    $s = "a" x 1000 . "wxyz";
    $s =~ /wxyz/ for 1..30000

is now about 7 times faster.  On systems with slow C<memchr()>, e.g. 32-bit ARM
Raspberry Pi, there will be a small or little speedup.  Conversely, some
pathological cases, such as C<"ab" x 1000 =~ /aa/> will be slower now; up to 3
times slower on the rPi, 1.5x slower on x86_64.

=item *

Faster addition, subtraction and multiplication.

Since 5.8.0, arithmetic became slower due to the need to support
64-bit integers. To deal with 64-bit integers, a lot more corner
cases need to be checked, which adds time. We now detect common
cases where there is no need to check for those corner cases,
and special-case them.

=item *

Preincrement, predecrement, postincrement, and postdecrement have been
made faster by internally splitting the functions which handled multiple
cases into different functions.

=item *

Creating Perl debugger data structures (see L<perldebguts/"Debugger Internals">)
for XSUBs and const subs has been removed.  This removed one glob/scalar combo
for each unique C<.c> file that XSUBs and const subs came from.  On startup
(C<perl -e"0">) about half a dozen glob/scalar debugger combos were created.
Loading XS modules created more glob/scalar combos.  These things were
being created regardless of whether the perl debugger was being used,
and despite the fact that it can't debug C code anyway

=item *

On Win32, C<stat>ing or C<-X>ing a path, if the file or directory does not
exist, is now 3.5x faster than before.

=item *

Single arguments in list assign are now slightly faster:

  ($x) = (...);
  (...) = ($x);

=item *

Less peak memory is now used when compiling regular expression patterns.

=back

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over

=item *

L<arybase> has been upgraded from version 0.10 to 0.11.

=item *

L<Attribute::Handlers> has been upgraded from version 0.97 to 0.99.

=item *

L<autodie> has been upgraded from version 2.26 to 2.29.

=item *

L<autouse> has been upgraded from version 1.08 to 1.11.

=item *

L<B> has been upgraded from version 1.58 to 1.62.

=item *

L<B::Deparse> has been upgraded from version 1.35 to 1.37.

=item *

L<base> has been upgraded from version 2.22 to 2.23.

=item *

L<Benchmark> has been upgraded from version 1.2 to 1.22.

=item *

L<bignum> has been upgraded from version 0.39 to 0.42.

=item *

L<bytes> has been upgraded from version 1.04 to 1.05.

=item *

L<Carp> has been upgraded from version 1.36 to 1.40.

=item *

L<Compress::Raw::Bzip2> has been upgraded from version 2.068 to 2.069.

=item *

L<Compress::Raw::Zlib> has been upgraded from version 2.068 to 2.069.

=item *

L<Config::Perl::V> has been upgraded from version 0.24 to 0.25.

=item *

L<CPAN::Meta> has been upgraded from version 2.150001 to 2.150005.

=item *

L<CPAN::Meta::Requirements> has been upgraded from version 2.132 to 2.140.

=item *

L<CPAN::Meta::YAML> has been upgraded from version 0.012 to 0.018.

=item *

L<Data::Dumper> has been upgraded from version 2.158 to 2.160.

=item *

L<Devel::Peek> has been upgraded from version 1.22 to 1.23.

=item *

L<Devel::PPPort> has been upgraded from version 3.31 to 3.32.

=item *

L<Dumpvalue> has been upgraded from version 1.17 to 1.18.

=item *

L<DynaLoader> has been upgraded from version 1.32 to 1.38.

=item *

L<Encode> has been upgraded from version 2.72 to 2.80.

=item *

L<encoding> has been upgraded from version 2.14 to 2.17.

=item *

L<encoding::warnings> has been upgraded from version 0.11 to 0.12.

=item *

L<English> has been upgraded from version 1.09 to 1.10.

=item *

L<Errno> has been upgraded from version 1.23 to 1.25.

=item *

L<experimental> has been upgraded from version 0.013 to 0.016.

=item *

L<ExtUtils::CBuilder> has been upgraded from version 0.280221 to 0.280225.

=item *

L<ExtUtils::Embed> has been upgraded from version 1.32 to 1.33.

=item *

L<ExtUtils::MakeMaker> has been upgraded from version 7.04_01 to 7.10_01.

=item *

L<ExtUtils::ParseXS> has been upgraded from version 3.28 to 3.31.

=item *

L<ExtUtils::Typemaps> has been upgraded from version 3.28 to 3.31.

=item *

L<feature> has been upgraded from version 1.40 to 1.42.

=item *

L<fields> has been upgraded from version 2.17 to 2.23.

=item *

L<File::Find> has been upgraded from version 1.29 to 1.34.

=item *

L<File::Glob> has been upgraded from version 1.24 to 1.26.

=item *

L<File::Path> has been upgraded from version 2.09 to 2.12_01.

=item *

L<File::Spec> has been upgraded from version 3.56 to 3.63.

=item *

L<Filter::Util::Call> has been upgraded from version 1.54 to 1.55.

=item *

L<Getopt::Long> has been upgraded from version 2.45 to 2.48.

=item *

L<Hash::Util> has been upgraded from version 0.18 to 0.19.

=item *

L<Hash::Util::FieldHash> has been upgraded from version 1.15 to 1.19.

=item *

L<HTTP::Tiny> has been upgraded from version 0.054 to 0.056.

=item *

L<I18N::Langinfo> has been upgraded from version 0.12 to 0.13.

=item *

L<if> has been upgraded from version 0.0604 to 0.0606.

=item *

L<IO> has been upgraded from version 1.35 to 1.36.

=item *

IO-Compress has been upgraded from version 2.068 to 2.069.

=item *

L<IPC::Open3> has been upgraded from version 1.18 to 1.20.

=item *

L<IPC::SysV> has been upgraded from version 2.04 to 2.06_01.

=item *

L<List::Util> has been upgraded from version 1.41 to 1.42_02.

=item *

L<locale> has been upgraded from version 1.06 to 1.08.

=item *

L<Locale::Codes> has been upgraded from version 3.34 to 3.37.

=item *

L<Math::BigInt> has been upgraded from version 1.9997 to 1.999715.

=item *

L<Math::BigInt::FastCalc> has been upgraded from version 0.31 to 0.40.

=item *

L<Math::BigRat> has been upgraded from version 0.2608 to 0.260802.

=item *

L<Module::CoreList> has been upgraded from version 5.20150520 to 5.20160320.

=item *

L<Module::Metadata> has been upgraded from version 1.000026 to 1.000031.

=item *

L<mro> has been upgraded from version 1.17 to 1.18.

=item *

L<ODBM_File> has been upgraded from version 1.12 to 1.14.

=item *

L<Opcode> has been upgraded from version 1.32 to 1.34.

=item *

L<parent> has been upgraded from version 0.232 to 0.234.

=item *

L<Parse::CPAN::Meta> has been upgraded from version 1.4414 to 1.4417.

=item *

L<Perl::OSType> has been upgraded from version 1.008 to 1.009.

=item *

L<perlfaq> has been upgraded from version 5.021009 to 5.021010.

=item *

L<PerlIO::encoding> has been upgraded from version 0.21 to 0.24.

=item *

L<PerlIO::mmap> has been upgraded from version 0.014 to 0.016.

=item *

L<PerlIO::scalar> has been upgraded from version 0.22 to 0.24.

=item *

L<PerlIO::via> has been upgraded from version 0.15 to 0.16.

=item *

L<Pod::Functions> has been upgraded from version 1.09 to 1.10.

=item *

L<Pod::Perldoc> has been upgraded from version 3.25 to 3.25_02.

=item *

L<Pod::Simple> has been upgraded from version 3.29 to 3.32.

=item *

L<Pod::Usage> has been upgraded from version 1.64 to 1.68.

=item *

L<POSIX> has been upgraded from version 1.53 to 1.65.

=item *

L<Scalar::Util> has been upgraded from version 1.41 to 1.42_02.

=item *

L<SDBM_File> has been upgraded from version 1.13 to 1.14.

=item *

L<SelfLoader> has been upgraded from version 1.22 to 1.23.

=item *

L<Socket> has been upgraded from version 2.018 to 2.020_03.

=item *

L<Storable> has been upgraded from version 2.53 to 2.56.

=item *

L<strict> has been upgraded from version 1.09 to 1.11.

=item *

L<Term::ANSIColor> has been upgraded from version 4.03 to 4.04.

=item *

L<Term::Cap> has been upgraded from version 1.15 to 1.17.

=item *

L<Test> has been upgraded from version 1.26 to 1.28.

=item *

L<Test::Harness> has been upgraded from version 3.35 to 3.36.

=item *

L<Thread::Queue> has been upgraded from version 3.05 to 3.08.

=item *

L<threads> has been upgraded from version 2.01 to 2.06.

=item *

L<threads::shared> has been upgraded from version 1.48 to 1.50.

=item *

L<Tie::File> has been upgraded from version 1.01 to 1.02.

=item *

L<Tie::Scalar> has been upgraded from version 1.03 to 1.04.

=item *

L<Time::HiRes> has been upgraded from version 1.9726 to 1.9732.

=item *

L<Time::Piece> has been upgraded from version 1.29 to 1.31.

=item *

L<Unicode::Collate> has been upgraded from version 1.12 to 1.14.

=item *

L<Unicode::Normalize> has been upgraded from version 1.18 to 1.25.

=item *

L<Unicode::UCD> has been upgraded from version 0.61 to 0.64.

=item *

L<UNIVERSAL> has been upgraded from version 1.12 to 1.13.

=item *

L<utf8> has been upgraded from version 1.17 to 1.19.

=item *

L<version> has been upgraded from version 0.9909 to 0.9916.

=item *

L<warnings> has been upgraded from version 1.32 to 1.36.

=item *

L<Win32> has been upgraded from version 0.51 to 0.52.

=item *

L<Win32API::File> has been upgraded from version 0.1202 to 0.1203.

=item *

L<XS::Typemap> has been upgraded from version 0.13 to 0.14.

=item *

L<XSLoader> has been upgraded from version 0.20 to 0.21.

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 L<perlapi>

=over 4

=item *

The process of using undocumented globals has been documented, namely, that one
should send email to L<perl5-porters@perl.org|mailto:perl5-porters@perl.org>
first to get the go-ahead for documenting and using an undocumented function or
global variable.

=back

=head3 L<perlcall>

=over 4

=item *

A number of cleanups have been made to perlcall, including:

=over 4

=item *

use C<EXTEND(SP, n)> and C<PUSHs()> instead of C<XPUSHs()> where applicable
and update prose to match

=item *

add POPu, POPul and POPpbytex to the "complete list of POP macros"
and clarify the documentation for some of the existing entries, and
a note about side-effects

=item *

add API documentation for POPu and POPul

=item *

use ERRSV more efficiently

=item *

approaches to thread-safety storage of SVs.

=back

=back

=head3 L<perlfunc>

=over 4

=item *

The documentation of C<hex> has been revised to clarify valid inputs.

=item *

Better explain meaning of negative PIDs in C<waitpid>.
L<[perl #127080]|https://rt.perl.org/Ticket/Display.html?id=127080>

=item *

General cleanup: there's more consistency now (in POD usage, grammar, code
examples), better practices in code examples (use of C<my>, removal of bareword
filehandles, dropped usage of C<&> when calling subroutines, ...), etc.

=back

=head3 L<perlguts>

=over 4

=item *

A new section has been added, L<perlguts/"Dynamic Scope and the Context
Stack">, which explains how the perl context stack works.

=back

=head3 L<perllocale>

=over 4

=item *

A stronger caution about using locales in threaded applications is
given.  Locales are not thread-safe, and you can get wrong results or
even segfaults if you use them there.

=back

=head3 L<perlmodlib>

=over 4

=item *

We now recommend contacting the module-authors list or PAUSE in seeking
guidance on the naming of modules.

=back

=head3 L<perlop>

=over 4

=item *

The documentation of C<qx//> now describes how C<$?> is affected.

=back

=head3 L<perlpolicy>

=over 4

=item *

This note has been added to perlpolicy:

 While civility is required, kindness is encouraged; if you have any
 doubt about whether you are being civil, simply ask yourself, "Am I
 being kind?" and aspire to that.

=back

=head3 L<perlreftut>

=over 4

=item *

Fix some examples to be L<strict> clean.

=back

=head3 L<perlrebackslash>

=over 4

=item *

Clarify that in languages like Japanese and Thai, dictionary lookup
is required to determine word boundaries.

=back

=head3 L<perlsub>

=over 4

=item *

Updated to note that anonymous subroutines can have signatures.

=back

=head3 L<perlsyn>

=over 4

=item *

Fixed a broken example where C<=> was used instead of
C<==> in conditional in do/while example.

=back

=head3 L<perltie>

=over 4

=item *

The usage of C<FIRSTKEY> and C<NEXTKEY> has been clarified.

=back

=head3 L<perlunicode>

=over 4

=item *

Discourage use of 'In' as a prefix signifying the Unicode Block property.

=back

=head3 L<perlvar>

=over 4

=item *

The documentation of C<$@> was reworded to clarify that it is not just for
syntax errors in C<eval>.
L<[perl #124034]|https://rt.perl.org/Ticket/Display.html?id=124034>

=item *

The specific true value of C<$!{E...}> is now documented, noting that it is
subject to change and not guaranteed.

=item *

Use of C<$OLD_PERL_VERSION> is now discouraged.

=back

=head3 L<perlxs>

=over 4

=item *

The documentation of C<PROTOTYPES> has been corrected; they are I<disabled>
by default, not I<enabled>.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<%s must not be a named sequence in transliteration operator|perldiag/"%s must not be a named sequence in transliteration operator">

=item *

L<Can't find Unicode property definition "%s" in regex;|perldiag/"Can't find Unicode property definition "%s" in regex; marked by <-- HERE in m/%s/">

=item *

L<Can't redeclare "%s" in "%s"|perldiag/"Can't redeclare "%s" in "%s"">

=item *

L<Character following \p must be '{' or a single-character Unicode property name in regex;|perldiag/"Character following \%c must be '{' or a single-character Unicode property name in regex; marked by <-- HERE in m/%s/">

=item *

L<Empty \%c in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>
|perldiag/"Empty \%c in regex; marked by <-- HERE in mE<sol>%sE<sol>">

=item *

L<Illegal user-defined property name|perldiag/"Illegal user-defined property name">

=item *

L<Invalid number '%s' for -C option.|perldiag/"Invalid number '%s' for -C option.">

=item *

L<<< Sequence (?... not terminated in regex; marked by S<<-- HERE> in mE<sol>%sE<sol>|perldiag/"Sequence (?... not terminated in regex; marked by <-- HERE in mE<sol>%sE<sol>" >>>

=item *

L<<< Sequence (?PE<lt>... not terminated in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>
|perldiag/"Sequence (?PE<lt>... not terminated in regex; marked by <-- HERE in mE<sol>%sE<sol>" >>>

=item *

L<Sequence (?PE<gt>... not terminated in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>
|perldiag/"Sequence (?PE<gt>... not terminated in regex; marked by <-- HERE in mE<sol>%sE<sol>">

=back

=head3 New Warnings

=over 4

=item *

L<Assuming NOT a POSIX class since %s in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|
perldiag/Assuming NOT a POSIX class since %s in regex; marked by <-- HERE in mE<sol>%sE<sol>>

=item *

L<%s() is deprecated on :utf8 handles|perldiag/"%s() is deprecated on :utf8 handles">

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

Accessing the C<IO> part of a glob as C<FILEHANDLE> instead of C<IO> is no
longer deprecated.  It is discouraged to encourage uniformity (so that, for
example, one can grep more easily) but it will not be removed.
L<[perl #127060]|https://rt.perl.org/Ticket/Display.html?id=127060>

=item *

The diagnostic C<< Hexadecimal float: internal error >> has been changed to
C<< Hexadecimal float: internal error (%s) >> to include more information.

=item *

L<Can't modify non-lvalue subroutine call of &%s|perldiag/"Can't modify non-lvalue subroutine call of &%s">

This error now reports the name of the non-lvalue subroutine you attempted to
use as an lvalue.

=item *

When running out of memory during an attempt the increase the stack
size, previously, perl would die using the cryptic message
C<< panic: av_extend_guts() negative count (-9223372036854775681) >>.
This has been fixed to show the prettier message:
L<< Out of memory during stack extend|perldiag/"Out of memory during %s extend" >>

=back

=head1 Configuration and Compilation

=over 4

=item *

C<Configure> now acts as if the C<-O> option is always passed, allowing command
line options to override saved configuration.  This should eliminate confusion
when command line options are ignored for no obvious reason.  C<-O> is now
permitted, but ignored.

=item *

Bison 3.0 is now supported.

=item *

F<Configure> no longer probes for F<libnm> by default.  Originally
this was the "New Math" library, but the name has been re-used by the
GNOME NetworkManager.
L<[perl #127131]|https://rt.perl.org/Ticket/Display.html?id=127131>

=item *

Added F<Configure> probes for C<newlocale>, C<freelocale>, and C<uselocale>.

=item *

C<< PPPort.so/PPPort.dll >> no longer get installed, as they are
not used by C<< PPPort.pm >>, only by its test files.

=item *

It is now possible to specify which compilation date to show on
C<< perl -V >> output, by setting the macro C<< PERL_BUILD_DATE >>.

=item *

Using the C<NO_HASH_SEED> define in combination with the default hash algorithm
C<PERL_HASH_FUNC_ONE_AT_A_TIME_HARD> resulted in a fatal error while compiling
the interpreter, since Perl 5.17.10.  This has been fixed.

=item *

F<Configure> should handle spaces in paths a little better.

=item *

No longer generate EBCDIC POSIX-BC tables.  We don't believe anyone is
using Perl and POSIX-BC at this time, and by not generating these tables
it saves time during development, and makes the resulting tar ball smaller.

=item *

The GNU Make makefile for Win32 now supports parallel builds.  [perl #126632]

=item *

You can now build perl with MSVC++ on Win32 using GNU Make.  [perl #126632]

=item *

The Win32 miniperl now has a real C<getcwd> which increases build performance
resulting in C<getcwd()> being 605x faster in Win32 miniperl.

=item *

Configure now takes C<-Dusequadmath> into account when calculating the
C<alignbytes> configuration variable.  Previously the mis-calculated
C<alignbytes> could cause alignment errors on debugging builds. [perl
#127894]

=back

=head1 Testing

=over 4

=item *

A new test (F<t/op/aassign.t>) has been added to test the list assignment operator
C<OP_AASSIGN>.

=item *

Parallel building has been added to the dmake C<makefile.mk> makefile. All
Win32 compilers are supported.

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item AmigaOS

=over 4

=item *

The AmigaOS port has been reintegrated into the main tree, based off of
Perl 5.22.1.

=back

=item Cygwin

=over 4

=item *

Tests are more robust against unusual cygdrive prefixes.
L<[perl #126834]|https://rt.perl.org/Ticket/Display.html?id=126834>

=back

=item EBCDIC

=over 4

=item UTF-EBCDIC extended

UTF-EBCDIC is like UTF-8, but for EBCDIC platforms.  It now has been
extended so that it can represent code points up to 2 ** 64 - 1 on
platforms with 64-bit words.  This brings it into parity with UTF-8.
This enhancement requires an incompatible change to the representation
of code points in the range 2 ** 30 to 2 ** 31 -1 (the latter was the
previous maximum representable code point).  This means that a file that
contains one of these code points, written out with previous versions of
perl cannot be read in, without conversion, by a perl containing this
change.  We do not believe any such files are in existence, but if you
do have one, submit a ticket at L<perlbug@perl.org|mailto:perlbug@perl.org>,
and we will write a conversion script for you.

=item EBCDIC C<cmp()> and C<sort()> fixed for UTF-EBCDIC strings

Comparing two strings that were both encoded in UTF-8 (or more
precisely, UTF-EBCDIC) did not work properly until now.  Since C<sort()>
uses C<cmp()>, this fixes that as well.

=item EBCDIC C<tr///> and C<y///> fixed for C<\N{}>, and C<S<use utf8>> ranges

Perl v5.22 introduced the concept of portable ranges to regular
expression patterns.  A portable range matches the same set of
characters no matter what platform is being run on.  This concept is now
extended to C<tr///>.  See
C<L<trE<sol>E<sol>E<sol>|perlop/trE<sol>SEARCHLISTE<sol>REPLACEMENTLISTE<sol>cdsr>>.

There were also some problems with these operations under S<C<use
utf8>>, which are now fixed

=back

=item FreeBSD

=over 4

=item *

Use the C<fdclose()> function from FreeBSD if it is available.
L<[perl #126847]|https://rt.perl.org/Ticket/Display.html?id=126847>

=back

=item IRIX

=over 4

=item *

Under some circumstances IRIX stdio C<fgetc()> and C<fread()> set the errno to
C<ENOENT>, which made no sense according to either IRIX or POSIX docs.  Errno
is now cleared in such cases.
L<[perl #123977]|https://rt.perl.org/Ticket/Display.html?id=123977>

=item *

Problems when multiplying long doubles by infinity have been fixed.
L<[perl #126396]|https://rt.perl.org/Ticket/Display.html?id=126396>

=back

=item MacOS X

=over 4

=item *

Until now OS X builds of perl have specified a link target of 10.3 (Panther,
2003) but have not specified a compiler target.  From now on, builds of perl on
OS X 10.6 or later (Snow Leopard, 2008) by default capture the current OS X
version and specify that as the explicit build target in both compiler and
linker flags, thus preserving binary compatibility for extensions built later
regardless of changes in OS X, SDK, or compiler and linker versions.  To
override the default value used in the build and preserved in the flags,
specify C<export MACOSX_DEPLOYMENT_TARGET=10.N> before configuring and building
perl, where 10.N is the version of OS X you wish to target.  In OS X 10.5 or
earlier there is no change to the behavior present when those systems were
current; the link target is still OS X 10.3 and there is no explicit compiler
target.

=item *

Builds with both -DDEBUGGING and threading enabled would fail with a
"panic: free from wrong pool" error when built or tested from Terminal
on OS X.  This was caused by perl's internal management of the
environment conflicting with an atfork handler using the libc
C<setenv()> function to update the environment.

Perl now uses C<setenv()>/C<unsetenv()> to update the environment on OS X.
L<[perl #126240]|https://rt.perl.org/Ticket/Display.html?id=126240>

=back

=item Solaris

=over 4

=item *

All Solaris variants now build a shared libperl

Solaris and variants like OpenIndiana now always build with the shared
Perl library (Configure -Duseshrplib).  This was required for the
OpenIndiana builds, but this has also been the setting for Oracle/Sun
Perl builds for several years.

=back

=item Tru64

=over 4

=item *

Workaround where Tru64 balks when prototypes are listed as
C<< PERL_STATIC_INLINE >>, but where the test is build with
C<< -DPERL_NO_INLINE_FUNCTIONS >>.

=back

=item VMS

=over 4

=item *

On VMS, the math function prototypes in C<math.h> are now visible under C++.
Now building the POSIX extension with C++ will no longer crash.

=item *

VMS has had C<setenv>/C<unsetenv> since v7.0 (released in 1996),
C<Perl_vmssetenv> now always uses C<setenv>/C<unsetenv>.

=item *

Perl now implements its own C<killpg> by scanning for processes in the
specified process group, which may not mean exactly the same thing as a Unix
process group, but allows us to send a signal to a parent (or master) process
and all of its sub-processes.  At the perl level, this means we can now send a
negative pid like so:

    kill SIGKILL, -$pid;

to signal all processes in the same group as C<$pid>.

=item *

For those C<%ENV> elements based on the CRTL environ array, we've always
preserved case when setting them but did look-ups only after upcasing the
key first, which made lower- or mixed-case entries go missing. This problem
has been corrected by making C<%ENV> elements derived from the environ array
case-sensitive on look-up as well as case-preserving on store.

=item *

Environment look-ups for C<PERL5LIB> and C<PERLLIB> previously only
considered logical names, but now consider all sources of C<%ENV> as
determined by C<PERL_ENV_TABLES> and as documented in L<perlvms/%ENV>.

=item *

The minimum supported version of VMS is now v7.3-2, released in 2003.  As a
side effect of this change, VAX is no longer supported as the terminal
release of OpenVMS VAX was v7.3 in 2001.

=back

=item Win32

=over 4

=item *

A new build option C<USE_NO_REGISTRY> has been added to the makefiles.  This
option is off by default, meaning the default is to do Windows registry
lookups.  This option stops Perl from looking inside the registry for anything.
For what values are looked up in the registry see L<perlwin32>.  Internally, in
C, the name of this option is C<WIN32_NO_REGISTRY>.

=item *

The behavior of Perl using C<HKEY_CURRENT_USER\Software\Perl> and
C<HKEY_LOCAL_MACHINE\Software\Perl> to lookup certain values, including C<%ENV>
vars starting with C<PERL> has changed.  Previously, the 2 keys were checked
for entries at all times through the perl process's life time even if
they did not
exist.  For performance reasons, now, if the root key (i.e.
C<HKEY_CURRENT_USER\Software\Perl> or C<HKEY_LOCAL_MACHINE\Software\Perl>) does
not exist at process start time, it will not be checked again for C<%ENV>
override entries for the remainder of the perl process's life.  This more
closely matches Unix behavior in that the environment is copied or inherited
on startup and changing the variable in the parent process or another process
or editing F<.bashrc> will not change the environmental variable in other
existing, running, processes.

=item *

One glob fetch was removed for each C<-X> or C<stat> call whether done from
Perl code or internally from Perl's C code.  The glob being looked up was
C<${^WIN32_SLOPPY_STAT}> which is a special variable.  This makes C<-X> and
C<stat> slightly faster.

=item *

During miniperl's process startup, during the build process, 4 to 8 IO calls
related to the process starting F<.pl> and the F<buildcustomize.pl> file were
removed from the code opening and executing the first 1 or 2 F<.pl> files.

=item *

Builds using Microsoft Visual C++ 2003 and earlier no longer produce
an "INTERNAL COMPILER ERROR" message.  [perl #126045]

=item *

Visual C++ 2013 builds will now execute on XP and higher. Previously they would
only execute on Vista and higher.

=item *

You can now build perl with GNU Make and GCC.  [perl #123440]

=item *

C<truncate($filename, $size)> now works for files over 4GB in size.
[perl #125347]

=item *

Parallel building has been added to the dmake C<makefile.mk> makefile. All
Win32 compilers are supported.

=item *

Building a 64-bit perl with a 64-bit GCC but a 32-bit gmake would
result in an invalid C<$Config{archname}> for the resulting perl.
[perl #127584]

=item *

Errors set by Winsock functions are now put directly into C<$^E>, and the
relevant C<WSAE*> error codes are now exported from the L<Errno> and L<POSIX>
modules for testing this against.

The previous behavior of putting the errors (converted to POSIX-style C<E*>
error codes since Perl 5.20.0) into C<$!> was buggy due to the non-equivalence
of like-named Winsock and POSIX error constants, a relationship between which
has unfortunately been established in one way or another since Perl 5.8.0.

The new behavior provides a much more robust solution for checking Winsock
errors in portable software without accidentally matching POSIX tests that were
intended for other OSes and may have different meanings for Winsock.

The old behavior is currently retained, warts and all, for backwards
compatibility, but users are encouraged to change any code that tests C<$!>
against C<E*> constants for Winsock errors to instead test C<$^E> against
C<WSAE*> constants.  After a suitable deprecation period, the old behavior may
be removed, leaving C<$!> unchanged after Winsock function calls, to avoid any
possible confusion over which error variable to check.

=back

=item ppc64el

=over 4

=item floating point

The floating point format of ppc64el (Debian naming for little-endian
PowerPC) is now detected correctly.

=back

=back

=head1 Internal Changes

=over 4

=item *

The implementation of perl's context stack system, and its internal API,
have been heavily reworked. Note that no significant changes have been
made to any external APIs, but XS code which relies on such internal
details may need to be fixed. The main changes are:

=over 4

=item *

The C<PUSHBLOCK()>, C<POPSUB()> etc. macros have been replaced with static
inline functions such as C<cx_pushblock()>, C<cx_popsub()> etc. These use
function args rather than implicitly relying on local vars such as
C<gimme> and C<newsp> being available. Also their functionality has
changed: in particular, C<cx_popblock()> no longer decrements
C<cxstack_ix>. The ordering of the steps in the C<pp_leave*> functions
involving C<cx_popblock()>, C<cx_popsub()> etc. has changed. See the new
documentation, L<perlguts/"Dynamic Scope and the Context Stack">, for
details on how to use them.

=item *

Various macros, which now consistently have a CX_ prefix, have been added:

  CX_CUR(), CX_LEAVE_SCOPE(), CX_POP()

or renamed:

  CX_POP_SAVEARRAY(), CX_DEBUG(), CX_PUSHSUBST(), CX_POPSUBST()

=item *

C<cx_pushblock()> now saves C<PL_savestack_ix> and C<PL_tmps_floor>, so
C<pp_enter*> and C<pp_leave*> no longer do

  ENTER; SAVETMPS; ....; LEAVE

=item *

C<cx_popblock()> now also restores C<PL_curpm>.

=item *

In C<dounwind()> for every context type, the current savestack frame is
now processed before each context is popped; formerly this was only done
for sub-like context frames. This action has been removed from
C<cx_popsub()> and placed into its own macro, C<CX_LEAVE_SCOPE(cx)>, which
must be called before C<cx_popsub()> etc.

C<dounwind()> now also does a C<cx_popblock()> on the last popped frame
(formerly it only did the C<cx_popsub()> etc. actions on each frame).

=item *

The temps stack is now freed on scope exit; previously, temps created
during the last statement of a block wouldn't be freed until the next
C<nextstate> following the block (apart from an existing hack that did
this for recursive subs in scalar context); and in something like
C<f(g())>, the temps created by the last statement in C<g()> would
formerly not be freed until the statement following the return from
C<f()>.

=item *

Most values that were saved on the savestack on scope entry are now
saved in suitable new fields in the context struct, and saved and
restored directly by C<cx_pushfoo()> and C<cx_popfoo()>, which is much
faster.

=item *

Various context struct fields have been added, removed or modified.

=item *

The handling of C<@_> in C<cx_pushsub()> and C<cx_popsub()> has been
considerably tidied up, including removing the C<argarray> field from the
context struct, and extracting out some common (but rarely used) code into
a separate function, C<clear_defarray()>. Also, useful subsets of
C<cx_popsub()> which had been unrolled in places like C<pp_goto> have been
gathered into the new functions C<cx_popsub_args()> and
C<cx_popsub_common()>.

=item *

C<pp_leavesub> and C<pp_leavesublv> now use the same function as the rest
of the C<pp_leave*>'s to process return args.

=item *

C<CXp_FOR_PAD> and C<CXp_FOR_GV> flags have been added, and
C<CXt_LOOP_FOR> has been split into C<CXt_LOOP_LIST>, C<CXt_LOOP_ARY>.

=item *

Some variables formerly declared by C<dMULTICALL> (but not documented) have
been removed.

=back

=item *

The obscure C<PL_timesbuf> variable, effectively a vestige of Perl 1, has
been removed. It was documented as deprecated in Perl 5.20, with a statement
that it would be removed early in the 5.21.x series; that has now finally
happened.
L<[perl #121351]|https://rt.perl.org/Ticket/Display.html?id=121351>

=item *

An unwarranted assertion in C<Perl_newATTRSUB_x()> has been removed.  If
a stub subroutine
definition with a prototype has been seen, then any subsequent stub (or
definition) of the same subroutine with an attribute was causing an assertion
failure because of a null pointer.
L<[perl #126845]|https://rt.perl.org/Ticket/Display.html?id=126845>

=item *

C<::> has been replaced by C<__> in C<ExtUtils::ParseXS>, like it's done for
parameters/return values. This is more consistent, and simplifies writing XS
code wrapping C++ classes into a nested Perl namespace (it requires only
a typedef for C<Foo__Bar> rather than two, one for C<Foo_Bar> and the other
for C<Foo::Bar>).

=item *

The C<to_utf8_case()> function is now deprecated.  Instead use
C<toUPPER_utf8>, C<toTITLE_utf8>, C<toLOWER_utf8>, and C<toFOLD_utf8>.
(See L<http://nntp.perl.org/group/perl.perl5.porters/233287>.)

=item *

Perl core code and the threads extension have been annotated so that,
if Perl is configured to use threads, then during compile-time clang (3.6
or later) will warn about suspicious uses of mutexes.
See L<http://clang.llvm.org/docs/ThreadSafetyAnalysis.html> for more
information.

=item *

The C<signbit()> emulation has been enhanced.  This will help older
and/or more exotic platforms or configurations.


=item *

Most EBCDIC-specific code in the core has been unified with non-EBCDIC
code, to avoid repetition and make maintenance easier.

=item *

MSWin32 code for C<$^X> has been moved out of the F<win32> directory to
F<caretx.c>, where other operating systems set that variable.

=item *

C<< sv_ref() >> is now part of the API.

=item *

L<perlapi/sv_backoff> had its return type changed from C<int> to C<void>.  It
previously has always returned C<0> since Perl 5.000 stable but that was
undocumented.  Although C<sv_backoff> is marked as public API, XS code is not
expected to be impacted since the proper API call would be through public API
C<sv_setsv(sv, &PL_sv_undef)>, or quasi-public C<SvOOK_off>, or non-public
C<SvOK_off> calls, and the return value of C<sv_backoff> was previously a
meaningless constant that can be rewritten as C<(sv_backoff(sv),0)>.

=item *

The C<EXTEND> and C<MEXTEND> macros have been improved to avoid various issues
with integer truncation and wrapping.  In particular, some casts formerly used
within the macros have been removed.  This means for example that passing an
unsigned C<nitems> argument is likely to raise a compiler warning now
(it's always been documented to require a signed value; formerly int,
lately SSize_t).

=item *

C<PL_sawalias> and C<GPf_ALIASED_SV> have been removed.

=item *

C<GvASSIGN_GENERATION> and C<GvASSIGN_GENERATION_set> have been removed.

=back

=head1 Selected Bug Fixes

=over 4

=item *

It now works properly to specify a user-defined property, such as

 qr/\p{mypkg1::IsMyProperty}/i

with C</i> caseless matching, an explicit package name, and
I<IsMyProperty> not defined at the time of the pattern compilation.

=item *

Perl's C<memcpy()>, C<memmove()>, C<memset()> and C<memcmp()> fallbacks are now
more compatible with the originals.  [perl #127619]

=item *

Fixed the issue where a C<< s///r >>) with B<< -DPERL_NO_COW >> attempts
to modify the source SV, resulting in the program dying. [perl #127635]

=item *

Fixed an EBCDIC-platform-only case where a pattern could fail to match. This
occurred when matching characters from the set of C1 controls when the
target matched string was in UTF-8.

=item *

Narrow the filename check in F<strict.pm> and F<warnings.pm>. Previously,
it assumed that if the filename (without the F<.pmc?> extension) differed
from the package name, if was a misspelled use statement (i.e. C<use Strict>
instead of C<use strict>). We now check whether there's really a 
miscapitalization happening, and not some other issue.

=item *

Turn an assertion into a more user friendly failure when parsing
regexes. [perl #127599]

=item *

Correctly raise an error when trying to compile patterns with 
unterminated character classes while there are trailing backslashes.
[perl #126141].

=item *

Line numbers larger than 2**31-1 but less than 2**32 are no longer
returned by C<caller()> as negative numbers.  [perl #126991]

=item *

C<< unless ( I<assignment> ) >> now properly warns when syntax
warnings are enabled.  [perl #127122]

=item *

Setting an C<ISA> glob to an array reference now properly adds
C<isaelem> magic to any existing elements.  Previously modifying such
an element would not update the ISA cache, so method calls would call
the wrong function.  Perl would also crash if the C<ISA> glob was
destroyed, since new code added in 5.23.7 would try to release the
C<isaelem> magic from the elements.  [perl #127351]

=item *

If a here-doc was found while parsing another operator, the parser had
already read end of file, and the here-doc was not terminated, perl
could produce an assertion or a segmentation fault.  This now reliably
complains about the unterminated here-doc.  [perl #125540]

=item *

C<untie()> would sometimes return the last value returned by the C<UNTIE()>
handler as well as it's normal value, messing up the stack.  [perl
#126621]

=item *

Fixed an operator precedence problem when C< castflags & 2> is true.
[perl #127474]

=item *

Caching of DESTROY methods could result in a non-pointer or a
non-STASH stored in the C<SvSTASH()> slot of a stash, breaking the B
C<STASH()> method.  The DESTROY method is now cached in the MRO metadata
for the stash.  [perl #126410]

=item *

The AUTOLOAD method is now called when searching for a DESTROY method,
and correctly sets C<$AUTOLOAD> too.  [perl #124387]  [perl #127494]

=item *

Avoid parsing beyond the end of the buffer when processing a C<#line>
directive with no filename.  [perl #127334]

=item *

Perl now raises a warning when a regular expression pattern looks like
it was supposed to contain a POSIX class, like C<qr/[[:alpha:]]/>, but
there was some slight defect in its specification which causes it to
instead be treated as a regular bracketed character class.  An example
would be missing the second colon in the above like this:
C<qr/[[:alpha]]/>.  This compiles to match a sequence of two characters.
The second is C<"]">, and the first is any of: C<"[">, C<":">, C<"a">,
C<"h">, C<"l">, or C<"p">.   This is unlikely to be the intended
meaning, and now a warning is raised.  No warning is raised unless the
specification is very close to one of the 14 legal POSIX classes.  (See
L<perlrecharclass/POSIX Character Classes>.)
[perl #8904]

=item *

Certain regex patterns involving a complemented POSIX class in an
inverted bracketed character class, and matching something else
optionally would improperly fail to match.  An example of one that could
fail is C<qr/_?[^\Wbar]\x{100}/>.  This has been fixed.
[perl #127537]

=item *

Perl 5.22 added support to the C99 hexadecimal floating point notation,
but sometimes misparses hex floats. This has been fixed.
[perl #127183]

=item *

A regression that allowed undeclared barewords in hash keys to work despite
strictures has been fixed.
L<[perl #126981]|https://rt.perl.org/Ticket/Display.html?id=126981>

=item *

Calls to the placeholder C<&PL_sv_yes> used internally when an C<import()>
or C<unimport()> method isn't found now correctly handle scalar context.
L<[perl #126042]|https://rt.perl.org/Ticket/Display.html?id=126042>

=item *

Report more context when we see an array where we expect to see an
operator and avoid an assertion failure.
L<[perl #123737]|https://rt.perl.org/Ticket/Display.html?id=123737>

=item *

Modifying an array that was previously a package C<@ISA> no longer
causes assertion failures or crashes.
L<[perl #123788]|https://rt.perl.org/Ticket/Display.html?id=123788>

=item *

Retain binary compatibility across plain and DEBUGGING perl builds.
L<[perl #127212]|https://rt.perl.org/Ticket/Display.html?id=127212>

=item *

Avoid leaking memory when setting C<$ENV{foo}> on darwin.
L<[perl #126240]|https://rt.perl.org/Ticket/Display.html?id=126240>

=item *

C</...\G/> no longer crashes on utf8 strings. When C<\G> is a fixed number
of characters from the start of the regex, perl needs to count back that
many characters from the current C<pos()> position and start matching from
there. However, it was counting back bytes rather than characters, which
could lead to panics on utf8 strings.

=item *

In some cases operators that return integers would return negative
integers as large positive integers.
L<[perl #126635]|https://rt.perl.org/Ticket/Display.html?id=126635>

=item *

The C<pipe()> operator would assert for DEBUGGING builds instead of
producing the correct error message.  The condition asserted on is
detected and reported on correctly without the assertions, so the
assertions were removed.
L<[perl #126480]|https://rt.perl.org/Ticket/Display.html?id=126480>

=item *

In some cases, failing to parse a here-doc would attempt to use freed
memory.  This was caused by a pointer not being restored correctly.
L<[perl #126443]|https://rt.perl.org/Ticket/Display.html?id=126443>

=item *

C<< @x = sort { *a = 0; $a <=> $b } 0 .. 1 >> no longer frees the GP
for *a before restoring its SV slot.
L<[perl #124097]|https://rt.perl.org/Ticket/Display.html?id=124097>

=item *

Multiple problems with the new hexadecimal floating point printf
format C<%a> were fixed:
L<[perl #126582]|https://rt.perl.org/Ticket/Display.html?id=126582>,
L<[perl #126586]|https://rt.perl.org/Ticket/Display.html?id=126586>,
L<[perl #126822]|https://rt.perl.org/Ticket/Display.html?id=126822>

=item *

Calling C<mg_set()> in C<leave_scope()> no longer leaks.

=item *

A regression from Perl v5.20 was fixed in which debugging output of regular
expression compilation was wrong.  (The pattern was correctly compiled, but
what got displayed for it was wrong.)

=item *

C<\b{sb}> works much better.  In Perl v5.22.0, this new construct didn't
seem to give the expected results, yet passed all the tests in the
extensive suite furnished by Unicode.  It turns out that it was because
these were short input strings, and the failures had to do with longer
inputs.

=item *

Certain syntax errors in
L<perlrecharclass/Extended Bracketed Character Classes> caused panics
instead of the proper error message.  This has now been fixed. [perl
#126481]

=item *

Perl 5.20 added a message when a quantifier in a regular
expression was useless, but then caused the parser to skip it;
this caused the surplus quantifier to be silently ignored, instead
of throwing an error. This is now fixed. [perl #126253]

=item *

The switch to building non-XS modules last in win32/makefile.mk (introduced
by design as part of the changes to enable parallel building) caused the
build of POSIX to break due to problems with the version module. This
is now fixed.

=item *

Improved parsing of hex float constants.

=item *

Fixed an issue with C<< pack >> where C<< pack "H" >> (and C<< pack "h" >>)
could read past the source when given a non-utf8 source, and a utf8 target.
[perl #126325]

=item *

Fixed several cases where perl would abort due to a segmentation fault,
or a C-level assert. [perl #126615], [perl #126602], [perl #126193].

=item *

There were places in regular expression patterns where comments (C<(?#...)>)
weren't allowed, but should have been.  This is now fixed.
L<[perl #116639]|https://rt.perl.org/Ticket/Display.html?id=116639>

=item *

Some regressions from Perl 5.20 have been fixed, in which some syntax errors in
L<C<(?[...])>|perlrecharclass/Extended Bracketed Character Classes> constructs
within regular expression patterns could cause a segfault instead of a proper
error message.
L<[perl #126180]|https://rt.perl.org/Ticket/Display.html?id=126180>
L<[perl #126404]|https://rt.perl.org/Ticket/Display.html?id=126404>

=item *

Another problem with
L<C<(?[...])>|perlrecharclass/Extended Bracketed Character Classes>
constructs has been fixed wherein things like C<\c]> could cause panics.
L<[perl #126181]|https://rt.perl.org/Ticket/Display.html?id=126181>

=item *

Some problems with attempting to extend the perl stack to around 2G or 4G
entries have been fixed.  This was particularly an issue on 32-bit perls built
to use 64-bit integers, and was easily noticeable with the list repetition
operator, e.g.

    @a = (1) x $big_number

Formerly perl may have crashed, depending on the exact value of C<$big_number>;
now it will typically raise an exception.
L<[perl #125937]|https://rt.perl.org/Ticket/Display.html?id=125937>

=item *

In a regex conditional expression C<(?(condition)yes-pattern|no-pattern)>, if
the condition is C<(?!)> then perl failed the match outright instead of
matching the no-pattern.  This has been fixed.
L<[perl #126222]|https://rt.perl.org/Ticket/Display.html?id=126222>

=item *

The special backtracking control verbs C<(*VERB:ARG)> now all allow an optional
argument and set C<REGERROR>/C<REGMARK> appropriately as well.
L<[perl #126186]|https://rt.perl.org/Ticket/Display.html?id=126186>

=item *

Several bugs, including a segmentation fault, have been fixed with the boundary
checking constructs (introduced in Perl 5.22) C<\b{gcb}>, C<\b{sb}>, C<\b{wb}>,
C<\B{gcb}>, C<\B{sb}>, and C<\B{wb}>.  All the C<\B{}> ones now match an empty
string; none of the C<\b{}> ones do.
L<[perl #126319]|https://rt.perl.org/Ticket/Display.html?id=126319>

=item *

Duplicating a closed file handle for write no longer creates a
filename of the form F<GLOB(0xXXXXXXXX)>.  [perl #125115]

=item *

Warning fatality is now ignored when rewinding the stack.  This
prevents infinite recursion when the now fatal error also causes
rewinding of the stack.  [perl #123398]

=item * 

In perl v5.22.0, the logic changed when parsing a numeric parameter to the -C
option, such that the successfully parsed number was not saved as the option
value if it parsed to the end of the argument.  [perl #125381]

=item *

The PadlistNAMES macro is an lvalue again.

=item *

Zero -DPERL_TRACE_OPS memory for sub-threads.

C<perl_clone_using()> was missing Zero init of PL_op_exec_cnt[].  This
caused sub-threads in threaded -DPERL_TRACE_OPS builds to spew exceedingly
large op-counts at destruct.  These counts would print %x as "ABABABAB",
clearly a mem-poison value.

=item *

A leak in the XS typemap caused one scalar to be leaked each time a C<FILE *>
or a C<PerlIO *> was C<OUTPUT:>ed or imported to Perl, since perl 5.000. These
particular typemap entries are thought to be extremely rarely used by XS
modules. [perl #124181]

=item *

C<alarm()> and C<sleep()> will now warn if the argument is a negative number
and return undef. Previously they would pass the negative value to the
underlying C function which may have set up a timer with a surprising value.

=item *

Perl can again be compiled with any Unicode version.  This used to
(mostly) work, but was lost in v5.18 through v5.20.  The property
C<Name_Alias> did not exist prior to Unicode 5.0.  L<Unicode::UCD>
incorrectly said it did.  This has been fixed.

=item *

Very large code-points (beyond Unicode) in regular expressions no
longer cause a buffer overflow in some cases when converted to UTF-8.
L<[perl #125826]|https://rt.perl.org/Ticket/Display.html?id=125826>

=item *

The integer overflow check for the range operator (...) in list
context now correctly handles the case where the size of the range is
larger than the address space.  This could happen on 32-bits with
-Duse64bitint.
L<[perl #125781]|https://rt.perl.org/Ticket/Display.html?id=125781>

=item *

A crash with C<< %::=(); J->${\"::"} >> has been fixed.
L<[perl #125541]|https://rt.perl.org/Ticket/Display.html?id=125541>

=item *

C<qr/(?[ () ])/> no longer segfaults, giving a syntax error message instead.
[perl #125805]

=item *

Regular expression possessive quantifier v5.20 regression now fixed.
C<qr/>I<PAT>C<{>I<min>,I<max>C<}+>C</> is supposed to behave identically
to C<qr/(?E<gt>>I<PAT>C<{>I<min>,I<max>C<})/>.  Since v5.20, this didn't
work if I<min> and I<max> were equal.  [perl #125825]

=item *

C<< BEGIN <> >> no longer segfaults and properly produces an error
message.  [perl #125341]

=item *

In C<tr///> an illegal backwards range like C<tr/\x{101}-\x{100}//> was
not always detected, giving incorrect results.  This is now fixed.

=back

=head1 Acknowledgements

Perl 5.24.0 represents approximately 11 months of development since Perl 5.24.0
and contains approximately 360,000 lines of changes across 1,800 files from 75
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 250,000 lines of changes to 1,200 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.24.0:

Aaron Crane, Aaron Priven, Abigail, Achim Gratz, Alexander D'Archangel, Alex
Vandiver, Andreas König, Andy Broad, Andy Dougherty, Aristotle Pagaltzis,
Chase Whitener, Chas. Owens, Chris 'BinGOs' Williams, Craig A. Berry, Dagfinn
Ilmari Mannsåker, Dan Collins, Daniel Dragan, David Golden, David Mitchell,
Doug Bell, Dr.Ruud, Ed Avis, Ed J, Father Chrysostomos, Herbert Breunung,
H.Merijn Brand, Hugo van der Sanden, Ivan Pozdeev, James E Keenan, Jan Dubois,
Jarkko Hietaniemi, Jerry D. Hedden, Jim Cromie, John Peacock, John SJ Anderson,
Karen Etheridge, Karl Williamson, kmx, Leon Timmermans, Ludovic E. R.
Tolhurst-Cleaver, Lukas Mai, Martijn Lievaart, Matthew Horsfall, Mattia Barbon,
Max Maischein, Mohammed El-Afifi, Nicholas Clark, Nicolas R., Niko Tyni, Peter
John Acklam, Peter Martini, Peter Rabbitson, Pip Cet, Rafael Garcia-Suarez,
Reini Urban, Ricardo Signes, Sawyer X, Shlomi Fish, Sisyphus, Stanislaw Pusep,
Steffen Müller, Stevan Little, Steve Hay, Sullivan Beck, Thomas Sibley, Todd
Rinaldo, Tom Hukins, Tony Cook, Unicode Consortium, Victor Adam, Vincent Pit,
Vladimir Timofeev, Yves Orton, Zachary Storer, Zefram.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
https://rt.perl.org/ .  There may also be information at
http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perliol.pod000064400000102611150344123500006714 0ustar00=head1 NAME

perliol - C API for Perl's implementation of IO in Layers.

=head1 SYNOPSIS

    /* Defining a layer ... */
    #include <perliol.h>

=head1 DESCRIPTION

This document describes the behavior and implementation of the PerlIO
abstraction described in L<perlapio> when C<USE_PERLIO> is defined.

=head2 History and Background

The PerlIO abstraction was introduced in perl5.003_02 but languished as
just an abstraction until perl5.7.0. However during that time a number
of perl extensions switched to using it, so the API is mostly fixed to
maintain (source) compatibility.

The aim of the implementation is to provide the PerlIO API in a flexible
and platform neutral manner. It is also a trial of an "Object Oriented
C, with vtables" approach which may be applied to Perl 6.

=head2 Basic Structure

PerlIO is a stack of layers.

The low levels of the stack work with the low-level operating system
calls (file descriptors in C) getting bytes in and out, the higher
layers of the stack buffer, filter, and otherwise manipulate the I/O,
and return characters (or bytes) to Perl.  Terms I<above> and I<below>
are used to refer to the relative positioning of the stack layers.

A layer contains a "vtable", the table of I/O operations (at C level
a table of function pointers), and status flags.  The functions in the
vtable implement operations like "open", "read", and "write".

When I/O, for example "read", is requested, the request goes from Perl
first down the stack using "read" functions of each layer, then at the
bottom the input is requested from the operating system services, then
the result is returned up the stack, finally being interpreted as Perl
data.

The requests do not necessarily go always all the way down to the
operating system: that's where PerlIO buffering comes into play.

When you do an open() and specify extra PerlIO layers to be deployed,
the layers you specify are "pushed" on top of the already existing
default stack.  One way to see it is that "operating system is
on the left" and "Perl is on the right".

What exact layers are in this default stack depends on a lot of
things: your operating system, Perl version, Perl compile time
configuration, and Perl runtime configuration.  See L<PerlIO>,
L<perlrun/PERLIO>, and L<open> for more information.

binmode() operates similarly to open(): by default the specified
layers are pushed on top of the existing stack.

However, note that even as the specified layers are "pushed on top"
for open() and binmode(), this doesn't mean that the effects are
limited to the "top": PerlIO layers can be very 'active' and inspect
and affect layers also deeper in the stack.  As an example there
is a layer called "raw" which repeatedly "pops" layers until
it reaches the first layer that has declared itself capable of
handling binary data.  The "pushed" layers are processed in left-to-right
order.

sysopen() operates (unsurprisingly) at a lower level in the stack than
open().  For example in Unix or Unix-like systems sysopen() operates
directly at the level of file descriptors: in the terms of PerlIO
layers, it uses only the "unix" layer, which is a rather thin wrapper
on top of the Unix file descriptors.

=head2 Layers vs Disciplines

Initial discussion of the ability to modify IO streams behaviour used
the term "discipline" for the entities which were added. This came (I
believe) from the use of the term in "sfio", which in turn borrowed it
from "line disciplines" on Unix terminals. However, this document (and
the C code) uses the term "layer".

This is, I hope, a natural term given the implementation, and should
avoid connotations that are inherent in earlier uses of "discipline"
for things which are rather different.

=head2 Data Structures

The basic data structure is a PerlIOl:

	typedef struct _PerlIO PerlIOl;
	typedef struct _PerlIO_funcs PerlIO_funcs;
	typedef PerlIOl *PerlIO;

	struct _PerlIO
	{
	 PerlIOl *	next;       /* Lower layer */
	 PerlIO_funcs *	tab;        /* Functions for this layer */
	 U32		flags;      /* Various flags for state */
	};

A C<PerlIOl *> is a pointer to the struct, and the I<application>
level C<PerlIO *> is a pointer to a C<PerlIOl *> - i.e. a pointer
to a pointer to the struct. This allows the application level C<PerlIO *>
to remain constant while the actual C<PerlIOl *> underneath
changes. (Compare perl's C<SV *> which remains constant while its
C<sv_any> field changes as the scalar's type changes.) An IO stream is
then in general represented as a pointer to this linked-list of
"layers".

It should be noted that because of the double indirection in a C<PerlIO *>,
a C<< &(perlio->next) >> "is" a C<PerlIO *>, and so to some degree
at least one layer can use the "standard" API on the next layer down.

A "layer" is composed of two parts:

=over 4

=item 1.

The functions and attributes of the "layer class".

=item 2.

The per-instance data for a particular handle.

=back

=head2 Functions and Attributes

The functions and attributes are accessed via the "tab" (for table)
member of C<PerlIOl>. The functions (methods of the layer "class") are
fixed, and are defined by the C<PerlIO_funcs> type. They are broadly the
same as the public C<PerlIO_xxxxx> functions:

 struct _PerlIO_funcs
 {
  Size_t     fsize;
  char *     name;
  Size_t     size;
  IV         kind;
  IV         (*Pushed)(pTHX_ PerlIO *f,
                             const char *mode,
                             SV *arg,
                             PerlIO_funcs *tab);
  IV         (*Popped)(pTHX_ PerlIO *f);
  PerlIO *   (*Open)(pTHX_ PerlIO_funcs *tab,
                           PerlIO_list_t *layers, IV n,
                           const char *mode,
                           int fd, int imode, int perm,
                           PerlIO *old,
                           int narg, SV **args);
  IV         (*Binmode)(pTHX_ PerlIO *f);
  SV *       (*Getarg)(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags)
  IV         (*Fileno)(pTHX_ PerlIO *f);
  PerlIO *   (*Dup)(pTHX_ PerlIO *f,
                          PerlIO *o,
                          CLONE_PARAMS *param,
                          int flags)
  /* Unix-like functions - cf sfio line disciplines */
  SSize_t    (*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);
  SSize_t    (*Unread)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
  SSize_t    (*Write)(pTHX_ PerlIO *f, const void *vbuf, Size_t count);
  IV         (*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);
  Off_t      (*Tell)(pTHX_ PerlIO *f);
  IV         (*Close)(pTHX_ PerlIO *f);
  /* Stdio-like buffered IO functions */
  IV         (*Flush)(pTHX_ PerlIO *f);
  IV         (*Fill)(pTHX_ PerlIO *f);
  IV         (*Eof)(pTHX_ PerlIO *f);
  IV         (*Error)(pTHX_ PerlIO *f);
  void       (*Clearerr)(pTHX_ PerlIO *f);
  void       (*Setlinebuf)(pTHX_ PerlIO *f);
  /* Perl's snooping functions */
  STDCHAR *  (*Get_base)(pTHX_ PerlIO *f);
  Size_t     (*Get_bufsiz)(pTHX_ PerlIO *f);
  STDCHAR *  (*Get_ptr)(pTHX_ PerlIO *f);
  SSize_t    (*Get_cnt)(pTHX_ PerlIO *f);
  void       (*Set_ptrcnt)(pTHX_ PerlIO *f,STDCHAR *ptr,SSize_t cnt);
 };

The first few members of the struct give a function table size for
compatibility check "name" for the layer, the  size to C<malloc> for the per-instance data,
and some flags which are attributes of the class as whole (such as whether it is a buffering
layer), then follow the functions which fall into four basic groups:

=over 4

=item 1.

Opening and setup functions

=item 2.

Basic IO operations

=item 3.

Stdio class buffering options.

=item 4.

Functions to support Perl's traditional "fast" access to the buffer.

=back

A layer does not have to implement all the functions, but the whole
table has to be present. Unimplemented slots can be NULL (which will
result in an error when called) or can be filled in with stubs to
"inherit" behaviour from a "base class". This "inheritance" is fixed
for all instances of the layer, but as the layer chooses which stubs
to populate the table, limited "multiple inheritance" is possible.

=head2 Per-instance Data

The per-instance data are held in memory beyond the basic PerlIOl
struct, by making a PerlIOl the first member of the layer's struct
thus:

	typedef struct
	{
	 struct _PerlIO base;       /* Base "class" info */
	 STDCHAR *	buf;        /* Start of buffer */
	 STDCHAR *	end;        /* End of valid part of buffer */
	 STDCHAR *	ptr;        /* Current position in buffer */
	 Off_t		posn;       /* Offset of buf into the file */
	 Size_t		bufsiz;     /* Real size of buffer */
	 IV		oneword;    /* Emergency buffer */
	} PerlIOBuf;

In this way (as for perl's scalars) a pointer to a PerlIOBuf can be
treated as a pointer to a PerlIOl.

=head2 Layers in action.

                table           perlio          unix
            |           |
            +-----------+    +----------+    +--------+
   PerlIO ->|           |--->|  next    |--->|  NULL  |
            +-----------+    +----------+    +--------+
            |           |    |  buffer  |    |   fd   |
            +-----------+    |          |    +--------+
            |           |    +----------+


The above attempts to show how the layer scheme works in a simple case.
The application's C<PerlIO *> points to an entry in the table(s)
representing open (allocated) handles. For example the first three slots
in the table correspond to C<stdin>,C<stdout> and C<stderr>. The table
in turn points to the current "top" layer for the handle - in this case
an instance of the generic buffering layer "perlio". That layer in turn
points to the next layer down - in this case the low-level "unix" layer.

The above is roughly equivalent to a "stdio" buffered stream, but with
much more flexibility:

=over 4

=item *

If Unix level C<read>/C<write>/C<lseek> is not appropriate for (say)
sockets then the "unix" layer can be replaced (at open time or even
dynamically) with a "socket" layer.

=item *

Different handles can have different buffering schemes. The "top"
layer could be the "mmap" layer if reading disk files was quicker
using C<mmap> than C<read>. An "unbuffered" stream can be implemented
simply by not having a buffer layer.

=item *

Extra layers can be inserted to process the data as it flows through.
This was the driving need for including the scheme in perl 5.7.0+ - we
needed a mechanism to allow data to be translated between perl's
internal encoding (conceptually at least Unicode as UTF-8), and the
"native" format used by the system. This is provided by the
":encoding(xxxx)" layer which typically sits above the buffering layer.

=item *

A layer can be added that does "\n" to CRLF translation. This layer
can be used on any platform, not just those that normally do such
things.

=back

=head2 Per-instance flag bits

The generic flag bits are a hybrid of C<O_XXXXX> style flags deduced
from the mode string passed to C<PerlIO_open()>, and state bits for
typical buffer layers.

=over 4

=item PERLIO_F_EOF

End of file.

=item PERLIO_F_CANWRITE

Writes are permitted, i.e. opened as "w" or "r+" or "a", etc.

=item  PERLIO_F_CANREAD

Reads are permitted i.e. opened "r" or "w+" (or even "a+" - ick).

=item PERLIO_F_ERROR

An error has occurred (for C<PerlIO_error()>).

=item PERLIO_F_TRUNCATE

Truncate file suggested by open mode.

=item PERLIO_F_APPEND

All writes should be appends.

=item PERLIO_F_CRLF

Layer is performing Win32-like "\n" mapped to CR,LF for output and CR,LF
mapped to "\n" for input. Normally the provided "crlf" layer is the only
layer that need bother about this. C<PerlIO_binmode()> will mess with this
flag rather than add/remove layers if the C<PERLIO_K_CANCRLF> bit is set
for the layers class.

=item PERLIO_F_UTF8

Data written to this layer should be UTF-8 encoded; data provided
by this layer should be considered UTF-8 encoded. Can be set on any layer
by ":utf8" dummy layer. Also set on ":encoding" layer.

=item PERLIO_F_UNBUF

Layer is unbuffered - i.e. write to next layer down should occur for
each write to this layer.

=item PERLIO_F_WRBUF

The buffer for this layer currently holds data written to it but not sent
to next layer.

=item PERLIO_F_RDBUF

The buffer for this layer currently holds unconsumed data read from
layer below.

=item PERLIO_F_LINEBUF

Layer is line buffered. Write data should be passed to next layer down
whenever a "\n" is seen. Any data beyond the "\n" should then be
processed.

=item PERLIO_F_TEMP

File has been C<unlink()>ed, or should be deleted on C<close()>.

=item PERLIO_F_OPEN

Handle is open.

=item PERLIO_F_FASTGETS

This instance of this layer supports the "fast C<gets>" interface.
Normally set based on C<PERLIO_K_FASTGETS> for the class and by the
existence of the function(s) in the table. However a class that
normally provides that interface may need to avoid it on a
particular instance. The "pending" layer needs to do this when
it is pushed above a layer which does not support the interface.
(Perl's C<sv_gets()> does not expect the streams fast C<gets> behaviour
to change during one "get".)

=back

=head2 Methods in Detail

=over 4

=item fsize

	Size_t fsize;

Size of the function table. This is compared against the value PerlIO
code "knows" as a compatibility check. Future versions I<may> be able
to tolerate layers compiled against an old version of the headers.

=item name

	char * name;

The name of the layer whose open() method Perl should invoke on
open().  For example if the layer is called APR, you will call:

  open $fh, ">:APR", ...

and Perl knows that it has to invoke the PerlIOAPR_open() method
implemented by the APR layer.

=item size

	Size_t size;

The size of the per-instance data structure, e.g.:

  sizeof(PerlIOAPR)

If this field is zero then C<PerlIO_pushed> does not malloc anything
and assumes layer's Pushed function will do any required layer stack
manipulation - used to avoid malloc/free overhead for dummy layers.
If the field is non-zero it must be at least the size of C<PerlIOl>,
C<PerlIO_pushed> will allocate memory for the layer's data structures
and link new layer onto the stream's stack. (If the layer's Pushed
method returns an error indication the layer is popped again.)

=item kind

	IV kind;

=over 4

=item * PERLIO_K_BUFFERED

The layer is buffered.

=item * PERLIO_K_RAW

The layer is acceptable to have in a binmode(FH) stack - i.e. it does not
(or will configure itself not to) transform bytes passing through it.

=item * PERLIO_K_CANCRLF

Layer can translate between "\n" and CRLF line ends.

=item * PERLIO_K_FASTGETS

Layer allows buffer snooping.

=item * PERLIO_K_MULTIARG

Used when the layer's open() accepts more arguments than usual. The
extra arguments should come not before the C<MODE> argument. When this
flag is used it's up to the layer to validate the args.

=back

=item Pushed

 IV	(*Pushed)(pTHX_ PerlIO *f,const char *mode, SV *arg);

The only absolutely mandatory method. Called when the layer is pushed
onto the stack.  The C<mode> argument may be NULL if this occurs
post-open. The C<arg> will be non-C<NULL> if an argument string was
passed. In most cases this should call C<PerlIOBase_pushed()> to
convert C<mode> into the appropriate C<PERLIO_F_XXXXX> flags in
addition to any actions the layer itself takes.  If a layer is not
expecting an argument it need neither save the one passed to it, nor
provide C<Getarg()> (it could perhaps C<Perl_warn> that the argument
was un-expected).

Returns 0 on success. On failure returns -1 and should set errno.

=item Popped

	IV	(*Popped)(pTHX_ PerlIO *f);

Called when the layer is popped from the stack. A layer will normally
be popped after C<Close()> is called. But a layer can be popped
without being closed if the program is dynamically managing layers on
the stream. In such cases C<Popped()> should free any resources
(buffers, translation tables, ...) not held directly in the layer's
struct.  It should also C<Unread()> any unconsumed data that has been
read and buffered from the layer below back to that layer, so that it
can be re-provided to what ever is now above.

Returns 0 on success and failure.  If C<Popped()> returns I<true> then
I<perlio.c> assumes that either the layer has popped itself, or the
layer is super special and needs to be retained for other reasons.
In most cases it should return I<false>.

=item Open

	PerlIO *	(*Open)(...);

The C<Open()> method has lots of arguments because it combines the
functions of perl's C<open>, C<PerlIO_open>, perl's C<sysopen>,
C<PerlIO_fdopen> and C<PerlIO_reopen>.  The full prototype is as
follows:

 PerlIO *	(*Open)(pTHX_ PerlIO_funcs *tab,
			PerlIO_list_t *layers, IV n,
			const char *mode,
			int fd, int imode, int perm,
			PerlIO *old,
			int narg, SV **args);

Open should (perhaps indirectly) call C<PerlIO_allocate()> to allocate
a slot in the table and associate it with the layers information for
the opened file, by calling C<PerlIO_push>.  The I<layers> is an
array of all the layers destined for the C<PerlIO *>, and any
arguments passed to them, I<n> is the index into that array of the
layer being called. The macro C<PerlIOArg> will return a (possibly
C<NULL>) SV * for the argument passed to the layer.

The I<mode> string is an "C<fopen()>-like" string which would match
the regular expression C</^[I#]?[rwa]\+?[bt]?$/>.

The C<'I'> prefix is used during creation of C<stdin>..C<stderr> via
special C<PerlIO_fdopen> calls; the C<'#'> prefix means that this is
C<sysopen> and that I<imode> and I<perm> should be passed to
C<PerlLIO_open3>; C<'r'> means B<r>ead, C<'w'> means B<w>rite and
C<'a'> means B<a>ppend. The C<'+'> suffix means that both reading and
writing/appending are permitted.  The C<'b'> suffix means file should
be binary, and C<'t'> means it is text. (Almost all layers should do
the IO in binary mode, and ignore the b/t bits. The C<:crlf> layer
should be pushed to handle the distinction.)

If I<old> is not C<NULL> then this is a C<PerlIO_reopen>. Perl itself
does not use this (yet?) and semantics are a little vague.

If I<fd> not negative then it is the numeric file descriptor I<fd>,
which will be open in a manner compatible with the supplied mode
string, the call is thus equivalent to C<PerlIO_fdopen>. In this case
I<nargs> will be zero.

If I<nargs> is greater than zero then it gives the number of arguments
passed to C<open>, otherwise it will be 1 if for example
C<PerlIO_open> was called.  In simple cases SvPV_nolen(*args) is the
pathname to open.

If a layer provides C<Open()> it should normally call the C<Open()>
method of next layer down (if any) and then push itself on top if that
succeeds.  C<PerlIOBase_open> is provided to do exactly that, so in
most cases you don't have to write your own C<Open()> method.  If this
method is not defined, other layers may have difficulty pushing
themselves on top of it during open.

If C<PerlIO_push> was performed and open has failed, it must
C<PerlIO_pop> itself, since if it's not, the layer won't be removed
and may cause bad problems.

Returns C<NULL> on failure.

=item Binmode

	IV        (*Binmode)(pTHX_ PerlIO *f);

Optional. Used when C<:raw> layer is pushed (explicitly or as a result
of binmode(FH)). If not present layer will be popped. If present
should configure layer as binary (or pop itself) and return 0.
If it returns -1 for error C<binmode> will fail with layer
still on the stack.

=item Getarg

	SV *      (*Getarg)(pTHX_ PerlIO *f,
			    CLONE_PARAMS *param, int flags);

Optional. If present should return an SV * representing the string
argument passed to the layer when it was
pushed. e.g. ":encoding(ascii)" would return an SvPV with value
"ascii". (I<param> and I<flags> arguments can be ignored in most
cases)

C<Dup> uses C<Getarg> to retrieve the argument originally passed to
C<Pushed>, so you must implement this function if your layer has an
extra argument to C<Pushed> and will ever be C<Dup>ed.

=item Fileno

	IV        (*Fileno)(pTHX_ PerlIO *f);

Returns the Unix/Posix numeric file descriptor for the handle. Normally
C<PerlIOBase_fileno()> (which just asks next layer down) will suffice
for this.

Returns -1 on error, which is considered to include the case where the
layer cannot provide such a file descriptor.

=item Dup

	PerlIO * (*Dup)(pTHX_ PerlIO *f, PerlIO *o,
			CLONE_PARAMS *param, int flags);

XXX: Needs more docs.

Used as part of the "clone" process when a thread is spawned (in which
case param will be non-NULL) and when a stream is being duplicated via
'&' in the C<open>.

Similar to C<Open>, returns PerlIO* on success, C<NULL> on failure.

=item Read

	SSize_t	(*Read)(pTHX_ PerlIO *f, void *vbuf, Size_t count);

Basic read operation.

Typically will call C<Fill> and manipulate pointers (possibly via the
API).  C<PerlIOBuf_read()> may be suitable for derived classes which
provide "fast gets" methods.

Returns actual bytes read, or -1 on an error.

=item	Unread

	SSize_t	(*Unread)(pTHX_ PerlIO *f,
			  const void *vbuf, Size_t count);

A superset of stdio's C<ungetc()>. Should arrange for future reads to
see the bytes in C<vbuf>. If there is no obviously better implementation
then C<PerlIOBase_unread()> provides the function by pushing a "fake"
"pending" layer above the calling layer.

Returns the number of unread chars.

=item Write

	SSize_t	(*Write)(PerlIO *f, const void *vbuf, Size_t count);

Basic write operation.

Returns bytes written or -1 on an error.

=item Seek

	IV	(*Seek)(pTHX_ PerlIO *f, Off_t offset, int whence);

Position the file pointer. Should normally call its own C<Flush>
method and then the C<Seek> method of next layer down.

Returns 0 on success, -1 on failure.

=item Tell

	Off_t	(*Tell)(pTHX_ PerlIO *f);

Return the file pointer. May be based on layers cached concept of
position to avoid overhead.

Returns -1 on failure to get the file pointer.

=item Close

	IV	(*Close)(pTHX_ PerlIO *f);

Close the stream. Should normally call C<PerlIOBase_close()> to flush
itself and close layers below, and then deallocate any data structures
(buffers, translation tables, ...) not  held directly in the data
structure.

Returns 0 on success, -1 on failure.

=item Flush

	IV	(*Flush)(pTHX_ PerlIO *f);

Should make stream's state consistent with layers below. That is, any
buffered write data should be written, and file position of lower layers
adjusted for data read from below but not actually consumed.
(Should perhaps C<Unread()> such data to the lower layer.)

Returns 0 on success, -1 on failure.

=item Fill

	IV	(*Fill)(pTHX_ PerlIO *f);

The buffer for this layer should be filled (for read) from layer
below.  When you "subclass" PerlIOBuf layer, you want to use its
I<_read> method and to supply your own fill method, which fills the
PerlIOBuf's buffer.

Returns 0 on success, -1 on failure.

=item Eof

	IV	(*Eof)(pTHX_ PerlIO *f);

Return end-of-file indicator. C<PerlIOBase_eof()> is normally sufficient.

Returns 0 on end-of-file, 1 if not end-of-file, -1 on error.

=item Error

	IV	(*Error)(pTHX_ PerlIO *f);

Return error indicator. C<PerlIOBase_error()> is normally sufficient.

Returns 1 if there is an error (usually when C<PERLIO_F_ERROR> is set),
0 otherwise.

=item  Clearerr

	void	(*Clearerr)(pTHX_ PerlIO *f);

Clear end-of-file and error indicators. Should call C<PerlIOBase_clearerr()>
to set the C<PERLIO_F_XXXXX> flags, which may suffice.

=item Setlinebuf

	void	(*Setlinebuf)(pTHX_ PerlIO *f);

Mark the stream as line buffered. C<PerlIOBase_setlinebuf()> sets the
PERLIO_F_LINEBUF flag and is normally sufficient.

=item Get_base

	STDCHAR *	(*Get_base)(pTHX_ PerlIO *f);

Allocate (if not already done so) the read buffer for this layer and
return pointer to it. Return NULL on failure.

=item Get_bufsiz

	Size_t	(*Get_bufsiz)(pTHX_ PerlIO *f);

Return the number of bytes that last C<Fill()> put in the buffer.

=item Get_ptr

	STDCHAR *	(*Get_ptr)(pTHX_ PerlIO *f);

Return the current read pointer relative to this layer's buffer.

=item Get_cnt

	SSize_t	(*Get_cnt)(pTHX_ PerlIO *f);

Return the number of bytes left to be read in the current buffer.

=item Set_ptrcnt

	void	(*Set_ptrcnt)(pTHX_ PerlIO *f,
			      STDCHAR *ptr, SSize_t cnt);

Adjust the read pointer and count of bytes to match C<ptr> and/or C<cnt>.
The application (or layer above) must ensure they are consistent.
(Checking is allowed by the paranoid.)

=back

=head2 Utilities

To ask for the next layer down use PerlIONext(PerlIO *f).

To check that a PerlIO* is valid use PerlIOValid(PerlIO *f).  (All
this does is really just to check that the pointer is non-NULL and
that the pointer behind that is non-NULL.)

PerlIOBase(PerlIO *f) returns the "Base" pointer, or in other words,
the C<PerlIOl*> pointer.

PerlIOSelf(PerlIO* f, type) return the PerlIOBase cast to a type.

Perl_PerlIO_or_Base(PerlIO* f, callback, base, failure, args) either
calls the I<callback> from the functions of the layer I<f> (just by
the name of the IO function, like "Read") with the I<args>, or if
there is no such callback, calls the I<base> version of the callback
with the same args, or if the f is invalid, set errno to EBADF and
return I<failure>.

Perl_PerlIO_or_fail(PerlIO* f, callback, failure, args) either calls
the I<callback> of the functions of the layer I<f> with the I<args>,
or if there is no such callback, set errno to EINVAL.  Or if the f is
invalid, set errno to EBADF and return I<failure>.

Perl_PerlIO_or_Base_void(PerlIO* f, callback, base, args) either calls
the I<callback> of the functions of the layer I<f> with the I<args>,
or if there is no such callback, calls the I<base> version of the
callback with the same args, or if the f is invalid, set errno to
EBADF.

Perl_PerlIO_or_fail_void(PerlIO* f, callback, args) either calls the
I<callback> of the functions of the layer I<f> with the I<args>, or if
there is no such callback, set errno to EINVAL.  Or if the f is
invalid, set errno to EBADF.

=head2 Implementing PerlIO Layers

If you find the implementation document unclear or not sufficient,
look at the existing PerlIO layer implementations, which include:

=over

=item * C implementations

The F<perlio.c> and F<perliol.h> in the Perl core implement the
"unix", "perlio", "stdio", "crlf", "utf8", "byte", "raw", "pending"
layers, and also the "mmap" and "win32" layers if applicable.
(The "win32" is currently unfinished and unused, to see what is used
instead in Win32, see L<PerlIO/"Querying the layers of filehandles"> .)

PerlIO::encoding, PerlIO::scalar, PerlIO::via in the Perl core.

PerlIO::gzip and APR::PerlIO (mod_perl 2.0) on CPAN.

=item * Perl implementations

PerlIO::via::QuotedPrint in the Perl core and PerlIO::via::* on CPAN.

=back

If you are creating a PerlIO layer, you may want to be lazy, in other
words, implement only the methods that interest you.  The other methods
you can either replace with the "blank" methods

    PerlIOBase_noop_ok
    PerlIOBase_noop_fail

(which do nothing, and return zero and -1, respectively) or for
certain methods you may assume a default behaviour by using a NULL
method.  The Open method looks for help in the 'parent' layer.
The following table summarizes the behaviour:

    method      behaviour with NULL

    Clearerr    PerlIOBase_clearerr
    Close       PerlIOBase_close
    Dup         PerlIOBase_dup
    Eof         PerlIOBase_eof
    Error       PerlIOBase_error
    Fileno      PerlIOBase_fileno
    Fill        FAILURE
    Flush       SUCCESS
    Getarg      SUCCESS
    Get_base    FAILURE
    Get_bufsiz  FAILURE
    Get_cnt     FAILURE
    Get_ptr     FAILURE
    Open        INHERITED
    Popped      SUCCESS
    Pushed      SUCCESS
    Read        PerlIOBase_read
    Seek        FAILURE
    Set_cnt     FAILURE
    Set_ptrcnt  FAILURE
    Setlinebuf  PerlIOBase_setlinebuf
    Tell        FAILURE
    Unread      PerlIOBase_unread
    Write       FAILURE

 FAILURE        Set errno (to EINVAL in Unixish, to LIB$_INVARG in VMS)
                and return -1 (for numeric return values) or NULL (for
                pointers)
 INHERITED      Inherited from the layer below
 SUCCESS        Return 0 (for numeric return values) or a pointer 

=head2 Core Layers

The file C<perlio.c> provides the following layers:

=over 4

=item "unix"

A basic non-buffered layer which calls Unix/POSIX C<read()>, C<write()>,
C<lseek()>, C<close()>. No buffering. Even on platforms that distinguish
between O_TEXT and O_BINARY this layer is always O_BINARY.

=item "perlio"

A very complete generic buffering layer which provides the whole of
PerlIO API. It is also intended to be used as a "base class" for other
layers. (For example its C<Read()> method is implemented in terms of
the C<Get_cnt()>/C<Get_ptr()>/C<Set_ptrcnt()> methods).

"perlio" over "unix" provides a complete replacement for stdio as seen
via PerlIO API. This is the default for USE_PERLIO when system's stdio
does not permit perl's "fast gets" access, and which do not
distinguish between C<O_TEXT> and C<O_BINARY>.

=item "stdio"

A layer which provides the PerlIO API via the layer scheme, but
implements it by calling system's stdio. This is (currently) the default
if system's stdio provides sufficient access to allow perl's "fast gets"
access and which do not distinguish between C<O_TEXT> and C<O_BINARY>.

=item "crlf"

A layer derived using "perlio" as a base class. It provides Win32-like
"\n" to CR,LF translation. Can either be applied above "perlio" or serve
as the buffer layer itself. "crlf" over "unix" is the default if system
distinguishes between C<O_TEXT> and C<O_BINARY> opens. (At some point
"unix" will be replaced by a "native" Win32 IO layer on that platform,
as Win32's read/write layer has various drawbacks.) The "crlf" layer is
a reasonable model for a layer which transforms data in some way.

=item "mmap"

If Configure detects C<mmap()> functions this layer is provided (with
"perlio" as a "base") which does "read" operations by mmap()ing the
file. Performance improvement is marginal on modern systems, so it is
mainly there as a proof of concept. It is likely to be unbundled from
the core at some point. The "mmap" layer is a reasonable model for a
minimalist "derived" layer.

=item "pending"

An "internal" derivative of "perlio" which can be used to provide
Unread() function for layers which have no buffer or cannot be
bothered.  (Basically this layer's C<Fill()> pops itself off the stack
and so resumes reading from layer below.)

=item "raw"

A dummy layer which never exists on the layer stack. Instead when
"pushed" it actually pops the stack removing itself, it then calls
Binmode function table entry on all the layers in the stack - normally
this (via PerlIOBase_binmode) removes any layers which do not have
C<PERLIO_K_RAW> bit set. Layers can modify that behaviour by defining
their own Binmode entry.

=item "utf8"

Another dummy layer. When pushed it pops itself and sets the
C<PERLIO_F_UTF8> flag on the layer which was (and now is once more)
the top of the stack.

=back

In addition F<perlio.c> also provides a number of C<PerlIOBase_xxxx()>
functions which are intended to be used in the table slots of classes
which do not need to do anything special for a particular method.

=head2 Extension Layers

Layers can be made available by extension modules. When an unknown layer
is encountered the PerlIO code will perform the equivalent of :

   use PerlIO 'layer';

Where I<layer> is the unknown layer. F<PerlIO.pm> will then attempt to:

   require PerlIO::layer;

If after that process the layer is still not defined then the C<open>
will fail.

The following extension layers are bundled with perl:

=over 4

=item ":encoding"

   use Encoding;

makes this layer available, although F<PerlIO.pm> "knows" where to
find it.  It is an example of a layer which takes an argument as it is
called thus:

   open( $fh, "<:encoding(iso-8859-7)", $pathname );

=item ":scalar"

Provides support for reading data from and writing data to a scalar.

   open( $fh, "+<:scalar", \$scalar );

When a handle is so opened, then reads get bytes from the string value
of I<$scalar>, and writes change the value. In both cases the position
in I<$scalar> starts as zero but can be altered via C<seek>, and
determined via C<tell>.

Please note that this layer is implied when calling open() thus:

   open( $fh, "+<", \$scalar );

=item ":via"

Provided to allow layers to be implemented as Perl code.  For instance:

   use PerlIO::via::StripHTML;
   open( my $fh, "<:via(StripHTML)", "index.html" );

See L<PerlIO::via> for details.

=back

=head1 TODO

Things that need to be done to improve this document.

=over

=item *

Explain how to make a valid fh without going through open()(i.e. apply
a layer). For example if the file is not opened through perl, but we
want to get back a fh, like it was opened by Perl.

How PerlIO_apply_layera fits in, where its docs, was it made public?

Currently the example could be something like this:

  PerlIO *foo_to_PerlIO(pTHX_ char *mode, ...)
  {
      char *mode; /* "w", "r", etc */
      const char *layers = ":APR"; /* the layer name */
      PerlIO *f = PerlIO_allocate(aTHX);
      if (!f) {
          return NULL;
      }

      PerlIO_apply_layers(aTHX_ f, mode, layers);

      if (f) {
          PerlIOAPR *st = PerlIOSelf(f, PerlIOAPR);
          /* fill in the st struct, as in _open() */
          st->file = file;
          PerlIOBase(f)->flags |= PERLIO_F_OPEN;

          return f;
      }
      return NULL;
  }

=item *

fix/add the documentation in places marked as XXX.

=item *

The handling of errors by the layer is not specified. e.g. when $!
should be set explicitly, when the error handling should be just
delegated to the top layer.

Probably give some hints on using SETERRNO() or pointers to where they
can be found.

=item *

I think it would help to give some concrete examples to make it easier
to understand the API. Of course I agree that the API has to be
concise, but since there is no second document that is more of a
guide, I think that it'd make it easier to start with the doc which is
an API, but has examples in it in places where things are unclear, to
a person who is not a PerlIO guru (yet).

=back

=cut
perlwin32.pod000064400000114602150344123500007076 0ustar00If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see pod/perlpod.pod) which is
specially designed to be readable as is.

=head1 NAME

perlwin32 - Perl under Windows

=head1 SYNOPSIS

These are instructions for building Perl under Windows 2000 and later.

=head1 DESCRIPTION

Before you start, you should glance through the README file
found in the top-level directory to which the Perl distribution
was extracted.  Make sure you read and understand the terms under
which this software is being distributed.

Also make sure you read L</BUGS AND CAVEATS> below for the
known limitations of this port.

The INSTALL file in the perl top-level has much information that is
only relevant to people building Perl on Unix-like systems.  In
particular, you can safely ignore any information that talks about
"Configure".

You may also want to look at one other option for building a perl that
will work on Windows: the README.cygwin file, which give a different
set of rules to build a perl for Windows.  This method will probably
enable you to build a more Unix-compatible perl, but you will also
need to download and use various other build-time and run-time support
software described in that file.

This set of instructions is meant to describe a so-called "native"
port of Perl to the Windows platform.  This includes both 32-bit and
64-bit Windows operating systems.  The resulting Perl requires no
additional software to run (other than what came with your operating
system).  Currently, this port is capable of using one of the
following compilers on the Intel x86 architecture:

      Microsoft Visual C++    version 6.0 or later
      Intel C++ Compiler      (experimental)
      Gcc by mingw.org        gcc version 3.4.5 or later
      Gcc by mingw-w64.org    gcc version 4.4.3 or later

Note that the last two of these are actually competing projects both
delivering complete gcc toolchain for MS Windows:

=over 4

=item L<http://mingw.org>

Delivers gcc toolchain targeting 32-bit Windows platform.

=item L<http://mingw-w64.org>

Delivers gcc toolchain targeting both 64-bit Windows and 32-bit Windows
platforms (despite the project name "mingw-w64" they are not only 64-bit
oriented). They deliver the native gcc compilers and cross-compilers
that are also supported by perl's makefile.

=back

The Microsoft Visual C++ compilers are also now being given away free. They are
available as "Visual C++ Toolkit 2003" or "Visual C++ 2005-2017 Express [or
Community, from 2017] Edition" (and also as part of the ".NET Framework SDK")
and are the same compilers that ship with "Visual C++ .NET 2003 Professional"
or "Visual C++ 2005-2017 Professional" respectively.

This port can also be built on IA64/AMD64 using:

      Microsoft Platform SDK	Nov 2001 (64-bit compiler and tools)
      MinGW64 compiler (gcc version 4.4.3 or later)

The Windows SDK can be downloaded from L<http://www.microsoft.com/>.
The MinGW64 compiler is available at L<http://mingw-w64.org>.
The latter is actually a cross-compiler targeting Win64. There's also a trimmed
down compiler (no java, or gfortran) suitable for building perl available at:
L<http://strawberryperl.com/package/kmx/64_gcctoolchain/>

NOTE: If you're using a 32-bit compiler to build perl on a 64-bit Windows
operating system, then you should set the WIN64 environment variable to "undef".
Also, the trimmed down compiler only passes tests when USE_ITHREADS *= define
(as opposed to undef) and when the CFG *= Debug line is commented out.

This port fully supports MakeMaker (the set of modules that
is used to build extensions to perl).  Therefore, you should be
able to build and install most extensions found in the CPAN sites.
See L</Usage Hints for Perl on Windows> below for general hints about this.

=head2 Setting Up Perl on Windows

=over 4

=item Make

You need a "make" program to build the sources.  If you are using
Visual C++ or the Windows SDK tools, you can use nmake supplied with Visual C++
or Windows SDK. You may also use, for Visual C++ or Windows SDK, dmake or gmake
instead of nmake.  dmake is open source software, but is not included with
Visual C++ or Windows SDK.  Builds using gcc need dmake or gmake.  nmake is not
supported for gcc builds.  Parallel building is only supported with dmake and
gmake, not nmake.  When using dmake it is recommended to use dmake 4.13 or newer
for parallel building.  Older dmakes, in parallel mode, have very high CPU usage
and pound the disk/filing system with duplicate I/O calls in an aggressive
polling loop.

A port of dmake for Windows is available from:

L<http://search.cpan.org/dist/dmake/>

Fetch and install dmake somewhere on your path.

=item Command Shell

Use the default "cmd" shell that comes with Windows.  Some versions of the
popular 4DOS/NT shell have incompatibilities that may cause you trouble.
If the build fails under that shell, try building again with the cmd
shell.

Make sure the path to the build directory does not contain spaces.  The
build usually works in this circumstance, but some tests will fail.

=item Microsoft Visual C++

The nmake that comes with Visual C++ will suffice for building. Visual C
requires that certain things be set up in the console before Visual C will
sucessfully run. To make a console box be able to run the C compiler, you will
need to beforehand, run the C<vcvars32.bat> file to compile for x86-32 and for
x86-64 C<vcvarsall.bat x64> or C<vcvarsamd64.bat>. On a typical install of a
Microsoft C compiler product, these batch files will already be in your C<PATH>
environment variable so you may just type them without an absolute path into
your console. If you need to find the absolute path to the batch file, it is
usually found somewhere like C:\Program Files\Microsoft Visual Studio\VC98\Bin.
With some newer Micrsoft C products (released after ~2004), the installer will
put a shortcut in the start menu to launch a new console window with the
console already set up for your target architecture (x86-32 or x86-64 or IA64).
With the newer compilers, you may also use the older batch files if you choose
so.

=item Microsoft Visual C++ 2008-2017 Express/Community Edition

These free versions of Visual C++ 2008-2017 Professional contain the same
compilers and linkers that ship with the full versions, and also contain
everything necessary to build Perl, rather than requiring a separate download
of the Windows SDK like previous versions did.

These packages can be downloaded by searching in the Download Center at
L<http://www.microsoft.com/downloads/search.aspx?displaylang=en>.  (Providing exact
links to these packages has proven a pointless task because the links keep on
changing so often.)

Install Visual C++ 2008-2017 Express/Community, then setup your environment
using, e.g.

 C:\Program Files\Microsoft Visual Studio 12.0\Common7\Tools\vsvars32.bat

(assuming the default installation location was chosen).

Perl should now build using the win32/Makefile.  You will need to edit that
file to set CCTYPE to one of MSVC90FREE-MSVC141FREE first.

=item Microsoft Visual C++ 2005 Express Edition

This free version of Visual C++ 2005 Professional contains the same compiler
and linker that ship with the full version, but doesn't contain everything
necessary to build Perl.

You will also need to download the "Windows SDK" (the "Core SDK" and "MDAC
SDK" components are required) for more header files and libraries.

These packages can both be downloaded by searching in the Download Center at
L<http://www.microsoft.com/downloads/search.aspx?displaylang=en>.  (Providing exact
links to these packages has proven a pointless task because the links keep on
changing so often.)

Try to obtain the latest version of the Windows SDK.  Sometimes these packages
contain a particular Windows OS version in their name, but actually work on
other OS versions too.  For example, the "Windows Server 2003 R2 Platform SDK"
also runs on Windows XP SP2 and Windows 2000.

Install Visual C++ 2005 first, then the Platform SDK.  Setup your environment
as follows (assuming default installation locations were chosen):

 SET PlatformSDKDir=C:\Program Files\Microsoft Platform SDK

 SET PATH=%SystemRoot%\system32;%SystemRoot%;C:\Program Files\Microsoft Visual Studio 8\Common7\IDE;C:\Program Files\Microsoft Visual Studio 8\VC\BIN;C:\Program Files\Microsoft Visual Studio 8\Common7\Tools;C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\bin;C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727;C:\Program Files\Microsoft Visual Studio 8\VC\VCPackages;%PlatformSDKDir%\Bin

 SET INCLUDE=C:\Program Files\Microsoft Visual Studio 8\VC\INCLUDE;%PlatformSDKDir%\include

 SET LIB=C:\Program Files\Microsoft Visual Studio 8\VC\LIB;C:\Program Files\Microsoft Visual Studio 8\SDK\v2.0\lib;%PlatformSDKDir%\lib

 SET LIBPATH=C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727

(The PlatformSDKDir might need to be set differently depending on which version
you are using. Earlier versions installed into "C:\Program Files\Microsoft SDK",
while the latest versions install into version-specific locations such as
"C:\Program Files\Microsoft Platform SDK for Windows Server 2003 R2".)

Perl should now build using the win32/Makefile.  You will need to edit that
file to set

 CCTYPE = MSVC80FREE

and to set CCHOME, CCINCDIR and CCLIBDIR as per the environment setup above.

=item Microsoft Visual C++ Toolkit 2003

This free toolkit contains the same compiler and linker that ship with
Visual C++ .NET 2003 Professional, but doesn't contain everything
necessary to build Perl.

You will also need to download the "Platform SDK" (the "Core SDK" and "MDAC
SDK" components are required) for header files, libraries and rc.exe, and
".NET Framework SDK" for more libraries and nmake.exe.  Note that the latter
(which also includes the free compiler and linker) requires the ".NET
Framework Redistributable" to be installed first.  This can be downloaded and
installed separately, but is included in the "Visual C++ Toolkit 2003" anyway.

These packages can all be downloaded by searching in the Download Center at
L<http://www.microsoft.com/downloads/search.aspx?displaylang=en>.  (Providing exact
links to these packages has proven a pointless task because the links keep on
changing so often.)

Try to obtain the latest version of the Windows SDK.  Sometimes these packages
contain a particular Windows OS version in their name, but actually work on
other OS versions too.  For example, the "Windows Server 2003 R2 Platform SDK"
also runs on Windows XP SP2 and Windows 2000.

Install the Toolkit first, then the Platform SDK, then the .NET Framework SDK.
Setup your environment as follows (assuming default installation locations
were chosen):

 SET PlatformSDKDir=C:\Program Files\Microsoft Platform SDK

 SET PATH=%SystemRoot%\system32;%SystemRoot%;C:\Program Files\Microsoft Visual C++ Toolkit 2003\bin;%PlatformSDKDir%\Bin;C:\Program Files\Microsoft.NET\SDK\v1.1\Bin

 SET INCLUDE=C:\Program Files\Microsoft Visual C++ Toolkit 2003\include;%PlatformSDKDir%\include;C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\include

 SET LIB=C:\Program Files\Microsoft Visual C++ Toolkit 2003\lib;%PlatformSDKDir%\lib;C:\Program Files\Microsoft Visual Studio .NET 2003\Vc7\lib

(The PlatformSDKDir might need to be set differently depending on which version
you are using. Earlier versions installed into "C:\Program Files\Microsoft SDK",
while the latest versions install into version-specific locations such as
"C:\Program Files\Microsoft Platform SDK for Windows Server 2003 R2".)

Several required files will still be missing:

=over 4

=item *

cvtres.exe is required by link.exe when using a .res file.  It is actually
installed by the .NET Framework SDK, but into a location such as the
following:

 C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322

Copy it from there to %PlatformSDKDir%\Bin

=item *

lib.exe is normally used to build libraries, but link.exe with the /lib
option also works, so change win32/config.vc to use it instead:

Change the line reading:

	ar='lib'

to:

	ar='link /lib'

It may also be useful to create a batch file called lib.bat in
C:\Program Files\Microsoft Visual C++ Toolkit 2003\bin containing:

	@echo off
	link /lib %*

for the benefit of any naughty C extension modules that you might want to build
later which explicitly reference "lib" rather than taking their value from
$Config{ar}.

=item *

setargv.obj is required to build perlglob.exe (and perl.exe if the USE_SETARGV
option is enabled).  The Platform SDK supplies this object file in source form
in %PlatformSDKDir%\src\crt.  Copy setargv.c, cruntime.h and
internal.h from there to some temporary location and build setargv.obj using

	cl.exe /c /I. /D_CRTBLD setargv.c

Then copy setargv.obj to %PlatformSDKDir%\lib

Alternatively, if you don't need perlglob.exe and don't need to enable the
USE_SETARGV option then you can safely just remove all mention of $(GLOBEXE)
from win32/Makefile and setargv.obj won't be required anyway.

=back

Perl should now build using the win32/Makefile.  You will need to edit that
file to set

	CCTYPE = MSVC70FREE

and to set CCHOME, CCINCDIR and CCLIBDIR as per the environment setup above.

=item Microsoft Platform SDK 64-bit Compiler

The nmake that comes with the Platform SDK will suffice for building
Perl.  Make sure you are building within one of the "Build Environment"
shells available after you install the Platform SDK from the Start Menu.

=item MinGW release 3 with gcc

Perl can be compiled with gcc from MinGW release 3 and later (using gcc 3.4.5
and later).  It can be downloaded here:

L<http://www.mingw.org/>

You also need dmake.  See L</"Make"> above on how to get it.

=item Intel C++ Compiler

Experimental support for using Intel C++ Compiler has been added. Edit
win32/Makefile and pick the correct CCTYPE for the Visual C that Intel C was
installed into. Also uncomment __ICC to enable Intel C on Visual C support.
To set up the build enviroment, from the Start Menu run
IA-32 Visual Studio 20__ mode or Intel 64 Visual Studio 20__ mode as
appropriate. Then run nmake as usually in that prompt box.

Only Intel C++ Compiler v12.1 has been tested. Other versions probably will
work. Using Intel C++ Compiler instead of Visual C has the benefit of C99
compatibility which is needed by some CPAN XS modules, while maintaining
compatibility with Visual C object code and Visual C debugging infrastructure
unlike GCC.

=back

=head2 Building

=over 4

=item *

Make sure you are in the "win32" subdirectory under the perl toplevel.
This directory contains a "Makefile" that will work with
versions of nmake that come with Visual C++ or the Windows SDK, and
a dmake "makefile.mk" that will work for all supported compilers.  The
defaults in the dmake makefile are setup to build using MinGW/gcc.

=item *

Edit the makefile.mk (or Makefile, if you're using nmake) and change
the values of INST_DRV and INST_TOP.   You can also enable various
build flags.  These are explained in the makefiles.

Note that it is generally not a good idea to try to build a perl with
INST_DRV and INST_TOP set to a path that already exists from a previous
build.  In particular, this may cause problems with the
lib/ExtUtils/t/Embed.t test, which attempts to build a test program and
may end up building against the installed perl's lib/CORE directory rather
than the one being tested.

You will have to make sure that CCTYPE is set correctly and that
CCHOME points to wherever you installed your compiler.

If building with the cross-compiler provided by
mingw-w64.org you'll need to uncomment the line that sets
GCCCROSS in the makefile.mk. Do this only if it's the cross-compiler - ie
only if the bin folder doesn't contain a gcc.exe. (The cross-compiler
does not provide a gcc.exe, g++.exe, ar.exe, etc. Instead, all of these
executables are prefixed with 'x86_64-w64-mingw32-'.)

The default value for CCHOME in the makefiles for Visual C++
may not be correct for some versions.  Make sure the default exists
and is valid.

You may also need to comment out the C<DELAYLOAD = ...> line in the
Makefile if you're using VC++ 6.0 without the latest service pack and
the linker reports an internal error.

If you want build some core extensions statically into perl's dll, specify
them in the STATIC_EXT macro.

NOTE: The USE_64_BIT_INT build option is not supported with the 32-bit
Visual C++ 6.0 compiler.

Be sure to read the instructions near the top of the makefiles carefully.

=item *

Type "dmake" (or "nmake" if you are using that make).

This should build everything.  Specifically, it will create perl.exe,
perl526.dll at the perl toplevel, and various other extension dll's
under the lib\auto directory.  If the build fails for any reason, make
sure you have done the previous steps correctly.

To try dmake's parallel mode, type "dmake -P2", where 2, is the maximum number
of parallel jobs you want to run. A number of things in the build process will
run in parallel, but there are serialization points where you will see just 1
CPU maxed out. This is normal.

If you are advanced enough with building C code, here is a suggestion to speed
up building perl, and the later C<make test>. Try to keep your PATH enviromental
variable with the least number of folders possible (remember to keep your C
compiler's folders there). C<C:\WINDOWS\system32> or C<C:\WINNT\system32>
depending on your OS version should be first folder in PATH, since "cmd.exe"
is the most commonly launched program during the build and later testing.

=back

=head2 Testing Perl on Windows

Type "dmake test" (or "nmake test").  This will run most of the tests from
the testsuite (many tests will be skipped).

There should be no test failures.

If you build with Visual C++ 2013 then three tests currently may fail with
Daylight Saving Time related problems: F<t/io/fs.t>,
F<cpan/HTTP-Tiny/t/110_mirror.t> and F<lib/File/Copy.t>. The failures are
caused by bugs in the CRT in VC++ 2013 which are fixed in VC++2015 and
later, as explained by Microsoft here:
L<https://connect.microsoft.com/VisualStudio/feedback/details/811534/utime-sometimes-fails-to-set-the-correct-file-times-in-visual-c-2013>. In the meantime,
if you need fixed C<stat> and C<utime> functions then have a look at the
CPAN distribution Win32::UTCFileTime.

If you build with certain versions (e.g. 4.8.1) of gcc from www.mingw.org then
F<ext/POSIX/t/time.t> may fail test 17 due to a known bug in those gcc builds:
see L<http://sourceforge.net/p/mingw/bugs/2152/>.

Some test failures may occur if you use a command shell other than the
native "cmd.exe", or if you are building from a path that contains
spaces.  So don't do that.

If you are running the tests from a emacs shell window, you may see
failures in op/stat.t.  Run "dmake test-notty" in that case.

Furthermore, you should make sure that during C<make test> you do not
have any GNU tool packages in your path: some toolkits like Unixutils
include some tools (C<type> for instance) which override the Windows
ones and makes tests fail. Remove them from your path while testing to
avoid these errors.

Please report any other failures as described under L</BUGS AND CAVEATS>.

=head2 Installation of Perl on Windows

Type "dmake install" (or "nmake install").  This will put the newly
built perl and the libraries under whatever C<INST_TOP> points to in the
Makefile.  It will also install the pod documentation under
C<$INST_TOP\$INST_VER\lib\pod> and HTML versions of the same under
C<$INST_TOP\$INST_VER\lib\pod\html>.

To use the Perl you just installed you will need to add a new entry to
your PATH environment variable: C<$INST_TOP\bin>, e.g.

    set PATH=c:\perl\bin;%PATH%

If you opted to uncomment C<INST_VER> and C<INST_ARCH> in the makefile
then the installation structure is a little more complicated and you will
need to add two new PATH components instead: C<$INST_TOP\$INST_VER\bin> and
C<$INST_TOP\$INST_VER\bin\$ARCHNAME>, e.g.

    set PATH=c:\perl\5.6.0\bin;c:\perl\5.6.0\bin\MSWin32-x86;%PATH%

=head2 Usage Hints for Perl on Windows

=over 4

=item Environment Variables

The installation paths that you set during the build get compiled
into perl, so you don't have to do anything additional to start
using that perl (except add its location to your PATH variable).

If you put extensions in unusual places, you can set PERL5LIB
to a list of paths separated by semicolons where you want perl
to look for libraries.  Look for descriptions of other environment
variables you can set in L<perlrun>.

You can also control the shell that perl uses to run system() and
backtick commands via PERL5SHELL.  See L<perlrun>.

Perl does not depend on the registry, but it can look up certain default
values if you choose to put them there unless disabled at build time with
USE_NO_REGISTRY.  On Perl process start Perl checks if
C<HKEY_CURRENT_USER\Software\Perl> and C<HKEY_LOCAL_MACHINE\Software\Perl>
exist.  If the keys exists, they will be checked for remainder of the Perl
process's run life for certain entries.  Entries in
C<HKEY_CURRENT_USER\Software\Perl> override entries in
C<HKEY_LOCAL_MACHINE\Software\Perl>.  One or more of the following entries
(of type REG_SZ or REG_EXPAND_SZ) may be set in the keys:

 lib-$]        version-specific standard library path to add to @INC
 lib           standard library path to add to @INC
 sitelib-$]    version-specific site library path to add to @INC
 sitelib       site library path to add to @INC
 vendorlib-$]  version-specific vendor library path to add to @INC
 vendorlib     vendor library path to add to @INC
 PERL*         fallback for all %ENV lookups that begin with "PERL"

Note the C<$]> in the above is not literal.  Substitute whatever version
of perl you want to honor that entry, e.g. C<5.6.0>.  Paths must be
separated with semicolons, as usual on Windows.

=item File Globbing

By default, perl handles file globbing using the File::Glob extension,
which provides portable globbing.

If you want perl to use globbing that emulates the quirks of DOS
filename conventions, you might want to consider using File::DosGlob
to override the internal glob() implementation.  See L<File::DosGlob> for
details.

=item Using perl from the command line

If you are accustomed to using perl from various command-line
shells found in UNIX environments, you will be less than pleased
with what Windows offers by way of a command shell.

The crucial thing to understand about the Windows environment is that
the command line you type in is processed twice before Perl sees it.
First, your command shell (usually CMD.EXE) preprocesses the command
line, to handle redirection, environment variable expansion, and
location of the executable to run. Then, the perl executable splits
the remaining command line into individual arguments, using the
C runtime library upon which Perl was built.

It is particularly important to note that neither the shell nor the C
runtime do any wildcard expansions of command-line arguments (so
wildcards need not be quoted).  Also, the quoting behaviours of the
shell and the C runtime are rudimentary at best (and may, if you are
using a non-standard shell, be inconsistent).  The only (useful) quote
character is the double quote (").  It can be used to protect spaces
and other special characters in arguments.

The Windows documentation describes the shell parsing rules here:
L<http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/cmd.mspx?mfr=true>
and the C runtime parsing rules here:
L<http://msdn.microsoft.com/en-us/library/17w5ykft%28v=VS.100%29.aspx>.

Here are some further observations based on experiments: The C runtime
breaks arguments at spaces and passes them to programs in argc/argv.
Double quotes can be used to prevent arguments with spaces in them from
being split up.  You can put a double quote in an argument by escaping
it with a backslash and enclosing the whole argument within double quotes.
The backslash and the pair of double quotes surrounding the argument will
be stripped by the C runtime.

The file redirection characters "E<lt>", "E<gt>", and "|" can be quoted by
double quotes (although there are suggestions that this may not always
be true).  Single quotes are not treated as quotes by the shell or
the C runtime, they don't get stripped by the shell (just to make
this type of quoting completely useless).  The caret "^" has also
been observed to behave as a quoting character, but this appears
to be a shell feature, and the caret is not stripped from the command
line, so Perl still sees it (and the C runtime phase does not treat
the caret as a quote character).

Here are some examples of usage of the "cmd" shell:

This prints two doublequotes:

    perl -e "print '\"\"' "

This does the same:

    perl -e "print \"\\\"\\\"\" "

This prints "bar" and writes "foo" to the file "blurch":

    perl -e "print 'foo'; print STDERR 'bar'" > blurch

This prints "foo" ("bar" disappears into nowhereland):

    perl -e "print 'foo'; print STDERR 'bar'" 2> nul

This prints "bar" and writes "foo" into the file "blurch":

    perl -e "print 'foo'; print STDERR 'bar'" 1> blurch

This pipes "foo" to the "less" pager and prints "bar" on the console:

    perl -e "print 'foo'; print STDERR 'bar'" | less

This pipes "foo\nbar\n" to the less pager:

    perl -le "print 'foo'; print STDERR 'bar'" 2>&1 | less

This pipes "foo" to the pager and writes "bar" in the file "blurch":

    perl -e "print 'foo'; print STDERR 'bar'" 2> blurch | less


Discovering the usefulness of the "command.com" shell on Windows 9x
is left as an exercise to the reader :)

One particularly pernicious problem with the 4NT command shell for
Windows is that it (nearly) always treats a % character as indicating
that environment variable expansion is needed.  Under this shell, it is
therefore important to always double any % characters which you want
Perl to see (for example, for hash variables), even when they are
quoted.

=item Building Extensions

The Comprehensive Perl Archive Network (CPAN) offers a wealth
of extensions, some of which require a C compiler to build.
Look in L<http://www.cpan.org/> for more information on CPAN.

Note that not all of the extensions available from CPAN may work
in the Windows environment; you should check the information at
L<http://www.cpantesters.org/> before investing too much effort into
porting modules that don't readily build.

Most extensions (whether they require a C compiler or not) can
be built, tested and installed with the standard mantra:

    perl Makefile.PL
    $MAKE
    $MAKE test
    $MAKE install

where $MAKE is whatever 'make' program you have configured perl to
use.  Use "perl -V:make" to find out what this is.  Some extensions
may not provide a testsuite (so "$MAKE test" may not do anything or
fail), but most serious ones do.

It is important that you use a supported 'make' program, and
ensure Config.pm knows about it.  If you don't have nmake, you can
either get dmake from the location mentioned earlier or get an
old version of nmake reportedly available from:

L<http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/nmake15.exe>

Another option is to use the make written in Perl, available from
CPAN.

L<http://www.cpan.org/modules/by-module/Make/>

You may also use dmake.  See L</"Make"> above on how to get it.

Note that MakeMaker actually emits makefiles with different syntax
depending on what 'make' it thinks you are using.  Therefore, it is
important that one of the following values appears in Config.pm:

    make='nmake'	# MakeMaker emits nmake syntax
    make='dmake'	# MakeMaker emits dmake syntax
    any other value	# MakeMaker emits generic make syntax
    			    (e.g GNU make, or Perl make)

If the value doesn't match the 'make' program you want to use,
edit Config.pm to fix it.

If a module implements XSUBs, you will need one of the supported
C compilers.  You must make sure you have set up the environment for
the compiler for command-line compilation before running C<perl Makefile.PL>
or any invocation of make.

If a module does not build for some reason, look carefully for
why it failed, and report problems to the module author.  If
it looks like the extension building support is at fault, report
that with full details of how the build failed using the perlbug
utility.

=item Command-line Wildcard Expansion

The default command shells on DOS descendant operating systems (such
as they are) usually do not expand wildcard arguments supplied to
programs.  They consider it the application's job to handle that.
This is commonly achieved by linking the application (in our case,
perl) with startup code that the C runtime libraries usually provide.
However, doing that results in incompatible perl versions (since the
behavior of the argv expansion code differs depending on the
compiler, and it is even buggy on some compilers).  Besides, it may
be a source of frustration if you use such a perl binary with an
alternate shell that *does* expand wildcards.

Instead, the following solution works rather well. The nice things
about it are 1) you can start using it right away; 2) it is more
powerful, because it will do the right thing with a pattern like
*/*/*.c; 3) you can decide whether you do/don't want to use it; and
4) you can extend the method to add any customizations (or even
entirely different kinds of wildcard expansion).

 C:\> copy con c:\perl\lib\Wild.pm
 # Wild.pm - emulate shell @ARGV expansion on shells that don't
 use File::DosGlob;
 @ARGV = map {
	      my @g = File::DosGlob::glob($_) if /[*?]/;
	      @g ? @g : $_;
	    } @ARGV;
 1;
 ^Z
 C:\> set PERL5OPT=-MWild
 C:\> perl -le "for (@ARGV) { print }" */*/perl*.c
 p4view/perl/perl.c
 p4view/perl/perlio.c
 p4view/perl/perly.c
 perl5.005/win32/perlglob.c
 perl5.005/win32/perllib.c
 perl5.005/win32/perlglob.c
 perl5.005/win32/perllib.c
 perl5.005/win32/perlglob.c
 perl5.005/win32/perllib.c

Note there are two distinct steps there: 1) You'll have to create
Wild.pm and put it in your perl lib directory. 2) You'll need to
set the PERL5OPT environment variable.  If you want argv expansion
to be the default, just set PERL5OPT in your default startup
environment.

If you are using the Visual C compiler, you can get the C runtime's
command line wildcard expansion built into perl binary.  The resulting
binary will always expand unquoted command lines, which may not be
what you want if you use a shell that does that for you.  The expansion
done is also somewhat less powerful than the approach suggested above.

=item Notes on 64-bit Windows

Windows .NET Server supports the LLP64 data model on the Intel Itanium
architecture.

The LLP64 data model is different from the LP64 data model that is the
norm on 64-bit Unix platforms.  In the former, C<int> and C<long> are
both 32-bit data types, while pointers are 64 bits wide.  In addition,
there is a separate 64-bit wide integral type, C<__int64>.  In contrast,
the LP64 data model that is pervasive on Unix platforms provides C<int>
as the 32-bit type, while both the C<long> type and pointers are of
64-bit precision.  Note that both models provide for 64-bits of
addressability.

64-bit Windows running on Itanium is capable of running 32-bit x86
binaries transparently.  This means that you could use a 32-bit build
of Perl on a 64-bit system.  Given this, why would one want to build
a 64-bit build of Perl?  Here are some reasons why you would bother:

=over

=item *

A 64-bit native application will run much more efficiently on
Itanium hardware.

=item *

There is no 2GB limit on process size.

=item *

Perl automatically provides large file support when built under
64-bit Windows.

=item *

Embedding Perl inside a 64-bit application.

=back

=back

=head2 Running Perl Scripts

Perl scripts on UNIX use the "#!" (a.k.a "shebang") line to
indicate to the OS that it should execute the file using perl.
Windows has no comparable means to indicate arbitrary files are
executables.

Instead, all available methods to execute plain text files on
Windows rely on the file "extension".  There are three methods
to use this to execute perl scripts:

=over 8

=item 1

There is a facility called "file extension associations".  This can be
manipulated via the two commands "assoc" and "ftype" that come
standard with Windows.  Type "ftype /?" for a complete example of how
to set this up for perl scripts (Say what?  You thought Windows
wasn't perl-ready? :).

=item 2

Since file associations don't work everywhere, and there are
reportedly bugs with file associations where it does work, the
old method of wrapping the perl script to make it look like a
regular batch file to the OS, may be used.  The install process
makes available the "pl2bat.bat" script which can be used to wrap
perl scripts into batch files.  For example:

	pl2bat foo.pl

will create the file "FOO.BAT".  Note "pl2bat" strips any
.pl suffix and adds a .bat suffix to the generated file.

If you use the 4DOS/NT or similar command shell, note that
"pl2bat" uses the "%*" variable in the generated batch file to
refer to all the command line arguments, so you may need to make
sure that construct works in batch files.  As of this writing,
4DOS/NT users will need a "ParameterChar = *" statement in their
4NT.INI file or will need to execute "setdos /p*" in the 4DOS/NT
startup file to enable this to work.

=item 3

Using "pl2bat" has a few problems:  the file name gets changed,
so scripts that rely on C<$0> to find what they must do may not
run properly; running "pl2bat" replicates the contents of the
original script, and so this process can be maintenance intensive
if the originals get updated often.  A different approach that
avoids both problems is possible.

A script called "runperl.bat" is available that can be copied
to any filename (along with the .bat suffix).  For example,
if you call it "foo.bat", it will run the file "foo" when it is
executed.  Since you can run batch files on Windows platforms simply
by typing the name (without the extension), this effectively
runs the file "foo", when you type either "foo" or "foo.bat".
With this method, "foo.bat" can even be in a different location
than the file "foo", as long as "foo" is available somewhere on
the PATH.  If your scripts are on a filesystem that allows symbolic
links, you can even avoid copying "runperl.bat".

Here's a diversion:  copy "runperl.bat" to "runperl", and type
"runperl".  Explain the observed behavior, or lack thereof. :)
Hint: .gnidnats llits er'uoy fi ,"lrepnur" eteled :tniH

=back

=head2 Miscellaneous Things

A full set of HTML documentation is installed, so you should be
able to use it if you have a web browser installed on your
system.

C<perldoc> is also a useful tool for browsing information contained
in the documentation, especially in conjunction with a pager
like C<less> (recent versions of which have Windows support).  You may
have to set the PAGER environment variable to use a specific pager.
"perldoc -f foo" will print information about the perl operator
"foo".

One common mistake when using this port with a GUI library like C<Tk>
is assuming that Perl's normal behavior of opening a command-line
window will go away.  This isn't the case.  If you want to start a copy
of C<perl> without opening a command-line window, use the C<wperl>
executable built during the installation process.  Usage is exactly
the same as normal C<perl> on Windows, except that options like C<-h>
don't work (since they need a command-line window to print to).

If you find bugs in perl, you can run C<perlbug> to create a
bug report (you may have to send it manually if C<perlbug> cannot
find a mailer on your system).

=head1 BUGS AND CAVEATS

Norton AntiVirus interferes with the build process, particularly if
set to "AutoProtect, All Files, when Opened". Unlike large applications
the perl build process opens and modifies a lot of files. Having the
the AntiVirus scan each and every one slows build the process significantly.
Worse, with PERLIO=stdio the build process fails with peculiar messages
as the virus checker interacts badly with miniperl.exe writing configure
files (it seems to either catch file part written and treat it as suspicious,
or virus checker may have it "locked" in a way which inhibits miniperl
updating it). The build does complete with

   set PERLIO=perlio

but that may be just luck. Other AntiVirus software may have similar issues.

A git GUI shell extension for Windows such as TortoiseGit will cause the build
and later C<make test> to run much slower since every file is checked for its
git status as soon as it is created and/or modified. TortoiseGit doesn't cause
any test failures or build problems unlike the antivirus software described
above, but it does cause similar slowness. It is suggested to use Task Manager
to look for background processes which use high CPU amounts during the building
process.

Some of the built-in functions do not act exactly as documented in
L<perlfunc>, and a few are not implemented at all.  To avoid
surprises, particularly if you have had prior exposure to Perl
in other operating environments or if you intend to write code
that will be portable to other environments, see L<perlport>
for a reasonably definitive list of these differences.

Not all extensions available from CPAN may build or work properly
in the Windows environment.  See L</"Building Extensions">.

Most C<socket()> related calls are supported, but they may not
behave as on Unix platforms.  See L<perlport> for the full list.

Signal handling may not behave as on Unix platforms (where it
doesn't exactly "behave", either :).  For instance, calling C<die()>
or C<exit()> from signal handlers will cause an exception, since most
implementations of C<signal()> on Windows are severely crippled.
Thus, signals may work only for simple things like setting a flag
variable in the handler.  Using signals under this port should
currently be considered unsupported.

Please send detailed descriptions of any problems and solutions that
you may find to E<lt>F<perlbug@perl.org>E<gt>, along with the output
produced by C<perl -V>.

=head1 ACKNOWLEDGEMENTS

The use of a camel with the topic of Perl is a trademark
of O'Reilly and Associates, Inc. Used with permission.

=head1 AUTHORS

=over 4

=item Gary Ng E<lt>71564.1743@CompuServe.COME<gt>

=item Gurusamy Sarathy E<lt>gsar@activestate.comE<gt>

=item Nick Ing-Simmons E<lt>nick@ing-simmons.netE<gt>

=item Jan Dubois E<lt>jand@activestate.comE<gt>

=item Steve Hay E<lt>steve.m.hay@googlemail.comE<gt>

=back

This document is maintained by Jan Dubois.

=head1 SEE ALSO

L<perl>

=head1 HISTORY

This port was originally contributed by Gary Ng around 5.003_24,
and borrowed from the Hip Communications port that was available
at the time.  Various people have made numerous and sundry hacks
since then.

GCC/mingw32 support was added in 5.005 (Nick Ing-Simmons).

Support for PERL_OBJECT was added in 5.005 (ActiveState Tool Corp).

Support for fork() emulation was added in 5.6 (ActiveState Tool Corp).

Win9x support was added in 5.6 (Benjamin Stuhl).

Support for 64-bit Windows added in 5.8 (ActiveState Corp).

Last updated: 16 June 2017

=cut
perl5163delta.pod000064400000007765150344123500007557 0ustar00=encoding utf8

=head1 NAME

perl5163delta - what is new for perl v5.16.3

=head1 DESCRIPTION

This document describes differences between the 5.16.2 release and
the 5.16.3 release.

If you are upgrading from an earlier release such as 5.16.1, first read
L<perl5162delta>, which describes differences between 5.16.1 and
5.16.2.

=head1 Core Enhancements

No changes since 5.16.0.

=head1 Security

This release contains one major and a number of minor security fixes.
These latter are included mainly to allow the test suite to pass cleanly
with the clang compiler's address sanitizer facility.

=head2 CVE-2013-1667: memory exhaustion with arbitrary hash keys

With a carefully crafted set of hash keys (for example arguments on a
URL), it is possible to cause a hash to consume a large amount of memory
and CPU, and thus possibly to achieve a Denial-of-Service.

This problem has been fixed.

=head2 wrap-around with IO on long strings

Reading or writing strings greater than 2**31 bytes in size could segfault
due to integer wraparound.

This problem has been fixed.

=head2 memory leak in Encode

The UTF-8 encoding implementation in Encode.xs had a memory leak which has been
fixed.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.16.0. If any
exist, they are bugs and reports are welcome.

=head1 Deprecations

There have been no deprecations since 5.16.0.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Encode> has been upgraded from version 2.44 to version 2.44_01.

=item *

L<Module::CoreList> has been upgraded from version 2.76 to version 2.76_02.

=item *

L<XS::APItest> has been upgraded from version 0.38 to version 0.39.

=back

=head1 Known Problems

None.

=head1 Acknowledgements

Perl 5.16.3 represents approximately 4 months of development since Perl 5.16.2
and contains approximately 870 lines of changes across 39 files from 7 authors.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers. The following people are known to have contributed the
improvements that became Perl 5.16.3:

Andy Dougherty, Chris 'BinGOs' Williams, Dave Rolsky, David Mitchell, Michael
Schroeder, Ricardo Signes, Yves Orton.

The list above is almost certainly incomplete as it is automatically generated
from version control history. In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please
send it to perl5-security-report@perl.org. This points to a closed
subscription unarchived mailing list, which includes all the core
committers, who will be able to help assess the impact of issues, figure
out a resolution, and help co-ordinate the release of patches to
mitigate or fix the problem across all platforms on which Perl is
supported. Please only use this address for security issues in the Perl
core, not for modules independently distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5244delta.pod000064400000010636150344123500007546 0ustar00=encoding utf8

=head1 NAME

perl5244delta - what is new for perl v5.24.4

=head1 DESCRIPTION

This document describes differences between the 5.24.3 release and the 5.24.4
release.

If you are upgrading from an earlier release such as 5.24.2, first read
L<perl5243delta>, which describes differences between 5.24.2 and 5.24.3.

=head1 Security

=head2 [CVE-2018-6797] heap-buffer-overflow (WRITE of size 1) in S_regatom (regcomp.c)

A crafted regular expression could cause a heap buffer write overflow, with
control over the bytes written.
L<[perl #132227]|https://rt.perl.org/Public/Bug/Display.html?id=132227>

=head2 [CVE-2018-6798] Heap-buffer-overflow in Perl__byte_dump_string (utf8.c)

Matching a crafted locale dependent regular expression could cause a heap
buffer read overflow and potentially information disclosure.
L<[perl #132063]|https://rt.perl.org/Public/Bug/Display.html?id=132063>

=head2 [CVE-2018-6913] heap-buffer-overflow in S_pack_rec

C<pack()> could cause a heap buffer write overflow with a large item count.
L<[perl #131844]|https://rt.perl.org/Public/Bug/Display.html?id=131844>

=head2 Assertion failure in Perl__core_swash_init (utf8.c)

Control characters in a supposed Unicode property name could cause perl to
crash.  This has been fixed.
L<[perl #132055]|https://rt.perl.org/Public/Bug/Display.html?id=132055>
L<[perl #132553]|https://rt.perl.org/Public/Bug/Display.html?id=132553>
L<[perl #132658]|https://rt.perl.org/Public/Bug/Display.html?id=132658>

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.24.3.  If any exist,
they are bugs, and we request that you submit a report.  See L</Reporting
Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Module::CoreList> has been upgraded from version 5.20170922_24 to 5.20180414_24.

=back

=head1 Selected Bug Fixes

=over 4

=item *

The C<readpipe()> built-in function now checks at compile time that it has only
one parameter expression, and puts it in scalar context, thus ensuring that it
doesn't corrupt the stack at runtime.
L<[perl #4574]|https://rt.perl.org/Public/Bug/Display.html?id=4574>

=back

=head1 Acknowledgements

Perl 5.24.4 represents approximately 7 months of development since Perl 5.24.3
and contains approximately 2,400 lines of changes across 49 files from 12
authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 1,300 lines of changes to 12 .pm, .t, .c and .h files.

Perl continues to flourish into its third decade thanks to a vibrant community
of users and developers.  The following people are known to have contributed
the improvements that became Perl 5.24.4:

Abigail, Chris 'BinGOs' Williams, John SJ Anderson, Karen Etheridge, Karl
Williamson, Renee Baecker, Sawyer X, Steve Hay, Todd Rinaldo, Tony Cook, Yves
Orton, Zefram.

The list above is almost certainly incomplete as it is automatically generated
from version control history.  In particular, it does not include the names of
the (very much appreciated) contributors who reported issues to the Perl bug
tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core.  We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please see
the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles recently
posted to the comp.lang.perl.misc newsgroup and the perl bug database at
L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perl5280delta.pod000064400000214661150344123500007552 0ustar00=encoding utf8

=head1 NAME

perl5280delta - what is new for perl v5.28.0

=head1 DESCRIPTION

This document describes differences between the 5.26.0 release and the 5.28.0
release.

If you are upgrading from an earlier release such as 5.24.0, first read
L<perl5260delta>, which describes differences between 5.24.0 and 5.26.0.

=head1 Core Enhancements

=head2 Unicode 10.0 is supported

A list of changes is at
L<http://www.unicode.org/versions/Unicode10.0.0>.

=head2 L<C<delete>|perlfunc/delete EXPR> on key/value hash slices

L<C<delete>|perlfunc/delete EXPR> can now be used on
L<keyE<sol>value hash slices|perldata/KeyE<sol>Value Hash Slices>,
returning the keys along with the deleted values.
L<[perl #131328]|https://rt.perl.org/Ticket/Display.html?id=131328>

=head2 Experimentally, there are now alphabetic synonyms for some regular expression assertions

If you find it difficult to remember how to write certain of the pattern
assertions, there are now alphabetic synonyms.

 CURRENT                NEW SYNONYMS
 ------                 ------------
 (?=...)        (*pla:...) or (*positive_lookahead:...)
 (?!...)        (*nla:...) or (*negative_lookahead:...)
 (?<=...)       (*plb:...) or (*positive_lookbehind:...)
 (?<!...)       (*nlb:...) or (*negative_lookbehind:...)
 (?>...)        (*atomic:...)

These are considered experimental, so using any of these will raise
(unless turned off) a warning in the C<experimental::alpha_assertions>
category.

=head2 Mixed Unicode scripts are now detectable

A mixture of scripts, such as Cyrillic and Latin, in a string is often
the sign of a spoofing attack.  A new regular expression construct
now allows for easy detection of these.  For example, you can say

 qr/(*script_run: \d+ \b )/x

And the digits matched will all be from the same set of 10.  You won't
get a look-alike digit from a different script that has a different
value than what it appears to be.

Or:

 qr/(*sr: \b \w+ \b )/x

makes sure that all the characters come from the same script.

You can also combine script runs with C<(?E<gt>...)> (or
C<*atomic:...)>).

Instead of writing:

    (*sr:(?<...))

you can now run:

    (*asr:...)
    # or
    (*atomic_script_run:...)

This is considered experimental, so using it will raise (unless turned
off) a warning in the C<experimental::script_run> category.

See L<perlre/Script Runs>.

=head2 In-place editing with C<perl -i> is now safer

Previously in-place editing (C<perl -i>) would delete or rename the
input file as soon as you started working on a new file.

Without backups this would result in loss of data if there was an
error, such as a full disk, when writing to the output file.

This has changed so that the input file isn't replaced until the
output file has been completely written and successfully closed.

This works by creating a work file in the same directory, which is
renamed over the input file once the output file is complete.

Incompatibilities:

=over

=item *

Since this renaming needs to only happen once, if you create a thread
or child process, that renaming will only happen in the original
thread or process.

=item *

If you change directories while processing a file, and your operating
system doesn't provide the C<unlinkat()>, C<renameat()> and C<fchmodat()>
functions, the final rename step may fail.

=back

L<[perl #127663]|https://rt.perl.org/Public/Bug/Display.html?id=127663>

=head2 Initialisation of aggregate state variables

A persistent lexical array or hash variable can now be initialized,
by an expression such as C<state @a = qw(x y z)>.  Initialization of a
list of persistent lexical variables is still not possible.

=head2 Full-size inode numbers

On platforms where inode numbers are of a type larger than perl's native
integer numerical types, L<stat|perlfunc/stat> will preserve the full
content of large inode numbers by returning them in the form of strings of
decimal digits.  Exact comparison of inode numbers can thus be achieved by
comparing with C<eq> rather than C<==>.  Comparison with C<==>, and other
numerical operations (which are usually meaningless on inode numbers),
work as well as they did before, which is to say they fall back to
floating point, and ultimately operate on a fairly useless rounded inode
number if the real inode number is too big for the floating point format.

=head2 The C<sprintf> C<%j> format size modifier is now available with pre-C99 compilers

The actual size used depends on the platform, so remains unportable.

=head2 Close-on-exec flag set atomically

When opening a file descriptor, perl now generally opens it with its
close-on-exec flag already set, on platforms that support doing so.
This improves thread safety, because it means that an C<exec> initiated
by one thread can no longer cause a file descriptor in the process
of being opened by another thread to be accidentally passed to the
executed program.

Additionally, perl now sets the close-on-exec flag more reliably, whether
it does so atomically or not.  Most file descriptors were getting the
flag set, but some were being missed.

=head2 String- and number-specific bitwise ops are no longer experimental

The new string-specific (C<&. |. ^. ~.>) and number-specific (C<& | ^ ~>)
bitwise operators introduced in Perl 5.22 that are available within the
scope of C<use feature 'bitwise'> are no longer experimental.
Because the number-specific ops are spelled the same way as the existing
operators that choose their behaviour based on their operands, these
operators must still be enabled via the "bitwise" feature, in either of
these two ways:

    use feature "bitwise";

    use v5.28; # "bitwise" now included

They are also now enabled by the B<-E> command-line switch.

The "bitwise" feature no longer emits a warning.  Existing code that
disables the "experimental::bitwise" warning category that the feature
previously used will continue to work.

One caveat that module authors ought to be aware of is that the numeric
operators now pass a fifth TRUE argument to overload methods.  Any methods
that check the number of operands may croak if they do not expect so many.
XS authors in particular should be aware that this:

    SV *
    bitop_handler (lobj, robj, swap)

may need to be changed to this:

    SV *
    bitop_handler (lobj, robj, swap, ...)

=head2 Locales are now thread-safe on systems that support them

These systems include Windows starting with Visual Studio 2005, and in
POSIX 2008 systems.

The implication is that you are now free to use locales and change them
in a threaded environment.  Your changes affect only your thread.
See L<perllocale/Multi-threaded operation>

=head2 New read-only predefined variable C<${^SAFE_LOCALES}>

This variable is 1 if the Perl interpreter is operating in an
environment where it is safe to use and change locales (see
L<perllocale>.)  This variable is true when the perl is
unthreaded, or compiled in a platform that supports thread-safe locale
operation (see previous item).

=head1 Security

=head2 [CVE-2017-12837] Heap buffer overflow in regular expression compiler

Compiling certain regular expression patterns with the case-insensitive
modifier could cause a heap buffer overflow and crash perl.  This has now been
fixed.
L<[perl #131582]|https://rt.perl.org/Public/Bug/Display.html?id=131582>

=head2 [CVE-2017-12883] Buffer over-read in regular expression parser

For certain types of syntax error in a regular expression pattern, the error
message could either contain the contents of a random, possibly large, chunk of
memory, or could crash perl.  This has now been fixed.
L<[perl #131598]|https://rt.perl.org/Public/Bug/Display.html?id=131598>

=head2 [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows

A possible stack buffer overflow in the C<%ENV> code on Windows has been fixed
by removing the buffer completely since it was superfluous anyway.
L<[perl #131665]|https://rt.perl.org/Public/Bug/Display.html?id=131665>

=head2 Default Hash Function Change

Perl 5.28.0 retires various older hash functions which are not viewed as
sufficiently secure for use in Perl. We now support four general purpose
hash functions, Siphash (2-4 and 1-3 variants), and  Zaphod32, and StadtX
hash. In addition we support SBOX32 (a form of tabular hashing) for hashing
short strings, in conjunction with any of the other hash functions provided.

By default Perl is configured to support SBOX hashing of strings up to 24
characters, in conjunction with StadtX hashing on 64 bit builds, and
Zaphod32 hashing for 32 bit builds.

You may control these settings with the following options to Configure:

    -DPERL_HASH_FUNC_SIPHASH
    -DPERL_HASH_FUNC_SIPHASH13
    -DPERL_HASH_FUNC_STADTX
    -DPERL_HASH_FUNC_ZAPHOD32

To disable SBOX hashing you can use

    -DPERL_HASH_USE_SBOX32_ALSO=0

And to set the maximum length to use SBOX32 hashing on with:

    -DSBOX32_MAX_LEN=16

The maximum length allowed is 256. There probably isn't much point
in setting it higher than the default.

=head1 Incompatible Changes

=head2 Subroutine attribute and signature order

The experimental subroutine signatures feature has been changed so that
subroutine attributes must now come before the signature rather than
after. This is because attributes like C<:lvalue> can affect the
compilation of code within the signature, for example:

    sub f :lvalue ($a = do { $x = "abc"; return substr($x,0,1)}) { ...}

Note that this the second time they have been flipped:

    sub f :lvalue ($a, $b) { ... }; # 5.20; 5.28 onwards
    sub f ($a, $b) :lvalue { ... }; # 5.22 - 5.26

=head2 Comma-less variable lists in formats are no longer allowed

Omitting the commas between variables passed to formats is no longer
allowed.  This has been deprecated since Perl 5.000.

=head2 The C<:locked> and C<:unique> attributes have been removed

These have been no-ops and deprecated since Perl 5.12 and 5.10,
respectively.

=head2 C<\N{}> with nothing between the braces is now illegal

This has been deprecated since Perl 5.24.

=head2 Opening the same symbol as both a file and directory handle is no longer allowed

Using C<open()> and C<opendir()> to associate both a filehandle and a dirhandle
to the same symbol (glob or scalar) has been deprecated since Perl 5.10.

=head2 Use of bare C<< << >> to mean C<< <<"" >> is no longer allowed

Use of a bare terminator has been deprecated since Perl 5.000.

=head2 Setting $/ to a reference to a non-positive integer no longer allowed

This used to work like setting it to C<undef>, but has been deprecated
since Perl 5.20.

=head2 Unicode code points with values exceeding C<IV_MAX> are now fatal

This was deprecated since Perl 5.24.

=head2 The C<B::OP::terse> method has been removed

Use C<B::Concise::b_terse> instead.

=head2 Use of inherited AUTOLOAD for non-methods is no longer allowed

This was deprecated in Perl 5.004.

=head2 Use of strings with code points over 0xFF is not allowed for bitwise string operators

Code points over C<0xFF> do not make sense for bitwise operators and such
an operation will now croak, except for a few remaining cases. See
L<perldeprecation>.

This was deprecated in Perl 5.24.

=head2 Setting C<${^ENCODING}> to a defined value is now illegal

This has been deprecated since Perl 5.22 and a no-op since Perl 5.26.

=head2 Backslash no longer escapes colon in PATH for the C<-S> switch

Previously the C<-S> switch incorrectly treated backslash ("\") as an
escape for colon when traversing the C<PATH> environment variable.
L<[perl #129183]|https://rt.perl.org/Ticket/Display.html?id=129183>

=head2 the -DH (DEBUG_H) misfeature has been removed

On a perl built with debugging support, the C<H> flag to the C<-D>
debugging option has been removed. This was supposed to dump hash values,
but has been broken for many years.

=head2 Yada-yada is now strictly a statement

By the time of its initial stable release in Perl 5.12, the C<...>
(yada-yada) operator was explicitly intended to serve as a statement,
not an expression.  However, the original implementation was confused
on this point, leading to inconsistent parsing.  The operator was
accidentally accepted in a few situations where it did not serve as a
complete statement, such as

    ... . "foo";
    ... if $a < $b;

The parsing has now been made consistent, permitting yada-yada only as
a statement.  Affected code can use C<do{...}> to put a yada-yada into
an arbitrary expression context.

=head2 Sort algorithm can no longer be specified

Since Perl 5.8, the L<sort> pragma has had subpragmata C<_mergesort>,
C<_quicksort>, and C<_qsort> that can be used to specify which algorithm
perl should use to implement the L<sort|perlfunc/sort> builtin.
This was always considered a dubious feature that might not last,
hence the underscore spellings, and they were documented as not being
portable beyond Perl 5.8.  These subpragmata have now been deleted,
and any attempt to use them is an error.  The L<sort> pragma otherwise
remains, and the algorithm-neutral C<stable> subpragma can be used to
control sorting behaviour.
L<[perl #119635]|https://rt.perl.org/Ticket/Display.html?id=119635>

=head2 Over-radix digits in floating point literals

Octal and binary floating point literals used to permit any hexadecimal
digit to appear after the radix point.  The digits are now restricted
to those appropriate for the radix, as digits before the radix point
always were.

=head2 Return type of C<unpackstring()>

The return types of the C API functions C<unpackstring()> and
C<unpack_str()> have changed from C<I32> to C<SSize_t>, in order to
accommodate datasets of more than two billion items.

=head1 Deprecations

=head2 Use of L<C<vec>|perlfunc/vec EXPR,OFFSET,BITS> on strings with code points above 0xFF is deprecated

Such strings are represented internally in UTF-8, and C<vec> is a
bit-oriented operation that will likely give unexpected results on those
strings.

=head2 Some uses of unescaped C<"{"> in regexes are no longer fatal

Perl 5.26.0 fatalized some uses of an unescaped left brace, but an
exception was made at the last minute, specifically crafted to be a
minimal change to allow GNU Autoconf to work.  That tool is heavily
depended upon, and continues to use the deprecated usage.  Its use of an
unescaped left brace is one where we have no intention of repurposing
C<"{"> to be something other than itself.

That exception is now generalized to include various other such cases
where the C<"{"> will not be repurposed. 

Note that these uses continue to raise a deprecation message.

=head2 Use of unescaped C<"{"> immediately after a C<"("> in regular expression patterns is deprecated

Using unescaped left braces is officially deprecated everywhere, but it
is not enforced in contexts where their use does not interfere with
expected extensions to the language.  A deprecation is added in this
release when the brace appears immediately after an opening parenthesis.
Before this, even if the brace was part of a legal quantifier, it was
not interpreted as such, but as the literal characters, unlike other
quantifiers that follow a C<"("> which are considered errors.  Now,
their use will raise a deprecation message, unless turned off.

=head2 Assignment to C<$[> will be fatal in Perl 5.30

Assigning a non-zero value to L<C<$[>|perlvar/$[> has been deprecated
since Perl 5.12, but was never given a deadline for removal.  This has
now been scheduled for Perl 5.30.

=head2 hostname() won't accept arguments in Perl 5.32

Passing arguments to C<Sys::Hostname::hostname()> was already deprecated,
but didn't have a removal date.  This has now been scheduled for Perl
5.32.  L<[perl #124349]|https://rt.perl.org/Ticket/Display.html?id=124349>

=head2 Module removals

The following modules will be removed from the core distribution in a
future release, and will at that time need to be installed from CPAN.
Distributions on CPAN which require these modules will need to list them as
prerequisites.

The core versions of these modules will now issue C<"deprecated">-category
warnings to alert you to this fact.  To silence these deprecation warnings,
install the modules in question from CPAN.

Note that these are (with rare exceptions) fine modules that you are encouraged
to continue to use.  Their disinclusion from core primarily hinges on their
necessity to bootstrapping a fully functional, CPAN-capable Perl installation,
not usually on concerns over their design.

=over

=item B::Debug

=item L<Locale::Codes> and its associated Country, Currency and Language modules

=back

=head1 Performance Enhancements

=over 4

=item *

The start up overhead for creating regular expression patterns with
Unicode properties (C<\p{...}>) has been greatly reduced in most cases.

=item *

Many string concatenation expressions are now considerably faster, due
to the introduction internally of a C<multiconcat> opcode which combines
multiple concatenations, and optionally a C<=> or C<.=>, into a single
action. For example, apart from retrieving C<$s>, C<$a> and C<$b>, this
whole expression is now handled as a single op:

    $s .= "a=$a b=$b\n"

As a special case, if the LHS of an assignment is a lexical variable or
C<my $s>, the op itself handles retrieving the lexical variable, which
is faster.

In general, the more the expression includes a mix of constant strings and
variable expressions, the longer the expression, and the more it mixes
together non-utf8 and utf8 strings, the more marked the performance
improvement. For example on a C<x86_64> system, this code has been
benchmarked running four times faster:

    my $s;
    my $a = "ab\x{100}cde";
    my $b = "fghij";
    my $c = "\x{101}klmn";

    for my $i (1..10_000_000) {
        $s = "\x{100}wxyz";
        $s .= "foo=$a bar=$b baz=$c";
    }

In addition, C<sprintf> expressions which have a constant format
containing only C<%s> and C<%%> format elements, and which have a fixed
number of arguments, are now also optimised into a C<multiconcat> op.

=item *

The C<ref()> builtin is now much faster in boolean context, since it no
longer bothers to construct a temporary string like C<Foo=ARRAY(0x134af48)>.

=item *

C<keys()> in void and scalar contexts is now more efficient.

=item *

The common idiom of comparing the result of index() with -1 is now
specifically optimised,  e.g.

    if (index(...) != -1) { ... }

=item *

C<for()> loops and similar constructs are now more efficient in most cases.

=item *

L<File::Glob> has been modified to remove unnecessary backtracking and
recursion, thanks to Russ Cox. See L<https://research.swtch.com/glob>
for more details.

=item *

The XS-level C<SvTRUE()> API function is now more efficient.

=item *

Various integer-returning ops are now more efficient in scalar/boolean context.

=item *

Slightly improved performance when parsing stash names.
L<[perl #129990]|https://rt.perl.org/Public/Bug/Display.html?id=129990>

=item *

Calls to C<require> for an already loaded module are now slightly faster.
L<[perl #132171]|https://rt.perl.org/Public/Bug/Display.html?id=132171>

=item *

The performance of pattern matching C<[[:ascii:]]> and C<[[:^ascii:]]>
has been improved significantly except on EBCDIC platforms.

=item *

Various optimizations have been applied to matching regular expression
patterns, so under the right circumstances, significant performance
gains may be noticed.  But in an application with many varied patterns,
little overall improvement likely will be seen.

=item *

Other optimizations have been applied to UTF-8 handling, but these are
not typically a major factor in most applications.

=back

=head1 Modules and Pragmata

Key highlights in this release across several modules:

=head2 Removal of use vars

The usage of C<use vars> has been discouraged since the introduction of
C<our> in Perl 5.6.0. Where possible the usage of this pragma has now been
removed from the Perl source code.

This had a slight effect (for the better) on the output of WARNING_BITS in
L<B::Deparse>.

=head2 Use of DynaLoader changed to XSLoader in many modules

XSLoader is more modern, and most modules already require perl 5.6 or
greater, so no functionality is lost by switching. In some cases, we have
also made changes to the local implementation that may not be reflected in
the version on CPAN due to a desire to maintain more backwards
compatibility.

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Archive::Tar> has been upgraded from version 2.24 to 2.30.

This update also handled CVE-2018-12015: directory traversal
vulnerability.
L<[cpan #125523]|https://rt.cpan.org/Ticket/Display.html?id=125523>

=item *

L<arybase> has been upgraded from version 0.12 to 0.15.

=item *

L<Attribute::Handlers> has been upgraded from version 0.99 to 1.01.

=item *

L<attributes> has been upgraded from version 0.29 to 0.33.

=item *

L<B> has been upgraded from version 1.68 to 1.74.

=item *

L<B::Concise> has been upgraded from version 0.999 to 1.003.

=item *

L<B::Debug> has been upgraded from version 1.24 to 1.26.

NOTE: L<B::Debug> is deprecated and may be removed from a future version
of Perl.

=item *

L<B::Deparse> has been upgraded from version 1.40 to 1.48.

It includes many bug fixes, and in particular, it now deparses variable
attributes correctly:

    my $x :foo;  # used to deparse as
                 # 'attributes'->import('main', \$x, 'foo'), my $x;

=item *

L<base> has been upgraded from version 2.25 to 2.27.

=item *

L<bignum> has been upgraded from version 0.47 to 0.49.

=item *

L<blib> has been upgraded from version 1.06 to 1.07.

=item *

L<bytes> has been upgraded from version 1.05 to 1.06.

=item *

L<Carp> has been upgraded from version 1.42 to 1.50.

If a package on the call stack contains a constant named C<ISA>, Carp no
longer throws a "Not a GLOB reference" error.

L<Carp>, when generating stack traces, now attempts to work around
longstanding bugs resulting from Perl's non-reference-counted stack.
L<[perl #52610]|https://rt.perl.org/Ticket/Display.html?id=52610>

Carp has been modified to avoid assuming that objects cannot be
overloaded without the L<overload> module loaded (this can happen with
objects created by XS modules).  Previously, infinite recursion would
result if an XS-defined overload method itself called Carp.
L<[perl #132828]|https://rt.perl.org/Ticket/Display.html?id=132828>

Carp now avoids using C<overload::StrVal>, partly because older versions
of L<overload> (included with perl 5.14 and earlier) load L<Scalar::Util>
at run time, which will fail if Carp has been invoked after a syntax error.

=item *

L<charnames> has been upgraded from version 1.44 to 1.45.

=item *

L<Compress::Raw::Zlib> has been upgraded from version 2.074 to 2.076.

This addresses a security vulnerability in older versions of the 'zlib' library
(which is bundled with Compress-Raw-Zlib).

=item *

L<Config::Extensions> has been upgraded from version 0.01 to 0.02.

=item *

L<Config::Perl::V> has been upgraded from version 0.28 to 0.29.

=item *

L<CPAN> has been upgraded from version 2.18 to 2.20.

=item *

L<Data::Dumper> has been upgraded from version 2.167 to 2.170.

Quoting of glob names now obeys the Useqq option
L<[perl #119831]|https://rt.perl.org/Ticket/Display.html?id=119831>.

Attempts to set an option to C<undef> through a combined getter/setter
method are no longer mistaken for getter calls
L<[perl #113090]|https://rt.perl.org/Ticket/Display.html?id=113090>.

=item *

L<Devel::Peek> has been upgraded from version 1.26 to 1.27.

=item *

L<Devel::PPPort> has been upgraded from version 3.35 to 3.40.

L<Devel::PPPort> has moved from cpan-first to perl-first maintenance

Primary responsibility for the code in Devel::PPPort has moved into core perl.
In a practical sense there should be no change except that hopefully it will
stay more up to date with changes made to symbols in perl, rather than needing
to be updated after the fact.

=item *

L<Digest::SHA> has been upgraded from version 5.96 to 6.01.

=item *

L<DirHandle> has been upgraded from version 1.04 to 1.05.

=item *

L<DynaLoader> has been upgraded from version 1.42 to 1.45.

Its documentation now shows the use of C<__PACKAGE__> and direct object
syntax
L<[perl #132247]|https://rt.perl.org/Ticket/Display.html?id=132247>.

=item *

L<Encode> has been upgraded from version 2.88 to 2.97.

=item *

L<encoding> has been upgraded from version 2.19 to 2.22.

=item *

L<Errno> has been upgraded from version 1.28 to 1.29.

=item *

L<experimental> has been upgraded from version 0.016 to 0.019.

=item *

L<Exporter> has been upgraded from version 5.72 to 5.73.

=item *

L<ExtUtils::CBuilder> has been upgraded from version 0.280225 to 0.280230.

=item *

L<ExtUtils::Constant> has been upgraded from version 0.23 to 0.25.

=item *

L<ExtUtils::Embed> has been upgraded from version 1.34 to 1.35.

=item *

L<ExtUtils::Install> has been upgraded from version 2.04 to 2.14.

=item *

L<ExtUtils::MakeMaker> has been upgraded from version 7.24 to 7.34.

=item *

L<ExtUtils::Miniperl> has been upgraded from version 1.06 to 1.08.

=item *

L<ExtUtils::ParseXS> has been upgraded from version 3.34 to 3.39.

=item *

L<ExtUtils::Typemaps> has been upgraded from version 3.34 to 3.38.

=item *

L<ExtUtils::XSSymSet> has been upgraded from version 1.3 to 1.4.

=item *

L<feature> has been upgraded from version 1.47 to 1.52.

=item *

L<fields> has been upgraded from version 2.23 to 2.24.

=item *

L<File::Copy> has been upgraded from version 2.32 to 2.33.

It will now use the sub-second precision variant of utime() supplied by
L<Time::HiRes> where available.
L<[perl #132401]|https://rt.perl.org/Ticket/Display.html?id=132401>.

=item *

L<File::Fetch> has been upgraded from version 0.52 to 0.56.

=item *

L<File::Glob> has been upgraded from version 1.28 to 1.31.

=item *

L<File::Path> has been upgraded from version 2.12_01 to 2.15.

=item *

L<File::Spec> and L<Cwd> have been upgraded from version 3.67 to 3.74.

=item *

L<File::stat> has been upgraded from version 1.07 to 1.08.

=item *

L<FileCache> has been upgraded from version 1.09 to 1.10.

=item *

L<Filter::Simple> has been upgraded from version 0.93 to 0.95.

=item *

L<Filter::Util::Call> has been upgraded from version 1.55 to 1.58.

=item *

L<GDBM_File> has been upgraded from version 1.15 to 1.17.

Its documentation now explains that C<each> and C<delete> don't mix in
hashes tied to this module
L<[perl #117449]|https://rt.perl.org/Ticket/Display.html?id=117449>.

It will now retry opening with an acceptable block size if asking gdbm
to default the block size failed
L<[perl #119623]|https://rt.perl.org/Ticket/Display.html?id=119623>.

=item *

L<Getopt::Long> has been upgraded from version 2.49 to 2.5.

=item *

L<Hash::Util::FieldHash> has been upgraded from version 1.19 to 1.20.

=item *

L<I18N::Langinfo> has been upgraded from version 0.13 to 0.17.

This module is now available on all platforms, emulating the system
L<nl_langinfo(3)> on systems that lack it.  Some caveats apply, as
L<detailed in its documentation|I18N::Langinfo>, the most severe being
that, except for MS Windows, the C<CODESET> item is not implemented on
those systems, always returning C<"">.

It now sets the UTF-8 flag in its returned scalar if the string contains
legal non-ASCII UTF-8, and the locale is UTF-8
L<[perl #127288]|https://rt.perl.org/Ticket/Display.html?id=127288>.

This update also fixes a bug in which the underlying locale was ignored
for the C<RADIXCHAR> (always was returned as a dot) and the C<THOUSEP>
(always empty).  Now the locale-appropriate values are returned.

=item *

L<I18N::LangTags> has been upgraded from version 0.42 to 0.43.

=item *

L<if> has been upgraded from version 0.0606 to 0.0608.

=item *

L<IO> has been upgraded from version 1.38 to 1.39.

=item *

L<IO::Socket::IP> has been upgraded from version 0.38 to 0.39.

=item *

L<IPC::Cmd> has been upgraded from version 0.96 to 1.00.

=item *

L<JSON::PP> has been upgraded from version 2.27400_02 to 2.97001.

=item *

The C<libnet> distribution has been upgraded from version 3.10 to 3.11.

=item *

L<List::Util> has been upgraded from version 1.46_02 to 1.49.

=item *

L<Locale::Codes> has been upgraded from version 3.42 to 3.56.

B<NOTE>: L<Locale::Codes> scheduled to be removed from core in Perl 5.30.

=item *

L<Locale::Maketext> has been upgraded from version 1.28 to 1.29.

=item *

L<Math::BigInt> has been upgraded from version 1.999806 to 1.999811.

=item *

L<Math::BigInt::FastCalc> has been upgraded from version 0.5005 to 0.5006.

=item *

L<Math::BigRat> has been upgraded from version 0.2611 to 0.2613.

=item *

L<Module::CoreList> has been upgraded from version 5.20170530 to 5.20180622.

=item *

L<mro> has been upgraded from version 1.20 to 1.22.

=item *

L<Net::Ping> has been upgraded from version 2.55 to 2.62.

=item *

L<NEXT> has been upgraded from version 0.67 to 0.67_01.

=item *

L<ODBM_File> has been upgraded from version 1.14 to 1.15.

=item *

L<Opcode> has been upgraded from version 1.39 to 1.43.

=item *

L<overload> has been upgraded from version 1.28 to 1.30.

=item *

L<PerlIO::encoding> has been upgraded from version 0.25 to 0.26.

=item *

L<PerlIO::scalar> has been upgraded from version 0.26 to 0.29.

=item *

L<PerlIO::via> has been upgraded from version 0.16 to 0.17.

=item *

L<Pod::Functions> has been upgraded from version 1.11 to 1.13.

=item *

L<Pod::Html> has been upgraded from version 1.2202 to 1.24.

A title for the HTML document will now be automatically generated by
default from a "NAME" section in the POD document, as it used to be
before the module was rewritten to use L<Pod::Simple::XHTML> to do the
core of its job
L<[perl #110520]|https://rt.perl.org/Ticket/Display.html?id=110520>.

=item *

L<Pod::Perldoc> has been upgraded from version 3.28 to 3.2801.

=item *

The C<podlators> distribution has been upgraded from version 4.09 to 4.10.

Man page references and function names now follow the Linux man page
formatting standards, instead of the Solaris standard.

=item *

L<POSIX> has been upgraded from version 1.76 to 1.84.

Some more cautions were added about using locale-specific functions in
threaded applications.

=item *

L<re> has been upgraded from version 0.34 to 0.36.

=item *

L<Scalar::Util> has been upgraded from version 1.46_02 to 1.50.

=item *

L<SelfLoader> has been upgraded from version 1.23 to 1.25.

=item *

L<Socket> has been upgraded from version 2.020_03 to 2.027.

=item *

L<sort> has been upgraded from version 2.02 to 2.04.

=item *

L<Storable> has been upgraded from version 2.62 to 3.08.

=item *

L<Sub::Util> has been upgraded from version 1.48 to 1.49.

=item *

L<subs> has been upgraded from version 1.02 to 1.03.

=item *

L<Sys::Hostname> has been upgraded from version 1.20 to 1.22.

=item *

L<Term::ReadLine> has been upgraded from version 1.16 to 1.17.

=item *

L<Test> has been upgraded from version 1.30 to 1.31.

=item *

L<Test::Harness> has been upgraded from version 3.38 to 3.42.

=item *

L<Test::Simple> has been upgraded from version 1.302073 to 1.302133.

=item *

L<threads> has been upgraded from version 2.15 to 2.22.

The documentation now better describes the problems that arise when
returning values from threads, and no longer warns about creating threads
in C<BEGIN> blocks.
L<[perl #96538]|https://rt.perl.org/Ticket/Display.html?id=96538>

=item *

L<threads::shared> has been upgraded from version 1.56 to 1.58.

=item *

L<Tie::Array> has been upgraded from version 1.06 to 1.07.

=item *

L<Tie::StdHandle> has been upgraded from version 4.4 to 4.5.

=item *

L<Time::gmtime> has been upgraded from version 1.03 to 1.04.

=item *

L<Time::HiRes> has been upgraded from version 1.9741 to 1.9759.

=item *

L<Time::localtime> has been upgraded from version 1.02 to 1.03.

=item *

L<Time::Piece> has been upgraded from version 1.31 to 1.3204.

=item *

L<Unicode::Collate> has been upgraded from version 1.19 to 1.25.

=item *

L<Unicode::Normalize> has been upgraded from version 1.25 to 1.26.

=item *

L<Unicode::UCD> has been upgraded from version 0.68 to 0.70.

The function C<num> now accepts an optional parameter to help in
diagnosing error returns.

=item *

L<User::grent> has been upgraded from version 1.01 to 1.02.

=item *

L<User::pwent> has been upgraded from version 1.00 to 1.01.

=item *

L<utf8> has been upgraded from version 1.19 to 1.21.

=item *

L<vars> has been upgraded from version 1.03 to 1.04.

=item *

L<version> has been upgraded from version 0.9917 to 0.9923.

=item *

L<VMS::DCLsym> has been upgraded from version 1.08 to 1.09.

=item *

L<VMS::Stdio> has been upgraded from version 2.41 to 2.44.

=item *

L<warnings> has been upgraded from version 1.37 to 1.42.

It now includes new functions with names ending in C<_at_level>, allowing
callers to specify the exact call frame.
L<[perl #132468]|https://rt.perl.org/Ticket/Display.html?id=132468>

=item *

L<XS::Typemap> has been upgraded from version 0.15 to 0.16.

=item *

L<XSLoader> has been upgraded from version 0.27 to 0.30.

Its documentation now shows the use of C<__PACKAGE__>, and direct object
syntax for example C<DynaLoader> usage
L<[perl #132247]|https://rt.perl.org/Ticket/Display.html?id=132247>.

Platforms that use C<mod2fname> to edit the names of loadable
libraries now look for bootstrap (.bs) files under the correct,
non-edited name.

=back

=head2 Removed Modules and Pragmata

=over 4

=item *

The C<VMS::stdio> compatibility shim has been removed.

=back

=head1 Documentation

=head2 Changes to Existing Documentation

We have attempted to update the documentation to reflect the changes
listed in this document.  If you find any we have missed, send email
to L<perlbug@perl.org|mailto:perlbug@perl.org>.

Additionally, the following selected changes have been made:

=head3 L<perlapi>

=over 4

=item *

The API functions C<perl_parse()>, C<perl_run()>, and C<perl_destruct()>
are now documented comprehensively, where previously the only
documentation was a reference to the L<perlembed> tutorial.

=item *

The documentation of C<newGIVENOP()> has been belatedly updated to
account for the removal of lexical C<$_>.

=item *

The API functions C<newCONSTSUB()> and C<newCONSTSUB_flags()> are
documented much more comprehensively than before.

=back

=head3 L<perldata>

=over 4

=item *

The section "Truth and Falsehood" in L<perlsyn> has been moved into
L<perldata>.

=back

=head3 L<perldebguts>

=over 4

=item *

The description of the conditions under which C<DB::sub()> will be called
has been clarified.
L<[perl #131672]|https://rt.perl.org/Ticket/Display.html?id=131672>

=back

=head3 L<perldiag>

=over 4

=item * L<perldiag/Variable length lookbehind not implemented in regex mE<sol>%sE<sol>>

This now gives more ideas as to workarounds to the issue that was
introduced in Perl 5.18 (but not documented explicitly in its perldelta)
for the fact that some Unicode C</i> rules cause a few sequences such as

 (?<!st)

to be considered variable length, and hence disallowed.

=item * "Use of state $_ is experimental" in L<perldiag>

This entry has been removed, as the experimental support of this construct was
removed in perl 5.24.0.

=item *

The diagnostic C<Initialization of state variables in list context
currently forbidden> has changed to C<Initialization of state variables
in list currently forbidden>, because list-context initialization of
single aggregate state variables is now permitted.

=back

=head3 L<perlembed>

=over 4

=item *

The examples in L<perlembed> have been made more portable in the way
they exit, and the example that gets an exit code from the embedded Perl
interpreter now gets it from the right place.  The examples that pass
a constructed argv to Perl now show the mandatory null C<argv[argc]>.

=item *

An example in L<perlembed> used the string value of C<ERRSV> as a
format string when calling croak().  If that string contains format
codes such as C<%s> this could crash the program.

This has been changed to a call to croak_sv().

An alternative could have been to supply a trivial format string:

  croak("%s", SvPV_nolen(ERRSV));

or as a special case for C<ERRSV> simply:

  croak(NULL);

=back

=head3 L<perlfunc>

=over 4

=item *

There is now a note that warnings generated by built-in functions are
documented in L<perldiag> and L<warnings>.
L<[perl #116080]|https://rt.perl.org/Ticket/Display.html?id=116080>

=item *

The documentation for the C<exists> operator no longer says that
autovivification behaviour "may be fixed in a future release".
We've determined that we're not going to change the default behaviour.
L<[perl #127712]|https://rt.perl.org/Ticket/Display.html?id=127712>

=item *

A couple of small details in the documentation for the C<bless> operator
have been clarified.
L<[perl #124428]|https://rt.perl.org/Ticket/Display.html?id=124428>

=item *

The description of C<@INC> hooks in the documentation for C<require>
has been corrected to say that filter subroutines receive a useless
first argument.
L<[perl #115754]|https://rt.perl.org/Ticket/Display.html?id=115754>

=item *

The documentation of C<ref> has been rewritten for clarity.

=item *

The documentation of C<use> now explains what syntactically qualifies
as a version number for its module version checking feature.

=item *

The documentation of C<warn> has been updated to reflect that since Perl
5.14 it has treated complex exception objects in a manner equivalent
to C<die>.
L<[perl #121372]|https://rt.perl.org/Ticket/Display.html?id=121372>

=item *

The documentation of C<die> and C<warn> has been revised for clarity.

=item *

The documentation of C<each> has been improved, with a slightly more
explicit description of the sharing of iterator state, and with
caveats regarding the fragility of while-each loops.
L<[perl #132644]|https://rt.perl.org/Ticket/Display.html?id=132644>

=item *

Clarification to C<require> was added to explain the differences between

    require Foo::Bar;
    require "Foo/Bar.pm";

=back

=head3 L<perlgit>

=over 4

=item *

The precise rules for identifying C<smoke-me> branches are now stated.

=back

=head3 L<perlguts>

=over 4

=item *

The section on reference counting in L<perlguts> has been heavily revised,
to describe references in the way a programmer needs to think about them
rather than in terms of the physical data structures.

=item *

Improve documentation related to UTF-8 multibytes.

=back

=head3 L<perlintern>

=over 4

=item *

The internal functions C<newXS_len_flags()> and C<newATTRSUB_x()> are
now documented.

=back

=head3 L<perlobj>

=over 4

=item *

The documentation about C<DESTROY> methods has been corrected, updated,
and revised, especially in regard to how they interact with exceptions.
L<[perl #122753]|https://rt.perl.org/Ticket/Display.html?id=122753>

=back

=head3 L<perlop>

=over 4

=item *

The description of the C<x> operator in L<perlop> has been clarified.
L<[perl #132460]|https://rt.perl.org/Ticket/Display.html?id=132460>

=item *

L<perlop> has been updated to note that C<qw>'s whitespace rules differ
from that of C<split>'s in that only ASCII whitespace is used.

=item *

The general explanation of operator precedence and associativity has
been corrected and clarified.
L<[perl #127391]|https://rt.perl.org/Ticket/Display.html?id=127391>

=item *

The documentation for the C<\> referencing operator now explains the
unusual context that it supplies to its operand.
L<[perl #131061]|https://rt.perl.org/Ticket/Display.html?id=131061>

=back

=head3 L<perlrequick>

=over 4

=item *

Clarifications on metacharacters and character classes

=back

=head3 L<perlretut>

=over 4

=item *

Clarify metacharacters.

=back

=head3 L<perlrun>

=over 4

=item *

Clarify the differences between B<< -M >> and B<< -m >>.
L<[perl #131518]|https://rt.perl.org/Ticket/Display.html?id=131518>

=back

=head3 L<perlsec>

=over 4

=item *

The documentation about set-id scripts has been updated and revised.
L<[perl #74142]|https://rt.perl.org/Ticket/Display.html?id=74142>

=item *

A section about using C<sudo> to run Perl scripts has been added.

=back

=head3 L<perlsyn>

=over 4

=item *

The section "Truth and Falsehood" in L<perlsyn> has been removed from
that document, where it didn't belong, and merged into the existing
paragraph on the same topic in L<perldata>.

=item *

The means to disambiguate between code blocks and hash constructors,
already documented in L<perlref>, are now documented in L<perlsyn> too.
L<[perl #130958]|https://rt.perl.org/Ticket/Display.html?id=130958>

=back

=head3 L<perluniprops>

=over 4

=item *

L<perluniprops> has been updated to note that C<\p{Word}> now includes
code points matching the C<\p{Join_Control}> property.  The change to
the property was made in Perl 5.18, but not documented until now.  There
are currently only two code points that match this property U+200C (ZERO
WIDTH NON-JOINER) and U+200D (ZERO WIDTH JOINER).

=item *

For each binary table or property, the documentation now includes which
characters in the range C<\x00-\xFF> it matches, as well as a list of
the first few ranges of code points matched above that.

=back

=head3 L<perlvar>

=over 4

=item *

The entry for C<$+> in perlvar has been expanded upon to describe handling of
multiply-named capturing groups.

=back

=head3 L<perlfunc>, L<perlop>, L<perlsyn>

=over 4

=item *

In various places, improve the documentation of the special cases
in the condition expression of a while loop, such as implicit C<defined>
and assignment to C<$_>.
L<[perl #132644]|https://rt.perl.org/Ticket/Display.html?id=132644>

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=head3 New Errors

=over 4

=item *

L<Can't "goto" into a "given" block|perldiag/"Can't E<quot>gotoE<quot> into a E<quot>givenE<quot> block">

(F) A "goto" statement was executed to jump into the middle of a C<given>
block.  You can't get there from here.  See L<perlfunc/goto>.

=item *

L<Can't "goto" into a binary or list expression|perldiag/"Can't E<quot>gotoE<quot> into a binary or list expression">

Use of C<goto> to jump into the parameter of a binary or list operator has
been prohibited, to prevent crashes and stack corruption.
L<[perl #130936]|https://rt.perl.org/Ticket/Display.html?id=130936>

You may only enter the I<first> argument of an operator that takes a fixed
number of arguments, since this is a case that will not cause stack
corruption.
L<[perl #132854]|https://rt.perl.org/Ticket/Display.html?id=132854>

=back

=head3 New Warnings

=over 4

=item *

L<Old package separator used in string|perldiag/"Old package separator used in string">

(W syntax) You used the old package separator, "'", in a variable
named inside a double-quoted string; e.g., C<"In $name's house">.  This
is equivalent to C<"In $name::s house">.  If you meant the former, put
a backslash before the apostrophe (C<"In $name\'s house">).

=item *

L<perldiag/Locale '%s' contains (at least) the following characters which
have unexpected meanings: %s  The Perl program will use the expected
meanings>

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

A false-positive warning that was issued when using a
numerically-quantified sub-pattern in a recursive regex has been
silenced. L<[perl #131868]|https://rt.perl.org/Public/Bug/Display.html?id=131868>

=item *

The warning about useless use of a concatenation operator in void context
is now generated for expressions with multiple concatenations, such as
C<$a.$b.$c>, which used to mistakenly not warn.
L<[perl #6997]|https://rt.perl.org/Ticket/Display.html?id=6997>

=item *

Warnings that a variable or subroutine "masks earlier declaration in same
...", or that an C<our> variable has been redeclared, have been moved to a
new warnings category "shadow".  Previously they were in category "misc".

=item *

The deprecation warning from C<Sys::Hostname::hostname()> saying that
it doesn't accept arguments now states the Perl version in which the
warning will be upgraded to an error.
L<[perl #124349]|https://rt.perl.org/Ticket/Display.html?id=124349>

=item *

The L<perldiag> entry for the error regarding a set-id script has been
expanded to make clear that the error is reporting a specific security
vulnerability, and to advise how to fix it.

=item *

The C<< Unable to flush stdout >> error message was missing a trailing
newline. [debian #875361]

=back

=head1 Utility Changes

=head2 L<perlbug>

=over 4

=item *

C<--help> and C<--version> options have been added.

=back

=head1 Configuration and Compilation

=over 4

=item * C89 requirement

Perl has been documented as requiring a C89 compiler to build since October
1998.  A variety of simplifications have now been made to Perl's internals to
rely on the features specified by the C89 standard. We believe that this
internal change hasn't altered the set of platforms that Perl builds on, but
please report a bug if Perl now has new problems building on your platform.

=item *

On GCC, C<-Werror=pointer-arith> is now enabled by default,
disallowing arithmetic on void and function pointers.

=item *

Where an HTML version of the documentation is installed, the HTML
documents now use relative links to refer to each other.  Links from
the index page of L<perlipc> to the individual section documents are
now correct.
L<[perl #110056]|https://rt.perl.org/Ticket/Display.html?id=110056>

=item *

F<lib/unicore/mktables> now correctly canonicalizes the names of the
dependencies stored in the files it generates.

F<regen/mk_invlists.pl>, unlike the other F<regen/*.pl> scripts, used
C<$0> to name itself in the dependencies stored in the files it
generates.  It now uses a literal so that the path stored in the
generated files doesn't depend on how F<regen/mk_invlists.pl> is
invoked.

This lack of canonical names could cause test failures in F<t/porting/regen.t>.
L<[perl #132925]|https://rt.perl.org/Ticket/Display.html?id=132925>

=item * New probes

=over 2

=item HAS_BUILTIN_ADD_OVERFLOW

=item HAS_BUILTIN_MUL_OVERFLOW

=item HAS_BUILTIN_SUB_OVERFLOW

=item HAS_THREAD_SAFE_NL_LANGINFO_L

=item HAS_LOCALECONV_L

=item HAS_MBRLEN

=item HAS_MBRTOWC

=item HAS_MEMRCHR

=item HAS_NANOSLEEP

=item HAS_STRNLEN

=item HAS_STRTOLD_L

=item I_WCHAR

=back

=back

=head1 Testing

=over 4

=item *

Testing of the XS-APItest directory is now done in parallel, where
applicable.

=item *

Perl now includes a default F<.travis.yml> file for Travis CI testing
on github mirrors.
L<[perl #123981]|https://rt.perl.org/Ticket/Display.html?id=123981>

=item *

The watchdog timer count in F<re/pat_psycho.t> can now be overridden.

This test can take a long time to run, so there is a timer to keep
this in check (currently, 5 minutes). This commit adds checking
the environment variable C<< PERL_TEST_TIME_OUT_FACTOR >>; if set,
the time out setting is multiplied by its value.

=item *

F<harness> no longer waits for 30 seconds when running F<t/io/openpid.t>.
L<[perl #121028]|https://rt.perl.org/Ticket/Display.html?id=121028>
L<[perl #132867]|https://rt.perl.org/Ticket/Display.html?id=132867>

=back

=head1 Packaging

For the past few years we have released perl using three different archive
formats: bzip (C<.bz2>), LZMA2 (C<.xz>) and gzip (C<.gz>). Since xz compresses
better and decompresses faster, and gzip is more compatible and uses less
memory, we have dropped the C<.bz2> archive format with this release.
(If this poses a problem, do let us know; see L</Reporting Bugs>, below.)

=head1 Platform Support

=head2 Discontinued Platforms

=over 4

=item PowerUX / Power MAX OS

Compiler hints and other support for these apparently long-defunct
platforms has been removed.

=back

=head2 Platform-Specific Notes

=over 4

=item CentOS

Compilation on CentOS 5 is now fixed.

=item Cygwin

A build with the quadmath library can now be done on Cygwin.

=item Darwin

Perl now correctly uses reentrant functions, like C<asctime_r>, on
versions of Darwin that have support for them.

=item FreeBSD

FreeBSD's F<< /usr/share/mk/sys.mk >> specifies C<< -O2 >> for
architectures other than ARM and MIPS. By default, perl is now compiled
with the same optimization levels.

=item VMS

Several fix-ups for F<configure.com>, marking function VMS has
(or doesn't have).

CRTL features can now be set by embedders before invoking Perl by using
the C<decc$feature_set> and C<decc$feature_set_value> functions.
Previously any attempt to set features after image initialization were
ignored.

=item Windows

=over 4

=item *

Support for compiling perl on Windows using Microsoft Visual Studio 2017
(containing Visual C++ 14.1) has been added.

=item *

Visual C++ compiler version detection has been improved to work on non-English
language systems.

=item *

We now set C<$Config{libpth}> correctly for 64-bit builds using Visual C++
versions earlier than 14.1.

=back

=back

=head1 Internal Changes

=over 4

=item *

A new optimisation phase has been added to the compiler,
C<optimize_optree()>, which does a top-down scan of a complete optree
just before the peephole optimiser is run. This phase is not currently
hookable.

=item *

An C<OP_MULTICONCAT> op has been added. At C<optimize_optree()> time, a
chain of C<OP_CONCAT> and C<OP_CONST> ops, together optionally with an
C<OP_STRINGIFY> and/or C<OP_SASSIGN>, are combined into a single
C<OP_MULTICONCAT> op. The op is of type C<UNOP_AUX>, and the aux array
contains the argument count, plus a pointer to a constant string and a set
of segment lengths. For example with

    my $x = "foo=$foo, bar=$bar\n";

the constant string would be C<"foo=, bar=\n"> and the segment lengths
would be (4,6,1). If the string contains characters such as C<\x80>, whose
representation changes under utf8, two sets of strings plus lengths are
precomputed and stored.

=item *

Direct access to L<C<PL_keyword_plugin>|perlapi/PL_keyword_plugin> is not
safe in the presence of multithreading. A new
L<C<wrap_keyword_plugin>|perlapi/wrap_keyword_plugin> function has been
added to allow XS modules to safely define custom keywords even when
loaded from a thread, analogous to L<C<PL_check>|perlapi/PL_check> /
L<C<wrap_op_checker>|perlapi/wrap_op_checker>.

=item *

The C<PL_statbuf> interpreter variable has been removed.

=item *

The deprecated function C<to_utf8_case()>, accessible from XS code, has
been removed.

=item *

A new function
L<C<is_utf8_invariant_string_loc()>|perlapi/is_utf8_invariant_string_loc>
has been added that is like
L<C<is_utf8_invariant_string()>|perlapi/is_utf8_invariant_string>
but takes an extra pointer parameter into which is stored the location
of the first variant character, if any are found.

=item *

A new function, L<C<Perl_langinfo()>|perlapi/Perl_langinfo> has been
added.  It is an (almost) drop-in replacement for the system
C<nl_langinfo(3)>, but works on platforms that lack that; as well as
being more thread-safe, and hiding some gotchas with locale handling
from the caller.  Code that uses this, needn't use L<C<localeconv(3)>>
(and be affected by the gotchas) to find the decimal point, thousands
separator, or currency symbol.  See L<perlapi/Perl_langinfo>.

=item *

A new API function L<C<sv_rvunweaken()>|perlapi/sv_rvunweaken> has
been added to complement L<C<sv_rvweaken()>|perlapi/sv_rvweaken>.
The implementation was taken from L<Scalar::Util/unweaken>.

=item *

A new flag, C<SORTf_UNSTABLE>, has been added. This will allow a
future commit to make mergesort unstable when the user specifies ‘no
sort stable’, since it has been decided that mergesort should remain
stable by default.

=item *

XS modules can now automatically get reentrant versions of system
functions on threaded perls.

By adding

    #define PERL_REENTRANT

near the beginning of an C<XS> file, it will be compiled so that
whatever reentrant functions perl knows about on that system will
automatically and invisibly be used instead of the plain, non-reentrant
versions.  For example, if you write C<getpwnam()> in your code, on a
system that has C<getpwnam_r()> all calls to the former will be translated
invisibly into the latter.  This does not happen except on threaded
perls, as they aren't needed otherwise.  Be aware that which functions
have reentrant versions varies from system to system.

=item *

The C<PERL_NO_OP_PARENT> build define is no longer supported, which means
that perl is now always built with C<PERL_OP_PARENT> enabled.

=item *

The format and content of the non-utf8 transliteration table attached to
the C<op_pv> field of C<OP_TRANS>/C<OP_TRANSR> ops has changed. It's now a
C<struct OPtrans_map>.

=item *

A new compiler C<#define>, C<dTHX_DEBUGGING>. has been added.  This is
useful for XS or C code that only need the thread context because their
debugging statements that get compiled only under C<-DDEBUGGING> need
one.

=item *

A new API function L<perlapi/Perl_setlocale> has been added.

=item *

L<perlapi/sync_locale> has been revised to return a boolean as to
whether the system was using the global locale or not.

=item *

A new kind of magic scalar, called a "nonelem" scalar, has been introduced.
It is stored in an array to denote a non-existent element, whenever such an
element is accessed in a potential lvalue context.  It replaces the
existing "defelem" (deferred element) magic wherever this is possible,
being significantly more efficient.  This means that
C<some_sub($sparse_array[$nonelem])> no longer has to create a new magic
defelem scalar each time, as long as the element is within the array.

It partially fixes the rare bug of deferred elements getting out of synch
with their arrays when the array is shifted or unshifted.
L<[perl #132729]|https://rt.perl.org/Ticket/Display.html?id=132729>

=back

=head1 Selected Bug Fixes

=over 4

=item *

List assignment (C<aassign>) could in some rare cases allocate an
entry on the mortals stack and leave the entry uninitialized, leading to
possible crashes.
L<[perl #131570]|https://rt.perl.org/Ticket/Display.html?id=131570>

=item *

Attempting to apply an attribute to an C<our> variable where a
function of that name already exists could result in a NULL pointer
being supplied where an SV was expected, crashing perl.
L<[perl #131597]|https://rt.perl.org/Ticket/Display.html?id=131597>

=item *

C<split ' '> now correctly handles the argument being split when in the
scope of the L<< C<unicode_strings>|feature/"The 'unicode_strings' feature"
>> feature. Previously, when a string using the single-byte internal
representation contained characters that are whitespace by Unicode rules but
not by ASCII rules, it treated those characters as part of fields rather
than as field separators.
L<[perl #130907]|https://rt.perl.org/Ticket/Display.html?id=130907>

=item *

Several built-in functions previously had bugs that could cause them to
write to the internal stack without allocating room for the item being
written. In rare situations, this could have led to a crash. These bugs have
now been fixed, and if any similar bugs are introduced in future, they will
be detected automatically in debugging builds.

These internal stack usage checks introduced are also done
by the C<entersub> operator when calling XSUBs.  This means we can
report which XSUB failed to allocate enough stack space.
L<[perl #131975]|https://rt.perl.org/Public/Bug/Display.html?id=131975>

=item *

Using a symbolic ref with postderef syntax as the key in a hash lookup was
yielding an assertion failure on debugging builds.
L<[perl #131627]|https://rt.perl.org/Ticket/Display.html?id=131627>

=item *

Array and hash variables whose names begin with a caret now admit indexing
inside their curlies when interpolated into strings, as in C<<
"${^CAPTURE[0]}" >> to index C<@{^CAPTURE}>.
L<[perl #131664]|https://rt.perl.org/Ticket/Display.html?id=131664>

=item *

Fetching the name of a glob that was previously UTF-8 but wasn't any
longer would return that name flagged as UTF-8.
L<[perl #131263]|https://rt.perl.org/Ticket/Display.html?id=131263>

=item *

The perl C<sprintf()> function (via the underlying C function
C<Perl_sv_vcatpvfn_flags()>) has been heavily reworked to fix many minor
bugs, including the integer wrapping of large width and precision
specifiers and potential buffer overruns. It has also been made faster in
many cases.

=item *

Exiting from an C<eval>, whether normally or via an exception, now always
frees temporary values (possibly calling destructors) I<before> setting
C<$@>. For example:

    sub DESTROY { eval { die "died in DESTROY"; } }
    eval { bless []; };
    # $@ used to be equal to "died in DESTROY" here; it's now "".

=item *

Fixed a duplicate symbol failure with C<-flto -mieee-fp> builds.
F<pp.c> defined C<_LIB_VERSION> which C<-lieee> already defines.
L<[perl #131786]|https://rt.perl.org/Ticket/Display.html?id=131786>

=item *

The tokenizer no longer consumes the exponent part of a floating
point number if it's incomplete.
L<[perl #131725]|https://rt.perl.org/Ticket/Display.html?id=131725>

=item *

On non-threaded builds, for C<m/$null/> where C<$null> is an empty
string is no longer treated as if the C</o> flag was present when the
previous matching match operator included the C</o> flag.  The
rewriting used to implement this behavior could confuse the
interpreter.  This matches the behaviour of threaded builds.
L<[perl #124368]|https://rt.perl.org/Ticket/Display.html?id=124368>

=item *

Parsing a C<sub> definition could cause a use after free if the C<sub>
keyword was followed by whitespace including newlines (and comments.)
L<[perl #131836]|https://rt.perl.org/Public/Bug/Display.html?id=131836>

=item *

The tokenizer now correctly adjusts a parse pointer when skipping
whitespace in a C<< ${identifier} >> construct.
L<[perl #131949]|https://rt.perl.org/Public/Bug/Display.html?id=131949>

=item *

Accesses to C<${^LAST_FH}> no longer assert after using any of a
variety of I/O operations on a non-glob.
L<[perl #128263]|https://rt.perl.org/Public/Bug/Display.html?id=128263>

=item *

The XS-level C<Copy()>, C<Move()>, C<Zero()> macros and their variants now
assert if the pointers supplied are C<NULL>.  ISO C considers
supplying NULL pointers to the functions these macros are built upon
as undefined behaviour even when their count parameters are zero.
Based on these assertions and the original bug report three macro
calls were made conditional.
L<[perl #131746]|https://rt.perl.org/Public/Bug/Display.html?id=131746>
L<[perl #131892]|https://rt.perl.org/Public/Bug/Display.html?id=131892>

=item *

Only the C<=> operator is permitted for defining defaults for
parameters in subroutine signatures.  Previously other assignment
operators, e.g. C<+=>, were also accidentally permitted.
L<[perl #131777]|https://rt.perl.org/Public/Bug/Display.html?id=131777>

=item *

Package names are now always included in C<:prototype> warnings
L<[perl #131833]|https://rt.perl.org/Public/Bug/Display.html?id=131833>

=item *

The C<je_old_stack_hwm> field, previously only found in the C<jmpenv>
structure on debugging builds, has been added to non-debug builds as
well. This fixes an issue with some CPAN modules caused by the size of
this structure varying between debugging and non-debugging builds.
L<[perl #131942]|https://rt.perl.org/Public/Bug/Display.html?id=131942>

=item *

The arguments to the C<ninstr()> macro are now correctly parenthesized.

=item *

A NULL pointer dereference in the C<S_regmatch()> function has been
fixed.
L<[perl #132017]|https://rt.perl.org/Public/Bug/Display.html?id=132017>

=item *

Calling L<exec PROGRAM LIST|perlfunc/exec PROGRAM LIST> with an empty C<LIST>
has been fixed.  This should call C<execvp()> with an empty C<argv> array
(containing only the terminating C<NULL> pointer), but was instead just
returning false (and not setting L<C<$!>|perlvar/$!>).
L<[perl #131730]|https://rt.perl.org/Public/Bug/Display.html?id=131730>

=item *

The C<gv_fetchmeth_sv> C function stopped working properly in Perl 5.22 when
fetching a constant with a UTF-8 name if that constant subroutine was stored in
the stash as a simple scalar reference, rather than a full typeglob.  This has
been corrected.

=item *

Single-letter debugger commands followed by an argument which starts with
punctuation  (e.g. C<p$^V> and C<x@ARGV>) now work again.  They had been
wrongly requiring a space between the command and the argument.
L<[perl #120174]|https://rt.perl.org/Public/Bug/Display.html?id=120174>

=item *

L<splice|perlfunc/splice ARRAY,OFFSET,LENGTH,LIST> now throws an exception
("Modification of a read-only value attempted") when modifying a read-only
array.  Until now it had been silently modifying the array.  The new behaviour
is consistent with the behaviour of L<push|perlfunc/push ARRAY,LIST> and
L<unshift|perlfunc/unshift ARRAY,LIST>.
L<[perl #131000]|https://rt.perl.org/Public/Bug/Display.html?id=131000>

=item *

C<stat()>, C<lstat()>, and file test operators now fail if given a
filename containing a nul character, in the same way that C<open()>
already fails.

=item *

C<stat()>, C<lstat()>, and file test operators now reliably set C<$!> when
failing due to being applied to a closed or otherwise invalid file handle.

=item *

File test operators for Unix permission bits that don't exist on a
particular platform, such as C<-k> (sticky bit) on Windows, now check that
the file being tested exists before returning the blanket false result,
and yield the appropriate errors if the argument doesn't refer to a file.

=item *

Fixed a 'read before buffer' overrun when parsing a range starting with
C<\N{}> at the beginning of the character set for the transliteration
operator.
L<[perl #132245]|https://rt.perl.org/Ticket/Display.html?id=132245>

=item *

Fixed a leaked scalar when parsing an empty C<\N{}> at compile-time.
L<[perl #132245]|https://rt.perl.org/Ticket/Display.html?id=132245>

=item *

Calling C<do $path> on a directory or block device now yields a meaningful
error code in C<$!>.
L<[perl #125774]|https://rt.perl.org/Ticket/Display.html?id=125774>

=item *

Regexp substitution using an overloaded replacement value that provides
a tainted stringification now correctly taints the resulting string.
L<[perl #115266]|https://rt.perl.org/Ticket/Display.html?id=115266>

=item *

Lexical sub declarations in C<do> blocks such as C<do { my sub lex; 123 }>
could corrupt the stack, erasing items already on the stack in the
enclosing statement.  This has been fixed.
L<[perl #132442]|https://rt.perl.org/Ticket/Display.html?id=132442>

=item *

C<pack> and C<unpack> can now handle repeat counts and lengths that
exceed two billion.
L<[perl #119367]|https://rt.perl.org/Ticket/Display.html?id=119367>

=item *

Digits past the radix point in octal and binary floating point literals
now have the correct weight on platforms where a floating point
significand doesn't fit into an integer type.

=item *

The canonical truth value no longer has a spurious special meaning as a
callable subroutine.  It used to be a magic placeholder for a missing
C<import> or C<unimport> method, but is now treated like any other string
C<1>.
L<[perl #126042]|https://rt.perl.org/Ticket/Display.html?id=126042>

=item *

C<system> now reduces its arguments to strings in the parent process, so
any effects of stringifying them (such as overload methods being called
or warnings being emitted) are visible in the way the program expects.
L<[perl #121105]|https://rt.perl.org/Ticket/Display.html?id=121105>

=item *

The C<readpipe()> built-in function now checks at compile time that
it has only one parameter expression, and puts it in scalar context,
thus ensuring that it doesn't corrupt the stack at runtime.
L<[perl #4574]|https://rt.perl.org/Ticket/Display.html?id=4574>

=item *

C<sort> now performs correct reference counting when aliasing C<$a> and
C<$b>, thus avoiding premature destruction and leakage of scalars if they
are re-aliased during execution of the sort comparator.
L<[perl #92264]|https://rt.perl.org/Ticket/Display.html?id=92264>

=item *

C<reverse> with no operand, reversing C<$_> by default, is no longer in
danger of corrupting the stack.
L<[perl #132544]|https://rt.perl.org/Ticket/Display.html?id=132544>

=item *

C<exec>, C<system>, et al are no longer liable to have their argument
lists corrupted by reentrant calls and by magic such as tied scalars.
L<[perl #129888]|https://rt.perl.org/Ticket/Display.html?id=129888>

=item *

Perl's own C<malloc> no longer gets confused by attempts to allocate
more than a gigabyte on a 64-bit platform.
L<[perl #119829]|https://rt.perl.org/Ticket/Display.html?id=119829>

=item *

Stacked file test operators in a sort comparator expression no longer
cause a crash.
L<[perl #129347]|https://rt.perl.org/Ticket/Display.html?id=129347>

=item *

An identity C<tr///> transformation on a reference is no longer mistaken
for that reference for the purposes of deciding whether it can be
assigned to.
L<[perl #130578]|https://rt.perl.org/Ticket/Display.html?id=130578>

=item *

Lengthy hexadecimal, octal, or binary floating point literals no
longer cause undefined behaviour when parsing digits that are of such
low significance that they can't affect the floating point value.
L<[perl #131894]|https://rt.perl.org/Ticket/Display.html?id=131894>

=item *

C<open $$scalarref...> and similar invocations no longer leak the file
handle.
L<[perl #115814]|https://rt.perl.org/Ticket/Display.html?id=115814>

=item *

Some convoluted kinds of regexp no longer cause an arithmetic overflow
when compiled.
L<[perl #131893]|https://rt.perl.org/Ticket/Display.html?id=131893>

=item *

The default typemap, by avoiding C<newGVgen>, now no longer leaks when
XSUBs return file handles (C<PerlIO *> or C<FILE *>).
L<[perl #115814]|https://rt.perl.org/Ticket/Display.html?id=115814>

=item *

Creating a C<BEGIN> block as an XS subroutine with a prototype no longer
crashes because of the early freeing of the subroutine.

=item *

The C<printf> format specifier C<%.0f> no longer rounds incorrectly
L<[perl #47602]|https://rt.perl.org/Ticket/Display.html?id=47602>,
and now shows the correct sign for a negative zero.

=item * 

Fixed an issue where the error C<< Scalar value @arrayname[0] better
written as $arrayname >> would give an error C<< Cannot printf Inf with 'c' >>
when arrayname starts with C<< Inf >>.
L<[perl #132645]|https://rt.perl.org/Ticket/Display.html?id=132645>

=item *

The Perl implementation of C<< getcwd() >> in C<< Cwd >> in the PathTools
distribution now behaves the same as XS implementation on errors: it
returns an error, and sets C<< $! >>.
L<[perl #132648]|https://rt.perl.org/Ticket/Display.html?id=132648>

=item *

Vivify array elements when putting them on the stack.
Fixes L<[perl #8910]|https://rt.perl.org/Ticket/Display.html?id=8910>
(reported in April 2002).

=item *

Fixed parsing of braced subscript after parens. Fixes
L<[perl #8045]|https://rt.perl.org/Ticket/Display.html?id=8045>
(reported in December 2001).

=item *

C<tr/non_utf8/long_non_utf8/c> could give the wrong results when the
length of the replacement character list was greater than 0x7fff.

=item *

C<tr/non_utf8/non_utf8/cd> failed to add the implied
C<\x{100}-\x{7fffffff}> to the search character list.

=item *

Compilation failures within "perl-within-perl" constructs, such as with
string interpolation and the right part of C<s///e>, now cause
compilation to abort earlier.

Previously compilation could continue in order to report other errors,
but the failed sub-parse could leave partly parsed constructs on the
parser shift-reduce stack, confusing the parser, leading to perl
crashes.
L<[perl #125351]|https://rt.perl.org/Ticket/Display.html?id=125351>

=item *

On threaded perls where the decimal point (radix) character is not a
dot, it has been possible for a race to occur between threads when one
needs to use the real radix character (such as with C<sprintf>).  This has
now been fixed by use of a mutex on systems without thread-safe locales,
and the problem just doesn't come up on those with thread-safe locales.

=item *

Errors while compiling a regex character class could sometime trigger an
assertion failure.
L<[perl #132163]|https://rt.perl.org/Ticket/Display.html?id=132163>

=back

=head1 Acknowledgements

Perl 5.28.0 represents approximately 13 months of development since Perl
5.26.0 and contains approximately 730,000 lines of changes across 2,200
files from 77 authors.

Excluding auto-generated files, documentation and release tools, there were
approximately 580,000 lines of changes to 1,300 .pm, .t, .c and .h files.

Perl continues to flourish into its fourth decade thanks to a vibrant
community of users and developers. The following people are known to have
contributed the improvements that became Perl 5.28.0:

Aaron Crane, Abigail, Ævar Arnfjörð Bjarmason, Alberto Simões, Alexandr
Savca, Andrew Fresh, Andy Dougherty, Andy Lester, Aristotle Pagaltzis, Ask
Bjørn Hansen, Chris 'BinGOs' Williams, Craig A. Berry, Dagfinn Ilmari
Mannsåker, Dan Collins, Daniel Dragan, David Cantrell, David Mitchell,
Dmitry Ulanov, Dominic Hargreaves, E. Choroba, Eric Herman, Eugen Konkov,
Father Chrysostomos, Gene Sullivan, George Hartzell, Graham Knop, Harald
Jörg, H.Merijn Brand, Hugo van der Sanden, Jacques Germishuys, James E
Keenan, Jarkko Hietaniemi, Jerry D. Hedden, J. Nick Koston, John Lightsey,
John Peacock, John P. Linderman, John SJ Anderson, Karen Etheridge, Karl
Williamson, Ken Brown, Ken Cotterill, Leon Timmermans, Lukas Mai, Marco
Fontani, Marc-Philip Werner, Matthew Horsfall, Neil Bowers, Nicholas Clark,
Nicolas R., Niko Tyni, Pali, Paul Marquess, Peter John Acklam, Reini Urban,
Renee Baecker, Ricardo Signes, Robin Barker, Sawyer X, Scott Lanning, Sergey
Aleynikov, Shirakata Kentaro, Shoichi Kaji, Slaven Rezic, Smylers, Steffen
Müller, Steve Hay, Sullivan Beck, Thomas Sibley, Todd Rinaldo, Tomasz
Konojacki, Tom Hukins, Tom Wyant, Tony Cook, Vitali Peil, Yves Orton,
Zefram.

The list above is almost certainly incomplete as it is automatically
generated from version control history. In particular, it does not include
the names of the (very much appreciated) contributors who reported issues to
the Perl bug tracker.

Many of the changes included in this version originated in the CPAN modules
included in Perl's core. We're grateful to the entire CPAN community for
helping Perl to flourish.

For a more complete list of all of Perl's historical contributors, please
see the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the perl bug database
at L<https://rt.perl.org/> .  There may also be information at
L<http://www.perl.org/> , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug> program
included with your release.  Be sure to trim your bug down to a tiny but
sufficient test case.  Your bug report, along with the output of C<perl -V>,
will be sent off to perlbug@perl.org to be analysed by the Perl porting team.

If the bug you are reporting has security implications which make it
inappropriate to send to a publicly archived mailing list, then see
L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>
for details of how to report the issue.

=head1 Give Thanks

If you wish to thank the Perl 5 Porters for the work we had done in Perl 5,
you can do so by running the C<perlthanks> program:

    perlthanks

This will send an email to the Perl 5 Porters list with your show of thanks.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details on
what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlqnx.pod000064400000015021150344123500006735 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlqnx - Perl version 5 on QNX

=head1 DESCRIPTION

As of perl5.7.2 all tests pass under:

  QNX 4.24G
  Watcom 10.6 with Beta/970211.wcc.update.tar.F
  socket3r.lib Nov21 1996.

As of perl5.8.1 there is at least one test still failing.

Some tests may complain under known circumstances.

See below and hints/qnx.sh for more information.

Under QNX 6.2.0 there are still a few tests which fail.
See below and hints/qnx.sh for more information.

=head2 Required Software for Compiling Perl on QNX4

As with many unix ports, this one depends on a few "standard"
unix utilities which are not necessarily standard for QNX4.

=over 4

=item /bin/sh

This is used heavily by Configure and then by
perl itself. QNX4's version is fine, but Configure
will choke on the 16-bit version, so if you are
running QNX 4.22, link /bin/sh to /bin32/ksh

=item ar

This is the standard unix library builder.
We use wlib. With Watcom 10.6, when wlib is
linked as "ar", it behaves like ar and all is
fine. Under 9.5, a cover is required. One is
included in ../qnx

=item nm

This is used (optionally) by configure to list
the contents of libraries. I will generate
a cover function on the fly in the UU directory.

=item cpp

Configure and perl need a way to invoke a C
preprocessor. I have created a simple cover
for cc which does the right thing. Without this,
Configure will create its own wrapper which works,
but it doesn't handle some of the command line arguments
that perl will throw at it.

=item make

You really need GNU make to compile this. GNU make
ships by default with QNX 4.23, but you can get it
from quics for earlier versions.

=back

=head2 Outstanding Issues with Perl on QNX4

There is no support for dynamically linked libraries in QNX4.

If you wish to compile with the Socket extension, you need
to have the TCP/IP toolkit, and you need to make sure that
-lsocket locates the correct copy of socket3r.lib. Beware
that the Watcom compiler ships with a stub version of
socket3r.lib which has very little functionality. Also
beware the order in which wlink searches directories for
libraries. You may have /usr/lib/socket3r.lib pointing to
the correct library, but wlink may pick up
/usr/watcom/10.6/usr/lib/socket3r.lib instead. Make sure
they both point to the correct library, that is,
/usr/tcptk/current/usr/lib/socket3r.lib.

The following tests may report errors under QNX4:

dist/Cwd/Cwd.t will complain if `pwd` and cwd don't give
the same results. cwd calls `fullpath -t`, so if you
cd `fullpath -t` before running the test, it will
pass.

lib/File/Find/taint.t will complain if '.' is in your
PATH. The PATH test is triggered because cwd calls
`fullpath -t`.

ext/IO/lib/IO/t/io_sock.t: Subtests 14 and 22 are skipped due to
the fact that the functionality to read back the non-blocking
status of a socket is not implemented in QNX's TCP/IP. This has
been reported to QNX and it may work with later versions of
TCP/IP.

t/io/tell.t: Subtest 27 is failing. We are still investigating.

=head2 QNX auxiliary files

The files in the "qnx" directory are:

=over 4

=item qnx/ar

A script that emulates the standard unix archive (aka library)
utility.  Under Watcom 10.6, ar is linked to wlib and provides the
expected interface. With Watcom 9.5, a cover function is
required. This one is fairly crude but has proved adequate for
compiling perl.

=item qnx/cpp

A script that provides C preprocessing functionality.  Configure can
generate a similar cover, but it doesn't handle all the command-line
options that perl throws at it. This might be reasonably placed in
/usr/local/bin.

=back

=head2 Outstanding issues with perl under QNX6

The following tests are still failing for Perl 5.8.1 under QNX 6.2.0:

  op/sprintf.........................FAILED at test 91
  lib/Benchmark......................FAILED at test 26

This is due to a bug in the C library's printf routine.
printf("'%e'", 0. ) produces '0.000000e+0', but ANSI requires
'0.000000e+00'. QNX has acknowledged the bug.

=head2 Cross-compilation

Perl supports cross-compiling to QNX NTO through the
Native Development Kit (NDK) for the Blackberry 10.  This means that you
can cross-compile for both ARM and x86 versions of the platform.

=head3 Setting up a cross-compilation environment

You can download the NDK from
L<http://developer.blackberry.com/native/downloads/>.

See
L<http://developer.blackberry.com/native/documentation/cascades/getting_started/setting_up.html>
for instructions to set up your device prior to attempting anything else.

Once you've installed the NDK and set up your device, all that's
left to do is setting up the device and the cross-compilation
environment.  Blackberry provides a script, C<bbndk-env.sh> (occasionally
named something like C<bbndk-env_10_1_0_4828.sh>) which can be used
to do this.  However, there's a bit of a snag that we have to work through:
The script modifies PATH so that 'gcc' or 'ar' point to their
cross-compilation equivalents, which screws over the build process.

So instead you'll want to do something like this:

    $ orig_path=$PATH
    $ source $location_of_bbndk/bbndk-env*.sh
    $ export PATH="$orig_path:$PATH"

Besides putting the cross-compiler and the rest of the toolchain in your
PATH, this will also provide the QNX_TARGET variable, which
we will pass to Configure through -Dsysroot.

=head3 Preparing the target system

It's quite possible that the target system doesn't have a readily
available /tmp, so it's generall safer to do something like this:

 $ ssh $TARGETUSER@$TARGETHOST 'rm -rf perl; mkdir perl; mkdir perl/tmp'
 $ export TARGETDIR=`ssh $TARGETUSER@$TARGETHOST pwd`/perl
 $ export TARGETENV="export TMPDIR=$TARGETDIR/tmp; "

Later on, we'll pass this to Configure through -Dtargetenv

=head3 Calling Configure

If you are targetting an ARM device -- which currently includes the vast 
majority of phones and tablets -- you'll want to pass
-Dcc=arm-unknown-nto-qnx8.0.0eabi-gcc to Configure.  Alternatively, if you 
are targetting an x86 device, or using the simulator provided with the NDK,
you should specify -Dcc=ntox86-gcc instead.

A sample Configure invocation looks something like this:

    ./Configure -des -Dusecrosscompile \
        -Dsysroot=$QNX_TARGET          \
        -Dtargetdir=$TARGETDIR         \
        -Dtargetenv="$TARGETENV"       \
        -Dcc=ntox86-gcc                \
        -Dtarghost=... # Usual cross-compilation options

=head1 AUTHOR

Norton T. Allen (allen@huarp.harvard.edu)

perlhurd.pod000064400000003711150344123500007074 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlhurd - Perl version 5 on Hurd

=head1 DESCRIPTION

If you want to use Perl on the Hurd, I recommend using the Debian
GNU/Hurd distribution ( see L<http://www.debian.org/> ), even if an
official, stable release has not yet been made.  The old "gnu-0.2"
binary distribution will most certainly have additional problems.

=head2 Known Problems with Perl on Hurd 

The Perl test suite may still report some errors on the Hurd.  The
"lib/anydbm" and "pragma/warnings" tests will almost certainly fail.
Both failures are not really specific to the Hurd, as indicated by the
test suite output.

The socket tests may fail if the network is not configured.  You have
to make "/hurd/pfinet" the translator for "/servers/socket/2", giving
it the right arguments.  Try "/hurd/pfinet --help" for more
information.

Here are the statistics for Perl 5.005_62 on my system:

 Failed Test  Status Wstat Total Fail  Failed  List of failed
 -----------------------------------------------------------------------
 lib/anydbm.t                 12    1   8.33%  12
 pragma/warnings             333    1   0.30%  215

 8 tests and 24 subtests skipped.
 Failed 2/229 test scripts, 99.13% okay. 2/10850 subtests failed,
     99.98% okay.

There are quite a few systems out there that do worse!

However, since I am running a very recent Hurd snapshot, in which a lot of
bugs that were exposed by the Perl test suite have been fixed, you may
encounter more failures.  Likely candidates are: "op/stat", "lib/io_pipe",
"lib/io_sock", "lib/io_udp" and "lib/time".

In any way, if you're seeing failures beyond those mentioned in this
document, please consider upgrading to the latest Hurd before reporting
the failure as a bug. 

=head1 AUTHOR

Mark Kettenis <kettenis@gnu.org>

Last Updated: Fri, 29 Oct 1999 22:50:30 +0200

perldebguts.pod000064400000113207150344123500007571 0ustar00=head1 NAME

perldebguts - Guts of Perl debugging 

=head1 DESCRIPTION

This is not L<perldebug>, which tells you how to use
the debugger.  This manpage describes low-level details concerning
the debugger's internals, which range from difficult to impossible
to understand for anyone who isn't incredibly intimate with Perl's guts.
Caveat lector.

=head1 Debugger Internals

Perl has special debugging hooks at compile-time and run-time used
to create debugging environments.  These hooks are not to be confused
with the I<perl -Dxxx> command described in L<perlrun>, which is
usable only if a special Perl is built per the instructions in the
F<INSTALL> podpage in the Perl source tree.

For example, whenever you call Perl's built-in C<caller> function
from the package C<DB>, the arguments that the corresponding stack
frame was called with are copied to the C<@DB::args> array.  These
mechanisms are enabled by calling Perl with the B<-d> switch.
Specifically, the following additional features are enabled
(cf. L<perlvar/$^P>):

=over 4

=item *

Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require
'perl5db.pl'}> if not present) before the first line of your program.

=item *

Each array C<@{"_<$filename"}> holds the lines of $filename for a
file compiled by Perl.  The same is also true for C<eval>ed strings
that contain subroutines, or which are currently being executed.
The $filename for C<eval>ed strings looks like C<(eval 34)>.

Values in this array are magical in numeric context: they compare
equal to zero only if the line is not breakable.

=item *

Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed
by line number.  Individual entries (as opposed to the whole hash)
are settable.  Perl only cares about Boolean true here, although
the values used by F<perl5db.pl> have the form
C<"$break_condition\0$action">.  

The same holds for evaluated strings that contain subroutines, or
which are currently being executed.  The $filename for C<eval>ed strings
looks like C<(eval 34)>.

=item *

Each scalar C<${"_<$filename"}> contains C<"_<$filename">.  This is
also the case for evaluated strings that contain subroutines, or
which are currently being executed.  The $filename for C<eval>ed
strings looks like C<(eval 34)>.

=item *

After each C<require>d file is compiled, but before it is executed,
C<DB::postponed(*{"_<$filename"})> is called if the subroutine
C<DB::postponed> exists.  Here, the $filename is the expanded name of
the C<require>d file, as found in the values of %INC.

=item *

After each subroutine C<subname> is compiled, the existence of
C<$DB::postponed{subname}> is checked.  If this key exists,
C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine
also exists.

=item *

A hash C<%DB::sub> is maintained, whose keys are subroutine names
and whose values have the form C<filename:startline-endline>.
C<filename> has the form C<(eval 34)> for subroutines defined inside
C<eval>s.

=item *

When the execution of your program reaches a point that can hold a
breakpoint, the C<DB::DB()> subroutine is called if any of the variables
C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true.  These variables
are not C<local>izable.  This feature is disabled when executing
inside C<DB::DB()>, including functions called from it 
unless C<< $^D & (1<<30) >> is true.

=item *

When execution of the program reaches a subroutine call, a call to
C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> holding the
name of the called subroutine. (This doesn't happen if the subroutine
was compiled in the C<DB> package.)

X<&DB::lsub>If the call is to an lvalue subroutine, and C<&DB::lsub>
is defined C<&DB::lsub>(I<args>) is called instead, otherwise falling
back to C<&DB::sub>(I<args>).

=item *

When execution of the program uses C<goto> to enter a non-XS
subroutine and the 0x80 bit is set in C<$^P>, a call to C<&DB::goto>
is made, with C<$DB::sub> holding the name of the subroutine being
entered.

=back

Note that if C<&DB::sub> needs external data for it to work, no
subroutine call is possible without it. As an example, the standard
debugger's C<&DB::sub> depends on the C<$DB::deep> variable
(it defines how many levels of recursion deep into the debugger you can go
before a mandatory break).  If C<$DB::deep> is not defined, subroutine
calls are not possible, even though C<&DB::sub> exists.

=head2 Writing Your Own Debugger

=head3 Environment Variables

The C<PERL5DB> environment variable can be used to define a debugger.
For example, the minimal "working" debugger (it actually doesn't do anything)
consists of one line:

  sub DB::DB {}

It can easily be defined like this:

  $ PERL5DB="sub DB::DB {}" perl -d your-script

Another brief debugger, slightly more useful, can be created
with only the line:

  sub DB::DB {print ++$i; scalar <STDIN>}

This debugger prints a number which increments for each statement
encountered and waits for you to hit a newline before continuing
to the next statement.

The following debugger is actually useful:

  {
    package DB;
    sub DB  {}
    sub sub {print ++$i, " $sub\n"; &$sub}
  }

It prints the sequence number of each subroutine call and the name of the
called subroutine.  Note that C<&DB::sub> is being compiled into the
package C<DB> through the use of the C<package> directive.

When it starts, the debugger reads your rc file (F<./.perldb> or
F<~/.perldb> under Unix), which can set important options.
(A subroutine (C<&afterinit>) can be defined here as well; it is executed
after the debugger completes its own initialization.)

After the rc file is read, the debugger reads the PERLDB_OPTS
environment variable and uses it to set debugger options. The
contents of this variable are treated as if they were the argument
of an C<o ...> debugger command (q.v. in L<perldebug/"Configurable Options">).

=head3 Debugger Internal Variables

In addition to the file and subroutine-related variables mentioned above,
the debugger also maintains various magical internal variables.

=over 4

=item *

C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which
holds the lines of the currently-selected file (compiled by Perl), either
explicitly chosen with the debugger's C<f> command, or implicitly by flow
of execution.

Values in this array are magical in numeric context: they compare
equal to zero only if the line is not breakable.

=item *

C<%DB::dbline> is an alias for C<%{"::_<current_file"}>, which
contains breakpoints and actions keyed by line number in
the currently-selected file, either explicitly chosen with the
debugger's C<f> command, or implicitly by flow of execution.

As previously noted, individual entries (as opposed to the whole hash)
are settable.  Perl only cares about Boolean true here, although
the values used by F<perl5db.pl> have the form
C<"$break_condition\0$action">.

=back

=head3 Debugger Customization Functions

Some functions are provided to simplify customization.

=over 4

=item *

See L<perldebug/"Configurable Options"> for a description of options parsed by
C<DB::parse_options(string)>.

=item *

C<DB::dump_trace(skip[,count])> skips the specified number of frames
and returns a list containing information about the calling frames (all
of them, if C<count> is missing).  Each entry is reference to a hash
with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine
name, or info about C<eval>), C<args> (C<undef> or a reference to
an array), C<file>, and C<line>.

=item *

C<DB::print_trace(FH, skip[, count[, short]])> prints
formatted info about caller frames.  The last two functions may be
convenient as arguments to C<< < >>, C<< << >> commands.

=back

Note that any variables and functions that are not documented in
this manpages (or in L<perldebug>) are considered for internal   
use only, and as such are subject to change without notice.

=head1 Frame Listing Output Examples

The C<frame> option can be used to control the output of frame 
information.  For example, contrast this expression trace:

 $ perl -de 42
 Stack dump during die enabled outside of evals.

 Loading DB routines from perl5db.pl patch level 0.94
 Emacs support available.

 Enter h or 'h h' for help.

 main::(-e:1):   0
   DB<1> sub foo { 14 }

   DB<2> sub bar { 3 }

   DB<3> t print foo() * bar()
 main::((eval 172):3):   print foo() + bar();
 main::foo((eval 168):2):
 main::bar((eval 170):2):
 42

with this one, once the C<o>ption C<frame=2> has been set:

   DB<4> o f=2
                frame = '2'
   DB<5> t print foo() * bar()
 3:      foo() * bar()
 entering main::foo
  2:     sub foo { 14 };
 exited main::foo
 entering main::bar
  2:     sub bar { 3 };
 exited main::bar
 42

By way of demonstration, we present below a laborious listing
resulting from setting your C<PERLDB_OPTS> environment variable to
the value C<f=n N>, and running I<perl -d -V> from the command line.
Examples using various values of C<n> are shown to give you a feel
for the difference between settings.  Long though it may be, this
is not a complete listing, but only excerpts.

=over 4

=item 1

 entering main::BEGIN
  entering Config::BEGIN
   Package lib/Exporter.pm.
   Package lib/Carp.pm.
  Package lib/Config.pm.
  entering Config::TIEHASH
  entering Exporter::import
   entering Exporter::export
 entering Config::myconfig
  entering Config::FETCH
  entering Config::FETCH
  entering Config::FETCH
  entering Config::FETCH

=item 2

 entering main::BEGIN
  entering Config::BEGIN
   Package lib/Exporter.pm.
   Package lib/Carp.pm.
  exited Config::BEGIN
  Package lib/Config.pm.
  entering Config::TIEHASH
  exited Config::TIEHASH
  entering Exporter::import
   entering Exporter::export
   exited Exporter::export
  exited Exporter::import
 exited main::BEGIN
 entering Config::myconfig
  entering Config::FETCH
  exited Config::FETCH
  entering Config::FETCH
  exited Config::FETCH
  entering Config::FETCH

=item 3

 in  $=main::BEGIN() from /dev/null:0
  in  $=Config::BEGIN() from lib/Config.pm:2
   Package lib/Exporter.pm.
   Package lib/Carp.pm.
  Package lib/Config.pm.
  in  $=Config::TIEHASH('Config') from lib/Config.pm:644
  in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
   in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li
 in  @=Config::myconfig() from /dev/null:0
  in  $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574

=item 4

 in  $=main::BEGIN() from /dev/null:0
  in  $=Config::BEGIN() from lib/Config.pm:2
   Package lib/Exporter.pm.
   Package lib/Carp.pm.
  out $=Config::BEGIN() from lib/Config.pm:0
  Package lib/Config.pm.
  in  $=Config::TIEHASH('Config') from lib/Config.pm:644
  out $=Config::TIEHASH('Config') from lib/Config.pm:644
  in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
   in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
   out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/
  out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
 out $=main::BEGIN() from /dev/null:0
 in  @=Config::myconfig() from /dev/null:0
  in  $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
  out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
  out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
  out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574
  in  $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574

=item 5

 in  $=main::BEGIN() from /dev/null:0
  in  $=Config::BEGIN() from lib/Config.pm:2
   Package lib/Exporter.pm.
   Package lib/Carp.pm.
  out $=Config::BEGIN() from lib/Config.pm:0
  Package lib/Config.pm.
  in  $=Config::TIEHASH('Config') from lib/Config.pm:644
  out $=Config::TIEHASH('Config') from lib/Config.pm:644
  in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
   in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
   out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E
  out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
 out $=main::BEGIN() from /dev/null:0
 in  @=Config::myconfig() from /dev/null:0
  in  $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
  out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574
  in  $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574
  out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574

=item 6

 in  $=CODE(0x15eca4)() from /dev/null:0
  in  $=CODE(0x182528)() from lib/Config.pm:2
   Package lib/Exporter.pm.
  out $=CODE(0x182528)() from lib/Config.pm:0
  scalar context return from CODE(0x182528): undef
  Package lib/Config.pm.
  in  $=Config::TIEHASH('Config') from lib/Config.pm:628
  out $=Config::TIEHASH('Config') from lib/Config.pm:628
  scalar context return from Config::TIEHASH:   empty hash
  in  $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
   in  $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
   out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171
   scalar context return from Exporter::export: ''
  out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0
  scalar context return from Exporter::import: ''

=back

In all cases shown above, the line indentation shows the call tree.
If bit 2 of C<frame> is set, a line is printed on exit from a
subroutine as well.  If bit 4 is set, the arguments are printed
along with the caller info.  If bit 8 is set, the arguments are
printed even if they are tied or references.  If bit 16 is set, the
return value is printed, too.

When a package is compiled, a line like this

    Package lib/Carp.pm.

is printed with proper indentation.

=head1 Debugging Regular Expressions

There are two ways to enable debugging output for regular expressions.

If your perl is compiled with C<-DDEBUGGING>, you may use the
B<-Dr> flag on the command line.

Otherwise, one can C<use re 'debug'>, which has effects at
compile time and run time.  Since Perl 5.9.5, this pragma is lexically
scoped.

=head2 Compile-time Output

The debugging output at compile time looks like this:

  Compiling REx '[bc]d(ef*g)+h[ij]k$'
  size 45 Got 364 bytes for offset annotations.
  first at 1
  rarest char g at 0
  rarest char d at 0
     1: ANYOF[bc](12)
    12: EXACT <d>(14)
    14: CURLYX[0] {1,32767}(28)
    16:   OPEN1(18)
    18:     EXACT <e>(20)
    20:     STAR(23)
    21:       EXACT <f>(0)
    23:     EXACT <g>(25)
    25:   CLOSE1(27)
    27:   WHILEM[1/1](0)
    28: NOTHING(29)
    29: EXACT <h>(31)
    31: ANYOF[ij](42)
    42: EXACT <k>(44)
    44: EOL(45)
    45: END(0)
  anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating) 
        stclass 'ANYOF[bc]' minlen 7 
  Offsets: [45]
  	1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
  	0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
  	11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
  	0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]  
  Omitting $` $& $' support.

The first line shows the pre-compiled form of the regex.  The second
shows the size of the compiled form (in arbitrary units, usually
4-byte words) and the total number of bytes allocated for the
offset/length table, usually 4+C<size>*8.  The next line shows the
label I<id> of the first node that does a match.

The 

  anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating) 
        stclass 'ANYOF[bc]' minlen 7 

line (split into two lines above) contains optimizer
information.  In the example shown, the optimizer found that the match 
should contain a substring C<de> at offset 1, plus substring C<gh>
at some offset between 3 and infinity.  Moreover, when checking for
these substrings (to abandon impossible matches quickly), Perl will check
for the substring C<gh> before checking for the substring C<de>.  The
optimizer may also use the knowledge that the match starts (at the
C<first> I<id>) with a character class, and no string 
shorter than 7 characters can possibly match.

The fields of interest which may appear in this line are

=over 4

=item C<anchored> I<STRING> C<at> I<POS>

=item C<floating> I<STRING> C<at> I<POS1..POS2>

See above.

=item C<matching floating/anchored>

Which substring to check first.

=item C<minlen>

The minimal length of the match.

=item C<stclass> I<TYPE>

Type of first matching node.

=item C<noscan>

Don't scan for the found substrings.

=item C<isall>

Means that the optimizer information is all that the regular
expression contains, and thus one does not need to enter the regex engine at
all.

=item C<GPOS>

Set if the pattern contains C<\G>.

=item C<plus> 

Set if the pattern starts with a repeated char (as in C<x+y>).

=item C<implicit>

Set if the pattern starts with C<.*>.

=item C<with eval> 

Set if the pattern contain eval-groups, such as C<(?{ code })> and
C<(??{ code })>.

=item C<anchored(TYPE)>

If the pattern may match only at a handful of places, with C<TYPE>
being C<SBOL>, C<MBOL>, or C<GPOS>.  See the table below.

=back

If a substring is known to match at end-of-line only, it may be
followed by C<$>, as in C<floating 'k'$>.

The optimizer-specific information is used to avoid entering (a slow) regex
engine on strings that will not definitely match.  If the C<isall> flag
is set, a call to the regex engine may be avoided even when the optimizer
found an appropriate place for the match.

Above the optimizer section is the list of I<nodes> of the compiled
form of the regex.  Each line has format 

C<   >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>)

=head2 Types of Nodes

Here are the current possible types, with short descriptions:

=for comment
This table is generated by regen/regcomp.pl.  Any changes made here
will be lost.

=for regcomp.pl begin

 # TYPE arg-description [num-args] [longjump-len] DESCRIPTION

 # Exit points

 END             no         End of program.
 SUCCEED         no         Return from a subroutine, basically.

 # Line Start Anchors:
 SBOL            no         Match "" at beginning of line: /^/, /\A/
 MBOL            no         Same, assuming multiline: /^/m

 # Line End Anchors:
 SEOL            no         Match "" at end of line: /$/
 MEOL            no         Same, assuming multiline: /$/m
 EOS             no         Match "" at end of string: /\z/

 # Match Start Anchors:
 GPOS            no         Matches where last m//g left off.

 # Word Boundary Opcodes:
 BOUND           no         Like BOUNDA for non-utf8, otherwise match ""
                            between any Unicode \w\W or \W\w
 BOUNDL          no         Like BOUND/BOUNDU, but \w and \W are defined
                            by current locale
 BOUNDU          no         Match "" at any boundary of a given type
                            using Unicode rules
 BOUNDA          no         Match "" at any boundary between \w\W or
                            \W\w, where \w is [_a-zA-Z0-9]
 NBOUND          no         Like NBOUNDA for non-utf8, otherwise match
                            "" between any Unicode \w\w or \W\W
 NBOUNDL         no         Like NBOUND/NBOUNDU, but \w and \W are
                            defined by current locale
 NBOUNDU         no         Match "" at any non-boundary of a given type
                            using using Unicode rules
 NBOUNDA         no         Match "" betweeen any \w\w or \W\W, where \w
                            is [_a-zA-Z0-9]

 # [Special] alternatives:
 REG_ANY         no         Match any one character (except newline).
 SANY            no         Match any one character.
 ANYOF           sv 1       Match character in (or not in) this class,
                            single char match only
 ANYOFD          sv 1       Like ANYOF, but /d is in effect
 ANYOFL          sv 1       Like ANYOF, but /l is in effect

 # POSIX Character Classes:
 POSIXD          none       Some [[:class:]] under /d; the FLAGS field
                            gives which one
 POSIXL          none       Some [[:class:]] under /l; the FLAGS field
                            gives which one
 POSIXU          none       Some [[:class:]] under /u; the FLAGS field
                            gives which one
 POSIXA          none       Some [[:class:]] under /a; the FLAGS field
                            gives which one
 NPOSIXD         none       complement of POSIXD, [[:^class:]]
 NPOSIXL         none       complement of POSIXL, [[:^class:]]
 NPOSIXU         none       complement of POSIXU, [[:^class:]]
 NPOSIXA         none       complement of POSIXA, [[:^class:]]

 CLUMP           no         Match any extended grapheme cluster sequence

 # Alternation

 # BRANCH        The set of branches constituting a single choice are
 #               hooked together with their "next" pointers, since
 #               precedence prevents anything being concatenated to
 #               any individual branch.  The "next" pointer of the last
 #               BRANCH in a choice points to the thing following the
 #               whole choice.  This is also where the final "next"
 #               pointer of each individual branch points; each branch
 #               starts with the operand node of a BRANCH node.
 #
 BRANCH          node       Match this alternative, or the next...

 # Literals

 EXACT           str        Match this string (preceded by length).
 EXACTL          str        Like EXACT, but /l is in effect (used so
                            locale-related warnings can be checked for).
 EXACTF          str        Match this non-UTF-8 string (not guaranteed
                            to be folded) using /id rules (w/len).
 EXACTFL         str        Match this string (not guaranteed to be
                            folded) using /il rules (w/len).
 EXACTFU         str        Match this string (folded iff in UTF-8,
                            length in folding doesn't change if not in
                            UTF-8) using /iu rules (w/len).
 EXACTFA         str        Match this string (not guaranteed to be
                            folded) using /iaa rules (w/len).

 EXACTFU_SS      str        Match this string (folded iff in UTF-8,
                            length in folding may change even if not in
                            UTF-8) using /iu rules (w/len).
 EXACTFLU8       str        Rare cirucmstances: like EXACTFU, but is
                            under /l, UTF-8, folded, and everything in
                            it is above 255.
 EXACTFA_NO_TRIE str        Match this string (which is not trie-able;
                            not guaranteed to be folded) using /iaa
                            rules (w/len).

 # Do nothing types

 NOTHING         no         Match empty string.
 # A variant of above which delimits a group, thus stops optimizations
 TAIL            no         Match empty string. Can jump here from
                            outside.

 # Loops

 # STAR,PLUS    '?', and complex '*' and '+', are implemented as
 #               circular BRANCH structures.  Simple cases
 #               (one character per match) are implemented with STAR
 #               and PLUS for speed and to minimize recursive plunges.
 #
 STAR            node       Match this (simple) thing 0 or more times.
 PLUS            node       Match this (simple) thing 1 or more times.

 CURLY           sv 2       Match this simple thing {n,m} times.
 CURLYN          no 2       Capture next-after-this simple thing
 CURLYM          no 2       Capture this medium-complex thing {n,m}
                            times.
 CURLYX          sv 2       Match this complex thing {n,m} times.

 # This terminator creates a loop structure for CURLYX
 WHILEM          no         Do curly processing and see if rest matches.

 # Buffer related

 # OPEN,CLOSE,GROUPP     ...are numbered at compile time.
 OPEN            num 1      Mark this point in input as start of #n.
 CLOSE           num 1      Analogous to OPEN.

 REF             num 1      Match some already matched string
 REFF            num 1      Match already matched string, folded using
                            native charset rules for non-utf8
 REFFL           num 1      Match already matched string, folded in loc.
 REFFU           num 1      Match already matched string, folded using
                            unicode rules for non-utf8
 REFFA           num 1      Match already matched string, folded using
                            unicode rules for non-utf8, no mixing ASCII,
                            non-ASCII

 # Named references.  Code in regcomp.c assumes that these all are after
 # the numbered references
 NREF            no-sv 1    Match some already matched string
 NREFF           no-sv 1    Match already matched string, folded using
                            native charset rules for non-utf8
 NREFFL          no-sv 1    Match already matched string, folded in loc.
 NREFFU          num 1      Match already matched string, folded using
                            unicode rules for non-utf8
 NREFFA          num 1      Match already matched string, folded using
                            unicode rules for non-utf8, no mixing ASCII,
                            non-ASCII

 # Support for long RE
 LONGJMP         off 1 1    Jump far away.
 BRANCHJ         off 1 1    BRANCH with long offset.

 # Special Case Regops
 IFMATCH         off 1 2    Succeeds if the following matches.
 UNLESSM         off 1 2    Fails if the following matches.
 SUSPEND         off 1 1    "Independent" sub-RE.
 IFTHEN          off 1 1    Switch, should be preceded by switcher.
 GROUPP          num 1      Whether the group matched.

 # The heavy worker

 EVAL            evl/flags  Execute some Perl code.
                 2L

 # Modifiers

 MINMOD          no         Next operator is not greedy.
 LOGICAL         no         Next opcode should set the flag only.

 # This is not used yet
 RENUM           off 1 1    Group with independently numbered parens.

 # Trie Related

 # Behave the same as A|LIST|OF|WORDS would. The '..C' variants
 # have inline charclass data (ascii only), the 'C' store it in the
 # structure.

 TRIE            trie 1     Match many EXACT(F[ALU]?)? at once.
                            flags==type
 TRIEC           trie       Same as TRIE, but with embedded charclass
                 charclass  data

 AHOCORASICK     trie 1     Aho Corasick stclass. flags==type
 AHOCORASICKC    trie       Same as AHOCORASICK, but with embedded
                 charclass  charclass data

 # Regex Subroutines
 GOSUB           num/ofs 2L recurse to paren arg1 at (signed) ofs arg2

 # Special conditionals
 NGROUPP         no-sv 1    Whether the group matched.
 INSUBP          num 1      Whether we are in a specific recurse.
 DEFINEP         none 1     Never execute directly.

 # Backtracking Verbs
 ENDLIKE         none       Used only for the type field of verbs
 OPFAIL          no-sv 1    Same as (?!), but with verb arg
 ACCEPT          no-sv/num  Accepts the current matched string, with
                 2L         verbar

 # Verbs With Arguments
 VERB            no-sv 1    Used only for the type field of verbs
 PRUNE           no-sv 1    Pattern fails at this startpoint if no-
                            backtracking through this
 MARKPOINT       no-sv 1    Push the current location for rollback by
                            cut.
 SKIP            no-sv 1    On failure skip forward (to the mark) before
                            retrying
 COMMIT          no-sv 1    Pattern fails outright if backtracking
                            through this
 CUTGROUP        no-sv 1    On failure go to the next alternation in the
                            group

 # Control what to keep in $&.
 KEEPS           no         $& begins here.

 # New charclass like patterns
 LNBREAK         none       generic newline pattern

 # SPECIAL  REGOPS

 # This is not really a node, but an optimized away piece of a "long"
 # node.  To simplify debugging output, we mark it as if it were a node
 OPTIMIZED       off        Placeholder for dump.

 # Special opcode with the property that no opcode in a compiled program
 # will ever be of this type. Thus it can be used as a flag value that
 # no other opcode has been seen. END is used similarly, in that an END
 # node cant be optimized. So END implies "unoptimizable" and PSEUDO
 # mean "not seen anything to optimize yet".
 PSEUDO          off        Pseudo opcode for internal use.

=for regcomp.pl end

=for unprinted-credits
Next section M-J. Dominus (mjd-perl-patch+@plover.com) 20010421

Following the optimizer information is a dump of the offset/length
table, here split across several lines:

  Offsets: [45]
  	1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1]
  	0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0]
  	11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0]
  	0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0]  

The first line here indicates that the offset/length table contains 45
entries.  Each entry is a pair of integers, denoted by C<offset[length]>.
Entries are numbered starting with 1, so entry #1 here is C<1[4]> and
entry #12 is C<5[1]>.  C<1[4]> indicates that the node labeled C<1:>
(the C<1: ANYOF[bc]>) begins at character position 1 in the
pre-compiled form of the regex, and has a length of 4 characters.
C<5[1]> in position 12 
indicates that the node labeled C<12:>
(the C<< 12: EXACT <d> >>) begins at character position 5 in the
pre-compiled form of the regex, and has a length of 1 character.
C<12[1]> in position 14 
indicates that the node labeled C<14:>
(the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the
pre-compiled form of the regex, and has a length of 1 character---that
is, it corresponds to the C<+> symbol in the precompiled regex.

C<0[0]> items indicate that there is no corresponding node.

=head2 Run-time Output

First of all, when doing a match, one may get no run-time output even
if debugging is enabled.  This means that the regex engine was never
entered and that all of the job was therefore done by the optimizer.

If the regex engine was entered, the output may look like this:

  Matching '[bc]d(ef*g)+h[ij]k$' against 'abcdefg__gh__'
    Setting an EVAL scope, savestack=3
     2 <ab> <cdefg__gh_>    |  1: ANYOF
     3 <abc> <defg__gh_>    | 11: EXACT <d>
     4 <abcd> <efg__gh_>    | 13: CURLYX {1,32767}
     4 <abcd> <efg__gh_>    | 26:   WHILEM
				0 out of 1..32767  cc=effff31c
     4 <abcd> <efg__gh_>    | 15:     OPEN1
     4 <abcd> <efg__gh_>    | 17:     EXACT <e>
     5 <abcde> <fg__gh_>    | 19:     STAR
			     EXACT <f> can match 1 times out of 32767...
    Setting an EVAL scope, savestack=3
     6 <bcdef> <g__gh__>    | 22:       EXACT <g>
     7 <bcdefg> <__gh__>    | 24:       CLOSE1
     7 <bcdefg> <__gh__>    | 26:       WHILEM
				    1 out of 1..32767  cc=effff31c
    Setting an EVAL scope, savestack=12
     7 <bcdefg> <__gh__>    | 15:         OPEN1
     7 <bcdefg> <__gh__>    | 17:         EXACT <e>
       restoring \1 to 4(4)..7
				    failed, try continuation...
     7 <bcdefg> <__gh__>    | 27:         NOTHING
     7 <bcdefg> <__gh__>    | 28:         EXACT <h>
				    failed...
				failed...

The most significant information in the output is about the particular I<node>
of the compiled regex that is currently being tested against the target string.
The format of these lines is

C<    >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>>   |I<ID>:  I<TYPE>

The I<TYPE> info is indented with respect to the backtracking level.
Other incidental information appears interspersed within.

=head1 Debugging Perl Memory Usage

Perl is a profligate wastrel when it comes to memory use.  There
is a saying that to estimate memory usage of Perl, assume a reasonable
algorithm for memory allocation, multiply that estimate by 10, and
while you still may miss the mark, at least you won't be quite so
astonished.  This is not absolutely true, but may provide a good
grasp of what happens.

Assume that an integer cannot take less than 20 bytes of memory, a
float cannot take less than 24 bytes, a string cannot take less
than 32 bytes (all these examples assume 32-bit architectures, the
result are quite a bit worse on 64-bit architectures).  If a variable
is accessed in two of three different ways (which require an integer,
a float, or a string), the memory footprint may increase yet another
20 bytes.  A sloppy malloc(3) implementation can inflate these
numbers dramatically.

On the opposite end of the scale, a declaration like

  sub foo;

may take up to 500 bytes of memory, depending on which release of Perl
you're running.

Anecdotal estimates of source-to-compiled code bloat suggest an
eightfold increase.  This means that the compiled form of reasonable
(normally commented, properly indented etc.) code will take
about eight times more space in memory than the code took
on disk.

The B<-DL> command-line switch is obsolete since circa Perl 5.6.0
(it was available only if Perl was built with C<-DDEBUGGING>).
The switch was used to track Perl's memory allocations and possible
memory leaks.  These days the use of malloc debugging tools like
F<Purify> or F<valgrind> is suggested instead.  See also
L<perlhacktips/PERL_MEM_LOG>.

One way to find out how much memory is being used by Perl data
structures is to install the Devel::Size module from CPAN: it gives
you the minimum number of bytes required to store a particular data
structure.  Please be mindful of the difference between the size()
and total_size().

If Perl has been compiled using Perl's malloc you can analyze Perl
memory usage by setting $ENV{PERL_DEBUG_MSTATS}.

=head2 Using C<$ENV{PERL_DEBUG_MSTATS}>

If your perl is using Perl's malloc() and was compiled with the
necessary switches (this is the default), then it will print memory
usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS}
> 1 >>, and before termination of the program when C<<
$ENV{PERL_DEBUG_MSTATS} >= 1 >>.  The report format is similar to
the following example:

 $ PERL_DEBUG_MSTATS=2 perl -e "require Carp"
 Memory allocation statistics after compilation: (buckets 4(4)..8188(8192)
    14216 free:   130   117    28     7     9   0   2     2   1 0 0
		437    61    36     0     5
    60924 used:   125   137   161    55     7   8   6    16   2 0 1
		 74   109   304    84    20
 Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048.
 Memory allocation statistics after execution:   (buckets 4(4)..8188(8192)
    30888 free:   245    78    85    13     6   2   1     3   2 0 1
		315   162    39    42    11
   175816 used:   265   176  1112   111    26  22  11    27   2 1 1
		196   178  1066   798    39
 Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144.

It is possible to ask for such a statistic at arbitrary points in
your execution using the mstat() function out of the standard
Devel::Peek module.

Here is some explanation of that format:

=over 4

=item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>

Perl's malloc() uses bucketed allocations.  Every request is rounded
up to the closest bucket size available, and a bucket is taken from
the pool of buckets of that size.

The line above describes the limits of buckets currently in use.
Each bucket has two sizes: memory footprint and the maximal size
of user data that can fit into this bucket.  Suppose in the above
example that the smallest bucket were size 4.  The biggest bucket
would have usable size 8188, and the memory footprint would be 8192.

In a Perl built for debugging, some buckets may have negative usable
size.  This means that these buckets cannot (and will not) be used.
For larger buckets, the memory footprint may be one page greater
than a power of 2.  If so, the corresponding power of two is
printed in the C<APPROX> field above.

=item Free/Used

The 1 or 2 rows of numbers following that correspond to the number
of buckets of each size between C<SMALLEST> and C<GREATEST>.  In
the first row, the sizes (memory footprints) of buckets are powers
of two--or possibly one page greater.  In the second row, if present,
the memory footprints of the buckets are between the memory footprints
of two buckets "above".

For example, suppose under the previous example, the memory footprints
were

   free:    8     16    32    64    128  256 512 1024 2048 4096 8192
	   4     12    24    48    80

With a non-C<DEBUGGING> perl, the buckets starting from C<128> have
a 4-byte overhead, and thus an 8192-long bucket may take up to
8188-byte allocations.

=item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>

The first two fields give the total amount of memory perl sbrk(2)ed
(ess-broken? :-) and number of sbrk(2)s used.  The third number is
what perl thinks about continuity of returned chunks.  So long as
this number is positive, malloc() will assume that it is probable
that sbrk(2) will provide continuous memory.

Memory allocated by external libraries is not counted.

=item C<pad: 0>

The amount of sbrk(2)ed memory needed to keep buckets aligned.

=item C<heads: 2192>

Although memory overhead of bigger buckets is kept inside the bucket, for
smaller buckets, it is kept in separate areas.  This field gives the
total size of these areas.

=item C<chain: 0>

malloc() may want to subdivide a bigger bucket into smaller buckets.
If only a part of the deceased bucket is left unsubdivided, the rest
is kept as an element of a linked list.  This field gives the total
size of these chunks.

=item C<tail: 6144>

To minimize the number of sbrk(2)s, malloc() asks for more memory.  This
field gives the size of the yet unused part, which is sbrk(2)ed, but
never touched.

=back

=head1 SEE ALSO

L<perldebug>,
L<perlguts>,
L<perlrun>
L<re>,
and
L<Devel::DProf>.
perlhpux.pod000064400000073455150344123500007132 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlhpux - Perl version 5 on Hewlett-Packard Unix (HP-UX) systems

=head1 DESCRIPTION

This document describes various features of HP's Unix operating system
(HP-UX) that will affect how Perl version 5 (hereafter just Perl) is
compiled and/or runs.

=head2 Using perl as shipped with HP-UX

Application release September 2001, HP-UX 11.00 is the first to ship
with Perl. By the time it was perl-5.6.1 in /opt/perl. The first
occurrence is on CD 5012-7954 and can be installed using

  swinstall -s /cdrom perl

assuming you have mounted that CD on /cdrom.

That build was a portable hppa-1.1 multithread build that supports large
files compiled with gcc-2.9-hppa-991112.

If you perform a new installation, then (a newer) Perl will be installed
automatically.  Pre-installed HP-UX systems now have more recent versions
of Perl and the updated modules.

The official (threaded) builds from HP, as they are shipped on the
Application DVD/CD's are available on
L<http://www.software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=PERL>
for both PA-RISC and IPF (Itanium Processor Family). They are built
with the HP ANSI-C compiler. Up till 5.8.8 that was done by ActiveState.

To see what version is included on the DVD (assumed here to be mounted
on /cdrom), issue this command:

  # swlist -s /cdrom perl
  # perl           D.5.8.8.B  5.8.8 Perl Programming Language
    perl.Perl5-32  D.5.8.8.B  32-bit 5.8.8 Perl Programming Language
                                           with Extensions
    perl.Perl5-64  D.5.8.8.B  64-bit 5.8.8 Perl Programming Language
                                           with Extensions

To see what is installed on your system:

  # swlist -R perl
  # perl                    E.5.8.8.J  Perl Programming Language
  # perl.Perl5-32           E.5.8.8.J  32-bit Perl Programming Language
                                       with Extensions
    perl.Perl5-32.PERL-MAN  E.5.8.8.J  32-bit Perl Man Pages for IA
    perl.Perl5-32.PERL-RUN  E.5.8.8.J  32-bit Perl Binaries for IA
  # perl.Perl5-64           E.5.8.8.J  64-bit Perl Programming Language
                                       with Extensions
    perl.Perl5-64.PERL-MAN  E.5.8.8.J  64-bit Perl Man Pages for IA
    perl.Perl5-64.PERL-RUN  E.5.8.8.J  64-bit Perl Binaries for IA

=head2 Using perl from HP's porting centre

HP porting centre tries to keep up with customer demand and release
updates from the Open Source community. Having precompiled Perl binaries
available is obvious, though "up-to-date" is something relative. At the
moment of writing only perl-5.10.1 was available (with 5.16.3 being the
latest stable release from the porters point of view).

The HP porting centres are limited in what systems they are allowed
to port to and they usually choose the two most recent OS versions
available.

HP has asked the porting centre to move Open Source binaries
from /opt to /usr/local, so binaries produced since the start
of July 2002 are located in /usr/local.

One of HP porting centres URL's is L<http://hpux.connect.org.uk/>
The port currently available is built with GNU gcc.

=head2 Other prebuilt perl binaries

To get even more recent perl depots for the whole range of HP-UX, visit
H.Merijn Brand's site at L<http://mirrors.develooper.com/hpux/#Perl>.
Carefully read the notes to see if the available versions suit your needs.

=head2 Compiling Perl 5 on HP-UX

When compiling Perl, you must use an ANSI C compiler.  The C compiler
that ships with all HP-UX systems is a K&R compiler that should only be
used to build new kernels.

Perl can be compiled with either HP's ANSI C compiler or with gcc.  The
former is recommended, as not only can it compile Perl with no
difficulty, but also can take advantage of features listed later that
require the use of HP compiler-specific command-line flags.

If you decide to use gcc, make sure your installation is recent and
complete, and be sure to read the Perl INSTALL file for more gcc-specific
details.

=head2 PA-RISC

HP's HP9000 Unix systems run on HP's own Precision Architecture
(PA-RISC) chip.  HP-UX used to run on the Motorola MC68000 family of
chips, but any machine with this chip in it is quite obsolete and this
document will not attempt to address issues for compiling Perl on the
Motorola chipset.

The version of PA-RISC at the time of this document's last update is 2.0,
which is also the last there will be. HP PA-RISC systems are usually
referred to with model description "HP 9000". The last CPU in this series
is the PA-8900.  Support for PA-RISC architectured machines officially
ends as shown in the following table:

   PA-RISC End-of-Life Roadmap
 +--------+----------------+----------------+-----------------+
 | HP9000 | Superdome      | PA-8700        | Spring 2011     |
 | 4-128  |                | PA-8800/sx1000 | Summer 2012     |
 | cores  |                | PA-8900/sx1000 | 2014            |
 |        |                | PA-8900/sx2000 | 2015            |
 +--------+----------------+----------------+-----------------+
 | HP9000 | rp7410, rp8400 | PA-8700        | Spring 2011     |
 | 2-32   | rp7420, rp8420 | PA-8800/sx1000 | 2012            |
 | cores  | rp7440, rp8440 | PA-8900/sx1000 | Autumn 2013     |
 |        |                | PA-8900/sx2000 | 2015            |
 +--------+----------------+----------------+-----------------+
 | HP9000 | rp44x0         | PA-8700        | Spring 2011     |
 | 1-8    |                | PA-8800/rp44x0 | 2012            |
 | cores  |                | PA-8900/rp44x0 | 2014            |
 +--------+----------------+----------------+-----------------+
 | HP9000 | rp34x0         | PA-8700        | Spring 2011     |
 | 1-4    |                | PA-8800/rp34x0 | 2012            |
 | cores  |                | PA-8900/rp34x0 | 2014            |
 +--------+----------------+----------------+-----------------+

From L<http://www.hp.com/products1/evolution/9000/faqs.html>

 The last order date for HP 9000 systems was December 31, 2008.

A complete list of models at the time the OS was built is in the file
/usr/sam/lib/mo/sched.models. The first column corresponds to the last
part of the output of the "model" command.  The second column is the
PA-RISC version and the third column is the exact chip type used.
(Start browsing at the bottom to prevent confusion ;-)

  # model
  9000/800/L1000-44
  # grep L1000-44 /usr/sam/lib/mo/sched.models
  L1000-44        2.0     PA8500

=head2 Portability Between PA-RISC Versions

An executable compiled on a PA-RISC 2.0 platform will not execute on a
PA-RISC 1.1 platform, even if they are running the same version of
HP-UX.  If you are building Perl on a PA-RISC 2.0 platform and want that
Perl to also run on a PA-RISC 1.1, the compiler flags +DAportable and
+DS32 should be used.

It is no longer possible to compile PA-RISC 1.0 executables on either
the PA-RISC 1.1 or 2.0 platforms.  The command-line flags are accepted,
but the resulting executable will not run when transferred to a PA-RISC
1.0 system.

=head2 PA-RISC 1.0

The original version of PA-RISC, HP no longer sells any system with this chip.

The following systems contained PA-RISC 1.0 chips:

  600, 635, 645, 808, 815, 822, 825, 832, 834, 835, 840, 842, 845, 850,
  852, 855, 860, 865, 870, 890

=head2 PA-RISC 1.1

An upgrade to the PA-RISC design, it shipped for many years in many different
system.

The following systems contain with PA-RISC 1.1 chips:

  705, 710, 712, 715, 720, 722, 725, 728, 730, 735, 742, 743, 744, 745,
  747, 750, 755, 770, 777, 778, 779, 800, 801, 803, 806, 807, 809, 811,
  813, 816, 817, 819, 821, 826, 827, 829, 831, 837, 839, 841, 847, 849,
  851, 856, 857, 859, 867, 869, 877, 887, 891, 892, 897, A180, A180C,
  B115, B120, B132L, B132L+, B160L, B180L, C100, C110, C115, C120,
  C160L, D200, D210, D220, D230, D250, D260, D310, D320, D330, D350,
  D360, D410, DX0, DX5, DXO, E25, E35, E45, E55, F10, F20, F30, G30,
  G40, G50, G60, G70, H20, H30, H40, H50, H60, H70, I30, I40, I50, I60,
  I70, J200, J210, J210XC, K100, K200, K210, K220, K230, K400, K410,
  K420, S700i, S715, S744, S760, T500, T520

=head2 PA-RISC 2.0

The most recent upgrade to the PA-RISC design, it added support for
64-bit integer data.

As of the date of this document's last update, the following systems
contain PA-RISC 2.0 chips:

  700, 780, 781, 782, 783, 785, 802, 804, 810, 820, 861, 871, 879, 889,
  893, 895, 896, 898, 899, A400, A500, B1000, B2000, C130, C140, C160,
  C180, C180+, C180-XP, C200+, C400+, C3000, C360, C3600, CB260, D270,
  D280, D370, D380, D390, D650, J220, J2240, J280, J282, J400, J410,
  J5000, J5500XM, J5600, J7000, J7600, K250, K260, K260-EG, K270, K360,
  K370, K380, K450, K460, K460-EG, K460-XP, K470, K570, K580, L1000,
  L2000, L3000, N4000, R380, R390, SD16000, SD32000, SD64000, T540,
  T600, V2000, V2200, V2250, V2500, V2600

Just before HP took over Compaq, some systems were renamed. the link
that contained the explanation is dead, so here's a short summary:

  HP 9000 A-Class servers, now renamed HP Server rp2400 series.
  HP 9000 L-Class servers, now renamed HP Server rp5400 series.
  HP 9000 N-Class servers, now renamed HP Server rp7400.

  rp2400, rp2405, rp2430, rp2450, rp2470, rp3410, rp3440, rp4410,
  rp4440, rp5400, rp5405, rp5430, rp5450, rp5470, rp7400, rp7405,
  rp7410, rp7420, rp7440, rp8400, rp8420, rp8440, Superdome

The current naming convention is:

  aadddd
  ||||`+- 00 - 99 relative capacity & newness (upgrades, etc.)
  |||`--- unique number for each architecture to ensure different
  |||     systems do not have the same numbering across
  |||     architectures
  ||`---- 1 - 9 identifies family and/or relative positioning
  ||
  |`----- c = ia32 (cisc)
  |       p = pa-risc
  |       x = ia-64 (Itanium & Itanium 2)
  |       h = housing
  `------ t = tower
          r = rack optimized
          s = super scalable
          b = blade
          sa = appliance

=head2 Itanium Processor Family (IPF) and HP-UX

HP-UX also runs on the new Itanium processor.  This requires the use
of a different version of HP-UX (currently 11.23 or 11i v2), and with
the exception of a few differences detailed below and in later sections,
Perl should compile with no problems.

Although PA-RISC binaries can run on Itanium systems, you should not
attempt to use a PA-RISC version of Perl on an Itanium system.  This is
because shared libraries created on an Itanium system cannot be loaded
while running a PA-RISC executable.

HP Itanium 2 systems are usually referred to with model description
"HP Integrity".

=head2 Itanium, Itanium 2 & Madison 6

HP also ships servers with the 128-bit Itanium processor(s). The cx26x0
is told to have Madison 6. As of the date of this document's last update,
the following systems contain Itanium or Itanium 2 chips (this is likely
to be out of date):

  BL60p, BL860c, BL870c, BL890c, cx2600, cx2620, rx1600, rx1620, rx2600,
  rx2600hptc, rx2620, rx2660, rx2800, rx3600, rx4610, rx4640, rx5670,
  rx6600, rx7420, rx7620, rx7640, rx8420, rx8620, rx8640, rx9610,
  sx1000, sx2000

To see all about your machine, type

  # model
  ia64 hp server rx2600
  # /usr/contrib/bin/machinfo

=head2 HP-UX versions

Not all architectures (PA = PA-RISC, IPF = Itanium Processor Family)
support all versions of HP-UX, here is a short list

  HP-UX version  Kernel  Architecture End-of-factory support
  -------------  ------  ------------ ----------------------------------
  10.20          32 bit  PA           30-Jun-2003
  11.00          32/64   PA           31-Dec-2006
  11.11  11i v1  32/64   PA           31-Dec-2015
  11.22  11i v2     64        IPF     30-Apr-2004
  11.23  11i v2     64   PA & IPF     31-Dec-2015
  11.31  11i v3     64   PA & IPF     31-Dec-2020 (PA) 31-Dec-2022 (IPF)

See for the full list of hardware/OS support and expected end-of-life
L<http://www.hp.com/go/hpuxservermatrix>

=head2 Building Dynamic Extensions on HP-UX

HP-UX supports dynamically loadable libraries (shared libraries).
Shared libraries end with the suffix .sl.  On Itanium systems,
they end with the suffix .so.

Shared libraries created on a platform using a particular PA-RISC
version are not usable on platforms using an earlier PA-RISC version by
default.  However, this backwards compatibility may be enabled using the
same +DAportable compiler flag (with the same PA-RISC 1.0 caveat
mentioned above).

Shared libraries created on an Itanium platform cannot be loaded on
a PA-RISC platform.  Shared libraries created on a PA-RISC platform
can only be loaded on an Itanium platform if it is a PA-RISC executable
that is attempting to load the PA-RISC library.  A PA-RISC shared
library cannot be loaded into an Itanium executable nor vice-versa.

To create a shared library, the following steps must be performed:

  1. Compile source modules with +z or +Z flag to create a .o module
     which contains Position-Independent Code (PIC).  The linker will
     tell you in the next step if +Z was needed.
     (For gcc, the appropriate flag is -fpic or -fPIC.)

  2. Link the shared library using the -b flag.  If the code calls
     any functions in other system libraries (e.g., libm), it must
     be included on this line.

(Note that these steps are usually handled automatically by the extension's
Makefile).

If these dependent libraries are not listed at shared library creation
time, you will get fatal "Unresolved symbol" errors at run time when the
library is loaded.

You may create a shared library that refers to another library, which
may be either an archive library or a shared library.  If this second
library is a shared library, this is called a "dependent library".  The
dependent library's name is recorded in the main shared library, but it
is not linked into the shared library.  Instead, it is loaded when the
main shared library is loaded.  This can cause problems if you build an
extension on one system and move it to another system where the
libraries may not be located in the same place as on the first system.

If the referred library is an archive library, then it is treated as a
simple collection of .o modules (all of which must contain PIC).  These
modules are then linked into the shared library.

Note that it is okay to create a library which contains a dependent
library that is already linked into perl.

Some extensions, like DB_File and Compress::Zlib use/require prebuilt
libraries for the perl extensions/modules to work. If these libraries
are built using the default configuration, it might happen that you
run into an error like "invalid loader fixup" during load phase.
HP is aware of this problem.  Search the HP-UX cxx-dev forums for
discussions about the subject.  The short answer is that B<everything>
(all libraries, everything) must be compiled with C<+z> or C<+Z> to be
PIC (position independent code).  (For gcc, that would be
C<-fpic> or C<-fPIC>).  In HP-UX 11.00 or newer the linker
error message should tell the name of the offending object file.

A more general approach is to intervene manually, as with an example for
the DB_File module, which requires SleepyCat's libdb.sl:

  # cd .../db-3.2.9/build_unix
  # vi Makefile
  ... add +Z to all cflags to create shared objects
  CFLAGS=         -c $(CPPFLAGS) +Z -Ae +O2 +Onolimit \
                  -I/usr/local/include -I/usr/include/X11R6
  CXXFLAGS=       -c $(CPPFLAGS) +Z -Ae +O2 +Onolimit \
                  -I/usr/local/include -I/usr/include/X11R6

  # make clean
  # make
  # mkdir tmp
  # cd tmp
  # ar x ../libdb.a
  # ld -b -o libdb-3.2.sl *.o
  # mv libdb-3.2.sl /usr/local/lib
  # rm *.o
  # cd /usr/local/lib
  # rm -f libdb.sl
  # ln -s libdb-3.2.sl libdb.sl

  # cd .../DB_File-1.76
  # make distclean
  # perl Makefile.PL
  # make
  # make test
  # make install

As of db-4.2.x it is no longer needed to do this by hand. Sleepycat
has changed the configuration process to add +z on HP-UX automatically.

  # cd .../db-4.2.25/build_unix
  # env CFLAGS=+DD64 LDFLAGS=+DD64 ../dist/configure

should work to generate 64bit shared libraries for HP-UX 11.00 and 11i.

It is no longer possible to link PA-RISC 1.0 shared libraries (even
though the command-line flags are still present).

PA-RISC and Itanium object files are not interchangeable.  Although
you may be able to use ar to create an archive library of PA-RISC
object files on an Itanium system, you cannot link against it using
an Itanium link editor.

=head2 The HP ANSI C Compiler

When using this compiler to build Perl, you should make sure that the
flag -Aa is added to the cpprun and cppstdin variables in the config.sh
file (though see the section on 64-bit perl below). If you are using a
recent version of the Perl distribution, these flags are set automatically.

Even though HP-UX 10.20 and 11.00 are not actively maintained by HP
anymore, updates for the HP ANSI C compiler are still available from
time to time, and it might be advisable to see if updates are applicable.
At the moment of writing, the latests available patches for 11.00 that
should be applied are PHSS_35098, PHSS_35175, PHSS_35100, PHSS_33036,
and PHSS_33902). If you have a SUM account, you can use it to search
for updates/patches. Enter "ANSI" as keyword.

=head2 The GNU C Compiler

When you are going to use the GNU C compiler (gcc), and you don't have
gcc yet, you can either build it yourself from the sources (available
from e.g. L<http://gcc.gnu.org/mirrors.html>) or fetch
a prebuilt binary from the HP porting center
at L<http://hpux.connect.org.uk/hppd/cgi-bin/search?term=gcc&Search=Search>
or from the DSPP (you need to be a member) at
L<http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc3f3515b49c108973a801?ciid=2a08725cc2f02110725cc2f02110275d6e10RCRD&jumpid=reg_r1002_usen_c-001_title_r0001>
(Browse through the list, because there are often multiple versions of
the same package available).

Most mentioned distributions are depots. H.Merijn Brand has made prebuilt
gcc binaries available on L<http://mirrors.develooper.com/hpux/> and/or
L<http://www.cmve.net/~merijn/> for HP-UX 10.20 (only 32bit), HP-UX 11.00,
HP-UX 11.11 (HP-UX 11i v1), and HP-UX 11.23 (HP-UX 11i v2 PA-RISC) in both
32- and 64-bit versions. For HP-UX 11.23 IPF and HP-UX 11.31 IPF depots are
available too. The IPF versions do not need two versions of GNU gcc.

On PA-RISC you need a different compiler for 32-bit applications and for
64-bit applications. On PA-RISC, 32-bit objects and 64-bit objects do
not mix. Period. There is no different behaviour for HP C-ANSI-C or GNU
gcc. So if you require your perl binary to use 64-bit libraries, like
Oracle-64bit, you MUST build a 64-bit perl.

Building a 64-bit capable gcc on PA-RISC from source is possible only when
you have the HP C-ANSI C compiler or an already working 64-bit binary of
gcc available. Best performance for perl is achieved with HP's native
compiler.

=head2 Using Large Files with Perl on HP-UX

Beginning with HP-UX version 10.20, files larger than 2GB (2^31 bytes)
may be created and manipulated.  Three separate methods of doing this
are available.  Of these methods, the best method for Perl is to compile
using the -Duselargefiles flag to Configure.  This causes Perl to be
compiled using structures and functions in which these are 64 bits wide,
rather than 32 bits wide.  (Note that this will only work with HP's ANSI
C compiler.  If you want to compile Perl using gcc, you will have to get
a version of the compiler that supports 64-bit operations. See above for
where to find it.)

There are some drawbacks to this approach.  One is that any extension
which calls any file-manipulating C function will need to be recompiled
(just follow the usual "perl Makefile.PL; make; make test; make install"
procedure).

The list of functions that will need to recompiled is:
  creat,          fgetpos,        fopen,
  freopen,        fsetpos,        fstat,
  fstatvfs,       fstatvfsdev,    ftruncate,
  ftw,            lockf,          lseek,
  lstat,          mmap,           nftw,
  open,           prealloc,       stat,
  statvfs,        statvfsdev,     tmpfile,
  truncate,       getrlimit,      setrlimit

Another drawback is only valid for Perl versions before 5.6.0.  This
drawback is that the seek and tell functions (both the builtin version
and POSIX module version) will not perform correctly.

It is strongly recommended that you use this flag when you run
Configure.  If you do not do this, but later answer the question about
large files when Configure asks you, you may get a configuration that
cannot be compiled, or that does not function as expected.

=head2 Threaded Perl on HP-UX

It is possible to compile a version of threaded Perl on any version of
HP-UX before 10.30, but it is strongly suggested that you be running on
HP-UX 11.00 at least.

To compile Perl with threads, add -Dusethreads to the arguments of
Configure.  Verify that the -D_POSIX_C_SOURCE=199506L compiler flag is
automatically added to the list of flags.  Also make sure that -lpthread
is listed before -lc in the list of libraries to link Perl with. The
hints provided for HP-UX during Configure will try very hard to get
this right for you.

HP-UX versions before 10.30 require a separate installation of a POSIX
threads library package. Two examples are the HP DCE package, available
on "HP-UX Hardware Extensions 3.0, Install and Core OS, Release 10.20,
April 1999 (B3920-13941)" or the Freely available PTH package, available
on H.Merijn's site (L<http://mirrors.develooper.com/hpux/>). The use of PTH
will be unsupported in perl-5.12 and up and is rather buggy in 5.11.x.

If you are going to use the HP DCE package, the library used for threading
is /usr/lib/libcma.sl, but there have been multiple updates of that
library over time. Perl will build with the first version, but it
will not pass the test suite. Older Oracle versions might be a compelling
reason not to update that library, otherwise please find a newer version
in one of the following patches: PHSS_19739, PHSS_20608, or PHSS_23672

reformatted output:

  d3:/usr/lib 106 > what libcma-*.1
  libcma-00000.1:
     HP DCE/9000 1.5               Module: libcma.sl (Export)
                                   Date: Apr 29 1996 22:11:24
  libcma-19739.1:
     HP DCE/9000 1.5 PHSS_19739-40 Module: libcma.sl (Export)
                                   Date: Sep  4 1999 01:59:07
  libcma-20608.1:
     HP DCE/9000 1.5 PHSS_20608    Module: libcma.1 (Export)
                                   Date: Dec  8 1999 18:41:23
  libcma-23672.1:
     HP DCE/9000 1.5 PHSS_23672    Module: libcma.1 (Export)
                                   Date: Apr  9 2001 10:01:06
  d3:/usr/lib 107 >

If you choose for the PTH package, use swinstall to install pth in
the default location (/opt/pth), and then make symbolic links to the
libraries from /usr/lib

  # cd /usr/lib
  # ln -s /opt/pth/lib/libpth* .

For building perl to support Oracle, it needs to be linked with libcl
and libpthread. So even if your perl is an unthreaded build, these
libraries might be required. See "Oracle on HP-UX" below.

=head2 64-bit Perl on HP-UX

Beginning with HP-UX 11.00, programs compiled under HP-UX can take
advantage of the LP64 programming environment (LP64 means Longs and
Pointers are 64 bits wide), in which scalar variables will be able
to hold numbers larger than 2^32 with complete precision.  Perl has
proven to be consistent and reliable in 64bit mode since 5.8.1 on
all HP-UX 11.xx.

As of the date of this document, Perl is fully 64-bit compliant on
HP-UX 11.00 and up for both cc- and gcc builds. If you are about to
build a 64-bit perl with GNU gcc, please read the gcc section carefully.

Should a user have the need for compiling Perl in the LP64 environment,
use the -Duse64bitall flag to Configure.  This will force Perl to be
compiled in a pure LP64 environment (with the +DD64 flag for HP C-ANSI-C,
with no additional options for GNU gcc 64-bit on PA-RISC, and with
-mlp64 for GNU gcc on Itanium).
If you want to compile Perl using gcc, you will have to get a version of
the compiler that supports 64-bit operations.)

You can also use the -Duse64bitint flag to Configure.  Although there
are some minor differences between compiling Perl with this flag versus
the -Duse64bitall flag, they should not be noticeable from a Perl user's
perspective. When configuring -Duse64bitint using a 64bit gcc on a
pa-risc architecture, -Duse64bitint is silently promoted to -Duse64bitall.

In both cases, it is strongly recommended that you use these flags when
you run Configure.  If you do not use do this, but later answer the
questions about 64-bit numbers when Configure asks you, you may get a
configuration that cannot be compiled, or that does not function as
expected.

=head2 Oracle on HP-UX

Using perl to connect to Oracle databases through DBI and DBD::Oracle
has caused a lot of people many headaches. Read README.hpux in the
DBD::Oracle for much more information. The reason to mention it here
is that Oracle requires a perl built with libcl and libpthread, the
latter even when perl is build without threads. Building perl using
all defaults, but still enabling to build DBD::Oracle later on can be
achieved using

  Configure -A prepend:libswanted='cl pthread ' ...

Do not forget the space before the trailing quote.

Also note that this does not (yet) work with all configurations,
it is known to fail with 64-bit versions of GCC.

=head2 GDBM and Threads on HP-UX

If you attempt to compile Perl with (POSIX) threads on an 11.X system
and also link in the GDBM library, then Perl will immediately core dump
when it starts up.  The only workaround at this point is to relink the
GDBM library under 11.X, then relink it into Perl.

the error might show something like:

Pthread internal error: message: __libc_reinit() failed, file: ../pthreads/pthread.c, line: 1096
Return Pointer is 0xc082bf33
sh: 5345 Quit(coredump)

and Configure will give up.

=head2 NFS filesystems and utime(2) on HP-UX

If you are compiling Perl on a remotely-mounted NFS filesystem, the test
io/fs.t may fail on test #18.  This appears to be a bug in HP-UX and no
fix is currently available.

=head2 HP-UX Kernel Parameters (maxdsiz) for Compiling Perl

By default, HP-UX comes configured with a maximum data segment size of
64MB.  This is too small to correctly compile Perl with the maximum
optimization levels.  You can increase the size of the maxdsiz kernel
parameter through the use of SAM.

When using the GUI version of SAM, click on the Kernel Configuration
icon, then the Configurable Parameters icon.  Scroll down and select
the maxdsiz line.  From the Actions menu, select the Modify Configurable
Parameter item.  Insert the new formula into the Formula/Value box.
Then follow the instructions to rebuild your kernel and reboot your
system.

In general, a value of 256MB (or "256*1024*1024") is sufficient for
Perl to compile at maximum optimization.

=head1 nss_delete core dump from op/pwent or op/grent

You may get a bus error core dump from the op/pwent or op/grent
tests. If compiled with -g you will see a stack trace much like
the following:

  #0  0xc004216c in  () from /usr/lib/libc.2
  #1  0xc00d7550 in __nss_src_state_destr () from /usr/lib/libc.2
  #2  0xc00d7768 in __nss_src_state_destr () from /usr/lib/libc.2
  #3  0xc00d78a8 in nss_delete () from /usr/lib/libc.2
  #4  0xc01126d8 in endpwent () from /usr/lib/libc.2
  #5  0xd1950 in Perl_pp_epwent () from ./perl
  #6  0x94d3c in Perl_runops_standard () from ./perl
  #7  0x23728 in S_run_body () from ./perl
  #8  0x23428 in perl_run () from ./perl
  #9  0x2005c in main () from ./perl

The key here is the C<nss_delete> call.  One workaround for this
bug seems to be to create add to the file F</etc/nsswitch.conf>
(at least) the following lines

  group: files
  passwd: files

Whether you are using NIS does not matter.  Amazingly enough,
the same bug also affects Solaris.

=head1 error: pasting ")" and "l" does not give a valid preprocessing token

There seems to be a broken system header file in HP-UX 11.00 that
breaks perl building in 32bit mode with GNU gcc-4.x causing this
error. The same file for HP-UX 11.11 (even though the file is older)
does not show this failure, and has the correct definition, so the
best fix is to patch the header to match:

 --- /usr/include/inttypes.h  2001-04-20 18:42:14 +0200
 +++ /usr/include/inttypes.h  2000-11-14 09:00:00 +0200
 @@ -72,7 +72,7 @@
  #define UINT32_C(__c)                   __CONCAT_U__(__c)
  #else /* __LP64 */
  #define INT32_C(__c)                    __CONCAT__(__c,l)
 -#define UINT32_C(__c)                   __CONCAT__(__CONCAT_U__(__c),l)
 +#define UINT32_C(__c)                   __CONCAT__(__c,ul)
  #endif /* __LP64 */

  #define INT64_C(__c)                    __CONCAT_L__(__c,l)

=head1 Redeclaration of "sendpath" with a different storage class specifier

The following compilation warnings may happen in HP-UX releases
earlier than 11.31 but are harmless:

 cc: "/usr/include/sys/socket.h", line 535: warning 562:
    Redeclaration of "sendfile" with a different storage class
    specifier: "sendfile" will have internal linkage.
 cc: "/usr/include/sys/socket.h", line 536: warning 562:
    Redeclaration of "sendpath" with a different storage class
    specifier: "sendpath" will have internal linkage.

They seem to be caused by broken system header files, and also other
open source projects are seeing them.  The following HP-UX patches
should make the warnings go away:

  CR JAGae12001: PHNE_27063
  Warning 562 on sys/socket.h due to redeclaration of prototypes

  CR JAGae16787:
  Warning 562 from socket.h sendpath/sendfile -D_FILEFFSET_BITS=64

  CR JAGae73470 (11.23)
  ER: Compiling socket.h with cc -D_FILEFFSET_BITS=64 warning 267/562

=head1 Miscellaneous

HP-UX 11 Y2K patch "Y2K-1100 B.11.00.B0125 HP-UX Core OS Year 2000
Patch Bundle" has been reported to break the io/fs test #18 which
tests whether utime() can change timestamps.  The Y2K patch seems to
break utime() so that over NFS the timestamps do not get changed
(on local filesystems utime() still works). This has probably been
fixed on your system by now.

=head1 AUTHOR

H.Merijn Brand <h.m.brand@xs4all.nl>
Jeff Okamoto <okamoto@corp.hp.com>

With much assistance regarding shared libraries from Marc Sabatella.

=cut
perlrun.pod000064400000150456150344123500006747 0ustar00=head1 NAME

perlrun - how to execute the Perl interpreter

=head1 SYNOPSIS

B<perl>	S<[ B<-sTtuUWX> ]>
	S<[ B<-hv> ] [ B<-V>[:I<configvar>] ]>
	S<[ B<-cw> ] [ B<-d>[B<t>][:I<debugger>] ] [ B<-D>[I<number/list>] ]>
	S<[ B<-pna> ] [ B<-F>I<pattern> ] [ B<-l>[I<octal>] ] [ B<-0>[I<octal/hexadecimal>] ]>
	S<[ B<-I>I<dir> ] [ B<-m>[B<->]I<module> ] [ B<-M>[B<->]I<'module...'> ] [ B<-f> ]>
	S<[ B<-C [I<number/list>] >]>
	S<[ B<-S> ]>
	S<[ B<-x>[I<dir>] ]>
	S<[ B<-i>[I<extension>] ]>
	S<[ [B<-e>|B<-E>] I<'command'> ] [ B<--> ] [ I<programfile> ] [ I<argument> ]...>

=head1 DESCRIPTION

The normal way to run a Perl program is by making it directly
executable, or else by passing the name of the source file as an
argument on the command line.  (An interactive Perl environment
is also possible--see L<perldebug> for details on how to do that.)
Upon startup, Perl looks for your program in one of the following
places:

=over 4

=item 1.

Specified line by line via B<-e> or B<-E> switches on the command line.

=item 2.

Contained in the file specified by the first filename on the command line.
(Note that systems supporting the C<#!> notation invoke interpreters this
way. See L</Location of Perl>.)

=item 3.

Passed in implicitly via standard input.  This works only if there are
no filename arguments--to pass arguments to a STDIN-read program you
must explicitly specify a "-" for the program name.

=back

With methods 2 and 3, Perl starts parsing the input file from the
beginning, unless you've specified a B<-x> switch, in which case it
scans for the first line starting with C<#!> and containing the word
"perl", and starts there instead.  This is useful for running a program
embedded in a larger message.  (In this case you would indicate the end
of the program using the C<__END__> token.)

The C<#!> line is always examined for switches as the line is being
parsed.  Thus, if you're on a machine that allows only one argument
with the C<#!> line, or worse, doesn't even recognize the C<#!> line, you
still can get consistent switch behaviour regardless of how Perl was
invoked, even if B<-x> was used to find the beginning of the program.

Because historically some operating systems silently chopped off
kernel interpretation of the C<#!> line after 32 characters, some
switches may be passed in on the command line, and some may not;
you could even get a "-" without its letter, if you're not careful.
You probably want to make sure that all your switches fall either
before or after that 32-character boundary.  Most switches don't
actually care if they're processed redundantly, but getting a "-"
instead of a complete switch could cause Perl to try to execute
standard input instead of your program.  And a partial B<-I> switch
could also cause odd results.

Some switches do care if they are processed twice, for instance
combinations of B<-l> and B<-0>.  Either put all the switches after
the 32-character boundary (if applicable), or replace the use of
B<-0>I<digits> by C<BEGIN{ $/ = "\0digits"; }>.

Parsing of the C<#!> switches starts wherever "perl" is mentioned in the line.
The sequences "-*" and "- " are specifically ignored so that you could,
if you were so inclined, say

    #!/bin/sh
    #! -*-perl-*-
    eval 'exec perl -x -wS $0 ${1+"$@"}'
        if 0;

to let Perl see the B<-p> switch.

A similar trick involves the I<env> program, if you have it.

    #!/usr/bin/env perl

The examples above use a relative path to the perl interpreter,
getting whatever version is first in the user's path.  If you want
a specific version of Perl, say, perl5.14.1, you should place
that directly in the C<#!> line's path.

If the C<#!> line does not contain the word "perl" nor the word "indir",
the program named after the C<#!> is executed instead of the Perl
interpreter.  This is slightly bizarre, but it helps people on machines
that don't do C<#!>, because they can tell a program that their SHELL is
F</usr/bin/perl>, and Perl will then dispatch the program to the correct
interpreter for them.

After locating your program, Perl compiles the entire program to an
internal form.  If there are any compilation errors, execution of the
program is not attempted.  (This is unlike the typical shell script,
which might run part-way through before finding a syntax error.)

If the program is syntactically correct, it is executed.  If the program
runs off the end without hitting an exit() or die() operator, an implicit
C<exit(0)> is provided to indicate successful completion.

=head2 #! and quoting on non-Unix systems
X<hashbang> X<#!>

Unix's C<#!> technique can be simulated on other systems:

=over 4

=item OS/2

Put

    extproc perl -S -your_switches

as the first line in C<*.cmd> file (B<-S> due to a bug in cmd.exe's
`extproc' handling).

=item MS-DOS

Create a batch file to run your program, and codify it in
C<ALTERNATE_SHEBANG> (see the F<dosish.h> file in the source
distribution for more information).

=item Win95/NT

The Win95/NT installation, when using the ActiveState installer for Perl,
will modify the Registry to associate the F<.pl> extension with the perl
interpreter.  If you install Perl by other means (including building from
the sources), you may have to modify the Registry yourself.  Note that
this means you can no longer tell the difference between an executable
Perl program and a Perl library file.

=item VMS

Put

 $ perl -mysw 'f$env("procedure")' 'p1' 'p2' 'p3' 'p4' 'p5' 'p6' 'p7' 'p8' !
 $ exit++ + ++$status != 0 and $exit = $status = undef;

at the top of your program, where B<-mysw> are any command line switches you
want to pass to Perl.  You can now invoke the program directly, by saying
C<perl program>, or as a DCL procedure, by saying C<@program> (or implicitly
via F<DCL$PATH> by just using the name of the program).

This incantation is a bit much to remember, but Perl will display it for
you if you say C<perl "-V:startperl">.

=back

Command-interpreters on non-Unix systems have rather different ideas
on quoting than Unix shells.  You'll need to learn the special
characters in your command-interpreter (C<*>, C<\> and C<"> are
common) and how to protect whitespace and these characters to run
one-liners (see L<-e|/-e commandline> below).

On some systems, you may have to change single-quotes to double ones,
which you must I<not> do on Unix or Plan 9 systems.  You might also
have to change a single % to a %%.

For example:

    # Unix
    perl -e 'print "Hello world\n"'

    # MS-DOS, etc.
    perl -e "print \"Hello world\n\""

    # VMS
    perl -e "print ""Hello world\n"""

The problem is that none of this is reliable: it depends on the
command and it is entirely possible neither works.  If I<4DOS> were
the command shell, this would probably work better:

    perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>""

B<CMD.EXE> in Windows NT slipped a lot of standard Unix functionality in
when nobody was looking, but just try to find documentation for its
quoting rules.

There is no general solution to all of this.  It's just a mess.

=head2 Location of Perl
X<perl, location of interpreter>

It may seem obvious to say, but Perl is useful only when users can
easily find it.  When possible, it's good for both F</usr/bin/perl>
and F</usr/local/bin/perl> to be symlinks to the actual binary.  If
that can't be done, system administrators are strongly encouraged
to put (symlinks to) perl and its accompanying utilities into a
directory typically found along a user's PATH, or in some other
obvious and convenient place.

In this documentation, C<#!/usr/bin/perl> on the first line of the program
will stand in for whatever method works on your system.  You are
advised to use a specific path if you care about a specific version.

    #!/usr/local/bin/perl5.14

or if you just want to be running at least version, place a statement
like this at the top of your program:

    use 5.014;

=head2 Command Switches
X<perl, command switches> X<command switches>

As with all standard commands, a single-character switch may be
clustered with the following switch, if any.

    #!/usr/bin/perl -spi.orig	# same as -s -p -i.orig

A C<--> signals the end of options and disables further option processing. Any
arguments after the C<--> are treated as filenames and arguments.

Switches include:

=over 5

=item B<-0>[I<octal/hexadecimal>]
X<-0> X<$/>

specifies the input record separator (C<$/>) as an octal or
hexadecimal number.  If there are no digits, the null character is the
separator.  Other switches may precede or follow the digits.  For
example, if you have a version of I<find> which can print filenames
terminated by the null character, you can say this:

    find . -name '*.orig' -print0 | perl -n0e unlink

The special value 00 will cause Perl to slurp files in paragraph mode.
Any value 0400 or above will cause Perl to slurp files whole, but by convention
the value 0777 is the one normally used for this purpose.

You can also specify the separator character using hexadecimal notation:
B<-0xI<HHH...>>, where the C<I<H>> are valid hexadecimal digits.  Unlike
the octal form, this one may be used to specify any Unicode character, even
those beyond 0xFF.  So if you I<really> want a record separator of 0777,
specify it as B<-0x1FF>.  (This means that you cannot use the B<-x> option
with a directory name that consists of hexadecimal digits, or else Perl
will think you have specified a hex number to B<-0>.)

=item B<-a>
X<-a> X<autosplit>

turns on autosplit mode when used with a B<-n> or B<-p>.  An implicit
split command to the @F array is done as the first thing inside the
implicit while loop produced by the B<-n> or B<-p>.

    perl -ane 'print pop(@F), "\n";'

is equivalent to

    while (<>) {
	@F = split(' ');
	print pop(@F), "\n";
    }

An alternate delimiter may be specified using B<-F>.

B<-a> implicitly sets B<-n>.

=item B<-C [I<number/list>]>
X<-C>

The B<-C> flag controls some of the Perl Unicode features.

As of 5.8.1, the B<-C> can be followed either by a number or a list
of option letters.  The letters, their numeric values, and effects
are as follows; listing the letters is equal to summing the numbers.

    I     1   STDIN is assumed to be in UTF-8
    O     2   STDOUT will be in UTF-8
    E     4   STDERR will be in UTF-8
    S     7   I + O + E
    i     8   UTF-8 is the default PerlIO layer for input streams
    o    16   UTF-8 is the default PerlIO layer for output streams
    D    24   i + o
    A    32   the @ARGV elements are expected to be strings encoded
              in UTF-8
    L    64   normally the "IOEioA" are unconditional, the L makes
              them conditional on the locale environment variables
              (the LC_ALL, LC_CTYPE, and LANG, in the order of
              decreasing precedence) -- if the variables indicate
              UTF-8, then the selected "IOEioA" are in effect
    a   256   Set ${^UTF8CACHE} to -1, to run the UTF-8 caching
              code in debugging mode.

=for documenting_the_underdocumented
perl.h gives W/128 as PERL_UNICODE_WIDESYSCALLS "/* for Sarathy */"

=for todo
perltodo mentions Unicode in %ENV and filenames. I guess that these will be
options e and f (or F).

For example, B<-COE> and B<-C6> will both turn on UTF-8-ness on both
STDOUT and STDERR.  Repeating letters is just redundant, not cumulative
nor toggling.

The C<io> options mean that any subsequent open() (or similar I/O
operations) in the current file scope will have the C<:utf8> PerlIO layer
implicitly applied to them, in other words, UTF-8 is expected from any
input stream, and UTF-8 is produced to any output stream.  This is just
the default, with explicit layers in open() and with binmode() one can
manipulate streams as usual.

B<-C> on its own (not followed by any number or option list), or the
empty string C<""> for the C<PERL_UNICODE> environment variable, has the
same effect as B<-CSDL>.  In other words, the standard I/O handles and
the default C<open()> layer are UTF-8-fied I<but> only if the locale
environment variables indicate a UTF-8 locale.  This behaviour follows
the I<implicit> (and problematic) UTF-8 behaviour of Perl 5.8.0.
(See L<perl581delta/UTF-8 no longer default under UTF-8 locales>.)

You can use B<-C0> (or C<"0"> for C<PERL_UNICODE>) to explicitly
disable all the above Unicode features.

The read-only magic variable C<${^UNICODE}> reflects the numeric value
of this setting.  This variable is set during Perl startup and is
thereafter read-only.  If you want runtime effects, use the three-arg
open() (see L<perlfunc/open>), the two-arg binmode() (see L<perlfunc/binmode>),
and the C<open> pragma (see L<open>).

(In Perls earlier than 5.8.1 the B<-C> switch was a Win32-only switch
that enabled the use of Unicode-aware "wide system call" Win32 APIs.
This feature was practically unused, however, and the command line
switch was therefore "recycled".)

B<Note:> Since perl 5.10.1, if the B<-C> option is used on the C<#!> line,
it must be specified on the command line as well, since the standard streams
are already set up at this point in the execution of the perl interpreter.
You can also use binmode() to set the encoding of an I/O stream.

=item B<-c>
X<-c>

causes Perl to check the syntax of the program and then exit without
executing it.  Actually, it I<will> execute any C<BEGIN>, C<UNITCHECK>,
or C<CHECK> blocks and any C<use> statements: these are considered as
occurring outside the execution of your program.  C<INIT> and C<END>
blocks, however, will be skipped.

=item B<-d>
X<-d> X<-dt>

=item B<-dt>

runs the program under the Perl debugger.  See L<perldebug>.
If B<t> is specified, it indicates to the debugger that threads
will be used in the code being debugged.

=item B<-d:>I<MOD[=bar,baz]>
X<-d> X<-dt>

=item B<-dt:>I<MOD[=bar,baz]>

runs the program under the control of a debugging, profiling, or tracing
module installed as C<Devel::I<MOD>>. E.g., B<-d:DProf> executes the
program using the C<Devel::DProf> profiler.  As with the B<-M> flag, options
may be passed to the C<Devel::I<MOD>> package where they will be received
and interpreted by the C<Devel::I<MOD>::import> routine.  Again, like B<-M>,
use -B<-d:-I<MOD>> to call C<Devel::I<MOD>::unimport> instead of import.  The
comma-separated list of options must follow a C<=> character.  If B<t> is
specified, it indicates to the debugger that threads will be used in the
code being debugged.  See L<perldebug>.

=item B<-D>I<letters>
X<-D> X<DEBUGGING> X<-DDEBUGGING>

=item B<-D>I<number>

sets debugging flags. This switch is enabled only if your perl binary has
been built with debugging enabled: normal production perls won't have
been.

For example, to watch how perl executes your program, use B<-Dtls>.
Another nice value is B<-Dx>, which lists your compiled syntax tree, and
B<-Dr> displays compiled regular expressions; the format of the output is
explained in L<perldebguts>.

As an alternative, specify a number instead of list of letters (e.g.,
B<-D14> is equivalent to B<-Dtls>):

         1  p  Tokenizing and parsing (with v, displays parse
               stack)
         2  s  Stack snapshots (with v, displays all stacks)
         4  l  Context (loop) stack processing
         8  t  Trace execution
        16  o  Method and overloading resolution
        32  c  String/numeric conversions
        64  P  Print profiling info, source file input state
       128  m  Memory and SV allocation
       256  f  Format processing
       512  r  Regular expression parsing and execution
      1024  x  Syntax tree dump
      2048  u  Tainting checks
      4096  U  Unofficial, User hacking (reserved for private,
               unreleased use)
      8192  H  Hash dump -- usurps values()
     16384  X  Scratchpad allocation
     32768  D  Cleaning up
     65536  S  Op slab allocation
    131072  T  Tokenizing
    262144  R  Include reference counts of dumped variables
               (eg when using -Ds)
    524288  J  show s,t,P-debug (don't Jump over) on opcodes within
               package DB
   1048576  v  Verbose: use in conjunction with other flags
   2097152  C  Copy On Write
   4194304  A  Consistency checks on internal structures
   8388608  q  quiet - currently only suppresses the "EXECUTING"
               message
  16777216  M  trace smart match resolution
  33554432  B  dump suBroutine definitions, including special
               Blocks like BEGIN
  67108864  L  trace Locale-related info; what gets output is very
               subject to change
 134217728  i  trace PerlIO layer processing.  Set PERLIO_DEBUG to
               the filename to trace to.

All these flags require B<-DDEBUGGING> when you compile the Perl
executable (but see C<:opd> in L<Devel::Peek> or L<re/'debug' mode>
which may change this).
See the F<INSTALL> file in the Perl source distribution
for how to do this.

If you're just trying to get a print out of each line of Perl code
as it executes, the way that C<sh -x> provides for shell scripts,
you can't use Perl's B<-D> switch.  Instead do this

  # If you have "env" utility
  env PERLDB_OPTS="NonStop=1 AutoTrace=1 frame=2" perl -dS program

  # Bourne shell syntax
  $ PERLDB_OPTS="NonStop=1 AutoTrace=1 frame=2" perl -dS program

  # csh syntax
  % (setenv PERLDB_OPTS "NonStop=1 AutoTrace=1 frame=2"; perl -dS program)

See L<perldebug> for details and variations.

=item B<-e> I<commandline>
X<-e>

may be used to enter one line of program.  If B<-e> is given, Perl
will not look for a filename in the argument list.  Multiple B<-e>
commands may be given to build up a multi-line script.  Make sure
to use semicolons where you would in a normal program.

=item B<-E> I<commandline>
X<-E>

behaves just like B<-e>, except that it implicitly enables all
optional features (in the main compilation unit). See L<feature>.

=item B<-f>
X<-f> X<sitecustomize> X<sitecustomize.pl>

Disable executing F<$Config{sitelib}/sitecustomize.pl> at startup.

Perl can be built so that it by default will try to execute
F<$Config{sitelib}/sitecustomize.pl> at startup (in a BEGIN block).
This is a hook that allows the sysadmin to customize how Perl behaves.
It can for instance be used to add entries to the @INC array to make Perl
find modules in non-standard locations.

Perl actually inserts the following code:

    BEGIN {
        do { local $!; -f "$Config{sitelib}/sitecustomize.pl"; }
            && do "$Config{sitelib}/sitecustomize.pl";
    }

Since it is an actual C<do> (not a C<require>), F<sitecustomize.pl>
doesn't need to return a true value. The code is run in package C<main>,
in its own lexical scope. However, if the script dies, C<$@> will not
be set.

The value of C<$Config{sitelib}> is also determined in C code and not
read from C<Config.pm>, which is not loaded.

The code is executed I<very> early. For example, any changes made to
C<@INC> will show up in the output of `perl -V`. Of course, C<END>
blocks will be likewise executed very late.

To determine at runtime if this capability has been compiled in your
perl, you can check the value of C<$Config{usesitecustomize}>.

=item B<-F>I<pattern>
X<-F>

specifies the pattern to split on for B<-a>. The pattern may be
surrounded by C<//>, C<"">, or C<''>, otherwise it will be put in single
quotes. You can't use literal whitespace or NUL characters in the pattern.

B<-F> implicitly sets both B<-a> and B<-n>.

=item B<-h>
X<-h>

prints a summary of the options.

=item B<-i>[I<extension>]
X<-i> X<in-place>

specifies that files processed by the C<E<lt>E<gt>> construct are to be
edited in-place.  It does this by renaming the input file, opening the
output file by the original name, and selecting that output file as the
default for print() statements.  The extension, if supplied, is used to
modify the name of the old file to make a backup copy, following these
rules:

If no extension is supplied, and your system supports it, the original
I<file> is kept open without a name while the output is redirected to
a new file with the original I<filename>.  When perl exits, cleanly or not,
the original I<file> is unlinked.

If the extension doesn't contain a C<*>, then it is appended to the
end of the current filename as a suffix.  If the extension does
contain one or more C<*> characters, then each C<*> is replaced
with the current filename.  In Perl terms, you could think of this
as:

    ($backup = $extension) =~ s/\*/$file_name/g;

This allows you to add a prefix to the backup file, instead of (or in
addition to) a suffix:

 $ perl -pi'orig_*' -e 's/bar/baz/' fileA  # backup to
                                           # 'orig_fileA'

Or even to place backup copies of the original files into another
directory (provided the directory already exists):

 $ perl -pi'old/*.orig' -e 's/bar/baz/' fileA  # backup to
                                               # 'old/fileA.orig'

These sets of one-liners are equivalent:

 $ perl -pi -e 's/bar/baz/' fileA          # overwrite current file
 $ perl -pi'*' -e 's/bar/baz/' fileA       # overwrite current file

 $ perl -pi'.orig' -e 's/bar/baz/' fileA   # backup to 'fileA.orig'
 $ perl -pi'*.orig' -e 's/bar/baz/' fileA  # backup to 'fileA.orig'

From the shell, saying

    $ perl -p -i.orig -e "s/foo/bar/; ... "

is the same as using the program:

    #!/usr/bin/perl -pi.orig
    s/foo/bar/;

which is equivalent to

    #!/usr/bin/perl
    $extension = '.orig';
    LINE: while (<>) {
	if ($ARGV ne $oldargv) {
	    if ($extension !~ /\*/) {
		$backup = $ARGV . $extension;
	    }
	    else {
		($backup = $extension) =~ s/\*/$ARGV/g;
	    }
	    rename($ARGV, $backup);
	    open(ARGVOUT, ">$ARGV");
	    select(ARGVOUT);
	    $oldargv = $ARGV;
	}
	s/foo/bar/;
    }
    continue {
	print;	# this prints to original filename
    }
    select(STDOUT);

except that the B<-i> form doesn't need to compare $ARGV to $oldargv to
know when the filename has changed.  It does, however, use ARGVOUT for
the selected filehandle.  Note that STDOUT is restored as the default
output filehandle after the loop.

As shown above, Perl creates the backup file whether or not any output
is actually changed.  So this is just a fancy way to copy files:

    $ perl -p -i'/some/file/path/*' -e 1 file1 file2 file3...
or
    $ perl -p -i'.orig' -e 1 file1 file2 file3...

You can use C<eof> without parentheses to locate the end of each input
file, in case you want to append to each file, or reset line numbering
(see example in L<perlfunc/eof>).

If, for a given file, Perl is unable to create the backup file as
specified in the extension then it will skip that file and continue on
with the next one (if it exists).

For a discussion of issues surrounding file permissions and B<-i>, see
L<perlfaq5/Why does Perl let me delete read-only files?  Why does -i clobber
protected files?  Isn't this a bug in Perl?>.

You cannot use B<-i> to create directories or to strip extensions from
files.

Perl does not expand C<~> in filenames, which is good, since some
folks use it for their backup files:

    $ perl -pi~ -e 's/foo/bar/' file1 file2 file3...

Note that because B<-i> renames or deletes the original file before
creating a new file of the same name, Unix-style soft and hard links will
not be preserved.

Finally, the B<-i> switch does not impede execution when no
files are given on the command line.  In this case, no backup is made
(the original file cannot, of course, be determined) and processing
proceeds from STDIN to STDOUT as might be expected.

=item B<-I>I<directory>
X<-I> X<@INC>

Directories specified by B<-I> are prepended to the search path for
modules (C<@INC>).

=item B<-l>[I<octnum>]
X<-l> X<$/> X<$\>

enables automatic line-ending processing.  It has two separate
effects.  First, it automatically chomps C<$/> (the input record
separator) when used with B<-n> or B<-p>.  Second, it assigns C<$\>
(the output record separator) to have the value of I<octnum> so
that any print statements will have that separator added back on.
If I<octnum> is omitted, sets C<$\> to the current value of
C<$/>.  For instance, to trim lines to 80 columns:

    perl -lpe 'substr($_, 80) = ""'

Note that the assignment C<$\ = $/> is done when the switch is processed,
so the input record separator can be different than the output record
separator if the B<-l> switch is followed by a B<-0> switch:

    gnufind / -print0 | perl -ln0e 'print "found $_" if -p'

This sets C<$\> to newline and then sets C<$/> to the null character.

=item B<-m>[B<->]I<module>
X<-m> X<-M>

=item B<-M>[B<->]I<module>

=item B<-M>[B<->]I<'module ...'>

=item B<-[mM]>[B<->]I<module=arg[,arg]...>

B<-m>I<module> executes C<use> I<module> C<();> before executing your
program.

B<-M>I<module> executes C<use> I<module> C<;> before executing your
program.  You can use quotes to add extra code after the module name,
e.g., C<'-MI<MODULE> qw(foo bar)'>.

If the first character after the B<-M> or B<-m> is a dash (B<->)
then the 'use' is replaced with 'no'.

A little builtin syntactic sugar means you can also say
B<-mI<MODULE>=foo,bar> or B<-MI<MODULE>=foo,bar> as a shortcut for
B<'-MI<MODULE> qw(foo bar)'>.  This avoids the need to use quotes when
importing symbols.  The actual code generated by B<-MI<MODULE>=foo,bar> is
C<use module split(/,/,q{foo,bar})>.  Note that the C<=> form
removes the distinction between B<-m> and B<-M>; that is,
B<-mI<MODULE>=foo,bar> is the same as B<-MI<MODULE>=foo,bar>.

A consequence of this is that B<-MI<MODULE>=number> never does a version check,
unless C<I<MODULE>::import()> itself is set up to do a version check, which
could happen for example if I<MODULE> inherits from L<Exporter>.

=item B<-n>
X<-n>

causes Perl to assume the following loop around your program, which
makes it iterate over filename arguments somewhat like I<sed -n> or
I<awk>:

  LINE:
    while (<>) {
	...		# your program goes here
    }

Note that the lines are not printed by default.  See L</-p> to have
lines printed.  If a file named by an argument cannot be opened for
some reason, Perl warns you about it and moves on to the next file.

Also note that C<< <> >> passes command line arguments to
L<perlfunc/open>, which doesn't necessarily interpret them as file names.
See  L<perlop> for possible security implications.

Here is an efficient way to delete all files that haven't been modified for
at least a week:

    find . -mtime +7 -print | perl -nle unlink

This is faster than using the B<-exec> switch of I<find> because you don't
have to start a process on every filename found (but it's not faster
than using the B<-delete> switch available in newer versions of I<find>.
It does suffer from the bug of mishandling newlines in pathnames, which
you can fix if you follow the example under B<-0>.

C<BEGIN> and C<END> blocks may be used to capture control before or after
the implicit program loop, just as in I<awk>.

=item B<-p>
X<-p>

causes Perl to assume the following loop around your program, which
makes it iterate over filename arguments somewhat like I<sed>:


  LINE:
    while (<>) {
	...		# your program goes here
    } continue {
	print or die "-p destination: $!\n";
    }

If a file named by an argument cannot be opened for some reason, Perl
warns you about it, and moves on to the next file.  Note that the
lines are printed automatically.  An error occurring during printing is
treated as fatal.  To suppress printing use the B<-n> switch.  A B<-p>
overrides a B<-n> switch.

C<BEGIN> and C<END> blocks may be used to capture control before or after
the implicit loop, just as in I<awk>.

=item B<-s>
X<-s>

enables rudimentary switch parsing for switches on the command
line after the program name but before any filename arguments (or before
an argument of B<-->).  Any switch found there is removed from @ARGV and sets the
corresponding variable in the Perl program.  The following program
prints "1" if the program is invoked with a B<-xyz> switch, and "abc"
if it is invoked with B<-xyz=abc>.

    #!/usr/bin/perl -s
    if ($xyz) { print "$xyz\n" }

Do note that a switch like B<--help> creates the variable C<${-help}>, which is
not compliant with C<use strict "refs">.  Also, when using this option on a
script with warnings enabled you may get a lot of spurious "used only once"
warnings.

=item B<-S>
X<-S>

makes Perl use the PATH environment variable to search for the
program unless the name of the program contains path separators.

On some platforms, this also makes Perl append suffixes to the
filename while searching for it.  For example, on Win32 platforms,
the ".bat" and ".cmd" suffixes are appended if a lookup for the
original name fails, and if the name does not already end in one
of those suffixes.  If your Perl was compiled with C<DEBUGGING> turned
on, using the B<-Dp> switch to Perl shows how the search progresses.

Typically this is used to emulate C<#!> startup on platforms that don't
support C<#!>.  It's also convenient when debugging a script that uses C<#!>,
and is thus normally found by the shell's $PATH search mechanism.

This example works on many platforms that have a shell compatible with
Bourne shell:

    #!/usr/bin/perl
    eval 'exec /usr/bin/perl -wS $0 ${1+"$@"}'
	    if $running_under_some_shell;

The system ignores the first line and feeds the program to F</bin/sh>,
which proceeds to try to execute the Perl program as a shell script.
The shell executes the second line as a normal shell command, and thus
starts up the Perl interpreter.  On some systems $0 doesn't always
contain the full pathname, so the B<-S> tells Perl to search for the
program if necessary.  After Perl locates the program, it parses the
lines and ignores them because the variable $running_under_some_shell
is never true.  If the program will be interpreted by csh, you will need
to replace C<${1+"$@"}> with C<$*>, even though that doesn't understand
embedded spaces (and such) in the argument list.  To start up I<sh> rather
than I<csh>, some systems may have to replace the C<#!> line with a line
containing just a colon, which will be politely ignored by Perl.  Other
systems can't control that, and need a totally devious construct that
will work under any of I<csh>, I<sh>, or Perl, such as the following:

	eval '(exit $?0)' && eval 'exec perl -wS $0 ${1+"$@"}'
	& eval 'exec /usr/bin/perl -wS $0 $argv:q'
		if $running_under_some_shell;

If the filename supplied contains directory separators (and so is an
absolute or relative pathname), and if that file is not found,
platforms that append file extensions will do so and try to look
for the file with those extensions added, one by one.

On DOS-like platforms, if the program does not contain directory
separators, it will first be searched for in the current directory
before being searched for on the PATH.  On Unix platforms, the
program will be searched for strictly on the PATH.

=item B<-t>
X<-t>

Like B<-T>, but taint checks will issue warnings rather than fatal
errors.  These warnings can now be controlled normally with C<no warnings
qw(taint)>.

B<Note: This is not a substitute for C<-T>!> This is meant to be
used I<only> as a temporary development aid while securing legacy code:
for real production code and for new secure code written from scratch,
always use the real B<-T>.

=item B<-T>
X<-T>

turns on "taint" so you can test them.  Ordinarily
these checks are done only when running setuid or setgid.  It's a
good idea to turn them on explicitly for programs that run on behalf
of someone else whom you might not necessarily trust, such as CGI
programs or any internet servers you might write in Perl.  See
L<perlsec> for details.  For security reasons, this option must be
seen by Perl quite early; usually this means it must appear early
on the command line or in the C<#!> line for systems which support
that construct.

=item B<-u>
X<-u>

This switch causes Perl to dump core after compiling your
program.  You can then in theory take this core dump and turn it
into an executable file by using the I<undump> program (not supplied).
This speeds startup at the expense of some disk space (which you
can minimize by stripping the executable).  (Still, a "hello world"
executable comes out to about 200K on my machine.)  If you want to
execute a portion of your program before dumping, use the dump()
operator instead.  Note: availability of I<undump> is platform
specific and may not be available for a specific port of Perl.

=item B<-U>
X<-U>

allows Perl to do unsafe operations.  Currently the only "unsafe"
operations are attempting to unlink directories while running as superuser
and running setuid programs with fatal taint checks turned into warnings.
Note that warnings must be enabled along with this option to actually
I<generate> the taint-check warnings.

=item B<-v>
X<-v>

prints the version and patchlevel of your perl executable.

=item B<-V>
X<-V>

prints summary of the major perl configuration values and the current
values of @INC.

=item B<-V:>I<configvar>

Prints to STDOUT the value of the named configuration variable(s),
with multiples when your C<I<configvar>> argument looks like a regex (has
non-letters).  For example:

    $ perl -V:libc
	libc='/lib/libc-2.2.4.so';
    $ perl -V:lib.
	libs='-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc';
	libc='/lib/libc-2.2.4.so';
    $ perl -V:lib.*
	libpth='/usr/local/lib /lib /usr/lib';
	libs='-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc';
	lib_ext='.a';
	libc='/lib/libc-2.2.4.so';
	libperl='libperl.a';
	....

Additionally, extra colons can be used to control formatting.  A
trailing colon suppresses the linefeed and terminator ";", allowing
you to embed queries into shell commands.  (mnemonic: PATH separator
":".)

    $ echo "compression-vars: " `perl -V:z.*: ` " are here !"
    compression-vars:  zcat='' zip='zip'  are here !

A leading colon removes the "name=" part of the response, this allows
you to map to the name you need.  (mnemonic: empty label)

    $ echo "goodvfork="`./perl -Ilib -V::usevfork`
    goodvfork=false;

Leading and trailing colons can be used together if you need
positional parameter values without the names.  Note that in the case
below, the C<PERL_API> params are returned in alphabetical order.

    $ echo building_on `perl -V::osname: -V::PERL_API_.*:` now
    building_on 'linux' '5' '1' '9' now

=item B<-w>
X<-w>

prints warnings about dubious constructs, such as variable names
mentioned only once and scalar variables used
before being set; redefined subroutines; references to undefined
filehandles; filehandles opened read-only that you are attempting
to write on; values used as a number that don't I<look> like numbers;
using an array as though it were a scalar; if your subroutines
recurse more than 100 deep; and innumerable other things.

This switch really just enables the global C<$^W> variable; normally,
the lexically scoped C<use warnings> pragma is preferred. You
can disable or promote into fatal errors specific warnings using
C<__WARN__> hooks, as described in L<perlvar> and L<perlfunc/warn>.
See also L<perldiag> and L<perltrap>.  A fine-grained warning
facility is also available if you want to manipulate entire classes
of warnings; see L<warnings>.

=item B<-W>
X<-W>

Enables all warnings regardless of C<no warnings> or C<$^W>.
See L<warnings>.

=item B<-X>
X<-X>

Disables all warnings regardless of C<use warnings> or C<$^W>.
See L<warnings>.

=item B<-x>
X<-x>

=item B<-x>I<directory>

tells Perl that the program is embedded in a larger chunk of unrelated
text, such as in a mail message.  Leading garbage will be
discarded until the first line that starts with C<#!> and contains the
string "perl".  Any meaningful switches on that line will be applied.

All references to line numbers by the program (warnings, errors, ...)
will treat the C<#!> line as the first line.
Thus a warning on the 2nd line of the program, which is on the 100th
line in the file will be reported as line 2, not as line 100.
This can be overridden by using the C<#line> directive.
(See L<perlsyn/"Plain Old Comments (Not!)">)

If a directory name is specified, Perl will switch to that directory
before running the program.  The B<-x> switch controls only the
disposal of leading garbage.  The program must be terminated with
C<__END__> if there is trailing garbage to be ignored;  the program
can process any or all of the trailing garbage via the C<DATA> filehandle
if desired.

The directory, if specified, must appear immediately following the B<-x>
with no intervening whitespace.

=back

=head1 ENVIRONMENT
X<perl, environment variables>

=over 12

=item HOME
X<HOME>

Used if C<chdir> has no argument.

=item LOGDIR
X<LOGDIR>

Used if C<chdir> has no argument and HOME is not set.

=item PATH
X<PATH>

Used in executing subprocesses, and in finding the program if B<-S> is
used.

=item PERL5LIB
X<PERL5LIB>

A list of directories in which to look for Perl library files before
looking in the standard library.
Any architecture-specific and version-specific directories,
such as F<version/archname/>, F<version/>, or F<archname/> under the
specified locations are automatically included if they exist, with this
lookup done at interpreter startup time.  In addition, any directories
matching the entries in C<$Config{inc_version_list}> are added.
(These typically would be for older compatible perl versions installed
in the same directory tree.)

If PERL5LIB is not defined, PERLLIB is used.  Directories are separated
(like in PATH) by a colon on Unixish platforms and by a semicolon on
Windows (the proper path separator being given by the command C<perl
-V:I<path_sep>>).

When running taint checks, either because the program was running setuid or
setgid, or the B<-T> or B<-t> switch was specified, neither PERL5LIB nor
PERLLIB is consulted. The program should instead say:

    use lib "/my/directory";

=item PERL5OPT
X<PERL5OPT>

Command-line options (switches).  Switches in this variable are treated
as if they were on every Perl command line.  Only the B<-[CDIMUdmtwW]>
switches are allowed.  When running taint checks (either because the
program was running setuid or setgid, or because the B<-T> or B<-t>
switch was used), this variable is ignored.  If PERL5OPT begins with
B<-T>, tainting will be enabled and subsequent options ignored.  If
PERL5OPT begins with B<-t>, tainting will be enabled, a writable dot
removed from @INC, and subsequent options honored.

=item PERLIO
X<PERLIO>

A space (or colon) separated list of PerlIO layers. If perl is built
to use PerlIO system for IO (the default) these layers affect Perl's IO.

It is conventional to start layer names with a colon (for example, C<:perlio>) to
emphasize their similarity to variable "attributes". But the code that parses
layer specification strings, which is also used to decode the PERLIO
environment variable, treats the colon as a separator.

An unset or empty PERLIO is equivalent to the default set of layers for
your platform; for example, C<:unix:perlio> on Unix-like systems
and C<:unix:crlf> on Windows and other DOS-like systems.

The list becomes the default for I<all> Perl's IO. Consequently only built-in
layers can appear in this list, as external layers (such as C<:encoding()>) need
IO in order to load them!  See L<"open pragma"|open> for how to add external
encodings as defaults.

Layers it makes sense to include in the PERLIO environment
variable are briefly summarized below. For more details see L<PerlIO>.

=over 8

=item :bytes
X<:bytes>

A pseudolayer that turns the C<:utf8> flag I<off> for the layer below;
unlikely to be useful on its own in the global PERLIO environment variable.
You perhaps were thinking of C<:crlf:bytes> or C<:perlio:bytes>.

=item :crlf
X<:crlf>

A layer which does CRLF to C<"\n"> translation distinguishing "text" and
"binary" files in the manner of MS-DOS and similar operating systems.
(It currently does I<not> mimic MS-DOS as far as treating of Control-Z
as being an end-of-file marker.)

=item :mmap
X<:mmap>

A layer that implements "reading" of files by using I<mmap>(2) to
make an entire file appear in the process's address space, and then
using that as PerlIO's "buffer".

=item :perlio
X<:perlio>

This is a re-implementation of stdio-like buffering written as a
PerlIO layer.  As such it will call whatever layer is below it for
its operations, typically C<:unix>.

=item :pop
X<:pop>

An experimental pseudolayer that removes the topmost layer.
Use with the same care as is reserved for nitroglycerine.

=item :raw
X<:raw>

A pseudolayer that manipulates other layers.  Applying the C<:raw>
layer is equivalent to calling C<binmode($fh)>.  It makes the stream
pass each byte as-is without translation.  In particular, both CRLF
translation and intuiting C<:utf8> from the locale are disabled.

Unlike in earlier versions of Perl, C<:raw> is I<not>
just the inverse of C<:crlf>: other layers which would affect the
binary nature of the stream are also removed or disabled.

=item :stdio
X<:stdio>

This layer provides a PerlIO interface by wrapping system's ANSI C "stdio"
library calls. The layer provides both buffering and IO.
Note that the C<:stdio> layer does I<not> do CRLF translation even if that
is the platform's normal behaviour. You will need a C<:crlf> layer above it
to do that.

=item :unix
X<:unix>

Low-level layer that calls C<read>, C<write>, C<lseek>, etc.

=item :utf8
X<:utf8>

A pseudolayer that enables a flag in the layer below to tell Perl
that output should be in utf8 and that input should be regarded as
already in valid utf8 form. B<WARNING: It does not check for validity and as such
should be handled with extreme caution for input, because security violations
can occur with non-shortest UTF-8 encodings, etc.> Generally C<:encoding(UTF-8)> is
the best option when reading UTF-8 encoded data.

=item :win32
X<:win32>

On Win32 platforms this I<experimental> layer uses native "handle" IO
rather than a Unix-like numeric file descriptor layer. Known to be
buggy in this release (5.14).

=back

The default set of layers should give acceptable results on all platforms

For Unix platforms that will be the equivalent of "unix perlio" or "stdio".
Configure is set up to prefer the "stdio" implementation if the system's library
provides for fast access to the buffer; otherwise, it uses the "unix perlio"
implementation.

On Win32 the default in this release (5.14) is "unix crlf". Win32's "stdio"
has a number of bugs/mis-features for Perl IO which are somewhat depending
on the version and vendor of the C compiler. Using our own C<crlf> layer as
the buffer avoids those issues and makes things more uniform.  The C<crlf>
layer provides CRLF conversion as well as buffering.

This release (5.14) uses C<unix> as the bottom layer on Win32, and so still
uses the C compiler's numeric file descriptor routines. There is an
experimental native C<win32> layer, which is expected to be enhanced and
should eventually become the default under Win32.

The PERLIO environment variable is completely ignored when Perl
is run in taint mode.

=item PERLIO_DEBUG
X<PERLIO_DEBUG>

If set to the name of a file or device when Perl is run with the
B<-Di> command-line switch, the logging of certain operations of
the PerlIO subsystem will be redirected to the specified file rather
than going to stderr, which is the default. The file is opened in append
mode. Typical uses are in Unix:

   % env PERLIO_DEBUG=/tmp/perlio.log perl -Di script ...

and under Win32, the approximately equivalent:

   > set PERLIO_DEBUG=CON
   perl -Di script ...

This functionality is disabled for setuid scripts, for scripts run
with B<-T>, and for scripts run on a Perl built without C<-DDEBUGGING>
support.

=item PERLLIB
X<PERLLIB>

A list of directories in which to look for Perl library
files before looking in the standard library.
If PERL5LIB is defined, PERLLIB is not used.

The PERLLIB environment variable is completely ignored when Perl
is run in taint mode.

=item PERL5DB
X<PERL5DB>

The command used to load the debugger code.  The default is:

	BEGIN { require "perl5db.pl" }

The PERL5DB environment variable is only used when Perl is started with
a bare B<-d> switch.

=item PERL5DB_THREADED
X<PERL5DB_THREADED>

If set to a true value, indicates to the debugger that the code being
debugged uses threads.

=item PERL5SHELL (specific to the Win32 port)
X<PERL5SHELL>

On Win32 ports only, may be set to an alternative shell that Perl must use
internally for executing "backtick" commands or system().  Default is
C<cmd.exe /x/d/c> on WindowsNT and C<command.com /c> on Windows95.  The
value is considered space-separated.  Precede any character that
needs to be protected, like a space or backslash, with another backslash.

Note that Perl doesn't use COMSPEC for this purpose because
COMSPEC has a high degree of variability among users, leading to
portability concerns.  Besides, Perl can use a shell that may not be
fit for interactive use, and setting COMSPEC to such a shell may
interfere with the proper functioning of other programs (which usually
look in COMSPEC to find a shell fit for interactive use).

Before Perl 5.10.0 and 5.8.8, PERL5SHELL was not taint checked
when running external commands.  It is recommended that
you explicitly set (or delete) C<$ENV{PERL5SHELL}> when running
in taint mode under Windows.

=item PERL_ALLOW_NON_IFS_LSP (specific to the Win32 port)
X<PERL_ALLOW_NON_IFS_LSP>

Set to 1 to allow the use of non-IFS compatible LSPs (Layered Service Providers).
Perl normally searches for an IFS-compatible LSP because this is required
for its emulation of Windows sockets as real filehandles.  However, this may
cause problems if you have a firewall such as I<McAfee Guardian>, which requires
that all applications use its LSP but which is not IFS-compatible, because clearly
Perl will normally avoid using such an LSP.

Setting this environment variable to 1 means that Perl will simply use the
first suitable LSP enumerated in the catalog, which keeps I<McAfee Guardian>
happy--and in that particular case Perl still works too because I<McAfee
Guardian>'s LSP actually plays other games which allow applications
requiring IFS compatibility to work.

=item PERL_DEBUG_MSTATS
X<PERL_DEBUG_MSTATS>

Relevant only if Perl is compiled with the C<malloc> included with the Perl
distribution; that is, if C<perl -V:d_mymalloc> is "define".

If set, this dumps out memory statistics after execution.  If set
to an integer greater than one, also dumps out memory statistics
after compilation.

=item PERL_DESTRUCT_LEVEL
X<PERL_DESTRUCT_LEVEL>

Relevant only if your Perl executable was built with B<-DDEBUGGING>,
this controls the behaviour of global destruction of objects and other
references.  See L<perlhacktips/PERL_DESTRUCT_LEVEL> for more information.

=item PERL_DL_NONLAZY
X<PERL_DL_NONLAZY>

Set to C<"1"> to have Perl resolve I<all> undefined symbols when it loads
a dynamic library.  The default behaviour is to resolve symbols when
they are used.  Setting this variable is useful during testing of
extensions, as it ensures that you get an error on misspelled function
names even if the test suite doesn't call them.

=item PERL_ENCODING
X<PERL_ENCODING>

If using the C<use encoding> pragma without an explicit encoding name, the
PERL_ENCODING environment variable is consulted for an encoding name.

=item PERL_HASH_SEED
X<PERL_HASH_SEED>

(Since Perl 5.8.1, new semantics in Perl 5.18.0)  Used to override
the randomization of Perl's internal hash function. The value is expressed
in hexadecimal, and may include a leading 0x. Truncated patterns
are treated as though they are suffixed with sufficient 0's as required.

If the option is provided, and C<PERL_PERTURB_KEYS> is NOT set, then
a value of '0' implies C<PERL_PERTURB_KEYS=0> and any other value
implies C<PERL_PERTURB_KEYS=2>.

B<PLEASE NOTE: The hash seed is sensitive information>. Hashes are
randomized to protect against local and remote attacks against Perl
code. By manually setting a seed, this protection may be partially or
completely lost.

See L<perlsec/"Algorithmic Complexity Attacks">, L</PERL_PERTURB_KEYS>, and
L</PERL_HASH_SEED_DEBUG> for more information.

=item PERL_PERTURB_KEYS
X<PERL_PERTURB_KEYS>

(Since Perl 5.18.0)  Set to C<"0"> or C<"NO"> then traversing keys
will be repeatable from run to run for the same PERL_HASH_SEED.
Insertion into a hash will not change the order, except to provide
for more space in the hash. When combined with setting PERL_HASH_SEED
this mode is as close to pre 5.18 behavior as you can get.

When set to C<"1"> or C<"RANDOM"> then traversing keys will be randomized.
Every time a hash is inserted into the key order will change in a random
fashion. The order may not be repeatable in a following program run
even if the PERL_HASH_SEED has been specified. This is the default
mode for perl.

When set to C<"2"> or C<"DETERMINISTIC"> then inserting keys into a hash
will cause the key order to change, but in a way that is repeatable
from program run to program run.

B<NOTE:> Use of this option is considered insecure, and is intended only
for debugging non-deterministic behavior in Perl's hash function. Do
not use it in production.

See L<perlsec/"Algorithmic Complexity Attacks"> and L</PERL_HASH_SEED>
and L</PERL_HASH_SEED_DEBUG> for more information. You can get and set the
key traversal mask for a specific hash by using the C<hash_traversal_mask()>
function from L<Hash::Util>.

=item PERL_HASH_SEED_DEBUG
X<PERL_HASH_SEED_DEBUG>

(Since Perl 5.8.1.)  Set to C<"1"> to display (to STDERR) information
about the hash function, seed, and what type of key traversal
randomization is in effect at the beginning of execution.  This, combined
with L</PERL_HASH_SEED> and L</PERL_PERTURB_KEYS> is intended to aid in
debugging nondeterministic behaviour caused by hash randomization.

B<Note> that any information about the hash function, especially the hash
seed is B<sensitive information>: by knowing it, one can craft a denial-of-service
attack against Perl code, even remotely; see L<perlsec/"Algorithmic Complexity Attacks">
for more information. B<Do not disclose the hash seed> to people who
don't need to know it. See also C<hash_seed()> and
C<key_traversal_mask()> in L<Hash::Util>.

An example output might be:

 HASH_FUNCTION = ONE_AT_A_TIME_HARD HASH_SEED = 0x652e9b9349a7a032 PERTURB_KEYS = 1 (RANDOM)

=item PERL_MEM_LOG
X<PERL_MEM_LOG>

If your Perl was configured with B<-Accflags=-DPERL_MEM_LOG>, setting
the environment variable C<PERL_MEM_LOG> enables logging debug
messages. The value has the form C<< <I<number>>[m][s][t] >>, where
C<I<number>> is the file descriptor number you want to write to (2 is
default), and the combination of letters specifies that you want
information about (m)emory and/or (s)v, optionally with
(t)imestamps. For example, C<PERL_MEM_LOG=1mst> logs all
information to stdout. You can write to other opened file descriptors
in a variety of ways:

  $ 3>foo3 PERL_MEM_LOG=3m perl ...

=item PERL_ROOT (specific to the VMS port)
X<PERL_ROOT>

A translation-concealed rooted logical name that contains Perl and the
logical device for the @INC path on VMS only.  Other logical names that
affect Perl on VMS include PERLSHR, PERL_ENV_TABLES, and
SYS$TIMEZONE_DIFFERENTIAL, but are optional and discussed further in
L<perlvms> and in F<README.vms> in the Perl source distribution.

=item PERL_SIGNALS
X<PERL_SIGNALS>

Available in Perls 5.8.1 and later.  If set to C<"unsafe">, the pre-Perl-5.8.0
signal behaviour (which is immediate but unsafe) is restored.  If set
to C<safe>, then safe (but deferred) signals are used.  See
L<perlipc/"Deferred Signals (Safe Signals)">.

=item PERL_UNICODE
X<PERL_UNICODE>

Equivalent to the B<-C> command-line switch.  Note that this is not
a boolean variable. Setting this to C<"1"> is not the right way to
"enable Unicode" (whatever that would mean).  You can use C<"0"> to
"disable Unicode", though (or alternatively unset PERL_UNICODE in
your shell before starting Perl).  See the description of the B<-C>
switch for more information.

=item PERL_USE_UNSAFE_INC
X<PERL_USE_UNSAFE_INC>

If perl has been configured to not have the current directory in
L<C<@INC>|perlvar/@INC> by default, this variable can be set to C<"1">
to reinstate it.  It's primarily intended for use while building and
testing modules that have not been updated to deal with "." not being in
C<@INC> and should not be set in the environment for day-to-day use.

=item SYS$LOGIN (specific to the VMS port)
X<SYS$LOGIN>

Used if chdir has no argument and HOME and LOGDIR are not set.

=back

Perl also has environment variables that control how Perl handles data
specific to particular natural languages; see L<perllocale>.

Perl and its various modules and components, including its test frameworks,
may sometimes make use of certain other environment variables.  Some of
these are specific to a particular platform.  Please consult the
appropriate module documentation and any documentation for your platform
(like L<perlsolaris>, L<perllinux>, L<perlmacosx>, L<perlwin32>, etc) for
variables peculiar to those specific situations.

Perl makes all environment variables available to the program being
executed, and passes these along to any child processes it starts.
However, programs running setuid would do well to execute the following
lines before doing anything else, just to keep people honest:

    $ENV{PATH}  = "/bin:/usr/bin";    # or whatever you need
    $ENV{SHELL} = "/bin/sh" if exists $ENV{SHELL};
    delete @ENV{qw(IFS CDPATH ENV BASH_ENV)};
perl5162delta.pod000064400000007012150344123500007537 0ustar00=encoding utf8

=head1 NAME

perl5162delta - what is new for perl v5.16.2

=head1 DESCRIPTION

This document describes differences between the 5.16.1 release and
the 5.16.2 release.

If you are upgrading from an earlier release such as 5.16.0, first read
L<perl5161delta>, which describes differences between 5.16.0 and
5.16.1.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.16.0
If any exist, they are bugs, and we request that you submit a
report.  See L</Reporting Bugs> below.

=head1 Modules and Pragmata

=head2 Updated Modules and Pragmata

=over 4

=item *

L<Module::CoreList> has been upgraded from version 2.70 to version 2.76.

=back

=head1 Configuration and Compilation

=over 4

=item * configuration should no longer be confused by ls colorization

=back

=head1 Platform Support

=head2 Platform-Specific Notes

=over 4

=item AIX

Configure now always adds -qlanglvl=extc99 to the CC flags on AIX when
using xlC.  This will make it easier to compile a number of XS-based modules
that assume C99 [perl #113778].

=back

=head1 Selected Bug Fixes

=over 4

=item * fix /\h/ equivalence with /[\h]/

see [perl #114220]

=back

=head1 Known Problems

There are no new known problems.

=head1 Acknowledgements

Perl 5.16.2 represents approximately 2 months of development since Perl
5.16.1 and contains approximately 740 lines of changes across 20 files
from 9 authors.

Perl continues to flourish into its third decade thanks to a vibrant
community of users and developers. The following people are known to
have contributed the improvements that became Perl 5.16.2:

Andy Dougherty, Craig A. Berry, Darin McBride, Dominic Hargreaves, Karen
Etheridge, Karl Williamson, Peter Martini, Ricardo Signes, Tony Cook.

The list above is almost certainly incomplete as it is automatically
generated from version control history. In particular, it does not
include the names of the (very much appreciated) contributors who
reported issues to the Perl bug tracker.

For a more complete list of all of Perl's historical contributors,
please see the F<AUTHORS> file in the Perl source distribution.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please
send it to perl5-security-report@perl.org. This points to a closed
subscription unarchived mailing list, which includes all the core
committers, who will be able to help assess the impact of issues, figure
out a resolution, and help co-ordinate the release of patches to
mitigate or fix the problem across all platforms on which Perl is
supported. Please only use this address for security issues in the Perl
core, not for modules independently distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perldeprecation.pod000064400000043371150344123500010435 0ustar00=head1 NAME

perldeprecation - list Perl deprecations

=head1 DESCRIPTION

The purpose of this document is to document what has been deprecated
in Perl, and by which version the deprecated feature will disappear,
or, for already removed features, when it was removed.

This document will try to discuss what alternatives for the deprecated
features are available.

The deprecated features will be grouped by the version of Perl in
which they will be removed.

=head2 Perl 5.32

=head3 Constants from lexical variables potentially modified elsewhere

You wrote something like

    my $var;
    $sub = sub () { $var };

but $var is referenced elsewhere and could be modified after the C<sub>
expression is evaluated.  Either it is explicitly modified elsewhere
(C<$var = 3>) or it is passed to a subroutine or to an operator like
C<printf> or C<map>, which may or may not modify the variable.

Traditionally, Perl has captured the value of the variable at that
point and turned the subroutine into a constant eligible for inlining.
In those cases where the variable can be modified elsewhere, this
breaks the behavior of closures, in which the subroutine captures
the variable itself, rather than its value, so future changes to the
variable are reflected in the subroutine's return value.

If you intended for the subroutine to be eligible for inlining, then
make sure the variable is not referenced elsewhere, possibly by
copying it:

    my $var2 = $var;
    $sub = sub () { $var2 };

If you do want this subroutine to be a closure that reflects future
changes to the variable that it closes over, add an explicit C<return>:

    my $var;
    $sub = sub () { return $var };

This usage has been deprecated, and will no longer be allowed in Perl 5.32.

=head2 Perl 5.30

=head3 C<< $* >> is no longer supported

Before Perl 5.10, setting C<< $* >> to a true value globally enabled
multi-line matching within a string. This relique from the past lost
its special meaning in 5.10. Use of this variable will be a fatal error
in Perl 5.30, freeing the variable up for a future special meaning.

To enable multiline matching one should use the C<< /m >> regexp
modifier (possibly in combination with C<< /s >>). This can be set
on a per match bases, or can be enabled per lexical scope (including
a whole file) with C<< use re '/m' >>.

=head3 C<< $# >> is no longer supported

This variable used to have a special meaning -- it could be used
to control how numbers were formatted when printed. This seldom
used functionality was removed in Perl 5.10. In order to free up
the variable for a future special meaning, its use will be a fatal
error in Perl 5.30.

To specify how numbers are formatted when printed, one is adviced
to use C<< printf >> or C<< sprintf >> instead.

=head3 C<< File::Glob::glob() >> will disappear

C<< File::Glob >> has a function called C<< glob >>, which just calls
C<< bsd_glob >>. However, its prototype is different from the prototype
of C<< CORE::glob >>, and hence, C<< File::Glob::glob >> should not
be used.

C<< File::Glob::glob() >> was deprecated in Perl 5.8. A deprecation
message was issued from Perl 5.26 onwards, and the function will
disappear in Perl 5.30.

Code using C<< File::Glob::glob() >> should call
C<< File::Glob::bsd_glob() >> instead.


=head3 Unescaped left braces in regular expressions

The simple rule to remember, if you want to match a literal C<{>
character (U+007B C<LEFT CURLY BRACKET>) in a regular expression
pattern, is to escape each literal instance of it in some way.
Generally easiest is to precede it with a backslash, like C<\{>
or enclose it in square brackets (C<[{]>).  If the pattern
delimiters are also braces, any matching right brace (C<}>) should
also be escaped to avoid confusing the parser, for example,

 qr{abc\{def\}ghi}

Forcing literal C<{> characters to be escaped will enable the Perl
language to be extended in various ways in future releases.  To avoid
needlessly breaking existing code, the restriction is is not enforced in
contexts where there are unlikely to ever be extensions that could
conflict with the use there of C<{> as a literal.

Literal uses of C<{> were deprecated in Perl 5.20, and some uses of it
started to give deprecation warnings since. These cases were made fatal
in Perl 5.26. Due to an oversight, not all cases of a use of a literal
C<{> got a deprecation warning. These cases started warning in Perl 5.26,
and they will be fatal by Perl 5.30.

=head3 Unqualified C<dump()>

Use of C<dump()> instead of C<CORE::dump()> was deprecated in Perl 5.8,
and an unqualified C<dump()> will no longer be available in Perl 5.30.

See L<perlfunc/dump>.


=head3 Using my() in false conditional.

There has been a long-standing bug in Perl that causes a lexical variable
not to be cleared at scope exit when its declaration includes a false
conditional.  Some people have exploited this bug to achieve a kind of
static variable.  Since we intend to fix this bug, we don't want people
relying on this behavior.

Instead, it's recommended one uses C<state> variables to achieve the
same effect:

    use 5.10.0;
    sub count {state $counter; return ++ $counter}
    say count ();    # Prints 1
    say count ();    # Prints 2

C<state> variables were introduced in Perl 5.10.

Alternatively, you can achieve a similar static effect by
declaring the variable in a separate block outside the function, eg

    sub f { my $x if 0; return $x++ }

becomes

    { my $x; sub f { return $x++ } }

The use of C<my()> in a false conditional has been deprecated in
Perl 5.10, and it will become a fatal error in Perl 5.30.


=head3 Reading/writing bytes from/to :utf8 handles.

The sysread(), recv(), syswrite() and send() operators are
deprecated on handles that have the C<:utf8> layer, either explicitly, or
implicitly, eg., with the C<:encoding(UTF-16LE)> layer.

Both sysread() and recv() currently use only the C<:utf8> flag for the stream,
ignoring the actual layers.  Since sysread() and recv() do no UTF-8
validation they can end up creating invalidly encoded scalars.

Similarly, syswrite() and send() use only the C<:utf8> flag, otherwise ignoring
any layers.  If the flag is set, both write the value UTF-8 encoded, even if
the layer is some different encoding, such as the example above.

Ideally, all of these operators would completely ignore the C<:utf8> state,
working only with bytes, but this would result in silently breaking existing
code.  To avoid this a future version of perl will throw an exception when
any of sysread(), recv(), syswrite() or send() are called on handle with the
C<:utf8> layer.

In Perl 5.30, it will no longer be possible to use sysread(), recv(),
syswrite() or send() to read or send bytes from/to :utf8 handles.


=head3 Use of unassigned code point or non-standalone grapheme for a delimiter.

A grapheme is what appears to a native-speaker of a language to be a
character.  In Unicode (and hence Perl) a grapheme may actually be
several adjacent characters that together form a complete grapheme.  For
example, there can be a base character, like "R" and an accent, like a
circumflex "^", that appear when displayed to be a single character with
the circumflex hovering over the "R".  Perl currently allows things like
that circumflex to be delimiters of strings, patterns, I<etc>.  When
displayed, the circumflex would look like it belongs to the character
just to the left of it.  In order to move the language to be able to
accept graphemes as delimiters, we have to deprecate the use of
delimiters which aren't graphemes by themselves.  Also, a delimiter must
already be assigned (or known to be never going to be assigned) to try
to future-proof code, for otherwise code that works today would fail to
compile if the currently unassigned delimiter ends up being something
that isn't a stand-alone grapheme.  Because Unicode is never going to
assign
L<non-character code points|perlunicode/Noncharacter code points>, nor
L<code points that are above the legal Unicode maximum|
perlunicode/Beyond Unicode code points>, those can be delimiters, and
their use won't raise this warning.

In Perl 5.30, delimiters which are unassigned code points, or which
are non-standalone graphemes will be fatal.

=head3 In XS code, use of various macros dealing with UTF-8.

These macros will require an extra parameter in Perl 5.30:
C<isALPHANUMERIC_utf8>,
C<isASCII_utf8>,
C<isBLANK_utf8>,
C<isCNTRL_utf8>,
C<isDIGIT_utf8>,
C<isIDFIRST_utf8>,
C<isPSXSPC_utf8>,
C<isSPACE_utf8>,
C<isVERTWS_utf8>,
C<isWORDCHAR_utf8>,
C<isXDIGIT_utf8>,
C<isALPHANUMERIC_LC_utf8>,
C<isALPHA_LC_utf8>,
C<isASCII_LC_utf8>,
C<isBLANK_LC_utf8>,
C<isCNTRL_LC_utf8>,
C<isDIGIT_LC_utf8>,
C<isGRAPH_LC_utf8>,
C<isIDCONT_LC_utf8>,
C<isIDFIRST_LC_utf8>,
C<isLOWER_LC_utf8>,
C<isPRINT_LC_utf8>,
C<isPSXSPC_LC_utf8>,
C<isPUNCT_LC_utf8>,
C<isSPACE_LC_utf8>,
C<isUPPER_LC_utf8>,
C<isWORDCHAR_LC_utf8>,
C<isXDIGIT_LC_utf8>,
C<toFOLD_utf8>,
C<toLOWER_utf8>,
C<toTITLE_utf8>,
and
C<toUPPER_utf8>.

There is now a macro that corresponds to each one of these, simply by
appending C<_safe> to the name.  It takes the extra parameter.
For example, C<isDIGIT_utf8_safe> corresponds to C<isDIGIT_utf8>, but
takes the extra parameter, and its use doesn't generate a deprecation
warning.  All are documented in L<perlapi/Character case changing> and
L<perlapi/Character classification>.

You can change to use these versions at any time, or, if you can live
with the deprecation messages, wait until 5.30 and add the parameter to
the existing calls, without changing the names.

=head2 Perl 5.28

=head3 Attribute "%s" is deprecated, and will disappear in 5.28

The attributes C<< :locked >> (on code references) and C<< :unique >>
(on array, hash and scalar references) have had no effect since 
Perl 5.005 and Perl 5.8.8 respectively. Their use has been deprecated
since.

These attributes will no longer be recognized in Perl 5.28, and will
then result in a syntax error. Since the attributes do not do anything,
removing them from your code fixes the deprecation warning; and removing
them will not influence the behaviour of your code.


=head3 Bare here-document terminators

Perl has allowed you to use a bare here-document terminator to have the
here-document end at the first empty line. This practise was deprecated
in Perl 5.000, and this will be a fatal error in Perl 5.28.

You are encouraged to use the explictly quoted form if you wish to
use an empty line as the terminator of the here-document:

  print <<"";
    Print this line.

  # Previous blank line ends the here-document.


=head3 Setting $/ to a reference to a non-positive integer

You assigned a reference to a scalar to C<$/> where the
referenced item is not a positive integer.  In older perls this B<appeared>
to work the same as setting it to C<undef> but was in fact internally
different, less efficient and with very bad luck could have resulted in
your file being split by a stringified form of the reference.

In Perl 5.20.0 this was changed so that it would be B<exactly> the same as
setting C<$/> to undef, with the exception that this warning would be
thrown.

In Perl 5.28, this will throw a fatal error.

You are recommended to change your code to set C<$/> to C<undef> explicitly
if you wish to slurp the file.


=head3 Limit on the value of Unicode code points.

Unicode only allows code points up to 0x10FFFF, but Perl allows much
larger ones. However, using code points exceeding the maximum value
of an integer (C<IV_MAX>) may break the perl interpreter in some constructs,
including causing it to hang in a few cases.  The known problem areas
are in C<tr///>, regular expression pattern matching using quantifiers,
as quote delimiters in C<qI<X>...I<X>> (where I<X> is the C<chr()> of a large
code point), and as the upper limits in loops.

The use of out of range code points was deprecated in Perl 5.24, and
it will be a fatal error in Perl 5.28.

If your code is to run on various platforms, keep in mind that the upper
limit depends on the platform.  It is much larger on 64-bit word sizes
than 32-bit ones.


=head3 Use of comma-less variable list in formats.

It's allowed to use a list of variables in a format, without
separating them with commas. This usage has been deprecated
for a long time, and it will be a fatal error in Perl 5.28.



=head3 Use of C<\N{}>

Use of C<\N{}> with nothing between the braces was deprecated in
Perl 5.24, and will throw a fatal error in Perl 5.28.

Since such a construct is equivalent to using an empty string,
you are recommended to remove such C<\N{}> constructs.


=head3 Using the same symbol to open a filehandle and a dirhandle

It used to be legal to use C<open()> to associate both a
filehandle and a dirhandle to the same symbol (glob or scalar).
This idiom is likely to be confusing, and it was deprecated in
Perl 5.10.

Using the same symbol to C<open()> a filehandle and a dirhandle
will be a fatal error in Perl 5.28.

You should be using two different symbols instead.

=head3 ${^ENCODING} is no longer supported.

The special variable C<${^ENCODING}> was used to implement
the C<encoding> pragma. Setting this variable to anything other
than C<undef> was deprecated in Perl 5.22. Full deprecation
of the variable happened in Perl 5.25.3.

Setting this variable will become a fatal error in Perl 5.28.


=head3 C<< B::OP::terse >>

This method, which just calls C<< B::Concise::b_terse >>, has been
deprecated, and will disappear in Perl 5.28. Please use 
C<< B::Concise >> instead.



=head3 Use of inherited AUTOLOAD for non-method %s() is deprecated

As an (ahem) accidental feature, C<AUTOLOAD> subroutines are looked
up as methods (using the C<@ISA> hierarchy) even when the subroutines
to be autoloaded were called as plain functions (e.g. C<Foo::bar()>),
not as methods (e.g. C<< Foo->bar() >> or C<< $obj->bar() >>).

This bug will be rectified in future by using method lookup only for
methods' C<AUTOLOAD>s.

The simple rule is:  Inheritance will not work when autoloading
non-methods.  The simple fix for old code is:  In any module that used
to depend on inheriting C<AUTOLOAD> for non-methods from a base class
named C<BaseClass>, execute C<*AUTOLOAD = \&BaseClass::AUTOLOAD> during
startup.

In code that currently says C<use AutoLoader; @ISA = qw(AutoLoader);>
you should remove AutoLoader from @ISA and change C<use AutoLoader;> to
C<use AutoLoader 'AUTOLOAD';>.

This feature was deprecated in Perl 5.004, and will be fatal in Perl 5.28.


=head3 Use of code points over 0xFF in string bitwise operators

The string bitwise operators, C<&>, C<|>, C<^>, and C<~>, treat
their operands as strings of bytes. As such, values above 0xFF 
are nonsensical. Using such code points with these operators
was deprecated in Perl 5.24, and will be fatal in Perl 5.28.

=head3 In XS code, use of C<to_utf8_case()>

This function is being removed; instead convert to call
the appropriate one of:
L<C<toFOLD_utf8_safe>|perlapi/toFOLD_utf8_safe>.
L<C<toLOWER_utf8_safe>|perlapi/toLOWER_utf8_safe>,
L<C<toTITLE_utf8_safe>|perlapi/toTITLE_utf8_safe>,
or
L<C<toUPPER_utf8_safe>|perlapi/toUPPER_utf8_safe>.

=head2 Perl 5.26

=head3 C<< --libpods >> in C<< Pod::Html >>

Since Perl 5.18, the option C<< --libpods >> has been deprecated, and
using this option did not do anything other than producing a warning.

The C<< --libpods >> option is no longer recognized in Perl 5.26.


=head3 The utilities C<< c2ph >> and C<< pstruct >>

These old, perl3-era utilities have been deprecated in favour of
C<< h2xs >> for a long time. In Perl 5.26, they have been removed.


=head3 Trapping C<< $SIG {__DIE__} >> other than during program exit.

The C<$SIG{__DIE__}> hook is called even inside an C<eval()>. It was
never intended to happen this way, but an implementation glitch made
this possible. This used to be deprecated, as it allowed strange action
at a distance like rewriting a pending exception in C<$@>. Plans to
rectify this have been scrapped, as users found that rewriting a
pending exception is actually a useful feature, and not a bug.

Perl never issued a deprecation warning for this; the deprecation
was by documentation policy only. But this deprecation has been 
lifted in Perl 5.26.


=head3 Malformed UTF-8 string in "%s"

This message indicates a bug either in the Perl core or in XS
code. Such code was trying to find out if a character, allegedly
stored internally encoded as UTF-8, was of a given type, such as
being punctuation or a digit.  But the character was not encoded
in legal UTF-8.  The C<%s> is replaced by a string that can be used
by knowledgeable people to determine what the type being checked
against was.

Passing malformed strings was deprecated in Perl 5.18, and
became fatal in Perl 5.26.


=head2 Perl 5.24

=head3 Use of C<< *glob{FILEHANDLE} >>

The use of C<< *glob{FILEHANDLE} >> was deprecated in Perl 5.8.
The intention was to use C<< *glob{IO} >> instead, for which 
C<< *glob{FILEHANDLE} >> is an alias.

However, this feature was undeprecated in Perl 5.24.

=head3 Calling POSIX::%s() is deprecated

The following functions in the C<POSIX> module are no longer available:
C<isalnum>, C<isalpha>, C<iscntrl>, C<isdigit>, C<isgraph>, C<islower>,  
C<isprint>, C<ispunct>, C<isspace>, C<isupper>, and C<isxdigit>.  The 
functions are buggy and don't work on UTF-8 encoded strings.  See their
entries in L<POSIX> for more information.

The functions were deprecated in Perl 5.20, and removed in Perl 5.24.


=head2 Perl 5.16

=head3 Use of %s on a handle without * is deprecated

It used to be possible to use C<tie>, C<tied> or C<untie> on a scalar
while the scalar holds a typeglob. This caused its filehandle to be
tied. It left no way to tie the scalar itself when it held a typeglob,
and no way to untie a scalar that had had a typeglob assigned to it.

This was deprecated in Perl 5.14, and the bug was fixed in Perl 5.16.

So now C<tie $scalar> will always tie the scalar, not the handle it holds.
To tie the handle, use C<tie *$scalar> (with an explicit asterisk).  The same
applies to C<tied *$scalar> and C<untie *$scalar>.


=head1 SEE ALSO

L<warnings>, L<diagnostics>.

=cut
perlboot.pod000064400000000446150344123500007077 0ustar00=encoding utf8

=head1 NAME

perlboot - Links to information on object-oriented programming in Perl

=head1 DESCRIPTION

For information on OO programming with Perl, please see L<perlootut>
and L<perlobj>.

(The above documents supersede the tutorial that was formerly here in
perlboot.)

=cut
perlamiga.pod000064400000013165150344123500007214 0ustar00If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see perlpod manpage) which is
specially designed to be readable as is.

=head1 NAME

perlamiga - Perl under AmigaOS 4.1

=head1 NOTE

This is a port of Perl 5.22.1, it is a fresh port and not in any way
compatible with my previous ports of Perl 5.8 and 5.16.3. This means
you will need to reinstall / rebuild any third party modules you have
installed.

newlib.library version 53.28 or greater is required.

=head1 SYNOPSIS

Once perl is installed you can read this document in the following way

	sh -c "perldoc perlamiga"

or you may read I<as is>: either as F<README.amiga>, or F<pod/perlamiga.pod>.

=cut

       NAME
       SYNOPSIS
       DESCRIPTION
         -  Prerequisites
         -  Starting Perl programs under AmigaOS
         -  Shortcomings of Perl under AmigaOS
       INSTALLATION
       CHANGES

=head1 DESCRIPTION

=head2 Prerequisites for running Perl 5.22.1 under AmigaOS 4.1

=over 6

=item B<AmigaOS 4.1 update 6 with all updates applied as of 9th October 2013>

The most important of which is:

=item B<newlib.library version 53.28 or greater>

=item B<AmigaOS SDK>

Perl installs into the SDK directory structure and expects many of the
build tools present in the SDK to be available. So for the best results
install the SDK first.

=item B<abc-shell>

If you do not have the SDK installed you must at least have abc-shell
installed or some other suitable sh port. This is required to run
external commands and should be available as 'sh' in your path.

=back

=head2 Starting Perl programs under AmigaOS 4.1

Perl may be run from the AmigaOS shell but for best results should be
run under abc-shell.  (abc-shell handles file globbing, pattern
expansion, and sets up environment variables in the UN*Xy way that
Perl expects.)

For example:

	New Shell process 10
	10.AmigaOS4:> sh
	/AmigaOS4>perl path:to/myprog arg1 arrg2 arg3

Abc-shell can also launch programs via the #! syntax at the start of
the program file, it's best use the form #!SDK:Local/C/perl so that
the AmigaOS shell may also find perl in the same way. AmigaOS requires
the script bit to be set for this to work

	10.AmigaOS4:> sh
	/AmigaOS4>myprog arg1 arrg2 arg3

=head2 Limitations of Perl under AmigaOS 4.1

=over 6

=item B<Nested Piped programs can crash when run from older abc-shells>

abc-shell version 53.2 has a bug that can cause crashes in the
subprocesses used to run piped programs, if a later version is
available you should install it instead.

=item B<Incorrect or unexpected command line unescaping>

newlib.library 53.30 and earlier incorrectly unescape slashed escape
sequences e.g. \" \n \t etc requiring unusual extra escaping.

=item B<Starting subprocesses via open has limitations>

	open FH, "command |"

Subprocesses started with open use a minimal popen() routine and
therefore they do not return pids usable with waitpid etc.

=item If you find any other limitations or bugs then let me know.

Please report bugs in this version of perl to andy@broad.ology.org.uk
in the first instance.

=back

=head1 INSTALLATION

This guide assumes you have obtained a prebuilt archive from os4depot.net.

Unpack the main archive to a temporary location (RAM: is fine).

Execute the provided install script from shell or via its icon.

You B<must not> attempt to install by hand.

Once installed you may delete the temporary archive.

This approach will preserve links in the installation without creating
duplicate binaries.

If you have the earlier ports perl 5.16 or 5.8 installed you may like
to rename your perl executable to perl516 or perl58 or something
similar before the installation of 5.22.1, this will allow you to use
both versions at the same time.

=head1 Amiga Specific Modules

=head2 Amiga::ARexx

The Amiga::ARexx module allows you to easily create a perl based ARexx
host or to send ARexx commands to other programs.

Try C<perldoc Amiga::ARexx> for more info.

=head2 Amiga::Exec

The Amiga::Exec module introduces support for Wait().

Try C<perldoc Amiga::Exec> for more info.

=head1 BUILDING

To build perl under AmigaOS from the patched sources you will need to
have a recent version of the SDK. Version 53.29 is recommended,
earlier versions will probably work too.

With the help of Jarkko Hietaniemi the Configure system has been tweaked to
run under abc-shell so the recommend build process is as follows.

	stack 2000000
	sh Configure -de
	gmake

This will build the default setup that installs under SDK:local/newlib/lib/

=head1 CHANGES

=over 6

=item B<August 2015>

=over 2

=item Port to Perl 5.22

=item Add handling of NIL: to afstat()

=item Fix inheritance of environment variables by subprocesses.

=item Fix exec, and exit in "forked" subprocesses.

=item Fix issue with newlib's unlink, which could cause infinite loops.

=item Add flock() emulation using IDOS->LockRecord thanks to Tony Cook
for the suggestion.

=item Fix issue where kill was using the wrong kind of process ID

=back

=item B<27th November 2013>

=over 2

=item Create new installation system based on installperl links
and Amiga protection bits now set correctly.

=item Pod now defaults to text.

=item File::Spec should now recognise an Amiga style absolute path as well
as an Unix style one. Relative paths must always be Unix style.

=back

=item B<20th November 2013>

=over 2

=item Configured to use SDK:Local/C/perl to start standard scripts

=item Added Amiga::Exec module with support for Wait() and AmigaOS signal numbers.

=back

=item B<10th October 13>

First release of port to 5.16.3.

=back

=head1 SEE ALSO

You like this port?  See L<http://www.broad.ology.org.uk/amiga/>
for how you can help.

=cut
perlnetware.pod000064400000014770150344123510007607 0ustar00If you read this file _as_is_, just ignore the funny characters you
see.  It is written in the POD format (see pod/perlpod.pod) which is
specifically designed to be readable as is.

=head1 NAME

perlnetware - Perl for NetWare

=head1 DESCRIPTION

This file gives instructions for building Perl 5.7 and above, and also 
Perl modules for NetWare. Before you start, you may want to read the
README file found in the top level directory into which the Perl source
code distribution was extracted. Make sure you read and understand
the terms under which the software is being distributed.

=head1 BUILD

This section describes the steps to be performed to build a Perl NLM
and other associated NLMs.

=head2 Tools & SDK

The build requires CodeWarrior compiler and linker.  In addition,
the "NetWare SDK", "NLM & NetWare Libraries for C" and
"NetWare Server Protocol Libraries for C", all available at
L<http://developer.novell.com/wiki/index.php/Category:Novell_Developer_Kit>,
are required. Microsoft Visual C++ version 4.2 or later is also
required.

=head2 Setup

The build process is dependent on the location of the NetWare SDK.
Once the Tools & SDK are installed, the build environment has to
be setup.  The following batch files setup the environment.

=over 4

=item SetNWBld.bat

The Execution of this file takes 2 parameters as input. The first
being the NetWare SDK path, second being the path for CodeWarrior
Compiler & tools. Execution of this file sets these paths and also
sets the build type to Release by default.

=item Buildtype.bat

This is used to set the build type to debug or release. Change the
build type only after executing SetNWBld.bat

Example:

=over

=item 1.

Typing "buildtype d on" at the command prompt causes the buildtype
to be set to Debug type with D2 flag set. 

=item 2.

Typing "buildtype d off" or "buildtype d" at the command prompt causes
the buildtype to be set to Debug type with D1 flag set. 

=item 3.

Typing "buildtype r" at the command prompt sets it to Release Build type.

=back

=back

=head2 Make

The make process runs only under WinNT shell.  The NetWare makefile is
located under the NetWare folder.  This makes use of miniperl.exe to
run some of the Perl scripts. To create miniperl.exe, first set the
required paths for Visual c++ compiler (specify vcvars32 location) at
the command prompt.  Then run nmake from win32 folder through WinNT
command prompt.  The build process can be stopped after miniperl.exe
is created. Then run nmake from NetWare folder through WinNT command
prompt.

Currently the following two build types are tested on NetWare:

=over 4

=item *

USE_MULTI, USE_ITHREADS & USE_IMP_SYS defined

=item *

USE_MULTI & USE_IMP_SYS defined and USE_ITHREADS not defined

=back

=head2 Interpreter

Once miniperl.exe creation is over, run nmake from the NetWare folder.
This will build the Perl interpreter for NetWare as I<perl.nlm>.
This is copied under the I<Release> folder if you are doing
a release build, else will be copied under I<Debug> folder for debug builds.

=head2 Extensions

The make process also creates the Perl extensions as I<<Extension>.nlm>

=head1 INSTALL

To install NetWare Perl onto a NetWare server, first map the Sys
volume of a NetWare server to I<i:>. This is because the makefile by
default sets the drive letter to I<i:>.  Type I<nmake nwinstall> from
NetWare folder on a WinNT command prompt.  This will copy the binaries
and module files onto the NetWare server under I<sys:\Perl>
folder. The Perl interpreter, I<perl.nlm>, is copied under
I<sys:\perl\system> folder.  Copy this to I<sys:\system> folder.

Example: At the command prompt Type "nmake nwinstall".
          This will install NetWare Perl on the NetWare Server.
          Similarly, if you type "nmake install",
          this will cause the binaries to be installed on the local machine.
          (Typically under the c:\perl folder)

=head1 BUILD NEW EXTENSIONS

To build extensions other than standard extensions, NetWare Perl has
to be installed on Windows along with Windows Perl. The Perl for
Windows can be either downloaded from the CPAN site and built using
the sources, or the binaries can be directly downloaded from the
ActiveState site.  Installation can be done by invoking I<nmake
install> from the NetWare folder on a WinNT command prompt after
building NetWare Perl by following steps given above.  This will copy
all the *.pm files and other required files.  Documentation files are
not copied.  Thus one must first install Windows Perl, Then install
NetWare Perl.

Once this is done, do the following to build any extension:

=over 4

=item *

Change to the extension directory where its source files are present.

=item *

Run the following command at the command prompt:

    perl -II<path to NetWare lib dir> -II<path to lib> Makefile.pl

Example:

    perl -Ic:/perl/5.6.1/lib/NetWare-x86-multi-thread           \
                                -Ic:\perl\5.6.1\lib MakeFile.pl

or

    perl -Ic:/perl/5.8.0/lib/NetWare-x86-multi-thread           \
                                -Ic:\perl\5.8.0\lib MakeFile.pl

=item *

nmake

=item *

nmake install

Install will copy the files into the Windows machine where NetWare
Perl is installed and these files may have to be copied to the NetWare
server manually. Alternatively, pass I<INSTALLSITELIB=i:\perl\lib> as
an input to makefile.pl above. Here I<i:> is the mapped drive to the
sys: volume of the server where Perl on NetWare is installed. Now
typing I<nmake install>, will copy the files onto the NetWare server.

Example: You can execute the following on the command prompt.

  perl -Ic:/perl/5.6.1/lib/NetWare-x86-multi-thread             \
                                -Ic:\perl\5.6.1\lib MakeFile.pl
  INSTALLSITELIB=i:\perl\lib

or

  perl -Ic:/perl/5.8.0/lib/NetWare-x86-multi-thread             \
                                -Ic:\perl\5.8.0\lib MakeFile.pl
  INSTALLSITELIB=i:\perl\lib

=item * 

Note: Some modules downloaded from CPAN may require NetWare related
API in order to build on NetWare.  Other modules may however build
smoothly with or without minor changes depending on the type of
module.

=back

=head1 ACKNOWLEDGEMENTS

The makefile for Win32 is used as a reference to create the makefile
for NetWare.  Also, the make process for NetWare port uses
miniperl.exe to run scripts during the make and installation process.

=head1 AUTHORS

Anantha Kesari H Y (hyanantha@novell.com)
Aditya C (caditya@novell.com)

=head1 DATE

=over 4

=item *

Created - 18 Jan 2001

=item *

Modified - 25 June 2001

=item *

Modified - 13 July 2001

=item *

Modified - 28 May 2002

=back
perlfunc.pod000064400001377375150344123510007113 0ustar00=head1 NAME
X<function>

perlfunc - Perl builtin functions

=head1 DESCRIPTION

The functions in this section can serve as terms in an expression.
They fall into two major categories: list operators and named unary
operators.  These differ in their precedence relationship with a
following comma.  (See the precedence table in L<perlop>.)  List
operators take more than one argument, while unary operators can never
take more than one argument.  Thus, a comma terminates the argument of
a unary operator, but merely separates the arguments of a list
operator.  A unary operator generally provides scalar context to its
argument, while a list operator may provide either scalar or list
contexts for its arguments.  If it does both, scalar arguments
come first and list argument follow, and there can only ever
be one such list argument.  For instance,
L<C<splice>|/splice ARRAY,OFFSET,LENGTH,LIST> has three scalar arguments
followed by a list, whereas L<C<gethostbyname>|/gethostbyname NAME> has
four scalar arguments.

In the syntax descriptions that follow, list operators that expect a
list (and provide list context for elements of the list) are shown
with LIST as an argument.  Such a list may consist of any combination
of scalar arguments or list values; the list values will be included
in the list as if each individual element were interpolated at that
point in the list, forming a longer single-dimensional list value.
Commas should separate literal elements of the LIST.

Any function in the list below may be used either with or without
parentheses around its arguments.  (The syntax descriptions omit the
parentheses.)  If you use parentheses, the simple but occasionally
surprising rule is this: It I<looks> like a function, therefore it I<is> a
function, and precedence doesn't matter.  Otherwise it's a list
operator or unary operator, and precedence does matter.  Whitespace
between the function and left parenthesis doesn't count, so sometimes
you need to be careful:

    print 1+2+4;      # Prints 7.
    print(1+2) + 4;   # Prints 3.
    print (1+2)+4;    # Also prints 3!
    print +(1+2)+4;   # Prints 7.
    print ((1+2)+4);  # Prints 7.

If you run Perl with the L<C<use warnings>|warnings> pragma, it can warn
you about this.  For example, the third line above produces:

    print (...) interpreted as function at - line 1.
    Useless use of integer addition in void context at - line 1.

A few functions take no arguments at all, and therefore work as neither
unary nor list operators.  These include such functions as
L<C<time>|/time> and L<C<endpwent>|/endpwent>.  For example,
C<time+86_400> always means C<time() + 86_400>.

For functions that can be used in either a scalar or list context,
nonabortive failure is generally indicated in scalar context by
returning the undefined value, and in list context by returning the
empty list.

Remember the following important rule: There is B<no rule> that relates
the behavior of an expression in list context to its behavior in scalar
context, or vice versa.  It might do two totally different things.
Each operator and function decides which sort of value would be most
appropriate to return in scalar context.  Some operators return the
length of the list that would have been returned in list context.  Some
operators return the first value in the list.  Some operators return the
last value in the list.  Some operators return a count of successful
operations.  In general, they do what you want, unless you want
consistency.
X<context>

A named array in scalar context is quite different from what would at
first glance appear to be a list in scalar context.  You can't get a list
like C<(1,2,3)> into being in scalar context, because the compiler knows
the context at compile time.  It would generate the scalar comma operator
there, not the list concatenation version of the comma.  That means it
was never a list to start with.

In general, functions in Perl that serve as wrappers for system calls
("syscalls") of the same name (like L<chown(2)>, L<fork(2)>,
L<closedir(2)>, etc.) return true when they succeed and
L<C<undef>|/undef EXPR> otherwise, as is usually mentioned in the
descriptions below.  This is different from the C interfaces, which
return C<-1> on failure.  Exceptions to this rule include
L<C<wait>|/wait>, L<C<waitpid>|/waitpid PID,FLAGS>, and
L<C<syscall>|/syscall NUMBER, LIST>.  System calls also set the special
L<C<$!>|perlvar/$!> variable on failure.  Other functions do not, except
accidentally.

Extension modules can also hook into the Perl parser to define new
kinds of keyword-headed expression.  These may look like functions, but
may also look completely different.  The syntax following the keyword
is defined entirely by the extension.  If you are an implementor, see
L<perlapi/PL_keyword_plugin> for the mechanism.  If you are using such
a module, see the module's documentation for details of the syntax that
it defines.

=head2 Perl Functions by Category
X<function>

Here are Perl's functions (including things that look like
functions, like some keywords and named operators)
arranged by category.  Some functions appear in more
than one place.

=over 4

=item Functions for SCALARs or strings
X<scalar> X<string> X<character>

=for Pod::Functions =String

L<C<chomp>|/chomp VARIABLE>, L<C<chop>|/chop VARIABLE>,
L<C<chr>|/chr NUMBER>, L<C<crypt>|/crypt PLAINTEXT,SALT>,
L<C<fc>|/fc EXPR>, L<C<hex>|/hex EXPR>,
L<C<index>|/index STR,SUBSTR,POSITION>, L<C<lc>|/lc EXPR>,
L<C<lcfirst>|/lcfirst EXPR>, L<C<length>|/length EXPR>,
L<C<oct>|/oct EXPR>, L<C<ord>|/ord EXPR>,
L<C<pack>|/pack TEMPLATE,LIST>,
L<C<qE<sol>E<sol>>|/qE<sol>STRINGE<sol>>,
L<C<qqE<sol>E<sol>>|/qqE<sol>STRINGE<sol>>, L<C<reverse>|/reverse LIST>,
L<C<rindex>|/rindex STR,SUBSTR,POSITION>,
L<C<sprintf>|/sprintf FORMAT, LIST>,
L<C<substr>|/substr EXPR,OFFSET,LENGTH,REPLACEMENT>,
L<C<trE<sol>E<sol>E<sol>>|/trE<sol>E<sol>E<sol>>, L<C<uc>|/uc EXPR>,
L<C<ucfirst>|/ucfirst EXPR>,
L<C<yE<sol>E<sol>E<sol>>|/yE<sol>E<sol>E<sol>>

L<C<fc>|/fc EXPR> is available only if the
L<C<"fc"> feature|feature/The 'fc' feature> is enabled or if it is
prefixed with C<CORE::>.  The
L<C<"fc"> feature|feature/The 'fc' feature> is enabled automatically
with a C<use v5.16> (or higher) declaration in the current scope.

=item Regular expressions and pattern matching
X<regular expression> X<regex> X<regexp>

=for Pod::Functions =Regexp

L<C<mE<sol>E<sol>>|/mE<sol>E<sol>>, L<C<pos>|/pos SCALAR>,
L<C<qrE<sol>E<sol>>|/qrE<sol>STRINGE<sol>>,
L<C<quotemeta>|/quotemeta EXPR>,
L<C<sE<sol>E<sol>E<sol>>|/sE<sol>E<sol>E<sol>>,
L<C<split>|/split E<sol>PATTERNE<sol>,EXPR,LIMIT>,
L<C<study>|/study SCALAR>

=item Numeric functions
X<numeric> X<number> X<trigonometric> X<trigonometry>

=for Pod::Functions =Math

L<C<abs>|/abs VALUE>, L<C<atan2>|/atan2 Y,X>, L<C<cos>|/cos EXPR>,
L<C<exp>|/exp EXPR>, L<C<hex>|/hex EXPR>, L<C<int>|/int EXPR>,
L<C<log>|/log EXPR>, L<C<oct>|/oct EXPR>, L<C<rand>|/rand EXPR>,
L<C<sin>|/sin EXPR>, L<C<sqrt>|/sqrt EXPR>, L<C<srand>|/srand EXPR>

=item Functions for real @ARRAYs
X<array>

=for Pod::Functions =ARRAY

L<C<each>|/each HASH>, L<C<keys>|/keys HASH>, L<C<pop>|/pop ARRAY>,
L<C<push>|/push ARRAY,LIST>, L<C<shift>|/shift ARRAY>,
L<C<splice>|/splice ARRAY,OFFSET,LENGTH,LIST>,
L<C<unshift>|/unshift ARRAY,LIST>, L<C<values>|/values HASH>

=item Functions for list data
X<list>

=for Pod::Functions =LIST

L<C<grep>|/grep BLOCK LIST>, L<C<join>|/join EXPR,LIST>,
L<C<map>|/map BLOCK LIST>, L<C<qwE<sol>E<sol>>|/qwE<sol>STRINGE<sol>>,
L<C<reverse>|/reverse LIST>, L<C<sort>|/sort SUBNAME LIST>,
L<C<unpack>|/unpack TEMPLATE,EXPR>

=item Functions for real %HASHes
X<hash>

=for Pod::Functions =HASH

L<C<delete>|/delete EXPR>, L<C<each>|/each HASH>,
L<C<exists>|/exists EXPR>, L<C<keys>|/keys HASH>,
L<C<values>|/values HASH>

=item Input and output functions
X<I/O> X<input> X<output> X<dbm>

=for Pod::Functions =I/O

L<C<binmode>|/binmode FILEHANDLE, LAYER>, L<C<close>|/close FILEHANDLE>,
L<C<closedir>|/closedir DIRHANDLE>, L<C<dbmclose>|/dbmclose HASH>,
L<C<dbmopen>|/dbmopen HASH,DBNAME,MASK>, L<C<die>|/die LIST>,
L<C<eof>|/eof FILEHANDLE>, L<C<fileno>|/fileno FILEHANDLE>,
L<C<flock>|/flock FILEHANDLE,OPERATION>, L<C<format>|/format>,
L<C<getc>|/getc FILEHANDLE>, L<C<print>|/print FILEHANDLE LIST>,
L<C<printf>|/printf FILEHANDLE FORMAT, LIST>,
L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<readdir>|/readdir DIRHANDLE>, L<C<readline>|/readline EXPR>,
L<C<rewinddir>|/rewinddir DIRHANDLE>, L<C<say>|/say FILEHANDLE LIST>,
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<seekdir>|/seekdir DIRHANDLE,POS>,
L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT>,
L<C<syscall>|/syscall NUMBER, LIST>,
L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>,
L<C<syswrite>|/syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<tell>|/tell FILEHANDLE>, L<C<telldir>|/telldir DIRHANDLE>,
L<C<truncate>|/truncate FILEHANDLE,LENGTH>, L<C<warn>|/warn LIST>,
L<C<write>|/write FILEHANDLE>

L<C<say>|/say FILEHANDLE LIST> is available only if the
L<C<"say"> feature|feature/The 'say' feature> is enabled or if it is
prefixed with C<CORE::>.  The
L<C<"say"> feature|feature/The 'say' feature> is enabled automatically
with a C<use v5.10> (or higher) declaration in the current scope.

=item Functions for fixed-length data or records

=for Pod::Functions =Binary

L<C<pack>|/pack TEMPLATE,LIST>,
L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<syscall>|/syscall NUMBER, LIST>,
L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>,
L<C<syswrite>|/syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<unpack>|/unpack TEMPLATE,EXPR>, L<C<vec>|/vec EXPR,OFFSET,BITS>

=item Functions for filehandles, files, or directories
X<file> X<filehandle> X<directory> X<pipe> X<link> X<symlink>

=for Pod::Functions =File

L<C<-I<X>>|/-X FILEHANDLE>, L<C<chdir>|/chdir EXPR>,
L<C<chmod>|/chmod LIST>, L<C<chown>|/chown LIST>,
L<C<chroot>|/chroot FILENAME>,
L<C<fcntl>|/fcntl FILEHANDLE,FUNCTION,SCALAR>, L<C<glob>|/glob EXPR>,
L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR>,
L<C<link>|/link OLDFILE,NEWFILE>, L<C<lstat>|/lstat FILEHANDLE>,
L<C<mkdir>|/mkdir FILENAME,MASK>, L<C<open>|/open FILEHANDLE,EXPR>,
L<C<opendir>|/opendir DIRHANDLE,EXPR>, L<C<readlink>|/readlink EXPR>,
L<C<rename>|/rename OLDNAME,NEWNAME>, L<C<rmdir>|/rmdir FILENAME>,
L<C<select>|/select FILEHANDLE>, L<C<stat>|/stat FILEHANDLE>,
L<C<symlink>|/symlink OLDFILE,NEWFILE>,
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE>,
L<C<umask>|/umask EXPR>, L<C<unlink>|/unlink LIST>,
L<C<utime>|/utime LIST>

=item Keywords related to the control flow of your Perl program
X<control flow>

=for Pod::Functions =Flow

L<C<break>|/break>, L<C<caller>|/caller EXPR>,
L<C<continue>|/continue BLOCK>, L<C<die>|/die LIST>, L<C<do>|/do BLOCK>,
L<C<dump>|/dump LABEL>, L<C<eval>|/eval EXPR>,
L<C<evalbytes>|/evalbytes EXPR>, L<C<exit>|/exit EXPR>,
L<C<__FILE__>|/__FILE__>, L<C<goto>|/goto LABEL>,
L<C<last>|/last LABEL>, L<C<__LINE__>|/__LINE__>,
L<C<next>|/next LABEL>, L<C<__PACKAGE__>|/__PACKAGE__>,
L<C<redo>|/redo LABEL>, L<C<return>|/return EXPR>,
L<C<sub>|/sub NAME BLOCK>, L<C<__SUB__>|/__SUB__>,
L<C<wantarray>|/wantarray>

L<C<break>|/break> is available only if you enable the experimental
L<C<"switch"> feature|feature/The 'switch' feature> or use the C<CORE::>
prefix.  The L<C<"switch"> feature|feature/The 'switch' feature> also
enables the C<default>, C<given> and C<when> statements, which are
documented in L<perlsyn/"Switch Statements">.
The L<C<"switch"> feature|feature/The 'switch' feature> is enabled
automatically with a C<use v5.10> (or higher) declaration in the current
scope.  In Perl v5.14 and earlier, L<C<continue>|/continue BLOCK>
required the L<C<"switch"> feature|feature/The 'switch' feature>, like
the other keywords.

L<C<evalbytes>|/evalbytes EXPR> is only available with the
L<C<"evalbytes"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
(see L<feature>) or if prefixed with C<CORE::>.  L<C<__SUB__>|/__SUB__>
is only available with the
L<C<"current_sub"> feature|feature/The 'current_sub' feature> or if
prefixed with C<CORE::>.  Both the
L<C<"evalbytes">|feature/The 'unicode_eval' and 'evalbytes' features>
and L<C<"current_sub">|feature/The 'current_sub' feature> features are
enabled automatically with a C<use v5.16> (or higher) declaration in the
current scope.

=item Keywords related to scoping

=for Pod::Functions =Namespace

L<C<caller>|/caller EXPR>, L<C<import>|/import LIST>,
L<C<local>|/local EXPR>, L<C<my>|/my VARLIST>, L<C<our>|/our VARLIST>,
L<C<package>|/package NAMESPACE>, L<C<state>|/state VARLIST>,
L<C<use>|/use Module VERSION LIST>

L<C<state>|/state VARLIST> is available only if the
L<C<"state"> feature|feature/The 'state' feature> is enabled or if it is
prefixed with C<CORE::>.  The
L<C<"state"> feature|feature/The 'state' feature> is enabled
automatically with a C<use v5.10> (or higher) declaration in the current
scope.

=item Miscellaneous functions

=for Pod::Functions =Misc

L<C<defined>|/defined EXPR>, L<C<formline>|/formline PICTURE,LIST>,
L<C<lock>|/lock THING>, L<C<prototype>|/prototype FUNCTION>,
L<C<reset>|/reset EXPR>, L<C<scalar>|/scalar EXPR>,
L<C<undef>|/undef EXPR>

=item Functions for processes and process groups
X<process> X<pid> X<process id>

=for Pod::Functions =Process

L<C<alarm>|/alarm SECONDS>, L<C<exec>|/exec LIST>, L<C<fork>|/fork>,
L<C<getpgrp>|/getpgrp PID>, L<C<getppid>|/getppid>,
L<C<getpriority>|/getpriority WHICH,WHO>, L<C<kill>|/kill SIGNAL, LIST>,
L<C<pipe>|/pipe READHANDLE,WRITEHANDLE>,
L<C<qxE<sol>E<sol>>|/qxE<sol>STRINGE<sol>>,
L<C<readpipe>|/readpipe EXPR>, L<C<setpgrp>|/setpgrp PID,PGRP>,
L<C<setpriority>|/setpriority WHICH,WHO,PRIORITY>,
L<C<sleep>|/sleep EXPR>, L<C<system>|/system LIST>, L<C<times>|/times>,
L<C<wait>|/wait>, L<C<waitpid>|/waitpid PID,FLAGS>

=item Keywords related to Perl modules
X<module>

=for Pod::Functions =Modules

L<C<do>|/do EXPR>, L<C<import>|/import LIST>,
L<C<no>|/no MODULE VERSION LIST>, L<C<package>|/package NAMESPACE>,
L<C<require>|/require VERSION>, L<C<use>|/use Module VERSION LIST>

=item Keywords related to classes and object-orientation
X<object> X<class> X<package>

=for Pod::Functions =Objects

L<C<bless>|/bless REF,CLASSNAME>, L<C<dbmclose>|/dbmclose HASH>,
L<C<dbmopen>|/dbmopen HASH,DBNAME,MASK>,
L<C<package>|/package NAMESPACE>, L<C<ref>|/ref EXPR>,
L<C<tie>|/tie VARIABLE,CLASSNAME,LIST>, L<C<tied>|/tied VARIABLE>,
L<C<untie>|/untie VARIABLE>, L<C<use>|/use Module VERSION LIST>

=item Low-level socket functions
X<socket> X<sock>

=for Pod::Functions =Socket

L<C<accept>|/accept NEWSOCKET,GENERICSOCKET>,
L<C<bind>|/bind SOCKET,NAME>, L<C<connect>|/connect SOCKET,NAME>,
L<C<getpeername>|/getpeername SOCKET>,
L<C<getsockname>|/getsockname SOCKET>,
L<C<getsockopt>|/getsockopt SOCKET,LEVEL,OPTNAME>,
L<C<listen>|/listen SOCKET,QUEUESIZE>,
L<C<recv>|/recv SOCKET,SCALAR,LENGTH,FLAGS>,
L<C<send>|/send SOCKET,MSG,FLAGS,TO>,
L<C<setsockopt>|/setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL>,
L<C<shutdown>|/shutdown SOCKET,HOW>,
L<C<socket>|/socket SOCKET,DOMAIN,TYPE,PROTOCOL>,
L<C<socketpair>|/socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL>

=item System V interprocess communication functions
X<IPC> X<System V> X<semaphore> X<shared memory> X<memory> X<message>

=for Pod::Functions =SysV

L<C<msgctl>|/msgctl ID,CMD,ARG>, L<C<msgget>|/msgget KEY,FLAGS>,
L<C<msgrcv>|/msgrcv ID,VAR,SIZE,TYPE,FLAGS>,
L<C<msgsnd>|/msgsnd ID,MSG,FLAGS>,
L<C<semctl>|/semctl ID,SEMNUM,CMD,ARG>,
L<C<semget>|/semget KEY,NSEMS,FLAGS>, L<C<semop>|/semop KEY,OPSTRING>,
L<C<shmctl>|/shmctl ID,CMD,ARG>, L<C<shmget>|/shmget KEY,SIZE,FLAGS>,
L<C<shmread>|/shmread ID,VAR,POS,SIZE>,
L<C<shmwrite>|/shmwrite ID,STRING,POS,SIZE>

=item Fetching user and group info
X<user> X<group> X<password> X<uid> X<gid>  X<passwd> X</etc/passwd>

=for Pod::Functions =User

L<C<endgrent>|/endgrent>, L<C<endhostent>|/endhostent>,
L<C<endnetent>|/endnetent>, L<C<endpwent>|/endpwent>,
L<C<getgrent>|/getgrent>, L<C<getgrgid>|/getgrgid GID>,
L<C<getgrnam>|/getgrnam NAME>, L<C<getlogin>|/getlogin>,
L<C<getpwent>|/getpwent>, L<C<getpwnam>|/getpwnam NAME>,
L<C<getpwuid>|/getpwuid UID>, L<C<setgrent>|/setgrent>,
L<C<setpwent>|/setpwent>

=item Fetching network info
X<network> X<protocol> X<host> X<hostname> X<IP> X<address> X<service>

=for Pod::Functions =Network

L<C<endprotoent>|/endprotoent>, L<C<endservent>|/endservent>,
L<C<gethostbyaddr>|/gethostbyaddr ADDR,ADDRTYPE>,
L<C<gethostbyname>|/gethostbyname NAME>, L<C<gethostent>|/gethostent>,
L<C<getnetbyaddr>|/getnetbyaddr ADDR,ADDRTYPE>,
L<C<getnetbyname>|/getnetbyname NAME>, L<C<getnetent>|/getnetent>,
L<C<getprotobyname>|/getprotobyname NAME>,
L<C<getprotobynumber>|/getprotobynumber NUMBER>,
L<C<getprotoent>|/getprotoent>,
L<C<getservbyname>|/getservbyname NAME,PROTO>,
L<C<getservbyport>|/getservbyport PORT,PROTO>,
L<C<getservent>|/getservent>, L<C<sethostent>|/sethostent STAYOPEN>,
L<C<setnetent>|/setnetent STAYOPEN>,
L<C<setprotoent>|/setprotoent STAYOPEN>,
L<C<setservent>|/setservent STAYOPEN>

=item Time-related functions
X<time> X<date>

=for Pod::Functions =Time

L<C<gmtime>|/gmtime EXPR>, L<C<localtime>|/localtime EXPR>,
L<C<time>|/time>, L<C<times>|/times>

=item Non-function keywords

=for Pod::Functions =!Non-functions

C<and>, C<AUTOLOAD>, C<BEGIN>, C<CHECK>, C<cmp>, C<CORE>, C<__DATA__>,
C<default>, C<DESTROY>, C<else>, C<elseif>, C<elsif>, C<END>, C<__END__>,
C<eq>, C<for>, C<foreach>, C<ge>, C<given>, C<gt>, C<if>, C<INIT>, C<le>,
C<lt>, C<ne>, C<not>, C<or>, C<UNITCHECK>, C<unless>, C<until>, C<when>,
C<while>, C<x>, C<xor>

=back

=head2 Portability
X<portability> X<Unix> X<portable>

Perl was born in Unix and can therefore access all common Unix
system calls.  In non-Unix environments, the functionality of some
Unix system calls may not be available or details of the available
functionality may differ slightly.  The Perl functions affected
by this are:

L<C<-I<X>>|/-X FILEHANDLE>, L<C<binmode>|/binmode FILEHANDLE, LAYER>,
L<C<chmod>|/chmod LIST>, L<C<chown>|/chown LIST>,
L<C<chroot>|/chroot FILENAME>, L<C<crypt>|/crypt PLAINTEXT,SALT>,
L<C<dbmclose>|/dbmclose HASH>, L<C<dbmopen>|/dbmopen HASH,DBNAME,MASK>,
L<C<dump>|/dump LABEL>, L<C<endgrent>|/endgrent>,
L<C<endhostent>|/endhostent>, L<C<endnetent>|/endnetent>,
L<C<endprotoent>|/endprotoent>, L<C<endpwent>|/endpwent>,
L<C<endservent>|/endservent>, L<C<exec>|/exec LIST>,
L<C<fcntl>|/fcntl FILEHANDLE,FUNCTION,SCALAR>,
L<C<flock>|/flock FILEHANDLE,OPERATION>, L<C<fork>|/fork>,
L<C<getgrent>|/getgrent>, L<C<getgrgid>|/getgrgid GID>,
L<C<gethostbyname>|/gethostbyname NAME>, L<C<gethostent>|/gethostent>,
L<C<getlogin>|/getlogin>,
L<C<getnetbyaddr>|/getnetbyaddr ADDR,ADDRTYPE>,
L<C<getnetbyname>|/getnetbyname NAME>, L<C<getnetent>|/getnetent>,
L<C<getppid>|/getppid>, L<C<getpgrp>|/getpgrp PID>,
L<C<getpriority>|/getpriority WHICH,WHO>,
L<C<getprotobynumber>|/getprotobynumber NUMBER>,
L<C<getprotoent>|/getprotoent>, L<C<getpwent>|/getpwent>,
L<C<getpwnam>|/getpwnam NAME>, L<C<getpwuid>|/getpwuid UID>,
L<C<getservbyport>|/getservbyport PORT,PROTO>,
L<C<getservent>|/getservent>,
L<C<getsockopt>|/getsockopt SOCKET,LEVEL,OPTNAME>,
L<C<glob>|/glob EXPR>, L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR>,
L<C<kill>|/kill SIGNAL, LIST>, L<C<link>|/link OLDFILE,NEWFILE>,
L<C<lstat>|/lstat FILEHANDLE>, L<C<msgctl>|/msgctl ID,CMD,ARG>,
L<C<msgget>|/msgget KEY,FLAGS>,
L<C<msgrcv>|/msgrcv ID,VAR,SIZE,TYPE,FLAGS>,
L<C<msgsnd>|/msgsnd ID,MSG,FLAGS>, L<C<open>|/open FILEHANDLE,EXPR>,
L<C<pipe>|/pipe READHANDLE,WRITEHANDLE>, L<C<readlink>|/readlink EXPR>,
L<C<rename>|/rename OLDNAME,NEWNAME>,
L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT>,
L<C<semctl>|/semctl ID,SEMNUM,CMD,ARG>,
L<C<semget>|/semget KEY,NSEMS,FLAGS>, L<C<semop>|/semop KEY,OPSTRING>,
L<C<setgrent>|/setgrent>, L<C<sethostent>|/sethostent STAYOPEN>,
L<C<setnetent>|/setnetent STAYOPEN>, L<C<setpgrp>|/setpgrp PID,PGRP>,
L<C<setpriority>|/setpriority WHICH,WHO,PRIORITY>,
L<C<setprotoent>|/setprotoent STAYOPEN>, L<C<setpwent>|/setpwent>,
L<C<setservent>|/setservent STAYOPEN>,
L<C<setsockopt>|/setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL>,
L<C<shmctl>|/shmctl ID,CMD,ARG>, L<C<shmget>|/shmget KEY,SIZE,FLAGS>,
L<C<shmread>|/shmread ID,VAR,POS,SIZE>,
L<C<shmwrite>|/shmwrite ID,STRING,POS,SIZE>,
L<C<socket>|/socket SOCKET,DOMAIN,TYPE,PROTOCOL>,
L<C<socketpair>|/socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL>,
L<C<stat>|/stat FILEHANDLE>, L<C<symlink>|/symlink OLDFILE,NEWFILE>,
L<C<syscall>|/syscall NUMBER, LIST>,
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE>,
L<C<system>|/system LIST>, L<C<times>|/times>,
L<C<truncate>|/truncate FILEHANDLE,LENGTH>, L<C<umask>|/umask EXPR>,
L<C<unlink>|/unlink LIST>, L<C<utime>|/utime LIST>, L<C<wait>|/wait>,
L<C<waitpid>|/waitpid PID,FLAGS>

For more information about the portability of these functions, see
L<perlport> and other available platform-specific documentation.

=head2 Alphabetical Listing of Perl Functions

=over

=item -X FILEHANDLE
X<-r>X<-w>X<-x>X<-o>X<-R>X<-W>X<-X>X<-O>X<-e>X<-z>X<-s>X<-f>X<-d>X<-l>X<-p>
X<-S>X<-b>X<-c>X<-t>X<-u>X<-g>X<-k>X<-T>X<-B>X<-M>X<-A>X<-C>

=item -X EXPR

=item -X DIRHANDLE

=item -X

=for Pod::Functions a file test (-r, -x, etc)

A file test, where X is one of the letters listed below.  This unary
operator takes one argument, either a filename, a filehandle, or a dirhandle,
and tests the associated file to see if something is true about it.  If the
argument is omitted, tests L<C<$_>|perlvar/$_>, except for C<-t>, which
tests STDIN.  Unless otherwise documented, it returns C<1> for true and
C<''> for false.  If the file doesn't exist or can't be examined, it
returns L<C<undef>|/undef EXPR> and sets L<C<$!>|perlvar/$!> (errno).
Despite the funny names, precedence is the same as any other named unary
operator.  The operator may be any of:

    -r  File is readable by effective uid/gid.
    -w  File is writable by effective uid/gid.
    -x  File is executable by effective uid/gid.
    -o  File is owned by effective uid.

    -R  File is readable by real uid/gid.
    -W  File is writable by real uid/gid.
    -X  File is executable by real uid/gid.
    -O  File is owned by real uid.

    -e  File exists.
    -z  File has zero size (is empty).
    -s  File has nonzero size (returns size in bytes).

    -f  File is a plain file.
    -d  File is a directory.
    -l  File is a symbolic link (false if symlinks aren't
        supported by the file system).
    -p  File is a named pipe (FIFO), or Filehandle is a pipe.
    -S  File is a socket.
    -b  File is a block special file.
    -c  File is a character special file.
    -t  Filehandle is opened to a tty.

    -u  File has setuid bit set.
    -g  File has setgid bit set.
    -k  File has sticky bit set.

    -T  File is an ASCII or UTF-8 text file (heuristic guess).
    -B  File is a "binary" file (opposite of -T).

    -M  Script start time minus file modification time, in days.
    -A  Same for access time.
    -C  Same for inode change time (Unix, may differ for other
	platforms)

Example:

    while (<>) {
        chomp;
        next unless -f $_;  # ignore specials
        #...
    }

Note that C<-s/a/b/> does not do a negated substitution.  Saying
C<-exp($foo)> still works as expected, however: only single letters
following a minus are interpreted as file tests.

These operators are exempt from the "looks like a function rule" described
above.  That is, an opening parenthesis after the operator does not affect
how much of the following code constitutes the argument.  Put the opening
parentheses before the operator to separate it from code that follows (this
applies only to operators with higher precedence than unary operators, of
course):

    -s($file) + 1024   # probably wrong; same as -s($file + 1024)
    (-s $file) + 1024  # correct

The interpretation of the file permission operators C<-r>, C<-R>,
C<-w>, C<-W>, C<-x>, and C<-X> is by default based solely on the mode
of the file and the uids and gids of the user.  There may be other
reasons you can't actually read, write, or execute the file: for
example network filesystem access controls, ACLs (access control lists),
read-only filesystems, and unrecognized executable formats.  Note
that the use of these six specific operators to verify if some operation
is possible is usually a mistake, because it may be open to race
conditions.

Also note that, for the superuser on the local filesystems, the C<-r>,
C<-R>, C<-w>, and C<-W> tests always return 1, and C<-x> and C<-X> return 1
if any execute bit is set in the mode.  Scripts run by the superuser
may thus need to do a L<C<stat>|/stat FILEHANDLE> to determine the
actual mode of the file, or temporarily set their effective uid to
something else.

If you are using ACLs, there is a pragma called L<C<filetest>|filetest>
that may produce more accurate results than the bare
L<C<stat>|/stat FILEHANDLE> mode bits.
When under C<use filetest 'access'>, the above-mentioned filetests
test whether the permission can(not) be granted using the L<access(2)>
family of system calls.  Also note that the C<-x> and C<-X> tests may
under this pragma return true even if there are no execute permission
bits set (nor any extra execute permission ACLs).  This strangeness is
due to the underlying system calls' definitions.  Note also that, due to
the implementation of C<use filetest 'access'>, the C<_> special
filehandle won't cache the results of the file tests when this pragma is
in effect.  Read the documentation for the L<C<filetest>|filetest>
pragma for more information.

The C<-T> and C<-B> tests work as follows.  The first block or so of
the file is examined to see if it is valid UTF-8 that includes non-ASCII
characters.  If so, it's a C<-T> file.  Otherwise, that same portion of
the file is examined for odd characters such as strange control codes or
characters with the high bit set.  If more than a third of the
characters are strange, it's a C<-B> file; otherwise it's a C<-T> file.
Also, any file containing a zero byte in the examined portion is
considered a binary file.  (If executed within the scope of a L<S<use
locale>|perllocale> which includes C<LC_CTYPE>, odd characters are
anything that isn't a printable nor space in the current locale.)  If
C<-T> or C<-B> is used on a filehandle, the current IO buffer is
examined
rather than the first block.  Both C<-T> and C<-B> return true on an empty
file, or a file at EOF when testing a filehandle.  Because you have to
read a file to do the C<-T> test, on most occasions you want to use a C<-f>
against the file first, as in C<next unless -f $file && -T $file>.

If any of the file tests (or either the L<C<stat>|/stat FILEHANDLE> or
L<C<lstat>|/lstat FILEHANDLE> operator) is given the special filehandle
consisting of a solitary underline, then the stat structure of the
previous file test (or L<C<stat>|/stat FILEHANDLE> operator) is used,
saving a system call.  (This doesn't work with C<-t>, and you need to
remember that L<C<lstat>|/lstat FILEHANDLE> and C<-l> leave values in
the stat structure for the symbolic link, not the real file.)  (Also, if
the stat buffer was filled by an L<C<lstat>|/lstat FILEHANDLE> call,
C<-T> and C<-B> will reset it with the results of C<stat _>).
Example:

    print "Can do.\n" if -r $a || -w _ || -x _;

    stat($filename);
    print "Readable\n" if -r _;
    print "Writable\n" if -w _;
    print "Executable\n" if -x _;
    print "Setuid\n" if -u _;
    print "Setgid\n" if -g _;
    print "Sticky\n" if -k _;
    print "Text\n" if -T _;
    print "Binary\n" if -B _;

As of Perl 5.10.0, as a form of purely syntactic sugar, you can stack file
test operators, in a way that C<-f -w -x $file> is equivalent to
C<-x $file && -w _ && -f _>.  (This is only fancy syntax: if you use
the return value of C<-f $file> as an argument to another filetest
operator, no special magic will happen.)

Portability issues: L<perlport/-X>.

To avoid confusing would-be users of your code with mysterious
syntax errors, put something like this at the top of your script:

    use 5.010;  # so filetest ops can stack

=item abs VALUE
X<abs> X<absolute>

=item abs

=for Pod::Functions absolute value function

Returns the absolute value of its argument.
If VALUE is omitted, uses L<C<$_>|perlvar/$_>.

=item accept NEWSOCKET,GENERICSOCKET
X<accept>

=for Pod::Functions accept an incoming socket connect

Accepts an incoming socket connect, just as L<accept(2)>
does.  Returns the packed address if it succeeded, false otherwise.
See the example in L<perlipc/"Sockets: Client/Server Communication">.

On systems that support a close-on-exec flag on files, the flag will
be set for the newly opened file descriptor, as determined by the
value of L<C<$^F>|perlvar/$^F>.  See L<perlvar/$^F>.

=item alarm SECONDS
X<alarm>
X<SIGALRM>
X<timer>

=item alarm

=for Pod::Functions schedule a SIGALRM

Arranges to have a SIGALRM delivered to this process after the
specified number of wallclock seconds has elapsed.  If SECONDS is not
specified, the value stored in L<C<$_>|perlvar/$_> is used.  (On some
machines, unfortunately, the elapsed time may be up to one second less
or more than you specified because of how seconds are counted, and
process scheduling may delay the delivery of the signal even further.)

Only one timer may be counting at once.  Each call disables the
previous timer, and an argument of C<0> may be supplied to cancel the
previous timer without starting a new one.  The returned value is the
amount of time remaining on the previous timer.

For delays of finer granularity than one second, the L<Time::HiRes> module
(from CPAN, and starting from Perl 5.8 part of the standard
distribution) provides
L<C<ualarm>|Time::HiRes/ualarm ( $useconds [, $interval_useconds ] )>.
You may also use Perl's four-argument version of
L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT> leaving the first three
arguments undefined, or you might be able to use the
L<C<syscall>|/syscall NUMBER, LIST> interface to access L<setitimer(2)>
if your system supports it.  See L<perlfaq8> for details.

It is usually a mistake to intermix L<C<alarm>|/alarm SECONDS> and
L<C<sleep>|/sleep EXPR> calls, because L<C<sleep>|/sleep EXPR> may be
internally implemented on your system with L<C<alarm>|/alarm SECONDS>.

If you want to use L<C<alarm>|/alarm SECONDS> to time out a system call
you need to use an L<C<eval>|/eval EXPR>/L<C<die>|/die LIST> pair.  You
can't rely on the alarm causing the system call to fail with
L<C<$!>|perlvar/$!> set to C<EINTR> because Perl sets up signal handlers
to restart system calls on some systems.  Using
L<C<eval>|/eval EXPR>/L<C<die>|/die LIST> always works, modulo the
caveats given in L<perlipc/"Signals">.

    eval {
        local $SIG{ALRM} = sub { die "alarm\n" }; # NB: \n required
        alarm $timeout;
        my $nread = sysread $socket, $buffer, $size;
        alarm 0;
    };
    if ($@) {
        die unless $@ eq "alarm\n";   # propagate unexpected errors
        # timed out
    }
    else {
        # didn't
    }

For more information see L<perlipc>.

Portability issues: L<perlport/alarm>.

=item atan2 Y,X
X<atan2> X<arctangent> X<tan> X<tangent>

=for Pod::Functions arctangent of Y/X in the range -PI to PI

Returns the arctangent of Y/X in the range -PI to PI.

For the tangent operation, you may use the
L<C<Math::Trig::tan>|Math::Trig/B<tan>> function, or use the familiar
relation:

    sub tan { sin($_[0]) / cos($_[0])  }

The return value for C<atan2(0,0)> is implementation-defined; consult
your L<atan2(3)> manpage for more information.

Portability issues: L<perlport/atan2>.

=item bind SOCKET,NAME
X<bind>

=for Pod::Functions binds an address to a socket

Binds a network address to a socket, just as L<bind(2)>
does.  Returns true if it succeeded, false otherwise.  NAME should be a
packed address of the appropriate type for the socket.  See the examples in
L<perlipc/"Sockets: Client/Server Communication">.

=item binmode FILEHANDLE, LAYER
X<binmode> X<binary> X<text> X<DOS> X<Windows>

=item binmode FILEHANDLE

=for Pod::Functions prepare binary files for I/O

Arranges for FILEHANDLE to be read or written in "binary" or "text"
mode on systems where the run-time libraries distinguish between
binary and text files.  If FILEHANDLE is an expression, the value is
taken as the name of the filehandle.  Returns true on success,
otherwise it returns L<C<undef>|/undef EXPR> and sets
L<C<$!>|perlvar/$!> (errno).

On some systems (in general, DOS- and Windows-based systems)
L<C<binmode>|/binmode FILEHANDLE, LAYER> is necessary when you're not
working with a text file.  For the sake of portability it is a good idea
always to use it when appropriate, and never to use it when it isn't
appropriate.  Also, people can set their I/O to be by default
UTF8-encoded Unicode, not bytes.

In other words: regardless of platform, use
L<C<binmode>|/binmode FILEHANDLE, LAYER> on binary data, like images,
for example.

If LAYER is present it is a single string, but may contain multiple
directives.  The directives alter the behaviour of the filehandle.
When LAYER is present, using binmode on a text file makes sense.

If LAYER is omitted or specified as C<:raw> the filehandle is made
suitable for passing binary data.  This includes turning off possible CRLF
translation and marking it as bytes (as opposed to Unicode characters).
Note that, despite what may be implied in I<"Programming Perl"> (the
Camel, 3rd edition) or elsewhere, C<:raw> is I<not> simply the inverse of C<:crlf>.
Other layers that would affect the binary nature of the stream are
I<also> disabled.  See L<PerlIO>, L<perlrun>, and the discussion about the
PERLIO environment variable.

The C<:bytes>, C<:crlf>, C<:utf8>, and any other directives of the
form C<:...>, are called I/O I<layers>.  The L<open> pragma can be used to
establish default I/O layers.

I<The LAYER parameter of the L<C<binmode>|/binmode FILEHANDLE, LAYER>
function is described as "DISCIPLINE" in "Programming Perl, 3rd
Edition".  However, since the publishing of this book, by many known as
"Camel III", the consensus of the naming of this functionality has moved
from "discipline" to "layer".  All documentation of this version of Perl
therefore refers to "layers" rather than to "disciplines".  Now back to
the regularly scheduled documentation...>

To mark FILEHANDLE as UTF-8, use C<:utf8> or C<:encoding(UTF-8)>.
C<:utf8> just marks the data as UTF-8 without further checking,
while C<:encoding(UTF-8)> checks the data for actually being valid
UTF-8.  More details can be found in L<PerlIO::encoding>.

In general, L<C<binmode>|/binmode FILEHANDLE, LAYER> should be called
after L<C<open>|/open FILEHANDLE,EXPR> but before any I/O is done on the
filehandle.  Calling L<C<binmode>|/binmode FILEHANDLE, LAYER> normally
flushes any pending buffered output data (and perhaps pending input
data) on the handle.  An exception to this is the C<:encoding> layer
that changes the default character encoding of the handle.
The C<:encoding> layer sometimes needs to be called in
mid-stream, and it doesn't flush the stream.  C<:encoding>
also implicitly pushes on top of itself the C<:utf8> layer because
internally Perl operates on UTF8-encoded Unicode characters.

The operating system, device drivers, C libraries, and Perl run-time
system all conspire to let the programmer treat a single
character (C<\n>) as the line terminator, irrespective of external
representation.  On many operating systems, the native text file
representation matches the internal representation, but on some
platforms the external representation of C<\n> is made up of more than
one character.

All variants of Unix, Mac OS (old and new), and Stream_LF files on VMS use
a single character to end each line in the external representation of text
(even though that single character is CARRIAGE RETURN on old, pre-Darwin
flavors of Mac OS, and is LINE FEED on Unix and most VMS files).  In other
systems like OS/2, DOS, and the various flavors of MS-Windows, your program
sees a C<\n> as a simple C<\cJ>, but what's stored in text files are the
two characters C<\cM\cJ>.  That means that if you don't use
L<C<binmode>|/binmode FILEHANDLE, LAYER> on these systems, C<\cM\cJ>
sequences on disk will be converted to C<\n> on input, and any C<\n> in
your program will be converted back to C<\cM\cJ> on output.  This is
what you want for text files, but it can be disastrous for binary files.

Another consequence of using L<C<binmode>|/binmode FILEHANDLE, LAYER>
(on some systems) is that special end-of-file markers will be seen as
part of the data stream.  For systems from the Microsoft family this
means that, if your binary data contain C<\cZ>, the I/O subsystem will
regard it as the end of the file, unless you use
L<C<binmode>|/binmode FILEHANDLE, LAYER>.

L<C<binmode>|/binmode FILEHANDLE, LAYER> is important not only for
L<C<readline>|/readline EXPR> and L<C<print>|/print FILEHANDLE LIST>
operations, but also when using
L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<syswrite>|/syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET> and
L<C<tell>|/tell FILEHANDLE> (see L<perlport> for more details).  See the
L<C<$E<sol>>|perlvar/$E<sol>> and L<C<$\>|perlvar/$\> variables in
L<perlvar> for how to manually set your input and output
line-termination sequences.

Portability issues: L<perlport/binmode>.

=item bless REF,CLASSNAME
X<bless>

=item bless REF

=for Pod::Functions create an object

This function tells the thingy referenced by REF that it is now an object
in the CLASSNAME package.  If CLASSNAME is omitted, the current package
is used.  Because a L<C<bless>|/bless REF,CLASSNAME> is often the last
thing in a constructor, it returns the reference for convenience.
Always use the two-argument version if a derived class might inherit the
method doing the blessing.  See L<perlobj> for more about the blessing
(and blessings) of objects.

Consider always blessing objects in CLASSNAMEs that are mixed case.
Namespaces with all lowercase names are considered reserved for
Perl pragmas.  Builtin types have all uppercase names.  To prevent
confusion, you may wish to avoid such package names as well.  Make sure
that CLASSNAME is a true value.

See L<perlmod/"Perl Modules">.

=item break

=for Pod::Functions +switch break out of a C<given> block

Break out of a C<given> block.

L<C<break>|/break> is available only if the
L<C<"switch"> feature|feature/The 'switch' feature> is enabled or if it
is prefixed with C<CORE::>. The
L<C<"switch"> feature|feature/The 'switch' feature> is enabled
automatically with a C<use v5.10> (or higher) declaration in the current
scope.

=item caller EXPR
X<caller> X<call stack> X<stack> X<stack trace>

=item caller

=for Pod::Functions get context of the current subroutine call

Returns the context of the current pure perl subroutine call.  In scalar
context, returns the caller's package name if there I<is> a caller (that is, if
we're in a subroutine or L<C<eval>|/eval EXPR> or
L<C<require>|/require VERSION>) and the undefined value otherwise.
caller never returns XS subs and they are skipped.  The next pure perl
sub will appear instead of the XS sub in caller's return values.  In
list context, caller returns

       # 0         1          2
    my ($package, $filename, $line) = caller;

With EXPR, it returns some extra information that the debugger uses to
print a stack trace.  The value of EXPR indicates how many call frames
to go back before the current one.

    #  0         1          2      3            4
 my ($package, $filename, $line, $subroutine, $hasargs,

    #  5          6          7            8       9         10
    $wantarray, $evaltext, $is_require, $hints, $bitmask, $hinthash)
  = caller($i);

Here, $subroutine is the function that the caller called (rather than the
function containing the caller).  Note that $subroutine may be C<(eval)> if
the frame is not a subroutine call, but an L<C<eval>|/eval EXPR>.  In
such a case additional elements $evaltext and C<$is_require> are set:
C<$is_require> is true if the frame is created by a
L<C<require>|/require VERSION> or L<C<use>|/use Module VERSION LIST>
statement, $evaltext contains the text of the C<eval EXPR> statement.
In particular, for an C<eval BLOCK> statement, $subroutine is C<(eval)>,
but $evaltext is undefined.  (Note also that each
L<C<use>|/use Module VERSION LIST> statement creates a
L<C<require>|/require VERSION> frame inside an C<eval EXPR> frame.)
$subroutine may also be C<(unknown)> if this particular subroutine
happens to have been deleted from the symbol table.  C<$hasargs> is true
if a new instance of L<C<@_>|perlvar/@_> was set up for the frame.
C<$hints> and C<$bitmask> contain pragmatic hints that the caller was
compiled with.  C<$hints> corresponds to L<C<$^H>|perlvar/$^H>, and
C<$bitmask> corresponds to
L<C<${^WARNING_BITS}>|perlvar/${^WARNING_BITS}>.  The C<$hints> and
C<$bitmask> values are subject to change between versions of Perl, and
are not meant for external use.

C<$hinthash> is a reference to a hash containing the value of
L<C<%^H>|perlvar/%^H> when the caller was compiled, or
L<C<undef>|/undef EXPR> if L<C<%^H>|perlvar/%^H> was empty.  Do not
modify the values of this hash, as they are the actual values stored in
the optree.

Furthermore, when called from within the DB package in
list context, and with an argument, caller returns more
detailed information: it sets the list variable C<@DB::args> to be the
arguments with which the subroutine was invoked.

Be aware that the optimizer might have optimized call frames away before
L<C<caller>|/caller EXPR> had a chance to get the information.  That
means that C<caller(N)> might not return information about the call
frame you expect it to, for C<< N > 1 >>.  In particular, C<@DB::args>
might have information from the previous time L<C<caller>|/caller EXPR>
was called.

Be aware that setting C<@DB::args> is I<best effort>, intended for
debugging or generating backtraces, and should not be relied upon.  In
particular, as L<C<@_>|perlvar/@_> contains aliases to the caller's
arguments, Perl does not take a copy of L<C<@_>|perlvar/@_>, so
C<@DB::args> will contain modifications the subroutine makes to
L<C<@_>|perlvar/@_> or its contents, not the original values at call
time.  C<@DB::args>, like L<C<@_>|perlvar/@_>, does not hold explicit
references to its elements, so under certain cases its elements may have
become freed and reallocated for other variables or temporary values.
Finally, a side effect of the current implementation is that the effects
of C<shift @_> can I<normally> be undone (but not C<pop @_> or other
splicing, I<and> not if a reference to L<C<@_>|perlvar/@_> has been
taken, I<and> subject to the caveat about reallocated elements), so
C<@DB::args> is actually a hybrid of the current state and initial state
of L<C<@_>|perlvar/@_>.  Buyer beware.

=item chdir EXPR
X<chdir>
X<cd>
X<directory, change>

=item chdir FILEHANDLE

=item chdir DIRHANDLE

=item chdir

=for Pod::Functions change your current working directory

Changes the working directory to EXPR, if possible.  If EXPR is omitted,
changes to the directory specified by C<$ENV{HOME}>, if set; if not,
changes to the directory specified by C<$ENV{LOGDIR}>.  (Under VMS, the
variable C<$ENV{'SYS$LOGIN'}> is also checked, and used if it is set.)  If
neither is set, L<C<chdir>|/chdir EXPR> does nothing and fails.  It
returns true on success, false otherwise.  See the example under
L<C<die>|/die LIST>.

On systems that support L<fchdir(2)>, you may pass a filehandle or
directory handle as the argument.  On systems that don't support L<fchdir(2)>,
passing handles raises an exception.

=item chmod LIST
X<chmod> X<permission> X<mode>

=for Pod::Functions changes the permissions on a list of files

Changes the permissions of a list of files.  The first element of the
list must be the numeric mode, which should probably be an octal
number, and which definitely should I<not> be a string of octal digits:
C<0644> is okay, but C<"0644"> is not.  Returns the number of files
successfully changed.  See also L<C<oct>|/oct EXPR> if all you have is a
string.

    my $cnt = chmod 0755, "foo", "bar";
    chmod 0755, @executables;
    my $mode = "0644"; chmod $mode, "foo";      # !!! sets mode to
                                                # --w----r-T
    my $mode = "0644"; chmod oct($mode), "foo"; # this is better
    my $mode = 0644;   chmod $mode, "foo";      # this is best

On systems that support L<fchmod(2)>, you may pass filehandles among the
files.  On systems that don't support L<fchmod(2)>, passing filehandles raises
an exception.  Filehandles must be passed as globs or glob references to be
recognized; barewords are considered filenames.

    open(my $fh, "<", "foo");
    my $perm = (stat $fh)[2] & 07777;
    chmod($perm | 0600, $fh);

You can also import the symbolic C<S_I*> constants from the
L<C<Fcntl>|Fcntl> module:

    use Fcntl qw( :mode );
    chmod S_IRWXU|S_IRGRP|S_IXGRP|S_IROTH|S_IXOTH, @executables;
    # Identical to the chmod 0755 of the example above.

Portability issues: L<perlport/chmod>.

=item chomp VARIABLE
X<chomp> X<INPUT_RECORD_SEPARATOR> X<$/> X<newline> X<eol>

=item chomp( LIST )

=item chomp

=for Pod::Functions remove a trailing record separator from a string

This safer version of L<C<chop>|/chop VARIABLE> removes any trailing
string that corresponds to the current value of
L<C<$E<sol>>|perlvar/$E<sol>> (also known as C<$INPUT_RECORD_SEPARATOR>
in the L<C<English>|English> module).  It returns the total
number of characters removed from all its arguments.  It's often used to
remove the newline from the end of an input record when you're worried
that the final record may be missing its newline.  When in paragraph
mode (C<$/ = ''>), it removes all trailing newlines from the string.
When in slurp mode (C<$/ = undef>) or fixed-length record mode
(L<C<$E<sol>>|perlvar/$E<sol>> is a reference to an integer or the like;
see L<perlvar>), L<C<chomp>|/chomp VARIABLE> won't remove anything.
If VARIABLE is omitted, it chomps L<C<$_>|perlvar/$_>.  Example:

    while (<>) {
        chomp;  # avoid \n on last field
        my @array = split(/:/);
        # ...
    }

If VARIABLE is a hash, it chomps the hash's values, but not its keys,
resetting the L<C<each>|/each HASH> iterator in the process.

You can actually chomp anything that's an lvalue, including an assignment:

    chomp(my $cwd = `pwd`);
    chomp(my $answer = <STDIN>);

If you chomp a list, each element is chomped, and the total number of
characters removed is returned.

Note that parentheses are necessary when you're chomping anything
that is not a simple variable.  This is because C<chomp $cwd = `pwd`;>
is interpreted as C<(chomp $cwd) = `pwd`;>, rather than as
C<chomp( $cwd = `pwd` )> which you might expect.  Similarly,
C<chomp $a, $b> is interpreted as C<chomp($a), $b> rather than
as C<chomp($a, $b)>.

=item chop VARIABLE
X<chop>

=item chop( LIST )

=item chop

=for Pod::Functions remove the last character from a string

Chops off the last character of a string and returns the character
chopped.  It is much more efficient than C<s/.$//s> because it neither
scans nor copies the string.  If VARIABLE is omitted, chops
L<C<$_>|perlvar/$_>.
If VARIABLE is a hash, it chops the hash's values, but not its keys,
resetting the L<C<each>|/each HASH> iterator in the process.

You can actually chop anything that's an lvalue, including an assignment.

If you chop a list, each element is chopped.  Only the value of the
last L<C<chop>|/chop VARIABLE> is returned.

Note that L<C<chop>|/chop VARIABLE> returns the last character.  To
return all but the last character, use C<substr($string, 0, -1)>.

See also L<C<chomp>|/chomp VARIABLE>.

=item chown LIST
X<chown> X<owner> X<user> X<group>

=for Pod::Functions change the ownership on a list of files

Changes the owner (and group) of a list of files.  The first two
elements of the list must be the I<numeric> uid and gid, in that
order.  A value of -1 in either position is interpreted by most
systems to leave that value unchanged.  Returns the number of files
successfully changed.

    my $cnt = chown $uid, $gid, 'foo', 'bar';
    chown $uid, $gid, @filenames;

On systems that support L<fchown(2)>, you may pass filehandles among the
files.  On systems that don't support L<fchown(2)>, passing filehandles raises
an exception.  Filehandles must be passed as globs or glob references to be
recognized; barewords are considered filenames.

Here's an example that looks up nonnumeric uids in the passwd file:

    print "User: ";
    chomp(my $user = <STDIN>);
    print "Files: ";
    chomp(my $pattern = <STDIN>);

    my ($login,$pass,$uid,$gid) = getpwnam($user)
        or die "$user not in passwd file";

    my @ary = glob($pattern);  # expand filenames
    chown $uid, $gid, @ary;

On most systems, you are not allowed to change the ownership of the
file unless you're the superuser, although you should be able to change
the group to any of your secondary groups.  On insecure systems, these
restrictions may be relaxed, but this is not a portable assumption.
On POSIX systems, you can detect this condition this way:

    use POSIX qw(sysconf _PC_CHOWN_RESTRICTED);
    my $can_chown_giveaway = ! sysconf(_PC_CHOWN_RESTRICTED);

Portability issues: L<perlport/chown>.

=item chr NUMBER
X<chr> X<character> X<ASCII> X<Unicode>

=item chr

=for Pod::Functions get character this number represents

Returns the character represented by that NUMBER in the character set.
For example, C<chr(65)> is C<"A"> in either ASCII or Unicode, and
chr(0x263a) is a Unicode smiley face.

Negative values give the Unicode replacement character (chr(0xfffd)),
except under the L<bytes> pragma, where the low eight bits of the value
(truncated to an integer) are used.

If NUMBER is omitted, uses L<C<$_>|perlvar/$_>.

For the reverse, use L<C<ord>|/ord EXPR>.

Note that characters from 128 to 255 (inclusive) are by default
internally not encoded as UTF-8 for backward compatibility reasons.

See L<perlunicode> for more about Unicode.

=item chroot FILENAME
X<chroot> X<root>

=item chroot

=for Pod::Functions make directory new root for path lookups

This function works like the system call by the same name: it makes the
named directory the new root directory for all further pathnames that
begin with a C</> by your process and all its children.  (It doesn't
change your current working directory, which is unaffected.)  For security
reasons, this call is restricted to the superuser.  If FILENAME is
omitted, does a L<C<chroot>|/chroot FILENAME> to L<C<$_>|perlvar/$_>.

B<NOTE:>  It is good security practice to do C<chdir("/")>
(L<C<chdir>|/chdir EXPR> to the root directory) immediately after a
L<C<chroot>|/chroot FILENAME>.

Portability issues: L<perlport/chroot>.

=item close FILEHANDLE
X<close>

=item close

=for Pod::Functions close file (or pipe or socket) handle

Closes the file or pipe associated with the filehandle, flushes the IO
buffers, and closes the system file descriptor.  Returns true if those
operations succeed and if no error was reported by any PerlIO
layer.  Closes the currently selected filehandle if the argument is
omitted.

You don't have to close FILEHANDLE if you are immediately going to do
another L<C<open>|/open FILEHANDLE,EXPR> on it, because
L<C<open>|/open FILEHANDLE,EXPR> closes it for you.  (See
L<C<open>|/open FILEHANDLE,EXPR>.) However, an explicit
L<C<close>|/close FILEHANDLE> on an input file resets the line counter
(L<C<$.>|perlvar/$.>), while the implicit close done by
L<C<open>|/open FILEHANDLE,EXPR> does not.

If the filehandle came from a piped open, L<C<close>|/close FILEHANDLE>
returns false if one of the other syscalls involved fails or if its
program exits with non-zero status.  If the only problem was that the
program exited non-zero, L<C<$!>|perlvar/$!> will be set to C<0>.
Closing a pipe also waits for the process executing on the pipe to
exit--in case you wish to look at the output of the pipe afterwards--and
implicitly puts the exit status value of that command into
L<C<$?>|perlvar/$?> and
L<C<${^CHILD_ERROR_NATIVE}>|perlvar/${^CHILD_ERROR_NATIVE}>.

If there are multiple threads running, L<C<close>|/close FILEHANDLE> on
a filehandle from a piped open returns true without waiting for the
child process to terminate, if the filehandle is still open in another
thread.

Closing the read end of a pipe before the process writing to it at the
other end is done writing results in the writer receiving a SIGPIPE.  If
the other end can't handle that, be sure to read all the data before
closing the pipe.

Example:

    open(OUTPUT, '|sort >foo')  # pipe to sort
        or die "Can't start sort: $!";
    #...                        # print stuff to output
    close OUTPUT                # wait for sort to finish
        or warn $! ? "Error closing sort pipe: $!"
                   : "Exit status $? from sort";
    open(INPUT, 'foo')          # get sort's results
        or die "Can't open 'foo' for input: $!";

FILEHANDLE may be an expression whose value can be used as an indirect
filehandle, usually the real filehandle name or an autovivified handle.

=item closedir DIRHANDLE
X<closedir>

=for Pod::Functions close directory handle

Closes a directory opened by L<C<opendir>|/opendir DIRHANDLE,EXPR> and
returns the success of that system call.

=item connect SOCKET,NAME
X<connect>

=for Pod::Functions connect to a remote socket

Attempts to connect to a remote socket, just like L<connect(2)>.
Returns true if it succeeded, false otherwise.  NAME should be a
packed address of the appropriate type for the socket.  See the examples in
L<perlipc/"Sockets: Client/Server Communication">.

=item continue BLOCK
X<continue>

=item continue

=for Pod::Functions optional trailing block in a while or foreach

When followed by a BLOCK, L<C<continue>|/continue BLOCK> is actually a
flow control statement rather than a function.  If there is a
L<C<continue>|/continue BLOCK> BLOCK attached to a BLOCK (typically in a
C<while> or C<foreach>), it is always executed just before the
conditional is about to be evaluated again, just like the third part of
a C<for> loop in C.  Thus it can be used to increment a loop variable,
even when the loop has been continued via the L<C<next>|/next LABEL>
statement (which is similar to the C L<C<continue>|/continue BLOCK>
statement).

L<C<last>|/last LABEL>, L<C<next>|/next LABEL>, or
L<C<redo>|/redo LABEL> may appear within a
L<C<continue>|/continue BLOCK> block; L<C<last>|/last LABEL> and
L<C<redo>|/redo LABEL> behave as if they had been executed within the
main block.  So will L<C<next>|/next LABEL>, but since it will execute a
L<C<continue>|/continue BLOCK> block, it may be more entertaining.

    while (EXPR) {
        ### redo always comes here
        do_something;
    } continue {
        ### next always comes here
        do_something_else;
        # then back the top to re-check EXPR
    }
    ### last always comes here

Omitting the L<C<continue>|/continue BLOCK> section is equivalent to
using an empty one, logically enough, so L<C<next>|/next LABEL> goes
directly back to check the condition at the top of the loop.

When there is no BLOCK, L<C<continue>|/continue BLOCK> is a function
that falls through the current C<when> or C<default> block instead of
iterating a dynamically enclosing C<foreach> or exiting a lexically
enclosing C<given>.  In Perl 5.14 and earlier, this form of
L<C<continue>|/continue BLOCK> was only available when the
L<C<"switch"> feature|feature/The 'switch' feature> was enabled.  See
L<feature> and L<perlsyn/"Switch Statements"> for more information.

=item cos EXPR
X<cos> X<cosine> X<acos> X<arccosine>

=item cos

=for Pod::Functions cosine function

Returns the cosine of EXPR (expressed in radians).  If EXPR is omitted,
takes the cosine of L<C<$_>|perlvar/$_>.

For the inverse cosine operation, you may use the
L<C<Math::Trig::acos>|Math::Trig> function, or use this relation:

    sub acos { atan2( sqrt(1 - $_[0] * $_[0]), $_[0] ) }

=item crypt PLAINTEXT,SALT
X<crypt> X<digest> X<hash> X<salt> X<plaintext> X<password>
X<decrypt> X<cryptography> X<passwd> X<encrypt>

=for Pod::Functions one-way passwd-style encryption

Creates a digest string exactly like the L<crypt(3)> function in the C
library (assuming that you actually have a version there that has not
been extirpated as a potential munition).

L<C<crypt>|/crypt PLAINTEXT,SALT> is a one-way hash function.  The
PLAINTEXT and SALT are turned
into a short string, called a digest, which is returned.  The same
PLAINTEXT and SALT will always return the same string, but there is no
(known) way to get the original PLAINTEXT from the hash.  Small
changes in the PLAINTEXT or SALT will result in large changes in the
digest.

There is no decrypt function.  This function isn't all that useful for
cryptography (for that, look for F<Crypt> modules on your nearby CPAN
mirror) and the name "crypt" is a bit of a misnomer.  Instead it is
primarily used to check if two pieces of text are the same without
having to transmit or store the text itself.  An example is checking
if a correct password is given.  The digest of the password is stored,
not the password itself.  The user types in a password that is
L<C<crypt>|/crypt PLAINTEXT,SALT>'d with the same salt as the stored
digest.  If the two digests match, the password is correct.

When verifying an existing digest string you should use the digest as
the salt (like C<crypt($plain, $digest) eq $digest>).  The SALT used
to create the digest is visible as part of the digest.  This ensures
L<C<crypt>|/crypt PLAINTEXT,SALT> will hash the new string with the same
salt as the digest.  This allows your code to work with the standard
L<C<crypt>|/crypt PLAINTEXT,SALT> and with more exotic implementations.
In other words, assume nothing about the returned string itself nor
about how many bytes of SALT may matter.

Traditionally the result is a string of 13 bytes: two first bytes of
the salt, followed by 11 bytes from the set C<[./0-9A-Za-z]>, and only
the first eight bytes of PLAINTEXT mattered.  But alternative
hashing schemes (like MD5), higher level security schemes (like C2),
and implementations on non-Unix platforms may produce different
strings.

When choosing a new salt create a random two character string whose
characters come from the set C<[./0-9A-Za-z]> (like C<join '', ('.',
'/', 0..9, 'A'..'Z', 'a'..'z')[rand 64, rand 64]>).  This set of
characters is just a recommendation; the characters allowed in
the salt depend solely on your system's crypt library, and Perl can't
restrict what salts L<C<crypt>|/crypt PLAINTEXT,SALT> accepts.

Here's an example that makes sure that whoever runs this program knows
their password:

    my $pwd = (getpwuid($<))[1];

    system "stty -echo";
    print "Password: ";
    chomp(my $word = <STDIN>);
    print "\n";
    system "stty echo";

    if (crypt($word, $pwd) ne $pwd) {
        die "Sorry...\n";
    } else {
        print "ok\n";
    }

Of course, typing in your own password to whoever asks you
for it is unwise.

The L<C<crypt>|/crypt PLAINTEXT,SALT> function is unsuitable for hashing
large quantities of data, not least of all because you can't get the
information back.  Look at the L<Digest> module for more robust
algorithms.

If using L<C<crypt>|/crypt PLAINTEXT,SALT> on a Unicode string (which
I<potentially> has characters with codepoints above 255), Perl tries to
make sense of the situation by trying to downgrade (a copy of) the
string back to an eight-bit byte string before calling
L<C<crypt>|/crypt PLAINTEXT,SALT> (on that copy).  If that works, good.
If not, L<C<crypt>|/crypt PLAINTEXT,SALT> dies with
L<C<Wide character in crypt>|perldiag/Wide character in %s>.

Portability issues: L<perlport/crypt>.

=item dbmclose HASH
X<dbmclose>

=for Pod::Functions breaks binding on a tied dbm file

[This function has been largely superseded by the
L<C<untie>|/untie VARIABLE> function.]

Breaks the binding between a DBM file and a hash.

Portability issues: L<perlport/dbmclose>.

=item dbmopen HASH,DBNAME,MASK
X<dbmopen> X<dbm> X<ndbm> X<sdbm> X<gdbm>

=for Pod::Functions create binding on a tied dbm file

[This function has been largely superseded by the
L<C<tie>|/tie VARIABLE,CLASSNAME,LIST> function.]

This binds a L<dbm(3)>, L<ndbm(3)>, L<sdbm(3)>, L<gdbm(3)>, or Berkeley
DB file to a hash.  HASH is the name of the hash.  (Unlike normal
L<C<open>|/open FILEHANDLE,EXPR>, the first argument is I<not> a
filehandle, even though it looks like one).  DBNAME is the name of the
database (without the F<.dir> or F<.pag> extension if any).  If the
database does not exist, it is created with protection specified by MASK
(as modified by the L<C<umask>|/umask EXPR>).  To prevent creation of
the database if it doesn't exist, you may specify a MODE of 0, and the
function will return a false value if it can't find an existing
database.  If your system supports only the older DBM functions, you may
make only one L<C<dbmopen>|/dbmopen HASH,DBNAME,MASK> call in your
program.  In older versions of Perl, if your system had neither DBM nor
ndbm, calling L<C<dbmopen>|/dbmopen HASH,DBNAME,MASK> produced a fatal
error; it now falls back to L<sdbm(3)>.

If you don't have write access to the DBM file, you can only read hash
variables, not set them.  If you want to test whether you can write,
either use file tests or try setting a dummy hash entry inside an
L<C<eval>|/eval EXPR> to trap the error.

Note that functions such as L<C<keys>|/keys HASH> and
L<C<values>|/values HASH> may return huge lists when used on large DBM
files.  You may prefer to use the L<C<each>|/each HASH> function to
iterate over large DBM files.  Example:

    # print out history file offsets
    dbmopen(%HIST,'/usr/lib/news/history',0666);
    while (($key,$val) = each %HIST) {
        print $key, ' = ', unpack('L',$val), "\n";
    }
    dbmclose(%HIST);

See also L<AnyDBM_File> for a more general description of the pros and
cons of the various dbm approaches, as well as L<DB_File> for a particularly
rich implementation.

You can control which DBM library you use by loading that library
before you call L<C<dbmopen>|/dbmopen HASH,DBNAME,MASK>:

    use DB_File;
    dbmopen(%NS_Hist, "$ENV{HOME}/.netscape/history.db")
        or die "Can't open netscape history file: $!";

Portability issues: L<perlport/dbmopen>.

=item defined EXPR
X<defined> X<undef> X<undefined>

=item defined

=for Pod::Functions test whether a value, variable, or function is defined

Returns a Boolean value telling whether EXPR has a value other than the
undefined value L<C<undef>|/undef EXPR>.  If EXPR is not present,
L<C<$_>|perlvar/$_> is checked.

Many operations return L<C<undef>|/undef EXPR> to indicate failure, end
of file, system error, uninitialized variable, and other exceptional
conditions.  This function allows you to distinguish
L<C<undef>|/undef EXPR> from other values.  (A simple Boolean test will
not distinguish among L<C<undef>|/undef EXPR>, zero, the empty string,
and C<"0">, which are all equally false.)  Note that since
L<C<undef>|/undef EXPR> is a valid scalar, its presence doesn't
I<necessarily> indicate an exceptional condition: L<C<pop>|/pop ARRAY>
returns L<C<undef>|/undef EXPR> when its argument is an empty array,
I<or> when the element to return happens to be L<C<undef>|/undef EXPR>.

You may also use C<defined(&func)> to check whether subroutine C<func>
has ever been defined.  The return value is unaffected by any forward
declarations of C<func>.  A subroutine that is not defined
may still be callable: its package may have an C<AUTOLOAD> method that
makes it spring into existence the first time that it is called; see
L<perlsub>.

Use of L<C<defined>|/defined EXPR> on aggregates (hashes and arrays) is
no longer supported. It used to report whether memory for that
aggregate had ever been allocated.  You should instead use a simple
test for size:

    if (@an_array) { print "has array elements\n" }
    if (%a_hash)   { print "has hash members\n"   }

When used on a hash element, it tells you whether the value is defined,
not whether the key exists in the hash.  Use L<C<exists>|/exists EXPR>
for the latter purpose.

Examples:

    print if defined $switch{D};
    print "$val\n" while defined($val = pop(@ary));
    die "Can't readlink $sym: $!"
        unless defined($value = readlink $sym);
    sub foo { defined &$bar ? $bar->(@_) : die "No bar"; }
    $debugging = 0 unless defined $debugging;

Note:  Many folks tend to overuse L<C<defined>|/defined EXPR> and are
then surprised to discover that the number C<0> and C<""> (the
zero-length string) are, in fact, defined values.  For example, if you
say

    "ab" =~ /a(.*)b/;

The pattern match succeeds and C<$1> is defined, although it
matched "nothing".  It didn't really fail to match anything.  Rather, it
matched something that happened to be zero characters long.  This is all
very above-board and honest.  When a function returns an undefined value,
it's an admission that it couldn't give you an honest answer.  So you
should use L<C<defined>|/defined EXPR> only when questioning the
integrity of what you're trying to do.  At other times, a simple
comparison to C<0> or C<""> is what you want.

See also L<C<undef>|/undef EXPR>, L<C<exists>|/exists EXPR>,
L<C<ref>|/ref EXPR>.

=item delete EXPR
X<delete>

=for Pod::Functions deletes a value from a hash

Given an expression that specifies an element or slice of a hash,
L<C<delete>|/delete EXPR> deletes the specified elements from that hash
so that L<C<exists>|/exists EXPR> on that element no longer returns
true.  Setting a hash element to the undefined value does not remove its
key, but deleting it does; see L<C<exists>|/exists EXPR>.

In list context, returns the value or values deleted, or the last such
element in scalar context.  The return list's length always matches that of
the argument list: deleting non-existent elements returns the undefined value
in their corresponding positions.

L<C<delete>|/delete EXPR> may also be used on arrays and array slices,
but its behavior is less straightforward.  Although
L<C<exists>|/exists EXPR> will return false for deleted entries,
deleting array elements never changes indices of existing values; use
L<C<shift>|/shift ARRAY> or L<C<splice>|/splice
ARRAY,OFFSET,LENGTH,LIST> for that.  However, if any deleted elements
fall at the end of an array, the array's size shrinks to the position of
the highest element that still tests true for L<C<exists>|/exists EXPR>,
or to 0 if none do.  In other words, an array won't have trailing
nonexistent elements after a delete.

B<WARNING:> Calling L<C<delete>|/delete EXPR> on array values is
strongly discouraged.  The
notion of deleting or checking the existence of Perl array elements is not
conceptually coherent, and can lead to surprising behavior.

Deleting from L<C<%ENV>|perlvar/%ENV> modifies the environment.
Deleting from a hash tied to a DBM file deletes the entry from the DBM
file.  Deleting from a L<C<tied>|/tied VARIABLE> hash or array may not
necessarily return anything; it depends on the implementation of the
L<C<tied>|/tied VARIABLE> package's DELETE method, which may do whatever
it pleases.

The C<delete local EXPR> construct localizes the deletion to the current
block at run time.  Until the block exits, elements locally deleted
temporarily no longer exist.  See L<perlsub/"Localized deletion of elements
of composite types">.

    my %hash = (foo => 11, bar => 22, baz => 33);
    my $scalar = delete $hash{foo};         # $scalar is 11
    $scalar = delete @hash{qw(foo bar)}; # $scalar is 22
    my @array  = delete @hash{qw(foo baz)}; # @array  is (undef,33)

The following (inefficiently) deletes all the values of %HASH and @ARRAY:

    foreach my $key (keys %HASH) {
        delete $HASH{$key};
    }

    foreach my $index (0 .. $#ARRAY) {
        delete $ARRAY[$index];
    }

And so do these:

    delete @HASH{keys %HASH};

    delete @ARRAY[0 .. $#ARRAY];

But both are slower than assigning the empty list
or undefining %HASH or @ARRAY, which is the customary
way to empty out an aggregate:

    %HASH = ();     # completely empty %HASH
    undef %HASH;    # forget %HASH ever existed

    @ARRAY = ();    # completely empty @ARRAY
    undef @ARRAY;   # forget @ARRAY ever existed

The EXPR can be arbitrarily complicated provided its
final operation is an element or slice of an aggregate:

    delete $ref->[$x][$y]{$key};
    delete @{$ref->[$x][$y]}{$key1, $key2, @morekeys};

    delete $ref->[$x][$y][$index];
    delete @{$ref->[$x][$y]}[$index1, $index2, @moreindices];

=item die LIST
X<die> X<throw> X<exception> X<raise> X<$@> X<abort>

=for Pod::Functions raise an exception or bail out

L<C<die>|/die LIST> raises an exception.  Inside an
L<C<eval>|/eval EXPR> the error message is stuffed into
L<C<$@>|perlvar/$@> and the L<C<eval>|/eval EXPR> is terminated with the
undefined value.  If the exception is outside of all enclosing
L<C<eval>|/eval EXPR>s, then the uncaught exception prints LIST to
C<STDERR> and exits with a non-zero value.  If you need to exit the
process with a specific exit code, see L<C<exit>|/exit EXPR>.

Equivalent examples:

    die "Can't cd to spool: $!\n" unless chdir '/usr/spool/news';
    chdir '/usr/spool/news' or die "Can't cd to spool: $!\n"

If the last element of LIST does not end in a newline, the current
script line number and input line number (if any) are also printed,
and a newline is supplied.  Note that the "input line number" (also
known as "chunk") is subject to whatever notion of "line" happens to
be currently in effect, and is also available as the special variable
L<C<$.>|perlvar/$.>.  See L<perlvar/"$/"> and L<perlvar/"$.">.

Hint: sometimes appending C<", stopped"> to your message will cause it
to make better sense when the string C<"at foo line 123"> is appended.
Suppose you are running script "canasta".

    die "/etc/games is no good";
    die "/etc/games is no good, stopped";

produce, respectively

    /etc/games is no good at canasta line 123.
    /etc/games is no good, stopped at canasta line 123.

If the output is empty and L<C<$@>|perlvar/$@> already contains a value
(typically from a previous L<C<eval>|/eval EXPR>) that value is reused after
appending C<"\t...propagated">.  This is useful for propagating exceptions:

    eval { ... };
    die unless $@ =~ /Expected exception/;

If the output is empty and L<C<$@>|perlvar/$@> contains an object
reference that has a C<PROPAGATE> method, that method will be called
with additional file and line number parameters.  The return value
replaces the value in L<C<$@>|perlvar/$@>;  i.e., as if
C<< $@ = eval { $@->PROPAGATE(__FILE__, __LINE__) }; >> were called.

If L<C<$@>|perlvar/$@> is empty, then the string C<"Died"> is used.

If an uncaught exception results in interpreter exit, the exit code is
determined from the values of L<C<$!>|perlvar/$!> and
L<C<$?>|perlvar/$?> with this pseudocode:

    exit $! if $!;              # errno
    exit $? >> 8 if $? >> 8;    # child exit status
    exit 255;                   # last resort

As with L<C<exit>|/exit EXPR>, L<C<$?>|perlvar/$?> is set prior to
unwinding the call stack; any C<DESTROY> or C<END> handlers can then
alter this value, and thus Perl's exit code.

The intent is to squeeze as much possible information about the likely cause
into the limited space of the system exit code.  However, as
L<C<$!>|perlvar/$!> is the value of C's C<errno>, which can be set by
any system call, this means that the value of the exit code used by
L<C<die>|/die LIST> can be non-predictable, so should not be relied
upon, other than to be non-zero.

You can also call L<C<die>|/die LIST> with a reference argument, and if
this is trapped within an L<C<eval>|/eval EXPR>, L<C<$@>|perlvar/$@>
contains that reference.  This permits more elaborate exception handling
using objects that maintain arbitrary state about the exception.  Such a
scheme is sometimes preferable to matching particular string values of
L<C<$@>|perlvar/$@> with regular expressions.  Because
L<C<$@>|perlvar/$@> is a global variable and L<C<eval>|/eval EXPR> may
be used within object implementations, be careful that analyzing the
error object doesn't replace the reference in the global variable.  It's
easiest to make a local copy of the reference before any manipulations.
Here's an example:

    use Scalar::Util "blessed";

    eval { ... ; die Some::Module::Exception->new( FOO => "bar" ) };
    if (my $ev_err = $@) {
        if (blessed($ev_err)
            && $ev_err->isa("Some::Module::Exception")) {
            # handle Some::Module::Exception
        }
        else {
            # handle all other possible exceptions
        }
    }

Because Perl stringifies uncaught exception messages before display,
you'll probably want to overload stringification operations on
exception objects.  See L<overload> for details about that.

You can arrange for a callback to be run just before the
L<C<die>|/die LIST> does its deed, by setting the
L<C<$SIG{__DIE__}>|perlvar/%SIG> hook.  The associated handler is called
with the error text and can change the error message, if it sees fit, by
calling L<C<die>|/die LIST> again.  See L<perlvar/%SIG> for details on
setting L<C<%SIG>|perlvar/%SIG> entries, and L<C<eval>|/eval EXPR> for some
examples.  Although this feature was to be run only right before your
program was to exit, this is not currently so: the
L<C<$SIG{__DIE__}>|perlvar/%SIG> hook is currently called even inside
L<C<eval>|/eval EXPR>ed blocks/strings!  If one wants the hook to do
nothing in such situations, put

    die @_ if $^S;

as the first line of the handler (see L<perlvar/$^S>).  Because
this promotes strange action at a distance, this counterintuitive
behavior may be fixed in a future release.

See also L<C<exit>|/exit EXPR>, L<C<warn>|/warn LIST>, and the L<Carp>
module.

=item do BLOCK
X<do> X<block>

=for Pod::Functions turn a BLOCK into a TERM

Not really a function.  Returns the value of the last command in the
sequence of commands indicated by BLOCK.  When modified by the C<while> or
C<until> loop modifier, executes the BLOCK once before testing the loop
condition.  (On other statements the loop modifiers test the conditional
first.)

C<do BLOCK> does I<not> count as a loop, so the loop control statements
L<C<next>|/next LABEL>, L<C<last>|/last LABEL>, or
L<C<redo>|/redo LABEL> cannot be used to leave or restart the block.
See L<perlsyn> for alternative strategies.

=item do EXPR
X<do>

Uses the value of EXPR as a filename and executes the contents of the
file as a Perl script:

    # load the exact specified file (./ and ../ special-cased)
    do '/foo/stat.pl';
    do './stat.pl';
    do '../foo/stat.pl';

    # search for the named file within @INC
    do 'stat.pl';
    do 'foo/stat.pl';

C<do './stat.pl'> is largely like

    eval `cat stat.pl`;

except that it's more concise, runs no external processes, and keeps
track of the current filename for error messages. It also differs in that
code evaluated with C<do FILE> cannot see lexicals in the enclosing
scope; C<eval STRING> does.  It's the same, however, in that it does
reparse the file every time you call it, so you probably don't want
to do this inside a loop.

Using C<do> with a relative path (except for F<./> and F<../>), like

    do 'foo/stat.pl';

will search the L<C<@INC>|perlvar/@INC> directories, and update
L<C<%INC>|perlvar/%INC> if the file is found.  See L<perlvar/@INC>
and L<perlvar/%INC> for these variables. In particular, note that
whilst historically L<C<@INC>|perlvar/@INC> contained '.' (the
current directory) making these two cases equivalent, that is no
longer necessarily the case, as '.' is not included in C<@INC> by default
in perl versions 5.26.0 onwards. Instead, perl will now warn:

    do "stat.pl" failed, '.' is no longer in @INC;
    did you mean do "./stat.pl"?

If L<C<do>|/do EXPR> can read the file but cannot compile it, it
returns L<C<undef>|/undef EXPR> and sets an error message in
L<C<$@>|perlvar/$@>.  If L<C<do>|/do EXPR> cannot read the file, it
returns undef and sets L<C<$!>|perlvar/$!> to the error.  Always check
L<C<$@>|perlvar/$@> first, as compilation could fail in a way that also
sets L<C<$!>|perlvar/$!>.  If the file is successfully compiled,
L<C<do>|/do EXPR> returns the value of the last expression evaluated.

Inclusion of library modules is better done with the
L<C<use>|/use Module VERSION LIST> and L<C<require>|/require VERSION>
operators, which also do automatic error checking and raise an exception
if there's a problem.

You might like to use L<C<do>|/do EXPR> to read in a program
configuration file.  Manual error checking can be done this way:

    # Read in config files: system first, then user.
    # Beware of using relative pathnames here.
    for $file ("/share/prog/defaults.rc",
               "$ENV{HOME}/.someprogrc")
    {
        unless ($return = do $file) {
            warn "couldn't parse $file: $@" if $@;
            warn "couldn't do $file: $!"    unless defined $return;
            warn "couldn't run $file"       unless $return;
        }
    }

=item dump LABEL
X<dump> X<core> X<undump>

=item dump EXPR

=item dump

=for Pod::Functions create an immediate core dump

This function causes an immediate core dump.  See also the B<-u>
command-line switch in L<perlrun>, which does the same thing.
Primarily this is so that you can use the B<undump> program (not
supplied) to turn your core dump into an executable binary after
having initialized all your variables at the beginning of the
program.  When the new binary is executed it will begin by executing
a C<goto LABEL> (with all the restrictions that L<C<goto>|/goto LABEL>
suffers).
Think of it as a goto with an intervening core dump and reincarnation.
If C<LABEL> is omitted, restarts the program from the top.  The
C<dump EXPR> form, available starting in Perl 5.18.0, allows a name to be
computed at run time, being otherwise identical to C<dump LABEL>.

B<WARNING>: Any files opened at the time of the dump will I<not>
be open any more when the program is reincarnated, with possible
resulting confusion by Perl.

This function is now largely obsolete, mostly because it's very hard to
convert a core file into an executable.  That's why you should now invoke
it as C<CORE::dump()> if you don't want to be warned against a possible
typo.

Unlike most named operators, this has the same precedence as assignment.
It is also exempt from the looks-like-a-function rule, so
C<dump ("foo")."bar"> will cause "bar" to be part of the argument to
L<C<dump>|/dump LABEL>.

Portability issues: L<perlport/dump>.

=item each HASH
X<each> X<hash, iterator>

=item each ARRAY
X<array, iterator>

=for Pod::Functions retrieve the next key/value pair from a hash

When called on a hash in list context, returns a 2-element list
consisting of the key and value for the next element of a hash.  In Perl
5.12 and later only, it will also return the index and value for the next
element of an array so that you can iterate over it; older Perls consider
this a syntax error.  When called in scalar context, returns only the key
(not the value) in a hash, or the index in an array.

Hash entries are returned in an apparently random order.  The actual random
order is specific to a given hash; the exact same series of operations
on two hashes may result in a different order for each hash.  Any insertion
into the hash may change the order, as will any deletion, with the exception
that the most recent key returned by L<C<each>|/each HASH> or
L<C<keys>|/keys HASH> may be deleted without changing the order.  So
long as a given hash is unmodified you may rely on
L<C<keys>|/keys HASH>, L<C<values>|/values HASH> and
L<C<each>|/each HASH> to repeatedly return the same order
as each other.  See L<perlsec/"Algorithmic Complexity Attacks"> for
details on why hash order is randomized.  Aside from the guarantees
provided here the exact details of Perl's hash algorithm and the hash
traversal order are subject to change in any release of Perl.

After L<C<each>|/each HASH> has returned all entries from the hash or
array, the next call to L<C<each>|/each HASH> returns the empty list in
list context and L<C<undef>|/undef EXPR> in scalar context; the next
call following I<that> one restarts iteration.  Each hash or array has
its own internal iterator, accessed by L<C<each>|/each HASH>,
L<C<keys>|/keys HASH>, and L<C<values>|/values HASH>.  The iterator is
implicitly reset when L<C<each>|/each HASH> has reached the end as just
described; it can be explicitly reset by calling L<C<keys>|/keys HASH>
or L<C<values>|/values HASH> on the hash or array.  If you add or delete
a hash's elements while iterating over it, the effect on the iterator is
unspecified; for example, entries may be skipped or duplicated--so don't
do that.  Exception: It is always safe to delete the item most recently
returned by L<C<each>|/each HASH>, so the following code works properly:

    while (my ($key, $value) = each %hash) {
        print $key, "\n";
        delete $hash{$key};   # This is safe
    }

Tied hashes may have a different ordering behaviour to perl's hash
implementation.

This prints out your environment like the L<printenv(1)> program,
but in a different order:

    while (my ($key,$value) = each %ENV) {
        print "$key=$value\n";
    }

Starting with Perl 5.14, an experimental feature allowed
L<C<each>|/each HASH> to take a scalar expression. This experiment has
been deemed unsuccessful, and was removed as of Perl 5.24.

As of Perl 5.18 you can use a bare L<C<each>|/each HASH> in a C<while>
loop, which will set L<C<$_>|perlvar/$_> on every iteration.

    while (each %ENV) {
	print "$_=$ENV{$_}\n";
    }

To avoid confusing would-be users of your code who are running earlier
versions of Perl with mysterious syntax errors, put this sort of thing at
the top of your file to signal that your code will work I<only> on Perls of
a recent vintage:

    use 5.012;	# so keys/values/each work on arrays
    use 5.018;	# so each assigns to $_ in a lone while test

See also L<C<keys>|/keys HASH>, L<C<values>|/values HASH>, and
L<C<sort>|/sort SUBNAME LIST>.

=item eof FILEHANDLE
X<eof>
X<end of file>
X<end-of-file>

=item eof ()

=item eof

=for Pod::Functions test a filehandle for its end

Returns 1 if the next read on FILEHANDLE will return end of file I<or> if
FILEHANDLE is not open.  FILEHANDLE may be an expression whose value
gives the real filehandle.  (Note that this function actually
reads a character and then C<ungetc>s it, so isn't useful in an
interactive context.)  Do not read from a terminal file (or call
C<eof(FILEHANDLE)> on it) after end-of-file is reached.  File types such
as terminals may lose the end-of-file condition if you do.

An L<C<eof>|/eof FILEHANDLE> without an argument uses the last file
read.  Using L<C<eof()>|/eof FILEHANDLE> with empty parentheses is
different.  It refers to the pseudo file formed from the files listed on
the command line and accessed via the C<< <> >> operator.  Since
C<< <> >> isn't explicitly opened, as a normal filehandle is, an
L<C<eof()>|/eof FILEHANDLE> before C<< <> >> has been used will cause
L<C<@ARGV>|perlvar/@ARGV> to be examined to determine if input is
available.   Similarly, an L<C<eof()>|/eof FILEHANDLE> after C<< <> >>
has returned end-of-file will assume you are processing another
L<C<@ARGV>|perlvar/@ARGV> list, and if you haven't set
L<C<@ARGV>|perlvar/@ARGV>, will read input from C<STDIN>; see
L<perlop/"I/O Operators">.

In a C<< while (<>) >> loop, L<C<eof>|/eof FILEHANDLE> or C<eof(ARGV)>
can be used to detect the end of each file, whereas
L<C<eof()>|/eof FILEHANDLE> will detect the end of the very last file
only.  Examples:

    # reset line numbering on each input file
    while (<>) {
        next if /^\s*#/;  # skip comments
        print "$.\t$_";
    } continue {
        close ARGV if eof;  # Not eof()!
    }

    # insert dashes just before last line of last file
    while (<>) {
        if (eof()) {  # check for end of last file
            print "--------------\n";
        }
        print;
        last if eof();     # needed if we're reading from a terminal
    }

Practical hint: you almost never need to use L<C<eof>|/eof FILEHANDLE>
in Perl, because the input operators typically return L<C<undef>|/undef
EXPR> when they run out of data or encounter an error.

=item eval EXPR
X<eval> X<try> X<catch> X<evaluate> X<parse> X<execute>
X<error, handling> X<exception, handling>

=item eval BLOCK

=item eval

=for Pod::Functions catch exceptions or compile and run code

C<eval> in all its forms is used to execute a little Perl program,
trapping any errors encountered so they don't crash the calling program.

Plain C<eval> with no argument is just C<eval EXPR>, where the
expression is understood to be contained in L<C<$_>|perlvar/$_>.  Thus
there are only two real C<eval> forms; the one with an EXPR is often
called "string eval".  In a string eval, the value of the expression
(which is itself determined within scalar context) is first parsed, and
if there were no errors, executed as a block within the lexical context
of the current Perl program.  This form is typically used to delay
parsing and subsequent execution of the text of EXPR until run time.
Note that the value is parsed every time the C<eval> executes.

The other form is called "block eval".  It is less general than string
eval, but the code within the BLOCK is parsed only once (at the same
time the code surrounding the C<eval> itself was parsed) and executed
within the context of the current Perl program.  This form is typically
used to trap exceptions more efficiently than the first, while also
providing the benefit of checking the code within BLOCK at compile time.
BLOCK is parsed and compiled just once.  Since errors are trapped, it
often is used to check if a given feature is available.

In both forms, the value returned is the value of the last expression
evaluated inside the mini-program; a return statement may also be used, just
as with subroutines.  The expression providing the return value is evaluated
in void, scalar, or list context, depending on the context of the
C<eval> itself.  See L<C<wantarray>|/wantarray> for more
on how the evaluation context can be determined.

If there is a syntax error or runtime error, or a L<C<die>|/die LIST>
statement is executed, C<eval> returns
L<C<undef>|/undef EXPR> in scalar context, or an empty list in list
context, and L<C<$@>|perlvar/$@> is set to the error message.  (Prior to
5.16, a bug caused L<C<undef>|/undef EXPR> to be returned in list
context for syntax errors, but not for runtime errors.) If there was no
error, L<C<$@>|perlvar/$@> is set to the empty string.  A control flow
operator like L<C<last>|/last LABEL> or L<C<goto>|/goto LABEL> can
bypass the setting of L<C<$@>|perlvar/$@>.  Beware that using
C<eval> neither silences Perl from printing warnings to
STDERR, nor does it stuff the text of warning messages into
L<C<$@>|perlvar/$@>.  To do either of those, you have to use the
L<C<$SIG{__WARN__}>|perlvar/%SIG> facility, or turn off warnings inside
the BLOCK or EXPR using S<C<no warnings 'all'>>.  See
L<C<warn>|/warn LIST>, L<perlvar>, and L<warnings>.

Note that, because C<eval> traps otherwise-fatal errors,
it is useful for determining whether a particular feature (such as
L<C<socket>|/socket SOCKET,DOMAIN,TYPE,PROTOCOL> or
L<C<symlink>|/symlink OLDFILE,NEWFILE>) is implemented.  It is also
Perl's exception-trapping mechanism, where the L<C<die>|/die LIST>
operator is used to raise exceptions.

Before Perl 5.14, the assignment to L<C<$@>|perlvar/$@> occurred before
restoration
of localized variables, which means that for your code to run on older
versions, a temporary is required if you want to mask some, but not all
errors:

 # alter $@ on nefarious repugnancy only
 {
    my $e;
    {
      local $@; # protect existing $@
      eval { test_repugnancy() };
      # $@ =~ /nefarious/ and die $@; # Perl 5.14 and higher only
      $@ =~ /nefarious/ and $e = $@;
    }
    die $e if defined $e
 }

There are some different considerations for each form:

=over 4

=item String eval

Since the return value of EXPR is executed as a block within the lexical
context of the current Perl program, any outer lexical variables are
visible to it, and any package variable settings or subroutine and
format definitions remain afterwards.

=over 4

=item Under the L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>

If this feature is enabled (which is the default under a C<use 5.16> or
higher declaration), EXPR is considered to be
in the same encoding as the surrounding program.  Thus if
S<L<C<use utf8>|utf8>> is in effect, the string will be treated as being
UTF-8 encoded.  Otherwise, the string is considered to be a sequence of
independent bytes.  Bytes that correspond to ASCII-range code points
will have their normal meanings for operators in the string.  The
treatment of the other bytes depends on if the
L<C<'unicode_strings"> feature|feature/The 'unicode_strings' feature> is
in effect.

In a plain C<eval> without an EXPR argument, being in S<C<use utf8>> or
not is irrelevant; the UTF-8ness of C<$_> itself determines the
behavior.

Any S<C<use utf8>> or S<C<no utf8>> declarations within the string have
no effect, and source filters are forbidden.  (C<unicode_strings>,
however, can appear within the string.)  See also the
L<C<evalbytes>|/evalbytes EXPR> operator, which works properly with
source filters.

Variables defined outside the C<eval> and used inside it retain their
original UTF-8ness.  Everything inside the string follows the normal
rules for a Perl program with the given state of S<C<use utf8>>.

=item Outside the C<"unicode_eval"> feature

In this case, the behavior is problematic and is not so easily
described.  Here are two bugs that cannot easily be fixed without
breaking existing programs:

=over 4

=item *

It can lose track of whether something should be encoded as UTF-8 or
not.

=item *

Source filters activated within C<eval> leak out into whichever file
scope is currently being compiled.  To give an example with the CPAN module
L<Semi::Semicolons>:

 BEGIN { eval "use Semi::Semicolons; # not filtered" }
 # filtered here!

L<C<evalbytes>|/evalbytes EXPR> fixes that to work the way one would
expect:

 use feature "evalbytes";
 BEGIN { evalbytes "use Semi::Semicolons; # filtered" }
 # not filtered

=back

=back

Problems can arise if the string expands a scalar containing a floating
point number.  That scalar can expand to letters, such as C<"NaN"> or
C<"Infinity">; or, within the scope of a L<C<use locale>|locale>, the
decimal point character may be something other than a dot (such as a
comma).  None of these are likely to parse as you are likely expecting.

You should be especially careful to remember what's being looked at
when:

    eval $x;        # CASE 1
    eval "$x";      # CASE 2

    eval '$x';      # CASE 3
    eval { $x };    # CASE 4

    eval "\$$x++";  # CASE 5
    $$x++;          # CASE 6

Cases 1 and 2 above behave identically: they run the code contained in
the variable $x.  (Although case 2 has misleading double quotes making
the reader wonder what else might be happening (nothing is).)  Cases 3
and 4 likewise behave in the same way: they run the code C<'$x'>, which
does nothing but return the value of $x.  (Case 4 is preferred for
purely visual reasons, but it also has the advantage of compiling at
compile-time instead of at run-time.)  Case 5 is a place where
normally you I<would> like to use double quotes, except that in this
particular situation, you can just use symbolic references instead, as
in case 6.

An C<eval ''> executed within a subroutine defined
in the C<DB> package doesn't see the usual
surrounding lexical scope, but rather the scope of the first non-DB piece
of code that called it.  You don't normally need to worry about this unless
you are writing a Perl debugger.

The final semicolon, if any, may be omitted from the value of EXPR.

=item Block eval

If the code to be executed doesn't vary, you may use the eval-BLOCK
form to trap run-time errors without incurring the penalty of
recompiling each time.  The error, if any, is still returned in
L<C<$@>|perlvar/$@>.
Examples:

    # make divide-by-zero nonfatal
    eval { $answer = $a / $b; }; warn $@ if $@;

    # same thing, but less efficient
    eval '$answer = $a / $b'; warn $@ if $@;

    # a compile-time error
    eval { $answer = }; # WRONG

    # a run-time error
    eval '$answer =';   # sets $@

If you want to trap errors when loading an XS module, some problems with
the binary interface (such as Perl version skew) may be fatal even with
C<eval> unless C<$ENV{PERL_DL_NONLAZY}> is set.  See
L<perlrun>.

Using the C<eval {}> form as an exception trap in libraries does have some
issues.  Due to the current arguably broken state of C<__DIE__> hooks, you
may wish not to trigger any C<__DIE__> hooks that user code may have installed.
You can use the C<local $SIG{__DIE__}> construct for this purpose,
as this example shows:

    # a private exception trap for divide-by-zero
    eval { local $SIG{'__DIE__'}; $answer = $a / $b; };
    warn $@ if $@;

This is especially significant, given that C<__DIE__> hooks can call
L<C<die>|/die LIST> again, which has the effect of changing their error
messages:

    # __DIE__ hooks may modify error messages
    {
       local $SIG{'__DIE__'} =
              sub { (my $x = $_[0]) =~ s/foo/bar/g; die $x };
       eval { die "foo lives here" };
       print $@ if $@;                # prints "bar lives here"
    }

Because this promotes action at a distance, this counterintuitive behavior
may be fixed in a future release.

C<eval BLOCK> does I<not> count as a loop, so the loop control statements
L<C<next>|/next LABEL>, L<C<last>|/last LABEL>, or
L<C<redo>|/redo LABEL> cannot be used to leave or restart the block.

The final semicolon, if any, may be omitted from within the BLOCK.

=back

=item evalbytes EXPR
X<evalbytes>

=item evalbytes

=for Pod::Functions +evalbytes similar to string eval, but intend to parse a bytestream

This function is similar to a L<string eval|/eval EXPR>, except it
always parses its argument (or L<C<$_>|perlvar/$_> if EXPR is omitted)
as a string of independent bytes.

If called when S<C<use utf8>> is in effect, the string will be assumed
to be encoded in UTF-8, and C<evalbytes> will make a temporary copy to
work from, downgraded to non-UTF-8.  If this is not possible
(because one or more characters in it require UTF-8), the C<evalbytes>
will fail with the error stored in C<$@>.

Bytes that correspond to ASCII-range code points will have their normal
meanings for operators in the string.  The treatment of the other bytes
depends on if the L<C<'unicode_strings"> feature|feature/The
'unicode_strings' feature> is in effect.

Of course, variables that are UTF-8 and are referred to in the string
retain that:

 my $a = "\x{100}";
 evalbytes 'print ord $a, "\n"';

prints

 256

and C<$@> is empty.

Source filters activated within the evaluated code apply to the code
itself.

L<C<evalbytes>|/evalbytes EXPR> is available starting in Perl v5.16.  To
access it, you must say C<CORE::evalbytes>, but you can omit the
C<CORE::> if the
L<C<"evalbytes"> feature|feature/The 'unicode_eval' and 'evalbytes' features>
is enabled.  This is enabled automatically with a C<use v5.16> (or
higher) declaration in the current scope.

=item exec LIST
X<exec> X<execute>

=item exec PROGRAM LIST

=for Pod::Functions abandon this program to run another

The L<C<exec>|/exec LIST> function executes a system command I<and never
returns>; use L<C<system>|/system LIST> instead of L<C<exec>|/exec LIST>
if you want it to return.  It fails and
returns false only if the command does not exist I<and> it is executed
directly instead of via your system's command shell (see below).

Since it's a common mistake to use L<C<exec>|/exec LIST> instead of
L<C<system>|/system LIST>, Perl warns you if L<C<exec>|/exec LIST> is
called in void context and if there is a following statement that isn't
L<C<die>|/die LIST>, L<C<warn>|/warn LIST>, or L<C<exit>|/exit EXPR> (if
L<warnings> are enabled--but you always do that, right?).  If you
I<really> want to follow an L<C<exec>|/exec LIST> with some other
statement, you can use one of these styles to avoid the warning:

    exec ('foo')   or print STDERR "couldn't exec foo: $!";
    { exec ('foo') }; print STDERR "couldn't exec foo: $!";

If there is more than one argument in LIST, this calls L<execvp(3)> with the
arguments in LIST.  If there is only one element in LIST, the argument is
checked for shell metacharacters, and if there are any, the entire
argument is passed to the system's command shell for parsing (this is
C</bin/sh -c> on Unix platforms, but varies on other platforms).  If
there are no shell metacharacters in the argument, it is split into words
and passed directly to C<execvp>, which is more efficient.  Examples:

    exec '/bin/echo', 'Your arguments are: ', @ARGV;
    exec "sort $outfile | uniq";

If you don't really want to execute the first argument, but want to lie
to the program you are executing about its own name, you can specify
the program you actually want to run as an "indirect object" (without a
comma) in front of the LIST, as in C<exec PROGRAM LIST>.  (This always
forces interpretation of the LIST as a multivalued list, even if there
is only a single scalar in the list.)  Example:

    my $shell = '/bin/csh';
    exec $shell '-sh';    # pretend it's a login shell

or, more directly,

    exec {'/bin/csh'} '-sh';  # pretend it's a login shell

When the arguments get executed via the system shell, results are
subject to its quirks and capabilities.  See L<perlop/"`STRING`">
for details.

Using an indirect object with L<C<exec>|/exec LIST> or
L<C<system>|/system LIST> is also more secure.  This usage (which also
works fine with L<C<system>|/system LIST>) forces
interpretation of the arguments as a multivalued list, even if the
list had just one argument.  That way you're safe from the shell
expanding wildcards or splitting up words with whitespace in them.

    my @args = ( "echo surprise" );

    exec @args;               # subject to shell escapes
                                # if @args == 1
    exec { $args[0] } @args;  # safe even with one-arg list

The first version, the one without the indirect object, ran the I<echo>
program, passing it C<"surprise"> an argument.  The second version didn't;
it tried to run a program named I<"echo surprise">, didn't find it, and set
L<C<$?>|perlvar/$?> to a non-zero value indicating failure.

On Windows, only the C<exec PROGRAM LIST> indirect object syntax will
reliably avoid using the shell; C<exec LIST>, even with more than one
element, will fall back to the shell if the first spawn fails.

Perl attempts to flush all files opened for output before the exec,
but this may not be supported on some platforms (see L<perlport>).
To be safe, you may need to set L<C<$E<verbar>>|perlvar/$E<verbar>>
(C<$AUTOFLUSH> in L<English>) or call the C<autoflush> method of
L<C<IO::Handle>|IO::Handle/METHODS> on any open handles to avoid lost
output.

Note that L<C<exec>|/exec LIST> will not call your C<END> blocks, nor
will it invoke C<DESTROY> methods on your objects.

Portability issues: L<perlport/exec>.

=item exists EXPR
X<exists> X<autovivification>

=for Pod::Functions test whether a hash key is present

Given an expression that specifies an element of a hash, returns true if the
specified element in the hash has ever been initialized, even if the
corresponding value is undefined.

    print "Exists\n"    if exists $hash{$key};
    print "Defined\n"   if defined $hash{$key};
    print "True\n"      if $hash{$key};

exists may also be called on array elements, but its behavior is much less
obvious and is strongly tied to the use of L<C<delete>|/delete EXPR> on
arrays.

B<WARNING:> Calling L<C<exists>|/exists EXPR> on array values is
strongly discouraged.  The
notion of deleting or checking the existence of Perl array elements is not
conceptually coherent, and can lead to surprising behavior.

    print "Exists\n"    if exists $array[$index];
    print "Defined\n"   if defined $array[$index];
    print "True\n"      if $array[$index];

A hash or array element can be true only if it's defined and defined only if
it exists, but the reverse doesn't necessarily hold true.

Given an expression that specifies the name of a subroutine,
returns true if the specified subroutine has ever been declared, even
if it is undefined.  Mentioning a subroutine name for exists or defined
does not count as declaring it.  Note that a subroutine that does not
exist may still be callable: its package may have an C<AUTOLOAD>
method that makes it spring into existence the first time that it is
called; see L<perlsub>.

    print "Exists\n"  if exists &subroutine;
    print "Defined\n" if defined &subroutine;

Note that the EXPR can be arbitrarily complicated as long as the final
operation is a hash or array key lookup or subroutine name:

    if (exists $ref->{A}->{B}->{$key})  { }
    if (exists $hash{A}{B}{$key})       { }

    if (exists $ref->{A}->{B}->[$ix])   { }
    if (exists $hash{A}{B}[$ix])        { }

    if (exists &{$ref->{A}{B}{$key}})   { }

Although the most deeply nested array or hash element will not spring into
existence just because its existence was tested, any intervening ones will.
Thus C<< $ref->{"A"} >> and C<< $ref->{"A"}->{"B"} >> will spring
into existence due to the existence test for the C<$key> element above.
This happens anywhere the arrow operator is used, including even here:

    undef $ref;
    if (exists $ref->{"Some key"})    { }
    print $ref;  # prints HASH(0x80d3d5c)

This surprising autovivification in what does not at first--or even
second--glance appear to be an lvalue context may be fixed in a future
release.

Use of a subroutine call, rather than a subroutine name, as an argument
to L<C<exists>|/exists EXPR> is an error.

    exists &sub;    # OK
    exists &sub();  # Error

=item exit EXPR
X<exit> X<terminate> X<abort>

=item exit

=for Pod::Functions terminate this program

Evaluates EXPR and exits immediately with that value.    Example:

    my $ans = <STDIN>;
    exit 0 if $ans =~ /^[Xx]/;

See also L<C<die>|/die LIST>.  If EXPR is omitted, exits with C<0>
status.  The only
universally recognized values for EXPR are C<0> for success and C<1>
for error; other values are subject to interpretation depending on the
environment in which the Perl program is running.  For example, exiting
69 (EX_UNAVAILABLE) from a I<sendmail> incoming-mail filter will cause
the mailer to return the item undelivered, but that's not true everywhere.

Don't use L<C<exit>|/exit EXPR> to abort a subroutine if there's any
chance that someone might want to trap whatever error happened.  Use
L<C<die>|/die LIST> instead, which can be trapped by an
L<C<eval>|/eval EXPR>.

The L<C<exit>|/exit EXPR> function does not always exit immediately.  It
calls any defined C<END> routines first, but these C<END> routines may
not themselves abort the exit.  Likewise any object destructors that
need to be called are called before the real exit.  C<END> routines and
destructors can change the exit status by modifying L<C<$?>|perlvar/$?>.
If this is a problem, you can call
L<C<POSIX::_exit($status)>|POSIX/C<_exit>> to avoid C<END> and destructor
processing.  See L<perlmod> for details.

Portability issues: L<perlport/exit>.

=item exp EXPR
X<exp> X<exponential> X<antilog> X<antilogarithm> X<e>

=item exp

=for Pod::Functions raise I<e> to a power

Returns I<e> (the natural logarithm base) to the power of EXPR.
If EXPR is omitted, gives C<exp($_)>.

=item fc EXPR
X<fc> X<foldcase> X<casefold> X<fold-case> X<case-fold>

=item fc

=for Pod::Functions +fc return casefolded version of a string

Returns the casefolded version of EXPR.  This is the internal function
implementing the C<\F> escape in double-quoted strings.

Casefolding is the process of mapping strings to a form where case
differences are erased; comparing two strings in their casefolded
form is effectively a way of asking if two strings are equal,
regardless of case.

Roughly, if you ever found yourself writing this

    lc($this) eq lc($that)    # Wrong!
        # or
    uc($this) eq uc($that)    # Also wrong!
        # or
    $this =~ /^\Q$that\E\z/i  # Right!

Now you can write

    fc($this) eq fc($that)

And get the correct results.

Perl only implements the full form of casefolding, but you can access
the simple folds using L<Unicode::UCD/B<casefold()>> and
L<Unicode::UCD/B<prop_invmap()>>.
For further information on casefolding, refer to
the Unicode Standard, specifically sections 3.13 C<Default Case Operations>,
4.2 C<Case-Normative>, and 5.18 C<Case Mappings>,
available at L<http://www.unicode.org/versions/latest/>, as well as the
Case Charts available at L<http://www.unicode.org/charts/case/>.

If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

This function behaves the same way under various pragmas, such as within
L<S<C<"use feature 'unicode_strings">>|feature/The 'unicode_strings' feature>,
as L<C<lc>|/lc EXPR> does, with the single exception of
L<C<fc>|/fc EXPR> of I<LATIN CAPITAL LETTER SHARP S> (U+1E9E) within the
scope of L<S<C<use locale>>|locale>.  The foldcase of this character
would normally be C<"ss">, but as explained in the L<C<lc>|/lc EXPR>
section, case
changes that cross the 255/256 boundary are problematic under locales,
and are hence prohibited.  Therefore, this function under locale returns
instead the string C<"\x{17F}\x{17F}">, which is the I<LATIN SMALL LETTER
LONG S>.  Since that character itself folds to C<"s">, the string of two
of them together should be equivalent to a single U+1E9E when foldcased.

While the Unicode Standard defines two additional forms of casefolding,
one for Turkic languages and one that never maps one character into multiple
characters, these are not provided by the Perl core.  However, the CPAN module
L<C<Unicode::Casing>|Unicode::Casing> may be used to provide an implementation.

L<C<fc>|/fc EXPR> is available only if the
L<C<"fc"> feature|feature/The 'fc' feature> is enabled or if it is
prefixed with C<CORE::>.  The
L<C<"fc"> feature|feature/The 'fc' feature> is enabled automatically
with a C<use v5.16> (or higher) declaration in the current scope.

=item fcntl FILEHANDLE,FUNCTION,SCALAR
X<fcntl>

=for Pod::Functions file control system call

Implements the L<fcntl(2)> function.  You'll probably have to say

    use Fcntl;

first to get the correct constant definitions.  Argument processing and
value returned work just like L<C<ioctl>|/ioctl
FILEHANDLE,FUNCTION,SCALAR> below.  For example:

    use Fcntl;
    my $flags = fcntl($filehandle, F_GETFL, 0)
        or die "Can't fcntl F_GETFL: $!";

You don't have to check for L<C<defined>|/defined EXPR> on the return
from L<C<fcntl>|/fcntl FILEHANDLE,FUNCTION,SCALAR>.  Like
L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR>, it maps a C<0> return
from the system call into C<"0 but true"> in Perl.  This string is true
in boolean context and C<0> in numeric context.  It is also exempt from
the normal
L<C<Argument "..." isn't numeric>|perldiag/Argument "%s" isn't numeric%s>
L<warnings> on improper numeric conversions.

Note that L<C<fcntl>|/fcntl FILEHANDLE,FUNCTION,SCALAR> raises an
exception if used on a machine that doesn't implement L<fcntl(2)>.  See
the L<Fcntl> module or your L<fcntl(2)> manpage to learn what functions
are available on your system.

Here's an example of setting a filehandle named C<$REMOTE> to be
non-blocking at the system level.  You'll have to negotiate
L<C<$E<verbar>>|perlvar/$E<verbar>> on your own, though.

    use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);

    my $flags = fcntl($REMOTE, F_GETFL, 0)
        or die "Can't get flags for the socket: $!\n";

    fcntl($REMOTE, F_SETFL, $flags | O_NONBLOCK)
        or die "Can't set flags for the socket: $!\n";

Portability issues: L<perlport/fcntl>.

=item __FILE__
X<__FILE__>

=for Pod::Functions the name of the current source file

A special token that returns the name of the file in which it occurs.

=item fileno FILEHANDLE
X<fileno>

=for Pod::Functions return file descriptor from filehandle

Returns the file descriptor for a filehandle, or undefined if the
filehandle is not open.  If there is no real file descriptor at the OS
level, as can happen with filehandles connected to memory objects via
L<C<open>|/open FILEHANDLE,EXPR> with a reference for the third
argument, -1 is returned.

This is mainly useful for constructing bitmaps for
L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT> and low-level POSIX
tty-handling operations.
If FILEHANDLE is an expression, the value is taken as an indirect
filehandle, generally its name.

You can use this to find out whether two handles refer to the
same underlying descriptor:

    if (fileno($this) != -1 && fileno($this) == fileno($that)) {
        print "\$this and \$that are dups\n";
    } elsif (fileno($this) != -1 && fileno($that) != -1) {
        print "\$this and \$that have different " .
            "underlying file descriptors\n";
    } else {
        print "At least one of \$this and \$that does " .
            "not have a real file descriptor\n";
    }

The behavior of L<C<fileno>|/fileno FILEHANDLE> on a directory handle
depends on the operating system.  On a system with L<dirfd(3)> or
similar, L<C<fileno>|/fileno FILEHANDLE> on a directory
handle returns the underlying file descriptor associated with the
handle; on systems with no such support, it returns the undefined value,
and sets L<C<$!>|perlvar/$!> (errno).

=item flock FILEHANDLE,OPERATION
X<flock> X<lock> X<locking>

=for Pod::Functions lock an entire file with an advisory lock

Calls L<flock(2)>, or an emulation of it, on FILEHANDLE.  Returns true
for success, false on failure.  Produces a fatal error if used on a
machine that doesn't implement L<flock(2)>, L<fcntl(2)> locking, or
L<lockf(3)>.  L<C<flock>|/flock FILEHANDLE,OPERATION> is Perl's portable
file-locking interface, although it locks entire files only, not
records.

Two potentially non-obvious but traditional L<C<flock>|/flock
FILEHANDLE,OPERATION> semantics are
that it waits indefinitely until the lock is granted, and that its locks
are B<merely advisory>.  Such discretionary locks are more flexible, but
offer fewer guarantees.  This means that programs that do not also use
L<C<flock>|/flock FILEHANDLE,OPERATION> may modify files locked with
L<C<flock>|/flock FILEHANDLE,OPERATION>.  See L<perlport>,
your port's specific documentation, and your system-specific local manpages
for details.  It's best to assume traditional behavior if you're writing
portable programs.  (But if you're not, you should as always feel perfectly
free to write for your own system's idiosyncrasies (sometimes called
"features").  Slavish adherence to portability concerns shouldn't get
in the way of your getting your job done.)

OPERATION is one of LOCK_SH, LOCK_EX, or LOCK_UN, possibly combined with
LOCK_NB.  These constants are traditionally valued 1, 2, 8 and 4, but
you can use the symbolic names if you import them from the L<Fcntl> module,
either individually, or as a group using the C<:flock> tag.  LOCK_SH
requests a shared lock, LOCK_EX requests an exclusive lock, and LOCK_UN
releases a previously requested lock.  If LOCK_NB is bitwise-or'ed with
LOCK_SH or LOCK_EX, then L<C<flock>|/flock FILEHANDLE,OPERATION> returns
immediately rather than blocking waiting for the lock; check the return
status to see if you got it.

To avoid the possibility of miscoordination, Perl now flushes FILEHANDLE
before locking or unlocking it.

Note that the emulation built with L<lockf(3)> doesn't provide shared
locks, and it requires that FILEHANDLE be open with write intent.  These
are the semantics that L<lockf(3)> implements.  Most if not all systems
implement L<lockf(3)> in terms of L<fcntl(2)> locking, though, so the
differing semantics shouldn't bite too many people.

Note that the L<fcntl(2)> emulation of L<flock(3)> requires that FILEHANDLE
be open with read intent to use LOCK_SH and requires that it be open
with write intent to use LOCK_EX.

Note also that some versions of L<C<flock>|/flock FILEHANDLE,OPERATION>
cannot lock things over the network; you would need to use the more
system-specific L<C<fcntl>|/fcntl FILEHANDLE,FUNCTION,SCALAR> for
that.  If you like you can force Perl to ignore your system's L<flock(2)>
function, and so provide its own L<fcntl(2)>-based emulation, by passing
the switch C<-Ud_flock> to the F<Configure> program when you configure
and build a new Perl.

Here's a mailbox appender for BSD systems.

    # import LOCK_* and SEEK_END constants
    use Fcntl qw(:flock SEEK_END);

    sub lock {
        my ($fh) = @_;
        flock($fh, LOCK_EX) or die "Cannot lock mailbox - $!\n";

        # and, in case someone appended while we were waiting...
        seek($fh, 0, SEEK_END) or die "Cannot seek - $!\n";
    }

    sub unlock {
        my ($fh) = @_;
        flock($fh, LOCK_UN) or die "Cannot unlock mailbox - $!\n";
    }

    open(my $mbox, ">>", "/usr/spool/mail/$ENV{'USER'}")
        or die "Can't open mailbox: $!";

    lock($mbox);
    print $mbox $msg,"\n\n";
    unlock($mbox);

On systems that support a real L<flock(2)>, locks are inherited across
L<C<fork>|/fork> calls, whereas those that must resort to the more
capricious L<fcntl(2)> function lose their locks, making it seriously
harder to write servers.

See also L<DB_File> for other L<C<flock>|/flock FILEHANDLE,OPERATION>
examples.

Portability issues: L<perlport/flock>.

=item fork
X<fork> X<child> X<parent>

=for Pod::Functions create a new process just like this one

Does a L<fork(2)> system call to create a new process running the
same program at the same point.  It returns the child pid to the
parent process, C<0> to the child process, or L<C<undef>|/undef EXPR> if
the fork is
unsuccessful.  File descriptors (and sometimes locks on those descriptors)
are shared, while everything else is copied.  On most systems supporting
L<fork(2)>, great care has gone into making it extremely efficient (for
example, using copy-on-write technology on data pages), making it the
dominant paradigm for multitasking over the last few decades.

Perl attempts to flush all files opened for output before forking the
child process, but this may not be supported on some platforms (see
L<perlport>).  To be safe, you may need to set
L<C<$E<verbar>>|perlvar/$E<verbar>> (C<$AUTOFLUSH> in L<English>) or
call the C<autoflush> method of L<C<IO::Handle>|IO::Handle/METHODS> on
any open handles to avoid duplicate output.

If you L<C<fork>|/fork> without ever waiting on your children, you will
accumulate zombies.  On some systems, you can avoid this by setting
L<C<$SIG{CHLD}>|perlvar/%SIG> to C<"IGNORE">.  See also L<perlipc> for
more examples of forking and reaping moribund children.

Note that if your forked child inherits system file descriptors like
STDIN and STDOUT that are actually connected by a pipe or socket, even
if you exit, then the remote server (such as, say, a CGI script or a
backgrounded job launched from a remote shell) won't think you're done.
You should reopen those to F</dev/null> if it's any issue.

On some platforms such as Windows, where the L<fork(2)> system call is
not available, Perl can be built to emulate L<C<fork>|/fork> in the Perl
interpreter.  The emulation is designed, at the level of the Perl
program, to be as compatible as possible with the "Unix" L<fork(2)>.
However it has limitations that have to be considered in code intended
to be portable.  See L<perlfork> for more details.

Portability issues: L<perlport/fork>.

=item format
X<format>

=for Pod::Functions declare a picture format with use by the write() function

Declare a picture format for use by the L<C<write>|/write FILEHANDLE>
function.  For example:

    format Something =
        Test: @<<<<<<<< @||||| @>>>>>
              $str,     $%,    '$' . int($num)
    .

    $str = "widget";
    $num = $cost/$quantity;
    $~ = 'Something';
    write;

See L<perlform> for many details and examples.

=item formline PICTURE,LIST
X<formline>

=for Pod::Functions internal function used for formats

This is an internal function used by L<C<format>|/format>s, though you
may call it, too.  It formats (see L<perlform>) a list of values
according to the contents of PICTURE, placing the output into the format
output accumulator, L<C<$^A>|perlvar/$^A> (or C<$ACCUMULATOR> in
L<English>).  Eventually, when a L<C<write>|/write FILEHANDLE> is done,
the contents of L<C<$^A>|perlvar/$^A> are written to some filehandle.
You could also read L<C<$^A>|perlvar/$^A> and then set
L<C<$^A>|perlvar/$^A> back to C<"">.  Note that a format typically does
one L<C<formline>|/formline PICTURE,LIST> per line of form, but the
L<C<formline>|/formline PICTURE,LIST> function itself doesn't care how
many newlines are embedded in the PICTURE.  This means that the C<~> and
C<~~> tokens treat the entire PICTURE as a single line.  You may
therefore need to use multiple formlines to implement a single record
format, just like the L<C<format>|/format> compiler.

Be careful if you put double quotes around the picture, because an C<@>
character may be taken to mean the beginning of an array name.
L<C<formline>|/formline PICTURE,LIST> always returns true.  See
L<perlform> for other examples.

If you are trying to use this instead of L<C<write>|/write FILEHANDLE>
to capture the output, you may find it easier to open a filehandle to a
scalar (C<< open my $fh, ">", \$output >>) and write to that instead.

=item getc FILEHANDLE
X<getc> X<getchar> X<character> X<file, read>

=item getc

=for Pod::Functions get the next character from the filehandle

Returns the next character from the input file attached to FILEHANDLE,
or the undefined value at end of file or if there was an error (in
the latter case L<C<$!>|perlvar/$!> is set).  If FILEHANDLE is omitted,
reads from
STDIN.  This is not particularly efficient.  However, it cannot be
used by itself to fetch single characters without waiting for the user
to hit enter.  For that, try something more like:

    if ($BSD_STYLE) {
        system "stty cbreak </dev/tty >/dev/tty 2>&1";
    }
    else {
        system "stty", '-icanon', 'eol', "\001";
    }

    my $key = getc(STDIN);

    if ($BSD_STYLE) {
        system "stty -cbreak </dev/tty >/dev/tty 2>&1";
    }
    else {
        system 'stty', 'icanon', 'eol', '^@'; # ASCII NUL
    }
    print "\n";

Determination of whether C<$BSD_STYLE> should be set is left as an
exercise to the reader.

The L<C<POSIX::getattr>|POSIX/C<getattr>> function can do this more
portably on systems purporting POSIX compliance.  See also the
L<C<Term::ReadKey>|Term::ReadKey> module on CPAN.

=item getlogin
X<getlogin> X<login>

=for Pod::Functions return who logged in at this tty

This implements the C library function of the same name, which on most
systems returns the current login from F</etc/utmp>, if any.  If it
returns the empty string, use L<C<getpwuid>|/getpwuid UID>.

    my $login = getlogin || getpwuid($<) || "Kilroy";

Do not consider L<C<getlogin>|/getlogin> for authentication: it is not
as secure as L<C<getpwuid>|/getpwuid UID>.

Portability issues: L<perlport/getlogin>.

=item getpeername SOCKET
X<getpeername> X<peer>

=for Pod::Functions find the other end of a socket connection

Returns the packed sockaddr address of the other end of the SOCKET
connection.

    use Socket;
    my $hersockaddr    = getpeername($sock);
    my ($port, $iaddr) = sockaddr_in($hersockaddr);
    my $herhostname    = gethostbyaddr($iaddr, AF_INET);
    my $herstraddr     = inet_ntoa($iaddr);

=item getpgrp PID
X<getpgrp> X<group>

=for Pod::Functions get process group

Returns the current process group for the specified PID.  Use
a PID of C<0> to get the current process group for the
current process.  Will raise an exception if used on a machine that
doesn't implement L<getpgrp(2)>.  If PID is omitted, returns the process
group of the current process.  Note that the POSIX version of
L<C<getpgrp>|/getpgrp PID> does not accept a PID argument, so only
C<PID==0> is truly portable.

Portability issues: L<perlport/getpgrp>.

=item getppid
X<getppid> X<parent> X<pid>

=for Pod::Functions get parent process ID

Returns the process id of the parent process.

Note for Linux users: Between v5.8.1 and v5.16.0 Perl would work
around non-POSIX thread semantics the minority of Linux systems (and
Debian GNU/kFreeBSD systems) that used LinuxThreads, this emulation
has since been removed.  See the documentation for L<$$|perlvar/$$> for
details.

Portability issues: L<perlport/getppid>.

=item getpriority WHICH,WHO
X<getpriority> X<priority> X<nice>

=for Pod::Functions get current nice value

Returns the current priority for a process, a process group, or a user.
(See L<getpriority(2)>.)  Will raise a fatal exception if used on a
machine that doesn't implement L<getpriority(2)>.

Portability issues: L<perlport/getpriority>.

=item getpwnam NAME
X<getpwnam> X<getgrnam> X<gethostbyname> X<getnetbyname> X<getprotobyname>
X<getpwuid> X<getgrgid> X<getservbyname> X<gethostbyaddr> X<getnetbyaddr>
X<getprotobynumber> X<getservbyport> X<getpwent> X<getgrent> X<gethostent>
X<getnetent> X<getprotoent> X<getservent> X<setpwent> X<setgrent> X<sethostent>
X<setnetent> X<setprotoent> X<setservent> X<endpwent> X<endgrent> X<endhostent>
X<endnetent> X<endprotoent> X<endservent>

=for Pod::Functions get passwd record given user login name

=item getgrnam NAME

=for Pod::Functions get group record given group name

=item gethostbyname NAME

=for Pod::Functions get host record given name

=item getnetbyname NAME

=for Pod::Functions get networks record given name

=item getprotobyname NAME

=for Pod::Functions get protocol record given name

=item getpwuid UID

=for Pod::Functions get passwd record given user ID

=item getgrgid GID

=for Pod::Functions get group record given group user ID

=item getservbyname NAME,PROTO

=for Pod::Functions get services record given its name

=item gethostbyaddr ADDR,ADDRTYPE

=for Pod::Functions get host record given its address

=item getnetbyaddr ADDR,ADDRTYPE

=for Pod::Functions get network record given its address

=item getprotobynumber NUMBER

=for Pod::Functions get protocol record numeric protocol

=item getservbyport PORT,PROTO

=for Pod::Functions get services record given numeric port

=item getpwent

=for Pod::Functions get next passwd record

=item getgrent

=for Pod::Functions get next group record

=item gethostent

=for Pod::Functions get next hosts record

=item getnetent

=for Pod::Functions get next networks record

=item getprotoent

=for Pod::Functions get next protocols record

=item getservent

=for Pod::Functions get next services record

=item setpwent

=for Pod::Functions prepare passwd file for use

=item setgrent

=for Pod::Functions prepare group file for use

=item sethostent STAYOPEN

=for Pod::Functions prepare hosts file for use

=item setnetent STAYOPEN

=for Pod::Functions prepare networks file for use

=item setprotoent STAYOPEN

=for Pod::Functions prepare protocols file for use

=item setservent STAYOPEN

=for Pod::Functions prepare services file for use

=item endpwent

=for Pod::Functions be done using passwd file

=item endgrent

=for Pod::Functions be done using group file

=item endhostent

=for Pod::Functions be done using hosts file

=item endnetent

=for Pod::Functions be done using networks file

=item endprotoent

=for Pod::Functions be done using protocols file

=item endservent

=for Pod::Functions be done using services file

These routines are the same as their counterparts in the
system C library.  In list context, the return values from the
various get routines are as follows:

 #    0        1          2           3         4
 my ( $name,   $passwd,   $gid,       $members  ) = getgr*
 my ( $name,   $aliases,  $addrtype,  $net      ) = getnet*
 my ( $name,   $aliases,  $port,      $proto    ) = getserv*
 my ( $name,   $aliases,  $proto                ) = getproto*
 my ( $name,   $aliases,  $addrtype,  $length,  @addrs ) = gethost*
 my ( $name,   $passwd,   $uid,       $gid,     $quota,
    $comment,  $gcos,     $dir,       $shell,   $expire ) = getpw*
 #    5        6          7           8         9

(If the entry doesn't exist, the return value is a single meaningless true
value.)

The exact meaning of the $gcos field varies but it usually contains
the real name of the user (as opposed to the login name) and other
information pertaining to the user.  Beware, however, that in many
system users are able to change this information and therefore it
cannot be trusted and therefore the $gcos is tainted (see
L<perlsec>).  The $passwd and $shell, user's encrypted password and
login shell, are also tainted, for the same reason.

In scalar context, you get the name, unless the function was a
lookup by name, in which case you get the other thing, whatever it is.
(If the entry doesn't exist you get the undefined value.)  For example:

    my $uid   = getpwnam($name);
    my $name  = getpwuid($num);
    my $name  = getpwent();
    my $gid   = getgrnam($name);
    my $name  = getgrgid($num);
    my $name  = getgrent();
    # etc.

In I<getpw*()> the fields $quota, $comment, and $expire are special
in that they are unsupported on many systems.  If the
$quota is unsupported, it is an empty scalar.  If it is supported, it
usually encodes the disk quota.  If the $comment field is unsupported,
it is an empty scalar.  If it is supported it usually encodes some
administrative comment about the user.  In some systems the $quota
field may be $change or $age, fields that have to do with password
aging.  In some systems the $comment field may be $class.  The $expire
field, if present, encodes the expiration period of the account or the
password.  For the availability and the exact meaning of these fields
in your system, please consult L<getpwnam(3)> and your system's
F<pwd.h> file.  You can also find out from within Perl what your
$quota and $comment fields mean and whether you have the $expire field
by using the L<C<Config>|Config> module and the values C<d_pwquota>, C<d_pwage>,
C<d_pwchange>, C<d_pwcomment>, and C<d_pwexpire>.  Shadow password
files are supported only if your vendor has implemented them in the
intuitive fashion that calling the regular C library routines gets the
shadow versions if you're running under privilege or if there exists
the L<shadow(3)> functions as found in System V (this includes Solaris
and Linux).  Those systems that implement a proprietary shadow password
facility are unlikely to be supported.

The $members value returned by I<getgr*()> is a space-separated list of
the login names of the members of the group.

For the I<gethost*()> functions, if the C<h_errno> variable is supported in
C, it will be returned to you via L<C<$?>|perlvar/$?> if the function
call fails.  The
C<@addrs> value returned by a successful call is a list of raw
addresses returned by the corresponding library call.  In the
Internet domain, each address is four bytes long; you can unpack it
by saying something like:

    my ($w,$x,$y,$z) = unpack('W4',$addr[0]);

The Socket library makes this slightly easier:

    use Socket;
    my $iaddr = inet_aton("127.1"); # or whatever address
    my $name  = gethostbyaddr($iaddr, AF_INET);

    # or going the other way
    my $straddr = inet_ntoa($iaddr);

In the opposite way, to resolve a hostname to the IP address
you can write this:

    use Socket;
    my $packed_ip = gethostbyname("www.perl.org");
    my $ip_address;
    if (defined $packed_ip) {
        $ip_address = inet_ntoa($packed_ip);
    }

Make sure L<C<gethostbyname>|/gethostbyname NAME> is called in SCALAR
context and that its return value is checked for definedness.

The L<C<getprotobynumber>|/getprotobynumber NUMBER> function, even
though it only takes one argument, has the precedence of a list
operator, so beware:

    getprotobynumber $number eq 'icmp'   # WRONG
    getprotobynumber($number eq 'icmp')  # actually means this
    getprotobynumber($number) eq 'icmp'  # better this way

If you get tired of remembering which element of the return list
contains which return value, by-name interfaces are provided in standard
modules: L<C<File::stat>|File::stat>, L<C<Net::hostent>|Net::hostent>,
L<C<Net::netent>|Net::netent>, L<C<Net::protoent>|Net::protoent>,
L<C<Net::servent>|Net::servent>, L<C<Time::gmtime>|Time::gmtime>,
L<C<Time::localtime>|Time::localtime>, and
L<C<User::grent>|User::grent>.  These override the normal built-ins,
supplying versions that return objects with the appropriate names for
each field.  For example:

   use File::stat;
   use User::pwent;
   my $is_his = (stat($filename)->uid == pwent($whoever)->uid);

Even though it looks as though they're the same method calls (uid),
they aren't, because a C<File::stat> object is different from
a C<User::pwent> object.

Portability issues: L<perlport/getpwnam> to L<perlport/endservent>.

=item getsockname SOCKET
X<getsockname>

=for Pod::Functions retrieve the sockaddr for a given socket

Returns the packed sockaddr address of this end of the SOCKET connection,
in case you don't know the address because you have several different
IPs that the connection might have come in on.

    use Socket;
    my $mysockaddr = getsockname($sock);
    my ($port, $myaddr) = sockaddr_in($mysockaddr);
    printf "Connect to %s [%s]\n",
       scalar gethostbyaddr($myaddr, AF_INET),
       inet_ntoa($myaddr);

=item getsockopt SOCKET,LEVEL,OPTNAME
X<getsockopt>

=for Pod::Functions get socket options on a given socket

Queries the option named OPTNAME associated with SOCKET at a given LEVEL.
Options may exist at multiple protocol levels depending on the socket
type, but at least the uppermost socket level SOL_SOCKET (defined in the
L<C<Socket>|Socket> module) will exist.  To query options at another
level the protocol number of the appropriate protocol controlling the
option should be supplied.  For example, to indicate that an option is
to be interpreted by the TCP protocol, LEVEL should be set to the
protocol number of TCP, which you can get using
L<C<getprotobyname>|/getprotobyname NAME>.

The function returns a packed string representing the requested socket
option, or L<C<undef>|/undef EXPR> on error, with the reason for the
error placed in L<C<$!>|perlvar/$!>.  Just what is in the packed string
depends on LEVEL and OPTNAME; consult L<getsockopt(2)> for details.  A
common case is that the option is an integer, in which case the result
is a packed integer, which you can decode using
L<C<unpack>|/unpack TEMPLATE,EXPR> with the C<i> (or C<I>) format.

Here's an example to test whether Nagle's algorithm is enabled on a socket:

    use Socket qw(:all);

    defined(my $tcp = getprotobyname("tcp"))
        or die "Could not determine the protocol number for tcp";
    # my $tcp = IPPROTO_TCP; # Alternative
    my $packed = getsockopt($socket, $tcp, TCP_NODELAY)
        or die "getsockopt TCP_NODELAY: $!";
    my $nodelay = unpack("I", $packed);
    print "Nagle's algorithm is turned ",
           $nodelay ? "off\n" : "on\n";

Portability issues: L<perlport/getsockopt>.

=item glob EXPR
X<glob> X<wildcard> X<filename, expansion> X<expand>

=item glob

=for Pod::Functions expand filenames using wildcards

In list context, returns a (possibly empty) list of filename expansions on
the value of EXPR such as the standard Unix shell F</bin/csh> would do.  In
scalar context, glob iterates through such filename expansions, returning
undef when the list is exhausted.  This is the internal function
implementing the C<< <*.c> >> operator, but you can use it directly.  If
EXPR is omitted, L<C<$_>|perlvar/$_> is used.  The C<< <*.c> >> operator
is discussed in more detail in L<perlop/"I/O Operators">.

Note that L<C<glob>|/glob EXPR> splits its arguments on whitespace and
treats
each segment as separate pattern.  As such, C<glob("*.c *.h")>
matches all files with a F<.c> or F<.h> extension.  The expression
C<glob(".* *")> matches all files in the current working directory.
If you want to glob filenames that might contain whitespace, you'll
have to use extra quotes around the spacey filename to protect it.
For example, to glob filenames that have an C<e> followed by a space
followed by an C<f>, use one of:

    my @spacies = <"*e f*">;
    my @spacies = glob '"*e f*"';
    my @spacies = glob q("*e f*");

If you had to get a variable through, you could do this:

    my @spacies = glob "'*${var}e f*'";
    my @spacies = glob qq("*${var}e f*");

If non-empty braces are the only wildcard characters used in the
L<C<glob>|/glob EXPR>, no filenames are matched, but potentially many
strings are returned.  For example, this produces nine strings, one for
each pairing of fruits and colors:

    my @many = glob "{apple,tomato,cherry}={green,yellow,red}";

This operator is implemented using the standard C<File::Glob> extension.
See L<File::Glob> for details, including
L<C<bsd_glob>|File::Glob/C<bsd_glob>>, which does not treat whitespace
as a pattern separator.

Portability issues: L<perlport/glob>.

=item gmtime EXPR
X<gmtime> X<UTC> X<Greenwich>

=item gmtime

=for Pod::Functions convert UNIX time into record or string using Greenwich time

Works just like L<C<localtime>|/localtime EXPR> but the returned values
are localized for the standard Greenwich time zone.

Note: When called in list context, $isdst, the last value
returned by gmtime, is always C<0>.  There is no
Daylight Saving Time in GMT.

Portability issues: L<perlport/gmtime>.

=item goto LABEL
X<goto> X<jump> X<jmp>

=item goto EXPR

=item goto &NAME

=for Pod::Functions create spaghetti code

The C<goto LABEL> form finds the statement labeled with LABEL and
resumes execution there.  It can't be used to get out of a block or
subroutine given to L<C<sort>|/sort SUBNAME LIST>.  It can be used to go
almost anywhere else within the dynamic scope, including out of
subroutines, but it's usually better to use some other construct such as
L<C<last>|/last LABEL> or L<C<die>|/die LIST>.  The author of Perl has
never felt the need to use this form of L<C<goto>|/goto LABEL> (in Perl,
that is; C is another matter).  (The difference is that C does not offer
named loops combined with loop control.  Perl does, and this replaces
most structured uses of L<C<goto>|/goto LABEL> in other languages.)

The C<goto EXPR> form expects to evaluate C<EXPR> to a code reference or
a label name.  If it evaluates to a code reference, it will be handled
like C<goto &NAME>, below.  This is especially useful for implementing
tail recursion via C<goto __SUB__>.

If the expression evaluates to a label name, its scope will be resolved
dynamically.  This allows for computed L<C<goto>|/goto LABEL>s per
FORTRAN, but isn't necessarily recommended if you're optimizing for
maintainability:

    goto ("FOO", "BAR", "GLARCH")[$i];

As shown in this example, C<goto EXPR> is exempt from the "looks like a
function" rule.  A pair of parentheses following it does not (necessarily)
delimit its argument.  C<goto("NE")."XT"> is equivalent to C<goto NEXT>.
Also, unlike most named operators, this has the same precedence as
assignment.

Use of C<goto LABEL> or C<goto EXPR> to jump into a construct is
deprecated and will issue a warning.  Even then, it may not be used to
go into any construct that requires initialization, such as a
subroutine or a C<foreach> loop.  It also can't be used to go into a
construct that is optimized away.

The C<goto &NAME> form is quite different from the other forms of
L<C<goto>|/goto LABEL>.  In fact, it isn't a goto in the normal sense at
all, and doesn't have the stigma associated with other gotos.  Instead,
it exits the current subroutine (losing any changes set by
L<C<local>|/local EXPR>) and immediately calls in its place the named
subroutine using the current value of L<C<@_>|perlvar/@_>.  This is used
by C<AUTOLOAD> subroutines that wish to load another subroutine and then
pretend that the other subroutine had been called in the first place
(except that any modifications to L<C<@_>|perlvar/@_> in the current
subroutine are propagated to the other subroutine.) After the
L<C<goto>|/goto LABEL>, not even L<C<caller>|/caller EXPR> will be able
to tell that this routine was called first.

NAME needn't be the name of a subroutine; it can be a scalar variable
containing a code reference or a block that evaluates to a code
reference.

=item grep BLOCK LIST
X<grep>

=item grep EXPR,LIST

=for Pod::Functions locate elements in a list test true against a given criterion

This is similar in spirit to, but not the same as, L<grep(1)> and its
relatives.  In particular, it is not limited to using regular expressions.

Evaluates the BLOCK or EXPR for each element of LIST (locally setting
L<C<$_>|perlvar/$_> to each element) and returns the list value
consisting of those
elements for which the expression evaluated to true.  In scalar
context, returns the number of times the expression was true.

    my @foo = grep(!/^#/, @bar);    # weed out comments

or equivalently,

    my @foo = grep {!/^#/} @bar;    # weed out comments

Note that L<C<$_>|perlvar/$_> is an alias to the list value, so it can
be used to
modify the elements of the LIST.  While this is useful and supported,
it can cause bizarre results if the elements of LIST are not variables.
Similarly, grep returns aliases into the original list, much as a for
loop's index variable aliases the list elements.  That is, modifying an
element of a list returned by grep (for example, in a C<foreach>,
L<C<map>|/map BLOCK LIST> or another L<C<grep>|/grep BLOCK LIST>)
actually modifies the element in the original list.
This is usually something to be avoided when writing clear code.

See also L<C<map>|/map BLOCK LIST> for a list composed of the results of
the BLOCK or EXPR.

=item hex EXPR
X<hex> X<hexadecimal>

=item hex

=for Pod::Functions convert a hexadecimal string to a number

Interprets EXPR as a hex string and returns the corresponding numeric value.
If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

    print hex '0xAf'; # prints '175'
    print hex 'aF';   # same
    $valid_input =~ /\A(?:0?[xX])?(?:_?[0-9a-fA-F])*\z/

A hex string consists of hex digits and an optional C<0x> or C<x> prefix.
Each hex digit may be preceded by a single underscore, which will be ignored.
Any other character triggers a warning and causes the rest of the string
to be ignored (even leading whitespace, unlike L<C<oct>|/oct EXPR>).
Only integers can be represented, and integer overflow triggers a warning.

To convert strings that might start with any of C<0>, C<0x>, or C<0b>,
see L<C<oct>|/oct EXPR>.  To present something as hex, look into
L<C<printf>|/printf FILEHANDLE FORMAT, LIST>,
L<C<sprintf>|/sprintf FORMAT, LIST>, and
L<C<unpack>|/unpack TEMPLATE,EXPR>.

=item import LIST
X<import>

=for Pod::Functions patch a module's namespace into your own

There is no builtin L<C<import>|/import LIST> function.  It is just an
ordinary method (subroutine) defined (or inherited) by modules that wish
to export names to another module.  The
L<C<use>|/use Module VERSION LIST> function calls the
L<C<import>|/import LIST> method for the package used.  See also
L<C<use>|/use Module VERSION LIST>, L<perlmod>, and L<Exporter>.

=item index STR,SUBSTR,POSITION
X<index> X<indexOf> X<InStr>

=item index STR,SUBSTR

=for Pod::Functions find a substring within a string

The index function searches for one string within another, but without
the wildcard-like behavior of a full regular-expression pattern match.
It returns the position of the first occurrence of SUBSTR in STR at
or after POSITION.  If POSITION is omitted, starts searching from the
beginning of the string.  POSITION before the beginning of the string
or after its end is treated as if it were the beginning or the end,
respectively.  POSITION and the return value are based at zero.
If the substring is not found, L<C<index>|/index STR,SUBSTR,POSITION>
returns -1.

=item int EXPR
X<int> X<integer> X<truncate> X<trunc> X<floor>

=item int

=for Pod::Functions get the integer portion of a number

Returns the integer portion of EXPR.  If EXPR is omitted, uses
L<C<$_>|perlvar/$_>.
You should not use this function for rounding: one because it truncates
towards C<0>, and two because machine representations of floating-point
numbers can sometimes produce counterintuitive results.  For example,
C<int(-6.725/0.025)> produces -268 rather than the correct -269; that's
because it's really more like -268.99999999999994315658 instead.  Usually,
the L<C<sprintf>|/sprintf FORMAT, LIST>,
L<C<printf>|/printf FILEHANDLE FORMAT, LIST>, or the
L<C<POSIX::floor>|POSIX/C<floor>> and L<C<POSIX::ceil>|POSIX/C<ceil>>
functions will serve you better than will L<C<int>|/int EXPR>.

=item ioctl FILEHANDLE,FUNCTION,SCALAR
X<ioctl>

=for Pod::Functions system-dependent device control system call

Implements the L<ioctl(2)> function.  You'll probably first have to say

    require "sys/ioctl.ph";  # probably in
                             # $Config{archlib}/sys/ioctl.ph

to get the correct function definitions.  If F<sys/ioctl.ph> doesn't
exist or doesn't have the correct definitions you'll have to roll your
own, based on your C header files such as F<< <sys/ioctl.h> >>.
(There is a Perl script called B<h2ph> that comes with the Perl kit that
may help you in this, but it's nontrivial.)  SCALAR will be read and/or
written depending on the FUNCTION; a C pointer to the string value of SCALAR
will be passed as the third argument of the actual
L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR> call.  (If SCALAR
has no string value but does have a numeric value, that value will be
passed rather than a pointer to the string value.  To guarantee this to be
true, add a C<0> to the scalar before using it.)  The
L<C<pack>|/pack TEMPLATE,LIST> and L<C<unpack>|/unpack TEMPLATE,EXPR>
functions may be needed to manipulate the values of structures used by
L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR>.

The return value of L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR> (and
L<C<fcntl>|/fcntl FILEHANDLE,FUNCTION,SCALAR>) is as follows:

    if OS returns:      then Perl returns:
        -1               undefined value
         0              string "0 but true"
    anything else           that number

Thus Perl returns true on success and false on failure, yet you can
still easily determine the actual value returned by the operating
system:

    my $retval = ioctl(...) || -1;
    printf "System returned %d\n", $retval;

The special string C<"0 but true"> is exempt from
L<C<Argument "..." isn't numeric>|perldiag/Argument "%s" isn't numeric%s>
L<warnings> on improper numeric conversions.

Portability issues: L<perlport/ioctl>.

=item join EXPR,LIST
X<join>

=for Pod::Functions join a list into a string using a separator

Joins the separate strings of LIST into a single string with fields
separated by the value of EXPR, and returns that new string.  Example:

   my $rec = join(':', $login,$passwd,$uid,$gid,$gcos,$home,$shell);

Beware that unlike L<C<split>|/split E<sol>PATTERNE<sol>,EXPR,LIMIT>,
L<C<join>|/join EXPR,LIST> doesn't take a pattern as its first argument.
Compare L<C<split>|/split E<sol>PATTERNE<sol>,EXPR,LIMIT>.

=item keys HASH
X<keys> X<key>

=item keys ARRAY

=for Pod::Functions retrieve list of indices from a hash

Called in list context, returns a list consisting of all the keys of the
named hash, or in Perl 5.12 or later only, the indices of an array.  Perl
releases prior to 5.12 will produce a syntax error if you try to use an
array argument.  In scalar context, returns the number of keys or indices.

Hash entries are returned in an apparently random order.  The actual random
order is specific to a given hash; the exact same series of operations
on two hashes may result in a different order for each hash.  Any insertion
into the hash may change the order, as will any deletion, with the exception
that the most recent key returned by L<C<each>|/each HASH> or
L<C<keys>|/keys HASH> may be deleted without changing the order.  So
long as a given hash is unmodified you may rely on
L<C<keys>|/keys HASH>, L<C<values>|/values HASH> and L<C<each>|/each
HASH> to repeatedly return the same order
as each other.  See L<perlsec/"Algorithmic Complexity Attacks"> for
details on why hash order is randomized.  Aside from the guarantees
provided here the exact details of Perl's hash algorithm and the hash
traversal order are subject to change in any release of Perl.  Tied hashes
may behave differently to Perl's hashes with respect to changes in order on
insertion and deletion of items.

As a side effect, calling L<C<keys>|/keys HASH> resets the internal
iterator of the HASH or ARRAY (see L<C<each>|/each HASH>).  In
particular, calling L<C<keys>|/keys HASH> in void context resets the
iterator with no other overhead.

Here is yet another way to print your environment:

    my @keys = keys %ENV;
    my @values = values %ENV;
    while (@keys) {
        print pop(@keys), '=', pop(@values), "\n";
    }

or how about sorted by key:

    foreach my $key (sort(keys %ENV)) {
        print $key, '=', $ENV{$key}, "\n";
    }

The returned values are copies of the original keys in the hash, so
modifying them will not affect the original hash.  Compare
L<C<values>|/values HASH>.

To sort a hash by value, you'll need to use a
L<C<sort>|/sort SUBNAME LIST> function.  Here's a descending numeric
sort of a hash by its values:

    foreach my $key (sort { $hash{$b} <=> $hash{$a} } keys %hash) {
        printf "%4d %s\n", $hash{$key}, $key;
    }

Used as an lvalue, L<C<keys>|/keys HASH> allows you to increase the
number of hash buckets
allocated for the given hash.  This can gain you a measure of efficiency if
you know the hash is going to get big.  (This is similar to pre-extending
an array by assigning a larger number to $#array.)  If you say

    keys %hash = 200;

then C<%hash> will have at least 200 buckets allocated for it--256 of them,
in fact, since it rounds up to the next power of two.  These
buckets will be retained even if you do C<%hash = ()>, use C<undef
%hash> if you want to free the storage while C<%hash> is still in scope.
You can't shrink the number of buckets allocated for the hash using
L<C<keys>|/keys HASH> in this way (but you needn't worry about doing
this by accident, as trying has no effect).  C<keys @array> in an lvalue
context is a syntax error.

Starting with Perl 5.14, an experimental feature allowed
L<C<keys>|/keys HASH> to take a scalar expression. This experiment has
been deemed unsuccessful, and was removed as of Perl 5.24.

To avoid confusing would-be users of your code who are running earlier
versions of Perl with mysterious syntax errors, put this sort of thing at
the top of your file to signal that your code will work I<only> on Perls of
a recent vintage:

    use 5.012;	# so keys/values/each work on arrays

See also L<C<each>|/each HASH>, L<C<values>|/values HASH>, and
L<C<sort>|/sort SUBNAME LIST>.

=item kill SIGNAL, LIST

=item kill SIGNAL
X<kill> X<signal>

=for Pod::Functions send a signal to a process or process group

Sends a signal to a list of processes.  Returns the number of arguments
that were successfully used to signal (which is not necessarily the same
as the number of processes actually killed, e.g. where a process group is
killed).

    my $cnt = kill 'HUP', $child1, $child2;
    kill 'KILL', @goners;

SIGNAL may be either a signal name (a string) or a signal number.  A signal
name may start with a C<SIG> prefix, thus C<FOO> and C<SIGFOO> refer to the
same signal.  The string form of SIGNAL is recommended for portability because
the same signal may have different numbers in different operating systems.

A list of signal names supported by the current platform can be found in
C<$Config{sig_name}>, which is provided by the L<C<Config>|Config>
module.  See L<Config> for more details.

A negative signal name is the same as a negative signal number, killing process
groups instead of processes.  For example, C<kill '-KILL', $pgrp> and
C<kill -9, $pgrp> will send C<SIGKILL> to
the entire process group specified.  That
means you usually want to use positive not negative signals.

If SIGNAL is either the number 0 or the string C<ZERO> (or C<SIGZERO>),
no signal is sent to the process, but L<C<kill>|/kill SIGNAL, LIST>
checks whether it's I<possible> to send a signal to it
(that means, to be brief, that the process is owned by the same user, or we are
the super-user).  This is useful to check that a child process is still
alive (even if only as a zombie) and hasn't changed its UID.  See
L<perlport> for notes on the portability of this construct.

The behavior of kill when a I<PROCESS> number is zero or negative depends on
the operating system.  For example, on POSIX-conforming systems, zero will
signal the current process group, -1 will signal all processes, and any
other negative PROCESS number will act as a negative signal number and
kill the entire process group specified.

If both the SIGNAL and the PROCESS are negative, the results are undefined.
A warning may be produced in a future version.

See L<perlipc/"Signals"> for more details.

On some platforms such as Windows where the L<fork(2)> system call is not
available, Perl can be built to emulate L<C<fork>|/fork> at the
interpreter level.
This emulation has limitations related to kill that have to be considered,
for code running on Windows and in code intended to be portable.

See L<perlfork> for more details.

If there is no I<LIST> of processes, no signal is sent, and the return
value is 0.  This form is sometimes used, however, because it causes
tainting checks to be run.  But see
L<perlsec/Laundering and Detecting Tainted Data>.

Portability issues: L<perlport/kill>.

=item last LABEL
X<last> X<break>

=item last EXPR

=item last

=for Pod::Functions exit a block prematurely

The L<C<last>|/last LABEL> command is like the C<break> statement in C
(as used in
loops); it immediately exits the loop in question.  If the LABEL is
omitted, the command refers to the innermost enclosing
loop.  The C<last EXPR> form, available starting in Perl
5.18.0, allows a label name to be computed at run time,
and is otherwise identical to C<last LABEL>.  The
L<C<continue>|/continue BLOCK> block, if any, is not executed:

    LINE: while (<STDIN>) {
        last LINE if /^$/;  # exit when done with header
        #...
    }

L<C<last>|/last LABEL> cannot be used to exit a block that returns a
value such as C<eval {}>, C<sub {}>, or C<do {}>, and should not be used
to exit a L<C<grep>|/grep BLOCK LIST> or L<C<map>|/map BLOCK LIST>
operation.

Note that a block by itself is semantically identical to a loop
that executes once.  Thus L<C<last>|/last LABEL> can be used to effect
an early exit out of such a block.

See also L<C<continue>|/continue BLOCK> for an illustration of how
L<C<last>|/last LABEL>, L<C<next>|/next LABEL>, and
L<C<redo>|/redo LABEL> work.

Unlike most named operators, this has the same precedence as assignment.
It is also exempt from the looks-like-a-function rule, so
C<last ("foo")."bar"> will cause "bar" to be part of the argument to
L<C<last>|/last LABEL>.

=item lc EXPR
X<lc> X<lowercase>

=item lc

=for Pod::Functions return lower-case version of a string

Returns a lowercased version of EXPR.  This is the internal function
implementing the C<\L> escape in double-quoted strings.

If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

What gets returned depends on several factors:

=over

=item If C<use bytes> is in effect:

The results follow ASCII rules.  Only the characters C<A-Z> change,
to C<a-z> respectively.

=item Otherwise, if C<use locale> for C<LC_CTYPE> is in effect:

Respects current C<LC_CTYPE> locale for code points < 256; and uses Unicode
rules for the remaining code points (this last can only happen if
the UTF8 flag is also set).  See L<perllocale>.

Starting in v5.20, Perl uses full Unicode rules if the locale is
UTF-8.  Otherwise, there is a deficiency in this scheme, which is that
case changes that cross the 255/256
boundary are not well-defined.  For example, the lower case of LATIN CAPITAL
LETTER SHARP S (U+1E9E) in Unicode rules is U+00DF (on ASCII
platforms).   But under C<use locale> (prior to v5.20 or not a UTF-8
locale), the lower case of U+1E9E is
itself, because 0xDF may not be LATIN SMALL LETTER SHARP S in the
current locale, and Perl has no way of knowing if that character even
exists in the locale, much less what code point it is.  Perl returns
a result that is above 255 (almost always the input character unchanged),
for all instances (and there aren't many) where the 255/256 boundary
would otherwise be crossed; and starting in v5.22, it raises a
L<locale|perldiag/Can't do %s("%s") on non-UTF-8 locale; resolved to "%s".> warning.

=item Otherwise, If EXPR has the UTF8 flag set:

Unicode rules are used for the case change.

=item Otherwise, if C<use feature 'unicode_strings'> or C<use locale ':not_characters'> is in effect:

Unicode rules are used for the case change.

=item Otherwise:

ASCII rules are used for the case change.  The lowercase of any character
outside the ASCII range is the character itself.

=back

=item lcfirst EXPR
X<lcfirst> X<lowercase>

=item lcfirst

=for Pod::Functions return a string with just the next letter in lower case

Returns the value of EXPR with the first character lowercased.  This
is the internal function implementing the C<\l> escape in
double-quoted strings.

If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

This function behaves the same way under various pragmas, such as in a locale,
as L<C<lc>|/lc EXPR> does.

=item length EXPR
X<length> X<size>

=item length

=for Pod::Functions return the number of characters in a string

Returns the length in I<characters> of the value of EXPR.  If EXPR is
omitted, returns the length of L<C<$_>|perlvar/$_>.  If EXPR is
undefined, returns L<C<undef>|/undef EXPR>.

This function cannot be used on an entire array or hash to find out how
many elements these have.  For that, use C<scalar @array> and C<scalar keys
%hash>, respectively.

Like all Perl character operations, L<C<length>|/length EXPR> normally
deals in logical
characters, not physical bytes.  For how many bytes a string encoded as
UTF-8 would take up, use C<length(Encode::encode('UTF-8', EXPR))>
(you'll have to C<use Encode> first).  See L<Encode> and L<perlunicode>.

=item __LINE__
X<__LINE__>

=for Pod::Functions the current source line number

A special token that compiles to the current line number.

=item link OLDFILE,NEWFILE
X<link>

=for Pod::Functions create a hard link in the filesystem

Creates a new filename linked to the old filename.  Returns true for
success, false otherwise.

Portability issues: L<perlport/link>.

=item listen SOCKET,QUEUESIZE
X<listen>

=for Pod::Functions register your socket as a server

Does the same thing that the L<listen(2)> system call does.  Returns true if
it succeeded, false otherwise.  See the example in
L<perlipc/"Sockets: Client/Server Communication">.

=item local EXPR
X<local>

=for Pod::Functions create a temporary value for a global variable (dynamic scoping)

You really probably want to be using L<C<my>|/my VARLIST> instead,
because L<C<local>|/local EXPR> isn't what most people think of as
"local".  See L<perlsub/"Private Variables via my()"> for details.

A local modifies the listed variables to be local to the enclosing
block, file, or eval.  If more than one value is listed, the list must
be placed in parentheses.  See L<perlsub/"Temporary Values via local()">
for details, including issues with tied arrays and hashes.

The C<delete local EXPR> construct can also be used to localize the deletion
of array/hash elements to the current block.
See L<perlsub/"Localized deletion of elements of composite types">.

=item localtime EXPR
X<localtime> X<ctime>

=item localtime

=for Pod::Functions convert UNIX time into record or string using local time

Converts a time as returned by the time function to a 9-element list
with the time analyzed for the local time zone.  Typically used as
follows:

    #     0    1    2     3     4    5     6     7     8
    my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
                                                localtime(time);

All list elements are numeric and come straight out of the C `struct
tm'.  C<$sec>, C<$min>, and C<$hour> are the seconds, minutes, and hours
of the specified time.

C<$mday> is the day of the month and C<$mon> the month in
the range C<0..11>, with 0 indicating January and 11 indicating December.
This makes it easy to get a month name from a list:

    my @abbr = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
    print "$abbr[$mon] $mday";
    # $mon=9, $mday=18 gives "Oct 18"

C<$year> contains the number of years since 1900.  To get a 4-digit
year write:

    $year += 1900;

To get the last two digits of the year (e.g., "01" in 2001) do:

    $year = sprintf("%02d", $year % 100);

C<$wday> is the day of the week, with 0 indicating Sunday and 3 indicating
Wednesday.  C<$yday> is the day of the year, in the range C<0..364>
(or C<0..365> in leap years.)

C<$isdst> is true if the specified time occurs during Daylight Saving
Time, false otherwise.

If EXPR is omitted, L<C<localtime>|/localtime EXPR> uses the current
time (as returned by L<C<time>|/time>).

In scalar context, L<C<localtime>|/localtime EXPR> returns the
L<ctime(3)> value:

    my $now_string = localtime;  # e.g., "Thu Oct 13 04:54:34 1994"

The format of this scalar value is B<not> locale-dependent but built
into Perl.  For GMT instead of local time use the
L<C<gmtime>|/gmtime EXPR> builtin.  See also the
L<C<Time::Local>|Time::Local> module (for converting seconds, minutes,
hours, and such back to the integer value returned by L<C<time>|/time>),
and the L<POSIX> module's L<C<strftime>|POSIX/C<strftime>> and
L<C<mktime>|POSIX/C<mktime>> functions.

To get somewhat similar but locale-dependent date strings, set up your
locale environment variables appropriately (please see L<perllocale>) and
try for example:

    use POSIX qw(strftime);
    my $now_string = strftime "%a %b %e %H:%M:%S %Y", localtime;
    # or for GMT formatted appropriately for your locale:
    my $now_string = strftime "%a %b %e %H:%M:%S %Y", gmtime;

Note that C<%a> and C<%b>, the short forms of the day of the week
and the month of the year, may not necessarily be three characters wide.

The L<Time::gmtime> and L<Time::localtime> modules provide a convenient,
by-name access mechanism to the L<C<gmtime>|/gmtime EXPR> and
L<C<localtime>|/localtime EXPR> functions, respectively.

For a comprehensive date and time representation look at the
L<DateTime> module on CPAN.

Portability issues: L<perlport/localtime>.

=item lock THING
X<lock>

=for Pod::Functions +5.005 get a thread lock on a variable, subroutine, or method

This function places an advisory lock on a shared variable or referenced
object contained in I<THING> until the lock goes out of scope.

The value returned is the scalar itself, if the argument is a scalar, or a
reference, if the argument is a hash, array or subroutine.

L<C<lock>|/lock THING> is a "weak keyword"; this means that if you've
defined a function
by this name (before any calls to it), that function will be called
instead.  If you are not under C<use threads::shared> this does nothing.
See L<threads::shared>.

=item log EXPR
X<log> X<logarithm> X<e> X<ln> X<base>

=item log

=for Pod::Functions retrieve the natural logarithm for a number

Returns the natural logarithm (base I<e>) of EXPR.  If EXPR is omitted,
returns the log of L<C<$_>|perlvar/$_>.  To get the
log of another base, use basic algebra:
The base-N log of a number is equal to the natural log of that number
divided by the natural log of N.  For example:

    sub log10 {
        my $n = shift;
        return log($n)/log(10);
    }

See also L<C<exp>|/exp EXPR> for the inverse operation.

=item lstat FILEHANDLE
X<lstat>

=item lstat EXPR

=item lstat DIRHANDLE

=item lstat

=for Pod::Functions stat a symbolic link

Does the same thing as the L<C<stat>|/stat FILEHANDLE> function
(including setting the special C<_> filehandle) but stats a symbolic
link instead of the file the symbolic link points to.  If symbolic links
are unimplemented on your system, a normal L<C<stat>|/stat FILEHANDLE>
is done.  For much more detailed information, please see the
documentation for L<C<stat>|/stat FILEHANDLE>.

If EXPR is omitted, stats L<C<$_>|perlvar/$_>.

Portability issues: L<perlport/lstat>.

=item m//

=for Pod::Functions match a string with a regular expression pattern

The match operator.  See L<perlop/"Regexp Quote-Like Operators">.

=item map BLOCK LIST
X<map>

=item map EXPR,LIST

=for Pod::Functions apply a change to a list to get back a new list with the changes

Evaluates the BLOCK or EXPR for each element of LIST (locally setting
L<C<$_>|perlvar/$_> to each element) and returns the list value composed
of the
results of each such evaluation.  In scalar context, returns the
total number of elements so generated.  Evaluates BLOCK or EXPR in
list context, so each element of LIST may produce zero, one, or
more elements in the returned value.

    my @chars = map(chr, @numbers);

translates a list of numbers to the corresponding characters.

    my @squares = map { $_ * $_ } @numbers;

translates a list of numbers to their squared values.

    my @squares = map { $_ > 5 ? ($_ * $_) : () } @numbers;

shows that number of returned elements can differ from the number of
input elements.  To omit an element, return an empty list ().
This could also be achieved by writing

    my @squares = map { $_ * $_ } grep { $_ > 5 } @numbers;

which makes the intention more clear.

Map always returns a list, which can be
assigned to a hash such that the elements
become key/value pairs.  See L<perldata> for more details.

    my %hash = map { get_a_key_for($_) => $_ } @array;

is just a funny way to write

    my %hash;
    foreach (@array) {
        $hash{get_a_key_for($_)} = $_;
    }

Note that L<C<$_>|perlvar/$_> is an alias to the list value, so it can
be used to modify the elements of the LIST.  While this is useful and
supported, it can cause bizarre results if the elements of LIST are not
variables.  Using a regular C<foreach> loop for this purpose would be
clearer in most cases.  See also L<C<grep>|/grep BLOCK LIST> for a
list composed of those items of the original list for which the BLOCK
or EXPR evaluates to true.

C<{> starts both hash references and blocks, so C<map { ...> could be either
the start of map BLOCK LIST or map EXPR, LIST.  Because Perl doesn't look
ahead for the closing C<}> it has to take a guess at which it's dealing with
based on what it finds just after the
C<{>.  Usually it gets it right, but if it
doesn't it won't realize something is wrong until it gets to the C<}> and
encounters the missing (or unexpected) comma.  The syntax error will be
reported close to the C<}>, but you'll need to change something near the C<{>
such as using a unary C<+> or semicolon to give Perl some help:

 my %hash = map {  "\L$_" => 1  } @array # perl guesses EXPR. wrong
 my %hash = map { +"\L$_" => 1  } @array # perl guesses BLOCK. right
 my %hash = map {; "\L$_" => 1  } @array # this also works
 my %hash = map { ("\L$_" => 1) } @array # as does this
 my %hash = map {  lc($_) => 1  } @array # and this.
 my %hash = map +( lc($_) => 1 ), @array # this is EXPR and works!

 my %hash = map  ( lc($_), 1 ),   @array # evaluates to (1, @array)

or to force an anon hash constructor use C<+{>:

    my @hashes = map +{ lc($_) => 1 }, @array # EXPR, so needs
                                              # comma at end

to get a list of anonymous hashes each with only one entry apiece.

=item mkdir FILENAME,MASK
X<mkdir> X<md> X<directory, create>

=item mkdir FILENAME

=item mkdir

=for Pod::Functions create a directory

Creates the directory specified by FILENAME, with permissions
specified by MASK (as modified by L<C<umask>|/umask EXPR>).  If it
succeeds it returns true; otherwise it returns false and sets
L<C<$!>|perlvar/$!> (errno).
MASK defaults to 0777 if omitted, and FILENAME defaults
to L<C<$_>|perlvar/$_> if omitted.

In general, it is better to create directories with a permissive MASK
and let the user modify that with their L<C<umask>|/umask EXPR> than it
is to supply
a restrictive MASK and give the user no way to be more permissive.
The exceptions to this rule are when the file or directory should be
kept private (mail files, for instance).  The documentation for
L<C<umask>|/umask EXPR> discusses the choice of MASK in more detail.

Note that according to the POSIX 1003.1-1996 the FILENAME may have any
number of trailing slashes.  Some operating and filesystems do not get
this right, so Perl automatically removes all trailing slashes to keep
everyone happy.

To recursively create a directory structure, look at
the L<C<make_path>|File::Path/make_path( $dir1, $dir2, .... )> function
of the L<File::Path> module.

=item msgctl ID,CMD,ARG
X<msgctl>

=for Pod::Functions SysV IPC message control operations

Calls the System V IPC function L<msgctl(2)>.  You'll probably have to say

    use IPC::SysV;

first to get the correct constant definitions.  If CMD is C<IPC_STAT>,
then ARG must be a variable that will hold the returned C<msqid_ds>
structure.  Returns like L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR>:
the undefined value for error, C<"0 but true"> for zero, or the actual
return value otherwise.  See also L<perlipc/"SysV IPC"> and the
documentation for L<C<IPC::SysV>|IPC::SysV> and
L<C<IPC::Semaphore>|IPC::Semaphore>.

Portability issues: L<perlport/msgctl>.

=item msgget KEY,FLAGS
X<msgget>

=for Pod::Functions get SysV IPC message queue

Calls the System V IPC function L<msgget(2)>.  Returns the message queue
id, or L<C<undef>|/undef EXPR> on error.  See also L<perlipc/"SysV IPC">
and the documentation for L<C<IPC::SysV>|IPC::SysV> and
L<C<IPC::Msg>|IPC::Msg>.

Portability issues: L<perlport/msgget>.

=item msgrcv ID,VAR,SIZE,TYPE,FLAGS
X<msgrcv>

=for Pod::Functions receive a SysV IPC message from a message queue

Calls the System V IPC function msgrcv to receive a message from
message queue ID into variable VAR with a maximum message size of
SIZE.  Note that when a message is received, the message type as a
native long integer will be the first thing in VAR, followed by the
actual message.  This packing may be opened with C<unpack("l! a*")>.
Taints the variable.  Returns true if successful, false
on error.  See also L<perlipc/"SysV IPC"> and the documentation for
L<C<IPC::SysV>|IPC::SysV> and L<C<IPC::Msg>|IPC::Msg>.

Portability issues: L<perlport/msgrcv>.

=item msgsnd ID,MSG,FLAGS
X<msgsnd>

=for Pod::Functions send a SysV IPC message to a message queue

Calls the System V IPC function msgsnd to send the message MSG to the
message queue ID.  MSG must begin with the native long integer message
type, be followed by the length of the actual message, and then finally
the message itself.  This kind of packing can be achieved with
C<pack("l! a*", $type, $message)>.  Returns true if successful,
false on error.  See also L<perlipc/"SysV IPC"> and the documentation
for L<C<IPC::SysV>|IPC::SysV> and L<C<IPC::Msg>|IPC::Msg>.

Portability issues: L<perlport/msgsnd>.

=item my VARLIST
X<my>

=item my TYPE VARLIST

=item my VARLIST : ATTRS

=item my TYPE VARLIST : ATTRS

=for Pod::Functions declare and assign a local variable (lexical scoping)

A L<C<my>|/my VARLIST> declares the listed variables to be local
(lexically) to the enclosing block, file, or L<C<eval>|/eval EXPR>.  If
more than one variable is listed, the list must be placed in
parentheses.

The exact semantics and interface of TYPE and ATTRS are still
evolving.  TYPE may be a bareword, a constant declared
with L<C<use constant>|constant>, or L<C<__PACKAGE__>|/__PACKAGE__>.  It
is
currently bound to the use of the L<fields> pragma,
and attributes are handled using the L<attributes> pragma, or starting
from Perl 5.8.0 also via the L<Attribute::Handlers> module.  See
L<perlsub/"Private Variables via my()"> for details.

Note that with a parenthesised list, L<C<undef>|/undef EXPR> can be used
as a dummy placeholder, for example to skip assignment of initial
values:

    my ( undef, $min, $hour ) = localtime;

=item next LABEL
X<next> X<continue>

=item next EXPR

=item next

=for Pod::Functions iterate a block prematurely

The L<C<next>|/next LABEL> command is like the C<continue> statement in
C; it starts the next iteration of the loop:

    LINE: while (<STDIN>) {
        next LINE if /^#/;  # discard comments
        #...
    }

Note that if there were a L<C<continue>|/continue BLOCK> block on the
above, it would get
executed even on discarded lines.  If LABEL is omitted, the command
refers to the innermost enclosing loop.  The C<next EXPR> form, available
as of Perl 5.18.0, allows a label name to be computed at run time, being
otherwise identical to C<next LABEL>.

L<C<next>|/next LABEL> cannot be used to exit a block which returns a
value such as C<eval {}>, C<sub {}>, or C<do {}>, and should not be used
to exit a L<C<grep>|/grep BLOCK LIST> or L<C<map>|/map BLOCK LIST>
operation.

Note that a block by itself is semantically identical to a loop
that executes once.  Thus L<C<next>|/next LABEL> will exit such a block
early.

See also L<C<continue>|/continue BLOCK> for an illustration of how
L<C<last>|/last LABEL>, L<C<next>|/next LABEL>, and
L<C<redo>|/redo LABEL> work.

Unlike most named operators, this has the same precedence as assignment.
It is also exempt from the looks-like-a-function rule, so
C<next ("foo")."bar"> will cause "bar" to be part of the argument to
L<C<next>|/next LABEL>.

=item no MODULE VERSION LIST
X<no declarations>
X<unimporting>

=item no MODULE VERSION

=item no MODULE LIST

=item no MODULE

=item no VERSION

=for Pod::Functions unimport some module symbols or semantics at compile time

See the L<C<use>|/use Module VERSION LIST> function, of which
L<C<no>|/no MODULE VERSION LIST> is the opposite.

=item oct EXPR
X<oct> X<octal> X<hex> X<hexadecimal> X<binary> X<bin>

=item oct

=for Pod::Functions convert a string to an octal number

Interprets EXPR as an octal string and returns the corresponding
value.  (If EXPR happens to start off with C<0x>, interprets it as a
hex string.  If EXPR starts off with C<0b>, it is interpreted as a
binary string.  Leading whitespace is ignored in all three cases.)
The following will handle decimal, binary, octal, and hex in standard
Perl notation:

    $val = oct($val) if $val =~ /^0/;

If EXPR is omitted, uses L<C<$_>|perlvar/$_>.   To go the other way
(produce a number in octal), use L<C<sprintf>|/sprintf FORMAT, LIST> or
L<C<printf>|/printf FILEHANDLE FORMAT, LIST>:

    my $dec_perms = (stat("filename"))[2] & 07777;
    my $oct_perm_str = sprintf "%o", $perms;

The L<C<oct>|/oct EXPR> function is commonly used when a string such as
C<644> needs
to be converted into a file mode, for example.  Although Perl
automatically converts strings into numbers as needed, this automatic
conversion assumes base 10.

Leading white space is ignored without warning, as too are any trailing
non-digits, such as a decimal point (L<C<oct>|/oct EXPR> only handles
non-negative integers, not negative integers or floating point).

=item open FILEHANDLE,EXPR
X<open> X<pipe> X<file, open> X<fopen>

=item open FILEHANDLE,MODE,EXPR

=item open FILEHANDLE,MODE,EXPR,LIST

=item open FILEHANDLE,MODE,REFERENCE

=item open FILEHANDLE

=for Pod::Functions open a file, pipe, or descriptor

Opens the file whose filename is given by EXPR, and associates it with
FILEHANDLE.

Simple examples to open a file for reading:

    open(my $fh, "<", "input.txt")
	or die "Can't open < input.txt: $!";

and for writing:

    open(my $fh, ">", "output.txt")
	or die "Can't open > output.txt: $!";

(The following is a comprehensive reference to
L<C<open>|/open FILEHANDLE,EXPR>: for a gentler introduction you may
consider L<perlopentut>.)

If FILEHANDLE is an undefined scalar variable (or array or hash element), a
new filehandle is autovivified, meaning that the variable is assigned a
reference to a newly allocated anonymous filehandle.  Otherwise if
FILEHANDLE is an expression, its value is the real filehandle.  (This is
considered a symbolic reference, so C<use strict "refs"> should I<not> be
in effect.)

If three (or more) arguments are specified, the open mode (including
optional encoding) in the second argument are distinct from the filename in
the third.  If MODE is C<< < >> or nothing, the file is opened for input.
If MODE is C<< > >>, the file is opened for output, with existing files
first being truncated ("clobbered") and nonexisting files newly created.
If MODE is C<<< >> >>>, the file is opened for appending, again being
created if necessary.

You can put a C<+> in front of the C<< > >> or C<< < >> to
indicate that you want both read and write access to the file; thus
C<< +< >> is almost always preferred for read/write updates--the
C<< +> >> mode would clobber the file first.  You can't usually use
either read-write mode for updating textfiles, since they have
variable-length records.  See the B<-i> switch in L<perlrun> for a
better approach.  The file is created with permissions of C<0666>
modified by the process's L<C<umask>|/umask EXPR> value.

These various prefixes correspond to the L<fopen(3)> modes of C<r>,
C<r+>, C<w>, C<w+>, C<a>, and C<a+>.

In the one- and two-argument forms of the call, the mode and filename
should be concatenated (in that order), preferably separated by white
space.  You can--but shouldn't--omit the mode in these forms when that mode
is C<< < >>.  It is safe to use the two-argument form of
L<C<open>|/open FILEHANDLE,EXPR> if the filename argument is a known literal.

For three or more arguments if MODE is C<|->, the filename is
interpreted as a command to which output is to be piped, and if MODE
is C<-|>, the filename is interpreted as a command that pipes
output to us.  In the two-argument (and one-argument) form, one should
replace dash (C<->) with the command.
See L<perlipc/"Using open() for IPC"> for more examples of this.
(You are not allowed to L<C<open>|/open FILEHANDLE,EXPR> to a command
that pipes both in I<and> out, but see L<IPC::Open2>, L<IPC::Open3>, and
L<perlipc/"Bidirectional Communication with Another Process"> for
alternatives.)

In the form of pipe opens taking three or more arguments, if LIST is specified
(extra arguments after the command name) then LIST becomes arguments
to the command invoked if the platform supports it.  The meaning of
L<C<open>|/open FILEHANDLE,EXPR> with more than three arguments for
non-pipe modes is not yet defined, but experimental "layers" may give
extra LIST arguments meaning.

In the two-argument (and one-argument) form, opening C<< <- >>
or C<-> opens STDIN and opening C<< >- >> opens STDOUT.

You may (and usually should) use the three-argument form of open to specify
I/O layers (sometimes referred to as "disciplines") to apply to the handle
that affect how the input and output are processed (see L<open> and
L<PerlIO> for more details).  For example:

  open(my $fh, "<:encoding(UTF-8)", $filename)
    || die "Can't open UTF-8 encoded $filename: $!";

opens the UTF8-encoded file containing Unicode characters;
see L<perluniintro>.  Note that if layers are specified in the
three-argument form, then default layers stored in ${^OPEN} (see L<perlvar>;
usually set by the L<open> pragma or the switch C<-CioD>) are ignored.
Those layers will also be ignored if you specify a colon with no name
following it.  In that case the default layer for the operating system
(:raw on Unix, :crlf on Windows) is used.

Open returns nonzero on success, the undefined value otherwise.  If
the L<C<open>|/open FILEHANDLE,EXPR> involved a pipe, the return value
happens to be the pid of the subprocess.

On some systems (in general, DOS- and Windows-based systems)
L<C<binmode>|/binmode FILEHANDLE, LAYER> is necessary when you're not
working with a text file.  For the sake of portability it is a good idea
always to use it when appropriate, and never to use it when it isn't
appropriate.  Also, people can set their I/O to be by default
UTF8-encoded Unicode, not bytes.

When opening a file, it's seldom a good idea to continue
if the request failed, so L<C<open>|/open FILEHANDLE,EXPR> is frequently
used with L<C<die>|/die LIST>.  Even if L<C<die>|/die LIST> won't do
what you want (say, in a CGI script,
where you want to format a suitable error message (but there are
modules that can help with that problem)) always check
the return value from opening a file.

The filehandle will be closed when its reference count reaches zero.
If it is a lexically scoped variable declared with L<C<my>|/my VARLIST>,
that usually
means the end of the enclosing scope.  However, this automatic close
does not check for errors, so it is better to explicitly close
filehandles, especially those used for writing:

    close($handle)
       || warn "close failed: $!";

An older style is to use a bareword as the filehandle, as

    open(FH, "<", "input.txt")
       or die "Can't open < input.txt: $!";

Then you can use C<FH> as the filehandle, in C<< close FH >> and C<<
<FH> >> and so on.  Note that it's a global variable, so this form is
not recommended in new code.

As a shortcut a one-argument call takes the filename from the global
scalar variable of the same name as the filehandle:

    $ARTICLE = 100;
    open(ARTICLE) or die "Can't find article $ARTICLE: $!\n";

Here C<$ARTICLE> must be a global (package) scalar variable - not one
declared with L<C<my>|/my VARLIST> or L<C<state>|/state VARLIST>.

As a special case the three-argument form with a read/write mode and the third
argument being L<C<undef>|/undef EXPR>:

    open(my $tmp, "+>", undef) or die ...

opens a filehandle to a newly created empty anonymous temporary file.
(This happens under any mode, which makes C<< +> >> the only useful and
sensible mode to use.)  You will need to
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> to do the reading.

Perl is built using PerlIO by default.  Unless you've
changed this (such as building Perl with C<Configure -Uuseperlio>), you can
open filehandles directly to Perl scalars via:

    open(my $fh, ">", \$variable) || ..

To (re)open C<STDOUT> or C<STDERR> as an in-memory file, close it first:

    close STDOUT;
    open(STDOUT, ">", \$variable)
	or die "Can't open STDOUT: $!";

See L<perliol> for detailed info on PerlIO.

General examples:

 open(my $log, ">>", "/usr/spool/news/twitlog");
 # if the open fails, output is discarded

 open(my $dbase, "+<", "dbase.mine")      # open for update
     or die "Can't open 'dbase.mine' for update: $!";

 open(my $dbase, "+<dbase.mine")          # ditto
     or die "Can't open 'dbase.mine' for update: $!";

 open(my $article_fh, "-|", "caesar <$article")  # decrypt
                                                 # article
     or die "Can't start caesar: $!";

 open(my $article_fh, "caesar <$article |")      # ditto
     or die "Can't start caesar: $!";

 open(my $out_fh, "|-", "sort >Tmp$$")    # $$ is our process id
     or die "Can't start sort: $!";

 # in-memory files
 open(my $memory, ">", \$var)
     or die "Can't open memory file: $!";
 print $memory "foo!\n";              # output will appear in $var

You may also, in the Bourne shell tradition, specify an EXPR beginning
with C<< >& >>, in which case the rest of the string is interpreted
as the name of a filehandle (or file descriptor, if numeric) to be
duped (as in L<dup(2)>) and opened.  You may use C<&> after C<< > >>,
C<<< >> >>>, C<< < >>, C<< +> >>, C<<< +>> >>>, and C<< +< >>.
The mode you specify should match the mode of the original filehandle.
(Duping a filehandle does not take into account any existing contents
of IO buffers.)  If you use the three-argument
form, then you can pass either a
number, the name of a filehandle, or the normal "reference to a glob".

Here is a script that saves, redirects, and restores C<STDOUT> and
C<STDERR> using various methods:

    #!/usr/bin/perl
    open(my $oldout, ">&STDOUT")     or die "Can't dup STDOUT: $!";
    open(OLDERR,     ">&", \*STDERR) or die "Can't dup STDERR: $!";

    open(STDOUT, '>', "foo.out") or die "Can't redirect STDOUT: $!";
    open(STDERR, ">&STDOUT")     or die "Can't dup STDOUT: $!";

    select STDERR; $| = 1;  # make unbuffered
    select STDOUT; $| = 1;  # make unbuffered

    print STDOUT "stdout 1\n";  # this works for
    print STDERR "stderr 1\n";  # subprocesses too

    open(STDOUT, ">&", $oldout) or die "Can't dup \$oldout: $!";
    open(STDERR, ">&OLDERR")    or die "Can't dup OLDERR: $!";

    print STDOUT "stdout 2\n";
    print STDERR "stderr 2\n";

If you specify C<< '<&=X' >>, where C<X> is a file descriptor number
or a filehandle, then Perl will do an equivalent of C's L<fdopen(3)> of
that file descriptor (and not call L<dup(2)>); this is more
parsimonious of file descriptors.  For example:

    # open for input, reusing the fileno of $fd
    open(my $fh, "<&=", $fd)

or

    open(my $fh, "<&=$fd")

or

    # open for append, using the fileno of $oldfh
    open(my $fh, ">>&=", $oldfh)

Being parsimonious on filehandles is also useful (besides being
parsimonious) for example when something is dependent on file
descriptors, like for example locking using
L<C<flock>|/flock FILEHANDLE,OPERATION>.  If you do just
C<< open(my $A, ">>&", $B) >>, the filehandle C<$A> will not have the
same file descriptor as C<$B>, and therefore C<flock($A)> will not
C<flock($B)> nor vice versa.  But with C<< open(my $A, ">>&=", $B) >>,
the filehandles will share the same underlying system file descriptor.

Note that under Perls older than 5.8.0, Perl uses the standard C library's'
L<fdopen(3)> to implement the C<=> functionality.  On many Unix systems,
L<fdopen(3)> fails when file descriptors exceed a certain value, typically 255.
For Perls 5.8.0 and later, PerlIO is (most often) the default.

You can see whether your Perl was built with PerlIO by running
C<perl -V:useperlio>.  If it says C<'define'>, you have PerlIO;
otherwise you don't.

If you open a pipe on the command C<-> (that is, specify either C<|-> or C<-|>
with the one- or two-argument forms of
L<C<open>|/open FILEHANDLE,EXPR>), an implicit L<C<fork>|/fork> is done,
so L<C<open>|/open FILEHANDLE,EXPR> returns twice: in the parent process
it returns the pid
of the child process, and in the child process it returns (a defined) C<0>.
Use C<defined($pid)> or C<//> to determine whether the open was successful.

For example, use either

   my $child_pid = open(my $from_kid, "-|") // die "Can't fork: $!";

or

   my $child_pid = open(my $to_kid,   "|-") // die "Can't fork: $!";

followed by

    if ($child_pid) {
	# am the parent:
	# either write $to_kid or else read $from_kid
	...
       waitpid $child_pid, 0;
    } else {
	# am the child; use STDIN/STDOUT normally
	...
	exit;
    }

The filehandle behaves normally for the parent, but I/O to that
filehandle is piped from/to the STDOUT/STDIN of the child process.
In the child process, the filehandle isn't opened--I/O happens from/to
the new STDOUT/STDIN.  Typically this is used like the normal
piped open when you want to exercise more control over just how the
pipe command gets executed, such as when running setuid and
you don't want to have to scan shell commands for metacharacters.

The following blocks are more or less equivalent:

    open(my $fh, "|tr '[a-z]' '[A-Z]'");
    open(my $fh, "|-", "tr '[a-z]' '[A-Z]'");
    open(my $fh, "|-") || exec 'tr', '[a-z]', '[A-Z]';
    open(my $fh, "|-", "tr", '[a-z]', '[A-Z]');

    open(my $fh, "cat -n '$file'|");
    open(my $fh, "-|", "cat -n '$file'");
    open(my $fh, "-|") || exec "cat", "-n", $file;
    open(my $fh, "-|", "cat", "-n", $file);

The last two examples in each block show the pipe as "list form", which is
not yet supported on all platforms.  A good rule of thumb is that if
your platform has a real L<C<fork>|/fork> (in other words, if your platform is
Unix, including Linux and MacOS X), you can use the list form.  You would
want to use the list form of the pipe so you can pass literal arguments
to the command without risk of the shell interpreting any shell metacharacters
in them.  However, this also bars you from opening pipes to commands
that intentionally contain shell metacharacters, such as:

    open(my $fh, "|cat -n | expand -4 | lpr")
	|| die "Can't open pipeline to lpr: $!";

See L<perlipc/"Safe Pipe Opens"> for more examples of this.

Perl will attempt to flush all files opened for
output before any operation that may do a fork, but this may not be
supported on some platforms (see L<perlport>).  To be safe, you may need
to set L<C<$E<verbar>>|perlvar/$E<verbar>> (C<$AUTOFLUSH> in L<English>)
or call the C<autoflush> method of L<C<IO::Handle>|IO::Handle/METHODS>
on any open handles.

On systems that support a close-on-exec flag on files, the flag will
be set for the newly opened file descriptor as determined by the value
of L<C<$^F>|perlvar/$^F>.  See L<perlvar/$^F>.

Closing any piped filehandle causes the parent process to wait for the
child to finish, then returns the status value in L<C<$?>|perlvar/$?> and
L<C<${^CHILD_ERROR_NATIVE}>|perlvar/${^CHILD_ERROR_NATIVE}>.

The filename passed to the one- and two-argument forms of
L<C<open>|/open FILEHANDLE,EXPR> will
have leading and trailing whitespace deleted and normal
redirection characters honored.  This property, known as "magic open",
can often be used to good effect.  A user could specify a filename of
F<"rsh cat file |">, or you could change certain filenames as needed:

    $filename =~ s/(.*\.gz)\s*$/gzip -dc < $1|/;
    open(my $fh, $filename) or die "Can't open $filename: $!";

Use the three-argument form to open a file with arbitrary weird characters in it,

    open(my $fh, "<", $file)
	|| die "Can't open $file: $!";

otherwise it's necessary to protect any leading and trailing whitespace:

    $file =~ s#^(\s)#./$1#;
    open(my $fh, "< $file\0")
	|| die "Can't open $file: $!";

(this may not work on some bizarre filesystems).  One should
conscientiously choose between the I<magic> and I<three-argument> form
of L<C<open>|/open FILEHANDLE,EXPR>:

    open(my $in, $ARGV[0]) || die "Can't open $ARGV[0]: $!";

will allow the user to specify an argument of the form C<"rsh cat file |">,
but will not work on a filename that happens to have a trailing space, while

    open(my $in, "<", $ARGV[0])
	|| die "Can't open $ARGV[0]: $!";

will have exactly the opposite restrictions. (However, some shells
support the syntax C<< perl your_program.pl <( rsh cat file ) >>, which
produces a filename that can be opened normally.)

If you want a "real" C L<open(2)>, then you should use the
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE> function, which involves
no such magic (but uses different filemodes than Perl
L<C<open>|/open FILEHANDLE,EXPR>, which corresponds to C L<fopen(3)>).
This is another way to protect your filenames from interpretation.  For
example:

    use IO::Handle;
    sysopen(my $fh, $path, O_RDWR|O_CREAT|O_EXCL)
        or die "Can't open $path: $!";
    $fh->autoflush(1);
    print $fh "stuff $$\n";
    seek($fh, 0, 0);
    print "File contains: ", readline($fh);

See L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> for some details about
mixing reading and writing.

Portability issues: L<perlport/open>.

=item opendir DIRHANDLE,EXPR
X<opendir>

=for Pod::Functions open a directory

Opens a directory named EXPR for processing by
L<C<readdir>|/readdir DIRHANDLE>, L<C<telldir>|/telldir DIRHANDLE>,
L<C<seekdir>|/seekdir DIRHANDLE,POS>,
L<C<rewinddir>|/rewinddir DIRHANDLE>, and
L<C<closedir>|/closedir DIRHANDLE>.  Returns true if successful.
DIRHANDLE may be an expression whose value can be used as an indirect
dirhandle, usually the real dirhandle name.  If DIRHANDLE is an undefined
scalar variable (or array or hash element), the variable is assigned a
reference to a new anonymous dirhandle; that is, it's autovivified.
DIRHANDLEs have their own namespace separate from FILEHANDLEs.

See the example at L<C<readdir>|/readdir DIRHANDLE>.

=item ord EXPR
X<ord> X<encoding>

=item ord

=for Pod::Functions find a character's numeric representation

Returns the numeric value of the first character of EXPR.
If EXPR is an empty string, returns 0.  If EXPR is omitted, uses
L<C<$_>|perlvar/$_>.
(Note I<character>, not byte.)

For the reverse, see L<C<chr>|/chr NUMBER>.
See L<perlunicode> for more about Unicode.

=item our VARLIST
X<our> X<global>

=item our TYPE VARLIST

=item our VARLIST : ATTRS

=item our TYPE VARLIST : ATTRS

=for Pod::Functions +5.6.0 declare and assign a package variable (lexical scoping)

L<C<our>|/our VARLIST> makes a lexical alias to a package (i.e. global)
variable of the same name in the current package for use within the
current lexical scope.

L<C<our>|/our VARLIST> has the same scoping rules as
L<C<my>|/my VARLIST> or L<C<state>|/state VARLIST>, meaning that it is
only valid within a lexical scope.  Unlike L<C<my>|/my VARLIST> and
L<C<state>|/state VARLIST>, which both declare new (lexical) variables,
L<C<our>|/our VARLIST> only creates an alias to an existing variable: a
package variable of the same name.

This means that when C<use strict 'vars'> is in effect, L<C<our>|/our
VARLIST> lets you use a package variable without qualifying it with the
package name, but only within the lexical scope of the
L<C<our>|/our VARLIST> declaration.  This applies immediately--even
within the same statement.

    package Foo;
    use strict;

    $Foo::foo = 23;

    {
        our $foo;   # alias to $Foo::foo
        print $foo; # prints 23
    }

    print $Foo::foo; # prints 23

    print $foo; # ERROR: requires explicit package name

This works even if the package variable has not been used before, as
package variables spring into existence when first used.

    package Foo;
    use strict;

    our $foo = 23;   # just like $Foo::foo = 23

    print $Foo::foo; # prints 23

Because the variable becomes legal immediately under C<use strict 'vars'>, so
long as there is no variable with that name is already in scope, you can then
reference the package variable again even within the same statement.

    package Foo;
    use strict;

    my  $foo = $foo; # error, undeclared $foo on right-hand side
    our $foo = $foo; # no errors

If more than one variable is listed, the list must be placed
in parentheses.

    our($bar, $baz);

An L<C<our>|/our VARLIST> declaration declares an alias for a package
variable that will be visible
across its entire lexical scope, even across package boundaries.  The
package in which the variable is entered is determined at the point
of the declaration, not at the point of use.  This means the following
behavior holds:

    package Foo;
    our $bar;      # declares $Foo::bar for rest of lexical scope
    $bar = 20;

    package Bar;
    print $bar;    # prints 20, as it refers to $Foo::bar

Multiple L<C<our>|/our VARLIST> declarations with the same name in the
same lexical
scope are allowed if they are in different packages.  If they happen
to be in the same package, Perl will emit warnings if you have asked
for them, just like multiple L<C<my>|/my VARLIST> declarations.  Unlike
a second L<C<my>|/my VARLIST> declaration, which will bind the name to a
fresh variable, a second L<C<our>|/our VARLIST> declaration in the same
package, in the same scope, is merely redundant.

    use warnings;
    package Foo;
    our $bar;      # declares $Foo::bar for rest of lexical scope
    $bar = 20;

    package Bar;
    our $bar = 30; # declares $Bar::bar for rest of lexical scope
    print $bar;    # prints 30

    our $bar;      # emits warning but has no other effect
    print $bar;    # still prints 30

An L<C<our>|/our VARLIST> declaration may also have a list of attributes
associated with it.

The exact semantics and interface of TYPE and ATTRS are still
evolving.  TYPE is currently bound to the use of the L<fields> pragma,
and attributes are handled using the L<attributes> pragma, or, starting
from Perl 5.8.0, also via the L<Attribute::Handlers> module.  See
L<perlsub/"Private Variables via my()"> for details.

Note that with a parenthesised list, L<C<undef>|/undef EXPR> can be used
as a dummy placeholder, for example to skip assignment of initial
values:

    our ( undef, $min, $hour ) = localtime;

L<C<our>|/our VARLIST> differs from L<C<use vars>|vars>, which allows
use of an unqualified name I<only> within the affected package, but
across scopes.

=item pack TEMPLATE,LIST
X<pack>

=for Pod::Functions convert a list into a binary representation

Takes a LIST of values and converts it into a string using the rules
given by the TEMPLATE.  The resulting string is the concatenation of
the converted values.  Typically, each converted value looks
like its machine-level representation.  For example, on 32-bit machines
an integer may be represented by a sequence of 4 bytes, which  will in
Perl be presented as a string that's 4 characters long.

See L<perlpacktut> for an introduction to this function.

The TEMPLATE is a sequence of characters that give the order and type
of values, as follows:

    a  A string with arbitrary binary data, will be null padded.
    A  A text (ASCII) string, will be space padded.
    Z  A null-terminated (ASCIZ) string, will be null padded.

    b  A bit string (ascending bit order inside each byte,
       like vec()).
    B  A bit string (descending bit order inside each byte).
    h  A hex string (low nybble first).
    H  A hex string (high nybble first).

    c  A signed char (8-bit) value.
    C  An unsigned char (octet) value.
    W  An unsigned char value (can be greater than 255).

    s  A signed short (16-bit) value.
    S  An unsigned short value.

    l  A signed long (32-bit) value.
    L  An unsigned long value.

    q  A signed quad (64-bit) value.
    Q  An unsigned quad value.
         (Quads are available only if your system supports 64-bit
          integer values _and_ if Perl has been compiled to support
          those.  Raises an exception otherwise.)

    i  A signed integer value.
    I  A unsigned integer value.
         (This 'integer' is _at_least_ 32 bits wide.  Its exact
          size depends on what a local C compiler calls 'int'.)

    n  An unsigned short (16-bit) in "network" (big-endian) order.
    N  An unsigned long (32-bit) in "network" (big-endian) order.
    v  An unsigned short (16-bit) in "VAX" (little-endian) order.
    V  An unsigned long (32-bit) in "VAX" (little-endian) order.

    j  A Perl internal signed integer value (IV).
    J  A Perl internal unsigned integer value (UV).

    f  A single-precision float in native format.
    d  A double-precision float in native format.

    F  A Perl internal floating-point value (NV) in native format
    D  A float of long-double precision in native format.
         (Long doubles are available only if your system supports
          long double values _and_ if Perl has been compiled to
          support those.  Raises an exception otherwise.
          Note that there are different long double formats.)

    p  A pointer to a null-terminated string.
    P  A pointer to a structure (fixed-length string).

    u  A uuencoded string.
    U  A Unicode character number.  Encodes to a character in char-
       acter mode and UTF-8 (or UTF-EBCDIC in EBCDIC platforms) in
       byte mode.

    w  A BER compressed integer (not an ASN.1 BER, see perlpacktut
       for details).  Its bytes represent an unsigned integer in
       base 128, most significant digit first, with as few digits
       as possible.  Bit eight (the high bit) is set on each byte
       except the last.

    x  A null byte (a.k.a ASCII NUL, "\000", chr(0))
    X  Back up a byte.
    @  Null-fill or truncate to absolute position, counted from the
       start of the innermost ()-group.
    .  Null-fill or truncate to absolute position specified by
       the value.
    (  Start of a ()-group.

One or more modifiers below may optionally follow certain letters in the
TEMPLATE (the second column lists letters for which the modifier is valid):

    !   sSlLiI     Forces native (short, long, int) sizes instead
                   of fixed (16-/32-bit) sizes.

    !   xX         Make x and X act as alignment commands.

    !   nNvV       Treat integers as signed instead of unsigned.

    !   @.         Specify position as byte offset in the internal
                   representation of the packed string.  Efficient
                   but dangerous.

    >   sSiIlLqQ   Force big-endian byte-order on the type.
        jJfFdDpP   (The "big end" touches the construct.)

    <   sSiIlLqQ   Force little-endian byte-order on the type.
        jJfFdDpP   (The "little end" touches the construct.)

The C<< > >> and C<< < >> modifiers can also be used on C<()> groups
to force a particular byte-order on all components in that group,
including all its subgroups.

=begin comment

Larry recalls that the hex and bit string formats (H, h, B, b) were added to
pack for processing data from NASA's Magellan probe.  Magellan was in an
elliptical orbit, using the antenna for the radar mapping when close to
Venus and for communicating data back to Earth for the rest of the orbit.
There were two transmission units, but one of these failed, and then the
other developed a fault whereby it would randomly flip the sense of all the
bits. It was easy to automatically detect complete records with the correct
sense, and complete records with all the bits flipped. However, this didn't
recover the records where the sense flipped midway. A colleague of Larry's
was able to pretty much eyeball where the records flipped, so they wrote an
editor named kybble (a pun on the dog food Kibbles 'n Bits) to enable him to
manually correct the records and recover the data. For this purpose pack
gained the hex and bit string format specifiers.

git shows that they were added to perl 3.0 in patch #44 (Jan 1991, commit
27e2fb84680b9cc1), but the patch description makes no mention of their
addition, let alone the story behind them.

=end comment

The following rules apply:

=over

=item *

Each letter may optionally be followed by a number indicating the repeat
count.  A numeric repeat count may optionally be enclosed in brackets, as
in C<pack("C[80]", @arr)>.  The repeat count gobbles that many values from
the LIST when used with all format types other than C<a>, C<A>, C<Z>, C<b>,
C<B>, C<h>, C<H>, C<@>, C<.>, C<x>, C<X>, and C<P>, where it means
something else, described below.  Supplying a C<*> for the repeat count
instead of a number means to use however many items are left, except for:

=over

=item *

C<@>, C<x>, and C<X>, where it is equivalent to C<0>.

=item *

<.>, where it means relative to the start of the string.

=item *

C<u>, where it is equivalent to 1 (or 45, which here is equivalent).

=back

One can replace a numeric repeat count with a template letter enclosed in
brackets to use the packed byte length of the bracketed template for the
repeat count.

For example, the template C<x[L]> skips as many bytes as in a packed long,
and the template C<"$t X[$t] $t"> unpacks twice whatever $t (when
variable-expanded) unpacks.  If the template in brackets contains alignment
commands (such as C<x![d]>), its packed length is calculated as if the
start of the template had the maximal possible alignment.

When used with C<Z>, a C<*> as the repeat count is guaranteed to add a
trailing null byte, so the resulting string is always one byte longer than
the byte length of the item itself.

When used with C<@>, the repeat count represents an offset from the start
of the innermost C<()> group.

When used with C<.>, the repeat count determines the starting position to
calculate the value offset as follows:

=over

=item *

If the repeat count is C<0>, it's relative to the current position.

=item *

If the repeat count is C<*>, the offset is relative to the start of the
packed string.

=item *

And if it's an integer I<n>, the offset is relative to the start of the
I<n>th innermost C<( )> group, or to the start of the string if I<n> is
bigger then the group level.

=back

The repeat count for C<u> is interpreted as the maximal number of bytes
to encode per line of output, with 0, 1 and 2 replaced by 45.  The repeat
count should not be more than 65.

=item *

The C<a>, C<A>, and C<Z> types gobble just one value, but pack it as a
string of length count, padding with nulls or spaces as needed.  When
unpacking, C<A> strips trailing whitespace and nulls, C<Z> strips everything
after the first null, and C<a> returns data with no stripping at all.

If the value to pack is too long, the result is truncated.  If it's too
long and an explicit count is provided, C<Z> packs only C<$count-1> bytes,
followed by a null byte.  Thus C<Z> always packs a trailing null, except
when the count is 0.

=item *

Likewise, the C<b> and C<B> formats pack a string that's that many bits long.
Each such format generates 1 bit of the result.  These are typically followed
by a repeat count like C<B8> or C<B64>.

Each result bit is based on the least-significant bit of the corresponding
input character, i.e., on C<ord($char)%2>.  In particular, characters C<"0">
and C<"1"> generate bits 0 and 1, as do characters C<"\000"> and C<"\001">.

Starting from the beginning of the input string, each 8-tuple
of characters is converted to 1 character of output.  With format C<b>,
the first character of the 8-tuple determines the least-significant bit of a
character; with format C<B>, it determines the most-significant bit of
a character.

If the length of the input string is not evenly divisible by 8, the
remainder is packed as if the input string were padded by null characters
at the end.  Similarly during unpacking, "extra" bits are ignored.

If the input string is longer than needed, remaining characters are ignored.

A C<*> for the repeat count uses all characters of the input field.
On unpacking, bits are converted to a string of C<0>s and C<1>s.

=item *

The C<h> and C<H> formats pack a string that many nybbles (4-bit groups,
representable as hexadecimal digits, C<"0".."9"> C<"a".."f">) long.

For each such format, L<C<pack>|/pack TEMPLATE,LIST> generates 4 bits of result.
With non-alphabetical characters, the result is based on the 4 least-significant
bits of the input character, i.e., on C<ord($char)%16>.  In particular,
characters C<"0"> and C<"1"> generate nybbles 0 and 1, as do bytes
C<"\000"> and C<"\001">.  For characters C<"a".."f"> and C<"A".."F">, the result
is compatible with the usual hexadecimal digits, so that C<"a"> and
C<"A"> both generate the nybble C<0xA==10>.  Use only these specific hex
characters with this format.

Starting from the beginning of the template to
L<C<pack>|/pack TEMPLATE,LIST>, each pair
of characters is converted to 1 character of output.  With format C<h>, the
first character of the pair determines the least-significant nybble of the
output character; with format C<H>, it determines the most-significant
nybble.

If the length of the input string is not even, it behaves as if padded by
a null character at the end.  Similarly, "extra" nybbles are ignored during
unpacking.

If the input string is longer than needed, extra characters are ignored.

A C<*> for the repeat count uses all characters of the input field.  For
L<C<unpack>|/unpack TEMPLATE,EXPR>, nybbles are converted to a string of
hexadecimal digits.

=item *

The C<p> format packs a pointer to a null-terminated string.  You are
responsible for ensuring that the string is not a temporary value, as that
could potentially get deallocated before you got around to using the packed
result.  The C<P> format packs a pointer to a structure of the size indicated
by the length.  A null pointer is created if the corresponding value for
C<p> or C<P> is L<C<undef>|/undef EXPR>; similarly with
L<C<unpack>|/unpack TEMPLATE,EXPR>, where a null pointer unpacks into
L<C<undef>|/undef EXPR>.

If your system has a strange pointer size--meaning a pointer is neither as
big as an int nor as big as a long--it may not be possible to pack or
unpack pointers in big- or little-endian byte order.  Attempting to do
so raises an exception.

=item *

The C</> template character allows packing and unpacking of a sequence of
items where the packed structure contains a packed item count followed by
the packed items themselves.  This is useful when the structure you're
unpacking has encoded the sizes or repeat counts for some of its fields
within the structure itself as separate fields.

For L<C<pack>|/pack TEMPLATE,LIST>, you write
I<length-item>C</>I<sequence-item>, and the
I<length-item> describes how the length value is packed.  Formats likely
to be of most use are integer-packing ones like C<n> for Java strings,
C<w> for ASN.1 or SNMP, and C<N> for Sun XDR.

For L<C<pack>|/pack TEMPLATE,LIST>, I<sequence-item> may have a repeat
count, in which case
the minimum of that and the number of available items is used as the argument
for I<length-item>.  If it has no repeat count or uses a '*', the number
of available items is used.

For L<C<unpack>|/unpack TEMPLATE,EXPR>, an internal stack of integer
arguments unpacked so far is
used.  You write C</>I<sequence-item> and the repeat count is obtained by
popping off the last element from the stack.  The I<sequence-item> must not
have a repeat count.

If I<sequence-item> refers to a string type (C<"A">, C<"a">, or C<"Z">),
the I<length-item> is the string length, not the number of strings.  With
an explicit repeat count for pack, the packed string is adjusted to that
length.  For example:

 This code:                             gives this result:

 unpack("W/a", "\004Gurusamy")          ("Guru")
 unpack("a3/A A*", "007 Bond  J ")      (" Bond", "J")
 unpack("a3 x2 /A A*", "007: Bond, J.") ("Bond, J", ".")

 pack("n/a* w/a","hello,","world")     "\000\006hello,\005world"
 pack("a/W2", ord("a") .. ord("z"))    "2ab"

The I<length-item> is not returned explicitly from
L<C<unpack>|/unpack TEMPLATE,EXPR>.

Supplying a count to the I<length-item> format letter is only useful with
C<A>, C<a>, or C<Z>.  Packing with a I<length-item> of C<a> or C<Z> may
introduce C<"\000"> characters, which Perl does not regard as legal in
numeric strings.

=item *

The integer types C<s>, C<S>, C<l>, and C<L> may be
followed by a C<!> modifier to specify native shorts or
longs.  As shown in the example above, a bare C<l> means
exactly 32 bits, although the native C<long> as seen by the local C compiler
may be larger.  This is mainly an issue on 64-bit platforms.  You can
see whether using C<!> makes any difference this way:

    printf "format s is %d, s! is %d\n",
	length pack("s"), length pack("s!");

    printf "format l is %d, l! is %d\n",
	length pack("l"), length pack("l!");


C<i!> and C<I!> are also allowed, but only for completeness' sake:
they are identical to C<i> and C<I>.

The actual sizes (in bytes) of native shorts, ints, longs, and long
longs on the platform where Perl was built are also available from
the command line:

    $ perl -V:{short,int,long{,long}}size
    shortsize='2';
    intsize='4';
    longsize='4';
    longlongsize='8';

or programmatically via the L<C<Config>|Config> module:

       use Config;
       print $Config{shortsize},    "\n";
       print $Config{intsize},      "\n";
       print $Config{longsize},     "\n";
       print $Config{longlongsize}, "\n";

C<$Config{longlongsize}> is undefined on systems without
long long support.

=item *

The integer formats C<s>, C<S>, C<i>, C<I>, C<l>, C<L>, C<j>, and C<J> are
inherently non-portable between processors and operating systems because
they obey native byteorder and endianness.  For example, a 4-byte integer
0x12345678 (305419896 decimal) would be ordered natively (arranged in and
handled by the CPU registers) into bytes as

    0x12 0x34 0x56 0x78  # big-endian
    0x78 0x56 0x34 0x12  # little-endian

Basically, Intel and VAX CPUs are little-endian, while everybody else,
including Motorola m68k/88k, PPC, Sparc, HP PA, Power, and Cray, are
big-endian.  Alpha and MIPS can be either: Digital/Compaq uses (well, used)
them in little-endian mode, but SGI/Cray uses them in big-endian mode.

The names I<big-endian> and I<little-endian> are comic references to the
egg-eating habits of the little-endian Lilliputians and the big-endian
Blefuscudians from the classic Jonathan Swift satire, I<Gulliver's Travels>.
This entered computer lingo via the paper "On Holy Wars and a Plea for
Peace" by Danny Cohen, USC/ISI IEN 137, April 1, 1980.

Some systems may have even weirder byte orders such as

   0x56 0x78 0x12 0x34
   0x34 0x12 0x78 0x56

These are called mid-endian, middle-endian, mixed-endian, or just weird.

You can determine your system endianness with this incantation:

   printf("%#02x ", $_) for unpack("W*", pack L=>0x12345678);

The byteorder on the platform where Perl was built is also available
via L<Config>:

    use Config;
    print "$Config{byteorder}\n";

or from the command line:

    $ perl -V:byteorder

Byteorders C<"1234"> and C<"12345678"> are little-endian; C<"4321">
and C<"87654321"> are big-endian.  Systems with multiarchitecture binaries
will have C<"ffff">, signifying that static information doesn't work,
one must use runtime probing.

For portably packed integers, either use the formats C<n>, C<N>, C<v>,
and C<V> or else use the C<< > >> and C<< < >> modifiers described
immediately below.  See also L<perlport>.

=item *

Also floating point numbers have endianness.  Usually (but not always)
this agrees with the integer endianness.  Even though most platforms
these days use the IEEE 754 binary format, there are differences,
especially if the long doubles are involved.  You can see the
C<Config> variables C<doublekind> and C<longdblkind> (also C<doublesize>,
C<longdblsize>): the "kind" values are enums, unlike C<byteorder>.

Portability-wise the best option is probably to keep to the IEEE 754
64-bit doubles, and of agreed-upon endianness.  Another possibility
is the C<"%a">) format of L<C<printf>|/printf FILEHANDLE FORMAT, LIST>.

=item *

Starting with Perl 5.10.0, integer and floating-point formats, along with
the C<p> and C<P> formats and C<()> groups, may all be followed by the
C<< > >> or C<< < >> endianness modifiers to respectively enforce big-
or little-endian byte-order.  These modifiers are especially useful
given how C<n>, C<N>, C<v>, and C<V> don't cover signed integers,
64-bit integers, or floating-point values.

Here are some concerns to keep in mind when using an endianness modifier:

=over

=item *

Exchanging signed integers between different platforms works only
when all platforms store them in the same format.  Most platforms store
signed integers in two's-complement notation, so usually this is not an issue.

=item *

The C<< > >> or C<< < >> modifiers can only be used on floating-point
formats on big- or little-endian machines.  Otherwise, attempting to
use them raises an exception.

=item *

Forcing big- or little-endian byte-order on floating-point values for
data exchange can work only if all platforms use the same
binary representation such as IEEE floating-point.  Even if all
platforms are using IEEE, there may still be subtle differences.  Being able
to use C<< > >> or C<< < >> on floating-point values can be useful,
but also dangerous if you don't know exactly what you're doing.
It is not a general way to portably store floating-point values.

=item *

When using C<< > >> or C<< < >> on a C<()> group, this affects
all types inside the group that accept byte-order modifiers,
including all subgroups.  It is silently ignored for all other
types.  You are not allowed to override the byte-order within a group
that already has a byte-order modifier suffix.

=back

=item *

Real numbers (floats and doubles) are in native machine format only.
Due to the multiplicity of floating-point formats and the lack of a
standard "network" representation for them, no facility for interchange has been
made.  This means that packed floating-point data written on one machine
may not be readable on another, even if both use IEEE floating-point
arithmetic (because the endianness of the memory representation is not part
of the IEEE spec).  See also L<perlport>.

If you know I<exactly> what you're doing, you can use the C<< > >> or C<< < >>
modifiers to force big- or little-endian byte-order on floating-point values.

Because Perl uses doubles (or long doubles, if configured) internally for
all numeric calculation, converting from double into float and thence
to double again loses precision, so C<unpack("f", pack("f", $foo)>)
will not in general equal $foo.

=item *

Pack and unpack can operate in two modes: character mode (C<C0> mode) where
the packed string is processed per character, and UTF-8 byte mode (C<U0> mode)
where the packed string is processed in its UTF-8-encoded Unicode form on
a byte-by-byte basis.  Character mode is the default
unless the format string starts with C<U>.  You
can always switch mode mid-format with an explicit
C<C0> or C<U0> in the format.  This mode remains in effect until the next
mode change, or until the end of the C<()> group it (directly) applies to.

Using C<C0> to get Unicode characters while using C<U0> to get I<non>-Unicode
bytes is not necessarily obvious.   Probably only the first of these
is what you want:

    $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      perl -CS -ne 'printf "%v04X\n", $_ for unpack("C0A*", $_)'
    03B1.03C9
    $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      perl -CS -ne 'printf "%v02X\n", $_ for unpack("U0A*", $_)'
    CE.B1.CF.89
    $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      perl -C0 -ne 'printf "%v02X\n", $_ for unpack("C0A*", $_)'
    CE.B1.CF.89
    $ perl -CS -E 'say "\x{3B1}\x{3C9}"' |
      perl -C0 -ne 'printf "%v02X\n", $_ for unpack("U0A*", $_)'
    C3.8E.C2.B1.C3.8F.C2.89

Those examples also illustrate that you should not try to use
L<C<pack>|/pack TEMPLATE,LIST>/L<C<unpack>|/unpack TEMPLATE,EXPR> as a
substitute for the L<Encode> module.

=item *

You must yourself do any alignment or padding by inserting, for example,
enough C<"x">es while packing.  There is no way for
L<C<pack>|/pack TEMPLATE,LIST> and L<C<unpack>|/unpack TEMPLATE,EXPR>
to know where characters are going to or coming from, so they
handle their output and input as flat sequences of characters.

=item *

A C<()> group is a sub-TEMPLATE enclosed in parentheses.  A group may
take a repeat count either as postfix, or for
L<C<unpack>|/unpack TEMPLATE,EXPR>, also via the C</>
template character.  Within each repetition of a group, positioning with
C<@> starts over at 0.  Therefore, the result of

    pack("@1A((@2A)@3A)", qw[X Y Z])

is the string C<"\0X\0\0YZ">.

=item *

C<x> and C<X> accept the C<!> modifier to act as alignment commands: they
jump forward or back to the closest position aligned at a multiple of C<count>
characters.  For example, to L<C<pack>|/pack TEMPLATE,LIST> or
L<C<unpack>|/unpack TEMPLATE,EXPR> a C structure like

    struct {
	char   c;    /* one signed, 8-bit character */
	double d;
	char   cc[2];
    }

one may need to use the template C<c x![d] d c[2]>.  This assumes that
doubles must be aligned to the size of double.

For alignment commands, a C<count> of 0 is equivalent to a C<count> of 1;
both are no-ops.

=item *

C<n>, C<N>, C<v> and C<V> accept the C<!> modifier to
represent signed 16-/32-bit integers in big-/little-endian order.
This is portable only when all platforms sharing packed data use the
same binary representation for signed integers; for example, when all
platforms use two's-complement representation.

=item *

Comments can be embedded in a TEMPLATE using C<#> through the end of line.
White space can separate pack codes from each other, but modifiers and
repeat counts must follow immediately.  Breaking complex templates into
individual line-by-line components, suitably annotated, can do as much to
improve legibility and maintainability of pack/unpack formats as C</x> can
for complicated pattern matches.

=item *

If TEMPLATE requires more arguments than L<C<pack>|/pack TEMPLATE,LIST>
is given, L<C<pack>|/pack TEMPLATE,LIST>
assumes additional C<""> arguments.  If TEMPLATE requires fewer arguments
than given, extra arguments are ignored.

=item *

Attempting to pack the special floating point values C<Inf> and C<NaN>
(infinity, also in negative, and not-a-number) into packed integer values
(like C<"L">) is a fatal error.  The reason for this is that there simply
isn't any sensible mapping for these special values into integers.

=back

Examples:

    $foo = pack("WWWW",65,66,67,68);
    # foo eq "ABCD"
    $foo = pack("W4",65,66,67,68);
    # same thing
    $foo = pack("W4",0x24b6,0x24b7,0x24b8,0x24b9);
    # same thing with Unicode circled letters.
    $foo = pack("U4",0x24b6,0x24b7,0x24b8,0x24b9);
    # same thing with Unicode circled letters.  You don't get the
    # UTF-8 bytes because the U at the start of the format caused
    # a switch to U0-mode, so the UTF-8 bytes get joined into
    # characters
    $foo = pack("C0U4",0x24b6,0x24b7,0x24b8,0x24b9);
    # foo eq "\xe2\x92\xb6\xe2\x92\xb7\xe2\x92\xb8\xe2\x92\xb9"
    # This is the UTF-8 encoding of the string in the
    # previous example

    $foo = pack("ccxxcc",65,66,67,68);
    # foo eq "AB\0\0CD"

    # NOTE: The examples above featuring "W" and "c" are true
    # only on ASCII and ASCII-derived systems such as ISO Latin 1
    # and UTF-8.  On EBCDIC systems, the first example would be
    #      $foo = pack("WWWW",193,194,195,196);

    $foo = pack("s2",1,2);
    # "\001\000\002\000" on little-endian
    # "\000\001\000\002" on big-endian

    $foo = pack("a4","abcd","x","y","z");
    # "abcd"

    $foo = pack("aaaa","abcd","x","y","z");
    # "axyz"

    $foo = pack("a14","abcdefg");
    # "abcdefg\0\0\0\0\0\0\0"

    $foo = pack("i9pl", gmtime);
    # a real struct tm (on my system anyway)

    $utmp_template = "Z8 Z8 Z16 L";
    $utmp = pack($utmp_template, @utmp1);
    # a struct utmp (BSDish)

    @utmp2 = unpack($utmp_template, $utmp);
    # "@utmp1" eq "@utmp2"

    sub bintodec {
        unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
    }

    $foo = pack('sx2l', 12, 34);
    # short 12, two zero bytes padding, long 34
    $bar = pack('s@4l', 12, 34);
    # short 12, zero fill to position 4, long 34
    # $foo eq $bar
    $baz = pack('s.l', 12, 4, 34);
    # short 12, zero fill to position 4, long 34

    $foo = pack('nN', 42, 4711);
    # pack big-endian 16- and 32-bit unsigned integers
    $foo = pack('S>L>', 42, 4711);
    # exactly the same
    $foo = pack('s<l<', -42, 4711);
    # pack little-endian 16- and 32-bit signed integers
    $foo = pack('(sl)<', -42, 4711);
    # exactly the same

The same template may generally also be used in
L<C<unpack>|/unpack TEMPLATE,EXPR>.

=item package NAMESPACE

=item package NAMESPACE VERSION
X<package> X<module> X<namespace> X<version>

=item package NAMESPACE BLOCK

=item package NAMESPACE VERSION BLOCK
X<package> X<module> X<namespace> X<version>

=for Pod::Functions declare a separate global namespace

Declares the BLOCK or the rest of the compilation unit as being in the
given namespace.  The scope of the package declaration is either the
supplied code BLOCK or, in the absence of a BLOCK, from the declaration
itself through the end of current scope (the enclosing block, file, or
L<C<eval>|/eval EXPR>).  That is, the forms without a BLOCK are
operative through the end of the current scope, just like the
L<C<my>|/my VARLIST>, L<C<state>|/state VARLIST>, and
L<C<our>|/our VARLIST> operators.  All unqualified dynamic identifiers
in this scope will be in the given namespace, except where overridden by
another L<C<package>|/package NAMESPACE> declaration or
when they're one of the special identifiers that qualify into C<main::>,
like C<STDOUT>, C<ARGV>, C<ENV>, and the punctuation variables.

A package statement affects dynamic variables only, including those
you've used L<C<local>|/local EXPR> on, but I<not> lexically-scoped
variables, which are created with L<C<my>|/my VARLIST>,
L<C<state>|/state VARLIST>, or L<C<our>|/our VARLIST>.  Typically it
would be the first declaration in a file included by
L<C<require>|/require VERSION> or L<C<use>|/use Module VERSION LIST>.
You can switch into a
package in more than one place, since this only determines which default
symbol table the compiler uses for the rest of that block.  You can refer to
identifiers in other packages than the current one by prefixing the identifier
with the package name and a double colon, as in C<$SomePack::var>
or C<ThatPack::INPUT_HANDLE>.  If package name is omitted, the C<main>
package as assumed.  That is, C<$::sail> is equivalent to
C<$main::sail> (as well as to C<$main'sail>, still seen in ancient
code, mostly from Perl 4).

If VERSION is provided, L<C<package>|/package NAMESPACE> sets the
C<$VERSION> variable in the given
namespace to a L<version> object with the VERSION provided.  VERSION must be a
"strict" style version number as defined by the L<version> module: a positive
decimal number (integer or decimal-fraction) without exponentiation or else a
dotted-decimal v-string with a leading 'v' character and at least three
components.  You should set C<$VERSION> only once per package.

See L<perlmod/"Packages"> for more information about packages, modules,
and classes.  See L<perlsub> for other scoping issues.

=item __PACKAGE__
X<__PACKAGE__>

=for Pod::Functions +5.004 the current package

A special token that returns the name of the package in which it occurs.

=item pipe READHANDLE,WRITEHANDLE
X<pipe>

=for Pod::Functions open a pair of connected filehandles

Opens a pair of connected pipes like the corresponding system call.
Note that if you set up a loop of piped processes, deadlock can occur
unless you are very careful.  In addition, note that Perl's pipes use
IO buffering, so you may need to set L<C<$E<verbar>>|perlvar/$E<verbar>>
to flush your WRITEHANDLE after each command, depending on the
application.

Returns true on success.

See L<IPC::Open2>, L<IPC::Open3>, and
L<perlipc/"Bidirectional Communication with Another Process">
for examples of such things.

On systems that support a close-on-exec flag on files, that flag is set
on all newly opened file descriptors whose
L<C<fileno>|/fileno FILEHANDLE>s are I<higher> than the current value of
L<C<$^F>|perlvar/$^F> (by default 2 for C<STDERR>).  See L<perlvar/$^F>.

=item pop ARRAY
X<pop> X<stack>

=item pop

=for Pod::Functions remove the last element from an array and return it

Pops and returns the last value of the array, shortening the array by
one element.

Returns the undefined value if the array is empty, although this may
also happen at other times.  If ARRAY is omitted, pops the
L<C<@ARGV>|perlvar/@ARGV> array in the main program, but the
L<C<@_>|perlvar/@_> array in subroutines, just like
L<C<shift>|/shift ARRAY>.

Starting with Perl 5.14, an experimental feature allowed
L<C<pop>|/pop ARRAY> to take a
scalar expression. This experiment has been deemed unsuccessful, and was
removed as of Perl 5.24.

=item pos SCALAR
X<pos> X<match, position>

=item pos

=for Pod::Functions find or set the offset for the last/next m//g search

Returns the offset of where the last C<m//g> search left off for the
variable in question (L<C<$_>|perlvar/$_> is used when the variable is not
specified).  This offset is in characters unless the
(no-longer-recommended) L<C<use bytes>|bytes> pragma is in effect, in
which case the offset is in bytes.  Note that 0 is a valid match offset.
L<C<undef>|/undef EXPR> indicates
that the search position is reset (usually due to match failure, but
can also be because no match has yet been run on the scalar).

L<C<pos>|/pos SCALAR> directly accesses the location used by the regexp
engine to store the offset, so assigning to L<C<pos>|/pos SCALAR> will
change that offset, and so will also influence the C<\G> zero-width
assertion in regular expressions.  Both of these effects take place for
the next match, so you can't affect the position with
L<C<pos>|/pos SCALAR> during the current match, such as in
C<(?{pos() = 5})> or C<s//pos() = 5/e>.

Setting L<C<pos>|/pos SCALAR> also resets the I<matched with
zero-length> flag, described
under L<perlre/"Repeated Patterns Matching a Zero-length Substring">.

Because a failed C<m//gc> match doesn't reset the offset, the return
from L<C<pos>|/pos SCALAR> won't change either in this case.  See
L<perlre> and L<perlop>.

=item print FILEHANDLE LIST
X<print>

=item print FILEHANDLE

=item print LIST

=item print

=for Pod::Functions output a list to a filehandle

Prints a string or a list of strings.  Returns true if successful.
FILEHANDLE may be a scalar variable containing the name of or a reference
to the filehandle, thus introducing one level of indirection.  (NOTE: If
FILEHANDLE is a variable and the next token is a term, it may be
misinterpreted as an operator unless you interpose a C<+> or put
parentheses around the arguments.)  If FILEHANDLE is omitted, prints to the
last selected (see L<C<select>|/select FILEHANDLE>) output handle.  If
LIST is omitted, prints L<C<$_>|perlvar/$_> to the currently selected
output handle.  To use FILEHANDLE alone to print the content of
L<C<$_>|perlvar/$_> to it, you must use a bareword filehandle like
C<FH>, not an indirect one like C<$fh>.  To set the default output handle
to something other than STDOUT, use the select operation.

The current value of L<C<$,>|perlvar/$,> (if any) is printed between
each LIST item.  The current value of L<C<$\>|perlvar/$\> (if any) is
printed after the entire LIST has been printed.  Because print takes a
LIST, anything in the LIST is evaluated in list context, including any
subroutines whose return lists you pass to
L<C<print>|/print FILEHANDLE LIST>.  Be careful not to follow the print
keyword with a left
parenthesis unless you want the corresponding right parenthesis to
terminate the arguments to the print; put parentheses around all arguments
(or interpose a C<+>, but that doesn't look as good).

If you're storing handles in an array or hash, or in general whenever
you're using any expression more complex than a bareword handle or a plain,
unsubscripted scalar variable to retrieve it, you will have to use a block
returning the filehandle value instead, in which case the LIST may not be
omitted:

    print { $files[$i] } "stuff\n";
    print { $OK ? *STDOUT : *STDERR } "stuff\n";

Printing to a closed pipe or socket will generate a SIGPIPE signal.  See
L<perlipc> for more on signal handling.

=item printf FILEHANDLE FORMAT, LIST
X<printf>

=item printf FILEHANDLE

=item printf FORMAT, LIST

=item printf

=for Pod::Functions output a formatted list to a filehandle

Equivalent to C<print FILEHANDLE sprintf(FORMAT, LIST)>, except that
L<C<$\>|perlvar/$\> (the output record separator) is not appended.  The
FORMAT and the LIST are actually parsed as a single list.  The first
argument of the list will be interpreted as the
L<C<printf>|/printf FILEHANDLE FORMAT, LIST> format.  This means that
C<printf(@_)> will use C<$_[0]> as the format.  See
L<sprintf|/sprintf FORMAT, LIST> for an explanation of the format
argument.  If C<use locale> (including C<use locale ':not_characters'>)
is in effect and L<C<POSIX::setlocale>|POSIX/C<setlocale>> has been
called, the character used for the decimal separator in formatted
floating-point numbers is affected by the C<LC_NUMERIC> locale setting.
See L<perllocale> and L<POSIX>.

For historical reasons, if you omit the list, L<C<$_>|perlvar/$_> is
used as the format;
to use FILEHANDLE without a list, you must use a bareword filehandle like
C<FH>, not an indirect one like C<$fh>.  However, this will rarely do what
you want; if L<C<$_>|perlvar/$_> contains formatting codes, they will be
replaced with the empty string and a warning will be emitted if
L<warnings> are enabled.  Just use L<C<print>|/print FILEHANDLE LIST> if
you want to print the contents of L<C<$_>|perlvar/$_>.

Don't fall into the trap of using a
L<C<printf>|/printf FILEHANDLE FORMAT, LIST> when a simple
L<C<print>|/print FILEHANDLE LIST> would do.  The
L<C<print>|/print FILEHANDLE LIST> is more efficient and less error
prone.

=item prototype FUNCTION
X<prototype>

=item prototype

=for Pod::Functions +5.002 get the prototype (if any) of a subroutine

Returns the prototype of a function as a string (or
L<C<undef>|/undef EXPR> if the
function has no prototype).  FUNCTION is a reference to, or the name of,
the function whose prototype you want to retrieve.  If FUNCTION is omitted,
L<C<$_>|perlvar/$_> is used.

If FUNCTION is a string starting with C<CORE::>, the rest is taken as a
name for a Perl builtin.  If the builtin's arguments
cannot be adequately expressed by a prototype
(such as L<C<system>|/system LIST>), L<C<prototype>|/prototype FUNCTION>
returns L<C<undef>|/undef EXPR>, because the builtin
does not really behave like a Perl function.  Otherwise, the string
describing the equivalent prototype is returned.

=item push ARRAY,LIST
X<push> X<stack>

=for Pod::Functions append one or more elements to an array

Treats ARRAY as a stack by appending the values of LIST to the end of
ARRAY.  The length of ARRAY increases by the length of LIST.  Has the same
effect as

    for my $value (LIST) {
        $ARRAY[++$#ARRAY] = $value;
    }

but is more efficient.  Returns the number of elements in the array following
the completed L<C<push>|/push ARRAY,LIST>.

Starting with Perl 5.14, an experimental feature allowed
L<C<push>|/push ARRAY,LIST> to take a
scalar expression. This experiment has been deemed unsuccessful, and was
removed as of Perl 5.24.

=item q/STRING/

=for Pod::Functions singly quote a string

=item qq/STRING/

=for Pod::Functions doubly quote a string

=item qw/STRING/

=for Pod::Functions quote a list of words

=item qx/STRING/

=for Pod::Functions backquote quote a string

Generalized quotes.  See L<perlop/"Quote-Like Operators">.

=item qr/STRING/

=for Pod::Functions +5.005 compile pattern

Regexp-like quote.  See L<perlop/"Regexp Quote-Like Operators">.

=item quotemeta EXPR
X<quotemeta> X<metacharacter>

=item quotemeta

=for Pod::Functions quote regular expression magic characters

Returns the value of EXPR with all the ASCII non-"word"
characters backslashed.  (That is, all ASCII characters not matching
C</[A-Za-z_0-9]/> will be preceded by a backslash in the
returned string, regardless of any locale settings.)
This is the internal function implementing
the C<\Q> escape in double-quoted strings.
(See below for the behavior on non-ASCII code points.)

If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

quotemeta (and C<\Q> ... C<\E>) are useful when interpolating strings into
regular expressions, because by default an interpolated variable will be
considered a mini-regular expression.  For example:

    my $sentence = 'The quick brown fox jumped over the lazy dog';
    my $substring = 'quick.*?fox';
    $sentence =~ s{$substring}{big bad wolf};

Will cause C<$sentence> to become C<'The big bad wolf jumped over...'>.

On the other hand:

    my $sentence = 'The quick brown fox jumped over the lazy dog';
    my $substring = 'quick.*?fox';
    $sentence =~ s{\Q$substring\E}{big bad wolf};

Or:

    my $sentence = 'The quick brown fox jumped over the lazy dog';
    my $substring = 'quick.*?fox';
    my $quoted_substring = quotemeta($substring);
    $sentence =~ s{$quoted_substring}{big bad wolf};

Will both leave the sentence as is.
Normally, when accepting literal string input from the user,
L<C<quotemeta>|/quotemeta EXPR> or C<\Q> must be used.

In Perl v5.14, all non-ASCII characters are quoted in non-UTF-8-encoded
strings, but not quoted in UTF-8 strings.

Starting in Perl v5.16, Perl adopted a Unicode-defined strategy for
quoting non-ASCII characters; the quoting of ASCII characters is
unchanged.

Also unchanged is the quoting of non-UTF-8 strings when outside the
scope of a
L<C<use feature 'unicode_strings'>|feature/The 'unicode_strings' feature>,
which is to quote all
characters in the upper Latin1 range.  This provides complete backwards
compatibility for old programs which do not use Unicode.  (Note that
C<unicode_strings> is automatically enabled within the scope of a
S<C<use v5.12>> or greater.)

Within the scope of L<C<use locale>|locale>, all non-ASCII Latin1 code
points
are quoted whether the string is encoded as UTF-8 or not.  As mentioned
above, locale does not affect the quoting of ASCII-range characters.
This protects against those locales where characters such as C<"|"> are
considered to be word characters.

Otherwise, Perl quotes non-ASCII characters using an adaptation from
Unicode (see L<http://www.unicode.org/reports/tr31/>).
The only code points that are quoted are those that have any of the
Unicode properties:  Pattern_Syntax, Pattern_White_Space, White_Space,
Default_Ignorable_Code_Point, or General_Category=Control.

Of these properties, the two important ones are Pattern_Syntax and
Pattern_White_Space.  They have been set up by Unicode for exactly this
purpose of deciding which characters in a regular expression pattern
should be quoted.  No character that can be in an identifier has these
properties.

Perl promises, that if we ever add regular expression pattern
metacharacters to the dozen already defined
(C<\ E<verbar> ( ) [ { ^ $ * + ? .>), that we will only use ones that have the
Pattern_Syntax property.  Perl also promises, that if we ever add
characters that are considered to be white space in regular expressions
(currently mostly affected by C</x>), they will all have the
Pattern_White_Space property.

Unicode promises that the set of code points that have these two
properties will never change, so something that is not quoted in v5.16
will never need to be quoted in any future Perl release.  (Not all the
code points that match Pattern_Syntax have actually had characters
assigned to them; so there is room to grow, but they are quoted
whether assigned or not.  Perl, of course, would never use an
unassigned code point as an actual metacharacter.)

Quoting characters that have the other 3 properties is done to enhance
the readability of the regular expression and not because they actually
need to be quoted for regular expression purposes (characters with the
White_Space property are likely to be indistinguishable on the page or
screen from those with the Pattern_White_Space property; and the other
two properties contain non-printing characters).

=item rand EXPR
X<rand> X<random>

=item rand

=for Pod::Functions retrieve the next pseudorandom number

Returns a random fractional number greater than or equal to C<0> and less
than the value of EXPR.  (EXPR should be positive.)  If EXPR is
omitted, the value C<1> is used.  Currently EXPR with the value C<0> is
also special-cased as C<1> (this was undocumented before Perl 5.8.0
and is subject to change in future versions of Perl).  Automatically calls
L<C<srand>|/srand EXPR> unless L<C<srand>|/srand EXPR> has already been
called.  See also L<C<srand>|/srand EXPR>.

Apply L<C<int>|/int EXPR> to the value returned by L<C<rand>|/rand EXPR>
if you want random integers instead of random fractional numbers.  For
example,

    int(rand(10))

returns a random integer between C<0> and C<9>, inclusive.

(Note: If your rand function consistently returns numbers that are too
large or too small, then your version of Perl was probably compiled
with the wrong number of RANDBITS.)

B<L<C<rand>|/rand EXPR> is not cryptographically secure.  You should not rely
on it in security-sensitive situations.>  As of this writing, a
number of third-party CPAN modules offer random number generators
intended by their authors to be cryptographically secure,
including: L<Data::Entropy>, L<Crypt::Random>, L<Math::Random::Secure>,
and L<Math::TrulyRandom>.

=item read FILEHANDLE,SCALAR,LENGTH,OFFSET
X<read> X<file, read>

=item read FILEHANDLE,SCALAR,LENGTH

=for Pod::Functions fixed-length buffered input from a filehandle

Attempts to read LENGTH I<characters> of data into variable SCALAR
from the specified FILEHANDLE.  Returns the number of characters
actually read, C<0> at end of file, or undef if there was an error (in
the latter case L<C<$!>|perlvar/$!> is also set).  SCALAR will be grown
or shrunk
so that the last character actually read is the last character of the
scalar after the read.

An OFFSET may be specified to place the read data at some place in the
string other than the beginning.  A negative OFFSET specifies
placement at that many characters counting backwards from the end of
the string.  A positive OFFSET greater than the length of SCALAR
results in the string being padded to the required size with C<"\0">
bytes before the result of the read is appended.

The call is implemented in terms of either Perl's or your system's native
L<fread(3)> library function.  To get a true L<read(2)> system call, see
L<sysread|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET>.

Note the I<characters>: depending on the status of the filehandle,
either (8-bit) bytes or characters are read.  By default, all
filehandles operate on bytes, but for example if the filehandle has
been opened with the C<:utf8> I/O layer (see
L<C<open>|/open FILEHANDLE,EXPR>, and the L<open>
pragma), the I/O will operate on UTF8-encoded Unicode
characters, not bytes.  Similarly for the C<:encoding> layer:
in that case pretty much any characters can be read.

=item readdir DIRHANDLE
X<readdir>

=for Pod::Functions get a directory from a directory handle

Returns the next directory entry for a directory opened by
L<C<opendir>|/opendir DIRHANDLE,EXPR>.
If used in list context, returns all the rest of the entries in the
directory.  If there are no more entries, returns the undefined value in
scalar context and the empty list in list context.

If you're planning to filetest the return values out of a
L<C<readdir>|/readdir DIRHANDLE>, you'd better prepend the directory in
question.  Otherwise, because we didn't L<C<chdir>|/chdir EXPR> there,
it would have been testing the wrong file.

    opendir(my $dh, $some_dir) || die "Can't opendir $some_dir: $!";
    my @dots = grep { /^\./ && -f "$some_dir/$_" } readdir($dh);
    closedir $dh;

As of Perl 5.12 you can use a bare L<C<readdir>|/readdir DIRHANDLE> in a
C<while> loop, which will set L<C<$_>|perlvar/$_> on every iteration.

    opendir(my $dh, $some_dir) || die "Can't open $some_dir: $!";
    while (readdir $dh) {
        print "$some_dir/$_\n";
    }
    closedir $dh;

To avoid confusing would-be users of your code who are running earlier
versions of Perl with mysterious failures, put this sort of thing at the
top of your file to signal that your code will work I<only> on Perls of a
recent vintage:

    use 5.012; # so readdir assigns to $_ in a lone while test

=item readline EXPR

=item readline
X<readline> X<gets> X<fgets>

=for Pod::Functions fetch a record from a file

Reads from the filehandle whose typeglob is contained in EXPR (or from
C<*ARGV> if EXPR is not provided).  In scalar context, each call reads and
returns the next line until end-of-file is reached, whereupon the
subsequent call returns L<C<undef>|/undef EXPR>.  In list context, reads
until end-of-file is reached and returns a list of lines.  Note that the
notion of "line" used here is whatever you may have defined with
L<C<$E<sol>>|perlvar/$E<sol>> (or C<$INPUT_RECORD_SEPARATOR> in
L<English>).  See L<perlvar/"$/">.

When L<C<$E<sol>>|perlvar/$E<sol>> is set to L<C<undef>|/undef EXPR>,
when L<C<readline>|/readline EXPR> is in scalar context (i.e., file
slurp mode), and when an empty file is read, it returns C<''> the first
time, followed by L<C<undef>|/undef EXPR> subsequently.

This is the internal function implementing the C<< <EXPR> >>
operator, but you can use it directly.  The C<< <EXPR> >>
operator is discussed in more detail in L<perlop/"I/O Operators">.

    my $line = <STDIN>;
    my $line = readline(STDIN);    # same thing

If L<C<readline>|/readline EXPR> encounters an operating system error,
L<C<$!>|perlvar/$!> will be set with the corresponding error message.
It can be helpful to check L<C<$!>|perlvar/$!> when you are reading from
filehandles you don't trust, such as a tty or a socket.  The following
example uses the operator form of L<C<readline>|/readline EXPR> and dies
if the result is not defined.

    while ( ! eof($fh) ) {
        defined( $_ = readline $fh ) or die "readline failed: $!";
        ...
    }

Note that you have can't handle L<C<readline>|/readline EXPR> errors
that way with the C<ARGV> filehandle.  In that case, you have to open
each element of L<C<@ARGV>|perlvar/@ARGV> yourself since
L<C<eof>|/eof FILEHANDLE> handles C<ARGV> differently.

    foreach my $arg (@ARGV) {
        open(my $fh, $arg) or warn "Can't open $arg: $!";

        while ( ! eof($fh) ) {
            defined( $_ = readline $fh )
                or die "readline failed for $arg: $!";
            ...
        }
    }

=item readlink EXPR
X<readlink>

=item readlink

=for Pod::Functions determine where a symbolic link is pointing

Returns the value of a symbolic link, if symbolic links are
implemented.  If not, raises an exception.  If there is a system
error, returns the undefined value and sets L<C<$!>|perlvar/$!> (errno).
If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

Portability issues: L<perlport/readlink>.

=item readpipe EXPR

=item readpipe
X<readpipe>

=for Pod::Functions execute a system command and collect standard output

EXPR is executed as a system command.
The collected standard output of the command is returned.
In scalar context, it comes back as a single (potentially
multi-line) string.  In list context, returns a list of lines
(however you've defined lines with L<C<$E<sol>>|perlvar/$E<sol>> (or
C<$INPUT_RECORD_SEPARATOR> in L<English>)).
This is the internal function implementing the C<qx/EXPR/>
operator, but you can use it directly.  The C<qx/EXPR/>
operator is discussed in more detail in L<perlop/"I/O Operators">.
If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

=item recv SOCKET,SCALAR,LENGTH,FLAGS
X<recv>

=for Pod::Functions receive a message over a Socket

Receives a message on a socket.  Attempts to receive LENGTH characters
of data into variable SCALAR from the specified SOCKET filehandle.
SCALAR will be grown or shrunk to the length actually read.  Takes the
same flags as the system call of the same name.  Returns the address
of the sender if SOCKET's protocol supports this; returns an empty
string otherwise.  If there's an error, returns the undefined value.
This call is actually implemented in terms of the L<recvfrom(2)> system call.
See L<perlipc/"UDP: Message Passing"> for examples.

Note the I<characters>: depending on the status of the socket, either
(8-bit) bytes or characters are received.  By default all sockets
operate on bytes, but for example if the socket has been changed using
L<C<binmode>|/binmode FILEHANDLE, LAYER> to operate with the
C<:encoding(UTF-8)> I/O layer (see the L<open> pragma), the I/O will
operate on UTF8-encoded Unicode
characters, not bytes.  Similarly for the C<:encoding> layer: in that
case pretty much any characters can be read.

=item redo LABEL
X<redo>

=item redo EXPR

=item redo

=for Pod::Functions start this loop iteration over again

The L<C<redo>|/redo LABEL> command restarts the loop block without
evaluating the conditional again.  The L<C<continue>|/continue BLOCK>
block, if any, is not executed.  If
the LABEL is omitted, the command refers to the innermost enclosing
loop.  The C<redo EXPR> form, available starting in Perl 5.18.0, allows a
label name to be computed at run time, and is otherwise identical to C<redo
LABEL>.  Programs that want to lie to themselves about what was just input
normally use this command:

    # a simpleminded Pascal comment stripper
    # (warning: assumes no { or } in strings)
    LINE: while (<STDIN>) {
        while (s|({.*}.*){.*}|$1 |) {}
        s|{.*}| |;
        if (s|{.*| |) {
            my $front = $_;
            while (<STDIN>) {
                if (/}/) {  # end of comment?
                    s|^|$front\{|;
                    redo LINE;
                }
            }
        }
        print;
    }

L<C<redo>|/redo LABEL> cannot be used to retry a block that returns a
value such as C<eval {}>, C<sub {}>, or C<do {}>, and should not be used
to exit a L<C<grep>|/grep BLOCK LIST> or L<C<map>|/map BLOCK LIST>
operation.

Note that a block by itself is semantically identical to a loop
that executes once.  Thus L<C<redo>|/redo LABEL> inside such a block
will effectively turn it into a looping construct.

See also L<C<continue>|/continue BLOCK> for an illustration of how
L<C<last>|/last LABEL>, L<C<next>|/next LABEL>, and
L<C<redo>|/redo LABEL> work.

Unlike most named operators, this has the same precedence as assignment.
It is also exempt from the looks-like-a-function rule, so
C<redo ("foo")."bar"> will cause "bar" to be part of the argument to
L<C<redo>|/redo LABEL>.

=item ref EXPR
X<ref> X<reference>

=item ref

=for Pod::Functions find out the type of thing being referenced

Returns a non-empty string if EXPR is a reference, the empty
string otherwise.  If EXPR is not specified, L<C<$_>|perlvar/$_> will be
used.  The value returned depends on the type of thing the reference is
a reference to.

Builtin types include:

    SCALAR
    ARRAY
    HASH
    CODE
    REF
    GLOB
    LVALUE
    FORMAT
    IO
    VSTRING
    Regexp

You can think of L<C<ref>|/ref EXPR> as a C<typeof> operator.

    if (ref($r) eq "HASH") {
        print "r is a reference to a hash.\n";
    }
    unless (ref($r)) {
        print "r is not a reference at all.\n";
    }

The return value C<LVALUE> indicates a reference to an lvalue that is not
a variable.  You get this from taking the reference of function calls like
L<C<pos>|/pos SCALAR> or
L<C<substr>|/substr EXPR,OFFSET,LENGTH,REPLACEMENT>.  C<VSTRING> is
returned if the reference points to a
L<version string|perldata/"Version Strings">.

The result C<Regexp> indicates that the argument is a regular expression
resulting from L<C<qrE<sol>E<sol>>|/qrE<sol>STRINGE<sol>>.

If the referenced object has been blessed into a package, then that package
name is returned instead.  But don't use that, as it's now considered
"bad practice".  For one reason, an object could be using a class called
C<Regexp> or C<IO>, or even C<HASH>.  Also, L<C<ref>|/ref EXPR> doesn't
take into account subclasses, like
L<C<isa>|UNIVERSAL/C<< $obj->isa( TYPE ) >>> does.

Instead, use L<C<blessed>|Scalar::Util/blessed> (in the L<Scalar::Util>
module) for boolean checks, L<C<isa>|UNIVERSAL/C<< $obj->isa( TYPE ) >>>
for specific class checks and L<C<reftype>|Scalar::Util/reftype> (also
from L<Scalar::Util>) for type checks.  (See L<perlobj> for details and
a L<C<blessed>|Scalar::Util/blessed>/L<C<isa>|UNIVERSAL/C<< $obj->isa( TYPE ) >>>
example.)

See also L<perlref>.

=item rename OLDNAME,NEWNAME
X<rename> X<move> X<mv> X<ren>

=for Pod::Functions change a filename

Changes the name of a file; an existing file NEWNAME will be
clobbered.  Returns true for success, false otherwise.

Behavior of this function varies wildly depending on your system
implementation.  For example, it will usually not work across file system
boundaries, even though the system I<mv> command sometimes compensates
for this.  Other restrictions include whether it works on directories,
open files, or pre-existing files.  Check L<perlport> and either the
L<rename(2)> manpage or equivalent system documentation for details.

For a platform independent L<C<move>|File::Copy/move> function look at
the L<File::Copy> module.

Portability issues: L<perlport/rename>.

=item require VERSION
X<require>

=item require EXPR

=item require

=for Pod::Functions load in external functions from a library at runtime

Demands a version of Perl specified by VERSION, or demands some semantics
specified by EXPR or by L<C<$_>|perlvar/$_> if EXPR is not supplied.

VERSION may be either a numeric argument such as 5.006, which will be
compared to L<C<$]>|perlvar/$]>, or a literal of the form v5.6.1, which
will be compared to L<C<$^V>|perlvar/$^V> (or C<$PERL_VERSION> in
L<English>).  An exception is raised if VERSION is greater than the
version of the current Perl interpreter.  Compare with
L<C<use>|/use Module VERSION LIST>, which can do a similar check at
compile time.

Specifying VERSION as a literal of the form v5.6.1 should generally be
avoided, because it leads to misleading error messages under earlier
versions of Perl that do not support this syntax.  The equivalent numeric
version should be used instead.

    require v5.6.1;     # run time version check
    require 5.6.1;      # ditto
    require 5.006_001;  # ditto; preferred for backwards
                          compatibility

Otherwise, L<C<require>|/require VERSION> demands that a library file be
included if it hasn't already been included.  The file is included via
the do-FILE mechanism, which is essentially just a variety of
L<C<eval>|/eval EXPR> with the
caveat that lexical variables in the invoking script will be invisible
to the included code.  If it were implemented in pure Perl, it
would have semantics similar to the following:

    use Carp 'croak';
    use version;

    sub require {
        my ($filename) = @_;
        if ( my $version = eval { version->parse($filename) } ) {
            if ( $version > $^V ) {
               my $vn = $version->normal;
               croak "Perl $vn required--this is only $^V, stopped";
            }
            return 1;
        }

        if (exists $INC{$filename}) {
            return 1 if $INC{$filename};
            croak "Compilation failed in require";
        }

        foreach $prefix (@INC) {
            if (ref($prefix)) {
                #... do other stuff - see text below ....
            }
            # (see text below about possible appending of .pmc
            # suffix to $filename)
            my $realfilename = "$prefix/$filename";
            next if ! -e $realfilename || -d _ || -b _;
            $INC{$filename} = $realfilename;
            my $result = do($realfilename);
                         # but run in caller's namespace

            if (!defined $result) {
                $INC{$filename} = undef;
                croak $@ ? "$@Compilation failed in require"
                         : "Can't locate $filename: $!\n";
            }
            if (!$result) {
                delete $INC{$filename};
                croak "$filename did not return true value";
            }
            $! = 0;
            return $result;
        }
        croak "Can't locate $filename in \@INC ...";
    }

Note that the file will not be included twice under the same specified
name.

The file must return true as the last statement to indicate
successful execution of any initialization code, so it's customary to
end such a file with C<1;> unless you're sure it'll return true
otherwise.  But it's better just to put the C<1;>, in case you add more
statements.

If EXPR is a bareword, L<C<require>|/require VERSION> assumes a F<.pm>
extension and replaces C<::> with C</> in the filename for you,
to make it easy to load standard modules.  This form of loading of
modules does not risk altering your namespace.

In other words, if you try this:

        require Foo::Bar;     # a splendid bareword

The require function will actually look for the F<Foo/Bar.pm> file in the
directories specified in the L<C<@INC>|perlvar/@INC> array.

But if you try this:

        my $class = 'Foo::Bar';
        require $class;       # $class is not a bareword
    #or
        require "Foo::Bar";   # not a bareword because of the ""

The require function will look for the F<Foo::Bar> file in the
L<C<@INC>|perlvar/@INC>  array and
will complain about not finding F<Foo::Bar> there.  In this case you can do:

        eval "require $class";

Now that you understand how L<C<require>|/require VERSION> looks for
files with a bareword argument, there is a little extra functionality
going on behind the scenes.  Before L<C<require>|/require VERSION> looks
for a F<.pm> extension, it will first look for a similar filename with a
F<.pmc> extension.  If this file is found, it will be loaded in place of
any file ending in a F<.pm> extension.

You can also insert hooks into the import facility by putting Perl code
directly into the L<C<@INC>|perlvar/@INC> array.  There are three forms
of hooks: subroutine references, array references, and blessed objects.

Subroutine references are the simplest case.  When the inclusion system
walks through L<C<@INC>|perlvar/@INC> and encounters a subroutine, this
subroutine gets called with two parameters, the first a reference to
itself, and the second the name of the file to be included (e.g.,
F<Foo/Bar.pm>).  The subroutine should return either nothing or else a
list of up to four values in the following order:

=over

=item 1

A reference to a scalar, containing any initial source code to prepend to
the file or generator output.

=item 2

A filehandle, from which the file will be read.

=item 3

A reference to a subroutine.  If there is no filehandle (previous item),
then this subroutine is expected to generate one line of source code per
call, writing the line into L<C<$_>|perlvar/$_> and returning 1, then
finally at end of file returning 0.  If there is a filehandle, then the
subroutine will be called to act as a simple source filter, with the
line as read in L<C<$_>|perlvar/$_>.
Again, return 1 for each valid line, and 0 after all lines have been
returned.

=item 4

Optional state for the subroutine.  The state is passed in as C<$_[1]>.  A
reference to the subroutine itself is passed in as C<$_[0]>.

=back

If an empty list, L<C<undef>|/undef EXPR>, or nothing that matches the
first 3 values above is returned, then L<C<require>|/require VERSION>
looks at the remaining elements of L<C<@INC>|perlvar/@INC>.
Note that this filehandle must be a real filehandle (strictly a typeglob
or reference to a typeglob, whether blessed or unblessed); tied filehandles
will be ignored and processing will stop there.

If the hook is an array reference, its first element must be a subroutine
reference.  This subroutine is called as above, but the first parameter is
the array reference.  This lets you indirectly pass arguments to
the subroutine.

In other words, you can write:

    push @INC, \&my_sub;
    sub my_sub {
        my ($coderef, $filename) = @_;  # $coderef is \&my_sub
        ...
    }

or:

    push @INC, [ \&my_sub, $x, $y, ... ];
    sub my_sub {
        my ($arrayref, $filename) = @_;
        # Retrieve $x, $y, ...
        my (undef, @parameters) = @$arrayref;
        ...
    }

If the hook is an object, it must provide an C<INC> method that will be
called as above, the first parameter being the object itself.  (Note that
you must fully qualify the sub's name, as unqualified C<INC> is always forced
into package C<main>.)  Here is a typical code layout:

    # In Foo.pm
    package Foo;
    sub new { ... }
    sub Foo::INC {
        my ($self, $filename) = @_;
        ...
    }

    # In the main program
    push @INC, Foo->new(...);

These hooks are also permitted to set the L<C<%INC>|perlvar/%INC> entry
corresponding to the files they have loaded.  See L<perlvar/%INC>.

For a yet-more-powerful import facility, see
L<C<use>|/use Module VERSION LIST> and L<perlmod>.

=item reset EXPR
X<reset>

=item reset

=for Pod::Functions clear all variables of a given name

Generally used in a L<C<continue>|/continue BLOCK> block at the end of a
loop to clear variables and reset C<m?pattern?> searches so that they
work again.  The
expression is interpreted as a list of single characters (hyphens
allowed for ranges).  All variables and arrays beginning with one of
those letters are reset to their pristine state.  If the expression is
omitted, one-match searches (C<m?pattern?>) are reset to match again.
Only resets variables or searches in the current package.  Always returns
1.  Examples:

    reset 'X';      # reset all X variables
    reset 'a-z';    # reset lower case variables
    reset;          # just reset m?one-time? searches

Resetting C<"A-Z"> is not recommended because you'll wipe out your
L<C<@ARGV>|perlvar/@ARGV> and L<C<@INC>|perlvar/@INC> arrays and your
L<C<%ENV>|perlvar/%ENV> hash.
Resets only package variables; lexical variables are unaffected, but
they clean themselves up on scope exit anyway, so you'll probably want
to use them instead.  See L<C<my>|/my VARLIST>.

=item return EXPR
X<return>

=item return

=for Pod::Functions get out of a function early

Returns from a subroutine, L<C<eval>|/eval EXPR>,
L<C<do FILE>|/do EXPR>, L<C<sort>|/sort SUBNAME LIST> block or regex
eval block (but not a L<C<grep>|/grep BLOCK LIST> or
L<C<map>|/map BLOCK LIST> block) with the value
given in EXPR.  Evaluation of EXPR may be in list, scalar, or void
context, depending on how the return value will be used, and the context
may vary from one execution to the next (see
L<C<wantarray>|/wantarray>).  If no EXPR
is given, returns an empty list in list context, the undefined value in
scalar context, and (of course) nothing at all in void context.

(In the absence of an explicit L<C<return>|/return EXPR>, a subroutine,
L<C<eval>|/eval EXPR>,
or L<C<do FILE>|/do EXPR> automatically returns the value of the last expression
evaluated.)

Unlike most named operators, this is also exempt from the
looks-like-a-function rule, so C<return ("foo")."bar"> will
cause C<"bar"> to be part of the argument to L<C<return>|/return EXPR>.

=item reverse LIST
X<reverse> X<rev> X<invert>

=for Pod::Functions flip a string or a list

In list context, returns a list value consisting of the elements
of LIST in the opposite order.  In scalar context, concatenates the
elements of LIST and returns a string value with all characters
in the opposite order.

    print join(", ", reverse "world", "Hello"); # Hello, world

    print scalar reverse "dlrow ,", "olleH";    # Hello, world

Used without arguments in scalar context, L<C<reverse>|/reverse LIST>
reverses L<C<$_>|perlvar/$_>.

    $_ = "dlrow ,olleH";
    print reverse;                         # No output, list context
    print scalar reverse;                  # Hello, world

Note that reversing an array to itself (as in C<@a = reverse @a>) will
preserve non-existent elements whenever possible; i.e., for non-magical
arrays or for tied arrays with C<EXISTS> and C<DELETE> methods.

This operator is also handy for inverting a hash, although there are some
caveats.  If a value is duplicated in the original hash, only one of those
can be represented as a key in the inverted hash.  Also, this has to
unwind one hash and build a whole new one, which may take some time
on a large hash, such as from a DBM file.

    my %by_name = reverse %by_address;  # Invert the hash

=item rewinddir DIRHANDLE
X<rewinddir>

=for Pod::Functions reset directory handle

Sets the current position to the beginning of the directory for the
L<C<readdir>|/readdir DIRHANDLE> routine on DIRHANDLE.

Portability issues: L<perlport/rewinddir>.

=item rindex STR,SUBSTR,POSITION
X<rindex>

=item rindex STR,SUBSTR

=for Pod::Functions right-to-left substring search

Works just like L<C<index>|/index STR,SUBSTR,POSITION> except that it
returns the position of the I<last>
occurrence of SUBSTR in STR.  If POSITION is specified, returns the
last occurrence beginning at or before that position.

=item rmdir FILENAME
X<rmdir> X<rd> X<directory, remove>

=item rmdir

=for Pod::Functions remove a directory

Deletes the directory specified by FILENAME if that directory is
empty.  If it succeeds it returns true; otherwise it returns false and
sets L<C<$!>|perlvar/$!> (errno).  If FILENAME is omitted, uses
L<C<$_>|perlvar/$_>.

To remove a directory tree recursively (C<rm -rf> on Unix) look at
the L<C<rmtree>|File::Path/rmtree( $dir )> function of the L<File::Path>
module.

=item s///

=for Pod::Functions replace a pattern with a string

The substitution operator.  See L<perlop/"Regexp Quote-Like Operators">.

=item say FILEHANDLE LIST
X<say>

=item say FILEHANDLE

=item say LIST

=item say

=for Pod::Functions +say output a list to a filehandle, appending a newline

Just like L<C<print>|/print FILEHANDLE LIST>, but implicitly appends a
newline.  C<say LIST> is simply an abbreviation for
C<{ local $\ = "\n"; print LIST }>.  To use FILEHANDLE without a LIST to
print the contents of L<C<$_>|perlvar/$_> to it, you must use a bareword
filehandle like C<FH>, not an indirect one like C<$fh>.

L<C<say>|/say FILEHANDLE LIST> is available only if the
L<C<"say"> feature|feature/The 'say' feature> is enabled or if it is
prefixed with C<CORE::>.  The
L<C<"say"> feature|feature/The 'say' feature> is enabled automatically
with a C<use v5.10> (or higher) declaration in the current scope.

=item scalar EXPR
X<scalar> X<context>

=for Pod::Functions force a scalar context

Forces EXPR to be interpreted in scalar context and returns the value
of EXPR.

    my @counts = ( scalar @a, scalar @b, scalar @c );

There is no equivalent operator to force an expression to
be interpolated in list context because in practice, this is never
needed.  If you really wanted to do so, however, you could use
the construction C<@{[ (some expression) ]}>, but usually a simple
C<(some expression)> suffices.

Because L<C<scalar>|/scalar EXPR> is a unary operator, if you
accidentally use a
parenthesized list for the EXPR, this behaves as a scalar comma expression,
evaluating all but the last element in void context and returning the final
element evaluated in scalar context.  This is seldom what you want.

The following single statement:

    print uc(scalar(foo(), $bar)), $baz;

is the moral equivalent of these two:

    foo();
    print(uc($bar), $baz);

See L<perlop> for more details on unary operators and the comma operator,
and L<perldata> for details on evaluating a hash in scalar contex.

=item seek FILEHANDLE,POSITION,WHENCE
X<seek> X<fseek> X<filehandle, position>

=for Pod::Functions reposition file pointer for random-access I/O

Sets FILEHANDLE's position, just like the L<fseek(3)> call of C C<stdio>.
FILEHANDLE may be an expression whose value gives the name of the
filehandle.  The values for WHENCE are C<0> to set the new position
I<in bytes> to POSITION; C<1> to set it to the current position plus
POSITION; and C<2> to set it to EOF plus POSITION, typically
negative.  For WHENCE you may use the constants C<SEEK_SET>,
C<SEEK_CUR>, and C<SEEK_END> (start of the file, current position, end
of the file) from the L<Fcntl> module.  Returns C<1> on success, false
otherwise.

Note the emphasis on bytes: even if the filehandle has been set to operate
on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, and
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
family of functions use byte offsets, not character offsets,
because seeking to a character offset would be very slow in a UTF-8 file.

If you want to position the file for
L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET> or
L<C<syswrite>|/syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET>, don't use
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>, because buffering makes its
effect on the file's read-write position unpredictable and non-portable.
Use L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> instead.

Due to the rules and rigors of ANSI C, on some systems you have to do a
seek whenever you switch between reading and writing.  Amongst other
things, this may have the effect of calling stdio's L<clearerr(3)>.
A WHENCE of C<1> (C<SEEK_CUR>) is useful for not moving the file position:

    seek($fh, 0, 1);

This is also useful for applications emulating C<tail -f>.  Once you hit
EOF on your read and then sleep for a while, you (probably) have to stick in a
dummy L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> to reset things.  The
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> doesn't change the position,
but it I<does> clear the end-of-file condition on the handle, so that the
next C<readline FILE> makes Perl try again to read something.  (We hope.)

If that doesn't work (some I/O implementations are particularly
cantankerous), you might need something like this:

    for (;;) {
        for ($curpos = tell($fh); $_ = readline($fh);
             $curpos = tell($fh)) {
            # search for some stuff and put it into files
        }
        sleep($for_a_while);
        seek($fh, $curpos, 0);
    }

=item seekdir DIRHANDLE,POS
X<seekdir>

=for Pod::Functions reposition directory pointer

Sets the current position for the L<C<readdir>|/readdir DIRHANDLE>
routine on DIRHANDLE.  POS must be a value returned by
L<C<telldir>|/telldir DIRHANDLE>.  L<C<seekdir>|/seekdir DIRHANDLE,POS>
also has the same caveats about possible directory compaction as the
corresponding system library routine.

=item select FILEHANDLE
X<select> X<filehandle, default>

=item select

=for Pod::Functions reset default output or do I/O multiplexing

Returns the currently selected filehandle.  If FILEHANDLE is supplied,
sets the new current default filehandle for output.  This has two
effects: first, a L<C<write>|/write FILEHANDLE> or a L<C<print>|/print
FILEHANDLE LIST> without a filehandle
default to this FILEHANDLE.  Second, references to variables related to
output will refer to this output channel.

For example, to set the top-of-form format for more than one
output channel, you might do the following:

    select(REPORT1);
    $^ = 'report1_top';
    select(REPORT2);
    $^ = 'report2_top';

FILEHANDLE may be an expression whose value gives the name of the
actual filehandle.  Thus:

    my $oldfh = select(STDERR); $| = 1; select($oldfh);

Some programmers may prefer to think of filehandles as objects with
methods, preferring to write the last example as:

    STDERR->autoflush(1);

(Prior to Perl version 5.14, you have to C<use IO::Handle;> explicitly
first.)

Portability issues: L<perlport/select>.

=item select RBITS,WBITS,EBITS,TIMEOUT
X<select>

This calls the L<select(2)> syscall with the bit masks specified, which
can be constructed using L<C<fileno>|/fileno FILEHANDLE> and
L<C<vec>|/vec EXPR,OFFSET,BITS>, along these lines:

    my $rin = my $win = my $ein = '';
    vec($rin, fileno(STDIN),  1) = 1;
    vec($win, fileno(STDOUT), 1) = 1;
    $ein = $rin | $win;

If you want to select on many filehandles, you may wish to write a
subroutine like this:

    sub fhbits {
        my @fhlist = @_;
        my $bits = "";
        for my $fh (@fhlist) {
            vec($bits, fileno($fh), 1) = 1;
        }
        return $bits;
    }
    my $rin = fhbits(\*STDIN, $tty, $mysock);

The usual idiom is:

 my ($nfound, $timeleft) =
   select(my $rout = $rin, my $wout = $win, my $eout = $ein,
                                                          $timeout);

or to block until something becomes ready just do this

 my $nfound =
   select(my $rout = $rin, my $wout = $win, my $eout = $ein, undef);

Most systems do not bother to return anything useful in C<$timeleft>, so
calling L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT> in scalar context
just returns C<$nfound>.

Any of the bit masks can also be L<C<undef>|/undef EXPR>.  The timeout,
if specified, is
in seconds, which may be fractional.  Note: not all implementations are
capable of returning the C<$timeleft>.  If not, they always return
C<$timeleft> equal to the supplied C<$timeout>.

You can effect a sleep of 250 milliseconds this way:

    select(undef, undef, undef, 0.25);

Note that whether L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT> gets
restarted after signals (say, SIGALRM) is implementation-dependent.  See
also L<perlport> for notes on the portability of
L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT>.

On error, L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT> behaves just
like L<select(2)>: it returns C<-1> and sets L<C<$!>|perlvar/$!>.

On some Unixes, L<select(2)> may report a socket file descriptor as
"ready for reading" even when no data is available, and thus any
subsequent L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET> would block.
This can be avoided if you always use C<O_NONBLOCK> on the socket.  See
L<select(2)> and L<fcntl(2)> for further details.

The standard L<C<IO::Select>|IO::Select> module provides a
user-friendlier interface to
L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT>, mostly because it does
all the bit-mask work for you.

B<WARNING>: One should not attempt to mix buffered I/O (like
L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET> or
L<C<readline>|/readline EXPR>) with
L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT>, except as permitted by
POSIX, and even then only on POSIX systems.  You have to use
L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET> instead.

Portability issues: L<perlport/select>.

=item semctl ID,SEMNUM,CMD,ARG
X<semctl>

=for Pod::Functions SysV semaphore control operations

Calls the System V IPC function L<semctl(2)>.  You'll probably have to say

    use IPC::SysV;

first to get the correct constant definitions.  If CMD is IPC_STAT or
GETALL, then ARG must be a variable that will hold the returned
semid_ds structure or semaphore value array.  Returns like
L<C<ioctl>|/ioctl FILEHANDLE,FUNCTION,SCALAR>:
the undefined value for error, "C<0 but true>" for zero, or the actual
return value otherwise.  The ARG must consist of a vector of native
short integers, which may be created with C<pack("s!",(0)x$nsem)>.
See also L<perlipc/"SysV IPC"> and the documentation for
L<C<IPC::SysV>|IPC::SysV> and L<C<IPC::Semaphore>|IPC::Semaphore>.

Portability issues: L<perlport/semctl>.

=item semget KEY,NSEMS,FLAGS
X<semget>

=for Pod::Functions get set of SysV semaphores

Calls the System V IPC function L<semget(2)>.  Returns the semaphore id, or
the undefined value on error.  See also
L<perlipc/"SysV IPC"> and the documentation for
L<C<IPC::SysV>|IPC::SysV> and L<C<IPC::Semaphore>|IPC::Semaphore>.

Portability issues: L<perlport/semget>.

=item semop KEY,OPSTRING
X<semop>

=for Pod::Functions SysV semaphore operations

Calls the System V IPC function L<semop(2)> for semaphore operations
such as signalling and waiting.  OPSTRING must be a packed array of
semop structures.  Each semop structure can be generated with
C<pack("s!3", $semnum, $semop, $semflag)>.  The length of OPSTRING
implies the number of semaphore operations.  Returns true if
successful, false on error.  As an example, the
following code waits on semaphore $semnum of semaphore id $semid:

    my $semop = pack("s!3", $semnum, -1, 0);
    die "Semaphore trouble: $!\n" unless semop($semid, $semop);

To signal the semaphore, replace C<-1> with C<1>.  See also
L<perlipc/"SysV IPC"> and the documentation for
L<C<IPC::SysV>|IPC::SysV> and L<C<IPC::Semaphore>|IPC::Semaphore>.

Portability issues: L<perlport/semop>.

=item send SOCKET,MSG,FLAGS,TO
X<send>

=item send SOCKET,MSG,FLAGS

=for Pod::Functions send a message over a socket

Sends a message on a socket.  Attempts to send the scalar MSG to the SOCKET
filehandle.  Takes the same flags as the system call of the same name.  On
unconnected sockets, you must specify a destination to I<send to>, in which
case it does a L<sendto(2)> syscall.  Returns the number of characters sent,
or the undefined value on error.  The L<sendmsg(2)> syscall is currently
unimplemented.  See L<perlipc/"UDP: Message Passing"> for examples.

Note the I<characters>: depending on the status of the socket, either
(8-bit) bytes or characters are sent.  By default all sockets operate
on bytes, but for example if the socket has been changed using
L<C<binmode>|/binmode FILEHANDLE, LAYER> to operate with the
C<:encoding(UTF-8)> I/O layer (see L<C<open>|/open FILEHANDLE,EXPR>, or
the L<open> pragma), the I/O will operate on UTF-8
encoded Unicode characters, not bytes.  Similarly for the C<:encoding>
layer: in that case pretty much any characters can be sent.

=item setpgrp PID,PGRP
X<setpgrp> X<group>

=for Pod::Functions set the process group of a process

Sets the current process group for the specified PID, C<0> for the current
process.  Raises an exception when used on a machine that doesn't
implement POSIX L<setpgid(2)> or BSD L<setpgrp(2)>.  If the arguments
are omitted, it defaults to C<0,0>.  Note that the BSD 4.2 version of
L<C<setpgrp>|/setpgrp PID,PGRP> does not accept any arguments, so only
C<setpgrp(0,0)> is portable.  See also
L<C<POSIX::setsid()>|POSIX/C<setsid>>.

Portability issues: L<perlport/setpgrp>.

=item setpriority WHICH,WHO,PRIORITY
X<setpriority> X<priority> X<nice> X<renice>

=for Pod::Functions set a process's nice value

Sets the current priority for a process, a process group, or a user.
(See L<setpriority(2)>.)  Raises an exception when used on a machine
that doesn't implement L<setpriority(2)>.

Portability issues: L<perlport/setpriority>.

=item setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL
X<setsockopt>

=for Pod::Functions set some socket options

Sets the socket option requested.  Returns L<C<undef>|/undef EXPR> on
error.  Use integer constants provided by the L<C<Socket>|Socket> module
for
LEVEL and OPNAME.  Values for LEVEL can also be obtained from
getprotobyname.  OPTVAL might either be a packed string or an integer.
An integer OPTVAL is shorthand for pack("i", OPTVAL).

An example disabling Nagle's algorithm on a socket:

    use Socket qw(IPPROTO_TCP TCP_NODELAY);
    setsockopt($socket, IPPROTO_TCP, TCP_NODELAY, 1);

Portability issues: L<perlport/setsockopt>.

=item shift ARRAY
X<shift>

=item shift

=for Pod::Functions remove the first element of an array, and return it

Shifts the first value of the array off and returns it, shortening the
array by 1 and moving everything down.  If there are no elements in the
array, returns the undefined value.  If ARRAY is omitted, shifts the
L<C<@_>|perlvar/@_> array within the lexical scope of subroutines and
formats, and the L<C<@ARGV>|perlvar/@ARGV> array outside a subroutine
and also within the lexical scopes
established by the C<eval STRING>, C<BEGIN {}>, C<INIT {}>, C<CHECK {}>,
C<UNITCHECK {}>, and C<END {}> constructs.

Starting with Perl 5.14, an experimental feature allowed
L<C<shift>|/shift ARRAY> to take a
scalar expression. This experiment has been deemed unsuccessful, and was
removed as of Perl 5.24.

See also L<C<unshift>|/unshift ARRAY,LIST>, L<C<push>|/push ARRAY,LIST>,
and L<C<pop>|/pop ARRAY>.  L<C<shift>|/shift ARRAY> and
L<C<unshift>|/unshift ARRAY,LIST> do the same thing to the left end of
an array that L<C<pop>|/pop ARRAY> and L<C<push>|/push ARRAY,LIST> do to
the right end.

=item shmctl ID,CMD,ARG
X<shmctl>

=for Pod::Functions SysV shared memory operations

Calls the System V IPC function shmctl.  You'll probably have to say

    use IPC::SysV;

first to get the correct constant definitions.  If CMD is C<IPC_STAT>,
then ARG must be a variable that will hold the returned C<shmid_ds>
structure.  Returns like ioctl: L<C<undef>|/undef EXPR> for error; "C<0>
but true" for zero; and the actual return value otherwise.
See also L<perlipc/"SysV IPC"> and the documentation for
L<C<IPC::SysV>|IPC::SysV>.

Portability issues: L<perlport/shmctl>.

=item shmget KEY,SIZE,FLAGS
X<shmget>

=for Pod::Functions get SysV shared memory segment identifier

Calls the System V IPC function shmget.  Returns the shared memory
segment id, or L<C<undef>|/undef EXPR> on error.
See also L<perlipc/"SysV IPC"> and the documentation for
L<C<IPC::SysV>|IPC::SysV>.

Portability issues: L<perlport/shmget>.

=item shmread ID,VAR,POS,SIZE
X<shmread>
X<shmwrite>

=for Pod::Functions read SysV shared memory

=item shmwrite ID,STRING,POS,SIZE

=for Pod::Functions write SysV shared memory

Reads or writes the System V shared memory segment ID starting at
position POS for size SIZE by attaching to it, copying in/out, and
detaching from it.  When reading, VAR must be a variable that will
hold the data read.  When writing, if STRING is too long, only SIZE
bytes are used; if STRING is too short, nulls are written to fill out
SIZE bytes.  Return true if successful, false on error.
L<C<shmread>|/shmread ID,VAR,POS,SIZE> taints the variable.  See also
L<perlipc/"SysV IPC"> and the documentation for
L<C<IPC::SysV>|IPC::SysV> and the L<C<IPC::Shareable>|IPC::Shareable>
module from CPAN.

Portability issues: L<perlport/shmread> and L<perlport/shmwrite>.

=item shutdown SOCKET,HOW
X<shutdown>

=for Pod::Functions close down just half of a socket connection

Shuts down a socket connection in the manner indicated by HOW, which
has the same interpretation as in the syscall of the same name.

    shutdown($socket, 0);    # I/we have stopped reading data
    shutdown($socket, 1);    # I/we have stopped writing data
    shutdown($socket, 2);    # I/we have stopped using this socket

This is useful with sockets when you want to tell the other
side you're done writing but not done reading, or vice versa.
It's also a more insistent form of close because it also
disables the file descriptor in any forked copies in other
processes.

Returns C<1> for success; on error, returns L<C<undef>|/undef EXPR> if
the first argument is not a valid filehandle, or returns C<0> and sets
L<C<$!>|perlvar/$!> for any other failure.

=item sin EXPR
X<sin> X<sine> X<asin> X<arcsine>

=item sin

=for Pod::Functions return the sine of a number

Returns the sine of EXPR (expressed in radians).  If EXPR is omitted,
returns sine of L<C<$_>|perlvar/$_>.

For the inverse sine operation, you may use the C<Math::Trig::asin>
function, or use this relation:

    sub asin { atan2($_[0], sqrt(1 - $_[0] * $_[0])) }

=item sleep EXPR
X<sleep> X<pause>

=item sleep

=for Pod::Functions block for some number of seconds

Causes the script to sleep for (integer) EXPR seconds, or forever if no
argument is given.  Returns the integer number of seconds actually slept.

May be interrupted if the process receives a signal such as C<SIGALRM>.

    eval {
        local $SIG{ALRM} = sub { die "Alarm!\n" };
        sleep;
    };
    die $@ unless $@ eq "Alarm!\n";

You probably cannot mix L<C<alarm>|/alarm SECONDS> and
L<C<sleep>|/sleep EXPR> calls, because L<C<sleep>|/sleep EXPR> is often
implemented using L<C<alarm>|/alarm SECONDS>.

On some older systems, it may sleep up to a full second less than what
you requested, depending on how it counts seconds.  Most modern systems
always sleep the full amount.  They may appear to sleep longer than that,
however, because your process might not be scheduled right away in a
busy multitasking system.

For delays of finer granularity than one second, the L<Time::HiRes>
module (from CPAN, and starting from Perl 5.8 part of the standard
distribution) provides L<C<usleep>|Time::HiRes/usleep ( $useconds )>.
You may also use Perl's four-argument
version of L<C<select>|/select RBITS,WBITS,EBITS,TIMEOUT> leaving the
first three arguments undefined, or you might be able to use the
L<C<syscall>|/syscall NUMBER, LIST> interface to access L<setitimer(2)>
if your system supports it.  See L<perlfaq8> for details.

See also the L<POSIX> module's L<C<pause>|POSIX/C<pause>> function.

=item socket SOCKET,DOMAIN,TYPE,PROTOCOL
X<socket>

=for Pod::Functions create a socket

Opens a socket of the specified kind and attaches it to filehandle
SOCKET.  DOMAIN, TYPE, and PROTOCOL are specified the same as for
the syscall of the same name.  You should C<use Socket> first
to get the proper definitions imported.  See the examples in
L<perlipc/"Sockets: Client/Server Communication">.

On systems that support a close-on-exec flag on files, the flag will
be set for the newly opened file descriptor, as determined by the
value of L<C<$^F>|perlvar/$^F>.  See L<perlvar/$^F>.

=item socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL
X<socketpair>

=for Pod::Functions create a pair of sockets

Creates an unnamed pair of sockets in the specified domain, of the
specified type.  DOMAIN, TYPE, and PROTOCOL are specified the same as
for the syscall of the same name.  If unimplemented, raises an exception.
Returns true if successful.

On systems that support a close-on-exec flag on files, the flag will
be set for the newly opened file descriptors, as determined by the value
of L<C<$^F>|perlvar/$^F>.  See L<perlvar/$^F>.

Some systems define L<C<pipe>|/pipe READHANDLE,WRITEHANDLE> in terms of
L<C<socketpair>|/socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL>, in
which a call to C<pipe($rdr, $wtr)> is essentially:

    use Socket;
    socketpair(my $rdr, my $wtr, AF_UNIX, SOCK_STREAM, PF_UNSPEC);
    shutdown($rdr, 1);        # no more writing for reader
    shutdown($wtr, 0);        # no more reading for writer

See L<perlipc> for an example of socketpair use.  Perl 5.8 and later will
emulate socketpair using IP sockets to localhost if your system implements
sockets but not socketpair.

Portability issues: L<perlport/socketpair>.

=item sort SUBNAME LIST
X<sort> X<qsort> X<quicksort> X<mergesort>

=item sort BLOCK LIST

=item sort LIST

=for Pod::Functions sort a list of values

In list context, this sorts the LIST and returns the sorted list value.
In scalar context, the behaviour of L<C<sort>|/sort SUBNAME LIST> is
undefined.

If SUBNAME or BLOCK is omitted, L<C<sort>|/sort SUBNAME LIST>s in
standard string comparison
order.  If SUBNAME is specified, it gives the name of a subroutine
that returns an integer less than, equal to, or greater than C<0>,
depending on how the elements of the list are to be ordered.  (The
C<< <=> >> and C<cmp> operators are extremely useful in such routines.)
SUBNAME may be a scalar variable name (unsubscripted), in which case
the value provides the name of (or a reference to) the actual
subroutine to use.  In place of a SUBNAME, you can provide a BLOCK as
an anonymous, in-line sort subroutine.

If the subroutine's prototype is C<($$)>, the elements to be compared are
passed by reference in L<C<@_>|perlvar/@_>, as for a normal subroutine.
This is slower than unprototyped subroutines, where the elements to be
compared are passed into the subroutine as the package global variables
C<$a> and C<$b> (see example below).

If the subroutine is an XSUB, the elements to be compared are pushed on
to the stack, the way arguments are usually passed to XSUBs.  C<$a> and
C<$b> are not set.

The values to be compared are always passed by reference and should not
be modified.

You also cannot exit out of the sort block or subroutine using any of the
loop control operators described in L<perlsyn> or with
L<C<goto>|/goto LABEL>.

When L<C<use locale>|locale> (but not C<use locale ':not_characters'>)
is in effect, C<sort LIST> sorts LIST according to the
current collation locale.  See L<perllocale>.

L<C<sort>|/sort SUBNAME LIST> returns aliases into the original list,
much as a for loop's index variable aliases the list elements.  That is,
modifying an element of a list returned by L<C<sort>|/sort SUBNAME LIST>
(for example, in a C<foreach>, L<C<map>|/map BLOCK LIST> or
L<C<grep>|/grep BLOCK LIST>)
actually modifies the element in the original list.  This is usually
something to be avoided when writing clear code.

Perl 5.6 and earlier used a quicksort algorithm to implement sort.
That algorithm was not stable and I<could> go quadratic.  (A I<stable> sort
preserves the input order of elements that compare equal.  Although
quicksort's run time is O(NlogN) when averaged over all arrays of
length N, the time can be O(N**2), I<quadratic> behavior, for some
inputs.)  In 5.7, the quicksort implementation was replaced with
a stable mergesort algorithm whose worst-case behavior is O(NlogN).
But benchmarks indicated that for some inputs, on some platforms,
the original quicksort was faster.  5.8 has a L<sort> pragma for
limited control of the sort.  Its rather blunt control of the
underlying algorithm may not persist into future Perls, but the
ability to characterize the input or output in implementation
independent ways quite probably will.

Examples:

    # sort lexically
    my @articles = sort @files;

    # same thing, but with explicit sort routine
    my @articles = sort {$a cmp $b} @files;

    # now case-insensitively
    my @articles = sort {fc($a) cmp fc($b)} @files;

    # same thing in reversed order
    my @articles = sort {$b cmp $a} @files;

    # sort numerically ascending
    my @articles = sort {$a <=> $b} @files;

    # sort numerically descending
    my @articles = sort {$b <=> $a} @files;

    # this sorts the %age hash by value instead of key
    # using an in-line function
    my @eldest = sort { $age{$b} <=> $age{$a} } keys %age;

    # sort using explicit subroutine name
    sub byage {
        $age{$a} <=> $age{$b};  # presuming numeric
    }
    my @sortedclass = sort byage @class;

    sub backwards { $b cmp $a }
    my @harry  = qw(dog cat x Cain Abel);
    my @george = qw(gone chased yz Punished Axed);
    print sort @harry;
        # prints AbelCaincatdogx
    print sort backwards @harry;
        # prints xdogcatCainAbel
    print sort @george, 'to', @harry;
        # prints AbelAxedCainPunishedcatchaseddoggonetoxyz

    # inefficiently sort by descending numeric compare using
    # the first integer after the first = sign, or the
    # whole record case-insensitively otherwise

    my @new = sort {
        ($b =~ /=(\d+)/)[0] <=> ($a =~ /=(\d+)/)[0]
                            ||
                    fc($a)  cmp  fc($b)
    } @old;

    # same thing, but much more efficiently;
    # we'll build auxiliary indices instead
    # for speed
    my (@nums, @caps);
    for (@old) {
        push @nums, ( /=(\d+)/ ? $1 : undef );
        push @caps, fc($_);
    }

    my @new = @old[ sort {
                           $nums[$b] <=> $nums[$a]
                                    ||
                           $caps[$a] cmp $caps[$b]
                         } 0..$#old
                  ];

    # same thing, but without any temps
    my @new = map { $_->[0] }
           sort { $b->[1] <=> $a->[1]
                           ||
                  $a->[2] cmp $b->[2]
           } map { [$_, /=(\d+)/, fc($_)] } @old;

    # using a prototype allows you to use any comparison subroutine
    # as a sort subroutine (including other package's subroutines)
    package Other;
    sub backwards ($$) { $_[1] cmp $_[0]; }  # $a and $b are
                                             # not set here
    package main;
    my @new = sort Other::backwards @old;

    # guarantee stability, regardless of algorithm
    use sort 'stable';
    my @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old;

    # force use of mergesort (not portable outside Perl 5.8)
    use sort '_mergesort';  # note discouraging _
    my @new = sort { substr($a, 3, 5) cmp substr($b, 3, 5) } @old;

Warning: syntactical care is required when sorting the list returned from
a function.  If you want to sort the list returned by the function call
C<find_records(@key)>, you can use:

    my @contact = sort { $a cmp $b } find_records @key;
    my @contact = sort +find_records(@key);
    my @contact = sort &find_records(@key);
    my @contact = sort(find_records(@key));

If instead you want to sort the array C<@key> with the comparison routine
C<find_records()> then you can use:

    my @contact = sort { find_records() } @key;
    my @contact = sort find_records(@key);
    my @contact = sort(find_records @key);
    my @contact = sort(find_records (@key));

C<$a> and C<$b> are set as package globals in the package the sort() is
called from.  That means C<$main::a> and C<$main::b> (or C<$::a> and
C<$::b>) in the C<main> package, C<$FooPack::a> and C<$FooPack::b> in the
C<FooPack> package, etc.  If the sort block is in scope of a C<my> or
C<state> declaration of C<$a> and/or C<$b>, you I<must> spell out the full
name of the variables in the sort block :

   package main;
   my $a = "C"; # DANGER, Will Robinson, DANGER !!!

   print sort { $a cmp $b }               qw(A C E G B D F H);
                                          # WRONG
   sub badlexi { $a cmp $b }
   print sort badlexi                     qw(A C E G B D F H);
                                          # WRONG
   # the above prints BACFEDGH or some other incorrect ordering

   print sort { $::a cmp $::b }           qw(A C E G B D F H);
                                          # OK
   print sort { our $a cmp our $b }       qw(A C E G B D F H);
                                          # also OK
   print sort { our ($a, $b); $a cmp $b } qw(A C E G B D F H);
                                          # also OK
   sub lexi { our $a cmp our $b }
   print sort lexi                        qw(A C E G B D F H);
                                          # also OK
   # the above print ABCDEFGH

With proper care you may mix package and my (or state) C<$a> and/or C<$b>:

   my $a = {
      tiny   => -2,
      small  => -1,
      normal => 0,
      big    => 1,
      huge   => 2
   };

   say sort { $a->{our $a} <=> $a->{our $b} }
       qw{ huge normal tiny small big};

   # prints tinysmallnormalbighuge

C<$a> and C<$b> are implicitely local to the sort() execution and regain their
former values upon completing the sort.

Sort subroutines written using C<$a> and C<$b> are bound to their calling
package. It is possible, but of limited interest, to define them in a
different package, since the subroutine must still refer to the calling
package's C<$a> and C<$b> :

   package Foo;
   sub lexi { $Bar::a cmp $Bar::b }
   package Bar;
   ... sort Foo::lexi ...

Use the prototyped versions (see above) for a more generic alternative.

The comparison function is required to behave.  If it returns
inconsistent results (sometimes saying C<$x[1]> is less than C<$x[2]> and
sometimes saying the opposite, for example) the results are not
well-defined.

Because C<< <=> >> returns L<C<undef>|/undef EXPR> when either operand
is C<NaN> (not-a-number), be careful when sorting with a
comparison function like C<< $a <=> $b >> any lists that might contain a
C<NaN>.  The following example takes advantage that C<NaN != NaN> to
eliminate any C<NaN>s from the input list.

    my @result = sort { $a <=> $b } grep { $_ == $_ } @input;

=item splice ARRAY,OFFSET,LENGTH,LIST
X<splice>

=item splice ARRAY,OFFSET,LENGTH

=item splice ARRAY,OFFSET

=item splice ARRAY

=for Pod::Functions add or remove elements anywhere in an array

Removes the elements designated by OFFSET and LENGTH from an array, and
replaces them with the elements of LIST, if any.  In list context,
returns the elements removed from the array.  In scalar context,
returns the last element removed, or L<C<undef>|/undef EXPR> if no
elements are
removed.  The array grows or shrinks as necessary.
If OFFSET is negative then it starts that far from the end of the array.
If LENGTH is omitted, removes everything from OFFSET onward.
If LENGTH is negative, removes the elements from OFFSET onward
except for -LENGTH elements at the end of the array.
If both OFFSET and LENGTH are omitted, removes everything.  If OFFSET is
past the end of the array and a LENGTH was provided, Perl issues a warning,
and splices at the end of the array.

The following equivalences hold (assuming C<< $#a >= $i >> )

    push(@a,$x,$y)      splice(@a,@a,0,$x,$y)
    pop(@a)             splice(@a,-1)
    shift(@a)           splice(@a,0,1)
    unshift(@a,$x,$y)   splice(@a,0,0,$x,$y)
    $a[$i] = $y         splice(@a,$i,1,$y)

L<C<splice>|/splice ARRAY,OFFSET,LENGTH,LIST> can be used, for example,
to implement n-ary queue processing:

    sub nary_print {
      my $n = shift;
      while (my @next_n = splice @_, 0, $n) {
        say join q{ -- }, @next_n;
      }
    }

    nary_print(3, qw(a b c d e f g h));
    # prints:
    #   a -- b -- c
    #   d -- e -- f
    #   g -- h

Starting with Perl 5.14, an experimental feature allowed
L<C<splice>|/splice ARRAY,OFFSET,LENGTH,LIST> to take a
scalar expression. This experiment has been deemed unsuccessful, and was
removed as of Perl 5.24.

=item split /PATTERN/,EXPR,LIMIT
X<split>

=item split /PATTERN/,EXPR

=item split /PATTERN/

=item split

=for Pod::Functions split up a string using a regexp delimiter

Splits the string EXPR into a list of strings and returns the
list in list context, or the size of the list in scalar context.
(Prior to Perl 5.11, it also overwrote C<@_> with the list in
void and scalar context. If you target old perls, beware.)

If only PATTERN is given, EXPR defaults to L<C<$_>|perlvar/$_>.

Anything in EXPR that matches PATTERN is taken to be a separator
that separates the EXPR into substrings (called "I<fields>") that
do B<not> include the separator.  Note that a separator may be
longer than one character or even have no characters at all (the
empty string, which is a zero-width match).

The PATTERN need not be constant; an expression may be used
to specify a pattern that varies at runtime.

If PATTERN matches the empty string, the EXPR is split at the match
position (between characters).  As an example, the following:

    print join(':', split(/b/, 'abc')), "\n";

uses the C<b> in C<'abc'> as a separator to produce the output C<a:c>.
However, this:

    print join(':', split(//, 'abc')), "\n";

uses empty string matches as separators to produce the output
C<a:b:c>; thus, the empty string may be used to split EXPR into a
list of its component characters.

As a special case for L<C<split>|/split E<sol>PATTERNE<sol>,EXPR,LIMIT>,
the empty pattern given in
L<match operator|perlop/"m/PATTERN/msixpodualngc"> syntax (C<//>)
specifically matches the empty string, which is contrary to its usual
interpretation as the last successful match.

If PATTERN is C</^/>, then it is treated as if it used the
L<multiline modifier|perlreref/OPERATORS> (C</^/m>), since it
isn't much use otherwise.

C<E<sol>m> and any of the other pattern modifiers valid for C<qr>
(summarized in L<perlop/qrE<sol>STRINGE<sol>msixpodualn>) may be
specified explicitly.

As another special case,
L<C<split>|/split E<sol>PATTERNE<sol>,EXPR,LIMIT> emulates the default
behavior of the
command line tool B<awk> when the PATTERN is either omitted or a
string composed of a single space character (such as S<C<' '>> or
S<C<"\x20">>, but not e.g. S<C</ />>).  In this case, any leading
whitespace in EXPR is removed before splitting occurs, and the PATTERN is
instead treated as if it were C</\s+/>; in particular, this means that
I<any> contiguous whitespace (not just a single space character) is used as
a separator.  However, this special treatment can be avoided by specifying
the pattern S<C</ />> instead of the string S<C<" ">>, thereby allowing
only a single space character to be a separator.  In earlier Perls this
special case was restricted to the use of a plain S<C<" ">> as the
pattern argument to split; in Perl 5.18.0 and later this special case is
triggered by any expression which evaluates to the simple string S<C<" ">>.

As of Perl 5.28, this special-cased whitespace splitting works as expected in
the scope of L<< S<C<"use feature 'unicode_strings">>|feature/The
'unicode_strings' feature >>. In previous versions, and outside the scope of
that feature, it exhibits L<perlunicode/The "Unicode Bug">: characters that are
whitespace according to Unicode rules but not according to ASCII rules can be
treated as part of fields rather than as field separators, depending on the
string's internal encoding.

If omitted, PATTERN defaults to a single space, S<C<" ">>, triggering
the previously described I<awk> emulation.

If LIMIT is specified and positive, it represents the maximum number
of fields into which the EXPR may be split; in other words, LIMIT is
one greater than the maximum number of times EXPR may be split.  Thus,
the LIMIT value C<1> means that EXPR may be split a maximum of zero
times, producing a maximum of one field (namely, the entire value of
EXPR).  For instance:

    print join(':', split(//, 'abc', 1)), "\n";

produces the output C<abc>, and this:

    print join(':', split(//, 'abc', 2)), "\n";

produces the output C<a:bc>, and each of these:

    print join(':', split(//, 'abc', 3)), "\n";
    print join(':', split(//, 'abc', 4)), "\n";

produces the output C<a:b:c>.

If LIMIT is negative, it is treated as if it were instead arbitrarily
large; as many fields as possible are produced.

If LIMIT is omitted (or, equivalently, zero), then it is usually
treated as if it were instead negative but with the exception that
trailing empty fields are stripped (empty leading fields are always
preserved); if all fields are empty, then all fields are considered to
be trailing (and are thus stripped in this case).  Thus, the following:

    print join(':', split(/,/, 'a,b,c,,,')), "\n";

produces the output C<a:b:c>, but the following:

    print join(':', split(/,/, 'a,b,c,,,', -1)), "\n";

produces the output C<a:b:c:::>.

In time-critical applications, it is worthwhile to avoid splitting
into more fields than necessary.  Thus, when assigning to a list,
if LIMIT is omitted (or zero), then LIMIT is treated as though it
were one larger than the number of variables in the list; for the
following, LIMIT is implicitly 3:

    my ($login, $passwd) = split(/:/);

Note that splitting an EXPR that evaluates to the empty string always
produces zero fields, regardless of the LIMIT specified.

An empty leading field is produced when there is a positive-width
match at the beginning of EXPR.  For instance:

    print join(':', split(/ /, ' abc')), "\n";

produces the output C<:abc>.  However, a zero-width match at the
beginning of EXPR never produces an empty field, so that:

    print join(':', split(//, ' abc'));

produces the output S<C< :a:b:c>> (rather than S<C<: :a:b:c>>).

An empty trailing field, on the other hand, is produced when there is a
match at the end of EXPR, regardless of the length of the match
(of course, unless a non-zero LIMIT is given explicitly, such fields are
removed, as in the last example).  Thus:

    print join(':', split(//, ' abc', -1)), "\n";

produces the output S<C< :a:b:c:>>.

If the PATTERN contains
L<capturing groups|perlretut/Grouping things and hierarchical matching>,
then for each separator, an additional field is produced for each substring
captured by a group (in the order in which the groups are specified,
as per L<backreferences|perlretut/Backreferences>); if any group does not
match, then it captures the L<C<undef>|/undef EXPR> value instead of a
substring.  Also,
note that any such additional field is produced whenever there is a
separator (that is, whenever a split occurs), and such an additional field
does B<not> count towards the LIMIT.  Consider the following expressions
evaluated in list context (each returned list is provided in the associated
comment):

    split(/-|,/, "1-10,20", 3)
    # ('1', '10', '20')

    split(/(-|,)/, "1-10,20", 3)
    # ('1', '-', '10', ',', '20')

    split(/-|(,)/, "1-10,20", 3)
    # ('1', undef, '10', ',', '20')

    split(/(-)|,/, "1-10,20", 3)
    # ('1', '-', '10', undef, '20')

    split(/(-)|(,)/, "1-10,20", 3)
    # ('1', '-', undef, '10', undef, ',', '20')

=item sprintf FORMAT, LIST
X<sprintf>

=for Pod::Functions formatted print into a string

Returns a string formatted by the usual
L<C<printf>|/printf FILEHANDLE FORMAT, LIST> conventions of the C
library function L<C<sprintf>|/sprintf FORMAT, LIST>.  See below for
more details and see L<sprintf(3)> or L<printf(3)> on your system for an
explanation of the general principles.

For example:

        # Format number with up to 8 leading zeroes
        my $result = sprintf("%08d", $number);

        # Round number to 3 digits after decimal point
        my $rounded = sprintf("%.3f", $number);

Perl does its own L<C<sprintf>|/sprintf FORMAT, LIST> formatting: it
emulates the C
function L<sprintf(3)>, but doesn't use it except for floating-point
numbers, and even then only standard modifiers are allowed.
Non-standard extensions in your local L<sprintf(3)> are
therefore unavailable from Perl.

Unlike L<C<printf>|/printf FILEHANDLE FORMAT, LIST>,
L<C<sprintf>|/sprintf FORMAT, LIST> does not do what you probably mean
when you pass it an array as your first argument.
The array is given scalar context,
and instead of using the 0th element of the array as the format, Perl will
use the count of elements in the array as the format, which is almost never
useful.

Perl's L<C<sprintf>|/sprintf FORMAT, LIST> permits the following
universally-known conversions:

   %%    a percent sign
   %c    a character with the given number
   %s    a string
   %d    a signed integer, in decimal
   %u    an unsigned integer, in decimal
   %o    an unsigned integer, in octal
   %x    an unsigned integer, in hexadecimal
   %e    a floating-point number, in scientific notation
   %f    a floating-point number, in fixed decimal notation
   %g    a floating-point number, in %e or %f notation

In addition, Perl permits the following widely-supported conversions:

   %X    like %x, but using upper-case letters
   %E    like %e, but using an upper-case "E"
   %G    like %g, but with an upper-case "E" (if applicable)
   %b    an unsigned integer, in binary
   %B    like %b, but using an upper-case "B" with the # flag
   %p    a pointer (outputs the Perl value's address in hexadecimal)
   %n    special: *stores* the number of characters output so far
         into the next argument in the parameter list
   %a    hexadecimal floating point
   %A    like %a, but using upper-case letters

Finally, for backward (and we do mean "backward") compatibility, Perl
permits these unnecessary but widely-supported conversions:

   %i    a synonym for %d
   %D    a synonym for %ld
   %U    a synonym for %lu
   %O    a synonym for %lo
   %F    a synonym for %f

Note that the number of exponent digits in the scientific notation produced
by C<%e>, C<%E>, C<%g> and C<%G> for numbers with the modulus of the
exponent less than 100 is system-dependent: it may be three or less
(zero-padded as necessary).  In other words, 1.23 times ten to the
99th may be either "1.23e99" or "1.23e099".  Similarly for C<%a> and C<%A>:
the exponent or the hexadecimal digits may float: especially the
"long doubles" Perl configuration option may cause surprises.

Between the C<%> and the format letter, you may specify several
additional attributes controlling the interpretation of the format.
In order, these are:

=over 4

=item format parameter index

An explicit format parameter index, such as C<2$>.  By default sprintf
will format the next unused argument in the list, but this allows you
to take the arguments out of order:

  printf '%2$d %1$d', 12, 34;      # prints "34 12"
  printf '%3$d %d %1$d', 1, 2, 3;  # prints "3 1 1"

=item flags

one or more of:

   space   prefix non-negative number with a space
   +       prefix non-negative number with a plus sign
   -       left-justify within the field
   0       use zeros, not spaces, to right-justify
   #       ensure the leading "0" for any octal,
           prefix non-zero hexadecimal with "0x" or "0X",
           prefix non-zero binary with "0b" or "0B"

For example:

  printf '<% d>',  12;   # prints "< 12>"
  printf '<% d>',   0;   # prints "< 0>"
  printf '<% d>', -12;   # prints "<-12>"
  printf '<%+d>',  12;   # prints "<+12>"
  printf '<%+d>',   0;   # prints "<+0>"
  printf '<%+d>', -12;   # prints "<-12>"
  printf '<%6s>',  12;   # prints "<    12>"
  printf '<%-6s>', 12;   # prints "<12    >"
  printf '<%06s>', 12;   # prints "<000012>"
  printf '<%#o>',  12;   # prints "<014>"
  printf '<%#x>',  12;   # prints "<0xc>"
  printf '<%#X>',  12;   # prints "<0XC>"
  printf '<%#b>',  12;   # prints "<0b1100>"
  printf '<%#B>',  12;   # prints "<0B1100>"

When a space and a plus sign are given as the flags at once,
the space is ignored.

  printf '<%+ d>', 12;   # prints "<+12>"
  printf '<% +d>', 12;   # prints "<+12>"

When the # flag and a precision are given in the %o conversion,
the precision is incremented if it's necessary for the leading "0".

  printf '<%#.5o>', 012;      # prints "<00012>"
  printf '<%#.5o>', 012345;   # prints "<012345>"
  printf '<%#.0o>', 0;        # prints "<0>"

=item vector flag

This flag tells Perl to interpret the supplied string as a vector of
integers, one for each character in the string.  Perl applies the format to
each integer in turn, then joins the resulting strings with a separator (a
dot C<.> by default).  This can be useful for displaying ordinal values of
characters in arbitrary strings:

  printf "%vd", "AB\x{100}";           # prints "65.66.256"
  printf "version is v%vd\n", $^V;     # Perl's version

Put an asterisk C<*> before the C<v> to override the string to
use to separate the numbers:

  printf "address is %*vX\n", ":", $addr;   # IPv6 address
  printf "bits are %0*v8b\n", " ", $bits;   # random bitstring

You can also explicitly specify the argument number to use for
the join string using something like C<*2$v>; for example:

  printf '%*4$vX %*4$vX %*4$vX',       # 3 IPv6 addresses
          @addr[1..3], ":";

=item (minimum) width

Arguments are usually formatted to be only as wide as required to
display the given value.  You can override the width by putting
a number here, or get the width from the next argument (with C<*>)
or from a specified argument (e.g., with C<*2$>):

 printf "<%s>", "a";       # prints "<a>"
 printf "<%6s>", "a";      # prints "<     a>"
 printf "<%*s>", 6, "a";   # prints "<     a>"
 printf '<%*2$s>', "a", 6; # prints "<     a>"
 printf "<%2s>", "long";   # prints "<long>" (does not truncate)

If a field width obtained through C<*> is negative, it has the same
effect as the C<-> flag: left-justification.

=item precision, or maximum width
X<precision>

You can specify a precision (for numeric conversions) or a maximum
width (for string conversions) by specifying a C<.> followed by a number.
For floating-point formats except C<g> and C<G>, this specifies
how many places right of the decimal point to show (the default being 6).
For example:

  # these examples are subject to system-specific variation
  printf '<%f>', 1;    # prints "<1.000000>"
  printf '<%.1f>', 1;  # prints "<1.0>"
  printf '<%.0f>', 1;  # prints "<1>"
  printf '<%e>', 10;   # prints "<1.000000e+01>"
  printf '<%.1e>', 10; # prints "<1.0e+01>"

For "g" and "G", this specifies the maximum number of significant digits to
show; for example:

  # These examples are subject to system-specific variation.
  printf '<%g>', 1;        # prints "<1>"
  printf '<%.10g>', 1;     # prints "<1>"
  printf '<%g>', 100;      # prints "<100>"
  printf '<%.1g>', 100;    # prints "<1e+02>"
  printf '<%.2g>', 100.01; # prints "<1e+02>"
  printf '<%.5g>', 100.01; # prints "<100.01>"
  printf '<%.4g>', 100.01; # prints "<100>"
  printf '<%.1g>', 0.0111; # prints "<0.01>"
  printf '<%.2g>', 0.0111; # prints "<0.011>"
  printf '<%.3g>', 0.0111; # prints "<0.0111>"

For integer conversions, specifying a precision implies that the
output of the number itself should be zero-padded to this width,
where the 0 flag is ignored:

  printf '<%.6d>', 1;      # prints "<000001>"
  printf '<%+.6d>', 1;     # prints "<+000001>"
  printf '<%-10.6d>', 1;   # prints "<000001    >"
  printf '<%10.6d>', 1;    # prints "<    000001>"
  printf '<%010.6d>', 1;   # prints "<    000001>"
  printf '<%+10.6d>', 1;   # prints "<   +000001>"

  printf '<%.6x>', 1;      # prints "<000001>"
  printf '<%#.6x>', 1;     # prints "<0x000001>"
  printf '<%-10.6x>', 1;   # prints "<000001    >"
  printf '<%10.6x>', 1;    # prints "<    000001>"
  printf '<%010.6x>', 1;   # prints "<    000001>"
  printf '<%#10.6x>', 1;   # prints "<  0x000001>"

For string conversions, specifying a precision truncates the string
to fit the specified width:

  printf '<%.5s>', "truncated";   # prints "<trunc>"
  printf '<%10.5s>', "truncated"; # prints "<     trunc>"

You can also get the precision from the next argument using C<.*>, or from a
specified argument (e.g., with C<.*2$>):

  printf '<%.6x>', 1;       # prints "<000001>"
  printf '<%.*x>', 6, 1;    # prints "<000001>"

  printf '<%.*2$x>', 1, 6;  # prints "<000001>"

  printf '<%6.*2$x>', 1, 4; # prints "<  0001>"

If a precision obtained through C<*> is negative, it counts
as having no precision at all.

  printf '<%.*s>',  7, "string";   # prints "<string>"
  printf '<%.*s>',  3, "string";   # prints "<str>"
  printf '<%.*s>',  0, "string";   # prints "<>"
  printf '<%.*s>', -1, "string";   # prints "<string>"

  printf '<%.*d>',  1, 0;   # prints "<0>"
  printf '<%.*d>',  0, 0;   # prints "<>"
  printf '<%.*d>', -1, 0;   # prints "<0>"

=item size

For numeric conversions, you can specify the size to interpret the
number as using C<l>, C<h>, C<V>, C<q>, C<L>, or C<ll>.  For integer
conversions (C<d u o x X b i D U O>), numbers are usually assumed to be
whatever the default integer size is on your platform (usually 32 or 64
bits), but you can override this to use instead one of the standard C types,
as supported by the compiler used to build Perl:

   hh          interpret integer as C type "char" or "unsigned
               char" on Perl 5.14 or later
   h           interpret integer as C type "short" or
               "unsigned short"
   j           interpret integer as C type "intmax_t" on Perl
               5.14 or later, and only with a C99 compiler
               (unportable)
   l           interpret integer as C type "long" or
               "unsigned long"
   q, L, or ll interpret integer as C type "long long",
               "unsigned long long", or "quad" (typically
               64-bit integers)
   t           interpret integer as C type "ptrdiff_t" on Perl
               5.14 or later
   z           interpret integer as C type "size_t" on Perl 5.14
               or later

As of 5.14, none of these raises an exception if they are not supported on
your platform.  However, if warnings are enabled, a warning of the
L<C<printf>|warnings> warning class is issued on an unsupported
conversion flag.  Should you instead prefer an exception, do this:

    use warnings FATAL => "printf";

If you would like to know about a version dependency before you
start running the program, put something like this at its top:

    use 5.014;  # for hh/j/t/z/ printf modifiers

You can find out whether your Perl supports quads via L<Config>:

    use Config;
    if ($Config{use64bitint} eq "define"
        || $Config{longsize} >= 8) {
        print "Nice quads!\n";
    }

For floating-point conversions (C<e f g E F G>), numbers are usually assumed
to be the default floating-point size on your platform (double or long double),
but you can force "long double" with C<q>, C<L>, or C<ll> if your
platform supports them.  You can find out whether your Perl supports long
doubles via L<Config>:

    use Config;
    print "long doubles\n" if $Config{d_longdbl} eq "define";

You can find out whether Perl considers "long double" to be the default
floating-point size to use on your platform via L<Config>:

    use Config;
    if ($Config{uselongdouble} eq "define") {
        print "long doubles by default\n";
    }

It can also be that long doubles and doubles are the same thing:

        use Config;
        ($Config{doublesize} == $Config{longdblsize}) &&
                print "doubles are long doubles\n";

The size specifier C<V> has no effect for Perl code, but is supported for
compatibility with XS code.  It means "use the standard size for a Perl
integer or floating-point number", which is the default.

=item order of arguments

Normally, L<C<sprintf>|/sprintf FORMAT, LIST> takes the next unused
argument as the value to
format for each format specification.  If the format specification
uses C<*> to require additional arguments, these are consumed from
the argument list in the order they appear in the format
specification I<before> the value to format.  Where an argument is
specified by an explicit index, this does not affect the normal
order for the arguments, even when the explicitly specified index
would have been the next argument.

So:

    printf "<%*.*s>", $a, $b, $c;

uses C<$a> for the width, C<$b> for the precision, and C<$c>
as the value to format; while:

  printf '<%*1$.*s>', $a, $b;

would use C<$a> for the width and precision, and C<$b> as the
value to format.

Here are some more examples; be aware that when using an explicit
index, the C<$> may need escaping:

 printf "%2\$d %d\n",      12, 34;     # will print "34 12\n"
 printf "%2\$d %d %d\n",   12, 34;     # will print "34 12 34\n"
 printf "%3\$d %d %d\n",   12, 34, 56; # will print "56 12 34\n"
 printf "%2\$*3\$d %d\n",  12, 34,  3; # will print " 34 12\n"
 printf "%*1\$.*f\n",       4,  5, 10; # will print "5.0000\n"

=back

If L<C<use locale>|locale> (including C<use locale ':not_characters'>)
is in effect and L<C<POSIX::setlocale>|POSIX/C<setlocale>> has been
called,
the character used for the decimal separator in formatted floating-point
numbers is affected by the C<LC_NUMERIC> locale.  See L<perllocale>
and L<POSIX>.

=item sqrt EXPR
X<sqrt> X<root> X<square root>

=item sqrt

=for Pod::Functions square root function

Return the positive square root of EXPR.  If EXPR is omitted, uses
L<C<$_>|perlvar/$_>.  Works only for non-negative operands unless you've
loaded the L<C<Math::Complex>|Math::Complex> module.

    use Math::Complex;
    print sqrt(-4);    # prints 2i

=item srand EXPR
X<srand> X<seed> X<randseed>

=item srand

=for Pod::Functions seed the random number generator

Sets and returns the random number seed for the L<C<rand>|/rand EXPR>
operator.

The point of the function is to "seed" the L<C<rand>|/rand EXPR>
function so that L<C<rand>|/rand EXPR> can produce a different sequence
each time you run your program.  When called with a parameter,
L<C<srand>|/srand EXPR> uses that for the seed; otherwise it
(semi-)randomly chooses a seed.  In either case, starting with Perl 5.14,
it returns the seed.  To signal that your code will work I<only> on Perls
of a recent vintage:

    use 5.014;	# so srand returns the seed

If L<C<srand>|/srand EXPR> is not called explicitly, it is called
implicitly without a parameter at the first use of the
L<C<rand>|/rand EXPR> operator.  However, there are a few situations
where programs are likely to want to call L<C<srand>|/srand EXPR>.  One
is for generating predictable results, generally for testing or
debugging.  There, you use C<srand($seed)>, with the same C<$seed> each
time.  Another case is that you may want to call L<C<srand>|/srand EXPR>
after a L<C<fork>|/fork> to avoid child processes sharing the same seed
value as the parent (and consequently each other).

Do B<not> call C<srand()> (i.e., without an argument) more than once per
process.  The internal state of the random number generator should
contain more entropy than can be provided by any seed, so calling
L<C<srand>|/srand EXPR> again actually I<loses> randomness.

Most implementations of L<C<srand>|/srand EXPR> take an integer and will
silently
truncate decimal numbers.  This means C<srand(42)> will usually
produce the same results as C<srand(42.1)>.  To be safe, always pass
L<C<srand>|/srand EXPR> an integer.

A typical use of the returned seed is for a test program which has too many
combinations to test comprehensively in the time available to it each run.  It
can test a random subset each time, and should there be a failure, log the seed
used for that run so that it can later be used to reproduce the same results.

B<L<C<rand>|/rand EXPR> is not cryptographically secure.  You should not rely
on it in security-sensitive situations.>  As of this writing, a
number of third-party CPAN modules offer random number generators
intended by their authors to be cryptographically secure,
including: L<Data::Entropy>, L<Crypt::Random>, L<Math::Random::Secure>,
and L<Math::TrulyRandom>.

=item stat FILEHANDLE
X<stat> X<file, status> X<ctime>

=item stat EXPR

=item stat DIRHANDLE

=item stat

=for Pod::Functions get a file's status information

Returns a 13-element list giving the status info for a file, either
the file opened via FILEHANDLE or DIRHANDLE, or named by EXPR.  If EXPR is
omitted, it stats L<C<$_>|perlvar/$_> (not C<_>!).  Returns the empty
list if L<C<stat>|/stat FILEHANDLE> fails.  Typically
used as follows:

    my ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,
        $atime,$mtime,$ctime,$blksize,$blocks)
           = stat($filename);

Not all fields are supported on all filesystem types.  Here are the
meanings of the fields:

  0 dev      device number of filesystem
  1 ino      inode number
  2 mode     file mode  (type and permissions)
  3 nlink    number of (hard) links to the file
  4 uid      numeric user ID of file's owner
  5 gid      numeric group ID of file's owner
  6 rdev     the device identifier (special files only)
  7 size     total size of file, in bytes
  8 atime    last access time in seconds since the epoch
  9 mtime    last modify time in seconds since the epoch
 10 ctime    inode change time in seconds since the epoch (*)
 11 blksize  preferred I/O size in bytes for interacting with the
             file (may vary from file to file)
 12 blocks   actual number of system-specific blocks allocated
             on disk (often, but not always, 512 bytes each)

(The epoch was at 00:00 January 1, 1970 GMT.)

(*) Not all fields are supported on all filesystem types.  Notably, the
ctime field is non-portable.  In particular, you cannot expect it to be a
"creation time"; see L<perlport/"Files and Filesystems"> for details.

If L<C<stat>|/stat FILEHANDLE> is passed the special filehandle
consisting of an underline, no stat is done, but the current contents of
the stat structure from the last L<C<stat>|/stat FILEHANDLE>,
L<C<lstat>|/lstat FILEHANDLE>, or filetest are returned.  Example:

    if (-x $file && (($d) = stat(_)) && $d < 0) {
        print "$file is executable NFS file\n";
    }

(This works on machines only for which the device number is negative
under NFS.)

Because the mode contains both the file type and its permissions, you
should mask off the file type portion and (s)printf using a C<"%o">
if you want to see the real permissions.

    my $mode = (stat($filename))[2];
    printf "Permissions are %04o\n", $mode & 07777;

In scalar context, L<C<stat>|/stat FILEHANDLE> returns a boolean value
indicating success
or failure, and, if successful, sets the information associated with
the special filehandle C<_>.

The L<File::stat> module provides a convenient, by-name access mechanism:

    use File::stat;
    my $sb = stat($filename);
    printf "File is %s, size is %s, perm %04o, mtime %s\n",
           $filename, $sb->size, $sb->mode & 07777,
           scalar localtime $sb->mtime;

You can import symbolic mode constants (C<S_IF*>) and functions
(C<S_IS*>) from the L<Fcntl> module:

    use Fcntl ':mode';

    my $mode = (stat($filename))[2];

    my $user_rwx      = ($mode & S_IRWXU) >> 6;
    my $group_read    = ($mode & S_IRGRP) >> 3;
    my $other_execute =  $mode & S_IXOTH;

    printf "Permissions are %04o\n", S_IMODE($mode), "\n";

    my $is_setuid     =  $mode & S_ISUID;
    my $is_directory  =  S_ISDIR($mode);

You could write the last two using the C<-u> and C<-d> operators.
Commonly available C<S_IF*> constants are:

    # Permissions: read, write, execute, for user, group, others.

    S_IRWXU S_IRUSR S_IWUSR S_IXUSR
    S_IRWXG S_IRGRP S_IWGRP S_IXGRP
    S_IRWXO S_IROTH S_IWOTH S_IXOTH

    # Setuid/Setgid/Stickiness/SaveText.
    # Note that the exact meaning of these is system-dependent.

    S_ISUID S_ISGID S_ISVTX S_ISTXT

    # File types.  Not all are necessarily available on
    # your system.

    S_IFREG S_IFDIR S_IFLNK S_IFBLK S_IFCHR
    S_IFIFO S_IFSOCK S_IFWHT S_ENFMT

    # The following are compatibility aliases for S_IRUSR,
    # S_IWUSR, and S_IXUSR.

    S_IREAD S_IWRITE S_IEXEC

and the C<S_IF*> functions are

    S_IMODE($mode)    the part of $mode containing the permission
                      bits and the setuid/setgid/sticky bits

    S_IFMT($mode)     the part of $mode containing the file type
                      which can be bit-anded with (for example)
                      S_IFREG or with the following functions

    # The operators -f, -d, -l, -b, -c, -p, and -S.

    S_ISREG($mode) S_ISDIR($mode) S_ISLNK($mode)
    S_ISBLK($mode) S_ISCHR($mode) S_ISFIFO($mode) S_ISSOCK($mode)

    # No direct -X operator counterpart, but for the first one
    # the -g operator is often equivalent.  The ENFMT stands for
    # record flocking enforcement, a platform-dependent feature.

    S_ISENFMT($mode) S_ISWHT($mode)

See your native L<chmod(2)> and L<stat(2)> documentation for more details
about the C<S_*> constants.  To get status info for a symbolic link
instead of the target file behind the link, use the
L<C<lstat>|/lstat FILEHANDLE> function.

Portability issues: L<perlport/stat>.

=item state VARLIST
X<state>

=item state TYPE VARLIST

=item state VARLIST : ATTRS

=item state TYPE VARLIST : ATTRS

=for Pod::Functions +state declare and assign a persistent lexical variable

L<C<state>|/state VARLIST> declares a lexically scoped variable, just
like L<C<my>|/my VARLIST>.
However, those variables will never be reinitialized, contrary to
lexical variables that are reinitialized each time their enclosing block
is entered.
See L<perlsub/"Persistent Private Variables"> for details.

If more than one variable is listed, the list must be placed in
parentheses.  With a parenthesised list, L<C<undef>|/undef EXPR> can be
used as a
dummy placeholder.  However, since initialization of state variables in
list context is currently not possible this would serve no purpose.

L<C<state>|/state VARLIST> is available only if the
L<C<"state"> feature|feature/The 'state' feature> is enabled or if it is
prefixed with C<CORE::>.  The
L<C<"state"> feature|feature/The 'state' feature> is enabled
automatically with a C<use v5.10> (or higher) declaration in the current
scope.


=item study SCALAR
X<study>

=item study

=for Pod::Functions no-op, formerly optimized input data for repeated searches

At this time, C<study> does nothing. This may change in the future.

Prior to Perl version 5.16, it would create an inverted index of all characters
that occurred in the given SCALAR (or L<C<$_>|perlvar/$_> if unspecified). When
matching a pattern, the rarest character from the pattern would be looked up in
this index. Rarity was based on some static frequency tables constructed from
some C programs and English text.


=item sub NAME BLOCK
X<sub>

=item sub NAME (PROTO) BLOCK

=item sub NAME : ATTRS BLOCK

=item sub NAME (PROTO) : ATTRS BLOCK

=for Pod::Functions declare a subroutine, possibly anonymously

This is subroutine definition, not a real function I<per se>.  Without a
BLOCK it's just a forward declaration.  Without a NAME, it's an anonymous
function declaration, so does return a value: the CODE ref of the closure
just created.

See L<perlsub> and L<perlref> for details about subroutines and
references; see L<attributes> and L<Attribute::Handlers> for more
information about attributes.

=item __SUB__
X<__SUB__>

=for Pod::Functions +current_sub the current subroutine, or C<undef> if not in a subroutine

A special token that returns a reference to the current subroutine, or
L<C<undef>|/undef EXPR> outside of a subroutine.

The behaviour of L<C<__SUB__>|/__SUB__> within a regex code block (such
as C</(?{...})/>) is subject to change.

This token is only available under C<use v5.16> or the
L<C<"current_sub"> feature|feature/The 'current_sub' feature>.
See L<feature>.

=item substr EXPR,OFFSET,LENGTH,REPLACEMENT
X<substr> X<substring> X<mid> X<left> X<right>

=item substr EXPR,OFFSET,LENGTH

=item substr EXPR,OFFSET

=for Pod::Functions get or alter a portion of a string

Extracts a substring out of EXPR and returns it.  First character is at
offset zero.  If OFFSET is negative, starts
that far back from the end of the string.  If LENGTH is omitted, returns
everything through the end of the string.  If LENGTH is negative, leaves that
many characters off the end of the string.

    my $s = "The black cat climbed the green tree";
    my $color  = substr $s, 4, 5;      # black
    my $middle = substr $s, 4, -11;    # black cat climbed the
    my $end    = substr $s, 14;        # climbed the green tree
    my $tail   = substr $s, -4;        # tree
    my $z      = substr $s, -4, 2;     # tr

You can use the L<C<substr>|/substr EXPR,OFFSET,LENGTH,REPLACEMENT>
function as an lvalue, in which case EXPR
must itself be an lvalue.  If you assign something shorter than LENGTH,
the string will shrink, and if you assign something longer than LENGTH,
the string will grow to accommodate it.  To keep the string the same
length, you may need to pad or chop your value using
L<C<sprintf>|/sprintf FORMAT, LIST>.

If OFFSET and LENGTH specify a substring that is partly outside the
string, only the part within the string is returned.  If the substring
is beyond either end of the string,
L<C<substr>|/substr EXPR,OFFSET,LENGTH,REPLACEMENT> returns the undefined
value and produces a warning.  When used as an lvalue, specifying a
substring that is entirely outside the string raises an exception.
Here's an example showing the behavior for boundary cases:

    my $name = 'fred';
    substr($name, 4) = 'dy';         # $name is now 'freddy'
    my $null = substr $name, 6, 2;   # returns "" (no warning)
    my $oops = substr $name, 7;      # returns undef, with warning
    substr($name, 7) = 'gap';        # raises an exception

An alternative to using
L<C<substr>|/substr EXPR,OFFSET,LENGTH,REPLACEMENT> as an lvalue is to
specify the
replacement string as the 4th argument.  This allows you to replace
parts of the EXPR and return what was there before in one operation,
just as you can with
L<C<splice>|/splice ARRAY,OFFSET,LENGTH,LIST>.

    my $s = "The black cat climbed the green tree";
    my $z = substr $s, 14, 7, "jumped from";    # climbed
    # $s is now "The black cat jumped from the green tree"

Note that the lvalue returned by the three-argument version of
L<C<substr>|/substr EXPR,OFFSET,LENGTH,REPLACEMENT> acts as
a 'magic bullet'; each time it is assigned to, it remembers which part
of the original string is being modified; for example:

    my $x = '1234';
    for (substr($x,1,2)) {
        $_ = 'a';   print $x,"\n";    # prints 1a4
        $_ = 'xyz'; print $x,"\n";    # prints 1xyz4
        $x = '56789';
        $_ = 'pq';  print $x,"\n";    # prints 5pq9
    }

With negative offsets, it remembers its position from the end of the string
when the target string is modified:

    my $x = '1234';
    for (substr($x, -3, 2)) {
        $_ = 'a';   print $x,"\n";    # prints 1a4, as above
        $x = 'abcdefg';
        print $_,"\n";                # prints f
    }

Prior to Perl version 5.10, the result of using an lvalue multiple times was
unspecified.  Prior to 5.16, the result with negative offsets was
unspecified.

=item symlink OLDFILE,NEWFILE
X<symlink> X<link> X<symbolic link> X<link, symbolic>

=for Pod::Functions create a symbolic link to a file

Creates a new filename symbolically linked to the old filename.
Returns C<1> for success, C<0> otherwise.  On systems that don't support
symbolic links, raises an exception.  To check for that,
use eval:

    my $symlink_exists = eval { symlink("",""); 1 };

Portability issues: L<perlport/symlink>.

=item syscall NUMBER, LIST
X<syscall> X<system call>

=for Pod::Functions execute an arbitrary system call

Calls the system call specified as the first element of the list,
passing the remaining elements as arguments to the system call.  If
unimplemented, raises an exception.  The arguments are interpreted
as follows: if a given argument is numeric, the argument is passed as
an int.  If not, the pointer to the string value is passed.  You are
responsible to make sure a string is pre-extended long enough to
receive any result that might be written into a string.  You can't use a
string literal (or other read-only string) as an argument to
L<C<syscall>|/syscall NUMBER, LIST> because Perl has to assume that any
string pointer might be written through.  If your
integer arguments are not literals and have never been interpreted in a
numeric context, you may need to add C<0> to them to force them to look
like numbers.  This emulates the
L<C<syswrite>|/syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET> function (or
vice versa):

    require 'syscall.ph';        # may need to run h2ph
    my $s = "hi there\n";
    syscall(SYS_write(), fileno(STDOUT), $s, length $s);

Note that Perl supports passing of up to only 14 arguments to your syscall,
which in practice should (usually) suffice.

Syscall returns whatever value returned by the system call it calls.
If the system call fails, L<C<syscall>|/syscall NUMBER, LIST> returns
C<-1> and sets L<C<$!>|perlvar/$!> (errno).
Note that some system calls I<can> legitimately return C<-1>.  The proper
way to handle such calls is to assign C<$! = 0> before the call, then
check the value of L<C<$!>|perlvar/$!> if
L<C<syscall>|/syscall NUMBER, LIST> returns C<-1>.

There's a problem with C<syscall(SYS_pipe())>: it returns the file
number of the read end of the pipe it creates, but there is no way
to retrieve the file number of the other end.  You can avoid this
problem by using L<C<pipe>|/pipe READHANDLE,WRITEHANDLE> instead.

Portability issues: L<perlport/syscall>.

=item sysopen FILEHANDLE,FILENAME,MODE
X<sysopen>

=item sysopen FILEHANDLE,FILENAME,MODE,PERMS

=for Pod::Functions +5.002 open a file, pipe, or descriptor

Opens the file whose filename is given by FILENAME, and associates it with
FILEHANDLE.  If FILEHANDLE is an expression, its value is used as the real
filehandle wanted; an undefined scalar will be suitably autovivified.  This
function calls the underlying operating system's L<open(2)> function with the
parameters FILENAME, MODE, and PERMS.

Returns true on success and L<C<undef>|/undef EXPR> otherwise.

The possible values and flag bits of the MODE parameter are
system-dependent; they are available via the standard module
L<C<Fcntl>|Fcntl>.  See the documentation of your operating system's
L<open(2)> syscall to see
which values and flag bits are available.  You may combine several flags
using the C<|>-operator.

Some of the most common values are C<O_RDONLY> for opening the file in
read-only mode, C<O_WRONLY> for opening the file in write-only mode,
and C<O_RDWR> for opening the file in read-write mode.
X<O_RDONLY> X<O_RDWR> X<O_WRONLY>

For historical reasons, some values work on almost every system
supported by Perl: 0 means read-only, 1 means write-only, and 2
means read/write.  We know that these values do I<not> work under
OS/390 and on the Macintosh; you probably don't want to
use them in new code.

If the file named by FILENAME does not exist and the
L<C<open>|/open FILEHANDLE,EXPR> call creates
it (typically because MODE includes the C<O_CREAT> flag), then the value of
PERMS specifies the permissions of the newly created file.  If you omit
the PERMS argument to L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE>,
Perl uses the octal value C<0666>.
These permission values need to be in octal, and are modified by your
process's current L<C<umask>|/umask EXPR>.
X<O_CREAT>

In many systems the C<O_EXCL> flag is available for opening files in
exclusive mode.  This is B<not> locking: exclusiveness means here that
if the file already exists,
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE> fails.  C<O_EXCL> may
not work
on network filesystems, and has no effect unless the C<O_CREAT> flag
is set as well.  Setting C<O_CREAT|O_EXCL> prevents the file from
being opened if it is a symbolic link.  It does not protect against
symbolic links in the file's path.
X<O_EXCL>

Sometimes you may want to truncate an already-existing file.  This
can be done using the C<O_TRUNC> flag.  The behavior of
C<O_TRUNC> with C<O_RDONLY> is undefined.
X<O_TRUNC>

You should seldom if ever use C<0644> as argument to
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE>, because
that takes away the user's option to have a more permissive umask.
Better to omit it.  See L<C<umask>|/umask EXPR> for more on this.

Note that under Perls older than 5.8.0,
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE> depends on the
L<fdopen(3)> C library function.  On many Unix systems, L<fdopen(3)> is known
to fail when file descriptors exceed a certain value, typically 255.  If
you need more file descriptors than that, consider using the
L<C<POSIX::open>|POSIX/C<open>> function.  For Perls 5.8.0 and later,
PerlIO is (most often) the default.

See L<perlopentut> for a kinder, gentler explanation of opening files.

Portability issues: L<perlport/sysopen>.

=item sysread FILEHANDLE,SCALAR,LENGTH,OFFSET
X<sysread>

=item sysread FILEHANDLE,SCALAR,LENGTH

=for Pod::Functions fixed-length unbuffered input from a filehandle

Attempts to read LENGTH bytes of data into variable SCALAR from the
specified FILEHANDLE, using L<read(2)>.  It bypasses
buffered IO, so mixing this with other kinds of reads,
L<C<print>|/print FILEHANDLE LIST>, L<C<write>|/write FILEHANDLE>,
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, or L<C<eof>|/eof FILEHANDLE> can cause
confusion because the
perlio or stdio layers usually buffer data.  Returns the number of
bytes actually read, C<0> at end of file, or undef if there was an
error (in the latter case L<C<$!>|perlvar/$!> is also set).  SCALAR will
be grown or
shrunk so that the last byte actually read is the last byte of the
scalar after the read.

An OFFSET may be specified to place the read data at some place in the
string other than the beginning.  A negative OFFSET specifies
placement at that many characters counting backwards from the end of
the string.  A positive OFFSET greater than the length of SCALAR
results in the string being padded to the required size with C<"\0">
bytes before the result of the read is appended.

There is no syseof() function, which is ok, since
L<C<eof>|/eof FILEHANDLE> doesn't work well on device files (like ttys)
anyway.  Use L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET> and
check for a return value for 0 to decide whether you're done.

Note that if the filehandle has been marked as C<:utf8>, Unicode
characters are read instead of bytes (the LENGTH, OFFSET, and the
return value of L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET>
are in Unicode characters).  The C<:encoding(...)> layer implicitly
introduces the C<:utf8> layer.  See
L<C<binmode>|/binmode FILEHANDLE, LAYER>,
L<C<open>|/open FILEHANDLE,EXPR>, and the L<open> pragma.

=item sysseek FILEHANDLE,POSITION,WHENCE
X<sysseek> X<lseek>

=for Pod::Functions +5.004 position I/O pointer on handle used with sysread and syswrite

Sets FILEHANDLE's system position I<in bytes> using L<lseek(2)>.  FILEHANDLE may
be an expression whose value gives the name of the filehandle.  The values
for WHENCE are C<0> to set the new position to POSITION; C<1> to set the it
to the current position plus POSITION; and C<2> to set it to EOF plus
POSITION, typically negative.

Note the emphasis on bytes: even if the filehandle has been set to operate
on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, and
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
family of functions use byte offsets, not character offsets,
because seeking to a character offset would be very slow in a UTF-8 file.

L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> bypasses normal
buffered IO, so mixing it with reads other than
L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET> (for example
L<C<readline>|/readline EXPR> or
L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET>),
L<C<print>|/print FILEHANDLE LIST>, L<C<write>|/write FILEHANDLE>,
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, or L<C<eof>|/eof FILEHANDLE> may cause
confusion.

For WHENCE, you may also use the constants C<SEEK_SET>, C<SEEK_CUR>,
and C<SEEK_END> (start of the file, current position, end of the file)
from the L<Fcntl> module.  Use of the constants is also more portable
than relying on 0, 1, and 2.  For example to define a "systell" function:

    use Fcntl 'SEEK_CUR';
    sub systell { sysseek($_[0], 0, SEEK_CUR) }

Returns the new position, or the undefined value on failure.  A position
of zero is returned as the string C<"0 but true">; thus
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> returns
true on success and false on failure, yet you can still easily determine
the new position.

=item system LIST
X<system> X<shell>

=item system PROGRAM LIST

=for Pod::Functions run a separate program

Does exactly the same thing as L<C<exec>|/exec LIST>, except that a fork is
done first and the parent process waits for the child process to
exit.  Note that argument processing varies depending on the
number of arguments.  If there is more than one argument in LIST,
or if LIST is an array with more than one value, starts the program
given by the first element of the list with arguments given by the
rest of the list.  If there is only one scalar argument, the argument
is checked for shell metacharacters, and if there are any, the
entire argument is passed to the system's command shell for parsing
(this is C</bin/sh -c> on Unix platforms, but varies on other
platforms).  If there are no shell metacharacters in the argument,
it is split into words and passed directly to C<execvp>, which is
more efficient.  On Windows, only the C<system PROGRAM LIST> syntax will
reliably avoid using the shell; C<system LIST>, even with more than one
element, will fall back to the shell if the first spawn fails.

Perl will attempt to flush all files opened for
output before any operation that may do a fork, but this may not be
supported on some platforms (see L<perlport>).  To be safe, you may need
to set L<C<$E<verbar>>|perlvar/$E<verbar>> (C<$AUTOFLUSH> in L<English>)
or call the C<autoflush> method of L<C<IO::Handle>|IO::Handle/METHODS>
on any open handles.

The return value is the exit status of the program as returned by the
L<C<wait>|/wait> call.  To get the actual exit value, shift right by
eight (see below).  See also L<C<exec>|/exec LIST>.  This is I<not> what
you want to use to capture the output from a command; for that you
should use merely backticks or
L<C<qxE<sol>E<sol>>|/qxE<sol>STRINGE<sol>>, as described in
L<perlop/"`STRING`">.  Return value of -1 indicates a failure to start
the program or an error of the L<wait(2)> system call (inspect
L<C<$!>|perlvar/$!> for the reason).

If you'd like to make L<C<system>|/system LIST> (and many other bits of
Perl) die on error, have a look at the L<autodie> pragma.

Like L<C<exec>|/exec LIST>, L<C<system>|/system LIST> allows you to lie
to a program about its name if you use the C<system PROGRAM LIST>
syntax.  Again, see L<C<exec>|/exec LIST>.

Since C<SIGINT> and C<SIGQUIT> are ignored during the execution of
L<C<system>|/system LIST>, if you expect your program to terminate on
receipt of these signals you will need to arrange to do so yourself
based on the return value.

    my @args = ("command", "arg1", "arg2");
    system(@args) == 0
        or die "system @args failed: $?";

If you'd like to manually inspect L<C<system>|/system LIST>'s failure,
you can check all possible failure modes by inspecting
L<C<$?>|perlvar/$?> like this:

    if ($? == -1) {
        print "failed to execute: $!\n";
    }
    elsif ($? & 127) {
        printf "child died with signal %d, %s coredump\n",
            ($? & 127),  ($? & 128) ? 'with' : 'without';
    }
    else {
        printf "child exited with value %d\n", $? >> 8;
    }

Alternatively, you may inspect the value of
L<C<${^CHILD_ERROR_NATIVE}>|perlvar/${^CHILD_ERROR_NATIVE}> with the
L<C<W*()>|POSIX/C<WIFEXITED>> calls from the L<POSIX> module.

When L<C<system>|/system LIST>'s arguments are executed indirectly by
the shell, results and return codes are subject to its quirks.
See L<perlop/"`STRING`"> and L<C<exec>|/exec LIST> for details.

Since L<C<system>|/system LIST> does a L<C<fork>|/fork> and
L<C<wait>|/wait> it may affect a C<SIGCHLD> handler.  See L<perlipc> for
details.

Portability issues: L<perlport/system>.

=item syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET
X<syswrite>

=item syswrite FILEHANDLE,SCALAR,LENGTH

=item syswrite FILEHANDLE,SCALAR

=for Pod::Functions fixed-length unbuffered output to a filehandle

Attempts to write LENGTH bytes of data from variable SCALAR to the
specified FILEHANDLE, using L<write(2)>.  If LENGTH is
not specified, writes whole SCALAR.  It bypasses buffered IO, so
mixing this with reads (other than C<sysread)>),
L<C<print>|/print FILEHANDLE LIST>, L<C<write>|/write FILEHANDLE>,
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, or L<C<eof>|/eof FILEHANDLE> may cause
confusion because the perlio and stdio layers usually buffer data.
Returns the number of bytes actually written, or L<C<undef>|/undef EXPR>
if there was an error (in this case the errno variable
L<C<$!>|perlvar/$!> is also set).  If the LENGTH is greater than the
data available in the SCALAR after the OFFSET, only as much data as is
available will be written.

An OFFSET may be specified to write the data from some part of the
string other than the beginning.  A negative OFFSET specifies writing
that many characters counting backwards from the end of the string.
If SCALAR is of length zero, you can only use an OFFSET of 0.

B<WARNING>: If the filehandle is marked C<:utf8>, Unicode characters
encoded in UTF-8 are written instead of bytes, and the LENGTH, OFFSET, and
return value of L<C<syswrite>|/syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET>
are in (UTF8-encoded Unicode) characters.
The C<:encoding(...)> layer implicitly introduces the C<:utf8> layer.
Alternately, if the handle is not marked with an encoding but you
attempt to write characters with code points over 255, raises an exception.
See L<C<binmode>|/binmode FILEHANDLE, LAYER>,
L<C<open>|/open FILEHANDLE,EXPR>, and the L<open> pragma.

=item tell FILEHANDLE
X<tell>

=item tell

=for Pod::Functions get current seekpointer on a filehandle

Returns the current position I<in bytes> for FILEHANDLE, or -1 on
error.  FILEHANDLE may be an expression whose value gives the name of
the actual filehandle.  If FILEHANDLE is omitted, assumes the file
last read.

Note the emphasis on bytes: even if the filehandle has been set to operate
on characters (for example using the C<:encoding(UTF-8)> I/O layer), the
L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>,
L<C<tell>|/tell FILEHANDLE>, and
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>
family of functions use byte offsets, not character offsets,
because seeking to a character offset would be very slow in a UTF-8 file.

The return value of L<C<tell>|/tell FILEHANDLE> for the standard streams
like the STDIN depends on the operating system: it may return -1 or
something else.  L<C<tell>|/tell FILEHANDLE> on pipes, fifos, and
sockets usually returns -1.

There is no C<systell> function.  Use
L<C<sysseek($fh, 0, 1)>|/sysseek FILEHANDLE,POSITION,WHENCE> for that.

Do not use L<C<tell>|/tell FILEHANDLE> (or other buffered I/O
operations) on a filehandle that has been manipulated by
L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET>,
L<C<syswrite>|/syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET>, or
L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE>.  Those functions
ignore the buffering, while L<C<tell>|/tell FILEHANDLE> does not.

=item telldir DIRHANDLE
X<telldir>

=for Pod::Functions get current seekpointer on a directory handle

Returns the current position of the L<C<readdir>|/readdir DIRHANDLE>
routines on DIRHANDLE.  Value may be given to
L<C<seekdir>|/seekdir DIRHANDLE,POS> to access a particular location in
a directory.  L<C<telldir>|/telldir DIRHANDLE> has the same caveats
about possible directory compaction as the corresponding system library
routine.

=item tie VARIABLE,CLASSNAME,LIST
X<tie>

=for Pod::Functions +5.002 bind a variable to an object class

This function binds a variable to a package class that will provide the
implementation for the variable.  VARIABLE is the name of the variable
to be enchanted.  CLASSNAME is the name of a class implementing objects
of correct type.  Any additional arguments are passed to the
appropriate constructor
method of the class (meaning C<TIESCALAR>, C<TIEHANDLE>, C<TIEARRAY>,
or C<TIEHASH>).  Typically these are arguments such as might be passed
to the L<dbm_open(3)> function of C.  The object returned by the
constructor is also returned by the
L<C<tie>|/tie VARIABLE,CLASSNAME,LIST> function, which would be useful
if you want to access other methods in CLASSNAME.

Note that functions such as L<C<keys>|/keys HASH> and
L<C<values>|/values HASH> may return huge lists when used on large
objects, like DBM files.  You may prefer to use the L<C<each>|/each
HASH> function to iterate over such.  Example:

    # print out history file offsets
    use NDBM_File;
    tie(my %HIST, 'NDBM_File', '/usr/lib/news/history', 1, 0);
    while (my ($key,$val) = each %HIST) {
        print $key, ' = ', unpack('L', $val), "\n";
    }

A class implementing a hash should have the following methods:

    TIEHASH classname, LIST
    FETCH this, key
    STORE this, key, value
    DELETE this, key
    CLEAR this
    EXISTS this, key
    FIRSTKEY this
    NEXTKEY this, lastkey
    SCALAR this
    DESTROY this
    UNTIE this

A class implementing an ordinary array should have the following methods:

    TIEARRAY classname, LIST
    FETCH this, key
    STORE this, key, value
    FETCHSIZE this
    STORESIZE this, count
    CLEAR this
    PUSH this, LIST
    POP this
    SHIFT this
    UNSHIFT this, LIST
    SPLICE this, offset, length, LIST
    EXTEND this, count
    DELETE this, key
    EXISTS this, key
    DESTROY this
    UNTIE this

A class implementing a filehandle should have the following methods:

    TIEHANDLE classname, LIST
    READ this, scalar, length, offset
    READLINE this
    GETC this
    WRITE this, scalar, length, offset
    PRINT this, LIST
    PRINTF this, format, LIST
    BINMODE this
    EOF this
    FILENO this
    SEEK this, position, whence
    TELL this
    OPEN this, mode, LIST
    CLOSE this
    DESTROY this
    UNTIE this

A class implementing a scalar should have the following methods:

    TIESCALAR classname, LIST
    FETCH this,
    STORE this, value
    DESTROY this
    UNTIE this

Not all methods indicated above need be implemented.  See L<perltie>,
L<Tie::Hash>, L<Tie::Array>, L<Tie::Scalar>, and L<Tie::Handle>.

Unlike L<C<dbmopen>|/dbmopen HASH,DBNAME,MASK>, the
L<C<tie>|/tie VARIABLE,CLASSNAME,LIST> function will not
L<C<use>|/use Module VERSION LIST> or L<C<require>|/require VERSION> a
module for you; you need to do that explicitly yourself.  See L<DB_File>
or the L<Config> module for interesting
L<C<tie>|/tie VARIABLE,CLASSNAME,LIST> implementations.

For further details see L<perltie>, L<C<tied>|/tied VARIABLE>.

=item tied VARIABLE
X<tied>

=for Pod::Functions get a reference to the object underlying a tied variable

Returns a reference to the object underlying VARIABLE (the same value
that was originally returned by the
L<C<tie>|/tie VARIABLE,CLASSNAME,LIST> call that bound the variable
to a package.)  Returns the undefined value if VARIABLE isn't tied to a
package.

=item time
X<time> X<epoch>

=for Pod::Functions return number of seconds since 1970

Returns the number of non-leap seconds since whatever time the system
considers to be the epoch, suitable for feeding to
L<C<gmtime>|/gmtime EXPR> and L<C<localtime>|/localtime EXPR>.  On most
systems the epoch is 00:00:00 UTC, January 1, 1970;
a prominent exception being Mac OS Classic which uses 00:00:00, January 1,
1904 in the current local time zone for its epoch.

For measuring time in better granularity than one second, use the
L<Time::HiRes> module from Perl 5.8 onwards (or from CPAN before then), or,
if you have L<gettimeofday(2)>, you may be able to use the
L<C<syscall>|/syscall NUMBER, LIST> interface of Perl.  See L<perlfaq8>
for details.

For date and time processing look at the many related modules on CPAN.
For a comprehensive date and time representation look at the
L<DateTime> module.

=item times
X<times>

=for Pod::Functions return elapsed time for self and child processes

Returns a four-element list giving the user and system times in
seconds for this process and any exited children of this process.

    my ($user,$system,$cuser,$csystem) = times;

In scalar context, L<C<times>|/times> returns C<$user>.

Children's times are only included for terminated children.

Portability issues: L<perlport/times>.

=item tr///

=for Pod::Functions transliterate a string

The transliteration operator.  Same as
L<C<yE<sol>E<sol>E<sol>>|/yE<sol>E<sol>E<sol>>.  See
L<perlop/"Quote-Like Operators">.

=item truncate FILEHANDLE,LENGTH
X<truncate>

=item truncate EXPR,LENGTH

=for Pod::Functions shorten a file

Truncates the file opened on FILEHANDLE, or named by EXPR, to the
specified length.  Raises an exception if truncate isn't implemented
on your system.  Returns true if successful, L<C<undef>|/undef EXPR> on
error.

The behavior is undefined if LENGTH is greater than the length of the
file.

The position in the file of FILEHANDLE is left unchanged.  You may want to
call L<seek|/"seek FILEHANDLE,POSITION,WHENCE"> before writing to the
file.

Portability issues: L<perlport/truncate>.

=item uc EXPR
X<uc> X<uppercase> X<toupper>

=item uc

=for Pod::Functions return upper-case version of a string

Returns an uppercased version of EXPR.  This is the internal function
implementing the C<\U> escape in double-quoted strings.
It does not attempt to do titlecase mapping on initial letters.  See
L<C<ucfirst>|/ucfirst EXPR> for that.

If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

This function behaves the same way under various pragmas, such as in a locale,
as L<C<lc>|/lc EXPR> does.

=item ucfirst EXPR
X<ucfirst> X<uppercase>

=item ucfirst

=for Pod::Functions return a string with just the next letter in upper case

Returns the value of EXPR with the first character in uppercase
(titlecase in Unicode).  This is the internal function implementing
the C<\u> escape in double-quoted strings.

If EXPR is omitted, uses L<C<$_>|perlvar/$_>.

This function behaves the same way under various pragmas, such as in a locale,
as L<C<lc>|/lc EXPR> does.

=item umask EXPR
X<umask>

=item umask

=for Pod::Functions set file creation mode mask

Sets the umask for the process to EXPR and returns the previous value.
If EXPR is omitted, merely returns the current umask.

The Unix permission C<rwxr-x---> is represented as three sets of three
bits, or three octal digits: C<0750> (the leading 0 indicates octal
and isn't one of the digits).  The L<C<umask>|/umask EXPR> value is such
a number representing disabled permissions bits.  The permission (or
"mode") values you pass L<C<mkdir>|/mkdir FILENAME,MASK> or
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE> are modified by your
umask, so even if you tell
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE> to create a file with
permissions C<0777>, if your umask is C<0022>, then the file will
actually be created with permissions C<0755>.  If your
L<C<umask>|/umask EXPR> were C<0027> (group can't write; others can't
read, write, or execute), then passing
L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE> C<0666> would create a
file with mode C<0640> (because C<0666 &~ 027> is C<0640>).

Here's some advice: supply a creation mode of C<0666> for regular
files (in L<C<sysopen>|/sysopen FILEHANDLE,FILENAME,MODE>) and one of
C<0777> for directories (in L<C<mkdir>|/mkdir FILENAME,MASK>) and
executable files.  This gives users the freedom of
choice: if they want protected files, they might choose process umasks
of C<022>, C<027>, or even the particularly antisocial mask of C<077>.
Programs should rarely if ever make policy decisions better left to
the user.  The exception to this is when writing files that should be
kept private: mail files, web browser cookies, F<.rhosts> files, and
so on.

If L<umask(2)> is not implemented on your system and you are trying to
restrict access for I<yourself> (i.e., C<< (EXPR & 0700) > 0 >>),
raises an exception.  If L<umask(2)> is not implemented and you are
not trying to restrict access for yourself, returns
L<C<undef>|/undef EXPR>.

Remember that a umask is a number, usually given in octal; it is I<not> a
string of octal digits.  See also L<C<oct>|/oct EXPR>, if all you have
is a string.

Portability issues: L<perlport/umask>.

=item undef EXPR
X<undef> X<undefine>

=item undef

=for Pod::Functions remove a variable or function definition

Undefines the value of EXPR, which must be an lvalue.  Use only on a
scalar value, an array (using C<@>), a hash (using C<%>), a subroutine
(using C<&>), or a typeglob (using C<*>).  Saying C<undef $hash{$key}>
will probably not do what you expect on most predefined variables or
DBM list values, so don't do that; see L<C<delete>|/delete EXPR>.
Always returns the undefined value.
You can omit the EXPR, in which case nothing is
undefined, but you still get an undefined value that you could, for
instance, return from a subroutine, assign to a variable, or pass as a
parameter.  Examples:

    undef $foo;
    undef $bar{'blurfl'};      # Compare to: delete $bar{'blurfl'};
    undef @ary;
    undef %hash;
    undef &mysub;
    undef *xyz;       # destroys $xyz, @xyz, %xyz, &xyz, etc.
    return (wantarray ? (undef, $errmsg) : undef) if $they_blew_it;
    select undef, undef, undef, 0.25;
    my ($x, $y, undef, $z) = foo();    # Ignore third value returned

Note that this is a unary operator, not a list operator.

=item unlink LIST
X<unlink> X<delete> X<remove> X<rm> X<del>

=item unlink

=for Pod::Functions remove one link to a file

Deletes a list of files.  On success, it returns the number of files
it successfully deleted.  On failure, it returns false and sets
L<C<$!>|perlvar/$!> (errno):

    my $unlinked = unlink 'a', 'b', 'c';
    unlink @goners;
    unlink glob "*.bak";

On error, L<C<unlink>|/unlink LIST> will not tell you which files it
could not remove.
If you want to know which files you could not remove, try them one
at a time:

     foreach my $file ( @goners ) {
         unlink $file or warn "Could not unlink $file: $!";
     }

Note: L<C<unlink>|/unlink LIST> will not attempt to delete directories
unless you are
superuser and the B<-U> flag is supplied to Perl.  Even if these
conditions are met, be warned that unlinking a directory can inflict
damage on your filesystem.  Finally, using L<C<unlink>|/unlink LIST> on
directories is not supported on many operating systems.  Use
L<C<rmdir>|/rmdir FILENAME> instead.

If LIST is omitted, L<C<unlink>|/unlink LIST> uses L<C<$_>|perlvar/$_>.

=item unpack TEMPLATE,EXPR
X<unpack>

=item unpack TEMPLATE

=for Pod::Functions convert binary structure into normal perl variables

L<C<unpack>|/unpack TEMPLATE,EXPR> does the reverse of
L<C<pack>|/pack TEMPLATE,LIST>: it takes a string
and expands it out into a list of values.
(In scalar context, it returns merely the first value produced.)

If EXPR is omitted, unpacks the L<C<$_>|perlvar/$_> string.
See L<perlpacktut> for an introduction to this function.

The string is broken into chunks described by the TEMPLATE.  Each chunk
is converted separately to a value.  Typically, either the string is a result
of L<C<pack>|/pack TEMPLATE,LIST>, or the characters of the string
represent a C structure of some kind.

The TEMPLATE has the same format as in the
L<C<pack>|/pack TEMPLATE,LIST> function.
Here's a subroutine that does substring:

    sub substr {
        my ($what, $where, $howmuch) = @_;
        unpack("x$where a$howmuch", $what);
    }

and then there's

    sub ordinal { unpack("W",$_[0]); } # same as ord()

In addition to fields allowed in L<C<pack>|/pack TEMPLATE,LIST>, you may
prefix a field with a %<number> to indicate that
you want a <number>-bit checksum of the items instead of the items
themselves.  Default is a 16-bit checksum.  The checksum is calculated by
summing numeric values of expanded values (for string fields the sum of
C<ord($char)> is taken; for bit fields the sum of zeroes and ones).

For example, the following
computes the same number as the System V sum program:

    my $checksum = do {
        local $/;  # slurp!
        unpack("%32W*", readline) % 65535;
    };

The following efficiently counts the number of set bits in a bit vector:

    my $setbits = unpack("%32b*", $selectmask);

The C<p> and C<P> formats should be used with care.  Since Perl
has no way of checking whether the value passed to
L<C<unpack>|/unpack TEMPLATE,EXPR>
corresponds to a valid memory location, passing a pointer value that's
not known to be valid is likely to have disastrous consequences.

If there are more pack codes or if the repeat count of a field or a group
is larger than what the remainder of the input string allows, the result
is not well defined: the repeat count may be decreased, or
L<C<unpack>|/unpack TEMPLATE,EXPR> may produce empty strings or zeros,
or it may raise an exception.
If the input string is longer than one described by the TEMPLATE,
the remainder of that input string is ignored.

See L<C<pack>|/pack TEMPLATE,LIST> for more examples and notes.

=item unshift ARRAY,LIST
X<unshift>

=for Pod::Functions prepend more elements to the beginning of a list

Does the opposite of a L<C<shift>|/shift ARRAY>.  Or the opposite of a
L<C<push>|/push ARRAY,LIST>,
depending on how you look at it.  Prepends list to the front of the
array and returns the new number of elements in the array.

    unshift(@ARGV, '-e') unless $ARGV[0] =~ /^-/;

Note the LIST is prepended whole, not one element at a time, so the
prepended elements stay in the same order.  Use
L<C<reverse>|/reverse LIST> to do the reverse.

Starting with Perl 5.14, an experimental feature allowed
L<C<unshift>|/unshift ARRAY,LIST> to take
a scalar expression. This experiment has been deemed unsuccessful, and was
removed as of Perl 5.24.

=item untie VARIABLE
X<untie>

=for Pod::Functions break a tie binding to a variable

Breaks the binding between a variable and a package.
(See L<tie|/tie VARIABLE,CLASSNAME,LIST>.)
Has no effect if the variable is not tied.

=item use Module VERSION LIST
X<use> X<module> X<import>

=item use Module VERSION

=item use Module LIST

=item use Module

=item use VERSION

=for Pod::Functions load in a module at compile time and import its namespace

Imports some semantics into the current package from the named module,
generally by aliasing certain subroutine or variable names into your
package.  It is exactly equivalent to

    BEGIN { require Module; Module->import( LIST ); }

except that Module I<must> be a bareword.
The importation can be made conditional by using the L<if> module.

In the peculiar C<use VERSION> form, VERSION may be either a positive
decimal fraction such as 5.006, which will be compared to
L<C<$]>|perlvar/$]>, or a v-string of the form v5.6.1, which will be
compared to L<C<$^V>|perlvar/$^V> (aka $PERL_VERSION).  An
exception is raised if VERSION is greater than the version of the
current Perl interpreter; Perl will not attempt to parse the rest of the
file.  Compare with L<C<require>|/require VERSION>, which can do a
similar check at run time.
Symmetrically, C<no VERSION> allows you to specify that you want a version
of Perl older than the specified one.

Specifying VERSION as a literal of the form v5.6.1 should generally be
avoided, because it leads to misleading error messages under earlier
versions of Perl (that is, prior to 5.6.0) that do not support this
syntax.  The equivalent numeric version should be used instead.

    use v5.6.1;     # compile time version check
    use 5.6.1;      # ditto
    use 5.006_001;  # ditto; preferred for backwards compatibility

This is often useful if you need to check the current Perl version before
L<C<use>|/use Module VERSION LIST>ing library modules that won't work
with older versions of Perl.
(We try not to do this more than we have to.)

C<use VERSION> also lexically enables all features available in the requested
version as defined by the L<feature> pragma, disabling any features
not in the requested version's feature bundle.  See L<feature>.
Similarly, if the specified Perl version is greater than or equal to
5.12.0, strictures are enabled lexically as
with L<C<use strict>|strict>.  Any explicit use of
C<use strict> or C<no strict> overrides C<use VERSION>, even if it comes
before it.  Later use of C<use VERSION>
will override all behavior of a previous
C<use VERSION>, possibly removing the C<strict> and C<feature> added by
C<use VERSION>.  C<use VERSION> does not
load the F<feature.pm> or F<strict.pm>
files.

The C<BEGIN> forces the L<C<require>|/require VERSION> and
L<C<import>|/import LIST> to happen at compile time.  The
L<C<require>|/require VERSION> makes sure the module is loaded into
memory if it hasn't been yet.  The L<C<import>|/import LIST> is not a
builtin; it's just an ordinary static method
call into the C<Module> package to tell the module to import the list of
features back into the current package.  The module can implement its
L<C<import>|/import LIST> method any way it likes, though most modules
just choose to derive their L<C<import>|/import LIST> method via
inheritance from the C<Exporter> class that is defined in the
L<C<Exporter>|Exporter> module.  See L<Exporter>.  If no
L<C<import>|/import LIST> method can be found, then the call is skipped,
even if there is an AUTOLOAD method.

If you do not want to call the package's L<C<import>|/import LIST>
method (for instance,
to stop your namespace from being altered), explicitly supply the empty list:

    use Module ();

That is exactly equivalent to

    BEGIN { require Module }

If the VERSION argument is present between Module and LIST, then the
L<C<use>|/use Module VERSION LIST> will call the C<VERSION> method in
class Module with the given version as an argument:

    use Module 12.34;

is equivalent to:

    BEGIN { require Module; Module->VERSION(12.34) }

The L<default C<VERSION> method|UNIVERSAL/C<VERSION ( [ REQUIRE ] )>>,
inherited from the L<C<UNIVERSAL>|UNIVERSAL> class, croaks if the given
version is larger than the value of the variable C<$Module::VERSION>.

Again, there is a distinction between omitting LIST (L<C<import>|/import
LIST> called with no arguments) and an explicit empty LIST C<()>
(L<C<import>|/import LIST> not called).  Note that there is no comma
after VERSION!

Because this is a wide-open interface, pragmas (compiler directives)
are also implemented this way.  Some of the currently implemented
pragmas are:

    use constant;
    use diagnostics;
    use integer;
    use sigtrap  qw(SEGV BUS);
    use strict   qw(subs vars refs);
    use subs     qw(afunc blurfl);
    use warnings qw(all);
    use sort     qw(stable _quicksort _mergesort);

Some of these pseudo-modules import semantics into the current
block scope (like L<C<strict>|strict> or L<C<integer>|integer>, unlike
ordinary modules, which import symbols into the current package (which
are effective through the end of the file).

Because L<C<use>|/use Module VERSION LIST> takes effect at compile time,
it doesn't respect the ordinary flow control of the code being compiled.
In particular, putting a L<C<use>|/use Module VERSION LIST> inside the
false branch of a conditional doesn't prevent it
from being processed.  If a module or pragma only needs to be loaded
conditionally, this can be done using the L<if> pragma:

    use if $] < 5.008, "utf8";
    use if WANT_WARNINGS, warnings => qw(all);

There's a corresponding L<C<no>|/no MODULE VERSION LIST> declaration
that unimports meanings imported by L<C<use>|/use Module VERSION LIST>,
i.e., it calls C<< Module->unimport(LIST) >> instead of
L<C<import>|/import LIST>.  It behaves just as L<C<import>|/import LIST>
does with VERSION, an omitted or empty LIST,
or no unimport method being found.

    no integer;
    no strict 'refs';
    no warnings;

Care should be taken when using the C<no VERSION> form of L<C<no>|/no
MODULE VERSION LIST>.  It is
I<only> meant to be used to assert that the running Perl is of a earlier
version than its argument and I<not> to undo the feature-enabling side effects
of C<use VERSION>.

See L<perlmodlib> for a list of standard modules and pragmas.  See L<perlrun>
for the C<-M> and C<-m> command-line options to Perl that give
L<C<use>|/use Module VERSION LIST> functionality from the command-line.

=item utime LIST
X<utime>

=for Pod::Functions set a file's last access and modify times

Changes the access and modification times on each file of a list of
files.  The first two elements of the list must be the NUMERIC access
and modification times, in that order.  Returns the number of files
successfully changed.  The inode change time of each file is set
to the current time.  For example, this code has the same effect as the
Unix L<touch(1)> command when the files I<already exist> and belong to
the user running the program:

    #!/usr/bin/perl
    my $atime = my $mtime = time;
    utime $atime, $mtime, @ARGV;

Since Perl 5.8.0, if the first two elements of the list are
L<C<undef>|/undef EXPR>,
the L<utime(2)> syscall from your C library is called with a null second
argument.  On most systems, this will set the file's access and
modification times to the current time (i.e., equivalent to the example
above) and will work even on files you don't own provided you have write
permission:

    for my $file (@ARGV) {
	utime(undef, undef, $file)
	    || warn "Couldn't touch $file: $!";
    }

Under NFS this will use the time of the NFS server, not the time of
the local machine.  If there is a time synchronization problem, the
NFS server and local machine will have different times.  The Unix
L<touch(1)> command will in fact normally use this form instead of the
one shown in the first example.

Passing only one of the first two elements as L<C<undef>|/undef EXPR> is
equivalent to passing a 0 and will not have the effect described when
both are L<C<undef>|/undef EXPR>.  This also triggers an
uninitialized warning.

On systems that support L<futimes(2)>, you may pass filehandles among the
files.  On systems that don't support L<futimes(2)>, passing filehandles raises
an exception.  Filehandles must be passed as globs or glob references to be
recognized; barewords are considered filenames.

Portability issues: L<perlport/utime>.

=item values HASH
X<values>

=item values ARRAY

=for Pod::Functions return a list of the values in a hash

In list context, returns a list consisting of all the values of the named
hash.  In Perl 5.12 or later only, will also return a list of the values of
an array; prior to that release, attempting to use an array argument will
produce a syntax error.  In scalar context, returns the number of values.

Hash entries are returned in an apparently random order.  The actual random
order is specific to a given hash; the exact same series of operations
on two hashes may result in a different order for each hash.  Any insertion
into the hash may change the order, as will any deletion, with the exception
that the most recent key returned by L<C<each>|/each HASH> or
L<C<keys>|/keys HASH> may be deleted without changing the order.  So
long as a given hash is unmodified you may rely on
L<C<keys>|/keys HASH>, L<C<values>|/values HASH> and
L<C<each>|/each HASH> to repeatedly return the same order
as each other.  See L<perlsec/"Algorithmic Complexity Attacks"> for
details on why hash order is randomized.  Aside from the guarantees
provided here the exact details of Perl's hash algorithm and the hash
traversal order are subject to change in any release of Perl.  Tied hashes
may behave differently to Perl's hashes with respect to changes in order on
insertion and deletion of items.

As a side effect, calling L<C<values>|/values HASH> resets the HASH or
ARRAY's internal iterator, see L<C<each>|/each HASH>.  (In particular,
calling L<C<values>|/values HASH> in void context resets the iterator
with no other overhead.  Apart from resetting the iterator,
C<values @array> in list context is the same as plain C<@array>.
(We recommend that you use void context C<keys @array> for this, but
reasoned that taking C<values @array> out would require more
documentation than leaving it in.)

Note that the values are not copied, which means modifying them will
modify the contents of the hash:

    for (values %hash)      { s/foo/bar/g }  # modifies %hash values
    for (@hash{keys %hash}) { s/foo/bar/g }  # same

Starting with Perl 5.14, an experimental feature allowed
L<C<values>|/values HASH> to take a
scalar expression. This experiment has been deemed unsuccessful, and was
removed as of Perl 5.24.

To avoid confusing would-be users of your code who are running earlier
versions of Perl with mysterious syntax errors, put this sort of thing at
the top of your file to signal that your code will work I<only> on Perls of
a recent vintage:

    use 5.012;	# so keys/values/each work on arrays

See also L<C<keys>|/keys HASH>, L<C<each>|/each HASH>, and
L<C<sort>|/sort SUBNAME LIST>.

=item vec EXPR,OFFSET,BITS
X<vec> X<bit> X<bit vector>

=for Pod::Functions test or set particular bits in a string

Treats the string in EXPR as a bit vector made up of elements of
width BITS and returns the value of the element specified by OFFSET
as an unsigned integer.  BITS therefore specifies the number of bits
that are reserved for each element in the bit vector.  This must
be a power of two from 1 to 32 (or 64, if your platform supports
that).

If BITS is 8, "elements" coincide with bytes of the input string.

If BITS is 16 or more, bytes of the input string are grouped into chunks
of size BITS/8, and each group is converted to a number as with
L<C<pack>|/pack TEMPLATE,LIST>/L<C<unpack>|/unpack TEMPLATE,EXPR> with
big-endian formats C<n>/C<N> (and analogously for BITS==64).  See
L<C<pack>|/pack TEMPLATE,LIST> for details.

If bits is 4 or less, the string is broken into bytes, then the bits
of each byte are broken into 8/BITS groups.  Bits of a byte are
numbered in a little-endian-ish way, as in C<0x01>, C<0x02>,
C<0x04>, C<0x08>, C<0x10>, C<0x20>, C<0x40>, C<0x80>.  For example,
breaking the single input byte C<chr(0x36)> into two groups gives a list
C<(0x6, 0x3)>; breaking it into 4 groups gives C<(0x2, 0x1, 0x3, 0x0)>.

L<C<vec>|/vec EXPR,OFFSET,BITS> may also be assigned to, in which case
parentheses are needed
to give the expression the correct precedence as in

    vec($image, $max_x * $x + $y, 8) = 3;

If the selected element is outside the string, the value 0 is returned.
If an element off the end of the string is written to, Perl will first
extend the string with sufficiently many zero bytes.   It is an error
to try to write off the beginning of the string (i.e., negative OFFSET).

If the string happens to be encoded as UTF-8 internally (and thus has
the UTF8 flag set), L<C<vec>|/vec EXPR,OFFSET,BITS> tries to convert it
to use a one-byte-per-character internal representation. However, if the
string contains characters with values of 256 or higher, that conversion
will fail. In that situation, C<vec> will operate on the underlying buffer
regardless, in its internal UTF-8 representation.

Strings created with L<C<vec>|/vec EXPR,OFFSET,BITS> can also be
manipulated with the logical
operators C<|>, C<&>, C<^>, and C<~>.  These operators will assume a bit
vector operation is desired when both operands are strings.
See L<perlop/"Bitwise String Operators">.

The following code will build up an ASCII string saying C<'PerlPerlPerl'>.
The comments show the string after each step.  Note that this code works
in the same way on big-endian or little-endian machines.

    my $foo = '';
    vec($foo,  0, 32) = 0x5065726C; # 'Perl'

    # $foo eq "Perl" eq "\x50\x65\x72\x6C", 32 bits
    print vec($foo, 0, 8);  # prints 80 == 0x50 == ord('P')

    vec($foo,  2, 16) = 0x5065; # 'PerlPe'
    vec($foo,  3, 16) = 0x726C; # 'PerlPerl'
    vec($foo,  8,  8) = 0x50;   # 'PerlPerlP'
    vec($foo,  9,  8) = 0x65;   # 'PerlPerlPe'
    vec($foo, 20,  4) = 2;      # 'PerlPerlPe'   . "\x02"
    vec($foo, 21,  4) = 7;      # 'PerlPerlPer'
                                   # 'r' is "\x72"
    vec($foo, 45,  2) = 3;      # 'PerlPerlPer'  . "\x0c"
    vec($foo, 93,  1) = 1;      # 'PerlPerlPer'  . "\x2c"
    vec($foo, 94,  1) = 1;      # 'PerlPerlPerl'
                                   # 'l' is "\x6c"

To transform a bit vector into a string or list of 0's and 1's, use these:

    my $bits = unpack("b*", $vector);
    my @bits = split(//, unpack("b*", $vector));

If you know the exact length in bits, it can be used in place of the C<*>.

Here is an example to illustrate how the bits actually fall in place:

  #!/usr/bin/perl -wl

  print <<'EOT';
                                    0         1         2         3
                     unpack("V",$_) 01234567890123456789012345678901
  ------------------------------------------------------------------
  EOT

  for $w (0..3) {
      $width = 2**$w;
      for ($shift=0; $shift < $width; ++$shift) {
          for ($off=0; $off < 32/$width; ++$off) {
              $str = pack("B*", "0"x32);
              $bits = (1<<$shift);
              vec($str, $off, $width) = $bits;
              $res = unpack("b*",$str);
              $val = unpack("V", $str);
              write;
          }
      }
  }

  format STDOUT =
  vec($_,@#,@#) = @<< == @######### @>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
  $off, $width, $bits, $val, $res
  .
  __END__

Regardless of the machine architecture on which it runs, the
example above should print the following table:

                                    0         1         2         3
                     unpack("V",$_) 01234567890123456789012345678901
  ------------------------------------------------------------------
  vec($_, 0, 1) = 1   ==          1 10000000000000000000000000000000
  vec($_, 1, 1) = 1   ==          2 01000000000000000000000000000000
  vec($_, 2, 1) = 1   ==          4 00100000000000000000000000000000
  vec($_, 3, 1) = 1   ==          8 00010000000000000000000000000000
  vec($_, 4, 1) = 1   ==         16 00001000000000000000000000000000
  vec($_, 5, 1) = 1   ==         32 00000100000000000000000000000000
  vec($_, 6, 1) = 1   ==         64 00000010000000000000000000000000
  vec($_, 7, 1) = 1   ==        128 00000001000000000000000000000000
  vec($_, 8, 1) = 1   ==        256 00000000100000000000000000000000
  vec($_, 9, 1) = 1   ==        512 00000000010000000000000000000000
  vec($_,10, 1) = 1   ==       1024 00000000001000000000000000000000
  vec($_,11, 1) = 1   ==       2048 00000000000100000000000000000000
  vec($_,12, 1) = 1   ==       4096 00000000000010000000000000000000
  vec($_,13, 1) = 1   ==       8192 00000000000001000000000000000000
  vec($_,14, 1) = 1   ==      16384 00000000000000100000000000000000
  vec($_,15, 1) = 1   ==      32768 00000000000000010000000000000000
  vec($_,16, 1) = 1   ==      65536 00000000000000001000000000000000
  vec($_,17, 1) = 1   ==     131072 00000000000000000100000000000000
  vec($_,18, 1) = 1   ==     262144 00000000000000000010000000000000
  vec($_,19, 1) = 1   ==     524288 00000000000000000001000000000000
  vec($_,20, 1) = 1   ==    1048576 00000000000000000000100000000000
  vec($_,21, 1) = 1   ==    2097152 00000000000000000000010000000000
  vec($_,22, 1) = 1   ==    4194304 00000000000000000000001000000000
  vec($_,23, 1) = 1   ==    8388608 00000000000000000000000100000000
  vec($_,24, 1) = 1   ==   16777216 00000000000000000000000010000000
  vec($_,25, 1) = 1   ==   33554432 00000000000000000000000001000000
  vec($_,26, 1) = 1   ==   67108864 00000000000000000000000000100000
  vec($_,27, 1) = 1   ==  134217728 00000000000000000000000000010000
  vec($_,28, 1) = 1   ==  268435456 00000000000000000000000000001000
  vec($_,29, 1) = 1   ==  536870912 00000000000000000000000000000100
  vec($_,30, 1) = 1   == 1073741824 00000000000000000000000000000010
  vec($_,31, 1) = 1   == 2147483648 00000000000000000000000000000001
  vec($_, 0, 2) = 1   ==          1 10000000000000000000000000000000
  vec($_, 1, 2) = 1   ==          4 00100000000000000000000000000000
  vec($_, 2, 2) = 1   ==         16 00001000000000000000000000000000
  vec($_, 3, 2) = 1   ==         64 00000010000000000000000000000000
  vec($_, 4, 2) = 1   ==        256 00000000100000000000000000000000
  vec($_, 5, 2) = 1   ==       1024 00000000001000000000000000000000
  vec($_, 6, 2) = 1   ==       4096 00000000000010000000000000000000
  vec($_, 7, 2) = 1   ==      16384 00000000000000100000000000000000
  vec($_, 8, 2) = 1   ==      65536 00000000000000001000000000000000
  vec($_, 9, 2) = 1   ==     262144 00000000000000000010000000000000
  vec($_,10, 2) = 1   ==    1048576 00000000000000000000100000000000
  vec($_,11, 2) = 1   ==    4194304 00000000000000000000001000000000
  vec($_,12, 2) = 1   ==   16777216 00000000000000000000000010000000
  vec($_,13, 2) = 1   ==   67108864 00000000000000000000000000100000
  vec($_,14, 2) = 1   ==  268435456 00000000000000000000000000001000
  vec($_,15, 2) = 1   == 1073741824 00000000000000000000000000000010
  vec($_, 0, 2) = 2   ==          2 01000000000000000000000000000000
  vec($_, 1, 2) = 2   ==          8 00010000000000000000000000000000
  vec($_, 2, 2) = 2   ==         32 00000100000000000000000000000000
  vec($_, 3, 2) = 2   ==        128 00000001000000000000000000000000
  vec($_, 4, 2) = 2   ==        512 00000000010000000000000000000000
  vec($_, 5, 2) = 2   ==       2048 00000000000100000000000000000000
  vec($_, 6, 2) = 2   ==       8192 00000000000001000000000000000000
  vec($_, 7, 2) = 2   ==      32768 00000000000000010000000000000000
  vec($_, 8, 2) = 2   ==     131072 00000000000000000100000000000000
  vec($_, 9, 2) = 2   ==     524288 00000000000000000001000000000000
  vec($_,10, 2) = 2   ==    2097152 00000000000000000000010000000000
  vec($_,11, 2) = 2   ==    8388608 00000000000000000000000100000000
  vec($_,12, 2) = 2   ==   33554432 00000000000000000000000001000000
  vec($_,13, 2) = 2   ==  134217728 00000000000000000000000000010000
  vec($_,14, 2) = 2   ==  536870912 00000000000000000000000000000100
  vec($_,15, 2) = 2   == 2147483648 00000000000000000000000000000001
  vec($_, 0, 4) = 1   ==          1 10000000000000000000000000000000
  vec($_, 1, 4) = 1   ==         16 00001000000000000000000000000000
  vec($_, 2, 4) = 1   ==        256 00000000100000000000000000000000
  vec($_, 3, 4) = 1   ==       4096 00000000000010000000000000000000
  vec($_, 4, 4) = 1   ==      65536 00000000000000001000000000000000
  vec($_, 5, 4) = 1   ==    1048576 00000000000000000000100000000000
  vec($_, 6, 4) = 1   ==   16777216 00000000000000000000000010000000
  vec($_, 7, 4) = 1   ==  268435456 00000000000000000000000000001000
  vec($_, 0, 4) = 2   ==          2 01000000000000000000000000000000
  vec($_, 1, 4) = 2   ==         32 00000100000000000000000000000000
  vec($_, 2, 4) = 2   ==        512 00000000010000000000000000000000
  vec($_, 3, 4) = 2   ==       8192 00000000000001000000000000000000
  vec($_, 4, 4) = 2   ==     131072 00000000000000000100000000000000
  vec($_, 5, 4) = 2   ==    2097152 00000000000000000000010000000000
  vec($_, 6, 4) = 2   ==   33554432 00000000000000000000000001000000
  vec($_, 7, 4) = 2   ==  536870912 00000000000000000000000000000100
  vec($_, 0, 4) = 4   ==          4 00100000000000000000000000000000
  vec($_, 1, 4) = 4   ==         64 00000010000000000000000000000000
  vec($_, 2, 4) = 4   ==       1024 00000000001000000000000000000000
  vec($_, 3, 4) = 4   ==      16384 00000000000000100000000000000000
  vec($_, 4, 4) = 4   ==     262144 00000000000000000010000000000000
  vec($_, 5, 4) = 4   ==    4194304 00000000000000000000001000000000
  vec($_, 6, 4) = 4   ==   67108864 00000000000000000000000000100000
  vec($_, 7, 4) = 4   == 1073741824 00000000000000000000000000000010
  vec($_, 0, 4) = 8   ==          8 00010000000000000000000000000000
  vec($_, 1, 4) = 8   ==        128 00000001000000000000000000000000
  vec($_, 2, 4) = 8   ==       2048 00000000000100000000000000000000
  vec($_, 3, 4) = 8   ==      32768 00000000000000010000000000000000
  vec($_, 4, 4) = 8   ==     524288 00000000000000000001000000000000
  vec($_, 5, 4) = 8   ==    8388608 00000000000000000000000100000000
  vec($_, 6, 4) = 8   ==  134217728 00000000000000000000000000010000
  vec($_, 7, 4) = 8   == 2147483648 00000000000000000000000000000001
  vec($_, 0, 8) = 1   ==          1 10000000000000000000000000000000
  vec($_, 1, 8) = 1   ==        256 00000000100000000000000000000000
  vec($_, 2, 8) = 1   ==      65536 00000000000000001000000000000000
  vec($_, 3, 8) = 1   ==   16777216 00000000000000000000000010000000
  vec($_, 0, 8) = 2   ==          2 01000000000000000000000000000000
  vec($_, 1, 8) = 2   ==        512 00000000010000000000000000000000
  vec($_, 2, 8) = 2   ==     131072 00000000000000000100000000000000
  vec($_, 3, 8) = 2   ==   33554432 00000000000000000000000001000000
  vec($_, 0, 8) = 4   ==          4 00100000000000000000000000000000
  vec($_, 1, 8) = 4   ==       1024 00000000001000000000000000000000
  vec($_, 2, 8) = 4   ==     262144 00000000000000000010000000000000
  vec($_, 3, 8) = 4   ==   67108864 00000000000000000000000000100000
  vec($_, 0, 8) = 8   ==          8 00010000000000000000000000000000
  vec($_, 1, 8) = 8   ==       2048 00000000000100000000000000000000
  vec($_, 2, 8) = 8   ==     524288 00000000000000000001000000000000
  vec($_, 3, 8) = 8   ==  134217728 00000000000000000000000000010000
  vec($_, 0, 8) = 16  ==         16 00001000000000000000000000000000
  vec($_, 1, 8) = 16  ==       4096 00000000000010000000000000000000
  vec($_, 2, 8) = 16  ==    1048576 00000000000000000000100000000000
  vec($_, 3, 8) = 16  ==  268435456 00000000000000000000000000001000
  vec($_, 0, 8) = 32  ==         32 00000100000000000000000000000000
  vec($_, 1, 8) = 32  ==       8192 00000000000001000000000000000000
  vec($_, 2, 8) = 32  ==    2097152 00000000000000000000010000000000
  vec($_, 3, 8) = 32  ==  536870912 00000000000000000000000000000100
  vec($_, 0, 8) = 64  ==         64 00000010000000000000000000000000
  vec($_, 1, 8) = 64  ==      16384 00000000000000100000000000000000
  vec($_, 2, 8) = 64  ==    4194304 00000000000000000000001000000000
  vec($_, 3, 8) = 64  == 1073741824 00000000000000000000000000000010
  vec($_, 0, 8) = 128 ==        128 00000001000000000000000000000000
  vec($_, 1, 8) = 128 ==      32768 00000000000000010000000000000000
  vec($_, 2, 8) = 128 ==    8388608 00000000000000000000000100000000
  vec($_, 3, 8) = 128 == 2147483648 00000000000000000000000000000001

=item wait
X<wait>

=for Pod::Functions wait for any child process to die

Behaves like L<wait(2)> on your system: it waits for a child
process to terminate and returns the pid of the deceased process, or
C<-1> if there are no child processes.  The status is returned in
L<C<$?>|perlvar/$?> and
L<C<${^CHILD_ERROR_NATIVE}>|perlvar/${^CHILD_ERROR_NATIVE}>.
Note that a return value of C<-1> could mean that child processes are
being automatically reaped, as described in L<perlipc>.

If you use L<C<wait>|/wait> in your handler for
L<C<$SIG{CHLD}>|perlvar/%SIG>, it may accidentally wait for the child
created by L<C<qx>|/qxE<sol>STRINGE<sol>> or L<C<system>|/system LIST>.
See L<perlipc> for details.

Portability issues: L<perlport/wait>.

=item waitpid PID,FLAGS
X<waitpid>

=for Pod::Functions wait for a particular child process to die

Waits for a particular child process to terminate and returns the pid of
the deceased process, or C<-1> if there is no such child process.  A
non-blocking wait (with L<WNOHANG|POSIX/C<WNOHANG>> in FLAGS) can return 0 if
there are child processes matching PID but none have terminated yet.
The status is returned in L<C<$?>|perlvar/$?> and
L<C<${^CHILD_ERROR_NATIVE}>|perlvar/${^CHILD_ERROR_NATIVE}>.

A PID of C<0> indicates to wait for any child process whose process group ID is
equal to that of the current process.  A PID of less than C<-1> indicates to
wait for any child process whose process group ID is equal to -PID.  A PID of
C<-1> indicates to wait for any child process.

If you say

    use POSIX ":sys_wait_h";

    my $kid;
    do {
        $kid = waitpid(-1, WNOHANG);
    } while $kid > 0;

or

    1 while waitpid(-1, WNOHANG) > 0;

then you can do a non-blocking wait for all pending zombie processes (see
L<POSIX/WAIT>).
Non-blocking wait is available on machines supporting either the
L<waitpid(2)> or L<wait4(2)> syscalls.  However, waiting for a particular
pid with FLAGS of C<0> is implemented everywhere.  (Perl emulates the
system call by remembering the status values of processes that have
exited but have not been harvested by the Perl script yet.)

Note that on some systems, a return value of C<-1> could mean that child
processes are being automatically reaped.  See L<perlipc> for details,
and for other examples.

Portability issues: L<perlport/waitpid>.

=item wantarray
X<wantarray> X<context>

=for Pod::Functions get void vs scalar vs list context of current subroutine call

Returns true if the context of the currently executing subroutine or
L<C<eval>|/eval EXPR> is looking for a list value.  Returns false if the
context is
looking for a scalar.  Returns the undefined value if the context is
looking for no value (void context).

    return unless defined wantarray; # don't bother doing more
    my @a = complex_calculation();
    return wantarray ? @a : "@a";

L<C<wantarray>|/wantarray>'s result is unspecified in the top level of a file,
in a C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> or C<END> block, or
in a C<DESTROY> method.

This function should have been named wantlist() instead.

=item warn LIST
X<warn> X<warning> X<STDERR>

=for Pod::Functions print debugging info

Prints the value of LIST to STDERR.  If the last element of LIST does
not end in a newline, it appends the same file/line number text as
L<C<die>|/die LIST> does.

If the output is empty and L<C<$@>|perlvar/$@> already contains a value
(typically from a previous eval) that value is used after appending
C<"\t...caught"> to L<C<$@>|perlvar/$@>.  This is useful for staying
almost, but not entirely similar to L<C<die>|/die LIST>.

If L<C<$@>|perlvar/$@> is empty, then the string
C<"Warning: Something's wrong"> is used.

No message is printed if there is a L<C<$SIG{__WARN__}>|perlvar/%SIG>
handler
installed.  It is the handler's responsibility to deal with the message
as it sees fit (like, for instance, converting it into a
L<C<die>|/die LIST>).  Most
handlers must therefore arrange to actually display the
warnings that they are not prepared to deal with, by calling
L<C<warn>|/warn LIST>
again in the handler.  Note that this is quite safe and will not
produce an endless loop, since C<__WARN__> hooks are not called from
inside one.

You will find this behavior is slightly different from that of
L<C<$SIG{__DIE__}>|perlvar/%SIG> handlers (which don't suppress the
error text, but can instead call L<C<die>|/die LIST> again to change
it).

Using a C<__WARN__> handler provides a powerful way to silence all
warnings (even the so-called mandatory ones).  An example:

    # wipe out *all* compile-time warnings
    BEGIN { $SIG{'__WARN__'} = sub { warn $_[0] if $DOWARN } }
    my $foo = 10;
    my $foo = 20;          # no warning about duplicate my $foo,
                           # but hey, you asked for it!
    # no compile-time or run-time warnings before here
    $DOWARN = 1;

    # run-time warnings enabled after here
    warn "\$foo is alive and $foo!";     # does show up

See L<perlvar> for details on setting L<C<%SIG>|perlvar/%SIG> entries
and for more
examples.  See the L<Carp> module for other kinds of warnings using its
C<carp> and C<cluck> functions.

=item write FILEHANDLE
X<write>

=item write EXPR

=item write

=for Pod::Functions print a picture record

Writes a formatted record (possibly multi-line) to the specified FILEHANDLE,
using the format associated with that file.  By default the format for
a file is the one having the same name as the filehandle, but the
format for the current output channel (see the
L<C<select>|/select FILEHANDLE> function) may be set explicitly by
assigning the name of the format to the L<C<$~>|perlvar/$~> variable.

Top of form processing is handled automatically:  if there is insufficient
room on the current page for the formatted record, the page is advanced by
writing a form feed and a special top-of-page
format is used to format the new
page header before the record is written.  By default, the top-of-page
format is the name of the filehandle with C<_TOP> appended, or C<top>
in the current package if the former does not exist.  This would be a
problem with autovivified filehandles, but it may be dynamically set to the
format of your choice by assigning the name to the L<C<$^>|perlvar/$^>
variable while that filehandle is selected.  The number of lines
remaining on the current page is in variable L<C<$->|perlvar/$->, which
can be set to C<0> to force a new page.

If FILEHANDLE is unspecified, output goes to the current default output
channel, which starts out as STDOUT but may be changed by the
L<C<select>|/select FILEHANDLE> operator.  If the FILEHANDLE is an EXPR,
then the expression
is evaluated and the resulting string is used to look up the name of
the FILEHANDLE at run time.  For more on formats, see L<perlform>.

Note that write is I<not> the opposite of
L<C<read>|/read FILEHANDLE,SCALAR,LENGTH,OFFSET>.  Unfortunately.

=item y///

=for Pod::Functions transliterate a string

The transliteration operator.  Same as
L<C<trE<sol>E<sol>E<sol>>|/trE<sol>E<sol>E<sol>>.  See
L<perlop/"Quote-Like Operators">.

=back

=head2 Non-function Keywords by Cross-reference

=head3 perldata

=over

=item __DATA__

=item __END__

These keywords are documented in L<perldata/"Special Literals">.

=back

=head3 perlmod

=over

=item BEGIN

=item CHECK

=item END

=item INIT

=item UNITCHECK

These compile phase keywords are documented in L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">.

=back

=head3 perlobj

=over

=item DESTROY

This method keyword is documented in L<perlobj/"Destructors">.

=back

=head3 perlop

=over

=item and

=item cmp

=item eq

=item ge

=item gt

=item le

=item lt

=item ne

=item not

=item or

=item x

=item xor

These operators are documented in L<perlop>.

=back

=head3 perlsub

=over

=item AUTOLOAD

This keyword is documented in L<perlsub/"Autoloading">.

=back

=head3 perlsyn

=over

=item else

=item elsif

=item for

=item foreach

=item if

=item unless

=item until

=item while

These flow-control keywords are documented in L<perlsyn/"Compound Statements">.

=item elseif

The "else if" keyword is spelled C<elsif> in Perl.  There's no C<elif>
or C<else if> either.  It does parse C<elseif>, but only to warn you
about not using it.

See the documentation for flow-control keywords in L<perlsyn/"Compound
Statements">.

=back

=over

=item default

=item given

=item when

These flow-control keywords related to the experimental switch feature are
documented in L<perlsyn/"Switch Statements">.

=back

=cut
perlopentut.pod000064400000022357150344123510007640 0ustar00=encoding utf8

=head1 NAME

perlopentut - simple recipes for opening files and pipes in Perl

=head1 DESCRIPTION

Whenever you do I/O on a file in Perl, you do so through what in Perl is
called a B<filehandle>.  A filehandle is an internal name for an external
file.  It is the job of the C<open> function to make the association
between the internal name and the external name, and it is the job
of the C<close> function to break that association.

For your convenience, Perl sets up a few special filehandles that are
already open when you run.  These include C<STDIN>, C<STDOUT>, C<STDERR>,
and C<ARGV>.  Since those are pre-opened, you can use them right away
without having to go to the trouble of opening them yourself:

    print STDERR "This is a debugging message.\n";

    print STDOUT "Please enter something: ";
    $response = <STDIN> // die "how come no input?";
    print STDOUT "Thank you!\n";

    while (<ARGV>) { ... }

As you see from those examples, C<STDOUT> and C<STDERR> are output
handles, and C<STDIN> and C<ARGV> are input handles.  They are
in all capital letters because they are reserved to Perl, much
like the C<@ARGV> array and the C<%ENV> hash are.  Their external
associations were set up by your shell.

You will need to open every other filehandle on your own. Although there
are many variants, the most common way to call Perl's open() function
is with three arguments and one return value:

C<    I<OK> = open(I<HANDLE>, I<MODE>, I<PATHNAME>)>

Where:

=over

=item I<OK>

will be some defined value if the open succeeds, but
C<undef> if it fails;

=item I<HANDLE>

should be an undefined scalar variable to be filled in by the
C<open> function if it succeeds;

=item I<MODE>

is the access mode and the encoding format to open the file with;

=item I<PATHNAME>

is the external name of the file you want opened.

=back

Most of the complexity of the C<open> function lies in the many
possible values that the I<MODE> parameter can take on.

One last thing before we show you how to open files: opening
files does not (usually) automatically lock them in Perl.  See
L<perlfaq5> for how to lock.

=head1 Opening Text Files

=head2 Opening Text Files for Reading

If you want to read from a text file, first open it in
read-only mode like this:

    my $filename = "/some/path/to/a/textfile/goes/here";
    my $encoding = ":encoding(UTF-8)";
    my $handle   = undef;     # this will be filled in on success

    open($handle, "< $encoding", $filename)
        || die "$0: can't open $filename for reading: $!";

As with the shell, in Perl the C<< "<" >> is used to open the file in
read-only mode.  If it succeeds, Perl allocates a brand new filehandle for
you and fills in your previously undefined C<$handle> argument with a
reference to that handle.

Now you may use functions like C<readline>, C<read>, C<getc>, and
C<sysread> on that handle.  Probably the most common input function
is the one that looks like an operator:

    $line = readline($handle);
    $line = <$handle>;          # same thing

Because the C<readline> function returns C<undef> at end of file or
upon error, you will sometimes see it used this way:

    $line = <$handle>;
    if (defined $line) {
        # do something with $line
    }
    else {
        # $line is not valid, so skip it
    }

You can also just quickly C<die> on an undefined value this way:

    $line = <$handle> // die "no input found";

However, if hitting EOF is an expected and normal event, you do not want to
exit simply because you have run out of input.  Instead, you probably just want
to exit an input loop.  You can then test to see if an actual error has caused
the loop to terminate, and act accordingly:

    while (<$handle>) {
        # do something with data in $_
    }
    if ($!) {
        die "unexpected error while reading from $filename: $!";
    }

B<A Note on Encodings>: Having to specify the text encoding every time
might seem a bit of a bother.  To set up a default encoding for C<open> so
that you don't have to supply it each time, you can use the C<open> pragma:

    use open qw< :encoding(UTF-8) >;

Once you've done that, you can safely omit the encoding part of the
open mode:

    open($handle, "<", $filename)
        || die "$0: can't open $filename for reading: $!";

But never use the bare C<< "<" >> without having set up a default encoding
first.  Otherwise, Perl cannot know which of the many, many, many possible
flavors of text file you have, and Perl will have no idea how to correctly
map the data in your file into actual characters it can work with.  Other
common encoding formats including C<"ASCII">, C<"ISO-8859-1">,
C<"ISO-8859-15">, C<"Windows-1252">, C<"MacRoman">, and even C<"UTF-16LE">.
See L<perlunitut> for more about encodings.

=head2 Opening Text Files for Writing

When you want to write to a file, you first have to decide what to do about
any existing contents of that file.  You have two basic choices here: to
preserve or to clobber.

If you want to preserve any existing contents, then you want to open the file
in append mode.  As in the shell, in Perl you use C<<< ">>" >>> to open an
existing file in append mode.  C<<< ">>" >>> creates the file if it does not
already exist.

    my $handle   = undef;
    my $filename = "/some/path/to/a/textfile/goes/here";
    my $encoding = ":encoding(UTF-8)";

    open($handle, ">> $encoding", $filename)
        || die "$0: can't open $filename for appending: $!";

Now you can write to that filehandle using any of C<print>, C<printf>,
C<say>, C<write>, or C<syswrite>.

As noted above, if the file does not already exist, then the append-mode open
will create it for you.  But if the file does already exist, its contents are
safe from harm because you will be adding your new text past the end of the
old text.

On the other hand, sometimes you want to clobber whatever might already be
there.  To empty out a file before you start writing to it, you can open it
in write-only mode:

    my $handle   = undef;
    my $filename = "/some/path/to/a/textfile/goes/here";
    my $encoding = ":encoding(UTF-8)";

    open($handle, "> $encoding", $filename)
        || die "$0: can't open $filename in write-open mode: $!";

Here again Perl works just like the shell in that the C<< ">" >> clobbers
an existing file.

As with the append mode, when you open a file in write-only mode,
you can now write to that filehandle using any of C<print>, C<printf>,
C<say>, C<write>, or C<syswrite>.

What about read-write mode?  You should probably pretend it doesn't exist,
because opening text files in read-write mode is unlikely to do what you
would like.  See L<perlfaq5> for details.

=head1 Opening Binary Files

If the file to be opened contains binary data instead of text characters,
then the C<MODE> argument to C<open> is a little different.  Instead of
specifying the encoding, you tell Perl that your data are in raw bytes.

    my $filename = "/some/path/to/a/binary/file/goes/here";
    my $encoding = ":raw :bytes"
    my $handle   = undef;     # this will be filled in on success

And then open as before, choosing C<<< "<" >>>, C<<< ">>" >>>, or
C<<< ">" >>> as needed:

    open($handle, "< $encoding", $filename)
        || die "$0: can't open $filename for reading: $!";

    open($handle, ">> $encoding", $filename)
        || die "$0: can't open $filename for appending: $!";

    open($handle, "> $encoding", $filename)
        || die "$0: can't open $filename in write-open mode: $!";

Alternately, you can change to binary mode on an existing handle this way:

    binmode($handle)    || die "cannot binmode handle";

This is especially handy for the handles that Perl has already opened for you.

    binmode(STDIN)      || die "cannot binmode STDIN";
    binmode(STDOUT)     || die "cannot binmode STDOUT";

You can also pass C<binmode> an explicit encoding to change it on the fly.
This isn't exactly "binary" mode, but we still use C<binmode> to do it:

  binmode(STDIN,  ":encoding(MacRoman)") || die "cannot binmode STDIN";
  binmode(STDOUT, ":encoding(UTF-8)")    || die "cannot binmode STDOUT";

Once you have your binary file properly opened in the right mode, you can
use all the same Perl I/O functions as you used on text files.  However,
you may wish to use the fixed-size C<read> instead of the variable-sized
C<readline> for your input.

Here's an example of how to copy a binary file:

    my $BUFSIZ   = 64 * (2 ** 10);
    my $name_in  = "/some/input/file";
    my $name_out = "/some/output/flie";

    my($in_fh, $out_fh, $buffer);

    open($in_fh,  "<", $name_in)
        || die "$0: cannot open $name_in for reading: $!";
    open($out_fh, ">", $name_out)
        || die "$0: cannot open $name_out for writing: $!";

    for my $fh ($in_fh, $out_fh)  {
        binmode($fh)               || die "binmode failed";
    }

    while (read($in_fh, $buffer, $BUFSIZ)) {
        unless (print $out_fh $buffer) {
            die "couldn't write to $name_out: $!";
        }
    }

    close($in_fh)       || die "couldn't close $name_in: $!";
    close($out_fh)      || die "couldn't close $name_out: $!";

=head1 Opening Pipes

To be announced.

=head1 Low-level File Opens via sysopen

To be announced.  Or deleted.

=head1 SEE ALSO

To be announced.

=head1 AUTHOR and COPYRIGHT

Copyright 2013 Tom Christiansen.

This documentation is free; you can redistribute it and/or modify it under
the same terms as Perl itself.

perl588delta.pod000064400000061270150344123510007475 0ustar00=encoding utf8

=head1 NAME

perl588delta - what is new for perl v5.8.8

=head1 DESCRIPTION

This document describes differences between the 5.8.7 release and
the 5.8.8 release.

=head1 Incompatible Changes

There are no changes intentionally incompatible with 5.8.7. If any exist,
they are bugs and reports are welcome.

=head1 Core Enhancements

=over

=item *

C<chdir>, C<chmod> and C<chown> can now work on filehandles as well as
filenames, if the system supports respectively C<fchdir>, C<fchmod> and
C<fchown>, thanks to a patch provided by Gisle Aas.

=back

=head1 Modules and Pragmata

=over

=item *

C<Attribute::Handlers> upgraded to version 0.78_02

=over

=item *

Documentation typo fix

=back

=item *

C<attrs> upgraded to version 1.02

=over

=item *

Internal cleanup only

=back

=item *

C<autouse> upgraded to version 1.05

=over

=item *

Simplified implementation

=back

=item *

C<B> upgraded to version 1.09_01

=over

=item *

The inheritance hierarchy of the C<B::> modules has been corrected;
C<B::NV> now inherits from C<B::SV> (instead of C<B::IV>).

=back

=item *

C<blib> upgraded to version 1.03

=over

=item *

Documentation typo fix

=back

=item *

C<ByteLoader> upgraded to version 0.06

=over

=item *

Internal cleanup

=back

=item *

C<CGI> upgraded to version 3.15

=over

=item *

Extraneous "?" from C<self_url()> removed

=item *

C<scrolling_list()> select attribute fixed

=item *

C<virtual_port> now works properly with the https protocol

=item *

C<upload_hook()> and C<append()> now works in function-oriented mode

=item *

C<POST_MAX> doesn't cause the client to hang any more

=item *

Automatic tab indexes are now disabled and new C<-tabindex> pragma has
been added to turn automatic indexes back on

=item *

C<end_form()> doesn't emit empty (and non-validating) C<< <div> >>

=item *

C<CGI::Carp> works better in certain mod_perl configurations

=item *

Setting C<$CGI::TMPDIRECTORY> is now effective

=item *

Enhanced documentation

=back

=item *

C<charnames> upgraded to version 1.05

=over

=item *

C<viacode()> now accept hex strings and has been optimized.

=back

=item *

C<CPAN> upgraded to version 1.76_02

=over

=item *

1 minor bug fix for Win32

=back

=item *

C<Cwd> upgraded to version 3.12

=over

=item *

C<canonpath()> on Win32 now collapses F<foo\..> sections correctly.

=item *

Improved behaviour on Symbian OS.

=item *

Enhanced documentation and typo fixes

=item *

Internal cleanup

=back

=item *

C<Data::Dumper> upgraded to version 2.121_08

=over

=item *

A problem where C<Data::Dumper> would sometimes update the iterator state
of hashes has been fixed

=item *

Numeric labels now work

=item *

Internal cleanup

=back

=item *

C<DB> upgraded to version 1.01

=over

=item *

A problem where the state of the regexp engine would sometimes get clobbered when running
under the debugger has been fixed.

=back

=item *

C<DB_File> upgraded to version 1.814

=over

=item *

Adds support for Berkeley DB 4.4.

=back

=item *

C<Devel::DProf> upgraded to version 20050603.00

=over

=item *

Internal cleanup

=back

=item *

C<Devel::Peek> upgraded to version 1.03

=over

=item *

Internal cleanup

=back

=item *

C<Devel::PPPort> upgraded to version 3.06_01

=over

=item *

C<--compat-version> argument checking has been improved

=item *

Files passed on the command line are filtered by default

=item *

C<--nofilter> option to override the filtering has been added

=item *

Enhanced documentation

=back

=item *

C<diagnostics> upgraded to version 1.15

=over

=item *

Documentation typo fix

=back

=item *

C<Digest> upgraded to version 1.14

=over

=item *

The constructor now knows which module implements SHA-224

=item *

Documentation tweaks and typo fixes

=back

=item *

C<Digest::MD5> upgraded to version 2.36

=over

=item *

C<XSLoader> is now used for faster loading

=item *

Enhanced documentation including MD5 weaknesses discovered lately

=back

=item *

C<Dumpvalue> upgraded to version 1.12

=over

=item *

Documentation fix

=back

=item *

C<DynaLoader> upgraded but unfortunately we're not able to increment its version number :-(

=over

=item *

Implements C<dl_unload_file> on Win32

=item *

Internal cleanup

=item *

C<XSLoader> 0.06 incorporated; small optimisation for calling
C<bootstrap_inherit()> and documentation enhancements.

=back

=item *

C<Encode> upgraded to version 2.12

=over

=item *

A coderef is now acceptable for C<CHECK>!

=item *

3 new characters added to the ISO-8859-7 encoding

=item *

New encoding C<MIME-Header-ISO_2022_JP> added

=item *

Problem with partial characters and C<< encoding(utf-8-strict) >> fixed.

=item *

Documentation enhancements and typo fixes

=back

=item *

C<English> upgraded to version 1.02

=over

=item *

the C<< $COMPILING >> variable has been added

=back

=item *

C<ExtUtils::Constant> upgraded to version 0.17

=over

=item *

Improved compatibility with older versions of perl

=back

=item *

C<ExtUtils::MakeMaker> upgraded to version 6.30 (was 6.17)

=over

=item *

Too much to list here;  see L<http://search.cpan.org/dist/ExtUtils-MakeMaker/Changes>

=back

=item *

C<File::Basename> upgraded to version 2.74, with changes contributed by Michael Schwern.

=over

=item *

Documentation clarified and errors corrected.

=item *

C<basename> now strips trailing path separators before processing the name.

=item *

C<basename> now returns C</> for parameter C</>, to make C<basename>
consistent with the shell utility of the same name.

=item *

The suffix is no longer stripped if it is identical to the remaining characters
in the name, again for consistency with the shell utility.

=item *

Some internal code cleanup.

=back

=item *

C<File::Copy> upgraded to version 2.09

=over

=item *

Copying a file onto itself used to fail.

=item *

Moving a file between file systems now preserves the access and
modification time stamps

=back

=item *

C<File::Find> upgraded to version 1.10

=over

=item *

Win32 portability fixes

=item *

Enhanced documentation

=back

=item *

C<File::Glob> upgraded to version 1.05

=over

=item *

Internal cleanup

=back

=item *

C<File::Path> upgraded to version 1.08

=over

=item *

C<mkpath> now preserves C<errno> when C<mkdir> fails

=back

=item *

C<File::Spec> upgraded to version 3.12

=over

=item *

C<File::Spec->rootdir()> now returns C<\> on Win32, instead of C</>

=item *

C<$^O> could sometimes become tainted. This has been fixed.

=item *

C<canonpath> on Win32 now collapses C<foo/..> (or C<foo\..>) sections
correctly, rather than doing the "misguided" work it was previously doing.
Note that C<canonpath> on Unix still does B<not> collapse these sections, as
doing so would be incorrect.

=item *

Some documentation improvements

=item *

Some internal code cleanup

=back

=item *

C<FileCache> upgraded to version 1.06

=over

=item *

POD formatting errors in the documentation fixed

=back

=item *

C<Filter::Simple> upgraded to version 0.82

=item *

C<FindBin> upgraded to version 1.47

=over

=item *

Now works better with directories where access rights are more
restrictive than usual.

=back

=item *

C<GDBM_File> upgraded to version 1.08

=over

=item *

Internal cleanup

=back

=item *

C<Getopt::Long> upgraded to version 2.35

=over

=item *

C<prefix_pattern> has now been complemented by a new configuration
option C<long_prefix_pattern> that allows the user to specify what
prefix patterns should have long option style semantics applied.

=item *

Options can now take multiple values at once (experimental)

=item *

Various bug fixes

=back

=item *

C<if> upgraded to version 0.05

=over

=item *

Give more meaningful error messages from C<if> when invoked with a
condition in list context.

=item *

Restore backwards compatibility with earlier versions of perl

=back

=item *

C<IO> upgraded to version 1.22

=over

=item *

Enhanced documentation

=item *

Internal cleanup

=back

=item *

C<IPC::Open2> upgraded to version 1.02

=over

=item *

Enhanced documentation

=back

=item *

C<IPC::Open3> upgraded to version 1.02

=over

=item *

Enhanced documentation

=back

=item *

C<List::Util> upgraded to version 1.18 (was 1.14)

=over

=item *

Fix pure-perl version of C<refaddr> to avoid blessing an un-blessed reference

=item *

Use C<XSLoader> for faster loading

=item *

Fixed various memory leaks

=item *

Internal cleanup and portability fixes

=back

=item *

C<Math::Complex> upgraded to version 1.35

=over

=item *

C<atan2(0, i)> now works, as do all the (computable) complex argument cases

=item *

Fixes for certain bugs in C<make> and C<emake>

=item *

Support returning the I<k>th root directly

=item *

Support C<[2,-3pi/8]> in C<emake>

=item *

Support C<inf> for C<make>/C<emake>

=item *

Document C<make>/C<emake> more visibly

=back

=item *

C<Math::Trig> upgraded to version 1.03

=over

=item *

Add more great circle routines: C<great_circle_waypoint> and
C<great_circle_destination>

=back

=item *

C<MIME::Base64> upgraded to version 3.07

=over

=item *

Use C<XSLoader> for faster loading

=item *

Enhanced documentation

=item *

Internal cleanup

=back

=item *

C<NDBM_File> upgraded to version 1.06

=over

=item *

Enhanced documentation

=back

=item *

C<ODBM_File> upgraded to version 1.06

=over

=item *

Documentation typo fixed

=item *

Internal cleanup

=back

=item *

C<Opcode> upgraded to version 1.06

=over

=item *

Enhanced documentation

=item *

Internal cleanup

=back

=item *

C<open> upgraded to version 1.05

=over

=item *

Enhanced documentation

=back

=item *

C<overload> upgraded to version 1.04

=over

=item *

Enhanced documentation

=back

=item *

C<PerlIO> upgraded to version 1.04

=over

=item *

C<PerlIO::via> iterate over layers properly now

=item *

C<PerlIO::scalar> understands C<< $/ = "" >> now

=item *

C<encoding(utf-8-strict)> with partial characters now works

=item *

Enhanced documentation

=item *

Internal cleanup

=back

=item *

C<Pod::Functions> upgraded to version 1.03

=over

=item *

Documentation typos fixed

=back

=item *

C<Pod::Html> upgraded to version 1.0504

=over

=item *

HTML output will now correctly link
to C<=item>s on the same page, and should be valid XHTML.

=item *

Variable names are recognized as intended

=item *

Documentation typos fixed

=back

=item *

C<Pod::Parser> upgraded to version 1.32

=over

=item *

Allow files that start with C<=head> on the first line

=item *

Win32 portability fix

=item *

Exit status of C<pod2usage> fixed

=item *

New C<-noperldoc> switch for C<pod2usage>

=item *

Arbitrary URL schemes now allowed

=item *

Documentation typos fixed

=back

=item *

C<POSIX> upgraded to version 1.09

=over

=item *

Documentation typos fixed

=item *

Internal cleanup

=back

=item *

C<re> upgraded to version 0.05

=over

=item *

Documentation typo fixed

=back

=item *

C<Safe> upgraded to version 2.12

=over

=item *

Minor documentation enhancement

=back

=item *

C<SDBM_File> upgraded to version 1.05

=over

=item *

Documentation typo fixed

=item *

Internal cleanup

=back

=item *

C<Socket> upgraded to version 1.78

=over

=item *

Internal cleanup

=back

=item *

C<Storable> upgraded to version 2.15

=over

=item *

This includes the C<STORABLE_attach> hook functionality added by
Adam Kennedy, and more frugal memory requirements when storing under C<ithreads>, by
using the C<ithreads> cloning tracking code.

=back

=item *

C<Switch> upgraded to version 2.10_01

=over

=item *

Documentation typos fixed

=back

=item *

C<Sys::Syslog> upgraded to version 0.13

=over

=item *

Now provides numeric macros and meaningful C<Exporter> tags.

=item *

No longer uses C<Sys::Hostname> as it may provide useless values in
unconfigured network environments, so instead uses C<INADDR_LOOPBACK> directly.

=item *

C<syslog()> now uses local timestamp.

=item *

C<setlogmask()> now behaves like its C counterpart.

=item *

C<setlogsock()> will now C<croak()> as documented.

=item *

Improved error and warnings messages.

=item *

Improved documentation.

=back

=item *

C<Term::ANSIColor> upgraded to version 1.10

=over

=item *

Fixes a bug in C<colored> when C<$EACHLINE> is set that caused it to not color
lines consisting solely of 0 (literal zero).

=item *

Improved tests.

=back

=item *

C<Term::ReadLine> upgraded to version 1.02

=over

=item *

Documentation tweaks

=back

=item *

C<Test::Harness> upgraded to version 2.56 (was 2.48)

=over

=item *

The C<Test::Harness> timer is now off by default.

=item *

Now shows elapsed time in milliseconds.

=item *

Various bug fixes

=back

=item *

C<Test::Simple> upgraded to version 0.62 (was 0.54)

=over

=item *

C<is_deeply()> no longer fails to work for many cases

=item *

Various minor bug fixes

=item *

Documentation enhancements

=back

=item *

C<Text::Tabs> upgraded to version 2005.0824

=over

=item *

Provides a faster implementation of C<expand>

=back

=item *

C<Text::Wrap> upgraded to version 2005.082401

=over

=item *

Adds C<$Text::Wrap::separator2>, which allows you to preserve existing newlines
but add line-breaks with some other string.

=back

=item *

C<threads> upgraded to version 1.07

=over

=item *

C<threads> will now honour C<no warnings 'threads'>

=item *

A thread's interpreter is now freed after C<< $t->join() >> rather than after
C<undef $t>, which should fix some C<ithreads> memory leaks. (Fixed by Dave
Mitchell)

=item *

Some documentation typo fixes.

=back

=item *

C<threads::shared> upgraded to version 0.94

=over

=item *

Documentation changes only

=item *

Note: An improved implementation of C<threads::shared> is available on
CPAN - this will be merged into 5.8.9 if it proves stable.

=back

=item *

C<Tie::Hash> upgraded to version 1.02

=over

=item *

Documentation typo fixed

=back

=item *

C<Time::HiRes> upgraded to version 1.86 (was 1.66)

=over

=item *

C<clock_nanosleep()> and C<clock()> functions added

=item *

Support for the POSIX C<clock_gettime()> and C<clock_getres()> has been added

=item *

Return C<undef> or an empty list if the C C<gettimeofday()> function fails

=item *

Improved C<nanosleep> detection

=item *

Internal cleanup

=item *

Enhanced documentation

=back

=item *

C<Unicode::Collate> upgraded to version 0.52

=over

=item *

Now implements UCA Revision 14 (based on Unicode 4.1.0).

=item *

C<Unicode::Collate->new> method no longer overwrites user's C<$_>

=item *

Enhanced documentation

=back

=item *

C<Unicode::UCD> upgraded to version 0.24

=over

=item *

Documentation typos fixed

=back

=item *

C<User::grent> upgraded to version 1.01

=over

=item *

Documentation typo fixed

=back

=item *

C<utf8> upgraded to version 1.06

=over

=item *

Documentation typos fixed

=back

=item *

C<vmsish> upgraded to version 1.02

=over

=item *

Documentation typos fixed

=back

=item *

C<warnings> upgraded to version 1.05

=over

=item *

Gentler messing with C<Carp::> internals

=item *

Internal cleanup

=item *

Documentation update

=back

=item *

C<Win32> upgraded to version 0.2601

=for cynics And how many perl 5.8.x versions can I release ahead of Vista?

=over

=item *

Provides Windows Vista support to C<Win32::GetOSName>

=item *

Documentation enhancements

=back

=item *

C<XS::Typemap> upgraded to version 0.02

=over

=item *

Internal cleanup

=back

=back

=head1 Utility Changes

=head2 C<h2xs> enhancements

C<h2xs> implements new option C<--use-xsloader> to force use of
C<XSLoader> even in backwards compatible modules.

The handling of authors' names that had apostrophes has been fixed.

Any enums with negative values are now skipped.

=head2 C<perlivp> enhancements

C<perlivp> implements new option C<-a> and will not check for F<*.ph>
files by default any more.  Use the C<-a> option to run I<all> tests.

=head1 New Documentation

The L<perlglossary> manpage is a glossary of terms used in the Perl
documentation, technical and otherwise, kindly provided by O'Reilly Media,
inc.

=head1 Performance Enhancements

=over 4

=item *

Weak reference creation is now I<O(1)> rather than I<O(n)>, courtesy of
Nicholas Clark. Weak reference deletion remains I<O(n)>, but if deletion only
happens at program exit, it may be skipped completely.

=item *

Salvador Fandiño provided improvements to reduce the memory usage of C<sort>
and to speed up some cases.

=item *

Jarkko Hietaniemi and Andy Lester worked to mark as much data as possible in
the C source files as C<static>, to increase the proportion of the executable
file that the operating system can share between process, and thus reduce
real memory usage on multi-user systems.

=back

=head1 Installation and Configuration Improvements

Parallel makes should work properly now, although there may still be problems
if C<make test> is instructed to run in parallel.

Building with Borland's compilers on Win32 should work more smoothly. In
particular Steve Hay has worked to side step many warnings emitted by their
compilers and at least one C compiler internal error.

C<Configure> will now detect C<clearenv> and C<unsetenv>, thanks to a patch
from Alan Burlison. It will also probe for C<futimes> and whether C<sprintf>
correctly returns the length of the formatted string, which will both be used
in perl 5.8.9.

There are improved hints for next-3.0, vmesa, IX, Darwin, Solaris, Linux,
DEC/OSF, HP-UX and MPE/iX

Perl extensions on Windows now can be statically built into the Perl DLL,
thanks to a work by Vadim Konovalov. (This improvement was actually in 5.8.7,
but was accidentally omitted from L<perl587delta>).

=head1 Selected Bug Fixes

=head2 no warnings 'category' works correctly with -w

Previously when running with warnings enabled globally via C<-w>, selective
disabling of specific warning categories would actually turn off all warnings.
This is now fixed; now C<no warnings 'io';> will only turn off warnings in the
C<io> class. Previously it would erroneously turn off all warnings.

This bug fix may cause some programs to start correctly issuing warnings.

=head2 Remove over-optimisation

Perl 5.8.4 introduced a change so that assignments of C<undef> to a
scalar, or of an empty list to an array or a hash, were optimised away. As
this could cause problems when C<goto> jumps were involved, this change
has been backed out.

=head2 sprintf() fixes

Using the sprintf() function with some formats could lead to a buffer
overflow in some specific cases. This has been fixed, along with several
other bugs, notably in bounds checking.

In related fixes, it was possible for badly written code that did not follow
the documentation of C<Sys::Syslog> to have formatting vulnerabilities.
C<Sys::Syslog> has been changed to protect people from poor quality third
party code.

=head2 Debugger and Unicode slowdown

It had been reported that running under perl's debugger when processing
Unicode data could cause unexpectedly large slowdowns. The most likely cause
of this was identified and fixed by Nicholas Clark.

=head2 Smaller fixes

=over 4

=item *

C<FindBin> now works better with directories where access rights are more
restrictive than usual.

=item *

Several memory leaks in ithreads were closed. An improved implementation of
C<threads::shared> is available on CPAN - this will be merged into 5.8.9 if
it proves stable.

=item *

Trailing spaces are now trimmed from C<$!> and C<$^E>.

=item *

Operations that require perl to read a process's list of groups, such as reads
of C<$(> and C<$)>, now dynamically allocate memory rather than using a
fixed sized array. The fixed size array could cause C stack exhaustion on
systems configured to use large numbers of groups.

=item *

C<PerlIO::scalar> now works better with non-default C<$/> settings.

=item *

You can now use the C<x> operator to repeat a C<qw//> list. This used
to raise a syntax error.

=item *

The debugger now traces correctly execution in eval("")uated code that
contains #line directives.

=item *

The value of the C<open> pragma is no longer ignored for three-argument
opens.

=item *

The optimisation of C<for (reverse @a)> introduced in perl 5.8.6 could
misbehave when the array had undefined elements and was used in LVALUE
context. Dave Mitchell provided a fix.

=item *

Some case insensitive matches between UTF-8 encoded data and 8 bit regexps,
and vice versa, could give malformed character warnings. These have been
fixed by Dave Mitchell and Yves Orton.

=item *

C<lcfirst> and C<ucfirst> could corrupt the string for certain cases where
the length UTF-8 encoding of the string in lower case, upper case or title
case differed. This was fixed by Nicholas Clark.

=item *

Perl will now use the C library calls C<unsetenv> and C<clearenv> if present
to delete keys from C<%ENV> and delete C<%ENV> entirely, thanks to a patch
from Alan Burlison.

=back

=head1 New or Changed Diagnostics

=head2 Attempt to set length of freed array

This is a new warning, produced in situations such as this:

    $r = do {my @a; \$#a};
    $$r = 503;

=head2 Non-string passed as bitmask

This is a new warning, produced when number has been passed as a argument to
select(), instead of a bitmask.

    # Wrong, will now warn
    $rin = fileno(STDIN);
    ($nfound,$timeleft) = select($rout=$rin, undef, undef, $timeout);
    
    # Should be
    $rin = '';
    vec($rin,fileno(STDIN),1) = 1;
    ($nfound,$timeleft) = select($rout=$rin, undef, undef, $timeout);

=head2 Search pattern not terminated or ternary operator parsed as search pattern

This syntax error indicates that the lexer couldn't find the final
delimiter of a C<?PATTERN?> construct. Mentioning the ternary operator in
this error message makes it easier to diagnose syntax errors.

=head1 Changed Internals

There has been a fair amount of refactoring of the C<C> source code, partly to
make it tidier and more maintainable. The resulting object code and the
C<perl> binary may well be smaller than 5.8.7, in particular due to a change
contributed by Dave Mitchell which reworked the warnings code to be
significantly smaller. Apart from being smaller and possibly faster, there
should be no user-detectable changes.

Andy Lester supplied many improvements to determine which function
parameters and local variables could actually be declared C<const> to the C
compiler. Steve Peters provided new C<*_set> macros and reworked the core to
use these rather than assigning to macros in LVALUE context.

Dave Mitchell improved the lexer debugging output under C<-DT>

Nicholas Clark changed the string buffer allocation so that it is now rounded
up to the next multiple of 4 (or 8 on platforms with 64 bit pointers). This
should reduce the number of calls to C<realloc> without actually using any
extra memory.

The C<HV>'s array of C<HE*>s is now allocated at the correct (minimal) size,
thanks to another change by Nicholas Clark. Compile with
C<-DPERL_USE_LARGE_HV_ALLOC> to use the old, sloppier, default.

For XS or embedding debugging purposes, if perl is compiled with
C<-DDEBUG_LEAKING_SCALARS_FORK_DUMP> in addition to
C<-DDEBUG_LEAKING_SCALARS> then a child process is C<fork>ed just before
global destruction, which is used to display the values of any scalars
found to have leaked at the end of global destruction. Without this, the
scalars have already been freed sufficiently at the point of detection that
it is impossible to produce any meaningful dump of their contents.  This
feature was implemented by the indefatigable Nicholas Clark, based on an idea
by Mike Giroux.

=head1 Platform Specific Problems

The optimiser on HP-UX 11.23 (Itanium 2) is currently partly disabled (scaled
down to +O1) when using HP C-ANSI-C; the cause of problems at higher
optimisation levels is still unclear.

There are a handful of remaining test failures on VMS, mostly due to
test fixes and minor module tweaks with too many dependencies to
integrate into this release from the development stream, where they have
all been corrected.  The following is a list of expected failures with
the patch number of the fix where that is known:

    ext/Devel/PPPort/t/ppphtest.t  #26913
    ext/List/Util/t/p_tainted.t    #26912
    lib/ExtUtils/t/PL_FILES.t      #26813
    lib/ExtUtils/t/basic.t         #26813
    t/io/fs.t
    t/op/cmp.t

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org.  There may also be
information at http://www.perl.org, the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.  You can browse and search
the Perl 5 bugs at http://bugs.perl.org/

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut
perlmacosx.pod000064400000027434150344123510007435 0ustar00If you read this file _as_is_, just ignore the funny characters you see.
It is written in the POD format (see pod/perlpod.pod) which is specially
designed to be readable as is.

=head1 NAME

perlmacosx - Perl under Mac OS X

=head1 SYNOPSIS

This document briefly describes Perl under Mac OS X.

  curl -O http://www.cpan.org/src/perl-5.26.3.tar.gz
  tar -xzf perl-5.26.3.tar.gz
  cd perl-5.26.3
  ./Configure -des -Dprefix=/usr/local/
  make
  make test
  sudo make install

=head1 DESCRIPTION

The latest Perl release (5.26.3 as of this writing) builds without changes
under all versions of Mac OS X from 10.3 "Panther" onwards. 

In order to build your own version of Perl you will need 'make',
which is part of Apple's developer tools - also known as Xcode. From
Mac OS X 10.7 "Lion" onwards, it can be downloaded separately as the
'Command Line Tools' bundle directly from L<https://developer.apple.com/downloads/>
(you will need a free account to log in), or as a part of the Xcode suite,
freely available at the App Store. Xcode is a pretty big app, so
unless you already have it or really want it, you are advised to get the
'Command Line Tools' bundle separately from the link above. If you want
to do it from within Xcode, go to Xcode -> Preferences -> Downloads and
select the 'Command Line Tools' option.

Between Mac OS X 10.3 "Panther" and 10.6 "Snow Leopard", the 'Command
Line Tools' bundle was called 'unix tools', and was usually supplied
with Mac OS install DVDs.

Earlier Mac OS X releases (10.2 "Jaguar" and older) did not include a
completely thread-safe libc, so threading is not fully supported. Also,
earlier releases included a buggy libdb, so some of the DB_File tests
are known to fail on those releases.


=head2 Installation Prefix

The default installation location for this release uses the traditional
UNIX directory layout under /usr/local. This is the recommended location
for most users, and will leave the Apple-supplied Perl and its modules
undisturbed.

Using an installation prefix of '/usr' will result in a directory layout
that mirrors that of Apple's default Perl, with core modules stored in
'/System/Library/Perl/${version}', CPAN modules stored in
'/Library/Perl/${version}', and the addition of
'/Network/Library/Perl/${version}' to @INC for modules that are stored
on a file server and used by many Macs.


=head2 SDK support

First, export the path to the SDK into the build environment:

 export SDK=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk

Please make sure the SDK version (i.e. the numbers right before '.sdk')
matches your system's (in this case, Mac OS X 10.8 "Mountain Lion"), as it is
possible to have more than one SDK installed. Also make sure the path exists
in your system, and if it doesn't please make sure the SDK is properly
installed, as it should come with the 'Command Line Tools' bundle mentioned
above. Finally, if you have an older Mac OS X (10.6 "Snow Leopard" and below)
running Xcode 4.2 or lower, the SDK path might be something like
C<'/Developer/SDKs/MacOSX10.3.9.sdk'>.

You can use the SDK by exporting some additions to Perl's 'ccflags' and '..flags'
config variables:

    ./Configure -Accflags="-nostdinc -B$SDK/usr/include/gcc \
                           -B$SDK/usr/lib/gcc -isystem$SDK/usr/include \
                           -F$SDK/System/Library/Frameworks" \
                -Aldflags="-Wl,-syslibroot,$SDK" \
                -de

=head2 Universal Binary support

Note: From Mac OS X 10.6 "Snow Leopard" onwards, Apple only supports
Intel-based hardware. This means you can safely skip this section unless
you have an older Apple computer running on ppc or wish to create a perl
binary with backwards compatibility.

You can compile perl as a universal binary (built for both ppc and intel).
In Mac OS X 10.4 "Tiger", you must export the 'u' variant of the SDK:

    export SDK=/Developer/SDKs/MacOSX10.4u.sdk

Mac OS X 10.5 "Leopard" and above do not require the 'u' variant.

In addition to the compiler flags used to select the SDK, also add the flags
for creating a universal binary:

 ./Configure -Accflags="-arch i686 -arch ppc -nostdinc               \
                         -B$SDK/usr/include/gcc                      \
                        -B$SDK/usr/lib/gcc -isystem$SDK/usr/include  \
                        -F$SDK/System/Library/Frameworks"            \
             -Aldflags="-arch i686 -arch ppc -Wl,-syslibroot,$SDK"   \
             -de

Keep in mind that these compiler and linker settings will also be used when
building CPAN modules. For XS modules to be compiled as a universal binary, any
libraries it links to must also be universal binaries. The system libraries that
Apple includes with the 10.4u SDK are all universal, but user-installed libraries
may need to be re-installed as universal binaries.

=head2 64-bit PPC support

Follow the instructions in F<INSTALL> to build perl with support for 64-bit 
integers (C<use64bitint>) or both 64-bit integers and 64-bit addressing
(C<use64bitall>). In the latter case, the resulting binary will run only
on G5-based hosts.

Support for 64-bit addressing is experimental: some aspects of Perl may be
omitted or buggy. Note the messages output by F<Configure> for further 
information. Please use C<perlbug> to submit a problem report in the
event that you encounter difficulties.

When building 64-bit modules, it is your responsibility to ensure that linked
external libraries and frameworks provide 64-bit support: if they do not,
module building may appear to succeed, but attempts to use the module will
result in run-time dynamic linking errors, and subsequent test failures.
You can use C<file> to discover the architectures supported by a library:

    $ file libgdbm.3.0.0.dylib 
    libgdbm.3.0.0.dylib: Mach-O fat file with 2 architectures
    libgdbm.3.0.0.dylib (for architecture ppc):      Mach-O dynamically linked shared library ppc
    libgdbm.3.0.0.dylib (for architecture ppc64):    Mach-O 64-bit dynamically linked shared library ppc64

Note that this issue precludes the building of many Macintosh-specific CPAN
modules (C<Mac::*>), as the required Apple frameworks do not provide PPC64
support. Similarly, downloads from Fink or Darwinports are unlikely to provide
64-bit support; the libraries must be rebuilt from source with the appropriate
compiler and linker flags. For further information, see Apple's
I<64-Bit Transition Guide> at 
L<http://developer.apple.com/documentation/Darwin/Conceptual/64bitPorting/index.html>.

=head2 libperl and Prebinding

Mac OS X ships with a dynamically-loaded libperl, but the default for
this release is to compile a static libperl. The reason for this is
pre-binding. Dynamic libraries can be pre-bound to a specific address in
memory in order to decrease load time. To do this, one needs to be aware
of the location and size of all previously-loaded libraries. Apple
collects this information as part of their overall OS build process, and
thus has easy access to it when building Perl, but ordinary users would
need to go to a great deal of effort to obtain the information needed
for pre-binding.

You can override the default and build a shared libperl if you wish
(S<Configure ... -Duseshrplib>).

With Mac OS X 10.4 "Tiger" and newer, there is almost no performance
penalty for non-prebound libraries. Earlier releases will suffer a greater
load time than either the static library, or Apple's pre-bound dynamic library.

=head2 Updating Apple's Perl

In a word - don't, at least not without a *very* good reason. Your scripts
can just as easily begin with "#!/usr/local/bin/perl" as with
"#!/usr/bin/perl". Scripts supplied by Apple and other third parties as
part of installation packages and such have generally only been tested
with the /usr/bin/perl that's installed by Apple.

If you find that you do need to update the system Perl, one issue worth
keeping in mind is the question of static vs. dynamic libraries. If you
upgrade using the default static libperl, you will find that the dynamic
libperl supplied by Apple will not be deleted. If both libraries are
present when an application that links against libperl is built, ld will
link against the dynamic library by default. So, if you need to replace
Apple's dynamic libperl with a static libperl, you need to be sure to
delete the older dynamic library after you've installed the update.


=head2 Known problems

If you have installed extra libraries such as GDBM through Fink
(in other words, you have libraries under F</sw/lib>), or libdlcompat
to F</usr/local/lib>, you may need to be extra careful when running
Configure to not to confuse Configure and Perl about which libraries
to use.  Being confused will show up for example as "dyld" errors about
symbol problems, for example during "make test". The safest bet is to run
Configure as

    Configure ... -Uloclibpth -Dlibpth=/usr/lib

to make Configure look only into the system libraries.  If you have some
extra library directories that you really want to use (such as newer
Berkeley DB libraries in pre-Panther systems), add those to the libpth:

    Configure ... -Uloclibpth -Dlibpth='/usr/lib /opt/lib'

The default of building Perl statically may cause problems with complex
applications like Tk: in that case consider building shared Perl

    Configure ... -Duseshrplib

but remember that there's a startup cost to pay in that case (see above
"libperl and Prebinding").

Starting with Tiger (Mac OS X 10.4), Apple shipped broken locale files for
the eu_ES locale (Basque-Spain).  In previous releases of Perl, this resulted in
failures in the F<lib/locale> test. These failures have been suppressed
in the current release of Perl by making the test ignore the broken locale.
If you need to use the eu_ES locale, you should contact Apple support.


=head2 Cocoa

There are two ways to use Cocoa from Perl. Apple's PerlObjCBridge
module, included with Mac OS X, can be used by standalone scripts to
access Foundation (i.e. non-GUI) classes and objects.

An alternative is CamelBones, a framework that allows access to both
Foundation and AppKit classes and objects, so that full GUI applications
can be built in Perl. CamelBones can be found on SourceForge, at
L<http://www.sourceforge.net/projects/camelbones/>.


=head1 Starting From Scratch

Unfortunately it is not that difficult somehow manage to break one's
Mac OS X Perl rather severely.  If all else fails and you want to
really, B<REALLY>, start from scratch and remove even your Apple Perl
installation (which has become corrupted somehow), the following
instructions should do it.  B<Please think twice before following
these instructions: they are much like conducting brain surgery to
yourself.  Without anesthesia.>  We will B<not> come to fix your system
if you do this.

First, get rid of the libperl.dylib:

    # cd /System/Library/Perl/darwin/CORE
    # rm libperl.dylib

Then delete every .bundle file found anywhere in the folders:

    /System/Library/Perl
    /Library/Perl

You can find them for example by

    # find /System/Library/Perl /Library/Perl -name '*.bundle' -print

After this you can either copy Perl from your operating system media
(you will need at least the /System/Library/Perl and /usr/bin/perl),
or rebuild Perl from the source code with C<Configure -Dprefix=/usr
-Duseshrplib> NOTE: the C<-Dprefix=/usr> to replace the system Perl
works much better with Perl 5.8.1 and later, in Perl 5.8.0 the
settings were not quite right.

"Pacifist" from CharlesSoft (L<http://www.charlessoft.com/>) is a nice
way to extract the Perl binaries from the OS media, without having to
reinstall the entire OS.


=head1 AUTHOR

This README was written by Sherm Pendley E<lt>sherm@dot-app.orgE<gt>,
and subsequently updated by Dominic Dunlop E<lt>domo@computer.orgE<gt>
and Breno G. de Oliveira E<lt>garu@cpan.orgE<gt>. The "Starting From Scratch"
recipe was contributed by John Montbriand E<lt>montbriand@apple.comE<gt>.

=head1 DATE

Last modified 2013-04-29.
perlhacktips.pod000064400000154325150344123510007751 0ustar00
=encoding utf8

=for comment
Consistent formatting of this file is achieved with:
  perl ./Porting/podtidy pod/perlhacktips.pod

=head1 NAME

perlhacktips - Tips for Perl core C code hacking

=head1 DESCRIPTION

This document will help you learn the best way to go about hacking on
the Perl core C code.  It covers common problems, debugging, profiling,
and more.

If you haven't read L<perlhack> and L<perlhacktut> yet, you might want
to do that first.

=head1 COMMON PROBLEMS

Perl source plays by ANSI C89 rules: no C99 (or C++) extensions.  In
some cases we have to take pre-ANSI requirements into consideration.
You don't care about some particular platform having broken Perl? I
hear there is still a strong demand for J2EE programmers.

=head2 Perl environment problems

=over 4

=item *

Not compiling with threading

Compiling with threading (-Duseithreads) completely rewrites the
function prototypes of Perl.  You better try your changes with that.
Related to this is the difference between "Perl_-less" and "Perl_-ly"
APIs, for example:

  Perl_sv_setiv(aTHX_ ...);
  sv_setiv(...);

The first one explicitly passes in the context, which is needed for
e.g. threaded builds.  The second one does that implicitly; do not get
them mixed.  If you are not passing in a aTHX_, you will need to do a
dTHX (or a dVAR) as the first thing in the function.

See L<perlguts/"How multiple interpreters and concurrency are
supported"> for further discussion about context.

=item *

Not compiling with -DDEBUGGING

The DEBUGGING define exposes more code to the compiler, therefore more
ways for things to go wrong.  You should try it.

=item *

Introducing (non-read-only) globals

Do not introduce any modifiable globals, truly global or file static.
They are bad form and complicate multithreading and other forms of
concurrency.  The right way is to introduce them as new interpreter
variables, see F<intrpvar.h> (at the very end for binary
compatibility).

Introducing read-only (const) globals is okay, as long as you verify
with e.g. C<nm libperl.a|egrep -v ' [TURtr] '> (if your C<nm> has
BSD-style output) that the data you added really is read-only.  (If it
is, it shouldn't show up in the output of that command.)

If you want to have static strings, make them constant:

  static const char etc[] = "...";

If you want to have arrays of constant strings, note carefully the
right combination of C<const>s:

    static const char * const yippee[] =
        {"hi", "ho", "silver"};

There is a way to completely hide any modifiable globals (they are all
moved to heap), the compilation setting
C<-DPERL_GLOBAL_STRUCT_PRIVATE>.  It is not normally used, but can be
used for testing, read more about it in L<perlguts/"Background and
PERL_IMPLICIT_CONTEXT">.

=item *

Not exporting your new function

Some platforms (Win32, AIX, VMS, OS/2, to name a few) require any
function that is part of the public API (the shared Perl library) to be
explicitly marked as exported.  See the discussion about F<embed.pl> in
L<perlguts>.

=item *

Exporting your new function

The new shiny result of either genuine new functionality or your
arduous refactoring is now ready and correctly exported.  So what could
possibly go wrong?

Maybe simply that your function did not need to be exported in the
first place.  Perl has a long and not so glorious history of exporting
functions that it should not have.

If the function is used only inside one source code file, make it
static.  See the discussion about F<embed.pl> in L<perlguts>.

If the function is used across several files, but intended only for
Perl's internal use (and this should be the common case), do not export
it to the public API.  See the discussion about F<embed.pl> in
L<perlguts>.

=back

=head2 Portability problems

The following are common causes of compilation and/or execution
failures, not common to Perl as such.  The C FAQ is good bedtime
reading.  Please test your changes with as many C compilers and
platforms as possible; we will, anyway, and it's nice to save oneself
from public embarrassment.

If using gcc, you can add the C<-std=c89> option which will hopefully
catch most of these unportabilities.  (However it might also catch
incompatibilities in your system's header files.)

Use the Configure C<-Dgccansipedantic> flag to enable the gcc C<-ansi
-pedantic> flags which enforce stricter ANSI rules.

If using the C<gcc -Wall> note that not all the possible warnings (like
C<-Wuninitialized>) are given unless you also compile with C<-O>.

Note that if using gcc, starting from Perl 5.9.5 the Perl core source
code files (the ones at the top level of the source code distribution,
but not e.g. the extensions under ext/) are automatically compiled with
as many as possible of the C<-std=c89>, C<-ansi>, C<-pedantic>, and a
selection of C<-W> flags (see cflags.SH).

Also study L<perlport> carefully to avoid any bad assumptions about the
operating system, filesystems, character set, and so forth.

You may once in a while try a "make microperl" to see whether we can
still compile Perl with just the bare minimum of interfaces.  (See
README.micro.)

Do not assume an operating system indicates a certain compiler.

=over 4

=item *

Casting pointers to integers or casting integers to pointers

    void castaway(U8* p)
    {
      IV i = p;

or

    void castaway(U8* p)
    {
      IV i = (IV)p;

Both are bad, and broken, and unportable.  Use the PTR2IV() macro that
does it right.  (Likewise, there are PTR2UV(), PTR2NV(), INT2PTR(), and
NUM2PTR().)

=item *

Casting between function pointers and data pointers

Technically speaking casting between function pointers and data
pointers is unportable and undefined, but practically speaking it seems
to work, but you should use the FPTR2DPTR() and DPTR2FPTR() macros.
Sometimes you can also play games with unions.

=item *

Assuming sizeof(int) == sizeof(long)

There are platforms where longs are 64 bits, and platforms where ints
are 64 bits, and while we are out to shock you, even platforms where
shorts are 64 bits.  This is all legal according to the C standard.  (In
other words, "long long" is not a portable way to specify 64 bits, and
"long long" is not even guaranteed to be any wider than "long".)

Instead, use the definitions IV, UV, IVSIZE, I32SIZE, and so forth.
Avoid things like I32 because they are B<not> guaranteed to be
I<exactly> 32 bits, they are I<at least> 32 bits, nor are they
guaranteed to be B<int> or B<long>.  If you really explicitly need
64-bit variables, use I64 and U64, but only if guarded by HAS_QUAD.

=item *

Assuming one can dereference any type of pointer for any type of data

  char *p = ...;
  long pony = *(long *)p;    /* BAD */

Many platforms, quite rightly so, will give you a core dump instead of
a pony if the p happens not to be correctly aligned.

=item *

Lvalue casts

  (int)*p = ...;    /* BAD */

Simply not portable.  Get your lvalue to be of the right type, or maybe
use temporary variables, or dirty tricks with unions.

=item *

Assume B<anything> about structs (especially the ones you don't
control, like the ones coming from the system headers)

=over 8

=item *

That a certain field exists in a struct

=item *

That no other fields exist besides the ones you know of

=item *

That a field is of certain signedness, sizeof, or type

=item *

That the fields are in a certain order

=over 8

=item *

While C guarantees the ordering specified in the struct definition,
between different platforms the definitions might differ

=back

=item *

That the sizeof(struct) or the alignments are the same everywhere

=over 8

=item *

There might be padding bytes between the fields to align the fields -
the bytes can be anything

=item *

Structs are required to be aligned to the maximum alignment required by
the fields - which for native types is for usually equivalent to
sizeof() of the field

=back

=back

=item *

Assuming the character set is ASCIIish

Perl can compile and run under EBCDIC platforms.  See L<perlebcdic>.
This is transparent for the most part, but because the character sets
differ, you shouldn't use numeric (decimal, octal, nor hex) constants
to refer to characters.  You can safely say C<'A'>, but not C<0x41>.
You can safely say C<'\n'>, but not C<\012>.  However, you can use
macros defined in F<utf8.h> to specify any code point portably.
C<LATIN1_TO_NATIVE(0xDF)> is going to be the code point that means
LATIN SMALL LETTER SHARP S on whatever platform you are running on (on
ASCII platforms it compiles without adding any extra code, so there is
zero performance hit on those).  The acceptable inputs to
C<LATIN1_TO_NATIVE> are from C<0x00> through C<0xFF>.  If your input
isn't guaranteed to be in that range, use C<UNICODE_TO_NATIVE> instead.
C<NATIVE_TO_LATIN1> and C<NATIVE_TO_UNICODE> translate the opposite
direction.

If you need the string representation of a character that doesn't have a
mnemonic name in C, you should add it to the list in
F<regen/unicode_constants.pl>, and have Perl create C<#define>'s for you,
based on the current platform.

Note that the C<isI<FOO>> and C<toI<FOO>> macros in F<handy.h> work
properly on native code points and strings.

Also, the range 'A' - 'Z' in ASCII is an unbroken sequence of 26 upper
case alphabetic characters.  That is not true in EBCDIC.  Nor for 'a' to
'z'.  But '0' - '9' is an unbroken range in both systems.  Don't assume
anything about other ranges.  (Note that special handling of ranges in
regular expression patterns and transliterations makes it appear to Perl
code that the aforementioned ranges are all unbroken.)

Many of the comments in the existing code ignore the possibility of
EBCDIC, and may be wrong therefore, even if the code works.  This is
actually a tribute to the successful transparent insertion of being
able to handle EBCDIC without having to change pre-existing code.

UTF-8 and UTF-EBCDIC are two different encodings used to represent
Unicode code points as sequences of bytes.  Macros  with the same names
(but different definitions) in F<utf8.h> and F<utfebcdic.h> are used to
allow the calling code to think that there is only one such encoding.
This is almost always referred to as C<utf8>, but it means the EBCDIC
version as well.  Again, comments in the code may well be wrong even if
the code itself is right.  For example, the concept of UTF-8 C<invariant
characters> differs between ASCII and EBCDIC.  On ASCII platforms, only
characters that do not have the high-order bit set (i.e.  whose ordinals
are strict ASCII, 0 - 127) are invariant, and the documentation and
comments in the code may assume that, often referring to something
like, say, C<hibit>.  The situation differs and is not so simple on
EBCDIC machines, but as long as the code itself uses the
C<NATIVE_IS_INVARIANT()> macro appropriately, it works, even if the
comments are wrong.

As noted in L<perlhack/TESTING>, when writing test scripts, the file
F<t/charset_tools.pl> contains some helpful functions for writing tests
valid on both ASCII and EBCDIC platforms.  Sometimes, though, a test
can't use a function and it's inconvenient to have different test
versions depending on the platform.  There are 20 code points that are
the same in all 4 character sets currently recognized by Perl (the 3
EBCDIC code pages plus ISO 8859-1 (ASCII/Latin1)).  These can be used in
such tests, though there is a small possibility that Perl will become
available in yet another character set, breaking your test.  All but one
of these code points are C0 control characters.  The most significant
controls that are the same are C<\0>, C<\r>, and C<\N{VT}> (also
specifiable as C<\cK>, C<\x0B>, C<\N{U+0B}>, or C<\013>).  The single
non-control is U+00B6 PILCROW SIGN.  The controls that are the same have
the same bit pattern in all 4 character sets, regardless of the UTF8ness
of the string containing them.  The bit pattern for U+B6 is the same in
all 4 for non-UTF8 strings, but differs in each when its containing
string is UTF-8 encoded.  The only other code points that have some sort
of sameness across all 4 character sets are the pair 0xDC and 0xFC.
Together these represent upper- and lowercase LATIN LETTER U WITH
DIAERESIS, but which is upper and which is lower may be reversed: 0xDC
is the capital in Latin1 and 0xFC is the small letter, while 0xFC is the
capital in EBCDIC and 0xDC is the small one.  This factoid may be
exploited in writing case insensitive tests that are the same across all
4 character sets.

=item *

Assuming the character set is just ASCII

ASCII is a 7 bit encoding, but bytes have 8 bits in them.  The 128 extra
characters have different meanings depending on the locale.  Absent a
locale, currently these extra characters are generally considered to be
unassigned, and this has presented some problems.  This has being
changed starting in 5.12 so that these characters can be considered to
be Latin-1 (ISO-8859-1).

=item *

Mixing #define and #ifdef

  #define BURGLE(x) ... \
  #ifdef BURGLE_OLD_STYLE        /* BAD */
  ... do it the old way ... \
  #else
  ... do it the new way ... \
  #endif

You cannot portably "stack" cpp directives.  For example in the above
you need two separate BURGLE() #defines, one for each #ifdef branch.

=item *

Adding non-comment stuff after #endif or #else

  #ifdef SNOSH
  ...
  #else !SNOSH    /* BAD */
  ...
  #endif SNOSH    /* BAD */

The #endif and #else cannot portably have anything non-comment after
them.  If you want to document what is going (which is a good idea
especially if the branches are long), use (C) comments:

  #ifdef SNOSH
  ...
  #else /* !SNOSH */
  ...
  #endif /* SNOSH */

The gcc option C<-Wendif-labels> warns about the bad variant (by
default on starting from Perl 5.9.4).

=item *

Having a comma after the last element of an enum list

  enum color {
    CERULEAN,
    CHARTREUSE,
    CINNABAR,     /* BAD */
  };

is not portable.  Leave out the last comma.

Also note that whether enums are implicitly morphable to ints varies
between compilers, you might need to (int).

=item *

Using //-comments

  // This function bamfoodles the zorklator.   /* BAD */

That is C99 or C++.  Perl is C89.  Using the //-comments is silently
allowed by many C compilers but cranking up the ANSI C89 strictness
(which we like to do) causes the compilation to fail.

=item *

Mixing declarations and code

  void zorklator()
  {
    int n = 3;
    set_zorkmids(n);    /* BAD */
    int q = 4;

That is C99 or C++.  Some C compilers allow that, but you shouldn't.

The gcc option C<-Wdeclaration-after-statements> scans for such
problems (by default on starting from Perl 5.9.4).

=item *

Introducing variables inside for()

  for(int i = ...; ...; ...) {    /* BAD */

That is C99 or C++.  While it would indeed be awfully nice to have that
also in C89, to limit the scope of the loop variable, alas, we cannot.

=item *

Mixing signed char pointers with unsigned char pointers

  int foo(char *s) { ... }
  ...
  unsigned char *t = ...; /* Or U8* t = ... */
  foo(t);   /* BAD */

While this is legal practice, it is certainly dubious, and downright
fatal in at least one platform: for example VMS cc considers this a
fatal error.  One cause for people often making this mistake is that a
"naked char" and therefore dereferencing a "naked char pointer" have an
undefined signedness: it depends on the compiler and the flags of the
compiler and the underlying platform whether the result is signed or
unsigned.  For this very same reason using a 'char' as an array index is
bad.

=item *

Macros that have string constants and their arguments as substrings of
the string constants

  #define FOO(n) printf("number = %d\n", n)    /* BAD */
  FOO(10);

Pre-ANSI semantics for that was equivalent to

  printf("10umber = %d\10");

which is probably not what you were expecting.  Unfortunately at least
one reasonably common and modern C compiler does "real backward
compatibility" here, in AIX that is what still happens even though the
rest of the AIX compiler is very happily C89.

=item *

Using printf formats for non-basic C types

   IV i = ...;
   printf("i = %d\n", i);    /* BAD */

While this might by accident work in some platform (where IV happens to
be an C<int>), in general it cannot.  IV might be something larger.  Even
worse the situation is with more specific types (defined by Perl's
configuration step in F<config.h>):

   Uid_t who = ...;
   printf("who = %d\n", who);    /* BAD */

The problem here is that Uid_t might be not only not C<int>-wide but it
might also be unsigned, in which case large uids would be printed as
negative values.

There is no simple solution to this because of printf()'s limited
intelligence, but for many types the right format is available as with
either 'f' or '_f' suffix, for example:

   IVdf /* IV in decimal */
   UVxf /* UV is hexadecimal */

   printf("i = %"IVdf"\n", i); /* The IVdf is a string constant. */

   Uid_t_f /* Uid_t in decimal */

   printf("who = %"Uid_t_f"\n", who);

Or you can try casting to a "wide enough" type:

   printf("i = %"IVdf"\n", (IV)something_very_small_and_signed);

See L<perlguts/Formatted Printing of Size_t and SSize_t> for how to
print those.

Also remember that the C<%p> format really does require a void pointer:

   U8* p = ...;
   printf("p = %p\n", (void*)p);

The gcc option C<-Wformat> scans for such problems.

=item *

Blindly using variadic macros

gcc has had them for a while with its own syntax, and C99 brought them
with a standardized syntax.  Don't use the former, and use the latter
only if the HAS_C99_VARIADIC_MACROS is defined.

=item *

Blindly passing va_list

Not all platforms support passing va_list to further varargs (stdarg)
functions.  The right thing to do is to copy the va_list using the
Perl_va_copy() if the NEED_VA_COPY is defined.

=item *

Using gcc statement expressions

   val = ({...;...;...});    /* BAD */

While a nice extension, it's not portable.  The Perl code does
admittedly use them if available to gain some extra speed (essentially
as a funky form of inlining), but you shouldn't.

=item *

Binding together several statements in a macro

Use the macros STMT_START and STMT_END.

   STMT_START {
      ...
   } STMT_END

=item *

Testing for operating systems or versions when should be testing for
features

  #ifdef __FOONIX__    /* BAD */
  foo = quux();
  #endif

Unless you know with 100% certainty that quux() is only ever available
for the "Foonix" operating system B<and> that is available B<and>
correctly working for B<all> past, present, B<and> future versions of
"Foonix", the above is very wrong.  This is more correct (though still
not perfect, because the below is a compile-time check):

  #ifdef HAS_QUUX
  foo = quux();
  #endif

How does the HAS_QUUX become defined where it needs to be?  Well, if
Foonix happens to be Unixy enough to be able to run the Configure
script, and Configure has been taught about detecting and testing
quux(), the HAS_QUUX will be correctly defined.  In other platforms, the
corresponding configuration step will hopefully do the same.

In a pinch, if you cannot wait for Configure to be educated, or if you
have a good hunch of where quux() might be available, you can
temporarily try the following:

  #if (defined(__FOONIX__) || defined(__BARNIX__))
  # define HAS_QUUX
  #endif

  ...

  #ifdef HAS_QUUX
  foo = quux();
  #endif

But in any case, try to keep the features and operating systems
separate.

A good resource on the predefined macros for various operating
systems, compilers, and so forth is
L<http://sourceforge.net/p/predef/wiki/Home/>

=item *

Assuming the contents of static memory pointed to by the return values
of Perl wrappers for C library functions doesn't change.  Many C library
functions return pointers to static storage that can be overwritten by
subsequent calls to the same or related functions.  Perl has
light-weight wrappers for some of these functions, and which don't make
copies of the static memory.  A good example is the interface to the
environment variables that are in effect for the program.  Perl has
C<PerlEnv_getenv> to get values from the environment.  But the return is
a pointer to static memory in the C library.  If you are using the value
to immediately test for something, that's fine, but if you save the
value and expect it to be unchanged by later processing, you would be
wrong, but perhaps you wouldn't know it because different C library
implementations behave differently, and the one on the platform you're
testing on might work for your situation.  But on some platforms, a
subsequent call to C<PerlEnv_getenv> or related function WILL overwrite
the memory that your first call points to.  This has led to some
hard-to-debug problems.  Do a L<perlapi/savepv> to make a copy, thus
avoiding these problems.  You will have to free the copy when you're
done to avoid memory leaks.  If you don't have control over when it gets
freed, you'll need to make the copy in a mortal scalar, like so:

 if ((s = PerlEnv_getenv("foo") == NULL) {
    ... /* handle NULL case */
 }
 else {
     s = SvPVX(sv_2mortal(newSVpv(s, 0)));
 }

The above example works only if C<"s"> is C<NUL>-terminated; otherwise
you have to pass its length to C<newSVpv>.

=back

=head2 Problematic System Interfaces

=over 4

=item *

malloc(0), realloc(0), calloc(0, 0) are non-portable.  To be portable
allocate at least one byte.  (In general you should rarely need to work
at this low level, but instead use the various malloc wrappers.)

=item *

snprintf() - the return type is unportable.  Use my_snprintf() instead.

=back

=head2 Security problems

Last but not least, here are various tips for safer coding.
See also L<perlclib> for libc/stdio replacements one should use.

=over 4

=item *

Do not use gets()

Or we will publicly ridicule you.  Seriously.

=item *

Do not use tmpfile()

Use mkstemp() instead.

=item *

Do not use strcpy() or strcat() or strncpy() or strncat()

Use my_strlcpy() and my_strlcat() instead: they either use the native
implementation, or Perl's own implementation (borrowed from the public
domain implementation of INN).

=item *

Do not use sprintf() or vsprintf()

If you really want just plain byte strings, use my_snprintf() and
my_vsnprintf() instead, which will try to use snprintf() and
vsnprintf() if those safer APIs are available.  If you want something
fancier than a plain byte string, use
L<C<Perl_form>()|perlapi/form> or SVs and
L<C<Perl_sv_catpvf()>|perlapi/sv_catpvf>.

Note that glibc C<printf()>, C<sprintf()>, etc. are buggy before glibc
version 2.17.  They won't allow a C<%.s> format with a precision to
create a string that isn't valid UTF-8 if the current underlying locale
of the program is UTF-8.  What happens is that the C<%s> and its operand are
simply skipped without any notice.
L<https://sourceware.org/bugzilla/show_bug.cgi?id=6530>.

=item *

Do not use atoi()

Use grok_atoUV() instead.  atoi() has ill-defined behavior on overflows,
and cannot be used for incremental parsing.  It is also affected by locale,
which is bad.

=item *

Do not use strtol() or strtoul()

Use grok_atoUV() instead.  strtol() or strtoul() (or their IV/UV-friendly
macro disguises, Strtol() and Strtoul(), or Atol() and Atoul() are
affected by locale, which is bad.

=back

=head1 DEBUGGING

You can compile a special debugging version of Perl, which allows you
to use the C<-D> option of Perl to tell more about what Perl is doing.
But sometimes there is no alternative than to dive in with a debugger,
either to see the stack trace of a core dump (very useful in a bug
report), or trying to figure out what went wrong before the core dump
happened, or how did we end up having wrong or unexpected results.

=head2 Poking at Perl

To really poke around with Perl, you'll probably want to build Perl for
debugging, like this:

    ./Configure -d -DDEBUGGING
    make

C<-DDEBUGGING> turns on the C compiler's C<-g> flag to have it produce
debugging information which will allow us to step through a running
program, and to see in which C function we are at (without the debugging
information we might see only the numerical addresses of the functions,
which is not very helpful). It will also turn on the C<DEBUGGING>
compilation symbol which enables all the internal debugging code in Perl.
There are a whole bunch of things you can debug with this: L<perlrun>
lists them all, and the best way to find out about them is to play about
with them.  The most useful options are probably

    l  Context (loop) stack processing
    s  Stack snapshots (with v, displays all stacks)
    t  Trace execution
    o  Method and overloading resolution
    c  String/numeric conversions

For example

    $ perl -Dst -e '$a + 1'
    ....
    (-e:1)	gvsv(main::a)
        =>  UNDEF
    (-e:1)	const(IV(1))
        =>  UNDEF  IV(1)
    (-e:1)	add
        =>  NV(1)


Some of the functionality of the debugging code can be achieved with a
non-debugging perl by using XS modules:

    -Dr => use re 'debug'
    -Dx => use O 'Debug'

=head2 Using a source-level debugger

If the debugging output of C<-D> doesn't help you, it's time to step
through perl's execution with a source-level debugger.

=over 3

=item *

We'll use C<gdb> for our examples here; the principles will apply to
any debugger (many vendors call their debugger C<dbx>), but check the
manual of the one you're using.

=back

To fire up the debugger, type

    gdb ./perl

Or if you have a core dump:

    gdb ./perl core

You'll want to do that in your Perl source tree so the debugger can
read the source code.  You should see the copyright message, followed by
the prompt.

    (gdb)

C<help> will get you into the documentation, but here are the most
useful commands:

=over 3

=item * run [args]

Run the program with the given arguments.

=item * break function_name

=item * break source.c:xxx

Tells the debugger that we'll want to pause execution when we reach
either the named function (but see L<perlguts/Internal Functions>!) or
the given line in the named source file.

=item * step

Steps through the program a line at a time.

=item * next

Steps through the program a line at a time, without descending into
functions.

=item * continue

Run until the next breakpoint.

=item * finish

Run until the end of the current function, then stop again.

=item * 'enter'

Just pressing Enter will do the most recent operation again - it's a
blessing when stepping through miles of source code.

=item * ptype

Prints the C definition of the argument given.

  (gdb) ptype PL_op
  type = struct op {
      OP *op_next;
      OP *op_sibparent;
      OP *(*op_ppaddr)(void);
      PADOFFSET op_targ;
      unsigned int op_type : 9;
      unsigned int op_opt : 1;
      unsigned int op_slabbed : 1;
      unsigned int op_savefree : 1;
      unsigned int op_static : 1;
      unsigned int op_folded : 1;
      unsigned int op_spare : 2;
      U8 op_flags;
      U8 op_private;
  } *

=item * print

Execute the given C code and print its results.  B<WARNING>: Perl makes
heavy use of macros, and F<gdb> does not necessarily support macros
(see later L</"gdb macro support">).  You'll have to substitute them
yourself, or to invoke cpp on the source code files (see L</"The .i
Targets">) So, for instance, you can't say

    print SvPV_nolen(sv)

but you have to say

    print Perl_sv_2pv_nolen(sv)

=back

You may find it helpful to have a "macro dictionary", which you can
produce by saying C<cpp -dM perl.c | sort>.  Even then, F<cpp> won't
recursively apply those macros for you.

=head2 gdb macro support

Recent versions of F<gdb> have fairly good macro support, but in order
to use it you'll need to compile perl with macro definitions included
in the debugging information.  Using F<gcc> version 3.1, this means
configuring with C<-Doptimize=-g3>.  Other compilers might use a
different switch (if they support debugging macros at all).

=head2 Dumping Perl Data Structures

One way to get around this macro hell is to use the dumping functions
in F<dump.c>; these work a little like an internal
L<Devel::Peek|Devel::Peek>, but they also cover OPs and other
structures that you can't get at from Perl.  Let's take an example.
We'll use the C<$a = $b + $c> we used before, but give it a bit of
context: C<$b = "6XXXX"; $c = 2.3;>.  Where's a good place to stop and
poke around?

What about C<pp_add>, the function we examined earlier to implement the
C<+> operator:

    (gdb) break Perl_pp_add
    Breakpoint 1 at 0x46249f: file pp_hot.c, line 309.

Notice we use C<Perl_pp_add> and not C<pp_add> - see
L<perlguts/Internal Functions>.  With the breakpoint in place, we can
run our program:

    (gdb) run -e '$b = "6XXXX"; $c = 2.3; $a = $b + $c'

Lots of junk will go past as gdb reads in the relevant source files and
libraries, and then:

    Breakpoint 1, Perl_pp_add () at pp_hot.c:309
    309         dSP; dATARGET; tryAMAGICbin(add,opASSIGN);
    (gdb) step
    311           dPOPTOPnnrl_ul;
    (gdb)

We looked at this bit of code before, and we said that
C<dPOPTOPnnrl_ul> arranges for two C<NV>s to be placed into C<left> and
C<right> - let's slightly expand it:

 #define dPOPTOPnnrl_ul  NV right = POPn; \
                         SV *leftsv = TOPs; \
                         NV left = USE_LEFT(leftsv) ? SvNV(leftsv) : 0.0

C<POPn> takes the SV from the top of the stack and obtains its NV
either directly (if C<SvNOK> is set) or by calling the C<sv_2nv>
function.  C<TOPs> takes the next SV from the top of the stack - yes,
C<POPn> uses C<TOPs> - but doesn't remove it.  We then use C<SvNV> to
get the NV from C<leftsv> in the same way as before - yes, C<POPn> uses
C<SvNV>.

Since we don't have an NV for C<$b>, we'll have to use C<sv_2nv> to
convert it.  If we step again, we'll find ourselves there:

    (gdb) step
    Perl_sv_2nv (sv=0xa0675d0) at sv.c:1669
    1669        if (!sv)
    (gdb)

We can now use C<Perl_sv_dump> to investigate the SV:

    (gdb) print Perl_sv_dump(sv)
    SV = PV(0xa057cc0) at 0xa0675d0
    REFCNT = 1
    FLAGS = (POK,pPOK)
    PV = 0xa06a510 "6XXXX"\0
    CUR = 5
    LEN = 6
    $1 = void

We know we're going to get C<6> from this, so let's finish the
subroutine:

    (gdb) finish
    Run till exit from #0  Perl_sv_2nv (sv=0xa0675d0) at sv.c:1671
    0x462669 in Perl_pp_add () at pp_hot.c:311
    311           dPOPTOPnnrl_ul;

We can also dump out this op: the current op is always stored in
C<PL_op>, and we can dump it with C<Perl_op_dump>.  This'll give us
similar output to L<B::Debug|B::Debug>.

    (gdb) print Perl_op_dump(PL_op)
    {
    13  TYPE = add  ===> 14
        TARG = 1
        FLAGS = (SCALAR,KIDS)
        {
            TYPE = null  ===> (12)
              (was rv2sv)
            FLAGS = (SCALAR,KIDS)
            {
    11          TYPE = gvsv  ===> 12
                FLAGS = (SCALAR)
                GV = main::b
            }
        }

# finish this later #

=head2 Using gdb to look at specific parts of a program

With the example above, you knew to look for C<Perl_pp_add>, but what if
there were multiple calls to it all over the place, or you didn't know what
the op was you were looking for?

One way to do this is to inject a rare call somewhere near what you're looking
for.  For example, you could add C<study> before your method:

    study;

And in gdb do:

    (gdb) break Perl_pp_study

And then step until you hit what you're
looking for.  This works well in a loop
if you want to only break at certain iterations:

    for my $c (1..100) {
        study if $c == 50;
    }

=head2 Using gdb to look at what the parser/lexer are doing

If you want to see what perl is doing when parsing/lexing your code, you can
use C<BEGIN {}>:

    print "Before\n";
    BEGIN { study; }
    print "After\n";

And in gdb:

    (gdb) break Perl_pp_study

If you want to see what the parser/lexer is doing inside of C<if> blocks and
the like you need to be a little trickier:

    if ($a && $b && do { BEGIN { study } 1 } && $c) { ... }

=head1 SOURCE CODE STATIC ANALYSIS

Various tools exist for analysing C source code B<statically>, as
opposed to B<dynamically>, that is, without executing the code.  It is
possible to detect resource leaks, undefined behaviour, type
mismatches, portability problems, code paths that would cause illegal
memory accesses, and other similar problems by just parsing the C code
and looking at the resulting graph, what does it tell about the
execution and data flows.  As a matter of fact, this is exactly how C
compilers know to give warnings about dubious code.

=head2 lint

The good old C code quality inspector, C<lint>, is available in several
platforms, but please be aware that there are several different
implementations of it by different vendors, which means that the flags
are not identical across different platforms.

There is a C<lint> target in Makefile, but you may have to
diddle with the flags (see above).

=head2 Coverity

Coverity (L<http://www.coverity.com/>) is a product similar to lint and as
a testbed for their product they periodically check several open source
projects, and they give out accounts to open source developers to the
defect databases.

There is Coverity setup for the perl5 project:
L<https://scan.coverity.com/projects/perl5>

=head2 HP-UX cadvise (Code Advisor)

HP has a C/C++ static analyzer product for HP-UX caller Code Advisor.
(Link not given here because the URL is horribly long and seems horribly
unstable; use the search engine of your choice to find it.)  The use of
the C<cadvise_cc> recipe with C<Configure ... -Dcc=./cadvise_cc>
(see cadvise "User Guide") is recommended; as is the use of C<+wall>.

=head2 cpd (cut-and-paste detector)

The cpd tool detects cut-and-paste coding.  If one instance of the
cut-and-pasted code changes, all the other spots should probably be
changed, too.  Therefore such code should probably be turned into a
subroutine or a macro.

cpd (L<http://pmd.sourceforge.net/cpd.html>) is part of the pmd project
(L<http://pmd.sourceforge.net/>).  pmd was originally written for static
analysis of Java code, but later the cpd part of it was extended to
parse also C and C++.

Download the pmd-bin-X.Y.zip () from the SourceForge site, extract the
pmd-X.Y.jar from it, and then run that on source code thusly:

  java -cp pmd-X.Y.jar net.sourceforge.pmd.cpd.CPD \
   --minimum-tokens 100 --files /some/where/src --language c > cpd.txt

You may run into memory limits, in which case you should use the -Xmx
option:

  java -Xmx512M ...

=head2 gcc warnings

Though much can be written about the inconsistency and coverage
problems of gcc warnings (like C<-Wall> not meaning "all the warnings",
or some common portability problems not being covered by C<-Wall>, or
C<-ansi> and C<-pedantic> both being a poorly defined collection of
warnings, and so forth), gcc is still a useful tool in keeping our
coding nose clean.

The C<-Wall> is by default on.

The C<-ansi> (and its sidekick, C<-pedantic>) would be nice to be on
always, but unfortunately they are not safe on all platforms, they can
for example cause fatal conflicts with the system headers (Solaris
being a prime example).  If Configure C<-Dgccansipedantic> is used, the
C<cflags> frontend selects C<-ansi -pedantic> for the platforms where
they are known to be safe.

Starting from Perl 5.9.4 the following extra flags are added:

=over 4

=item *

C<-Wendif-labels>

=item *

C<-Wextra>

=item *

C<-Wdeclaration-after-statement>

=back

The following flags would be nice to have but they would first need
their own Augean stablemaster:

=over 4

=item *

C<-Wpointer-arith>

=item *

C<-Wshadow>

=item *

C<-Wstrict-prototypes>

=back

The C<-Wtraditional> is another example of the annoying tendency of gcc
to bundle a lot of warnings under one switch (it would be impossible to
deploy in practice because it would complain a lot) but it does contain
some warnings that would be beneficial to have available on their own,
such as the warning about string constants inside macros containing the
macro arguments: this behaved differently pre-ANSI than it does in
ANSI, and some C compilers are still in transition, AIX being an
example.

=head2 Warnings of other C compilers

Other C compilers (yes, there B<are> other C compilers than gcc) often
have their "strict ANSI" or "strict ANSI with some portability
extensions" modes on, like for example the Sun Workshop has its C<-Xa>
mode on (though implicitly), or the DEC (these days, HP...) has its
C<-std1> mode on.

=head1 MEMORY DEBUGGERS

B<NOTE 1>: Running under older memory debuggers such as Purify,
valgrind or Third Degree greatly slows down the execution: seconds
become minutes, minutes become hours.  For example as of Perl 5.8.1, the
ext/Encode/t/Unicode.t takes extraordinarily long to complete under
e.g. Purify, Third Degree, and valgrind.  Under valgrind it takes more
than six hours, even on a snappy computer.  The said test must be doing
something that is quite unfriendly for memory debuggers.  If you don't
feel like waiting, that you can simply kill away the perl process.
Roughly valgrind slows down execution by factor 10, AddressSanitizer by
factor 2.

B<NOTE 2>: To minimize the number of memory leak false alarms (see
L</PERL_DESTRUCT_LEVEL> for more information), you have to set the
environment variable PERL_DESTRUCT_LEVEL to 2.  For example, like this:

    env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib ...

B<NOTE 3>: There are known memory leaks when there are compile-time
errors within eval or require, seeing C<S_doeval> in the call stack is
a good sign of these.  Fixing these leaks is non-trivial, unfortunately,
but they must be fixed eventually.

B<NOTE 4>: L<DynaLoader> will not clean up after itself completely
unless Perl is built with the Configure option
C<-Accflags=-DDL_UNLOAD_ALL_AT_EXIT>.

=head2 valgrind

The valgrind tool can be used to find out both memory leaks and illegal
heap memory accesses.  As of version 3.3.0, Valgrind only supports Linux
on x86, x86-64 and PowerPC and Darwin (OS X) on x86 and x86-64.  The
special "test.valgrind" target can be used to run the tests under
valgrind.  Found errors and memory leaks are logged in files named
F<testfile.valgrind> and by default output is displayed inline.

Example usage:

    make test.valgrind

Since valgrind adds significant overhead, tests will take much longer to
run.  The valgrind tests support being run in parallel to help with this:

    TEST_JOBS=9 make test.valgrind

Note that the above two invocations will be very verbose as reachable
memory and leak-checking is enabled by default.  If you want to just see
pure errors, try:

    VG_OPTS='-q --leak-check=no --show-reachable=no' TEST_JOBS=9 \
        make test.valgrind

Valgrind also provides a cachegrind tool, invoked on perl as:

    VG_OPTS=--tool=cachegrind make test.valgrind

As system libraries (most notably glibc) are also triggering errors,
valgrind allows to suppress such errors using suppression files.  The
default suppression file that comes with valgrind already catches a lot
of them.  Some additional suppressions are defined in F<t/perl.supp>.

To get valgrind and for more information see

    http://valgrind.org/

=head2 AddressSanitizer

AddressSanitizer is a clang and gcc extension, included in clang since
v3.1 and gcc since v4.8.  It checks illegal heap pointers, global
pointers, stack pointers and use after free errors, and is fast enough
that you can easily compile your debugging or optimized perl with it.
It does not check memory leaks though.  AddressSanitizer is available
for Linux, Mac OS X and soon on Windows.

To build perl with AddressSanitizer, your Configure invocation should
look like:

    sh Configure -des -Dcc=clang \
       -Accflags=-faddress-sanitizer -Aldflags=-faddress-sanitizer \
       -Alddlflags=-shared\ -faddress-sanitizer

where these arguments mean:

=over 4

=item * -Dcc=clang

This should be replaced by the full path to your clang executable if it
is not in your path.

=item * -Accflags=-faddress-sanitizer

Compile perl and extensions sources with AddressSanitizer.

=item * -Aldflags=-faddress-sanitizer

Link the perl executable with AddressSanitizer.

=item * -Alddlflags=-shared\ -faddress-sanitizer

Link dynamic extensions with AddressSanitizer.  You must manually
specify C<-shared> because using C<-Alddlflags=-shared> will prevent
Configure from setting a default value for C<lddlflags>, which usually
contains C<-shared> (at least on Linux).

=back

See also
L<http://code.google.com/p/address-sanitizer/wiki/AddressSanitizer>.


=head1 PROFILING

Depending on your platform there are various ways of profiling Perl.

There are two commonly used techniques of profiling executables:
I<statistical time-sampling> and I<basic-block counting>.

The first method takes periodically samples of the CPU program counter,
and since the program counter can be correlated with the code generated
for functions, we get a statistical view of in which functions the
program is spending its time.  The caveats are that very small/fast
functions have lower probability of showing up in the profile, and that
periodically interrupting the program (this is usually done rather
frequently, in the scale of milliseconds) imposes an additional
overhead that may skew the results.  The first problem can be alleviated
by running the code for longer (in general this is a good idea for
profiling), the second problem is usually kept in guard by the
profiling tools themselves.

The second method divides up the generated code into I<basic blocks>.
Basic blocks are sections of code that are entered only in the
beginning and exited only at the end.  For example, a conditional jump
starts a basic block.  Basic block profiling usually works by
I<instrumenting> the code by adding I<enter basic block #nnnn>
book-keeping code to the generated code.  During the execution of the
code the basic block counters are then updated appropriately.  The
caveat is that the added extra code can skew the results: again, the
profiling tools usually try to factor their own effects out of the
results.

=head2 Gprof Profiling

I<gprof> is a profiling tool available in many Unix platforms which
uses I<statistical time-sampling>.  You can build a profiled version of
F<perl> by compiling using gcc with the flag C<-pg>.  Either edit
F<config.sh> or re-run F<Configure>.  Running the profiled version of
Perl will create an output file called F<gmon.out> which contains the
profiling data collected during the execution.

quick hint:

    $ sh Configure -des -Dusedevel -Accflags='-pg' \
        -Aldflags='-pg' -Alddlflags='-pg -shared' \
        && make perl
    $ ./perl ... # creates gmon.out in current directory
    $ gprof ./perl > out
    $ less out

(you probably need to add C<-shared> to the <-Alddlflags> line until RT
#118199 is resolved)

The F<gprof> tool can then display the collected data in various ways.
Usually F<gprof> understands the following options:

=over 4

=item * -a

Suppress statically defined functions from the profile.

=item * -b

Suppress the verbose descriptions in the profile.

=item * -e routine

Exclude the given routine and its descendants from the profile.

=item * -f routine

Display only the given routine and its descendants in the profile.

=item * -s

Generate a summary file called F<gmon.sum> which then may be given to
subsequent gprof runs to accumulate data over several runs.

=item * -z

Display routines that have zero usage.

=back

For more detailed explanation of the available commands and output
formats, see your own local documentation of F<gprof>.

=head2 GCC gcov Profiling

I<basic block profiling> is officially available in gcc 3.0 and later.
You can build a profiled version of F<perl> by compiling using gcc with
the flags C<-fprofile-arcs -ftest-coverage>.  Either edit F<config.sh>
or re-run F<Configure>.

quick hint:

    $ sh Configure -des -Dusedevel -Doptimize='-g' \
        -Accflags='-fprofile-arcs -ftest-coverage' \
        -Aldflags='-fprofile-arcs -ftest-coverage' \
        -Alddlflags='-fprofile-arcs -ftest-coverage -shared' \
        && make perl
    $ rm -f regexec.c.gcov regexec.gcda
    $ ./perl ...
    $ gcov regexec.c
    $ less regexec.c.gcov

(you probably need to add C<-shared> to the <-Alddlflags> line until RT
#118199 is resolved)

Running the profiled version of Perl will cause profile output to be
generated.  For each source file an accompanying F<.gcda> file will be
created.

To display the results you use the I<gcov> utility (which should be
installed if you have gcc 3.0 or newer installed).  F<gcov> is run on
source code files, like this

    gcov sv.c

which will cause F<sv.c.gcov> to be created.  The F<.gcov> files contain
the source code annotated with relative frequencies of execution
indicated by "#" markers.  If you want to generate F<.gcov> files for
all profiled object files, you can run something like this:

    for file in `find . -name \*.gcno`
    do sh -c "cd `dirname $file` && gcov `basename $file .gcno`"
    done

Useful options of F<gcov> include C<-b> which will summarise the basic
block, branch, and function call coverage, and C<-c> which instead of
relative frequencies will use the actual counts.  For more information
on the use of F<gcov> and basic block profiling with gcc, see the
latest GNU CC manual.  As of gcc 4.8, this is at
L<http://gcc.gnu.org/onlinedocs/gcc/Gcov-Intro.html#Gcov-Intro>

=head1 MISCELLANEOUS TRICKS

=head2 PERL_DESTRUCT_LEVEL

If you want to run any of the tests yourself manually using e.g.
valgrind, please note that by default perl B<does not> explicitly
cleanup all the memory it has allocated (such as global memory arenas)
but instead lets the exit() of the whole program "take care" of such
allocations, also known as "global destruction of objects".

There is a way to tell perl to do complete cleanup: set the environment
variable PERL_DESTRUCT_LEVEL to a non-zero value.  The t/TEST wrapper
does set this to 2, and this is what you need to do too, if you don't
want to see the "global leaks": For example, for running under valgrind

    env PERL_DESTRUCT_LEVEL=2 valgrind ./perl -Ilib t/foo/bar.t

(Note: the mod_perl apache module uses also this environment variable
for its own purposes and extended its semantics.  Refer to the mod_perl
documentation for more information.  Also, spawned threads do the
equivalent of setting this variable to the value 1.)

If, at the end of a run you get the message I<N scalars leaked>, you
can recompile with C<-DDEBUG_LEAKING_SCALARS>,
(C<Configure -Accflags=-DDEBUG_LEAKING_SCALARS>), which will cause the
addresses of all those leaked SVs to be dumped along with details as to
where each SV was originally allocated.  This information is also
displayed by Devel::Peek.  Note that the extra details recorded with
each SV increases memory usage, so it shouldn't be used in production
environments.  It also converts C<new_SV()> from a macro into a real
function, so you can use your favourite debugger to discover where
those pesky SVs were allocated.

If you see that you're leaking memory at runtime, but neither valgrind
nor C<-DDEBUG_LEAKING_SCALARS> will find anything, you're probably
leaking SVs that are still reachable and will be properly cleaned up
during destruction of the interpreter.  In such cases, using the C<-Dm>
switch can point you to the source of the leak.  If the executable was
built with C<-DDEBUG_LEAKING_SCALARS>, C<-Dm> will output SV
allocations in addition to memory allocations.  Each SV allocation has a
distinct serial number that will be written on creation and destruction
of the SV.  So if you're executing the leaking code in a loop, you need
to look for SVs that are created, but never destroyed between each
cycle.  If such an SV is found, set a conditional breakpoint within
C<new_SV()> and make it break only when C<PL_sv_serial> is equal to the
serial number of the leaking SV.  Then you will catch the interpreter in
exactly the state where the leaking SV is allocated, which is
sufficient in many cases to find the source of the leak.

As C<-Dm> is using the PerlIO layer for output, it will by itself
allocate quite a bunch of SVs, which are hidden to avoid recursion.  You
can bypass the PerlIO layer if you use the SV logging provided by
C<-DPERL_MEM_LOG> instead.

=head2 PERL_MEM_LOG

If compiled with C<-DPERL_MEM_LOG> (C<-Accflags=-DPERL_MEM_LOG>), both
memory and SV allocations go through logging functions, which is
handy for breakpoint setting.

Unless C<-DPERL_MEM_LOG_NOIMPL> (C<-Accflags=-DPERL_MEM_LOG_NOIMPL>) is
also compiled, the logging functions read $ENV{PERL_MEM_LOG} to
determine whether to log the event, and if so how:

    $ENV{PERL_MEM_LOG} =~ /m/           Log all memory ops
    $ENV{PERL_MEM_LOG} =~ /s/           Log all SV ops
    $ENV{PERL_MEM_LOG} =~ /t/           include timestamp in Log
    $ENV{PERL_MEM_LOG} =~ /^(\d+)/      write to FD given (default is 2)

Memory logging is somewhat similar to C<-Dm> but is independent of
C<-DDEBUGGING>, and at a higher level; all uses of Newx(), Renew(), and
Safefree() are logged with the caller's source code file and line
number (and C function name, if supported by the C compiler).  In
contrast, C<-Dm> is directly at the point of C<malloc()>.  SV logging is
similar.

Since the logging doesn't use PerlIO, all SV allocations are logged and
no extra SV allocations are introduced by enabling the logging.  If
compiled with C<-DDEBUG_LEAKING_SCALARS>, the serial number for each SV
allocation is also logged.

=head2 DDD over gdb

Those debugging perl with the DDD frontend over gdb may find the
following useful:

You can extend the data conversion shortcuts menu, so for example you
can display an SV's IV value with one click, without doing any typing.
To do that simply edit ~/.ddd/init file and add after:

  ! Display shortcuts.
  Ddd*gdbDisplayShortcuts: \
  /t ()   // Convert to Bin\n\
  /d ()   // Convert to Dec\n\
  /x ()   // Convert to Hex\n\
  /o ()   // Convert to Oct(\n\

the following two lines:

  ((XPV*) (())->sv_any )->xpv_pv  // 2pvx\n\
  ((XPVIV*) (())->sv_any )->xiv_iv // 2ivx

so now you can do ivx and pvx lookups or you can plug there the sv_peek
"conversion":

  Perl_sv_peek(my_perl, (SV*)()) // sv_peek

(The my_perl is for threaded builds.)  Just remember that every line,
but the last one, should end with \n\

Alternatively edit the init file interactively via: 3rd mouse button ->
New Display -> Edit Menu

Note: you can define up to 20 conversion shortcuts in the gdb section.

=head2 C backtrace

On some platforms Perl supports retrieving the C level backtrace
(similar to what symbolic debuggers like gdb do).

The backtrace returns the stack trace of the C call frames,
with the symbol names (function names), the object names (like "perl"),
and if it can, also the source code locations (file:line).

The supported platforms are Linux, and OS X (some *BSD might
work at least partly, but they have not yet been tested).

This feature hasn't been tested with multiple threads, but it will
only show the backtrace of the thread doing the backtracing.

The feature needs to be enabled with C<Configure -Dusecbacktrace>.

The C<-Dusecbacktrace> also enables keeping the debug information when
compiling/linking (often: C<-g>).  Many compilers/linkers do support
having both optimization and keeping the debug information.  The debug
information is needed for the symbol names and the source locations.

Static functions might not be visible for the backtrace.

Source code locations, even if available, can often be missing or
misleading if the compiler has e.g. inlined code.  Optimizer can
make matching the source code and the object code quite challenging.

=over 4

=item Linux

You B<must> have the BFD (-lbfd) library installed, otherwise C<perl> will
fail to link.  The BFD is usually distributed as part of the GNU binutils.

Summary: C<Configure ... -Dusecbacktrace>
and you need C<-lbfd>.

=item OS X

The source code locations are supported B<only> if you have
the Developer Tools installed.  (BFD is B<not> needed.)

Summary: C<Configure ... -Dusecbacktrace>
and installing the Developer Tools would be good.

=back

Optionally, for trying out the feature, you may want to enable
automatic dumping of the backtrace just before a warning or croak (die)
message is emitted, by adding C<-Accflags=-DUSE_C_BACKTRACE_ON_ERROR>
for Configure.

Unless the above additional feature is enabled, nothing about the
backtrace functionality is visible, except for the Perl/XS level.

Furthermore, even if you have enabled this feature to be compiled,
you need to enable it in runtime with an environment variable:
C<PERL_C_BACKTRACE_ON_ERROR=10>.  It must be an integer higher
than zero, telling the desired frame count.

Retrieving the backtrace from Perl level (using for example an XS
extension) would be much less exciting than one would hope: normally
you would see C<runops>, C<entersub>, and not much else.  This API is
intended to be called B<from within> the Perl implementation, not from
Perl level execution.

The C API for the backtrace is as follows:

=over 4

=item get_c_backtrace

=item free_c_backtrace

=item get_c_backtrace_dump

=item dump_c_backtrace

=back

=head2 Poison

If you see in a debugger a memory area mysteriously full of 0xABABABAB
or 0xEFEFEFEF, you may be seeing the effect of the Poison() macros, see
L<perlclib>.

=head2 Read-only optrees

Under ithreads the optree is read only.  If you want to enforce this, to
check for write accesses from buggy code, compile with
C<-Accflags=-DPERL_DEBUG_READONLY_OPS>
to enable code that allocates op memory
via C<mmap>, and sets it read-only when it is attached to a subroutine.
Any write access to an op results in a C<SIGBUS> and abort.

This code is intended for development only, and may not be portable
even to all Unix variants.  Also, it is an 80% solution, in that it
isn't able to make all ops read only.  Specifically it does not apply to
op slabs belonging to C<BEGIN> blocks.

However, as an 80% solution it is still effective, as it has caught
bugs in the past.

=head2 When is a bool not a bool?

On pre-C99 compilers, C<bool> is defined as equivalent to C<char>.
Consequently assignment of any larger type to a C<bool> is unsafe and may be
truncated.  The C<cBOOL> macro exists to cast it correctly; you may also find
that using it is shorter and clearer than writing out the equivalent
conditional expression longhand.

On those platforms and compilers where C<bool> really is a boolean (C++,
C99), it is easy to forget the cast.  You can force C<bool> to be a C<char>
by compiling with C<-Accflags=-DPERL_BOOL_AS_CHAR>.  You may also wish to
run C<Configure> with something like

    -Accflags='-Wconversion -Wno-sign-conversion -Wno-shorten-64-to-32'

or your compiler's equivalent to make it easier to spot any unsafe truncations
that show up.

The C<TRUE> and C<FALSE> macros are available for situations where using them
would clarify intent. (But they always just mean the same as the integers 1 and
0 regardless, so using them isn't compulsory.)

=head2 The .i Targets

You can expand the macros in a F<foo.c> file by saying

    make foo.i

which will expand the macros using cpp.  Don't be scared by the
results.

=head1 AUTHOR

This document was originally written by Nathan Torkington, and is
maintained by the perl5-porters mailing list.
perlintern.pod000064400000152454150344123510007443 0ustar00-*- buffer-read-only: t -*-
!!!!!!!   DO NOT EDIT THIS FILE   !!!!!!!
This file is built by autodoc.pl extracting documentation from the C source
files.
Any changes made here will be lost!

=head1 NAME

perlintern - autogenerated documentation of purely B<internal>
		 Perl functions

=head1 DESCRIPTION
X<internal Perl functions> X<interpreter functions>

This file is the autogenerated documentation of functions in the
Perl interpreter that are documented using Perl's internal documentation
format but are not marked as part of the Perl API.  In other words,
B<they are not for use in extensions>!


=head1 Compile-time scope hooks

=over 8

=item BhkENTRY
X<BhkENTRY>


NOTE: this function is experimental and may change or be
removed without notice.


Return an entry from the BHK structure.  C<which> is a preprocessor token
indicating which entry to return.  If the appropriate flag is not set
this will return C<NULL>.  The type of the return value depends on which
entry you ask for.

	void *	BhkENTRY(BHK *hk, which)

=for hackers
Found in file op.h

=item BhkFLAGS
X<BhkFLAGS>


NOTE: this function is experimental and may change or be
removed without notice.


Return the BHK's flags.

	U32	BhkFLAGS(BHK *hk)

=for hackers
Found in file op.h

=item CALL_BLOCK_HOOKS
X<CALL_BLOCK_HOOKS>


NOTE: this function is experimental and may change or be
removed without notice.


Call all the registered block hooks for type C<which>.  C<which> is a
preprocessing token; the type of C<arg> depends on C<which>.

	void	CALL_BLOCK_HOOKS(which, arg)

=for hackers
Found in file op.h


=back

=head1 Custom Operators

=over 8

=item core_prototype
X<core_prototype>

This function assigns the prototype of the named core function to C<sv>, or
to a new mortal SV if C<sv> is C<NULL>.  It returns the modified C<sv>, or
C<NULL> if the core function has no prototype.  C<code> is a code as returned
by C<keyword()>.  It must not be equal to 0.

	SV *	core_prototype(SV *sv, const char *name,
		               const int code,
		               int * const opnum)

=for hackers
Found in file op.c


=back

=head1 CV Manipulation Functions

=over 8

=item docatch
X<docatch>

Check for the cases 0 or 3 of cur_env.je_ret, only used inside an eval context.

0 is used as continue inside eval,

3 is used for a die caught by an inner eval - continue inner loop

See F<cop.h>: je_mustcatch, when set at any runlevel to TRUE, means eval ops must
establish a local jmpenv to handle exception traps.

	OP*	docatch(Perl_ppaddr_t firstpp)

=for hackers
Found in file pp_ctl.c


=back

=head1 CV reference counts and CvOUTSIDE

=over 8

=item CvWEAKOUTSIDE
X<CvWEAKOUTSIDE>

Each CV has a pointer, C<CvOUTSIDE()>, to its lexically enclosing
CV (if any).  Because pointers to anonymous sub prototypes are
stored in C<&> pad slots, it is a possible to get a circular reference,
with the parent pointing to the child and vice-versa.  To avoid the
ensuing memory leak, we do not increment the reference count of the CV
pointed to by C<CvOUTSIDE> in the I<one specific instance> that the parent
has a C<&> pad slot pointing back to us.  In this case, we set the
C<CvWEAKOUTSIDE> flag in the child.  This allows us to determine under what
circumstances we should decrement the refcount of the parent when freeing
the child.

There is a further complication with non-closure anonymous subs (i.e. those
that do not refer to any lexicals outside that sub).  In this case, the
anonymous prototype is shared rather than being cloned.  This has the
consequence that the parent may be freed while there are still active
children, I<e.g.>,

    BEGIN { $a = sub { eval '$x' } }

In this case, the BEGIN is freed immediately after execution since there
are no active references to it: the anon sub prototype has
C<CvWEAKOUTSIDE> set since it's not a closure, and $a points to the same
CV, so it doesn't contribute to BEGIN's refcount either.  When $a is
executed, the C<eval '$x'> causes the chain of C<CvOUTSIDE>s to be followed,
and the freed BEGIN is accessed.

To avoid this, whenever a CV and its associated pad is freed, any
C<&> entries in the pad are explicitly removed from the pad, and if the
refcount of the pointed-to anon sub is still positive, then that
child's C<CvOUTSIDE> is set to point to its grandparent.  This will only
occur in the single specific case of a non-closure anon prototype
having one or more active references (such as C<$a> above).

One other thing to consider is that a CV may be merely undefined
rather than freed, eg C<undef &foo>.  In this case, its refcount may
not have reached zero, but we still delete its pad and its C<CvROOT> etc.
Since various children may still have their C<CvOUTSIDE> pointing at this
undefined CV, we keep its own C<CvOUTSIDE> for the time being, so that
the chain of lexical scopes is unbroken.  For example, the following
should print 123:

    my $x = 123;
    sub tmp { sub { eval '$x' } }
    my $a = tmp();
    undef &tmp;
    print  $a->();

	bool	CvWEAKOUTSIDE(CV *cv)

=for hackers
Found in file cv.h


=back

=head1 Embedding Functions

=over 8

=item cv_dump
X<cv_dump>

dump the contents of a CV

	void	cv_dump(CV *cv, const char *title)

=for hackers
Found in file pad.c

=item cv_forget_slab
X<cv_forget_slab>

When a CV has a reference count on its slab (C<CvSLABBED>), it is responsible
for making sure it is freed.  (Hence, no two CVs should ever have a
reference count on the same slab.)  The CV only needs to reference the slab
during compilation.  Once it is compiled and C<CvROOT> attached, it has
finished its job, so it can forget the slab.

	void	cv_forget_slab(CV *cv)

=for hackers
Found in file pad.c

=item do_dump_pad
X<do_dump_pad>

Dump the contents of a padlist

	void	do_dump_pad(I32 level, PerlIO *file,
		            PADLIST *padlist, int full)

=for hackers
Found in file pad.c

=item pad_alloc_name
X<pad_alloc_name>

Allocates a place in the currently-compiling
pad (via L<perlapi/pad_alloc>) and
then stores a name for that entry.  C<name> is adopted and
becomes the name entry; it must already contain the name
string.  C<typestash> and C<ourstash> and the C<padadd_STATE>
flag get added to C<name>.  None of the other
processing of L<perlapi/pad_add_name_pvn>
is done.  Returns the offset of the allocated pad slot.

	PADOFFSET pad_alloc_name(PADNAME *name, U32 flags,
	                         HV *typestash, HV *ourstash)

=for hackers
Found in file pad.c

=item pad_block_start
X<pad_block_start>

Update the pad compilation state variables on entry to a new block.

	void	pad_block_start(int full)

=for hackers
Found in file pad.c

=item pad_check_dup
X<pad_check_dup>

Check for duplicate declarations: report any of:

     * a 'my' in the current scope with the same name;
     * an 'our' (anywhere in the pad) with the same name and the
       same stash as 'ourstash'

C<is_our> indicates that the name to check is an C<"our"> declaration.

	void	pad_check_dup(PADNAME *name, U32 flags,
		              const HV *ourstash)

=for hackers
Found in file pad.c

=item pad_findlex
X<pad_findlex>

Find a named lexical anywhere in a chain of nested pads.  Add fake entries
in the inner pads if it's found in an outer one.

Returns the offset in the bottom pad of the lex or the fake lex.
C<cv> is the CV in which to start the search, and seq is the current C<cop_seq>
to match against.  If C<warn> is true, print appropriate warnings.  The C<out_>*
vars return values, and so are pointers to where the returned values
should be stored.  C<out_capture>, if non-null, requests that the innermost
instance of the lexical is captured; C<out_name> is set to the innermost
matched pad name or fake pad name; C<out_flags> returns the flags normally
associated with the C<PARENT_FAKELEX_FLAGS> field of a fake pad name.

Note that C<pad_findlex()> is recursive; it recurses up the chain of CVs,
then comes back down, adding fake entries
as it goes.  It has to be this way
because fake names in anon protoypes have to store in C<xpadn_low> the
index into the parent pad.

	PADOFFSET pad_findlex(const char *namepv,
	                      STRLEN namelen, U32 flags,
	                      const CV* cv, U32 seq, int warn,
	                      SV** out_capture,
	                      PADNAME** out_name,
	                      int *out_flags)

=for hackers
Found in file pad.c

=item pad_fixup_inner_anons
X<pad_fixup_inner_anons>

For any anon CVs in the pad, change C<CvOUTSIDE> of that CV from
C<old_cv> to C<new_cv> if necessary.  Needed when a newly-compiled CV has to be
moved to a pre-existing CV struct.

	void	pad_fixup_inner_anons(PADLIST *padlist,
		                      CV *old_cv, CV *new_cv)

=for hackers
Found in file pad.c

=item pad_free
X<pad_free>

Free the SV at offset po in the current pad.

	void	pad_free(PADOFFSET po)

=for hackers
Found in file pad.c

=item pad_leavemy
X<pad_leavemy>

Cleanup at end of scope during compilation: set the max seq number for
lexicals in this scope and warn of any lexicals that never got introduced.

	void	pad_leavemy()

=for hackers
Found in file pad.c

=item padlist_dup
X<padlist_dup>

Duplicates a pad.

	PADLIST * padlist_dup(PADLIST *srcpad,
	                      CLONE_PARAMS *param)

=for hackers
Found in file pad.c

=item padname_dup
X<padname_dup>

Duplicates a pad name.

	PADNAME * padname_dup(PADNAME *src, CLONE_PARAMS *param)

=for hackers
Found in file pad.c

=item padnamelist_dup
X<padnamelist_dup>

Duplicates a pad name list.

	PADNAMELIST * padnamelist_dup(PADNAMELIST *srcpad,
	                              CLONE_PARAMS *param)

=for hackers
Found in file pad.c

=item pad_push
X<pad_push>

Push a new pad frame onto the padlist, unless there's already a pad at
this depth, in which case don't bother creating a new one.  Then give
the new pad an C<@_> in slot zero.

	void	pad_push(PADLIST *padlist, int depth)

=for hackers
Found in file pad.c

=item pad_reset
X<pad_reset>

Mark all the current temporaries for reuse

	void	pad_reset()

=for hackers
Found in file pad.c

=item pad_swipe
X<pad_swipe>

Abandon the tmp in the current pad at offset C<po> and replace with a
new one.

	void	pad_swipe(PADOFFSET po, bool refadjust)

=for hackers
Found in file pad.c


=back

=head1 GV Functions

=over 8

=item gv_try_downgrade
X<gv_try_downgrade>


NOTE: this function is experimental and may change or be
removed without notice.


If the typeglob C<gv> can be expressed more succinctly, by having
something other than a real GV in its place in the stash, replace it
with the optimised form.  Basic requirements for this are that C<gv>
is a real typeglob, is sufficiently ordinary, and is only referenced
from its package.  This function is meant to be used when a GV has been
looked up in part to see what was there, causing upgrading, but based
on what was found it turns out that the real GV isn't required after all.

If C<gv> is a completely empty typeglob, it is deleted from the stash.

If C<gv> is a typeglob containing only a sufficiently-ordinary constant
sub, the typeglob is replaced with a scalar-reference placeholder that
more compactly represents the same thing.

	void	gv_try_downgrade(GV* gv)

=for hackers
Found in file gv.c


=back

=head1 Hash Manipulation Functions

=over 8

=item hv_ename_add
X<hv_ename_add>

Adds a name to a stash's internal list of effective names.  See
C<L</hv_ename_delete>>.

This is called when a stash is assigned to a new location in the symbol
table.

	void	hv_ename_add(HV *hv, const char *name, U32 len,
		             U32 flags)

=for hackers
Found in file hv.c

=item hv_ename_delete
X<hv_ename_delete>

Removes a name from a stash's internal list of effective names.  If this is
the name returned by C<HvENAME>, then another name in the list will take
its place (C<HvENAME> will use it).

This is called when a stash is deleted from the symbol table.

	void	hv_ename_delete(HV *hv, const char *name,
		                U32 len, U32 flags)

=for hackers
Found in file hv.c

=item refcounted_he_chain_2hv
X<refcounted_he_chain_2hv>

Generates and returns a C<HV *> representing the content of a
C<refcounted_he> chain.
C<flags> is currently unused and must be zero.

	HV *	refcounted_he_chain_2hv(
		    const struct refcounted_he *c, U32 flags
		)

=for hackers
Found in file hv.c

=item refcounted_he_fetch_pv
X<refcounted_he_fetch_pv>

Like L</refcounted_he_fetch_pvn>, but takes a nul-terminated string
instead of a string/length pair.

	SV *	refcounted_he_fetch_pv(
		    const struct refcounted_he *chain,
		    const char *key, U32 hash, U32 flags
		)

=for hackers
Found in file hv.c

=item refcounted_he_fetch_pvn
X<refcounted_he_fetch_pvn>

Search along a C<refcounted_he> chain for an entry with the key specified
by C<keypv> and C<keylen>.  If C<flags> has the C<REFCOUNTED_HE_KEY_UTF8>
bit set, the key octets are interpreted as UTF-8, otherwise they
are interpreted as Latin-1.  C<hash> is a precomputed hash of the key
string, or zero if it has not been precomputed.  Returns a mortal scalar
representing the value associated with the key, or C<&PL_sv_placeholder>
if there is no value associated with the key.

	SV *	refcounted_he_fetch_pvn(
		    const struct refcounted_he *chain,
		    const char *keypv, STRLEN keylen, U32 hash,
		    U32 flags
		)

=for hackers
Found in file hv.c

=item refcounted_he_fetch_pvs
X<refcounted_he_fetch_pvs>

Like L</refcounted_he_fetch_pvn>, but takes a C<NUL>-terminated literal string
instead of a string/length pair, and no precomputed hash.

	SV *	refcounted_he_fetch_pvs(
		    const struct refcounted_he *chain,
		    const char *key, U32 flags
		)

=for hackers
Found in file hv.h

=item refcounted_he_fetch_sv
X<refcounted_he_fetch_sv>

Like L</refcounted_he_fetch_pvn>, but takes a Perl scalar instead of a
string/length pair.

	SV *	refcounted_he_fetch_sv(
		    const struct refcounted_he *chain, SV *key,
		    U32 hash, U32 flags
		)

=for hackers
Found in file hv.c

=item refcounted_he_free
X<refcounted_he_free>

Decrements the reference count of a C<refcounted_he> by one.  If the
reference count reaches zero the structure's memory is freed, which
(recursively) causes a reduction of its parent C<refcounted_he>'s
reference count.  It is safe to pass a null pointer to this function:
no action occurs in this case.

	void	refcounted_he_free(struct refcounted_he *he)

=for hackers
Found in file hv.c

=item refcounted_he_inc
X<refcounted_he_inc>

Increment the reference count of a C<refcounted_he>.  The pointer to the
C<refcounted_he> is also returned.  It is safe to pass a null pointer
to this function: no action occurs and a null pointer is returned.

	struct refcounted_he * refcounted_he_inc(
	                           struct refcounted_he *he
	                       )

=for hackers
Found in file hv.c

=item refcounted_he_new_pv
X<refcounted_he_new_pv>

Like L</refcounted_he_new_pvn>, but takes a nul-terminated string instead
of a string/length pair.

	struct refcounted_he * refcounted_he_new_pv(
	                           struct refcounted_he *parent,
	                           const char *key, U32 hash,
	                           SV *value, U32 flags
	                       )

=for hackers
Found in file hv.c

=item refcounted_he_new_pvn
X<refcounted_he_new_pvn>

Creates a new C<refcounted_he>.  This consists of a single key/value
pair and a reference to an existing C<refcounted_he> chain (which may
be empty), and thus forms a longer chain.  When using the longer chain,
the new key/value pair takes precedence over any entry for the same key
further along the chain.

The new key is specified by C<keypv> and C<keylen>.  If C<flags> has
the C<REFCOUNTED_HE_KEY_UTF8> bit set, the key octets are interpreted
as UTF-8, otherwise they are interpreted as Latin-1.  C<hash> is
a precomputed hash of the key string, or zero if it has not been
precomputed.

C<value> is the scalar value to store for this key.  C<value> is copied
by this function, which thus does not take ownership of any reference
to it, and later changes to the scalar will not be reflected in the
value visible in the C<refcounted_he>.  Complex types of scalar will not
be stored with referential integrity, but will be coerced to strings.
C<value> may be either null or C<&PL_sv_placeholder> to indicate that no
value is to be associated with the key; this, as with any non-null value,
takes precedence over the existence of a value for the key further along
the chain.

C<parent> points to the rest of the C<refcounted_he> chain to be
attached to the new C<refcounted_he>.  This function takes ownership
of one reference to C<parent>, and returns one reference to the new
C<refcounted_he>.

	struct refcounted_he * refcounted_he_new_pvn(
	                           struct refcounted_he *parent,
	                           const char *keypv,
	                           STRLEN keylen, U32 hash,
	                           SV *value, U32 flags
	                       )

=for hackers
Found in file hv.c

=item refcounted_he_new_pvs
X<refcounted_he_new_pvs>

Like L</refcounted_he_new_pvn>, but takes a C<NUL>-terminated literal string
instead of a string/length pair, and no precomputed hash.

	struct refcounted_he * refcounted_he_new_pvs(
	                           struct refcounted_he *parent,
	                           const char *key, SV *value,
	                           U32 flags
	                       )

=for hackers
Found in file hv.h

=item refcounted_he_new_sv
X<refcounted_he_new_sv>

Like L</refcounted_he_new_pvn>, but takes a Perl scalar instead of a
string/length pair.

	struct refcounted_he * refcounted_he_new_sv(
	                           struct refcounted_he *parent,
	                           SV *key, U32 hash, SV *value,
	                           U32 flags
	                       )

=for hackers
Found in file hv.c


=back

=head1 IO Functions

=over 8

=item start_glob
X<start_glob>


NOTE: this function is experimental and may change or be
removed without notice.


Function called by C<do_readline> to spawn a glob (or do the glob inside
perl on VMS).  This code used to be inline, but now perl uses C<File::Glob>
this glob starter is only used by miniperl during the build process,
or when PERL_EXTERNAL_GLOB is defined.
Moving it away shrinks F<pp_hot.c>; shrinking F<pp_hot.c> helps speed perl up.

	PerlIO*	start_glob(SV *tmpglob, IO *io)

=for hackers
Found in file doio.c


=back

=head1 Lexer interface

=over 8

=item validate_proto
X<validate_proto>


NOTE: this function is experimental and may change or be
removed without notice.


This function performs syntax checking on a prototype, C<proto>.
If C<warn> is true, any illegal characters or mismatched brackets
will trigger illegalproto warnings, declaring that they were
detected in the prototype for C<name>.

The return value is C<true> if this is a valid prototype, and
C<false> if it is not, regardless of whether C<warn> was C<true> or
C<false>.

Note that C<NULL> is a valid C<proto> and will always return C<true>.

NOTE: the perl_ form of this function is deprecated.

	bool	validate_proto(SV *name, SV *proto, bool warn)

=for hackers
Found in file toke.c


=back

=head1 Magical Functions

=over 8

=item magic_clearhint
X<magic_clearhint>

Triggered by a delete from C<%^H>, records the key to
C<PL_compiling.cop_hints_hash>.

	int	magic_clearhint(SV* sv, MAGIC* mg)

=for hackers
Found in file mg.c

=item magic_clearhints
X<magic_clearhints>

Triggered by clearing C<%^H>, resets C<PL_compiling.cop_hints_hash>.

	int	magic_clearhints(SV* sv, MAGIC* mg)

=for hackers
Found in file mg.c

=item magic_methcall
X<magic_methcall>

Invoke a magic method (like FETCH).

C<sv> and C<mg> are the tied thingy and the tie magic.

C<meth> is the name of the method to call.

C<argc> is the number of args (in addition to $self) to pass to the method.

The C<flags> can be:

    G_DISCARD     invoke method with G_DISCARD flag and don't
                  return a value
    G_UNDEF_FILL  fill the stack with argc pointers to
                  PL_sv_undef

The arguments themselves are any values following the C<flags> argument.

Returns the SV (if any) returned by the method, or C<NULL> on failure.


	SV*	magic_methcall(SV *sv, const MAGIC *mg,
		               SV *meth, U32 flags, U32 argc,
		               ...)

=for hackers
Found in file mg.c

=item magic_sethint
X<magic_sethint>

Triggered by a store to C<%^H>, records the key/value pair to
C<PL_compiling.cop_hints_hash>.  It is assumed that hints aren't storing
anything that would need a deep copy.  Maybe we should warn if we find a
reference.

	int	magic_sethint(SV* sv, MAGIC* mg)

=for hackers
Found in file mg.c

=item mg_localize
X<mg_localize>

Copy some of the magic from an existing SV to new localized version of that
SV.  Container magic (I<e.g.>, C<%ENV>, C<$1>, C<tie>)
gets copied, value magic doesn't (I<e.g.>,
C<taint>, C<pos>).

If C<setmagic> is false then no set magic will be called on the new (empty) SV.
This typically means that assignment will soon follow (e.g. S<C<'local $x = $y'>>),
and that will handle the magic.

	void	mg_localize(SV* sv, SV* nsv, bool setmagic)

=for hackers
Found in file mg.c


=back

=head1 Miscellaneous Functions

=over 8

=item free_c_backtrace
X<free_c_backtrace>

Deallocates a backtrace received from get_c_bracktrace.

	void	free_c_backtrace(Perl_c_backtrace* bt)

=for hackers
Found in file util.c

=item get_c_backtrace
X<get_c_backtrace>

Collects the backtrace (aka "stacktrace") into a single linear
malloced buffer, which the caller B<must> C<Perl_free_c_backtrace()>.

Scans the frames back by S<C<depth + skip>>, then drops the C<skip> innermost,
returning at most C<depth> frames.

	Perl_c_backtrace* get_c_backtrace(int max_depth,
	                                  int skip)

=for hackers
Found in file util.c


=back

=head1 MRO Functions

=over 8

=item mro_get_linear_isa_dfs
X<mro_get_linear_isa_dfs>

Returns the Depth-First Search linearization of C<@ISA>
the given stash.  The return value is a read-only AV*.
C<level> should be 0 (it is used internally in this
function's recursion).

You are responsible for C<SvREFCNT_inc()> on the
return value if you plan to store it anywhere
semi-permanently (otherwise it might be deleted
out from under you the next time the cache is
invalidated).

	AV*	mro_get_linear_isa_dfs(HV* stash, U32 level)

=for hackers
Found in file mro_core.c

=item mro_isa_changed_in
X<mro_isa_changed_in>

Takes the necessary steps (cache invalidations, mostly)
when the C<@ISA> of the given package has changed.  Invoked
by the C<setisa> magic, should not need to invoke directly.

	void	mro_isa_changed_in(HV* stash)

=for hackers
Found in file mro_core.c

=item mro_package_moved
X<mro_package_moved>

Call this function to signal to a stash that it has been assigned to
another spot in the stash hierarchy.  C<stash> is the stash that has been
assigned.  C<oldstash> is the stash it replaces, if any.  C<gv> is the glob
that is actually being assigned to.

This can also be called with a null first argument to
indicate that C<oldstash> has been deleted.

This function invalidates isa caches on the old stash, on all subpackages
nested inside it, and on the subclasses of all those, including
non-existent packages that have corresponding entries in C<stash>.

It also sets the effective names (C<HvENAME>) on all the stashes as
appropriate.

If the C<gv> is present and is not in the symbol table, then this function
simply returns.  This checked will be skipped if C<flags & 1>.

	void	mro_package_moved(HV * const stash,
		                  HV * const oldstash,
		                  const GV * const gv,
		                  U32 flags)

=for hackers
Found in file mro_core.c


=back

=head1 Optree Manipulation Functions

=over 8

=item finalize_optree
X<finalize_optree>

This function finalizes the optree.  Should be called directly after
the complete optree is built.  It does some additional
checking which can't be done in the normal C<ck_>xxx functions and makes
the tree thread-safe.

	void	finalize_optree(OP* o)

=for hackers
Found in file op.c


=back

=head1 Pad Data Structures

=over 8

=item CX_CURPAD_SAVE
X<CX_CURPAD_SAVE>

Save the current pad in the given context block structure.

	void	CX_CURPAD_SAVE(struct context)

=for hackers
Found in file pad.h

=item CX_CURPAD_SV
X<CX_CURPAD_SV>

Access the SV at offset C<po> in the saved current pad in the given
context block structure (can be used as an lvalue).

	SV *	CX_CURPAD_SV(struct context, PADOFFSET po)

=for hackers
Found in file pad.h

=item PAD_BASE_SV
X<PAD_BASE_SV>

Get the value from slot C<po> in the base (DEPTH=1) pad of a padlist

	SV *	PAD_BASE_SV(PADLIST padlist, PADOFFSET po)

=for hackers
Found in file pad.h

=item PAD_CLONE_VARS
X<PAD_CLONE_VARS>

Clone the state variables associated with running and compiling pads.

	void	PAD_CLONE_VARS(PerlInterpreter *proto_perl,
		               CLONE_PARAMS* param)

=for hackers
Found in file pad.h

=item PAD_COMPNAME_FLAGS
X<PAD_COMPNAME_FLAGS>

Return the flags for the current compiling pad name
at offset C<po>.  Assumes a valid slot entry.

	U32	PAD_COMPNAME_FLAGS(PADOFFSET po)

=for hackers
Found in file pad.h

=item PAD_COMPNAME_GEN
X<PAD_COMPNAME_GEN>

The generation number of the name at offset C<po> in the current
compiling pad (lvalue).

	STRLEN	PAD_COMPNAME_GEN(PADOFFSET po)

=for hackers
Found in file pad.h

=item PAD_COMPNAME_GEN_set
X<PAD_COMPNAME_GEN_set>

Sets the generation number of the name at offset C<po> in the current
ling pad (lvalue) to C<gen>.
	STRLEN	PAD_COMPNAME_GEN_set(PADOFFSET po, int gen)

=for hackers
Found in file pad.h

=item PAD_COMPNAME_OURSTASH
X<PAD_COMPNAME_OURSTASH>

Return the stash associated with an C<our> variable.
Assumes the slot entry is a valid C<our> lexical.

	HV *	PAD_COMPNAME_OURSTASH(PADOFFSET po)

=for hackers
Found in file pad.h

=item PAD_COMPNAME_PV
X<PAD_COMPNAME_PV>

Return the name of the current compiling pad name
at offset C<po>.  Assumes a valid slot entry.

	char *	PAD_COMPNAME_PV(PADOFFSET po)

=for hackers
Found in file pad.h

=item PAD_COMPNAME_TYPE
X<PAD_COMPNAME_TYPE>

Return the type (stash) of the current compiling pad name at offset
C<po>.  Must be a valid name.  Returns null if not typed.

	HV *	PAD_COMPNAME_TYPE(PADOFFSET po)

=for hackers
Found in file pad.h

=item PadnameIsOUR
X<PadnameIsOUR>

Whether this is an "our" variable.

	bool	PadnameIsOUR(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnameIsSTATE
X<PadnameIsSTATE>

Whether this is a "state" variable.

	bool	PadnameIsSTATE(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnameOURSTASH
X<PadnameOURSTASH>

The stash in which this "our" variable was declared.

	HV *	PadnameOURSTASH()

=for hackers
Found in file pad.h

=item PadnameOUTER
X<PadnameOUTER>

Whether this entry belongs to an outer pad.  Entries for which this is true
are often referred to as 'fake'.

	bool	PadnameOUTER(PADNAME pn)

=for hackers
Found in file pad.h

=item PadnameTYPE
X<PadnameTYPE>

The stash associated with a typed lexical.  This returns the C<%Foo::> hash
for C<my Foo $bar>.

	HV *	PadnameTYPE(PADNAME pn)

=for hackers
Found in file pad.h

=item PAD_RESTORE_LOCAL
X<PAD_RESTORE_LOCAL>

Restore the old pad saved into the local variable C<opad> by C<PAD_SAVE_LOCAL()>

	void	PAD_RESTORE_LOCAL(PAD *opad)

=for hackers
Found in file pad.h

=item PAD_SAVE_LOCAL
X<PAD_SAVE_LOCAL>

Save the current pad to the local variable C<opad>, then make the
current pad equal to C<npad>

	void	PAD_SAVE_LOCAL(PAD *opad, PAD *npad)

=for hackers
Found in file pad.h

=item PAD_SAVE_SETNULLPAD
X<PAD_SAVE_SETNULLPAD>

Save the current pad then set it to null.

	void	PAD_SAVE_SETNULLPAD()

=for hackers
Found in file pad.h

=item PAD_SETSV
X<PAD_SETSV>

Set the slot at offset C<po> in the current pad to C<sv>

	SV *	PAD_SETSV(PADOFFSET po, SV* sv)

=for hackers
Found in file pad.h

=item PAD_SET_CUR
X<PAD_SET_CUR>

Set the current pad to be pad C<n> in the padlist, saving
the previous current pad.  NB currently this macro expands to a string too
long for some compilers, so it's best to replace it with

    SAVECOMPPAD();
    PAD_SET_CUR_NOSAVE(padlist,n);


	void	PAD_SET_CUR(PADLIST padlist, I32 n)

=for hackers
Found in file pad.h

=item PAD_SET_CUR_NOSAVE
X<PAD_SET_CUR_NOSAVE>

like PAD_SET_CUR, but without the save

	void	PAD_SET_CUR_NOSAVE(PADLIST padlist, I32 n)

=for hackers
Found in file pad.h

=item PAD_SV
X<PAD_SV>

Get the value at offset C<po> in the current pad

	SV *	PAD_SV(PADOFFSET po)

=for hackers
Found in file pad.h

=item PAD_SVl
X<PAD_SVl>

Lightweight and lvalue version of C<PAD_SV>.
Get or set the value at offset C<po> in the current pad.
Unlike C<PAD_SV>, does not print diagnostics with -DX.
For internal use only.

	SV *	PAD_SVl(PADOFFSET po)

=for hackers
Found in file pad.h

=item SAVECLEARSV
X<SAVECLEARSV>

Clear the pointed to pad value on scope exit.  (i.e. the runtime action of
C<my>)

	void	SAVECLEARSV(SV **svp)

=for hackers
Found in file pad.h

=item SAVECOMPPAD
X<SAVECOMPPAD>

save C<PL_comppad> and C<PL_curpad>


	void	SAVECOMPPAD()

=for hackers
Found in file pad.h

=item SAVEPADSV
X<SAVEPADSV>

Save a pad slot (used to restore after an iteration)

XXX DAPM it would make more sense to make the arg a PADOFFSET
	void	SAVEPADSV(PADOFFSET po)

=for hackers
Found in file pad.h


=back

=head1 Per-Interpreter Variables

=over 8

=item PL_DBsingle
X<PL_DBsingle>

When Perl is run in debugging mode, with the B<-d> switch, this SV is a
boolean which indicates whether subs are being single-stepped.
Single-stepping is automatically turned on after every step.  This is the C
variable which corresponds to Perl's $DB::single variable.  See
C<L</PL_DBsub>>.

	SV *	PL_DBsingle

=for hackers
Found in file intrpvar.h

=item PL_DBsub
X<PL_DBsub>

When Perl is run in debugging mode, with the B<-d> switch, this GV contains
the SV which holds the name of the sub being debugged.  This is the C
variable which corresponds to Perl's $DB::sub variable.  See
C<L</PL_DBsingle>>.

	GV *	PL_DBsub

=for hackers
Found in file intrpvar.h

=item PL_DBtrace
X<PL_DBtrace>

Trace variable used when Perl is run in debugging mode, with the B<-d>
switch.  This is the C variable which corresponds to Perl's $DB::trace
variable.  See C<L</PL_DBsingle>>.

	SV *	PL_DBtrace

=for hackers
Found in file intrpvar.h

=item PL_dowarn
X<PL_dowarn>

The C variable that roughly corresponds to Perl's C<$^W> warning variable.
However, C<$^W> is treated as a boolean, whereas C<PL_dowarn> is a
collection of flag bits.

	U8	PL_dowarn

=for hackers
Found in file intrpvar.h

=item PL_last_in_gv
X<PL_last_in_gv>

The GV which was last used for a filehandle input operation.  (C<< <FH> >>)

	GV*	PL_last_in_gv

=for hackers
Found in file intrpvar.h

=item PL_ofsgv
X<PL_ofsgv>

The glob containing the output field separator - C<*,> in Perl space.

	GV*	PL_ofsgv

=for hackers
Found in file intrpvar.h

=item PL_rs
X<PL_rs>

The input record separator - C<$/> in Perl space.

	SV*	PL_rs

=for hackers
Found in file intrpvar.h


=back

=head1 Stack Manipulation Macros

=over 8

=item djSP
X<djSP>

Declare Just C<SP>.  This is actually identical to C<dSP>, and declares
a local copy of perl's stack pointer, available via the C<SP> macro.
See C<L<perlapi/SP>>.  (Available for backward source code compatibility with
the old (Perl 5.005) thread model.)

		djSP;

=for hackers
Found in file pp.h

=item LVRET
X<LVRET>

True if this op will be the return value of an lvalue subroutine

=for hackers
Found in file pp.h


=back

=head1 SV-Body Allocation

=over 8

=item sv_2num
X<sv_2num>


NOTE: this function is experimental and may change or be
removed without notice.


Return an SV with the numeric value of the source SV, doing any necessary
reference or overload conversion.  The caller is expected to have handled
get-magic already.

	SV*	sv_2num(SV *const sv)

=for hackers
Found in file sv.c


=back

=head1 SV Manipulation Functions

An SV (or AV, HV, etc.) is allocated in two parts: the head (struct
sv, av, hv...) contains type and reference count information, and for
many types, a pointer to the body (struct xrv, xpv, xpviv...), which
contains fields specific to each type.  Some types store all they need
in the head, so don't have a body.

In all but the most memory-paranoid configurations (ex: PURIFY), heads
and bodies are allocated out of arenas, which by default are
approximately 4K chunks of memory parcelled up into N heads or bodies.
Sv-bodies are allocated by their sv-type, guaranteeing size
consistency needed to allocate safely from arrays.

For SV-heads, the first slot in each arena is reserved, and holds a
link to the next arena, some flags, and a note of the number of slots.
Snaked through each arena chain is a linked list of free items; when
this becomes empty, an extra arena is allocated and divided up into N
items which are threaded into the free list.

SV-bodies are similar, but they use arena-sets by default, which
separate the link and info from the arena itself, and reclaim the 1st
slot in the arena.  SV-bodies are further described later.

The following global variables are associated with arenas:

 PL_sv_arenaroot     pointer to list of SV arenas
 PL_sv_root          pointer to list of free SV structures

 PL_body_arenas      head of linked-list of body arenas
 PL_body_roots[]     array of pointers to list of free bodies of svtype
                     arrays are indexed by the svtype needed

A few special SV heads are not allocated from an arena, but are
instead directly created in the interpreter structure, eg PL_sv_undef.
The size of arenas can be changed from the default by setting
PERL_ARENA_SIZE appropriately at compile time.

The SV arena serves the secondary purpose of allowing still-live SVs
to be located and destroyed during final cleanup.

At the lowest level, the macros new_SV() and del_SV() grab and free
an SV head.  (If debugging with -DD, del_SV() calls the function S_del_sv()
to return the SV to the free list with error checking.) new_SV() calls
more_sv() / sv_add_arena() to add an extra arena if the free list is empty.
SVs in the free list have their SvTYPE field set to all ones.

At the time of very final cleanup, sv_free_arenas() is called from
perl_destruct() to physically free all the arenas allocated since the
start of the interpreter.

The function visit() scans the SV arenas list, and calls a specified
function for each SV it finds which is still live - ie which has an SvTYPE
other than all 1's, and a non-zero SvREFCNT. visit() is used by the
following functions (specified as [function that calls visit()] / [function
called by visit() for each SV]):

    sv_report_used() / do_report_used()
			dump all remaining SVs (debugging aid)

    sv_clean_objs() / do_clean_objs(),do_clean_named_objs(),
		      do_clean_named_io_objs(),do_curse()
			Attempt to free all objects pointed to by RVs,
			try to do the same for all objects indir-
			ectly referenced by typeglobs too, and
			then do a final sweep, cursing any
			objects that remain.  Called once from
			perl_destruct(), prior to calling sv_clean_all()
			below.

    sv_clean_all() / do_clean_all()
			SvREFCNT_dec(sv) each remaining SV, possibly
			triggering an sv_free(). It also sets the
			SVf_BREAK flag on the SV to indicate that the
			refcnt has been artificially lowered, and thus
			stopping sv_free() from giving spurious warnings
			about SVs which unexpectedly have a refcnt
			of zero.  called repeatedly from perl_destruct()
			until there are no SVs left.


=over 8

=item sv_add_arena
X<sv_add_arena>

Given a chunk of memory, link it to the head of the list of arenas,
and split it into a list of free SVs.

	void	sv_add_arena(char *const ptr, const U32 size,
		             const U32 flags)

=for hackers
Found in file sv.c

=item sv_clean_all
X<sv_clean_all>

Decrement the refcnt of each remaining SV, possibly triggering a
cleanup.  This function may have to be called multiple times to free
SVs which are in complex self-referential hierarchies.

	I32	sv_clean_all()

=for hackers
Found in file sv.c

=item sv_clean_objs
X<sv_clean_objs>

Attempt to destroy all objects not yet freed.

	void	sv_clean_objs()

=for hackers
Found in file sv.c

=item sv_free_arenas
X<sv_free_arenas>

Deallocate the memory used by all arenas.  Note that all the individual SV
heads and bodies within the arenas must already have been freed.

	void	sv_free_arenas()

=for hackers
Found in file sv.c

=item SvTHINKFIRST
X<SvTHINKFIRST>

A quick flag check to see whether an C<sv> should be passed to C<sv_force_normal>
to be "downgraded" before C<SvIVX> or C<SvPVX> can be modified directly.

For example, if your scalar is a reference and you want to modify the C<SvIVX>
slot, you can't just do C<SvROK_off>, as that will leak the referent.

This is used internally by various sv-modifying functions, such as
C<sv_setsv>, C<sv_setiv> and C<sv_pvn_force>.

One case that this does not handle is a gv without SvFAKE set.  After

    if (SvTHINKFIRST(gv)) sv_force_normal(gv);

it will still be a gv.

C<SvTHINKFIRST> sometimes produces false positives.  In those cases
C<sv_force_normal> does nothing.

	U32	SvTHINKFIRST(SV *sv)

=for hackers
Found in file sv.h


=back

=head1 Unicode Support

=over 8

=item find_uninit_var
X<find_uninit_var>


NOTE: this function is experimental and may change or be
removed without notice.


Find the name of the undefined variable (if any) that caused the operator
to issue a "Use of uninitialized value" warning.
If match is true, only return a name if its value matches C<uninit_sv>.
So roughly speaking, if a unary operator (such as C<OP_COS>) generates a
warning, then following the direct child of the op may yield an
C<OP_PADSV> or C<OP_GV> that gives the name of the undefined variable.  On the
other hand, with C<OP_ADD> there are two branches to follow, so we only print
the variable name if we get an exact match.
C<desc_p> points to a string pointer holding the description of the op.
This may be updated if needed.

The name is returned as a mortal SV.

Assumes that C<PL_op> is the OP that originally triggered the error, and that
C<PL_comppad>/C<PL_curpad> points to the currently executing pad.

	SV*	find_uninit_var(const OP *const obase,
		                const SV *const uninit_sv,
		                bool match, const char **desc_p)

=for hackers
Found in file sv.c

=item report_uninit
X<report_uninit>

Print appropriate "Use of uninitialized variable" warning.

	void	report_uninit(const SV *uninit_sv)

=for hackers
Found in file sv.c


=back

=head1 Undocumented functions

The following functions are currently undocumented.  If you use one of
them, you may wish to consider creating and submitting documentation for
it.

=over

=item PerlIO_restore_errno
X<PerlIO_restore_errno>

=item PerlIO_save_errno
X<PerlIO_save_errno>

=item Slab_Alloc
X<Slab_Alloc>

=item Slab_Free
X<Slab_Free>

=item Slab_to_ro
X<Slab_to_ro>

=item Slab_to_rw
X<Slab_to_rw>

=item _add_range_to_invlist
X<_add_range_to_invlist>

=item _byte_dump_string
X<_byte_dump_string>

=item _core_swash_init
X<_core_swash_init>

=item _get_regclass_nonbitmap_data
X<_get_regclass_nonbitmap_data>

=item _get_swash_invlist
X<_get_swash_invlist>

=item _invlistEQ
X<_invlistEQ>

=item _invlist_array_init
X<_invlist_array_init>

=item _invlist_contains_cp
X<_invlist_contains_cp>

=item _invlist_dump
X<_invlist_dump>

=item _invlist_intersection
X<_invlist_intersection>

=item _invlist_intersection_maybe_complement_2nd
X<_invlist_intersection_maybe_complement_2nd>

=item _invlist_invert
X<_invlist_invert>

=item _invlist_len
X<_invlist_len>

=item _invlist_populate_swatch
X<_invlist_populate_swatch>

=item _invlist_search
X<_invlist_search>

=item _invlist_subtract
X<_invlist_subtract>

=item _invlist_union
X<_invlist_union>

=item _invlist_union_maybe_complement_2nd
X<_invlist_union_maybe_complement_2nd>

=item _is_grapheme
X<_is_grapheme>

=item _load_PL_utf8_foldclosures
X<_load_PL_utf8_foldclosures>

=item _mem_collxfrm
X<_mem_collxfrm>

=item _new_invlist
X<_new_invlist>

=item _new_invlist_C_array
X<_new_invlist_C_array>

=item _setup_canned_invlist
X<_setup_canned_invlist>

=item _swash_inversion_hash
X<_swash_inversion_hash>

=item _swash_to_invlist
X<_swash_to_invlist>

=item _to_fold_latin1
X<_to_fold_latin1>

=item _to_upper_title_latin1
X<_to_upper_title_latin1>

=item _warn_problematic_locale
X<_warn_problematic_locale>

=item abort_execution
X<abort_execution>

=item add_cp_to_invlist
X<add_cp_to_invlist>

=item alloc_LOGOP
X<alloc_LOGOP>

=item alloc_maybe_populate_EXACT
X<alloc_maybe_populate_EXACT>

=item allocmy
X<allocmy>

=item amagic_is_enabled
X<amagic_is_enabled>

=item append_utf8_from_native_byte
X<append_utf8_from_native_byte>

=item apply
X<apply>

=item av_extend_guts
X<av_extend_guts>

=item av_reify
X<av_reify>

=item bind_match
X<bind_match>

=item boot_core_PerlIO
X<boot_core_PerlIO>

=item boot_core_UNIVERSAL
X<boot_core_UNIVERSAL>

=item boot_core_mro
X<boot_core_mro>

=item cando
X<cando>

=item check_utf8_print
X<check_utf8_print>

=item ck_anoncode
X<ck_anoncode>

=item ck_backtick
X<ck_backtick>

=item ck_bitop
X<ck_bitop>

=item ck_cmp
X<ck_cmp>

=item ck_concat
X<ck_concat>

=item ck_defined
X<ck_defined>

=item ck_delete
X<ck_delete>

=item ck_each
X<ck_each>

=item ck_entersub_args_core
X<ck_entersub_args_core>

=item ck_eof
X<ck_eof>

=item ck_eval
X<ck_eval>

=item ck_exec
X<ck_exec>

=item ck_exists
X<ck_exists>

=item ck_ftst
X<ck_ftst>

=item ck_fun
X<ck_fun>

=item ck_glob
X<ck_glob>

=item ck_grep
X<ck_grep>

=item ck_index
X<ck_index>

=item ck_join
X<ck_join>

=item ck_length
X<ck_length>

=item ck_lfun
X<ck_lfun>

=item ck_listiob
X<ck_listiob>

=item ck_match
X<ck_match>

=item ck_method
X<ck_method>

=item ck_null
X<ck_null>

=item ck_open
X<ck_open>

=item ck_prototype
X<ck_prototype>

=item ck_readline
X<ck_readline>

=item ck_refassign
X<ck_refassign>

=item ck_repeat
X<ck_repeat>

=item ck_require
X<ck_require>

=item ck_return
X<ck_return>

=item ck_rfun
X<ck_rfun>

=item ck_rvconst
X<ck_rvconst>

=item ck_sassign
X<ck_sassign>

=item ck_select
X<ck_select>

=item ck_shift
X<ck_shift>

=item ck_smartmatch
X<ck_smartmatch>

=item ck_sort
X<ck_sort>

=item ck_spair
X<ck_spair>

=item ck_split
X<ck_split>

=item ck_stringify
X<ck_stringify>

=item ck_subr
X<ck_subr>

=item ck_substr
X<ck_substr>

=item ck_svconst
X<ck_svconst>

=item ck_tell
X<ck_tell>

=item ck_trunc
X<ck_trunc>

=item closest_cop
X<closest_cop>

=item compute_EXACTish
X<compute_EXACTish>

=item coresub_op
X<coresub_op>

=item create_eval_scope
X<create_eval_scope>

=item croak_caller
X<croak_caller>

=item croak_no_mem
X<croak_no_mem>

=item croak_popstack
X<croak_popstack>

=item current_re_engine
X<current_re_engine>

=item custom_op_get_field
X<custom_op_get_field>

=item cv_ckproto_len_flags
X<cv_ckproto_len_flags>

=item cv_clone_into
X<cv_clone_into>

=item cv_const_sv_or_av
X<cv_const_sv_or_av>

=item cv_undef_flags
X<cv_undef_flags>

=item cvgv_from_hek
X<cvgv_from_hek>

=item cvgv_set
X<cvgv_set>

=item cvstash_set
X<cvstash_set>

=item deb_stack_all
X<deb_stack_all>

=item defelem_target
X<defelem_target>

=item delete_eval_scope
X<delete_eval_scope>

=item delimcpy_no_escape
X<delimcpy_no_escape>

=item die_unwind
X<die_unwind>

=item do_aexec
X<do_aexec>

=item do_aexec5
X<do_aexec5>

=item do_eof
X<do_eof>

=item do_exec
X<do_exec>

=item do_exec3
X<do_exec3>

=item do_execfree
X<do_execfree>

=item do_ipcctl
X<do_ipcctl>

=item do_ipcget
X<do_ipcget>

=item do_msgrcv
X<do_msgrcv>

=item do_msgsnd
X<do_msgsnd>

=item do_ncmp
X<do_ncmp>

=item do_open6
X<do_open6>

=item do_open_raw
X<do_open_raw>

=item do_print
X<do_print>

=item do_readline
X<do_readline>

=item do_seek
X<do_seek>

=item do_semop
X<do_semop>

=item do_shmio
X<do_shmio>

=item do_sysseek
X<do_sysseek>

=item do_tell
X<do_tell>

=item do_trans
X<do_trans>

=item do_vecget
X<do_vecget>

=item do_vecset
X<do_vecset>

=item do_vop
X<do_vop>

=item does_utf8_overflow
X<does_utf8_overflow>

=item dofile
X<dofile>

=item drand48_init_r
X<drand48_init_r>

=item drand48_r
X<drand48_r>

=item dtrace_probe_call
X<dtrace_probe_call>

=item dtrace_probe_load
X<dtrace_probe_load>

=item dtrace_probe_op
X<dtrace_probe_op>

=item dtrace_probe_phase
X<dtrace_probe_phase>

=item dump_all_perl
X<dump_all_perl>

=item dump_packsubs_perl
X<dump_packsubs_perl>

=item dump_sub_perl
X<dump_sub_perl>

=item dump_sv_child
X<dump_sv_child>

=item emulate_cop_io
X<emulate_cop_io>

=item feature_is_enabled
X<feature_is_enabled>

=item find_lexical_cv
X<find_lexical_cv>

=item find_runcv_where
X<find_runcv_where>

=item find_script
X<find_script>

=item form_short_octal_warning
X<form_short_octal_warning>

=item free_tied_hv_pool
X<free_tied_hv_pool>

=item get_db_sub
X<get_db_sub>

=item get_debug_opts
X<get_debug_opts>

=item get_hash_seed
X<get_hash_seed>

=item get_invlist_iter_addr
X<get_invlist_iter_addr>

=item get_invlist_offset_addr
X<get_invlist_offset_addr>

=item get_invlist_previous_index_addr
X<get_invlist_previous_index_addr>

=item get_no_modify
X<get_no_modify>

=item get_opargs
X<get_opargs>

=item get_re_arg
X<get_re_arg>

=item getenv_len
X<getenv_len>

=item grok_atoUV
X<grok_atoUV>

=item grok_bslash_c
X<grok_bslash_c>

=item grok_bslash_o
X<grok_bslash_o>

=item grok_bslash_x
X<grok_bslash_x>

=item gv_fetchmeth_internal
X<gv_fetchmeth_internal>

=item gv_override
X<gv_override>

=item gv_setref
X<gv_setref>

=item gv_stashpvn_internal
X<gv_stashpvn_internal>

=item gv_stashsvpvn_cached
X<gv_stashsvpvn_cached>

=item handle_named_backref
X<handle_named_backref>

=item hfree_next_entry
X<hfree_next_entry>

=item hv_backreferences_p
X<hv_backreferences_p>

=item hv_kill_backrefs
X<hv_kill_backrefs>

=item hv_placeholders_p
X<hv_placeholders_p>

=item hv_undef_flags
X<hv_undef_flags>

=item init_argv_symbols
X<init_argv_symbols>

=item init_constants
X<init_constants>

=item init_dbargs
X<init_dbargs>

=item init_debugger
X<init_debugger>

=item invert
X<invert>

=item invlist_array
X<invlist_array>

=item invlist_clear
X<invlist_clear>

=item invlist_clone
X<invlist_clone>

=item invlist_highest
X<invlist_highest>

=item invlist_is_iterating
X<invlist_is_iterating>

=item invlist_iterfinish
X<invlist_iterfinish>

=item invlist_iterinit
X<invlist_iterinit>

=item invlist_max
X<invlist_max>

=item invlist_previous_index
X<invlist_previous_index>

=item invlist_set_len
X<invlist_set_len>

=item invlist_set_previous_index
X<invlist_set_previous_index>

=item invlist_trim
X<invlist_trim>

=item io_close
X<io_close>

=item isFF_OVERLONG
X<isFF_OVERLONG>

=item isFOO_lc
X<isFOO_lc>

=item is_utf8_common
X<is_utf8_common>

=item is_utf8_common_with_len
X<is_utf8_common_with_len>

=item is_utf8_cp_above_31_bits
X<is_utf8_cp_above_31_bits>

=item is_utf8_overlong_given_start_byte_ok
X<is_utf8_overlong_given_start_byte_ok>

=item isinfnansv
X<isinfnansv>

=item jmaybe
X<jmaybe>

=item keyword
X<keyword>

=item keyword_plugin_standard
X<keyword_plugin_standard>

=item list
X<list>

=item localize
X<localize>

=item magic_clear_all_env
X<magic_clear_all_env>

=item magic_cleararylen_p
X<magic_cleararylen_p>

=item magic_clearenv
X<magic_clearenv>

=item magic_clearisa
X<magic_clearisa>

=item magic_clearpack
X<magic_clearpack>

=item magic_clearsig
X<magic_clearsig>

=item magic_copycallchecker
X<magic_copycallchecker>

=item magic_existspack
X<magic_existspack>

=item magic_freearylen_p
X<magic_freearylen_p>

=item magic_freeovrld
X<magic_freeovrld>

=item magic_get
X<magic_get>

=item magic_getarylen
X<magic_getarylen>

=item magic_getdebugvar
X<magic_getdebugvar>

=item magic_getdefelem
X<magic_getdefelem>

=item magic_getnkeys
X<magic_getnkeys>

=item magic_getpack
X<magic_getpack>

=item magic_getpos
X<magic_getpos>

=item magic_getsig
X<magic_getsig>

=item magic_getsubstr
X<magic_getsubstr>

=item magic_gettaint
X<magic_gettaint>

=item magic_getuvar
X<magic_getuvar>

=item magic_getvec
X<magic_getvec>

=item magic_killbackrefs
X<magic_killbackrefs>

=item magic_nextpack
X<magic_nextpack>

=item magic_regdata_cnt
X<magic_regdata_cnt>

=item magic_regdatum_get
X<magic_regdatum_get>

=item magic_regdatum_set
X<magic_regdatum_set>

=item magic_scalarpack
X<magic_scalarpack>

=item magic_set
X<magic_set>

=item magic_set_all_env
X<magic_set_all_env>

=item magic_setarylen
X<magic_setarylen>

=item magic_setcollxfrm
X<magic_setcollxfrm>

=item magic_setdbline
X<magic_setdbline>

=item magic_setdebugvar
X<magic_setdebugvar>

=item magic_setdefelem
X<magic_setdefelem>

=item magic_setenv
X<magic_setenv>

=item magic_setisa
X<magic_setisa>

=item magic_setlvref
X<magic_setlvref>

=item magic_setmglob
X<magic_setmglob>

=item magic_setnkeys
X<magic_setnkeys>

=item magic_setpack
X<magic_setpack>

=item magic_setpos
X<magic_setpos>

=item magic_setregexp
X<magic_setregexp>

=item magic_setsig
X<magic_setsig>

=item magic_setsubstr
X<magic_setsubstr>

=item magic_settaint
X<magic_settaint>

=item magic_setutf8
X<magic_setutf8>

=item magic_setuvar
X<magic_setuvar>

=item magic_setvec
X<magic_setvec>

=item magic_sizepack
X<magic_sizepack>

=item magic_wipepack
X<magic_wipepack>

=item malloc_good_size
X<malloc_good_size>

=item malloced_size
X<malloced_size>

=item mem_collxfrm
X<mem_collxfrm>

=item mem_log_alloc
X<mem_log_alloc>

=item mem_log_free
X<mem_log_free>

=item mem_log_realloc
X<mem_log_realloc>

=item mg_find_mglob
X<mg_find_mglob>

=item mode_from_discipline
X<mode_from_discipline>

=item more_bodies
X<more_bodies>

=item mro_meta_dup
X<mro_meta_dup>

=item mro_meta_init
X<mro_meta_init>

=item multideref_stringify
X<multideref_stringify>

=item my_attrs
X<my_attrs>

=item my_clearenv
X<my_clearenv>

=item my_lstat_flags
X<my_lstat_flags>

=item my_stat_flags
X<my_stat_flags>

=item my_unexec
X<my_unexec>

=item newATTRSUB_x
X<newATTRSUB_x>

=item newGP
X<newGP>

=item newMETHOP_internal
X<newMETHOP_internal>

=item newSTUB
X<newSTUB>

=item newSVavdefelem
X<newSVavdefelem>

=item newXS_deffile
X<newXS_deffile>

=item newXS_len_flags
X<newXS_len_flags>

=item new_warnings_bitfield
X<new_warnings_bitfield>

=item nextargv
X<nextargv>

=item noperl_die
X<noperl_die>

=item notify_parser_that_changed_to_utf8
X<notify_parser_that_changed_to_utf8>

=item oopsAV
X<oopsAV>

=item oopsHV
X<oopsHV>

=item op_clear
X<op_clear>

=item op_integerize
X<op_integerize>

=item op_lvalue_flags
X<op_lvalue_flags>

=item op_refcnt_dec
X<op_refcnt_dec>

=item op_refcnt_inc
X<op_refcnt_inc>

=item op_relocate_sv
X<op_relocate_sv>

=item op_std_init
X<op_std_init>

=item op_unscope
X<op_unscope>

=item opmethod_stash
X<opmethod_stash>

=item opslab_force_free
X<opslab_force_free>

=item opslab_free
X<opslab_free>

=item opslab_free_nopad
X<opslab_free_nopad>

=item package
X<package>

=item package_version
X<package_version>

=item pad_add_weakref
X<pad_add_weakref>

=item padlist_store
X<padlist_store>

=item padname_free
X<padname_free>

=item padnamelist_free
X<padnamelist_free>

=item parse_unicode_opts
X<parse_unicode_opts>

=item parser_free
X<parser_free>

=item parser_free_nexttoke_ops
X<parser_free_nexttoke_ops>

=item path_is_searchable
X<path_is_searchable>

=item peep
X<peep>

=item pmruntime
X<pmruntime>

=item populate_isa
X<populate_isa>

=item ptr_hash
X<ptr_hash>

=item qerror
X<qerror>

=item re_exec_indentf
X<re_exec_indentf>

=item re_indentf
X<re_indentf>

=item re_op_compile
X<re_op_compile>

=item re_printf
X<re_printf>

=item reg_named_buff
X<reg_named_buff>

=item reg_named_buff_iter
X<reg_named_buff_iter>

=item reg_numbered_buff_fetch
X<reg_numbered_buff_fetch>

=item reg_numbered_buff_length
X<reg_numbered_buff_length>

=item reg_numbered_buff_store
X<reg_numbered_buff_store>

=item reg_qr_package
X<reg_qr_package>

=item reg_skipcomment
X<reg_skipcomment>

=item reg_temp_copy
X<reg_temp_copy>

=item regcurly
X<regcurly>

=item regprop
X<regprop>

=item report_evil_fh
X<report_evil_fh>

=item report_redefined_cv
X<report_redefined_cv>

=item report_wrongway_fh
X<report_wrongway_fh>

=item rpeep
X<rpeep>

=item rsignal_restore
X<rsignal_restore>

=item rsignal_save
X<rsignal_save>

=item rxres_save
X<rxres_save>

=item same_dirent
X<same_dirent>

=item save_strlen
X<save_strlen>

=item sawparens
X<sawparens>

=item scalar
X<scalar>

=item scalarvoid
X<scalarvoid>

=item set_caret_X
X<set_caret_X>

=item set_padlist
X<set_padlist>

=item should_warn_nl
X<should_warn_nl>

=item sighandler
X<sighandler>

=item softref2xv
X<softref2xv>

=item ssc_add_range
X<ssc_add_range>

=item ssc_clear_locale
X<ssc_clear_locale>

=item ssc_cp_and
X<ssc_cp_and>

=item ssc_intersection
X<ssc_intersection>

=item ssc_union
X<ssc_union>

=item sub_crush_depth
X<sub_crush_depth>

=item sv_add_backref
X<sv_add_backref>

=item sv_buf_to_ro
X<sv_buf_to_ro>

=item sv_del_backref
X<sv_del_backref>

=item sv_free2
X<sv_free2>

=item sv_kill_backrefs
X<sv_kill_backrefs>

=item sv_len_utf8_nomg
X<sv_len_utf8_nomg>

=item sv_magicext_mglob
X<sv_magicext_mglob>

=item sv_mortalcopy_flags
X<sv_mortalcopy_flags>

=item sv_only_taint_gmagic
X<sv_only_taint_gmagic>

=item sv_or_pv_pos_u2b
X<sv_or_pv_pos_u2b>

=item sv_resetpvn
X<sv_resetpvn>

=item sv_sethek
X<sv_sethek>

=item sv_setsv_cow
X<sv_setsv_cow>

=item sv_unglob
X<sv_unglob>

=item swash_fetch
X<swash_fetch>

=item swash_init
X<swash_init>

=item tied_method
X<tied_method>

=item tmps_grow_p
X<tmps_grow_p>

=item translate_substr_offsets
X<translate_substr_offsets>

=item try_amagic_bin
X<try_amagic_bin>

=item try_amagic_un
X<try_amagic_un>

=item unshare_hek
X<unshare_hek>

=item utilize
X<utilize>

=item varname
X<varname>

=item vivify_defelem
X<vivify_defelem>

=item vivify_ref
X<vivify_ref>

=item wait4pid
X<wait4pid>

=item was_lvalue_sub
X<was_lvalue_sub>

=item watch
X<watch>

=item win32_croak_not_implemented
X<win32_croak_not_implemented>

=item write_to_stderr
X<write_to_stderr>

=item xs_boot_epilog
X<xs_boot_epilog>

=item xs_handshake
X<xs_handshake>

=item yyerror
X<yyerror>

=item yyerror_pv
X<yyerror_pv>

=item yyerror_pvn
X<yyerror_pvn>

=item yylex
X<yylex>

=item yyparse
X<yyparse>

=item yyquit
X<yyquit>

=item yyunlex
X<yyunlex>

=back


=head1 AUTHORS

The autodocumentation system was originally added to the Perl core by
Benjamin Stuhl.  Documentation is by whoever was kind enough to
document their functions.

=head1 SEE ALSO

L<perlguts>, L<perlapi>

=cut

ex: set ro:
perlunitut.pod000064400000017417150344123510007473 0ustar00=head1 NAME

perlunitut - Perl Unicode Tutorial

=head1 DESCRIPTION

The days of just flinging strings around are over. It's well established that
modern programs need to be capable of communicating funny accented letters, and
things like euro symbols. This means that programmers need new habits. It's
easy to program Unicode capable software, but it does require discipline to do
it right.

There's a lot to know about character sets, and text encodings. It's probably
best to spend a full day learning all this, but the basics can be learned in
minutes. 

These are not the very basics, though. It is assumed that you already
know the difference between bytes and characters, and realise (and accept!)
that there are many different character sets and encodings, and that your
program has to be explicit about them. Recommended reading is "The Absolute
Minimum Every Software Developer Absolutely, Positively Must Know About Unicode
and Character Sets (No Excuses!)" by Joel Spolsky, at
L<http://joelonsoftware.com/articles/Unicode.html>.

This tutorial speaks in rather absolute terms, and provides only a limited view
of the wealth of character string related features that Perl has to offer. For
most projects, this information will probably suffice.

=head2 Definitions

It's important to set a few things straight first. This is the most important
part of this tutorial. This view may conflict with other information that you
may have found on the web, but that's mostly because many sources are wrong.

You may have to re-read this entire section a few times...

=head3 Unicode

B<Unicode> is a character set with room for lots of characters. The ordinal
value of a character is called a B<code point>.   (But in practice, the
distinction between code point and character is blurred, so the terms often
are used interchangeably.)

There are many, many code points, but computers work with bytes, and a byte has
room for only 256 values.  Unicode has many more characters than that,
so you need a method to make these accessible.

Unicode is encoded using several competing encodings, of which UTF-8 is the
most used. In a Unicode encoding, multiple subsequent bytes can be used to
store a single code point, or simply: character.

=head3 UTF-8

B<UTF-8> is a Unicode encoding. Many people think that Unicode and UTF-8 are
the same thing, but they're not. There are more Unicode encodings, but much of
the world has standardized on UTF-8. 

UTF-8 treats the first 128 codepoints, 0..127, the same as ASCII. They take
only one byte per character. All other characters are encoded as two to
four bytes using a complex scheme. Fortunately, Perl handles this for
us, so we don't have to worry about this.

=head3 Text strings (character strings)

B<Text strings>, or B<character strings> are made of characters. Bytes are
irrelevant here, and so are encodings. Each character is just that: the
character.

On a text string, you would do things like:

    $text =~ s/foo/bar/;
    if ($string =~ /^\d+$/) { ... }
    $text = ucfirst $text;
    my $character_count = length $text;

The value of a character (C<ord>, C<chr>) is the corresponding Unicode code
point.

=head3 Binary strings (byte strings)

B<Binary strings>, or B<byte strings> are made of bytes. Here, you don't have
characters, just bytes. All communication with the outside world (anything
outside of your current Perl process) is done in binary.

On a binary string, you would do things like:

    my (@length_content) = unpack "(V/a)*", $binary;
    $binary =~ s/\x00\x0F/\xFF\xF0/;  # for the brave :)
    print {$fh} $binary;
    my $byte_count = length $binary;

=head3 Encoding

B<Encoding> (as a verb) is the conversion from I<text> to I<binary>. To encode,
you have to supply the target encoding, for example C<iso-8859-1> or C<UTF-8>.
Some encodings, like the C<iso-8859> ("latin") range, do not support the full
Unicode standard; characters that can't be represented are lost in the
conversion.

=head3 Decoding

B<Decoding> is the conversion from I<binary> to I<text>. To decode, you have to
know what encoding was used during the encoding phase. And most of all, it must
be something decodable. It doesn't make much sense to decode a PNG image into a
text string.

=head3 Internal format

Perl has an B<internal format>, an encoding that it uses to encode text strings
so it can store them in memory. All text strings are in this internal format.
In fact, text strings are never in any other format!

You shouldn't worry about what this format is, because conversion is
automatically done when you decode or encode.

=head2 Your new toolkit

Add to your standard heading the following line:

    use Encode qw(encode decode);

Or, if you're lazy, just:

    use Encode;

=head2 I/O flow (the actual 5 minute tutorial)

The typical input/output flow of a program is:

    1. Receive and decode
    2. Process
    3. Encode and output

If your input is binary, and is supposed to remain binary, you shouldn't decode
it to a text string, of course. But in all other cases, you should decode it.

Decoding can't happen reliably if you don't know how the data was encoded. If
you get to choose, it's a good idea to standardize on UTF-8.

    my $foo   = decode('UTF-8', get 'http://example.com/');
    my $bar   = decode('ISO-8859-1', readline STDIN);
    my $xyzzy = decode('Windows-1251', $cgi->param('foo'));

Processing happens as you knew before. The only difference is that you're now
using characters instead of bytes. That's very useful if you use things like
C<substr>, or C<length>.

It's important to realize that there are no bytes in a text string. Of course,
Perl has its internal encoding to store the string in memory, but ignore that.
If you have to do anything with the number of bytes, it's probably best to move
that part to step 3, just after you've encoded the string. Then you know
exactly how many bytes it will be in the destination string.

The syntax for encoding text strings to binary strings is as simple as decoding:

    $body = encode('UTF-8', $body);

If you needed to know the length of the string in bytes, now's the perfect time
for that. Because C<$body> is now a byte string, C<length> will report the
number of bytes, instead of the number of characters. The number of
characters is no longer known, because characters only exist in text strings.

    my $byte_count = length $body;

And if the protocol you're using supports a way of letting the recipient know
which character encoding you used, please help the receiving end by using that
feature! For example, E-mail and HTTP support MIME headers, so you can use the
C<Content-Type> header. They can also have C<Content-Length> to indicate the
number of I<bytes>, which is always a good idea to supply if the number is
known.

    "Content-Type: text/plain; charset=UTF-8",
    "Content-Length: $byte_count"

=head1 SUMMARY

Decode everything you receive, encode everything you send out. (If it's text
data.)

=head1 Q and A (or FAQ)

After reading this document, you ought to read L<perlunifaq> too, then
L<perluniintro>.

=head1 ACKNOWLEDGEMENTS

Thanks to Johan Vromans from Squirrel Consultancy. His UTF-8 rants during the
Amsterdam Perl Mongers meetings got me interested and determined to find out
how to use character encodings in Perl in ways that don't break easily.

Thanks to Gerard Goossen from TTY. His presentation "UTF-8 in the wild" (Dutch
Perl Workshop 2006) inspired me to publish my thoughts and write this tutorial.

Thanks to the people who asked about this kind of stuff in several Perl IRC
channels, and have constantly reminded me that a simpler explanation was
needed.

Thanks to the people who reviewed this document for me, before it went public.
They are: Benjamin Smith, Jan-Pieter Cornet, Johan Vromans, Lukas Mai, Nathan
Gray.

=head1 AUTHOR

Juerd Waalboer <#####@juerd.nl>

=head1 SEE ALSO

L<perlunifaq>, L<perlunicode>, L<perluniintro>, L<Encode>

perlhist.pod000064400000150452150344123510007107 0ustar00=encoding utf8

=head1 NAME

perlhist - the Perl history records

=head1 DESCRIPTION

This document aims to record the Perl source code releases.

=head1 INTRODUCTION

Perl history in brief, by Larry Wall:

   Perl 0 introduced Perl to my officemates.
   Perl 1 introduced Perl to the world, and changed /\(...\|...\)/ to
       /(...|...)/.  \(Dan Faigin still hasn't forgiven me. :-\)
   Perl 2 introduced Henry Spencer's regular expression package.
   Perl 3 introduced the ability to handle binary data (embedded nulls).
   Perl 4 introduced the first Camel book.  Really.  We mostly just
       switched version numbers so the book could refer to 4.000.
   Perl 5 introduced everything else, including the ability to
       introduce everything else.

=head1 THE KEEPERS OF THE PUMPKIN

Larry Wall, Andy Dougherty, Tom Christiansen, Charles Bailey, Nick
Ing-Simmons, Chip Salzenberg, Tim Bunce, Malcolm Beattie, Gurusamy
Sarathy, Graham Barr, Jarkko Hietaniemi, Hugo van der Sanden,
Michael Schwern, Rafael Garcia-Suarez, Nicholas Clark, Richard Clamp,
Leon Brocard, Dave Mitchell, Jesse Vincent, Ricardo Signes, Steve Hay,
Matt S Trout, David Golden, Florian Ragwitz, Tatsuhiko Miyagawa,
Chris C<BinGOs> Williams, Zefram, Ævar Arnfjörð Bjarmason, Stevan
Little, Dave Rolsky, Max Maischein, Abigail, Jesse Luehrs, Tony Cook,
Dominic Hargreaves, Aaron Crane, Aristotle Pagaltzis, Matthew Horsfall,
Peter Martini, Sawyer X, Chad 'Exodist' Granum, Renee Bäcker, Eric Herman,
John SJ Anderson, and Karen Etheridge.

=head2 PUMPKIN?

[from Porting/pumpkin.pod in the Perl source code distribution]

=for disclaimer orking cows is hazardous, and not legal in all jurisdictions

Chip Salzenberg gets credit for that, with a nod to his cow orker,
David Croy.  We had passed around various names (baton, token, hot
potato) but none caught on.  Then, Chip asked:

[begin quote]

   Who has the patch pumpkin?

To explain:  David Croy once told me that at a previous job,
there was one tape drive and multiple systems that used it for backups.
But instead of some high-tech exclusion software, they used a low-tech
method to prevent multiple simultaneous backups: a stuffed pumpkin.
No one was allowed to make backups unless they had the "backup pumpkin".

[end quote]

The name has stuck.  The holder of the pumpkin is sometimes called
the pumpking (keeping the source afloat?) or the pumpkineer (pulling
the strings?).

=head1 THE RECORDS

 Pump-  Release         Date            Notes
 king                                   (by no means
                                         comprehensive,
                                         see Changes*
                                         for details)
 ======================================================================

 Larry   0              Classified.     Don't ask.

 Larry   1.000          1987-Dec-18

          1.001..10     1988-Jan-30
          1.011..14     1988-Feb-02
 Schwern  1.0.15        2002-Dec-18     Modernization
 Richard  1.0_16        2003-Dec-18

 Larry   2.000          1988-Jun-05

          2.001         1988-Jun-28

 Larry   3.000          1989-Oct-18

          3.001         1989-Oct-26
          3.002..4      1989-Nov-11
          3.005         1989-Nov-18
          3.006..8      1989-Dec-22
          3.009..13     1990-Mar-02
          3.014         1990-Mar-13
          3.015         1990-Mar-14
          3.016..18     1990-Mar-28
          3.019..27     1990-Aug-10     User subs.
          3.028         1990-Aug-14
          3.029..36     1990-Oct-17
          3.037         1990-Oct-20
          3.040         1990-Nov-10
          3.041         1990-Nov-13
          3.042..43     1991-Jan-??
          3.044         1991-Jan-12

 Larry   4.000          1991-Mar-21

          4.001..3      1991-Apr-12
          4.004..9      1991-Jun-07
          4.010         1991-Jun-10
          4.011..18     1991-Nov-05
          4.019         1991-Nov-11     Stable.
          4.020..33     1992-Jun-08
          4.034         1992-Jun-11
          4.035         1992-Jun-23
 Larry    4.036         1993-Feb-05     Very stable.

          5.000alpha1   1993-Jul-31
          5.000alpha2   1993-Aug-16
          5.000alpha3   1993-Oct-10
          5.000alpha4   1993-???-??
          5.000alpha5   1993-???-??
          5.000alpha6   1994-Mar-18
          5.000alpha7   1994-Mar-25
 Andy     5.000alpha8   1994-Apr-04
 Larry    5.000alpha9   1994-May-05     ext appears.
          5.000alpha10  1994-Jun-11
          5.000alpha11  1994-Jul-01
 Andy     5.000a11a     1994-Jul-07     To fit 14.
          5.000a11b     1994-Jul-14
          5.000a11c     1994-Jul-19
          5.000a11d     1994-Jul-22
 Larry    5.000alpha12  1994-Aug-04
 Andy     5.000a12a     1994-Aug-08
          5.000a12b     1994-Aug-15
          5.000a12c     1994-Aug-22
          5.000a12d     1994-Aug-22
          5.000a12e     1994-Aug-22
          5.000a12f     1994-Aug-24
          5.000a12g     1994-Aug-24
          5.000a12h     1994-Aug-24
 Larry    5.000beta1    1994-Aug-30
 Andy     5.000b1a      1994-Sep-06
 Larry    5.000beta2    1994-Sep-14     Core slushified.
 Andy     5.000b2a      1994-Sep-14
          5.000b2b      1994-Sep-17
          5.000b2c      1994-Sep-17
 Larry    5.000beta3    1994-Sep-??
 Andy     5.000b3a      1994-Sep-18
          5.000b3b      1994-Sep-22
          5.000b3c      1994-Sep-23
          5.000b3d      1994-Sep-27
          5.000b3e      1994-Sep-28
          5.000b3f      1994-Sep-30
          5.000b3g      1994-Oct-04
 Andy     5.000b3h      1994-Oct-07
 Larry?   5.000gamma    1994-Oct-13?

 Larry   5.000          1994-Oct-17

 Andy     5.000a        1994-Dec-19
          5.000b        1995-Jan-18
          5.000c        1995-Jan-18
          5.000d        1995-Jan-18
          5.000e        1995-Jan-18
          5.000f        1995-Jan-18
          5.000g        1995-Jan-18
          5.000h        1995-Jan-18
          5.000i        1995-Jan-26
          5.000j        1995-Feb-07
          5.000k        1995-Feb-11
          5.000l        1995-Feb-21
          5.000m        1995-Feb-28
          5.000n        1995-Mar-07
          5.000o        1995-Mar-13?

 Larry   5.001          1995-Mar-13

 Andy     5.001a        1995-Mar-15
          5.001b        1995-Mar-31
          5.001c        1995-Apr-07
          5.001d        1995-Apr-14
          5.001e        1995-Apr-18     Stable.
          5.001f        1995-May-31
          5.001g        1995-May-25
          5.001h        1995-May-25
          5.001i        1995-May-30
          5.001j        1995-Jun-05
          5.001k        1995-Jun-06
          5.001l        1995-Jun-06     Stable.
          5.001m        1995-Jul-02     Very stable.
          5.001n        1995-Oct-31     Very unstable.
          5.002beta1    1995-Nov-21
          5.002b1a      1995-Dec-04
          5.002b1b      1995-Dec-04
          5.002b1c      1995-Dec-04
          5.002b1d      1995-Dec-04
          5.002b1e      1995-Dec-08
          5.002b1f      1995-Dec-08
 Tom      5.002b1g      1995-Dec-21     Doc release.
 Andy     5.002b1h      1996-Jan-05
          5.002b2       1996-Jan-14
 Larry    5.002b3       1996-Feb-02
 Andy     5.002gamma    1996-Feb-11
 Larry    5.002delta    1996-Feb-27

 Larry   5.002          1996-Feb-29     Prototypes.

 Charles  5.002_01      1996-Mar-25

         5.003          1996-Jun-25     Security release.

          5.003_01      1996-Jul-31
 Nick     5.003_02      1996-Aug-10
 Andy     5.003_03      1996-Aug-28
          5.003_04      1996-Sep-02
          5.003_05      1996-Sep-12
          5.003_06      1996-Oct-07
          5.003_07      1996-Oct-10
 Chip     5.003_08      1996-Nov-19
          5.003_09      1996-Nov-26
          5.003_10      1996-Nov-29
          5.003_11      1996-Dec-06
          5.003_12      1996-Dec-19
          5.003_13      1996-Dec-20
          5.003_14      1996-Dec-23
          5.003_15      1996-Dec-23
          5.003_16      1996-Dec-24
          5.003_17      1996-Dec-27
          5.003_18      1996-Dec-31
          5.003_19      1997-Jan-04
          5.003_20      1997-Jan-07
          5.003_21      1997-Jan-15
          5.003_22      1997-Jan-16
          5.003_23      1997-Jan-25
          5.003_24      1997-Jan-29
          5.003_25      1997-Feb-04
          5.003_26      1997-Feb-10
          5.003_27      1997-Feb-18
          5.003_28      1997-Feb-21
          5.003_90      1997-Feb-25     Ramping up to the 5.004 release.
          5.003_91      1997-Mar-01
          5.003_92      1997-Mar-06
          5.003_93      1997-Mar-10
          5.003_94      1997-Mar-22
          5.003_95      1997-Mar-25
          5.003_96      1997-Apr-01
          5.003_97      1997-Apr-03     Fairly widely used.
          5.003_97a     1997-Apr-05
          5.003_97b     1997-Apr-08
          5.003_97c     1997-Apr-10
          5.003_97d     1997-Apr-13
          5.003_97e     1997-Apr-15
          5.003_97f     1997-Apr-17
          5.003_97g     1997-Apr-18
          5.003_97h     1997-Apr-24
          5.003_97i     1997-Apr-25
          5.003_97j     1997-Apr-28
          5.003_98      1997-Apr-30
          5.003_99      1997-May-01
          5.003_99a     1997-May-09
          p54rc1        1997-May-12     Release Candidates.
          p54rc2        1997-May-14

 Chip    5.004          1997-May-15     A major maintenance release.

 Tim      5.004_01-t1   1997-???-??     The 5.004 maintenance track.
          5.004_01-t2   1997-Jun-11     aka perl5.004m1t2
          5.004_01      1997-Jun-13
          5.004_01_01   1997-Jul-29     aka perl5.004m2t1
          5.004_01_02   1997-Aug-01     aka perl5.004m2t2
          5.004_01_03   1997-Aug-05     aka perl5.004m2t3
          5.004_02      1997-Aug-07
          5.004_02_01   1997-Aug-12     aka perl5.004m3t1
          5.004_03-t2   1997-Aug-13     aka perl5.004m3t2
          5.004_03      1997-Sep-05
          5.004_04-t1   1997-Sep-19     aka perl5.004m4t1
          5.004_04-t2   1997-Sep-23     aka perl5.004m4t2
          5.004_04-t3   1997-Oct-10     aka perl5.004m4t3
          5.004_04-t4   1997-Oct-14     aka perl5.004m4t4
          5.004_04      1997-Oct-15
          5.004_04-m1   1998-Mar-04     (5.004m5t1) Maint. trials for 5.004_05.
          5.004_04-m2   1998-May-01
          5.004_04-m3   1998-May-15
          5.004_04-m4   1998-May-19
          5.004_05-MT5  1998-Jul-21
          5.004_05-MT6  1998-Oct-09
          5.004_05-MT7  1998-Nov-22
          5.004_05-MT8  1998-Dec-03
 Chip     5.004_05-MT9  1999-Apr-26
          5.004_05      1999-Apr-29

 Malcolm  5.004_50      1997-Sep-09     The 5.005 development track.
          5.004_51      1997-Oct-02
          5.004_52      1997-Oct-15
          5.004_53      1997-Oct-16
          5.004_54      1997-Nov-14
          5.004_55      1997-Nov-25
          5.004_56      1997-Dec-18
          5.004_57      1998-Feb-03
          5.004_58      1998-Feb-06
          5.004_59      1998-Feb-13
          5.004_60      1998-Feb-20
          5.004_61      1998-Feb-27
          5.004_62      1998-Mar-06
          5.004_63      1998-Mar-17
          5.004_64      1998-Apr-03
          5.004_65      1998-May-15
          5.004_66      1998-May-29
 Sarathy  5.004_67      1998-Jun-15
          5.004_68      1998-Jun-23
          5.004_69      1998-Jun-29
          5.004_70      1998-Jul-06
          5.004_71      1998-Jul-09
          5.004_72      1998-Jul-12
          5.004_73      1998-Jul-13
          5.004_74      1998-Jul-14     5.005 beta candidate.
          5.004_75      1998-Jul-15     5.005 beta1.
          5.004_76      1998-Jul-21     5.005 beta2.

 Sarathy  5.005         1998-Jul-22     Oneperl.

 Sarathy  5.005_01      1998-Jul-27     The 5.005 maintenance track.
          5.005_02-T1   1998-Aug-02
          5.005_02-T2   1998-Aug-05
          5.005_02      1998-Aug-08
 Graham   5.005_03-MT1  1998-Nov-30
          5.005_03-MT2  1999-Jan-04
          5.005_03-MT3  1999-Jan-17
          5.005_03-MT4  1999-Jan-26
          5.005_03-MT5  1999-Jan-28
          5.005_03-MT6  1999-Mar-05
          5.005_03      1999-Mar-28
 Leon     5.005_04-RC1  2004-Feb-05
          5.005_04-RC2  2004-Feb-18
          5.005_04      2004-Feb-23
          5.005_05-RC1  2009-Feb-16

 Sarathy  5.005_50      1998-Jul-26     The 5.6 development track.
          5.005_51      1998-Aug-10
          5.005_52      1998-Sep-25
          5.005_53      1998-Oct-31
          5.005_54      1998-Nov-30
          5.005_55      1999-Feb-16
          5.005_56      1999-Mar-01
          5.005_57      1999-May-25
          5.005_58      1999-Jul-27
          5.005_59      1999-Aug-02
          5.005_60      1999-Aug-02
          5.005_61      1999-Aug-20
          5.005_62      1999-Oct-15
          5.005_63      1999-Dec-09
          5.5.640       2000-Feb-02
          5.5.650       2000-Feb-08     beta1
          5.5.660       2000-Feb-22     beta2
          5.5.670       2000-Feb-29     beta3
          5.6.0-RC1     2000-Mar-09     Release candidate 1.
          5.6.0-RC2     2000-Mar-14     Release candidate 2.
          5.6.0-RC3     2000-Mar-21     Release candidate 3.

 Sarathy  5.6.0         2000-Mar-22

 Sarathy  5.6.1-TRIAL1  2000-Dec-18     The 5.6 maintenance track.
          5.6.1-TRIAL2  2001-Jan-31
          5.6.1-TRIAL3  2001-Mar-19
          5.6.1-foolish 2001-Apr-01     The "fools-gold" release.
          5.6.1         2001-Apr-08
 Rafael   5.6.2-RC1     2003-Nov-08
          5.6.2         2003-Nov-15     Fix new build issues

 Jarkko   5.7.0         2000-Sep-02     The 5.7 track: Development.
          5.7.1         2001-Apr-09
          5.7.2         2001-Jul-13     Virtual release candidate 0.
          5.7.3         2002-Mar-05
          5.8.0-RC1     2002-Jun-01
          5.8.0-RC2     2002-Jun-21
          5.8.0-RC3     2002-Jul-13

 Jarkko   5.8.0         2002-Jul-18

 Jarkko   5.8.1-RC1     2003-Jul-10     The 5.8 maintenance track
          5.8.1-RC2     2003-Jul-11
          5.8.1-RC3     2003-Jul-30
          5.8.1-RC4     2003-Aug-01
          5.8.1-RC5     2003-Sep-22
          5.8.1         2003-Sep-25
 Nicholas 5.8.2-RC1     2003-Oct-27
          5.8.2-RC2     2003-Nov-03
          5.8.2         2003-Nov-05
          5.8.3-RC1     2004-Jan-07
          5.8.3         2004-Jan-14
          5.8.4-RC1     2004-Apr-05
          5.8.4-RC2     2004-Apr-15
          5.8.4         2004-Apr-21
          5.8.5-RC1     2004-Jul-06
          5.8.5-RC2     2004-Jul-08
          5.8.5         2004-Jul-19
          5.8.6-RC1     2004-Nov-11
          5.8.6         2004-Nov-27
          5.8.7-RC1     2005-May-18
          5.8.7         2005-May-30
          5.8.8-RC1     2006-Jan-20
          5.8.8         2006-Jan-31
          5.8.9-RC1     2008-Nov-10
          5.8.9-RC2     2008-Dec-06
          5.8.9         2008-Dec-14

 Hugo     5.9.0         2003-Oct-27     The 5.9 development track
 Rafael   5.9.1         2004-Mar-16
          5.9.2         2005-Apr-01
          5.9.3         2006-Jan-28
          5.9.4         2006-Aug-15
          5.9.5         2007-Jul-07
          5.10.0-RC1    2007-Nov-17
          5.10.0-RC2    2007-Nov-25

 Rafael   5.10.0        2007-Dec-18

 David M  5.10.1-RC1    2009-Aug-06     The 5.10 maintenance track
          5.10.1-RC2    2009-Aug-18
          5.10.1        2009-Aug-22

 Jesse    5.11.0        2009-Oct-02     The 5.11 development track
          5.11.1        2009-Oct-20
 Leon     5.11.2        2009-Nov-20
 Jesse    5.11.3        2009-Dec-20
 Ricardo  5.11.4        2010-Jan-20
 Steve    5.11.5        2010-Feb-20
 Jesse    5.12.0-RC0    2010-Mar-21
          5.12.0-RC1    2010-Mar-29
          5.12.0-RC2    2010-Apr-01
          5.12.0-RC3    2010-Apr-02
          5.12.0-RC4    2010-Apr-06
          5.12.0-RC5    2010-Apr-09

 Jesse    5.12.0        2010-Apr-12

 Jesse    5.12.1-RC2    2010-May-13     The 5.12 maintenance track
          5.12.1-RC1    2010-May-09
          5.12.1        2010-May-16
          5.12.2-RC2    2010-Aug-31
          5.12.2        2010-Sep-06
 Ricardo  5.12.3-RC1    2011-Jan-09
 Ricardo  5.12.3-RC2    2011-Jan-14
 Ricardo  5.12.3-RC3    2011-Jan-17
 Ricardo  5.12.3        2011-Jan-21
 Leon     5.12.4-RC1    2011-Jun-08
 Leon     5.12.4        2011-Jun-20
 Dominic  5.12.5        2012-Nov-10

 Leon     5.13.0        2010-Apr-20     The 5.13 development track
 Ricardo  5.13.1        2010-May-20
 Matt     5.13.2        2010-Jun-22
 David G  5.13.3        2010-Jul-20
 Florian  5.13.4        2010-Aug-20
 Steve    5.13.5        2010-Sep-19
 Miyagawa 5.13.6        2010-Oct-20
 BinGOs   5.13.7        2010-Nov-20
 Zefram   5.13.8        2010-Dec-20
 Jesse    5.13.9        2011-Jan-20
 Ævar     5.13.10       2011-Feb-20
 Florian  5.13.11       2011-Mar-20
 Jesse    5.14.0RC1     2011-Apr-20
 Jesse    5.14.0RC2     2011-May-04
 Jesse    5.14.0RC3     2011-May-11

 Jesse    5.14.0        2011-May-14     The 5.14 maintenance track
 Jesse    5.14.1        2011-Jun-16
 Florian  5.14.2-RC1    2011-Sep-19
          5.14.2        2011-Sep-26
 Dominic  5.14.3        2012-Oct-12
 David M  5.14.4-RC1    2013-Mar-05
 David M  5.14.4-RC2    2013-Mar-07
 David M  5.14.4        2013-Mar-10

 David G  5.15.0        2011-Jun-20     The 5.15 development track
 Zefram   5.15.1        2011-Jul-20
 Ricardo  5.15.2        2011-Aug-20
 Stevan   5.15.3        2011-Sep-20
 Florian  5.15.4        2011-Oct-20
 Steve    5.15.5        2011-Nov-20
 Dave R   5.15.6        2011-Dec-20
 BinGOs   5.15.7        2012-Jan-20
 Max M    5.15.8        2012-Feb-20
 Abigail  5.15.9        2012-Mar-20
 Ricardo  5.16.0-RC0    2012-May-10
 Ricardo  5.16.0-RC1    2012-May-14
 Ricardo  5.16.0-RC2    2012-May-15

 Ricardo  5.16.0        2012-May-20     The 5.16 maintenance track
 Ricardo  5.16.1        2012-Aug-08
 Ricardo  5.16.2        2012-Nov-01
 Ricardo  5.16.3-RC1    2013-Mar-06
 Ricardo  5.16.3        2013-Mar-11

 Zefram   5.17.0        2012-May-26     The 5.17 development track
 Jesse L  5.17.1        2012-Jun-20
 TonyC    5.17.2        2012-Jul-20
 Steve    5.17.3        2012-Aug-20
 Florian  5.17.4        2012-Sep-20
 Florian  5.17.5        2012-Oct-20
 Ricardo  5.17.6        2012-Nov-20
 Dave R   5.17.7        2012-Dec-18
 Aaron    5.17.8        2013-Jan-20
 BinGOs   5.17.9        2013-Feb-20
 Max M    5.17.10       2013-Mar-21
 Ricardo  5.17.11       2013-Apr-20

 Ricardo  5.18.0-RC1    2013-May-11     The 5.18 maintenance track
 Ricardo  5.18.0-RC2    2013-May-12
 Ricardo  5.18.0-RC3    2013-May-13
 Ricardo  5.18.0-RC4    2013-May-15
 Ricardo  5.18.0        2013-May-18
 Ricardo  5.18.1-RC1    2013-Aug-01
 Ricardo  5.18.1-RC2    2013-Aug-03
 Ricardo  5.18.1-RC3    2013-Aug-08
 Ricardo  5.18.1        2013-Aug-12
 Ricardo  5.18.2        2014-Jan-06
 Ricardo  5.18.3-RC1    2014-Sep-17
 Ricardo  5.18.3-RC2    2014-Sep-27
 Ricardo  5.18.3        2014-Oct-01
 Ricardo  5.18.4        2014-Oct-01

 Ricardo   5.19.0       2013-May-20     The 5.19 development track
 David G   5.19.1       2013-Jun-21
 Aristotle 5.19.2       2013-Jul-22
 Steve     5.19.3       2013-Aug-20
 Steve     5.19.4       2013-Sep-20
 Steve     5.19.5       2013-Oct-20
 BinGOs    5.19.6       2013-Nov-20
 Abigail   5.19.7       2013-Dec-20
 Ricardo   5.19.8       2014-Jan-20
 TonyC     5.19.9       2014-Feb-20
 Aaron     5.19.10      2014-Mar-20
 Steve     5.19.11      2014-Apr-20

 Ricardo   5.20.0-RC1   2014-May-16     The 5.20 maintenance track
 Ricardo   5.20.0       2014-May-27
 Steve     5.20.1-RC1   2014-Aug-25
 Steve     5.20.1-RC2   2014-Sep-07
 Steve     5.20.1       2014-Sep-14
 Steve     5.20.2-RC1   2015-Jan-31
 Steve     5.20.2       2015-Feb-14
 Steve     5.20.3-RC1   2015-Aug-22
 Steve     5.20.3-RC2   2015-Aug-29
 Steve     5.20.3       2015-Sep-12

 Ricardo   5.21.0       2014-May-27     The 5.21 development track
 Matthew H 5.21.1       2014-Jun-20
 Abigail   5.21.2       2014-Jul-20
 Peter     5.21.3       2014-Aug-20
 Steve     5.21.4       2014-Sep-20
 Abigail   5.21.5       2014-Oct-20
 BinGOs    5.21.6       2014-Nov-20
 Max M     5.21.7       2014-Dec-20
 Matthew H 5.21.8       2015-Jan-20
 Sawyer X  5.21.9       2015-Feb-20
 Steve     5.21.10      2015-Mar-20
 Steve     5.21.11      2015-Apr-20

 Ricardo   5.22.0-RC1   2015-May-19     The 5.22 maintenance track
 Ricardo   5.22.0-RC2   2015-May-21
 Ricardo   5.22.0       2015-Jun-01
 Steve     5.22.1-RC1   2015-Oct-31
 Steve     5.22.1-RC2   2015-Nov-15
 Steve     5.22.1-RC3   2015-Dec-02
 Steve     5.22.1-RC4   2015-Dec-08
 Steve     5.22.1       2015-Dec-13
 Steve     5.22.2-RC1   2016-Apr-10
 Steve     5.22.2       2016-Apr-29
 Steve     5.22.3-RC1   2016-Jul-17
 Steve     5.22.3-RC2   2016-Jul-25
 Steve     5.22.3-RC3   2016-Aug-11
 Steve     5.22.3-RC4   2016-Oct-12
 Steve     5.22.3-RC5   2017-Jan-02
 Steve     5.22.3       2017-Jan-14
 Steve     5.22.4-RC1   2017-Jul-01
 Steve     5.22.4       2017-Jul-15

 Ricardo   5.23.0       2015-Jun-20     The 5.23 development track
 Matthew H 5.23.1       2015-Jul-20
 Matthew H 5.23.2       2015-Aug-20
 Peter     5.23.3       2015-Sep-20
 Steve     5.23.4       2015-Oct-20
 Abigail   5.23.5       2015-Nov-20
 David G   5.23.6       2015-Dec-21
 Stevan    5.23.7       2016-Jan-20
 Sawyer X  5.23.8       2016-Feb-20
 Abigail   5.23.9       2016-Mar-20

 Ricardo   5.24.0-RC1   2016-Apr-13     The 5.24 maintenance track
 Ricardo   5.24.0-RC2   2016-Apr-23
 Ricardo   5.24.0-RC3   2016-Apr-26
 Ricardo   5.24.0-RC4   2016-May-02
 Ricardo   5.24.0-RC5   2016-May-04
 Ricardo   5.24.0       2016-May-09
 Steve     5.24.1-RC1   2016-Jul-17
 Steve     5.24.1-RC2   2016-Jul-25
 Steve     5.24.1-RC3   2016-Aug-11
 Steve     5.24.1-RC4   2016-Oct-12
 Steve     5.24.1-RC5   2017-Jan-02
 Steve     5.24.1       2017-Jan-14
 Steve     5.24.2-RC1   2017-Jul-01
 Steve     5.24.2       2017-Jul-15
 Steve     5.24.3-RC1   2017-Sep-10
 Steve     5.24.3       2017-Sep-22
 Steve     5.24.4-RC1   2018-Mar-24
 Steve     5.24.4       2018-Apr-14

 Ricardo   5.25.0       2016-May-09     The 5.25 development track
 Sawyer X  5.25.1       2016-May-20
 Matthew H 5.25.2       2016-Jun-20
 Steve     5.25.3       2016-Jul-20
 BinGOs    5.25.4       2016-Aug-20
 Stevan    5.25.5       2016-Sep-20
 Aaron     5.25.6       2016-Oct-20
 Chad      5.25.7       2016-Nov-20
 Sawyer X  5.25.8       2016-Dec-20
 Abigail   5.25.9       2017-Jan-20
 Renee     5.25.10      2017-Feb-20
 Sawyer X  5.25.11      2017-Mar-20
 Sawyer X  5.25.12      2017-Apr-20

 Sawyer X  5.26.0-RC1   2017-May-11     The 5.26 maintenance track
 Sawyer X  5.26.0-RC2   2017-May-23
 Sawyer X  5.26.0       2017-May-30
 Steve     5.26.1-RC1   2017-Sep-10
 Steve     5.26.1       2017-Sep-22
 Steve     5.26.2-RC1   2018-Mar-24
 Steve     5.26.2       2018-Apr-14
 Steve     5.26.3-RC1   2018-Nov-08
 Steve     5.26.3       2018-Nov-29

 Sawyer X  5.27.0       2017-May-31     The 5.27 development track
 Eric      5.27.1       2017-Jun-20
 Aaron     5.27.2       2017-Jul-20
 Matthew H 5.27.3       2017-Aug-21
 John      5.27.4       2017-Sep-20
 Steve     5.27.5       2017-Oct-20
 Ether     5.27.6       2017-Nov-20
 BinGOs    5.27.7       2017-Dec-20
 Abigail   5.27.8       2018-Jan-20
 Renee     5.27.9       2018-Feb-20
 Todd      5.27.10      2018-Mar-20
 Sawyer X  5.27.11      2018-Apr-20

 Sawyer X  5.28.0-RC1   2018-May-21     The 5.28 maintenance track
 Sawyer X  5.28.0-RC2   2018-Jun-06
 Sawyer X  5.28.0-RC3   2018-Jun-18
 Sawyer X  5.28.0-RC4   2018-Jun-19
 Sawyer X  5.28.0       2018-Jun-22
 Steve     5.28.1-RC1   2018-Nov-08
 Steve     5.28.1       2018-Nov-29

 Sawyer X  5.29.0       2018-Jun-26     The 5.29 development track
 Steve     5.29.1       2018-Jul-20
 BinGOs    5.29.2       2018-Aug-20
 John      5.29.3       2019-Sep-20

=head2 SELECTED RELEASE SIZES

For example the notation "core: 212  29" in the release 1.000 means that
it had in the core 212 kilobytes, in 29 files.  The "core".."doc" are
explained below.

 release        core       lib         ext        t         doc
 ======================================================================

 1.000           212  29      -   -      -    -     38   51     62   3
 1.014           219  29      -   -      -    -     39   52     68   4
 2.000           309  31      2   3      -    -     55   57     92   4
 2.001           312  31      2   3      -    -     55   57     94   4
 3.000           508  36     24  11      -    -     79   73    156   5
 3.044           645  37     61  20      -    -     90   74    190   6
 4.000           635  37     59  20      -    -     91   75    198   4
 4.019           680  37     85  29      -    -     98   76    199   4
 4.036           709  37     89  30      -    -     98   76    208   5
 5.000alpha2     785  50    114  32      -    -    112   86    209   5
 5.000alpha3     801  50    117  33      -    -    121   87    209   5
 5.000alpha9    1022  56    149  43    116   29    125   90    217   6
 5.000a12h       978  49    140  49    205   46    152   97    228   9
 5.000b3h       1035  53    232  70    216   38    162   94    218  21
 5.000          1038  53    250  76    216   38    154   92    536  62
 5.001m         1071  54    388  82    240   38    159   95    544  29
 5.002          1121  54    661 101    287   43    155   94    847  35
 5.003          1129  54    680 102    291   43    166  100    853  35
 5.003_07       1231  60    748 106    396   53    213  137    976  39
 5.004          1351  60   1230 136    408   51    355  161   1587  55
 5.004_01       1356  60   1258 138    410   51    358  161   1587  55
 5.004_04       1375  60   1294 139    413   51    394  162   1629  55
 5.004_05       1463  60   1435 150    394   50    445  175   1855  59
 5.004_51       1401  61   1260 140    413   53    358  162   1594  56
 5.004_53       1422  62   1295 141    438   70    394  162   1637  56
 5.004_56       1501  66   1301 140    447   74    408  165   1648  57
 5.004_59       1555  72   1317 142    448   74    424  171   1678  58
 5.004_62       1602  77   1327 144    629   92    428  173   1674  58
 5.004_65       1626  77   1358 146    615   92    446  179   1698  60
 5.004_68       1856  74   1382 152    619   92    463  187   1784  60
 5.004_70       1863  75   1456 154    675   92    494  194   1809  60
 5.004_73       1874  76   1467 152    762  102    506  196   1883  61
 5.004_75       1877  76   1467 152    770  103    508  196   1896  62
 5.005          1896  76   1469 152    795  103    509  197   1945  63
 5.005_03       1936  77   1541 153    813  104    551  201   2176  72
 5.005_50       1969  78   1842 301    795  103    514  198   1948  63
 5.005_53       1999  79   1885 303    806  104    602  224   2002  67
 5.005_56       2086  79   1970 307    866  113    672  238   2221  75
 5.6.0          2820  79   2626 364   1096  129    863  280   2840  93
 5.6.1          2946  78   2921 430   1171  132   1024  304   3330 102
 5.6.2          2947  78   3143 451   1247  127   1303  387   3406 102
 5.7.0          2977  80   2801 425   1250  132    975  307   3206 100
 5.7.1          3351  84   3442 455   1944  167   1334  357   3698 124
 5.7.2          3491  87   4858 618   3290  298   1598  449   3910 139
 5.7.3          3299  85   4295 537   2196  300   2176  626   4171 120
 5.8.0          3489  87   4533 585   2437  331   2588  726   4368 125
 5.8.1          3674  90   5104 623   2604  353   2983  836   4625 134
 5.8.2          3633  90   5111 623   2623  357   3019  848   4634 135
 5.8.3          3625  90   5141 624   2660  363   3083  869   4669 136
 5.8.4          3653  90   5170 634   2684  368   3148  885   4689 137
 5.8.5          3664  90   4260 303   2707  369   3208  898   4689 138
 5.8.6          3690  90   4271 303   3141  396   3411  925   4709 139
 5.8.7          3788  90   4322 307   3297  401   3485  964   4744 141
 5.8.8          3895  90   4357 314   3409  431   3622 1017   4979 144
 5.8.9          4132  93   5508 330   3826  529   4364 1234   5348 152
 5.9.0          3657  90   4951 626   2603  354   3011  841   4609 135
 5.9.1          3580  90   5196 634   2665  367   3186  889   4725 138
 5.9.2          3863  90   4654 312   3283  403   3551  973   4800 142
 5.9.3          4096  91   5318 381   4806  597   4272 1214   5139 147
 5.9.4          4393  94   5718 415   4578  642   4646 1310   5335 153
 5.9.5          4681  96   6849 479   4827  671   5155 1490   5572 159
 5.10.0         4710  97   7050 486   4899  673   5275 1503   5673 160
 5.10.1         4858  98   7440 519   6195  921   6147 1751   5151 163
 5.12.0         4999 100   1146 121  15227 2176   6400 1843   5342 168
 5.12.1         5000 100   1146 121  15283 2178   6407 1846   5354 169
 5.12.2         5003 100   1146 121  15404 2178   6413 1846   5376 170
 5.12.3         5004 100   1146 121  15529 2180   6417 1848   5391 171
 5.14.0         5328 104   1100 114  17779 2479   7697 2130   5871 188
 5.16.0         5562 109   1077  80  20504 2702   8750 2375   4815 152
 5.18.0         5892 113   1088  79  20077 2760   9365 2439   4943 154
 5.20.0         6243 115   1187  75  19499 2701   9620 2457   5145 159
 5.22.0         7819 115   1284  77  19121 2635   9772 2434   5615 176
 5.24.0         7922 113   1287  77  19535 2677   9994 2465   5702 177
 5.26.0         9140 121  24925 1200 40643 3017  10514 2614   7854 211
 5.28.0        13056 128  27267 1230 41745 3130  10952 2715   8185 218

The "core"..."doc" mean the following files from the Perl source code
distribution.  The glob notation ** means recursively, (.) means
regular files.

 core   *.[hcy]
 lib    lib/**/*.p[ml]
 ext    ext/**/*.{[hcyt],xs,pm} (for -5.10.1) or
        {dist,ext,cpan}/**/*.{[hcyt],xs,pm} (for 5.12.0-)
 t      t/**/*(.) (for 1-5.005_56) or **/*.t (for 5.6.0-5.7.3)
 doc    {README*,INSTALL,*[_.]man{,.?},pod/**/*.pod}

Here are some statistics for the other subdirectories and one file in
the Perl source distribution for somewhat more selected releases.

 ======================================================================
   Legend:  kB   #

                  1.014      2.001      3.044

 Configure      31    1    37    1    62    1
 eg              -    -    34   28    47   39
 h2pl            -    -     -    -    12   12
 msdos           -    -     -    -    41   13
 os2             -    -     -    -    63   22
 usub            -    -     -    -    21   16
 x2p           103   17   104   17   137   17

 ======================================================================

                  4.000      4.019      4.036

 atarist         -    -     -    -   113   31
 Configure      73    1    83    1    86    1
 eg             47   39    47   39    47   39
 emacs          67    4    67    4    67    4
 h2pl           12   12    12   12    12   12
 hints           -    -     5   42    11   56
 msdos          57   15    58   15    60   15
 os2            81   29    81   29   113   31
 usub           25    7    43    8    43    8
 x2p           147   18   152   19   154   19

 ======================================================================

                5.000a2  5.000a12h   5.000b3h      5.000     5.001m

 apollo          8    3     8    3     8    3     8    3     8    3
 atarist       113   31   113   31     -    -     -    -     -    -
 bench           -    -     0    1     -    -     -    -     -    -
 Bugs            2    5    26    1     -    -     -    -     -    -
 dlperl         40    5     -    -     -    -     -    -     -    -
 do            127   71     -    -     -    -     -    -     -    -
 Configure       -    -   153    1   159    1   160    1   180    1
 Doc             -    -    26    1    75    7    11    1    11    1
 eg             79   58    53   44    51   43    54   44    54   44
 emacs          67    4   104    6   104    6   104    1   104    6
 h2pl           12   12    12   12    12   12    12   12    12   12
 hints          11   56    12   46    18   48    18   48    44   56
 msdos          60   15    60   15     -    -     -    -     -    -
 os2           113   31   113   31     -    -     -    -     -    -
 U               -    -    62    8   112   42     -    -     -    -
 usub           43    8     -    -     -    -     -    -     -    -
 vms             -    -    80    7   123    9   184   15   304   20
 x2p           171   22   171   21   162   20   162   20   279   20

 ======================================================================

                  5.002      5.003   5.003_07

 Configure     201    1   201    1   217    1
 eg             54   44    54   44    54   44
 emacs         108    1   108    1   143    1
 h2pl           12   12    12   12    12   12
 hints          73   59    77   60    90   62
 os2            84   17    56   10   117   42
 plan9           -    -     -    -    79   15
 Porting         -    -     -    -    51    1
 utils          87    7    88    7    97    7
 vms           500   24   475   26   505   27
 x2p           280   20   280   20   280   19

 ======================================================================

                  5.004   5.004_04   5.004_62   5.004_65   5.004_68

 beos            -    -     -    -     -    -      1   1      1   1
 Configure     225    1   225    1   240    1    248   1    256   1
 cygwin32       23    5    23    5    23    5     24   5     24   5
 djgpp           -    -     -    -    14    5     14   5     14   5
 eg             81   62    81   62    81   62     81  62     81  62
 emacs         194    1   204    1   212    2    212   2    212   2
 h2pl           12   12    12   12    12   12     12  12     12  12
 hints         129   69   132   71   144   72    151  74    155  74
 os2           121   42   127   42   127   44    129  44    129  44
 plan9          82   15    82   15    82   15     82  15     82  15
 Porting        94    2   109    4   203    6    234   8    241   9
 qnx             1    2     1    2     1    2      1   2      1   2
 utils         112    8   118    8   124    8    156   9    159   9
 vms           518   34   524   34   538   34    569  34    569  34
 win32         285   33   378   36   470   39    493  39    575  41
 x2p           281   19   281   19   281   19    282  19    281  19

 ======================================================================

               5.004_70   5.004_73   5.004_75      5.005   5.005_03

 apollo          -    -     -    -     -    -     -    -      0   1
 beos            1    1     1    1     1    1     1    1      1   1
 Configure     256    1   256    1   264    1   264    1    270   1
 cygwin32       24    5    24    5    24    5    24    5     24   5
 djgpp          14    5    14    5    14    5    14    5     15   5
 eg             86   65    86   65    86   65    86   65     86  65
 emacs         262    2   262    2   262    2   262    2    274   2
 h2pl           12   12    12   12    12   12    12   12     12  12
 hints         157   74   157   74   159   74   160   74    179  77
 mint            -    -     -    -     -    -     -    -      4   7
 mpeix           -    -     -    -     5    3     5    3      5   3
 os2           129   44   139   44   142   44   143   44    148  44
 plan9          82   15    82   15    82   15    82   15     82  15
 Porting       241    9   253    9   259   10   264   12    272  13
 qnx             1    2     1    2     1    2     1    2      1   2
 utils         160    9   160    9   160    9   160    9    164   9
 vms           570   34   572   34   573   34   575   34    583  34
 vos             -    -     -    -     -    -     -   -     156  10
 win32         577   41   585   41   585   41   587   41    600  42
 x2p           281   19   281   19   281   19   281   19    281  19

 ======================================================================

                  5.6.0      5.6.1      5.6.2      5.7.3

 apollo          8    3     8    3     8    3     8    3
 beos            5    2     5    2     5    2     6    4
 Configure     346    1   361    1   363    1   394    1
 Cross           -    -     -    -     -    -     4    2
 djgpp          19    6    19    6    19    6    21    7
 eg            112   71   112   71   112   71     -    -
 emacs         303    4   319    4   319    4   319    4
 epoc           29    8    35    8    35    8    36    8
 h2pl           24   15    24   15    24   15    24   15
 hints         242   83   250   84   321   89   272   87
 mint           11    9    11    9    11    9    11    9
 mpeix           9    4     9    4     9    4     9    4
 NetWare         -    -     -    -     -    -   423   57
 os2           214   59   224   60   224   60   357   66
 plan9          92   17    92   17    92   17    85   15
 Porting       361   15   390   16   390   16   425   21
 qnx             5    3     5    3     5    3     5    3
 utils         228   12   221   11   222   11   267   13
 uts             -    -     -    -     -    -    12    3
 vmesa          25    4    25    4    25    4    25    4
 vms           686   38   627   38   627   38   649   36
 vos           227   12   249   15   248   15   281   17
 win32         755   41   782   42   801   42  1006   50
 x2p           307   20   307   20   307   20   345   20

 ======================================================================

                  5.8.0      5.8.1      5.8.2      5.8.3      5.8.4

 apollo          8    3     8    3     8    3     8    3     8    3
 beos            6    4     6    4     6    4     6    4     6    4
 Configure     472    1   493    1   493    1   493    1   494    1
 Cross           4    2    45   10    45   10    45   10    45   10
 djgpp          21    7    21    7    21    7    21    7    21    7
 emacs         319    4   329    4   329    4   329    4   329    4
 epoc           33    8    33    8    33    8    33    8    33    8
 h2pl           24   15    24   15    24   15    24   15    24   15
 hints         294   88   321   89   321   89   321   89   348   91
 mint           11    9    11    9    11    9    11    9    11    9
 mpeix          24    5    25    5    25    5    25    5    25    5
 NetWare       488   61   490   61   490   61   490   61   488   61
 os2           361   66   445   67   450   67   488   67   488   67
 plan9          85   15   325   17   325   17   325   17   321   17
 Porting       479   22   537   32   538   32   539   32   538   33
 qnx             5    3     5    3     5    3     5    3     5    3
 utils         275   15   258   16   258   16   263   19   263   19
 uts            12    3    12    3    12    3    12    3    12    3
 vmesa          25    4    25    4    25    4    25    4    25    4
 vms           648   36   654   36   654   36   656   36   656   36
 vos           330   20   335   20   335   20   335   20   335   20
 win32        1062   49  1125   49  1127   49  1126   49  1181   56
 x2p           347   20   348   20   348   20   348   20   348   20

 ======================================================================

                  5.8.5      5.8.6      5.8.7      5.8.8      5.8.9

 apollo          8    3     8    3     8    3     8    3     8    3
 beos            6    4     6    4     8    4     8    4     8    4
 Configure     494    1   494    1   495    1   506    1   520    1
 Cross          45   10    45   10    45   10    45   10    46   10
 djgpp          21    7    21    7    21    7    21    7    21    7
 emacs         329    4   329    4   329    4   329    4   406    4
 epoc           33    8    33    8    33    8    34    8    35    8
 h2pl           24   15    24   15    24   15    24   15    24   15
 hints         350   91   352   91   355   94   360   94   387   99
 mint           11    9    11    9    11    9    11    9    11    9
 mpeix          25    5    25    5    25    5    49    6    49    6
 NetWare       488   61   488   61   488   61   490   61   491   61
 os2           488   67   488   67   488   67   488   67   552   70
 plan9         321   17   321   17   321   17   322   17   324   17
 Porting       538   34   548   35   549   35   564   37   625   41
 qnx             5    3     5    3     5    3     5    3     5    3
 utils         265   19   265   19   266   19   267   19   281   21
 uts            12    3    12    3    12    3    12    3    12    3
 vmesa          25    4    25    4    25    4    25    4    25    4
 vms           657   36   658   36   662   36   664   36   716   35
 vos           335   20   335   20   335   20   336   21   345   22
 win32        1183   56  1190   56  1199   56  1219   56  1484   68
 x2p           349   20   349   20   349   20   349   19   350   19

 ======================================================================

                  5.9.0      5.9.1      5.9.2      5.9.3      5.9.4

 apollo          8    3     8    3     8    3     8    3     8    3
 beos            6    4     6    4     8    4     8    4     8    4
 Configure     493    1   493    1   495    1   508    1   512    1
 Cross          45   10    45   10    45   10    45   10    46   10
 djgpp          21    7    21    7    21    7    21    7    21    7
 emacs         329    4   329    4   329    4   329    4   329    4
 epoc           33    8    33    8    33    8    34    8    34    8
 h2pl           24   15    24   15    24   15    24   15    24   15
 hints         321   89   346   91   355   94   359   94   366   96
 mad             -    -     -    -     -    -     -    -   174    6
 mint           11    9    11    9    11    9    11    9    11    9
 mpeix          25    5    25    5    25    5    49    6    49    6
 NetWare       489   61   487   61   487   61   489   61   489   61
 os2           444   67   488   67   488   67   488   67   488   67
 plan9         325   17   321   17   321   17   322   17   323   17
 Porting       537   32   536   33   549   36   564   38   576   38
 qnx             5    3     5    3     5    3     5    3     5    3
 symbian         -    -     -    -     -    -   293   53   293   53
 utils         258   16   263   19   268   20   273   23   275   24
 uts            12    3    12    3    12    3    12    3    12    3
 vmesa          25    4    25    4    25    4    25    4    25    4
 vms           660   36   547   33   553   33   661   33   696   33
 vos            11    7    11    7    11    7    11    7    11    7
 win32        1120   49  1124   51  1191   56  1209   56  1719   90
 x2p           348   20   348   20   349   20   349   19   349   19

 ======================================================================

                  5.9.5     5.10.0     5.10.1     5.12.0     5.12.1

 apollo          8    3     8    3     0    3     0    3     0    3
 beos            8    4     8    4     4    4     4    4     4    4
 Configure     518    1   518    1   533    1   536    1   536    1
 Cross         122   15   122   15   119   15   118   15   118   15
 djgpp          21    7    21    7    17    7    17    7    17    7
 emacs         329    4   406    4   402    4   402    4   402    4
 epoc           34    8    35    8    31    8    31    8    31    8
 h2pl           24   15    24   15    12   15    12   15    12   15
 hints         377   98   381   98   385  100   368   97   368   97
 mad           182    8   182    8   174    8   174    8   174    8
 mint           11    9    11    9     3    9     -    -     -    -
 mpeix          49    6    49    6    45    6    45    6    45    6
 NetWare       489   61   489   61   465   61   466   61   466   61
 os2           552   70   552   70   507   70   507   70   507   70
 plan9         324   17   324   17   316   17   316   17   316   17
 Porting       627   40   632   40   933   53   749   54   749   54
 qnx             5    3     5    4     1    4     1    4     1    4
 symbian       300   54   300   54   290   54   288   54   288   54
 utils         260   26   264   27   268   27   269   27   269   27
 uts            12    3    12    3     8    3     8    3     8    3
 vmesa          25    4    25    4    21    4    21    4    21    4
 vms           690   32   722   32   693   30   645   18   645   18
 vos            19    8    19    8    16    8    16    8    16    8
 win32        1482   68  1485   68  1497   70  1841   73  1841   73
 x2p           349   19   349   19   345   19   345   19   345   19

 ======================================================================

                 5.12.2     5.12.3      5.14.0     5.16.0       5.18.0

 apollo          0    3     0    3      -    -     -    -      -     -
 beos            4    4     4    4      5    4     5    4      -     -
 Configure     536    1   536    1    539    1   547    1    550     1
 Cross         118   15   118   15    118   15   118   15    118    15
 djgpp          17    7    17    7     18    7    18    7     18     7
 emacs         402    4   402    4      -    -     -    -      -     -
 epoc           31    8    31    8     32    8    30    8      -     -
 h2pl           12   15    12   15     15   15    15   15     13    15
 hints         368   97   368   97    370   96   371   96    354    91
 mad           174    8   174    8    176    8   176    8    174     8
 mpeix          45    6    45    6     46    6    46    6      -     -
 NetWare       466   61   466   61    473   61   472   61    469    61
 os2           507   70   507   70    518   70   519   70    510    70
 plan9         316   17   316   17    319   17   319   17    318    17
 Porting       750   54   750   54    855   60  1093   69   1149    70
 qnx             1    4     1    4      2    4     2    4      1     4
 symbian       288   54   288   54    292   54   292   54    290    54
 utils         269   27   269   27    249   29   245   30    246    31
 uts             8    3     8    3      9    3     9    3      -     -
 vmesa          21    4    21    4     22    4    22    4      -     -
 vms           646   18   644   18    639   17   571   15    564    15
 vos            16    8    16    8     17    8     9    7      8     7
 win32        1841   73  1841   73   1833   72  1655   67   1157    62
 x2p           345   19   345   19    346   19   345   19    344    20

 ======================================================================

                  5.20.0           5.22.0          5.24.0

 Configure    552      1       570      1      586      1
 Cross        118     15       118     15      118     15
 djgpp         18      7        17      7       17      7
 h2pl          13     15        13     15       13     15
 hints        355     90       356     87      362     87
 mad          174      8         -      -        -      -
 NetWare      467     61       466     61      467     61
 os2          510     70       510     70      510     70
 plan9        316     17       317     17      314     17
 Porting     1204     68      1393     71     1321     71
 qnx            1      4         1      4        1      4
 symbian      290     54       291     54      292     54
 utils        241     27       242     27      679     53
 vms          538     12       532     12      524     12
 vos            8      7         8      7        8      7
 win32       1183     64      1201     64     1268     65
 x2p          341     19         -      -        -      -

 ======================================================================

                  5.26.0           5.28.0

 Configure    593      1       580      1
 Cross        122     15       125     15
 djgpp         21      7        21      7
 h2pl          24     15        24     15
 hints        376     87       364     85
 mad            -      -         -      -
 NetWare      499     61       493     61
 os2          552     70       552     70
 plan9        322     17       309     17
 Porting     1380     73      1462     75
 qnx            5      4         5      4
 symbian      315     54       315     54
 utils        578     50       584     50
 vms          527     12       526     12
 vos           12      7        12      7
 win32       1313     65      1326     65
 x2p            -      -         -      -

=head2 SELECTED PATCH SIZES

The "diff lines kB" means that for example the patch 5.003_08, to be
applied on top of the 5.003_07 (or whatever was before the 5.003_08)
added lines for 110 kilobytes, it removed lines for 19 kilobytes, and
changed lines for 424 kilobytes.  Just the lines themselves are
counted, not their context.  The "+ - !" become from the diff(1)
context diff output format.

 Pump-  Release         Date              diff lines kB
 king                                     -------------
                                          +     -     !
 ======================================================================

 Chip     5.003_08      1996-Nov-19     110    19   424
          5.003_09      1996-Nov-26      38     9   248
          5.003_10      1996-Nov-29      29     2    27
          5.003_11      1996-Dec-06      73    12   165
          5.003_12      1996-Dec-19     275     6   436
          5.003_13      1996-Dec-20      95     1    56
          5.003_14      1996-Dec-23      23     7   333
          5.003_15      1996-Dec-23       0     0     1
          5.003_16      1996-Dec-24      12     3    50
          5.003_17      1996-Dec-27      19     1    14
          5.003_18      1996-Dec-31      21     1    32
          5.003_19      1997-Jan-04      80     3    85
          5.003_20      1997-Jan-07      18     1   146
          5.003_21      1997-Jan-15      38    10   221
          5.003_22      1997-Jan-16       4     0    18
          5.003_23      1997-Jan-25      71    15   119
          5.003_24      1997-Jan-29     426     1    20
          5.003_25      1997-Feb-04      21     8   169
          5.003_26      1997-Feb-10      16     1    15
          5.003_27      1997-Feb-18      32    10    38
          5.003_28      1997-Feb-21      58     4    66
          5.003_90      1997-Feb-25      22     2    34
          5.003_91      1997-Mar-01      37     1    39
          5.003_92      1997-Mar-06      16     3    69
          5.003_93      1997-Mar-10      12     3    15
          5.003_94      1997-Mar-22     407     7   200
          5.003_95      1997-Mar-25      41     1    37
          5.003_96      1997-Apr-01     283     5   261
          5.003_97      1997-Apr-03      13     2    34
          5.003_97a     1997-Apr-05      57     1    27
          5.003_97b     1997-Apr-08      14     1    20
          5.003_97c     1997-Apr-10      20     1    16
          5.003_97d     1997-Apr-13       8     0    16
          5.003_97e     1997-Apr-15      15     4    46
          5.003_97f     1997-Apr-17       7     1    33
          5.003_97g     1997-Apr-18       6     1    42
          5.003_97h     1997-Apr-24      23     3    68
          5.003_97i     1997-Apr-25      23     1    31
          5.003_97j     1997-Apr-28      36     1    49
          5.003_98      1997-Apr-30     171    12   539
          5.003_99      1997-May-01       6     0     7
          5.003_99a     1997-May-09      36     2    61
          p54rc1        1997-May-12       8     1    11
          p54rc2        1997-May-14       6     0    40

        5.004           1997-May-15       4     0     4

 Tim      5.004_01      1997-Jun-13     222    14    57
          5.004_02      1997-Aug-07     112    16   119
          5.004_03      1997-Sep-05     109     0    17
          5.004_04      1997-Oct-15      66     8   173

=head3 The patch-free era

In more modern times, named releases don't come as often, and as progress
can be followed (nearly) instantly (with rsync, and since late 2008, git)
patches between versions are no longer provided. However, that doesn't
keep us from calculating how large a patch could have been. Which is
shown in the table below. Unless noted otherwise, the size mentioned is
the patch to bring version x.y.z to x.y.z+1.

 Sarathy  5.6.1         2001-Apr-08     531    44   651
 Rafael   5.6.2         2003-Nov-15      20    11  1819

 Jarkko   5.8.0         2002-Jul-18    1205    31   471   From 5.7.3

          5.8.1         2003-Sep-25     243   102  6162
 Nicholas 5.8.2         2003-Nov-05      10    50   788
          5.8.3         2004-Jan-14      31    13   360
          5.8.4         2004-Apr-21      33     8   299
          5.8.5         2004-Jul-19      11    19   255
          5.8.6         2004-Nov-27      35     3   192
          5.8.7         2005-May-30      75    34   778
          5.8.8         2006-Jan-31     131    42  1251
          5.8.9         2008-Dec-14     340   132 12988

 Hugo     5.9.0         2003-Oct-27     281   168  7132   From 5.8.0
 Rafael   5.9.1         2004-Mar-16      57   250  2107
          5.9.2         2005-Apr-01     720    57   858
          5.9.3         2006-Jan-28    1124   102  1906
          5.9.4         2006-Aug-15     896    60   862
          5.9.5         2007-Jul-07    1149   128  1062

          5.10.0        2007-Dec-18      50    31 13111   From 5.9.5


=head1 THE KEEPERS OF THE RECORDS

Jarkko Hietaniemi <F<jhi@iki.fi>>.

Thanks to the collective memory of the Perlfolk.  In addition to the
Keepers of the Pumpkin also Alan Champion, Mark Dominus,
Andreas KE<0xf6>nig, John Macdonald, Matthias Neeracher, Jeff Okamoto,
Michael Peppler, Randal Schwartz, and Paul D. Smith sent corrections
and additions. Abigail added file and patch size data for the 5.6.0 - 5.10
era.

=cut