EzDev.org

perl5

The Perl 5 language interpreter (MIRROR ONLY) Perl 5 - dev.perl.org


What is a fast way to replace identical branches inside nested structures with references?

Is there a readily available module for Perl that could scan an arbitrarily big nested structure of hashes and arrays and replace all identical branches (those that, for example, Test::Deep::cmp_deeply would say 'ok' about) with references to only a single value?

I have my own solution for this problem already, but I would prefer to use existing fast XS module if it is available.

Example of original structure as shown by Data::Dumper:

$VAR1 = {
    'other_elems' => [
        {
            'sub_elements' => [
                {'id' => 333},
                {
                    'props' => ['attr5', 'attr6'],
                    'id'    => 444
                }
            ],
            'other_key_for_attrs' => ['attr1', 'attr5'],
            'id'                  => 222
        },
        {
            'sub_elements' => [{'id' => 333}],
            'id'           => 111
        }
    ],
    'elems' => [
        {
            'attrs' => ['attr1', 'attr5'],
            'id'    => 1
        },
        {
            'parent' => 3,
            'attrs'  => ['attr1', 'attr5'],
            'id'     => 2
        },
        {
            'attrs' => ['attr5', 'attr6'],
            'id'    => 3
        },
        {
            'attrs' => ['attr5', 'attr6'],
            'id'    => 4
        }
    ]
};

Example of expected result structure:

$VAR1 = {
    'other_elems' => [
        {
            'sub_elements' => [
                {'id' => 333},
                {
                    'props' => ['attr5', 'attr6'],
                    'id'    => 444
                }
            ],
            'other_key_for_attrs' => ['attr1', 'attr5'],
            'id'                  => 222
        },
        {
            'sub_elements' =>
              [$VAR1->{'other_elems'}[0]{'sub_elements'}[0]],
            'id' => 111
        }
    ],
    'elems' => [
        {
            'attrs' => $VAR1->{'other_elems'}[0]{'other_key_for_attrs'},
            'id'    => 1
        },
        {
            'parent' => 3,
            'attrs'  => $VAR1->{'other_elems'}[0]{'other_key_for_attrs'},
            'id'     => 2
        },
        {
            'attrs' =>
              $VAR1->{'other_elems'}[0]{'sub_elements'}[1]{'props'},
            'id' => 3
        },
        {
            'attrs' =>
              $VAR1->{'other_elems'}[0]{'sub_elements'}[1]{'props'},
            'id' => 4
        }
    ]
};

Source: (StackOverflow)

Non-determinism in encoding when using open() with scalar and I/O layers in Perl

For several hours now I am fighting a bug in my Perl program. I am not sure if I do something wrong or the interpreter does, but the code is non-deterministic while it should be deterministic, IMO. Also it exhibits the same behavior on ancient Debian Lenny (Perl 5.10.0) and a server just upgraded to Debian Wheezy (Perl 5.14.2). It boiled down to this piece of Perl code:

#!/usr/bin/perl
use warnings;
use strict;
use utf8;
binmode STDOUT, ":utf8";
binmode STDERR, ":utf8";
my $c = "";
open C, ">:utf8", \$c;
print C "š";
close C;
die "Does not happen\n" if utf8::is_utf8($c);
print utf8::decode($c) ? "Decoded\n" : "Undecoded\n";

It initializes Perl 5 interpreter in strict mode with warnings enabled, with character strings (as opposed to byte strings) and named standard streams encoded in UTF8 (internal notion of UTF-8, but pretty close; changing to full UTF-8 makes no difference). Then it opens a file handle to an “in-memory file” (scalar variable), prints a single two-byte UTF-8 character into it and examines the variable upon closure.

The scalar variable now always has UTF8 bit flipped off. However it sometimes contains a byte string (converted to character string via utf8::decode()) and sometimes a character string that just needs to flip on its UTF8 bit (Encode::_utf8_on()).

When I execute my code repeatedly (1000 times, via Bash), it prints Undecoded and Decoded with approximately the same frequencies. When I change the string I write into the “file”, e.g. add a newline at its end, Undecoded disappears. When utf8::decode succeeds and I try it for the same original string in a loop, it keeps succeeding in the same instance of interpreter; however, if it fails, it keeps failing.

What is the explanation for the observed behavior? How can I use file handle to a scalar variable together with character strings?

Bash playground:

for i in {1..1000}; do perl -we 'use strict; use utf8; binmode STDOUT, ":utf8"; binmode STDERR, ":utf8"; my $c = ""; open C, ">:utf8", \$c; print C "š"; close C; die "Does not happen\n" if utf8::is_utf8($c); print utf8::decode($c) ? "Decoded\n" : "Undecoded\n";'; done | grep Undecoded | wc -l

For reference and to be absolutely sure, I also made a version with pedantic error handling – same results.

#!/usr/bin/perl
use warnings;
use strict;
use utf8;
binmode STDOUT, ":utf8" or die "Cannot binmode STDOUT\n";
binmode STDERR, ":utf8" or die "Cannot binmode STDERR\n";
my $c = "";
open C, ">:utf8", \$c or die "Cannot open: $!\n";
print C "š" or die "Cannot print: $!\n";
close C or die "Cannot close: $!\n";
die "Does not happen\n" if utf8::is_utf8($c);
print utf8::decode($c) ? "Decoded\n" : "Undecoded\n";

Source: (StackOverflow)

Inconsistent (silly?) data access in Perl 5 (also confusing me regarding use of sigils)

This question is about asking for some explanation of what's going on in the Perl system for I don't implicitly see the point though I'm coding for more than 25 years now. So here comes the story ...

On trying to work with Cyrus::IMAP::Admin instances in Perl5 I've tried to get and print a list of quotas resulting in a somewhat strangely structured data returned.

my %quotas = $client->listquota(@list[0]);

if ( $client->error ) {
    printf STDERR "Error: " . $client->error . "\n";
    exit 1;
}

print "root: " . $list[0] . "\n";

foreach my $quota ( keys %quotas ) {
    print( $quota, " ", $quotas{$quota}[0], "/", $quotas{$quota}[1], " KiB\n" );
}

This code is actually working as desired by printing out stuff like

root: user.myuser
STORAGE: 123/4567 KiB

This code was taken from Cyrus::IMAP::Shell reading similar to this:

my %quota = $$cyrref->listquota(@nargv);
foreach my $quota (keys %quota) {
    $lfh->[1]->print(" ", $quota, " ", $quota{$quota}[0], "/", $quota{$quota}[1]);
    if ($quota{$quota}[1]) {
        $lfh->[1]->print(" (", $quota{$quota}[0] * 100 / $quota{$quota}[1], "%)");
    }
}

This code looks somewhat silly to me for using $quota{$quota}[0]. In my case I renamed variables a bit for rejecting that mixed use of differently typed but equivalently named variables.

Prior to taking the code from Cyrus::IMAP::Admin I tried to understand its specification and to process the result by code written myself. It looked like this:

my %quotas = $client->listquota(@list[0]);

if ( $client->error ) {
    printf STDERR "Error: " . $client->error . "\n";
    exit 1;
}

print "root: " . $list[0] . "\n";

foreach my $quota ( keys %quotas ) {
    my @sizes = @quotas{$quota};
    print( $quota, " ", $sizes[0], "/", $sizes[1], "\n" );
}

However, this code didn't work and I didn't find any plausible explanation myself. My understanding here is that transferring this last code example to the initially posted form would require to have source of assignment in line 11 substituted into usages in line 12 and change the sigil of quotas from @ to $ for I'm trying to get a scalar result finally. This last code was printing an array reference before slash and nothing after it. So I had to fix my code like this to get it working:

my %quotas = $client->listquota(@list[0]);

if ( $client->error ) {
    printf STDERR "Error: " . $client->error . "\n";
    exit 1;
}

print "root: " . $list[0] . "\n";

foreach my $quota ( keys %quotas ) {
    my @sizes = @quotas{$quota};
    print( $quota, " ", $sizes[0][0], "/", $sizes[0][1], "\n" );
}

This additional dereferencing in line 12 is what I'm confused about now. Why is @sizes containing an array storing another array in its sole first element? For being confused I already tried alternative code in line 11 to no avail. These tests included

    my @sizes = $quotas{$quota};

(for its equivalence with original code posted above) and

    my $sizes = @quotas{$quota};

(for I don't know why). Switching sigils don't seem to change semantics of assignment here at all. But using this assignment seems to open different view on data structure contained in %quotas originally. What sigils are required to have @sizes matching content and structure of $quotas{$quota} as used in top-most code fragment?


Source: (StackOverflow)

What are the avaialble compilers/interpreters for Perl 5?

Like C where gcc, borland and many more compilers are available, I am wondering whether any other Compiler/Interpreters are available for Perl 5?

From my reading, I understand there was perlcc which compiled the code into B:OP format and then interpreter was used to convert the optree to machine executable.


Source: (StackOverflow)

Is there an analogue of Ruby gsub method in Perl? [duplicate]

Possible Duplicate:
How do I perform a Perl substitution on a string while keeping the original?

How do I do one line replacements in Perl without modifying the string itself? I also want it to be usable inside expressions, much like I can do p s.gsub(/from/, 'to') in Ruby.

All I can think of is

do {my $r = $s; $r =~ s/from/to/; $r}

but sure there is a better way?


Source: (StackOverflow)

Find runs of capitalized letters

I have a file containing some fully capitalized words and some mixed-case words, and I want to extract the fully capitalized runs of words (contained in one line) – that is, things separated by \b and containing at least two capital letters and no lowercase letters. Everything is 7-bit.

So, for example, if a line is

The QUICK Brown fox JUMPs OV3R T4E LAZY DoG.

then I'd want to extract QUICK and OV3R T4E LAZY.

This is what I have so far:

while (<$fh>) { # file handle
    my @array = $_ =~ /\b[^a-z]*[A-Z][^a-z]*[A-Z][^a-z]*\b/;
    push @bigarray, @array;
}

Is there a more elegant way to do it than [^a-z]*[A-Z][^a-z]*[A-Z][^a-z]*?


Source: (StackOverflow)

Are perl5 libraries are importable in perl6?

I know that perl6 will definetely allow importing perl5 code but I'm not able to do this.

Here is perl6 code

use perl5:Net::FTP;

It reports an error

Is there any configuration issue or it is not ready yet?


Source: (StackOverflow)

Can't locate CPAN.pm in @INC (@INC contains: /usr/local/lib/perl5 /usr/local/share/perl5

I tried to install some modules to a new server (fedora core 18) but I'm betting this error:

Can't locate CPAN.pm in @INC (@INC contains: /usr/local/lib/perl5 /usr/local/share/perl5 /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5 /usr/share/perl5 .).
BEGIN failed--compilation aborted.

The module I need to install is : XML/Writer.pm

because I'm getting this error:

Can't locate XML/Writer.pm in @INC (@INC contains: /usr/local/lib/perl5 /usr/local/share/perl5 /usr/lib/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib/perl5 /usr/share/perl5 .

Any of you knows why or how can I fix this errors?


Source: (StackOverflow)

issue accessing lexical scope using B

For debugging purposes I'd like to Access the lexical scope of different subroutines with a specific Attribute set. That works fine. I get a Problem when the first variable stores a string, then I get a empty string. I do something like this:

$pad = $cv->PADLIST; # $cv is the coderef to the sub
@scatchpad = $pad->ARRAY; # getting the scratchpad
@varnames = $scratchpad[0]->ARRAY; # getting the variablenames
@varcontents = $scratchpad[1]->ARRAY; # getting the Content from the vars

for (0 .. $#varnames) {
    eval {
        my $name = $varnames[$_]->PV;
        my $content;
        # following line  matches numbers, works so far
        $content = $varcontent[$_]->IVX if (scalar($varcontent[$_]) =~ /PVIV=/);
        # should match strings, but does give me undef
        $content = B::perlstring($varcontent[$_]->PV) if (scalar($varcontent[$_]) =~ /PV=/);
        print "DEBUGGER> Local variable: ", $name, " = ", $content, "\n";
    }; # there are Special vars that throw a error, but i don't care about them
}

Like I said in the comment the eval is to prevent the Errors from the B::Special objects in the scratchpad. Output:

Local variable: $test = 42
Local variable: $text = 0

The first Output is okay, the second should Output "TEXT" instead of 0.

What am I doing wrong?

EDIT: With a little bit of coding I got all values of the variables , but not stored in the same indexes of @varnames and @varcontents. So now is the question how (in which order) the values are stored in @varcontents.

use strict;
use warnings;
use B;

sub testsub {
    my $testvar1 = 42;
    my $testvar2 = 21;
    my $testvar3 = "testval3";
    print "printtest1";
    my $testvar4 = "testval4";
    print "printtest2";
    return "returnval";
}

no warnings "uninitialized";

my $coderef = \&testsub;
my $cv = B::svref_2object ( $coderef );
my $pad = $cv->PADLIST; # get scratchpad object
my @scratchpad = $pad->ARRAY;
my @varnames = $scratchpad[0]->ARRAY; # get varnames out of scratchpad
my @varcontents = $scratchpad[1]->ARRAY; # get content array out of scratchpad

my @vars; # array to store variable names adn "undef" for special objects (print-values, return-values, etc.)

for (0 .. $#varnames) {
    eval { push @vars, $varnames[$_]->PV; };
    if ($@) { push @vars, "undef"; }
}

my @cont; # array to store the content of the variables and special objects

for (0 .. $#varcontents) {
    eval { push @cont, $varcontents[$_]->IV; };
    eval { push @cont, $varcontents[$_]->PV; };
}

print $vars[$_], "\t\t\t", $cont[$_], "\n" for (0 .. $#cont);

EDIT2: Added runnable script to demonstrate the issue: Variablenames and variablevalues are not stored in the same index of the two Arrays (@varnames and @varcontents).


Source: (StackOverflow)

How do Perl Cwd::cwd and Cwd::getcwd functions differ?

The question

What is the difference between Cwd::cwd and Cwd::getcwd in Perl, generally, without regard to any specific platform? Why does Perl have both? What is the intended use, which one should I use in which scenarios? (Example use cases will be appreciated.) Does it matter? (Assuming I don’t mix them.) Does choice of either one affect portability in any way? Which one is more commonly used in modules?

Even if I interpret the manual is saying that except for corner cases cwd is `pwd` and getcwd just calls getcwd from unistd.h, what is the actual difference? This works only on POSIX systems, anyway.

I can always read the implementation but that tells me nothing about the meaning of those functions. Implementation details may change, not so defined meaning. (Otherwise a breaking change occurs, which is serious business.)

What does the manual say

Quoting Perl’s Cwd module manpage:

Each of these functions are called without arguments and return the absolute path of the current working directory.

  • getcwd

    my $cwd = getcwd();

    Returns the current working directory.

    Exposes the POSIX function getcwd(3) or re-implements it if it's not available.

  • cwd

    my $cwd = cwd();

    The cwd() is the most natural form for the current architecture. For most systems it is identical to `pwd` (but without the trailing line terminator).

And in the Notes section:

  • Actually, on Mac OS, the getcwd(), fastgetcwd() and fastcwd() functions are all aliases for the cwd() function, which, on Mac OS, calls `pwd`. Likewise, the abs_path() function is an alias for fast_abs_path()

OK, I know that on Mac OS1 there is no difference between getcwd() and cwd() as both actually boil down to `pwd`. But what on other platforms? (I’m especially interested in Debian Linux.)


1 Classic Mac OS, not OS X. $^O values are MacOS and darwin for Mac OS and OS X, respectively. Thanks, @tobyink and @ikegami.

And a little meta-question: How to avoid asking similar questions for other modules with very similar functions? Is there a universal way of discovering the difference, other than digging through the implementation? (Currently, I think that if the documentation is not clear about intended use and differences, I have to ask someone more experienced or read the implementation myself.)


Source: (StackOverflow)

Perl - pass code block as parameter inside parenthesis

Is it possible to pass a block of code to a sub using "parenthesis" syntax?

I.e. when i write

List::MoreUtils::any { defined ($_) } (undef, undef, 1);

it works. But when i try to add parenthesis

List::MoreUtils::any ( { defined ($_) } , (undef, undef, 1) );

this is interpreted as an anonymous hash, giving an error message. Neither escaping nor using eval helps.

The idea behind all the fuss is if the call is a part of an expression, i.e.

if (first_index { defined (${$_})} $jms_positions > $jms_positionals_seen )

some operator following the arguments might be executed before the call, producing an undesired result.


Source: (StackOverflow)

Cleanse a shell runtime environment to run system Perl?

I need to create a several Perl programs on a Solaris 9 SPARC environment running Oracle EBS, one of which will be run from cron. The UNIX account that will be running Perl has all the environment variables set up to run Oracle-centric programs, so when I run "/usr/bin/perl -V", I get the following compilation error. Fortunately, the cron run Perl is not impacted by the Oracle environment settings.

bash-2.05$ /usr/bin/perl -V
Perl lib version (5.00503) doesn't match executable version (5.008) at /u01/app/applmgr/pr/iAS/Apache/perl/lib/5.00503/sun4-solaris/Config.pm line 7.
Compilation failed in require.
BEGIN failed--compilation aborted.

My first thought was to use the BEGIN block to do some house cleaning so I can use the system Perl rather than the Oracle EBS supplied version.

#!/usr/bin/perl

BEGIN {
  delete $ENV{PERL5LIB};
  delete @INC[0..$#INC];
  push @INC, map { "/usr/local/lib/perl5/$_" } (
    '5.8.0','5.8.0/sun4-solaris',
    'site_perl','site_perl/5.8.0','site_perl/5.8.0/sun4-solaris'
  );
}

print "Hello clean Perl environment! :)\n";

I am not permitted to modify the UNIX account's local profile, so is this the proper way to handle this scenario?


Source: (StackOverflow)

How can I loop over an array from the first occurrence of an element with a specific value using perl?

I have an array like ("valueA", "valueB", "valueC", "valueD") etc. I want to loop over the values of the array starting from (for example) the first instance of "valueC". Everything in the array before the first instance of the value "valueC" should be ignored; so in this case only "valueC" and "valueD" would be handled by the loop.

I can just put a conditional inside my loop, but is there a neater way to express the idea using perl?


Source: (StackOverflow)

Can't use string ("1") as a subroutine ref while "strict refs" in use

In a Perl daemon reacting to various events I'm trying to use a Null object pattern in 2 cases by creating anonymous subroutines, which should just return a value of 1 aka "true" (please scroll to the right to see the check subroutines for LOGIN and ALIVE events):

package User;

our %EVENTS = (
        LOGIN   => {handler => \&handleLogin,   check => sub {1},     },
        CHAT    => {handler => \&handleChat,    check => \&mayChat,   },
        JOIN    => {handler => \&handleJoin,    check => \&mayJoin,   },
        LEAVE   => {handler => \&handleLeave,   check => \&mayLeave,  },
        ALIVE   => {handler => sub {},          check => sub {1},     },
        BID     => {handler => \&handleBid,     check => \&checkArgs, },
        TAKE    => {handler => \&handleTake,    check => \&checkArgs, },
  # .... more events ....
);


sub action($$$) {
        my $user  = shift;
        my $event = shift;
        my $arg   = shift;
        my $game  = $user->{GAME};

        unless (exists $EVENTS{$event}) {
                print STDERR "wrong event: $event\n";
                return;
        }

        my $handler = $EVENTS{$event}->{handler};
        my $check   = $EVENTS{$event}->{check};

        return unless $user->$check->($arg); # XXX fails
        $user->$handler->($arg);
}

sub mayChat($$) {
        my $user = shift;

        return if $user->{KIBITZER};
}

# ...... more methods here ...

1;

Unfortunately I get the runtime error for LOGIN event:

Can't use string ("1") as a subroutine ref while "strict refs" in use

Does anybody please know how to fix it here?

How to provide a "function pointer" to an anonymous Perl subroutine?

The handler => \&sub { 1 } doesn't do it either.

Using perl 5.8.8 and perl 5.10.1 on CentOS 5.x and 6.x

UPDATE:

I've also tried following:

    my $check = $EVENTS{$event}->{check};
    return unless $check->($user, $arg);

but it doesn't help. I think this rules out the "missing blessing" suggested in some answers.

UPDATE 2:

I have extended the source code snippet in my original question. The background is: I'm in the process of refactoring of my source code and thus I've created the %EVENTS hash as listed above, so that for each incoming event (a string sent over TCP-socket from a Flash client) there is a reference to a subroutine (check) which validates the event and a reference to another subroutine (handler) which performs some actions. I'm not sure if other subroutines work - I'm stuck already at the first LOGIN event.

I also don't understand why doesn't check => sub { 1 } above work - isn't sub supposed to return a reference to an anonymous subroutine (when the name is omitted - according to perldoc perlref section 4)?

UPDATE 3:

The output of print Dumper(\%EVENTS) -

$VAR1 = {
          'PLAY' => {
                      'check' => sub { "DUMMY" },
                      'handler' => sub { "DUMMY" },
                    },
          'JOIN' => {
                      'check' => sub { "DUMMY" },
                      'handler' => sub { "DUMMY" },
                    },
          'OVER1' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'ALIVE' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'DISCARD' => {
                         'check' => $VAR1->{'PLAY'}{'check'},
                         'handler' => sub { "DUMMY" },
                       },
          'MISS1' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'LOGIN' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'TAKE' => {
                      'check' => $VAR1->{'PLAY'}{'check'},
                      'handler' => sub { "DUMMY" },
                    },
          'ONEMORE' => {
                         'check' => sub { "DUMMY" },
                         'handler' => sub { "DUMMY" },
                       },
          'OVER2' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'MISS2' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'EXACT' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'TRUST' => {
                       'check' => $VAR1->{'PLAY'}{'check'},
                       'handler' => sub { "DUMMY" },
                     },
          'LEAVE' => {
                       'check' => sub { "DUMMY" },
                       'handler' => sub { "DUMMY" },
                     },
          'DEFEND' => {
                        'check' => $VAR1->{'PLAY'}{'check'},
                        'handler' => sub { "DUMMY" },
                      },
          'OPEN' => {
                      'check' => $VAR1->{'PLAY'}{'check'},
                      'handler' => sub { "DUMMY" },
                    },
          'REVEAL' => {
                        'check' => sub { "DUMMY" },
                        'handler' => sub { "DUMMY" },
                      },
          'CHAT' => {
                      'check' => sub { "DUMMY" },
                      'handler' => sub { "DUMMY" },
                    },
          'DECLARE' => {
                         'check' => $VAR1->{'PLAY'}{'check'},
                         'handler' => sub { "DUMMY" },
                       },
          'BACK' => {
                      'check' => sub { "DUMMY" },
                      'handler' => sub { "DUMMY" },
                    },
          'MISERE' => {
                        'check' => sub { "DUMMY" },
                        'handler' => sub { "DUMMY" },
                      },
          'BID' => {
                     'check' => $VAR1->{'PLAY'}{'check'},
                     'handler' => sub { "DUMMY" },
                   }
        };

Source: (StackOverflow)

What is indirect object notation, why is it bad, and how does one avoid it?

The title pretty much sums it up, but here's the long version anyway.

After posting a small snippet of perl code, I was told to avoid indirect object notation, "as it has several side effects". The comment referenced this particular line:

my $some_object = new Some::Module(FIELD => 'value');

As this is how I've always done it, in an effort to get with the times I therefore ask:

  • What's so bad about it? (specifically)
  • What are the potential (presumably negative) side effects?
  • How should that line be rewritten?

I was about to ask the commenter, but to me this is worthy of its own post.


Source: (StackOverflow)