Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perl crash with recursive sub and regex with code eval. #22869

Open
baltitenger opened this issue Dec 19, 2024 · 5 comments
Open

Perl crash with recursive sub and regex with code eval. #22869

baltitenger opened this issue Dec 19, 2024 · 5 comments

Comments

@baltitenger
Copy link

Description
Perl crashes (malloc assertion or general segfault) when trying to run the following script:

Steps to Reproduce

use v5.40;

sub foo;
my $pat = qr/^(?:a|aa)(*{foo(substr $_, pos)})(*F)/;
sub foo($x) {
  say $x || '- fin -';
  $x =~ /$pat/;
}
foo 'aaaaaaaa';

Expected behavior
The code should run.

Perl configuration

Summary of my perl5 (revision 5 version 40 subversion 0) configuration:
   
  Platform:
    osname=linux
    osvers=5.12.15-arch1-1
    archname=x86_64-linux-thread-multi
    uname='archlinux'
    config_args='-des -Dusethreads -Duseshrplib -Doptimize=-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions         -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security         -fstack-clash-protection -fcf-protection         -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -g -ffile-prefix-map=/build/perl/src=/usr/src/debug/perl -flto=auto -Dprefix=/usr -Dvendorprefix=/usr -Dprivlib=/usr/share/perl5/core_perl -Darchlib=/usr/lib/perl5/5.40/core_perl -Dsitelib=/usr/share/perl5/site_perl -Dsitearch=/usr/lib/perl5/5.40/site_perl -Dvendorlib=/usr/share/perl5/vendor_perl -Dvendorarch=/usr/lib/perl5/5.40/vendor_perl -Dscriptdir=/usr/bin/core_perl -Dsitescript=/usr/bin/site_perl -Dvendorscript=/usr/bin/vendor_perl -Dinc_version_list=none -Dman1ext=1perl -Dman3ext=3perl -Dlddlflags=-shared -Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now          -Wl,-z,pack-relative-relocs -flto=auto -Dldflags=-Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now          -Wl,-z,pack-relative-relocs -flto=auto -Dloclibpth=/usr/lib/db5.3 -Dlocincpth=/usr/include/db5.3'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=define
    usemultiplicity=define
    use64bitint=define
    use64bitall=define
    uselongdouble=undef
    usemymalloc=n
    default_inc_excludes_dot=define
  Compiler:
    cc='cc'
    ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/include/db5.3 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
    optimize='-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security -fstack-clash-protection -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -g -ffile-prefix-map=/build/perl/src=/usr/src/debug/perl -flto=auto'
    cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/include/db5.3'
    ccversion=''
    gccversion='14.2.1 20240805'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='double'
    nvsize=8
    Off_t='off_t'
    lseeksize=8
    alignbytes=8
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags ='-Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,-z,pack-relative-relocs -flto=auto -fstack-protector-strong -L/usr/lib/db5.3'
    libpth=/usr/local/lib /usr/lib /usr/lib/db5.3
    libs=-lpthread -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc -lgdbm_compat
    perllibs=-lpthread -ldl -lm -lcrypt -lutil -lc
    libc=/lib/../lib/libc.so.6
    so=so
    useshrplib=true
    libperl=libperl.so
    gnulibc_version='2.40'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=so
    d_dlsymun=undef
    ccdlflags='-Wl,-E -Wl,-rpath,/usr/lib/perl5/5.40/core_perl/CORE'
    cccdlflags='-fPIC'
    lddlflags='-shared -Wl,-O1 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,-z,pack-relative-relocs -flto=auto -L/usr/lib/db5.3 -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_LONG_DOUBLE
    HAS_STRTOLD
    HAS_TIMES
    MULTIPLICITY
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_HASH_FUNC_SIPHASH13
    PERL_HASH_USE_SBOX32
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_ITHREADS
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_PERLIO
    USE_PERL_ATOF
    USE_REENTRANT_API
    USE_THREAD_SAFE_LOCALE
  Built under linux
  Compiled at Sep  1 2024 11:21:17
  @INC:
    /usr/lib/perl5/5.40/site_perl
    /usr/share/perl5/site_perl
    /usr/lib/perl5/5.40/vendor_perl
    /usr/share/perl5/vendor_perl
    /usr/lib/perl5/5.40/core_perl
    /usr/share/perl5/core_perl
@mauke
Copy link
Contributor

mauke commented Dec 19, 2024

Crash stack (blead):

(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff7c4526e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff7c288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff7c297b6 in __libc_message_impl (fmt=fmt@entry=0x7ffff7dce8d7 "%s\n") at ../sysdeps/posix/libc_fatal.c:132
#6  0x00007ffff7ca8fe5 in malloc_printerr (str=str@entry=0x7ffff7dd1eb8 "malloc(): unaligned tcache chunk detected") at ./malloc/malloc.c:5772
#7  0x00007ffff7cad93c in tcache_get_n (ep=<optimized out>, tc_idx=<optimized out>) at ./malloc/malloc.c:3183
#8  tcache_get (tc_idx=<optimized out>) at ./malloc/malloc.c:3199
#9  __GI___libc_malloc (bytes=bytes@entry=8) at ./malloc/malloc.c:3320
#10 0x00005555557da594 in Perl_safesysmalloc (size=8) at util.c:176
#11 0x00005555557d4d5a in Perl_bytes_to_utf8 (s=s@entry=0x555555a55540 "- fin -", lenp=lenp@entry=0x7fffffffb518) at /home/mauke/Projects/perl5/inline.h:1933
#12 0x00005555557ebebd in Perl_do_print (sv=0x555555a55b20, fp=fp@entry=0x555555a32690) at doio.c:2224
#13 0x000055555569da84 in Perl_pp_print () at pp_hot.c:2155
#14 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#15 0x00005555557449f1 in S_regmatch (reginfo=reginfo@entry=0x7fffffffb8b0, startpos=0x555555a52e80 "a", prog=prog@entry=0x555555a55474) at regexec.c:8352
#16 0x000055555574ea10 in S_regtry (reginfo=reginfo@entry=0x7fffffffb8b0, startposp=startposp@entry=0x7fffffffb898) at regexec.c:4422
#17 0x000055555575ac0a in Perl_regexec_flags (rx=<optimized out>, stringarg=0x555555a52e80 "a", strend=0x555555a52e81 "", strbeg=0x555555a52e80 "a", minend=<optimized out>, sv=0x555555a49b08, data=0x0, flags=97) at regexec.c:3941
#18 0x00005555556a1157 in Perl_pp_match () at pp_hot.c:3829
#19 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#20 0x00005555557449f1 in S_regmatch (reginfo=reginfo@entry=0x7fffffffbd30, startpos=0x555555a53650 "aa", prog=prog@entry=0x555555a55474) at regexec.c:8352
#21 0x000055555574ea10 in S_regtry (reginfo=reginfo@entry=0x7fffffffbd30, startposp=startposp@entry=0x7fffffffbd18) at regexec.c:4422
#22 0x000055555575ac0a in Perl_regexec_flags (rx=<optimized out>, stringarg=0x555555a53650 "aa", strend=0x555555a53652 "", strbeg=0x555555a53650 "aa", minend=<optimized out>, sv=0x555555a49c88, data=0x0, flags=97) at regexec.c:3941
#23 0x00005555556a1157 in Perl_pp_match () at pp_hot.c:3829
#24 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#25 0x00005555557449f1 in S_regmatch (reginfo=reginfo@entry=0x7fffffffc1b0, startpos=0x555555a53b50 "aaa", prog=prog@entry=0x555555a55474) at regexec.c:8352
#26 0x000055555574ea10 in S_regtry (reginfo=reginfo@entry=0x7fffffffc1b0, startposp=startposp@entry=0x7fffffffc198) at regexec.c:4422
#27 0x000055555575ac0a in Perl_regexec_flags (rx=<optimized out>, stringarg=0x555555a53b50 "aaa", strend=0x555555a53b53 "", strbeg=0x555555a53b50 "aaa", minend=<optimized out>, sv=0x555555a49e98, data=0x0, flags=97) at regexec.c:3941
#28 0x00005555556a1157 in Perl_pp_match () at pp_hot.c:3829
#29 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#30 0x00005555557449f1 in S_regmatch (reginfo=reginfo@entry=0x7fffffffc630, startpos=0x555555a512a0 "aaaaa", prog=prog@entry=0x555555a55474) at regexec.c:8352
#31 0x000055555574ea10 in S_regtry (reginfo=reginfo@entry=0x7fffffffc630, startposp=startposp@entry=0x7fffffffc618) at regexec.c:4422
#32 0x000055555575ac0a in Perl_regexec_flags (rx=<optimized out>, stringarg=0x555555a512a0 "aaaaa", strend=0x555555a512a5 "", strbeg=0x555555a512a0 "aaaaa", minend=<optimized out>, sv=0x555555a3f980, data=0x0, flags=97)
    at regexec.c:3941
#33 0x00005555556a1157 in Perl_pp_match () at pp_hot.c:3829
#34 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#35 0x00005555557449f1 in S_regmatch (reginfo=reginfo@entry=0x7fffffffcab0, startpos=0x555555a3d400 "aaaaaa", prog=prog@entry=0x555555a55474) at regexec.c:8352
#36 0x000055555574ea10 in S_regtry (reginfo=reginfo@entry=0x7fffffffcab0, startposp=startposp@entry=0x7fffffffca98) at regexec.c:4422
#37 0x000055555575ac0a in Perl_regexec_flags (rx=<optimized out>, stringarg=0x555555a3d400 "aaaaaa", strend=0x555555a3d406 "", strbeg=0x555555a3d400 "aaaaaa", minend=<optimized out>, sv=0x555555a3f3e0, data=0x0, flags=97)
    at regexec.c:3941
#38 0x00005555556a1157 in Perl_pp_match () at pp_hot.c:3829
#39 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#40 0x00005555557449f1 in S_regmatch (reginfo=reginfo@entry=0x7fffffffcf30, startpos=0x555555a2bd70 "aaaaaaa", prog=prog@entry=0x555555a55474) at regexec.c:8352
#41 0x000055555574ea10 in S_regtry (reginfo=reginfo@entry=0x7fffffffcf30, startposp=startposp@entry=0x7fffffffcf18) at regexec.c:4422
#42 0x000055555575ac0a in Perl_regexec_flags (rx=<optimized out>, stringarg=0x555555a2bd70 "aaaaaaa", strend=0x555555a2bd77 "", strbeg=0x555555a2bd70 "aaaaaaa", minend=<optimized out>, sv=0x555555a36c58, data=0x0, flags=97)
    at regexec.c:3941
#43 0x00005555556a1157 in Perl_pp_match () at pp_hot.c:3829
#44 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#45 0x00005555557449f1 in S_regmatch (reginfo=reginfo@entry=0x7fffffffd3b0, startpos=0x555555a50fc0 "aaaaaaaa", prog=prog@entry=0x555555a55474) at regexec.c:8352
#46 0x000055555574ea10 in S_regtry (reginfo=reginfo@entry=0x7fffffffd3b0, startposp=startposp@entry=0x7fffffffd398) at regexec.c:4422
#47 0x000055555575ac0a in Perl_regexec_flags (rx=<optimized out>, stringarg=0x555555a50fc0 "aaaaaaaa", strend=0x555555a50fc8 "", strbeg=0x555555a50fc0 "aaaaaaaa", minend=<optimized out>, sv=0x555555a55af0, data=0x0, flags=97)
    at regexec.c:3941
#48 0x00005555556a1157 in Perl_pp_match () at pp_hot.c:3829
#49 0x00005555555f099a in Perl_runops_debug () at dump.c:2993
#50 0x00005555555d4fb2 in S_run_body (oldscope=1) at perl.c:2883
#51 perl_run (my_perl=<optimized out>) at perl.c:2798
#52 0x000055555559d28c in main (argc=<optimized out>, argv=<optimized out>, env=<optimized out>) at perlmain.c:127

Valgrind indicates this is a use-after-free:

$ valgrind ./perl bug                                                                                                                                                18:28:49 [93/4234]
==264173== Memcheck, a memory error detector
==264173== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==264173== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==264173== Command: ./perl bug
==264173== 
aaaaaaaa
aaaaaaa
aaaaaa
aaaaa  
aaaa   
aaa    
aa
a      
- fin -
- fin -
==264173== Invalid write of size 8
==264173==    at 0x4852E61: memmove (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==264173==    by 0x30DF54: memcpy (string_fortified.h:29)
==264173==    by 0x30DF54: Perl_regexec_flags (regexec.c:4336)
==264173==    by 0x255156: Perl_pp_match (pp_hot.c:3829)
==264173==    by 0x1A4999: Perl_runops_debug (dump.c:2993)
==264173==    by 0x2F89F0: S_regmatch (regexec.c:8352)
==264173==    by 0x302A0F: S_regtry (regexec.c:4422)
==264173==    by 0x30EC09: Perl_regexec_flags (regexec.c:3941)                                                        
==264173==    by 0x255156: Perl_pp_match (pp_hot.c:3829)                                                              
==264173==    by 0x1A4999: Perl_runops_debug (dump.c:2993)
==264173==    by 0x2F89F0: S_regmatch (regexec.c:8352)
==264173==    by 0x302A0F: S_regtry (regexec.c:4422)
==264173==    by 0x30EC09: Perl_regexec_flags (regexec.c:3941)
==264173==  Address 0x4c15720 is 0 bytes inside a block of size 24 free'd
==264173==    at 0x484988F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)                         
==264173==    by 0x2AA490: Perl_pregfree2 (regcomp.c:13288) 
==264173==    by 0x31EC63: Perl_sv_clear (sv.c:6812)                                                                  
==264173==    by 0x31F7A7: Perl_sv_free2 (sv.c:7337)
==264173==    by 0x239DD7: Perl_SvREFCNT_dec (sv_inline.h:696)                                                        
==264173==    by 0x239DD7: Perl_pp_regcomp (pp_ctl.c:163)                                                             
==264173==    by 0x1A4999: Perl_runops_debug (dump.c:2993)
==264173==    by 0x2F89F0: S_regmatch (regexec.c:8352)
==264173==    by 0x302A0F: S_regtry (regexec.c:4422)
==264173==    by 0x30EC09: Perl_regexec_flags (regexec.c:3941)
==264173==    by 0x255156: Perl_pp_match (pp_hot.c:3829)
==264173==    by 0x1A4999: Perl_runops_debug (dump.c:2993)
==264173==    by 0x2F89F0: S_regmatch (regexec.c:8352)
==264173==  Block was alloc'd at
==264173==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==264173==    by 0x38E593: Perl_safesysmalloc (util.c:176)
==264173==    by 0x2AAAE3: Perl_reg_temp_copy (regcomp.c:13397)
==264173==    by 0x239A8A: Perl_pp_regcomp (pp_ctl.c:157)
==264173==    by 0x1A4999: Perl_runops_debug (dump.c:2993)
==264173==    by 0x2F89F0: S_regmatch (regexec.c:8352)
==264173==    by 0x302A0F: S_regtry (regexec.c:4422)
==264173==    by 0x30EC09: Perl_regexec_flags (regexec.c:3941)
==264173==    by 0x255156: Perl_pp_match (pp_hot.c:3829)
==264173==    by 0x1A4999: Perl_runops_debug (dump.c:2993)
==264173==    by 0x2F89F0: S_regmatch (regexec.c:8352)
==264173==    by 0x302A0F: S_regtry (regexec.c:4422)
==264173== 

@jkeenan
Copy link
Contributor

jkeenan commented Dec 23, 2024

Description Perl crashes (malloc assertion or general segfault) when trying to run the following script:

Steps to Reproduce

use v5.40;

sub foo;
my $pat = qr/^(?:a|aa)(*{foo(substr $_, pos)})(*F)/;
sub foo($x) {
  say $x || '- fin -';
  $x =~ /$pat/;
}
foo 'aaaaaaaa';

Expected behavior The code should run.

I found that if I modified the test program as follows to include warnings, then the crash went away and the program ran to (apparent) completion.

$ diff gh-22869-crash.pl gh-22869-warnings.pl 
1a2
> use warnings;
$ perl gh-22869-warnings.pl 
aaaaaaaa
aaaaaaa
aaaaaa
aaaaa
aaaa
aaa
aa
a
- fin -
- fin -
a
- fin -
aa
a
- fin -
- fin -
aaa
aa
a
- fin -
- fin -
a
- fin -
aaaa
aaa
aa
a
- fin -
- fin -
a
- fin -
aa
a
- fin -
- fin -
aaaaa
aaaa
aaa
aa
a
- fin -
- fin -
a
- fin -
aa
a
- fin -
- fin -
aaa
aa
a
- fin -
- fin -
a
- fin -
aaaaaa
aaaaa
aaaa
aaa
aa
a
- fin -
- fin -
a
- fin -
aa
a
- fin -
- fin -
aaa
aa
a
- fin -
- fin -
a
- fin -
aaaa
aaa
aa
a
- fin -
- fin -
a
- fin -
aa
a
- fin -
- fin -

@baltitenger
Copy link
Author

baltitenger commented Dec 23, 2024

Interesting... Just checked, if you add use warnings but also call foo with 'a' x 13 it's back to crashing (with 'a' x 12 it runs just fine).

(btw I thought v5.40 implied warnings)

@richardleach
Copy link
Contributor

The crashing regex compiles to:

Compiling REx "^(?:a|aa)(*{foo(substr $_, pos)})(*F)"
Final program:
   1: SBOL /^/ (2)
   2: EXACT <a> (4)
   4: TRIE-EXACT[a] (11)
      <> 
      <a> 
  11: EVAL optimistic (14)
  14: OPFAIL (16)
  16: END (0)

Without the alternate a matching, it doesn't seem to crash for different lengths of foo() input:

Compiling REx "^a(*{foo(substr $_, pos)})(*F)"
Final program:
   1: SBOL /^/ (2)
   2: EXACT <a> (4)
   4: EVAL optimistic (7)
   7: OPFAIL (9)
   9: END (0)

i.e. The backtracking interplay between the TRIE-EXACT and OPFAIL seems to matter.

The last few steps of re debugging output are:

Matching REx "^(?:a|aa)(*{foo(substr $_, pos)})(*F)" against "a"
Intuit: trying to determine minimum start position...
  Looking for check substr at fixed offset 0...
Intuit: Successfully guessed: match at offset 0
   0 <> <a>                  |   0| 1:SBOL /^/(2)
   0 <> <a>                  |   0| 2:EXACT <a>(4)
   1 <a> <>                  |   0| 4:TRIE-EXACT[a](11)
                             |   0| TRIE: matched empty string...
   1 <a> <>                  |   0| 11:EVAL optimistic(14)
String shorter than min possible regex match (0 < 1)
   1 <a> <>                  |   1|  14:OPFAIL(16)
                             |   1|  failed...
                             |   0| failed...
Match failed
   2 <aa> <a>                |   1|  14:OPFAIL(16)
                             |   1|  failed...
                             |   0| failed...
Match failed
   2 <aa> <aaa>              |   1|  14:OPFAIL(16)
                             |   1|  failed...
                             |   0| failed...
Match failed
   1 <a> <aaaaa>             |   2|   14:OPFAIL(16)
                             |   2|   failed...
                             |   1|  failed...
                             |   0| TRIE matched word #2, continuing
                             |   0| TRIE: only one match left, short-circuiting: #2 <a>
   2 <aa> <aaaa>             |   0| 11:EVAL optimistic(14)
Matching REx "^(?:a|aa)(*{foo(substr $_, pos)})(*F)" against "aaaa"
Intuit: trying to determine minimum start position...
  Looking for check substr at fixed offset 0...
Intuit: Successfully guessed: match at offset 0
   0 <> <aaaa>               |   0| 1:SBOL /^/(2)
   0 <> <aaaa>               |   0| 2:EXACT <a>(4)
   1 <a> <aaa>               |   0| malloc_consolidate(): unaligned fastbin chunk detected

@richardleach
Copy link
Contributor

This also crashes, so it's not specific to TRIE-EXACT:

Compiling REx "^aa?(*{foo(substr $_, pos)})(*F)"
Final program:
   1: SBOL /^/ (2)
   2: EXACT <a> (4)
   4: CURLY{0,1} (10)
   8:   EXACT <a> (0)
  10: EVAL optimistic (13)
  13: OPFAIL (15)
  15: END (0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants