Cleanup - remove unused parameter with magic numbers #22871

happy-barney · 2024-12-21T21:04:52Z

NewOp's first parameter is unused since 2007, polluting codebase with magic numbers.

PR is adding new macro NewOp_v542 without this unused parameter.

Macro name is also POC of adding Perl version when macro will be added to maintain reasonable
backward compatibility by version string in symbol name.

iabyn · 2024-12-23T10:51:37Z

On Sat, Dec 21, 2024 at 01:05:22PM -0800, Branislav Zahradník wrote: PR is adding new macro `NewOp_v542` without this unused parameter. Macro name is also POC of adding Perl version when macro will be added to maintain reasonable backward compatibility by version string in symbol name.

I can't see what practical gain you get by including the version number in the new macro. It just seems to make it worse.

…

-- Modern art: "That's easy, I could have done that!" "Ah, but you didn't!"

happy-barney · 2024-12-23T12:36:03Z

I can't see what practical gain you get by including the version number in the new macro. It just seems to make it worse.

Symbol name is describes behaviour. With new version behavior changed (3 parameters instead of 4).

It's same like with humans. Look for example at French kings, there were for example:

Louis XIV
Louis XVI

They have version in names as well, so you can distinguish one born in 17th century and another in 18th.

Same is for this pattern - you can introduce new symbol, new behaviour and still provide old one so there is no need to modify every old code simultaneously.

iabyn · 2024-12-23T12:49:29Z

On Mon, Dec 23, 2024 at 04:36:25AM -0800, Branislav Zahradník wrote: > I can't see what practical gain you get by including the version number in the new macro. It just seems to make it worse. Symbol name is describes behaviour. With new version behavior changed (3 parameters instead of 4).

But the version number tells you *nothing* about the behaviour. We have a long history in perl (as do many other projects) of naming newer variants of an API function. The new name usually tries to use some sort of simple mnemonic to indicate its changed behaviour. For example: my_atof2() and my_atof3() - two- and three-arg variants of my_atof(); or av_fetch_simple() - an optimised version of av_fetch() applicable in some circumstances; or newSVsv_flags() - a variant of newSVsv() that takes an extra 'flags' argument. Is newSVsv_530() easier for a developer to understand than newSVsv_flags()? And if not, is there any other technical benefit of using the newSVsv_530() naming convention?

…

-- Technology is dominated by two types of people: those who understand what they do not manage, and those who manage what they do not understand.

happy-barney · 2024-12-23T13:03:26Z

But the version number tells you nothing about the behaviour.

Well, difference is that I'm using release numbers and not version per symbol.
There it tells you one important thing - when it was introduced. Rest is matter of documentation.

I'm not using version to provide different behaviour with of same symbol. It will still be NewOp behaviour
with two implementations:

as implemented before v5.42
as implemented in v5.42

It may not tell much about actual behaviour, but it tells which version of behaviour and it ensures that it will be preserved.

Following this pattern ensures 100% forward compatibility (eg: no need to recompile XS with each release).

It will even (with some tweaks) allow to propagate newer grammar into older versions - this is root assumption behind idea of exact use VERSION - https://github.com/happy-barney/perl-wish-list/blob/master/exact-use-version/spec/spec.md

NewOp's first argument has been unused since 2007 but remains required, polluting the codebase with magic numbers. A new macro is provided that eliminates these magic numbers while maintaining backward/forward compatibility through Perl version identification in its name.

First parameter of NewOpSz is unused since 2007.

iabyn · 2024-12-27T08:24:56Z

On Mon, Dec 23, 2024 at 05:03:49AM -0800, Branislav Zahradník wrote: Following this pattern ensures 100% forward compatibility (eg: no need to recompile XS with each release).

The big problem with XS binary compatibility is data structure definitions and alignments, I don't see how a new function naming scheme is going to avoid that,

It will even (with some tweaks) allow to propagate newer grammar into older versions - this is root assumption behind idea of exact `use VERSION` - https://github.com/happy-barney/perl-wish-list/blob/master/exact-use-version/spec/spec.md

Like 99% of your proposals, I have no real idea of what you're proposing, but a quick look at that file sounds like something insanely complex for little gain,.

…

-- Any [programming] language that doesn't occasionally surprise the novice will pay for it by continually surprising the expert. -- Larry Wall

happy-barney · 2024-12-27T12:46:39Z

On Mon, Dec 23, 2024 at 05:03:49AM -0800, Branislav Zahradník wrote: Following this pattern ensures 100% forward compatibility (eg: no need to recompile XS with each release).
The big problem with XS binary compatibility is data structure definitions and alignments, I don't see how a new function naming scheme is going to avoid that,

structs and typedefs can be versioned as well.
For example:

struct mgvtbl_v510
struct mgvtbl_v520
struct mgvtbl_v530

including versioniong for each function using them.

such structures should be opaque and provide versioned accessors.

It will even (with some tweaks) allow to propagate newer grammar into older versions - this is root assumption behind idea of exact use VERSION - https://github.com/happy-barney/perl-wish-list/blob/master/exact-use-version/spec/spec.md
Like 99% of your proposals, I have no real idea of what you're proposing, but a quick look at that file sounds like something insanely complex for little gain,.

Why do you think it is insanely complex?

Gain is lowering TCO of codebase. For some it is few cents for others it may be millions.

Another gain is that it will be possible to build compatibility layers so you will be for example able to install "use v5.50" library on your "use v5.42" box. And vice versa, you will be able to use old syntax in new versions - ie, you can introduce incompatible changes without loosing capability to run older code.

…
-- Any [programming] language that doesn't occasionally surprise the novice will pay for it by continually surprising the expert. -- Larry Wall

iabyn · 2024-12-30T11:58:43Z

On Fri, Dec 27, 2024 at 04:47:01AM -0800, Branislav Zahradník wrote: structs and typedefs can be versioned as well. For example: - `struct mgvtbl_v510` - `struct mgvtbl_v520` - `struct mgvtbl_v530` including versioniong for each function using them.

...

Why do you think it is insanely complex?

So before, we had to maintain the mgvtbl structure and the functions which access it. Now we have to maintain a set of variant structs and a whole set of variant functions, along with new sets of tests which exercise all variants. That sounds insanely complex to me, And I'm still not seeing what *practical* advantages this provides.

…

-- Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.

happy-barney · 2024-12-30T13:22:13Z

re maintain struct:

no, there will be no structure maintenance, since structures are version locked and should not be changed.
Also, new structure and functions should be needed only when struct changes.

Structures can be even generated.

re access functions

this is little bit tricky. For example for mgvtbl workflow:

there should be registration functions per version returning opaque accessor

struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *);
struct mgvtbl_internal * mgvtbl_registry_v520 (struct mgvtbl_v520 *);
struct mgvtbl_internal * mgvtbl_registry_v530 (struct mgvtbl_v530 *);

such functions mey look like :

struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *data) {
   struct mgvtbl_v520 data_v520 = {
     .svt_get = data->svt_get,
     .svt_set = data->svt_set,
     .svt_added_in_520 = NULL
   };

  return mgvtbl_registry_v520 (&data_v520);
}

accessor functions, as long as they uses struct mgvtbl_internal * and returns same value, they don't need to change at all (mostly)

re advantage:

once symbol (C-API) is defined, it remains constant. So when you change behaviour (eg: introduces struct mgvtbl_v550) every XS will be compatible even if new structure is not aligned same way as old structure(s).

bulk88 · 2025-01-01T07:22:51Z

re maintain struct:

no, there will be no structure maintenance, since structures are version locked and should not be changed. Also, new structure and functions should be needed only when struct changes.

Every not-FOSS commercial platform I've worked with, just makes the ''first'' field a length counter of some struct between app-core and customer-plugin-sh-lib. That struct can grow bigger for decades, any new-ext can register on old-core, old-ext can register on new-core.

Abstracting your code design out to tons of getter/setter methods everywhere, no pass by copy C/C++ args, no malloc blocks, no structs,, or assuming the compilers can escape analysis the glibc and perl binaries, as if its a high level Java or Web JS VM is very inaccurate.

But this thing might confuse people nowadays https://webperl.zero-g.net/ :-)

C/C++ don't really fold things backed by RO sh lib mem, after the first & operator gets that symbol's address, even with LTO is scared mprotect() will rewrite it and break the C VM spec. #define 0x1234 or static inline lto, is the only ways to const fold away, or de-refing memory, and making sure there are no fn calls for alot of lines of code, so the CC doesn't keep re-reading global (malloc/elf/dll) storage between each fn call.

Structures can be even generated.

re access functions

this is little bit tricky. For example for mgvtbl workflow:

there should be registration functions per version returning opaque accessor

can't optimize that, it will have multi eval problems

struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *);
struct mgvtbl_internal * mgvtbl_registry_v520 (struct mgvtbl_v520 *);
struct mgvtbl_internal * mgvtbl_registry_v530 (struct mgvtbl_v530 *);

why do we need bug compatibility on CPAN year-by-year? what happens after 10 years? thats alot of structs.

such functions mey look like :

struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *data) {
   struct mgvtbl_v520 data_v520 = {
     .svt_get = data->svt_get,
     .svt_set = data->svt_set,
     .svt_added_in_520 = NULL
   };
  return mgvtbl_registry_v520 (&data_v520);
}

impl/code/tokens look okay to me, ive seen this pattern in other C apps,

accessor functions, as long as they uses struct mgvtbl_internal * and returns same value, they don't need to change at all (mostly)

C is not Perl/Py/JS/Rust. C doesn't have JIT or mark & sweeping , user mode VM pages based on last CPU R/W time. Or tracing tilt bits.

C abstract virtual machine, doesn't allow function pointers, to ever optimize out. CC LTO can't go between 2 ELFs/DLLs/EXEs at CC time. And the CC doesnt exist at runtime to fold anything or rewrite machine code. https://metacpan.org/pod/FFI::TinyCC and RPerl https://metacpan.org/pod/C::TinyCompiler exist for runtime mach code,

the mgvtbl_registry_v520 looks too much, umm,
"Fortune 100s with premier product lines started in 1980s",
if 16 byte GUIDs are added to the register() func call prototype, this API will have decades of lifetime.... not sarcasm, but perl doesn't need IDLs or XML or JSON, so struct member U32 size or c func arg constant U32 0x05410100 is good enough and better than separate c func names

re advantage:

once symbol (C-API) is defined, it remains constant. So when you change behaviour (eg: introduces struct mgvtbl_v550) every XS will be compatible even if new structure is not aligned same way as old structure(s).

This was tried before

perl5/perlapi.h

Line 120 in 3b8a46d

#define PL_Argv (*Perl_IArgv_ptr(aTHX))

It was toxic. Thank goodness its in the cemetery. I used to -DPERL_CORE my Config.pm to get rid of that 5.8-era ABI-forever vtables for all my CPAN XS modules I installed. Remember a CC can't optimize out a function pointer, and with Perl CPP macros, you now have 25-100 function calls per line of C code after expansion, b/c multiple eval.

If anyone wants to see this type of execution, single step in C dbg, in disassembly view, any xsub from APITest.so/.dll.

Wrapping SvIV()/SvPV/PUSH() and friends into another XS library and "double buffering" a ABI forever XS middle module that is recompiled yearly, with "Shimed CPAN" that last 10 years makes more sense. Its probably on cpan already.

Another question, why is there perl XS code that CANT be recompiled? If FOSS/CPAN, obviously it can be recompiled. Private/biz XS code, I assume the dev/sys admin has the source code. I believe all Perl OSes have free-ish CCs now. If its over 30 mins to recomp core and the .dlls, its an EUMM optimization problem, not XS/type system problem.

Im thinking of closed source XS binaries that are "trapped" on a legacy server/desktop/legacy perl, and a person is trying to extend the life of that server/software stack, no source, no author, no support.

happy-barney · 2025-01-01T08:29:37Z

(I will shorter quoting to reduce comment size) re maintain struct:

Every not-FOSS commercial platform I've worked with, just makes the ''first'' field a length counter of some struct between app-core and customer-plugin-sh-lib. That struct can grow bigger for decades, any new-ext can register on old-core, old-ext can register on new-core.

Main point here is compatibility with binary extensions, it will not be needed to recompile/rewrite every time core changes, for example: - XS - mod_perl and it allows you to introduce incompatible changes

- there should be registration functions per version returning opaque accessor can't optimize that, it will have multi eval problems

Please elaborate your points: - it doesn't need to be opaque in INTERN.h, code there will be working with newest version (as source of change) - registration functions will accept versioned data, translating into newest structure - it will be opaque in EXTERN.h - aim is to NOT to force users of language to invest into peek of additional work with upgrade of language. What is "only" price for that is performance impact. That one is usually acceptable during adaptation time.

why do we need bug compatibility on CPAN year-by-year? what happens after 10 years? thats alot of structs.

Why? Bugs can be fixed, can be even easily promoted into oldest version - win for users, win for reputation.

such functions mey look like : struct mgvtbl_internal * mgvtbl_registry_v510 (struct mgvtbl_v510 *data) { struct mgvtbl_v520 data_v520 = { .svt_get = data->svt_get, .svt_set = data->svt_set, .svt_added_in_520 = NULL }; return mgvtbl_registry_v520 (&data_v520); } impl/code/tokens look okay to me, ive seen this pattern in other C apps, - accessor functions, as long as they uses struct mgvtbl_internal * and returns same value, they don't need to change at all (mostly) C is not Perl/Py/JS/Rust. C doesn't have JIT or mark & sweeping , user mode VM pages based on last CPU R/W time. Or tracing tilt bits. C abstract virtual machine, doesn't allow function pointers, to ever optimize out. CC LTO can't go between 2 ELFs/DLLs/EXEs at CC time. And the CC doesnt exist at runtime to fold anything or rewrite machine code. https://metacpan.org/pod/FFI::TinyCC and RPerl https://metacpan.org/pod/C::TinyCompiler exist for runtime mach code,

Sorry, I have no idea what are you talking about. If you points to `mgvtbl` structure, it already exists and is necessary evil. Otherwise I have no idea. Please give some examples where you think it can make problems.

- the mgvtbl_registry_v520 looks too much, umm, - "Fortune 100s with premier product lines started in 1980s", - if 16 byte GUIDs are added to the register() func call prototype, this API will have decades of lifetime.... not sarcasm, but perl doesn't need IDLs or XML or JSON, so struct member U32 size or c func arg constant U32 0x05410100 is good enough and better than separate c func names adding GUID into prototype is not understanding concept. That is

"behaviour of this function was specified in 5.20 and didn't change since). Stacked transformation is a pattern I have great experience with ( untransferable :-( ), it allows you for example to move old functions into external libraries and load them on demand, stating: such backward compatibility exists, but can have performance impact

re advantage: once symbol (C-API) is defined, it remains constant. So when you change behaviour (eg: introduces struct mgvtbl_v550) every XS will be compatible even if new structure is not aligned same way as old structure(s). This was tried before https://github.com/Perl/perl5/blob/3b8a46d8724fcee349bf29d0d90f830a0f9c6bd9/perlapi.h#L120 It was toxic. Thank goodness its in the cemetery. I used to -DPERL_CORE my Config.pm to get rid of that 5.8-era ABI-forever vtables for all my CPAN XS modules I installed. Remember a CC can't optimize out a function pointer, and with Perl CPP macros, you now have 25-100 function calls per line of C code after expansion, b/c multiple eval.

OK, I don't think we are talking about same topic. There are no function pointers in my proposal (they are only in `mgvtbl` table, which already exists, and I used it to demonstrate workflow).

Another question, why is there perl XS code that CANT be recompiled? If FOSS/CPAN, obviously it can be recompiled. Private/biz XS code, I assume the dev/sys admin has the source code. I believe all Perl OSes have free-ish CCs now. If its over 30 mins to recomp core and the .dlls, its an EUMM optimization problem, not XS/type system problem.

recompile, test, deploy - in word which can generate usage of language it is bureaucracy, testing time - and a lot of it. Providing time to adapt is lowering cost of ownership for users yet allowing user to adapt newest features for new / touched code.

Im thinking of closed source XS binaries that are "trapped" on a legacy server/desktop/legacy perl, and a person is trying to extend the life of that server/software stack, no source, no author, no support.

... no user Message ID: ***@***.***>

…

Branislav Zahradník added 2 commits December 24, 2024 13:13

NewOpSz_v542 - variant without unused parameter

63ab1d4

First parameter of NewOpSz is unused since 2007.

happy-barney force-pushed the hpb/unused-magic-numbers branch from 0032ed0 to 63ab1d4 Compare December 24, 2024 12:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup - remove unused parameter with magic numbers #22871

Cleanup - remove unused parameter with magic numbers #22871

happy-barney commented Dec 21, 2024

iabyn commented Dec 23, 2024 via email

happy-barney commented Dec 23, 2024 •

edited

Loading

iabyn commented Dec 23, 2024 via email

happy-barney commented Dec 23, 2024

iabyn commented Dec 27, 2024 via email

happy-barney commented Dec 27, 2024

iabyn commented Dec 30, 2024 via email

happy-barney commented Dec 30, 2024

bulk88 commented Jan 1, 2025

re maintain struct:

re access functions

re advantage:

happy-barney commented Jan 1, 2025 via email

Cleanup - remove unused parameter with magic numbers #22871

Are you sure you want to change the base?

Cleanup - remove unused parameter with magic numbers #22871

Conversation

happy-barney commented Dec 21, 2024

iabyn commented Dec 23, 2024 via email

happy-barney commented Dec 23, 2024 • edited Loading

iabyn commented Dec 23, 2024 via email

happy-barney commented Dec 23, 2024

iabyn commented Dec 27, 2024 via email

happy-barney commented Dec 27, 2024

iabyn commented Dec 30, 2024 via email

happy-barney commented Dec 30, 2024

re maintain struct:

re access functions

re advantage:

bulk88 commented Jan 1, 2025

re maintain struct:

re access functions

re advantage:

happy-barney commented Jan 1, 2025 via email

happy-barney commented Dec 23, 2024 •

edited

Loading