Post by Felix WinkelmannHello, Peter!
I generally agree with most proposed changes on this list (with the
exception of the idea to drop "fluid-let", of course.) But it must be
clear to you that you already created a "pony" page. It is impossible
to do all of that, so before we go crazy with ideas, we should perhaps
get back to what we want to achieve with CHICKEN 5.
Yeah, I kind of had that "pony" impression while writing it up :)
Post by Felix WinkelmannAs I understand it, the idea is to decruftify (that is drop or eggify
library code), give proper names and modularize.
That's the main reason we're breaking back-compat, I think.
Post by Felix WinkelmannThis is all related
in one way or the other and looks like it can be done with the little
resources we have, especially considering that it will take ages until
only a reasonable subset of the existing eggs compiles and runs
properly in the new system.
* Designing a decent POSIX API is a hard task. I have not seen any
reasonably good API wrapper for that yet - they are either too
lowlevel (Basis, Ocaml, etc.), or too highlevel.
For now a modest refactoring would be enough.
[begin of short brain dump about the POSIX situation]
Putting things like, for example, "directory" in some other unit would
make more sense to me, because there's nothing inherently POSIXy in
reading the contents of a directory. (though the _implementation_
happens to rely on the C POSIX API, of course), and I think it belongs
with make-pathname and friends (ie, a "paths" or "files" module).
Ideally, there wouldn't be much left of the "posix" unit except some
deeply POSIXy things like fork, signal, fcntl, environment vars etc.
Probably this means the really high-level things move elsewhere.
In time, we might even move the POSIX unit out of core into an egg
and keep only truly "portable" (or essential) things in core. I'm
not sure what will happen to POSIX in the future, but I think its
hegemony will end sooner rather than later. the landscape is shifting
so quickly with these mobile devices (think Windows Phone, Firefox OS
but also the crippled POSIX support on iOS and Android), OS research
is slowly picking up again and the Linux crowd seems to be taking an
increasingly aggressive stance against "backwards compatibility" (think
Wayland, systemd etc).
So, I'm not against any POSIX support, but relying too much on it in
core itself is probably a mistake in the (very) long run.
[end of braindump]
Post by Felix Winkelmann* Changing the string representation is much harder than you think
(quoting John: "If Chibi can do it, so can we" completely ignores
the fact that writing a string-representation implementation from
scratch is something vastly different than modifying an existing
one, one that is much older and much more widely used from
foreign/native code.)
Agreed. Recall that my suggestion was simply to "bless" UTF-8 as the
canonical internal representation (which is the case, de facto, anyway)
and *maybe* adding some detection code to reject invalid sequences rather
than just continuing with bogus data. Possibly making the default
string ops the ones from the UTF-8 egg. Anything beyond that is
overkill and I would definitely not support changing the encoding in
this effort.
Of course if someone sent in a patch, that might change my mind...
but that's just wishful "pony" thinking ;)
Post by Felix Winkelmann* Numeric tower support: this is also hard, and will have a
considerable performance impact, needs changes in the compiler, in
all the icky C glue code and particularly in foreign code - which
means things will break all over the place in user code.
There is strong support from the community to do this, and I'm willing
to put in the required effort. I feel very strongly about adding at
least bignum support to core. I don't care as much about ratnums and
I don't care at all about compnums, but it may be simpler to add them;
the code to support them too is relatively straightforward.
Not having bignums in core causes too much headache:
- When dealing with foreign procedures returning full-width 64-bit
integers, as those simply cannot be fully represented by flonums.
- Having bignums be external to the core causes a lot of headaches when
one generates them and passes them to some library. For instance,
storing very large numbers in a database is perfectly sane and
generally possible with the DECIMAL type, but this requires all the
database eggs to pull in the numbers egg, which they currently don't.
In short, the numbers egg is "contagious".
- There are several hard to fix bugs that become trivial once bignums
are supported: #1096, #1000, #1139, #823. There have been other
such problems.
- Also, it confuses the newbies :)
If I don't make it before all the other things have been taken care of,
feel free to release CHICKEN 5 without it.
Post by Felix Winkelmann* Port-refactoring: again - basically a good idea, but tricky to
design, and may have a large performance impact, and the refactoring
will be work-intensive (all the direct peeking and poking in port
records needs to be localized and changed). This change should also
ideally be considered to be done in tandem with changing the string
representation.
Here too, a modest change would be enough. Just using a proper
struct/record type would make later refactorings easier. The best
part is that the performance impact of adding an offset to the write
buffer is a positive one. But if we won't be able to make this work,
I won't be too sad, I promise ;)
We don't have to make a perfect design, just one that scales better
with future changes. I was thinking to make the constructors accept
keyword arguments, so that we can later add things (like position
setting etc) without breaking existing programs.
Post by Felix Winkelmann* chicken-install/setup-files: a major and very important project on
its own. I started thinking about this some time ago, but didn't get
anywhere. Something very simple needs to be found that covers most
use cases, but this is something that needs input by many people
that have experience with the egg system and applicastions written
in CHICKEN. Perhaps we should plan to think about this the next time
some CHICKEN-hackers meet?
Sounds like a good plan. I also think this one may be too difficult and
too much work to do it for CHICKEN 5.0 unless lots of people chip in.
Post by Felix WinkelmannI _do_ think all the proposed changes make sense more or less, but
it's unrealistic to think that we achieve anything more than one or
two of the big parts.
Agreed. I'll put in some extra effort this week to get the numbers
egg in good shape for importing it into core, and maybe try to get
started on a core patch.
Post by Felix Winkelmann* I think John's idea of putting all the little SRFIs in a few (or a
single) module is better that splitting everything up into
modules. Having modules for each and everything looks nice on paper
but quickly gets old when you have to modify your module imports
every time you use a common but nonstandard language construct. I
understand that some people like this kind of bureaucracy, but
what's wrong with making things easier for the user?
Yeah, I said much the same at the start of the section about SRFIs.
However, I think it *does* make it easier for the user to _also_ offer
the SRFI libraries separately. There's already a hacky workaround for
require-extension's builtin-features in eval.scm so that you can say,
for example, (require-extension (srfi 2)), so I think it makes sense to
also provide "full" library declarations, to make it simpler to use and
write portable R7RS programs.
Note that this does not mean this needs to be the only library to export
said SRFI procedures!
Post by Felix Winkelmann* Please use long, explicit library names, it's easier to remember
("there are many ways to abbreviate something, but only one way not
to" - I forgot who said this, John will tell me, I'm sure.) And I
would also suggest to avoid using "srfi-XXX" as a module name, and
to use something meaningful (yes, I know that in the past I was
largely responsible for that mistake in numerous situations.) That
would also allow adding our own extensions.
For portability, I prefer at least also allowing the srfi numbers.
But yes, long names are good. However, there will be so few SRFIs
that will still be left as part of core that it makes very little
sense to rename the existing SRFIs, except when grouping several
constructs together.
Post by Felix Winkelmann* I can't resist to add a pony on my own: I fear that integrating the
R7RS syntax-rules cleanly and transparently inside an egg will be
tricky. What about changing syntax-rules to have R7RS semantics in
general? I'm not sure if I understand the differences well enough,
perhaps someone (Peter?) can comment on this.
I think we already did the important bits (ellipsis identifiers and
tail patterns - ie, SRFI-46). There are two more changes, AFAIK:
- The "new" syntax-rules foolishly changed the underscore to act as
a wildcard symbol, making it - strictly speaking - incompatible with
R5RS. I don't think it's a good idea to support this in core.
- For no good reason, R7RS syntax-rules allows not only renaming
ellipsis identifiers, but also quoting them (which I think is
a bit ugly). I *think* this is entirely backwards compatible,
so we could add that to core.
This is easily put in the R7RS egg, though. Remember, any use of
syntax-rules simply expands into one big ER macro transformer, and it
is a completely self-contained file which may be taken and copied into
the R7RS egg, and tweaked there to support these two cases. But it could
be simpler to do as a simple preprocessor which generates a "core"
syntax-rules expansion.
Post by Felix WinkelmannSo, in short: forget about unicode, the full numeric tower,
chicken-install, port-refactoring and everything but modularization,
the internal structure (and size!) and the necessary issues of doing a
major release (e.g. the question of how to integrate that with
henrietta.)
I think we can do minor things to make changing things in a backwards-
compatible way. These are important to postpone the need to "break the
world" a third time as much as possible.
I'd really like to hear other people's ideas about what would be the
best way to integrate the changes with Henrietta. Personally, I think
the easiest way is to simply deploy a second copy of henrietta which
reads from a different cache, populated by a second henrietta-cache cron
job which reads from a different master list.
Post by Felix WinkelmannThe major problem is that re-modularization will be the biggest
barrier in migrating user code. Once that is done we have a groundwork
for the really tricky things, and for smaller API changes that are
easier to detect via the module system.
Agreed. How will we attack the problem of bootstrapping? We will make
some breaking changes which might mean CHICKEN 4 may be unable to
bootstrap the CHICKEN-5-in-progress at some point. Now that we're on
a separate branch we can't really release snapshots in the 4.9.x series.
Maybe fall back to a simple date, or git hash versioning scheme for the
time being? We don't need to make them public "official" releases of
course. I just don't know how well our infrastructure will cope with
a different naming strategy. Should we do this by hand?
Cheers,
Peter
--
http://www.more-magic.net