Discussion:
[PATCH] Fix #1133
Peter Bex
2014-06-19 18:11:31 UTC
Permalink
Hi all,

Here's a pretty simple patch for #1133. It segfaults with a regular
build, but a DEBUGBUILD says exactly what's wrong:

Error: unbound variable: make-input-port

Call history:

foo.scm:2: tcp-listen
foo.scm:2: tcp-accept <--

The problem got introduced when we fixed a similar bug in unit ports,
This time it turns out that the core itself gets its dependencies wrong:
tcp uses one procedure from ports but none from extras, even though it
only depends on extras!

I've replaced extras by ports to fix the dependency stuff. I think it's
time we started making a more concrete plan on how to make the core
system use proper modules instead of "units", while still maintaining
separate compilation of course! If we use modules, this kind of bug
will be easily detected because the module refuses to compile. It will
also make the system cleaner in general because names won't "leak out"
unless explicitly imported. Anyway, this is a long-term project that
will probably require fixing #1131 (kill ##sys#alias-global-hook) first.

This bugfix should probably go into the stability branch.

Cheers,
Peter
--
http://www.more-magic.net
John Cowan
2014-06-19 18:36:47 UTC
Permalink
Post by Peter Bex
I've replaced extras by ports to fix the dependency stuff. I think it's
time we started making a more concrete plan on how to make the core
system use proper modules instead of "units", while still maintaining
separate compilation of course!
Amen. Units are really a hangover from Chicken 3 days.
--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
Half the lies they tell about me are true.
--Tallulah Bankhead, American actress
Christian Kellermann
2014-06-19 19:06:26 UTC
Permalink
Post by Peter Bex
Hi all,
Here's a pretty simple patch for #1133. It segfaults with a regular
Pushed, thanks.

Christian
Felix Winkelmann
2014-06-19 19:13:01 UTC
Permalink
Post by Peter Bex
I've replaced extras by ports to fix the dependency stuff. I think it's
time we started making a more concrete plan on how to make the core
system use proper modules instead of "units", while still maintaining
separate compilation of course! If we use modules, this kind of bug
will be easily detected because the module refuses to compile. It will
also make the system cleaner in general because names won't "leak out"
unless explicitly imported. Anyway, this is a long-term project that
will probably require fixing #1131 (kill ##sys#alias-global-hook) first.
A few remarks:

- Modules are a namespace-specific concept, units are a
loading/linking concept. Are you really comparing these here?

- The possibility of static linking needs to be preserved. Units
provide static entry-points for compilation units, which will be
needed for this. Static linking can currently not be simply dropped
because of the restrictions that mobile OSs have (android with its
fixed table of dynamically loadable libraries, iOS with its complete
lack of dynamic loading).

- The module system is currently optional - one can just use core
procedures in a toplevel program without using modules at all. The
machinery to make this work is admittedly ugly, but still I consider
it a feature. Forcing pervasive use of modules is noit by itself a
bad thing, but need to be thought out thoroughly.

- It's possible that I misunderstand your intentions, considering the
many meanings of the word "module" used in this context. But another
issue that comes to mind is bootstrapping: if you change the
imports/exports of a module that represents a core library unit, how
can we prevent screw-ups due to mixing up of the
import/export-information (import libraries) between the chicken
that compiles and the chicken that is compiled? This can get very
tricky.


felix
Peter Bex
2014-06-19 19:27:37 UTC
Permalink
Hi Felix,

Thanks for replying!
Post by Felix Winkelmann
- Modules are a namespace-specific concept, units are a
loading/linking concept. Are you really comparing these here?
Well, kind of. Let me tip-toe around the definitions to clarify:
Currently units are used in core to define dependencies, which is one
of the things that modules are used for as well. And as we saw in
this ticket and the other one there are some problems which can be
detected in dependencies expressed as modules but not when using
"just units".
Post by Felix Winkelmann
- The possibility of static linking needs to be preserved.
I agree completely, and said as much in my email.
Post by Felix Winkelmann
- The module system is currently optional - one can just use core
procedures in a toplevel program without using modules at all. The
machinery to make this work is admittedly ugly, but still I consider
it a feature. Forcing pervasive use of modules is noit by itself a
bad thing, but need to be thought out thoroughly.
Yes, this is a concern. I think it may be possible to hook the unit
system into the module system somehow (or vice versa), but you're right
in that this won't "just work".
Post by Felix Winkelmann
- It's possible that I misunderstand your intentions, considering the
many meanings of the word "module" used in this context. But another
issue that comes to mind is bootstrapping: if you change the
imports/exports of a module that represents a core library unit, how
can we prevent screw-ups due to mixing up of the
import/export-information (import libraries) between the chicken
that compiles and the chicken that is compiled? This can get very
tricky.
Indeed, bootstrapping is an ever-present pitfall. And there's also
library.scm which does not really correspond to a particular module.
Perhaps to "scheme", but not exactly.

I just wanted to put the idea out there, as a sort of "call to arms".
This is another large-scale project that will take a lot of energy which
I personally don't have right now. Maybe I'll put it on the wish-list
for a long-term goal.

Cheers,
Peter
--
http://www.more-magic.net
Felix Winkelmann
2014-06-19 20:23:41 UTC
Permalink
Post by Peter Bex
Currently units are used in core to define dependencies, which is one
of the things that modules are used for as well. And as we saw in
this ticket and the other one there are some problems which can be
detected in dependencies expressed as modules but not when using
"just units".
I understand, and agree completely.
Post by Peter Bex
I just wanted to put the idea out there, as a sort of "call to arms".
This is another large-scale project that will take a lot of energy which
I personally don't have right now. Maybe I'll put it on the wish-list
for a long-term goal.
Right, I expected that this was not planned to be done soon. Perhaps
we could start this by putting at least the compiler into proper
modules, as it only has a few entry-points that need to be accessed
externally, and this could be done via the "<module>#<binding>"
notation. The problem here is that the compiler is spread over several
files and would need a restructuring into several files + modules.


felix
Kristian Lein-Mathisen
2014-06-19 21:36:58 UTC
Permalink
Thanks so much for looking into this, and fixing it, Peter! This is
probably the only patch on chicken-hackers I'll ever understand :) I should
have thought about using DEBUGBUILD myself. I'll see if I can be of more
use next time!

We are depending on this fix for our systems in prod. Will there be a new
4.9.0.2 tarball or do we have to build from git for now? What's usually the
process in a case like this?

Thanks again,
K.
Peter Bex
2014-06-20 06:22:30 UTC
Permalink
Post by Kristian Lein-Mathisen
Thanks so much for looking into this, and fixing it, Peter! This is
probably the only patch on chicken-hackers I'll ever understand :) I should
have thought about using DEBUGBUILD myself. I'll see if I can be of more
use next time!
Don't worry about it, I always have a debug build lying around :)
Post by Kristian Lein-Mathisen
We are depending on this fix for our systems in prod. Will there be a new
4.9.0.2 tarball or do we have to build from git for now? What's usually the
process in a case like this?
Like I said in my mail this should go into stability, from which point
releases are periodically made. Christian indeed cherry-picked it into
that branch, so it's sitting there waiting for a release.

I don't think this change is so important that it warrants publishing a
new release; I'd prefer to have a few more bugfixes accumulate in the
branch. You can always choose to build from git from the stability
branch, though. That way you don't risk building a CHICKEN with any
new unknown bugs.

Have you seen my comment about the known workaround? You can just
(require-library ports) or (use ports) in your code (even if you're not
using the ports unit). This will fix the problem by ensuring that the
ports unit is loaded before running the program.

Cheers,
Peter
--
http://www.more-magic.net
Kristian Lein-Mathisen
2014-06-20 08:40:25 UTC
Permalink
Ah, I did not get that quick (use ports) workaround. That'll do!

Thanks,
K.
Post by Peter Bex
Post by Kristian Lein-Mathisen
Thanks so much for looking into this, and fixing it, Peter! This is
probably the only patch on chicken-hackers I'll ever understand :) I
should
Post by Kristian Lein-Mathisen
have thought about using DEBUGBUILD myself. I'll see if I can be of more
use next time!
Don't worry about it, I always have a debug build lying around :)
Post by Kristian Lein-Mathisen
We are depending on this fix for our systems in prod. Will there be a new
4.9.0.2 tarball or do we have to build from git for now? What's usually
the
Post by Kristian Lein-Mathisen
process in a case like this?
Like I said in my mail this should go into stability, from which point
releases are periodically made. Christian indeed cherry-picked it into
that branch, so it's sitting there waiting for a release.
I don't think this change is so important that it warrants publishing a
new release; I'd prefer to have a few more bugfixes accumulate in the
branch. You can always choose to build from git from the stability
branch, though. That way you don't risk building a CHICKEN with any
new unknown bugs.
Have you seen my comment about the known workaround? You can just
(require-library ports) or (use ports) in your code (even if you're not
using the ports unit). This will fix the problem by ensuring that the
ports unit is loaded before running the program.
Cheers,
Peter
--
http://www.more-magic.net
Oleg Kolosov
2014-06-23 00:02:59 UTC
Permalink
Post by Peter Bex
Post by Felix Winkelmann
- Modules are a namespace-specific concept, units are a
loading/linking concept. Are you really comparing these here?
Currently units are used in core to define dependencies, which is one
of the things that modules are used for as well. And as we saw in
this ticket and the other one there are some problems which can be
detected in dependencies expressed as modules but not when using
"just units".
Units are also used for separate compilation of large code bases. It is
not always make sense to separate everything into modules. This is
useful, but should be build only matter. AFAIK, current system does not
allow to transparently switch between module and unit mode for the given
source file. We can try to add an option for chicken to mostly ignore
module declarations and just issue some useful warnings but compile as a
unit. This approach will enable incremental migration.

This might require huge refactoring of the current module handling
facilities. Which might be a good or bad thing depending on the point of
view.
Post by Peter Bex
Post by Felix Winkelmann
- The possibility of static linking needs to be preserved.
I agree completely, and said as much in my email.
This area definitely needs attention. Especially handling of external
extensions (eggs). Last time I've tried to build completely static
binary it required adjustments to an eggs source code. I don't remember
specific details, maybe it is already improved. An ability to split up
the Chicken runtime might will be helpful, currently it is too big and
contains many unnecessary things (can be moved into eggs). Windows
support is nice to have, but too alien for current Chicken toolchain.
Also, seamless compiling for Android and iOS is a big deal.
Post by Peter Bex
Post by Felix Winkelmann
- The module system is currently optional - one can just use core
procedures in a toplevel program without using modules at all. The
machinery to make this work is admittedly ugly, but still I consider
it a feature. Forcing pervasive use of modules is noit by itself a
bad thing, but need to be thought out thoroughly.
Yes, this is a concern. I think it may be possible to hook the unit
system into the module system somehow (or vice versa), but you're right
in that this won't "just work".
What do you think about Chibi Scheme approach
(http://synthcode.com/scheme/chibi/#h2_ModuleSystem)? Providing module
definitions separately, avoids aforementioned issues nicely, and,
coupled with an option to ignore everything as proposed above will allow
incremental migration. Also, having compatible (and standard compliant)
module system is a huge selling point IMO.
Post by Peter Bex
Post by Felix Winkelmann
- It's possible that I misunderstand your intentions, considering the
many meanings of the word "module" used in this context. But another
issue that comes to mind is bootstrapping: if you change the
imports/exports of a module that represents a core library unit, how
can we prevent screw-ups due to mixing up of the
import/export-information (import libraries) between the chicken
that compiles and the chicken that is compiled? This can get very
tricky.
Indeed, bootstrapping is an ever-present pitfall. And there's also
library.scm which does not really correspond to a particular module.
Perhaps to "scheme", but not exactly.
I have not dived into this too deeply yet, but can you elaborate what
real problems you anticipate? Maybe someone could share an experience on
how other Scheme implementations handle this?
Post by Peter Bex
I just wanted to put the idea out there, as a sort of "call to arms".
This is another large-scale project that will take a lot of energy which
I personally don't have right now. Maybe I'll put it on the wish-list
for a long-term goal.
We recently migrated our project to modules precisely for a reason
started this thread and still cleaning up the mess. Suddenly, we
realized that many files are placed in wrong folders, some parts have
circular dependencies, some know way too much about others. But seeing
this on the surface is a good thing in the end.

IMO, Chicken have some of these problems too, it is often hard to
predict which module has which function due to responsibilities mixed
between library, posix, files, extras, utils, et al.

First of all, I suggest to break things up into very small modules based
on intended usage and common sense (see R7RS or Chibi for examples) and
provide reexporting declarations for compatibility. This will allow to
freely move things around. Then refactor, cleanup, repeat.

I am very interested in this undertaking and want to help you out.
--
Regards, Oleg
John Cowan
2014-06-23 04:14:25 UTC
Permalink
Post by Oleg Kolosov
What do you think about Chibi Scheme approach
(http://synthcode.com/scheme/chibi/#h2_ModuleSystem)? Providing module
definitions separately, avoids aforementioned issues nicely, and,
coupled with an option to ignore everything as proposed above will allow
incremental migration. Also, having compatible (and standard compliant)
module system is a huge selling point IMO.
Hey, I'm all for that (no surprise).

When I develop code for a SRFI, I provide one or more implementation
files, usually named foo-impl.scm, and then an R7RS/Chibi module named
foo.sld and a Chicken version named foo.scm. These files contain just
imports for R7RS or uses for Chicken, exports, and includes. So far I
haven't felt the need for separate compilation of the components of a
a single module.

IMO (and I know Felix is opposed to this, though I can't say I understand
why), it would be better if the semantics of Chicken `import` fell back
to `use` if the module is not currently loaded, rather than throwing
an error. If that worked, it would be possible to simulate a substantial
subset of R7RS module language as native Chicken code.
--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
As you read this, I don't want you to feel sorry for me, because,
I believe everyone will die someday.
--From a Nigerian-type scam spam
Peter Bex
2014-06-26 12:33:31 UTC
Permalink
Post by Oleg Kolosov
Post by Peter Bex
Currently units are used in core to define dependencies, which is one
of the things that modules are used for as well. And as we saw in
this ticket and the other one there are some problems which can be
detected in dependencies expressed as modules but not when using
"just units".
Units are also used for separate compilation of large code bases. It is
not always make sense to separate everything into modules. This is
useful, but should be build only matter. AFAIK, current system does not
allow to transparently switch between module and unit mode for the given
source file. We can try to add an option for chicken to mostly ignore
module declarations and just issue some useful warnings but compile as a
unit. This approach will enable incremental migration.
That's an interesting idea. It might work, but I'm also unsure of the
added benefit; if the code is always compiled separately in the "ignore
modules" mode, it won't add the intended benefits of additional checking
of dependencies and exports.
Post by Oleg Kolosov
This area definitely needs attention. Especially handling of external
extensions (eggs). Last time I've tried to build completely static
binary it required adjustments to an eggs source code. I don't remember
specific details, maybe it is already improved. An ability to split up
the Chicken runtime might will be helpful, currently it is too big and
contains many unnecessary things (can be moved into eggs). Windows
support is nice to have, but too alien for current Chicken toolchain.
Also, seamless compiling for Android and iOS is a big deal.
I'm unsure how much further we can strip down the core. Perhaps we can
drop SRFI-1, SRFI-13 and SRFI-14, which would be just fine as eggs. But
currently core is using some procedures from these. I think that this is
a separate issue, though. Refactoring stuff as modules is a shitload of
work which is not yet certain to be feasible, and we should not drag more
work into this project.
Post by Oleg Kolosov
What do you think about Chibi Scheme approach
(http://synthcode.com/scheme/chibi/#h2_ModuleSystem)?
That's just R7RS.
Post by Oleg Kolosov
Providing module
definitions separately, avoids aforementioned issues nicely, and,
coupled with an option to ignore everything as proposed above will allow
incremental migration.
It's not 100% separate, the (begin ...) form allows embedded Scheme forms.
Also, our module system already supports separation into files, either
by using INCLUDE or the shorthand form (module NAME (EXPORT ...) FILENAME)
from http://wiki.call-cc.org/man/4/Modules
Post by Oleg Kolosov
Also, having compatible (and standard compliant)
module system is a huge selling point IMO.
We already support this syntax through the R7RS egg. We can't just
switch to the R7RS module system in core because it would break 100% of
the user code out there, for no technical gain whatsoever. CHICKEN core
is an R5RS system, and adding an R7RS module system to it adds no
benefit if everything else from R7RS is only available as an extension.

We have already too many people asking about the difference between use,
require, require-extension, uses etc. Adding a second module system to
core would only add to the confusion (and *replacing* it would break the
world).
Post by Oleg Kolosov
Post by Peter Bex
Indeed, bootstrapping is an ever-present pitfall. And there's also
library.scm which does not really correspond to a particular module.
Perhaps to "scheme", but not exactly.
I have not dived into this too deeply yet, but can you elaborate what
real problems you anticipate?
All the ##sys# stuff and other "internal" procedures don't really belong
anywhere, including a lot of runtime support code that other core units
and the compiler need. Finally, there's an overlap between the r4rs and
r5rs primitive modules. That r5rs module is currently defined as r4rs
plus VALUES, DYNAMIC-WIND and CALL-WITH-VALUES.

The simple solution would be to stuff everything into a "core" module
which nobody but the core system is supposed to really use. We could
then define r4rs and r5rs/scheme as modules which import the whole of
core and reexport only the relevant procedures.

But there's still the problem of code which replaces primitive
procedures. However much this is frowned upon by "modern" Scheme
standards, this is something that is explicitly allowed by RnRS | n < 6.
AFAIK toplevel code can do that even with identifiers exported by egg
modules, so perhaps this is not an issue. We have to test it though, and
luckily r4rstest.scm already checks this.
Post by Oleg Kolosov
Post by Peter Bex
I just wanted to put the idea out there, as a sort of "call to arms".
This is another large-scale project that will take a lot of energy which
I personally don't have right now. Maybe I'll put it on the wish-list
for a long-term goal.
We recently migrated our project to modules precisely for a reason
started this thread and still cleaning up the mess. Suddenly, we
realized that many files are placed in wrong folders, some parts have
circular dependencies, some know way too much about others. But seeing
this on the surface is a good thing in the end.
Yeah, people often complain when they find out that circular dependencies
are impossible in the module system, but that's a good thing if you want
to have a fighting chance at maintaining your code later.
Post by Oleg Kolosov
IMO, Chicken have some of these problems too, it is often hard to
predict which module has which function due to responsibilities mixed
between library, posix, files, extras, utils, et al.
Yeah, we have had problems this.
Post by Oleg Kolosov
First of all, I suggest to break things up into very small modules based
on intended usage and common sense (see R7RS or Chibi for examples) and
provide reexporting declarations for compatibility. This will allow to
freely move things around. Then refactor, cleanup, repeat.
Yeah. Especially the stuff from posix should be cleaned up. There are
plenty of things unrelated things in there at varying abstraction levels,
which is only stuffed in there because it's somehow related to an
underlying syscall that's defined by POSIX. This is a little too
system-specific for me; it would make more sense to group those things
together into things like files, process-control, user-management etc,
and define modules in that way.
Post by Oleg Kolosov
I am very interested in this undertaking and want to help you out.
That would be much appreciated, once we get started.

Cheers,
Peter
--
http://www.more-magic.net
Felix Winkelmann
2014-06-26 13:56:47 UTC
Permalink
Post by Peter Bex
I'm unsure how much further we can strip down the core. Perhaps we can
drop SRFI-1, SRFI-13 and SRFI-14, which would be just fine as eggs. But
currently core is using some procedures from these. I think that this is
a separate issue, though. Refactoring stuff as modules is a shitload of
work which is not yet certain to be feasible, and we should not drag more
work into this project.
The compiler and egg-setup stuff need a number of srfi-1/13/14
primitives, and the core tarball should be self-contained, so I see not
much chance to eggify more of the base system.
Post by Peter Bex
We already support this syntax through the R7RS egg.
BTW, what's the status of that egg? The last time I looked it seemed
to be basically complete. Or is there still anything missing? (I know
there are some rough edges, and obscure warnings, but does it
basically work?)
Post by Peter Bex
We have already too many people asking about the difference between use,
require, require-extension, uses etc. Adding a second module system to
core would only add to the confusion (and *replacing* it would break the
world).
(add1 from me...)
Post by Peter Bex
The simple solution would be to stuff everything into a "core" module
which nobody but the core system is supposed to really use. We could
then define r4rs and r5rs/scheme as modules which import the whole of
core and reexport only the relevant procedures.
What you mean by "everything" here? All the ##this#.. and ##that#...,
or the non-standard chicken procedures that are usually exported from
the "chicken" module?
Post by Peter Bex
But there's still the problem of code which replaces primitive
procedures. However much this is frowned upon by "modern" Scheme
standards, this is something that is explicitly allowed by RnRS | n < 6.
AFAIK toplevel code can do that even with identifiers exported by egg
modules, so perhaps this is not an issue. We have to test it though, and
luckily r4rstest.scm already checks this.
The module system doesn't prevent this, it just renames identifiers,
but doesn't change the semantics of assignment.
Post by Peter Bex
Yeah. Especially the stuff from posix should be cleaned up. There are
plenty of things unrelated things in there at varying abstraction levels,
which is only stuffed in there because it's somehow related to an
underlying syscall that's defined by POSIX. This is a little too
system-specific for me; it would make more sense to group those things
together into things like files, process-control, user-management etc,
and define modules in that way.
Be aware that a platform-independent layer over those syscalls is
quite some design work. There is a reason why so many library
implementations often fall back to doing exactly this: wrapping each
and every syscall (look at ocaml, or the SML basis library).

Take windows into the equation and things get even worse. POSIX is not
the most ideal API "standard", but at least it is something one can
start with, even if Windows support is somewhat painful.
Post by Peter Bex
Post by Oleg Kolosov
I am very interested in this undertaking and want to help you out.
That would be much appreciated, once we get started.
Indeed.


felix
Peter Bex
2014-06-26 14:29:07 UTC
Permalink
Post by Felix Winkelmann
The compiler and egg-setup stuff need a number of srfi-1/13/14
primitives, and the core tarball should be self-contained, so I see not
much chance to eggify more of the base system.
Depending on how large the subset of procedures is we could perhaps pull
a simplified version of them into core and optimise that to hell, and
excise the slow remaining crap from core and move it to an egg (we
already have some duplication like foldl vs fold and string-intersperse
vs string-join). But I'm really not sure, I'd have to investigate.
Post by Felix Winkelmann
Post by Peter Bex
We already support this syntax through the R7RS egg.
BTW, what's the status of that egg? The last time I looked it seemed
to be basically complete. Or is there still anything missing? (I know
there are some rough edges, and obscure warnings, but does it
basically work?)
I've asked Seth and he said it's working adequately. Maybe Evan can
provide us with some more in-depth info.
Post by Felix Winkelmann
Post by Peter Bex
The simple solution would be to stuff everything into a "core" module
which nobody but the core system is supposed to really use. We could
then define r4rs and r5rs/scheme as modules which import the whole of
core and reexport only the relevant procedures.
What you mean by "everything" here? All the ##this#.. and ##that#...,
or the non-standard chicken procedures that are usually exported from
the "chicken" module?
The ##-things and such. If it's already exported by a module like
chicken or scheme that's fine. But it might make more sense to put
it in scheme.scm in that case, if that's doable. We'd probably have to
split up core a bit in order to avoid having an overlap between files
and modules like we do now (eg, eval.scm, module.scm and library.scm
define things that are exported by the chicken module, but other things
from library.scm are exported by the scheme module).

We could still keep eval.scm and module.scm for example, if they strictly
export things for chicken. If that's undoable we could define lower-level
modules and compose the chicken and scheme modules from them, like it
more-or-less works right now. This could be the simplest thing to start
out with, anyway.
Post by Felix Winkelmann
Post by Peter Bex
But there's still the problem of code which replaces primitive
procedures. However much this is frowned upon by "modern" Scheme
standards, this is something that is explicitly allowed by RnRS | n < 6.
AFAIK toplevel code can do that even with identifiers exported by egg
modules, so perhaps this is not an issue. We have to test it though, and
luckily r4rstest.scm already checks this.
The module system doesn't prevent this, it just renames identifiers,
but doesn't change the semantics of assignment.
Good to know, thanks for explaining.
Post by Felix Winkelmann
Post by Peter Bex
Yeah. Especially the stuff from posix should be cleaned up. There are
plenty of things unrelated things in there at varying abstraction levels,
which is only stuffed in there because it's somehow related to an
underlying syscall that's defined by POSIX. This is a little too
system-specific for me; it would make more sense to group those things
together into things like files, process-control, user-management etc,
and define modules in that way.
Be aware that a platform-independent layer over those syscalls is
quite some design work. There is a reason why so many library
implementations often fall back to doing exactly this: wrapping each
and every syscall (look at ocaml, or the SML basis library).
At least we can steal some good designs from other Schemes and Lisps,
where appropriate. :)

The biggest problem I have is the different abstraction levels of the
"posix" unit. Some are extremely thin wrappers (like for example
file-select and file-{open,read,write}, which block the entire process
instead of handling I/O in a nonblocking way and integrating with
the scheduler like the rest of the system does; file-read just returns
-1 and sets errno if you try to make it nonblocking), whereas others are
pretty sophisticated library procedures which provide a more high-level
API (like for example find-files and the signal handling stuff).
Post by Felix Winkelmann
Take windows into the equation and things get even worse. POSIX is not
the most ideal API "standard", but at least it is something one can
start with, even if Windows support is somewhat painful.
Hehehe, "somewhat" :) But you're right, most of what POSIX offers is
valuable and needs to be deeply integrated in core in order to provide
good support so it makes sense. Still, I like the idea of having only
completely portable stuff in core, so that programmers know that if it's
in core, it's going to work everywhere. That can save a lot of headaches
when porting an application to Windows, or CHICKEN to a completely new
platform.

Cheers,
Peter
--
http://www.more-magic.net
Evan Hanson
2014-06-26 18:49:01 UTC
Permalink
Post by Peter Bex
Post by Felix Winkelmann
BTW, what's the status of that egg? The last time I looked it seemed
to be basically complete. Or is there still anything missing? (I know
there are some rough edges, and obscure warnings, but does it
basically work?)
I've asked Seth and he said it's working adequately. Maybe Evan can
provide us with some more in-depth info.
There's not much that's obviously missing anymore, it's mostly rough
edges as you said. The list at https://wiki.call-cc.org/r7rs-tasks#todo
is up to date.

Actually, the most important todo is probably documentation of the
gotchas/limitations/interplay between the egg and core that aren't
obvious from that list, e.g. `(import (foo bar))` doesn't load code
from ./foo/bar.scm (or ".sld" as seems to have become common), one must
`(import-for-syntax (r7rs))` to get R7RS's syntax-rules, and so on. But,
overall, I'd be fine with releasing it with such disclaimers just so
people can start hacking at it.

Cheers,

Evan
John Cowan
2014-06-26 23:42:31 UTC
Permalink
Post by Evan Hanson
There's not much that's obviously missing anymore, it's mostly rough
edges as you said. The list at https://wiki.call-cc.org/r7rs-tasks#todo
is up to date.
I've added a few things there. The only substantive point is that after
doing (use r7rs) in the REPL, the (scheme base) library should be loaded.
Currently it isn't, which means that (e.g.) you get the non-numbers version
of `/`.
Post by Evan Hanson
Actually, the most important todo is probably documentation of the
gotchas/limitations/interplay between the egg and core that aren't
obvious from that list, e.g. `(import (foo bar))` doesn't load code
from ./foo/bar.scm (or ".sld" as seems to have become common), one must
`(import-for-syntax (r7rs))` to get R7RS's syntax-rules, and so on. But,
overall, I'd be fine with releasing it with such disclaimers just so
people can start hacking at it.
+1. Why are the R7RS library names like (foo bar) mapped to foo.bar
instead of foo-bar?
--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
Original line from The Warrior's Apprentice by Lois McMaster Bujold:
"Only on Barrayar would pulling a loaded needler start a stampede toward one."
English-to-Russian-to-English mangling thereof: "Only on Barrayar you risk to
lose support instead of finding it when you threat with the charged weapon."
Felix Winkelmann
2014-06-30 10:58:46 UTC
Permalink
From: Evan Hanson <***@foldling.org>
Subject: Re: [Chicken-hackers] [PATCH] Fix #1133
Date: Thu, 26 Jun 2014 11:49:01 -0700
Post by Evan Hanson
Post by Peter Bex
Post by Felix Winkelmann
BTW, what's the status of that egg? The last time I looked it seemed
to be basically complete. Or is there still anything missing? (I know
there are some rough edges, and obscure warnings, but does it
basically work?)
I've asked Seth and he said it's working adequately. Maybe Evan can
provide us with some more in-depth info.
There's not much that's obviously missing anymore, it's mostly rough
edges as you said. The list at https://wiki.call-cc.org/r7rs-tasks#todo
is up to date.
Actually, the most important todo is probably documentation of the
gotchas/limitations/interplay between the egg and core that aren't
obvious from that list, e.g. `(import (foo bar))` doesn't load code
from ./foo/bar.scm (or ".sld" as seems to have become common), one must
`(import-for-syntax (r7rs))` to get R7RS's syntax-rules, and so on. But,
overall, I'd be fine with releasing it with such disclaimers just so
people can start hacking at it.
Excellent! Evan, would you mind writing a wiki-page for the egg?
Perhaps based on the r7rs-tasks page?

I can try to figure out something for the import-for-syntax.

The thing about "define-values" is related to not being able to handle
extended (DSSSL) lambda-lists, is that correct? If yes, we can
probably ignore that for the time being.

Oh, and thanks very much for putting so much effort into the r7rs egg.
Well done, indeed!

I will try to give it some testing and perhaps I can fix a few open
issues.


felix
Evan Hanson
2014-07-01 22:23:25 UTC
Permalink
Post by Felix Winkelmann
Excellent! Evan, would you mind writing a wiki-page for the egg?
Perhaps based on the r7rs-tasks page?
Certainly.
Post by Felix Winkelmann
The thing about "define-values" is related to not being able to handle
extended (DSSSL) lambda-lists, is that correct? If yes, we can
probably ignore that for the time being.
Not DSSSL, but more basic lambda lists such as the following, which
core's `define-values` doesn't support:

(define-values a (values 1 2 3))
(define-values (a . b) (values 1 2 3))

I initially added an alternative definition of `define-values` to the
egg that supports these cases, but ran up against #1132 in that it
wasn't used within let and lambda bodies. I hope to post a patch to
extend core's version shortly.

Cheers,

Evan
Felix Winkelmann
2014-06-27 14:41:07 UTC
Permalink
Post by Peter Bex
Depending on how large the subset of procedures is we could perhaps pull
a simplified version of them into core and optimise that to hell, and
excise the slow remaining crap from core and move it to an egg (we
already have some duplication like foldl vs fold and string-intersperse
vs string-join). But I'm really not sure, I'd have to investigate.
That might be a good idea. Some of the SRFI-procedures are more
general than we need.
Post by Peter Bex
The ##-things and such. If it's already exported by a module like
chicken or scheme that's fine. But it might make more sense to put
it in scheme.scm in that case, if that's doable. We'd probably have to
split up core a bit in order to avoid having an overlap between files
and modules like we do now (eg, eval.scm, module.scm and library.scm
define things that are exported by the chicken module, but other things
from library.scm are exported by the scheme module).
We could still keep eval.scm and module.scm for example, if they strictly
export things for chicken. If that's undoable we could define lower-level
modules and compose the chicken and scheme modules from them, like it
more-or-less works right now. This could be the simplest thing to start
out with, anyway.
The scheme module is created by hand, there is no file corresponding
to it. I would propose to keep as much as possible in "chicken", even
if there is file overlap. These internal modules do not map to
specific files anyway. This means basically all the stuff in
"modules", "expand", "library", "eval" and "chicken-syntax".

It's the library-units that need to be heavily restructured, I think.
Post by Peter Bex
The biggest problem I have is the different abstraction levels of the
"posix" unit. Some are extremely thin wrappers (like for example
file-select and file-{open,read,write}, which block the entire process
instead of handling I/O in a nonblocking way and integrating with
the scheduler like the rest of the system does; file-read just returns
-1 and sets errno if you try to make it nonblocking), whereas others are
pretty sophisticated library procedures which provide a more high-level
API (like for example find-files and the signal handling stuff).
Yes, I agree.
Post by Peter Bex
Post by Felix Winkelmann
Take windows into the equation and things get even worse. POSIX is not
the most ideal API "standard", but at least it is something one can
start with, even if Windows support is somewhat painful.
Hehehe, "somewhat" :) But you're right, most of what POSIX offers is
valuable and needs to be deeply integrated in core in order to provide
good support so it makes sense. Still, I like the idea of having only
completely portable stuff in core, so that programmers know that if it's
in core, it's going to work everywhere. That can save a lot of headaches
when porting an application to Windows, or CHICKEN to a completely new
platform.
I would propose something like this:

- identify stuff that's not used in core and move that to eggs, for example:

memory-mapped files
binary-search (this screams out to be implemented as a functor)
queues
... much more, of possible ...

- Completely restructure all library units (extras, data-structures, files,
ports, lolevel, posix, utils). tcp and foreign stay as they are.

- irregex is used in core and will benefit from as much core-support as
it can get, so leave it as it is, as well.

- re-implement/copy stuff from SRFI-1/13/14 for internal use only,
move the rest into eggs.

- I would like to move srfi-18 to an egg as well, only keep the scheduler
and the internal threading-stuff in library.scm in core.

- srfi-69 can go to an egg. It is only used in chicken-profile, and
can be done using internal (symbol) hash-tables.

- I think srfi-4 can be eggified, too.

- Provide wrapper eggs for all the library units that are gone
(extras, ports, files, data-structures).

- I'm not sure about posix. Perhaps split this into higher-level
modules (eggify, if not needed in core) and keep a lower-level
interface in core.

- Use modules for the compiler. It is not used externally (well,
with the exception of user-passes, which nobody uses, I guess)
and would need some restructuring as well. But this may be a
first attempt at using modules internally.

Does this make sense?


felix
Oleg Kolosov
2014-06-27 20:18:24 UTC
Permalink
Post by Felix Winkelmann
Post by Peter Bex
Depending on how large the subset of procedures is we could perhaps pull
a simplified version of them into core and optimise that to hell, and
excise the slow remaining crap from core and move it to an egg (we
already have some duplication like foldl vs fold and string-intersperse
vs string-join). But I'm really not sure, I'd have to investigate.
That might be a good idea. Some of the SRFI-procedures are more
general than we need.
Sounds like candidates for "lolevel subset" APIs. There are also a room
for leveraging Chicken close relationship with C when optimizing code.
We have some bottlenecks where Scheme is "too high level" but using its
C API directly is too troublesome. It would be nice to have some more
low level stuff conveniently exposed to Scheme side.
Post by Felix Winkelmann
Post by Peter Bex
Post by Felix Winkelmann
Take windows into the equation and things get even worse. POSIX is not
the most ideal API "standard", but at least it is something one can
start with, even if Windows support is somewhat painful.
Hehehe, "somewhat" :) But you're right, most of what POSIX offers is
valuable and needs to be deeply integrated in core in order to provide
good support so it makes sense. Still, I like the idea of having only
completely portable stuff in core, so that programmers know that if it's
in core, it's going to work everywhere. That can save a lot of headaches
when porting an application to Windows, or CHICKEN to a completely new
platform.
Looks like it is mostly setup api that pulls POSIX. Making it optional
along with csc and friends will allow us to strip the core much more.
And they are hard to port to Windows (native) anyway.

I have an experimental CMake based Chicken branch where these are not
needed. Basically, the agenda is to use CMake to detect as much as it
can during compilation, generate bigger chicken-config.h, and get rid of
the most platform specific stuff in the core. Write dumb versions of
csc, chicken-install, etc. which just generate CMakeLists with
predefined macros to provide smooth migration path.

Does it sounds worthwhile?
Post by Felix Winkelmann
memory-mapped files
binary-search (this screams out to be implemented as a functor)
queues
... much more, of possible ...
- Completely restructure all library units (extras, data-structures, files,
ports, lolevel, posix, utils). tcp and foreign stay as they are.
- irregex is used in core and will benefit from as much core-support as
it can get, so leave it as it is, as well.
- re-implement/copy stuff from SRFI-1/13/14 for internal use only,
move the rest into eggs.
- I would like to move srfi-18 to an egg as well, only keep the scheduler
and the internal threading-stuff in library.scm in core.
Maybe it makes sense to expose some of that to make it easier to
implement stuff like concurrent-native-callbacks? I briefly investigated
the possibility to integrate it with libuv event loop and the internals
looked quite nasty.
Post by Felix Winkelmann
- srfi-69 can go to an egg. It is only used in chicken-profile, and
can be done using internal (symbol) hash-tables.
- I think srfi-4 can be eggified, too.
- Provide wrapper eggs for all the library units that are gone
(extras, ports, files, data-structures).
- I'm not sure about posix. Perhaps split this into higher-level
modules (eggify, if not needed in core) and keep a lower-level
interface in core.
- Use modules for the compiler. It is not used externally (well,
with the exception of user-passes, which nobody uses, I guess)
and would need some restructuring as well. But this may be a
first attempt at using modules internally.
We are hooking logging system (extracting file, line number information
basically) through user-passes. There are also potential use cases like
externally getting module dependency information (sort of like gcc -M
options). It is a nice feature to abuse.
Post by Felix Winkelmann
Does this make sense?
Looks like great plan overall.
--
Regards, Oleg
Felix Winkelmann
2014-06-27 20:38:37 UTC
Permalink
Post by Oleg Kolosov
Sounds like candidates for "lolevel subset" APIs. There are also a room
for leveraging Chicken close relationship with C when optimizing code.
We have some bottlenecks where Scheme is "too high level" but using its
C API directly is too troublesome. It would be nice to have some more
low level stuff conveniently exposed to Scheme side.
What stuff would you like to see?
Post by Oleg Kolosov
We are hooking logging system (extracting file, line number information
basically) through user-passes. There are also potential use cases like
externally getting module dependency information (sort of like gcc -M
options). It is a nice feature to abuse.
Ah, ok. Good to know.


felix
Oleg Kolosov
2014-07-02 21:33:25 UTC
Permalink
Post by Felix Winkelmann
Post by Oleg Kolosov
Sounds like candidates for "lolevel subset" APIs. There are also a room
for leveraging Chicken close relationship with C when optimizing code.
We have some bottlenecks where Scheme is "too high level" but using its
C API directly is too troublesome. It would be nice to have some more
low level stuff conveniently exposed to Scheme side.
What stuff would you like to see?
More documentation for internals, even comments will do.

More low level utilities for constructing complex data structures along
the lines of record-instance. A ways to define high performance analogs
of defstruct's update.

Some control for GC, with more direct support for ref-counted C
structures for example. Or callbacks from native threads. I know about
concurrent-native-callbacks, but it looks too magical to me. Maybe there
are simpler way. It is said that finalizers hurt performance, maybe this
can be improved with hints from C side (for FFI code).

These are random thoughts. I just barely remember about problems I've
had some time ago.
--
Regards, Oleg
Felix Winkelmann
2014-07-05 21:31:18 UTC
Permalink
From: Oleg Kolosov <***@gmail.com>
Subject: Re: [Chicken-hackers] [PATCH] Fix #1133
Date: Thu, 03 Jul 2014 01:33:25 +0400
Post by Oleg Kolosov
Post by Felix Winkelmann
Post by Oleg Kolosov
Sounds like candidates for "lolevel subset" APIs. There are also a room
for leveraging Chicken close relationship with C when optimizing code.
We have some bottlenecks where Scheme is "too high level" but using its
C API directly is too troublesome. It would be nice to have some more
low level stuff conveniently exposed to Scheme side.
What stuff would you like to see?
More documentation for internals, even comments will do.
Well, that can be done. External documentation will quickly get out of
sync, but comments shouldn't be a problem. If there are particular
places where one could start, please tell.
Post by Oleg Kolosov
More low level utilities for constructing complex data structures along
the lines of record-instance. A ways to define high performance analogs
of defstruct's update.
Do you mean high-performance analogs of functional update, for the
builtin records in the core system?
Post by Oleg Kolosov
Some control for GC, with more direct support for ref-counted C
structures for example. Or callbacks from native threads. I know about
concurrent-native-callbacks, but it looks too magical to me. Maybe there
are simpler way. It is said that finalizers hurt performance, maybe this
can be improved with hints from C side (for FFI code).
concurrent-native-callbacks is actually pretty dumb: just a pipe and
some mutexes, and a dispatcher reading pointers from the pipe. Having
something more lightweight would certainly be nice, but will need even
more magical tricks and/or complex machinery. I'm really not familiar
enough with multithreading, but someone more experienced may be.

Finalizers have an impact on every GC, as every GC may free memory at
any time. What may be possible is to register finalizers only when it
is known that the associated data is not held in global variables. But
I'm not sure if it isn't in the end easier to manage the resources by
hand. On could do something like registerering and de-registering
finalizers temporarily (the former is O(1), the latter O(N), though).
Whether this buys anything, is another question, though.


felix
Jörg F. Wittenberger
2014-07-06 08:42:39 UTC
Permalink
Post by Felix Winkelmann
Finalizers have an impact on every GC, as every GC may free memory at
any time. What may be possible is to register finalizers only when it
is known that the associated data is not held in global variables. But
I'm not sure if it isn't in the end easier to manage the resources by
hand. On could do something like registerering and de-registering
finalizers temporarily (the former is O(1), the latter O(N), though).
Whether this buys anything, is another question, though. felix
_______________________________________________ Chicken-hackers
https://lists.nongnu.org/mailman/listinfo/chicken-hackers
I'm becoming ever more guilty here for failing to provide a single
feature patch to runtime.c for just changing this.

If anyone had the patience to cherry-pick from a diff… here is code
where doing de-registration in O(1) for finalizers and gc roots.
Peter Bex
2014-06-28 10:40:46 UTC
Permalink
Post by Oleg Kolosov
Looks like it is mostly setup api that pulls POSIX. Making it optional
along with csc and friends will allow us to strip the core much more.
And they are hard to port to Windows (native) anyway.
If I understand correctly, you propose replacing chicken-install with
cmake, but that's only able to compile things. Without the setup-api,
how are egg dependencies specified and how are the eggs downloaded and
managed? (ie, listing, installation and removal of the eggs on a
particular system)
Post by Oleg Kolosov
Post by Felix Winkelmann
- I would like to move srfi-18 to an egg as well, only keep the scheduler
and the internal threading-stuff in library.scm in core.
Maybe it makes sense to expose some of that to make it easier to
implement stuff like concurrent-native-callbacks?
Suggestions? Right now there's already the ##sys#thread stuff that
srfi-18 itself uses. I think there's not going to be a whole lot
of opportunity for exposing more stuff. If we're going to extract
srfi-18 into an egg I do agree we need to take a closer look at the
exact API it exposes. Right now threads are limited in what objects
they are able to block on. If we change that after making an egg it
will be more difficult. I think Jerry said he was working on improving
this.
Post by Oleg Kolosov
We are hooking logging system (extracting file, line number information
basically) through user-passes. There are also potential use cases like
externally getting module dependency information (sort of like gcc -M
options). It is a nice feature to abuse.
That's a pretty cool hack!

Cheers,
Peter
--
http://www.more-magic.net
Jörg F. Wittenberger
2014-06-29 09:57:24 UTC
Permalink
OK, since Peter mentioned me…
Post by Peter Bex
Post by Oleg Kolosov
Looks like it is mostly setup api that pulls POSIX. Making it optional
along with csc and friends will allow us to strip the core much more.
And they are hard to port to Windows (native) anyway.
If I understand correctly, you propose replacing chicken-install with
cmake, but that's only able to compile things. Without the setup-api,
how are egg dependencies specified and how are the eggs downloaded and
managed? (ie, listing, installation and removal of the eggs on a
particular system)
I'm even a fan of not depending on cmake at all. Instead I'd favor a
pure Scheme based make process. However I found the plt make macro a
little cumbersome to use, and hard to express generation of and
dependencies on .import.scm files, I extended it a little and wrote a
Scheme-language make process for chicken to test it.

Why? a) reduce dependencies b) reduce mental load for developers c)
Scheme makes some things sooo easy, we could have this make look into
the module declarations and figure out the dependency hell itself.

This does not solve the configuration according the system. That's where
I'd like to draw the line: have autoconf, cmake or whatever test system
capabilities. The build process should not be separate. (Plus: I'd like
it in Scheme; though that's more the personal taste of mine.)
Post by Peter Bex
Post by Oleg Kolosov
Post by Felix Winkelmann
- I would like to move srfi-18 to an egg as well, only keep the scheduler
and the internal threading-stuff in library.scm in core.
Maybe it makes sense to expose some of that to make it easier to
implement stuff like concurrent-native-callbacks?
Suggestions? Right now there's already the ##sys#thread stuff that
srfi-18 itself uses. I think there's not going to be a whole lot
of opportunity for exposing more stuff. If we're going to extract
srfi-18 into an egg I do agree we need to take a closer look at the
exact API it exposes. Right now threads are limited in what objects
they are able to block on. If we change that after making an egg it
will be more difficult. I think Jerry said he was working on improving
this.
Yes. And all this time I feel guilty that I don't have the time to
eventually sit down and pull those modifications over.

Also scheduler.scm mentions ";; This should really use a balanced tree:"
I'm using LLRB trees for the queues instead.


In fact I recall that there was even a fix, but also I recall that
forgot to document the problem.

I did re-factor quite some things between the scheduler and srfi-18. And
I'm all in favor of a smaller scheduler API and alternative threading
implementations atop (like this:
http://lists.nongnu.org/archive/html/chicken-hackers/2012-05/msg00014.html

Furthermore I'm in favor of a slight modification of the thread
structure, which would save my code a few allocations and indirections.

Quite a can of worms.
Post by Peter Bex
Post by Oleg Kolosov
We are hooking logging system (extracting file, line number information
basically) through user-passes. There are also potential use cases like
externally getting module dependency information (sort of like gcc -M
options). It is a nice feature to abuse.
That's a pretty cool hack!
Cheers,
Peter
Oleg Kolosov
2014-07-02 22:21:19 UTC
Permalink
Post by Jörg F. Wittenberger
OK, since Peter mentioned me…
Post by Peter Bex
Post by Oleg Kolosov
Looks like it is mostly setup api that pulls POSIX. Making it optional
along with csc and friends will allow us to strip the core much more.
And they are hard to port to Windows (native) anyway.
If I understand correctly, you propose replacing chicken-install with
cmake, but that's only able to compile things. Without the setup-api,
how are egg dependencies specified and how are the eggs downloaded and
managed? (ie, listing, installation and removal of the eggs on a
particular system)
So far I've settled on generating CMakeLists with predefined macros for
building things and leave setup-api as is for downloading and managing.
Leverage CMake for figuring out system dependent things. I have
experimental implementation which downloads with chicken-install -r,
then builds everything. It is not very user friendly yet but works few
times faster than chicken-install.

It should be straightforward to write some converter for .setup files.
Post by Jörg F. Wittenberger
I'm even a fan of not depending on cmake at all. Instead I'd favor a
pure Scheme based make process. However I found the plt make macro a
little cumbersome to use, and hard to express generation of and
dependencies on .import.scm files, I extended it a little and wrote a
Scheme-language make process for chicken to test it.
Why? a) reduce dependencies b) reduce mental load for developers c)
Scheme makes some things sooo easy, we could have this make look into
the module declarations and figure out the dependency hell itself.
There are no problems with generated dependencies in my CMake macros. We
are using them in our Chicken Scheme based application with few external
C libraries, 40+ eggs and 50+ internal modules without major problems. I
even support emitting and using type and inline files automatically. But
I agree that CMake syntax can be quite ugly for non trivial things.
Post by Jörg F. Wittenberger
From the package maintainer point of view idiosyncratic build systems
are the worst offenders of widespread adoption.

I'm currently experimenting with automatic dependency extraction.
Unfortunately, apart from make with gcc -M combination, toolchains do
not expect dependencies jumping into the build graph when the process
already started. Also this breaks parallel compilation. I have some
ideas but not yet figured out how to handle this cleanly.
Post by Jörg F. Wittenberger
This does not solve the configuration according the system. That's where
I'd like to draw the line: have autoconf, cmake or whatever test system
capabilities. The build process should not be separate. (Plus: I'd like
it in Scheme; though that's more the personal taste of mine.)
Post by Peter Bex
Post by Oleg Kolosov
Post by Felix Winkelmann
- I would like to move srfi-18 to an egg as well, only keep the scheduler
and the internal threading-stuff in library.scm in core.
Maybe it makes sense to expose some of that to make it easier to
implement stuff like concurrent-native-callbacks?
Suggestions? Right now there's already the ##sys#thread stuff that
srfi-18 itself uses. I think there's not going to be a whole lot
of opportunity for exposing more stuff. If we're going to extract
srfi-18 into an egg I do agree we need to take a closer look at the
exact API it exposes. Right now threads are limited in what objects
they are able to block on. If we change that after making an egg it
will be more difficult. I think Jerry said he was working on improving
this.
Yes. And all this time I feel guilty that I don't have the time to
eventually sit down and pull those modifications over.
Also scheduler.scm mentions ";; This should really use a balanced tree:"
I'm using LLRB trees for the queues instead.
In fact I recall that there was even a fix, but also I recall that
forgot to document the problem.
I did re-factor quite some things between the scheduler and srfi-18. And
I'm all in favor of a smaller scheduler API and alternative threading
http://lists.nongnu.org/archive/html/chicken-hackers/2012-05/msg00014.html
Furthermore I'm in favor of a slight modification of the thread
structure, which would save my code a few allocations and indirections.
Quite a can of worms.
Is there a clean way to offload some work to native threads (pure C)?
Calling back from C triggers minor GC AFAIR, which I think can hurt
performance. Threads should have their own stack making this
unnecessary. Maybe I'm wrong.
--
Regards, Oleg
Aleksej Saushev
2014-07-03 07:40:13 UTC
Permalink
Post by Oleg Kolosov
Post by Jörg F. Wittenberger
I'm even a fan of not depending on cmake at all. Instead I'd favor a
pure Scheme based make process. However I found the plt make macro a
little cumbersome to use, and hard to express generation of and
dependencies on .import.scm files, I extended it a little and wrote a
Scheme-language make process for chicken to test it.
Why? a) reduce dependencies b) reduce mental load for developers c)
Scheme makes some things sooo easy, we could have this make look into
the module declarations and figure out the dependency hell itself.
There are no problems with generated dependencies in my CMake macros. We
are using them in our Chicken Scheme based application with few external
C libraries, 40+ eggs and 50+ internal modules without major problems. I
even support emitting and using type and inline files automatically. But
I agree that CMake syntax can be quite ugly for non trivial things.
From the package maintainer point of view idiosyncratic build systems
are the worst offenders of widespread adoption.
As a packager I can tell you that CMake goes the next in line after SCons,
certainly much higher than autoconf, which makes it rather bad choice
for portable software. In this respect (as in many others) I don't see
how it is any improvement over Make. It is an improvement only in one sense:
it has only one dialect. Once you realize it, you understand that you
could have insisted on using one particular Make dialect which is
exactly what Chicken does.

Besides, generating CMakeLists automatically makes it really suspicious.
Post by Oleg Kolosov
I'm currently experimenting with automatic dependency extraction.
Unfortunately, apart from make with gcc -M combination, toolchains do
not expect dependencies jumping into the build graph when the process
already started. Also this breaks parallel compilation. I have some
ideas but not yet figured out how to handle this cleanly.
This intervention changes the problem from simple case of planning
that is reduced to topological sort into more general control problem.
--
HE CE3OH...
Yaroslav Tsarko
2014-07-03 08:14:10 UTC
Permalink
Besides weird and clumsy syntax CMake has one great advantage over herds
of others so-called "build systems" (waf, scons, gyp ... <put your name
here>) - cross compilation, and being a packager, you should be aware of
that. CMake has indeed decent support of it. And this point gives CMake
100 points ahead of others "build systems".
Post by Aleksej Saushev
As a packager I can tell you that CMake goes the next in line after
SCons, certainly much higher than autoconf, which makes it rather bad
choice for portable software. In this respect (as in many others) I
don't see how it is any improvement over Make. It is an improvement
only in one sense: it has only one dialect. Once you realize it, you
understand that you could have insisted on using one particular Make
dialect which is exactly what Chicken does. Besides, generating
CMakeLists automatically makes it really suspicious.
Post by Oleg Kolosov
I'm currently experimenting with automatic dependency extraction.
Unfortunately, apart from make with gcc -M combination, toolchains do
not expect dependencies jumping into the build graph when the process
already started. Also this breaks parallel compilation. I have some
ideas but not yet figured out how to handle this cleanly.
Felix Winkelmann
2014-07-03 12:05:25 UTC
Permalink
From: Yaroslav Tsarko <***@googlemail.com>
Subject: Re: [Chicken-hackers] [PATCH] Fix #1133
Date: Thu, 03 Jul 2014 12:14:10 +0400
Post by Yaroslav Tsarko
Besides weird and clumsy syntax CMake has one great advantage over
herds of others so-called "build systems" (waf, scons, gyp ... <put
your name here>) - cross compilation, and being a packager, you should
be aware of that. CMake has indeed decent support of it. And this
point gives CMake 100 points ahead of others "build systems".
Interesting. Can you point me to specifics? I find the cmake
documentation somewhat opaque, but perhaps it's just the terminology
that is alien to me. In other words, what makes cross-compilation
in CMake easier than in, say, GNU make?


felix
Yaroslav Tsarko
2014-07-03 12:37:33 UTC
Permalink
Sure. First, please take a look at the following CMake wiki page:
http://www.cmake.org/Wiki/CMake_Cross_Compiling.

In addition to that many CMake commands already know how to properly
work in cross-compilation mode (for example: find_library command
http://www.cmake.org/cmake/help/v2.8.12/cmake.html#command:find_library).

In common scenario all you need to do is to set up a few CMake variables
to specify your cross-compilation target and CMake will handle many
things automatically.

As example (I used this to build CLucene library on my Ubuntu 12.04
system for Windows):

cmake -D CMAKE_TOOLCHAIN_FILE=mingw-w64-toolchain.cmake -D
CMAKE_INSTALL_PREFIX=<path> ..

where mentioned mingw-w64-toolchain.cmake file contains:

SET(CMAKE_SYSTEM_NAME Windows)
SET(GNU_HOST i686-w64-mingw32)
SET(CMAKE_C_COMPILER i686-w64-mingw32-gcc)
SET(CMAKE_CXX_COMPILER i686-w64-mingw32-g++)
SET(CMAKE_RC_COMPILER i686-w64-mingw32-windres)
SET(CMAKE_FIND_ROOT_PATH <path-to-mingw32> )
Post by Felix Winkelmann
Subject: Re: [Chicken-hackers] [PATCH] Fix #1133
Date: Thu, 03 Jul 2014 12:14:10 +0400
Post by Yaroslav Tsarko
Besides weird and clumsy syntax CMake has one great advantage over
herds of others so-called "build systems" (waf, scons, gyp ... <put
your name here>) - cross compilation, and being a packager, you should
be aware of that. CMake has indeed decent support of it. And this
point gives CMake 100 points ahead of others "build systems".
Interesting. Can you point me to specifics? I find the cmake
documentation somewhat opaque, but perhaps it's just the terminology
that is alien to me. In other words, what makes cross-compilation
in CMake easier than in, say, GNU make?
felix
--
Thanks,
Yaroslav
Felix Winkelmann
2014-07-03 14:06:40 UTC
Permalink
From: Yaroslav Tsarko <***@googlemail.com>
Subject: Re: [Chicken-hackers] [PATCH] Fix #1133
Date: Thu, 03 Jul 2014 16:37:33 +0400
Post by Yaroslav Tsarko
http://www.cmake.org/Wiki/CMake_Cross_Compiling.
In addition to that many CMake commands already know how to properly
work in cross-compilation mode (for example: find_library command
http://www.cmake.org/cmake/help/v2.8.12/cmake.html#command:find_library).
In common scenario all you need to do is to set up a few CMake
variables to specify your cross-compilation target and CMake will
handle many things automatically.
As example (I used this to build CLucene library on my Ubuntu 12.04
cmake -D CMAKE_TOOLCHAIN_FILE=mingw-w64-toolchain.cmake -D
CMAKE_INSTALL_PREFIX=<path> ..
SET(CMAKE_SYSTEM_NAME Windows)
SET(GNU_HOST i686-w64-mingw32)
SET(CMAKE_C_COMPILER i686-w64-mingw32-gcc)
SET(CMAKE_CXX_COMPILER i686-w64-mingw32-g++)
SET(CMAKE_RC_COMPILER i686-w64-mingw32-windres)
SET(CMAKE_FIND_ROOT_PATH <path-to-mingw32> )
Thanks for the information, this is very helpful.


felix
Aleksej Saushev
2014-07-03 18:23:24 UTC
Permalink
Post by Yaroslav Tsarko
Sure. First, please take a look at the following CMake wiki
page: http://www.cmake.org/Wiki/CMake_Cross_Compiling.
In addition to that many CMake commands already know how to
find_library command
http://www.cmake.org/cmake/help/v2.8.12/cmake.html#command:find_library).
In common scenario all you need to do is to set up a few CMake
variables to specify your cross-compilation target and CMake
will handle many things automatically.
As example (I used this to build CLucene library on my Ubuntu
cmake -D CMAKE_TOOLCHAIN_FILE=mingw-w64-toolchain.cmake -D
CMAKE_INSTALL_PREFIX=<path> ..
SET(CMAKE_SYSTEM_NAME Windows)
SET(GNU_HOST i686-w64-mingw32)
SET(CMAKE_C_COMPILER i686-w64-mingw32-gcc)
SET(CMAKE_CXX_COMPILER i686-w64-mingw32-g++)
SET(CMAKE_RC_COMPILER i686-w64-mingw32-windres)
SET(CMAKE_FIND_ROOT_PATH <path-to-mingw32> )
This is exactly the demonstration why CMake's support for cross-compilation
is no better than Make's. The analogous code for Make is:

make CC=i686-w64-mingw32-gcc CXX=i686-w64-mingw32-g++ SYSROOT=<path-to-mingw32>

The most important part here is that you have to add it manually while autoconf
has almost all the useful stuff already inside.

In general, CMake doesn't solve any real problem of Make. In some aspects
it is a step back even. CMake is mostly equivalent to Make + some autotools.
The only Make's problem it does solve is the problem of divergent dialects.
CMake has only one. But that's triviality. We employ the same solution
by requiring bmake and ignoring all other dialects.
Otherwise CMake is "yet another tool" that requires additional maintainance.
--
HE CE3OH...
Jörg F. Wittenberger
2014-07-04 07:51:03 UTC
Permalink
+1 …
Post by Aleksej Saushev
In general, CMake doesn't solve any real problem of Make. In some aspects
it is a step back even. CMake is mostly equivalent to Make + some autotools.
Plus: this mixing of make's job with the configuration management is
what I perceive as the real (technical) disadvantage of cmake.
Post by Aleksej Saushev
The only Make's problem it does solve is the problem of divergent dialects.
CMake has only one. But that's triviality. We employ the same solution
by requiring bmake and ignoring all other dialects.
For quite a long time I managed to use the "original" make dialect
exclusively. Until some nice guy contributed a lot of code using GNU
extensions, that is. Since it's *really* hard to modify this make process.

Often enough more is not better.
Post by Aleksej Saushev
Otherwise CMake is "yet another tool" that requires additional maintainance.
+1

So far I can't see how a package maintainers job becomes harder from
using a custom build tool like "ant" and such. The maintainers care
about the path of least resistance to get their configuration done. I
see some preference for autoconf, mostly because that's what's known.
Maintainers will often succeed by guessing "./configure&&make". But once
one runs into this not working, it becomes manual tweaking source. This
is either easy in any build system, because it's documented in the
README where to assign which variables or it's hard work anyway,
regardless of the build system.
Aleksej Saushev
2014-07-03 17:55:36 UTC
Permalink
Post by Yaroslav Tsarko
Besides weird and clumsy syntax CMake has one great advantage
over herds of others so-called "build systems" (waf, scons, gyp
... <put your name here>) - cross compilation, and being a
packager, you should be aware of that. CMake has indeed decent
support of it. And this point gives CMake 100 points ahead of
others "build systems".
No, I strongly disagree on cross-compilation point.
All my experience demonstrates that CMake is no better in cross-compilation
as anything else with exception of autoconf and BSD Make scripts.
--
HE CE3OH...
Peter Bex
2014-06-28 10:02:34 UTC
Permalink
Post by Felix Winkelmann
memory-mapped files
binary-search (this screams out to be implemented as a functor)
queues
... much more, of possible ...
- Completely restructure all library units (extras, data-structures, files,
ports, lolevel, posix, utils). tcp and foreign stay as they are.
- irregex is used in core and will benefit from as much core-support as
it can get, so leave it as it is, as well.
- re-implement/copy stuff from SRFI-1/13/14 for internal use only,
move the rest into eggs.
- I would like to move srfi-18 to an egg as well, only keep the scheduler
and the internal threading-stuff in library.scm in core.
- srfi-69 can go to an egg. It is only used in chicken-profile, and
can be done using internal (symbol) hash-tables.
- I think srfi-4 can be eggified, too.
- Provide wrapper eggs for all the library units that are gone
(extras, ports, files, data-structures).
- I'm not sure about posix. Perhaps split this into higher-level
modules (eggify, if not needed in core) and keep a lower-level
interface in core.
- Use modules for the compiler. It is not used externally (well,
with the exception of user-passes, which nobody uses, I guess)
and would need some restructuring as well. But this may be a
first attempt at using modules internally.
Does this make sense?
Quite a lot, but it's a large project that is not exactly well-defined.
And the backwards compatibility will be pretty tricky too. However, I
think we can easily get started with removing stuff from core that isn't
used at all (like the mmapped files, and srfis). Once the core has been
made smaller, it will be easier to move things around and refactor what's
left.

Cheers,
Peter
--
http://www.more-magic.net
John Cowan
2014-06-26 22:13:35 UTC
Permalink
Post by Peter Bex
It's not 100% separate, the (begin ...) form allows embedded Scheme forms.
Yes. As a matter of style, that's meant to be used in situations like
this:

(define library (foo)
...
(cond-expand
((library (scheme complex))
(import (only (scheme complex) real-part imag-part))
(else
(begin
(define (real-part z) z)
(define (imag-part z) 0))))
...)
Post by Peter Bex
Also, our module system already supports separation into files, either
by using INCLUDE or the shorthand form (module NAME (EXPORT ...) FILENAME)
from http://wiki.call-cc.org/man/4/Modules
But it does not support separate compilation of those files. As I said
before, I personally don't consider that a severe limitation.
--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
You escaped them by the will-death and the Way of the Black Wheel.
I could not. --Great-Souled Sam
Oleg Kolosov
2014-06-27 19:10:39 UTC
Permalink
Post by Peter Bex
Post by Oleg Kolosov
Units are also used for separate compilation of large code bases. It is
not always make sense to separate everything into modules. This is
useful, but should be build only matter. AFAIK, current system does not
allow to transparently switch between module and unit mode for the given
source file. We can try to add an option for chicken to mostly ignore
module declarations and just issue some useful warnings but compile as a
unit. This approach will enable incremental migration.
That's an interesting idea. It might work, but I'm also unsure of the
added benefit; if the code is always compiled separately in the "ignore
modules" mode, it won't add the intended benefits of additional checking
of dependencies and exports.
I mean an option to compile like: internally convert (module <name>
<exports>) to (declare <name>), hook import somehow into this for rename
and such, issue warnings for names still unresolved but just continue
and generate a unit. This will allow having something working during
conversion to modules and not require to declare and resolve everything
at once. This is all hand wavy, I have not investigated technical
details too much, but it seemed not so hard to do at first sight.
Post by Peter Bex
Post by Oleg Kolosov
...cific details, maybe it is already improved. An ability to split up
the Chicken runtime might will be helpful, currently it is too big and
contains many unnecessary things (can be moved into eggs). Windows ...
I'm unsure how much further we can strip down the core. Perhaps we can
drop SRFI-1, SRFI-13 and SRFI-14, which would be just fine as eggs. But
currently core is using some procedures from these. I think that this is
a separate issue, though. Refactoring stuff as modules is a shitload of
work which is not yet certain to be feasible, and we should not drag more
work into this project.
Eggs was not the right word. I meant to just split it up more and make
most components optional. I was under an impression somehow that this
was the point of "call to arms".

The motivation here is that we found out that Chicken works surprisingly
well on embedded devices and want to integrate it even deeper (like
initramfs, firmware updater and such). An experiments show it is
possible to reduce runtime size to at least half of it's current size
and still have something usable. But it is too troublesome to do
manually, modularization will definitely help here and looks like good
thing in general.
Post by Peter Bex
Post by Oleg Kolosov
What do you think about Chibi Scheme approach
(http://synthcode.com/scheme/chibi/#h2_ModuleSystem)?
Also, having compatible (and standard compliant)
module system is a huge selling point IMO.
We already support this syntax through the R7RS egg. We can't just
switch to the R7RS module system in core because it would break 100% of
the user code out there, for no technical gain whatsoever. CHICKEN core
is an R5RS system, and adding an R7RS module system to it adds no
benefit if everything else from R7RS is only available as an extension.
We have already too many people asking about the difference between use,
require, require-extension, uses etc. Adding a second module system to
core would only add to the confusion (and *replacing* it would break the
world).
The idea is: require and friends are not necessary at all (and
confusing). The user should just use import and let the runtime system
figure it out. Like John mentioned earlier.

The only real use case for detailed control over library loading I can
imagine is the support for module hot reload.
Post by Peter Bex
Post by Oleg Kolosov
Post by Peter Bex
Indeed, bootstrapping is an ever-present pitfall. And there's also
library.scm which does not really correspond to a particular module.
Perhaps to "scheme", but not exactly.
Yeah. Especially the stuff from posix should be cleaned up. There are
plenty of things unrelated things in there at varying abstraction levels,
which is only stuffed in there because it's somehow related to an
underlying syscall that's defined by POSIX. This is a little too
system-specific for me; it would make more sense to group those things
together into things like files, process-control, user-management etc,
and define modules in that way.
That got me thinking: maybe we could implement module versioning? Not
just put everything into lib/chicken/<API_VERSION> but proper
module.so.1.2.3 like other shared libraries? And somehow add this to
imported namespace (##module#1.2.3#procedure or so). In theory this
should allow to move stuff around, but can open a whole can of worms,
but still, what do you think?
--
Regards, Oleg
Peter Bex
2014-06-28 10:13:35 UTC
Permalink
Post by Oleg Kolosov
I mean an option to compile like: internally convert (module <name>
<exports>) to (declare <name>), hook import somehow into this for rename
and such, issue warnings for names still unresolved but just continue
and generate a unit. This will allow having something working during
conversion to modules and not require to declare and resolve everything
at once.
That's a good idea and will make conversion to true modules much simpler.
Post by Oleg Kolosov
Post by Peter Bex
I'm unsure how much further we can strip down the core. Perhaps we can
drop SRFI-1, SRFI-13 and SRFI-14, which would be just fine as eggs. But
currently core is using some procedures from these. I think that this is
a separate issue, though. Refactoring stuff as modules is a shitload of
work which is not yet certain to be feasible, and we should not drag more
work into this project.
Eggs was not the right word. I meant to just split it up more and make
most components optional. I was under an impression somehow that this
was the point of "call to arms".
Indeed, but we must remain focused too. Felix's mail convinced me that
first removing more things from core might make conversion to modules
easier, so I guess this is not an unrelated project, but more of a
prerequisite of converting to modules.
Post by Oleg Kolosov
The motivation here is that we found out that Chicken works surprisingly
well on embedded devices and want to integrate it even deeper (like
initramfs, firmware updater and such). An experiments show it is
possible to reduce runtime size to at least half of it's current size
and still have something usable. But it is too troublesome to do
manually, modularization will definitely help here and looks like good
thing in general.
Being able to pick and choose what parts from core you want is nice,
and will indeed help making it simpler to make specialised smaller
builds on embedded devices.
Post by Oleg Kolosov
Post by Peter Bex
We have already too many people asking about the difference between use,
require, require-extension, uses etc. Adding a second module system to
core would only add to the confusion (and *replacing* it would break the
world).
The idea is: require and friends are not necessary at all (and
confusing). The user should just use import and let the runtime system
figure it out. Like John mentioned earlier.
The only real use case for detailed control over library loading I can
imagine is the support for module hot reload.
I'm not convinced we need separate loading and importing either, as long
as it doesn't interfere with separate compilation. Right now everyone
always ends up doing (use ...) anyway. Perhaps Felix can show us an
example where only an import (no loading of code) is useful, or otherwise
how I'm missing the point?
Post by Oleg Kolosov
That got me thinking: maybe we could implement module versioning? Not
just put everything into lib/chicken/<API_VERSION> but proper
module.so.1.2.3 like other shared libraries? And somehow add this to
imported namespace (##module#1.2.3#procedure or so). In theory this
should allow to move stuff around, but can open a whole can of worms,
but still, what do you think?
I have no thoughts on this right now. It could be a good idea, but
I'm unable to foresee the consequences.

Cheers,
Peter
--
http://www.more-magic.net
Felix Winkelmann
2014-06-28 15:39:35 UTC
Permalink
Post by Peter Bex
I'm not convinced we need separate loading and importing either, as long
as it doesn't interfere with separate compilation. Right now everyone
always ends up doing (use ...) anyway. Perhaps Felix can show us an
example where only an import (no loading of code) is useful, or otherwise
how I'm missing the point?
Static linking? Having several modules (library units) in a single
library (as in libchicken itself)?

We had these discussions many times and there is nothing that can be
done about it, with the exception of hiding it somehow behind modules,
or whatever.
Post by Peter Bex
Post by Oleg Kolosov
That got me thinking: maybe we could implement module versioning? Not
just put everything into lib/chicken/<API_VERSION> but proper
module.so.1.2.3 like other shared libraries? And somehow add this to
imported namespace (##module#1.2.3#procedure or so). In theory this
should allow to move stuff around, but can open a whole can of worms,
but still, what do you think?
I have no thoughts on this right now. It could be a good idea, but
I'm unable to foresee the consequences.
No thoughts here, too. But in my experience library versioning turns
out to be even more of a mess than without. You always end up in a
situation where two versions of the same module are required in the
same program, increasing code size and the chance of collisions, in
case both modules access limited resources. It's better fix the
conflict right away and avoid the duplication. But my experience is
limited, and perhaps Oleg means something entirely different.


felix
Peter Bex
2014-06-29 15:55:10 UTC
Permalink
Post by Felix Winkelmann
Post by Peter Bex
I'm not convinced we need separate loading and importing either, as long
as it doesn't interfere with separate compilation. Right now everyone
always ends up doing (use ...) anyway. Perhaps Felix can show us an
example where only an import (no loading of code) is useful, or otherwise
how I'm missing the point?
Static linking? Having several modules (library units) in a single
library (as in libchicken itself)?
As far as I understand, if you statically link a program which contains
a (use foo) clause, you can link it together with foo.a, and it won't
try to load anything dynamically. Is that correct?

If it's correct, I don't see why (require-library) and (import) must
be separate things from (use). They could all perform the same
operation (ie, importing the module identifier and loading the code
if it isn't loaded yet).

Cheers,
Peter
--
http://www.more-magic.net
Felix Winkelmann
2014-06-29 18:34:24 UTC
Permalink
From: Peter Bex <***@xs4all.nl>
Subject: Re: [Chicken-hackers] [PATCH] Fix #1133
Date: Sun, 29 Jun 2014 17:55:10 +0200
Post by Peter Bex
Post by Felix Winkelmann
Post by Peter Bex
I'm not convinced we need separate loading and importing either, as long
as it doesn't interfere with separate compilation. Right now everyone
always ends up doing (use ...) anyway. Perhaps Felix can show us an
example where only an import (no loading of code) is useful, or otherwise
how I'm missing the point?
Static linking? Having several modules (library units) in a single
library (as in libchicken itself)?
As far as I understand, if you statically link a program which contains
a (use foo) clause, you can link it together with foo.a, and it won't
try to load anything dynamically. Is that correct?
No, it will try to load "foo.so" dynamically. Some builtin library
units are treated as a special case.


felix
Peter Bex
2014-06-29 18:36:40 UTC
Permalink
Post by Felix Winkelmann
Post by Peter Bex
As far as I understand, if you statically link a program which contains
a (use foo) clause, you can link it together with foo.a, and it won't
try to load anything dynamically. Is that correct?
No, it will try to load "foo.so" dynamically. Some builtin library
units are treated as a special case.
Ah, now I understand. That's due to register-feature, right?

Would it be enough to simply always register the library as being
loaded in case of static linking? That way, we could make (import)
perform the task of (use), and remove all the other forms that so
often confuse beginners. This will also make using the r7rs egg less
painful.

Cheers,
Peter
--
http://www.more-magic.net
Felix Winkelmann
2014-06-30 22:15:47 UTC
Permalink
Post by Peter Bex
Ah, now I understand. That's due to register-feature, right?
This may be the case, I think (I'm too lazy to check right now, but
features are used for this, at least for the core library units)
Post by Peter Bex
Would it be enough to simply always register the library as being
loaded in case of static linking? That way, we could make (import)
perform the task of (use), and remove all the other forms that so
often confuse beginners. This will also make using the r7rs egg less
painful.
Hm. You mean invoking the entry-point of a library unit should
register a feature of the same name? Perhaps that might work (yet
wouldn't solve the problem of multiple compilation (and thus
entry-points) units in a single library), but I'm not sure if it is a
good idea. The separation of loading/linking and importing is IMHO
actually an advantage and provides more flexibility. Hiding
everything in "import" will result in trying to work around that very
feature, sooner or later.

I also don't think that the confusion is _that_ great,
"require-extension"/"use" normally works fine, in the usual mode most
users will eventually have - one module per compilation unit.

That the documentation does probably increase the confusion due to the
many options is a valid concern, though.

And as "use" currently performs the task of "import", why do you want
"import" to perform the task of "use", if not for R7RS compatibility?
(which is not necessary to implement in core. CHICKEN is an R5RS
system, as you said in another mail.)

And why would it make using the r7rs egg less painful? I'm not sure I
understand, but I'm probably missing something.


felix
Peter Bex
2014-07-01 07:13:45 UTC
Permalink
Post by Felix Winkelmann
Hm. You mean invoking the entry-point of a library unit should
register a feature of the same name?
Or something else, I'm not sure what exactly. I think conflating
features with module names is asking for trouble down the line.
If nothing else, we could prefix it, like for example module:foo.
It's ugly but would avoid name conflicts.
Post by Felix Winkelmann
Perhaps that might work (yet
wouldn't solve the problem of multiple compilation (and thus
entry-points) units in a single library)
Can you elaborate on that?
Post by Felix Winkelmann
but I'm not sure if it is a
good idea. The separation of loading/linking and importing is IMHO
actually an advantage and provides more flexibility. Hiding
everything in "import" will result in trying to work around that very
feature, sooner or later.
Is this just a gut feeling, or can you give an example where it will
cause trouble? So far, I've only seen several questions on IRC from
newbies who got confused that import didn't actually load the code.
And having to remember to (import chicken scheme) but that we should
(use everything-else) is just a hassle, and a barrier to entry for
newbies. I admit it's a relatively minor annoyance, but if we can
remove it I don't see why we shouldn't (unless of course there's a
compelling example why loading and importing should be kept separate)
Post by Felix Winkelmann
I also don't think that the confusion is _that_ great,
"require-extension"/"use" normally works fine, in the usual mode most
users will eventually have - one module per compilation unit.
Having multiple modules inside one file is kind of problematic anyway,
so I don't think many people are doing that. (I don't think we should
actively make that impossible, however)
Post by Felix Winkelmann
That the documentation does probably increase the confusion due to the
many options is a valid concern, though.
Perhaps we should strive to get rid of some forms. I'm not sure which
though. require-extension is there for SRFI compat, isn't it?
If "import" performed the task currently performed by "use", we could
eventually get rid of "require-library" AND "use".

I'm not sure if "require" has any uses, otherwise we could start by
deprecating that. I've never seen anyone use it "in the wild".
Post by Felix Winkelmann
And as "use" currently performs the task of "import", why do you want
"import" to perform the task of "use", if not for R7RS compatibility?
For convenience and to reduce confusion. We wouldn't need "use" at all,
eventually we can get rid of it.
Post by Felix Winkelmann
(which is not necessary to implement in core. CHICKEN is an R5RS
system, as you said in another mail.)
It is, but R5RS doesn't have a module system, so we're not bound to the
spec for how that works, and can change it to whatever is most
convenient for us.
Post by Felix Winkelmann
And why would it make using the r7rs egg less painful? I'm not sure I
understand, but I'm probably missing something.
Hm, perhaps it's me who is missing something. I seem to recall that the
r7rs egg just mapped import to CHICKEN's import, requiring the user to
load the implementation of a module somehow "from the outside", but that
doesn't seem to be the case (anymore?), so just ignore that.

Cheers,
Peter
--
http://www.more-magic.net
Felix Winkelmann
2014-07-01 15:08:15 UTC
Permalink
Post by Peter Bex
Post by Felix Winkelmann
Hm. You mean invoking the entry-point of a library unit should
register a feature of the same name?
Or something else, I'm not sure what exactly. I think conflating
features with module names is asking for trouble down the line.
If nothing else, we could prefix it, like for example module:foo.
It's ugly but would avoid name conflicts.
Yes, I agree. But perhaps using a different form of "registry"
flor loadied stuff might be better.
Post by Peter Bex
Post by Felix Winkelmann
Perhaps that might work (yet
wouldn't solve the problem of multiple compilation (and thus
entry-points) units in a single library)
Can you elaborate on that?
You can have multiple entry-points (units) in a single library, just
as we have in libchicken in the moment. This might work with a
"feature-registry" based approach, but I'm afraid of changing anything
in that code (somewhere deep in eval.scm).
Post by Peter Bex
Post by Felix Winkelmann
but I'm not sure if it is a
good idea. The separation of loading/linking and importing is IMHO
actually an advantage and provides more flexibility. Hiding
everything in "import" will result in trying to work around that very
feature, sooner or later.
Is this just a gut feeling, or can you give an example where it will
cause trouble? So far, I've only seen several questions on IRC from
newbies who got confused that import didn't actually load the code.
And having to remember to (import chicken scheme) but that we should
(use everything-else) is just a hassle, and a barrier to entry for
newbies. I admit it's a relatively minor annoyance, but if we can
remove it I don't see why we shouldn't (unless of course there's a
compelling example why loading and importing should be kept separate)
It's a gut feeling. I don't doubt this *can* be simplified, but
getting this right will be a major job, if it is going to cover all
possible use cases. And we already are planning too much, and haven't
even got the manpower to get patches to this list reviewed in a timely
manner...
Post by Peter Bex
I'm not sure if "require" has any uses, otherwise we could start by
deprecating that. I've never seen anyone use it "in the wild".
You need it if you don't use modules, and if you don't want to
manually track what is loaded. But it is probably the least used of
those loading constructs.


felix
Felix Winkelmann
2014-07-01 22:50:02 UTC
Permalink
Ok, let's start from scratch...

* We can't change the existing machinery without breaking an awful lot
of code, so any solution must be an addition to what we currently
have.

* The basic entities we have to deal with are "compilation units",
bodies of code, either statically linked into an executable (or
library) or dynamically loaded.

* These compilation units may or may not contain one or more
"modules", which are separate namespaces (or "bindings") over those
bodies of code.

* "import" incorporates bindings into the current environment, either
globally or inside another namespace (module). We want to
_automatically_ make the code associated with that namespace
available, regardless of the nature of the compilation unit that
contains that code. Is this interpretation correct?

* Making the code inside a compilation unit available happens either
by loading, resulting at one point of time in a call to "load" (this
includes interpreted code in source form, which is just another
flavor of a compilation unit), or it happens by declaring an
externally available entry-point, currently via "(declare (uses
...))".

(This needs a more obvious or natural syntax at some point, but
that isn't relevant right now)

* Declaring an entry-point into the current compilation unit
(basically the current source file) takes place by "(declare (unit
...))".

* The last 2 points are important if we want to support static
linking. Loading is in this case the simpler operation, as the
entry-point always has the same name. For static linking the
entry-points need to be named differently (there might be ways
around this limitation, but to keep things simple, let's not
consider that right now.)

* So, if we create a "registry" of linked/loaded compilation units,
"import" can consult this registry and check whether a compilation
unit of the same name is already registered and, if not, default to
loading a ".so" or ".scm" with the same name. If the latter is not
found, we have an error. If it is found, add it to the registry.

* "import" incorporates bindings from a set of available modules, also
registered somewhere, specifically in ##sys#module-table. Should it
also handle compilation-units for which no bindings exist (i.e. all
bindings are unqualified)? This is only useful at toplevel, or, in
other words, not inside a module. This will also bring up the
question whether such a behaviour might lead to head-scratching in
case a module should exist, but the binding-information is
unavailable for some erroneous reason.

* Declaring an externally available entry-point must add the
compilation unit associated with it to the registry.

(Sorry, now it gets complicated...)

* libchicken contains a number of entry-points, one for each library
unit that comes with the core system. The registry must already have
entries for these. Users might want to have to use a similar
physical structure of their code, so we will have to provide means
to add "default" registry entries, I think (I'm not completely
sure right now - the resolution of the entry-points happens
automatically by the linker, but we have to make later "import"s
aware of this.)

* Currently "(declare (unit ...))" calls the entry-point,
_initializing_ the compilation unit. Later "import"s will just
incorporate the bindings. Do we want to initialize the compilation
unit on the first "import"? If yes, we need to separate the notions
of declaring an externally available entry-point and calling it, the
latter being done (we hope) transparently by "import".

* The same situation arises with loaded compilation units. Consider a
dynamically loaded ".so" that holds several compilation units: When
is the entry-point of each contained compilation unit called? On
first "import"? I this case it makes sense to generalize this, I
think.

* The different actions or declarations will need different constructs
to implement the low-level behaviour. Not all of them need to be
user-visible. "import" naturally will. Declaring the current
compilation unit to have a separately named entry point will do so
as well. Declaring an externally available entry-point will. And
finally something for registering a "default" (admittedly for those
special occasions...)

* The registry needs to be something more extensible than a simple
"feature" list. We have to keep track of what is initialized, and so
on. Using any existing mechanism will only make it harder to later
remove the old code and make the existing code even more complicated
than it already is.

* Changing the semantics of "import" for "late" initializing of
compilation units breaks backwards compatibility, but we don't want
to create yet another special form, right? The conservative solution
is to do initialization at the point where an externally available
entry point is declared or code is explicitly loaded, like it
currently is implemented.

(Side note: loading invokes the default entry point "C_toplevel",
declaring an externally available entry-point invokes the
entry-point derived from the name of the compilation unit. In the
case of an ".so" holding several compilation units, we have a
mixture of default entry-point + separately named entry-points. Oh,
this is fun...)

* Thinking of this now, I realize that the compilation unit itself
might already contain the binding-information - this is the case
when we compile a module without emitting an import library. So late
initialization actually doesn't work, unless we want to require
import libraries in any case. A valid approach, but this may have
again other implications.

* It would be nice to have some terminology for those "bodies of code"
that we can use to invent new special forms to cleanly perform the
above mentioned "actions". This will of course increase the
confusion in the beginning, but we can deprecate the old forms at
some point.

I'm sure I have forgotten something, but it is important that we think
of all possible use cases before anything is changed, or we really
start going into details.

Note that our current CHICKEN does even more than this:
"require-extension" handling feature-IDs, for example. Or
automatically loading syntax-extensions. It's not a coincedence that
handling extensions/using/importing is in part done by a procedure
called "##sys#do-the-right-thing". And then there is figuring out
where the extensions are located, or telling the compiler what units
are loaded, or handling the "(srfi N ...)" extension-specifier even in
the presence of module-binding modifiers like "rename". Wheels within
wheels - it's terrible...

All that nasty lowlevel stuff does not necessarily have to be touched,
but care must be taken before we lock down what is in the future to be
allowable and what not. This is kind of obvious, but I just wanted to
mention it once more.

I hope I haven't raised the confusion to unbearable levels. My
intention was to clear things up, but I have my doubts whether this
was succesful.


felix
Peter Bex
2014-07-02 11:37:28 UTC
Permalink
Post by Felix Winkelmann
I hope I haven't raised the confusion to unbearable levels. My
intention was to clear things up, but I have my doubts whether this
was succesful.
Thanks for sending this mail, it provides a complete overview of what
needs to be done. I don't have further comments right now.

Cheers,
Peter
--
http://www.more-magic.net
John Cowan
2014-07-02 19:29:24 UTC
Permalink
Post by Felix Winkelmann
* "import" incorporates bindings into the current environment, either
globally or inside another namespace (module). We want to
_automatically_ make the code associated with that namespace
available, regardless of the nature of the compilation unit that
contains that code. Is this interpretation correct?
That's what I'd like to see, yes. However, there are certain perverse
circumstances in which it won't work. For example, if "a.scm" includes
module "b" and vice versa, then neither (use a) nor (use b) will succeed,
although it is possible to get both modules loaded and imported by messing
with the low-level operations.

Use (and its synonym require-extension) essentially involve loading a named
body of code and importing a module of the same name. If either of those
operations fail, `use` fails. But it does not require that the code loaded
actually provides the module!
Post by Felix Winkelmann
* The last 2 points are important if we want to support static
linking. Loading is in this case the simpler operation, as the
entry-point always has the same name. For static linking the
entry-points need to be named differently (there might be ways
around this limitation, but to keep things simple, let's not
consider that right now.)
It seems easy to provide them with unique entry points for static
purposes, provided the name of the entry point can be deduced from the
name of the loadable chunk.
Post by Felix Winkelmann
* So, if we create a "registry" of linked/loaded compilation units,
"import" can consult this registry and check whether a compilation
unit of the same name is already registered and, if not, default to
loading a ".so" or ".scm" with the same name. If the latter is not
found, we have an error. If it is found, add it to the registry.
Just so.
Post by Felix Winkelmann
* "import" incorporates bindings from a set of available modules, also
registered somewhere, specifically in ##sys#module-table. Should it
also handle compilation-units for which no bindings exist (i.e. all
bindings are unqualified)? This is only useful at toplevel, or, in
other words, not inside a module. This will also bring up the
question whether such a behaviour might lead to head-scratching in
case a module should exist, but the binding-information is
unavailable for some erroneous reason.
The obvious reason would be a failure to compile with -j or -J. It's
probably surprising to newbies that it's not enough to write (module foo
...) in a file "foo.scm" and then say "(use foo)" at the REPL; you have
to compile "foo.scm" with -J at the very least.
Post by Felix Winkelmann
* libchicken contains a number of entry-points, one for each library
unit that comes with the core system. The registry must already have
entries for these. Users might want to have to use a similar
physical structure of their code, so we will have to provide means
to add "default" registry entries, I think (I'm not completely
sure right now - the resolution of the entry-points happens
automatically by the linker, but we have to make later "import"s
aware of this.)
IMO this should be a special case for libchicken. I see no reason why
ordinary users should bundle multiple modules into a single .so file.
Post by Felix Winkelmann
* Currently "(declare (unit ...))" calls the entry-point,
_initializing_ the compilation unit. Later "import"s will just
incorporate the bindings. Do we want to initialize the compilation
unit on the first "import"? If yes, we need to separate the notions
of declaring an externally available entry-point and calling it, the
latter being done (we hope) transparently by "import".
Well, that is what `use` already does, no?
--
John Cowan http://www.ccil.org/~cowan ***@ccil.org
Principles. You can't say A is made of B or vice versa.
All mass is interaction. --Richard Feynman
Felix Winkelmann
2014-07-05 21:16:47 UTC
Permalink
Post by John Cowan
The obvious reason would be a failure to compile with -j or -J. It's
probably surprising to newbies that it's not enough to write (module foo
...) in a file "foo.scm" and then say "(use foo)" at the REPL; you have
to compile "foo.scm" with -J at the very least.
That's true. The "obvious" solution (to load the .so and "import" the
module bindings) is only obvious to those who understand the
underlying implementation.
Post by John Cowan
Post by Felix Winkelmann
* libchicken contains a number of entry-points, one for each library
unit that comes with the core system. The registry must already have
entries for these. Users might want to have to use a similar
physical structure of their code, so we will have to provide means
to add "default" registry entries, I think (I'm not completely
sure right now - the resolution of the entry-points happens
automatically by the linker, but we have to make later "import"s
aware of this.)
IMO this should be a special case for libchicken. I see no reason why
ordinary users should bundle multiple modules into a single .so file.
It must be possible, though. There must at least be some low-level
operation available.
Post by John Cowan
Post by Felix Winkelmann
* Currently "(declare (unit ...))" calls the entry-point,
_initializing_ the compilation unit. Later "import"s will just
incorporate the bindings. Do we want to initialize the compilation
unit on the first "import"? If yes, we need to separate the notions
of declaring an externally available entry-point and calling it, the
latter being done (we hope) transparently by "import".
Well, that is what `use` already does, no?
"use" only does that the first time a dynamically loadable extension
is loaded. Static entry-points (declared via "(declare (unit ...))")
are called at the start of the toplevel code of the current
compilation unit, before everything else (specifically, it produces a
"##core#callunit" special form, inserted before the actual user code.


felix
Arthur Maciel
2014-07-06 20:02:41 UTC
Permalink
Now with correct subject. Sorry.

Date: Sat, 05 Jul 2014 23:31:18 +0200 (CEST)
Post by Oleg Kolosov
More documentation for internals, even comments will do.
Well, that can be done. External documentation will quickly get out of
sync, but comments shouldn't be a problem. If there are particular
places where one could start, please tell.
Probably all my doubts are due to my ignorance about C and computer
internals, but I would love to understand more the chicken.h file.
(
http://code.call-cc.org/cgi-bin/gitweb.cgi?p=chicken-core.git;a=blob;f=chicken.h;h=f5a103ee14314f7c679e01dd8e11c0404043791a;hb=HEAD
)

I feel really ashamed to ask this, but as I want to contribute more to
Chicken, it would help me a lot to know about these details:
- Throughout the whole file I can't guess what is the meaning of some
prefix or suffix letters in function names, like:
- C_truep(x): I suppose the "p" stands for "predicate" as in Common
Lisp, right?
- C_c_pointer(x): what does the "_c_" stand for?
- C_c_pointer_nn(x): what about "nn"?
- C_fixnum_plus(n1, n2) vs. C_u_fixnum_plus(n1, n2): does the "_u_" on
the second mean "unsigned"?
- C_ub_i_flonum_plus(x, y): what about "ub"? unsigned byte?
- C_a_i_flonum_plus(ptr, c, n1, n2) and C_a_i(a, n): what about "a" and
"i"?
- C_mpointer: the "m" stands for "memory"?
- C_mk_bool: "mk" stand for "make"?
- Lines 777 to 838: I suppose the code is used for CPS, but I'm not sure
about it and I can't figure out why it is implemented that way and how it
is used in practice.
- Lines 876 to 909: although this excerpt starts with a comment about Clang
and G++ limitations on statement expressions, all the hackery is only
implemented when DEBUGBUILD is defined. I don't understand why. I also
can't understand the code from 892 to 909, but I suppose I need more C
knowledge to that.

I thought about comments like this in chicken.h:
/* "p" suffix (like in C_truep) stands for "predicate" as in Lisp
tradition. See: http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node69.html
*/

The CPS and statement expressions could be elsewhere.

What do you think? Is it worthy?

Thanks for help and sorry about the ignorance.

Best wishes,
Arthur
Arthur Maciel
2014-07-06 20:39:09 UTC
Permalink
I also started a stub wiki page about chicken.h (
https://wiki.call-cc.org/notes-on-chicken.h), but I agree that commenting
its source code is a much better solution.

Best wishes,
Arthur
Post by Arthur Maciel
Now with correct subject. Sorry.
Date: Sat, 05 Jul 2014 23:31:18 +0200 (CEST)
Post by Oleg Kolosov
More documentation for internals, even comments will do.
Well, that can be done. External documentation will quickly get out of
sync, but comments shouldn't be a problem. If there are particular
places where one could start, please tell.
Probably all my doubts are due to my ignorance about C and computer
internals, but I would love to understand more the chicken.h file.
(
http://code.call-cc.org/cgi-bin/gitweb.cgi?p=chicken-core.git;a=blob;f=chicken.h;h=f5a103ee14314f7c679e01dd8e11c0404043791a;hb=HEAD
)
I feel really ashamed to ask this, but as I want to contribute more to
- Throughout the whole file I can't guess what is the meaning of some
- C_truep(x): I suppose the "p" stands for "predicate" as in Common
Lisp, right?
- C_c_pointer(x): what does the "_c_" stand for?
- C_c_pointer_nn(x): what about "nn"?
- C_fixnum_plus(n1, n2) vs. C_u_fixnum_plus(n1, n2): does the "_u_" on
the second mean "unsigned"?
- C_ub_i_flonum_plus(x, y): what about "ub"? unsigned byte?
- C_a_i_flonum_plus(ptr, c, n1, n2) and C_a_i(a, n): what about "a"
and "i"?
- C_mpointer: the "m" stands for "memory"?
- C_mk_bool: "mk" stand for "make"?
- Lines 777 to 838: I suppose the code is used for CPS, but I'm not sure
about it and I can't figure out why it is implemented that way and how it
is used in practice.
- Lines 876 to 909: although this excerpt starts with a comment about
Clang and G++ limitations on statement expressions, all the hackery is only
implemented when DEBUGBUILD is defined. I don't understand why. I also
can't understand the code from 892 to 909, but I suppose I need more C
knowledge to that.
/* "p" suffix (like in C_truep) stands for "predicate" as in Lisp
tradition. See: http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node69.html
*/
The CPS and statement expressions could be elsewhere.
What do you think? Is it worthy?
Thanks for help and sorry about the ignorance.
Best wishes,
Arthur
Arthur Maciel
2014-07-07 12:50:57 UTC
Permalink
Peter, thank you very much for always kindly answering my silly questions.

This info helps me a lot to understand more the chicken core. As time is
limited, I have to finish an egg first, implement the new wiki style
secondly and in third place start looking into core seriously.

But just to anticipate, is 'make check' enough to report any code break I
generate experimenting with the core?

Thanks for making this community warm!

Best wishes,
Arthur
Post by Arthur Maciel
I also started a stub wiki page about chicken.h (
https://wiki.call-cc.org/notes-on-chicken.h), but I agree that commenting
its source code is a much better solution.
Best wishes,
Arthur
Post by Arthur Maciel
Now with correct subject. Sorry.
Date: Sat, 05 Jul 2014 23:31:18 +0200 (CEST)
Post by Oleg Kolosov
More documentation for internals, even comments will do.
Well, that can be done. External documentation will quickly get out of
sync, but comments shouldn't be a problem. If there are particular
places where one could start, please tell.
Probably all my doubts are due to my ignorance about C and computer
internals, but I would love to understand more the chicken.h file.
(
http://code.call-cc.org/cgi-bin/gitweb.cgi?p=chicken-core.git;a=blob;f=chicken.h;h=f5a103ee14314f7c679e01dd8e11c0404043791a;hb=HEAD
)
I feel really ashamed to ask this, but as I want to contribute more to
- Throughout the whole file I can't guess what is the meaning of some
- C_truep(x): I suppose the "p" stands for "predicate" as in Common
Lisp, right?
- C_c_pointer(x): what does the "_c_" stand for?
- C_c_pointer_nn(x): what about "nn"?
- C_fixnum_plus(n1, n2) vs. C_u_fixnum_plus(n1, n2): does the "_u_"
on the second mean "unsigned"?
- C_ub_i_flonum_plus(x, y): what about "ub"? unsigned byte?
- C_a_i_flonum_plus(ptr, c, n1, n2) and C_a_i(a, n): what about "a"
and "i"?
- C_mpointer: the "m" stands for "memory"?
- C_mk_bool: "mk" stand for "make"?
- Lines 777 to 838: I suppose the code is used for CPS, but I'm not sure
about it and I can't figure out why it is implemented that way and how it
is used in practice.
- Lines 876 to 909: although this excerpt starts with a comment about
Clang and G++ limitations on statement expressions, all the hackery is only
implemented when DEBUGBUILD is defined. I don't understand why. I also
can't understand the code from 892 to 909, but I suppose I need more C
knowledge to that.
/* "p" suffix (like in C_truep) stands for "predicate" as in Lisp
tradition. See: http://www.cs.cmu.edu/Groups/AI/html/cltl/clm/node69.html
*/
The CPS and statement expressions could be elsewhere.
What do you think? Is it worthy?
Thanks for help and sorry about the ignorance.
Best wishes,
Arthur
Peter Bex
2014-07-07 12:52:59 UTC
Permalink
Post by Arthur Maciel
Peter, thank you very much for always kindly answering my silly questions.
This info helps me a lot to understand more the chicken core. As time is
limited, I have to finish an egg first, implement the new wiki style
secondly and in third place start looking into core seriously.
Yeah, so much to do, so little time!
Post by Arthur Maciel
But just to anticipate, is 'make check' enough to report any code break I
generate experimenting with the core?
The test suite is far from complete, but yeah if "make check" exits with
an error you *know* you broke something ;)

Cheers,
Peter
--
http://www.more-magic.net
Loading...