May 3, 2010 Thinking about self-hosting. Two possible paths: 1) byte-code compiler for Irken, test via that route 2) direct in Irken. Why this might be a bad idea at this time? Well, the VM has barely even been born, so e.g. debugging tools are non-existent. What is the eventual goal? For maximum performance we'll probably want Irken written in Irken, similar to how all of CPython is written in C. But do we get there by starting in the VM version and pushing down into the C version? Regardless, need to grok Irken-in-VM. Irken currently has a 'basis' (in the ML sense) in lib/core.scm, which relies on %%cexp to implement most of the primitive operations. All of these would need to be reproduced in the VM in some way. What about datatypes in the VM, though? Will they be implemented using the datatype? Very confusing... easy to see how it works (just write a %make-tuple), but not how it will *type*. I can probably start exploring these issues by just putting the first foot forward - write the lisp reader, then the transformer... baby steps! But before we can read lisp expressions, we need to be able to read characters. I/O it is! And of course that just brings us back to datatypes. How to represent a file? Or rather, what's the easiest/cleanest way to interface to the existing file object? At a bare minimum, I can implement getch/putch as primitives/vm-insns, but this wouldn't let us open *files*. The Pythonic way to do this is to have a file object with a read() method on it. Is it time to think about implementing 'objects' using records? Or am I getting too far ahead again? Extending the minimalist getch/putch idea, just add a new 'file' type to 'object', and get on with life. [remove it later when we grok datatypes-in-the-vm better] -------------------------------------------------------------------------------- statically-typed bytecode. Ok, when I started on the VM the assumption was that we were aiming for a dynamically-typed python/scheme-like language. But now that I'm thinking of making Irken self-hosting, I need to think about a statically-typed VM. I have no experience with such a beast, what does it look like? Is it like Java, where the bytecodes themselves are statically typed? [e.g., "add" only works on integers] More interestingly... is there some way to build a 'real' untyped register VM in Irken? Almost certainly not, without basically throwing away the type system altogether? [Parrot uses 'typed registers', an interesting idea] Ok, what job does the typing phase do? Nothing, from the point of view of the end product. It acts like a filter, stopping ill-typed programs from going to the next phase. Well, not completely - information generated by the typing phase informs the backend of what tags to generate or test for. But in general, it gives a pass to code that will later make assumptions about values, such that we can omit run-time type testing... ----------- What I did: I actually ended up just self-hosting Irken directly.