I recently did a similar project but I tried something different in an attempt to avoid large switch statements. I created a collection of functions which could mutate my version of the "Frame" object, created an array of pointers to these functions, and called them as their opcodes were encountered. [0] It's a little over complicated for a simple example but adding a new opcode will never be more work then this [1].
I hope one day my VM/Assembler will become as fully featured as this.
My last major achievement was implementing multiplication in my assembly language [2] so I've got a lot of catching up to do with this author!
Indeed. At least in my experience, Rust (or LLVM, presumably) will make jump tables from all sorts of nonsense, doing something efficient even when there are gaps in matched values. Manually creating jump tables seems to be a waste of time.
(Actually, it doesn't even need to be a switch statement... Just a succession of `if`s works fine.)
> At least in my experience, Rust (or LLVM, presumably) will make jump tables from all sorts of nonsense, doing something efficient even when there are gaps in matched values. Manually creating jump tables seems to be a waste of time.
I wouldn't say it is always a waste of time just because jump tables can be easier to read and manage when you have a decent number of opcodes. Being able to have all your opcodes in separate functions and then just have one single list that shows exactly what opcode maps to what instruction is really nice:
If everyone uses switches or computed goto[0] then I assume there is a reason for using that over portable and easy to understand (compared to huge function that switch causes or the compiler specific computed goto) array of function pointers. Performance probably. If array of pointers would end up quicker than either it'd be quite funny because computed goto is a performance (and well, allows jumping programatically to different labels without a switch, I guess) tool and a large switch is IMO less readable than either of the other two. There is a big comment [1] in ceval.c in Python 3.6 source explaining the computed goto mechanics.
I was designing my own ISA for a computer I want to build out of relays (eventually). I want everything to be in software because I'm a software guy and relays are expensive. You can find more about what I was trying to do here: http://blog.gravypod.com/?title=lets_write_a_processor
I think your implementation would fit under the traditional definition of a state machine or maybe an interpreter. But most people would not define it as a virtual machine. It is certainly not an operating system running on another operating system which is the most popular definition today. Nor is it as capable as the VM's that run environments such as Java or .NET, which is the other definition people might use.
> most people would not define it as a virtual machine. It is certainly not an operating system running on another operating system which is the most popular definition today.
Maybe refrain from 'correcting' someone on common definitions if you aren't familiar with the kinds and amount of commonality of all definitions?
"Virtual Machine" does today commonly mean an interpreter based on opcode instructions, aka process VM.
Furthermore, using 'interpreter' for an opcode based VM is less common, I feel like your suggestion goes backwards by your own metric.
> Nor is it as capable as the VM's that run environments such as Java or .NET, which is the other definition people might use.
So I'm curious what you were expecting when you clicked on "Simple Virtual Machine in C" linked to github? Is it likely that someone's open source project is going to compete with the 2 largest and most widely distributed runtimes on the planet?
How do you measure capability? If the VM is Turing complete, is it really less "capable"?
What is the point of your negative comment? Do you want the author to change the title? Do you want other people to never make another VM because Java already has one? And what kind of responses do you hope for next time you share your open source project on HN?
The term virtual machine stands in distinction to real machine, so unless a hardware interpreter for python (perhaps interpreting preprocessed bytecode), I wouldn't prefer to make the distinction, although, technically, there are unrelated bare to the bones interpreters in the contrast (edit: as is the case with the featuredvarticle, the author points out in this thread).
> The term virtual machine stands in distinction to real machine
Right: virtual machine = software implementation of what could be a real machine.
The simplest interpreters execute directly from the AST, but if there's a pass that reduces this to an abstract machine with an ISA, I think it's fair to classify that as a VM.
7 years ago I came across a programming challenge - The Cult of the Bound Variable The challenge was described as an ancient specification written in sandstone for a universal machine. This really sparked my interest - the task was to write an emulator! The question to myself, was how many lines of code can it take to write an emulator ?
Not to hijack the comments section, but a little while ago I made a small VM for educational purposes in C. Heavily commented ANSI C with example programs and documentation. I figure it might be of interest to others in this thread.
Neat project! I did something similar a couple of years ago in python: https://github.com/gedrap/xs-vm these toy projects are really good to try new things and, you know, just build something :)
Donald Knuth created a virtual machine written in MMIX that simulates the MMIX instruction set architecture, so that it can then simulate the MMIX machine which can simulate the MMIX instruction set and so on.
The "nodeMCU" IoT board can execute LUA instructions. But I notice the chips have little memory for stack, so it's easy to write valid lua that exceeds memory b/c of loops that eat up the stack.
I wonder if it's possible to write a interpreter with restrictions on loops etc that can limit/verify programs on the basis of the memory they'd require?
I'm pretty sure that Turing completeness and/or halting problem imply that you can't statically determine memory use of arbitrary code.
Doesn't mean you couldn't make an useful sub-turing language for resource limited use.
On a final note, nodemcu doesn't really execute "lua instructions", it executes normal machine code like everything else but it ships with lua interpreter/vm firmware.
Any VM can support concurrency - look at all the OS's that provided a threads + processes model on single cpu machines. Do you mean VMs with multiple virtual CPUs?
Let's say there's a toy language that runs on a stack VM. How is threading/concurrency handled by the VM? So if the language had thread.start(function()), the VM would have to have instructions for handling this, so I would imagine it would run another virtual machine in a real thread when it gets to this instruction?
edit: So basically, if one had the p-code machine[0], how might it handle concurrency/threading?
I guess we need to define more precisely what it means for a VM to "support concurrency". Based on your comment, I assume you mean that it's possible to write programs for the VM that use something like the pthreads API. But you don't need multiple CPUs or special hardware to provide the pthreads API. The "concurrent threads, shared memory" model of execution is still meaningful on a single-cpu machine.
Given a single-CPU VM, you can write an assembly program that implements context switching and the thread control block data structure. Then you write a scheduler (OK, the VM needs a timer/interrupt to trigger the scheduler). thread.start(function) allocates some stack memory and a thread control block, which contains a pointer to `function` and to the stack memory you just allocated. Then `thread.start` inserts our TCB into the scheduler queue. All this stuff is implemented as software in the VM's machine language, it is not part of the VM itself. pthread_create and the Linux scheduler are not implemented in hardware either.
I guess if you want memory protection, you would need the VM to simulate a MMU.
On the other hand, if you actually want programs on the VM to be able to run on multiple CPUs of the host machine, then I agree you need some special support in the VM to do that.
I hope one day my VM/Assembler will become as fully featured as this.
My last major achievement was implementing multiplication in my assembly language [2] so I've got a lot of catching up to do with this author!
[0] - https://github.com/gravypod/computer/blob/master/isa/main.c#...
[1] - https://github.com/gravypod/computer/blob/master/isa/instruc...
[2] - https://github.com/gravypod/computer/blob/master/examples/mu...