In terms of I/O pins that should actually be fine (current pinout: VGA (14x), SPI (4x), CLK (1x), N64_controller (1x)), though I'm still working on the project and I don't think it'll be done by the deadline. It looks like they might do future batches though -- it's a cool idea!
That's a good idea. The two analog sticks on the GC controller would be an improvement over the single stick on the N64 controller for movement in 3D. I think that the main benefit of the N64 controller (besides a nostalgia factor, though that may just be me ;)) is how easy it is to connect. I actually just got some wire from the local hardware store, plugged pieces of it into the controller connector, and then attached some IC clips. For the GC controller, things are a bit trickier due to its connector layout, though I just found  which might be a nice solution; alternatively, buying a GC controller extension cord and wire stripping could be an option. I'll consider it!
VGA is an analog protocol, but the FPGA can only output a 0 (GND) or 1 (3.3v) on its I/O pins. I'm using a Digilent VGA Pmod  which uses a set of resistor ladders to map each color component from a 4-bit value to an analog voltage that goes to the monitor. This means that we have 14 pins: R (4x), G (4x), B (4x), HS (1x) and VS (1x).
Oh duh, I should've remembered that. So 12 bit VGA then. From the demo that seems sufficient. Have you tried getting rid of the LSB to see if there's a noticeable difference? I'm curious how few colors you actually need
Most textures look fine despite the 4-bit quantization, but color artifacts/hue change do become more apparent when multiplying with a light factor (to darken the textures on the bottom and sides of blocks), so I'd say 4-bit is definitely pushing the limit. 6 bits or even 8 bits per color component would be ideal, though unfortunately on a small FPGA like this we cannot afford such luxuries ;). It is probably the first thing I'd change if I were to port this to a larger FPGA, since it would be pretty straightforward to do and the increase in visual quality would likely be worth it.
The screen is connected to the FPGA over VGA. The output resolution is 1024x768 @ 60Hz, but the 3D portion of the screen is rendered at 256x128 @ 30Hz. The design consists of a custom 16-bit CPU (running at 32.6Mhz) and a custom raytracing "GPU" that can handle up to 4 rays in parallel.
Input happens via a N64 controller! Those are actually fairly easy to work with at a low level.
The code is not public, though I'm considering open sourcing the project when it's done.
Moreover, there's a lot of additional details that I could potentially go into, so I'm considering also writing a few blogs posts with more info if people are interested!
Wow! This is really cool, thank you for open sourcing this! Do you have any posts talking about how you got started with FPGAs? I'm a SWE and want to get into this area. Your language seems to map directly to what I was sort of expecting to see in the HDL/VHDL world but wasn't finding. You're making me want to buy a VGA monitor and an iCE.
Just do it ;). I don't have a blog post on how I got started, but you gave me the idea that perhaps I should write one.
In terms of hardware, I'm using an iCEBreaker dev board  which has worked really well (A nice bonus is that the board has a 16MB flash chip, which can be used for storage -- this is what the Minecraft clone uses to load the map and textures from on startup).
On the software side, I'm using the open source yosys/nextpnr/icestorm toolchain which is a lot faster than the vendor supplied tools. I mostly figured things out by just trying stuff, so a high iteration speed definitely helped here!
Yup, that's correct, the design is implemented in Wyre, which is a hardware definition language that I created. The language compiles to Verilog so it can work with existing hardware development toolchains. The language is open source and can be found here: https://github.com/nickmqb/wyre
I like the sound of that, but my verilog's rusty, a suggestion for the readme: show what the equivalent verilog would be for the example, or better I suppose the actual verilog that it would transpile to.
I suppose your target audience is mainly more familiar with verilog (though not necessarily I suppose - could have only ever used VHDL) but I'm interested in playing with it, just haven't used verilog, or FPGAs at all, since university.
Yes, I am. This also means that any open source distribution won't include those textures. However, if I do end up open sourcing the work I'll make sure to include instructions for people that already own Minecraft; the textures are just .png files that can be extracted from the game's .jar file and can then be transformed to be used on the FPGA.
I suggest you look at minetest textures. It's an opensource minecraft clone. Most textures have CC or MIT licenses. Read license.txt for each texture mod before you use them. They may help in opensourcing your project.
The FPGA that I'm using for this (the Lattice iCE40 UP5K) is really limited when it comes to RAM, which is the main constraint when it comes to world size. As per the title, there's only 143kb, which is insanely low for doing any kind of 3D stuff :). 48kb is used by the frame buffer, 19kb for textures, which leaves 76kb. 48kb of that is used for the map, which currently limits it to 32x32x32 blocks. However, I do have some plans to improve on that in the future!
The FPS (30Hz) is rock steady though! One of my pet peeves when doing DirectX/OpenGL development is that it's really hard to completely avoid frame drops, e.g. if the OS decides to schedule some other thread, or because the GPU driver decided to do something else than render your app. With hardware development, you can side step all of those problems. As a result, the Minecraft clone is guaranteed to not drop frames :).
Have you thought of going Shadertoy style and doing everything procedurally? Or every block procedurally? That way you can cut RAM as much as you wish. For example, if you have a procedural formula to determine if a block is populated, you don't need to store it in RAM, just use this formula in the renderer directly (in Shadertoys this usually would repeat per-pixel).
It did cross my mind. However, a problem with that approach is that evaluating such a formula is too costly/inaccurate on a small FPGA like this, which just has 8 DSPs (that can only do 16x16 bit multiplication), and some of these are already in use in other parts of the design.
If you have a guarantee for the worst case of generating a pixel (which you indicate by saying that you never drop frames), couldn't you get rid of the framebuffer? Schedule pixel generation so that they complete just in time for when they're needed for output.
This would save up RAM for other things (and be a fun exercice to get right).
That's a good observation! This technique is also known as "racing the beam". The problem is a mismatch of refresh rates; the VGA display operates at 60 Hz but the ray tracer is not capable of producing that many pixels, it can only do 30 fps. So we need a frame buffer to store the result.
On console and embedded the OS is either non-existent or gives your game guarantees about when it will schedule something and how often. Hardware obviously gives you way more control, but a baremetal raspberry pi project, Arduboy, console homebrew, etc can give you some of that control back in software. Awesome project btw
As an example of this, on the Nintendo Switch, games run on three cores dedicated to the game, while the rest of the OS tasks run on the fourth core. Furthermore their scheduler gives precise guarantees about the scheduling order of threads spawned on the same core. It makes sustained 60fps achievable through careful design.
Thanks! That's correct, I built the ray tracing "GPU" from scratch. It's highly optimized because it needs to trace 256 * 128 * 30 = 0.98 million rays per second on very under powered hardware. It's specifically tailored to fast traversal of voxel grids. There's too many details to go into here, but as I wrote in another comment, I'm considering writing a few blog posts to explain how everything works in more detail!
I probably won't port it to other platforms, however, as I wrote in another comment, I am considering open sourcing it when it's done, so others can port it if they like.
I just looked up the MiSTer board . They are using a DE10-Nano FPGA , which is a lot more powerful than the iCE40 UP5K  that I'm using (for comparison, the DE10-Nano has 110,000 lookup elements, whereas the iCE40 UP5K just has 5,280, so ~20x difference!). So porting should be pretty easy. It also opens up opportunities to increase the resolution, frame rate and render distance.
Yes and no. The design includes a custom built 16-bit CPU, which uses a custom instruction set, which I wrote an assembler for. There is a small 4kb bank of RAM that contains a program written in this instruction set. From a hardware perspective it is just data, but from a a software perspective it's that program that is ultimately responsible for running the game (by reading input from the gamepad module, setting up GPU registers, etc.).