I wrote a programming language. Here’s how you can, too.

I wrote a programming language. Here’s how you can, too.

Over the past 6 months, I’ve been working on a programming language called Pinecone. It’s not mature yet, but it already has enough features working to be usable, such as: variables, user defined structures, etc. So I must be doing something right by making a completely new language.

Parsing

The parser turns a list of tokens into a tree of nodes

Flex

A program that generates lexers

Decision

In the end, I didn’t see significant benefits of using Flex, at least not enough to justify adding a dependency and complicating the build process.

LLVM

It’s basically a library that will turn your language into a compiled executable binary.

Build Your Own Compiler

Because of the number of architectures and operating systems, it is impractical for any individual to write a cross platform compiler backend.

Compiled vs Interpreted

There are two major types of languages: compiled and interpreted

Choosing a Language

If you are writing an interpreted language, it makes a lot of sense to write it in a compiled one (like C, C++ or Swift) because the performance lost in the language of your interpreter and the interpreter that is interpreting your interpreter will compound.

Lexing

The first step in most programming languages is lexing, or tokenizing.

Getting Started

“I have absolutely no idea where I would even start” is something I hear a lot when I tell other developers I’m writing a language.

Transpiling

Write a Pinecone to C++ transpiler, and add the ability to automatically compile the output source with GCC

Conclusion

If in doubt, go interpreted

Bison

The predominant parsing library is Bison.

Why Custom Is Better

Minimize context switching in workflow: context switching between C++ and Pinecone is bad enough without throwing in Bison’s grammar grammar

Task of the Lexer

The lexer takes in a string containing an entire files worth of source code and spits out a list containing every token.

High Level Design

A programming language is generally structured as a pipeline

Action Tree

Most akin to LLVM’s IR (intermediate representation).

Action Tree vs AST

The action tree is the AST with context

Future

Pinecone will get LLVM compiling support eventually.

Source

Get in