Gophers vs Pythons: Leveraging Data to Select the Right Language for High-Performance APIs
What can you do when Python isn’t fast enough to meet your clients’ needs? When the issue doesn’t come from the algorithms, the hardware, or the network, you’re left with few other ways to make a project run faster. And the compiler (or interpreter) is one of the last and hardest options to change. At FLYR Labs, our clients require millisecond turnaround times, so we had to start searching for Python alternatives.
FLYR goes fast, and so must our APIs. When an airline needs to generate the seat prices for a flight, it calls our Dynamic Pricing Engine API. For context, a single customer sends up to 3,500 requests each second and expects a real-time response just 30 milliseconds later.
With the first and last 8 milliseconds going to network traffic, our AI is left with no more than 14 milliseconds to build an optimal answer.
It’s not enough to switch languages because of personal preferences. We need a data-driven comparison and actionable results in order to make such a shift. So we built a head-to-head competition between our choices of language.
First, why did we test Golang against Python? Why not Java? After all, there are many more Java developers in the world than Go developers. It’s a stretch to say that a language with a Hello World as unflattering and inflexible as this would be anyone’s first choice…
However, Golang has some less-than-obvious benefits that made it a contender and potentially a better option. Concurrency is a first class citizen in Go. This reduces any concurrency-related code to a fraction of its Python-equivalent size, thus reducing our coding time, our opportunities to introduce bugs, and the effort it takes to write unit tests. When you’re building a low-latency, high-throughput API, this benefit is key.
But Python is still the better choice in terms of:
- Ease of programming
- Supporting the data science community
- Having well documented third-party library support
- Ending up with easy-to-read code
From the onset, Go was neither a clear contender nor the expected victor. We tried our best to keep the proofs of concept fair by mapping out what the complete system should look like:
- It features a front-end server with a RESTful API;
- The server reads & parses JSON data;
- It queries an in-memory database (Redis);
- It manipulates lists and arrays to build its results; and lastly;
- It packages the results for the original caller.
We wrote each portion of the proof of concept first in Python and then in Golang. Instead of writing the full solutions for some rigorous parts, we time boxed them. That is, we gave ourselves a set amount of time to create a reasonable application that handles the majority of the functionality. This requires enough understanding of the problem space to extrapolate future feature processing time. Our code isn’t perfect, but if it was, it’d be production code and not a proof of concept.
Next, we wrapped the proofs of concept into Docker containers so we could repeatedly measure performance, load, and stress. This gave us a reasonable and realistic view of how fast our solutions would run in the real world.
Python’s Last Stand
And with a single stress test, Python’s days were numbered.
Python lost the battle for every load factor, and the story kept getting worse as the synthesized requests grew in size.
But a single test isn’t exactly enough to justify the migration of the entire project’s codebase. We are, after all, predominantly a Python and TypeScript shop. And with Python being a fan favorite and a champion of data science, we continued the head-to-head testing. Next up: a simple, lightweight tool called Vegeta to load-test the servers.
Project source code: https://github.com/tsenart/vegeta. This short command sends 500 requests per second to the server of your choosing for 10 minutes straight, allowing us to generate performance comparisons for any API we’d like.
With these results in hand, it was now time to wrestle with some big issues:
- It costs significant time to gain experience in Go.
- It costs money to develop in a different language.
- What parts of the original codebase can be left in Python?
- There weren’t enough resources – people and knowledge – to build a production environment in Golang.
- Golang is not known for data science or for its ease of use.
- Go’s continuous integration & continuous deployment (CI/CD) requirements are a noticeable shift from Python’s.
- Do we continue the search for another language?
- Will this help any other internal projects with us going through the pioneers’ experience?
With these issues in play, we decided to pause development in Python and take the time to learn Golang so we could port the existing code over. No matter how we chose to write the engine, Python would never be able to meet the speed requirements of our customers. It was a reasonable risk to try a language that’s known for its compiler’s speed and its use in a server environment.
We paid close attention to how Golang trades off flexibility for serious speed gains. And since this project needed to weigh speed as slightly more important than developer efficiency, we decided Golang was better suited to meet our specific needs and deliver the performance our customers require.