Celebrating one year and tentative roadmap for the future of the API2PDF product

June 24th, 2019 / by api2pdf /

The Past Year

Hello everyone, Zack here! One year ago we hit the metaphorical GO button. It was anti-climactic of course because we had no launch strategy and Google had yet to even index our website. It was about 10 days later where I remember waking up one morning and found that a developer from the Netherlands who runs an e-commerce vitamin store signed up and paid $10. I literally screamed.

We still barely show up on page 2 of Google for the coveted keywords search “HTML to PDF API”. Even so, we are averaging somewhere between 10 and 20 signups a day, mostly through word of mouth. We owe our incredible growth to you – our customer. Thank you for continuing to recommend us to your colleagues. We reflect on the past year with immense gratitude. While most SaaS companies take 3 to 5 years to become profitable, we just crossed that threshold this past month.

The last year has been fun, but we have big plans for the future. In staying with our commitment to transparency, we want to share with you our desired roadmap for version 2 of the product and some of the R&D we have been doing. This goes without saying, but it’s all subject to change. I also do not have a specific timeline for when we plan to launch v2 so I’m not even going to harbor a guess. We really want to make it right, and not rush it.

V2 Feature Roadmap for API2PDF

Support both sync and async (with web hooks) options for endpoints. Currently we have about a 230 second time limit to generate a PDF. This is sufficient for about 95% of the use cases, but there is 5% of people who are generating PDFs that take more than 230 seconds to generate (crazy!?!). We want to support that too by allowing async calls and then generating your PDFs in long running tasks.
Use your own Amazon S3 buckets (or other provider). We wanted to implement this sooner, but were concerned about security with storing your Amazon S3 credentials on our servers. A customer emailed in recommending an alternative, no-brainer approach. Once again, you guys drive the product.
Ability to remain on a frozen version of a PDF engine. For example, API2PDF’s Headless Chrome endpoint is currently running on version 68. We generally stay evergreen, but recognize this could cause problems for some customers. If Chrome releases an update that causes a breaking change to your HTML, you won’t be happy with us if we upgrade without letting you know. We want to provide you the ability to choose “Always keep me on the latest version” or specify which version of Chrome or wkhtmltopdf you wish to use.
Url to PDF endpoints to allow custom headers. This is a common one – right now our URL to PDF endpoints leave a bit to be desired. The URL you provide has to be publicly accessible. We want to provide the ability for you to pass custom authorization headers that will be used to call your own URLs to ensure it is a proper request. Total oversight on our part for missing this one from the get-go. Seems obvious, honestly.
Chrome Puppeteer. This is a big one. Often we have developers who need to wait until javascript renders or other HTML elements to appear, etc, before generating the PDF. It’s kinda hacky / not really great the way it works now because our AWS Lambda endpoints call Headless Chrome directly. Our plan is to implement Chrome on top of Puppeteer which provides some additional functionality that we can leverage, and pass those on to you via the API.
Chrome Screenshots. Hotly requested feature. We support HTML to PDF, why not HTML to Screenshots? Since we plan on implementing Puppeteer, this is barely any additional work to get this working – might as well kill two birds with one stone. Yes, we are API2PDF and screenshots are off-brand. Whatever.
LibreOffice expansion. Let’s continue with the trend of going off-brand. Our LibreOffice endpoint was ignored for the better part of a year, but has for some reason seen an explosion in usage these past several months. Currently we let you take whatever can get converted to PDF in LibreOffice and it will do a best effort conversion. But Libre supports conversions from file types to non-PDF file types. For example, what if you want to go the opposite direction – convert PDF to HTML? Let’s do it.
Wkhtmltopdf Table of Contents. We are not big fans of wkhtmltopdf – we feel Headless Chrome is better in nearly every way. But HC does not support Table of Contents generation (we believe it is being developed by Chromium team). Wkhtmltopdf does support ToC, except …we don’t. We failed to implement the functionality in our API to access those features. Our bad. We plan to fix that because the demand for it is quite high.
PDF Bookmarks. We are looking into some bookmark functionality for PDFs though we think this may be a rabbit hole we are not prepared for from a technical standpoint.
Split. On the fence about this one. We support merging PDFs, but might also now include splitting PDFs into separate PDFs. Not sure about this one yet. Kind of a nice-to-have.

V2 Infrastructure

Along with all of those features we plan to build out, we are doing heavy R&D into a new backend infrastructure, getting off of AWS Lambda. There are a few reasons we are looking into this.

Having infrastructure that is tied specifically to a platform (AWS Lambda) bothers me overall. It’s not that valid of a concern because AWS is not going away, but it’s how I feel. I’d much rather have an infrastructure-agnostic code base.
We are experimenting with Docker Containers. Using Docker is two-fold. First, you can build once and deploy your containerized applications anywhere (see #1 above), and second – we get inundated with requests from developers who want a self-hosted solution of our API because it’s the bomb. Stuck on Lambda does not really help them out, but we could sell and distribute our Docker Containers with our API built-in and ready to go. We can offer a mix of solutions – such as just providing the images, or we help you install the images in your cloud, or maybe we manage the hosting for you. We’ll explore different business models, but the idea is that these developers want our API and we have no way to service them effectively.
Switching to Docker poses another interesting challenge. The problem that AWS Lambda / Serverless solve is that you do not need VMs running idle in case you hit a burst of requests and need to be able to service those requests. This is why we can offer our API at such low prices. We are simply charging a small markup on what Lambda charges. This is where I’m spending most of my time doing R&D. Using Azure Kubernetes Service, in combination with Azure Container Instances, you get the best of both worlds. Azure Container Instances is serverless compute and let’s you spin up new containers in 45 seconds to handle burst scale. This is remarkable and ACI is the first of its kind. I built a proof of concept of this AKS-ACI combo w/ Headless Chrome on Puppeteer. More research is needed here though and it is far from a trivial engineering challenge. In the end, we may decide to stay on AWS Lambda, but we’ll continue to share our thoughts on the matter.

We hope you enjoyed this update and stream of consciousness. We thank you again for being the best customers one could ask for. API2PDF is a labor of love, and after a year we are more motivated than ever to provide a kick ass product.

Best,

Zack, Kunal, and Hussein.