C# / .NET Tutorial – Convert PDF to HTML

July 14th, 2021 / by api2pdf /


For those who are c# / .net core developers, you might have a niche requirement to convert a PDF into HTML, or extract text content from a PDF for indexing purposes. Here at API2PDF, we have a PDF to HTML endpoint that does a best effort to extract the text from a PDF and output an HTML document.

Our API will take your .pdf file and convert it to html. Just make sure your PDF is saved as a .pdf file and accessible at a URL that our service can ingest. For example, see this: http://www.api2pdf.com/wp-content/uploads/2021/01/1a082b03-2bd6-4703-989d-0443a88e3b0f-4.pdf — Ideally a file storage provider like S3 or Azure Blob Storage. See the code sample below.

Convert PDF to HTML with C# / .NET Core

Step 1) Open up your package manager and run the command

Install-Package Api2Pdf -Version 2.0.0

Step 2) Grab an API key from https://portal.api2pdf.com. Only takes 60 seconds.

Step 3) Use the sample code below and replace “YOUR-API-KEY” with the api key you acquired in step 2.

var a2pClient = new Api2Pdf("YOUR-API-KEY");
var request = new LibreFileConversionRequest
    Url = "https://link-to-your-pdf"
var apiResponse = a2pClient.LibreOffice.PdfToHtml(request);

And that’s it! Modify the code as you see fit. Hopefully this saves you time and makes converting converting PDF to HTML files easy and painless for those writing C# / .NET core code.

See full github library

We have a whole .net based client library for our API that does a lot more than just this. Check out the full library capabilities here: https://github.com/Api2Pdf/api2pdf.dotnet

Tags: , , , , , ,

Comments are closed.