Inkfluence AI Tech Stack 2025 - How It Works Under the Hood

Most users never see what is happening behind the interface of a SaaS product. They click a button and content appears. A project exports to PDF. An audiobook generates in two minutes.

But every one of those interactions is the result of dozens of technical decisions - infrastructure choices, API integrations, rendering engines, concurrency limits, state management patterns.

Try the core tool: AI ebook generator
Export pipeline: PDF ebook maker
Audio stack: AI audiobook generator

Inkfluence AI was not built by a large team with unlimited resources. It was built by one developer who needed to ship fast, iterate quickly, and avoid infrastructure complexity. The stack had to support instant user value without requiring DevOps expertise or cloud architecture babysitting.

This is a breakdown of how Inkfluence AI works under the hood - the technologies, the reasons behind each choice, and the tradeoffs that shaped the product.

The Foundation: React + Vite

The frontend is built with React 19 and bundled with Vite. This is not an uncommon choice for modern web applications, but it was chosen deliberately.

React provides component-based UI architecture. Every feature - the dashboard, the chapter editor, the AI chat interface, the export modals - is isolated, testable, and reusable. New features can be added without touching the rest of the app.

Vite was chosen for developer experience. Hot module replacement during development is instant. Build times are measured in seconds, not minutes. The entire frontend pipeline - from code change to live preview - is faster than most alternatives.

The UI also uses Radix UI primitives for accessibility and Tailwind CSS for styling. These are not cosmetic decisions. Radix ensures that interactive elements work with keyboards, screen readers, and assistive technology without manual ARIA implementation. Tailwind keeps the CSS bundle small and design system consistent.

React Router handles navigation. Framer Motion adds animations where they improve the user experience. The entire frontend is a single-page application served from a CDN, which means global latency is low and time-to-interactive is under one second for most users.

The Backend: Firebase (Firestore + Cloud Storage + Auth)

Firebase is the entire backend infrastructure. No custom database servers. No Docker containers. No Kubernetes orchestration. Everything runs on Google's managed services.

Firestore stores all user data: projects, chapters, AI prompts, metadata, usage tracking, subscription status, generation history. Documents are nested hierarchically under users. This structure makes it trivial to query all projects for a specific user or all chapters within a project.

Firestore also handles realtime sync. When a user generates a new chapter on their laptop and opens the project on their phone, the data is already there. No refresh required. The client SDK listens for document changes and updates the UI automatically.

Cloud Storage handles all file exports: PDFs, audiobooks (MP3/WAV), cover images (PNG/JPG). Files are uploaded directly from the backend API functions after generation. Users download them via signed URLs with expiration timestamps. There is no file hosting infrastructure to manage.

Firebase Auth manages user authentication: email/password signup, Google OAuth, session persistence, password resets. The entire auth layer is handled by Firebase SDKs. There is no custom user table or JWT implementation.

The choice to use Firebase was strategic. It eliminated the need to design database schemas, manage server uptime, handle backups, or optimize query performance manually. Firestore auto-scales. Firebase Auth is SOC 2 compliant. Cloud Storage is globally distributed. All of this infrastructure existed before the first line of Inkfluence code was written.

The AI Layer: OpenAI + Neural TTS + Replicate

Inkfluence does not train its own AI models. It integrates with existing APIs and orchestrates them intelligently.

Text Generation: All chapter content is generated via OpenAI's GPT-4 API. The system sends structured prompts with user instructions, tone profiles, formatting rules, and chapter blueprints. The API returns markdown-formatted chapters that are parsed and stored in Firestore.

The prompts are modular. There are separate prompt templates for tone (formal, casual, academic), format (listicle, narrative, instructional), and blueprint (hero's journey, problem-solution, case study). Users can mix and match these settings without the AI layer caring about implementation details.

Streaming is enabled for chapter previews. Users see words appear in realtime as GPT-4 generates them. This is not just a UX flourish - it builds trust. Users can stop generation early if they see the output diverging from their intent.

Audiobook Narration: Text-to-speech is powered by OpenAI's neural TTS models. Users select from 12 voice options (Alloy, Echo, Fable, Onyx, Nova, Shimmer, and others). The backend splits chapter text into chunks under 4096 characters, sends them to the TTS API in parallel, and stitches the audio files together using FFmpeg.

Audiobook generation is handled by Inngest, a background job orchestration service. Long-running audio synthesis tasks run asynchronously without blocking the user's browser. Users can close the tab and return later to download the finished audiobook.

Cover Design: Cover images are generated via Replicate's Flux.1 Schnell model. Users enter a description (or let the AI suggest one based on the ebook title), and the backend sends it to the Replicate API. The generated image is downloaded and uploaded to Cloud Storage.

The AI layer is deliberately model-agnostic. If OpenAI's pricing increases or a better text model launches, the integration can be swapped by changing the API endpoint and adjusting the prompt format. The rest of the application does not need to know.

Deployment: Vercel Serverless Functions

Inkfluence is deployed on Vercel. The frontend is served as static files from Vercel's global CDN. The backend API routes are serverless functions that run on-demand.

Every API endpoint is a standalone JavaScript function in the /api directory. Vercel automatically deploys these as serverless functions with configurable timeouts and memory limits. AI generation endpoints have a 30-second timeout. Audiobook synthesis (via Inngest) has a 5-minute timeout.

The deployment pipeline is Git-based. Push to the main branch and Vercel rebuilds the frontend, redeploys the API functions, and invalidates the CDN cache. The entire process takes under two minutes. There is no manual deployment step.

Vercel also handles environment variables, custom domains, SSL certificates, and edge caching. The infrastructure overhead is zero. The focus stays on building features, not configuring servers.

The Rendering Pipeline: jsPDF + Custom Export Logic

Exporting ebooks to PDF is not as simple as converting HTML to a file. Professional ebooks need:

Automatic chapter numbering
Consistent heading hierarchy (H1 for titles, H2 for sections, H3 for subsections)
Typography rules (proper line spacing, paragraph indentation, widow/orphan control)
Page breaks that do not split headings or code blocks
Embedded cover images and metadata

Inkfluence uses jsPDF as the core PDF generation library. It converts HTML to PDF on the client side (no server rendering required). Custom logic handles:

Chapter Structuring: Each chapter is parsed for headings, paragraphs, lists, and code blocks. The renderer applies consistent spacing and margins.
Font Embedding: Custom fonts can be embedded in the PDF to match brand guidelines.
Table of Contents: Automatically generated from chapter titles with page number references.
Metadata: Author name, title, keywords, and publication date are embedded in the PDF file properties.

The export happens entirely in the user's browser. No file is uploaded to a server. The PDF is generated, compressed, and downloaded directly. This keeps costs low and export speed fast.

The Audiobook Engine: Inngest + FFmpeg + OpenAI TTS

Generating a full audiobook from a 10,000-word ebook is not a synchronous operation. It requires:

Splitting the text into chunks (TTS APIs have character limits)
Sending each chunk to the TTS API
Retrying failed chunks with exponential backoff
Stitching the audio files together in the correct order
Normalizing volume levels
Exporting the final file to Cloud Storage

This workflow is orchestrated by Inngest. Inngest is a background job platform that queues tasks, handles retries, and manages concurrency limits. When a user clicks "Generate Audiobook," the backend creates an Inngest job with the chapter text and voice settings. Inngest runs the job asynchronously.

The audiobook generation function:

Cleans the text (removes HTML tags, math symbols, formatting artifacts)
Splits it into chunks under 4096 characters
Sends each chunk to OpenAI's TTS API in parallel (up to 12 concurrent requests)
Saves each audio buffer to memory
Uses FFmpeg (via fluent-ffmpeg) to concatenate the buffers into a single MP3 file
Uploads the final file to Cloud Storage
Updates Firestore with the download URL

The entire process takes 1-3 minutes for a typical 5,000-word chapter. Users can track progress in realtime via Firestore listeners.

Concurrency and Rate Limiting

AI APIs have rate limits. OpenAI allows a certain number of requests per minute. Replicate has queue limits. Firestore has write limits per second.

Inkfluence handles this with custom concurrency logic:

Per-User Limits: Each user can run 1 AI generation task at a time. This prevents a single user from queuing 50 chapters simultaneously and hitting API limits.
Global Limits: The entire system runs up to 12 concurrent AI tasks across all users. This keeps API costs predictable and prevents rate limit errors.
Queue Management: If a user requests generation while another task is running, the new request is queued. The frontend shows a "queued" status with estimated wait time.

This system ensures that Inkfluence can scale to hundreds of concurrent users without hitting API throttling or exhausting service quotas.

Why This Stack Works for a Solo Developer

The technology choices behind Inkfluence were not about using the latest frameworks or chasing trends. They were about shipping a product that works, scales predictably, and does not require constant maintenance.

No servers to manage. Firebase and Vercel handle infrastructure. There is no SSH access, no Docker configuration, no Kubernetes manifests.

No custom AI training. Inkfluence uses hosted APIs. There is no model fine-tuning, no GPU clusters, no ML pipeline to maintain.

Fast iteration. New features can be deployed in minutes. A/B tests can be run without infrastructure changes. The feedback loop is measured in hours, not weeks.

Predictable costs. Firebase charges per read/write. Vercel charges per function invocation. OpenAI charges per token. There are no surprise server bills or unused capacity.

Global performance. Vercel's CDN ensures low latency worldwide. Firebase replicates data across regions. Users in London and Singapore see the same fast load times.

This is the same stack used by companies like Linear, Notion, and Supabase (with variations). It is proven at scale. It works for solo developers and 50-person teams.

What Comes Next: Future Stack Improvements

The current architecture is stable, but there are planned improvements:

Vector Memory for Style Retention: Storing user writing samples in a vector database (Pinecone or Qdrant) so the AI can mimic the user's tone across chapters.
Realtime Collaboration: Adding Firestore-based multiplayer editing so multiple users can co-author ebooks.
Custom Model Fine-Tuning: Allowing power users to fine-tune GPT-4 on their writing style using OpenAI's fine-tuning API.
Voice Cloning: Integrating ElevenLabs or PlayHT for users who want audiobooks narrated in their own voice.
Server-Side Rendering for SEO: Moving to Next.js for blog pages and landing pages to improve search engine indexing.

These additions will happen incrementally. The stack is modular enough to support them without major rewrites.

The Lessons from Building This Stack

If you are building a SaaS product in 2025, you do not need a complex infrastructure. You need:

A frontend framework that ships fast (React, Vue, Svelte - pick one and move on)
A managed backend (Firebase, Supabase, AWS Amplify - avoid rolling your own database)
API integrations for AI (OpenAI, Anthropic, Replicate - do not train models unless you must)
A deployment platform that handles CDN, serverless functions, and CI/CD (Vercel, Netlify, Cloudflare Pages)

The hardest part of building Inkfluence was not the technology. It was designing the UX, writing the prompts, and understanding what users actually needed. The stack just had to get out of the way.

Explore Inkfluence AI

If you want to see this stack in action, try Inkfluence AI for free. Generate an ebook in your niche, export it to PDF, and turn it into an audiobook.

Inkfluence AI Tech Stack (2025): How the Platform Works

The Foundation: React + Vite

The Backend: Firebase (Firestore + Cloud Storage + Auth)

The AI Layer: OpenAI + Neural TTS + Replicate

Deployment: Vercel Serverless Functions

The Rendering Pipeline: jsPDF + Custom Export Logic

The Audiobook Engine: Inngest + FFmpeg + OpenAI TTS

Concurrency and Rate Limiting

Why This Stack Works for a Solo Developer

What Comes Next: Future Stack Improvements

The Lessons from Building This Stack

Explore Inkfluence AI

Helpful links

Ready to Create Your Own Ebook?

Related Articles

Inkfluence AI Is Live on the App Store (2025 iOS Launch)

Building Inkfluence AI in 30 Days - A Real Founder Log

How I Built Inkfluence AI - The Solo Journey Behind a 2025 AI Publishing Tool

Get ebook tips in your inbox