Manual Installation (Preferred)

This guide provides step-by-step instructions for setting up SurfSense without Docker. This approach gives you more control over the installation process and allows for customization of the environment.

Prerequisites

Before beginning the manual installation, ensure you have completed all the prerequisite setup steps, including:

PGVector setup
File Processing ETL Service (choose one):
- Unstructured.io API key (Supports 34+ formats)
- LlamaIndex API key (enhanced parsing, supports 50+ formats)
- Docling (local processing, no API key required, supports PDF, Office docs, images, HTML, CSV)
Other required API keys

Backend Setup

The backend is the core of SurfSense. Follow these steps to set it up:

1. Environment Configuration

First, create and configure your environment variables by copying the example file:

Linux/macOS:

cd surfsense_backend
cp .env.example .env

Windows (Command Prompt):

cd surfsense_backend
copy .env.example .env

Windows (PowerShell):

cd surfsense_backend
Copy-Item -Path .env.example -Destination .env

Edit the .env file and set the following variables:

ENV VARIABLE	DESCRIPTION
DATABASE_URL	PostgreSQL connection string (e.g., `postgresql+asyncpg://postgres:postgres@localhost:5432/surfsense`)
SECRET_KEY	JWT Secret key for authentication (should be a secure random string)
NEXT_FRONTEND_URL	URL where your frontend application is hosted (e.g., `http://localhost:3000`)
AUTH_TYPE	Authentication method: `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication
GOOGLE_OAUTH_CLIENT_ID	(Optional) Client ID from Google Cloud Console (required if AUTH_TYPE=GOOGLE)
GOOGLE_OAUTH_CLIENT_SECRET	(Optional) Client secret from Google Cloud Console (required if AUTH_TYPE=GOOGLE)
EMBEDDING_MODEL	Name of the embedding model (e.g., `mixedbread-ai/mxbai-embed-large-v1`)
RERANKERS_MODEL_NAME	Name of the reranker model (e.g., `ms-marco-MiniLM-L-12-v2`)
RERANKERS_MODEL_TYPE	Type of reranker model (e.g., `flashrank`)
TTS_SERVICE	Text-to-Speech API provider for Podcasts (e.g., `local/kokoro`, `openai/tts-1`). See supported providers
TTS_SERVICE_API_KEY	API key for the Text-to-Speech service
TTS_SERVICE_API_BASE	(Optional) Custom API base URL for the Text-to-Speech service
STT_SERVICE	Speech-to-Text API provider for Podcasts (e.g., `openai/whisper-1`). See supported providers
STT_SERVICE_API_KEY	API key for the Speech-to-Text service
STT_SERVICE_API_BASE	(Optional) Custom API base URL for the Speech-to-Text service
FIRECRAWL_API_KEY	API key for Firecrawl service for web crawling
ETL_SERVICE	Document parsing service: `UNSTRUCTURED` (supports 34+ formats), `LLAMACLOUD` (supports 50+ formats including legacy document types), or `DOCLING` (local processing, supports PDF, Office docs, images, HTML, CSV)
UNSTRUCTURED_API_KEY	API key for Unstructured.io service for document parsing (required if ETL_SERVICE=UNSTRUCTURED)
LLAMA_CLOUD_API_KEY	API key for LlamaCloud service for document parsing (required if ETL_SERVICE=LLAMACLOUD)

Optional Backend LangSmith Observability:

ENV VARIABLE	DESCRIPTION
LANGSMITH_TRACING	Enable LangSmith tracing (e.g., `true`)
LANGSMITH_ENDPOINT	LangSmith API endpoint (e.g., `https://api.smith.langchain.com`)
LANGSMITH_API_KEY	Your LangSmith API key
LANGSMITH_PROJECT	LangSmith project name (e.g., `surfsense`)

Uvicorn Server Configuration

ENV VARIABLE	DESCRIPTION	DEFAULT VALUE
UVICORN_HOST	Host address to bind the server	0.0.0.0
UVICORN_PORT	Port to run the backend API	8000
UVICORN_LOG_LEVEL	Logging level (e.g., info, debug, warning)	info
UVICORN_PROXY_HEADERS	Enable/disable proxy headers	false
UVICORN_FORWARDED_ALLOW_IPS	Comma-separated list of allowed IPs	127.0.0.1
UVICORN_WORKERS	Number of worker processes	1
UVICORN_ACCESS_LOG	Enable/disable access log (true/false)	true
UVICORN_LOOP	Event loop implementation	auto
UVICORN_HTTP	HTTP protocol implementation	auto
UVICORN_WS	WebSocket protocol implementation	auto
UVICORN_LIFESPAN	Lifespan implementation	auto
UVICORN_LOG_CONFIG	Path to logging config file or empty string
UVICORN_SERVER_HEADER	Enable/disable Server header	true
UVICORN_DATE_HEADER	Enable/disable Date header	true
UVICORN_LIMIT_CONCURRENCY	Max concurrent connections
UVICORN_LIMIT_MAX_REQUESTS	Max requests before worker restart
UVICORN_TIMEOUT_KEEP_ALIVE	Keep-alive timeout (seconds)	5
UVICORN_TIMEOUT_NOTIFY	Worker shutdown notification timeout (sec)	30
UVICORN_SSL_KEYFILE	Path to SSL key file
UVICORN_SSL_CERTFILE	Path to SSL certificate file
UVICORN_SSL_KEYFILE_PASSWORD	Password for SSL key file
UVICORN_SSL_VERSION	SSL version
UVICORN_SSL_CERT_REQS	SSL certificate requirements
UVICORN_SSL_CA_CERTS	Path to CA certificates file
UVICORN_SSL_CIPHERS	SSL ciphers
UVICORN_HEADERS	Comma-separated list of headers
UVICORN_USE_COLORS	Enable/disable colored logs	true
UVICORN_UDS	Unix domain socket path
UVICORN_FD	File descriptor to bind to
UVICORN_ROOT_PATH	Root path for the application

Refer to the .env.example file for all available Uvicorn options and their usage. Uncomment and set in your .env file as needed.

For more details, see the Uvicorn documentation.

2. Install Dependencies

Install the backend dependencies using uv:

Linux/macOS:

# Install uv if you don't have it
curl -fsSL https://astral.sh/uv/install.sh | bash

# Install dependencies
uv sync

Windows (PowerShell):

# Install uv if you don't have it
iwr -useb https://astral.sh/uv/install.ps1 | iex

# Install dependencies
uv sync

Windows (Command Prompt):

# Install dependencies with uv (after installing uv)
uv sync

3. Run the Backend

Start the backend server:

Linux/macOS/Windows:

# Run without hot reloading
uv run main.py

# Or with hot reloading for development
uv run main.py --reload

If everything is set up correctly, you should see output indicating the server is running on http://localhost:8000.

Frontend Setup

1. Environment Configuration

Set up the frontend environment:

Linux/macOS:

cd surfsense_web
cp .env.example .env

Windows (Command Prompt):

cd surfsense_web
copy .env.example .env

Windows (PowerShell):

cd surfsense_web
Copy-Item -Path .env.example -Destination .env

Edit the .env file and set:

ENV VARIABLE	DESCRIPTION
NEXT_PUBLIC_FASTAPI_BACKEND_URL	Backend URL (e.g., `http://localhost:8000`)
NEXT_PUBLIC_FASTAPI_BACKEND_AUTH_TYPE	Same value as set in backend AUTH_TYPE i.e `GOOGLE` for OAuth with Google, `LOCAL` for email/password authentication
NEXT_PUBLIC_ETL_SERVICE	Document parsing service (should match backend ETL_SERVICE): `UNSTRUCTURED`, `LLAMACLOUD`, or `DOCLING` - affects supported file formats in upload interface

2. Install Dependencies

Install the frontend dependencies:

Linux/macOS:

# Install pnpm if you don't have it
npm install -g pnpm

# Install dependencies
pnpm install

Windows:

# Install pnpm if you don't have it
npm install -g pnpm

# Install dependencies
pnpm install

3. Run the Frontend

Start the Next.js development server:

Linux/macOS/Windows:

pnpm run dev

The frontend should now be running at http://localhost:3000.

Browser Extension Setup (Optional)

The SurfSense browser extension allows you to save any webpage, including those protected behind authentication.

1. Environment Configuration

Linux/macOS:

cd surfsense_browser_extension
cp .env.example .env

Windows (Command Prompt):

cd surfsense_browser_extension
copy .env.example .env

Windows (PowerShell):

cd surfsense_browser_extension
Copy-Item -Path .env.example -Destination .env

Edit the .env file:

ENV VARIABLE	DESCRIPTION
PLASMO_PUBLIC_BACKEND_URL	SurfSense Backend URL (e.g., `http://127.0.0.1:8000`)

2. Build the Extension

Build the extension for your browser using the Plasmo framework.

Linux/macOS/Windows:

# Install dependencies
pnpm install

# Build for Chrome (default)
pnpm build

# Or for other browsers
pnpm build --target=firefox
pnpm build --target=edge

3. Load the Extension

Load the extension in your browser's developer mode and configure it with your SurfSense API key.

Verification

To verify your installation:

Open your browser and navigate to http://localhost:3000
Sign in with your Google account
Create a search space and try uploading a document
Test the chat functionality with your uploaded content

Troubleshooting

Database Connection Issues: Verify your PostgreSQL server is running and pgvector is properly installed
Authentication Problems: Check your Google OAuth configuration and ensure redirect URIs are set correctly
LLM Errors: Confirm your LLM API keys are valid and the selected models are accessible
File Upload Failures: Validate your Unstructured.io API key
Windows-specific: If you encounter path issues, ensure you're using the correct path separator (\ instead of /)
macOS-specific: If you encounter permission issues, you may need to use sudo for some installation commands

Next Steps

Now that you have SurfSense running locally, you can explore its features:

Create search spaces for organizing your content
Upload documents or use the browser extension to save webpages
Ask questions about your saved content
Explore the advanced RAG capabilities

For production deployments, consider setting up:

A reverse proxy like Nginx
SSL certificates for secure connections
Proper database backups
User access controls