Lemonade helps users discover and run local AI apps by serving optimized LLMs, images, and speech right from their own GPUs and NPUs.
Apps like [n8n](https://n8n.io/integrations/lemonade-model/), [VS Code Copilot](https://marketplace.visualstudio.com/items?itemName=lemonade-sdk.lemonade-sdk), [Morphik](https://www.morphik.ai/docs/local-inference#lemonade), and many more use Lemonade to seamlessly run generative AI on any PC.
## Getting Started
1. **Install**: [Windows](https://lemonade-server.ai/install_options.html#windows) · [Linux](https://lemonade-server.ai/install_options.html#linux) · [macOS (beta)](https://lemonade-server.ai/install_options.html#macos) · [Docker](https://lemonade-server.ai/install_options.html#docker) · [Source](./docs/dev-getting-started.md)
2. **Get Models**: Browse and download with the [Model Manager](#model-library)
3. **Generate**: Try models with the built-in interfaces for chat, image gen, speech gen, and more
4. **Mobile**: Take your lemonade to go: [iOS](https://apps.apple.com/us/app/lemonade-mobile/id6757372210) · [Android](https://play.google.com/store/apps/details?id=com.lemonade.mobile.chat.ai&pli=1) · [Source](https://github.com/lemonade-sdk/lemonade-mobile)
5. **Connect**: Use Lemonade with your favorite apps:
## Using the CLI
To run and chat with Gemma 3:
```
lemonade-server run Gemma-3-4b-it-GGUF
```
More modalities:
```
# image gen
lemonade-server run SDXL-Turbo
# speech gen
lemonade-server run kokoro-v1
# transcription
lemonade-server run Whisper-Large-v3-Turbo
```
To see models availables and download them:
```
lemonade-server list
lemonade-server pull Gemma-3-4b-it-GGUF
```
To see the backends available on your PC:
```
lemonade-server recipes
```
## Model Library
Lemonade supports a wide variety of LLMs (**GGUF**, **FLM**, and **ONNX**), whisper, stable diffusion, etc. models across CPU, GPU, and NPU.
Use `lemonade-server pull` or the built-in **Model Manager** to download models. You can also import custom GGUF/ONNX models from Hugging Face.
**[Browse all built-in models →](https://lemonade-server.ai/models.html)**
## Supported Configurations
Lemonade supports multiple recipes (LLM, speech, TTS, and image generation), and each recipe has its own backend and hardware requirements.
Modality
Recipe
Backend
Device
OS
Text generation
llamacpp
vulkan
GPU
Windows, Linux
rocm
Select AMD GPUs*
Windows, Linux
cpu
x86_64
Windows, Linux
metal
Apple Silicon GPU
macOS (beta)
flm
npu
XDNA2 NPU
Windows
ryzenai-llm
npu
XDNA2 NPU
Windows
Speech-to-text
whispercpp
npu
XDNA2 NPU
Windows
cpu
x86_64
Windows
Text-to-speech
kokoro
cpu
x86_64
Windows, Linux
Image generation
sd-cpp
rocm
Selected AMD GPUs
Windows, Linux
cpu
x86_64 CPU
Windows, Linux
To check exactly which recipes/backends are supported on your own machine, run:
```
lemonade-server recipes
```
* See supported AMD ROCm platforms
Architecture
Platform Support
GPU Models
gfx1151 (STX Halo)
Windows, Ubuntu
Ryzen AI MAX+ Pro 395
gfx120X (RDNA4)
Windows, Ubuntu
Radeon AI PRO R9700, RX 9070 XT/GRE/9070, RX 9060 XT
## Project Roadmap
| Under Development | Under Consideration | Recently Completed |
|---------------------------|-----------------------------|------------------------|
| MLX support | vLLM support | macOS (beta) |
| More whisper.cpp backends | Enhanced custom model usage | Image generation |
| More SD.cpp backends | | Speech-to-text |
| | | Text-to-speech |
| | | Apps marketplace |
## Integrate Lemonade Server with Your Application
You can use any OpenAI-compatible client library by configuring it to use `http://localhost:8000/api/v1` as the base URL. A table containing official and popular OpenAI clients on different languages is shown below.
Feel free to pick and choose your preferred language.
| Python | C++ | Java | C# | Node.js | Go | Ruby | Rust | PHP |
|--------|-----|------|----|---------|----|-------|------|-----|
| [openai-python](https://github.com/openai/openai-python) | [openai-cpp](https://github.com/olrea/openai-cpp) | [openai-java](https://github.com/openai/openai-java) | [openai-dotnet](https://github.com/openai/openai-dotnet) | [openai-node](https://github.com/openai/openai-node) | [go-openai](https://github.com/sashabaranov/go-openai) | [ruby-openai](https://github.com/alexrudall/ruby-openai) | [async-openai](https://github.com/64bit/async-openai) | [openai-php](https://github.com/openai-php/client) |
### Python Client Example
```python
from openai import OpenAI
# Initialize the client to use Lemonade Server
client = OpenAI(
base_url="http://localhost:8000/api/v1",
api_key="lemonade" # required but unused
)
# Create a chat completion
completion = client.chat.completions.create(
model="Llama-3.2-1B-Instruct-Hybrid", # or any other available model
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
# Print the response
print(completion.choices[0].message.content)
```
For more detailed integration instructions, see the [Integration Guide](./docs/server/server_integration.md).
## FAQ
To read our frequently asked questions, see our [FAQ Guide](./docs/faq.md)
## Contributing
We are actively seeking collaborators from across the industry. If you would like to contribute to this project, please check out our [contribution guide](./docs/contribute.md).
New contributors can find beginner-friendly issues tagged with "Good First Issue" to get started.
## Maintainers
This is a community project maintained by @amd-pworfolk @bitgamma @danielholanda @jeremyfowers @Geramy @ramkrishna2910 @siavashhub @sofiageo @superm1 @vgodsoe, and sponsored by AMD. You can reach us by filing an [issue](https://github.com/lemonade-sdk/lemonade/issues), emailing [lemonade@amd.com](mailto:lemonade@amd.com), or joining our [Discord](https://discord.gg/5xXzkMu8Zk).
## Code Signing Policy
Free code signing provided by [SignPath.io](https://signpath.io), certificate by [SignPath Foundation](https://signpath.org).
- **Committers and reviewers**: [Maintainers](#maintainers) of this repo
- **Approvers**: [Owners](https://github.com/orgs/lemonade-sdk/people?query=role%3Aowner)
**Privacy policy**: This program will not transfer any information to other networked systems unless specifically requested by the user or the person installing or operating it. When the user requests it, Lemonade downloads AI models from [Hugging Face Hub](https://huggingface.co/) (see their [privacy policy](https://huggingface.co/privacy)).
## License and Attribution
This project is:
- Built with C++ (server) and React (app) with ❤️ for the open source community,
- Standing on the shoulders of great tools from:
- [ggml/llama.cpp](https://github.com/ggml-org/llama.cpp)
- [ggml/whisper.cpp](https://github.com/ggerganov/whisper.cpp)
- [ggml/stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)
- [kokoros](https://github.com/lucasjinreal/Kokoros)
- [OnnxRuntime GenAI](https://github.com/microsoft/onnxruntime-genai)
- [Hugging Face Hub](https://github.com/huggingface/huggingface_hub)
- [OpenAI API](https://github.com/openai/openai-python)
- [IRON/MLIR-AIE](https://github.com/Xilinx/mlir-aie)
- and more...
- Accelerated by mentorship from the OCV Catalyst program.
- Licensed under the [Apache 2.0 License](https://github.com/lemonade-sdk/lemonade/blob/main/LICENSE).
- Portions of the project are licensed as described in [NOTICE.md](./NOTICE.md).
lemonade-sdk-lemonade-d88f5d9/data/ 0000775 0000000 0000000 00000000000 15154303444 0017210 5 ustar 00root root 0000000 0000000 lemonade-sdk-lemonade-d88f5d9/data/lemonade-app.desktop 0000664 0000000 0000000 00000000440 15154303444 0023143 0 ustar 00root root 0000000 0000000 [Desktop Entry]
Version=1.0
Type=Application
Name=Lemonade App
Comment=Local LLMs with GPU and NPU acceleration - Desktop Application
GenericName=AI Model Manager
Exec=lemonade-app
Icon=lemonade-app
Terminal=false
Categories=Development;Utility;
Keywords=AI;LLM;GPU;NPU;Machine Learning;
lemonade-sdk-lemonade-d88f5d9/data/lemonade-server.service.in 0000664 0000000 0000000 00000001221 15154303444 0024263 0 ustar 00root root 0000000 0000000 [Unit]
Description=Lemonade Server
After=network-online.target
[Service]
Type=simple
User=lemonade
Group=lemonade
WorkingDirectory=@CMAKE_INSTALL_FULL_LOCALSTATEDIR@/lib/lemonade
EnvironmentFile=/etc/lemonade/lemonade.conf
EnvironmentFile=/etc/lemonade/secrets.conf
ExecStart=@CMAKE_INSTALL_FULL_BINDIR@/lemonade-server serve
Restart=on-failure
RestartSec=5s
KillSignal=SIGINT
LimitMEMLOCK=infinity
# Security hardening
PrivateTmp=yes
NoNewPrivileges=yes
ProtectSystem=full
ProtectHome=yes
ReadWritePaths=@CMAKE_INSTALL_FULL_LOCALSTATEDIR@/lib/lemonade
RestrictRealtime=yes
RestrictNamespaces=yes
LockPersonality=yes
[Install]
WantedBy=multi-user.target
lemonade-sdk-lemonade-d88f5d9/data/lemonade-web-app 0000664 0000000 0000000 00000001560 15154303444 0022252 0 ustar 00root root 0000000 0000000 #!/bin/bash
set -e
LEMONADE_HOST="localhost"
LEMONADE_PORT="8000"
if [ -f /etc/lemonade/lemonade.conf ]; then
source /etc/lemonade/lemonade.conf
fi
if [ -f ~/.config/lemonade/lemonade.conf ]; then
source ~/.config/lemonade/lemonade.conf
fi
URL="http://${LEMONADE_HOST}:${LEMONADE_PORT}/"
# Prefer chromium based browsers otherwise xdg-open
if [ "$(echo "${LEMONADE_PREFER_CHROMIUM:-true}" | tr '[:upper:]' '[:lower:]')" = "true" ]; then
if command -v google-chrome &> /dev/null; then
google-chrome --app="$URL"
elif command -v microsoft-edge-stable &> /dev/null; then
microsoft-edge-stable --app="$URL"
elif command -v chromium &> /dev/null; then
chromium --app="$URL"
elif command -v chromium-browser &> /dev/null; then
chromium-browser --app="$URL"
else
xdg-open "$URL"
fi
else
xdg-open "$URL"
fi
lemonade-sdk-lemonade-d88f5d9/data/lemonade-web-app.desktop 0000664 0000000 0000000 00000000442 15154303444 0023720 0 ustar 00root root 0000000 0000000 [Desktop Entry]
Version=1.0
Type=Application
Name=Lemonade Web App
Comment=Local LLMs with GPU and NPU acceleration - Web Interface
GenericName=AI Model Manager
Exec=lemonade-web-app
Icon=lemonade-app
Terminal=false
Categories=Development;Utility;
Keywords=AI;LLM;GPU;NPU;Machine Learning;
lemonade-sdk-lemonade-d88f5d9/data/lemonade.conf 0000664 0000000 0000000 00000001101 15154303444 0021634 0 ustar 00root root 0000000 0000000 #LEMONADE_HOST=
#LEMONADE_PORT=
#LEMONADE_LOG_LEVEL=
#LEMONADE_LLAMACPP=
#LEMONADE_CTX_SIZE=
#LEMONADE_LLAMACPP_ARGS=
#LEMONADE_EXTRA_MODELS_DIR=
#LEMONADE_DISABLE_MODEL_FILTERING=
#LEMONADE_ENABLE_DGPU_GTT=
#LEMONADE_LLAMACPP_ROCM_BIN
#LEMONADE_LLAMACPP_VULKAN_BIN=
#LEMONADE_LLAMACPP_CPU_BIN=
#LEMONADE_WHISPERCPP_CPU_BIN=
#LEMONADE_WHISPERCPP_NPU_BIN=
#LEMONADE_RYZENAI_SERVER_BIN=
#LEMONADE_KOKORO_CPU_BIN=
#LEMONADE_SDCPP_CPU_BIN=
#LEMONADE_SDCPP_ROCM_BIN=
#LEMONADE_SDCPP_VULKAN_BIN=
#LEMONADE_LLAMACPP_PREFER_SYSTEM=
#LEMONADE_NO_BROADCAST=
#LEMONADE_MAX_LOADED_MODELS=
lemonade-sdk-lemonade-d88f5d9/data/secrets.conf 0000664 0000000 0000000 00000000023 15154303444 0021522 0 ustar 00root root 0000000 0000000 #LEMONADE_API_KEY=
lemonade-sdk-lemonade-d88f5d9/docs/ 0000775 0000000 0000000 00000000000 15154303444 0017227 5 ustar 00root root 0000000 0000000 lemonade-sdk-lemonade-d88f5d9/docs/CNAME 0000664 0000000 0000000 00000000023 15154303444 0017770 0 ustar 00root root 0000000 0000000 lemonade-server.ai
lemonade-sdk-lemonade-d88f5d9/docs/README.md 0000664 0000000 0000000 00000001054 15154303444 0020506 0 ustar 00root root 0000000 0000000 # Lemonade SDK Documentation
This documentation has moved to our main documentation site.
## Lemonade Server
For the Lemonade Server (OpenAI-compatible HTTP server for local LLMs), see:
**[Lemonade Server Documentation →](https://lemonade-server.ai/docs/)**
## lemonade-eval CLI
For the `lemonade-eval` CLI (benchmarking, accuracy evaluation, and model preparation), see:
**[lemonade-eval Documentation →](./eval/README.md)**
lemonade-sdk-lemonade-d88f5d9/docs/assets/ 0000775 0000000 0000000 00000000000 15154303444 0020531 5 ustar 00root root 0000000 0000000 lemonade-sdk-lemonade-d88f5d9/docs/assets/carousel.js 0000664 0000000 0000000 00000004265 15154303444 0022713 0 ustar 00root root 0000000 0000000 // Simple YouTube video carousel for MkDocs Material
document.addEventListener('DOMContentLoaded', function () {
var carousel = document.getElementById('yt-carousel');
if (!carousel) return;
// Support both data-ids (comma-separated) and data-videos (JSON array of {id, title})
var videos = [];
if (carousel.dataset.videos) {
try {
videos = JSON.parse(carousel.dataset.videos);
} catch (e) {
console.error('Invalid JSON in data-videos:', e);
}
} else if (carousel.dataset.ids) {
videos = carousel.dataset.ids.split(',').map(function(id) {
return { id: id.trim(), title: '' };
});
}
if (!videos.length) return;
var idx = 0;
function render() {
var video = videos[idx];
var titleHtml = video.title ? `