././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1740761919.8178308 llm-0.23/0000755000175100001660000000000014760365500011633 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/LICENSE0000644000175100001660000002613514760365472012657 0ustar00runnerdocker Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/MANIFEST.in0000644000175100001660000000002714760365472013400 0ustar00runnerdockerglobal-exclude tests/* ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1740761919.8178308 llm-0.23/PKG-INFO0000644000175100001660000001517414760365500012740 0ustar00runnerdockerMetadata-Version: 2.2 Name: llm Version: 0.23 Summary: CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine. Home-page: https://github.com/simonw/llm Author: Simon Willison License: Apache License, Version 2.0 Project-URL: Documentation, https://llm.datasette.io/ Project-URL: Issues, https://github.com/simonw/llm/issues Project-URL: CI, https://github.com/simonw/llm/actions Project-URL: Changelog, https://github.com/simonw/llm/releases Requires-Python: >=3.9 Description-Content-Type: text/markdown License-File: LICENSE Requires-Dist: click Requires-Dist: openai>=1.55.3 Requires-Dist: click-default-group>=1.2.3 Requires-Dist: sqlite-utils>=3.37 Requires-Dist: sqlite-migrate>=0.1a2 Requires-Dist: pydantic>=2.0.0 Requires-Dist: PyYAML Requires-Dist: pluggy Requires-Dist: python-ulid Requires-Dist: setuptools Requires-Dist: pip Requires-Dist: pyreadline3; sys_platform == "win32" Requires-Dist: puremagic Provides-Extra: test Requires-Dist: pytest; extra == "test" Requires-Dist: numpy; extra == "test" Requires-Dist: pytest-httpx>=0.33.0; extra == "test" Requires-Dist: pytest-asyncio; extra == "test" Requires-Dist: cogapp; extra == "test" Requires-Dist: mypy>=1.10.0; extra == "test" Requires-Dist: black>=25.1.0; extra == "test" Requires-Dist: ruff; extra == "test" Requires-Dist: types-click; extra == "test" Requires-Dist: types-PyYAML; extra == "test" Requires-Dist: types-setuptools; extra == "test" Dynamic: author Dynamic: description Dynamic: description-content-type Dynamic: home-page Dynamic: license Dynamic: project-url Dynamic: provides-extra Dynamic: requires-dist Dynamic: requires-python Dynamic: summary # LLM [![PyPI](https://img.shields.io/pypi/v/llm.svg)](https://pypi.org/project/llm/) [![Documentation](https://readthedocs.org/projects/llm/badge/?version=latest)](https://llm.datasette.io/) [![Changelog](https://img.shields.io/github/v/release/simonw/llm?include_prereleases&label=changelog)](https://llm.datasette.io/en/stable/changelog.html) [![Tests](https://github.com/simonw/llm/workflows/Test/badge.svg)](https://github.com/simonw/llm/actions?query=workflow%3ATest) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/llm/blob/main/LICENSE) [![Discord](https://img.shields.io/discord/823971286308356157?label=discord)](https://datasette.io/discord-llm) [![Homebrew](https://img.shields.io/homebrew/installs/dy/llm?color=yellow&label=homebrew&logo=homebrew)](https://formulae.brew.sh/formula/llm) A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine. [Run prompts from the command-line](https://llm.datasette.io/en/stable/usage.html#executing-a-prompt), [store the results in SQLite](https://llm.datasette.io/en/stable/logging.html), [generate embeddings](https://llm.datasette.io/en/stable/embeddings/index.html) and more. Consult the **[LLM plugins directory](https://llm.datasette.io/en/stable/plugins/directory.html)** for plugins that provide access to remote and local models. Full documentation: **[llm.datasette.io](https://llm.datasette.io/)** Background on this project: - [llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs](https://simonwillison.net/2023/May/18/cli-tools-for-llms/) - [The LLM CLI tool now supports self-hosted language models via plugins](https://simonwillison.net/2023/Jul/12/llm/) - [Accessing Llama 2 from the command-line with the llm-replicate plugin](https://simonwillison.net/2023/Jul/18/accessing-llama-2/) - [Run Llama 2 on your own Mac using LLM and Homebrew](https://simonwillison.net/2023/Aug/1/llama-2-mac/) - [Catching up on the weird world of LLMs](https://simonwillison.net/2023/Aug/3/weird-world-of-llms/) - [LLM now provides tools for working with embeddings](https://simonwillison.net/2023/Sep/4/llm-embeddings/) - [Build an image search engine with llm-clip, chat with models with llm chat](https://simonwillison.net/2023/Sep/12/llm-clip-and-chat/) - [Many options for running Mistral models in your terminal using LLM](https://simonwillison.net/2023/Dec/18/mistral/) ## Installation Install this tool using `pip`: ```bash pip install llm ``` Or using [Homebrew](https://brew.sh/): ```bash brew install llm ``` [Detailed installation instructions](https://llm.datasette.io/en/stable/setup.html). ## Getting started If you have an [OpenAI API key](https://platform.openai.com/api-keys) you can get started using the OpenAI models right away. As an alternative to OpenAI, you can [install plugins](https://llm.datasette.io/en/stable/plugins/installing-plugins.html) to access models by other providers, including models that can be installed and run on your own device. Save your OpenAI API key like this: ```bash llm keys set openai ``` This will prompt you for your key like so: ``` Enter key: ``` Now that you've saved a key you can run a prompt like this: ```bash llm "Five cute names for a pet penguin" ``` ``` 1. Waddles 2. Pebbles 3. Bubbles 4. Flappy 5. Chilly ``` Read the [usage instructions](https://llm.datasette.io/en/stable/usage.html) for more. ## Installing a model that runs on your own machine [LLM plugins](https://llm.datasette.io/en/stable/plugins/index.html) can add support for alternative models, including models that run on your own machine. To download and run Mistral 7B Instruct locally, you can install the [llm-gpt4all](https://github.com/simonw/llm-gpt4all) plugin: ```bash llm install llm-gpt4all ``` Then run this command to see which models it makes available: ```bash llm models ``` ``` gpt4all: all-MiniLM-L6-v2-f16 - SBert, 43.76MB download, needs 1GB RAM gpt4all: orca-mini-3b-gguf2-q4_0 - Mini Orca (Small), 1.84GB download, needs 4GB RAM gpt4all: mistral-7b-instruct-v0 - Mistral Instruct, 3.83GB download, needs 8GB RAM ... ``` Each model file will be downloaded once the first time you use it. Try Mistral out like this: ```bash llm -m mistral-7b-instruct-v0 'difference between a pelican and a walrus' ``` You can also start a chat session with the model using the `llm chat` command: ```bash llm chat -m mistral-7b-instruct-v0 ``` ``` Chatting with mistral-7b-instruct-v0 Type 'exit' or 'quit' to exit Type '!multi' to enter multiple lines, then '!end' to finish > ``` ## Using a system prompt You can use the `-s/--system` option to set a system prompt, providing instructions for processing other input to the tool. To describe how the code in a file works, try this: ```bash cat mycode.py | llm -s "Explain this code" ``` ## Help For help, run: llm --help You can also use: python -m llm --help ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/README.md0000644000175100001660000001164014760365472013124 0ustar00runnerdocker# LLM [![PyPI](https://img.shields.io/pypi/v/llm.svg)](https://pypi.org/project/llm/) [![Documentation](https://readthedocs.org/projects/llm/badge/?version=latest)](https://llm.datasette.io/) [![Changelog](https://img.shields.io/github/v/release/simonw/llm?include_prereleases&label=changelog)](https://llm.datasette.io/en/stable/changelog.html) [![Tests](https://github.com/simonw/llm/workflows/Test/badge.svg)](https://github.com/simonw/llm/actions?query=workflow%3ATest) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/llm/blob/main/LICENSE) [![Discord](https://img.shields.io/discord/823971286308356157?label=discord)](https://datasette.io/discord-llm) [![Homebrew](https://img.shields.io/homebrew/installs/dy/llm?color=yellow&label=homebrew&logo=homebrew)](https://formulae.brew.sh/formula/llm) A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine. [Run prompts from the command-line](https://llm.datasette.io/en/stable/usage.html#executing-a-prompt), [store the results in SQLite](https://llm.datasette.io/en/stable/logging.html), [generate embeddings](https://llm.datasette.io/en/stable/embeddings/index.html) and more. Consult the **[LLM plugins directory](https://llm.datasette.io/en/stable/plugins/directory.html)** for plugins that provide access to remote and local models. Full documentation: **[llm.datasette.io](https://llm.datasette.io/)** Background on this project: - [llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs](https://simonwillison.net/2023/May/18/cli-tools-for-llms/) - [The LLM CLI tool now supports self-hosted language models via plugins](https://simonwillison.net/2023/Jul/12/llm/) - [Accessing Llama 2 from the command-line with the llm-replicate plugin](https://simonwillison.net/2023/Jul/18/accessing-llama-2/) - [Run Llama 2 on your own Mac using LLM and Homebrew](https://simonwillison.net/2023/Aug/1/llama-2-mac/) - [Catching up on the weird world of LLMs](https://simonwillison.net/2023/Aug/3/weird-world-of-llms/) - [LLM now provides tools for working with embeddings](https://simonwillison.net/2023/Sep/4/llm-embeddings/) - [Build an image search engine with llm-clip, chat with models with llm chat](https://simonwillison.net/2023/Sep/12/llm-clip-and-chat/) - [Many options for running Mistral models in your terminal using LLM](https://simonwillison.net/2023/Dec/18/mistral/) ## Installation Install this tool using `pip`: ```bash pip install llm ``` Or using [Homebrew](https://brew.sh/): ```bash brew install llm ``` [Detailed installation instructions](https://llm.datasette.io/en/stable/setup.html). ## Getting started If you have an [OpenAI API key](https://platform.openai.com/api-keys) you can get started using the OpenAI models right away. As an alternative to OpenAI, you can [install plugins](https://llm.datasette.io/en/stable/plugins/installing-plugins.html) to access models by other providers, including models that can be installed and run on your own device. Save your OpenAI API key like this: ```bash llm keys set openai ``` This will prompt you for your key like so: ``` Enter key: ``` Now that you've saved a key you can run a prompt like this: ```bash llm "Five cute names for a pet penguin" ``` ``` 1. Waddles 2. Pebbles 3. Bubbles 4. Flappy 5. Chilly ``` Read the [usage instructions](https://llm.datasette.io/en/stable/usage.html) for more. ## Installing a model that runs on your own machine [LLM plugins](https://llm.datasette.io/en/stable/plugins/index.html) can add support for alternative models, including models that run on your own machine. To download and run Mistral 7B Instruct locally, you can install the [llm-gpt4all](https://github.com/simonw/llm-gpt4all) plugin: ```bash llm install llm-gpt4all ``` Then run this command to see which models it makes available: ```bash llm models ``` ``` gpt4all: all-MiniLM-L6-v2-f16 - SBert, 43.76MB download, needs 1GB RAM gpt4all: orca-mini-3b-gguf2-q4_0 - Mini Orca (Small), 1.84GB download, needs 4GB RAM gpt4all: mistral-7b-instruct-v0 - Mistral Instruct, 3.83GB download, needs 8GB RAM ... ``` Each model file will be downloaded once the first time you use it. Try Mistral out like this: ```bash llm -m mistral-7b-instruct-v0 'difference between a pelican and a walrus' ``` You can also start a chat session with the model using the `llm chat` command: ```bash llm chat -m mistral-7b-instruct-v0 ``` ``` Chatting with mistral-7b-instruct-v0 Type 'exit' or 'quit' to exit Type '!multi' to enter multiple lines, then '!end' to finish > ``` ## Using a system prompt You can use the `-s/--system` option to set a system prompt, providing instructions for processing other input to the tool. To describe how the code in a file works, try this: ```bash cat mycode.py | llm -s "Explain this code" ``` ## Help For help, run: llm --help You can also use: python -m llm --help ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1740761919.8148308 llm-0.23/llm/0000755000175100001660000000000014760365500012417 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/__init__.py0000644000175100001660000002500014760365472014535 0ustar00runnerdockerfrom .hookspecs import hookimpl from .errors import ( ModelError, NeedsKeyException, ) from .models import ( AsyncConversation, AsyncKeyModel, AsyncModel, AsyncResponse, Attachment, Conversation, EmbeddingModel, EmbeddingModelWithAliases, KeyModel, Model, ModelWithAliases, Options, Prompt, Response, ) from .utils import schema_dsl from .embeddings import Collection from .templates import Template from .plugins import pm, load_plugins import click from typing import Dict, List, Optional import json import os import pathlib import struct __all__ = [ "AsyncConversation", "AsyncKeyModel", "AsyncResponse", "Attachment", "Collection", "Conversation", "get_async_model", "get_key", "get_model", "hookimpl", "KeyModel", "Model", "ModelError", "NeedsKeyException", "Options", "Prompt", "Response", "Template", "user_dir", "schema_dsl", ] DEFAULT_MODEL = "gpt-4o-mini" def get_plugins(all=False): plugins = [] plugin_to_distinfo = dict(pm.list_plugin_distinfo()) for plugin in pm.get_plugins(): if not all and plugin.__name__.startswith("llm.default_plugins."): continue plugin_info = { "name": plugin.__name__, "hooks": [h.name for h in pm.get_hookcallers(plugin)], } distinfo = plugin_to_distinfo.get(plugin) if distinfo: plugin_info["version"] = distinfo.version plugin_info["name"] = ( getattr(distinfo, "name", None) or distinfo.project_name ) plugins.append(plugin_info) return plugins def get_models_with_aliases() -> List["ModelWithAliases"]: model_aliases = [] # Include aliases from aliases.json aliases_path = user_dir() / "aliases.json" extra_model_aliases: Dict[str, list] = {} if aliases_path.exists(): configured_aliases = json.loads(aliases_path.read_text()) for alias, model_id in configured_aliases.items(): extra_model_aliases.setdefault(model_id, []).append(alias) def register(model, async_model=None, aliases=None): alias_list = list(aliases or []) if model.model_id in extra_model_aliases: alias_list.extend(extra_model_aliases[model.model_id]) model_aliases.append(ModelWithAliases(model, async_model, alias_list)) load_plugins() pm.hook.register_models(register=register) return model_aliases def get_embedding_models_with_aliases() -> List["EmbeddingModelWithAliases"]: model_aliases = [] # Include aliases from aliases.json aliases_path = user_dir() / "aliases.json" extra_model_aliases: Dict[str, list] = {} if aliases_path.exists(): configured_aliases = json.loads(aliases_path.read_text()) for alias, model_id in configured_aliases.items(): extra_model_aliases.setdefault(model_id, []).append(alias) def register(model, aliases=None): alias_list = list(aliases or []) if model.model_id in extra_model_aliases: alias_list.extend(extra_model_aliases[model.model_id]) model_aliases.append(EmbeddingModelWithAliases(model, alias_list)) load_plugins() pm.hook.register_embedding_models(register=register) return model_aliases def get_embedding_models(): models = [] def register(model, aliases=None): models.append(model) load_plugins() pm.hook.register_embedding_models(register=register) return models def get_embedding_model(name): aliases = get_embedding_model_aliases() try: return aliases[name] except KeyError: raise UnknownModelError("Unknown model: " + str(name)) def get_embedding_model_aliases() -> Dict[str, EmbeddingModel]: model_aliases = {} for model_with_aliases in get_embedding_models_with_aliases(): for alias in model_with_aliases.aliases: model_aliases[alias] = model_with_aliases.model model_aliases[model_with_aliases.model.model_id] = model_with_aliases.model return model_aliases def get_async_model_aliases() -> Dict[str, AsyncModel]: async_model_aliases = {} for model_with_aliases in get_models_with_aliases(): if model_with_aliases.async_model: for alias in model_with_aliases.aliases: async_model_aliases[alias] = model_with_aliases.async_model async_model_aliases[model_with_aliases.model.model_id] = ( model_with_aliases.async_model ) return async_model_aliases def get_model_aliases() -> Dict[str, Model]: model_aliases = {} for model_with_aliases in get_models_with_aliases(): if model_with_aliases.model: for alias in model_with_aliases.aliases: model_aliases[alias] = model_with_aliases.model model_aliases[model_with_aliases.model.model_id] = model_with_aliases.model return model_aliases class UnknownModelError(KeyError): pass def get_models() -> List[Model]: "Get all registered models" models_with_aliases = get_models_with_aliases() return [mwa.model for mwa in models_with_aliases if mwa.model] def get_async_models() -> List[AsyncModel]: "Get all registered async models" models_with_aliases = get_models_with_aliases() return [mwa.async_model for mwa in models_with_aliases if mwa.async_model] def get_async_model(name: Optional[str] = None) -> AsyncModel: "Get an async model by name or alias" aliases = get_async_model_aliases() name = name or get_default_model() try: return aliases[name] except KeyError: # Does a sync model exist? sync_model = None try: sync_model = get_model(name, _skip_async=True) except UnknownModelError: pass if sync_model: raise UnknownModelError("Unknown async model (sync model exists): " + name) else: raise UnknownModelError("Unknown model: " + name) def get_model(name: Optional[str] = None, _skip_async: bool = False) -> Model: "Get a model by name or alias" aliases = get_model_aliases() name = name or get_default_model() try: return aliases[name] except KeyError: # Does an async model exist? if _skip_async: raise UnknownModelError("Unknown model: " + name) async_model = None try: async_model = get_async_model(name) except UnknownModelError: pass if async_model: raise UnknownModelError("Unknown model (async model exists): " + name) else: raise UnknownModelError("Unknown model: " + name) def get_key( explicit_key: Optional[str], key_alias: str, env_var: Optional[str] = None ) -> Optional[str]: """ Return an API key based on a hierarchy of potential sources. :param provided_key: A key provided by the user. This may be the key, or an alias of a key in keys.json. :param key_alias: The alias used to retrieve the key from the keys.json file. :param env_var: Name of the environment variable to check for the key. """ stored_keys = load_keys() # If user specified an alias, use the key stored for that alias if explicit_key in stored_keys: return stored_keys[explicit_key] if explicit_key: # User specified a key that's not an alias, use that return explicit_key # Stored key over-rides environment variables over-ride the default key if key_alias in stored_keys: return stored_keys[key_alias] # Finally try environment variable if env_var and os.environ.get(env_var): return os.environ[env_var] # Couldn't find it return None def load_keys(): path = user_dir() / "keys.json" if path.exists(): return json.loads(path.read_text()) else: return {} def user_dir(): llm_user_path = os.environ.get("LLM_USER_PATH") if llm_user_path: path = pathlib.Path(llm_user_path) else: path = pathlib.Path(click.get_app_dir("io.datasette.llm")) path.mkdir(exist_ok=True, parents=True) return path def set_alias(alias, model_id_or_alias): """ Set an alias to point to the specified model. """ path = user_dir() / "aliases.json" path.parent.mkdir(parents=True, exist_ok=True) if not path.exists(): path.write_text("{}\n") try: current = json.loads(path.read_text()) except json.decoder.JSONDecodeError: # We're going to write a valid JSON file in a moment: current = {} # Resolve model_id_or_alias to a model_id try: model = get_model(model_id_or_alias) model_id = model.model_id except UnknownModelError: # Try to resolve it to an embedding model try: model = get_embedding_model(model_id_or_alias) model_id = model.model_id except UnknownModelError: # Set the alias to the exact string they provided instead model_id = model_id_or_alias current[alias] = model_id path.write_text(json.dumps(current, indent=4) + "\n") def remove_alias(alias): """ Remove an alias. """ path = user_dir() / "aliases.json" if not path.exists(): raise KeyError("No aliases.json file exists") try: current = json.loads(path.read_text()) except json.decoder.JSONDecodeError: raise KeyError("aliases.json file is not valid JSON") if alias not in current: raise KeyError("No such alias: {}".format(alias)) del current[alias] path.write_text(json.dumps(current, indent=4) + "\n") def encode(values): return struct.pack("<" + "f" * len(values), *values) def decode(binary): return struct.unpack("<" + "f" * (len(binary) // 4), binary) def cosine_similarity(a, b): dot_product = sum(x * y for x, y in zip(a, b)) magnitude_a = sum(x * x for x in a) ** 0.5 magnitude_b = sum(x * x for x in b) ** 0.5 return dot_product / (magnitude_a * magnitude_b) def get_default_model(filename="default_model.txt", default=DEFAULT_MODEL): path = user_dir() / filename if path.exists(): return path.read_text().strip() else: return default def set_default_model(model, filename="default_model.txt"): path = user_dir() / filename if model is None and path.exists(): path.unlink() else: path.write_text(model) def get_default_embedding_model(): return get_default_model("default_embedding_model.txt", None) def set_default_embedding_model(model): set_default_model(model, "default_embedding_model.txt") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/__main__.py0000644000175100001660000000007314760365472014521 0ustar00runnerdockerfrom .cli import cli if __name__ == "__main__": cli() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/cli.py0000644000175100001660000021526114760365472013557 0ustar00runnerdockerimport asyncio import click from click_default_group import DefaultGroup from dataclasses import asdict import io import json import re from llm import ( Attachment, AsyncConversation, AsyncKeyModel, AsyncResponse, Collection, Conversation, Response, Template, UnknownModelError, KeyModel, encode, get_async_model, get_default_model, get_default_embedding_model, get_embedding_models_with_aliases, get_embedding_model_aliases, get_embedding_model, get_plugins, get_model, get_model_aliases, get_models_with_aliases, user_dir, set_alias, set_default_model, set_default_embedding_model, remove_alias, ) from llm.models import _BaseConversation from .migrations import migrate from .plugins import pm, load_plugins from .utils import ( mimetype_from_path, mimetype_from_string, token_usage_string, extract_fenced_code_block, make_schema_id, output_rows_as_json, resolve_schema_input, schema_summary, multi_schema, schema_dsl, find_unused_key, ) import base64 import httpx import pathlib import pydantic import readline from runpy import run_module import shutil import sqlite_utils from sqlite_utils.utils import rows_from_file, Format import sys import textwrap from typing import cast, Optional, Iterable, Union, Tuple import warnings import yaml warnings.simplefilter("ignore", ResourceWarning) DEFAULT_TEMPLATE = "prompt: " class AttachmentType(click.ParamType): name = "attachment" def convert(self, value, param, ctx): if value == "-": content = sys.stdin.buffer.read() # Try to guess type mimetype = mimetype_from_string(content) if mimetype is None: raise click.BadParameter("Could not determine mimetype of stdin") return Attachment(type=mimetype, path=None, url=None, content=content) if "://" in value: # Confirm URL exists and try to guess type try: response = httpx.head(value) response.raise_for_status() mimetype = response.headers.get("content-type") except httpx.HTTPError as ex: raise click.BadParameter(str(ex)) return Attachment(mimetype, None, value, None) # Check that the file exists path = pathlib.Path(value) if not path.exists(): self.fail(f"File {value} does not exist", param, ctx) path = path.resolve() # Try to guess type mimetype = mimetype_from_path(str(path)) if mimetype is None: raise click.BadParameter(f"Could not determine mimetype of {value}") return Attachment(type=mimetype, path=str(path), url=None, content=None) def attachment_types_callback(ctx, param, values): collected = [] for value, mimetype in values: if "://" in value: attachment = Attachment(mimetype, None, value, None) elif value == "-": content = sys.stdin.buffer.read() attachment = Attachment(mimetype, None, None, content) else: # Look for file path = pathlib.Path(value) if not path.exists(): raise click.BadParameter(f"File {value} does not exist") path = path.resolve() attachment = Attachment(mimetype, str(path), None, None) collected.append(attachment) return collected def json_validator(object_name): def validator(ctx, param, value): if value is None: return value try: obj = json.loads(value) if not isinstance(obj, dict): raise click.BadParameter(f"{object_name} must be a JSON object") return obj except json.JSONDecodeError: raise click.BadParameter(f"{object_name} must be valid JSON") return validator def schema_option(fn): click.option( "schema_input", "--schema", help="JSON schema, filepath or ID", )(fn) return fn @click.group( cls=DefaultGroup, default="prompt", default_if_no_args=True, ) @click.version_option() def cli(): """ Access Large Language Models from the command-line Documentation: https://llm.datasette.io/ LLM can run models from many different providers. Consult the plugin directory for a list of available models: https://llm.datasette.io/en/stable/plugins/directory.html To get started with OpenAI, obtain an API key from them and: \b $ llm keys set openai Enter key: ... Then execute a prompt like this: llm 'Five outrageous names for a pet pelican' """ @cli.command(name="prompt") @click.argument("prompt", required=False) @click.option("-s", "--system", help="System prompt to use") @click.option("model_id", "-m", "--model", help="Model to use") @click.option( "attachments", "-a", "--attachment", type=AttachmentType(), multiple=True, help="Attachment path or URL or -", ) @click.option( "attachment_types", "--at", "--attachment-type", type=(str, str), multiple=True, callback=attachment_types_callback, help="Attachment with explicit mimetype", ) @click.option( "options", "-o", "--option", type=(str, str), multiple=True, help="key/value options for the model", ) @schema_option @click.option( "--schema-multi", help="JSON schema to use for multiple results", ) @click.option("-t", "--template", help="Template to use") @click.option( "-p", "--param", multiple=True, type=(str, str), help="Parameters for template", ) @click.option("--no-stream", is_flag=True, help="Do not stream output") @click.option("-n", "--no-log", is_flag=True, help="Don't log to database") @click.option("--log", is_flag=True, help="Log prompt and response to the database") @click.option( "_continue", "-c", "--continue", is_flag=True, flag_value=-1, help="Continue the most recent conversation.", ) @click.option( "conversation_id", "--cid", "--conversation", help="Continue the conversation with the given ID.", ) @click.option("--key", help="API key to use") @click.option("--save", help="Save prompt with this template name") @click.option("async_", "--async", is_flag=True, help="Run prompt asynchronously") @click.option("-u", "--usage", is_flag=True, help="Show token usage") @click.option("-x", "--extract", is_flag=True, help="Extract first fenced code block") @click.option( "extract_last", "--xl", "--extract-last", is_flag=True, help="Extract last fenced code block", ) def prompt( prompt, system, model_id, attachments, attachment_types, options, schema_input, schema_multi, template, param, no_stream, no_log, log, _continue, conversation_id, key, save, async_, usage, extract, extract_last, ): """ Execute a prompt Documentation: https://llm.datasette.io/en/stable/usage.html Examples: \b llm 'Capital of France?' llm 'Capital of France?' -m gpt-4o llm 'Capital of France?' -s 'answer in Spanish' Multi-modal models can be called with attachments like this: \b llm 'Extract text from this image' -a image.jpg llm 'Describe' -a https://static.simonwillison.net/static/2024/pelicans.jpg cat image | llm 'describe image' -a - # With an explicit mimetype: cat image | llm 'describe image' --at - image/jpeg The -x/--extract option returns just the content of the first ``` fenced code block, if one is present. If none are present it returns the full response. \b llm 'JavaScript function for reversing a string' -x """ if log and no_log: raise click.ClickException("--log and --no-log are mutually exclusive") log_path = logs_db_path() (log_path.parent).mkdir(parents=True, exist_ok=True) db = sqlite_utils.Database(log_path) migrate(db) if schema_multi: schema_input = schema_multi schema = resolve_schema_input(db, schema_input, load_template) if schema_multi: # Convert that schema into multiple "items" of the same schema schema = multi_schema(schema) model_aliases = get_model_aliases() def read_prompt(): nonlocal prompt, schema # Is there extra prompt available on stdin? stdin_prompt = None if not sys.stdin.isatty(): stdin_prompt = sys.stdin.read() if stdin_prompt: bits = [stdin_prompt] if prompt: bits.append(prompt) prompt = " ".join(bits) if ( prompt is None and not save and sys.stdin.isatty() and not attachments and not attachment_types and not schema ): # Hang waiting for input to stdin (unless --save) prompt = sys.stdin.read() return prompt if save: # We are saving their prompt/system/etc to a new template # Fields to save: prompt, system, model - and more in the future disallowed_options = [] for option, var in ( ("--template", template), ("--continue", _continue), ("--cid", conversation_id), ): if var: disallowed_options.append(option) if disallowed_options: raise click.ClickException( "--save cannot be used with {}".format(", ".join(disallowed_options)) ) path = template_dir() / f"{save}.yaml" to_save = {} if model_id: try: to_save["model"] = model_aliases[model_id].model_id except KeyError: raise click.ClickException("'{}' is not a known model".format(model_id)) prompt = read_prompt() if prompt: to_save["prompt"] = prompt if system: to_save["system"] = system if param: to_save["defaults"] = dict(param) if extract: to_save["extract"] = True if extract_last: to_save["extract_last"] = True if schema: to_save["schema_object"] = schema path.write_text( yaml.dump( to_save, indent=4, default_flow_style=False, sort_keys=False, ), "utf-8", ) return if template: params = dict(param) # Cannot be used with system if system: raise click.ClickException("Cannot use -t/--template and --system together") template_obj = load_template(template) extract = template_obj.extract extract_last = template_obj.extract_last if template_obj.schema_object: schema = template_obj.schema_object prompt = read_prompt() try: prompt, system = template_obj.evaluate(prompt, params) except Template.MissingVariables as ex: raise click.ClickException(str(ex)) if model_id is None and template_obj.model: model_id = template_obj.model if extract or extract_last: no_stream = True conversation = None if conversation_id or _continue: # Load the conversation - loads most recent if no ID provided try: conversation = load_conversation(conversation_id, async_=async_) except UnknownModelError as ex: raise click.ClickException(str(ex)) # Figure out which model we are using if model_id is None: if conversation: model_id = conversation.model.model_id else: model_id = get_default_model() # Now resolve the model try: if async_: model = get_async_model(model_id) else: model = get_model(model_id) except UnknownModelError as ex: raise click.ClickException(ex) if conversation: # To ensure it can see the key conversation.model = model # Validate options validated_options = {} if options: # Validate with pydantic try: validated_options = dict( (key, value) for key, value in model.Options(**dict(options)) if value is not None ) except pydantic.ValidationError as ex: raise click.ClickException(render_errors(ex.errors())) kwargs = {**validated_options} resolved_attachments = [*attachments, *attachment_types] should_stream = model.can_stream and not no_stream if not should_stream: kwargs["stream"] = False if isinstance(model, (KeyModel, AsyncKeyModel)): kwargs["key"] = key prompt = read_prompt() response = None prompt_method = model.prompt if conversation: prompt_method = conversation.prompt try: if async_: async def inner(): if should_stream: response = prompt_method( prompt, attachments=resolved_attachments, system=system, schema=schema, **kwargs, ) async for chunk in response: print(chunk, end="") sys.stdout.flush() print("") else: response = prompt_method( prompt, attachments=resolved_attachments, system=system, schema=schema, **kwargs, ) text = await response.text() if extract or extract_last: text = ( extract_fenced_code_block(text, last=extract_last) or text ) print(text) return response response = asyncio.run(inner()) else: response = prompt_method( prompt, attachments=resolved_attachments, system=system, schema=schema, **kwargs, ) if should_stream: for chunk in response: print(chunk, end="") sys.stdout.flush() print("") else: text = response.text() if extract or extract_last: text = extract_fenced_code_block(text, last=extract_last) or text print(text) # List of exceptions that should never be raised in pytest: except (ValueError, NotImplementedError) as ex: raise click.ClickException(str(ex)) except Exception as ex: # All other exceptions should raise in pytest, show to user otherwise if getattr(sys, "_called_from_test", False): raise raise click.ClickException(str(ex)) if isinstance(response, AsyncResponse): response = asyncio.run(response.to_sync_response()) if usage: # Show token usage to stderr in yellow click.echo( click.style( "Token usage: {}".format(response.token_usage()), fg="yellow", bold=True ), err=True, ) # Log to the database if (logs_on() or log) and not no_log: response.log_to_db(db) @cli.command() @click.option("-s", "--system", help="System prompt to use") @click.option("model_id", "-m", "--model", help="Model to use") @click.option( "_continue", "-c", "--continue", is_flag=True, flag_value=-1, help="Continue the most recent conversation.", ) @click.option( "conversation_id", "--cid", "--conversation", help="Continue the conversation with the given ID.", ) @click.option("-t", "--template", help="Template to use") @click.option( "-p", "--param", multiple=True, type=(str, str), help="Parameters for template", ) @click.option( "options", "-o", "--option", type=(str, str), multiple=True, help="key/value options for the model", ) @click.option("--no-stream", is_flag=True, help="Do not stream output") @click.option("--key", help="API key to use") def chat( system, model_id, _continue, conversation_id, template, param, options, no_stream, key, ): """ Hold an ongoing chat with a model. """ # Left and right arrow keys to move cursor: if sys.platform != "win32": readline.parse_and_bind("\\e[D: backward-char") readline.parse_and_bind("\\e[C: forward-char") else: readline.parse_and_bind("bind -x '\\e[D: backward-char'") readline.parse_and_bind("bind -x '\\e[C: forward-char'") log_path = logs_db_path() (log_path.parent).mkdir(parents=True, exist_ok=True) db = sqlite_utils.Database(log_path) migrate(db) conversation = None if conversation_id or _continue: # Load the conversation - loads most recent if no ID provided try: conversation = load_conversation(conversation_id) except UnknownModelError as ex: raise click.ClickException(str(ex)) template_obj = None if template: params = dict(param) # Cannot be used with system if system: raise click.ClickException("Cannot use -t/--template and --system together") template_obj = load_template(template) if model_id is None and template_obj.model: model_id = template_obj.model # Figure out which model we are using if model_id is None: if conversation: model_id = conversation.model.model_id else: model_id = get_default_model() # Now resolve the model try: model = get_model(model_id) except KeyError: raise click.ClickException("'{}' is not a known model".format(model_id)) if conversation is None: # Start a fresh conversation for this chat conversation = Conversation(model=model) else: # Ensure it can see the API key conversation.model = model # Validate options validated_options = {} if options: try: validated_options = dict( (key, value) for key, value in model.Options(**dict(options)) if value is not None ) except pydantic.ValidationError as ex: raise click.ClickException(render_errors(ex.errors())) kwargs = {} kwargs.update(validated_options) should_stream = model.can_stream and not no_stream if not should_stream: kwargs["stream"] = False if key and isinstance(model, KeyModel): kwargs["key"] = key click.echo("Chatting with {}".format(model.model_id)) click.echo("Type 'exit' or 'quit' to exit") click.echo("Type '!multi' to enter multiple lines, then '!end' to finish") in_multi = False accumulated = [] end_token = "!end" while True: prompt = click.prompt("", prompt_suffix="> " if not in_multi else "") if prompt.strip().startswith("!multi"): in_multi = True bits = prompt.strip().split() if len(bits) > 1: end_token = "!end {}".format(" ".join(bits[1:])) continue if in_multi: if prompt.strip() == end_token: prompt = "\n".join(accumulated) in_multi = False accumulated = [] else: accumulated.append(prompt) continue if template_obj: try: prompt, system = template_obj.evaluate(prompt, params) except Template.MissingVariables as ex: raise click.ClickException(str(ex)) if prompt.strip() in ("exit", "quit"): break response = conversation.prompt(prompt, system=system, **kwargs) # System prompt only sent for the first message: system = None for chunk in response: print(chunk, end="") sys.stdout.flush() response.log_to_db(db) print("") def load_conversation( conversation_id: Optional[str], async_=False ) -> Optional[_BaseConversation]: db = sqlite_utils.Database(logs_db_path()) migrate(db) if conversation_id is None: # Return the most recent conversation, or None if there are none matches = list(db["conversations"].rows_where(order_by="id desc", limit=1)) if matches: conversation_id = matches[0]["id"] else: return None try: row = cast(sqlite_utils.db.Table, db["conversations"]).get(conversation_id) except sqlite_utils.db.NotFoundError: raise click.ClickException( "No conversation found with id={}".format(conversation_id) ) # Inflate that conversation conversation_class = AsyncConversation if async_ else Conversation response_class = AsyncResponse if async_ else Response conversation = conversation_class.from_row(row) for response in db["responses"].rows_where( "conversation_id = ?", [conversation_id] ): conversation.responses.append(response_class.from_row(db, response)) return conversation @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def keys(): "Manage stored API keys for different models" @keys.command(name="list") def keys_list(): "List names of all stored keys" path = user_dir() / "keys.json" if not path.exists(): click.echo("No keys found") return keys = json.loads(path.read_text()) for key in sorted(keys.keys()): if key != "// Note": click.echo(key) @keys.command(name="path") def keys_path_command(): "Output the path to the keys.json file" click.echo(user_dir() / "keys.json") @keys.command(name="get") @click.argument("name") def keys_get(name): """ Return the value of a stored key Example usage: \b export OPENAI_API_KEY=$(llm keys get openai) """ path = user_dir() / "keys.json" if not path.exists(): raise click.ClickException("No keys found") keys = json.loads(path.read_text()) try: click.echo(keys[name]) except KeyError: raise click.ClickException("No key found with name '{}'".format(name)) @keys.command(name="set") @click.argument("name") @click.option("--value", prompt="Enter key", hide_input=True, help="Value to set") def keys_set(name, value): """ Save a key in the keys.json file Example usage: \b $ llm keys set openai Enter key: ... """ default = {"// Note": "This file stores secret API credentials. Do not share!"} path = user_dir() / "keys.json" path.parent.mkdir(parents=True, exist_ok=True) if not path.exists(): path.write_text(json.dumps(default)) path.chmod(0o600) try: current = json.loads(path.read_text()) except json.decoder.JSONDecodeError: current = default current[name] = value path.write_text(json.dumps(current, indent=2) + "\n") @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def logs(): "Tools for exploring logged prompts and responses" @logs.command(name="path") def logs_path(): "Output the path to the logs.db file" click.echo(logs_db_path()) @logs.command(name="status") def logs_status(): "Show current status of database logging" path = logs_db_path() if not path.exists(): click.echo("No log database found at {}".format(path)) return if logs_on(): click.echo("Logging is ON for all prompts".format()) else: click.echo("Logging is OFF".format()) db = sqlite_utils.Database(path) migrate(db) click.echo("Found log database at {}".format(path)) click.echo("Number of conversations logged:\t{}".format(db["conversations"].count)) click.echo("Number of responses logged:\t{}".format(db["responses"].count)) click.echo( "Database file size: \t\t{}".format(_human_readable_size(path.stat().st_size)) ) @logs.command(name="on") def logs_turn_on(): "Turn on logging for all prompts" path = user_dir() / "logs-off" if path.exists(): path.unlink() @logs.command(name="off") def logs_turn_off(): "Turn off logging for all prompts" path = user_dir() / "logs-off" path.touch() LOGS_COLUMNS = """ responses.id, responses.model, responses.prompt, responses.system, responses.prompt_json, responses.options_json, responses.response, responses.response_json, responses.conversation_id, responses.duration_ms, responses.datetime_utc, responses.input_tokens, responses.output_tokens, responses.token_details, conversations.name as conversation_name, conversations.model as conversation_model, schemas.content as schema_json""" LOGS_SQL = """ select {columns} from responses left join schemas on responses.schema_id = schemas.id left join conversations on responses.conversation_id = conversations.id{extra_where} order by responses.id desc{limit} """ LOGS_SQL_SEARCH = """ select {columns} from responses left join schemas on responses.schema_id = schemas.id left join conversations on responses.conversation_id = conversations.id join responses_fts on responses_fts.rowid = responses.rowid where responses_fts match :query{extra_where} order by responses_fts.rank desc{limit} """ ATTACHMENTS_SQL = """ select response_id, attachments.id, attachments.type, attachments.path, attachments.url, length(attachments.content) as content_length from attachments join prompt_attachments on attachments.id = prompt_attachments.attachment_id where prompt_attachments.response_id in ({}) order by prompt_attachments."order" """ @logs.command(name="list") @click.option( "-n", "--count", type=int, default=None, help="Number of entries to show - defaults to 3, use 0 for all", ) @click.option( "-p", "--path", type=click.Path(readable=True, exists=True, dir_okay=False), help="Path to log database", ) @click.option("-m", "--model", help="Filter by model or model alias") @click.option("-q", "--query", help="Search for logs matching this string") @schema_option @click.option( "--schema-multi", help="JSON schema used for multiple results", ) @click.option( "--data", is_flag=True, help="Output newline-delimited JSON data for schema" ) @click.option("--data-array", is_flag=True, help="Output JSON array of data for schema") @click.option("--data-key", help="Return JSON objects from array in this key") @click.option( "--data-ids", is_flag=True, help="Attach corresponding IDs to JSON objects" ) @click.option("-t", "--truncate", is_flag=True, help="Truncate long strings in output") @click.option( "-s", "--short", is_flag=True, help="Shorter YAML output with truncated prompts" ) @click.option("-u", "--usage", is_flag=True, help="Include token usage") @click.option("-r", "--response", is_flag=True, help="Just output the last response") @click.option("-x", "--extract", is_flag=True, help="Extract first fenced code block") @click.option( "extract_last", "--xl", "--extract-last", is_flag=True, help="Extract last fenced code block", ) @click.option( "current_conversation", "-c", "--current", is_flag=True, flag_value=-1, help="Show logs from the current conversation", ) @click.option( "conversation_id", "--cid", "--conversation", help="Show logs for this conversation ID", ) @click.option("--id-gt", help="Return responses with ID > this") @click.option("--id-gte", help="Return responses with ID >= this") @click.option( "json_output", "--json", is_flag=True, help="Output logs as JSON", ) def logs_list( count, path, model, query, schema_input, schema_multi, data, data_array, data_key, data_ids, truncate, short, usage, response, extract, extract_last, current_conversation, conversation_id, id_gt, id_gte, json_output, ): "Show recent logged prompts and their responses" path = pathlib.Path(path or logs_db_path()) if not path.exists(): raise click.ClickException("No log database found at {}".format(path)) db = sqlite_utils.Database(path) migrate(db) if schema_multi: schema_input = schema_multi schema = resolve_schema_input(db, schema_input, load_template) if schema_multi: schema = multi_schema(schema) if short and (json_output or response): invalid = " or ".join( [ flag[0] for flag in (("--json", json_output), ("--response", response)) if flag[1] ] ) raise click.ClickException("Cannot use --short and {} together".format(invalid)) if response and not current_conversation and not conversation_id: current_conversation = True if current_conversation: try: conversation_id = next( db.query( "select conversation_id from responses order by id desc limit 1" ) )["conversation_id"] except StopIteration: # No conversations yet raise click.ClickException("No conversations found") # For --conversation set limit 0, if not explicitly set if count is None: if conversation_id: count = 0 else: count = 3 model_id = None if model: # Resolve alias, if any try: model_id = get_model(model).model_id except UnknownModelError: # Maybe they uninstalled a model, use the -m option as-is model_id = model sql = LOGS_SQL if query: sql = LOGS_SQL_SEARCH limit = "" if count is not None and count > 0: limit = " limit {}".format(count) sql_format = { "limit": limit, "columns": LOGS_COLUMNS, "extra_where": "", } where_bits = [] if model_id: where_bits.append("responses.model = :model") if conversation_id: where_bits.append("responses.conversation_id = :conversation_id") if id_gt: where_bits.append("responses.id > :id_gt") if id_gte: where_bits.append("responses.id >= :id_gte") schema_id = None if schema: schema_id = make_schema_id(schema)[0] where_bits.append("responses.schema_id = :schema_id") if where_bits: where_ = " and " if query else " where " sql_format["extra_where"] = where_ + " and ".join(where_bits) final_sql = sql.format(**sql_format) rows = list( db.query( final_sql, { "model": model_id, "query": query, "conversation_id": conversation_id, "schema_id": schema_id, "id_gt": id_gt, "id_gte": id_gte, }, ) ) # Reverse the order - we do this because we 'order by id desc limit 3' to get the # 3 most recent results, but we still want to display them in chronological order # ... except for searches where we don't do this if not query and not data: rows.reverse() # Fetch any attachments ids = [row["id"] for row in rows] attachments = list(db.query(ATTACHMENTS_SQL.format(",".join("?" * len(ids))), ids)) attachments_by_id = {} for attachment in attachments: attachments_by_id.setdefault(attachment["response_id"], []).append(attachment) if data or data_array or data_key or data_ids: # Special case for --data to output valid JSON to_output = [] for row in rows: response = row["response"] or "" try: decoded = json.loads(response) new_items = [] if ( isinstance(decoded, dict) and (data_key in decoded) and all(isinstance(item, dict) for item in decoded[data_key]) ): for item in decoded[data_key]: new_items.append(item) else: new_items.append(decoded) if data_ids: for item in new_items: item[find_unused_key(item, "response_id")] = row["id"] item[find_unused_key(item, "conversation_id")] = row["id"] to_output.extend(new_items) except ValueError: pass click.echo(output_rows_as_json(to_output, not data_array)) return for row in rows: if truncate: row["prompt"] = _truncate_string(row["prompt"]) row["response"] = _truncate_string(row["response"]) # Either decode or remove all JSON keys keys = list(row.keys()) for key in keys: if key.endswith("_json") and row[key] is not None: if truncate: del row[key] else: row[key] = json.loads(row[key]) output = None if json_output: # Output as JSON if requested for row in rows: row["attachments"] = [ {k: v for k, v in attachment.items() if k != "response_id"} for attachment in attachments_by_id.get(row["id"], []) ] output = json.dumps(list(rows), indent=2) elif extract or extract_last: # Extract and return first code block for row in rows: output = extract_fenced_code_block(row["response"], last=extract_last) if output is not None: break elif response: # Just output the last response if rows: output = rows[-1]["response"] if output is not None: click.echo(output) else: # Output neatly formatted human-readable logs current_system = None should_show_conversation = True for row in rows: if short: system = _truncate_string(row["system"], 120, end=True) prompt = _truncate_string(row["prompt"], 120, end=True) cid = row["conversation_id"] attachments = attachments_by_id.get(row["id"]) obj = { "model": row["model"], "datetime": row["datetime_utc"].split(".")[0], "conversation": cid, } if system: obj["system"] = system if prompt: obj["prompt"] = prompt if attachments: items = [] for attachment in attachments: details = {"type": attachment["type"]} if attachment.get("path"): details["path"] = attachment["path"] if attachment.get("url"): details["url"] = attachment["url"] items.append(details) obj["attachments"] = items if usage and (row["input_tokens"] or row["output_tokens"]): usage_details = { "input": row["input_tokens"], "output": row["output_tokens"], } if row["token_details"]: usage_details["details"] = json.loads(row["token_details"]) obj["usage"] = usage_details click.echo(yaml.dump([obj], sort_keys=False).strip()) continue click.echo( "# {}{}\n{}".format( row["datetime_utc"].split(".")[0], ( " conversation: {} id: {}".format( row["conversation_id"], row["id"] ) if should_show_conversation else "" ), ( "\nModel: **{}**\n".format(row["model"]) if should_show_conversation else "" ), ) ) # In conversation log mode only show it for the first one if conversation_id: should_show_conversation = False click.echo("## Prompt\n\n{}".format(row["prompt"] or "-- none --")) if row["system"] != current_system: if row["system"] is not None: click.echo("\n## System\n\n{}".format(row["system"])) current_system = row["system"] if row["schema_json"]: click.echo( "\n## Schema\n\n```json\n{}\n```".format( json.dumps(row["schema_json"], indent=2) ) ) attachments = attachments_by_id.get(row["id"]) if attachments: click.echo("\n### Attachments\n") for i, attachment in enumerate(attachments, 1): if attachment["path"]: path = attachment["path"] click.echo( "{}. **{}**: `{}`".format(i, attachment["type"], path) ) elif attachment["url"]: click.echo( "{}. **{}**: {}".format( i, attachment["type"], attachment["url"] ) ) elif attachment["content_length"]: click.echo( "{}. **{}**: `<{} bytes>`".format( i, attachment["type"], f"{attachment['content_length']:,}", ) ) # If a schema was provided and the row is valid JSON, pretty print and syntax highlight it response = row["response"] if row["schema_json"]: try: parsed = json.loads(response) response = "```json\n{}\n```".format(json.dumps(parsed, indent=2)) except ValueError: pass click.echo("\n## Response\n\n{}\n".format(response)) if usage: token_usage = token_usage_string( row["input_tokens"], row["output_tokens"], json.loads(row["token_details"]) if row["token_details"] else None, ) if token_usage: click.echo("## Token usage:\n\n{}\n".format(token_usage)) @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def models(): "Manage available models" _type_lookup = { "number": "float", "integer": "int", "string": "str", "object": "dict", } @models.command(name="list") @click.option( "--options", is_flag=True, help="Show options for each model, if available" ) @click.option("async_", "--async", is_flag=True, help="List async models") @click.option("--schemas", is_flag=True, help="List models that support schemas") @click.option( "-q", "--query", multiple=True, help="Search for models matching these strings", ) def models_list(options, async_, schemas, query): "List available models" models_that_have_shown_options = set() for model_with_aliases in get_models_with_aliases(): if async_ and not model_with_aliases.async_model: continue if query: # Only show models where every provided query string matches if not all(model_with_aliases.matches(q) for q in query): continue if schemas and not model_with_aliases.model.supports_schema: continue extra = "" if model_with_aliases.aliases: extra = " (aliases: {})".format(", ".join(model_with_aliases.aliases)) model = ( model_with_aliases.model if not async_ else model_with_aliases.async_model ) output = str(model) + extra if options and model.Options.model_json_schema()["properties"]: output += "\n Options:" for name, field in model.Options.model_json_schema()["properties"].items(): any_of = field.get("anyOf") if any_of is None: any_of = [{"type": field.get("type", "str")}] types = ", ".join( [ _type_lookup.get(item.get("type"), item.get("type", "str")) for item in any_of if item.get("type") != "null" ] ) bits = ["\n ", name, ": ", types] description = field.get("description", "") if description and ( model.__class__ not in models_that_have_shown_options ): wrapped = textwrap.wrap(description, 70) bits.append("\n ") bits.extend("\n ".join(wrapped)) output += "".join(bits) models_that_have_shown_options.add(model.__class__) if options and model.attachment_types: attachment_types = ", ".join(sorted(model.attachment_types)) wrapper = textwrap.TextWrapper( width=min(max(shutil.get_terminal_size().columns, 30), 70), initial_indent=" ", subsequent_indent=" ", ) output += "\n Attachment types:\n{}".format(wrapper.fill(attachment_types)) features = ( [] + (["streaming"] if model.can_stream else []) + (["schemas"] if model.supports_schema else []) + (["async"] if model_with_aliases.async_model else []) ) if options and features: output += "\n Features:\n{}".format( "\n".join(" - {}".format(feature) for feature in features) ) click.echo(output) if not query and not options and not schemas: click.echo(f"Default: {get_default_model()}") @models.command(name="default") @click.argument("model", required=False) def models_default(model): "Show or set the default model" if not model: click.echo(get_default_model()) return # Validate it is a known model try: model = get_model(model) set_default_model(model.model_id) except KeyError: raise click.ClickException("Unknown model: {}".format(model)) @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def templates(): "Manage stored prompt templates" @templates.command(name="list") def templates_list(): "List available prompt templates" path = template_dir() pairs = [] for file in path.glob("*.yaml"): name = file.stem template = load_template(name) text = [] if template.system: text.append(f"system: {template.system}") if template.prompt: text.append(f" prompt: {template.prompt}") else: text = [template.prompt if template.prompt else ""] pairs.append((name, "".join(text).replace("\n", " "))) try: max_name_len = max(len(p[0]) for p in pairs) except ValueError: return else: fmt = "{name:<" + str(max_name_len) + "} : {prompt}" for name, prompt in sorted(pairs): text = fmt.format(name=name, prompt=prompt) click.echo(display_truncated(text)) @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def schemas(): "Manage stored schemas" @schemas.command(name="list") @click.option( "-p", "--path", type=click.Path(readable=True, exists=True, dir_okay=False), help="Path to log database", ) @click.option( "queries", "-q", "--query", multiple=True, help="Search for schemas matching this string", ) @click.option("--full", is_flag=True, help="Output full schema contents") def schemas_list(path, queries, full): "List stored schemas" path = pathlib.Path(path or logs_db_path()) if not path.exists(): raise click.ClickException("No log database found at {}".format(path)) db = sqlite_utils.Database(path) migrate(db) params = [] where_sql = "" if queries: where_bits = ["schemas.content like ?" for _ in queries] where_sql += " where {}".format(" and ".join(where_bits)) params.extend("%{}%".format(q) for q in queries) sql = """ select schemas.id, schemas.content, max(responses.datetime_utc) as recently_used, count(*) as times_used from schemas join responses on responses.schema_id = schemas.id {} group by responses.schema_id order by recently_used """.format( where_sql ) rows = db.query(sql, params) for row in rows: click.echo("- id: {}".format(row["id"])) if full: click.echo( " schema: |\n{}".format( textwrap.indent( json.dumps(json.loads(row["content"]), indent=2), " " ) ) ) else: click.echo( " summary: |\n {}".format( schema_summary(json.loads(row["content"])) ) ) click.echo( " usage: |\n {} time{}, most recently {}".format( row["times_used"], "s" if row["times_used"] != 1 else "", row["recently_used"], ) ) @schemas.command(name="show") @click.argument("schema_id") @click.option( "-p", "--path", type=click.Path(readable=True, exists=True, dir_okay=False), help="Path to log database", ) def schemas_show(schema_id, path): "Show a stored schema" path = pathlib.Path(path or logs_db_path()) if not path.exists(): raise click.ClickException("No log database found at {}".format(path)) db = sqlite_utils.Database(path) migrate(db) try: row = db["schemas"].get(schema_id) except sqlite_utils.db.NotFoundError: raise click.ClickException("Invalid schema ID") click.echo(json.dumps(json.loads(row["content"]), indent=2)) @schemas.command(name="dsl") @click.argument("input") @click.option("--multi", is_flag=True, help="Wrap in an array") def schemas_dsl_debug(input, multi): """ Convert LLM's schema DSL to a JSON schema \b llm schema dsl 'name, age int, bio: their bio' """ schema = schema_dsl(input, multi) click.echo(json.dumps(schema, indent=2)) @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def aliases(): "Manage model aliases" @aliases.command(name="list") @click.option("json_", "--json", is_flag=True, help="Output as JSON") def aliases_list(json_): "List current aliases" to_output = [] for alias, model in get_model_aliases().items(): if alias != model.model_id: to_output.append((alias, model.model_id, "")) for alias, embedding_model in get_embedding_model_aliases().items(): if alias != embedding_model.model_id: to_output.append((alias, embedding_model.model_id, "embedding")) if json_: click.echo( json.dumps({key: value for key, value, type_ in to_output}, indent=4) ) return max_alias_length = max(len(a) for a, _, _ in to_output) fmt = "{alias:<" + str(max_alias_length) + "} : {model_id}{type_}" for alias, model_id, type_ in to_output: click.echo( fmt.format( alias=alias, model_id=model_id, type_=f" ({type_})" if type_ else "" ) ) @aliases.command(name="set") @click.argument("alias") @click.argument("model_id", required=False) @click.option( "-q", "--query", multiple=True, help="Set alias for model matching these strings", ) def aliases_set(alias, model_id, query): """ Set an alias for a model Example usage: \b llm aliases set mini gpt-4o-mini Alternatively you can omit the model ID and specify one or more -q options. The first model matching all of those query strings will be used. \b llm aliases set mini -q 4o -q mini """ if not model_id: if not query: raise click.ClickException( "You must provide a model_id or at least one -q option" ) # Search for the first model matching all query strings found = None for model_with_aliases in get_models_with_aliases(): if all(model_with_aliases.matches(q) for q in query): found = model_with_aliases break if not found: raise click.ClickException( "No model found matching query: " + ", ".join(query) ) model_id = found.model.model_id set_alias(alias, model_id) click.echo( f"Alias '{alias}' set to model '{model_id}'", err=True, ) else: set_alias(alias, model_id) @aliases.command(name="remove") @click.argument("alias") def aliases_remove(alias): """ Remove an alias Example usage: \b $ llm aliases remove turbo """ try: remove_alias(alias) except KeyError as ex: raise click.ClickException(ex.args[0]) @aliases.command(name="path") def aliases_path(): "Output the path to the aliases.json file" click.echo(user_dir() / "aliases.json") @cli.command(name="plugins") @click.option("--all", help="Include built-in default plugins", is_flag=True) def plugins_list(all): "List installed plugins" click.echo(json.dumps(get_plugins(all), indent=2)) def display_truncated(text): console_width = shutil.get_terminal_size()[0] if len(text) > console_width: return text[: console_width - 3] + "..." else: return text @templates.command(name="show") @click.argument("name") def templates_show(name): "Show the specified prompt template" template = load_template(name) click.echo( yaml.dump( dict((k, v) for k, v in template.model_dump().items() if v is not None), indent=4, default_flow_style=False, ) ) @templates.command(name="edit") @click.argument("name") def templates_edit(name): "Edit the specified prompt template using the default $EDITOR" # First ensure it exists path = template_dir() / f"{name}.yaml" if not path.exists(): path.write_text(DEFAULT_TEMPLATE, "utf-8") click.edit(filename=path) # Validate that template load_template(name) @templates.command(name="path") def templates_path(): "Output the path to the templates directory" click.echo(template_dir()) @cli.command() @click.argument("packages", nargs=-1, required=False) @click.option( "-U", "--upgrade", is_flag=True, help="Upgrade packages to latest version" ) @click.option( "-e", "--editable", help="Install a project in editable mode from this path", ) @click.option( "--force-reinstall", is_flag=True, help="Reinstall all packages even if they are already up-to-date", ) @click.option( "--no-cache-dir", is_flag=True, help="Disable the cache", ) def install(packages, upgrade, editable, force_reinstall, no_cache_dir): """Install packages from PyPI into the same environment as LLM""" args = ["pip", "install"] if upgrade: args += ["--upgrade"] if editable: args += ["--editable", editable] if force_reinstall: args += ["--force-reinstall"] if no_cache_dir: args += ["--no-cache-dir"] args += list(packages) sys.argv = args run_module("pip", run_name="__main__") @cli.command() @click.argument("packages", nargs=-1, required=True) @click.option("-y", "--yes", is_flag=True, help="Don't ask for confirmation") def uninstall(packages, yes): """Uninstall Python packages from the LLM environment""" sys.argv = ["pip", "uninstall"] + list(packages) + (["-y"] if yes else []) run_module("pip", run_name="__main__") @cli.command() @click.argument("collection", required=False) @click.argument("id", required=False) @click.option( "-i", "--input", type=click.Path(exists=True, readable=True, allow_dash=True), help="File to embed", ) @click.option("-m", "--model", help="Embedding model to use") @click.option("--store", is_flag=True, help="Store the text itself in the database") @click.option( "-d", "--database", type=click.Path(file_okay=True, allow_dash=False, dir_okay=False, writable=True), envvar="LLM_EMBEDDINGS_DB", ) @click.option( "-c", "--content", help="Content to embed", ) @click.option("--binary", is_flag=True, help="Treat input as binary data") @click.option( "--metadata", help="JSON object metadata to store", callback=json_validator("metadata"), ) @click.option( "format_", "-f", "--format", type=click.Choice(["json", "blob", "base64", "hex"]), help="Output format", ) def embed( collection, id, input, model, store, database, content, binary, metadata, format_ ): """Embed text and store or return the result""" if collection and not id: raise click.ClickException("Must provide both collection and id") if store and not collection: raise click.ClickException("Must provide collection when using --store") # Lazy load this because we do not need it for -c or -i versions def get_db(): if database: return sqlite_utils.Database(database) else: return sqlite_utils.Database(user_dir() / "embeddings.db") collection_obj = None model_obj = None if collection: db = get_db() if Collection.exists(db, collection): # Load existing collection and use its model collection_obj = Collection(collection, db) model_obj = collection_obj.model() else: # We will create a new one, but that means model is required if not model: model = get_default_embedding_model() if model is None: raise click.ClickException( "You need to specify an embedding model (no default model is set)" ) collection_obj = Collection(collection, db=db, model_id=model) model_obj = collection_obj.model() if model_obj is None: if model is None: model = get_default_embedding_model() try: model_obj = get_embedding_model(model) except UnknownModelError: raise click.ClickException( "You need to specify an embedding model (no default model is set)" ) show_output = True if collection and (format_ is None): show_output = False # Resolve input text if not content: if not input or input == "-": # Read from stdin input_source = sys.stdin.buffer if binary else sys.stdin content = input_source.read() else: mode = "rb" if binary else "r" with open(input, mode) as f: content = f.read() if not content: raise click.ClickException("No content provided") if collection_obj: embedding = collection_obj.embed(id, content, metadata=metadata, store=store) else: embedding = model_obj.embed(content) if show_output: if format_ == "json" or format_ is None: click.echo(json.dumps(embedding)) elif format_ == "blob": click.echo(encode(embedding)) elif format_ == "base64": click.echo(base64.b64encode(encode(embedding)).decode("ascii")) elif format_ == "hex": click.echo(encode(embedding).hex()) @cli.command() @click.argument("collection") @click.argument( "input_path", type=click.Path(exists=True, dir_okay=False, allow_dash=True, readable=True), required=False, ) @click.option( "--format", type=click.Choice(["json", "csv", "tsv", "nl"]), help="Format of input file - defaults to auto-detect", ) @click.option( "--files", type=(click.Path(file_okay=False, dir_okay=True, allow_dash=False), str), multiple=True, help="Embed files in this directory - specify directory and glob pattern", ) @click.option( "encodings", "--encoding", help="Encoding to use when reading --files", multiple=True, ) @click.option("--binary", is_flag=True, help="Treat --files as binary data") @click.option("--sql", help="Read input using this SQL query") @click.option( "--attach", type=(str, click.Path(file_okay=True, dir_okay=False, allow_dash=False)), multiple=True, help="Additional databases to attach - specify alias and file path", ) @click.option( "--batch-size", type=int, help="Batch size to use when running embeddings" ) @click.option("--prefix", help="Prefix to add to the IDs", default="") @click.option("-m", "--model", help="Embedding model to use") @click.option( "--prepend", help="Prepend this string to all content before embedding", ) @click.option("--store", is_flag=True, help="Store the text itself in the database") @click.option( "-d", "--database", type=click.Path(file_okay=True, allow_dash=False, dir_okay=False, writable=True), envvar="LLM_EMBEDDINGS_DB", ) def embed_multi( collection, input_path, format, files, encodings, binary, sql, attach, batch_size, prefix, model, prepend, store, database, ): """ Store embeddings for multiple strings at once Input can be CSV, TSV or a JSON list of objects. The first column is treated as an ID - all other columns are assumed to be text that should be concatenated together in order to calculate the embeddings. Input data can come from one of three sources: \b 1. A CSV, JSON, TSV or JSON-nl file (including on standard input) 2. A SQL query against a SQLite database 3. A directory of files """ if binary and not files: raise click.UsageError("--binary must be used with --files") if binary and encodings: raise click.UsageError("--binary cannot be used with --encoding") if not input_path and not sql and not files: raise click.UsageError("Either --sql or input path or --files is required") if files: if input_path or sql or format: raise click.UsageError( "Cannot use --files with --sql, input path or --format" ) if database: db = sqlite_utils.Database(database) else: db = sqlite_utils.Database(user_dir() / "embeddings.db") for alias, attach_path in attach: db.attach(alias, attach_path) try: collection_obj = Collection( collection, db=db, model_id=model or get_default_embedding_model() ) except ValueError: raise click.ClickException( "You need to specify an embedding model (no default model is set)" ) expected_length = None if files: encodings = encodings or ("utf-8", "latin-1") def count_files(): i = 0 for directory, pattern in files: for path in pathlib.Path(directory).glob(pattern): i += 1 return i def iterate_files(): for directory, pattern in files: p = pathlib.Path(directory) if not p.exists() or not p.is_dir(): # fixes issue/274 - raise error if directory does not exist raise click.UsageError(f"Invalid directory: {directory}") for path in pathlib.Path(directory).glob(pattern): if path.is_dir(): continue # fixed issue/280 - skip directories relative = path.relative_to(directory) content = None if binary: content = path.read_bytes() else: for encoding in encodings: try: content = path.read_text(encoding=encoding) except UnicodeDecodeError: continue if content is None: # Log to stderr click.echo( "Could not decode text in file {}".format(path), err=True, ) else: yield {"id": str(relative), "content": content} expected_length = count_files() rows = iterate_files() elif sql: rows = db.query(sql) count_sql = "select count(*) as c from ({})".format(sql) expected_length = next(db.query(count_sql))["c"] else: def load_rows(fp): return rows_from_file(fp, Format[format.upper()] if format else None)[0] try: if input_path != "-": # Read the file twice - first time is to get a count expected_length = 0 with open(input_path, "rb") as fp: for _ in load_rows(fp): expected_length += 1 rows = load_rows( open(input_path, "rb") if input_path != "-" else io.BufferedReader(sys.stdin.buffer) ) except json.JSONDecodeError as ex: raise click.ClickException(str(ex)) with click.progressbar( rows, label="Embedding", show_percent=True, length=expected_length ) as rows: def tuples() -> Iterable[Tuple[str, Union[bytes, str]]]: for row in rows: values = list(row.values()) id: str = prefix + str(values[0]) content: Optional[Union[bytes, str]] = None if binary: content = cast(bytes, values[1]) else: content = " ".join(v or "" for v in values[1:]) if prepend and isinstance(content, str): content = prepend + content yield id, content or "" embed_kwargs = {"store": store} if batch_size: embed_kwargs["batch_size"] = batch_size collection_obj.embed_multi(tuples(), **embed_kwargs) @cli.command() @click.argument("collection") @click.argument("id", required=False) @click.option( "-i", "--input", type=click.Path(exists=True, readable=True, allow_dash=True), help="File to embed for comparison", ) @click.option("-c", "--content", help="Content to embed for comparison") @click.option("--binary", is_flag=True, help="Treat input as binary data") @click.option( "-n", "--number", type=int, default=10, help="Number of results to return" ) @click.option( "-d", "--database", type=click.Path(file_okay=True, allow_dash=False, dir_okay=False, writable=True), envvar="LLM_EMBEDDINGS_DB", ) def similar(collection, id, input, content, binary, number, database): """ Return top N similar IDs from a collection using cosine similarity. Example usage: \b llm similar my-collection -c "I like cats" Or to find content similar to a specific stored ID: \b llm similar my-collection 1234 """ if not id and not content and not input: raise click.ClickException("Must provide content or an ID for the comparison") if database: db = sqlite_utils.Database(database) else: db = sqlite_utils.Database(user_dir() / "embeddings.db") if not db["embeddings"].exists(): raise click.ClickException("No embeddings table found in database") try: collection_obj = Collection(collection, db, create=False) except Collection.DoesNotExist: raise click.ClickException("Collection does not exist") if id: try: results = collection_obj.similar_by_id(id, number) except Collection.DoesNotExist: raise click.ClickException("ID not found in collection") else: # Resolve input text if not content: if not input or input == "-": # Read from stdin input_source = sys.stdin.buffer if binary else sys.stdin content = input_source.read() else: mode = "rb" if binary else "r" with open(input, mode) as f: content = f.read() if not content: raise click.ClickException("No content provided") results = collection_obj.similar(content, number) for result in results: click.echo(json.dumps(asdict(result))) @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def embed_models(): "Manage available embedding models" @embed_models.command(name="list") @click.option( "-q", "--query", multiple=True, help="Search for embedding models matching these strings", ) def embed_models_list(query): "List available embedding models" output = [] for model_with_aliases in get_embedding_models_with_aliases(): if query: if not all(model_with_aliases.matches(q) for q in query): continue s = str(model_with_aliases.model) if model_with_aliases.aliases: s += " (aliases: {})".format(", ".join(model_with_aliases.aliases)) output.append(s) click.echo("\n".join(output)) @embed_models.command(name="default") @click.argument("model", required=False) @click.option( "--remove-default", is_flag=True, help="Reset to specifying no default model" ) def embed_models_default(model, remove_default): "Show or set the default embedding model" if not model and not remove_default: default = get_default_embedding_model() if default is None: click.echo("", err=True) else: click.echo(default) return # Validate it is a known model try: if remove_default: set_default_embedding_model(None) else: model = get_embedding_model(model) set_default_embedding_model(model.model_id) except KeyError: raise click.ClickException("Unknown embedding model: {}".format(model)) @cli.group( cls=DefaultGroup, default="list", default_if_no_args=True, ) def collections(): "View and manage collections of embeddings" @collections.command(name="path") def collections_path(): "Output the path to the embeddings database" click.echo(user_dir() / "embeddings.db") @collections.command(name="list") @click.option( "-d", "--database", type=click.Path(file_okay=True, allow_dash=False, dir_okay=False, writable=True), envvar="LLM_EMBEDDINGS_DB", help="Path to embeddings database", ) @click.option("json_", "--json", is_flag=True, help="Output as JSON") def embed_db_collections(database, json_): "View a list of collections" database = database or (user_dir() / "embeddings.db") db = sqlite_utils.Database(str(database)) if not db["collections"].exists(): raise click.ClickException("No collections table found in {}".format(database)) rows = db.query( """ select collections.name, collections.model, count(embeddings.id) as num_embeddings from collections left join embeddings on collections.id = embeddings.collection_id group by collections.name, collections.model """ ) if json_: click.echo(json.dumps(list(rows), indent=4)) else: for row in rows: click.echo("{}: {}".format(row["name"], row["model"])) click.echo( " {} embedding{}".format( row["num_embeddings"], "s" if row["num_embeddings"] != 1 else "" ) ) @collections.command(name="delete") @click.argument("collection") @click.option( "-d", "--database", type=click.Path(file_okay=True, allow_dash=False, dir_okay=False, writable=True), envvar="LLM_EMBEDDINGS_DB", help="Path to embeddings database", ) def collections_delete(collection, database): """ Delete the specified collection Example usage: \b llm collections delete my-collection """ database = database or (user_dir() / "embeddings.db") db = sqlite_utils.Database(str(database)) try: collection_obj = Collection(collection, db, create=False) except Collection.DoesNotExist: raise click.ClickException("Collection does not exist") collection_obj.delete() def template_dir(): path = user_dir() / "templates" path.mkdir(parents=True, exist_ok=True) return path def _truncate_string(s, max_length=100, end=False): if not s: return s if end: s = re.sub(r"\s+", " ", s) if len(s) <= max_length: return s return s[: max_length - 3] + "..." if len(s) <= max_length: return s return s[: max_length - 3] + "..." def logs_db_path(): return user_dir() / "logs.db" def load_template(name): path = template_dir() / f"{name}.yaml" if not path.exists(): raise click.ClickException(f"Invalid template: {name}") try: loaded = yaml.safe_load(path.read_text()) except yaml.YAMLError as ex: raise click.ClickException("Invalid YAML: {}".format(str(ex))) if isinstance(loaded, str): return Template(name=name, prompt=loaded) loaded["name"] = name try: return Template(**loaded) except pydantic.ValidationError as ex: msg = "A validation error occurred:\n" msg += render_errors(ex.errors()) raise click.ClickException(msg) def get_history(chat_id): if chat_id is None: return None, [] log_path = logs_db_path() db = sqlite_utils.Database(log_path) migrate(db) if chat_id == -1: # Return the most recent chat last_row = list(db["logs"].rows_where(order_by="-id", limit=1)) if last_row: chat_id = last_row[0].get("chat_id") or last_row[0].get("id") else: # Database is empty return None, [] rows = db["logs"].rows_where( "id = ? or chat_id = ?", [chat_id, chat_id], order_by="id" ) return chat_id, rows def render_errors(errors): output = [] for error in errors: output.append(", ".join(error["loc"])) output.append(" " + error["msg"]) return "\n".join(output) load_plugins() pm.hook.register_commands(cli=cli) def _human_readable_size(size_bytes): if size_bytes == 0: return "0B" size_name = ("B", "KB", "MB", "GB", "TB", "PB", "EB", "ZB", "YB") i = 0 while size_bytes >= 1024 and i < len(size_name) - 1: size_bytes /= 1024.0 i += 1 return "{:.2f}{}".format(size_bytes, size_name[i]) def logs_on(): return not (user_dir() / "logs-off").exists() ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1740761919.8158307 llm-0.23/llm/default_plugins/0000755000175100001660000000000014760365500015604 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/default_plugins/__init__.py0000644000175100001660000000000014760365472017713 0ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/default_plugins/openai_models.py0000644000175100001660000006470114760365472021014 0ustar00runnerdockerfrom llm import AsyncKeyModel, EmbeddingModel, KeyModel, hookimpl import llm from llm.utils import ( dicts_to_table_string, remove_dict_none_values, logging_client, simplify_usage_dict, ) import click import datetime from enum import Enum import httpx import openai import os from pydantic import field_validator, Field from typing import AsyncGenerator, List, Iterable, Iterator, Optional, Union import json import yaml @hookimpl def register_models(register): # GPT-4o register( Chat("gpt-4o", vision=True, supports_schema=True), AsyncChat("gpt-4o", vision=True, supports_schema=True), aliases=("4o",), ) register( Chat("chatgpt-4o-latest", vision=True), AsyncChat("chatgpt-4o-latest", vision=True), aliases=("chatgpt-4o",), ) register( Chat("gpt-4o-mini", vision=True, supports_schema=True), AsyncChat("gpt-4o-mini", vision=True, supports_schema=True), aliases=("4o-mini",), ) for audio_model_id in ( "gpt-4o-audio-preview", "gpt-4o-audio-preview-2024-12-17", "gpt-4o-audio-preview-2024-10-01", "gpt-4o-mini-audio-preview", "gpt-4o-mini-audio-preview-2024-12-17", ): register( Chat(audio_model_id, audio=True), AsyncChat(audio_model_id, audio=True), ) # 3.5 and 4 register( Chat("gpt-3.5-turbo"), AsyncChat("gpt-3.5-turbo"), aliases=("3.5", "chatgpt") ) register( Chat("gpt-3.5-turbo-16k"), AsyncChat("gpt-3.5-turbo-16k"), aliases=("chatgpt-16k", "3.5-16k"), ) register(Chat("gpt-4"), AsyncChat("gpt-4"), aliases=("4", "gpt4")) register(Chat("gpt-4-32k"), AsyncChat("gpt-4-32k"), aliases=("4-32k",)) # GPT-4 Turbo models register(Chat("gpt-4-1106-preview"), AsyncChat("gpt-4-1106-preview")) register(Chat("gpt-4-0125-preview"), AsyncChat("gpt-4-0125-preview")) register(Chat("gpt-4-turbo-2024-04-09"), AsyncChat("gpt-4-turbo-2024-04-09")) register( Chat("gpt-4-turbo"), AsyncChat("gpt-4-turbo"), aliases=("gpt-4-turbo-preview", "4-turbo", "4t"), ) # GPT-4.5 register( Chat("gpt-4.5-preview-2025-02-27", vision=True, supports_schema=True), AsyncChat("gpt-4.5-preview-2025-02-27", vision=True, supports_schema=True), ) register( Chat("gpt-4.5-preview", vision=True, supports_schema=True), AsyncChat("gpt-4.5-preview", vision=True, supports_schema=True), aliases=("gpt-4.5",), ) # o1 for model_id in ("o1", "o1-2024-12-17"): register( Chat( model_id, vision=True, can_stream=False, reasoning=True, supports_schema=True, ), AsyncChat( model_id, vision=True, can_stream=False, reasoning=True, supports_schema=True, ), ) register( Chat("o1-preview", allows_system_prompt=False), AsyncChat("o1-preview", allows_system_prompt=False), ) register( Chat("o1-mini", allows_system_prompt=False), AsyncChat("o1-mini", allows_system_prompt=False), ) register( Chat("o3-mini", reasoning=True, supports_schema=True), AsyncChat("o3-mini", reasoning=True, supports_schema=True), ) # The -instruct completion model register( Completion("gpt-3.5-turbo-instruct", default_max_tokens=256), aliases=("3.5-instruct", "chatgpt-instruct"), ) # Load extra models extra_path = llm.user_dir() / "extra-openai-models.yaml" if not extra_path.exists(): return with open(extra_path) as f: extra_models = yaml.safe_load(f) for extra_model in extra_models: model_id = extra_model["model_id"] aliases = extra_model.get("aliases", []) model_name = extra_model["model_name"] api_base = extra_model.get("api_base") api_type = extra_model.get("api_type") api_version = extra_model.get("api_version") api_engine = extra_model.get("api_engine") headers = extra_model.get("headers") reasoning = extra_model.get("reasoning") kwargs = {} if extra_model.get("can_stream") is False: kwargs["can_stream"] = False if extra_model.get("completion"): klass = Completion else: klass = Chat chat_model = klass( model_id, model_name=model_name, api_base=api_base, api_type=api_type, api_version=api_version, api_engine=api_engine, headers=headers, reasoning=reasoning, **kwargs, ) if api_base: chat_model.needs_key = None if extra_model.get("api_key_name"): chat_model.needs_key = extra_model["api_key_name"] register( chat_model, aliases=aliases, ) @hookimpl def register_embedding_models(register): register( OpenAIEmbeddingModel("text-embedding-ada-002", "text-embedding-ada-002"), aliases=( "ada", "ada-002", ), ) register( OpenAIEmbeddingModel("text-embedding-3-small", "text-embedding-3-small"), aliases=("3-small",), ) register( OpenAIEmbeddingModel("text-embedding-3-large", "text-embedding-3-large"), aliases=("3-large",), ) # With varying dimensions register( OpenAIEmbeddingModel( "text-embedding-3-small-512", "text-embedding-3-small", 512 ), aliases=("3-small-512",), ) register( OpenAIEmbeddingModel( "text-embedding-3-large-256", "text-embedding-3-large", 256 ), aliases=("3-large-256",), ) register( OpenAIEmbeddingModel( "text-embedding-3-large-1024", "text-embedding-3-large", 1024 ), aliases=("3-large-1024",), ) class OpenAIEmbeddingModel(EmbeddingModel): needs_key = "openai" key_env_var = "OPENAI_API_KEY" batch_size = 100 def __init__(self, model_id, openai_model_id, dimensions=None): self.model_id = model_id self.openai_model_id = openai_model_id self.dimensions = dimensions def embed_batch(self, items: Iterable[Union[str, bytes]]) -> Iterator[List[float]]: kwargs = { "input": items, "model": self.openai_model_id, } if self.dimensions: kwargs["dimensions"] = self.dimensions client = openai.OpenAI(api_key=self.get_key()) results = client.embeddings.create(**kwargs).data return ([float(r) for r in result.embedding] for result in results) @hookimpl def register_commands(cli): @cli.group(name="openai") def openai_(): "Commands for working directly with the OpenAI API" @openai_.command() @click.option("json_", "--json", is_flag=True, help="Output as JSON") @click.option("--key", help="OpenAI API key") def models(json_, key): "List models available to you from the OpenAI API" from llm import get_key api_key = get_key(key, "openai", "OPENAI_API_KEY") response = httpx.get( "https://api.openai.com/v1/models", headers={"Authorization": f"Bearer {api_key}"}, ) if response.status_code != 200: raise click.ClickException( f"Error {response.status_code} from OpenAI API: {response.text}" ) models = response.json()["data"] if json_: click.echo(json.dumps(models, indent=4)) else: to_print = [] for model in models: # Print id, owned_by, root, created as ISO 8601 created_str = datetime.datetime.fromtimestamp( model["created"], datetime.timezone.utc ).isoformat() to_print.append( { "id": model["id"], "owned_by": model["owned_by"], "created": created_str, } ) done = dicts_to_table_string("id owned_by created".split(), to_print) print("\n".join(done)) class SharedOptions(llm.Options): temperature: Optional[float] = Field( description=( "What sampling temperature to use, between 0 and 2. Higher values like " "0.8 will make the output more random, while lower values like 0.2 will " "make it more focused and deterministic." ), ge=0, le=2, default=None, ) max_tokens: Optional[int] = Field( description="Maximum number of tokens to generate.", default=None ) top_p: Optional[float] = Field( description=( "An alternative to sampling with temperature, called nucleus sampling, " "where the model considers the results of the tokens with top_p " "probability mass. So 0.1 means only the tokens comprising the top " "10% probability mass are considered. Recommended to use top_p or " "temperature but not both." ), ge=0, le=1, default=None, ) frequency_penalty: Optional[float] = Field( description=( "Number between -2.0 and 2.0. Positive values penalize new tokens based " "on their existing frequency in the text so far, decreasing the model's " "likelihood to repeat the same line verbatim." ), ge=-2, le=2, default=None, ) presence_penalty: Optional[float] = Field( description=( "Number between -2.0 and 2.0. Positive values penalize new tokens based " "on whether they appear in the text so far, increasing the model's " "likelihood to talk about new topics." ), ge=-2, le=2, default=None, ) stop: Optional[str] = Field( description=("A string where the API will stop generating further tokens."), default=None, ) logit_bias: Optional[Union[dict, str]] = Field( description=( "Modify the likelihood of specified tokens appearing in the completion. " 'Pass a JSON string like \'{"1712":-100, "892":-100, "1489":-100}\'' ), default=None, ) seed: Optional[int] = Field( description="Integer seed to attempt to sample deterministically", default=None, ) @field_validator("logit_bias") def validate_logit_bias(cls, logit_bias): if logit_bias is None: return None if isinstance(logit_bias, str): try: logit_bias = json.loads(logit_bias) except json.JSONDecodeError: raise ValueError("Invalid JSON in logit_bias string") validated_logit_bias = {} for key, value in logit_bias.items(): try: int_key = int(key) int_value = int(value) if -100 <= int_value <= 100: validated_logit_bias[int_key] = int_value else: raise ValueError("Value must be between -100 and 100") except ValueError: raise ValueError("Invalid key-value pair in logit_bias dictionary") return validated_logit_bias class ReasoningEffortEnum(str, Enum): low = "low" medium = "medium" high = "high" class OptionsForReasoning(SharedOptions): json_object: Optional[bool] = Field( description="Output a valid JSON object {...}. Prompt must mention JSON.", default=None, ) reasoning_effort: Optional[ReasoningEffortEnum] = Field( description=( "Constraints effort on reasoning for reasoning models. Currently supported " "values are low, medium, and high. Reducing reasoning effort can result in " "faster responses and fewer tokens used on reasoning in a response." ), default=None, ) def _attachment(attachment): url = attachment.url base64_content = "" if not url or attachment.resolve_type().startswith("audio/"): base64_content = attachment.base64_content() url = f"data:{attachment.resolve_type()};base64,{base64_content}" if attachment.resolve_type().startswith("image/"): return {"type": "image_url", "image_url": {"url": url}} else: format_ = "wav" if attachment.resolve_type() == "audio/wav" else "mp3" return { "type": "input_audio", "input_audio": { "data": base64_content, "format": format_, }, } class _Shared: def __init__( self, model_id, key=None, model_name=None, api_base=None, api_type=None, api_version=None, api_engine=None, headers=None, can_stream=True, vision=False, audio=False, reasoning=False, supports_schema=False, allows_system_prompt=True, ): self.model_id = model_id self.key = key self.supports_schema = supports_schema self.model_name = model_name self.api_base = api_base self.api_type = api_type self.api_version = api_version self.api_engine = api_engine self.headers = headers self.can_stream = can_stream self.vision = vision self.allows_system_prompt = allows_system_prompt self.attachment_types = set() if reasoning: self.Options = OptionsForReasoning if vision: self.attachment_types.update( { "image/png", "image/jpeg", "image/webp", "image/gif", } ) if audio: self.attachment_types.update( { "audio/wav", "audio/mpeg", } ) def __str__(self): return "OpenAI Chat: {}".format(self.model_id) def build_messages(self, prompt, conversation): messages = [] current_system = None if conversation is not None: for prev_response in conversation.responses: if ( prev_response.prompt.system and prev_response.prompt.system != current_system ): messages.append( {"role": "system", "content": prev_response.prompt.system} ) current_system = prev_response.prompt.system if prev_response.attachments: attachment_message = [] if prev_response.prompt.prompt: attachment_message.append( {"type": "text", "text": prev_response.prompt.prompt} ) for attachment in prev_response.attachments: attachment_message.append(_attachment(attachment)) messages.append({"role": "user", "content": attachment_message}) else: messages.append( {"role": "user", "content": prev_response.prompt.prompt} ) messages.append( {"role": "assistant", "content": prev_response.text_or_raise()} ) if prompt.system and prompt.system != current_system: messages.append({"role": "system", "content": prompt.system}) if not prompt.attachments: messages.append({"role": "user", "content": prompt.prompt or ""}) else: attachment_message = [] if prompt.prompt: attachment_message.append({"type": "text", "text": prompt.prompt}) for attachment in prompt.attachments: attachment_message.append(_attachment(attachment)) messages.append({"role": "user", "content": attachment_message}) return messages def set_usage(self, response, usage): if not usage: return input_tokens = usage.pop("prompt_tokens") output_tokens = usage.pop("completion_tokens") usage.pop("total_tokens") response.set_usage( input=input_tokens, output=output_tokens, details=simplify_usage_dict(usage) ) def get_client(self, key, *, async_=False): kwargs = {} if self.api_base: kwargs["base_url"] = self.api_base if self.api_type: kwargs["api_type"] = self.api_type if self.api_version: kwargs["api_version"] = self.api_version if self.api_engine: kwargs["engine"] = self.api_engine if self.needs_key: kwargs["api_key"] = self.get_key(key) else: # OpenAI-compatible models don't need a key, but the # openai client library requires one kwargs["api_key"] = "DUMMY_KEY" if self.headers: kwargs["default_headers"] = self.headers if os.environ.get("LLM_OPENAI_SHOW_RESPONSES"): kwargs["http_client"] = logging_client() if async_: return openai.AsyncOpenAI(**kwargs) else: return openai.OpenAI(**kwargs) def build_kwargs(self, prompt, stream): kwargs = dict(not_nulls(prompt.options)) json_object = kwargs.pop("json_object", None) if "max_tokens" not in kwargs and self.default_max_tokens is not None: kwargs["max_tokens"] = self.default_max_tokens if json_object: kwargs["response_format"] = {"type": "json_object"} if prompt.schema: kwargs["response_format"] = { "type": "json_schema", "json_schema": {"name": "output", "schema": prompt.schema}, } if stream: kwargs["stream_options"] = {"include_usage": True} return kwargs class Chat(_Shared, KeyModel): needs_key = "openai" key_env_var = "OPENAI_API_KEY" default_max_tokens = None class Options(SharedOptions): json_object: Optional[bool] = Field( description="Output a valid JSON object {...}. Prompt must mention JSON.", default=None, ) def execute(self, prompt, stream, response, conversation=None, key=None): if prompt.system and not self.allows_system_prompt: raise NotImplementedError("Model does not support system prompts") messages = self.build_messages(prompt, conversation) kwargs = self.build_kwargs(prompt, stream) client = self.get_client(key) usage = None if stream: completion = client.chat.completions.create( model=self.model_name or self.model_id, messages=messages, stream=True, **kwargs, ) chunks = [] for chunk in completion: chunks.append(chunk) if chunk.usage: usage = chunk.usage.model_dump() try: content = chunk.choices[0].delta.content except IndexError: content = None if content is not None: yield content response.response_json = remove_dict_none_values(combine_chunks(chunks)) else: completion = client.chat.completions.create( model=self.model_name or self.model_id, messages=messages, stream=False, **kwargs, ) usage = completion.usage.model_dump() response.response_json = remove_dict_none_values(completion.model_dump()) yield completion.choices[0].message.content self.set_usage(response, usage) response._prompt_json = redact_data({"messages": messages}) class AsyncChat(_Shared, AsyncKeyModel): needs_key = "openai" key_env_var = "OPENAI_API_KEY" default_max_tokens = None class Options(SharedOptions): json_object: Optional[bool] = Field( description="Output a valid JSON object {...}. Prompt must mention JSON.", default=None, ) async def execute( self, prompt, stream, response, conversation=None, key=None ) -> AsyncGenerator[str, None]: if prompt.system and not self.allows_system_prompt: raise NotImplementedError("Model does not support system prompts") messages = self.build_messages(prompt, conversation) kwargs = self.build_kwargs(prompt, stream) client = self.get_client(key, async_=True) usage = None if stream: completion = await client.chat.completions.create( model=self.model_name or self.model_id, messages=messages, stream=True, **kwargs, ) chunks = [] async for chunk in completion: if chunk.usage: usage = chunk.usage.model_dump() chunks.append(chunk) try: content = chunk.choices[0].delta.content except IndexError: content = None if content is not None: yield content response.response_json = remove_dict_none_values(combine_chunks(chunks)) else: completion = await client.chat.completions.create( model=self.model_name or self.model_id, messages=messages, stream=False, **kwargs, ) response.response_json = remove_dict_none_values(completion.model_dump()) usage = completion.usage.model_dump() yield completion.choices[0].message.content self.set_usage(response, usage) response._prompt_json = redact_data({"messages": messages}) class Completion(Chat): class Options(SharedOptions): logprobs: Optional[int] = Field( description="Include the log probabilities of most likely N per token", default=None, le=5, ) def __init__(self, *args, default_max_tokens=None, **kwargs): super().__init__(*args, **kwargs) self.default_max_tokens = default_max_tokens def __str__(self): return "OpenAI Completion: {}".format(self.model_id) def execute(self, prompt, stream, response, conversation=None, key=None): if prompt.system: raise NotImplementedError( "System prompts are not supported for OpenAI completion models" ) messages = [] if conversation is not None: for prev_response in conversation.responses: messages.append(prev_response.prompt.prompt) messages.append(prev_response.text()) messages.append(prompt.prompt) kwargs = self.build_kwargs(prompt, stream) client = self.get_client(key) if stream: completion = client.completions.create( model=self.model_name or self.model_id, prompt="\n".join(messages), stream=True, **kwargs, ) chunks = [] for chunk in completion: chunks.append(chunk) try: content = chunk.choices[0].text except IndexError: content = None if content is not None: yield content combined = combine_chunks(chunks) cleaned = remove_dict_none_values(combined) response.response_json = cleaned else: completion = client.completions.create( model=self.model_name or self.model_id, prompt="\n".join(messages), stream=False, **kwargs, ) response.response_json = remove_dict_none_values(completion.model_dump()) yield completion.choices[0].text response._prompt_json = redact_data({"messages": messages}) def not_nulls(data) -> dict: return {key: value for key, value in data if value is not None} def combine_chunks(chunks: List) -> dict: content = "" role = None finish_reason = None # If any of them have log probability, we're going to persist # those later on logprobs = [] usage = {} for item in chunks: if item.usage: usage = item.usage.dict() for choice in item.choices: if choice.logprobs and hasattr(choice.logprobs, "top_logprobs"): logprobs.append( { "text": choice.text if hasattr(choice, "text") else None, "top_logprobs": choice.logprobs.top_logprobs, } ) if not hasattr(choice, "delta"): content += choice.text continue role = choice.delta.role if choice.delta.content is not None: content += choice.delta.content if choice.finish_reason is not None: finish_reason = choice.finish_reason # Imitations of the OpenAI API may be missing some of these fields combined = { "content": content, "role": role, "finish_reason": finish_reason, "usage": usage, } if logprobs: combined["logprobs"] = logprobs if chunks: for key in ("id", "object", "model", "created", "index"): value = getattr(chunks[0], key, None) if value is not None: combined[key] = value return combined def redact_data(input_dict): """ Recursively search through the input dictionary for any 'image_url' keys and modify the 'url' value to be just 'data:...'. Also redact input_audio.data keys """ if isinstance(input_dict, dict): for key, value in input_dict.items(): if ( key == "image_url" and isinstance(value, dict) and "url" in value and value["url"].startswith("data:") ): value["url"] = "data:..." elif key == "input_audio" and isinstance(value, dict) and "data" in value: value["data"] = "..." else: redact_data(value) elif isinstance(input_dict, list): for item in input_dict: redact_data(item) return input_dict ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/embeddings.py0000644000175100001660000002754314760365472015115 0ustar00runnerdockerfrom .models import EmbeddingModel from .embeddings_migrations import embeddings_migrations from dataclasses import dataclass import hashlib from itertools import islice import json from sqlite_utils import Database from sqlite_utils.db import Table import time from typing import cast, Any, Dict, Iterable, List, Optional, Tuple, Union @dataclass class Entry: id: str score: Optional[float] content: Optional[str] = None metadata: Optional[Dict[str, Any]] = None class Collection: class DoesNotExist(Exception): pass def __init__( self, name: str, db: Optional[Database] = None, *, model: Optional[EmbeddingModel] = None, model_id: Optional[str] = None, create: bool = True, ) -> None: """ A collection of embeddings Returns the collection with the given name, creating it if it does not exist. If you set create=False a Collection.DoesNotExist exception will be raised if the collection does not already exist. Args: db (sqlite_utils.Database): Database to store the collection in name (str): Name of the collection model (llm.models.EmbeddingModel, optional): Embedding model to use model_id (str, optional): Alternatively, ID of the embedding model to use create (bool, optional): Whether to create the collection if it does not exist """ import llm self.db = db or Database(memory=True) self.name = name self._model = model embeddings_migrations.apply(self.db) rows = list(self.db["collections"].rows_where("name = ?", [self.name])) if rows: row = rows[0] self.id = row["id"] self.model_id = row["model"] else: if create: # Collection does not exist, so model or model_id is required if not model and not model_id: raise ValueError( "Either model= or model_id= must be provided when creating a new collection" ) # Create it if model_id: # Resolve alias model = llm.get_embedding_model(model_id) self._model = model model_id = cast(EmbeddingModel, model).model_id self.id = ( cast(Table, self.db["collections"]) .insert( { "name": self.name, "model": model_id, } ) .last_pk ) else: raise self.DoesNotExist(f"Collection '{name}' does not exist") def model(self) -> EmbeddingModel: "Return the embedding model used by this collection" import llm if self._model is None: self._model = llm.get_embedding_model(self.model_id) return cast(EmbeddingModel, self._model) def count(self) -> int: """ Count the number of items in the collection. Returns: int: Number of items in the collection """ return next( self.db.query( """ select count(*) as c from embeddings where collection_id = ( select id from collections where name = ? ) """, (self.name,), ) )["c"] def embed( self, id: str, value: Union[str, bytes], metadata: Optional[Dict[str, Any]] = None, store: bool = False, ) -> None: """ Embed value and store it in the collection with a given ID. Args: id (str): ID for the value value (str or bytes): value to be embedded metadata (dict, optional): Metadata to be stored store (bool, optional): Whether to store the value in the content or content_blob column """ from llm import encode content_hash = self.content_hash(value) if self.db["embeddings"].count_where( "content_hash = ? and collection_id = ?", [content_hash, self.id] ): return embedding = self.model().embed(value) cast(Table, self.db["embeddings"]).insert( { "collection_id": self.id, "id": id, "embedding": encode(embedding), "content": value if (store and isinstance(value, str)) else None, "content_blob": value if (store and isinstance(value, bytes)) else None, "content_hash": content_hash, "metadata": json.dumps(metadata) if metadata else None, "updated": int(time.time()), }, replace=True, ) def embed_multi( self, entries: Iterable[Tuple[str, Union[str, bytes]]], store: bool = False, batch_size: int = 100, ) -> None: """ Embed multiple texts and store them in the collection with given IDs. Args: entries (iterable): Iterable of (id: str, text: str) tuples store (bool, optional): Whether to store the text in the content column batch_size (int, optional): custom maximum batch size to use """ self.embed_multi_with_metadata( ((id, value, None) for id, value in entries), store=store, batch_size=batch_size, ) def embed_multi_with_metadata( self, entries: Iterable[Tuple[str, Union[str, bytes], Optional[Dict[str, Any]]]], store: bool = False, batch_size: int = 100, ) -> None: """ Embed multiple values along with metadata and store them in the collection with given IDs. Args: entries (iterable): Iterable of (id: str, value: str or bytes, metadata: None or dict) store (bool, optional): Whether to store the value in the content or content_blob column batch_size (int, optional): custom maximum batch size to use """ import llm batch_size = min(batch_size, (self.model().batch_size or batch_size)) iterator = iter(entries) collection_id = self.id while True: batch = list(islice(iterator, batch_size)) if not batch: break # Calculate hashes first items_and_hashes = [(item, self.content_hash(item[1])) for item in batch] # Any of those hashes already exist? existing_ids = [ row["id"] for row in self.db.query( """ select id from embeddings where collection_id = ? and content_hash in ({}) """.format( ",".join("?" for _ in items_and_hashes) ), [collection_id] + [item_and_hash[1] for item_and_hash in items_and_hashes], ) ] filtered_batch = [item for item in batch if item[0] not in existing_ids] embeddings = list( self.model().embed_multi(item[1] for item in filtered_batch) ) with self.db.conn: cast(Table, self.db["embeddings"]).insert_all( ( { "collection_id": collection_id, "id": id, "embedding": llm.encode(embedding), "content": ( value if (store and isinstance(value, str)) else None ), "content_blob": ( value if (store and isinstance(value, bytes)) else None ), "content_hash": self.content_hash(value), "metadata": json.dumps(metadata) if metadata else None, "updated": int(time.time()), } for (embedding, (id, value, metadata)) in zip( embeddings, filtered_batch ) ), replace=True, ) def similar_by_vector( self, vector: List[float], number: int = 10, skip_id: Optional[str] = None ) -> List[Entry]: """ Find similar items in the collection by a given vector. Args: vector (list): Vector to search by number (int, optional): Number of similar items to return Returns: list: List of Entry objects """ import llm def distance_score(other_encoded): other_vector = llm.decode(other_encoded) return llm.cosine_similarity(other_vector, vector) self.db.register_function(distance_score, replace=True) where_bits = ["collection_id = ?"] where_args = [str(self.id)] if skip_id: where_bits.append("id != ?") where_args.append(skip_id) return [ Entry( id=row["id"], score=row["score"], content=row["content"], metadata=json.loads(row["metadata"]) if row["metadata"] else None, ) for row in self.db.query( """ select id, content, metadata, distance_score(embedding) as score from embeddings where {where} order by score desc limit {number} """.format( where=" and ".join(where_bits), number=number, ), where_args, ) ] def similar_by_id(self, id: str, number: int = 10) -> List[Entry]: """ Find similar items in the collection by a given ID. Args: id (str): ID to search by number (int, optional): Number of similar items to return Returns: list: List of Entry objects """ import llm matches = list( self.db["embeddings"].rows_where( "collection_id = ? and id = ?", (self.id, id) ) ) if not matches: raise self.DoesNotExist("ID not found") embedding = matches[0]["embedding"] comparison_vector = llm.decode(embedding) return self.similar_by_vector(comparison_vector, number, skip_id=id) def similar(self, value: Union[str, bytes], number: int = 10) -> List[Entry]: """ Find similar items in the collection by a given value. Args: value (str or bytes): value to search by number (int, optional): Number of similar items to return Returns: list: List of Entry objects """ comparison_vector = self.model().embed(value) return self.similar_by_vector(comparison_vector, number) @classmethod def exists(cls, db: Database, name: str) -> bool: """ Does this collection exist in the database? Args: name (str): Name of the collection """ rows = list(db["collections"].rows_where("name = ?", [name])) return bool(rows) def delete(self): """ Delete the collection and its embeddings from the database """ with self.db.conn: self.db.execute("delete from embeddings where collection_id = ?", [self.id]) self.db.execute("delete from collections where id = ?", [self.id]) @staticmethod def content_hash(input: Union[str, bytes]) -> bytes: "Hash content for deduplication. Override to change hashing behavior." if isinstance(input, str): input = input.encode("utf8") return hashlib.md5(input).digest() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/embeddings_migrations.py0000644000175100001660000000467214760365472017347 0ustar00runnerdockerfrom sqlite_migrate import Migrations import hashlib import time embeddings_migrations = Migrations("llm.embeddings") @embeddings_migrations() def m001_create_tables(db): db["collections"].create({"id": int, "name": str, "model": str}, pk="id") db["collections"].create_index(["name"], unique=True) db["embeddings"].create( { "collection_id": int, "id": str, "embedding": bytes, "content": str, "metadata": str, }, pk=("collection_id", "id"), ) @embeddings_migrations() def m002_foreign_key(db): db["embeddings"].add_foreign_key("collection_id", "collections", "id") @embeddings_migrations() def m003_add_updated(db): db["embeddings"].add_column("updated", int) # Pretty-print the schema db["embeddings"].transform() # Assume anything existing was last updated right now db.query( "update embeddings set updated = ? where updated is null", [int(time.time())] ) @embeddings_migrations() def m004_store_content_hash(db): db["embeddings"].add_column("content_hash", bytes) db["embeddings"].transform( column_order=( "collection_id", "id", "embedding", "content", "content_hash", "metadata", "updated", ) ) # Register functions manually so we can de-register later def md5(text): return hashlib.md5(text.encode("utf8")).digest() def random_md5(): return hashlib.md5(str(time.time()).encode("utf8")).digest() db.conn.create_function("temp_md5", 1, md5) db.conn.create_function("temp_random_md5", 0, random_md5) with db.conn: db.execute( """ update embeddings set content_hash = temp_md5(content) where content is not null """ ) db.execute( """ update embeddings set content_hash = temp_random_md5() where content is null """ ) db["embeddings"].create_index(["content_hash"]) # De-register functions db.conn.create_function("temp_md5", 1, None) db.conn.create_function("temp_random_md5", 0, None) @embeddings_migrations() def m005_add_content_blob(db): db["embeddings"].add_column("content_blob", bytes) db["embeddings"].transform( column_order=("collection_id", "id", "embedding", "content", "content_blob") ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/errors.py0000644000175100001660000000030414760365472014312 0ustar00runnerdockerclass ModelError(Exception): "Models can raise this error, which will be displayed to the user" class NeedsKeyException(ModelError): "Model needs an API key which has not been provided" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/hookspecs.py0000644000175100001660000000076214760365472015004 0ustar00runnerdockerfrom pluggy import HookimplMarker from pluggy import HookspecMarker hookspec = HookspecMarker("llm") hookimpl = HookimplMarker("llm") @hookspec def register_commands(cli): """Register additional CLI commands, e.g. 'llm mycommand ...'""" @hookspec def register_models(register): "Register additional model instances representing LLM models that can be called" @hookspec def register_embedding_models(register): "Register additional model instances that can be used for embedding" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/migrations.py0000644000175100001660000001403014760365472015153 0ustar00runnerdockerimport datetime from typing import Callable, List MIGRATIONS: List[Callable] = [] migration = MIGRATIONS.append def migrate(db): ensure_migrations_table(db) already_applied = {r["name"] for r in db["_llm_migrations"].rows} for fn in MIGRATIONS: name = fn.__name__ if name not in already_applied: fn(db) db["_llm_migrations"].insert( { "name": name, "applied_at": str(datetime.datetime.now(datetime.timezone.utc)), } ) already_applied.add(name) def ensure_migrations_table(db): if not db["_llm_migrations"].exists(): db["_llm_migrations"].create( { "name": str, "applied_at": str, }, pk="name", ) @migration def m001_initial(db): # Ensure the original table design exists, so other migrations can run if db["log"].exists(): # It needs to have the chat_id column if "chat_id" not in db["log"].columns_dict: db["log"].add_column("chat_id") return db["log"].create( { "provider": str, "system": str, "prompt": str, "chat_id": str, "response": str, "model": str, "timestamp": str, } ) @migration def m002_id_primary_key(db): db["log"].transform(pk="id") @migration def m003_chat_id_foreign_key(db): db["log"].transform(types={"chat_id": int}) db["log"].add_foreign_key("chat_id", "log", "id") @migration def m004_column_order(db): db["log"].transform( column_order=( "id", "model", "timestamp", "prompt", "system", "response", "chat_id", ) ) @migration def m004_drop_provider(db): db["log"].transform(drop=("provider",)) @migration def m005_debug(db): db["log"].add_column("debug", str) db["log"].add_column("duration_ms", int) @migration def m006_new_logs_table(db): columns = db["log"].columns_dict for column, type in ( ("options_json", str), ("prompt_json", str), ("response_json", str), ("reply_to_id", int), ): # It's possible people running development code like myself # might have accidentally created these columns already if column not in columns: db["log"].add_column(column, type) # Use .transform() to rename options and timestamp_utc, and set new order db["log"].transform( column_order=( "id", "model", "prompt", "system", "prompt_json", "options_json", "response", "response_json", "reply_to_id", "chat_id", "duration_ms", "timestamp_utc", ), rename={ "timestamp": "timestamp_utc", "options": "options_json", }, ) @migration def m007_finish_logs_table(db): db["log"].transform( drop={"debug"}, rename={"timestamp_utc": "datetime_utc"}, drop_foreign_keys=("chat_id",), ) with db.conn: db.execute("alter table log rename to logs") @migration def m008_reply_to_id_foreign_key(db): db["logs"].add_foreign_key("reply_to_id", "logs", "id") @migration def m008_fix_column_order_in_logs(db): # reply_to_id ended up at the end after foreign key added db["logs"].transform( column_order=( "id", "model", "prompt", "system", "prompt_json", "options_json", "response", "response_json", "reply_to_id", "chat_id", "duration_ms", "timestamp_utc", ), ) @migration def m009_delete_logs_table_if_empty(db): # We moved to a new table design, but we don't delete the table # if someone has put data in it if not db["logs"].count: db["logs"].drop() @migration def m010_create_new_log_tables(db): db["conversations"].create( { "id": str, "name": str, "model": str, }, pk="id", ) db["responses"].create( { "id": str, "model": str, "prompt": str, "system": str, "prompt_json": str, "options_json": str, "response": str, "response_json": str, "conversation_id": str, "duration_ms": int, "datetime_utc": str, }, pk="id", foreign_keys=(("conversation_id", "conversations", "id"),), ) @migration def m011_fts_for_responses(db): db["responses"].enable_fts(["prompt", "response"], create_triggers=True) @migration def m012_attachments_tables(db): db["attachments"].create( { "id": str, "type": str, "path": str, "url": str, "content": bytes, }, pk="id", ) db["prompt_attachments"].create( { "response_id": str, "attachment_id": str, "order": int, }, foreign_keys=( ("response_id", "responses", "id"), ("attachment_id", "attachments", "id"), ), pk=("response_id", "attachment_id"), ) @migration def m013_usage(db): db["responses"].add_column("input_tokens", int) db["responses"].add_column("output_tokens", int) db["responses"].add_column("token_details", str) @migration def m014_schemas(db): db["schemas"].create( { "id": str, "content": str, }, pk="id", ) db["responses"].add_column("schema_id", str, fk="schemas", fk_col="id") # Clean up SQL create table indentation db["responses"].transform() # These changes may have dropped the FTS configuration, fix that db["responses"].enable_fts( ["prompt", "response"], create_triggers=True, replace=True ) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/models.py0000644000175100001660000006541714760365472014301 0ustar00runnerdockerimport asyncio import base64 from dataclasses import dataclass, field import datetime from .errors import NeedsKeyException import hashlib import httpx from itertools import islice import re import time from typing import ( Any, AsyncGenerator, Callable, Dict, Iterable, Iterator, List, Optional, Set, Union, ) from .utils import ( mimetype_from_path, mimetype_from_string, token_usage_string, make_schema_id, ) from abc import ABC, abstractmethod import json from pydantic import BaseModel, ConfigDict from ulid import ULID CONVERSATION_NAME_LENGTH = 32 @dataclass class Usage: input: Optional[int] = None output: Optional[int] = None details: Optional[Dict[str, Any]] = None @dataclass class Attachment: type: Optional[str] = None path: Optional[str] = None url: Optional[str] = None content: Optional[bytes] = None _id: Optional[str] = None def id(self): # Hash of the binary content, or of '{"url": "https://..."}' for URL attachments if self._id is None: if self.content: self._id = hashlib.sha256(self.content).hexdigest() elif self.path: self._id = hashlib.sha256(open(self.path, "rb").read()).hexdigest() else: self._id = hashlib.sha256( json.dumps({"url": self.url}).encode("utf-8") ).hexdigest() return self._id def resolve_type(self): if self.type: return self.type # Derive it from path or url or content if self.path: return mimetype_from_path(self.path) if self.url: response = httpx.head(self.url) response.raise_for_status() return response.headers.get("content-type") if self.content: return mimetype_from_string(self.content) raise ValueError("Attachment has no type and no content to derive it from") def content_bytes(self): content = self.content if not content: if self.path: content = open(self.path, "rb").read() elif self.url: response = httpx.get(self.url) response.raise_for_status() content = response.content return content def base64_content(self): return base64.b64encode(self.content_bytes()).decode("utf-8") @classmethod def from_row(cls, row): return cls( _id=row["id"], type=row["type"], path=row["path"], url=row["url"], content=row["content"], ) @dataclass class Prompt: prompt: Optional[str] model: "Model" attachments: Optional[List[Attachment]] system: Optional[str] prompt_json: Optional[str] schema: Optional[Union[Dict, type[BaseModel]]] options: "Options" def __init__( self, prompt, model, *, attachments=None, system=None, prompt_json=None, options=None, schema=None, ): self.prompt = prompt self.model = model self.attachments = list(attachments or []) self.system = system self.prompt_json = prompt_json if schema and not isinstance(schema, dict) and issubclass(schema, BaseModel): schema = schema.model_json_schema() self.schema = schema self.options = options or {} @dataclass class _BaseConversation: model: "_BaseModel" id: str = field(default_factory=lambda: str(ULID()).lower()) name: Optional[str] = None responses: List["_BaseResponse"] = field(default_factory=list) @classmethod @abstractmethod def from_row(cls, row: Any) -> "_BaseConversation": raise NotImplementedError @dataclass class Conversation(_BaseConversation): def prompt( self, prompt: Optional[str] = None, *, attachments: Optional[List[Attachment]] = None, system: Optional[str] = None, schema: Optional[Union[dict, type[BaseModel]]] = None, stream: bool = True, key: Optional[str] = None, **options, ) -> "Response": return Response( Prompt( prompt, model=self.model, attachments=attachments, system=system, schema=schema, options=self.model.Options(**options), ), self.model, stream, conversation=self, key=key, ) @classmethod def from_row(cls, row): from llm import get_model return cls( model=get_model(row["model"]), id=row["id"], name=row["name"], ) def __repr__(self): count = len(self.responses) s = "s" if count == 1 else "" return f"<{self.__class__.__name__}: {self.id} - {count} response{s}" @dataclass class AsyncConversation(_BaseConversation): def prompt( self, prompt: Optional[str] = None, *, attachments: Optional[List[Attachment]] = None, system: Optional[str] = None, schema: Optional[Union[dict, type[BaseModel]]] = None, stream: bool = True, key: Optional[str] = None, **options, ) -> "AsyncResponse": return AsyncResponse( Prompt( prompt, model=self.model, attachments=attachments, system=system, schema=schema, options=self.model.Options(**options), ), self.model, stream, conversation=self, key=key, ) @classmethod def from_row(cls, row): from llm import get_async_model return cls( model=get_async_model(row["model"]), id=row["id"], name=row["name"], ) def __repr__(self): count = len(self.responses) s = "s" if count == 1 else "" return f"<{self.__class__.__name__}: {self.id} - {count} response{s}" class _BaseResponse: """Base response class shared between sync and async responses""" prompt: "Prompt" stream: bool conversation: Optional["_BaseConversation"] = None _key: Optional[str] = None def __init__( self, prompt: Prompt, model: "_BaseModel", stream: bool, conversation: Optional[_BaseConversation] = None, key: Optional[str] = None, ): self.prompt = prompt self._prompt_json = None self.model = model self.stream = stream self._key = key self._chunks: List[str] = [] self._done = False self.response_json = None self.conversation = conversation self.attachments: List[Attachment] = [] self._start: Optional[float] = None self._end: Optional[float] = None self._start_utcnow: Optional[datetime.datetime] = None self.input_tokens: Optional[int] = None self.output_tokens: Optional[int] = None self.token_details: Optional[dict] = None self.done_callbacks: List[Callable] = [] if self.prompt.schema and not self.model.supports_schema: raise ValueError(f"{self.model} does not support schemas") def set_usage( self, *, input: Optional[int] = None, output: Optional[int] = None, details: Optional[dict] = None, ): self.input_tokens = input self.output_tokens = output self.token_details = details @classmethod def from_row(cls, db, row, _async=False): from llm import get_model, get_async_model if _async: model = get_async_model(row["model"]) else: model = get_model(row["model"]) # Schema schema = None if row["schema_id"]: schema = json.loads(db["schemas"].get(row["schema_id"])["content"]) response = cls( model=model, prompt=Prompt( prompt=row["prompt"], model=model, attachments=[], system=row["system"], schema=schema, options=model.Options(**json.loads(row["options_json"])), ), stream=False, ) response.id = row["id"] response._prompt_json = json.loads(row["prompt_json"] or "null") response.response_json = json.loads(row["response_json"] or "null") response._done = True response._chunks = [row["response"]] # Attachments response.attachments = [ Attachment.from_row(arow) for arow in db.query( """ select attachments.* from attachments join prompt_attachments on attachments.id = prompt_attachments.attachment_id where prompt_attachments.response_id = ? order by prompt_attachments."order" """, [row["id"]], ) ] return response def token_usage(self) -> str: return token_usage_string( self.input_tokens, self.output_tokens, self.token_details ) def log_to_db(self, db): conversation = self.conversation if not conversation: conversation = Conversation(model=self.model) db["conversations"].insert( { "id": conversation.id, "name": _conversation_name( self.prompt.prompt or self.prompt.system or "" ), "model": conversation.model.model_id, }, ignore=True, ) schema_id = None if self.prompt.schema: schema_id, schema_json = make_schema_id(self.prompt.schema) db["schemas"].insert({"id": schema_id, "content": schema_json}, ignore=True) response_id = str(ULID()).lower() response = { "id": response_id, "model": self.model.model_id, "prompt": self.prompt.prompt, "system": self.prompt.system, "prompt_json": self._prompt_json, "options_json": { key: value for key, value in dict(self.prompt.options).items() if value is not None }, "response": self.text_or_raise(), "response_json": self.json(), "conversation_id": conversation.id, "duration_ms": self.duration_ms(), "datetime_utc": self.datetime_utc(), "input_tokens": self.input_tokens, "output_tokens": self.output_tokens, "token_details": ( json.dumps(self.token_details) if self.token_details else None ), "schema_id": schema_id, } db["responses"].insert(response) # Persist any attachments - loop through with index for index, attachment in enumerate(self.prompt.attachments): attachment_id = attachment.id() db["attachments"].insert( { "id": attachment_id, "type": attachment.resolve_type(), "path": attachment.path, "url": attachment.url, "content": attachment.content, }, replace=True, ) db["prompt_attachments"].insert( { "response_id": response_id, "attachment_id": attachment_id, "order": index, }, ) class Response(_BaseResponse): model: "Model" conversation: Optional["Conversation"] = None def on_done(self, callback): if not self._done: self.done_callbacks.append(callback) else: callback(self) def _on_done(self): for callback in self.done_callbacks: callback(self) def __str__(self) -> str: return self.text() def _force(self): if not self._done: list(self) def text(self) -> str: self._force() return "".join(self._chunks) def text_or_raise(self) -> str: return self.text() def json(self) -> Optional[Dict[str, Any]]: self._force() return self.response_json def duration_ms(self) -> int: self._force() return int(((self._end or 0) - (self._start or 0)) * 1000) def datetime_utc(self) -> str: self._force() return self._start_utcnow.isoformat() if self._start_utcnow else "" def usage(self) -> Usage: self._force() return Usage( input=self.input_tokens, output=self.output_tokens, details=self.token_details, ) def __iter__(self) -> Iterator[str]: self._start = time.monotonic() self._start_utcnow = datetime.datetime.now(datetime.timezone.utc) if self._done: yield from self._chunks return if isinstance(self.model, Model): for chunk in self.model.execute( self.prompt, stream=self.stream, response=self, conversation=self.conversation, ): yield chunk self._chunks.append(chunk) elif isinstance(self.model, KeyModel): for chunk in self.model.execute( self.prompt, stream=self.stream, response=self, conversation=self.conversation, key=self.model.get_key(self._key), ): yield chunk self._chunks.append(chunk) else: raise Exception("self.model must be a Model or KeyModel") if self.conversation: self.conversation.responses.append(self) self._end = time.monotonic() self._done = True self._on_done() def __repr__(self): text = "... not yet done ..." if self._done: text = "".join(self._chunks) return "".format(self.prompt.prompt, text) class AsyncResponse(_BaseResponse): model: "AsyncModel" conversation: Optional["AsyncConversation"] = None @classmethod def from_row(cls, db, row, _async=False): return super().from_row(db, row, _async=True) async def on_done(self, callback): if not self._done: self.done_callbacks.append(callback) else: if callable(callback): callback = callback(self) if asyncio.iscoroutine(callback): await callback async def _on_done(self): for callback in self.done_callbacks: if callable(callback): callback = callback(self) if asyncio.iscoroutine(callback): await callback def __aiter__(self): self._start = time.monotonic() self._start_utcnow = datetime.datetime.now(datetime.timezone.utc) return self async def __anext__(self) -> str: if self._done: if not self._chunks: raise StopAsyncIteration chunk = self._chunks.pop(0) if not self._chunks: raise StopAsyncIteration return chunk if not hasattr(self, "_generator"): if isinstance(self.model, AsyncModel): self._generator = self.model.execute( self.prompt, stream=self.stream, response=self, conversation=self.conversation, ) elif isinstance(self.model, AsyncKeyModel): self._generator = self.model.execute( self.prompt, stream=self.stream, response=self, conversation=self.conversation, key=self.model.get_key(self._key), ) else: raise ValueError("self.model must be an AsyncModel or AsyncKeyModel") try: chunk = await self._generator.__anext__() self._chunks.append(chunk) return chunk except StopAsyncIteration: if self.conversation: self.conversation.responses.append(self) self._end = time.monotonic() self._done = True await self._on_done() raise async def _force(self): if not self._done: async for _ in self: pass return self def text_or_raise(self) -> str: if not self._done: raise ValueError("Response not yet awaited") return "".join(self._chunks) async def text(self) -> str: await self._force() return "".join(self._chunks) async def json(self) -> Optional[Dict[str, Any]]: await self._force() return self.response_json async def duration_ms(self) -> int: await self._force() return int(((self._end or 0) - (self._start or 0)) * 1000) async def datetime_utc(self) -> str: await self._force() return self._start_utcnow.isoformat() if self._start_utcnow else "" async def usage(self) -> Usage: await self._force() return Usage( input=self.input_tokens, output=self.output_tokens, details=self.token_details, ) def __await__(self): return self._force().__await__() async def to_sync_response(self) -> Response: await self._force() response = Response( self.prompt, self.model, self.stream, conversation=self.conversation, ) response._chunks = self._chunks response._done = True response._end = self._end response._start = self._start response._start_utcnow = self._start_utcnow response.input_tokens = self.input_tokens response.output_tokens = self.output_tokens response.token_details = self.token_details return response @classmethod def fake( cls, model: "AsyncModel", prompt: str, *attachments: List[Attachment], system: str, response: str, ): "Utility method to help with writing tests" response_obj = cls( model=model, prompt=Prompt( prompt, model=model, attachments=attachments, system=system, ), stream=False, ) response_obj._done = True response_obj._chunks = [response] return response_obj def __repr__(self): text = "... not yet awaited ..." if self._done: text = "".join(self._chunks) return "".format(self.prompt.prompt, text) class Options(BaseModel): model_config = ConfigDict(extra="forbid") _Options = Options class _get_key_mixin: needs_key: Optional[str] = None key: Optional[str] = None key_env_var: Optional[str] = None def get_key(self, explicit_key: Optional[str] = None) -> Optional[str]: from llm import get_key if self.needs_key is None: # This model doesn't use an API key return None if self.key is not None: # Someone already set model.key='...' return self.key # Attempt to load a key using llm.get_key() key = get_key( explicit_key=explicit_key, key_alias=self.needs_key, env_var=self.key_env_var, ) if key: return key # Show a useful error message message = "No key found - add one using 'llm keys set {}'".format( self.needs_key ) if self.key_env_var: message += " or set the {} environment variable".format(self.key_env_var) raise NeedsKeyException(message) class _BaseModel(ABC, _get_key_mixin): model_id: str can_stream: bool = False attachment_types: Set = set() supports_schema = False class Options(_Options): pass def _validate_attachments( self, attachments: Optional[List[Attachment]] = None ) -> None: if attachments and not self.attachment_types: raise ValueError("This model does not support attachments") for attachment in attachments or []: attachment_type = attachment.resolve_type() if attachment_type not in self.attachment_types: raise ValueError( f"This model does not support attachments of type '{attachment_type}', " f"only {', '.join(self.attachment_types)}" ) def __str__(self) -> str: return "{}{}: {}".format( self.__class__.__name__, " (async)" if isinstance(self, (AsyncModel, AsyncKeyModel)) else "", self.model_id, ) def __repr__(self) -> str: return f"<{str(self)}>" class _Model(_BaseModel): def conversation(self) -> Conversation: return Conversation(model=self) def prompt( self, prompt: Optional[str] = None, *, attachments: Optional[List[Attachment]] = None, system: Optional[str] = None, stream: bool = True, schema: Optional[Union[dict, type[BaseModel]]] = None, **options, ) -> Response: key = options.pop("key", None) self._validate_attachments(attachments) return Response( Prompt( prompt, attachments=attachments, system=system, schema=schema, model=self, options=self.Options(**options), ), self, stream, key=key, ) class Model(_Model): @abstractmethod def execute( self, prompt: Prompt, stream: bool, response: Response, conversation: Optional[Conversation], ) -> Iterator[str]: pass class KeyModel(_Model): @abstractmethod def execute( self, prompt: Prompt, stream: bool, response: Response, conversation: Optional[Conversation], key: Optional[str], ) -> Iterator[str]: pass class _AsyncModel(_BaseModel): def conversation(self) -> AsyncConversation: return AsyncConversation(model=self) def prompt( self, prompt: Optional[str] = None, *, attachments: Optional[List[Attachment]] = None, system: Optional[str] = None, schema: Optional[Union[dict, type[BaseModel]]] = None, stream: bool = True, **options, ) -> AsyncResponse: key = options.pop("key", None) self._validate_attachments(attachments) return AsyncResponse( Prompt( prompt, attachments=attachments, system=system, schema=schema, model=self, options=self.Options(**options), ), self, stream, key=key, ) class AsyncModel(_AsyncModel): @abstractmethod async def execute( self, prompt: Prompt, stream: bool, response: AsyncResponse, conversation: Optional[AsyncConversation], ) -> AsyncGenerator[str, None]: yield "" class AsyncKeyModel(_AsyncModel): @abstractmethod async def execute( self, prompt: Prompt, stream: bool, response: AsyncResponse, conversation: Optional[AsyncConversation], key: Optional[str], ) -> AsyncGenerator[str, None]: yield "" class EmbeddingModel(ABC, _get_key_mixin): model_id: str key: Optional[str] = None needs_key: Optional[str] = None key_env_var: Optional[str] = None supports_text: bool = True supports_binary: bool = False batch_size: Optional[int] = None def _check(self, item: Union[str, bytes]): if not self.supports_binary and isinstance(item, bytes): raise ValueError( "This model does not support binary data, only text strings" ) if not self.supports_text and isinstance(item, str): raise ValueError( "This model does not support text strings, only binary data" ) def embed(self, item: Union[str, bytes]) -> List[float]: "Embed a single text string or binary blob, return a list of floats" self._check(item) return next(iter(self.embed_batch([item]))) def embed_multi( self, items: Iterable[Union[str, bytes]], batch_size: Optional[int] = None ) -> Iterator[List[float]]: "Embed multiple items in batches according to the model batch_size" iter_items = iter(items) batch_size = self.batch_size if batch_size is None else batch_size if (not self.supports_binary) or (not self.supports_text): def checking_iter(items): for item in items: self._check(item) yield item iter_items = checking_iter(items) if batch_size is None: yield from self.embed_batch(iter_items) return while True: batch_items = list(islice(iter_items, batch_size)) if not batch_items: break yield from self.embed_batch(batch_items) @abstractmethod def embed_batch(self, items: Iterable[Union[str, bytes]]) -> Iterator[List[float]]: """ Embed a batch of strings or blobs, return a list of lists of floats """ pass def __str__(self) -> str: return "{}: {}".format(self.__class__.__name__, self.model_id) def __repr__(self) -> str: return f"<{str(self)}>" @dataclass class ModelWithAliases: model: Model async_model: AsyncModel aliases: Set[str] def matches(self, query: str) -> bool: query = query.lower() all_strings: List[str] = [] all_strings.extend(self.aliases) if self.model: all_strings.append(str(self.model)) if self.async_model: all_strings.append(str(self.async_model.model_id)) return any(query in alias.lower() for alias in all_strings) @dataclass class EmbeddingModelWithAliases: model: EmbeddingModel aliases: Set[str] def matches(self, query: str) -> bool: query = query.lower() all_strings: List[str] = [] all_strings.extend(self.aliases) all_strings.append(str(self.model)) return any(query in alias.lower() for alias in all_strings) def _conversation_name(text): # Collapse whitespace, including newlines text = re.sub(r"\s+", " ", text) if len(text) <= CONVERSATION_NAME_LENGTH: return text return text[: CONVERSATION_NAME_LENGTH - 1] + "…" ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/plugins.py0000644000175100001660000000304314760365472014462 0ustar00runnerdockerimport importlib from importlib import metadata import os import pluggy import sys from . import hookspecs DEFAULT_PLUGINS = ("llm.default_plugins.openai_models",) pm = pluggy.PluginManager("llm") pm.add_hookspecs(hookspecs) LLM_LOAD_PLUGINS = os.environ.get("LLM_LOAD_PLUGINS", None) _loaded = False def load_plugins(): global _loaded if _loaded: return _loaded = True if not hasattr(sys, "_called_from_test") and LLM_LOAD_PLUGINS is None: # Only load plugins if not running tests pm.load_setuptools_entrypoints("llm") # Load any plugins specified in LLM_LOAD_PLUGINS") if LLM_LOAD_PLUGINS is not None: for package_name in [ name for name in LLM_LOAD_PLUGINS.split(",") if name.strip() ]: try: distribution = metadata.distribution(package_name) # Updated call llm_entry_points = [ ep for ep in distribution.entry_points if ep.group == "llm" ] for entry_point in llm_entry_points: mod = entry_point.load() pm.register(mod, name=entry_point.name) # Ensure name can be found in plugin_to_distinfo later: pm._plugin_distinfo.append((mod, distribution)) # type: ignore except metadata.PackageNotFoundError: sys.stderr.write(f"Plugin {package_name} could not be found\n") for plugin in DEFAULT_PLUGINS: mod = importlib.import_module(plugin) pm.register(mod, plugin) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/templates.py0000644000175100001660000000375214760365472015006 0ustar00runnerdockerfrom pydantic import BaseModel, ConfigDict import string from typing import Optional, Any, Dict, List, Tuple class Template(BaseModel): name: str prompt: Optional[str] = None system: Optional[str] = None model: Optional[str] = None defaults: Optional[Dict[str, Any]] = None # Should a fenced code block be extracted? extract: Optional[bool] = None extract_last: Optional[bool] = None schema_object: Optional[dict] = None model_config = ConfigDict(extra="forbid") class MissingVariables(Exception): pass def evaluate( self, input: str, params: Optional[Dict[str, Any]] = None ) -> Tuple[Optional[str], Optional[str]]: params = params or {} params["input"] = input if self.defaults: for k, v in self.defaults.items(): if k not in params: params[k] = v prompt: Optional[str] = None system: Optional[str] = None if not self.prompt: system = self.interpolate(self.system, params) prompt = input else: prompt = self.interpolate(self.prompt, params) system = self.interpolate(self.system, params) return prompt, system @classmethod def interpolate(cls, text: Optional[str], params: Dict[str, Any]) -> Optional[str]: if not text: return text # Confirm all variables in text are provided string_template = string.Template(text) vars = cls.extract_vars(string_template) missing = [p for p in vars if p not in params] if missing: raise cls.MissingVariables( "Missing variables: {}".format(", ".join(missing)) ) return string_template.substitute(**params) @staticmethod def extract_vars(string_template: string.Template) -> List[str]: return [ match.group("named") for match in string_template.pattern.finditer(string_template.template) ] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/llm/utils.py0000644000175100001660000003066414760365472014152 0ustar00runnerdockerimport click import hashlib import httpx import json import pathlib import puremagic import re import sqlite_utils import textwrap from typing import Any, List, Dict, Optional, Tuple MIME_TYPE_FIXES = { "audio/wave": "audio/wav", } def mimetype_from_string(content) -> Optional[str]: try: type_ = puremagic.from_string(content, mime=True) return MIME_TYPE_FIXES.get(type_, type_) except puremagic.PureError: return None def mimetype_from_path(path) -> Optional[str]: try: type_ = puremagic.from_file(path, mime=True) return MIME_TYPE_FIXES.get(type_, type_) except puremagic.PureError: return None def dicts_to_table_string( headings: List[str], dicts: List[Dict[str, str]] ) -> List[str]: max_lengths = [len(h) for h in headings] # Compute maximum length for each column for d in dicts: for i, h in enumerate(headings): if h in d and len(str(d[h])) > max_lengths[i]: max_lengths[i] = len(str(d[h])) # Generate formatted table strings res = [] res.append(" ".join(h.ljust(max_lengths[i]) for i, h in enumerate(headings))) for d in dicts: row = [] for i, h in enumerate(headings): row.append(str(d.get(h, "")).ljust(max_lengths[i])) res.append(" ".join(row)) return res def remove_dict_none_values(d): """ Recursively remove keys with value of None or value of a dict that is all values of None """ if not isinstance(d, dict): return d new_dict = {} for key, value in d.items(): if value is not None: if isinstance(value, dict): nested = remove_dict_none_values(value) if nested: new_dict[key] = nested elif isinstance(value, list): new_dict[key] = [remove_dict_none_values(v) for v in value] else: new_dict[key] = value return new_dict class _LogResponse(httpx.Response): def iter_bytes(self, *args, **kwargs): for chunk in super().iter_bytes(*args, **kwargs): click.echo(chunk.decode(), err=True) yield chunk class _LogTransport(httpx.BaseTransport): def __init__(self, transport: httpx.BaseTransport): self.transport = transport def handle_request(self, request: httpx.Request) -> httpx.Response: response = self.transport.handle_request(request) return _LogResponse( status_code=response.status_code, headers=response.headers, stream=response.stream, extensions=response.extensions, ) def _no_accept_encoding(request: httpx.Request): request.headers.pop("accept-encoding", None) def _log_response(response: httpx.Response): request = response.request click.echo(f"Request: {request.method} {request.url}", err=True) click.echo(" Headers:", err=True) for key, value in request.headers.items(): if key.lower() == "authorization": value = "[...]" if key.lower() == "cookie": value = value.split("=")[0] + "=..." click.echo(f" {key}: {value}", err=True) click.echo(" Body:", err=True) try: request_body = json.loads(request.content) click.echo( textwrap.indent(json.dumps(request_body, indent=2), " "), err=True ) except json.JSONDecodeError: click.echo(textwrap.indent(request.content.decode(), " "), err=True) click.echo(f"Response: status_code={response.status_code}", err=True) click.echo(" Headers:", err=True) for key, value in response.headers.items(): if key.lower() == "set-cookie": value = value.split("=")[0] + "=..." click.echo(f" {key}: {value}", err=True) click.echo(" Body:", err=True) def logging_client() -> httpx.Client: return httpx.Client( transport=_LogTransport(httpx.HTTPTransport()), event_hooks={"request": [_no_accept_encoding], "response": [_log_response]}, ) def simplify_usage_dict(d): # Recursively remove keys with value 0 and empty dictionaries def remove_empty_and_zero(obj): if isinstance(obj, dict): cleaned = { k: remove_empty_and_zero(v) for k, v in obj.items() if v != 0 and v != {} } return {k: v for k, v in cleaned.items() if v is not None and v != {}} return obj return remove_empty_and_zero(d) or {} def token_usage_string(input_tokens, output_tokens, token_details) -> str: bits = [] if input_tokens is not None: bits.append(f"{format(input_tokens, ',')} input") if output_tokens is not None: bits.append(f"{format(output_tokens, ',')} output") if token_details: bits.append(json.dumps(token_details)) return ", ".join(bits) def extract_fenced_code_block(text: str, last: bool = False) -> Optional[str]: """ Extracts and returns Markdown fenced code block found in the given text. The function handles fenced code blocks that: - Use at least three backticks (`). - May include a language tag immediately after the opening backticks. - Use more than three backticks as long as the closing fence has the same number. If no fenced code block is found, the function returns None. Args: text (str): The input text to search for a fenced code block. last (bool): Extract the last code block if True, otherwise the first. Returns: Optional[str]: The content of the fenced code block, or None if not found. """ # Regex pattern to match fenced code blocks # - ^ or \n ensures that the fence is at the start of a line # - (`{3,}) captures the opening backticks (at least three) # - (\w+)? optionally captures the language tag # - \n matches the newline after the opening fence # - (.*?) non-greedy match for the code block content # - (?P=fence) ensures that the closing fence has the same number of backticks # - [ ]* allows for optional spaces between the closing fence and newline # - (?=\n|$) ensures that the closing fence is followed by a newline or end of string pattern = re.compile( r"""(?m)^(?P`{3,})(?P\w+)?\n(?P.*?)^(?P=fence)[ ]*(?=\n|$)""", re.DOTALL, ) matches = list(pattern.finditer(text)) if matches: match = matches[-1] if last else matches[0] return match.group("code") return None def make_schema_id(schema: dict) -> Tuple[str, str]: schema_json = json.dumps(schema, separators=(",", ":")) schema_id = hashlib.blake2b(schema_json.encode(), digest_size=16).hexdigest() return schema_id, schema_json def output_rows_as_json(rows, nl=False): """ Output rows as JSON - either newline-delimited or an array Parameters: - rows: List of dictionaries to output - nl: Boolean, if True, use newline-delimited JSON Returns: - String with formatted JSON output """ if not rows: return "" if nl else "[]" lines = [] end_i = len(rows) - 1 for i, row in enumerate(rows): is_first = i == 0 is_last = i == end_i line = "{firstchar}{serialized}{maybecomma}{lastchar}".format( firstchar=("[" if is_first else " ") if not nl else "", serialized=json.dumps(row), maybecomma="," if (not nl and not is_last) else "", lastchar="]" if (is_last and not nl) else "", ) lines.append(line) return "\n".join(lines) def resolve_schema_input(db, schema_input, load_template): # schema_input might be JSON or a filepath or an ID or t:name if not schema_input: return if schema_input.strip().startswith("t:"): name = schema_input.strip()[2:] template = load_template(name) if not template.schema_object: raise click.ClickException("Template '{}' has no schema".format(name)) return template.schema_object if schema_input.strip().startswith("{"): try: return json.loads(schema_input) except ValueError: pass if " " in schema_input.strip() or "," in schema_input: # Treat it as schema DSL return schema_dsl(schema_input) # Is it a file on disk? path = pathlib.Path(schema_input) if path.exists(): try: return json.loads(path.read_text()) except ValueError: raise click.ClickException("Schema file contained invalid JSON") # Last attempt: is it an ID in the DB? try: row = db["schemas"].get(schema_input) return json.loads(row["content"]) except (sqlite_utils.db.NotFoundError, ValueError): raise click.BadParameter("Invalid schema") def schema_summary(schema: dict) -> str: """ Extract property names from a JSON schema and format them in a concise way that highlights the array/object structure. Args: schema (dict): A JSON schema dictionary Returns: str: A human-friendly summary of the schema structure """ if not schema or not isinstance(schema, dict): return "" schema_type = schema.get("type", "") if schema_type == "object": props = schema.get("properties", {}) prop_summaries = [] for name, prop_schema in props.items(): prop_type = prop_schema.get("type", "") if prop_type == "array": items = prop_schema.get("items", {}) items_summary = schema_summary(items) prop_summaries.append(f"{name}: [{items_summary}]") elif prop_type == "object": nested_summary = schema_summary(prop_schema) prop_summaries.append(f"{name}: {nested_summary}") else: prop_summaries.append(name) return "{" + ", ".join(prop_summaries) + "}" elif schema_type == "array": items = schema.get("items", {}) return schema_summary(items) return "" def schema_dsl(schema_dsl: str, multi: bool = False) -> Dict[str, Any]: """ Build a JSON schema from a concise schema string. Args: schema_dsl: A string representing a schema in the concise format. Can be comma-separated or newline-separated. multi: Boolean, return a schema for an "items" array of these Returns: A dictionary representing the JSON schema. """ # Type mapping dictionary type_mapping = { "int": "integer", "float": "number", "bool": "boolean", "str": "string", } # Initialize the schema dictionary with required elements json_schema: Dict[str, Any] = {"type": "object", "properties": {}, "required": []} # Check if the schema is newline-separated or comma-separated if "\n" in schema_dsl: fields = [field.strip() for field in schema_dsl.split("\n") if field.strip()] else: fields = [field.strip() for field in schema_dsl.split(",") if field.strip()] # Process each field for field in fields: # Extract field name, type, and description if ":" in field: field_info, description = field.split(":", 1) description = description.strip() else: field_info = field description = "" # Process field name and type field_parts = field_info.strip().split() field_name = field_parts[0].strip() # Default type is string field_type = "string" # If type is specified, use it if len(field_parts) > 1: type_indicator = field_parts[1].strip() if type_indicator in type_mapping: field_type = type_mapping[type_indicator] # Add field to properties json_schema["properties"][field_name] = {"type": field_type} # Add description if provided if description: json_schema["properties"][field_name]["description"] = description # Add field to required list json_schema["required"].append(field_name) if multi: return multi_schema(json_schema) else: return json_schema def multi_schema(schema: dict) -> dict: "Wrap JSON schema in an 'items': [] array" return { "type": "object", "properties": {"items": {"type": "array", "items": schema}}, "required": ["items"], } def find_unused_key(item: dict, key: str) -> str: 'Return unused key, e.g. for {"id": "1"} and key "id" returns "id_"' while key in item: key += "_" return key ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1740761919.8168309 llm-0.23/llm.egg-info/0000755000175100001660000000000014760365500014111 5ustar00runnerdocker././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761919.0 llm-0.23/llm.egg-info/PKG-INFO0000644000175100001660000001517414760365477015233 0ustar00runnerdockerMetadata-Version: 2.2 Name: llm Version: 0.23 Summary: CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine. Home-page: https://github.com/simonw/llm Author: Simon Willison License: Apache License, Version 2.0 Project-URL: Documentation, https://llm.datasette.io/ Project-URL: Issues, https://github.com/simonw/llm/issues Project-URL: CI, https://github.com/simonw/llm/actions Project-URL: Changelog, https://github.com/simonw/llm/releases Requires-Python: >=3.9 Description-Content-Type: text/markdown License-File: LICENSE Requires-Dist: click Requires-Dist: openai>=1.55.3 Requires-Dist: click-default-group>=1.2.3 Requires-Dist: sqlite-utils>=3.37 Requires-Dist: sqlite-migrate>=0.1a2 Requires-Dist: pydantic>=2.0.0 Requires-Dist: PyYAML Requires-Dist: pluggy Requires-Dist: python-ulid Requires-Dist: setuptools Requires-Dist: pip Requires-Dist: pyreadline3; sys_platform == "win32" Requires-Dist: puremagic Provides-Extra: test Requires-Dist: pytest; extra == "test" Requires-Dist: numpy; extra == "test" Requires-Dist: pytest-httpx>=0.33.0; extra == "test" Requires-Dist: pytest-asyncio; extra == "test" Requires-Dist: cogapp; extra == "test" Requires-Dist: mypy>=1.10.0; extra == "test" Requires-Dist: black>=25.1.0; extra == "test" Requires-Dist: ruff; extra == "test" Requires-Dist: types-click; extra == "test" Requires-Dist: types-PyYAML; extra == "test" Requires-Dist: types-setuptools; extra == "test" Dynamic: author Dynamic: description Dynamic: description-content-type Dynamic: home-page Dynamic: license Dynamic: project-url Dynamic: provides-extra Dynamic: requires-dist Dynamic: requires-python Dynamic: summary # LLM [![PyPI](https://img.shields.io/pypi/v/llm.svg)](https://pypi.org/project/llm/) [![Documentation](https://readthedocs.org/projects/llm/badge/?version=latest)](https://llm.datasette.io/) [![Changelog](https://img.shields.io/github/v/release/simonw/llm?include_prereleases&label=changelog)](https://llm.datasette.io/en/stable/changelog.html) [![Tests](https://github.com/simonw/llm/workflows/Test/badge.svg)](https://github.com/simonw/llm/actions?query=workflow%3ATest) [![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/simonw/llm/blob/main/LICENSE) [![Discord](https://img.shields.io/discord/823971286308356157?label=discord)](https://datasette.io/discord-llm) [![Homebrew](https://img.shields.io/homebrew/installs/dy/llm?color=yellow&label=homebrew&logo=homebrew)](https://formulae.brew.sh/formula/llm) A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine. [Run prompts from the command-line](https://llm.datasette.io/en/stable/usage.html#executing-a-prompt), [store the results in SQLite](https://llm.datasette.io/en/stable/logging.html), [generate embeddings](https://llm.datasette.io/en/stable/embeddings/index.html) and more. Consult the **[LLM plugins directory](https://llm.datasette.io/en/stable/plugins/directory.html)** for plugins that provide access to remote and local models. Full documentation: **[llm.datasette.io](https://llm.datasette.io/)** Background on this project: - [llm, ttok and strip-tags—CLI tools for working with ChatGPT and other LLMs](https://simonwillison.net/2023/May/18/cli-tools-for-llms/) - [The LLM CLI tool now supports self-hosted language models via plugins](https://simonwillison.net/2023/Jul/12/llm/) - [Accessing Llama 2 from the command-line with the llm-replicate plugin](https://simonwillison.net/2023/Jul/18/accessing-llama-2/) - [Run Llama 2 on your own Mac using LLM and Homebrew](https://simonwillison.net/2023/Aug/1/llama-2-mac/) - [Catching up on the weird world of LLMs](https://simonwillison.net/2023/Aug/3/weird-world-of-llms/) - [LLM now provides tools for working with embeddings](https://simonwillison.net/2023/Sep/4/llm-embeddings/) - [Build an image search engine with llm-clip, chat with models with llm chat](https://simonwillison.net/2023/Sep/12/llm-clip-and-chat/) - [Many options for running Mistral models in your terminal using LLM](https://simonwillison.net/2023/Dec/18/mistral/) ## Installation Install this tool using `pip`: ```bash pip install llm ``` Or using [Homebrew](https://brew.sh/): ```bash brew install llm ``` [Detailed installation instructions](https://llm.datasette.io/en/stable/setup.html). ## Getting started If you have an [OpenAI API key](https://platform.openai.com/api-keys) you can get started using the OpenAI models right away. As an alternative to OpenAI, you can [install plugins](https://llm.datasette.io/en/stable/plugins/installing-plugins.html) to access models by other providers, including models that can be installed and run on your own device. Save your OpenAI API key like this: ```bash llm keys set openai ``` This will prompt you for your key like so: ``` Enter key: ``` Now that you've saved a key you can run a prompt like this: ```bash llm "Five cute names for a pet penguin" ``` ``` 1. Waddles 2. Pebbles 3. Bubbles 4. Flappy 5. Chilly ``` Read the [usage instructions](https://llm.datasette.io/en/stable/usage.html) for more. ## Installing a model that runs on your own machine [LLM plugins](https://llm.datasette.io/en/stable/plugins/index.html) can add support for alternative models, including models that run on your own machine. To download and run Mistral 7B Instruct locally, you can install the [llm-gpt4all](https://github.com/simonw/llm-gpt4all) plugin: ```bash llm install llm-gpt4all ``` Then run this command to see which models it makes available: ```bash llm models ``` ``` gpt4all: all-MiniLM-L6-v2-f16 - SBert, 43.76MB download, needs 1GB RAM gpt4all: orca-mini-3b-gguf2-q4_0 - Mini Orca (Small), 1.84GB download, needs 4GB RAM gpt4all: mistral-7b-instruct-v0 - Mistral Instruct, 3.83GB download, needs 8GB RAM ... ``` Each model file will be downloaded once the first time you use it. Try Mistral out like this: ```bash llm -m mistral-7b-instruct-v0 'difference between a pelican and a walrus' ``` You can also start a chat session with the model using the `llm chat` command: ```bash llm chat -m mistral-7b-instruct-v0 ``` ``` Chatting with mistral-7b-instruct-v0 Type 'exit' or 'quit' to exit Type '!multi' to enter multiple lines, then '!end' to finish > ``` ## Using a system prompt You can use the `-s/--system` option to set a system prompt, providing instructions for processing other input to the tool. To describe how the code in a file works, try this: ```bash cat mycode.py | llm -s "Explain this code" ``` ## Help For help, run: llm --help You can also use: python -m llm --help ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761919.0 llm-0.23/llm.egg-info/SOURCES.txt0000644000175100001660000000072514760365477016016 0ustar00runnerdockerLICENSE MANIFEST.in README.md setup.py llm/__init__.py llm/__main__.py llm/cli.py llm/embeddings.py llm/embeddings_migrations.py llm/errors.py llm/hookspecs.py llm/migrations.py llm/models.py llm/plugins.py llm/templates.py llm/utils.py llm.egg-info/PKG-INFO llm.egg-info/SOURCES.txt llm.egg-info/dependency_links.txt llm.egg-info/entry_points.txt llm.egg-info/requires.txt llm.egg-info/top_level.txt llm/default_plugins/__init__.py llm/default_plugins/openai_models.py././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761919.0 llm-0.23/llm.egg-info/dependency_links.txt0000644000175100001660000000000114760365477020174 0ustar00runnerdocker ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761919.0 llm-0.23/llm.egg-info/entry_points.txt0000644000175100001660000000004414760365477017422 0ustar00runnerdocker[console_scripts] llm = llm.cli:cli ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761919.0 llm-0.23/llm.egg-info/requires.txt0000644000175100001660000000051614760365477016530 0ustar00runnerdockerclick openai>=1.55.3 click-default-group>=1.2.3 sqlite-utils>=3.37 sqlite-migrate>=0.1a2 pydantic>=2.0.0 PyYAML pluggy python-ulid setuptools pip puremagic [:sys_platform == "win32"] pyreadline3 [test] pytest numpy pytest-httpx>=0.33.0 pytest-asyncio cogapp mypy>=1.10.0 black>=25.1.0 ruff types-click types-PyYAML types-setuptools ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761919.0 llm-0.23/llm.egg-info/top_level.txt0000644000175100001660000000000414760365477016652 0ustar00runnerdockerllm ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1740761919.8178308 llm-0.23/setup.cfg0000644000175100001660000000004614760365500013454 0ustar00runnerdocker[egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1740761914.0 llm-0.23/setup.py0000644000175100001660000000347214760365472013363 0ustar00runnerdockerfrom setuptools import setup, find_packages import os VERSION = "0.23" def get_long_description(): with open( os.path.join(os.path.dirname(os.path.abspath(__file__)), "README.md"), encoding="utf8", ) as fp: return fp.read() setup( name="llm", description=( "CLI utility and Python library for interacting with Large Language Models from " "organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine." ), long_description=get_long_description(), long_description_content_type="text/markdown", author="Simon Willison", url="https://github.com/simonw/llm", project_urls={ "Documentation": "https://llm.datasette.io/", "Issues": "https://github.com/simonw/llm/issues", "CI": "https://github.com/simonw/llm/actions", "Changelog": "https://github.com/simonw/llm/releases", }, license="Apache License, Version 2.0", version=VERSION, packages=find_packages(), entry_points=""" [console_scripts] llm=llm.cli:cli """, install_requires=[ "click", "openai>=1.55.3", "click-default-group>=1.2.3", "sqlite-utils>=3.37", "sqlite-migrate>=0.1a2", "pydantic>=2.0.0", "PyYAML", "pluggy", "python-ulid", "setuptools", "pip", "pyreadline3; sys_platform == 'win32'", "puremagic", ], extras_require={ "test": [ "pytest", "numpy", "pytest-httpx>=0.33.0", "pytest-asyncio", "cogapp", "mypy>=1.10.0", "black>=25.1.0", "ruff", "types-click", "types-PyYAML", "types-setuptools", ] }, python_requires=">=3.9", )