Jonathan's blog

Ollama with Llama3 and Code Interpreter

Jonathan Adly — Fri, 26 Apr 2024 18:53:07 GMT

I try to run an experiment once a week with open-source LLMs. This week experiment was using Llama3 via Ollama and AgentRun to have an open-source, 100% local Code Interpreter.

The idea is, give an LLM a query that is better answered via code execution instead of its training. Run the code in AgentRun, then return the answer to the user. It is more or less a proof of concept, that can be expanded on with additional tools that an LLM can use.

For this experiment, I had Ollama installed and running as well as the AgentRun API. My goal was use code generated by an LLM it to answer some questions that normally an LLM would struggle with. Like, what is 12345 * 54321? Or what is the largest prime number under 1000?

The full code is available here: https://jonathan-adly.github.io/AgentRun/examples/ollama_llama3/

Step 1: Setting Up

If you don't have Ollama installed, first install it from here. Then, run a test query to make sure everything is working.

curl -X POST http://localhost:11434/api/generate -d '{  "model": "llama3",  "prompt":"What is 1+1?" }'

Next, install AgentRun and have its REST API running. You will need docker installed to use docker-compose.

git clone https://github.com/Jonathan-Adly/agentruncd agentrun/agentrun-apicp .env.example .env.devdocker-compose up -d --build

And again, let's make a test request to make sure everything is running correctly.

curl -X GET http://localhost:8000/v1/health/# {"status":"ok"}

Next, we will run a Python script that will be our starting point to run queries against Llama3 with Agentrun.

python -m venv agentrun-venv# windows: .\agentrun-venv\Scripts\activatesource agentrun-venv/bin/activate# windows: New-Item main.py -type filetouch main.pypip install requests json_repair

In the file, we will start off by importing the necessary libraries. We'll need json for handling data and requests for making HTTP calls. Were also using a cool library called json_repair just in case our JSON data decides to act up and we need to fix it on the fly. This is especially the case if use 8B version of Llama3 where the JSON sometimes is slightly broken.

import jsonimport json_repairimport requests

Step 2: Define the Function & Tools

We've crafted a simple function execute_python_code. This function is pretty straightforwardit sends a Python code snippet to a code execution environment provided by AgentRun and fetches the output.

Here's a quick peek at how this works:

def execute_python_code(code: str) -> str:    code = json.dumps({"code": code})    response = requests.post(        "http://localhost:8000/v1/run/",        data=code,        headers={"Content-Type": "application/json"},    )    print(code)        output = response.json()["output"]    return output

We basically format the code snippet into JSON, send it off to our localhost where the magic happens, and get back the result. You can read more about how AgentRun works here.

Next, we would use this function as our basis for defining the tool that we want Llama3 to use. Here is what this looks like.

tools = [    {        "type": "function",        "function": {            "name": "execute_python_code",            "description": """Sends a python code snippet to the code execution environment and returns the output.             The code execution environment can automatically import any library or package by importing.             The code snippet to execute must be a valid python code and must use print() to output the result.""",            "parameters": {                "type": "object",                "properties": {                    "code": {                        "type": "string",                        "description": "The code snippet to execute. Must be a valid python code. Must use print() to output the result.",                    },                },                "required": ["code"],            },        },    },]

Lastly, we will set up our model here. We can use the base Llama3 or any of the finetunes provided by the community. For the sake of experimentations, I ran my experiment using Dolphin-llama3 8b finetune.

# Ollama dolphin-llama3 page: https://ollama.com/library/dolphin-llama3MODEL = "dolphin-llama3"

Step 3: The Integration with Ollama and Llama3

Moving on to the cooler elementintegration with the Ollama and Llama3.

Heres a how the query processing and tool selection works:

def generate_full_completion(prompt: str, model: str = MODEL) -> dict[str, str]:    # setting up the parameters including our model    params = {        "model": model,        "prompt": prompt,        "stream": False,        # seed and temperature for deterministic output        "temperature": 0,        "seed": 123,        # format is JSON, since we are interested in tools/function calling        "format": "json",    }    # making the post request and handling responses    try:        response = requests.post(            f"http://localhost:11434/api/generate",            headers={"Content-Type": "application/json"},            data=json.dumps(params),            timeout=60,        )        return json_repair.loads(response.text)    except requests.RequestException as err:        return {"error": f"API call error: {str(err)}"}

Step 4: Putting It All Together

Now, that we have everything setup. We will simply use a prompt to nudge the model toward using our execute_python_code tool for its outputs.

def get_answer(query: str) -> str:    functions_prompt = f"""        You have access to the following tools:            {tools}        You must follow these instructions:        If a user query requires a tool, you must select the appropriate tool from the list of tools provided.        Always select one or more of the above tools based on the user query        If a tool is found, you must respond in the JSON format matching the following schema:        {{        "tools": {{            "tool": "",            "tool_input": {query}        """    r_dict = generate_full_completion(functions_prompt)    r_tools = json_repair.loads(r_dict["response"])["tools"]    code = r_tools["tool_input"]["code"]    response = execute_python_code(code)    return response

Finally, when you feed it a query like "what's the 12312 * 321?" the whole system whirls into action, the model figures out which tool and code snippet to use, executes it, and bam! You've got your answer.

Just to Show Off

Lets see it in action with a couple of examples:

# 3952152print(get_answer("what's 12312 *321?"))# 500print(get_answer("how many even numbers are there between 1 and 1000?"))# Parisprint(get_answer("what's the capital of France?"))

We're blending advanced model integration with practical code execution. Whether you're automating tasks, building out a project, or just playing around to see the capabilities, this setup might just be your next go-to.

And, there you goa delightful mix of Python, APIs, and some AI magic to streamline how you handle and execute code snippets. As always, tweak, tinker, and tailor it to your needs. Happy coding, everyone!

Open Sourcing a Python Project the Right Way in 2024

Jonathan Adly — Thu, 18 Apr 2024 02:19:42 GMT

Every Python developer I've talked to has written some code that others would find useful. At the same time, they've all spent days, if not longer, wrestling with the tooling and packaging that comes with the language. My aim with this article is to simplify the process of open-sourcing your Python code as much as possible. By the time you finish reading, you'll know how to take your existing code base and turn it into an open source project that's easy to use and contribute to.

This was inspired by:

Trying to open-source my package AgentRun and struggling for a few days with Python tooling
Jeff Knupp excellent article on how to open source a Python project hitting the 10+ year mark and needing an update
Simon Willison updated cookiecutter tool python-lib that did most of the heavy lifting with the pain of automating the publishing process to PyPI.

Tools and Concepts

When you're gearing up to open source a Python project, there are a handful of tools and concepts that really come in handy. I'm going to walk you through some of the essentials that I've found invaluable. Keep in mind, my recommendations are based on personal experience, so they might be a bit subjective.

Let's break it down:

Project Layout: How to structure your files.
The pyproject.toml File: This is crucial for project settings.
Pytest: For all your testing needs.
GitHub Actions: Automate workflows directly from your GitHub repository.
MkDocs: For awesome documentation.
PyPI Trusted Publishing: Get your package out there easily.
Cookiecutter: A lifesaver for starting projects quickly.
Recipe: Step-by-step guide to get you rolling.

If you're already familiar with these tools, feel free to jump straight to the Recipe section where you can follow the practical steps to get your project live.

Project Layout

When you're open sourcing a Python project, the way you organize your project layout is crucial. It's often the first thing potential contributors or users notice. A cluttered or confusing structure can be overwhelming for newcomers, so it's essential to get it right from the start.

Every project should include at least three key directories:

A docs directory for all your project documentation.
A directory named after your project where the actual Python package lives.
A tests directory to hold all your test files.

On top of these, you'll typically have several important top-level files like LICENSE, README.md, and possibly a few others. However, it's wise to keep the number of top-level files to a minimum. To give you a clearer picture, here's a simplified snapshot of the layout for one of my projects, AgentRun.

 .github    workflows        publish.yml        test.yml .gitignore LICENSE README.md agentrun    __init__.py agentrun-api docs mkdocs.yml pyproject.toml tests     test_agentrun.py

The `pyproject.toml` File

The pyproject.toml file is a configuration file for Python projects, standardized by PEP 518. It specifies build system requirements and can be used to configure tools like black, isort, and pytest and many others. This file is crucial for modern Python packaging and dependency management.

In the context of open-sourcing a Python package, pyproject.toml plays several roles:

Dependency Specification: It declares build dependencies required to compile your package from source. This ensures that anyone trying to build your package from the source will have the right tools installed automatically.
Package Metadata: It can include metadata about your package such as name, version, authors, and more. This information is essential for package distribution and maintenance.
Tool Configuration: It allows you to configure various tools used during development in a single, standardized file. For example, settings for formatters, linters, and test frameworks can be specified here, ensuring consistency across environments.
Build System Declaration: It declares which build backend (like setuptools, flit, or poetry) is used to build your package. This is crucial for reproducibility and compatibility in different environments.

The difficult thing about this part is that you have many options and configurations you can choose from. In the beginning I recommend to keep it simple and gradually add more tools as you are get more comfortable.

Here is a minimal pyproject.toml similar to the one generated by python-lib that I recommend to start with.

[project]name = "Your project name"version = "0.1"description = "Your project description"readme = "README.md"requires-python = ">=3.8"authors = [{name = "your name"}]license = {text = "the license choosen from the project"}classifiers = [    "License :: OSI Approved :: {{ the license name }} "]dependencies = [][build-system]requires = ["setuptools"]build-backend = "setuptools.build_meta"[project.urls]Homepage = "your github repo page"[project.optional-dependencies]test = ["pytest", "mkdocs"]

This pyproject.toml file doesn't have any dependencies and only pytest and mkdocs for the test version of the package. So, this is just a starting point and would need to be adjusted depending on your own project needs.

Pytest

When publishing an open-source package, it's good to include some tests. This not only invites contributors to your project but also reassures users about the reliability of your package. Without tests, there's a risk that contributors might unintentionally break existing features, and potential users might hesitate to depend on your software.

Pytest is one of the most popular Python testing framework and I recommend starting from the beginning with it.

Let's imagine you have a package that simply increments a number by 1. Here is the code in your package/__init__.py :

def inc(x):    return x + 1

Then under your tests/ directory. You should have a file test_inc.py that uses Pytest to test your code.

from package import incdef test_inc():    assert inc(3) == 4

To run your tests, simply type pytest in the terminal. Pytest will execute all your tests and report back whether they passed or failed. This immediate feedback loop is invaluable for maintaining a robust codebase.

Github Actions

When you're planning to share and collaborate on a project, having a solid CI/CD setup is important. Continuous integration (CI) refers to the practice of automatically integrating and testing code changes into a shared source code repository without breaking anything. Continuous Delivery, automates the release of validated code to a repository following the tests that happen in CI.

At a minimum, you want your code tested automatically every time someone pushes a new change. There are many vendors and tools that can do that, but I like Github Actions. Github is by far the most popular platform to store and share code and having native CI/CD in the same place as your code makes things easier.

You store all your Github actions in a directory called .github/workflows . Here is a simple github action that tests your code every time someone's open a Pull Request or pushes code.

name: Teston: [push, pull_request]permissions:  contents: readjobs:  test:    runs-on: ubuntu-latest    strategy:      matrix:        python-version: ["3.10"]    steps:    - uses: actions/checkout@v4    - name: Set up Python ${{ matrix.python-version }}      uses: actions/setup-python@v5      with:        python-version: ${{ matrix.python-version }}        cache: pip        cache-dependency-path: pyproject.toml    - name: Install dependencies      run: |        pip install '.[test]'    - name: Run tests      run: |        pytest

Mkdocs

MkDocs is a fast, and simple static site generator that's geared towards building project documentation. Documentation source files are written in Markdown, and configured with a single YAML configuration file - mkdocs.yml.

There are many plugins, themes, and automation recipes that helps with documenting your package. But to start, you really need only 2 things.

mkdocs.yml configuration file
index.md file under the docs/ directory

The mkdocs.yml only needs a site name and a site url to be valid. Here is a minimal example:

site_name: My Documentationsite_url: https://example.com

PyPI Trusted Publishing

Your code should now be ready for you to build and distribute to PyPI. The process is quite simple with PyPI Trusted publishing mechanism. You need to sign into PyPI and create a new pending publisher. Here is what that looks like.

Trusted publishing is essentially you allowing your GitHub repository your-name/your-package to publish packages with the name your-package via a github action.

The next step is to create a publish.yml in your .github/workflows that uses Github Action to publish your code to PyPI automatically whenever you create a release.

name: Publish Python Packageon:  release:    types: [created]permissions:  contents: readjobs:  test:    runs-on: ubuntu-latest    strategy:      matrix:        python-version: ["3.10"]    steps:    - uses: actions/checkout@v4    - name: Set up Python ${{ matrix.python-version }}      uses: actions/setup-python@v5      with:        python-version: ${{ matrix.python-version }}        cache: pip        cache-dependency-path: pyproject.toml    - name: Install dependencies      run: |        pip install '.[test]'    - name: Run tests      run: |        pytest  deploy:    runs-on: ubuntu-latest    needs: [test]    environment: release    permissions:      id-token: write    steps:    - uses: actions/checkout@v4    - name: Set up Python      uses: actions/setup-python@v5      with:        python-version: "3.12"        cache: pip        cache-dependency-path: pyproject.toml    - name: Install dependencies      run: |        pip install setuptools wheel build    - name: Build      run: |        python -m build    - name: Publish      uses: pypa/gh-action-pypi-publish@release/v1

Let's simplify this file and see what's happening:

Essentially, we have two main tasks here: testing and deploying.
In the deployment section, we're tackling a couple of key activities. First off, we install the necessary tools for packagingthese include setuptools, wheel, and build. Next, we use Pythons build module to actually build the package. Finally, we publish the package to PyPI using the pypa/gh-action-pypi-publish@release/v1 action.

So, in a nutshell, we're testing the code, building it, and then pushing it up to PyPI.

Cookiecutter

Cookiecutter is a command line tool that automates the process of starting a project. Instead of going through all these steps manually, you can just run cookiecutter and the tool will generate all the necessary files and directory structure.

For tools discussed, I tried to follow python-lib structure and recommendations (only adding mkdocs as an extra dependency). So, to achieve the same results you can:

Install cookiecutter - pipx install cookiecutter
Run it - cookiecutter gh:simonw/python-lib
Optional: Add mkdocs as discussed above.

There are many cookiecutter templates around with all kind of different options. The truth is, you really want a minimal well-maintained one.

Cookiecutter is great as a starting point, but relying completely on it without understanding all the tools inside can be painful when things go wrong. That's why I recommend python-lib which gives you the minimum tools you need to publish your package and nothing else.

Recipe

Start by creating a PyPI account if you don't already have one. Check to ensure the package name you want isnt already taken.
Next, set up a new GitHub repository with your package name.
Install Cookiecutter if you haven't already
pipx install cookiecutter
Run Cookiecutter and answer the prompts to generate your project skeleton
cookiecutter gh:simonw/python-lib

Create a virtual environment in your project directory and activate it.

 cd my-new-library # Create and activate a virtual environment: python3 -mvenv venv source venv/bin/activate # Install dependencies so you can edit the project: pip install -e '.[test]' # With zsh you have to run this again for some reason: source venv/bin/activate # test the example function pytest

Initialize a Git repository, commit the initial project structure, and push it to GitHub

 git init git add -A git commit -m "Initial structure from template" git remote add origin https://github.com/{{github_name}}/{{repo_name}}.git git push -u origin main

Check the GitHub Actions tab in your repository to confirm that the test workflow is running. This setup ensures your code is tested with every push.
On your local machine, create a 'develop' branch for ongoing development.
```
 git checkout -b "develop"
```

Add MkDocs to your development dependencies in the pyproject.toml file.

 # nothing else changed [project.optional-dependencies] test = ["pytest", "mkdocs"]

Create a docs directory and add index.md in there. You can copy the info in the README.md to index.md for now.

Create a mkdocs.yml file. Here is a minimal example.

site_name:  documentationsite_url: https://.github.io/-docs/

Install mkdocs and deploy a documentation site.

# update the dependenciespip install -e '.[test]'# test and adjust the documentations as neededmkdocs serve# deploy your documentations to github pagesmkdocs gh-deploy# Add site/ to your .gitignore fileecho "site/" >> .gitignore

When you are happy with the changes, commit your documentation changes to the 'develop' branch and push
```
git add -Agit commit -m "documentation"git push -u origin develop
```

For new features, create a feature branch, add your code, and then merge it back into 'develop' when ready

git checkout -b "new-feature"# work on your code until donegit add -Agit commit -m "feature complete"git checkout develop# now we merge the completed featuregit merge "new-feature"

When ready to release, merge 'develop' into 'main', pushing the changes to GitHub
💡
Don't forget to upgrade your version in pyproject.toml every time you merge develop into main after the initial release.
```
git checkout maingit merge developgit push
```
On Github - create a new release for your package:
On your repositorys main page, click on "Releases" which is typically found on the right side of the sub-navigation menu under the code tab.

Using GPT-4 Over Email

Jonathan Adly — Mon, 15 Apr 2024 19:33:39 GMT

I recently came across a tweet from a founder in my network, who had an interesting question on Twitter: "Why can't you email an assistant and just a get a response without any hassle?" Around the same time, I noticed a discussion on r/residency about GPT-4 being blocked across various hospital networks.

A few days later, I discovered that GPT-4 access is restricted in several countries, and private usage is quite limited. This prompted me to revisit some old code I had, which was designed to handle email webhooks for incoming messages. I decided to adapt this code to create a solution for these issues because it would be cool.

Thus, Assistant-OverMail was bornan open-source project aimed at simplifying access to GPT-4 via email.

Assistant-OverMail

Assistant-OverMail is an open-source web application designed to simplify your workflow by emailing assistant@overmail.ai (or any domain name if you self-host) and getting a response back from GPT-4.

You can use this to use GPT-4 privately, go around employers' blocks, or simply as a short-cut to deal with emails in the inbox. You can try it at: overmail.ai

For those interested in self-hosting, the setup requires Docker, Docker Compose, and a standard Nginx + SSL configuration. Assistant-OverMail is compatible with any email service provider that supports SMTP, such as Mailgun, SendGrid, or AWS SES. Just provide your SMTP credentials in the .env file to get started.

How does it work?

The main business logic is really around emails webhooks. All mail providers as far as I know provide a way to send you webhooks on receiving emails that match certain rules. Here is an example from my mailgun dashboard.

In our case, whenever an email is sent to "assistant@overmail.ai", mailgun forwards this email as a form POST to a designated endpoint. This endpoint performs several checks before passing the email content to GPT-4 for processing. After generating a response, the system utilizes Django's built-in email capabilities to send this response back to the sender.

Parsing Email Addresses

One new thing I learned from trying to improve my old code is Python's parseaddr. My old code had a big block of code handling all kind of different emails formatting on the FROM field with if statements.

The parseaddr function from Python's email.utils module is used to parse a string into its constituent name and email address parts. It takes a single string argument, which typically represents an email address with an optional display name, and returns a tuple of the form (realname, email_address). Here's a quick breakdown:

Input: A string in the format "Display Name <email@example.com>" or just "email@example.com".
Output: A tuple where the first element is the display name (if provided, otherwise an empty string), and the second element is the email address.

The function is quite handy for extracting email addresses and names from headers or similar formats in emails where such data may be presented in a more human-readable form.

Payments

Stripe checkouts are relatively straightforward. You generate a URL on the server from your Stripe key and the Price id, and redirect your user to that URL. They pay on Stripe infrastructure, and you get notified via a webhook on success.

I did want to experiment with Bitcoin payments though as something new to learn. This feature, ended up taking almost a week - mostly because I didn't know what I was doing (but now I do). It really deserves its own post, but at a glance - there are three ways you can accept Bitcoin payments over the internet, ranked by effort:

Via BTCPay server on your own infrastructure
Custodial services APIs (Alby or Strike for example).
Via a payment processor on their infrastructure

Since this is a small project that I wasn't looking forward to maintain a server for or build a custom checkout experience on top of Alby or Strike APIs- I ended up going with #3.

OpenNode is a Stripe-like service that essentially offers the same type of workflow for Bitcoin. A checkout URL is generated on the server, and the user is redirected there. They pay on OpenNode infrastructure, and you receive a webhook on success.

Frontend

This is a quite simple project, and so I decided to go with a Carrd template for it. All what was needed is really a form to submit an email address too, and everything else is either payment on someone's else infrastructure or a webhook.

Alternatives

There are a few Zaps from Zapier that offers Email to GPT-4 integration. There is also at least 1 extension that works in the browser to autofill email responses.

I consider this project complete and overall a success. You can find the completed project here: https://overmail.ai and the code - MIT licensed here: https://github.com/Jonathan-Adly/assistant-overmail/

How to start a Python project in 2024

Jonathan Adly — Mon, 15 Apr 2024 01:53:56 GMT

When I begin a new Python project, one of the first steps I take is to create a virtual environment. This is crucial for managing dependencies and ensuring that the libraries used in my project do not conflict with those of other projects or the system itself. Heres how I set up a Python project using a virtual environment.

Step 1: Create a New Project Directory

I start by creating a new directory for my project and navigate into it:

mkdir my_projectcd my_project

Step 2: Initialize a Git Repository

Next, I initialize a git repository to manage version control:

git init

Step 3: Create the Virtual Environment

I use the venv module to create a virtual environment in my project directory, naming it after my project followed by -venv to maintain clarity:

python -m venv myproject-venv

This command creates a directory named myproject-venv where the virtual environment files are stored.

Step 4: Activate the Virtual Environment

Before installing any packages, I activate the virtual environment. The method differs slightly depending on the operating system:

On macOS and Linux:
```
  source myproject-venv/bin/activate
```

On Windows:
```
  .\myproject-venv\Scripts\activate
```

The prompt in the shell changes to show the name of the environment, indicating that it is now active.

Step 5: Install Required Packages

With the virtual environment activated, I install the necessary packages using pip:

pip install

For instance, to install Flask:

pip install flask

Step 6: Save Dependencies

Its good practice to keep track of the project's dependencies. I do this by creating a requirements.txt file:

pip freeze > requirements.txt

This file is crucial for replicating the environment on other machines or by other developers working on the project.

Step 7: Start Coding

Now, I can start coding my project. I create Python scripts in the project directory and run them using the Python interpreter that is part of my virtual environment.

Step 8: Deactivate the Virtual Environment

When I'm done working, I deactivate the virtual environment by running:

deactivate

This workflow helps me maintain a clean and organized working environment and makes it easier to manage project-specific dependencies. By following these steps, I ensure that each of my Python projects is set up correctly and ready for development.