ChatGPT API Python Tutorial: From First Call to Production

Introduction

The ChatGPT API is the easiest way to add real AI capability to your Python projects — chatbots, summarizers, classifiers, code assistants, agents. The hard part isn't making a single call. It's understanding the patterns that turn that one call into a production-ready app.

In this tutorial, we'll build four working examples, each more advanced than the last:

Your first API call
Streaming responses in real time
Multi-turn conversations with memory
Function calling for tool-using agents

We'll finish with a complete CLI assistant you can actually use.

Setup: Install and Authenticate

First, install the official OpenAI Python SDK:

pip install openai python-dotenv

Get your API key from platform.openai.com/api-keys. Never commit it to git. Store it in a .env file at the root of your project:

OPENAI_API_KEY=sk-proj-your-key-here

Load it once in your script and you're authenticated:

import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

That client object is all you need for the rest of this tutorial.

1. Your First API Call

The simplest possible call — send a prompt, get a response:

from openai import OpenAI

client = OpenAI()

def ask(prompt: str, model: str = "gpt-4o-mini") -> str:
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": "You are a concise, helpful assistant."},
            {"role": "user", "content": prompt},
        ],
        temperature=0.7,
    )
    return response.choices[0].message.content

if __name__ == "__main__":
    answer = ask("Explain Python decorators in two sentences.")
    print(answer)

The three things to understand about every API call:

Parameter	What it does
`model`	Which model to use. `gpt-4o-mini` is cheap and fast; `gpt-4o` is smarter and more expensive.
`messages`	A list of role-tagged messages. `system` sets behavior, `user` is the prompt, `assistant` is past replies.
`temperature`	0 = deterministic, 1 = creative. Use 0–0.3 for factual tasks, 0.7+ for writing.

That's the whole foundation. Everything else is variations on this.

2. Streaming Responses in Real Time

Without streaming, users wait 3–10 seconds staring at a blank screen. With streaming, tokens appear as they're generated — the experience feels 10× faster:

from openai import OpenAI

client = OpenAI()

def ask_streaming(prompt: str, model: str = "gpt-4o-mini"):
    stream = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        stream=True,
    )

    full_response = ""
    for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)
        full_response += delta

    print()  # final newline
    return full_response

if __name__ == "__main__":
    ask_streaming("Write a haiku about debugging Python at 3am.")

Two things to notice: stream=True in the request, and flush=True in print so output appears immediately instead of being buffered. Use streaming for any user-facing interface. Skip it for background jobs where you just need the full response.

3. Multi-Turn Conversations With Memory

The ChatGPT API is stateless — it does not remember previous messages. You remember them, by passing the entire conversation history with every call:

from openai import OpenAI

client = OpenAI()

class Chatbot:
    def __init__(self, system_prompt: str, model: str = "gpt-4o-mini"):
        self.model = model
        self.messages = [{"role": "system", "content": system_prompt}]

    def send(self, user_message: str) -> str:
        self.messages.append({"role": "user", "content": user_message})

        response = client.chat.completions.create(
            model=self.model,
            messages=self.messages,
        )
        reply = response.choices[0].message.content
        self.messages.append({"role": "assistant", "content": reply})

        return reply

    def reset(self):
        self.messages = self.messages[:1]  # keep only system prompt


if __name__ == "__main__":
    bot = Chatbot("You are a friendly Python tutor.")

    print(bot.send("What is a list comprehension?"))
    print(bot.send("Show me an example with filtering."))
    print(bot.send("Now rewrite it as a for-loop."))

The bot now "remembers" — each call sees the full history. Be careful with long conversations: every message is re-sent every turn, so your token cost grows quadratically. For long chats, summarize older messages or truncate to the last N turns.

4. Function Calling: Let the Model Use Your Code

This is where the API stops being a chatbot and starts being an agent. You define Python functions; the model decides when to call them and with what arguments:

import json
from openai import OpenAI

client = OpenAI()

# 1. Define the actual Python functions
def get_weather(city: str) -> dict:
    # In real code, call a weather API. Here, just mock it.
    return {"city": city, "temp_c": 22, "condition": "sunny"}

def calculate(expression: str) -> float:
    # WARNING: eval is unsafe for untrusted input. Use a real parser.
    return eval(expression, {"__builtins__": {}}, {})

# 2. Describe them to the model as tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {"city": {"type": "string"}},
                "required": ["city"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a math expression like '23 * 4 + 9'.",
            "parameters": {
                "type": "object",
                "properties": {"expression": {"type": "string"}},
                "required": ["expression"],
            },
        },
    },
]

# 3. Map names back to real functions
AVAILABLE_FUNCTIONS = {"get_weather": get_weather, "calculate": calculate}

def run_agent(user_message: str, model: str = "gpt-4o-mini") -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.chat.completions.create(
            model=model, messages=messages, tools=tools,
        )
        msg = response.choices[0].message
        messages.append(msg)

        if not msg.tool_calls:
            return msg.content  # done

        # Execute each tool call the model asked for
        for call in msg.tool_calls:
            fn = AVAILABLE_FUNCTIONS[call.function.name]
            args = json.loads(call.function.arguments)
            result = fn(**args)
            messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": json.dumps(result),
            })

if __name__ == "__main__":
    print(run_agent("What's the weather in Lahore, and what's 12% of 850?"))

The flow: model receives the question → decides it needs get_weather("Lahore") and calculate("0.12 * 850") → you run them → send results back → model writes the final answer. That's the foundation of every "AI agent" framework on the market.

Putting It All Together: A CLI Assistant

Here's a complete, working terminal assistant that combines streaming, memory, and the chat interface:

import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()
client = OpenAI()

SYSTEM_PROMPT = """You are a senior Python developer.
Answer questions clearly, give code examples, and be honest
when you don't know something."""

def run_assistant():
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    print("Python Assistant — type 'exit' to quit, 'reset' to clear context.\n")

    while True:
        try:
            user_input = input("you ▸ ").strip()
        except (EOFError, KeyboardInterrupt):
            print()
            break

        if user_input.lower() == "exit":
            break
        if user_input.lower() == "reset":
            messages = messages[:1]
            print("(context cleared)\n")
            continue
        if not user_input:
            continue

        messages.append({"role": "user", "content": user_input})

        print("ai  ▸ ", end="", flush=True)
        stream = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            stream=True,
        )

        reply = ""
        for chunk in stream:
            delta = chunk.choices[0].delta.content or ""
            print(delta, end="", flush=True)
            reply += delta
        print("\n")

        messages.append({"role": "assistant", "content": reply})

if __name__ == "__main__":
    run_assistant()

Save it as assistant.py, run python assistant.py, and you have a working AI assistant in your terminal. Under 50 lines of code.

Production Checklist

Before you ship anything using the ChatGPT API, run through this:

Concern	What to do
Secrets	Use environment variables, never hardcode keys
Cost	Set hard usage limits in the OpenAI dashboard
Errors	Wrap calls in `try/except` and retry with exponential backoff
Latency	Stream responses for any user-facing UI
Token limits	Truncate or summarize long conversation history
Safety	Validate user input; don't blindly `eval` model output
Logging	Log prompts + responses for debugging (scrub PII)

The official SDK has built-in retry logic, but you should still handle RateLimitError and APIError explicitly so failures are visible, not silent.

Install Everything

pip install openai python-dotenv

That's the entire stack. The OpenAI SDK gives you streaming, function calling, file uploads, vision, and embeddings out of the box.

Final Thought

The four patterns in this tutorial — single call, streaming, memory, and tool use — are the building blocks of every AI product you'll ever build. Chatbots, customer support agents, code reviewers, research assistants, internal tools: all of them are remixes of these four ideas.

The API itself takes ten minutes to learn. The interesting work is everything that wraps around it — prompt design, evaluation, cost control, and UX. Start with the CLI assistant above, then build the version that solves a real problem in your own workflow. That's how you actually learn this.