AI

ChatGPT API Python Tutorial: From First Call to Production

Build real applications with the OpenAI ChatGPT API in Python. Learn authentication, streaming, conversation memory, function calling, and how to ship a working CLI assistant.

May 23, 202611 min read
Share
Advertisement (not configured)

Introduction

The ChatGPT API is the easiest way to add real AI capability to your Python projects — chatbots, summarizers, classifiers, code assistants, agents. The hard part isn't making a single call. It's understanding the patterns that turn that one call into a production-ready app.

In this tutorial, we'll build four working examples, each more advanced than the last:

  1. Your first API call
  2. Streaming responses in real time
  3. Multi-turn conversations with memory
  4. Function calling for tool-using agents

We'll finish with a complete CLI assistant you can actually use.

Setup: Install and Authenticate

First, install the official OpenAI Python SDK:

pip install openai python-dotenv

Get your API key from platform.openai.com/api-keys. Never commit it to git. Store it in a .env file at the root of your project:

OPENAI_API_KEY=sk-proj-your-key-here

Load it once in your script and you're authenticated:

import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

That client object is all you need for the rest of this tutorial.

1. Your First API Call

The simplest possible call — send a prompt, get a response:

from openai import OpenAI

client = OpenAI()

def ask(prompt: str, model: str = "gpt-4o-mini") -> str:
    response = client.chat.completions.create(
        model=model,
        messages=[
            {"role": "system", "content": "You are a concise, helpful assistant."},
            {"role": "user", "content": prompt},
        ],
        temperature=0.7,
    )
    return response.choices[0].message.content

if __name__ == "__main__":
    answer = ask("Explain Python decorators in two sentences.")
    print(answer)

The three things to understand about every API call:

Parameter What it does
model Which model to use. gpt-4o-mini is cheap and fast; gpt-4o is smarter and more expensive.
messages A list of role-tagged messages. system sets behavior, user is the prompt, assistant is past replies.
temperature 0 = deterministic, 1 = creative. Use 0–0.3 for factual tasks, 0.7+ for writing.

That's the whole foundation. Everything else is variations on this.

2. Streaming Responses in Real Time

Without streaming, users wait 3–10 seconds staring at a blank screen. With streaming, tokens appear as they're generated — the experience feels 10× faster:

from openai import OpenAI

client = OpenAI()

def ask_streaming(prompt: str, model: str = "gpt-4o-mini"):
    stream = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        stream=True,
    )

    full_response = ""
    for chunk in stream:
        delta = chunk.choices[0].delta.content or ""
        print(delta, end="", flush=True)
        full_response += delta

    print()  # final newline
    return full_response

if __name__ == "__main__":
    ask_streaming("Write a haiku about debugging Python at 3am.")

Two things to notice: stream=True in the request, and flush=True in print so output appears immediately instead of being buffered. Use streaming for any user-facing interface. Skip it for background jobs where you just need the full response.

3. Multi-Turn Conversations With Memory

The ChatGPT API is stateless — it does not remember previous messages. You remember them, by passing the entire conversation history with every call:

from openai import OpenAI

client = OpenAI()

class Chatbot:
    def __init__(self, system_prompt: str, model: str = "gpt-4o-mini"):
        self.model = model
        self.messages = [{"role": "system", "content": system_prompt}]

    def send(self, user_message: str) -> str:
        self.messages.append({"role": "user", "content": user_message})

        response = client.chat.completions.create(
            model=self.model,
            messages=self.messages,
        )
        reply = response.choices[0].message.content
        self.messages.append({"role": "assistant", "content": reply})

        return reply

    def reset(self):
        self.messages = self.messages[:1]  # keep only system prompt


if __name__ == "__main__":
    bot = Chatbot("You are a friendly Python tutor.")

    print(bot.send("What is a list comprehension?"))
    print(bot.send("Show me an example with filtering."))
    print(bot.send("Now rewrite it as a for-loop."))

The bot now "remembers" — each call sees the full history. Be careful with long conversations: every message is re-sent every turn, so your token cost grows quadratically. For long chats, summarize older messages or truncate to the last N turns.

4. Function Calling: Let the Model Use Your Code

This is where the API stops being a chatbot and starts being an agent. You define Python functions; the model decides when to call them and with what arguments:

import json
from openai import OpenAI

client = OpenAI()

# 1. Define the actual Python functions
def get_weather(city: str) -> dict:
    # In real code, call a weather API. Here, just mock it.
    return {"city": city, "temp_c": 22, "condition": "sunny"}

def calculate(expression: str) -> float:
    # WARNING: eval is unsafe for untrusted input. Use a real parser.
    return eval(expression, {"__builtins__": {}}, {})

# 2. Describe them to the model as tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a city.",
            "parameters": {
                "type": "object",
                "properties": {"city": {"type": "string"}},
                "required": ["city"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Evaluate a math expression like '23 * 4 + 9'.",
            "parameters": {
                "type": "object",
                "properties": {"expression": {"type": "string"}},
                "required": ["expression"],
            },
        },
    },
]

# 3. Map names back to real functions
AVAILABLE_FUNCTIONS = {"get_weather": get_weather, "calculate": calculate}

def run_agent(user_message: str, model: str = "gpt-4o-mini") -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.chat.completions.create(
            model=model, messages=messages, tools=tools,
        )
        msg = response.choices[0].message
        messages.append(msg)

        if not msg.tool_calls:
            return msg.content  # done

        # Execute each tool call the model asked for
        for call in msg.tool_calls:
            fn = AVAILABLE_FUNCTIONS[call.function.name]
            args = json.loads(call.function.arguments)
            result = fn(**args)
            messages.append({
                "role": "tool",
                "tool_call_id": call.id,
                "content": json.dumps(result),
            })

if __name__ == "__main__":
    print(run_agent("What's the weather in Lahore, and what's 12% of 850?"))

The flow: model receives the question → decides it needs get_weather("Lahore") and calculate("0.12 * 850") → you run them → send results back → model writes the final answer. That's the foundation of every "AI agent" framework on the market.

Putting It All Together: A CLI Assistant

Here's a complete, working terminal assistant that combines streaming, memory, and the chat interface:

import os
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()
client = OpenAI()

SYSTEM_PROMPT = """You are a senior Python developer.
Answer questions clearly, give code examples, and be honest
when you don't know something."""

def run_assistant():
    messages = [{"role": "system", "content": SYSTEM_PROMPT}]
    print("Python Assistant — type 'exit' to quit, 'reset' to clear context.\n")

    while True:
        try:
            user_input = input("you ▸ ").strip()
        except (EOFError, KeyboardInterrupt):
            print()
            break

        if user_input.lower() == "exit":
            break
        if user_input.lower() == "reset":
            messages = messages[:1]
            print("(context cleared)\n")
            continue
        if not user_input:
            continue

        messages.append({"role": "user", "content": user_input})

        print("ai  ▸ ", end="", flush=True)
        stream = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            stream=True,
        )

        reply = ""
        for chunk in stream:
            delta = chunk.choices[0].delta.content or ""
            print(delta, end="", flush=True)
            reply += delta
        print("\n")

        messages.append({"role": "assistant", "content": reply})

if __name__ == "__main__":
    run_assistant()

Save it as assistant.py, run python assistant.py, and you have a working AI assistant in your terminal. Under 50 lines of code.

Production Checklist

Before you ship anything using the ChatGPT API, run through this:

Concern What to do
Secrets Use environment variables, never hardcode keys
Cost Set hard usage limits in the OpenAI dashboard
Errors Wrap calls in try/except and retry with exponential backoff
Latency Stream responses for any user-facing UI
Token limits Truncate or summarize long conversation history
Safety Validate user input; don't blindly eval model output
Logging Log prompts + responses for debugging (scrub PII)

The official SDK has built-in retry logic, but you should still handle RateLimitError and APIError explicitly so failures are visible, not silent.

Install Everything

pip install openai python-dotenv

That's the entire stack. The OpenAI SDK gives you streaming, function calling, file uploads, vision, and embeddings out of the box.

Final Thought

The four patterns in this tutorial — single call, streaming, memory, and tool use — are the building blocks of every AI product you'll ever build. Chatbots, customer support agents, code reviewers, research assistants, internal tools: all of them are remixes of these four ideas.

The API itself takes ten minutes to learn. The interesting work is everything that wraps around it — prompt design, evaluation, cost control, and UX. Start with the CLI assistant above, then build the version that solves a real problem in your own workflow. That's how you actually learn this.

Advertisement (not configured)

Written by

Raretechsol

Software company from Pakistan, specializing in Python and JavaScript. Passionate about automation, AI, and building practical web applications.