Deploying Google ADK Agents
Scaling Intelligence: Deploying Google ADK Agents on Google Cloud Run
Modern AI development is moving away from monolithic designs toward modular, agentic architectures. In this post, I’ll walk you through how to deploy a Google Agent Development Kit (ADK) agent as a specialized microservice.
The Fiona AI Platform: A Tool-Centric Approach
The Fiona AI platform is an ecosystem purposefully built around specialized AI tools. While my core application runs on Django, I wanted to create a fleet of autonomous agents—specifically for lead management—that operate independently.
To achieve this, I decided to host my agents as Flask microservices on Google Cloud Run. This separation ensures that my main backend remains lightweight while my AI agents can scale horizontally based on the intensity of the research tasks.
Why Flask on Cloud Run?
Most tutorials suggest deploying agents via Vertex AI Engine to take advantage of built-in UIs. However, for a platform like Fiona, a "chat bubble" isn't the goal—structured JSON data is. By using Flask, we treat the agent as a private API. My Django server sends a request, and the agent returns a clean JSON response, making it easy to integrate AI research directly into a database or a CRM workflow.
The Lead Research Agent
The agent showcased here is a Personalized Outreach & Research Agent. By feeding it basic lead information, the agent utilizes the Google Search tool to scrape, verify, and synthesize data about a prospect.
You can find all the source code, prompts, and logic in my GitHub repository: 👉 FionaAgents - Personalized Outreach Agent
The Anatomy of the Deployment
To run this on Cloud Run, you need to prepare four essential files:
-
main.py: The Flask entry point. -
personalized_outreach.py: The core ADK agent logic and tool definitions. -
prompts.json: The system instructions defining the agent's behavior. -
requirements.txt: Your dependencies (flask, google-labs-adk, etc.).
Note on functions_framework.http: In main.py, we use the @functions_framework.http decorator. This is a critical piece of the puzzle for Google Cloud. It allows your Flask function to be "wrapped" as an HTTP-triggered service, handling the underlying server lifecycle so you can focus strictly on the request and response logic.
Step-by-Step Deployment on GCP
1. Initialize the Service
Head to the Google Cloud Console, search for Cloud Run, and click "Create Service."
-
Workflow: You can use the "Service" option or the "Function" simplified workflow. Select Python as your runtime.
-
Authentication: Since this is a backend-to-backend service for our internal platform, we can select "Allow unauthenticated invocations" during the testing phase (ensure you implement header-based secrets or IAM for production).
2. Configuration & Cost Control
Under the Container/Scaling settings:
-
Instance Limit: To keep costs at a minimum (or within the free tier), set the "Maximum number of instances" to 1 or 2.
-
Memory: 512MB is usually sufficient for standard ADK agents.
3. Environment Variables (The API Key)
Your agent requires a GOOGLE_API_KEY to communicate with the LLM.
-
Get your key: Go to Google AI Studio and generate an API key.
-
Set the variable: In the Cloud Run settings, add an Environment Variable named
GOOGLE_API_KEYand paste your key there.
Deployment & Testing
Once you hit Deploy, Google will build the container and provide a Service URL.
Testing with Postman
Copy your new URL and open Postman. Create a new POST request:
-
Body: Set to
rawandJSON. -
Data: Provide the lead information as defined in the repo documentation.
-
Result: You should receive a detailed JSON object containing the agent's research findings.
Conclusion
You’ve now learned how to decouple AI logic from your main application by deploying Google ADK agents as Flask microservices. This modularity is what allows the Fiona platform to remain fast, scalable, and intelligent.
Happy coding, and let's build something great!
Any queries? Reach out to me here: mrphilip.cv/contact/